Form-Focused Instruction and the Advanced Language Learner: On the Importance of the Semantics of Grammar 9783631607497, 9783653018769, 3631607490

186 67 3MB

English Pages [514]

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Form-Focused Instruction and the Advanced Language Learner: On the Importance of the Semantics of Grammar
 9783631607497, 9783653018769, 3631607490

Table of contents :
Contents 7
Acknowledgements 13
Introduction 15
PART I: FOCUS ON FORM 19
Chapter One: On the importance of noticing: attention 21
1. Selectivity of attention 22
1.1. Attentional selectivity: main research paths to date 22
1.1.1. Focusing in controlled vs. automated processing 27
1.1.2. Extensive and intensive attention 29
1.2. Working memory and attention 32
2. The problem of consciousness 38
2.1. The easy problems of consciousness 40
2.2. The hard problem of consciousness – Chalmers’ dualism vs. Dennett’s physicalism 47
2.3. Dual Process Theories 51
2.3.1. Epstein’s Cognitive-Experiential Self-Theory 51
2.3.2. Bargh and Wegner’s automatic processing 53
2.3.3. Chaiken’s heuristic and systematic information processing 53
3. Attention and language learning 54
3.1. Attentional selectivity in language processing 54
3.1.1. Selectivity in language processing. Treisman revisited 54
3.1.2. Attentional selectivity in language learning. The constraints on the process 63
3.1.2.1. Selectivity in input processing. Form/meaning trade-off effects 64
3.1.2.2. Selectivity in output processing 69
3.2. On noticing. The problem of conscious awareness 73
3.2.1. The Noticing Hypothesis – strong and weak versions 74
3.2.2. The levels of conscious awareness 76
3.2.2.1. Awareness on the level of noticing 76
3.2.2.2. Awareness on the level of understanding 81
3.2.2.3. The strong version of the Noticing Hypothesis. A discussion 84
3.2.4. Constraints on noticing 90
Chapter Two: Language instruction: Theoretical underpinnings and practical options 97
1. Implicit and explicit learning 100
1.1. Implicit learning 100
1.1.1. Defining the implicit. The terminological hoax 101
1.1.2. Implicit learning paradigms 106
1.1.3. Implicit learning and educating intuition 113
1.2. Explicit learning 115
1.2.1. Defining the explicit 115
1.2.2. Explicit learning paradigms and models 119
2. Implicit and explicit learning of languages 126
2.1. Implicit language learning 127
2.1.1. Deductive implicit models: the Identity Hypothesis and UG-based approaches 127
2.1.2. Inductive-implicit models: the Interlanguage Theory, the Monitor Model and the Competition Model 130
2.1.3. Acquisition by Processing – the APT model 137
2.2. Explicit language learning 140
2.2.1. Input Processing (IP) 140
2.2.2. Output/interaction-based approaches 142
2.2.3. Language as a cognitive skill 144
2.2.4. Theoretical underpinnings of instructed learning 146
3. Focus-on form: pedagogical options 152
3.1. Comprehensible-input language pedagogy 153
3.2. Input enhancement 155
3.3. Consciousness Raising 158
3.4. Explicit focus on form: the experiential classroom 167
3.5. Explicit focus-on-form – direct instruction 171
4. What kind of focus on form? Learner individual differences and object-of-study constraints 177
4.1. Focus on form or focus on forms? 177
4.2. Learner constraints on form-focused instruction. Individual differences 182
4.2.1. Language aptitude and working memory as individual differences 182
4.2.2. Learning styles and instruction 186
4.3. Object-of-study constraints 190
PART II: FOCUS ON FORM 195
Chapter Three: The meaning of meaning 199
1. Introduction to meaning. The cognitive perspective 199
1.1. Prototype semantics 200
1.2. Conceptual and spatial semantics; Jackendoff’s model (2002, 2004) 202
2. Holistic semantic models 208
2.1. Mental spaces 208
2.2. Frames and cultural models 213
2.2.1. Cognitive frames 217
2.2.2. Interactional frames 218
2.3. Scripts 221
3. Holistic semantic models: the intra- and inter-space processes 225
3.1. Mappings 225
3.2. Perspective 226
3.2.1. Perspective in the scope of predication 227
3.2.2. Perspective in discourse 231
3.3. Profile 232
3.4. Active zones 234
4. Conceptual Integration Theory 237
5. Dynamic aspects of online meaning construction: frame shifting and conceptual blending 239
5.1. Conceptual blending 240
5.2. Frame shifting 246
Chapter Four: Focus on the semantics of the English tense-aspect system 251
1. Tense and aspect – the preliminaries 252
1.1. Tense 253
1.1.1. The timeline and the basic temporal concepts 253
1.1.2. The controversies 257
1.1.3. Tense as a grammatical category 261
1.2. Aspect 261
1.2.1. Perfective vs. imperfective 261
1.2.2. The perfect aspect 263
1.2.3. Lexical aspect 264
1.3. Tense and aspect. The grams 269
2. The English tense and aspect as grammatical categories. The prototype effects 273
2.1. The past tense 273
2.2. The present tense 276
2.3. The future tense 281
2.4. The progressive and the perfect aspects 283
2.5. Prototypicality effects in time orientation 288
3. Perspectivisation of the English tense and aspect 291
3.1. Tense as deixis: the present tense hoax 292
3.2. Simple totalities vs. complex partialities: the optimality/egocentricity issue 296
3.3. What the perfects have in common 298
3.4. Tense as a reference point construction: the perspective of mental space architecture 300
3.5. Temporal and aspectual profiles; the notion of re-profiling 303
4. The CIT perspective on the semantics of English temporal expressions 308
PART III: FOCUS ON FORM-MEANING CONNECTIONS IN THE ENGLISH TENSE-ASPECT SYSTEM 313
Chapter Five: Organic Approach Deductivised: Towards a research design 315
1. Form-meaning mappings in the tense-aspect system. Research to date 315
1.1. Contextualising the OAD studies 316
1.2. Form-to-function studies of the acquisition of the temporal system 317
1.2.1. The acquisitional sequence studies 317
1.2.2. Aspect Hypothesis (AH) studies 320
1.2.3. Discourse Hypothesis studies. Bardovi-Harlig’s (1998, 2000) aspect-and-narrative approach 323
1.3. The focus-on-form perspective on time-talk acquisition 325
2. Linguistic perspectives on form-focus pedagogy. Construction grammar as an option 330
2.1. Grammar theories to date and their impact on pedagogical grammar 331
2.2. The Lexical Approach – potential shortcomings 337
2.3. Construction Grammar (CxG) – an overview 341
2.4. Construction Grammar as the theoretical backbone of grammar pedagogy 342
3. Organic Approach Deductivised 345
3.1. The advanced language learner. Indentifying the problem 346
3.2. Form-form and form-meaning discrepancy in the advanced language classroom 347
3.2.1. The deductive, linear approach: the main culprit? 348
3.2.2. The organic, input-based approaches: another culprit? 351
3.3. Teaching the semantics of grammar. The treatment 354
Chapter Six: Implementing Organic Approach Deductivised: The studies 361
1. Research design 361
1.1. Research questions and hypotheses 362
1.2. Research variables 363
1.3. Research participants 363
1.4. The treatment 366
1.5. Research instruments, data and NS/NNS calculation procedures 368
1.5.1. The test format and the data collection procedures 368
1.5.2. The NS/NNS gap calculation procedure: distance in multidimensional spaces 371
2. Research implementation 373
2.1. Research chronology 373
2.2. Studies 1 and 2 374
2.2.1. Pre-test results 374
2.2.2. Post-test results 379
2.2.3. Delayed post-test results 385
2.3. Study 3 403
3. Organic Approach Deductivised: a discussion 416
3.1. Analysis of test results 416
3.2. Conclusions 425
FOCUS ON THE SEMANTICS OF GRAMMAR: CONCLUSIONS AND IMPLICATIONS 429
BIBLIOGRAPHY 439
APPENDIX 1 477
APPENDIX 2 483
APPENDIX 3 499

Citation preview

33

www.peterlang.de

PSEL 33-Turula-260749-HCA5-VH.indd 1

ISBN 978-3-631-60749-7

HKS 65

Anna Turula Anna Turula · Form-Focused Instruction and the Advanced Language Learner

Anna Turula holds a PhD in applied linguistics. She has more than 20 years experience in teaching English as a foreign language and has worked in TEFL teacher training for more than 10 years. At present, she is head of the Department of Modern Languages and Literature and a teacher trainer at the College of Foreign Languages, Czestochowa (Poland).

Polish Studies in English Language and Literature Edited by Jacek Fisiak

Lang

Form-Focused Instruction and the Advanced Language Learner looks at the role of FFI at higher levels of foreign language learning. It argues that – contrary to a common belief – there are aspects of grammar to be taught to the proficient FL user. While such a learner may be familiar with formal properties of different structures, (s)he still needs to focus on the semantics of target language grammar. Considering this, the book investigates the efficiency of a FFI treatment called Organic Approach Deductivised or 3-D language pedagogy, devised to teach the semantics of the English tense-and-aspect system to the advanced EFL learner. In doing so, the book takes the reader through different aspects of focus on form, looking at the semantics of the English time talk from the cognitive perspective.

Form-Focused Instruction and the Advanced Language Learner On the Importance of the Semantics of Grammar

33 PETER LANG

Internationaler Verlag der Wissenschaft en

09.06.11 22:36:14 Uhr

33

Anna Turula

L

www.peterlang.de

HKS 65

Anna Turula · Form-Focused Instruction and the Advanced Language Learner

Anna Turula holds a PhD in applied linguistics. She has more than 20 years experience in teaching English as a foreign language and has worked in TEFL teacher training for more than 10 years. At present, she is head of the Department of Modern Languages and Literature and a teacher trainer at the College of Foreign Languages, Czestochowa (Poland).

PSEL 33-Turula-260749-HCA5-VH.indd 1

Polish Studies in English Language and Literature Edited by Jacek Fisiak

ang

Form-Focused Instruction and the Advanced Language Learner looks at the role of FFI at higher levels of foreign language learning. It argues that – contrary to a common belief – there are aspects of grammar to be taught to the proficient FL user. While such a learner may be familiar with formal properties of different structures, (s)he still needs to focus on the semantics of target language grammar. Considering this, the book investigates the efficiency of a FFI treatment called Organic Approach Deductivised or 3-D language pedagogy, devised to teach the semantics of the English tense-and-aspect system to the advanced EFL learner. In doing so, the book takes the reader through different aspects of focus on form, looking at the semantics of the English time talk from the cognitive perspective.

Form-Focused Instruction and the Advanced Language Learner On the Importance of the Semantics of Grammar

33 PETER LANG

Internationaler Verlag der Wissenschaften

09.06.11 22:36:14 Uhr

Form-Focused Instruction and the Advanced Language Learner

Polish Studies in English Language and Literature Edited by Jacek Fisiak Advisory Board: Janusz Arabski (Katowice) Arleta Adamska-Sałaciak (Poznań) Grażyna Bystydzieńska (Warsaw) Maria Dakowska (Warsaw) Roman Kalisz (Gdańsk) Henryk Kardela (Lublin) Wiesław Krajka (Lublin) Tomasz Krzeszowski (Warsaw) Barbara Lewandowska-Tomaszczyk (Łódź) Jerzy Limon (Gdańsk) Michał Post (Wrocław) Stanisław Puppel (Poznań) Liliana Sikorska (Poznań) Tadeusz Sławek (Katowice) Aleksander Szwedek (Warsaw) Jerzy Wełna (Warsaw)

Vol. 33

PETER LANG

Frankfurt am Main · Berlin · Bern · Bruxelles · New York · Oxford · Wien

Anna Turula

Form-Focused Instruction and the Advanced Language Learner On the Importance of the Semantics of Grammar

PETER LANG

Internationaler Verlag der Wissenschaften

Bibliographic Information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the internet at http://dnb.d-nb.de.

The publication was financially supported by the College of Foreign Languages Czestochowa.

ISBN 978-3-653-01876-9 (eBook) ISSN 1436-7513 ISBN 978-3-631-60749-7 © Peter Lang GmbH Internationaler Verlag der Wissenschaften Frankfurt am Main 2011 All rights reserved. All parts of this publication are protected by copyright. Any utilisation outside the strict limits of the copyright law, without the permission of the publisher, is forbidden and liable to prosecution. This applies in particular to reproductions, translations, microfilming, and storage and processing in electronic retrieval systems. www.peterlang.de

mojemu Mowi, który musia chodzi na palcach i nie móg oddycha (ale wytrzyma) moim Synom, których problemy musiay czeka, a skocz zdanie moim Rodzicom, do których dzwoniam zdecydowanie za rzadko i moim Przyjacioom, którzy myleli, e o nich zapomniaam

@

Contents Acknowledgements .......................................................................................... 13 Introduction ...................................................................................................... 15 PART I: FOCUS ON FORM .......................................................................... 19 Chapter One: On the importance of noticing: attention .............................. 21 1. Selectivity of attention ................................................................................... 1.1. Attentional selectivity: main research paths to date ............................... 1.1.1. Focusing in controlled vs. automated processing ........................ 1.1.2. Extensive and intensive attention ................................................ 1.2. Working memory and attention .............................................................. 2. The problem of consciousness ....................................................................... 2.1. The easy problems of consciousness ...................................................... 2.2. The hard problem of consciousness – Chalmers’ dualism vs. Dennett’s physicalism ............................................................................ 2.3. Dual Process Theories ............................................................................. 2.3.1. Epstein’s Cognitive-Experiential Self-Theory ............................. 2.3.2. Bargh and Wegner’s automatic processing ................................. 2.3.3. Chaiken’s heuristic and systematic information processing ........ 3. Attention and language learning .................................................................... 3.1. Attentional selectivity in language processing ....................................... 3.1.1. Selectivity in language processing. Treisman revisited ............... 3.1.2. Attentional selectivity in language learning. The constraints on the process ..................................................... 3.1.2.1. Selectivity in input processing. Form/meaning trade-off effects .............................................................. 3.1.2.2. Selectivity in output processing ..................................... 3.2. On noticing. The problem of conscious awareness ................................ 3.2.1. The Noticing Hypothesis – strong and weak versions ................. 3.2.2. The levels of conscious awareness .............................................. 3.2.2.1. Awareness on the level of noticing ................................ 3.2.2.2. Awareness on the level of understanding ....................... 3.2.2.3. The strong version of the Noticing Hypothesis. A discussion .................................................................. 3.2.4. Constraints on noticing ................................................................

22 22 27 29 32 38 40 47 51 51 53 53 54 54 54 63 64 69 73 74 76 76 81 84 90

Chapter Two: Language instruction: Theoretical underpinnings and practical options .............................................................. 97 1. Implicit and explicit learning ........................................................................ 100

8

Contents

1.1. Implicit learning .................................................................................... 1.1.1. Defining the implicit. The terminological hoax .......................... 1.1.2. Implicit learning paradigms ........................................................ 1.1.3. Implicit learning and educating intuition .................................... 1.2. Explicit learning .................................................................................... 1.2.1. Defining the explicit ................................................................... 1.2.2. Explicit learning paradigms and models ..................................... 2. Implicit and explicit learning of languages ................................................... 2.1. Implicit language learning ..................................................................... 2.1.1. Deductive implicit models: the Identity Hypothesis and UG-based approaches .......................................................... 2.1.2. Inductive-implicit models: the Interlanguage Theory, the Monitor Model and the Competition Model ........................ 2.1.3. Acquisition by Processing – the APT model .............................. 2.2. Explicit language learning ..................................................................... 2.2.1. Input Processing (IP) .................................................................. 2.2.2. Output/interaction-based approaches .......................................... 2.2.3. Language as a cognitive skill ...................................................... 2.2.4. Theoretical underpinnings of instructed learning ....................... 3. Focus-on form: pedagogical options ............................................................. 3.1. Comprehensible-input language pedagogy ........................................... 3.2. Input enhancement ................................................................................ 3.3. Consciousness Raising .......................................................................... 3.4. Explicit focus on form: the experiential classroom ............................... 3.5. Explicit focus-on-form – direct instruction ........................................... 4. What kind of focus on form? Learner individual differences and object-of-study constraints ........................................................................... 4.1. Focus on form or focus on forms? ......................................................... 4.2. Learner constraints on form-focused instruction. Individual differences ............................................................................................. 4.2.1. Language aptitude and working memory as individual differences .................................................................................. 4.2.2. Learning styles and instruction ................................................... 4.3. Object-of-study constraints ...................................................................

100 101 106 113 115 115 119 126 127 127 130 137 140 140 142 144 146 152 153 155 158 167 171 177 177 182 182 186 190

PART II: FOCUS ON FORM ........................................................................ 195 Chapter Three: The meaning of meaning .................................................... 199 1. Introduction to meaning. The cognitive perspective .................................... 199 1.1. Prototype semantics ............................................................................... 200 1.2. Conceptual and spatial semantics; Jackendoff’s model (2002, 2004) .......................................................................................... 202

Contents

2. Holistic semantic models .............................................................................. 2.1. Mental spaces ........................................................................................ 2.2. Frames and cultural models ................................................................... 2.2.1. Cognitive frames ......................................................................... 2.2.2. Interactional frames .................................................................... 2.3. Scripts .................................................................................................... 3. Holistic semantic models: the intra- and inter-space processes .................... 3.1. Mappings ............................................................................................... 3.2. Perspective ............................................................................................. 3.2.1. Perspective in the scope of predication ....................................... 3.2.2. Perspective in discourse .............................................................. 3.3. Profile .................................................................................................... 3.4. Active zones .......................................................................................... 4. Conceptual Integration Theory ..................................................................... 5. Dynamic aspects of online meaning construction: frame shifting and conceptual blending .................................................................. 5.1. Conceptual blending .............................................................................. 5.2. Frame shifting ........................................................................................

9

208 208 213 217 218 221 225 225 226 227 231 232 234 237 239 240 246

Chapter Four: Focus on the semantics of the English tense-aspect system ............................................................... 251 1. Tense and aspect – the preliminaries ............................................................ 1.1. Tense ..................................................................................................... 1.1.1. The timeline and the basic temporal concepts ............................ 1.1.2. The controversies ........................................................................ 1.1.3. Tense as a grammatical category ................................................ 1.2. Aspect .................................................................................................... 1.2.1. Perfective vs. imperfective ......................................................... 1.2.2. The perfect aspect ....................................................................... 1.2.3. Lexical aspect ............................................................................. 1.3. Tense and aspect. The grams ................................................................. 2. The English tense and aspect as grammatical categories. The prototype effects .................................................................................... 2.1. The past tense ........................................................................................ 2.2. The present tense ................................................................................... 2.3. The future tense ..................................................................................... 2.4. The progressive and the perfect aspects ................................................ 2.5. Prototypicality effects in time orientation ............................................. 3. Perspectivisation of the English tense and aspect ......................................... 3.1. Tense as deixis: the present tense hoax ................................................. 3.2. Simple totalities vs. complex partialities: the optimality/egocentricity issue ...............................................................

252 253 253 257 261 261 261 263 264 269 273 273 276 281 283 288 291 292 296

10

Contents

3.3. What the perfects have in common ....................................................... 3.4. Tense as a reference point construction: the perspective of mental space architecture .................................................................................. 3.5. Temporal and aspectual profiles; the notion of re-profiling .................. 4. The CIT perspective on the semantics of English temporal expressions .....

298 300 303 308

PART III: FOCUS ON FORM-MEANING CONNECTIONS IN THE ENGLISH TENSE-ASPECT SYSTEM ........................................................ 313 Chapter Five: Organic Approach Deductivised: Towards a research design .................................................... 315 1. Form-meaning mappings in the tense-aspect system. Research to date ....... 1.1. Contextualising the OAD studies .......................................................... 1.2. Form-to-function studies of the acquisition of the temporal system ..... 1.2.1. The acquisitional sequence studies ............................................. 1.2.2. Aspect Hypothesis (AH) studies ................................................. 1.2.3. Discourse Hypothesis studies. Bardovi-Harlig’s (1998, 2000) aspect-and-narrative approach .................................................... 1.3. The focus-on-form perspective on time-talk acquisition ...................... 2. Linguistic perspectives on form-focus pedagogy. Construction grammar as an option ................................................................................................... 2.1. Grammar theories to date and their impact on pedagogical grammar .. 2.2. The Lexical Approach – potential shortcomings .................................. 2.3. Construction Grammar (CxG) – an overview ....................................... 2.4. Construction Grammar as the theoretical backbone of grammar pedagogy ............................................................................................... 3. Organic Approach Deductivised .................................................................. 3.1. The advanced language learner. Indentifying the problem ................... 3.2. Form-form and form-meaning discrepancy in the advanced language classroom ............................................................................... 3.2.1. The deductive, linear approach: the main culprit? ...................... 3.2.2. The organic, input-based approaches: another culprit? .............. 3.3. Teaching the semantics of grammar. The treatment .............................

315 316 317 317 320 323 325 330 331 337 341 342 345 346 347 348 351 354

Chapter Six: Implementing Organic Approach Deductivised: The studies ................................................................................ 361 1. Research design ............................................................................................. 1.1. Research questions and hypotheses ....................................................... 1.2. Research variables ................................................................................. 1.3. Research participants ............................................................................. 1.4. The treatment .........................................................................................

361 362 363 363 366

Contents

1.5. Research instruments, data and NS/NNS calculation procedures ......... 1.5.1. The test format and the data collection procedures .................... 1.5.2. The NS/NNS gap calculation procedure: distance in multidimensional spaces .......................................... 2. Research implementation .............................................................................. 2.1. Research chronology ............................................................................. 2.2. Studies 1 and 2 ...................................................................................... 2.2.1. Pre-test results ............................................................................. 2.2.2. Post-test results ........................................................................... 2.2.3. Delayed post-test results ............................................................. 2.3. Study 3 ................................................................................................... 3. Organic Approach Deductivised: a discussion.............................................. 3.1. Analysis of test results ........................................................................... 3.2. Conclusions ...........................................................................................

11

368 368 371 373 373 374 374 379 385 403 416 416 425

FOCUS ON THE SEMANTICS OF GRAMMAR: CONCLUSIONS AND IMPLICATIONS ..................................................... 429 BIBLIOGRAPHY............................................................................................ 439 APPENDIX 1.................................................................................................... 477 APPENDIX 2.................................................................................................... 483 APPENDIX 3.................................................................................................... 499

ACKNOWLEDGEMENTS I would like to thank all of those without whom the present research effort could have never been implemented and brought to a successful completion. First and foremost, I want to express my gratitude to my students from the E1, E2 and C groups for taking part in all of the studies, with a special thank you to the two experimental groups for their willingness to cooperate in a study project that went beyond their curricular requirements. I would also like to thank all of the native speakers who participated in the interviews and provided grammaticality judgments. They were (in the alphabetical order) Richard Flack, Carl Humphries, Tom Kay, James Lambie, Robert Lawler, Tom Linton, Jonathan Lowe, Lindsay Pinnell, Eva Piotrowska, Russell Preston, Hannah Richardson, Michele Simmons, Yahya Tamezouit, Scott Tattam and Christopher Walker. In addition to this, my most sincere appreciation is offered to Michele Simmons for proofreading my work and making sure it sounds decently English. Finally, a word of thank you to fellow academics who supported me in my research: to Bogush Bierwiaczonek for his inspiration and guidance; to Ewa Piechurska-Kuciel and Joanna Nijakowska for encouraging my efforts in statistics and for all the patience they showed me; and to Marek Oziewicz and Roman Ociepa for their help concerning various research practicalities. I would also like to express my gratitude to my reviewers, professor Kamila Turewicz and professor Mirosaw Pawlak, for all their comments, which helped me improve the book. Last but not least, a special appreciation to my friend Boena Pawowska for all of the warm support, encouragement and the shoulder to cry on she was always ready to offer.

INTRODUCTION “There is no grammar to teach to the advanced learner,” is a frequently made assertion based on the understanding that at higher levels of language proficiency formal instruction is reduced to mere recycling of what has already been successfully acquired. However, while it is true that the learned structures actually get automatised with exposure and practice, their retrieval and application becoming increasingly fluent, many non-native (NNS) advanced target language users have the feeling that their choices as regards form/context connection are very much unlike those of a native speaker (NS). This feeling is reinforced by exposure to authentic target language discourse in which numerous examples of use seem to contradict the automatised form-function mappings learned from respectable tutors and grammar books over the years of target language education. Such a perceivable NS/NNS gap is highly frustrating for the proficient language user, who expects to be near-native-like in every language area, grammar included. This feeling of dissatisfaction is further strengthened by the fact that an effective solution to the problem does not seem to be offered by the traditional grammar instruction. The research described in the present work, rooted in the above-described frustration, was undertaken in search of appropriate grammar instruction, instruction that could help to level the said NS/NNS gap in the advanced language learner. Considering the fact that in many examples of use the difference amounts to native speakers going beyond the rules to which non-natives have been taught to adhere, it has to be hypothesised that, to bring desirable results, the pedagogy at issue has to exceed purely formal tuition and enter into the sphere of bending the forms towards the meaning intended by the speaker. Such a domain, in which grammatical forms subserve the speaker’s intentions, is the semantics of grammar and it – contrary to the popular belief quoted above – constitutes an aspect of grammar to be taught to the advanced language learner. The three studies presented in the present work were carried out with the intention of confirming this claim; they also sought the answer to the question of how teaching grammar semantics can be implemented effectively. To be precise, it was investigated to what extent form-focused instruction can be applied to the successful teaching of form-meaning mappings at higher levels of language proficiency. In addition to this, the studies looked at the quality and the dynamics of such change. Paving the way for the research chapters of the book are two theoretical parts, in which focus on form is analysed from two perspectives: psychological and psycholinguistic (Part 1) as well as linguistic (Part 2). Part 1, entitled FOCUS on form, deals with two aspects of the uppercased focus: cognitive processing with special regard to attention (Chapter 1) as well as pedagogic procedures (Chapter 2) known as focus on form or form-focused

16

Introduction

instruction. Chapter 1 looks at attention through its two main characteristics, selectivity and conscious awareness. While the pedagogic importance of both is fully acknowledged based on relevant research in both psychology and language acquisition, the chapter argues for a fuller recognition of the importance of global attention and the role of f-conscious processes in learning. In doing so, it examines models of attention in relation to working memory, the latter seen as spreading activation patterns in the LTM store. Some constraints on attention, with special regard to frequency and reliability are discussed as well. Hinged on the global/focal, f-conscious/conscious distinction – discussed in Chapter 1 and treated as extremes on a certain continuum rather than oppositions – Chapter 2 takes the reader through language acquisition/learning approaches, starting, respectively, from the more non-interventionist and moving towards the more teacher-fronted and explicit ones. Alongside the different instructional modes, the two types of learning – implicit and explicit – together with the properties of the resulting knowledge are discussed, with special regard to the problem of interface between the two and their mutual, manifold interplay, which, it seems, have to be taken into account in any attempt at effective language pedagogy. Part 2 – under the title of Focus on FORM – is an attempt to linguistically delineate the object of the instruction whose pedagogic bases were discussed in Part 1. Since the focus is actually on the meaning of form, the area of investigation is grammar semantics. That is why, first of all, Part 2 looks at the broad meaning of meaning, describing it as encyclopeadic rather than dictionary. Such meaning is best interpreted within a conceptual architecture whose structure is built by various linguistic means and is filled in by holistic semantic models, such as frames or scripts, and in relation to the processes these models are subject to, including frame shifting and conceptual blending. Chapter 3, which describes this, creates the background for Chapter 4, in which all of the said theoretical prerequisites are applied to the area of language learning that is of special interest to the present work: the English tense and aspect or the so-called time talk. It has to be admitted that in both these parts the book, whose main focus is foreign language pedagogy, trespasses on foreign territory: psychology and theoretical linguistics. This, however, seemed to be a necessary step to take as one cannot address the problem of teaching grammar semantics without analysing the cognitive process of attention which is responsible for learning as well as delineating the object of such instruction by means of defining semantics per se. In both fields – psychology and linguistics – the defining word is cognitive. This is why, even in the absence of overt reference, the reflections presented in parts 1 and 2 are to a considerable degree motivated by the findings of Bruner (1956 and later works), Damasio (1994, 1999), Pinker (1994 and later works) as well as (Wierzbicka 1988, to whose publication I refer in the subtitle of my book), Langacker (1987 and later works), Jackendoff (1975 and later works), Radden and Dirven (2007) and many others.

Introduction

17

The pedagogical solutions as regards form-focus on the semantics of the English temporal expressions are presented in Part 3. Its first chapter, Chapter 5, which deals with the research design, starts with a description of research to date into the acquisition of the English tense and aspect. The historical perspective, in addition to offering the diachronic view on the area studied as well as relevant research contexts, helps to pinpoint the issues which need to be addressed in any attempt to design an experimental treatment: the linguistic framework within which the pedagogical intervention can be implemented; and the learner specificity of the treatment, with special regard to ID-determined needs and potential areas of difficulty. Since it is proposed that time talk constitutes a fullstructure system rather than a random collection of form-meaning connections, construction grammar with its form-meaning pairings and inheritance hierarchies is proposed as a linguistic point of reference. When it comes to the learner, the proficient student, a highly neglected species, and their needs are in focus. It is speculated that since the gap between such near-native users and native speakers of the target language is caused by conceptual transfer as a result of which certain form-meaning pairings are subject to native-language bias, the necessary pedagogic intervention needs to provide opportunities for conceptual refocusing. Chapter 5 concludes with a presentation of a relevant treatment called the Organic Approach Deductivised or 3-D grammar pedagogy, which hopefully fulfils the above requirement. The said treatment was put to the test in three experimental studies whose implementation, results, data analysis, conclusions and implications are presented in Chapter 6. However, what needs to be pointed out is that these studies will be accommodated within the research tradition delineated in Chapter 5 in two more ways. First of all, in spite of the fact that the final results of the Organic Approach Deductivised will be given and discussed, the main focus of the present work, as in the case of other time talk studies (cf. Bardovi-Harlig 2000), is going be on the cognitive processes accompanying the implementation of the treatment rather than the product of this treatment; on emergence rather than acquisition. Secondly – and very much in relation to the assertion above – the three studies presented in Chapter 6, like the majority of the said research to date, will be small-scale, qualitative rather than quantitative, and based on data collected from a small number of testees through tests and verbal reports. The decisions underlying the research design have been made in the belief that while large-format research efforts provide us with invaluable statistically significant data, small experimental studies are a worthy supplement offering insights into phenomena which would otherwise remain uncovered. The final part of the book offers a number of conclusions and teaching implications based on the research presented in Part III.

PART I: FOCUS ON FORM This part of the book deals with focus on form emphasising the focus part of this pedagogical phenomenon. This is why it seems important to clarify what focus may mean. There are two educationally sound interpretational possibilities: focusing on the part of the learner, which amounts to this learner paying attention to what is supposed to be learned; or focusing understood as instruction offered to this learner. The above-mentioned dual interpretation of focus is represented by Chapters 1 and 2: the former looks at attention, its two main aspects – selectivity and conscious awareness – and tries to relate the two to learning in general and foreign language education in particular; the second chapter, in turn, examines instruction in foreign language learning. All these considerations are aimed at answering the question of whether to learn a foreign language the learner needs to consciously pay focal attention to the object of study; a question that cannot be ignored when reflecting upon one of the most important postulates related to the role of attention in language learning: Schmidt’s Noticing Hypothesis (Schmidt 1990 and later works) in its strong version in the light of which allocating attentional resources fully awares and in the spotlight fashion is condition sine qua non of learning. Although – based on common sense as well as research to date (cited in both chapters of Part I) – it is difficult to deny the importance of conscious, selective attention in language learning, the present book argues for a middle-of-the-road position, closer to the weaker version of the Noticing Hypothesis, in accordance with which intensive attention is mandatory in some cases and merely facilitative in others. Such an understanding of learning can accommodate both intensive focal as well as extensive, global attention with its different pools activated with varying degrees of awareness. All this means that in learning selective and conscious attention needs to be paid in the spotlight manner but there are a number of reservations to be kept in mind. First of all, noticing does not always happen in a 01 fashion: as in the case of real spotlights, there are degrees of light and shade, closer to the centre of vision – understood both literally and metaphorically – or neighbouring on its peripheries. In language learning this will mean that while paying focal attention to certain aspects of language, the learner is likely to simultaneously record other features present in the input. Secondly, just as the spotlight can be moved from place to place, the learner can focus and refocus. Such attention shifts are either caused by conscious registration of incoming stimuli or induced without the subject’s full awareness. Similarly in language learning, one will have mastered certain areas of grammar due to clues present in the input (called input enhancement) or deductive teacher-fronted instruction; yet, one may as well spot them owing to the considerable frequency of certain structures in the language material that is encountered. In the latter case, while the very act of noticing may be

20

Part I

conscious, the paths leading to the final illumination will most certainly go through territories of the unconscious, outside the main stage of events (as in Baars’ Global Theatre metaphor, discussed thoroughly in Chapter 1). In fact, as will be shown in this part of the book, in human processing of input the selective, the linear and the conscious are in constant interplay with the global, the parallel and the unconscious, as argued by dual-process theories. Such interactive metal operations are also the case in language learning, as will be shown based on multiple examples of phonological, morphological, syntactic and semantic processing. All this is argued for in Chapter 1 in relation to both the native tongue and the second/foreign language in late bilinguals. It is also made clear that, selective and conscious though it has to be at lower levels, language processing becomes more parallel, conscious/unconscious as the learner develops and automatises their language skills. Such dual-process specificity of FOCUS ON FORM is also subscribed to throughout Chapter 2, which looks at how the teacher may draw the learner’s attention to formal aspects of language in ways that are particularly conducive to learning. Research to date (cf. Doughty and Williams 2004; Ellis 2002 and other works) shows that the best result comes if focus on form, explicit or implicit, is combined with communicative exposure to authentic language. That such a combine is effective pedagogically is hardly surprising if we remember the focal/global, intensive/extensive, conscious/unconscious processing modes, relate the former characteristic of each pair to instruction. What is also important is that these two types of processing are mutually reinforcing rather than mutually exclusive, which cannot be without its influence on the success of the said approach. At the same time, however, it will be too simplistic to assume that the focal, selective, intensive and conscious instruction will only lead to conscious declarative learning while the exposure will result in global extensive processing and subsequent not-fully-conscious intake. In classroom practice the exposure component combined with a multiplicity of teaching options will lead to different learning modes, explicit or implicit (both discussed in Chapter 2). This is the reason why the present work chooses to see language instruction as a kind of continuum of options rather than in terms of binary oppositions, interaction/instruction (Ellis 2002) or non-interventionist/interventionist (Pawlak 2006). Another reason for such an understanding of language instruction is that we cannot really divide instructional approaches into purely interventionist or noninterventionist; we rather place them on a less-to-more interventionist scale. Chapter 2 argues for such an interpretation of language teaching options, reflecting on which formal aspects of language are best taught/learned in each of the possible modes, especially in the case of the advanced language learner, who is of primary interest to the present work.

CHAPTER ONE ON THE IMPORTANCE OF NOTICING: ATTENTION By definition, attention is a cognitive process of selectively concentrating on one thing while ignoring the others. For the present book, this mental sieve is of interest in relation to language learning, with special regard to Schmidt’s Noticing Hypothesis (Schmidt 1990 and later works), in the light of which intake is the part of input on which the learner has selectively concentrated, while whatever was filtered out or ignored cannot be considered learned. However, in order to be able to critically reconsider Schmidt’s postulate and either subscribe to it or choose its weaker version, we need to look at attention as such, with special regard to its selectivity and the consciousness of the process. Both said aspects of attention are far from one-dimensional: there are a number of issues to be considered in relation to each of them. In the case of selectivity, we have to investigate its timing and properties as well as the related issues of parallel as opposed to linear one-track processing; contemplating consciousness we need to reflect upon whether such a cognitive filtering and focusing process has to involve only conscious mental states. The present chapter starts by clarifying the main concepts while looking at attention both diachronically and synchronically. Moving along the former line, Section 1 traces the interest in attention back to the 1960s, examining such attentional correlates as: processes including sensory activation as well as the problem of the selectivity of these processes; the relationship between attention and the affective domain; attention and other behavioural or cognitive processes such as simultaneous performance of different actions or memory; and finally, the question of consciousness. The section concludes on a more synchronic note, with the presentation of the contemporary views on attention, with special regard to the role of working memory in the process of cognitively filtering and registering incoming stimuli Section 2 discusses the question of consciousness, presenting different approaches to the issue from the perspective of the so-called easy and difficult problem of consciousness (Chalmers 2002). Various stances, motivated by both neuroscience and language philosophy are discussed in this part. In both of these parts the relations between attention and language learning and processing are initially considered, to be fully dwelled upon in Section 3, which relates both aspects of attention – selectivity and consciousness – firstly to the native tongue phenomena, and then to language learning, with special regard to language instruction and the afore-mentioned Noticing Hypothesis.

22

Chapter One

1. Selectivity of attention The idea of the selective intake of information is intuitively sound. No specialist psychological knowledge is necessary to realise that being constantly surrounded by numerous stimuli, we are aware of only a limited number of them. It is also common knowledge that it is impossible to be equally attentive to different simultaneously performed actions or different pieces of information processed at the same time. In such cases, selectivity is necessary to minimise processing investment in order to avoid cognitive overload and, consequently, error; or the risk of incohesion of the incoming data and the resulting inability to make a decision. Research to date has concentrated on the mechanisms of such selectivity as well as their timing. The latter refers to the attentional switch-on, the exact moment information processing changes qualitatively: from unconscious to conscious; from parallel to sequential. Analysing the evolution of views on the above-mentioned issues, Czyewska (1991) points out that the development of knowledge has been towards granting more importance to the unconscious. The following sections take the reader – chronologically – through a number of studies on attention, hopefully demonstrating Czyewska’s claim. The present work also tries to make it clear that attention is selective on both levels of processing: sensory, carried out in the primary cortex as well as representational, in the higher cortex. What is also important for learning in general – and language learning in particular – is that attentional selectivity is a factor in all three stages of the acquisition of knowledge: input, central processing and output. 1.1. Attentional selectivity: main research paths to date It was the selectivity to perceptual input that was first proposed as the defining characteristic of attention. In the light of one of such working models of attention – Broadbent’s Filter Model of Selective Attention (Broadbent 1954, 1958; in Czyewska 1991, Kolaczyk 1992 and Eysenck 2004) – a virtually unlimited number of sensory stimuli, received via different modalities, enter the module of very short term memory (VSTM), where they are subjected to unconscious parallel processing. It is in VSTM that the process of selection takes place: based solely on the sensory analysis, only the sensory input of significant perceptual salience is filtered for further processing. This is also the moment of a major qualitative change: once filtered, the so-far unconsciously processed input is subject to mental operations that are conscious and sequential. Broadbent justifies his theory based on two experiments: his own split-span number experiment (Broadbent 1954; in Czyewska 1991, Kolaczyk 1992 and Eysenck 2004) and Cherry’s two-texts experiment (Cherry 1953 cited in Broadbent 1958). In both experiments, the testees were subjected to a dichotic listening

On the importance of noticing: attention

23

task, in which each ear was a separate channel for one of two different inputs presented simultaneously. In Broadbent’s number experiment, the testees had two lists of numbers presented to them. While the attending ear received numbers such as 1, 2 and 3, numbers 4, 5, and 6 were played to the unattending ear. Immediately after the presentation, the testees were asked to retrieve the numbers they remembered. Even though the numbers were presented in pairs – 1-4, 2-5 and 3-6 – the retrieved sequence was always 1, 2, 3, 4, 5 and 6. In Cherry’s experiment in turn, the listening material contained a press article read by a man (the attending ear) and another text read by a different speaker (the other ear). Both texts differed in their middle part and the modifications in the material presented to the unattending ear included: the introduction of another speaker (a woman); playing the text backwards; or replacing individual words with tones. The testees were asked to concentrate on the listening material played to the attending ear. After the experiment it turned out that the text presented to the unattending reception channel had not been processed semantically: while the testees were almost certain the presentation contained actual human speech, they were unsure about the language of the presentation and totally unaware of its content. What was registered in the to-be-ignored passage, however, were the change of the speaker (man/woman) and the series of tones replacing words. The results of the two experiments were, according to Broadbent (1954, 1958), demonstrative of the mechanism of selective attention and its timing. What the testees were able to retrieve and the order of the remembered elements showed that parallel processing may take place during the initial not-yet-conscious analysis of the sensory properties of the incoming stimuli. However, once the process became conscious, it became serial and dealt with a limited amount of input. This is what Broadbent called a bottleneck effect: an inflexible, definitive narrowing down of the information channel filtering in only a part of input for further conscious processing. It was the bottleneck – inflexible and definitive – understanding of the attention switch-on that aroused criticism and led to research, whose results and interpretations showed that the initial unconscious phase may be longer and more significant than Broadbent thought. Among others, Moray’s (1959) cocktail party experiment showed that while the unattending ear may not register neutral words, the channel is definitely sensitive to the listener’s name, as often is the case when people socialise at parties and are oblivious to other conversations going on around unless they spot their name mentioned in one of them. This suggests that while unconscious of the incoming stimuli, we can actually process them not only for physical properties – a voice changing from male to female; words replaced by tones (Broadbent’s 1958) – but also, at least to a certain extent, semantically. The findings were confirmed by three other studies. The first of them, a textprocessing experiment, was carried out by Gray and Wedderburn (1960). Formally similar to Broadbent’s number test, it was more sophisticated semantically: the text chosen for the experiment was broken down into two

24

Chapter One

complementary parts and played to the testees dichotically, through the attending and the unattending ears. When asked to recall the text, the participants were unable to render it verbatim or near-verbatim but, significantly, could retrieve the content of the combined message. The second experiment demonstrated unattended semantic processing of visual input (Neisser 1964): the testees who were asked to locate a given letter (e.g.: Z) were able to perform the task much faster, if the neighbouring distractor letters did not share the visual properties of the searched graphic symbol; in other words, the more salient/unique a given stimulus was, the easier it was to spot. In turn, in his 1969 study, which was a visually adapted replica of Moray’s cocktail party, Neisser demonstrated that while concentrating on a text of a specified colour, the testees retained very little from fragments in other colours which were unattended to unless they contained the name of the test participant. All this shows that the unconsciously processed material is not entirely filtered out, especially if, to use Kolaczyk’s words, it is “important to the listener or organised in a meaningful way” (1992: 81; my own translation). As was pointed out earlier, all the above-quoted studies and their findings led to a major reconsideration of the problem of the selectivity of attention. New models of selective attention were put forward, the most important of which were the ones proposed by Treisman (1960, 1964), Norman (1968), Shiffrin and Schneider (1977), Deutsch and Deutsch (1963), Posner (1980) and Eriksen and St.James (1986). All of them, as can be seen from their presentation below, saw selection as carried out much later than Broadbent had proposed, following and not preceding the semantic analysis of the input. Revising Broadbent’s idea of the nature and timing of the selective mechanisms of attention, Treisman (1960, 1964) suggested that unattended stimuli are not filtered out at the threshold of consciousness but continue to be processed extensively and may influence the conceptualisation of the attended input. This happens because, as Treisman speculated, Broadbent’s attentional bottleneck may not be as inflexible as he had claimed; nor does the final transition from unconscious parallel to conscious serial appear so early in the course of input processing (at the level of physical cues, as claimed by Broadbent 1954, 1958). According to Treisman, the process of selection is gradual and hierarchical: in the case of language input processing, it starts at the level of visual or auditory cues and proceeds through syllables, words, phrases, sentences towards the analysis of discourse. As a result, Treisman’s model accounts for attentional attenuation of unattended stimuli rather than their filtering out in the bottleneck-like, definitive fashion. What is crucial to point out here – as it will be of importance (and, consequently, revisited) later in this chapter – is the particular applicability of Treisman’s intuitions about selectivity to the role of attention in the processing of linguistic input. This is because it coincides with somewhat later studies on the

On the importance of noticing: attention

25

mental lexicon (Aitchison 1994, Singleton 2000), research into parsing as well as memory vs. rule-based processing and understanding complex sentences or sentential ambiguity (cf. Pinker 1994 and 1999, among others). All this amounts to an assertion that we process the physical input parallelly while receiving it, and not only linearly after having received it. This happens because the sensory (auditory or visual) linguistic input carries an array of various potential candidates (words, phrases) for processing meeting – or not – our initial expectations about what is going to be heard or read. That is why, in order to understand a given utterance, we need to bottleneck or attenuate this input by successfully discarding dead-ending processing routes. This happens in the course of the parallel processing of the incoming data along the three pillars – phonological, syntactic and conceptual – of Jackendoff’s (2002) parallel architecture. Within this architecture three parallelly operating processors of language as a system – phonological, syntactic and conceptual – generate phonological, syntactic and conceptual structures which interact with each other communicating via interfaces. As a result, as proposed by Jackendoff (2002) and recycled by Truscott and SharwoodSmith (2004), language input is analysed parallelly by the three subsystems, which are in constant interplay: the sensory sound/image input is reprocessed in the phonological loop; simultaneously, the conceptual analysis is carried out, and this is frequently done way ahead of the structural analysis; and the syntactic module usually overproduces candidates for the structural interpretation of the message, these candidates, needless to say, being processed parallelly. The closer we get to the final selection – phonological (or graphic), syntactic, semantic – of input, the tighter the bottleneck becomes. Going back to attentional selectivity studies, another research effort worth quoting is Shiffrin and Schneider’s (1977) experiment, which corroborated the earlier findings (Neisser 1964; the letter test) that the unattended message is attentionally registered if it contains an element standing out from its surroundings, for example a figure in a group of letters. The results point to the importance of categorisation as the filtering mechanism, the consequence of which is particularly visible if the context categories – like the figures and letters used in the experiment – are clearly homogenous and internally coherent. In this respect Shiffrin and Schneider’s model is similar to the one proposed by Deutsch and Deutsch (1963), who claim that all incoming stimuli are processed by comparison with the data retrieved from the long-term term memory, semantic processing included. All this is done subconsciously up to the moment of meaning recognition, so the categorisation belongs to the parallel, unfiltered domain of mental operations; in order to recognise what is meaningful, the attention filter weighs all of the content stimuli parallelly, selecting those of a considerably high value. Once singled out, these stimuli themselves become part of the selection process, filtering out all other, less important or less salient pieces of information. In effect, we can speak of a shadowing effect – a stimulus of a very high value has

26

Chapter One

the potential of outweighing the rest of input to the point of its total negligence. In the case of visual input studies such a phenomenon is known as the Stroop effect and is manifested by the testees’ inability to semantically process a word like red if the word itself is printed in another colour (Stroop 1935; cf. also Deutsch Lezak et al. 2004). In language-related areas, the studies of Shiffrin and Schneider as well as by Deutsch and Deutsch conjure up issues such as the role of working memory as the worktable for language processing; the interplay between the declarative and the procedural memory (cf. Anderson’s ACT model; Anderson 1980 and later works); or the gestalt appraisal of linguistic data which is carried out prior to the analytic processing of this data. Phenomena similar to the Stroop effect, in turn, can be identified in the case of clutter on the working memory worktable, where stronger stimuli overshadow and off-focus the weaker ones. Such strong attention attractors are, for example, emotions – including language anxiety – that are the winners among equals, the brain being the captive audience of the body (Damasio 1994: 159-160). All the issues will be considered in detail later in this part of the book. Finally, we look at two models – Posner’s (1980) and Eriksen and St. James’s (1986) – together, as the two see attentional selectivity in terms of similar metaphors. According to Posner (1980), attention is best understood based on the spotlight metaphor: looking at a certain scene we perceive a very small area very clearly while the rest of the scene remains in a kind of shadow or darkness. As our motivation or task demands change, we can move the spotlight around the scene bringing out its different elements and regions from the shadow/darkness. All this seems to endorse the idea of attention as flexible and its selectivity as an ongoing rather than a fixed process. What, however, is equally – if not more – important is that, according to Posner (1980), there is a phenomenon called covert attention: the ability to move the attentional spotlight without actually moving one’s sight towards the singled-out item. This, according to Posner, means that there are actually two attentional systems: endogenous, controlled by intention and involved in the processing of central cues; and exogenous, in charge of automatic attention shifts towards peripheral cues. Similarly, in the light Eriksen and St. James’s (1986) zoom lens model of attention, we notice an item and concentrate on it. However, instead of changing the focus of attention altogether, as in Posner’s model, we can increase or decrease the focal area depending on the demands of the task, considering tradeoff effects: greater focusing limits scope while broadening the scope results in blurring the vision. In other words, we can either attend to a limited area intensively or deal with a larger number of input facets in a more extensive way. What is particularly interesting about these zoom adjustments is that – unlike spotlighting, which implies a switch from the unconscious to the awares – zooming in and out is a fluctuating rather than a definitive process: it may vary

On the importance of noticing: attention

27

while we are on the task with changing demands of the task in question without the switch-on/switch-off quality of the process. This means, as Awh and Pashler (2000) argue, that even if focused – regardless of the lens adjustment – our attention can still, to a certain degree and under certain conditions, process stimuli from regions of the field of vision which are non-adjacent to the area we are currently focusing on, this phenomenon being known as split attention. This proposal is reminiscent of a claim made by Kahneman et al. (1992: 176) to the effect that: “attention to any property of an object causes even irrelevant properties of the object to be attended”. This happens – to revisit the issue of “certain conditions” mentioned above – because the division of attention is facilitated by the fact that all of the attributes belonging to the same object are related in a coherent, uniting way – they are part of one scene, field of vision or context. If we venture to reach out, as in the case of previous models, towards language processing, we can say that combining the intensive and the extensive, the flexibility and the degrees of focusing of the spotlight and the zoom lens is highly reminiscent of the instruction plus exposure combine proposed by focus-on-form pedagogy. That is why we will return to these issues where instructed learning is discussed at a greater length. In conclusion to the review of the selectivity studies, we can say that, in spite of their differences in how they conceptualise attention or their understanding of the quality and quantity of filtering mechanisms, all of the above proposals share two basic assumptions: 1. there are different kinds of paying attention, intensive and extensive, overt and covert; 2. these parallelly processed different foci of attention can be of varying intensity as our attention fluctuates and shifts. These assumptions are discussed in the following sub-sections. 1.1.1. Focusing in controlled vs. automated processing The necessity of distinguishing between controlled and automated processes was put forward in the already quoted Shiffrin and Schneider (1977). They argued that while controlled processing requires attention and does not offer a lot of focusing capacity for any competing input, there are no such limitations on automated processes, which are performed without attention. This, however, means that automated actions are rather inflexible: they are not subject to modification as control also means the potential to adjust one’s performance when on task.

28

Chapter One

A slightly different view is presented in Posner (1982): attention, or – as he also calls it – the central executive (cf. Baddeley and Hitch’s 1974; model of working memory discussed later in this chapter) is in charge of both controlled and automated processes. There is a kind of interplay between these two types of foci, resulting from the fact that the brain is a system of systems (cf. Damasio 1994, discussed later in this chapter). Consequently, on the one hand, the automated processes consume a part of attentional resources allocated to controlled processes (Posner 1982). On the other hand, controlled processes will be, to a certain extent, automated and unconscious. The latter is because even new tasks, which require attentional focusing, are at the same time wired to background knowledge, previous similar tasks, etc. Such factors, which may be called contextual in the broad sense of the word, will influence attention-focusing processes by automatically providing conceptual blueprints within which the controlled processes can be accommodated. This influence will be positive or negative, as proved in an experiment carried out by Kahneman and Henik (1977; for details see Eysenck 2004), in the sense of speeding up the controlled processing or inhibiting it, if what is automatically provided is a mismatch. A similar model proposing a certain gradation of automaticity and control is Norman and Shallice’s (1986) schema activation, which distinguishes between: – fully controlled actions directed by a supervisory system similar to the central executive known from Baddeley and Hitch’s (1974; see also Baddeley 1986 and 2000) working memory model. This supervisory system (Norman and Shallice 1986; see also Shallice and Burgess 1996) is activated in the case of tasks which require control: novel and incoherently structured (Kolaczyk 1992). The system is in charge of constructing new schemas to underlie task execution, monitoring those novel schemas for error and evaluating them for potential future use; – partly automated actions, without conscious control yet involving what can be called contention scheduling, which is a mechanism used to choose between two conflicting schemas competing to blueprint an action; such a selection is motivated by contextual clues; – fully automatic actions, controlled by schemas or our plans of actions, requiring hardly any conscious awareness of what is being done. In the light of all of the above considerations, we can say that while the division into automated and controlled processing is a fact, it is not decidedly bipolar, automaticity and control being gradual rather than final. It is also important to note that while automatic processes are not particularly attention-consuming, they do, to a certain extent, compete for attentional resources because both controlled and automatic processes are supervised by a kind of central executive. This supervisory attentional resource, in addition to coordinating the input, monitors it

On the importance of noticing: attention

29

for its adherence to schemas, exercising less control when familiar action plans are activated as a blueprint for the currently perceived stimuli; or more control when the input is novel, incoherent and in need of new schemas to be constructed and monitored. All this brings up the question of intensive and extensive attention. 1.1.2. Extensive and intensive attention According to Kolaczyk (1992), the intensity of attention can be understood in two ways: 1. how much we decide to concentrate on the task and, consequently, how many resources (=attention, effort; Wickens 1984) we are ready to invest; and 2. how well the object of attention is singled out from the background/context. Extensive attention, in turn, amounts to the holistic processing of data, often in a relaxed way, with paratelic – not goal-oriented – motivation towards the task (Kolaczyk 1992). Paying attention extensively to a broader area of interest – often carried out in addition to intensively concentrating on one item only – re-introduces the phenomenon of divided attention. Broadbent (1954, 1958), as mentioned earlier in this chapter, discarded the idea altogether, arguing that we should rather be speaking about shifts of (intensive) attention between different tasks. Contrary to his assumption – and in agreement with what has already been written about controlled and automated actions – it is possible to be on a number of tasks at the same time, depending on their novelty/familiarity as well as where on the noviceexpert continuum the performer can be placed. As a consequence, a novel task and/or the one which is performed by a novice will necessitate a greater intensity of attention; in turn, an expert will be able to extensively attend to a number of operations especially if they require a standard, well-known course of action. Additional factors in the area of attentional intensity will include the similarity of the parallelly processed tasks and their relative difficulty (Eysenck and Keane 2000, Eysenck 2004). In the case of task similarity, two or more actions performed together benefit from different modalities; consequently, while we can read and listen to music, simultaneous reading and watching television is considerably more demanding. When it comes to task difficulty, attention is more efficiently divided between less demanding tasks; going back to the reading/listening example, it is less problematic to listen to music while engrossed in a novel than when one is on an exam-preparation task. Summing up research into divided attention, Eysenck

30

Chapter One

(2004: 205) states that: “two dissimilar, highly practised and simple tasks can typically be performed well together, whereas two similar, novel and complex tasks cannot”. The intensity and extensity of attention can also be understood in a way proposed by Kentridge et al. (1999). Similarly to Posner’s (1980) concept of endogenous/exogenous attention described earlier in this chapter, they argued that attentional resources can be item-oriented or field-oriented. In the first case we can speak of attention as volitional and selective, amounting to focusing on a particular item or its particular feature. In the latter case, attention will have the quality of a more general vigilance, alertness or readiness to be attracted to a certain conspicuous stimulus coming from the field. A contemporary model which seems to offer an interesting combination of the elements of intensive and extensive attention is the one presented by Arvidson (2004, 2006). Based on the work of Gurwitsch (1964 cited in Arvidson 2004, 2006), rooted in phenomenology and Gestalt psychology, Arvidson’s model proposes that we perceive all incoming stimuli in their context. We are aware of this context but to a different degree than we are aware of the main focus of our attention. The immediate context, in turn, is placed within a much larger situational framework to which attention is paid only marginally. That is why, when talking about attention, Arvidson (2006: 1) proposes working from the inside of the attention circle towards the outer, more peripheral layers, the “outer shells”, as he calls them. Consequently, we will be moving from the theme, the centrally located dimension of the main focus; through the thematic context of attention, our consciousness of the immediate circumstances surrounding the object of our thematic (focal) attention; towards the very vague peripheral attention, the consciousness of the outer word halo. As demonstrated below, each layer on the attentional tripartite model operates according to different organisational principles. In order to be the focus of attention, the theme has to be singled out from the background and consolidated as a unit on the basis of the “gestalt-coherence principle” (Arvidson 2006: 3). According to this principle, everything that is noticed about a given theme – clear, vague, complete or not – contributes to the perception of the theme as central against the backgrounded context (a thematic dog in the contextual yard, in Arvidson’s example; Arvidson 2006). Central as it is, the main theme is also prone to shiftiness and jumpiness, to use Arvidson’s terminology: any part of the central focus perceived as a whole may result in us refocusing our attention. This means that the above-mentioned dog (as a whole) may be backgrounded if we notice his/her wounded leg; the dog and his/her leg will, however, remain coherently linked1. ––––––––– 1

A similar point was made earlier in this chapter following the quote from Kahneman et al. (1992: 176): “attention to any property of an object causes even irrelevant properties of the object to be attended” on condition that all the properties are related in a coherent, uniting way – they are part of one scene, field of vision or context.

On the importance of noticing: attention

31

The thematic context, in turn, is subject to relevancy rules: we notice everything that presents itself as relevant to the theme (Arvidson 2006: 5). While it is true, as mentioned above, that the theme segregates itself from the background, it remains part of it in the sense that it continues to be relevant to its surroundings as an organising factor. Arvidson (2006: 5) calls this unity by relevancy and explains it as the inner coherence of the context, whose value goes beyond just “being there”, and the theme, “a central gestalt” in the network of gestalts which are all part of the same action or essive field. The closer a given gestalt is to the theme, the more important it appears; consequently, the context as a whole has what Arvidson calls a “gradation intensity” (2006: 6), this quality of gradual transition making the context itself flexible and prone to transformation, fluctuation and shifting. Modifying the dog example slightly, we can stipulate that the yard as a context for the dog will be subject to different kinds of processing activities performed in relation to the dog itself and in relation to the dog’s aforementioned wound. In the latter case we may want to refocus our context-oriented attention looking for sharp objects which, when located, will evoke – although not equally clearly – sharp objects as a class with their properties and possible accompanying incidents. Whether, as a result, our contextual attention will transform into focal/thematic attention depends on how important the original theme – the dog – still is. Finally, there is the outer world halo, which, as Arvidson (2006: 7) proposes “is irrelevant to the theme but is presented nonetheless”. Irrelevant, however, according to Arvidson, does not mean dispensable. The halo consists of three ever-present domains: the stream of consciousness or “streaming in attending” (Arvidson 2006: 7), embodied existence and the perceptual or environing world. To refer back to the dog example, the thematic focus on the animal and its immediate context will be accompanied by the marginal awareness of the passing of time (streaming), the fact that the person watching the dog is sitting or standing (embodiment) and the fact that cars are passing outside the fence surrounding the yard (the environing world). What Arvidson points out is that the marginal consciousness has its inner/outer structure too: there is gradation of the parts of the halo adhering to the thematic context and those more remote from it. Summing it all up, in the light of the considerations of attentional selectivity, historical and contemporary presented in Section 1.1, referring to both controlled vs. automated processing as well as intensive vis à vis extensive attention, we can assume that there is, as Eysenck (2004) puts it, a grain of truth in all three theories: the central processor theory, the bottleneck theory and the divided attention theory, also called the separate pools theory. Consequently, it may be a good idea to combine the three kinds of findings into a coherent construct of attention as multifaceted processing controlled by a kind of a supervisory resource. There have been a few attempts at translating the assumptions presented above into working models, the most convincing of which is the proposal to link

32

Chapter One

attention to working memory. Two such models are presented in the following section, the one put forward by Cowan et al. (2005), and Baars’ Global Workspace Theory (Baars 1997a, 1997b and 2002; Baars and Franklin 2003 and 2007). 1.2. Working memory and attention Before presenting the theoretical models that accommodate the concept of central executive and different pools of attention based on the idea of the convergence of attention and working memory (WM), the very construct of WM proposed by Baddeley and Hitch (1974) and later amended by Baddleley (1986 and 2000) will be briefly described. As Cowan et al. (2005: 42) put it, “[w]orking memory (WM) is the set of mental processes holding limited information in a temporarily accessible state in service of cognition” (Cowan et al. 2005: 42). It was exactly this kind of approach – the “in service of cognition” idea – that led to the reconsideration of the notion of memory, originally referred to in terms of three stores: sensory, short-term and long-term memory. Baddeley and Hitch (1974), the proponents of the working memory construct, put forward a claim that a module has to exist – later often referred to as a kind of worktable (cf. Stevick 1999, among others) – which is in charge not only of storage but also of retrieval and processing. Baddeley and Hitch saw such module as tripartite, its components being: – the phonological loop, which stores information in the form of speech (=phonologically); – the visuo-spatial sketchpad, in charge of coding of the visual/spatial input; – and, most importantly, the central executive, in charge of both processing tracks, which is modality free but whose capacity is limited. The three-component model was amended by Baddeley (2000), who added a fourth element – the episodic buffer – which enables multimodal temporary storage fed by and mediating between both the two original subsystems (the loop and the sketchpad) on the one hand and long-term memory on the other. The buffer itself, however, is separate from LTM and serves as its lead-in, a kind of modelling space, working, as it seems, through new episode/known episode comparisons and categorisation, similarly to the model of Deutsch and Deutsch (1963). Unlike it, however, retrieval from the episode memory buffer proposed by Baddeley (2000) is possible based on conscious awareness. The Baddeley and Hitch’s WM model and the earlier-quoted Eysenck’s (2004) proposal for a comprehensive model of attention have a number of convergence zones. First of all, the central executive – which was also linked to attention by Posner (1982; cf. earlier in the present chapter) – fits into the concept

On the importance of noticing: attention

33

of the supervisory resource. Its limited capacity, in turn, is the most likely underlying cause for the bottleneck effect on tasks in which “rehearsal and grouping processes are prevented, allowing a clearer estimate of how many separate chunks of information the focus of attention circumscribes at once” (Cowan et al. 2005: 43). Finally, the two ancillary subsystems, the phonological loop and the visuo-spatial sketchpad explain the idea of separate pools of attention and the fact that we can simultaneously attend to two different tasks if each of them utilises a different modality subcomponent. The tendency to find parallels between working memory and attention is very strong in contemporary research, one of the bents boiling down to amending the traditional concept of WM capacity, understood in terms of storage-and-processing measures and control over them, by highlighting either storing with its speed of retrieval or processing with its scope-of-attention measures. Cowan et al. (2005) argue in favour of this, claiming that the traditional interpretation of working memory in terms of a united concept of storage and processing can be difficult because these measures may be subject to individual differences (IDs). As a result, some people may exercise better control over storing and some over processing. What is more, in addition to IDs, it is the task specificity that will be of importance to these two measures, as some tasks require more storage capacity and some – a greater processing effort. Considering the latter we have to admit that somebody may be able to perform two or more tasks simultaneously because of their high WM capacity or because of the fact that the tasks in question are well automatised, comparatively easy or engage different modalities. Based on what was written earlier about attention, we can note that both the processing-and-storage dexterity (IDs) of working memory as well as its taskrelated specificity mentioned above call for a certain attention-related module/area on the WM worktable. Such a general, amodal attention resource which is a subcomponent of WM, shared between storage and processing, if necessary, is included in the working memory model of Cowan et al. (2005). Such an attentional capacity within working memory is, according to its proponents, characterised by 4 kinds of findings (Cowan et al. 2005: 49): 1. There is a limit in the capacity of the focus of attention. 2. This limit varies between individuals. 3. Measures of this capacity are theoretically and empirically related to storageand-processing measures of WM. 4. The common variance between these measures is related to intellectual aptitude measures. As for the first assertion, Cowan (2001) in his earlier work speaks of 3-5 chunks of information that can be subject to simultaneous processing. This, as noted by Bierwiaczonek (2010; personal communication), is an assertion which is intui-

34

Chapter One

tively sound in relation to language processing. The predicate-argument structure of a sentence – the subject, the verb, its (maximum) two objects and its (optional) adverbial – is elegantly accommodated within the 3-5 chunk range, much more neatly than within the earlier-proposed (Miller 1956) short-term memory capacity for processing information, in which the magical number 7 (5-9) was put forward. In turn, the fact that both Cowan and Miller suggest range of 3-5/5-9 implies that scores in this area differ between testees (assertion 2). As a result, it seems reasonable to argue that attentional capacity is a relevant individual difference, in general and in language acquisition/learning. In the light of assertion 3, it stands to reason that the total capacity of working memory will be the result of a multiplication between the number of chunks of information that can be put together on the worktable (memory capacity) and the effectiveness with which they can be effectively manipulated (attentional capacity). The result of this multiplication will be the most satisfying if both capacities are high. It can also be said that low memory capacity and high attentional capacity and high memory capacity and low attentional capacity will lead to comparable scores, an acknowledgment which, among others, lies at the explanation of different types of intellectual aptitude (4), a part of which is language aptitude, an issue to be revisited in Section 3 of the present chapter, where attention-related phenomena are discussed in connection with language learning. Another development on Baddeley and Hitch’s model of working memory (Baddeley and Hitch 1974; Baddeley 1986 and 2000) which incorporates the idea of a certain central/supervisory resource is Baars’ Global Workspace Theory (Baars 1997a, 1997b, 2002, Baars and Franklin 2003, 2007). The model proposes a relationship between working memory and consciousness, which is motivated by Baars’ (1997a and 1997b) intuition that consciousness has to be related to the limited capacity aspects of brain: we can be consciously involved with only one flow of information. At the same time, however, consciousness creates widespread access to sources of knowledge which are mostly unconscious and whose joint capacity is practically unlimited. While the issue of consciousness per se will be returned to in Section 2, here we will concentrate on the important role to be played by attention in the Global Workspace model: to explain the conscious gateway offering global access, Baars (1997b) speaks of “the spotlight of attention” in “the theatre of consciousness”. How exactly should this “theatre of consciousness” metaphor be understood? And how important is the attentional spotlight? The answers to these questions can only be clear if based on the Global Workspace Theory (Baars 1997b), in the light of which there are five important elements of this theatre: the stage, the spotlight, the actors, the behind-the-scenes context and the audience. Based on the early model by Baddeley and Hitch (1974), Baars proposes two dimensions of the stage of working memory, corresponding respectively to the phonological loop and the visuo-spatial sketchpad: the inner speech, practically

On the importance of noticing: attention

35

ongoing and difficult to stop for more than a few seconds; and the visual component, constantly underlying our spatial processing of the world. In Baars and Franklin (2003, 2007) a third dimension is introduced. Following the alreadymentioned Baddeley’s (2000) amendment of the WM model, the episodic buffer is included in the WM stage of the consciousness theatre, a kind of filter between working memory and the long-term episodic memory. According to Baars and Franklin (2003), while the interaction between working memory and the longterm store, carried out via the inner speech and the visual component, is mostly unconscious, the episodic buffer is a conscious go-between whatever happens at the stage and the long-term memory. As the contents of the working memory stage – by which we mean the current input, processing and output – may be within the conscious grasp as well as they may fade away and come back again, Baars (1997b and later works) introduces the concept of the spotlight of attention. Working memory is described as fleeting (Baars 1997b and later works), which is the best characteristic of the come-intolight-fade-away-come-into-light-again phenomenon of spotlight in the theatre. In relation to attention it means that the attentional spotlight can be guided – both voluntarily and spontaneously – singling out, for conscious processing, different elements on the scene which Baars (1997b and later works) calls the actors, trying to get into the bright spot. As a result, on the stage of working memory there is an ongoing competition between sensations, thoughts and images that want to get into the spotlight. The more important an actor is the more actively it will compete for attentional resources. However, the theatre of consciousness – like any theatre – is not limited to the stage. There are two additional elements: the out-of-stage context, including the director of the play, and the audience. As Baars (1997) points out, a lot of attentional selection – metaphorically understood as the fleeting of the spotlight from one element on the scene to another – is unconscious and spontaneous. This is because it is motivated from behind the scenes by factors such as beliefs or past memories which make the outof-stage context. There is also the director – which is how Baars (1997) understands the executive brain functions, the goal-driven system guiding human WM – who is also behind the scenes. We rarely have access to reasons – there is scientific proof that our minds play dice (Klarreich 2001), making inexplicable decisions on the spur of the moment – which is why we have to assume that the director seated off stage is not always conscious, surprising as it may sound. Finally, the audience in the theatre metaphor stands for long-term memory and automated productions. In its above-described shape, the theatre of consciousness performs 9 functions (Baars 1997b): 1) the adaptation and learning function: the more new information we encounter the more conscious involvement is needed; 2) the definitional and contextualising function: every conscious experience is shaped by

36

Chapter One

contextual unconscious factors (like prior ideas about a given phenomenon, now out of the stage); 3) access to a self system: self is the observant and the controller of conscious experience; it adds the subjective feel to the ongoing events; 4) the prioritising and access control functions: events are consciously related to higherlevel goals (an example is smoking related to health in social campaigns); 5) the recruitment and control function: conscious goals can recruit unconscious routines for the execution of new goals; 6) the decision-making and executive function: the theatre is controlled from outside; however, an actor on the scene can incite the controller into noticing an issue and dealing with it; 7) the error detection and editing function: unconsciously detected error breaks through to consciousness; 8) the reflective and self-monitoring function: we reflect upon our own functioning through inner speech and imagery; 9) optimising the trade-off between organisation and flexibility: what is in the spotlight is consciously controlled but the spotlight itself can be moved around freely. The model presented in Baars (1997b) was confirmed by neuroimaging studies carried out in subsequent years (Baars 2002; Baars and Franklin 2003, Baars and Franklin 2007) leading, once again, to the following observations (Baars and Franklin 2003: 166-167): – the brain can be viewed as a collection of distributed specialised networks; – consciousness is associated with a global workspace in the brain – a fleeting memory capacity whose focal contents are widely distributed (broadcast) to many unconscious specialised networks; – this workspace can also integrate many competing and integrating networks; – some unconscious contents called contexts shape conscious content; – sometime such contexts work together jointly constraining conscious events; – motives and emotions can be viewed as goal contexts (in this way they belong to the central executive domain); – executive functions work as hierarchies of goal contexts. All of this is also in agreement with contemporary views on working memory, which, if referred to as a space, is no longer seen as a two/three-dimensional enclosed area (a blackboard; a desk) but is rather “a transient pattern of activation of elements within long-term memory stores” (Miyake and Shah 1999 cited in Truscott and Sharwood-Smith 2004: 3). The Global Workspace Theory model presented above is also, to a large extent, similar to somewhat earlier considerations offered in LeDoux (1996) and much earlier claims put forward by Lashley (1951). Both of these authors emphasise the fact that we have access to only a fraction of information processing in our brain, namely the result of these operations. The rest is unconscious, yet not in the Freudian, ominous, dynamic, emotionally-loaded and motivated-by-repressed-memories way but cognitively unconscious. The term

On the importance of noticing: attention

37

itself – the cognitive unconscious – was put forward by Kihlstrom (1987 but see also Kihlstrom 1984), and is simply applied to all of the workings of the human mind which are inaccessible to us. If we agree that the final result of the analysis is conscious while the analysis itself remains in the shadow of the cognitive unconscious, the comparison between this idea and the theatre of the Global Workspace with its spot-lighted stage and the rest of the theatre in darkness is well justified. In the light of what has been written so far, we can draw a number of interim conclusions about attentional selectivity. First of all, we need to emphasise that attention is subject to bottleneck-like effects. As a result, the number of information-processing tasks we can intellectually handle at the same time is not unlimited. It, however, would be far-fetched to state that attention is undivided, and that the filtering process, once in progress, is one-directional and finite. The key words in the description of attention are fleeting (Baars 1997b; see also later works) and flexible, the latter manifesting itself in attenuation of attention – which is a term preferred to filtering – its graduity and hierarchality (Treisman 1960 and 1964). As a result, we have to assume that attentional processes will be subject to fluctuation in two different senses. On the one hand, a number of information chunks will be processed parallelly, the attentional focus shifting from one to another depending on current – and constantly changing – task demands. All this happens very much along the lines of the spotlight metaphors proposed by Baars as well as – much earlier – by Posner (1980), the legitimate intuitive interpretation of the spotlight being the one of attentional control. On the other hand, attention will be on the move in terms of its spectrum, in accordance with Eriksen and St. James’s (1986) zoom lens model, transforming flexibly – again motivated by task demands – from intensive to extensive. This transitional propensity of attention is quite adequately included in Arvidson’s (2004, 2006) three-level model of attention in which the fluctuation of focus between the theme, the thematic context and the halo of the outer world is based in the internal gestalt-like coherence between the three layers of the model, which is a continuum with a number of flexible seamlessly-crossed boundaries. At the same time however, we have to remember that the gestalt nature of attentional processes has one more dimension: the very same wholeness that allows for the flexibility of these processes calls for a central supervisory unit, a kind of central executive in control of the whole process. This demand is satisfactorily fulfilled by the concept of linking attention to working memory, as proposed by Baars (1997b and later works) as well as Cowan et al. (2005). All of the above can best be summed up in the words of Eysenck, who concludes that (2004: 207-209):

38

Chapter One

– there is considerable evidence for the existence of multiple [attentional] resources; – [at the same time] there is also evidence for the central processor on the one hand and, on the other, for the bottleneck effect – frequent but not omnipresent – which amounts to serial execution of tasks in dual-task – and, as it seems, multiple-task – performance; – as a result, the amount of dual task interference depends on the extent to which two tasks share common resources; simultaneously, there is often some disruption to performance even though two tasks make use of separate pools of resources, which depends, among others, on individual differences between task performers. That is why it is important to remember about a number of constraints on attention. Those mentioned so far have been: modality, with the assertion that divided attention is possible on the condition that competing inputs come from different modalities; individual differences, especially in the area of the capacity of working memory, with the acknowledgement that such differences may be the result of varying sub-capacities in the areas of both storage and processing; the degree to which certain operations are automatised; and last but not least, where the individual performing the tasks can be placed within the novice-expert continuum. Finally, it has to be pointed out that there is one more attention-related factor whose importance is still subject to debate, namely, the role of conscious and unconscious processes, emphasised by numerous theoretical models, Baars’ Global Workspace Theory, among others. What is consciousness and how can it be understood and related to attention? Are attention and consciousness synonymous? Is their relation as straightforward as it seems on the basis of the quoted research, which generally treats attentive as conscious? All these questions are considered in Section 2.

2. The problem of consciousness According to a number of researchers, there is a difference between attention and consciousness. Baars (1997a, 1997b and later works) argues that attention is a window to consciousness, the difference between the two being a matter of selecting an item/chunk – be it a person, an object, an event, etc. – from the background (attention) and becoming aware of this event (consciousness). Eysenck and Keane (2000: 119) explain this attention/consciousness variance as the difference between looking and seeing; listening and hearing; choosing a channel on a TV and actually watching what appears on the screen. Alternatively, it will be the difference between implicit and explicit perception, to use a term utilised in Chun and Wolfe (2001).

On the importance of noticing: attention

39

In fact, there is ample research to date which sees attention and consciousness as independent. Auksztulewicz (2007) quotes a number of studies which prove that consciousness and attention are not fully convergent; we can only talk about an overlap between some attentional processes and some types of consciousness. The most interesting model of such an interplay between attention and consciousness is the one presented in Lamme (2003), who, contrary to Baars’ claim that attention is a gateway to consciousness, sees consciousness as being the gateway to both attended and unattended intakes, out of which only the first are subject to reportability2. Based on the analysis of both psychological/theoretical and neurobiological arguments, Lamme (2003) claims that conscious experiences – like attention – are selective; in neural terms, however, they are two kinds of cerebral activities. In the light of the multiplicity of consciousness-related phenomena only just referred to above, it has to be stated that discussing in detail all of the intricacies of the consciousness/attention mutual relationship is definitely beyond the scope of the present work. Yet, as has been shown in the course of the present chapter, the word conscious appears in numerous attention-related contexts. That is why, while not venturing to offer a full explication of the problem, the following subsections attempt to clarify the very notion of consciousness, so that we can specify the meaning of the very frequently used phrase of “conscious attention”. In doing so, we have to remember that defining consciousness is far from easy as there is no agreement on what consciousness is. It “poses the most baffling problems in the science of the mind. There is nothing that we know more intimately than conscious experience, but there is nothing that is harder to explain” (Chalmers 1994: 200). Additionally, there are few things as multi-faceted as consciousness. In the light of the two above-mentioned constraints, trying to pinpoint consciousness as a phenomenon seems to be mission impossible or at least not easily accomplished. Considering the problem of consciousness is likely to send us back to Leibnitz and Newton and their interest in the physics of perception as well as to inevitably bring up the homunculus argument: the question of the Cartesian theatre and the little audience watching the scene of events. The following sections of the present chapter will look at some of these issues, considering them at least to a certain extent. Finally, there seems to be a terminological problem to solve: in noticing-related debates, consciousness is used interchangeably with awareness. The question which should be resolved before a more profound discussion is started is if these two are actually synonymous. Based on the distinction between the easy and hard problems of consciousness (to be discussed later in this section) the answer is “no”: awareness ––––––––– 2

Lamme’s model will be discussed in more detail in relation to Schmidt’s Noticing Hypothesis

40

Chapter One

is the easy part of the consciousness issue: it relates to mental functions and their underlying mechanisms which are objective and physical (Chalmers 2002), such as focusing attention together with the ability to discriminate and integrate information, report our mental states etc. The label consciousness, in turn is frequently reserved for the hard-problem phenomenon, that of the subjective experience which, at least according to some researchers, goes beyond the scientifically detectable. As a result, the hard problem amounts to explaining: why human beings have phenomenal experience together with issues related to such experience including awareness of sensory input and qualia; the question of philosophical zombies; and subjectivity of experience or phenomenal natures. What should also be kept in mind is that there are different approaches to the problem: philosophical, neurobiological, linguistic and so on. The present section looks at awareness and consciousness, as well as the easy and hard problems of consciousness, respectively, examining them from two – out of many – different stances: Chalmers’ (1994, 1996, 2002) dualism and Dennett’s (Dennett 1991, 1993 and 1997; Dennett and Akins 2008) physicalism. 2.1. The easy problems of consciousness According to Chalmers (1994: 200-201), the easy problems of consciousness – also called A-consciousness (access consciousness; Block 1997) – boil down to explaining the following phenomena: – – – – – – –

the ability to discriminate, categorise, and react to environmental stimuli the integration of information by a cognitive system the reportability of mental states the ability of a system to access its own internal states the focus of attention the deliberate control of behaviour the difference between wakefulness and sleep

Consequently, being aware will amount to a number of easily differentiated states and actions, such as: reacting to a certain input in a way that might be called intentional or deliberate; having access to one’s state of mind and its dynamics; being able to verbalise the accompanying sensations to describe both the current mental state as well as possible changes resulting from the accommodation of new information in the existing architecture of knowledge, which Gut (2009: 188) calls the “solidification of thoughts in a form of ... language”; as well as being aware that one is focusing on the incoming stimuli. The key word to explaining these states and actions is control because, as Block (1997) puts it, awareness does not amount to availability of certain stimuli alone, active control of thought and behaviour is indispensible.

On the importance of noticing: attention

41

All of the above phenomena can be – and have been – explained scientifically based on studies of cognitive and neurophysiologic models of a number of mechanisms: access and reportability, which are the mechanisms in charge of retrieving information about internal states and making it available for verbal report; integration of information, responsible for which are the mechanisms which consolidate and process incoming data; etc. The models that offer explanations of these mechanisms include, among others, the following theories (based on Chalmers 1994): Dennett’s (Dennett 1991; Dennett and Akins 2008) multiple drafts theory, in the light of which numerous processes in the brain integrate into the final perception of the experienced event; the already-quoted Baars’ (1988, 2003) Global Workspace Theory of consciousness, whose main idea is that of a central processor containing consciousness; and Crick and Koch’s (1990, 1994) neurobiological theory of consciousness relating the phenomenon in question to neuronal oscillations in the cerebral cortex; the final model will be endorsed by ideas put forward by two other brain scientists, Damasio (1999) and LeDoux (2002). Dennett’s (1991 and later works) theory of multiple drafts is a model developed in response to the so-called Cartesian materialism, in the light of which there is a central consciousness centre in the brain where the results of our experience are presented to the inner self or homunculus. Dennett’s escape from the Cartesian theatre, as explicated in Dennett and Akins (2008), amounts to: – breaking up the work supposedly done by the homunculus and distributing it, temporarily and spatially, to a number of lesser but more specialised agencies in the brain; – the effects of the work not having to be re-analysed or stored in memory – all the processes involved being parallel; the final draft is the result of preparing and discarding a number of interim drafts – the impossibility of precisely timing when the human being becomes conscious of the experience. The multiple draft process was subsequently renamed in Dennett’s later works as fame in the brain (cf. 1996, among others), and seen as a kind of competition (not all can be famous), in which some subsystems in our brain are quicker in processing what we experience; other subsystems take their time, discarding their initial drafts and taking up different ones. This often results in us deciding to choose one option but actually choosing another; screaming with fear and laughing at this fear at the same time; performing an action, like changing gear while driving, only to become aware of what has happened subsequently. What is important in relation to the problem of consciousness is that whether or not some events become famous in Dennett’s understanding of the term is a matter of their ability to draw attention to incoming visual or auditory stimuli and

42

Chapter One

give prominence to concurrent content fixations in the brain3. Events demonstrating such an ability are called probes and are characterised by their subsequent recollectability and reportability. Probes are what we are aware of as “the ability to report a content is conclusive evidence of consciousness” (Dennett and Akins 2008: 4321). In turn, Baars’ (1988, 1997a and 1997b, 2003) Global Workspace Theory of consciousness – already described in some detail in the section of the present chapter devoted to models of attention related to working memory – is based on a similar idea of the integrative function of consciousness: the brain being a kind of web, “a massive parallel distributed system” (Baars 2003: 1), there is a “fleeting memory” enabling access between otherwise separate brain functions. The present chapter has already offered a comprehensive description of the theatre; what remains to be added is that Baars (2003) claims that consciousness is the primary agent in such a global access function because it acts as a gateway to a number of brain functions. According to Baars, conscious perception opens the route, among others, to working memory, unconscious perception offering much more limited processing possibilities; conscious events enable all kinds of learning: explicit, implicit, episodic and skill learning; and, finally, consciousness is behind attentional selectivity. Clarifying the idea of the above-mentioned agent consciousness active in the Global Workspace, Baars (1988, 2003) uses the theatre metaphor. Yet, contrary to the Cartesian theatre idea, we are not dealing with a homunculus consciousness but rather with “the bright spot on the stage of immediate memory” – working memory also called “extended consciousness” (Baars 2003: 7) – directed there by a spotlight of attention under central executive guidance. “The rest of the theatre is dark and unconscious” (Baars 2003: 3). Poetic as it may sound, the model is well grounded neurologically: in the case of sensory consciousness, the “bright spot” amounts to the activation of visual, auditory, etc. cortex resulting in inner imagery or speech. In this form, the input is forwarded to the “decentralised audience of expert networks sitting in the darkened theatre” (Baars 2003: 3). In such a way consciousness performs its primary function: it sets the multiple unconscious networks in motion, often in a state of competition, while coordinating and integrating their activities. All this happens under dual cerebral control of the frontal executive cortex on the one hand and areas of, as Baars (2003a: 4) calls them, automatic interrupt control: the brain stem, pain systems or the amygdale. The coordination, integration and control exercised by consciousness in the theatre – like Dennett’s fame model – are the basis of subsequent reportability of the supervised events. In other words, in the theatre of consciousness, the performance starts in the spotlight (=the gateway), continues on stage (=within awareness) and out of it (=unawares) and returns to the ––––––––– 3

Very much like Baars’ actors; cf. earlier in the present chapter

On the importance of noticing: attention

43

stage (or at least some of the mental operations do) to be reportable when spotlighted (=within consciousness again). Finally, Crick and Koch’s neurobiological theory of consciousness (Crick and Koch 1990, 1994 and 2002) is an attempt to prove that “the best approach to the problem of explaining consciousness is to concentrate on finding what is known as the neural correlates of consciousness – the processes in the brain that are most directly responsible for consciousness” (Crick and Koch 2002: 94). Such an approach amounts to treating the hard problem of consciousness– the one of the subjective experience (cf. the next subsection) – as an easy problem by breaking it down into a number of questions such as: the reason why we experience things at all; the factors underlying particular experiences; the fact that some aspects of our conscious experience are not reportable to others. Crick and Koch offer – as they put it – “an answer to the last question and a suggestion to the first two” (2002: 94): explicit neuronal representation. In order to understand the nature of such representation, Koch and Crick claim, we need to be aware that everything we perceive is represented in our brain in a semihierarchical way. The representations appear in the primary and higher cortex, respectively: implicitly in the former, as the firing neurons generate a kind of general idea, lines and edges (differences between colours as they used to be represented on black-and-white TV); and explicitly, more particularly in the latter, including all of the specificities of a given item like the perspective, point of view, specific colour or the effects of light and shadow4. Crick and Koch claim that it is the latter, explicit representations that are the actual neural correlates of the subjective experience. This proposal – with is hierarchality and mappings between primary and higher cerebral processes – is very much in accord with the model of consciousness put forward by Damasio (1999), who distinguishes between emotions, our feelings of these emotions and the sense that it is our self feeling this emotion. He argues that the first two belong to the realm of core consciousness while the latter is demonstrative of higher-reason, extended consciousness5. Core consciousness is activated on perception of a certain object as a result of which, as Damasio (1999) suggests, our brain construes a non-verbal message of how our mental state is affected by the processing of this object. In other words, as a result of a certain ––––––––– 4

5

As Crick and Koch emphasise, brain damage affecting the neurons in charge of explicit item representation – and not defective receptors in the eye – results in dysfunctions such as prosopagnosia (the inability to consciously recognise a familiar item if this item is the face of someone we know or a certain known colour). Koch (2004), who compares Damasio’s core and extended consciousness to Block’s (1997) phenomenal and acess consciousness (terms used throughout the present chapter), writes: “Core consciousness is all about here and now, while extended consciousness requires a sense of self – the self-referential aspect that, for many people, epitomizes consciousness – and of the past and the anticipated future” (2004: 15).

44

Chapter One

emotion, which is reflected in our physiological responses (such as, for example, elevated blood pressure), a feeling – which Damasio describes as a “sensory pattern” or “image” (1999: 55) is generated. Neither of these two, as pointed out by Damasio, has to be conscious. It is not until the “[c]omplex customized plans of response are formulated in conscious reason and may be executed as behaviour” (Damasio 1999: 55) that the extended consciousness is reached. The transition – or mapping – between the two levels of consciousness implies that the first-order (or proto-self; 1999: 154) feelings are transformed into memories (or become part of the conscious autobiographical self; 1999: 173). What is important is that – as in Crick and Koch’s model (2002) – the extended consciousness has to be the substrate, or correlate, of the core consciousness because, as Damasio observes, the former does not exist without the latter. This correlation between the two types of consciousness vis à vis their abovedescribed differences brings back the problem of the nature of subjective experience and its reportability. Crick and Koch’s (2002) claim that for us to be able to verbalise the subjective experience of a given item – be it somebody’s face seen at dawn or a certain colour – explicit information represented in the higher visual cortex, the correlate of the phenomenal experience registered by the primary cortex, has to be further transferred to the motor cortex. As a result, we may not be able to fully report the nature of the inner feel accompanying the perception, but, according to Crick and Koch (2002), we can certainly see and report the difference between one subjective experience and another because of the specific encoding-reencoding going on between these two cortical areas. Addressing the other two questions – why we consciously experience at all and what underlies specific experiences – Crick and Koch suggest that there are ways in which neurons that explicitly encode an item can “convey the meaning” of this item “to the rest of the brain” (2002: 95). Such neuronal ability is related to what they call “a neuron’s projective field” (2002: 95), a synaptic pattern explicitly coding a certain concept. In other words, a familiar face represented explicitly in the higher visual cortex will be linked via the white brain to a corresponding area in the motor cortex, where the name of the person is represented and ready for uttering; to auditory cortex and the sound of the person’s voice; to the emotional brain and all of the memories of this person; etc. The above-mentioned idea of consciousness as the elusive self located in the white brain is endorsed by another neurobiologist – LeDoux (2002). He argues that what makes every self unique are synapses with their unique activation patterns. In other words we are what we remember and how6 (in terms of the connection strength as well as the content of these memories). Explicating the mechanism, LeDoux (2002: 303) writes: ––––––––– 6

An idea similar to Schacter’s (1996, 2001) view of memory-motivated self.

On the importance of noticing: attention

45

We all have the same brain systems, and the number of neurons in each brain system is more or less the same in each of us as well. However, the particular way those neurons are connected is distinct, and that uniqueness, in short, is what makes us who we are.

Reviewing all the three cognitive perspectives on consciousness discussed above, it is important to point out that all of them share certain characteristics, the most important being breaking up the work done in the brain and assigning it to the subsystems parallelly processing the data. In this respect all these theories seem highly convergent with other widely accepted models of cerebral organisation and performance: Damasio’s (1994) idea of the brain as a system of systems or the brain as a parallel processor, the core concept of the PDP (Parallel Distributed Processing) theory put forward by Rumelhart and McClelland (1986). The following paragraphs look at these similarities. First of all, Dennett’s model of multiple drafts with its idea of breaking up mental work and sharing it between special cerebral agencies as well as Crick and Koch’s concept of a neuron’s projective field are reminiscent of Damasio’s idea of brain which is not “a single, contiguous map, but rather an interaction and coordination of signals in separate maps” (1994: 66), a system of systems. As Damasio puts it: We can now say with confidence that there are no single “centres” for vision, or language, or for that matter, reason or social behaviour. There are “systems” made up of several interconnected brain units … dedicated to relatively separable operations that constitute the basis of mental functions … What determines the contribution of a given brain unit to the operation of the system to which it belongs is not just the structure of the unit but also its place in the system. (1994: 15)

As a result of such compositionality of brain representations, the images we store in our memories are what Damasio (1994: 102-103) calls “dispositional representations”, neuronal firing patterns which enable a momentary reconstruction of an image – for example a memory of a certain person – based on the joined activity of the individual assemblies of neurons storing the person’s voice, their profile at dawn, their giggle, their freckled nose, etc. Such ensembles of smaller neuronal systems are called “convergence zones” (Damasio 1994). Additionally, Dennett’s multiple draft and Crick and Koch’s projection field concepts will be considerably convergent with the core claim of PDP (Rumelhart and McClelland 1986): that information about individual entities like people, objects or situations is not stored as a single memory trace but in the form of a number of interconnected units any of which can trigger the retrieval of the whole entity. There is also a perceivable analogy between the way Damasio (1994) specifies neural correlates of mind and Crick and Koch’s (1990, 1994, 2002) attempt to define consciousness via explicit representations in the higher cortex. We can speculate that what Damasio means by “the place in the system” (cf. the

46

Chapter One

quotation on the previous page; Damasio 1994: 15) is the relative importance and, consequently, activation potential of a given higher cortical area. This potential will determine which of the interconnected neural subsystems and the representations they encode will participate in the inner feel of a given perception in a given moment. Equally important, to quote Damasio once more, will be the context in which the perception takes place as “the physiological operations that we call mind … mental phenomena can be fully understood only in the context of an organism interacting in an environment” (1994: xvii). In search of other conceptual similarities, we can note that Dennett’s proposal that the final draft of input is the result of multiple drafting and redrafting on the level of individual perception and that these processes involve parallel processing goes hand in hand with PDP, in the light of which the human mind consists of a number of elementary units making up the neuronal network and all mental processes involve parallel (rather than sequential) interactions – excitatory or inhibitatory – of these units (Rumelhart and McClelland 1986). The draftingredrafting concept also agrees with what Damasio (1994) writes about brain operations in macro-scale such as recall and learning: The brain’s systems and circuits, as well as the operations they perform, depend on the patterns of connections among neurons and on the strength of synapses constituting the connections (108). Since different experiences cause synaptic strengths to vary within and across many neural systems, experience shapes the design of circuits. ... Some circuits are remodelled over and over throughout the life span, according to the changes an organism undergoes (112)

What is emphasised is the cyclicality of the process of remodelling – cf. Dennett’s redrafting – as well as the process of strengthening neuronal connections. This goes hand in hand with PDP’s claim that learning consists of strengthening connections between the interconnected units of the overall memory trace. Finally, Baars’ concept of a “fleeting memory” enabling access between otherwise separate brain functions, though of different cerebral origin, is reminiscent of Damasio’s (1994: 182-183) “convergence zones” – assemblies of neurons in the prefrontal cortices in charge of item identification based on putting together all kinds of incoming information – and their content – “dispositional representations for the appropriately categorised and unique contingencies of our life experience” (Damasio 1994: 182-183). The latter are not representations of items per se but rather means for reconstructing such representations in recall, patterns of neuronal activity in the convergence-zone assemblies. What remains to be said as a form of sum-up is that when it comes to offering solutions to the easy problems of consciousness listed at the beginning of this section – the above-mentioned theories deal with a number of them: Dennett’s multiple drafts theory addresses reportability of mental states; Baars’ (1988, 2003)

On the importance of noticing: attention

47

Global Workspace theory of consciousness, deals with the issues of information integration and reportability; finally Crick and Koch’s (1990, 1994) neurobiological theory of consciousness refers to integration – or binding – of information as well as the reportability of explicit representations . However, if we were to quote Chalmers (1994: 202), the conclusion might be that if “these phenomena were all there was to consciousness, then consciousness would not be much of a problem”. 2.2. The hard problem of consciousness – Chalmers’ dualism vs. Dennett’s physicalism What is problematic – and a bone of contention, as will be shown in the present section – is the hard problem of consciousness: the one of experience, the subjective aspect of information processing. As Chalmers (1994) puts it, we can easily explain how our cognitive apparatuses engage in information intake and processing, which means that we clarify issues such as our ability to see colours, feel tastes, and hear sounds. What posits a problem is what makes this experience subject-specific: how and why we see colours as deep or pale; how and why tastes and sounds can lead to rich inner states, etc. To take this even further, the question is of organisms being conscious of what it is like to be themselves; of these organisms’ mental states being conscious in the sense that they know what it is like to be in a certain state; and of phenomenal consciousness or qualia, which amount to the way things seem to us (cf. among other, Chalmers 1994, 1996, 2002 and Block 1997). What is additionally important is that the unique character of this kind of consciousness makes it qualitatively different from the functional/access, the easy problem type of consciousness presented in the previous section. This is exactly where the disagreement arises. To start with, the already quoted Crick and Koch’s neurobiological theory of consciousness (Crick and Koch 1990, 1994, 2002) states that there is no such thing as the hard problem of consciousness. All of the questions related to it, such as the question of subjective experience, can be explained on the basis of scientifically detectable – and “easy” – neural correlates of consciousness; a similar stance is presented in the synaptic self put forward by LeDoux (2002). The very qualia, the way things seem to us – as described in the previous section – are the result of neuronal activity – neuronal connections (LeDoux) – even though their full reportability is not possible. Similar scepticism towards consciousness as a hard problem is expressed by Dennett (Dennett 1991, 1993, 1997 Dennett and Akins 2008), who claims that qualia do not exist, or, as he puts it, if qualia are something that we know about the objects of perception, something real, they are real in no special way, because: whenever someone experiences something as being one way rather than another, this is true in virtue of some property of something happening in them at the time, but these properties are so unlike the properties traditionally imputed to consciousness that it

48

Chapter One would be grossly misleading to call any of them the long-sought qualia. Qualia are supposed to be special properties, in some hard-to-define way. My claim – which can only come into focus as we proceed – is that conscious experience has no properties that are special in any of the ways qualia have been supposed to be special. (Dennett 1993: http://www.tufts.edu/as/cogstud/papers/quinqual.htm)

By stating the above Dennett, as he puts it himself, does not deny the existence of conscious experience; he simply claims that this experience can be explained in terms of its easily defined properties. Addressing the functional/phenomenal duality of consciousness, he inclines towards a quantitative rather than qualitative (cf. Chalmers and Block earlier in this section) distinction between the two sides of the consciousness coin (Dennett 1997: 417): he suggests richness of content and degree of influence for the phenomenal/experiential and functional/access consciousness, respectively. He refuses to see these two as separate phenomena, arguing that Block himself, in his own consideration of the duality problem (Block 1997), could not satisfactorily prove that “in the normal run of things”7 (Dennett 1997: 417) these two types of consciousness can actually exist separately. Chalmers (2002) discards the above-mentioned argument based, as he claims, on a methodological flaw in Dennett’s line of reasoning and the other cognitively motivated approaches (described in Section 2.1, devoted to the easy problem of consciousness). Cognitive research methodology, as Chalmers claims, while very effective in explaining functions and their underlying mechanisms, is of little use when it comes to investigating the origins and the quality of the very “inner feel” (Chalmers 1994, 1996, 2002), the subjective experience accompanying the performance of the above-mentioned functions and related mechanisms. According to Chalmers, while Dennett’s (1991, 1993) multiple drafts theory as well as the other two already-mentioned theories of consciousness: Baars’ (1988) Global Workspace Theory and Crick and Koch’s neurobiological theory (Crick and Koch 1990; Crick 1994) offer some insights into: discrimination, categorisation and integration of input; its reportability and accessibility; as well as attention-related phenomena of selectivity and control, they leave the “the inner feel” unexplained. Theories from other fields of science – neuroscience (nonlinear dynamics and non-algorithmic processing; Penrose 1994) or quantum mechanics (consciousness ––––––––– 7

Dennett (1997) addresses Block’s (1997) argument of blindsight, in the light of which patients suffering from damages to visual cortex can still see in their minds’ eyes. Block argues that such patients have access consciousness without phenomenal consciousness; according to Dennett they just have impoverished content with a simultaneous high degree of influence of the higher cortex able to compansate for the deficiencies in the primary cortex. We may observe that a similar higher-cortex influence is exerted in the case of prosopagnosia, yet in this case it is the higher-cortex deficiency which overcomes the uncomromised intake by the primary cortex.

On the importance of noticing: attention

49

arising from quantum-physical processes taking place in neuronal protein structures; Hameroff et al. 1994) – seem equally deficient to Chalmers (1994, 1996). Finally, there are also the so-called “mysterians” (Chalmers 2002: 92), who claim consciousness can never be understood and explained. To quote Chalmers (1994: 201), the main flaw of all these attempts boils down to the fact that we know that subjective experience “arises from a physical basis, but we have no good explanation of why and how it so arises”. As for the so-far unsatisfactory solutions offered to the hard problems of consciousness, the underlying cause of their failure to pinpoint the how and why of subjectivity is, according to Chalmers (1994, 1996, 2002), their reductionism, which amounts to trying to explain consciousness in terms simpler than the phenomenon itself. Where these cognitively-based approaches seem to have failed, a non-reductionist stance is offered. The point of departure for such a theory is seeing experience as fundamental, an axiom taken for granted and, consequently, unanalysable. What follows are the principles of structural coherence and organisational invariance as well as the double-aspect theory of information whose aim is to enable the leap (Chalmers 2002) across Levine’s (1999) explanatory gap between physical processes and consciousness. All three principles are briefly discussed below, following Chalmers (1994, 1996 and 2002). The principle of structural coherence refers to a certain isomorphism between consciousness and awareness, the former representing the mysterious inner feel and the latter applicable to various functional phenomena, accessible and reportable, discussed in the section devoted to the easy problem of consciousness. In the light of this principle, the conscious experience, even if impossible to analyse, will in no way be unrelated to the cognitive representation of the incoming sensory information. As Chalmers (1994, 1996) puts it, every instance of conscious experience leaves a trace of corresponding controllable and verbalisable information in the functional system; and vice versa: every trace in the system is a proof of conscious experience, these two as if mirroring each other (Chalmers 2002). While it is true that we still do not fully understand certain properties of experience because of its intrinsic subjective nature, Chalmers (2002) claims that we can still see through it into awareness-related substrates or correlates. To put it in a simpler yet metaphoric way, we cannot catch the main culprit but we know they were at the crime scene, as we can clearly see and analyse the footprints. This logically relates to the principle of organisational invariance, which states that if two different systems exhibit the same kind of awareness-related traces – the same neural substrates; the same functional architecture – they are demonstrative of the same kind of conscious experience. Finally, there is the double-aspect theory of information which stems from the above acknowledgement of the isomorphism between awareness and consciousness, or, to use Chalmers’ (1994, 1996) terms, between the physically embodied

50

Chapter One

information spaces and experiential (phenomenal) information spaces, where structural differences between the latter are the result of differences between the former. Logically inevitable is the conclusion that information has two aspects: physical and phenomenal. These two aspects being linked and mutually dependent, it seems obvious that activation of the former implies the emergence of the latter. How convincing Chalmers’ argument is and whether his three principles, supporting the qualitative difference between the two types of consciousness, are actually significantly different from the quantitative variance stance adopted by cognitive approaches to consciousness is a problem that goes beyond the scope of the present work; as a result, the dualism-physicalism debate is not going to be resolved here. Yet, keeping in mind that any consideration of consciousness means taking into account the two levels of the emergent information space, the phenomenal and the functional, is crucial as the point of departure for two important acknowledgements. First of all, it has to be pointed out that it is the latter, functional dimension of consciousness that is of much greater importance and use to the present argument. To start with, the easy problem of consciousness is where the very issue of attention as a cognitive phenomenon can be accommodated. Since the present work is written from the cognitive perspective, such a cognitively-grounded approach to consciousness is a most natural point of departure. As a consequence, the discussion on noticing will, to a certain, manageable extent, concentrate on the subjectivity of experience and phenomenal consciousness, particularly in our discussion of Schmidt’s Noticing Hypothesis; it will, however, deal much more thoroughly with discrimination, categorisation and integration of input; accessibility and reportability of mental states; different foci of attention; and the degree of control. These latter issues are going to be discussed later in this chapter. Different foci of attention will be considered in relation to strong and weak versions of the noticing hypothesis; accessibility, reportability and control of mental states are going to come back with problems of implicit and explicit learning in Chapter 2. Finally – and most importantly – it is necessary to point out that from this point on the term consciousness will be used interchangeably with awareness, both in reference to the functional dimension of the phenomenon in question. At the same time, however, it will be kept in mind throughout the subsequent parts of the argument that learning of any kind amounts to behaviour regulation and there are two – the immediate (phenomenal) and the intermediate (functional) – systems (Obuchowski 1967, Tomaszewski 1975 and Kolaczyk 1999) that deal with such change. Kolaczyk looks at them through their functions – which are, respectively, labelled adaptive and transgressive (Obuchowski 1967; 1993) or reactive and purposive (Tomaszewski 1975) – and the type of memory they involve – procedural for the immediate behaviour regulation system and declarative for the intermediate system. What is important to remember is that the

On the importance of noticing: attention

51

first type of memory operates based on gestalts while the latter utilises analysed input data. What is more, as Kolaczyk (1999) argues, the two regulation systems operate smoothly and collaboratively for two reasons. First of all, the procedural and declarative memory systems meet in the area of working memory – an idea similar to the one put forward by Anderson’s (1980, 1982, 1983, 1993 and Anderson et al. 1997) ACT and ACT-R models8. The second link between the immediate/phenomenal and the intermediate behaviour regulation is attention with its potential to subjectify both types of experience. All this leads to two observations: it confirms, once again, the already-mentioned convergence between working memory and attention; it also indicates that any kind of learning – language learning included – will be a complex process resulting from both explicit, (self)instruction, whose realm is the analytic intermediate behaviour regulation system, as well as implicit processing of the contextual clues of the input, based on family resemblance or through accumulation of exemplars, which belongs to the immediate, phenomenal behaviour regulation. 2.3. Dual Process Theories The above assertion about learning being a complex collaborative process is particularly strongly rooted in a conceptual construct called dual-process theories. Three such theories – Epstein’s Cognitive-Experiential Self-Theory; Bargh and Wegner’s (Bargh 1989; Wegner and Bargh 1998) automatic processing; and Chaiken’s heuristic and systematic information processing (Chaiken et al. 1989; Chen and Chaiken 1999) – are briefly introduced below. They have been selected from a number of other models based on Neckar (2003). The selection criteria will be skipped as it is not particularly important, as the three models chosen are presented here for the sole purpose of providing a greater clarity of the present argument. 2.3.1. Epstein’s Cognitive-Experiential Self-Theory In the light of Epstein’s Cognitive-Experiential Self-Theory (Epstein 1990 and later works; Epstein and Pacini 1999), adaptation to the environment is carried out owing to three processing systems: the unconscious system9 as well as two others including the ––––––––– 8 9

Illustrated and described in Chapter 2, Section 1.2.1 When Epstein talks about the third, unconscious associative level, his unconsciousness – once again very much like Kihlstrom’s cognitive unconscious and Jackendoff’s f-conscious – differs from Freud’s in the sense that it is called cognitive and perceived as evolutionary, adaptive and very influential when it comes to everyday human choices. Understood in this way, such unconsciousness is quite dynamic as it is influenced by four basic needs (Epstein 1994; following Kolaczyk 1999: 19): the need for maximal pleasure; the need for cognitive system coherence; the need for belonging; and the need to overcome the feeling of inferiority and boost one’s self-esteem.

52

Chapter One

preconscious, holistic experiential system and the conscious, analytic system. The analytic system processes data sequentially and works logically and, nomen omen, analytically. The experiential system, in turn, is affect-motivated and necessary for a positive or negative appraisal of the current situation, its basic constructs being two types of schemata: descriptive and motivational. The former facilitate the processing of what entities are like; the latter determine if-then relations in a currently appraised situation [=if I work hard I’ll succeed]. The fact that it processes all data holistically makes the experiential system indispensible in more sophisticated intellectual feats like understanding and constructing prototypes, stereotypes as well as coding and decoding metaphors and narratives (Epstein and Pacini 1999). Its major drawback, on the other hand, is that the heuristic judgments it performs are far from logical, as described by Epstein and Pacini (1999) based on the so-called ratio-bias phenomenon experiments: the testees, when asked to predict the probability of a certain event decided that selecting a given 10 out of 100 stands a greater chance than selecting 1 out of 10. What is important to our line of reasoning is that these two systems – rational and experiential – operate parallelly and interact (Epstein 1990). What is of particular interest in the point made above is the fact that the three cognitive systems – particularly the experiential system and the rational system – interact and complement each other’s operations. As indicated before, the primary workings of the rational system (Epstein 1994) are on the conscious level, which takes us back to Schmidt’s noticing hypothesis and makes it perfectly legitimate that any learning which is, analytic, logical, cause-effect based, and which codes the new information in symbols will have to be conscious. What, however, is important is that there is the other system, the experiential one, whose working is primarily unconscious: its operations (Epstein 1994) are automatic, based on a holistic rather than analytic evaluation of the incoming stimuli; and it works on the basis of associations rather than by interpreting cause-effect relations. The coding of input coming from such a system is carried out in images and narratives and not in abstract symbols. As Epstein puts it, when compared to the rational system, the experiential system is crude and its workings – evaluation and reactions to the outer world – can be described as dirty. As a result, the incoming data, before being subjected to the sequential, analytic processing, is pre-consciously interpreted, and this interpretation is of value to the following conscious analytic reasoning. In addition to all this, it needs to be said that the fact that the two systems described above, the rational and the experiential, are initially conscious and pre-conscious, respectively, does not mean that we can place an equal sign between the rational and conscious as well as the experiential and pre-conscious. As Kolaczyk (1999) puts it, we can be aware of the emotions that are the result of the experiential preconscious cognition as well as unconscious of the subsequent workings of our cognitive systems on the consciously analysed input. The latter, by the way, will be the difference between phenomenal noticing (conscious) and understanding (not necessarily conscious or verbalised).

On the importance of noticing: attention

53

2.3.2. Bargh and Wegner’s automatic processing When it comes to Bargh and Wegner’s model (Bargh 1989), the automaticity of processing is defined using the following adjectives: unconscious; effortless; unintentional; unmonitored; and devoid of control even if conscious. As proposed by Wegner and Bargh (1998), there is interaction between the two types of processes – automatic and controlled – which boils down to the following relations: – these two processes can be carried out parallelly, because – as was indicated on a number of occasions throughout this chapter – they do not compete for attentional resources; – they initiate each other: an automatic process may become conscious if something unusual happens in the course of such a mental activity; on the other hand a conscious process may become automatic, if stereotypes are – intentionally or not – applied; – they can dominate each other: an automatic process may overcome a conscious process in the form of a slip of tongue known as a Freudian slip; simultaneously, we may stop automatic data processing when we consciously choose to overthrow stereotypes; finally – one can evolve into the other: a process which starts as conscious may become automatic as in the case of driving a car which we may start fully intentionally and drop into the f-conscious procedural mode with time; or, we may perform a fully automated action only to realise later the steps that have been taken. In the case of language learning, a similar evolution may be the case when declarative knowledge becomes proceduralised or the already procedural knowledge – declarativised (to be discussed more thoroughly in Chapter 2). 2.3.3. Chaiken’s heuristic and systematic information processing The last dual-process model to be discussed here is Chaiken’s heuristic and systematic information processing (Chaiken et al. 1989; Chen and Chaiken 1999), in the light of which the latter type of processing is exhaustive and analytic, while heuristic mental operations are based on simple rules and schemata or stereotypes. What makes the theory different from the previous one is that the heuristic, gestaltlike processes can be both conscious and unconscious, while Bargh and Wegner (Bargh 1989; Wegner and Bargh 1998) argue that thinking, for example, in stereotypes is automatic and thus unconscious and uncontrolled. As for the mutual relationship between the two types of processes, Chen and Chaiken (1999) put forward the additive hypothesis – the two processing systems work independently but at the same time cohesively and collaboratively influence our actions.

54

Chapter One

3. Attention and language learning. The previous two sections of the present chapter dealt with the defining aspects of attention: its selectivity and the problem of consciousness. Both these issues will now be the point of departure for a discussion of the role of attention in language learning. Consequently, section 3.1 looks at the importance of selection in language processing and acquisition, referring to the issue of consciousness, if relevant. Section 3.2, in turn, concentrates on the role of consciousness in noticing. It considers Schmidt’s Noticing Hypothesis and addresses the proponent’s claim that “storage without conscious awareness is impossible” (Schmidt 1990: 136). 3.1. Attentional selectivity in language processing In this part of Section 3 we will look at attentional selectivity as regards language related phenomena. The main focus will be on two aspects of selection: the unquestionable bottleneck effect, amounting to channelling attentional resources towards conscious noticing on the one hand and, on the other, the graduity of the process and the role of the unconscious parallel processing which underlies the final selection. These two selection-related issues will be of slightly different importance with regard to what language we are talking about: the mother tongue or a second/foreign language. 3.1.1. Selectivity in language processing. Treisman revisited Attentional selectivity, or the bottleneck effect, was related to language processing by Treisman, whose model was presented in Section 1.1. What it emphasised was that the closer we get to the final selection – phonological (or graphic), syntactic, semantic – of input, the tighter the bottleneck becomes, the whole process being gradual rather than radical and, to a considerable extent, based on unconscious mental operations carried out parallelly rather than sequentially. These aspects of selectivity – graduity and the underlying parallel unconscious processing – are visible on a number of language input processing planes: recognition and retrieval of words; selecting word senses; memory vs. rule-based processing; parsing of phrases, utterances, discourse sections; as well as both understanding and making the meaning of utterances, all discussed below. In her preferred model of mental lexicon, Aitchison (1994) proposes what she calls “interactive circuitry”, a processing schema underlying the recognition and production of words. In the course of spoken language reception, it is the auditory sensation that triggers the mechanism. On hearing a certain sound sequence, we start a parallel spreading lexical search through a number of phonologically

55

On the importance of noticing: attention

similar items in order to finally select the one word that best suits the current co(n)text. The process – potential, not mandatory, as the activation may follow a different pattern – is schematically diagrammed below for the word handsome, (Figure 1) following Aitchison’s spreading activation model for bracelet (cf. Aitchison 1994: 237). As can be seen, there is constant interaction between different elements, as the search takes place cyclically. This means that looking for the best option may occasionally require going back to options already selected but abandoned. This cyclicality is a strong proof that there is a reason the module of working memory in charge of spoken language input is called a phonological loop10. Figure 1. Parallel spreading lexical search for handsome. PERCEIVED SEQUENCE ha___

hat

SOUND

_ome

handcuff handgun

ham

hand hands hamster

HAT

HAM

handsome

hands

HAMSTER HAND HANDS

HANDS

HANDSOME HANDGUN HANDCUFF

In production, we will be dealing with a reversal of the mental activity above, as seen in Figure 2 presenting the mental circuitry for a domestic fruit (cf. Aitchison 1994: 225; meaning: a small woodland animal):

––––––––– 10

It stands to reason that patterns similar to the one diagrammed below are activated for the reception of written texts. In their case, the process described below is triggered by the visual input and the visual sketchpad of WM is the place of interactive circuitry based on graphic – rather than phonological – similarity.

56

Chapter One

Figure 2. Parallel spreading lexical search for domestic fruit. MEANING (a domestic fruit)

SOUND ampoule

APPLE

a.........le

PLUM

p.........m

PEAR

p.........r

apple pram plum pair pearl pear

As can be seen in the diagramme above, in word production meaning is selected first and then a number of words are parallelly and unconsciously activated and cohort towards the processing space only to be gradually suppressed in the course of context-based input processing. The activation of individual items, which is not marked in the diagramme above, varies in degree, depending on how frequently a certain item is found in language – and, consequently, how high its resting level of activation is – and how effectively a particular item is constrained by the current context. All this results in gradual suppression of the items which do not fit11. It may happen, though, that the frequency-motivated activation level of an item or its emotional weight overpower contextual constraints. In such a case we witness slip-of-tongue phenomena. A similar process can be observed in the activation of different senses of one polysemous word, those more frequent and, consequently, more prototypical, being activated faster and their levels of activation being higher. The process is obviously constrained by the context, which is why the final selection is the result of interplay between a cohort of senses – on differing activation levels – and the utterance which contains a given item. However, very much as in the course of acoustic processing of input described above, “highly entrenched senses are difficult to suppress regardless of how inappropriate they may be in a given context” (Dbrowska 2004: 14). Resulting from this are garden-path effects, in which the initial incorrect prototype-motivated interpretation is difficult to ––––––––– 11

All this is also highly reminiscent of Dennett’s fame model (Dennett 1991; Dennett and Akins 2008).

On the importance of noticing: attention

57

abandon, as shown by longer processing time needed for the disambiguation of such utterances. The fact that such disambiguation is possible proves that the initially abandoned item must be on a standby; it is, then – to a certain extent – processed parallelly alongside the high-frequency item. Another example of zeroing in on the intended meaning, preceded by the parallel processing of a number of available alternatives, will be seen on the level of morphology and syntax. This is because understanding various formal complexities – resulting from word polysemy and the related competition of argument structure; memory-based processing of irregular forms vs. rule based processing of regularities; syntactic ambiguity; or our interlocutor’s digressions – often requires juggling a number of possible formal and conceptual options at one time. Examples of such simultaneous processing of a number of syntactic alternatives are discussed below. The first example of parallel processing and gradual suppression is given by Pinker (1999) in relation to past forms of regular and irregular verbs. While the first are produced based on the add-the-ed rule, the latter are retrieved from memory. This means, as Pinker argues, that there are two independent systems in charge of each of the two operations. Given the fact that past terms forms are produced quite instantaneously, it will be difficult to accept the possibility, that we activate our rule-based regular-verb system only after having run a complete memory search for the irregular past form. “A more likely possibility is that,” as noted by Pinker (1999: 144), words and rules are accessed in parallel, that is, at the same time. As we plan to utter a verb in the past tense, we simultaneously look up the word in memory and activate the rule. An inhibitory link runs from the memory box to the rule box, which gradually (emphasis mine) slows down the rule as evidence for a match is found, and eventually turns it off.

Another instance, this time of a syntactic parallel processing, is related to utterance parsing. According to Pinker (1994), on hearing the beginning of an utterance we normally set up tree diagrammes underlying its syntactic organisation. In doing so, we are usually minimally ahead of the speaker, waiting for the “usual suspects” – the most predictable elements – to fall into appropriate slots on the syntactic tree. Revisiting modern models of working memory, it may be said that we open Cowan’s (Cowan 2001; Cowan et al. 2005) 3-5 slots for the prospective chunks to fall into. In other words, if the utterance is a grammatical sentence, the presence of the noun phrase branch means that the verb phrase branch should soon “grow” too. In this sense our syntactic parser operates in the parallel processing mode: while attending to input as it becomes available, it, simultaneously, deals with what we expect to hear next. This double focus is relatively uncomplicated to maintain if our expectations are fulfilled as we move on through discourse; or, if the digressions the speaker makes on his/her way through the sentence are not to difficult for our working memory to handle. Yet,

58

Chapter One

should some syntactic complications arise, we need to set up an alternative tree diagramme or amend the presently construed one. What, however, usually happens is that, while doing so, we still keep our mental eye on the old arrangements, which again requires parallel processing. This will, for example, be the case of complex NPs, where we have to attend to the intricacies of in-phrase complementation while still keeping the overall NP+VP structure on standby. A similar tree-based parsing-related parallel processing will be the case in interpreting ambiguities such as Hunting children can be cruel or We talk about sex with Jarosaw Gowin. In the case of the former, we will have two candidates for the head of the subject NP; in the latter, there will be two complement options to consider. While we may see one of the solutions and their underlying meanings first, we are still able to go back to the other option instantaneously, which suggests that even if it does not emerge simultaneously with the first, the second choice has at least to be just below the access level to be retrieved so quickly. As seen in the Hunting children and sex with Jarosaw Gowin examples, parallel processing will often appear at the meeting point between syntax and broadly understood semantics. On the one hand, simultaneous processing of syntactic factors may help resolve a word polysemy problem; on the other hand, the numerous potential conceptualisations resulting from ambiguous syntax will, eventually lead to choosing one interpretative option which is acceptable in terms of both, the syntactic structure in question and the interpretation of meaning possible in a given context in which the parallel manipulation of content does not relate to syntax per se but rather to the propositional content triggered by syntactic processing (cf. Caplan and Waters 1999). Examples of the first situation are given in MacDonald (1994) and Baars (1997b). MacDonald argues that the meaning of some polysemous verbs can be disambiguated based on the accompanying argument structure. Hear being a verb of this kind, the opening phrase of a sentence such as The patients heard... has four possible developments (MacDonald 1994: 97): music (active transitive); with the help of ... (intransitive); that ... (sentential complement); and talking will be ... (reduced participle cause). While it seems unquestionable that the most prototypical active transitive is a likely candidate for the default choice, the other options are all likely to be on a standby until the proposition is disambiguated. This claim is based on the fact that if a less prototypical argument structure is used, the semantic disambiguation follows instantaneously; as a result, it stands to reason that if the alternatives are so quick to surface, they must be in the state of activation throughout the whole processing event. Baars (1997b), in turn, gives the example of the word set, chosen for its polysemy12, to explain the same process. He notes that, ––––––––– 12

75,000 words were devoted to explaining its numerous meanings in the Oxford Dictionary of English; in Baars (1997b: 296)

On the importance of noticing: attention

59

in spite of the plethora of meanings of set, based on our knowledge of its grammatical structure and collocates we can instantaneously select the appropriate meaning out of the numerous options on standby. We can also, equally quickly, switch between the options, should the co-text of the word – and, consequently, its meaning – change. As above, it stands to reason that, for the selection and the switch to be possible, the different options have to be processed parallelly until the bottlenecking is finalised. To give a final example, an analogous parallel processing of data seems possible in the case of constructional schemata which have more than one meaning. This can be seen, for example, in the case of tense and aspect as these syntactic structures are highly polysemous and thus open to more than one interpretation. As a result, like a polysemous word, a polysemous grammar construction will get disambiguated in a syntagmatic relationship with another construction. As I argue elsewhere (Turula 2007b), this, among others, is the case of future forms such as will+inf and be going to whose interpretation changes depending on their collocates, I’m not sure if ... as opposed to ... or else13. In turn, an example of semantics helping to disambiguate the syntactic properties of a sentence can be found in Chipere (2003). She looks at sentences with a familiar syntax like The robber arrested the policeman, which, in spite of their formal averageness, have to be reinterpreted in the course of the reconstruction of possible contexts in which policemen get arrested by robbers, the contexts being processed simultaneously and gradually abandoned. It also seems probable – and the point will be argued later in the present book – that such parallelly carried out interpretations and (re)interpretations of otherwise noncontroversial syntactic properties of utterances will take place in the case of various non-prototypical14 uses of certain grammatical forms the language user may meet in their linguistic encounters. All these language processing phenomena with their selectivity following parallel unconscious activation are also, to a certain extent, reflected in Langacker’s (1987) idea of holistic semantic processing described in relation to his usage-based model of language. His model, in addition to echoing Treisman’s understanding of attentional processes, bears considerable resemblance to the zoom-lens (Posner’s 1980) and spotlight (Eriksen and St. James’s 1986) models of focusing. As Langacker (1987) puts it, in our coding and decoding of expressions, we relate them to conceived situations or “scenes” but “the meaning ––––––––– 13 14

to be discussed more extensively – and in relation to Fauconnier’s (1994) mental spaces in Part II What I mean by non-prototypical is a use which differs from that proposed in pedagogical grammars and, as a consequence, is of interest to the present argument. These may include using: the present – and not continuous – tense for events taking place at the moment of speaking; the continuous tense for stative verbs; the past simple tense in accomplishment/achievement contexts where present perfect is normally preferred; etc.

60

Chapter One

of an expression is not adequately characterised just by identifying or describing the situation in question” (Langacker 1987: 116). What we need in addition to this, are focal adjustments of the scene carried out on three different levels: selection, perspective and abstraction. As will be seen in the course of the presentation below, all three aspects of focal adjustments will accommodate the syntax/semantics interaction phenomena described earlier in this section. In the case of selection, a full semantic characterisation of a selected entity requires parallel activation of numerous cognitive domains (related to the shape, function, size, etc. of the entity in question). In the process of (de)coding, focal adjustments pertaining to these domains – and, consequently, the parts of the scene we are dealing with – are carried out. As part of this process, the attentional focus may narrow down on a number of planes: between the specific cognitive domains which are activated by a certain expression; within the same domain, along a certain scale; and finally, in scope, which amounts to how big a portion of the element in question is focused on. The first, inter-domain focal adjustments will mean moving the attentional spotlight from the colour of the object to its shape and/or its function, with all of the conceptual consequences of such foregrounding, including, for example, meaning extensions. Such focal adjustments may also imply moving between the spatial and temporal domains pertaining to a lexical item such as close, as in the case of The tree is quite close to the garage vs. It’s already close to Christmas given by Langacker (1987: 117). When it comes to the scale-specific adjustments, different usage events of the same lexical item will require a conceptual zoom-in – or out – as in the case of close whose interpretation changes between the following sentences: The two galaxies are very close to one another; San Jose is close to Berkley; The runner is staying close to first base and The sulphur and the oxygen atoms are quite close to one another in this type of molecule (Langacker 1987: 118). Finally, in the case of scope, we will be interested in the portion of the scene which comes into focus, keeping in mind the principle of the immediacy of such scope. For example, what a knuckle will activate is a finger while two arms are likely to require the scope of the entire body; similarly, a predicate like an uncle will activate a smaller portion of the family tree than a great-grandfather. All of the above-described mental operations involve attentional selection based on what was earlier referred to as juggling a number of possible formal and conceptual options at one time for the focusing/refocusing to be instantaneous and automatic. With relevance to perspective, in turn, focal adjustments will refer to the prominence of the individual participants of the scene as regards the position from which the scene is viewed. Consequently, according to Langacker (1987), the variation in question will be determined by figure/ground alignment, viewpoint as well as deixis and subjectivity/objectivity. As all of the issues, belonging to the domain of cognitive semantics, will be discussed in extenso in Chapter 3, the present discussion will be limited to a brief introduction of the two areas of focal

On the importance of noticing: attention

61

adjustments in perspective which are more important to the present line of reasoning: the figure/ground and the subjectivity/objectivity relations. In the case of the former, “the figure within a scene is a substructure perceived as standing out from the remainder (the ground) and accorded special prominence as the pivotal entity around which the scene is organised” (Langacker 1987: 120), all this being very similar in quality to Arvidson’s (2006) focus on the theme, with all its shiftiness and jumpiness. To give examples, thematisation and other word order phenomena will be such means of focal adjustment; similarly, all imperfectives – as opposed to perfective structures – will shift the spotlight from the product to the process itself. When it comes to the subjectivity/objectivity dimension, “the speaker (or hearer), by choosing appropriate focal ‘settings’, and structuring a scene in a specific manner, establishes a construal relationship between himself and the scene so structured” (Langacker 1987: 128). As a result, the speaker/hearer may construe himself as a participant of the scene (continuous tenses; first-person reference) or subjectify themselves, in various ways and to varying degrees (simple tenses, third-person reference as in Don’t lie to your mother uttered by the mother; Langacker 1987: 131). The semantics of all of the perspective-related language means is to a large extent determined by the other available options: the meaning of continuous tenses or the application of the third person reference will be understood based on the comparison between these formal solutions and their alternatives (simple tenses; first-person reference, etc.), a comparison which requires a certain degree of parallel processing. If this is so, it seems justifiable to assume that a number of potential lexical and structural candidates will be subject to simultaneous activation in the course of perspectivisation. Finally, in the area of abstraction, the focal adjustments discussed here will refer to the level of the specificity of the individual perception. It will, after Langacker (1987), be understood as the omission of certain properties of an entity carried out on the level of perception – we adjust the attention focus to the colour but not the shape – and, more importantly for the present argument, as a departure from the immediate physical reality into the realm of abstract schematisations of symbolic units and their amalgams/constructions. Considering the parallel processing phenomena discussed before, we can argue that the instantiation and the schematisation will be processed alongside each other. In macro scale, all the individual examples of language-related attentional narrowing down described above, with their transition from parallel multi-option processing to the selection based single-channel interpretation are best understood with reference to Jackendoff’s (2002: 125) parallel architecture, whose very name implies simultaneous processing of language. In the light of Jackendoff’s model, all linguistic computations are carried out based on a tri-partite framework: phonological, syntactic and semantic formation rules underlie phonological,

62

Chapter One

syntactic and semantic structures, which, in turn, form the three mainstays of the language system. None of them autonomous to the extent Chomsky wanted them to be15, the three pillars in the system’s architecture communicate with each other via interfaces. As a result of such communication, as proposed by Jackendoff (2002) and recycled by Truscott and Sharwood-Smith (2004), language input – and, analogously, output – is analysed simultaneously by the three subsystems. As a result, a number of parallel processing phenomena of the kind described earlier in this section can be observed: the sound is processed cyclically in the phonological loop; the conceptual analysis carried out alongside is often faster than the phonological and the structural analysis and consequently overtakes both; and the syntactic module frequently overproduces candidates for the structural as well as conceptual interpretation of the message. All this parallel processing is subject to gradual narrowing down at the end of which the intended meaning of the utterance is reached. At this point in the argument, it is important to observe that the parallel processing leading to the final bottlenecking – all of the interplay inside language as a system; the simultaneous and cyclical processing of different aspects of input – is, in the case of L1, mostly un/subconscious. Or rather, as Jackendoff (2002) phrases it, to avoid Freudian associations of the subconscious, it is f-conscious: functionally conscious of certain subsymbolic representations. To clarify the concept, Jackendoff argues that when we say the little star, it is a bit farfetched to claim that we have an entity like a noun phrase in [the conscious] mind; we rather speak of such phrasing in functional terms, “this functional organisation [being] embodied in a collection of neurons engaging in electrical and chemical interaction” (Jackendoff 2002: 22). All this is highly reminiscent of the alreadydiscussed cognitive unconscious16 (Kihlstrom 1987, LeDoux 1996; see earlier in the chapter). According to LeDoux (1996), we do not plan the grammatical structure of our utterances as this is one of the many things the cognitive unconscious is in charge of. In the light of the above we can say that Jackendoff’s model seems to confirm the role of the unconscious in attentional selectivity in language. In other words, the language processing examples quoted in this subsection affirm that selection does not amount to a momentary switch from the unconscious to the conscious; in the case of L1, the final discrimination is preceded by the unconscious parallel processing of data which is subject to gradual narrowing down and the accompanying progressive zeroing in on the ––––––––– 15

16

Chomsky and other generativists attached particular importance to the syntax pillar, seeing this component as fully autonomous. The syntactic module was exclusively rule-governed which meant that all anomalies – like idioms or irregular verb forms were stored in the lexicon. Chomsky (1986) uses the word cognised to speak of processes which are f-conscious or cognitively unconscious.

On the importance of noticing: attention

63

intended meaning rather than an instantaneous conceptual switch-on. The fconsciousness of the whole process is what makes it smooth and instantaneous: we can handle manifold foci simultaneously because there is no need to pay focal attention to all the three language subsystems. Another factor that takes the cognitive load off the attentional capacity is what Dbrowska (2004: 17 and subsequent pages) calls “processing shortcuts”: storing and accessing language in chunks rather than single items assembled online based on algorithms. Such “availability of semi-idiomatic prefab[s] ... pre-empts more compositional interpretations” (Dbrowska 2004: 22) by filtering out semantic and syntactic options which would be subject to parallel processing if the utterances had to be assembled online – either in production or in reception – and fully parsed. The automaticity of the process makes it undemanding in terms of allocating attentional resources, as clarified earlier in this chapter in relation to allocating attentional resources to controlled vs. automated actions. 3.1.2. Attentional selectivity in language learning. The constraints on the process While all of the above remarks concerning attentional selectivity and the role of the f-conscious and processing shortcuts are, to a certain extent, valid for any language – be it L1, L2, Ln or FL – we have to be aware that, in the case of a language other than the mother tongue, the gradual zooming in/bottlenecking and focal adjustment based on parallel unconscious processing may be subject to numerous constraints. This is because the L2/FL is likely to become f-conscious, in Jackendoff’s (2002) understanding, on the condition that it is automatised as a result of ample exposure and practice. Before this happens – or, more frequently, unless it happens – we observe a number of undesirable phenomena related to attentional selectivity which negatively affect language intake, processing and production. These phenomena may include premature input bottlenecking in the phonological loop or interface blocking and the resulting single-channel processing in the tri-partite architecture of otherwise – meaning in L1 – smoothly interplaying subsystems. This may lead to a number of trade-off effects like meaning-form shadowing while processing input or certain constraints on L2 output, in which either accuracy or fluency or complexity are seriously compromised. When it comes to overall, conceptual processing – motivated by the interplay of the three mainstays of the language architecture – it may be far from native-like as focal adjustment of a language learner are far from automatic and inter-domain shifts have a tendency to be blocked as a result of insufficient knowledge of the target language or L1 conceptual transfer. Finally, the change of focus between the specific and the abstract – especially in the token-type-token changes of attentional focus – may be hindered by the insufficiently developed overall grasp of language as a system. All these problems are discussed below.

64

Chapter One

3.1.2.1. Selectivity in input processing. Form/meaning trade-off effects When it comes to the premature bottlenecking during different language tasks, its debilitating effect is known to all lower-level language learners. Such a phenomenon is observed, for instance, in tasks requiring online processing of aural input. It often happens that, in the course of listening, the learner’s attentional resources are allocated exclusively to a number of selected problematic words which become subject to cyclical phonological processing with almost total shadowing of the remainder of the message. Consequently, the learner perceives and processes a number of individual, syntagmatically unrelated lexical items while missing the gist of the input as a whole. Such a lack of overall control over the processed text is most probably the result of the fact that the problematic words – because of their novelty and difficulty – stick out as figures in Langackerian understanding whereas the rest of text is backgrounded. As a result, these words enter the focus of attention while the background is filtered out. Similar faulty foregrounding and the resulting premature bottlenecking of attention to language – this time motivated by word familiarity, not novelty – can be observed in sentence processing and decisions about grammar. Certain key words, particularly adverbials and conjunctions given by pedagogical grammars as indicators of particular tenses, tend to stand out of the sentence, backgrounding other interpretational cues. This, as I argue elsewhere (Turula 2007a) is, for example the case of since and its tendency to be understood primarily as from a moment in the past until the present moment. Such an ad hoc interpretation elicits the present perfect tense in spite of the co(n)textual evidence for the other meaning of since, which is offered by contexts such as I don’t know who did it but it couldn’t have been Alice since she was away at that time. Similar bottlenecking on a certain aspect of input combined with the filtering out of all other processing facets may take place as early as the level of phonology and overshadow any related semantic computation. As a result of this effect, the words one hears trigger lexical associations with other words, based on sheer sound similarity bypassing the semantics. This often happens in real-time processing tasks at the very low stages of language proficiency, at which “a simple state of ignorance ... provokes a desperate casting about for lexical straws to clutch at” (Singleton 1999: 132) and the phenomenon described is called, after Meara (1984 cited in Singleton 1999: 131-136) the clang effect. Singleton’s “clutching at” easily associates with tightening/narrowing down the flow of input (bottleneck) while the “desperate casting” about for clues is the result of limited attentional resources. Premature bottlenecking or faulty foregrounding is also a problem in comprehension tasks which, by nature, require simultaneous cyclical processing of one or more aspects of input. In a study I carried out recently (for details see Turula 2009b), the testees were asked to interpret skeletal sentences containing

On the importance of noticing: attention

65

noun-to-verb conversions in different grammatical constructions such as X bottled Y or X kept Y bungeed. The overall meaning of these sentences was simultaneously motivated by the verb itself as well as the syntactic frame it was used in, both understood as constructions with their dual character of a formmeaning pairing (Croft and Cruse 2004; Goldberg 1995 and 2006). The test was given to three groups of learners on three different levels of language proficiency, and the study showed that the less advanced learners, unable to simultaneously process both constructions, tended to interpret the whole sentence based on one of the two cues while ignoring/filtering out the other. In addition to this, a number of conceptual transfer phenomena were observed, which led to the conclusion that while trying to make certain inter-domain focal adjustments on polysemous words, the testees were often blocked or tended to rely on L1 meaning extensions. Moreover, as could be observed on the basis of the data collected, the less advanced foreign language users had problems with focal adjustments between the specific and the abstract, especially in the cases where the token-type-token attention shifts were necessary (for a more detailed analysis, see Turula 2009b). Finally, when unable to reconcile the verb and the syntactic frame in which the verb was used, the less advanced testees interpreted the overall meaning of the skeletal sentence based on the meaning of the verb while ignoring the syntactic construction. The phenomena described above, with special regard to the last observation, are part of a much broader problem, involving bottlenecking as well as focal adjustment effects, known as a form-meaning trade-off. This phenomenon results form the less advanced learner’s limited processing capacity and can be described as allocating attentional resources either to meaning or to form, the former definitely being the first choice. The fact that L2 learners, especially at lower levels of proficiency, find it difficult to attend to form and meaning while processing input – with special regard to online processing – was confirmed by a number of studies, two of which – VanPatten’s (1990) and Wong’s (2001) are described below. In the first of the studies, involving 202 students of Spanish at varying levels of proficiency, VanPatten (1990) showed that paying simultaneous attention to form and meaning in an aural presentation is rarely manageable for lower-level L2 learners. Such learners choose to process input: a) for meaning alone, as demonstrated by the highest test scores achieved in content-only tasks; or b) mainly for meaning and only minimally for form, as shown by the content+lexical item testing tasks (for details of the study see VanPatten 1990). According to VanPatten, such a lack of attention to form is most likely the case if formal aspects of language are unmarked and consequently noticing them is not indispensible for the understanding of input. The underlying causes of such a trade-off can be found in the earlier considerations of attention as a limited resource. If we take into account the finite attentional capacity in humans –

66

Chapter One

already described in detail in the present work – it stands to reason that the first to be discarded as unimportant will be the redundant aspects of input: its unmarked formal properties. All this boils down to VanPatten’s Primacy of Meaning Principle (VanPatten 1990 and later works) which can be further subdivided into: – the Primacy of Content Words Principle, in the light of which, in their search for meaning, learners will always process content before form; – the Lexical Preference Principle, which means that lexical verb processing will precede allocating attentional resources to structure; – the Preference for Nonredundancy and the Meaning-before-Nonmeaning principles, according to which forms of greater communicative value will be processed first; – the Sentence Location Principle, which states that a sentence initial items are processed prior to those that are sentence-central or sentence-final; this principle, however, is often overthrown by the Lexical Semantics and Event Probability principle, in the light of which we tend to rely on the meaning of individual words and our event schemata and scripts rather than on word order alone; – the Contextual Constraint Principle, in turn, emphasises the role of context in input processing declaring that we constrain all of the above interpretations based on a broadly understood context as well as co-text. In the latter case, however, as VanPatten (2004) argues, while we may pay attention to prosodic cues of the text, we rarely process them purely formally; and finally – the Availability of Resources Principle, in the light of which all the abovediscussed facets of input are constrained by the learner’s attentional capacity largely determined – as mentioned on numerous occasions in the present work – to the efficiency of their working memory (also cf. VanPatten 2004 as well as Just and Carpenter 1992). As a result, content will precede form17, as specified by the Primacy of Content, Lexical Preference and Meaning-before-Nonmeaning principles. Similarly, in the light of Sentence Location and – particularly, the Contextual Constraint principles, excessive focus on form will have a debilitating effect on content processing, such as described in the cases of premature bottlenecking and faulty foregrounding presented earlier in this section. All of the negative effects will ––––––––– 17

Occasionally, as was pointed out in relation to the clang effect (Meara 1984 in Singleton 1999), form processing may override content computation. This happens if the learner’s proficiency is far below the language material (s)he is being exposed to. VanPatten’s principles are subscribed to based on the assumption that we are dealing with inputs following Krashen’s i+1 (Krashen 1982) principle: they are one step ahead of what the learner’s current language processing potential.

On the importance of noticing: attention

67

possibly be alleviated in the course of the gradual automatisation of formal processing, which is possible only with growing learner proficiency, a point already argued here and confirmed by the Availability of Resources Principle. In the light of the above, it becomes clear why processing of input becomes much more complicated when the forms used are marked or, in other words, when they carry certain meaning nuances and, as a result, cannot be attentionally disposed of without compromising the correct understanding of the message, as was for example the case of my own study (Turula 2009b) mentioned earlier. Processing problems will result form the fact that if a form has to be processed because of its “real communicative value” (DeKeyser et al. 2002: 810), we will witness what VanPatten calls the competition of form and meaning for attentional resources during online processing of input. This competition will occur on a number of planes on which, in the light of VanPatten’s input processing (IP) model (1994, 1996, and 2004), the human IP parser operates. These planes include the overall sentence meaning – with regard to relational meaning understood as who did what to whom – the referential meaning as well as the already mentioned communicative value. It is important to point out that VanPatten’s model has been subject to criticism by DeKeyser et al. (2002). The main fault found with his theory was the fact that VanPatten subscribes to the effortful early-selection and the limited-capacity idea of attention put forward by Broadbent (1954, 1958) and Kahneman (1973), while other contemporary models – including the one put forward by Neumann (1996) – are more inclined towards the proposals of Treisman (1960, 1964), Wickens (1984) and all of the others providing for multiple attentional pools (see earlier in this chapter). The criticism does not seem particularly well-grounded, though, which can be shown in an analysis of individual charges. First of all, many of the critical comments seem slightly unsubstantial as the model proposed by VanPatten and the solution argued for in DeKeyser et al. (2002) are not that far apart conceptually. It is true that – unlike VanPatten, who opts for early selection based on a competitionwinning mechanism – the multiple-pool models address the competition in progress, focusing on interference rather than trade-off phenomena and ascribing the problem to confusion caused by a large number of similar stimuli involved in “cross-talk” (DeKeyser et al. 2002: 807) rather than early selection. However, it also cannot be denied that both types of models see multiple-focus processing as competitive and regard this competition as a factor constraining processing. Even if we agree with Gopher (1992) that in the cross-talk models attentional capacity – which is crucial to VanPatten – is irrelevant because the only factor of importance is time, there are two arguments which can be used to prove that this claim does not make the interference and the trade-off stances decidedly different. First of all, contemporary science sees time as yet another dimension. Consequently, attentional capacity – perceived metaphorically in terms of a physical 3-D space – will naturally become four-dimensional, a kind of temporal-spatial continuum, in which

68

Chapter One

questions of how many parallelly will be as important as how quickly. What is more, the time constraints are connected to selection mechanisms via the issue of the automaticity of performed mental operations. As demonstrated earlier in this chapter, the more automatised certain processes are the faster the computation and, consequently, the larger the number of facets can be attended to simultaneously. As a result, tasks with a time limit – such as online processing – will naturally involve attentional selection which will take place the earlier the less linguistically dexterous the learner is. In conclusion, attention is paid to language input selectively, regardless of whether we understand this selectivity in terms of the trade-off effect, bottlenecking and the resulting highlighting of one feature of input, or in terms of the simultaneous processing of the different interfering aspects of input. Finally, it is indeed difficult to refute DeKeyser et al.’s (2002) critique of VanPatten’s meaning-driven parser if we look at input processing from the L1 perspective, based on the fact that there truly are multiple sources of information motivating understanding, not only the semantic domain. It, however, stands to reason that in the case of L2, especially for lower-level learners, simultaneous processing of all these numerous aspects of the perceived message is impossible. This observation is confirmed by Gass et al. (2003: 498), who state: “with regard to proficiency, focused attention [has] a diminishing effect, with the greatest effect in the early periods of learning and the least in later stages”, as well as the findings of my own already quoted study (Turula 2009b) which proved that the lower the level of the testees, the greater their problems in handling numerous attentional foci at the same time. All in all, trade-off is inevitable and selection is a must. And because the semantic – or, even more importantly, pragmatic –is almost always more important than the formal, it is meaning that is processed first and often exclusively. An additional observation that can be made based on the above considerations is that whether the selection happens early or gradually depends on a number of factors, the most important of which is the level of learner proficiency in the target language, as has been highlighted on a number of occasions throughout this section. Other influential factors will include the “real communicative value of form” (cf. earlier in this section) in whose case attentional resources, rather than being allocated to meaning alone, have to be divided between meaning and form; if, in their current stage of development the learner is ready for such parallel processing is a different issue. Whether the above-described bottleneck effect will occur depends also on currently used reception channel/processing mode. Arguments at the fore of this were demonstrated in a partial replication of VanPatten’s research carried out by Wong (2001). Wong assumed that one of the most important constraints on dual – meaning vs. form – attention tasks will be the modality used when on task. That is why she carried out her study along two research lines looking at processing of both the spoken and written texts. Before launching research procedures, she put forward the following hypotheses (Wong 2001: 354):

On the importance of noticing: attention

69

– H1: attending to a grammatical feature (the definite article) in the auditory input will negatively affect comprehension; – H2: attending to a content lexical item in the auditory input will not negatively affect comprehension; – H3: attending to a grammatical feature (the definite article) in the written input will negatively affect comprehension; – H4: attending to a content lexical item in the written input will not negatively affect comprehension; – H5: the above-mentioned form vs. meaning processing effects will differ with regard to the type of modality (aural vs. written) required for a given task. Describing the results of her study (for a detailed description see Wong 2001), the author concludes that the findings from the aural modality study support VanPatten’s (1990) claim that when attention is allocated to form, comprehension is compromised. A similar detrimental effect, however, is not observed in the case of form/meaning competition on visual modality/ written input tasks. However, in spite of the labels Wong employs, all this does not seem to result from the activation of a different modality (aural/visual); it is rather a consequence of the varying processing modes that spoken and written inputs require. While the listener has to deal with the incoming message in real time, the reader can always go back to previously read passages or stop to rethink the gist they get from the material read. This to a certain extent confirms the claim, put forward earlier in the chapter, that time is an important factor vis à vis selectivity of attention and an important dimension in attentional capacity. Even though Wong did not control for this, based on language teaching experience we can argue that the processing mode used when on tasks, while of importance to all levels of proficiency, will be particularly crucial for the less advanced learners. This is because, as argued earlier in this section, such learners need more processing time to compensate for their low attentional capacity resulting from the fact that language processing in their case is far from automatic. As a result, from the pedagogical point of view, two things need to be postulated: processing in real time, especially on lower levels, should have one focus, meaning or form. If the tutor wants the two to be attended to simultaneously, the offline mode needs to be applied. 3.1.2.2. Selectivity in output processing In addition to the study on form/meaning competition in the above-described input tasks described in the previous subsection, Wong investigated modalityrelated effects on attentional selectivity in output-oriented assignments and obtained comparable results. She found out that while written tasks, being offline, allow for parallel processing, in oral tasks, an additional focus usually has a detrimental effect on production. The underlying causes for such online output

70

Chapter One

trade-off effects and the resulting single-channel processing – once again inevitably graver at the lower levels and diminishing with growing proficiency – are identified by Skehan (2003), who, like VanPatten, ascribes them to the competition of different task demands. Skehan (2003: 99) proposes a three-way distinction which, at the same time, is the scheme of all of the language areas which will compete for attentional resources in oral/online production: – code complexity, which is an umbrella term for the language required, its lexical and grammatical complexity, variety, density and redundancy; – cognitive complexity, which is further subdivided into cognitive familiarity and cognitive processing. The former boils down to how familiar the learner currently on task is with the topic of this task, the genre of the presentation and the task as such. The latter refers to the type, organisation and clarity of information given as well as the amount of simultaneous computation needed on task; – communicative stress reinforced by time pressure, the length of the oral presentation, and the degree of control over performance. As a result, as suggested by Skehan (2003), there will be trade-off effects in relation to the three aspects of spoken performance: its accuracy, fluency and complexity. Presenting the results of his two studies, Skehan argues that improvement in one of these areas usually means a decline in the two other. As a result, a pedagogically sound decision – and the one motivated by what we know about attention as a limited resource – will mean selectively concentrating on the three different aspects of output. Skehan gives the following task guidelines: 1. Good accuracy-oriented tasks will: – have a clear task macrostructure – the steps to be taken while on task will be typical and easy to determine; – be based on cognitively undemanding, familiar, structured information input – have planning time; and – require an “agreed” outcome rather than a discussion of controversial issues; 2. Good fluency-oriented tasks will: – refer to topics of personal interest to the learners; – contain structured information; – be based on familiar information; – be cognitively undemanding; and – have planning time similar to or slightly longer than accuracy-oriented tasks. 3. Good complexity-oriented tasks will: – require quite complicated mental operations; – be based on information input containing a number of interrelated elements or be unpredictable in nature; – require planning and transformation operations;

On the importance of noticing: attention

71

– assign different goals to different participants rather than requiring them to arrive at common solutions; and – have planning time longer than accuracy-oriented tasks. As can be seen in the above framework, if attentional resources are to be allocated to accuracy, the remaining task demands have to be low so that the attention required for their execution is minimal: familiarity and the lack of complexity of the macrostructure guarantee that the required processing will be highly automated, ample time being an additional advantage. Similar facilitations concerning the automaticity of tasks and the lack of cognitive involvement can be seen in fluency tasks. Complexity tasks, in turn, can afford to assign the greatest attentional pool to the cognitive processing of the problem itself because they do not require accuracy of language or fluency of performance. The need for more specialised language as the task is performed can be fully satisfied because task time is not limited and some of the processing actually takes place offline and the learners have the chance to ask the task facilitator for help or to consult resources such as grammars or dictionaries. In all of the three types of tasks it is accepted that attentional resources will have to be channelled one way or another and tradeoff effects are inevitable. That is why a reasonable compromise – in the form of allowances concerning task complexity and task time – is suggested prior to task execution. At the same time it has to be remembered that in the case of oral/online tasks, the learner’s attentional resources will be subject to one more competition: between the cognitive and affective processing demands, an issue already introduced in relation to Stroop effect (Stroop 1935), in the course of which certain very powerful stimuli interfere with others shadowing them and slowing down their processing. In the case of the affective domain we are talking about cognitive interference in oral tasks induced by language anxiety (Sarason 1984). Such a processing deficit is attentional in nature as it involves a competition between task-related and self-related attention. A similar interpretation of the phenomenon of affective interference is proposed by Wegner and Bargh (Wegner 1994; Wegner and Bargh 1998). They argue that in our attempts to accomplish goals two systems cooperate supporting the actions we take: the conscious operating system in charge of goal attainment proper and the unconscious monitoring-oriented control system. However, in the case of error – real or imagined – when the monitoring system starts looking for its source, monitoring becomes conscious processing and claims some territory thus compromising the operating system and its content. “[W]hen [additionally] the person's conscious control processes are overwhelmed by distractions or stress,” as Wegner and Bargh (1998: 457) put it, the “monitoring processes usually crop up” and gain more and more territory in the limited space of attention. This leads to a cluster-on-theworktable (Stevick 1999) phenomenon of multiple focus processing (DeKeyser et

72

Chapter One

al. 2002; quoted earlier in this chapter) whose manifold foci come from both the cognitive and the affective domains. Considering the cluster situation together with the limited attentional capacity, it seems obvious that the different incoming stimuli start competing for resources. However, considering Damasio’s (1994: 159) claim that “the brain [being] the body’s captive audience, feelings are winners among equals” we can assume that affect-related processing will not only appropriate the attentional resources that would otherwise satisfy the cognitive demands but also dominate the scene of mental operations. As a result, according to Wegner and Bargh (1998), if one tries to suppress the thought of certain anxiety-breeding factors, the thought will persistently reappear increasing this anxiety and, as it operates consciously, additionally take up some of the attentional resources allocated to goal attainment. The results of such – as Wegner (1994) calls them – ironic processes of mental control are detrimental to the outcomes of the task currently being performed. As a consequence of this cognitive phenomenon known as Affective Filter (Krashen 1985), anxious learners, as indicated by MacIntyre and Gardner (1994), have problems with attention and concentration; rely heavily on memory; are reluctant to hypothesise; and lack speech coherence. To conclude this section on the selectivity of attention, two major observations need to be made. First of all, channelling of attentional resources in language processing is gradual and motivated by parallel f-conscious computation which takes place based on separate pools of attention: more than one simultaneously activated spreading patterns. This functional consciousness of language-related mental operations seems to be rather limited at lower levels of language proficiency. That is why, if parallel processing of some kind is required of an nonadvanced foreign language learner, we can observe some undesirable effects of attentional selectivity in language processing in both reception and production tasks. There is competition between different aspects of input or output, which results in trade-off effects on some of them and attentional focus on others. The latter will on most occasions amount to spotlight on meaning – as opposed to form – within the cognitive domain, and self- as opposed to task-centeredness on the brink between the cognitive and the affective. All these constraints will have to be taken into account when making pedagogical choices; some suggestions concerning learning and teaching have been made in the present chapter. Simultaneously, selectivity of attention – or exclusiveness of one of its foci – which was demonstrated in the present section in relation to second/foreign language processing is in fact typical of any situation in which we choose to consciously concentrate on one aspect of input and not the other. This fact was argued for, based on a number of sources, at the beginning of this chapter. At the same time however, as also indicated previously, in addition to intensive attention

On the importance of noticing: attention

73

of the kind described above there is also another way of focusing – extensive. As a result, we may ask two questions the answers to which will be of interest to L2/FL language pedagogy. First of all, while acknowledging that it is impossible – or at least very difficult – to intensively concentrate on more than one aspect of input, it will be interesting to what extent the learner can still attend to some aspects of input extensively, on condition (cf. earlier in this chapter) that they are coherently related to the main focus. Secondly, it seems crucial to establish whether at all and – if yes – how far such extensive attention to language data is conducive to learning. Section 3.2, based on relevant research to date, is an attempt to find answers to the two questions above. 3.2. On noticing. The problem of conscious awareness Going back to – and developing on – the initial definition of attention as a process of selectively concentrating on one thing while ignoring the other, we can say that such conscious noticing is a complicated and multilevel process. Tomlin and Villa (1994) see it as consisting of three components: alertness, which is the very readiness to deal with incoming stimuli or data; orientation, meaning the actual directing attentional resources; and, finally, detection, tantamount to actual cognitive registration of incoming stimuli18. Stashed in-between might also be processes like preconscious registration, which Schmidt (2001: 3) understands as “detection without awareness” as well as facilitation and inhibition (Schmidt 2001), whose role boils down either to reinforcement of any of the above aspects of focusing or to cognitive interference if attentional resources are allocated to stimuli coming from outside and/or not related to the main focus. Consequently, as a phenomenon, attention is both multi-faceted and intrinsically coherent: it offers a number of qualitatively differing options of the same phenomenon: noticing. There is general agreement about the fact that “what learners notice in input is what becomes intake for learning” (Schmidt 1995: 20). However, researchers disagree about two aspects of noticing: firstly, the definition of what noticing actually amounts to and, consequently, whether or not it has to be exclusively conscious; and secondly, whether noticing qualifies as indispensable for learning – if we subscribe to the strong version of the Noticing Hypothesis presented in Schmidt (1990 and later works) – or simply facilitative, in the light of its weaker version, in whose understanding noticing is only “the gateway to subsequent learning” (Batstone 1994: 100). Both controversies are discussed in the present section. ––––––––– 18

Tomlin and Villa’s model was SLA-endorsed in a study of Leow, who attempted to “isolate the effects of the three attentional functions of alertness, orientation and detection” (Leow 1998: 137) in a study concerning morphosyntactic gains in a problem-solving task .

74

Chapter One

3.2.1. The Noticing Hypothesis – strong and weak versions The strong version of the noticing hypothesis was put forward and later developed by Schmidt (1990; 1991, 1993, 1995, 2001). In the light of this postulate, learning is equivalent to intake and “intake is that part of the input that the learner notices” (Schmidt 1990: 139). Such noticing, according to Schmidt (1990), needs to be conscious as “storage without conscious awareness is impossible” (1990: 136). To understand this argument properly, it is necessary to know that for Schmidt attention and awareness/consciousness are isomorphic, potentially two sides of the same coin: “attention as the mechanism that controls access to awareness” (2001: 5) or “controls access to consciousness” (2001: 11). As Schmidt himself points out, “[i]t is difficult to distinguish between paying attention to something and noticing or being aware of it” (1995: 18). He supports the claim with a quote from Carr and Curran (1994: 219), who state: “If you are conscious of something then you are attending to it ... and if you are attending to something than you are conscious of it”. In effect, if attention is indispensible in learning, so is consciousness; as claimed in Seth et al. (2004: 14), even implicit learning requires conscious attention to the stimuli from which regularities are unconsciously inferred. What is also important is – and this issue will be returned to later in this chapter – it does not matter to Schmidt whether this act of conscious awareness resulting in noticing is intentional: “it makes no difference whether the learner notices a linguistic form in input because he or she was deliberately attending to form, or purely inadvertently” (Schmidt 1990: 139). What matters is the act of noticing which Schmidt, as indicated before, sees as conducive to and indispensible for learning. As Schmidt (1990) himself admits, the experimental data proving the strong Noticing Hypothesis stance is very limited. In fact, the only scientific proof he quotes is his own study carried out in collaboration with Frota (Schmidt and Frota 1986), on the basis of which he concludes that there is “a close connection between noticing and emergence in production” but that “the study does not show that noticing is sufficient for learning” if the spotted item is not processed “sufficiently deeply to ensure retention” (Schmidt 1990: 141). Schmidt’s later work (1995) contains reference to one more study, in addition to his and Frota’s research cited above (Schmidt and Frota 1986), carried out by Hanson and Hirst (1988), in which the two researchers, by asking their testees to pay attention to certain features of words on a list, proved that encoding was better in the case of the attended rather than unattended features. What makes the Noticing Hypothesis strong is the fact that Schmidt rules out alternative options such as subliminal perception, which he does not see as potentially educational. He also claims that the importance of noticing is not disproved by studies on the natural orders of acquisition or acquisition sequences because there is no way of proving that in their case natural means without noticing. Schmidt (1990: 149) concludes by stating:

On the importance of noticing: attention

75

... intake is what learners consciously notice. This requirement of noticing is meant to apply equally to all aspects of language (lexicon, phonology, grammatical form, pragmatics), and can be incorporated into many different theories of second language acquisition. Theories of parameter setting in second language learning, for example, can easily incorporate the suggestion that whenever 'triggers' are required to set parameters, these must be consciously noticed. (Whether one such trigger can result in a cascade of related syntactic changes is a variant of the implicit learning question.)

The claim is repeated in a much more recent publication, where Schmidt (2001: 3) asserts: “There is no doubt that attended learning is far superior, and for all practical purposes, attention is necessary (emphasis mine) for all aspects of language learning”. An alternative to the above strong version of the Noticing Hypothesis is the weaker perspective on the role of attention. The controversy hinges on the fact that while Schmidt considers noticing to be indispensible, other researchers, some of whom are quoted below, believe that attention and the resulting conscious registration of input is simply facilitative. As Leow (2000: 558) notes: Recently a few studies conducted in the L2 classroom setting (e.g. Alanen 1995, Leow 1997, Robinson 1997a and 1997b, Rosa 1999, Rosa and O’Neill 1999) have reported overall positive effects of awareness on learners subsequent postexposure performances. However, Leow (1997) in his study of the role of awareness in foreign language behavior pointed out that “the issue of whether awareness is essential [original emphasis] for subsequent processing to take place remains unsolved” (494)

What is more, while Schmidt (1990, 2001), as indicated earlier, is of the opinion that learning is possible only if input is consciously noticed, Sharwood-Smith (1981), Rutherford (1987) and Ellis (1997) claim that such noticing can be both conscious and unconscious, the latter being conducive to learning as well. There are a few points in favour of the idea that global attention – as well as focal – and unconscious noticing – together with conscious awareness – are pedagogically beneficial, and should be considered alongside conscious focus-onform. First of all, it is difficult to disagree with Krashen’s (1985) claim that there are too many aspects of input for the learner to process them all consciously. Secondly, as pointed out by Truscott (1998) – who presents quite a radical critique of the strong version of the Noticing Hypothesis – and, hopefully, demonstrated in the initial part of this chapter, attention is too complicated a phenomenon to make any final judgments about what it is and “when it is or it is not allocated to a given task” (Truscott 1998: 105). Moreover, Truscott subscribes to the view – also discussed earlier – that, in fact, attention may be a combination of “a number of separate pools of resources which can be allocated to tasks more or less independently” (1998: 105). As a result, it seems a little farfetched to make any finite judgments about the degree to which noticing and awareness are interrelated and interdependent. Besides, even if we agree that only consciously registered data

76

Chapter One

can enter the short-term memory, the processing of this data, including noticing the gap between them and one’s current interlanguage knowledge as well as eventual integration of the intake in one’s long-term memory, does not need – and cannot be proved – to be conscious (Ellis 1997). Finally, Truscott (1998), evaluating attentionrelated research to date, claims that it “does not indicate that language acquisition requires anything more than global awareness of input” (107), which takes us back to the problem of both extensive and intensive attention and the already-made acknowledgement of the crucial role of the former, which goes well with the idea of gestalt, a concept of an integrated whole, which is how we perceive. The resolution of the conflict, however, will largely depend on how we understand consciousness, attention and noticing as well as how we conceptualise their relationship. 3.2.2. The levels of conscious awareness The answer to the question of what kind of conscious in conscious learning Schmidt is actually interested in can be found in his early publication, in which he (Schmidt 1990) explains conscious learning vis à vis unconscious learning. He defines the latter in three different ways: as unintentional learning (=learning without the intention to learn); implicit learning (=learning without explicit metaknowledge/ knowledge of rules); and finally as unconscious-proper learning (=learning without awareness). As Schmidt argues in his Noticing Hypothesis, conscious learning is conscious in the last sense: while it does not need to be intentional or lead to verbalisable knowledge, it has to amount to learning with awareness. In order to understand what this conscious learning or learning with awareness is, we need to follow Schmidt’s (1990, 1995) differentiation between awareness on different levels: the level of noticing and the level of understanding. 3.2.2.1. Awareness on the level of noticing As for awareness on the level of noticing, Schmidt (1995) asks two questions: Is it possible to learn with attention but without noticing? Is learning without awareness possible at the level of noticing? We can rephrase the first question asking whether it is possible to pay attention and not notice at the same time. The answer – which is yes on the condition that attention does not have to be understood as exclusively intensive – was offered earlier in the present work: as pointed out by Eysenck and Keane (2000; cf. also Bowers and Meichenbaum 1984; Chun and Wolfe 2001), we can listen but not hear; watch but fail to see. Is such listening/watching without hearing/seeing conducive to learning? Intuitively, it cannot be; consequently, we have to state that only attention that involves noticing is educationally beneficial. The point that needs to be made here, though, is that in the sense of hearing/seeing as mentioned above, noticing actually amounts to sensory registration which

77

On the importance of noticing: attention

makes noticing synonymous with consciousness in the phenomenological understanding of it. Consequently, we can state that if Schmidt’s Hypothesis refers to noticing as being phenomenally conscious, we have to agree that one takes in – and learns – only what is noticed. This can, quite convincingly be argued based on the model of a relationship between consciousness and attention proposed by Lamme (2003). As a point of departure for his proposal, Lamme observes that consciousness and attention are both selective and seem likely to share some properties; however, they are not identical but rather conducive to each other. For example, in the context of visual noticing – which belongs to the realm of phenomenal noticing which is of interest to our present argument – awareness is often defined as the focus of attention. We can notice that this perspective is common to a number of approaches ( cf. the beginning of the present chapter), starting from the proposals of Broadbent(1954, 1958), through Treisman (1960, 1964), Norman (1968), Schneider and Shiffrin (1977), Deutsch and Deutsch (1963), Posner (1980) and Eriksen and St.James (1986) to Baars (1997a, 1997b and later works) and Arvidson (2004, 2006); it can also be seen in Schmidt’s Noticing Hypothesis (cf. 1990, 1995 and 2001). It is the moment of focusing attention or noticing that narrows down the reception channel and subjects the incoming data to conscious cognitive processes such as discrimination and categorisation of input and its integration as well as the reportability of all of the above. It does not matter if the noticing takes place in the spotlight or in the peripheries. Continuing the considerations along the lines proposed by Lamme (2003), based on the acknowledgement of the relationship between consciousness and attention, we can ask which is a gate to which. Lamme attempts to find the answer by taking us through a number of models presenting the mutual relationship between attention and consciousness. He starts with a model he calls traditional (Figure 3; adapted from Lamme 2003: 13), which in fact is highly reminiscent of Broadbent’s and some later concepts of selection mechanisms mentioned in the previous paragraph, in which inputs can be either attended to or unattended to, and it is the attended input that leads to a conscious cognitive experience: Figure 3. The traditional model of the relationship between attention and consciousness inputs

attended unattended

conscious

conscious report

78

Chapter One

In the light of this model, only attended stimuli reach conscious awareness and, consequently, are available for verbal report. Taking the reader through a number of model modifications – which include adding the unconscious module to the attended/unattended block as well as removing the conscious block altogether based on the assumption that attention does not determine whether stimuli reach a conscious state but whether or not a verbal report of these stimuli is possible – Lamme arrives at his own proposal. His concept provides for a partly reverse order of the modules in which it is consciousness that is the gate to attention, or rather to two kinds of processes, attended and unattended. Out of these, in turn, only attended processes are subject to reportability, as diagrammed in Figure 4 (adapted from Lamme 2003: 13): Figure 4. Lamme’s model of the relationship between attention and consciousness inputs

conscious

attended unattended

conscious report

unconscious

What is important in the light of the above diagramme is that if we assume that the attended/unattended block stands for the input processing area, the organisation of the model suggests that cognitive mechanisms within this area operate only on the part of input of which we have been phenomenally conscious or, in other words, we have noticed19. Lamme (2003) argues for this point, giving examples of change blindness (CB) and inattentional blindness (IB) experiments which prove that if the testees are unconscious of changes occurring in the input because the changes are effectively masked, they remain inattentive to/unaware of this input. If, however, their attention is somehow focused on the item that was about to change, the change does not pass unnoticed. In the light of the final statement – and going back to the watching/seeing and listening/hearing metaphor – we can amend the model even further based on the assumption that if something is to be noticed it has to be subject to awareness/attention which is, at least, global, as shown in Figure 5:

––––––––– 19

A concept that goes hand in hand with Schmidt’s noticing hypothesis.

On the importance of noticing: attention

79

inputs

global attention

Figure 5. The relation between attention and consciousness accommodating Tomlin and Villa’s model

alertness

conscious

attended unattended

conscious report

unconscious

orientation

detection

As can be seen above, the diagramme in its present form accommodates Tomlin and Villa’s (1994) tri-partite concept of attention consisting of alertness, orientation and detection, the first belonging to the area of global attention, the second to the phenomenal consciousness and the final implying cognitive registration of stimuli being, as Tomlin and Villa themselves point out, separate from conscious awareness and a gateway to subsequent processing. In addition, we can observe that the above amendment is concordant with the results of studies on the selectivity of attention, including certain findings of Broadbent’s experiments as well as Moray’s (1959) cocktail party effect and later studies referring to the so-called unconscious noticing. In fact all these studies concern processes which are phenomenally conscious: in Broadbent’s experiments, though inattentive, the testees were globally aware of the message played to the other ear. As for Moray’s study, we can assume that such is the nature of the special context, the cocktail party, that we are globally aware – to use the phrase again – of the possibility of our name being mentioned in a conversation even though we are not intentionally attending to every exchange held within our hearing range. Finally, when it comes to the models providing for filtering mechanisms (Schneider and Shiffrin 1977), attention spotlights (Posner 1980) or zoom lens (Eriksen and St. James’s 1986), experiment participants had the potential to become conscious of the unattended – out-of-spotlight/out-of-zoom – stimuli owing to the fact that they were globally aware of the whole experimental context. As a result, the stimuli in question had a chance to enter the realm of phenomenal consciousness. In conclusion, the above model, endorsed by the results of previously described studies, validates Schmidt’s Noticing Hypothesis on the level of phenomenal consciousness/noticing: input becomes intake only if noticed in the sense that it is registered by one of the senses. There are, however, two issues to be resolved here; first of all, to what extent Schmidt’s interpretation of noticing accommodates global attention; secondly, whether what Schmidt himself means is noticing in this limited, phenomenal understanding.

80

Chapter One

As for the first question, what Schmidt (1990: 132) calls noticing “at the level at which stimuli are subjectively experienced” – hence in the area of phenomenal consciousness we are considering – are focal awareness, episodic awareness, apperceived input. In the light of this, the role of global awareness in Schmidt’s model can raise doubts. Schmidt (1990) addresses the problem in his consideration of consciousness on yet another level, the one of intention. Following Baars (1985), he states that intentions may be conscious or unconscious, the latter case referring to situations when we do not focally attend to some aspect of input but notice it anyway, which is the case of global awareness. The result is incidental learning, which Schmidt deems possible, stating that “it makes no difference whether the learner notices a linguistic form in input because he or she was deliberately attending to form, or purely inadvertently. If noticed, it becomes intake” (1990: 139). He, however, claims that this is definitely inferior to focused learning and observes that it requires appropriate structuring of a task which facilitates the unintentional yet conscious noticing of the target form. When it comes to resolving the second problem – the scope of noticing proposed by Schmidt – it seem that it goes beyond phenomenal consciousness into the realm of information processing mechanisms called access consciousness (Block 1997; cf. the discussion on the easy vs. difficult problem of consciousness earlier in this chapter). This, in particular, is the case with one type of noticing which, according to Schmidt, is essential to language learning: noticing the gap (Schmidt and Frota 1986). The point is that to notice this gap is not enough to register a certain aspect of input by means of the senses. The very act of spotting the difference between one’s interlanguage (IL) and the properties of the target language (TL) typical of its native-speaker users requires a lot of cognitive operations such as analysing and, frequently, abstracting as well as comparing and evaluating, which, in Lamme’s model will belong to the attended/unattended module representing access consciousness. In Schmidt’s understanding, noticing apparently reaches this stage if it “can be operationally defined as availability for verbal report” (Schmidt 1990: 132). Based on all of the above together with Schmidt’s claim that “attention is necessary for all aspects of L2 learning” (Schmidt 2001: 3; emphasis mine) we can infer that noticing in Schmidt’s understanding is operationalised at all stages of input processing, not only at the level of the sensory registration. As a result, trying to answer the two questions, we can say that the first problem can be resolved at the level of phenomenal noticing and the verdict is that, in the light of Schmidt’s hypothesis, global attention is conducive to learning but only under certain conditions and it is definitely inferior to focal attention. As for the scope of noticing as Schmidt operationalises the term, it definitely goes beyond the phenomenal noticing, into the access consciousness. That is why the question of whether the strong version of Noticing Hypothesis can be defended in

On the importance of noticing: attention

81

the realm of higher level awareness, which is defined by Schmidt (1995, 2001) as the level of understanding, will be presented in the following subsection, where the problem of conscious awareness vis à vis understanding is discussed. 3.2.2.2. Awareness on the level of understanding When it comes to awareness on the level of understanding as opposed to awareness in noticing, Schmidt describes the difference in the following way (1995: 29): I use “noticing” to mean conscious registration of the occurrence of some event, whereas understanding, as I am using the term, implies recognition of a general principle, rule or pattern. Noticing refers to surface level phenomena and item learning, while understanding refers to a deeper level of abstraction related to (semantic, syntactic, or communicative) meaning, system learning (Slobin 1985).

To further explicate the distinction, Schmidt (1995) gives the following examples: in forensics, noticing is about evidence collection while understanding implies creating the theory of the crime committed; in grammar – including artificial grammar learning – spotting a pattern means noticing while abstracting the rule implies understanding; in vocabulary learning, noticing consists of conscious registration of form while understanding amounts to apprehending the meaning of the word; in pragmatics, becoming aware that a speaker uses a certain utterance is an act of noticing whereas discovering the strategic intentions behind the utterance is understanding. When it comes to other similar constructs in attention-oriented literature, Schmidt (1995) relates his idea of understanding to two other concepts labelled differently by different authors: detection within selective attention (Tomlin and Villa 1994); or apperception (cf. Gass 1997; Gass et al. 2003). These two notions, as pointed out by Gass et al. (2003), share a number of characteristics. First of all, their definitions are convergent: detection, to go back to a quotation already used in the present paper, is “a cognitive registration of sensory stimuli ... a process that selects, or engages, a particular or specific bit of information” (Tomlin and Villa 1994: 192); apperception, on the other hand, is “a process of understanding” (Gass et al. 2003: 499; cf. also Gass 1997) which relates what we notice about the item in question to our previous experience. Secondly, as observed by Gass et al. (2003), when detection and apperception are considered in the context of language learning, they are both viewed by the proponents as crucial to the acquisition of grammatical patterns. “Once a grammatical alteration is detected, or, more precisely, once a token or instance of a grammatical alteration is noticed, it is then available for further processing” (Tomlin and Villa 1994: 198). Similarly, apperception is conceptualised (Gass 1997: 23) as “a priming device” which “prepares the learner for the possibility of subsequent analysis”.

82

Chapter One

In the light of the above, we have to conclude that both detection and apperception are a kind of go-between processes which mediate between what is phenomenally noticed and what is understood. Such an interpretation is in agreement with Robinson’s (1995) idea that detection alone is not likely to result in the permanent effect which is desired in any kind of learning; it is just the first stage which has to be accompanied by rehearsal in the working memory, which is a kind of lead-in to awareness. As a result, as Robinson (1995) argues, while for detection itself awareness may not be required, the working memory rehearsal will definitely complete the job of conscious understanding. This takes us back to one of the cognitive attempts to explain the neurobiology of consciousness, proposed by Crick and Koch (1990, 2002; described in Section 2.1). The two proponents suggest that any representation initially appears implicitly in the primary cortex and later, explicitly in the higher cortex, the latter, explicit representations being the neural correlates of the underlying subjective experience. In such a model, Tomlin and Villa’s detection and Gass’s apperception will be mediating processes whose aim is to bridge the gap between the primary and the higher-order processes. As a result of this interim, transitory nature, they will potentially belong neither to phenomenal nor access awareness. This is most probably how Tomlin and Villa’s (1994) statement that detection is separate from awareness should be understood. The acknowledgement that detection/apperception is separate from awareness is exactly where Schmidt’s understanding and the concept of detection seem to diverge. Schmidt (1995) argues that detection without awareness is restricted to phenomena such as subliminal perception, blindsight and implicit memory, all of which are either irrelevant to learning or their role is impossible to determine. While, based on quoted experiments (for details see Schmidt 1995: 23-25), he acknowledges subliminal perception and blindsight as fact, he discards them as conducive to learning. In turn, while addressing the role of implicit memory in learning, Schmidt (1995, 2001) agrees with Gass (1997) that successful language learning goes beyond input, among others into implicit memories of past learning events; he admits that (1995: 25) “past learning experiences affect current behaviour even if we do not consciously recall the relevant prior events” as well as the fact that “implicit memory is clearly relevant to understanding foreign language learning” (Schmidt 1995: 25). He, however, states that it is difficult to relate this problem to the Noticing Hypothesis because we have no way of deciding whether the learner was aware of the memory when it was formed. Generally speaking, according to Schmidt (1990, 1995), in the question of implicitness in learning there are too many ambiguities to be resolved. For example one of the symptoms of implicit knowledge (to be discussed in Chapter 2) – the unavailability for verbal report – can be ascribed to memory problems or to the lack of the appropriate metalanguage and not necessarily to learning without awareness. Consequently, Schmidt (1990) prefers to see the explicit and the implicit as a kind of continuum rather than in terms of opposing learning modes.

On the importance of noticing: attention

83

As a result, answering his own question (Schmidt 1995: 30-31) of whether understanding (system building, abstraction, generalisation, etc.) proceeds on the basis of “conscious processes” or “more basic unconscious mechanisms that may be encapsulated in a way that makes them unaffected by any conscious knowledge”, Schmidt subscribes to the former stance, a model of language acquisition which sees the learner as a scientist. Consequently, understanding, which amounts to comparing the newly noticed input with previously noticed inputs, “all this mental activity – what we commonly think of as thinking – goes on within consciousness” (Schmidt 1990: 132). Based on this assumption, Schmidt criticises the unconscious learning perspective, in the light of which the learner’s knowledge of language, unlike the one of the linguist, is not a result of conscious analysis. First of all, Schmidt (1995) dismisses as vacuous the already quoted complexity of input argument put forward by Krashen (1985, 1994), in the light of which language as a system is too multi-faceted to be learned consciously. This, as Schmidt (1990) observes, referring to Chomsky’s concept of cognisance, may be well justified in relation to the mother tongue but not the second or foreign language. He claims: “Chomsky’s rejection of any role for conscious attention or choice in first language learning rests on the assumption of uniform and complete learning by all children. This argument cannot be made for language learning” (1990: 142) in whose case “paying attention to language form is hypothesised to be facilitative in all cases ... particularly crucial in adult acquisition of redundant grammatical features” (Schmidt 1990: 149). Addressing the views of other authors (Paradis 1994; Van Patten 1984, 1994; cited in Schmidt 1995: 39-40) who, subscribing to views similar to Krashen’s, claim that some aspects of grammar are too subtle to be actually analysed by the learner, Schmidt (1995) states that none of them provides a very convincing case for implicit awareness-independent learning. First of all, as Schmidt notes, the subtleties listed by Paradis or VanPatten are exactly those aspects of learning that are actually “notorious problems in foreign language learning” for “both instructed and uninstructed learners”. As a result, the claim that such marked forms and form-meaning mappings are learned unconsciously seems unsubstantial, as they are not learned at all. Secondly, Schmidt censures VanPatten’s claim that a hypothetical student might form only partial rules and – unconsciously – fill in the gaps in the system by default, which would be a compensatory technique similar to the one utilised by blindsight patients20. Schmidt (1995) claims that detection without awareness which is typical of such patients is rather limited in ––––––––– 20

Blindsight patients, as a result of cortical damage, suffer from extensive gaps in their visual field. However, as numerous experiments have shown, when asked to guess what could be in the blindsighted area, they usually provide astonishingly correct answers (after Dennet 1991)

84

Chapter One

scope: the blindsight patients can mentally reconstruct only very simple items. In the light of this claim, VanPatten’s hypothetical student would actually be incapable of reconstructing anything complex or subtle and that is the core of the complexity argument. Additionally, it is impossible to determine what the hypothetical student may or may not be aware of. All in all, Schmidt (1995) sees some role of implicit induction in foreign language learning. However, arguing strongly against implicit abstraction, he limits the application of implicit learning to cases of fuzzy grammatical rules, in the case of which only associative, connectionist learning is possible. This is the only case where, according to Schmidt, awareness on the level of understanding is unnecessary; in all other cases, the cases of system building involving categorisation and abstraction, awareness is definitely involved. 3.2.2.3. The strong version of the Noticing Hypothesis. A discussion As was mentioned in Subsection 3.2.2.1, the strong version of Schmidt’s Noticing Hypothesis seems uncontroversial on the level of phenomenal consciousness. Also when it comes to access consciousness, it must be admitted that Schmidt’s arguments for conscious understanding sound very convincing. Even if based on implicit induction, the grasp of the factor uniting the collected tokens as part of the overall language system is definitely conscious as is the feeling of understanding: figuring out the place of an instance structure within a much vaster blueprint of grammar. There are, however, a few questions to be raised here. They include the role of the inborn capacity for language, applicable to the mother tongue but also, to some extent, argued for in the case of the other languages; the importance of the different types of attention and attentional pools described earlier in this chapter; the question of conscious and unconscious detection; and the role and scope of implicit learning. They are all addressed in the present subsection. First of all, there is a line of language theorists, starting with Krashen (1981) and Seliger (1983) and going through names so numerous that giving credit to all of them is beyond the scope of the present work, who believe that “obviously, it is at the unconscious level that language learning takes place” (Seliger 1983: 187). Most of those who subscribe to this view are also in favour of an inborn capacity for language learning. Schmidt, quoted in the previous subsection, rules out the question of cognisance in languages other than mother tongue. However, it seems that applied linguistics has yet to resolve the extent to which learning a new language involves the utilisation of a kind of language acquisition device, a toolkit we are born with (Jackendoff 2002) and which, working unconsciously/fconsciously for the native tongue, most probably does the same – at least to a certain extent – for any other language. Before the final verdict though, such an option cannot be ruled out entirely, especially in the light of a very convincing argument presented in Krashen (1985; 1994) concerning our inability to consciously attend to all of the numerous aspects of input.

On the importance of noticing: attention

85

Secondly, it has to be admitted that Schmidt’s notion of attention being conducive to learning on every level of input processing seems to agree with Lamme’s model in that there is a visually perceivable narrowing down along the globally attended-conscious-attended-conscious report line. To translate this visual effect into words, we can say, following Lamme (2003: 13) that “we are conscious of many inputs but without attention this conscious experience cannot be reported”. However, this very statement endorses the strong version of the Noticing Hypothesis only to a certain degree. It has to be emphasised that Lamme does not state that without consciousness of input and attending to it cognitive experience does not exist; he simply claims that it is not subject to verbal report. It is true that Lamme’s model does not explain the whole range of processes in which the unattended inputs contribute to the cognitive experience; neither does he acknowledge the role of mental processes that are unconscious or the way in which and the degree to which they influence input processing. Such explication, however, does not seem necessary in the light of our earlier discussion of studies which see attention as a multi-faceted and multi-pool phenomenon, intensive as well as extensive, in the light of which the role of unattended learning cannot be ignored. There is also a word to be spoken for the unconscious, f-conscious or automated, all discussed in the first section of the present chapter21. At this point it needs to be admitted that the studies to which the above claim refers did not investigate SLA-related problems. Consequently, basing any claim about the nature of language learning on the said research can be subject to critique similar to the one Simard and Wong (2001) expressed in relation to Tomlin and Villa’s (1994) fine-grained model of attention. Knowing that this model was based on the research of Posner et al. (Posner and Petersen 1990; Posner and Rothbart 1992) in the area of general cognition, Simard and Wong (2001: 109) state “[i]t is not clear how such systems might inform us about how attention might affect L2 learning”. Considering how much we know – or rather how much we still do not know – about the workings of the human brain, it has to be admitted that the above censure cannot be fully refuted. At the same time, and for the same reason, it cannot be accepted either. As a result, we have to assume that, unless proven otherwise, what refers to general cognition has to a certain extent to be true for language learning, language as a cognitive skill (Skehan 2003) being part of cognition. All this is in accordance with the new understanding of the modularity of the brain argued for by Damasio (1994), in the light of which the brain is a system of interconnected systems. This interconnectedness is a feature of the parallel architecture in which language as a system communicates with general cognition via yet another interface (Jackendoff 2002). ––––––––– 21

And revisited in Chapter 2, in relation to implicit learning, where the question of the superiority of the attended over the unattended is going to be reconsidered.

86

Chapter One

As a result, if general intake involves both conscious and unconscious processes (cf. Baars Global Workspace Theory), so does language learning. This is further endorsed by the fact that, as pointed out by Lewicki et al. (1992), non-conscious acquisition of knowledge in human cognition is virtually ubiquitous, especially in the case of complex patterns of stimuli in which there are too many input facets to consciously control all of them. Taking all this into account, our considerations of the role of attention can legitimately be based on the results of studies in the area of general cognition and to acknowledge the insights they offer. It has to be pointed out that it is within the very territory delineated by Schmidt’s hypothesis that unattended/f-conscious learning seems particularly relevant. When it comes to the earlier mentioned phenomenon of noticing the gap, it stands to reason that the processes underlying IL/TL comparisons cannot be exclusively conscious. Considering the previously quoted Krashen’s multiaspectuality of input, we may consciously notice some differences between certain TL input features and relevant aspects of our interlanguage; however, there may be certain other characteristics of the said difference that will be attended to globally rather than focally. What is more, even if certain aspects of TL input are noticed in a focal manner, the very TL/IL comparison may be carried out unconsciously, as argued by Sharwood-Smith (1981), Rutherford (1987), and McLaughlin (1987)22. In addition to that, certain explicit/consciously applied TL/IL comparison strategies that we utilise – taught to us or worked out on our own – may become automatised to such an extent that their application will be fconscious. Finally, the still not fully resolved role of our native tongue may be such that language processing will be partly unconscious because of whatever is left of the Language Acquisition Device is activated in – or exapted to (Pinker 1994) – the process. As a result, the IL/TL gap will be noticed and processed as a certain gestalt, unanalysed into perceivable items and consequently unattended to in the fully conscious mode Schmidt proposes. As for Schmidt’s (1995) claim that there is no detection without awareness apart from phenomena, whose importance to learning is close to zero, it is, to a certain extent supported by Simard and Wong (2001). As was mentioned before, they discard Tomlin and Villa’s (1994) claim that detection can exist without awareness – as well as without alertness and orientation – on the grounds that their fine-grained model is based on research (Posner and Petersen 1990; Posner and Rothbart 1992) whose results cannot be generalisable to the context of SLA. However, Simard and Wong do not deny that within the model, whose finegrainess they accept, the individual components can demonstrate varying degrees of activation and the expression not always highlighted in the quotation below suggests that, on occasion, detection may happen with, for example, a low level of ––––––––– 22

As noticed by Fotos (1993: 386), where all three authors are cited.

On the importance of noticing: attention

87

alertness, no actual orientation and, consequently, without full awareness in Schmidt’s understanding. Simard and Wong write (2001: 119): Given that a variety of variables may influence attentional demands during input processing, we proposed that a more feasible way to conceptualise attention is not to view alertness, orientation, detection and awareness as always (emphasis mine) separable, but as coexisting and interacting in graded levels, and whose degree of activation is determined by the nature of the task, the linguistic item in question, and individual differences, among other factors.

Additionally, what the above-quoted fragment confirms once more is that attention is too complex a phenomenon for any final judgments to be made23. As a result, a broader, all-inclusive – or at least more-inclusive – model seems a better pedagogical choice. In turn, when it comes to Schmidt’s scepticism about implicit learning, there are studies proving that learning a language without intention as well as without awareness can be successful (to be discussed thoroughly in Chapter 2). One such study to be mentioned here is the one carried out by Williams (2005). In an experiment requiring the testees to learn miniature noun class systems, the participants were taught a rule about the use of determiners with nouns, in the light of which the choice depended on the distance between a given noun and the subject of the sentence. Another rule, according to which, it was also the animacy of the noun in question that influenced determiner use was withheld from the testees. Williams found out that her respondents, after exposure to lists of sentences containing determiner+noun combinations, scored above chance on a subsequent test which checked the generalisation of the rule. In the light of this observation, Williams (2005) concludes that learning form-meaning connections without awareness is possible. Finally, addressing Schmidt’s claim (1995) that implicit learning is limited in scope as confined to fuzzy grammatical rules, it is hard to escape the impression that Schmidt sees such learning opportunities as something peripheral to the mainstream, rule-governed learning of language as a system. This is a very Chomskyan24 assertion and an early Chomskyan assertion too. Since his declared syntacocentrism ––––––––– 23

24

Simard and Wong’s position is subject to critique on the part of Leow (2002), who, as indicated previously, is very much in favour of Tomlin and Villa’s (1994) model (CF. Leow 1998). As indicated in my own comment preceding the quote from Simard and Wong, I personally believe that Leow’s remarks are exaggeratingly critical, as the two models – Tomlin and Villa’s as well as Simard and Wong’s – are conceptually quite close to each other in spite of the reservations the latter authors express about the concept of the former authors. The claim is Chomskyan in its approach to language as a system and not in its attitude to the problem of language acquisition. As regards the latter, Chomsky is of the opinion that language learning is incidental (cf. Chomsky 1975)

88

Chapter One

(Chomsky 1965), linguistics has witnessed the birth of such breakthrough ideas and theoretical positions as Rosch’s (1975a and 1975b) prototype theory, Langacker’s (1987) cognitive grammar, Lakoff’s (1987) study of the human ability to categorise or Taylor’s (2003) work which, among others, looked at the ways we categorise grammar structures. In the light of all of the above – which are but a fraction of a whole body of research – it can be stated that fuzzy categories and associative reasoning, including associative categorisation carried out on the basis of family resemblance (Rosch 1975a and 1975b; Lakoff 1987) may be far from peripheral. Moreover, even if we assume that language as a system is elegantly structured, we have to account for two facts: individual differences in learning and the role of the unconscious/f-conscious in arriving at the consciously aware stage of understanding. As for the individual differences, not all people learn it in an equally elegantly analytic way. As argued by Skehan (2003), in addition to learners who rely on their grammatical sensitivity, there are memory learners who acquire language material unanalysed. In the light of all this, Schmidt’s (1990, 1995, and 2001) claim that there is no or very little learning without awareness seems a little far-fetched. When it comes to the role of the unconscious/f-conscious, even if the final stage of abstraction is conscious, we can predict that the way to this conscious state of understanding has not necessarily been fully conscious. This is because the underlying categorisation might have been performed on the basis of family resemblance motivated by a stereotypical, gestalt perception of the item in question. Such a view is decisively endorsed by the research conclusions presented in Williams (1999: 37) when she states: Finally, it has been argued that inductive learning does not have to be seen as either an explicit process of hypothesis formation and revision or as an implicit process of unconscious rule abstraction. Rather, a middle position is possible whereby knowledge that is sufficiently explicit to transfer to other tasks can result from the implicit structuring effect of long-term memories of (appropriately structured) instances or the learner’s conscious perception of current input without the learner necessarily having an explicit intention to learn grammatical rules as such.

In addition to this, we have to take into account the fact that even goal-oriented activities – focusing intensively and consciously on a particular language form being an example of such an activity – which, by definition, require the awareness of what is to be accomplished, may not be as exclusively conscious as Schmidt wants them to be25. Bargh and Bandollar (1996) argue that goal-oriented activities are conscious and intentional as regards their “top node” in the goal system. This system, however, is a network of a number of other nodes under which certain substrategies and subprocesses aimed at goal attainment are subsumed. According ––––––––– 25

“Problem solving belongs to consciousness.” (Schmidt 1990: 133)

On the importance of noticing: attention

89

to Bargh and Bandollar, some of these subprocesses may be fully automatised, and, as a result, their performance does not require the involvement of any attentional resources. Such subprocesses in the overall goal architecture are referred to as chronic and the unconscious is seen as their repository (Bargh and Bandollar 1996: 457). In this sense, even an activity like goal execution, whose overall consciousness cannot be questioned, can have a partly unconscious internal composition. The concept is applicable to language processing in the form of the shortcuts proposed in Dbrowska (2004; discussed earlier in this chapter). The above claim shows once again that the distinction between the conscious and the unconscious seems to be on the fuzzy side. As already pointed out, the most clearly in Baars’ Global Workspace theory – but also with regard to Kihlstrom’s (1987) cognitive unconscious and Jackendoff’s (2002) f-mind – the boundaries between the systems are fuzzy and the individual contributions of the conscious and the unconscious are hard to determine on every level, starting from the phenomenal consciousness of noticing and finishing with the access awareness and noticing the gap. That is why, when talking about cognition in general and skill learning – such as, for example, language learning – as a cognitive process in particular, we always have to take into account two subsystems remembering that they are in constant cooperation as proposed in dual-process theories, which were discussed in Section 2.3. Summing up, examining the problem of noticing against such issues as an inborn capacity for language, different types of attention and attentional pools the question of conscious and unconscious detection; and the role and scope of implicit learning as well as dual processing we can conclude that the best stance to adopt on noticing is the middle-of-the-road approach which allows the importance of conscious noticing and focal attention to be fully estimated while also acknowledging the role of holistic, pre-conscious processes. At the same time, however, one more question needs to be addressed: the one of focal as opposed to global awareness. Following another quotation from Schmidt (1995: 20) that “focal attention and awareness are essentially isomorphic”, we can ask whether listening to hear and watching to see really requires attention to be fully focused or if a kind of global awareness of input would be enough. And even if the answer is to the fore of Schmidt’s position – which may not be the case considering our discussion of intensive and extensive attention – another question that arises is what exactly it means to be focused or not. Is it the 0-1 situation Searle (1990) argued for or can focused-unfocused be seen as a kind of continuum as suggested in Arvidson’s model (2004, 2006) and earlier in the zoom lens model (Eriksen and St.James 1986), among others? Taking all the above into account, the dualprocess models as well as the global/extensive vs. focal/intensive attention problem, it seems legitimate to state that focusing is a matter of graduity and is better understood as a place on a certain unconscious-globally conscious-focally conscious continuum and not in terms of a switch on/off mechanism.

90

Chapter One

At the same time, the importance of attention in learning is not – and never has been – subject to doubt. It is certainly indispensible on the level of phenomenal noticing and this is where the strong version of the Noticing Hypothesis should be adopted. However, in the light of the above, focal attention and noticing seem to be facilitative rather than mandatory, as declared by the proponents of the weaker version of the hypothesis in the case of both noticing the gap and system adjustment. At the same time, it seems most probably that conscious noticing will be indispensible in the case of all that is “infrequent, nonsalient and communicatively redundant” (Schmidt 2001: 23), where preparatory attention and voluntary orienting are the only ways to improve encoding (Schmidt 1995). What remains to be pointed out is that all these types of attention will be constrained by a number of factors discussed in the following section. 3.2.4. Constraints on noticing In the understanding of both versions of the Noticing Hypothesis, attention is reinforced by a number of factors, including (Schmidt 1990: 143) instruction, frequency, salience, skill level and task demands. Instruction may take on different forms. It may be a detailed study of language material by, for example, indicating the patterns worth noticing in the process of rule formation or using deductively presented rules as a point of departure, a kind of blueprint (Ellis 1994) for further linguistic investigations; it may also mean raising general awareness of the formal aspects of language or simply flooding the learner with language tokens. All these will be given due attention in Chapter 2, which is specifically devoted to focus on form as understood in terms of language instruction. In addition to a number of practical solutions, Chapter 2 will quote the results of ample research into the effects of instruction on learning. There is, however, one study which will be quoted here as it looks specifically at the relationship between instruction and noticing: Fotos (1993) observed that grammatical structures which are subject to explicit tuition are more readily noticed in the subsequent communicative input. This, in turn, results in an increased use of the noticed structures in output (ibidem). When it comes to the role of frequency, it apparently does not need any detailed comment. We all have an intuition that the more recurrent something is, the greater the chance for it to be noticed; an intuition proved by a number of studies (including N.Ellis 2002); and an intuition leading to conclusions referring to appropriate materials design (Biber and Reppen 2002). Frequency is important because, as N. Ellis puts it (2008: 373), “language processing is rational in that it is exquisitely sensitive to prior usage”. In an earlier publication, N. Ellis (2002: 143) shows the effects of frequency on acquisition in virtually every area of language, starting from phonology, phonotactics, through reading, spelling, lexis,

On the importance of noticing: attention

91

morphosyntax, to formulaic language, language comprehension, grammaticality, sentence production, and syntax. If we assume, after cognitive linguistics, that in any of these areas language learning is usage-based, we can observe that “[t]he regularities of language emerge from experience as categories and prototypical patterns. The typical route of emergence of constructions is from formula, through low-scope pattern, to construction” (N. Ellis 2002: 143). Such abstraction of regularities is, to use N. Ellis’s (N. Ellis 2002: 144) terminology, “frequencybiased”: the representations we form reflect the probability of a certain formmeaning pairing to occur, probability which is directly proportional to how often such a pairing is noticed. What, in addition to frequency, is conducive to noticing is also, according to N.Ellis (2008) the reliability vs. contingency of the relationship between certain cues and outcomes. In other words, we are sensitive to how frequently a certain constant relationship – and not just a separate linguistic unit – can be spotted in input. This is particularly important for syntagmatic combinations (how often two forms can be noticed together), paradigmatic relationships (how often a certain token represents a given type), and form-meaning mappings (how consistently a particular meaning is represented by the form in question). In the case of multiple cues leading to a certain outcome, noticing will be constrained by the cue that wins the competition, which will be based on its frequency (N. Ellis 2002; 2008). There is, however, the other side of the frequency coin. What is frequently spotted becomes known, familiar. However, familiarity may also imply what Stern (1997: 62) calls “unquestioning acceptance” which prevents “deployment of curiosity”. Stern (1997: 62-63) explains the effect of the frequent that becomes familiar in the following way: The familiar swallows anything. It is bottomless. When experience fades into the familiar, it loses substance; it becomes a ghost of itself. It may be gone forever, irretrievable in its original form ... “The known,” Hegel wrote somewhere, “just because it is the known, is the unknown”.

In the light of the above, we can assume that in the case of language learning, frequent encounters with a given form will lead to its automatisation which, in addition to positive effects like fluency of use will also – as indicated in Section 1.1.1 – makes it unavailable for control and further analysis. Consequently, in spite of the role frequency plays in language learning, we cannot, as pointed out by Tarone (2002: 291) think of the second language learners’ interlanguage as something that is, in its entirety, passively and unconsciously derived from raw input frequencies; and passively and unconsciously accessed in performance ... Learners’ noticing of language forms must lead them to see those forms as potential objects of analysis (G. Yule; personal communication – July 2001); their creativity can then lead them to see those same forms as potential objects of language play.

92

Chapter One

What is more, frequency as such is not a guarantee of language learning success, if we do not take into account factors such as developmental and L1 effects. As a result, as argued by Gass and Mackey (2002), we always need to consider frequency in relation to other aspects of SLA processes. The above-described effects of frequency and reliability can also be related to noticing on the level of understanding (cf. earlier in the present chapter) being, at the same time, placed within a broader context of the psychology of human judgment motivated by heuristics and biases (Tversky and Kahneman 1974). The ways in which we categorise new experiences, language encounters included, are to a certain extent simplified by “heuristic principles which reduce the complex tasks of assessing probabilities and predicting values to simpler judgmental operations” (Tversky and Kahneman 1974: 1124). The two main heuristics are the heuristic of availability and the heuristic of representativeness (Tversky and Kahneman 1974). The former boils down to the easiness with which the highly frequent or probably instances are recalled. The latter is based on the reliability of the newly encountered sample: on how many qualities of the ideal representative it actually demonstrates. The judgmental operations are called “simpler” (cf. above), because the judgments we make are not analytic, algorithm-based but gestalt, intuitive and instantaneous. In the case of language, a newly encountered form and its meaning will be processed in a way which does not follow a lockstep rule-governed approach; the processing will imply speedmatching the language chunk against other already known patterns motivated by the frequency of these patterns as well as their reliability – in terms of matching characteristics – as the blueprint/matrix for the new language chunk. As far as salience is concerned, it has to be understood both pedagogically and from the point of view of individual preferences and predispositions. On the one hand, there are a number of pedagogical ways to make the feature-to-benoticed more prominent, including all kinds of highlighting, to be discussed in Chapter 2 as focus-on-form micro-options. Looking from the other, individualrelated perspective, we have to remember, as relevance theory (Sperber and Wilson 1986a and 1986b) suggests, that human beings notice what they find relevant in terms of cognitive effects – the better the effect the more relevant the input – and processing effort – the greater the effort the less relevant the input. What, however, will be salient to an individual is hard to determine without reference to his/her broadly understood personal context. We can only speculate, following Batstone (2002), that all individual needs motivating attention to language will be derived either from communicative expediency – a particular form seems pragmatically effective – or from the desire to make sense of language as a system. As a result, the constraint of learner’s expectations (Schmidt 1990: 143) – what a particular individual expects to be found in input and what kind of conceptual background he/she evokes – will also be a factor contributing to what is salient in input.

On the importance of noticing: attention

93

In addition to all the extralinguistic factors presented above, it needs to be mentioned here that salience is decidedly an intralanguage feature. As Talmy (2008: 27) puts it: language has an extensive system that assigns different degrees of salience to the parts of an expression or of its reference or of the context. In terms of the speech participants, the speaker employs this system in formulating an expression; the hearer, largely on the basis of such formulations, allocates his or her attention in a particular way over the material of these domains.

The ways in which language allocates salience have been discussed in a number of publications, some of which will be presented in Part II, which is devoted to focus on FORM. To briefly introduce a few of them here, we can mention Rosch’s (1975a and 1975b; see also Lakoff 1987) prototypes; Fillmore’s (1982) frame semantics; Langacker’s (1987) profiling and active zones; or Talmy’s (2000a and 2000b) figure/ground in a represented situation. Talmy (2008) himself points out that “various portions or aspects of the expression, content, and context have differing degrees of salience” (27) as a result of the following domains of factors setting strength to attention (28 and the following pages): 1. 2. 3. 4. 5. 6. 7. 8.

Domain A: factors involving properties of the morpheme Domain B: factors involving morphology and syntax Domain C: factors involving forms that set attention outside themselves Domain D: phonological factors Domain E: factors involving properties of the referent Domain F: factors involving the relation between reference and its representation Domain G: factors involving the occurrence of representation Domain H: factors involving properties of temporal progression

As a detailed discussion of all these factors is beyond the scope of the present work, it will be limited to just a few examples of the factors above. Language salience will, therefore, be affected by: individual morphemes themselves – domain A – in the sense that an open-class morpheme will be more efficient in drawing attention than a closed-class item (his previous visit is more salient than when he last visited); the initial or a pre-verbal position – domain B – will, as Talmy puts it, foreground the referent of a constituent; particular phonological shapes and vocal delivery – domains C and D – will single out items; the referents divergence from norm – domain E – will be an attention drawing factor (scream will be more salient than shout); the content-over-form bias – domain F; the context, that will help to select a subset of concepts from a complex overt expression – domain G; the fact of whether we can or cannot make deictic reference – domain H. Finally, it is important to be aware that, in addition to pedagogical highlighting and individual preferences as well as intralanguage factors, formal

94

Chapter One

salience will be related to the attentional resources currently available in the online competition between form and meaning, discussed earlier in this chapter. If the message can be understood in spite of its formal properties, the form will not be salient enough to be noticed (Slobin 1985; VanPatten 1994 and later works, Wong 2001). Consequently, whether or not formal aspects of language are noticed will depend on learner-related and task-related factors. As a result, the learner’s noticing ability will positively correlate with the efficiency of their working memory and their analytical ability (Skehan 2003) and negatively – with the complexity of the task. The above remark briefly explains the constraints on noticing called skill level and task demands. Some guidelines considering the two are also offered in Cowan et al. (2005). Assuming that attention is “potentially involved in the reception, maintenance, and retrieval of information” (Cowan et al 2005: 51), we can assume that the amount of control the learner will exercise on the language input will depend on a number of factors. First of all, it will be related to the familiarity of the task. Cowan et al. (2005) argue that unfamiliar information may be represented in a greater number of chunks; familiar input is potentially categorised, grouped and has already been rehearsed. As a result, the familiar will be processed more easily, while the unfamiliar leads to cluster-on-the-worktable problems. Secondly, we need to consider the presence/lack of potential distracters, which, if co-processed, lead to the shadowing effect or the so-called Stroop effect, as indicated earlier in this chapter in relation to language anxiety. Yet another important factor related to task demands is the speed of the presentation. If items to be processed are presented quickly, there is little time for categorisation and rehearsal of new material or recognition of the already known data and effective retrieval of already existing patterns. Finally, task execution will be affected by the organisation of the material being presented. Presentation which is chaotic or does not comply with certain genre schemata results in processing problems similar to the difficulties related to the speed of presentation. When, in turn, it comes to skill level, noticing, as already mentioned on a number of occasions, will be constrained by individual differences related to attentional capacity as well as the current position of the learner on the novice-expert continuum. The influence of the latter on noticing was demonstrated, among others, by the results of a study carried out by Philp (2003), in the course of which 33 adult ESL learners received recasts of their nontargetlike question forms in 5 sessions of dyadic interaction with native speakers. Philp observed that even though the average noticing ratio amounted to 60-70%, there were individual differences among testees, caused, among others, by their language level. This, as is pointed out in the cited article (Philp 2003), may result from two different factors. Firstly, as Philp (2003: 104) argues referring to relevant research (Ellis 1994, McLaughlin 1987, Long 1996 and VanPatten 1996), in more advanced learners certain aspects of processing may already be automatised and, as such, make it possible for attentional resources to be allocated to other foci. Secondly (Mackey

On the importance of noticing: attention

95

1999, Mackey and Philp 1998), the advantage the more proficient language users have over novices may be related to learner readiness which translates into developmental stages in language pedagogy. Consequently, Philp’s lower level testees might not have been ready to notice some recasts as they went beyond their current language capacity. All the above-mentioned constraints on attention – instruction, frequency, salience, skill level and task demands – should be taken into account either as indispensible for learning or as simply facilitative. The latter approach will refer to all cases where noticing will be seen as “the gateway to subsequent learning” (Batstone 1994: 100). The former stance will be the case in the areas in which it seems justifiable to subscribe to the strong version of the Noticing Hypothesis presented in Schmidt (1990), including, as it seems, all infrequent, peripheral and marked forms or form-meaning mappings. In conclusion to the first chapter, it needs to be said that attention in general has to be considered in the context of selectivity and conscious awareness. The former was examined in relation to the problem of the intensiveness/extensiveness of attention as well as controlled vis à vis automated processes. All this was considered in relation to attentional capacity. As a result, attention was linked to working memory and examined in the light of theories that accommodate such a connection. When it comes to consciousness, it was discussed on the phenomenal and access levels and in relation to dual-process theories. These considerations of attention, carried out along the two different paths, have a number of consequences for first and second/foreign language processing, which have been described in the present chapter. When it comes to language pedagogy, which is of major importance to the present work, a number of important points have been made. First of all, there is a need to take into account various trade-off effects when parallel processing is required of the learner. Secondly, we need to acknowledge the importance of both conscious noticing and the underlying unconscious/preconscious/f-conscious processing. This particular observation is based on a consideration of Schmidt’s Noticing Hypothesis in relation to both phenomenal and access awareness. For the former, its strong version can be endorsed; when it comes to consciousness on the level of understanding or higher level cognitive phenomena like analysing, categorising, evaluating etc., the weaker version of the hypothesis seems more plausible, namely that noticing is facilitative to learning, a gateway to subsequent processing. Nevertheless, all this has to be considered in relation to a number of constraints on noticing, which ought to be taken into account regardless of which version of the Noticing Hypothesis – the strong or the weak – we choose to subscribe to. Ways of pedagogically satisfying the constraints will be presented in Chapter 2, which deals with focus on form from the perspective of the language classroom.

CHAPTER TWO LANGUAGE INSTRUCTION: THEORETICAL UNDERPINNINGS AND PRACTICAL OPTIONS The previous chapter looked at FOCUS ON FORM understood as attention, discussing its important characteristics such as selectivity and conscious awareness. Research to date on these two issues was presented, yet the main emphasis throughout Chapter 1 was on the importance of attention in the processing of language, with special regard to second/foreign language learning contexts. Consequently, an important place was given to the Noticing Hypothesis and its stance on the essentialness of conscious awareness in mastering a language other than one’s mother tongue. The two known versions of the Noticing Hypothesis were discussed: the strong one, proposed and developed by Schmidt (1990, 1993, 1995, 2001), in the light of which attention is indispensible in language learning; and the weak one, subscribed to by Ellis (1997), Batsone (1994, 2002) or Truscott (1998), to mention just a few names, who argue to the fore of attention being facilitative but not necessarily a condition sine qua non of successfully learning a language. Considering additionally what was said in Chapter 1 about the different kinds of attention – intensive and extensive – varying degrees of control as well as the role of conscious and unconscious processes, the present book is written with the conviction that a strong version of the Noticing Hypothesis is required but only on the level of phenomenal noticing, while, on the level of higher cortical processing – in spite of the important role that is assigned to consciousness in understanding – both attended and unattended processes are significant in the area of language learning. The above claim is crucial to what the present chapter is going to deal with: FOCUS ON FORM understood as the way in which foreign languages are learned and taught, focusing amounting to the quality and quantity of instruction offered to the learner. In doing so, Chapter 2 follows the renewed interest in instructing learners on the formal aspects of language, which originated from the growing disappointment in the purely communicative language teaching expressed by many researchers (Long 1983a, 1988 [cited in Fotos 1998: 302] and 1991; Fotos 1993 and 1998; Williams 1995; Ellis – numerous publications on instructed learning; and many others). Lyster and Mori (2006), in their reaction to the predominantly communicative classroom, argue for a form-meaning equilibrium which, as they put it in their counterbalance hypothesis, rests on “instructional activities and interactional feedback” acting as “counterbalance to a classroom's predominant communicative orientation” (Lyster and Mori 2006: 269). In fact, such equilibrium is proposed by most publications on the topic. Ellis (1994) as well as Doughty and Williams (2004) – to mention just two of the numerous works – argue that the best results are achieved if instruction is carried out in a meaningful, communicative context.

98

Chapter Two

Reflecting on instructed learning research from a quite recent perspective, Ellis (2008b: http://asian-efl-journal.com/September_05_re.php) puts forward ten principles underlying the instructional end of focus on form, in which he subscribes to the above-mentioned equilibrium: Principle 1: Instruction needs to ensure that learners develop both a rich repertoire of formulaic expressions and a rule-based competence. Principle 2: Instruction needs to ensure that learners focus predominantly on meaning. Principle 3: Instruction needs to ensure that learners also focus on form. Principle 4: Instruction needs to be predominantly directed at developing implicit knowledge of the L2 while not neglecting explicit knowledge. Principle 5: Instruction needs to take into account the learner's 'built-in syllabus'. Principle 6: Successful instructed language learning requires extensive L2 input. Principle 7: Successful instructed language learning also requires opportunities for output. Principle 8: The opportunity to interact in the L2 is central to developing L2 proficiency. Principle 9: Instruction needs to take account of individual differences in learners. Principle 10: In assessing learners' L2 proficiency, it is important to examine free as well as controlled production

Chapter 2 aims to address all of the issues included in Ellis’s ten principles, such as: different types of learning and different types of resulting knowledge; various pedagogical options which facilitate learning; instruction/knowledge and knowledge/knowledge interfacing; constraints on all of the above related to both the learner and the object of instruction. In doing so, however, the present work will try to fit into the discussion revolving around instructed learning, which is actually multifaceted and revolves around questions such as What is focus-onform?; Is form-focused instruction conducive to learning?; or What kind of language instruction is pedagogically beneficial in relation to learner individual differences as well as the object of learning? Based on relevant literature to date, we can observe that in their attempt to answer the first question, the theoretical efforts concerning focus on form are aimed at some kind of systematization of the phenomenon most frequently carried out through bi-polar labelling. As a result, numerous publications take us through consciousness raising vs. instruction (Rutherford 1987, Ellis 1994); implicit vs. explicit focus on form (Norris and Ortega 2000); focus on form vs. focus on formS (Long 1988, 1991) or pro-active vs. reactive focus on form (Doughty and Williams 2004; Ellis 2008a). While consciousness raising and instruction as well as the pro- and re-active routines will be discussed alongside FFI options, the dichotomies that are of primary interest to the present book are the implicitness/explicitness of the pedagogical treatment as well as focus-onform vis à vis focus-on-forms.

Language instruction: Theoretical underpinnings …

99

The first dichotomy to be discussed is the on of implicit vs. explicit language instruction. This distinction poses a definitional problem, as the implicit/explicit is often equated with organic/mechanic; inductive/deductive; learner-centred/teachercentred; noninterventionist/interventionist; or unobtrusive/obtrusive. While all the labels seem legitimate, the present book will try to distinguish the implicit from explicit focus on form by analogy to the difference between implicit and explicit learning. That is why the two learning modes – in general and in the area of language pedagogy – are discussed first; their differences are subsequently mapped onto form-focused instruction. Following Ellis and Bogart (2007), explicit learning is related to explicit teaching, while implicit learning is seen as motivated by ample practice, based on both input and output. At the same time, however, it will be argued that explicit instruction, especially combined with exposure to communicative contexts will facilitate implicit learning; in turn, practice and exposure, if offering considerable frequency of certain forms as well as the salience of relevant form-meaning mappings, is likely to result in explicit, rule-based and reportable knowledge. Related to the implicit/explicit distinction is the other dichotomy of focus-onform vs. focus-on-formS. It was first introduced by Long (1988, 1991), according to whom the former involves directing students’ attention to forms “as they arise incidentally in lessons whose overriding focus is on meaning and communication” (Long 1991: 46), while the latter belongs to the structural syllabus and consists in discrete teaching and testing of individual language structures. What, however, is problematic for any work trying to draw definitional lines is that the abovedistinction has not been applied consistently throughout relevant publications to date. First of all, as pointed out by Sheen (2002), there is a tendency to use the term focus-on-form to describe any kind of grammar instruction, either holistic or discrete. Secondly, what is offered is often self-contradictory. For example, Norris and Ortega (2000) argue that the distinction of focus-on-form from focus-onformS can be carried out along the lines of the following four criteria, which are in concordance with the Task-Based approach to language learning: 1) learner engagement with meaning rather than form; 2) task priority and the naturalness of the target language used; 3) the unobtrusiveness of the instruction; 4) the importance of learner mental process. However, as pointed out by Sheen (2002), in spite of their own criteria, Norris and Ortega (2000) classify as focus-on-form – and not focus-on-formS – a study by VanPatten and Sanz (1995), which violates their own criteria 1 and 3 as well as Long’s (1991) requirement for focus-on-form to be incidental rather than intentional and discrete. That is why, in its presentation of focus-on-form and focus-on-formS studies, the book will follow some of the criteria specified by Rutherford (1987) in his organic/mechanic metaphor of language and developed by Nunan (1996). Finally, the present chapter sets out to address the issues of the effectiveness of form-focused instruction, looking primarily at the questions Ellis (1994) asked

100

Chapter Two

and repeated in the following impressions of his work: 1) To what extent is instruction conducive to learning?; Does learner/instruction matching work?; Does learner training enhance the ability to learn from formal instruction? Ellis’s answers based on the research he quotes will be supplemented by the results of more recent studies. The question of which type of instruction, implicit and explicit, organic or mechanic, is best for a particular grammar problem will be given some attention too, in relation to different kinds of grammar (Lewis 1986; Willis 2005) as well as instruction/learnability factors (Ellis 2006a; N. Ellis 2006).

1. Implicit and explicit learning The reader will, most probably, find Section 1 a little unbalanced. This is because considerably more attention is going to be paid to implicit learning, which will be presented, in detail, through its different definitions and models, as well as in relation to intuition and its role in learning. This neglect of explicit learning – whose philosophy and models are going to be discussed less extensively – is deliberate and results from the fact that the latter type of learning is more common. Based on Schmidt’s claim – see Chapter 1 – that there is no learning without conscious awareness, most educators assume that proper assimilation of knowledge requires explicitness of presentation, and that the resulting expertise needs to be verbalisable. In fact the explicitness and being able to report one’s knowledge verbatim are the mainstays of most educational systems. Much less is known about – and considerably less attention is paid to – learning that is unintentional/incidental or – as in the case of implicit learning – unconscious. Consequently, the present section is an attempt to overcome this bias by giving implicit learning a slight preference over the better known, traditional way of attaining knowledge; a preference that is justified by the advantages of learning implicitly presented towards the end of the description. 1.1. Implicit learning The present sub-section looks at implicit learning in three different ways. The first part seeks to define the phenomenon, an attempt which is far from satisfactory. This results from a definitional hoax motivated by a number of factors such as different operationalisations of the very term implicit; the terminological and conceptual convergence of terms such as implicit learning on the one hand and incidental learning, implicit memory, subliminal perception and automatic processing on the other; the kind of learning processes that need to be implicit for the whole learning event to be classified as implicit; and finally, the problem of consciousness in implicit learning, whose terminological opaqueness (cf. Chapter 1) complicates the definitional process even further.

Language instruction: Theoretical underpinnings …

101

The following part investigates implicit learning through three different paradigms: artificial grammar learning; sequence learning; and control of complex systems. It also describes relevant models explaining what the actual educational the result of implicit learning is. The final part relates implicit learning to educating intuition. 1.1.1. Defining the implicit. The terminological hoax When approaching certain tasks on a day-to-day basis, we very often respond according to certain pre-learned criteria: we have an algorithm specifying which steps need to be taken in a particular situation. If asked to give a rationale for our actions, we are usually able to describe the rules – social, institutional, scientific – that we have followed. However, at times, even if acting in some rule-governed way, we are unable to explicate these rules. This is because “not all our abilities depend on our knowing explicitly how to do them. There also appear to be many examples of our learning to respond in some rule-like way without being able to state the rules that govern our behavior” (Berry and Diennes 1993: 1); we “typically learn about the structure of a fairly complex stimulus environment, without necessarily intending to do so, and in such a way that the resulting knowledge is difficult to express. This is what we mean by implicit learning” (Diennes and Berry 1997: 3). However, coining a satisfactory general definition of implicit learning (IL) is far from easy. As Cleeremans et al. (1998: 407) observe, While everyday life seems replete with examples of situations where we know more than we can tell, including language acquisition and use as well as skill learning in general, it has so far proven extremely difficult to provide a satisfactory definition for IL, let alone to provide clear empirical demonstrations of its existence and to establish exactly what its properties are.

Trying to establish these properties through a review of the relevant literature to date, Frensch (1998: 50-51) quotes as many as eleven different definitions: 1. “[learning] may be implicit, when people are merely told to memorize the specific material presented, but nevertheless learn about the underlying rules” (Berry and Broadbent 1988: 251) 2. “[implicit learning is] the acquisition of knowledge about the structural properties of the relations between (usually more than two) objects or events” (Buchner and Wippich, 1998: 6) 3. “implicit learning should designate cases where some knowledge is 1) acquired without intention to learn …, and 2) capable of influencing behavior unconsciously” (Cleeremans and Jiménez, 1996)

102

Chapter Two

4. “Implicit learning should be identified with evocative mental episodes. It consists of the establishment and use of evocative relations among nonpropositional but fully conscious content” (Dulany 1997: 189) 5. “[implicit learning is when] subjects are able to acquire specific … knowledge … not only without being able to articulate what they have learned, but even without being aware that they have learned anything (Lewicki, Czyzewska and Hoffman 1987: 523) 6. “implicit learning is thought to be an alternate mode of learning that is automatic, nonconscious, and more powerful than explicit thinking for discovering nonsalient covariance between task variables (Mathews et al. 1989: 1083) 7. “the term implicit learning designates an adaptive mode in which subjects’ behavior is sensitive to the structural features of an experienced situation, without the adaptation being due to an intentional exploitation of subjects’ explicit knowledge about these features” (Perruchet and Vinter 1998: 496) 8. “[implicit learning] is characterized by a situation-neutral induction process whereby complex information about any stimulus environment may be acquired largely independently of the subjects’ awareness of either the process of acquisition or the knowledge base ultimately acquired (Reber 1993: 12) 9. “Implicit learning (a) happens in the incidental manner, without the use of conscious hypothesis-testing strategies, (b) happens without subjects acquiring conscious knowledge to account for their performance on tests of their learning, (c) is of novel material rather than involving activation of previously acquired representations, and (d) learning is preserved in patients with amnesia” (Seger 1998: 296) 10. “we will reserve the term unconscious learning for learning without awareness regardless of what sort of knowledge is being acquired” (Shanks and St.John 1994: 367) 11. “essentially we argue that learning is implicit when the learning process is unaffected by intention” (Stadler and Frensch 1994: 423) Trying to systematise the proposals above and integrate them into one definition, Frensch (1998) finds a number of problems concerning knowledge representation in implicit learning as well as the very meaning of the word implicit and its operationalisation. When it comes to knowledge representation, Frensch (1998) points out that Dulany (1997) understands it as simply nonpropositional, Reber 1989 – as abstract, Perruchet and Vinter (1998), in turn, see such knowledge as episodic and fragmentary while other definitions do not address the issue at all. The very meaning the proponents of the above-quoted definitions attach to the term implicit differs as well. According to Frensch (1998), it may be equivalent to unconscious (definitions 1, 9, 10); unaware (5, 8); unintentional, automatic (3, 7, 11); or incidental (9).

Language instruction: Theoretical underpinnings …

103

An additional difficulty faced when trying to define implicit learning is that it is often used interchangeably with incidental learning on the one hand as well as implicit memory on the other. The point is that this is an improper practice, as implicit learning differs from both. As for implicit and incidental learning, the former means learning with no awareness or intention to learn while incidental learning is simply devoid of intention. Considering implicit learning as opposed to implicit memory is slightly more complicated as it requires the introduction of the very concept of this type of memory. Implicit memory was first defined by Graf and Schacter (1985) as a form of memory involved in priming performance and contrasted with explicit – recall and recognition – memory. The foundations for such a definition were laid a whole century earlier, by Ebbinghaus, who, having identified the following memory-related operations (1885/1966: 2): 1. Calling back into consciousness by an exertion of the will 2. Spontaneous return of mental states once present in the conscious 3. Vanished mental states giving indubitable proof that they exist even if they cannot return to consciousness ascribed 2 and 3 to implicit memory. Based on the two proposals – Graf and Schacter’s and the earlier Ebbinghaus’ distinction – Buchner and Wippich (1998: 6) put forward the following contrastive definition of implicit memory and learning: Implicit learning is often described as phenomenally unconscious … In contrast to implicit learning, the notion of implicit memory refers to the effects of past experiences with single events or objects such as words read or solved as anagrams in the acquisition phase. Most important, the learning situation does not need to be incidental. Rather, implicit memory refers to situations in which effects of prior incidental or intentional experience can be observed despite the fact that participants are not instructed to relate a current performance to a learning episode.

Based on the above, Buchner and Wippich (1998) list a number of parallels and differences between implicit memory (IM) and learning (IL). As for the similarities, both IM and IL do not vary much as a function of age; yet, on tests, in the case of both, a longer reaction time is observed for older testees; in explicit tests, in turn, younger testees score significantly better. There is also temporal stability on both: IM and IL are rather time-resistant as opposed to explicit memory and learning that fade away with time. Moreover – the characteristic not mentioned in Buchner and Wippich (1998) but indicated in Berry and Diennes (1993) – IM as well as IL are tied to the surface (=without reference to the underlying rules) characteristics of the incoming stimuli. Finally, they both suffer from second task interference and are remarkably good in amnesiacs. In turn,

104

Chapter Two

when it comes to IM/IL differences, there is, first of all, the specificity vs. abstractness difference: results of implicit memory tests are sensitive to changes in modality; the outcomes of implicit learning, in turn, including implicit knowledge of grammar, can be transferred between different modality instantiations of the same structure. In addition to this, IM and IL differ with regards to the state of awareness: implicit learning is tacit; implicit memory does not have to be unconscious. Continuing along the lines of terminological interference, Berry and Diennes (1993) extend the list of terms which are potentially confused with implicit learning by adding perception without awareness and automatic processing. The former, also called subliminal perception, has already been introduced in the present work (cf. Chapter 1, after Schmidt 1995). It occurs when stimuli presented below the threshold of awareness influence our cognition or affect. As for automatic processing, it amounts to computing of individual stimuli involving learning the links between stimuli or between stimuli and actions. Such links, however, are originally perceived consciously but then the conscious element is forgotten. Finally, when it comes to operationalising the concept, it is important to be aware what kinds of processes have to be implicit to be included in the overall concept of implicit learning. As Frensch (1998: 51) puts it, in any learning episode, implicit or other, three types of processes can be conceptually distinguished: perceptual processes that encode the constituents of learning, cognitive processes that underlie the acquisition of knowledge, and cognitive processes that underlie the retrieval of what was learned.

If, based on Frensch (1998), we assume that each of the three processes: perceptual processes, cognitive processes underlying acquisition and cognitive processes underlying recognition – labelled PP, CPA, CPR, respectively, for the clarity of the subsequent presentation – can be either implicit (I) or non-implicit (N), there will be seven possible combinations to be considered1: PP/CPA/CPR: I/N/N; N/I/N; N/N/I; N/I/I; I/N/I; I/I/N; I/I/I According to Frensch, it will be the two bolded combinations that can be seen as implicit learning per se. He argues that all of the I-initial sequences are excluded as instances of subliminal learning (cf. the case of phenomenal unconsciousness discussed, after Schmidt 1995, in Chapter 1) and the NNI combination pertains more to implicit memory than implicit learning mechanisms. The most ––––––––– 1

Assuming that we are able to talk about implicit learning at all, at least one of the elements in the sequence has to be implicit. Therefore, the NNN combination does not apply here.

Language instruction: Theoretical underpinnings …

105

prototypical learning combination is, according to Frensch, the NII, which if related back to the definitions, quoted at the beginning of this chapter fits the concept of proposals 3, 5, 8, 9 and 10. As for Frensch himself, he opts for the NIN combination. Summing up the meaning and operationalisation problems, he states: “existing definitions of the concept of implicit learning differ in their assigned meaning to the term implicit, just as they differ in whether they demand the learning process or both learning and knowledge retrieval processes to be implicit” (Frensch 1998: 54) and concludes: “meanings that treat implicit learning as learning only rather than learning plus retrieval phenomenon and meanings which regard implicit as equivalent with nonintentional/automatic rather than unconscious/unaware show greater promise in distinguishing the concept” (Frensch 1998: 66). A slightly different perspective is presented in an earlier publication by Berry and Diennes (1993). First of all, unlike Frensch, they extend implicit learning to retrieval. Secondly, they themselves incline towards the unaware more than the nonintentional/automatic definitions of the phenomenon. Their way out of the definitional hoax is via, what they call, “a working characterisation of implicit learning” (1993: 13), which includes the following defining features of implicit learning (Berry and Diennes 1993: 13-16): 1. specificity to transfer a. relative inaccessibility of knowledge in free recall b. relative inaccessibility of knowledge in forced-choice tests c. limited transfer to related tasks 2. incidental learning conditions 3. giving rise to a phenomenal sense of intuition; responses are given because they “feel right” 4. robust in the face of a. time b. psychological disorder, for example amnesia c. secondary task 5. acquired knowledge: less manipulable and more context bound as demonstrated by scores on a memory test which are affected by modality shift from learning to testing a. testees not able to describe what they know and what can be used as evidence b. such knowledge can be elicited by choice-forced test; it is the interpretation of results which poses problems c. no or reduced transfer shown in quoted studies The final question, of whether implicit learning is or is not unconscious, is more extensively addressed by Diennes and Berry in a later publication. They (Diennes and Berry 1997) claim that, as the notion of consciousness is far from transparent itself, it might be better to consider implicitness in terms of the inaccessibility of what the learner knows. This takes us back to Lamme’s (2003) model and the

106

Chapter Two

reportability of knowledge resulting from input that was attended to (cf. Chapter 1). What the learner knows implicitly in Diennes and Berry’s understanding is manifested by the lack of metaknowledge in the following representations (Diennes and Perner 1996): content explicitness and attitude explicitness. These two will, respectively, be – using Reber’s (1993) explication – knowing something and knowing that one knows it and is not just guessing. The latter dimension – attitude implicitness – is, according to Diennes and Berry (1997), close to the below-the-subjective-threshold2 criterion, which they see as a promising interpretation of implicitness. They claim that the implicit learning that takes place below the subjective threshold demonstrates the following characteristics – to repeat, once again, the main elements of the working characterisation by the same authors presented earlier in this section: – it is not subject to transfer of learning; – the learning focus is on particular items rather than underlying rules; – the resulting knowledge is durable, more intact than explicit knowledge. The above will be accepted by the present work as a working characterisation of implicit learning. 1.1.2. Implicit learning paradigms Implicit learning research to date has been carried out within a number of experimental paradigms (see Diennes and Berry 1997 for a detailed overview), out of which three have received the greatest interest: artificial grammar learning; sequence learning; and control of complex systems (also called complex problem solving [CPS] or complex dynamic control [CDC] tasks). The three paradigms are described in the present section. In artificial grammar learning (AGL; Reber 1967) research, subjects are asked to memorise strings of letters generated by a finite-state grammar. They are later informed that there are certain word (=letter) order rules in this system and asked to classify new strings as grammatical or ungrammatical (complying or not complying with these rules, respectively). Sequence learning, in turn, can be subdivided into two experimental domains: one in which facilitated processing of sequences is studied; and the other in which ––––––––– 2

The terms ‘objective threshold’ and ‘subjective threshold’ were originally used in subliminal perception literature (Cheesman and Merikle 1984). In a typical subliminal perception experiment, subjects are given a sequence of trials in which a stimulus is either presented or not. The subjective threshold is the point at which subjects do not know that they know that a stimulus was presented; the objective threshold is the point at which subjects do not know that a stimulus was presented at all. (Diennes and Berry 1997: 3).

Language instruction: Theoretical underpinnings …

107

the next element in a sequence has to be predicted. The first sub-paradigm is represented, among others, by the study of Nissen and Bullemer (1987), in which lights appearing on the computer screen followed either the repeating sequence condition or the random condition. Nissen and Bullemer found that the reaction times of their testees decreased for the former while remaining stable for the latter condition. The sequence prediction task (Kushner et al. 1991), on the other hand, consisted of a successive presentation of 5 different stimuli. The participants were supposed to foresee the place where the sixth stimulus would be located. Finally, when it comes to the control of complex systems (Berry and Broadbent 1984), such experiments require their subjects to make certain decisions about input variables in order to control output variables. An additional element is the constant feedback they receive on the output they generate. The best-known dynamic control task is the sugarfactory experiment (Berry and Broadbent 1984). In the task, the testees are asked to play the role of the manager in a small sugar producing factory. Their task is to reach and maintain a certain production output by manipulating the number of employees. As Berry and Diennes (1993: 21) observe when commenting on the results of the sugarfactory experiment vis à vis accompanying treatment offered to the testees, “practice significantly improved ability to control these tasks but had no effect on the ability to answer post-task written questions ... in contrast, detailed verbal instruction on how to reach and maintain the target value significantly improved the ability to answer questions but had no effect on control performance”. In their comparison of the three paradigms, Diennes and Berry (1997) note that there indeed was learning in all of the experiments as the participants’ scores on subsequent knowledge tests were above chance. At the same time, the learning was definitely implicit as the testees were “able to choose the correct reaction while in a task, but [were] later unable to recall the key characteristics that controlled the behaviour” (Broadbent 1991: 128). In terms of thresholds, in all three types of experiments, as noted by Diennes and Berry (1997), the resulting knowledge was below the subjective level. In the AGL tasks subjects concentrated on individual items within the letter strings rather than on rules motivating the accuracy of a given “word order”, and their level of metaknowledge, as shown on a subsequent test, was low. In sequence learning tasks subjects were surprised when informed that there actually was some regularity behind the stimuli. Finally, in the sugarfactory experiment, the testees, unable to specify how they exercised control over the system, claimed they were guessing. Furthermore, it is important to point out, following Diennes and Berry’s discussion, that the knowledge that resulted from the three types of experiments was far from flexible: it proved to be robust but not transferable to new tasks. In conclusion, as Diennes and Berry (1997: 18) put it, all of the studies “provided evidence that subjects could learn to perform well in a task without being able to freely report what they had learned or why they had made the right decision”.

108

Chapter Two

A question that may be asked at this point is: if testees on implicit learning experiments do not form verbalisable representations of what they are exposed to, what is it that they actually learn? The attempt to answer this question will be based on Druham and Mathews (1989), Mathews et al. (1989) for the AGL experiments and Broadbent et al. (1986) as well as Berry and Diennes (1993) for the CCS experiments. As for the AGL experiments, Druham and Mathews propose four models to explain what is actually learned: the classifier systems, the competitive chunking as well as the exemplar and the connectionist models. Classifier systems are lists of implicit, non-verbalisable rules used by AGL experiment participants on tests. The nature and workings of such systems can be explained based on the THIYOS (The Ideal Yoked Subject) experiment described in Druham and Mathews (1989). Their original testees on an AGL task, having learned to classify letter strings – which amounts to having formed a classifier system – were asked to give instructions to yoke (=new) subjects. Druham and Mathews (1989) note that such training was observed to bring the following two results: (i) on a subsequent test yoke subjects scored above chance (=they have learned the AG) but were observably worse than the original testees; (ii) the sets of instructions from different original testees did not converge on a common set of rules. Druham and Mathews (1989) explain the outcomes by claiming that, as a result of the offered instruction, the yoke subjects were able to construct their own classifier systems; that is why they learned the artificial grammar of the strings and scored above chance. However, as demonstrated by result (i), their classifier systems were not perfect as they lacked the missing knowledge “that the original [testees] did not verbalise” (1989: 82). Result (ii), in turn, is indicative of “a learning process that does not converge on the same knowledge base for different subjects, who obviously have varying sets of implicitly known classifiers” (1989: 82). Druham and Mathew (1989) explain results (i) and (ii) stating that classifier systems are a generic learning mechanism involving a large number of rules, or classifiers, which post messages on a message list (thought of as working memory). The original testees, obviously unable to consciously process and, subsequently, verbalise all of the rules, selected rules – at random but correctly – from the message list implicitly existing in their working memories and passed them on to the yoke subjects. The yoke subjects, in turn, used these rules to construct their own classifier systems. To understand the mechanism of classifier system construction in yokes, let us follow Druham and Mathews (1989) in their analysis of the THIYOS experiment in the context of a typical experimental AGL task. Such a task requires noticing a condition, which will be some aspect of a string (e.g. does it start with MTV?), and performing an action, such as indicating whether or not this condition is consistent with the rules of this particular artificial grammar. Noticing of this kind leads to posting a message and the message can be simultaneously

Language instruction: Theoretical underpinnings …

109

motivated by several classifiers. Whether or not a certain message is posted is a probabilistic function of the strength of a certain classifier. “Different classifiers can be given different priorities by tuning their strengths” (Druham and Mathews 1989: 83) such tuning being carried out on the basis of the general usefulness of a classifier. Classifiers “pay” for being used by reducing their strength; yet, if a classifier contributes to a successful decision, its strength increases. Now, if we relate all this to the THIYOS experiment, we can note that the strength of the classifiers depended on the phase of the experiment: the rules given by the original testees to yokes constituted classifiers of equal strength for these yokes: they had to be taken at face value; yet the resulting implicit rule formation in yokes was success and failure driven, which means that the classifiers used successfully increased their strengths while the strengths of those which did not guarantee a desired outcome reduced. Competitive chunking, another model which explains what is learned in AGL experiments, is based on the assumption that perception and memory are more or less automatic processes of chunking. As Druham and Mathews (1989) put it, “if an object can be seen as a single chunk, it will be perceived as maximally familiar” (85) in the form of such a chunk; and once encoded as a single chunk, the item is memorised in such a way. When, in the test phase of an AGL experiment, the testees are presented with a different string containing the same chunks, they find memorisation of such sequences easier. In other words, the testees feel the more conversant with new strings the more familiar chunks the strings contain; and the more familiar they appear, the more likely they are to be classified as grammatical. The competition, as indicated in the name of the model, occurs between familiar and unfamiliar chunks and is won by the former. A similar non-analytic concept formation and reliance on memory can be observed in the two remaining models proposed in Druham and Mathews (1989): the exemplar model and the connectionist model. The former shows the beneficial effect of reasoning by analogy. An example of such input processing comes from an experiment by Brooks (1978 cited in Druham and Mathews 1989), in which subjects were asked to learn strings of words as in a typical AGL task. What made their performance significantly better was when the strings in question were paired with actually existing English words. As for the connectionist model, in turn, it “attempts to model human performance according to patterns of activation across a number of simple computational elements or units connected by weights” (Druham and Mathews 1989: 90) forming two common architectures: a pattern associator and an autoassociator. In the light of the connectionist model, as proposed by Druham and Mathews, learning is based on a multi-layer pattern associator3 whose role is to perform paired associate learning. An autoassociator, ––––––––– 3

Input layer, output layer and the hidden layer in-between

110

Chapter Two

in turn, tries to associate every stimulus with itself. If the output patterns of the pattern associator are always identical to the input pattern, then such an associator is an autoassociator. As for the workings of the connectionist model, a stimulus presented to a pattern associator produces a pattern of activation (the input activation) across the units at time t. At time t+1 the autoassociator produces a response: an output activation for each unit based on the weighted sum of the input activations of other units. The aim of the autoassociator is to produce output activation equal to the input activation (Druham and Mathews 1989: 90)

In artificial grammar learning the task of the autoassociator is to establish the predictability of each letter in a string from other letters in this string. Complicated as it sounds, the latter model is effectively clarified based on the operation of the neural networks of artificial intelligence programmes. In such networks neurons are grouped into three main types of layers: input, hidden, and output (Figure 6). Figure 6: Neuronal network; input, hidden and output layers input layer

hidden layer

output layer

input#1 input#2 input#3 ... input#n

The input layer receives input data from the external environment, and passes it to the hidden layer(s). The hidden layer neurons are where the information is processed and sent to the output layer, where the output is compared with the required output and the network error is computed. Typically, the information about the error flows backwards through the network and the values of connection weights between the neurons are adjusted using the error term. The process, called backpropagation, is repeated in the network for the number of times that is necessary to achieve the output closest to the desired (actual) one. Finally, the network output is presented to the user. Neuronal network learning is basically the process by which the system arrives at the values of connection weights between neurons. Yet, according to Bartkowski (2001), the most fascinating characteristic of NNs is that they are not programmed like conventional software, but are literally trained to produce the desired effects. As a result, NNs are superior to classical algorithms (on the basis of which computer programmes

Language instruction: Theoretical underpinnings …

111

typically work) because they are capable of analyzing incomplete, “noisy” data, they deal with problems with no clear-cut solution and, most importantly, they learn on the basis of previous experience. The final characteristic makes the NN operation mode in particular and the connectionist approach in general applicable in AGL experiments. The incoming data are “noisy” in the sense that there are no explicit verbalisable rules but sets of classifiers, to use a previously introduced term. Success or failure strengthens the weights on some cues and weakens the weights on others, working as backpropagation mechanisms. The final result is a set of rules learned in a way that is effective but not verbalisable, as the system works through simple associative patterns and is not based on any conscious processes. The final statement makes the neural network comparison even more legitimate, as artificial intelligence works without consciousness. What, in turn, is actually learned in the experiments involving control of complex systems (CCS) amounts to a kind of “the look-up table” (Broadbent et al. 1986), in which successful actions are stored together with associated situations in which they were taken. Such action/situation pairings are called episodes. On a CCS test there will obviously be competition between a number of such stored episodes. Diennes proposes a connectionist explanation, analogous with the connectionist model presented for artificial grammar learning. Such connectionist learning of rules inspired by the reinforcement learning procedure amounts to “acquiring control over a system with unknown dynamics when the only feedback is the extent to which outcome is favourable” (Diennes 1993: 104). In other words, success increases the weight of a certain episode while failure decreases its competitive power. In this way, success or failure function as adaptive elements of a given episode (Diennes 1993: 104), which is understood as the output layer of the connectionist network. However, if we assume that the actions are stored together with the episodic context of situations remembered in connection with these actions, learning implies “the network [being able] to perform the right action in the right context and so needs to develop large weights between the corresponding input units and the adaptive elements” (Diennes 1993: 104). Finally, one remaining issue needs to be addressed. Throughout the description of knowledge resulting from implicit learning situations, terms such as rules and chunks/exemplars have been used. The question that arises at this point is whether the use of both is actually legitimate in the light of Schmidt’s (1995) claim – see Chapter 1 – that implicitly induced knowledge of rules does not exist and the only result of implicit learning may be a connectionist string of associations. In fact, implicit learning literature contains ample evidence overthrowing Schmidt’s position. Stanley et al. (1989; cf. also Pinker 1999) suggest that people draw on two separate but interacting knowledge structures: one based on memory of past experiences (close analogies; Schmidt’s connectionist associations) and the other

112

Chapter Two

based on the current mental model of the performed task. Implicit rules that control response selection are believed to be derived from both structures. The first structure has more control over response selection (being more specific) and the second structure has more control over verbal reporting as it is more accessible. This position is endorsed by a number of earlier and later experiments that allowed their authors to assume the following positions (following Underwood and Bright 1996): 1. The abstractionist (rule-based) position. Reber and Lewis (1977) showed that the knowledge resulting from implicit learning was not limited to the surface features of input. Their subjects, while on AGL tasks, successfully solved anagrams simultaneously demonstrating sensitivity to bigrams and trigrams, sensitivity which did not correlate with the actual frequency of the two letter combinations and implies some – limited, as argued by Berry and Diennes (1993) but noticeable – degree of schematisation (=rule-oriented processing) of input; 2. The analogy-based position. Brooks (1978: 208), commenting on the results of Reber and Lewis’s AGL experiments, states that the knowledge of the testees is “a looser confederation of special cases in which our knowledge of the general is often overridden by our knowledge of the particular” (cf. also Vokey and Brooks 1992). He, however, accepts the possibility of implicit learning resulting in two different kinds of concept formation: analytic and non-analytic. The former employs rule induction and logic to provide abstract generalisation; the latter deals with instances of importance to the individual. He claims that Reber’s results are open to both types of interpretation. As Underwood and Bright (1996: 23) observe, many experiments show that “grammaticality and similarity are confounded, thereby making the determination between the two accounts impossible”. This leads to the final proposal of 3. Abstract analogies – general vs. particular. As indicated in point (1), Reber’s (1977) argument that knowledge resulting from implicit learning is abstract rested on two claims: (i) subjects transfer efficiently from one letter set to another (cf. anagrams to bigrams and trigrams); (ii) knowledge of bigrams and trigrams is deeper than frequency of occurrence detection. These claims were disputed by Brooks and Vokey (1992) and Perruchet et al. (1992), who are more inclined towards analogies but divide them into general analogies (when two strings of letters are similar in that they start and finish with the same letter) and particular (when two identical strings reoccur). Categorisation, they claim, even when occurring throughout a range of materials, is carried out in relation to specific instances. Categorisation, however, as they understand it (Brooks and Vokey 1992; Perruchet et al. 1992), implies something more than mere associations and takes us from the level of token to the level of type; from the level of instantiation to the level of schema.

Language instruction: Theoretical underpinnings …

113

1.1.3. Implicit learning and educating intuition Implicit learning, already defined as unintentional, unconscious and leading to outcomes that are non-verbalisable inevitably elicits one more adjective: intuitive. The association is reinforced by the way intuition is defined by Hammond (1996): he sees it as “a cognitive process that somehow produces an answer, solution or idea without a conscious, logically defensible step-by-step process” (1996: 90), the wording of which brings out Bruner’s intuiting conceptualised as “the intellectual technique of arriving and plausible but tentative formulations without going through the analytical steps by which such formulations would be found to be valid or invalid conclusions” (1960/2003: 13). Hammond’s and Bruner’s definitions derive from Brunswik’s view on perception, in the light of which “the organism achieves a percept by attending to and taking into account various cues that indicate the nature of what is being seen” (Brunswik 1956 cited in Hogarth 2001: 7), the cue weighing being highly reminiscent of implicit learning, which does not seem to be accidental. In turn, “educating intuition”, used in the present subheading, is a concept borrowed from a book by Hogarth (2001), and this conceptual loan is meant to indicate that intuition – like the implicit knowledge described in the previous two sections – is a product of experience (learning included), and not a mysterious sixth-sense-type capacity. Consequently, the present section will try to demonstrate what intuition is and how it can be educated. According to Hogarth (2001), intuition-motivated actions are based on: general facts; past experiences; rough calculation/estimation; identifying the genus/type/category; and something unexplainable. Dissatisfied with the attempts to form one concise definition of intuition, Hogarth tries to define the phenomenon by its process, content and correlates. As for intuitive processes, Hogarth sees them as covert, unconscious, not logically defensible, automatic and dependent on different contributing cues which are weighted. As for content, intuition is based on a stock of knowledge which may be expert knowledge including our ways of interpreting the world which Hogarth calls cultural capital. On the basis of such knowledge we make prognoses for the future which are imprecise and usually approximate. Finally, the correlates of intuition, as Hogarth sees it, are speed and confidence: intuition is instantaneous (though some people may experience a slow build-up of the feeling) and carry a feeling of an amount of certitude. The definition at which Hogarth finally arrives is as follows: “the essence of intuition or intuitive processes is that they are reached with little apparent effort, are typically without conscious awareness. They involve little or no conscious deliberation” (Hogarth (2001: 14). In the light of it, the alreadysignalled parallels between such processes and implicit learning are hard to escape; so is the conclusion that intuition can be subject to such learning. Hogarth’s (2001) claim that intuition can be educated is based on 5 key ideas. Primarily, a lot of learning is automatic and automatic processes, like intuitive

114

Chapter Two

processes, are effortless. Secondly, a large proportion of learning comes from experience and is based on two principles: first, associations and contingencies (=things that occur together) are noticed; then rewards and punishments help people remember some associations better; intuition is motivated by similar factors. In addition to the above, there are two systems for learning and doing – tacit (including intuition) and deliberate; intuition (experience) and analysis (intellect) can both be involved in problem solving (cf. the discussion on dualprocessing in Chapter 1); what is more, if deliberate system actions are repeated over time, they can move into the domain of the tacit system. What is also true is the fact that intuition is related to expertise which means that both are domainspecific. Finally, scientific methods are made intuitive through experiential learning. The final assertion is very Brunerian (Bruner 1960/2003) in its approach: “[t]he good intuitor may be born with something special, but his effectiveness rests upon a solid knowledge of the subject, a familiarity that gives intuition something to work with (Bruner 1960/2003: 56-57) Consequently, acquiring intuition will amount to learning from experience and this, in turn, is characterised by Hogarth (2001) very much along the lines of implicit learning in complex problem solving (cf. Section 1.1.2): as making mental connections between things that happen together as well as reinforcing these connections in memory. As for the former process, Hogarth (2001) claims that seeing connections is critical to learning from experience and that such connections may range from simple, binary, to fairly complex, systematic. When it comes to reinforcement, its effectiveness depends on genetic predisposition, motivation and stimulus frequency. The information about the frequency of events is stored – Hogarth ascribes it to a kind of built-in counter in the mind – and encoded automatically4. Moreover, human beings possess the ability to automatically classify and maintain tallies of different kinds of experience. This knowledge is not verbalisable and, consequently, can be seen as implicit. In addition to the above, Hogarth (2001) points to the importance of memory processes in educating intuition. Such processes, crucial to intuitive processing, include recall and recognition. The former is an attempt to find or identify a particular piece of information in the long term memory (LTM). It is important to point out that, as Hogarth puts it, recall works by reconstructions: what is kept in LTM in the skeletal form is elaborated on through the use of numerous cues coming from the network of possible connections in a given domain of knowledge; as a result experts find reconstruction easier; experts also, as argued in Vincent and Wang (1998), observe the constraints that limit the complexity of what one is trying to remember. Recognition, on the other hand, is a process ––––––––– 4

Cf. N.Ellis’s (2002) implicit tallying hypothesis and Harrington and Dennis’s (2002: 261) concept of language learner as an intuitive statistician presented later in this chapter

Language instruction: Theoretical underpinnings …

115

involving little conscious attention or effort. It consists of matching an external stimulus and an internal representation. As Hogarth (2001) points out, judgments of similarity facilitate recognition because the power of analogies depends on the strength of the similarity. Consequently, recognition is enhanced by expertise as expertise depends on understanding redundancies and assuming the missing elements by default. In conclusion, Hogarth (2001: 138) states: intuition may be influenced by certain predispositions but it is essentially the result of experience. In many ways, the organism can be thought of as a learning machine that codifies many associations and their frequencies and can access this information when newly encountered stimuli trigger connections.

Summing up, we can say that implicit learning is the one that is generally defined as unintentional, unconscious and resulting in non-verbalisable knowledge which is represented by instantiations as well as schemata, whose degree of abstractness may differ but is a fact. The judgments that testees make on tests that are part of implicit learning experiments are fast, effortless and devoid of conscious deliberation and, as such, can be referred to as intuitive. The concept of intuition has been introduced on purpose. In the context of language learning there is frequent mention of the native-speaker-like intuition that advanced users have about the language they are learning. The position assumed in this work is that the intuition is in fact the implicit knowledge of the language which is motivated by expertise which, in turn, is the result of ample exposure to input in native and non-native speakers of the language as well as automatisation/proceduralisation of previously declarative linguistic knowledge in the case of non-native speakers. Both, the role of input and the role of automatisation are discussed later in this book. 1.2. Explicit learning While implicit learning is intuitive, effortless, unconscious, and implicitly acquired knowledge is not subject to verbal report, explicit learning, by disanalogy, is laborious, memory-based, conscious, and explicit knowledge is reportable. The present section develops the definition of the explicit, basing on the said disanalogy with implicit learning. Then it goes on to look at different explicit learning models – with special regard to skill learning – and modes: inductive (experiential) and deductive (direct; teacher-fronted). 1.2.1. Defining the explicit. Defining the explicit will be definitely shorter than the comparable fragment of Section 1.1.1, which was devoted to the explication of the term implicit, because explicit learning and explicit knowledge are less controversial than their implicit counterparts.

116

Chapter Two

Treating as a point of departure the definition of implicit learning arrived at at the end of the previous section, we can say that explicit learning is conscious, often intentional; the resulting knowledge is overtly represented and controlled in the sense of being verbalisable as well as analysable, and its outcomes are easy to abstract. This last feature, the symbolicity of explicitly learned knowledge – as opposed to the connectionist representations which are normally (cf. Section 1.1.2 for arguments to the opposite) not rule-like – is, according to Redington and Ronald (1999), one of the defining characteristics of the explicit; another one is discreteness of both learning steps and resulting representations of knowledge. As a consequence, when we learn something explicitly, we are able to specify, in a rule-like way, why we perform certain actions in a specific way; we can also identify the individual steps taken within a particular activity algorithm as well as apply the existing abstract schemata to novel tasks. When it comes to being able to articulate what one knows, explicit knowledge, as Ellis (1994: 355) puts it, “is not the same as ‘metalingual knowledge’ (knowing the special terminology for labelling ... concepts), although it is often developed hand in hand with such knowledge”, especially in instructed explicit learning, because metalingual expertise is usually the result of explicit instruction offered by a qualified trainer who knows and uses appropriate specialist terminology. In literature, the term explicit learning is very often used in relation to skill learning. Therefore, the present section will address this theoretical stance at some length. However, before we proceed to describe it, it needs to be pointed out that skill learning as such is addressed by both behavioural and cognitive psychology. The former discusses three theories of skill learning: classical conditioning, operant conditioning and vicarious learning also called modelling or observational learning. The first two models are not going to be addressed in this section, as their stimulus-stimulus and stimulus-response interfaces by nature exclude – or ignore – any cognitive explicit, verbalisable component of the learning process. As for modelling, its role in explicit learning will be mentioned in Section 1.2.2, in the part devoted to experiential learning. On the whole, however, modelling is best accommodated within the connectionist model of learning, which – as described in section 1.1.2 – belongs to the implicit learning paradigm. In turn, skill learning in cognitive understanding is sequential and involves three stages: declarative representation, proceduralisation and automaticity. One of the best-known cognitive models of skill learning is Anderson’s ACT-R model discussed below. ACT5 (Anderson 1980, 1982, 1983) and its later ACT-R version (Anderson 1993; Anderson et al. 1997) is “a general system modeling a wide range of higher ––––––––– 5

The theory is chosen for its considerable popularity but also for the reasons mentioned by O’Malley and Chamot (1990: 19): it incorporates numerous concepts form prevailing notions of cognitive processing; it accommodates learning and performance as well as factual knowledge and skills; and it has been continually updated, expended and revised.

Language instruction: Theoretical underpinnings …

117

level cognitive processes” (1997: 439). As such, it offers an insight into human memory and the learning of cognitive skills, one of which is language learning (to be discussed in relation to ACT/ACT-R later in this chapter). In the said model there are two types of knowledge: declarative, related to facts we know consciously and can retrieve from memory (unless we are constrained by the forgetting process); and procedural, displayed by behaviours we know how to do, that is unconscious. “Procedural knowledge,” as Anderson et al. (1997: 441) put it, “specifies how to bring declarative knowledge to bear on solving problems”. The operation of the ACT model is diagrammed in Figure 7. We will look at the process trying to establish the role of input (the factual knowledge) and output (task execution) as well as to examine the subsequent processes which lead to the restructuring of the internal representations of knowledge in the learner. Figure 7. The operation of the ACT model (adapted from Anderson 1990) learning

DECLARATIVE MEMORY retrieval

storing

PROCEDURAL MEMORY matching

execution

WORKING MEMORY

The arrows in the diagram show information flow in the system. When a certain task enters the working memory, declarative knowledge chunks (we assume that they are already instilled in the learner’s mind via some kind of instruction) that are suitable for solving this particular problem are retrieved from the declarative memory. These chunks may include explicitly remembered knowledge as well as past procedures used in dealing with similar tasks. Another option is, as argued by Bargh and Bandollar (1996; cf. Chapter 1), that the top-node of the retrieved chunk is declarative in character and so are some of the other nodes; yet the inner structure of the chunk will accommodate processes and strategies of procedural nature. The declarative and the procedural chunks are matched with the task requirements and, if they are suitable, the task is executed. As a result of this output, certain procedures get automatised or further automatised. In such a form, they are stored in memory, ready for another retrieval and matching to task, the process becoming more fluent as it is repeated. This is how we learn skills.

118

Chapter Two

In the light of the ACT/ACT-R model, the declarative knowledge is, generally speaking,6 primary, and the learner proceeds from state zero through knowing THAT to the performance of a particular skill (knowing HOW) in three stages (Anderson 1983, 1985): the cognitive stage, the associative stage and the autonomous stage. The cognitive stage may involve: instruction; observing an expert performer in task execution; or discovery techniques. The associative stage involves skill proficiency development – in the course of cycles following the pattern diagrammed in Figure 7 – which involve compilation of knowledge and its gradual proceduralisation, and result mainly in the efficiency with which the declarative knowledge is retrieved and matched to task. Finally, in the autonomous stage, the task is executed automatically, fluently and effortlessly in the sense that conscious processing is minimal or non-existent. The performance becomes fine-tuned in the sense that error will not constrain task execution. The proposal which explains – in more detailed stages – the learner’s proceeding from the declarative knowledge to the autonomous task performance is the one of coaching for competence (Dreyfus and Dreyfus 1986). The said stages include: – Novice – a novice performer uses objective facts and applies them utilising oversimplified rules in a methodical way – Advanced beginner – such a performer is more familiar with patterns; shows beginnings of situational judgment – Competence – a competent performer knows solutions to common problems; some decisions start to be intuitive – Proficiency – a proficient performer is one with the activity; decisions are rapid and fluid – a subconscious habit – yet still analytical – Expertise – an expert performer just knows what to do intuitively Relating the above to ACT/ACT-R, we can say that a novice performer will be still, to a considerable degree, at the cognitive stage; an advanced beginner and a competent performer will exhibit different degrees of compilation/proceduralisation of their declarative knowledge; and proficient and expert performers will show varying extent of automaticity when on task. All this is in agreement with the considerations of controlled vs. automated actions presented in the discussion of the selectivity and consciousness of attention in Chapter 1. Looking at skill learning from a pedagogical perspective, Bernstein et al. (2007) state that two important components of the learning of skills are practice (acknowledged in ACT) and feedback. As for practice, it should continue beyond satisfactory performance towards full automatisation of action. The feedback ––––––––– 6

As Anderson and Finchman write in one work (1994: 1323), “It is too strong to argue that procedural knowledge can never be acquired without a declarative representation. ... None the less, the research does indicate that this is the major avenue for acquisition of procedural knowledge”.

Language instruction: Theoretical underpinnings …

119

component, in turn, is in charge of confirming whether the efforts of skill learning lead to a desirable. Bernstein et al. (2007) claim that feedback should be provided soon enough to be effective, but not too quickly, as it may interfere with the skill being practised. As for the forms the explicit learning of skills may take, they were already introduced in the description of the cognitive stage of the ACT/ACT-R model and involve: instruction, modelling and discovery learning. The most prototypical of them, according to DeKeyser and Juffs (2005) is the first: deductive and top-down teaching, in the course of which declarative knowledge is instilled in the learners and subsequently applied in more or less controlled task execution. At the same time, it is also possible – and increasingly popular in contemporary pedagogy – for explicit learning to be triggered by different forms of consciousness-raising, by input and output enhancement or, as mentioned in the description of the cognitive stage of ACT, generated by the learner on their own, in the course of such discovery learning. In this latter body of cases explicit learning will proceed in a bottom-up fashion as the learner, instead of being presented with the knowledge, will be encouraged/made to discover facts and abstract schemata independently of the teacher. The bottom-up vs. top-down distinction is represented by two explicit learning paradigms: experiential learning (the constructivist perspective) and direct instruction (the objectivist perspective). They are addressed in Section 1.2.2, which discusses the two models on two levels – cognitive and metacognitive – the latter case involving a look at learning strategies and learning strategy training. 1.2.2. Explicit learning paradigms and models Experiential learning is a form of education in which the emphasis is on the subjective experiences of learners. Dewey (1938) describes experience as the basis for all genuine learning; Knowles (1990) emphasises its importance by claiming that the teacher and the textbook are secondary. As I argue elsewhere (Turula 2010: 277), experiential learning can be understood in two ways: – what one has already learned (the experienced learner), and – what one is feeling or taking part in at the moment (the experiencing learner). In relation to the first understanding, recognizing the learner as an experienced person has a number of practical implications in the classroom (Turula 2010: 277-278): – The teacher can build on the knowledge, understanding and skills learners already possess. Consequently, the experiential classroom will be a moraleboosting environment as it is the place where everybody can feel important because (s)he is a recognised expert in some field of knowledge.

120

Chapter Two

– If the teacher makes the students realise, refer to and base on their earlier experience and already possessed knowledge, (s)he not only makes room for associative learning and thus facilitates the acquisition of new material, but also humanises language increasing every individual’s motivation and involvement. It is because, as Moskowitz (1978) emphasises, effective education is that which combines the subject matter to be learned with the lives of the students – their feelings, emotions and experience. Involving such personal experience gives life, texture and subjective personal meaning to abstract concepts and also provides a concrete reference point for testing the validity of ideas created in the learning process (Nunan 1993). – The experiential classroom is a place where metacognitive instruction (to be discussed later in this section), understood mainly as strategy training, can be enriched by the learner’s own educational biography, which helps every student improve his/her self-awareness as a learner. As a result of such educational biography insights, students can identify their strengths and weaknesses, their learning preferences and styles. In addition to the students’ subjective experiences, attitudes and feelings about themselves as learners, experiential learning is also what Weils and McGill (1989) call meaningful discovery, a combination of action, autonomous and integrative learning. The experiencing learner – in the second understanding of experiential learning – is not a passive recipient of knowledge, and teaching cannot mean lecturing to an inactive audience. Students need to experience – to participate, to try out, to experiment, to discover – to be able to learn. The anatomy of such experiential learning will be cyclical and will consist of four stages (Kohonen 1993): 1. concrete experience in or outside the language classroom. In this context Nunan (1998) emphasises the importance of task-based learning. At this stage feeling is as important as reason; 2. abstract conceptualisation, which is a logical and systematic activity consisting in neating and precising the experience one has had; 3. reflective observation, which amounts to understanding the meaning of ideas, thoughts, feelings and judgement; 4. active experimentation – practical application of internalised ideas and learned skills that obviously involves risk taking. It can be speculated here that while stages 2-4 are definitely learner-centred, stage 1 – concrete experience – though definitely subjective – may involve all kinds of exposure, from taking part in a certain activity to listening to offered instruction or observing other people and trying to model one’s behaviour on their routines. This is

Language instruction: Theoretical underpinnings …

121

where experiential learning can intersect with direct instruction (to be discussed later in this section) and observational/vicarious learning, which, according to Bandura (1977), involves attention to the observed model, retention of the details of his/her behaviour and subsequent reproduction of these details in one’s own performance. On most occasions, modelling will be an implicit form of learning (cf. earlier in this chapter); if, however, the subsequent stages of experiential learning – abstract conceptualisation, reflective observation and active experimentation follow, we can say that such modelling is a conscious, explicit form of learning7. Learning from direct instruction, in turn, is a pedagogical option based on overt teaching, usually associated with traditional schooling involving lectures and demonstrations rather than discovery learning or discussions. Such top-down, teacher-fronted education is objectivist/behaviourist in nature, as it assumes that there is an objective body of knowledge to be transformed from the teacher to the learner. This proposal is definitely in disagreement with the earlier-presented constructivist view that learning is personal, unique and should be individualised. At the same time, though, there is an agreement among practitioners and researchers that such an explicit, top-down teaching mode is necessary as purely inductive, discovery-based approaches are not fully successful. Therefore deductive instruction is mandatory and appears more effective especially, as argued by Rosenshine (2000), for giving the learner knowledge about “a body of content or well-defined skills” (Rosenshine 2000: 403), on condition that this body of knowledge is well-structured, the concepts to be taught are clearcut, and the skills to be acquired follow explicit steps. Direct instruction is particularly suited in such a case because it is systematic, small-step-based and involves pausing to check the learner’s understanding. All this is deemed effective vis à vis what we know about information processing and the limits of human attention and the capacity of working memory (cf. Chapter 1). Explicit learning in the direct instruction context will go through five stages (Joyce et al. 2000): – orientation, in which the learner listens to the teacher clarifying objectives and procedures; connects to previous lessons and gets his background knowledge activated; – presentation, when the learner listens to the teacher explaining, demonstrating and giving examples of concepts, skills and strategies; gives examples if prompted by the teacher; and asks for clarification or concept development; ––––––––– 7

Modelling – or teaching by analogy – is also a form of explicit direct instruction, utilised, among others, in teaching mathematics and surface (non-transformational) grammar of a language.

122

Chapter Two

– highly structured practice which requires the learner to practice with the teacher’s support; – guided practice, during which the learner is offered close guidance from the teacher; and – independent practice, in which the learner practices on his/her own. As can be seen in the above-quoted stages of direct instruction, a lot of emphasis is put on the engagement of the learner, which is a partial compensation for the objectivist stance of this approach and a way of making the treatment at least partly learner-centred. Other amendments to the original model, such as active teaching (Good 1983) or generative teaching (Wittrock, 1990) involve broadening the concept of teacher-fronted schooling by incorporating learner participation or combining direct instruction with discovery techniques. In active teaching, the instructors whose students learn effectively (Vockell 2006: http://education. calumet.purdue.edu/Vockell/EdPsyBook/index.html): – are active in presenting concepts, providing appropriate engagement and practice activities, and monitoring those activities carefully; – actively look for ways to determine whether their students understand what they are doing; and – actively assume partial responsibility for their students' learning and are prepared to reteach when it is necessary to do so. In turn, in generative learning, the primary objective of instruction offered by the teacher, as explained by Vockell (2006), is to help students generate relationships: construct/generate meaning in the course of class activities by building relations which are intra-domain (related to the studied content) or inter-domain (related to other areas of the learner’s knowledge). In their attempts to make direct instruction more learner-centred, Vockell and Schwartz (2009) go as far as suggesting how it can be incorporated in the fourpartite experiential learning cycle, specifying the role for teacher-fronted instruction at every phase of this model. They claim that in the application and exploration phases, direct teacher interventions such as asking questions or providing suggestions will be required if the learners do not ask the right questions themselves, make wrong observations or are sidetracked from the task; during the concept introduction phase, in turn, it will naturally be the teacher’s role to provide information which, as Vockell and Schwartz (2009) argue, needs to be structured but such a structuring does not need to be authoritarian. As the presented paradigm is a teaching – rather than learning – model, the person of the instructor and the routines (s)he implements are quite important. That is why concepts that are particularly applicable to direct instruction include:

Language instruction: Theoretical underpinnings …

123

– Mediation/Mediated Learning Experience (Vygotsky 1986, Feuerstein 1990), an idea that interacting with other people is important in learning. There is a special role for the teacher-mediator: their task is to select and shape learning experiences of learners and, eventually, prepare them for independent and cooperative education. This is because, as argued by Feuerstein (1990), our ability to learn is not determined solely by our genetic equipment; what is equally important is cognitive enhancement from the environment. The teacher is an important part of this environment. – Zone of proximal development (Vygotsky 1978), understood as the distance between the current stage of development enabling self-sufficient taskexecution and the potential stage of development as demonstrated in tasks executed under guidance or in collaboration with more skilled performers. Vygotsky claims that the learner is capable of learning skills or knowledge that are just beyond what they can do at the moment, the level of guidance (the teacher) and cooperation (more able peers) constituting the limit. – Scaffolding (Vygotsky 1978), which is a concept related to the zone of proximal development in the sense that the guide/collaborator needs to adjust the quantity and the quality of help (s)he offers. Such help, according to Bransford et al. (2000) may consist in: activation of the learner’s knowledge; personalization, simplification or modification of input; provision of a model; offering directions/instructions; helping the learner to notice the gap (=perceive the difference between his/her own production and the target standard); the know-how and tools (strategies and resources to be used in task completion); or reducing anxiety and increasing motivation – Dynamic assessment (Feuerstein et al. 1979) in the light of which the teacher should not measure the outcomes and skills of the learner but rather evaluate the learner’ ability vis à vis the task. Such assessment should imply entering into a dialogue with the learner. – Transfer of training (Holton 1996), a phenomenon related to the degree to which learning has the potential to influence individual performance of tasks. Considering it in relation to the phases of direct instruction, we can say that successful transfer of this instruction will enhance the learner’s independent practice. According to Holton’s model of transfer of training (Holton 1996: 17), whether or not the learner’s task performance is affected by instruction will be determined by three factors: 1) how motivated the learner is to transfer training; 2) the design of transfer; and 3) the climate for transfer. In relation to the former factor, based on the cognitive views on motivation – the Expectancy-Valence Paradigm (Cross 1981); goal theories (Houle 1981) – we can say that the learner will be motivated to transfer training when on task if: (s)he is convinced that the training outcomes are applicable to this task in the sense that they guarantee goal execution; as well as if (s)he has a strong feeling of competence: of being able to use the knowledge resulting from instruction for successful task execution. When it comes to transfer

124

Chapter Two

design, Holton (1996) notes that transfer of training will be successful if the training phase itself provides for applying the gained knowledge to real-life-like tasks. This demand can be linked to the earlier-discussed modifications on direct instruction which involve and activate the learner if we agree that task realism and personalisation are successful incentives. Finally, the climate for transfer of training will be related to the extent to which the instruction-based task performance is successfully accommodated within a larger context. All in all, discussing transfer of training in relation to direct, teacher-fronted instruction, we can argue that in the light of Holton’s model (Holton 1996), the teacher will facilitate transfer of training if (s)he motivates the learner, showing the knowledge-to-task as well as task-tocontext applicability; increases the learner’s self-esteem, task-related and general; and provides opportunities for real-life-like practice. In relation to above-discussed models, both experiential and direct, the term explicit learning, as noted by Rodrigo et al. (2002), will also refer to “conscious knowledge-acquisition process that occurs when people invoke active strategies to discover the rules or principles that underlie the task (e.g. hypothesis testing or metacognitive procedures)” (Rodrigo et al. 2002: 170). In the light of the above assertion, it stands to reason that we should analyse explicit learning, in its two forms described above (experiential and direct), on two levels: the cognitive plane of the task itself (described earlier in this section) as well as the meta level, within the information processing framework of learning, with special relation to learning strategies, addressed below. Discussing explicit learning in the context of learning strategies is based on a claim that the skill-learning proceeds more effectively if the agent: uses effective cognitive tricks or shortcuts enhancing memorization and retrieval; plans, manages, monitors the execution of the task; and is able to command their affective domain as well as benefit from the outside context. Such “behaviours and thoughts that a learner engages in during learning” (Weinstein and Mayer 1986: 315 cited in O’Malley and Chamot 1990: 17) are utilised in all three phases of skill learning, including the process of encoding the declarative knowledge. In such a case, as argued by O’Malley and Chamot (1990), learning strategies will be identified and applied at each of the four stages of this process: at the selection stage, when the learner chooses the information that is of interest to him/her, concentrates on it and enters it into the working (short-term) memory; in acquisition, which consists in information transfer from short-term to long-term memory; during construction, which involves relating the newly learned material to the existing database; and finally at the stage of integration, implying information transfer from the long-term memory to the working memory during a novel task execution. The role of learning strategies at each of the four stages is “to make explicit what otherwise may occur without the learner’s awareness or may occur inefficiently during the early stages of learning (O’Malley and Chamot 1990: 18).

Language instruction: Theoretical underpinnings …

125

In their classification of learning strategies, O’Malley and Chamot (1990: 44-45) distinguish three groups: metacognitive, cognitive and social/affective strategies. The metacognitive strategies include: selective attention to certain aspects of input or parts of the task; planning the organisation of the taskexecution process; monitoring task performance; and evaluating task execution on its completion. The cognitive strategies are: rehearsal or repetition; organisation, classification, categorisation, analysis, synthesis, elaboration and summarising of content; reasoning: deduction and inferring from context; and transfer of knowledge. Finally, the social/affective strategies incorporate: collaborative learning; questioning and eliciting information from others; and self-talk aimed at mentally controlling one’s states, including affective states such as anxiety or motivation fluctuations. Considering the above-listed strategic repertoire, it is important to remember that it is not the size of the quantity of the used strategies that draws the line between successful and unsuccessful learners; the difference is qualitative in nature. As Chamot (2005) points out, weak learners use strategies too but they do this ineffectively or select strategies which are inappropriate for a given assignment. What makes a good learner is the capacity of “matching strategies to the task they [are] working on whereas less successful language learners apparently do not have the metacognitive knowledge about the task requirements needed to select appropriate strategies” Chamot (2005: 116). Strategies are learned and taught explicitly. As signalled earlier in the discussion of explicit learning at the meta level, such strategy training can be experiential, based on discovery or result from direct instruction. In the former case, it will follow the four-partite cycle of experiential learning, with the learner observing a useful strategy – in his/her own or other people’s performance – conceptualising it and actively experimenting with it. Direct strategy instruction, in turn, will be carried out in three stages: – presentation of a wide range of learning strategies; – offering learners ample opportunities to experiment and try out the techniques in order to select the ones that comply with their learning style and type of intelligence; and finally – automatisation of acquired learning tactics following both the ACT/ACT-R cycle as well as the phases of direct learning presented earlier in this section. Summing up Section 1, it is important to point out that the description of implicit and explicit learning – as was mentioned at the introductory part to the present chapter – was aimed, among others, at delineating the framework for the implicit and explicit learning of languages and, eventually, at defining explicit and implicit focus-on-form. Summing up, we can say that implicit learning is

126

Chapter Two

direct, unconscious/f-conscious and results in knowledge which, even if rulebased, is not verbalisable, whereas in explicit learning declarative systematic knowledge, deductive or inductive in origin is the core of instruction, and is accessible to conscious processing and verbal report. Implicit and explicit learning of languages (Section 2) will be described based on the same definitional foundations. In turn, different forms of focus-on-form will be classified as implicit or explicit (Section 3) in relation to whether – in the course of such form-focused instruction – language as a system of rules is paid attention to and discussed (explicit FFI) or not (implicit FFI).

2. Implicit and explicit learning of languages If, as was summarised above, implicit learning is direct, unconscious/f-conscious and results in knowledge which, even if rule-based, is not verbalisable, whereas in explicit learning rules are the basis of instruction and the resulting knowledge is accessible to conscious processing and verbal report, it is hard to escape an observation that those who subscribe to implicit learning – or, preferably, acquisition – of any target language will perceive the process as similar to first language acquisition, which is believed to proceed without – and occasionally counter to – overt interventions; explicit learning of languages, in turn, will be seen as a laborious rule-based/rules-first process accommodated within general cognitive mechanisms such as those belonging to access awareness and working memory (cf. Chapter 1). Bearing the above in mind, the present section looks at the theoretical explications of these two types of language learning. Implicit language learning is explained through the approaches based on Universal Grammar (Ellis 1994; Cook 1994 among others); the Interlanguage Theory (Selinker 1972); the Monitor Model (Krashen 1981; 1982) and the Competition Model (Bates and MacWinney 1982; MacWinney 2002); as well as a more recent proposal: the APT model (Truscott and Sharwood-Smith 2004). Explicit language learning, in turn, is theoretically related to Input Processing (VanPatten 1990 and later works); output-based processing (Swain and 2004); and the cognitive theory of language learning with its concept of language learning strategies (following Rubin 1975; O’Malley and Chamot 1990; Oxford 1990, to mention the bestknown proposals). As most of these implicit and explicit learning approaches and models are discussed quite comprehensively in a fairly contemporary work by Pawlak (2006), the presentation to follow will be limited to their most important characteristics. The section will also, whenever appropriate, provide a general linguistic perspective on the discussed models.

Language instruction: Theoretical underpinnings …

127

2.1. Implicit language learning The present section describes the theoretical models informing implicit language learning in two subsections, each based on a different similarity between L1 and any other language acquisition. Some models, such as the Identity Hypothesis or the UG-based approaches to language learning see L1 and Ln acquisition as similar in the sense that both are motivated by the same language-specific mechanisms in the brain. As such, the proposals are discussed in relation to the generative model of language acquisition put forward by Chomsky (1965 and later works); they are called implicit deductive, following DeKeyser (2003)8. In turn, in the light of the Monitor Model as well as the cognitive interpretations of the Interlanguage Theory, the acquisition of Ln and the mother tongue are alike in the sense that neither of them benefits from instruction and both are motivated by interaction with comprehensible input. Input-based is also the third model to be discussed in this section – the Competition Model. That is why the Interlanguage Theory, the Monitor Model and the Competition Model are labelled implicit inductive and discussed in relation to such linguistic models as Halliday’s SystemicFunctional Grammar (Halliday 1973, 1985, 1994, and 2003) as well as Langacker’s usage-based model (1987, 2000). The final subsection looks at the APT model, which seems to be a middle-of-the-road proposal, accounting for the importance of input processing as well as addressing the role of languagespecific modules in this processing. 2.1.1. Deductive implicit models: the Identity Hypothesis and UG-based approaches The Identity Hypothesis – also called the L2=L1 hypothesis (Ellis 1994) – is based on a claim that the language acquisition device (LAD. Chomsky 1965; McNeill 1966), a language organ which enables native language acquisition, can be accessed in the process of learning any other language. If, following McNeill (1966), we accept that LAD is in charge of the following abilities: – the ability to tell the difference between speech sounds and other non-linguistic “noises”; – the ability to categorise incoming linguistic data; ––––––––– 8

DeKeyser (2003), in his four-partite model of relationships between the learning modes and the underlying pedagogical approaches, sums the above up in the following way: learning mode: deductive/explicit – traditional pedagogy; learning mode: deductive/implicit – UGbased approaches; learning mode: inductive/explicit – rule discovery mode; learning mode: inductive/implicit – input-based approaches

128

Chapter Two

– the ability to distinguish a certain possible linguistic system from those that are not possible; and – the ability to constantly evaluate and develop the simplest possible system on the basis of incoming language data, and if we subscribe to the Indentity Hypothesis, we can assume that all these capacities will be utilised in the course of the acquisition of any language other than the mother tongue. Considering the fact that these abilities are present in all human children regardless of their language background, it is claimed that LAD structures cannot be tied to any language but must be universal. As a result, language structures, or principles, which are innately specified, are believed to form an entity called Universal Grammar9. When language acquisition is triggered by the incoming primary linguistic data, the universal structures – or principles –mentioned above are parameterized. The above-introduced understanding of language learning is based on the concept of Principles and Parameters put forward as a part of Chomsky’s Government and Binding Theory (Chomsky 1981a and 1981b). The same idea underlies UG-based approaches to language learning and teaching. Quoting Cook (1994: 25), the knowledge of any language consists of “principles that do not vary from one person to another and parameter settings that vary according to a particular language a person knows”. To explain the concept Cook uses the metaphor of a video recorder which needs two elements to function: the machinery, the same for every item sold and the tuning function utilised by the user depending on the specific circumstances. The principles that are represented by the machinery part of the metaphor are the equipment – or the toolbox, to use Jackendoffian (2002) terminology – that every human mind is genetically fitted out with. All of the principles have different parametric options – for example the word order principle can be set into the free or fixed – into which principles are tuned as a result of the speaker being exposed to a particular language. It is important to point out that initially every parameter is set neutrally or has a default (unmarked) value (Cook 1994: 31). Consequently, we either get in gear, to use another metaphor, or – if necessary – change the setting to marked as a result of the said exposure. The language material that one is exposed to ––––––––– 9

The newest cognitive research into the nature of world languages (cf. Everett 2008; Evans and Levison 2009, among others) questions the very concept of Universal Grammar claiming that the list of language universals shared by all world languages is almost empty. The present work is not going to address the issue as the aim of this section is to describe language learning approaches subscribed to by different scholars and not to evaluate them. The only remark that can be made here – regardless of my own view regarding the applicability of the UG concept – is that universals may be best understood as attributes or features not all of which are equally necessary and sufficient but which are better/more or less representative depending on a given language. This is a position convergent with the prototype theory presented in extenso in Chapter 3.

Language instruction: Theoretical underpinnings …

129

contains ample evidence both positive – of how certain forms are used – and negative – what does not occur in the language; another form of negative evidence, correction, is not necessary as is claimed by the proponents of the approach headed by Chomsky himself. “[A]cquiring a language thus means selecting, among the options generated by the mind, those which match experience and discarding other options” (Chomsky 2002: 16). In the course of such language acquisition, to quote Mehler and Dupoux (1992; cited in Chomsky 2002: 16), we learn our native tongue by setting parameters and “forgetting” all the universal phonetic distinctions, syntactic properties, etc. that are not represented in our own language. If this is so, learning a language that is not one’s mother tongue does not involve learning principles, as they are universal and already present in the mind; instead, language learning will consist in resetting the parameters that differ between languages. Theoretically, to revisit Mehler and Dupoux’s (1992) maxim, this would mean: first remembering what had been forgotten and then forgetting – at least for the sake of the newly learned language – the parameter setting typical of L1. In reality the whole operation is extremely complicated if not impossible, as the effectiveness of the whole remembering-resetting-forgetting process will depend on how much access we still have to Universal Grammar. There is some disagreement among researchers in this area and a number of theoretical claims are advanced. The no-access position amounts to claiming that there is actually nothing to be remembered and, consequently, the second language is learned based on mental faculties other than UG. This position – clearly not UG-based – is presented in Clashen and Muysken (1986 cited in Cook 1994: 37), who claim that parameter resetting is not an option, especially in adults. In the indirect access model, Universal Grammar is accessed via L1. We can assume, going back to the concept of learning by forgetting, that even if unable to remember the long forgotten parameter options, the L2/FL learner still has f-conscious access to principles, which are universal and part of any language. Consequently, treating L1 parameter setting as a point of departure, the learner will be able to consciously reason their way out of the native setting towards the target setting, based on the disanalogy between the two languages. Finally, the direct access10 stance provides for language learning which is a replication of the process of mother tongue acquisition (for a more detailed discussion, see Cook 1994 and Ellis 1994). What is important pedagogically is the fact that the two access perspectives offer different views on the initial parameter setting: in the light of the direct model (Cook 1994), all of the initial parameter options are still available ––––––––– 10

The direct access stance is also part of the dual access model proposed by Felix (1985), in the light of which, in spite of the accessibility of the Universal Grammar, mature language learners rely, interchangingly, on their UGs and general cognitive mechanisms.

130

Chapter Two

to the learner; the indirect model, in turn, assumes every parameter will be in an already fixed position determined in the course of L1 acquisition. What is important in our discussion of UG-based approaches as a model underlying implicit learning of languages is what exactly is needed for the abovedescribed parameter resetting. The answer will actually depend on what access position we assume11. According to the direct-access view, all that is needed for the parameter setting mechanism to be triggered is exposure and the evidence – both positive and negative, as in the case of L1 – that it provides. The indirect-access position will require optimalisation of input (Cook 1994), some forms of which are going to be discussed as implicit focus-on-form options in Section 3. What, however, may also be needed, especially in the case of marked parameter values are, according to Cook (1994: 39-40), “habit-based structure drills, grammar explanation or controlled communication games as well as negative evidence including, potentially, correction. All of this, as Cook points out, has to be implemented based on what type of treatment is best for a particular learner. Therefore, as it seems, there are forms of UG informed pedagogy which can actually be classified as explicit. 2.1.2. Inductive-implicit models: the Interlanguage Theory, the Monitor Model and the Competition Model The term interlanguage is used in SLA theory in two ways: to refer to “the internal system a learner has constructed at a given point in time” or to “a series of interconnected systems that characterise the learner progress overtime” (Ellis 1994: 350). This system, as proposed by the Interlanguage Theory (Selinker 1972), amounts to “a separate mental grammar that [the learners] draw on in L2 performance”. This grammar is “[a] system of implicit knowledge that the learner develops and systematically amends over time” (Ellis 1994: 354). The fact that the said mental grammar is implicit is rather uncontroversial. In the light of the Interlanguage Theory put forward by Selinker (1972), knowledge of this interim language system is implicit in a number of ways. First of all, the course of interlanguage development is the mirror image of L1 acquisition which is, as mentioned on a number of occasions, f-conscious; moreover, many followers of the theory also suggest the same starting point for L1 and L2. Finally, restructuring or amending interlanguage as a system is frequently motivated by substratum – hence not conscious – transfer from the learner’s mother tongue. The question of what kind of implicit – deductive or inductive – is a more complicated issue in the context of interlanguage. If we examine it from the ––––––––– 11

If we adopt the no-access view, the UG-based approaches do not apply

Language instruction: Theoretical underpinnings …

131

perspective of the processes of interlanguage construction – Selinker (1972; cf. Ellis 1994: 351) proposes that they include: language transfer; transfer of training; strategies of second language learning; strategies of second language communication; overgeneralisation – we can say that different characteristics point to different learning modes. The generalisation – and frequent overgeneralisation – of the language material to which the learner is exposed makes the model inductive implicit; transfer of training and the use of language strategies can even point to the explicit mode of learning. Finally, L1 transfer will suggest a deductiveimplicit process of IL construction, in which the first language functions as the LAD for the target language. The deductive implicit mode is further endorsed by Adjemian’s observation about the generative character of the interlanguage system which is used to “generate novel utterances” (Adjemian 1976: 299). These considerations are a part of a broader debate about the nature of interlanguage as a system, the most important facet of which is the Chomskyan/cognitive argument about the initial hypothesis or the starting point of language construction: the innate knowledge or a general process of hypothesis testing (Ellis 1994: 353). The present work, subscribing to the cognitive view of language learning, chooses to the model in the inductive implicit section. Related to the Interlanguage Theory is Krashen’s Monitor Model (Krashen 1981, 1982) based on the assumption that L2, like the native tongue, is learned not as a result of tuition but in the course of exposure to input, on the condition that this input is comprehensible, i.e. satisfies the i+1 formula (it contains language slightly above the learner’s present level). Resulting from the above claim is the distinction of two types of knowledge – implicit and explicit – and the noninterface position between the two, in the light of which languages are acquired subconsciously, and not learned as a result of conscious attention to rules and their memorization. The learned knowledge can never become implicit; its role is limited to monitoring performance and potential eradication of error. Such an understanding of the Monitor Model is based on five hypotheses (Krashen 1982): 1. The acquisition/learning hypothesis, in the light of which acquisition is natural, and learning is a process of conscious rule development. Learning does not lead to acquisition. 2. The monitor hypothesis, according to which the conscious learning mentioned above can work as a monitor, checking and repairing the errors resulting from acquisition. 3. The natural order hypothesis, which claims that the acquisition of grammatical structures follows a certain predictable sequence regardless of the learner’s mother tongue. Errors that occur on the way to language competence are a manifestation of such a naturalistic developmental process. 4. The input hypothesis, in the light of which acquisition (but not learning) is possible via exposure to meaningful i+1 input. The input needs to be rough-

132

Chapter Two

tuned to the present level of the learner. The code takes on the form of foreigner talk, an equivalent to caretaker speech – the kind of language parents and other caregivers speak to a child acquiring their mother tongue. 5. The affective filter hypothesis, which treats a learner’s emotional states and attitudes as a filter that either passes or blocks the input needed for acquisition. Krashen and Terrell (1983) claim that while motivation and self-confidence make the filter fully permeable, anxiety works as an input-blocking factor. The third – and last to be discussed here – implicit inductive model of language learning is the Competition Model. Ellis characterises the model as a performance model which “seeks to account for the kind of knowledge that underlies real-time processing in real-world language behaviour” (Ellis 1994:373). The main concept is that of form-function mappings: the task of the learner is to discover what form(s) to use for the realisation of a certain function in L2 and what weights should be attached to individual forms in the performance of different functions (Bates and MacWinney 1982; MacWinney 2002). The task can be accomplished because exposed to input, the learner receives what are called cues – information about how the language works and what individual forms mean in four areas: word order, vocabulary, morphology and intonation. Every cue is examined by the learner for it reliability (the extent to which it always maps the same function), availability (how often it appears in the input) and conflict validity (whether it wins or loses in competitive environments). Consequently, we can imagine the following competition between cues referring to the formation of the past simple tense in English: Cue 1: add the –ed suffix to verb; result: work-worked; learn-learned; go-goed Cue 2: some English verbs are irregular, go is one of them; the result of competition between the two cues (obtained through multiple exposure to language input): workworked; learn-learned; go-went

The scenario that seems to have been confirmed by the educational experience on the part of many a teacher is that, with ample exposure, the –ed suffix will win the competition in the case of regular verbs, because the weight that it carries will be strengthened by its consistent mapping of the past time reference function (high reliability) for a majority (availability) of verbs which never take alternative, irregular forms (conflict validity). Irregular verbs such as go, in turn, will be equally available, as – in spite of being less numerous – they refer to basic everyday actions (availability). With time and exposure negative evidence concerning their –ed suffixation will be accumulated, increasing the conflict validity and reliability of the irregular forms, whose weights will grow. As a template for language acquisition, the Competition Model is nonmodular and non-specific (Geoff 2004) in the sense that the cognitive mechanisms

Language instruction: Theoretical underpinnings …

133

utilised in resolving cue competition are the ones used in any kind of learning. It stems from a number of theoretical bases (MacWinney 1997; cf. Doughty and Long 2003 and Geoff 2004), all of which confirm its implicit learning mode: lexical functionalism providing for the already-mentioned inseparability of form and function; connectionism, in the light of which cerebral processing is associative – activation- and inhibition-based – rather than symbolic; input-driven learning where task frequency – and not internal mechanisms – are the most important factor in resolving cue competitions; and finally, processing capacity enabling undisturbed online computation of input. Out of the three models presented in this section, the Interlanguage Theory and Krashen’s Monitor Model are often linked to the Chomskyan view of language. However, considering their usage-based character, they seem to be much more convergent with such theoretical proposals as Halliday’s Systemic-Functional Grammar (Halliday 1973, 1985, 1994, and 2003), in the light of which language learning is experiential and shows the dynamism of town planning; as well as Langacker’s usage-based model (1987, 2000), which emphasises the role of the authentic use of language as well as the language user and his/her mental processes. Equally-related to these two proposals is the input-based Competition Model, with its lexical functionalism and input-reliance. Both theories – Halliday’s and Langacker’s – of language as a system, language processing and language development are introduced below. They are discussed at some length, as they seem to be slightly neglected in discussions of implicit language learning. In his relatively early publication, Halliday (1973: 7) defines the functional approach to language as the one that means: [f]irst of all, investigating how language is used: trying to find out what are the purposes that language serves for us and how we are able to achieve these purposes. .. But it also means more than this. It means seeking to explain the nature of language in functional terms: seeing whether language itself has been shaped by use and if so, in what ways.

The said “functional terms” explaining the nature of language are specified in later publications (1994, 2003) in the following way: The meaning potential of language, therefore, is organised by grammar on functional lines. Not in the sense that particular instances of language use have different functions but in the sense that language evolved in these functional contexts as one aspect of the evolution of the human species” (2003: 18)

As a result, in addition to form-function pairings in which the latter are understood as ways of using the language pragmatically, to get things done (to

134

Chapter Two

complain, apologise, explain, etc.), Halliday proposes language functions understood more abstractly – and labelled metafunctions (Halliday 1994) – which are divided into the experiential/ideational, interpersonal and textual functions. The experiential function, to start with, is explained based on the fact that children, while mastering their mother tongue, learn not only the language itself, but also accumulate their knowledge about the world through this language. As Halliday (2003) puts it, “they are using language to construe a theoretical model of their experience” (15). As a result, “grammar is not merely annotating experience; it is construing experience – theorising it in the form that we call understanding” (16). In turn, the ideational function of language boils down to linking the experiential function with grammatical logic and ensuring that the discourse makes sense as a whole because of the flow between its component parts. Still another function of language – interpersonal – amounts to enacting (i.e. acting out) personal experience, interpersonal meanings implying attitudes expressed prosodically (by means of intonation, word order, etc.). Finally, the textual function enables the language user to construct logical and coherent texts. The above-described language functions are complementary, and they are built into the basic architecture of human language. When it comes to complementarity, for a better understanding of the concept it is important to remember that – as already indicated – language learning amounts to using language to construe experience. In effect, the correspondences between the form and meaning of individual signs will be natural in the sense that “in general, what is construed systemically in the grammar will resonate with some feature of our experience of the ecosocial environment” (Halliday 2003: 15; see also Halliday 1994). Simultaneously, in the course of such form-meaning construction – or, as Halliday puts is, on the child’s way from a simplistic protolanguage to the fullblown mother tongue – the form itself will be subject to diversification and complication. Language, according to Halliday (1994, 2003), will become more complex in four different ways: – syntagmatically: individual language signs will be linked to form larger signs; – realisationally: the form-meaning pairings will be uncoupled to create new pairings: old forms will get new conceptualisations; old meanings will be realised formally in a different way; – stratificationally: signs will be layered one cycling onto another and the stratification will apply to forms. The child’s protolanguage originally unstratified – consistent of just meaning and phonology – will be subject to variation in the area of subsystems such as polarisation, mood, tense/aspect, etc.; and – paradigmatically: signs will be networked in relations of dependence. Based on this is the overall architecture of language as a system, a justification of the first component of the name of the model: Systemic-Functional. Halliday

Language instruction: Theoretical underpinnings …

135

himself (2003) admits that architecture may not be the best of metaphors, as it indicates a static organisation of language as a system, stating that “if we want a spatial metaphor of this kind we might perhaps think more in terms of town planning, with its conception of a spatial layout defined by the movement of people” (7). What is meant by this architecture – or the more dynamically conceptualised spatial layout – is an arrangement of meaning-making resources inside a space determined by the four interrelated vectors listed above – syntagmatic, realisational, stratificational and paradigmatic. Within the system different aspects can be perceived depending on the view we adopt (Halliday 2003). In the view ‘from above’, the architecture will be perceived through the previously described language metafunctions: as distinct modes of meaning used for construing experience and enacting interpersonal relationships. In the view ‘from below’, in turn, these two modes of meaning will be expressed through different kinds of structures: experiential meanings as organic configurations of parts (like the Actor+Process+Goal structure of the clause); or interpersonal meanings as prosodic patterns spread over variable domains (Halliday 2003: 16). “Most clearly, however, [the architecture] appears in the view ‘from round about’ – that is in the internal organisation of the lexicogrammar itself” (16), which, represented paradigmatically, is networks that form clusters in which language structures derive from the experiential, interpersonal and textual metafunctions which are mapped onto one another. Resulting instantiations are constrained by the options of the underlying system: the grammar of a given language. What is also crucial to point out is that Halliday’s model is also systemic in the sense that it sees grammar as a system of choices available to the user. Equally importantly, the approach takes these choices beyond the level of the sentence towards an analysis of the whole text. As Halliday (1994: 33) puts it, every clause is analysed in terms of three strands of meaning: message, exchange and representation. The first refers to what kind of information is expressed; the second: what kind of transaction there is between the speaker and their interlocutor and the third: how a certain fragment of experience is structured by means of the clause. These three, as pointed out by Evans and Green (2006), are the assumptions Halliday’s Systemic-Functional Grammar shares with other typological-functional approaches to grammar, as they all explain systematic classifications of types of patterns in terms of language use. Their approach to grammar seems to be best expressed by a definition put forward by Hopper (1987: 142 cited in Evans and Green 2006: 760), who is also the proponent of the term Emergent Grammar: The notion of Emergent Grammar is meant to suggest that structure or regularity comes out of discourse and is shaped by discourse as much as it shapes discourse in an ongoing process. [Its] forms are not fixed templates but are negotiable in face-to-face interaction in ways that reflect the individual speaker’s past experience of these forms, and their assessment of the present context, including essentially their interlocutors, whose experiences and assessments may be quite different

136

Chapter Two

A similar perspective on language to Halliday’s Systemic-Functional Grammar – and a plausible explication of what Halliday’s (2003: 16) view ‘from round about’ may mean – was offered around the same time by Langacker (1987, 2000) in his usage-based model addressing the internal organisation of the lexicogrammar. Langacker’s model is usage-based and non-reductionist in the sense that actual use is primary and the model operates bottom up: in spite of acknowledging the existence of language abstractions in the form of fully schematic networks, it attaches the most importance to low-level schemata. In effect, as Langacker puts it (2000: 91) revising his earlier publication (1987), “substantial importance is given to the actual use of the linguistic system and a speaker’s knowledge of this use”. In the light of the usage-based model, language as a system is an inventory of symbolic units, each of such entities consisting of the semantic pole and the phonological pole, as depicted in the diagramme below (Figure 8): Figure 8. Symbolic unit mapped onto a usage event; adapted from Langacker (1987: 77) symbolic unit

usage event

semantic unit cod

conceptualisation

phonological unit

vocalisation

According to Langacker, in a usage event the above-mentioned poles represent, respectively, the two components of the target structure: the conceptualisation and the vocalisation. Such symbolic units form hierarchies of inheritance containing low-level (more specific) structures and their high-level (abstract) schematisations. Remembering that Langacker’s model is dynamic and usagebased (Langacker 1987, 2000), we can say that the schemata in question will be abstracted from numerous encounters with highly specific applications of this structure. Such abstract schemata will coexist with a number of low-level idiosyncrasies, especially the ones which like I do, I swear to God I..., What do I do now? have become symbolic units – pairings of highly specific form and meaning – on the strength of linguistic convention. Moreover, both such schematisations and exemplars will range from smaller units to highly compositional entities which Langacker (2000: 109) calls “assemblies of symbolic structures” – linguistic conglomerates consisting of two or more symbolic units. Two important facts need to be remembered for a better understanding of the model: its role in every new usage event and the primacy of the user over the system. As for the former, Langacker (2000: 98) emphasises its non-generative

Language instruction: Theoretical underpinnings …

137

character stating that “linguistic knowledge is not conceived as an algorithmic device enumerating a well defined set of formal objects but simply an extensive collection of semantic, phonological and symbolic resources that can be brought to bear in language processing”. What is also important is that some of the said resources are incorporated in other resources – the already-mentioned “assemblies of symbolic structures” – and instantiations and their schematisations form hierarchies of inheritance. When it comes to the role of the system and its user in constructing and understanding novel expressions, Langacker emphasises the primacy of the user, emphasising the ancillary role of the system which is marshalled by the user together with a number of cognitive faculties such as “memory, planning, problem-solving ability, general knowledge, short- and longterm goals as well as a full apprehension of the physical, social, cultural, and linguistic context” (2000: 99). 2.1.3. Acquisition by Processing – the APT model The final theoretical model which underlies implicit learning of languages to be discussed in this section is the Acquisition by Processing Theory (APT; Truscott and Sharwood-Smith 2004). Similarly to LAD and UG-based approaches, it subscribes to the Chomskyan understanding of language learning as domain specific, yet, the language acquisition device is seen as a special genetic predisposition – an instinct (Pinker 1994); a toolkit (Jackendoff 2002); a triggering device (Dbrowska 2004) – rather than a kind of organ (Chomsky 1965). To quote APT proponents, the theoretical proposal argues for “a modular view of language in which competence is embodied in the processing mechanisms”, and in which development occurs “as a natural product of processing activity without any acquisition mechanisms as such” (Truscott and Sharwood-Smith 2004: 1; emphasis mine). This importance of processing links APT to the usage-based models discussed in Section 2.1.2. As an attempt to create an interdisciplinary platform to accommodate both research on general cognition and linguistics, incorporating the perspectives offered by Jackendoff (2002) and Baars (1997b), the APT theory is a middle-ofthe-road approach: it is modular but in Jackendoff’s understanding: the mind is composed of a number of modules which are in fact highly specialised processors. Jackendoff’s parallel architecture – presented in some detail in Chapter 1 – is such a tri-processor/tri-module concept. It includes three pillars of formation rules and relevant structures, which Truscott and Sharwood-Smith (2004) call integrative processors, as well as the interfaces between them, which are called interface processors. The former modules are in charge of putting together structures while the latter map the code of one module onto the representation typical of another module (phonological/syntactic/semantic). Language use processes are additionally carried out by other modules and processors: the visual/auditory and

138

Chapter Two

motor processors as well as the conceptual module which Jackendoff (2002) places outside the language-processing system. When it comes to L2/FL acquisition, Truscott and Sharwood-Smith (2004) see it accommodated within the modular system in which some (sub)modules are shared while others are separate and competing, all this being supported by very active metalinguistic knowledge, extramodular in the sense that it is knowledge about – and not of – the language. As for the model of language development called Acquisition by Processing, it is, as indicated above, a concept “seeking to explain the process with minimal appeal (ideally none) to mechanisms existing specifically for the purpose”, and “changes are to be seen as lingering effects of processing” (Truscott and Sharwood-Smith 2004: 4; emphasis mine). The explanation of such lingering effects of processing is offered based on the concept of the activation level. Truscott and Sharwood-Smith propose that every item/chunk of knowledge has its resting level, determined by all its past activations, and the current activation level, a kind of additional activation received from a processor during its activity. Such activation of items follows a spreading pattern and its direction is determined by the type of accompanying activity: reception or production (language experience, in other words). What is important is that the processor “uses the most active items at its level at a given moment to construct its representations” and “reduces the level of items that turn out to be unsuitable” (Truscott and Sharwood-Smith 2004: 5-6). As a result, every instance of processing incites certain language units working like a magnet on waterborne swarf, to use Truscott and Sharwood-Smith’s metaphor. However, as the proponents of APT admit, “[t]he representations produced during processing are temporary creations, lasting only long enough to allow construction of the message” (Truscott and Sharwood-Smith 2004: 6). Similarly – to go back to the above metaphor – once the magnet is removed from the surface of the tank, the swarf becomes entropic again. Consequently, the question that needs to be asked here is the one about the lingering effects, which are a condition sine qua non of learning in general and language learning in the case of our discussion. Truscott and Sharwood-Smith (2004) propose that such effects are brought about by repetitive processing. If the resting level of individual items changes as a result of every activation, it stands to reason that, the more activation via processing an item receives, the higher its resting level and, consequently, its availability. There is also another type of system modification that can be observed alongside the increased activation of items. This change amounts to the addition of new items and the modification of existing items. As a result of processing, items that are efficient in rendering the desired meaning are retained; others are rejected, fade away and need to be replaced by new items; still other items, which turn out to be partially successful in fulfilling processing demands are subject to readjustment. In this respect, the APT model – its modularity notwithstanding – is highly

Language instruction: Theoretical underpinnings …

139

reminiscent of the cognitive Competition Model (discussed earlier in this section). The resting level of items increasing as a result of activation by processing resembles the growth in item weight; the rejection/shadowing process, in turn, is very much like losing the competition. Truscott and Sharwood-Smith acknowledge the similarity between the two theories themselves. They, however, argue that no language processing strategies – as theorised by MacWinney Bates and MacWinney 1982; MacWinney 2002 – are applied; “[i]nstead, cross-linguistic contrasts are a product of acquired differences in the content of the lexical stores” (2004: 11). Relating their theory to second language acquisition, Truscott and SharwoodSmith (2004) address four issues: noticing the gap, the initial state, transfer and limits in SLA. Discussing the concept of noticing the gap12, the proponents of APT reject Schmidt and Frota’s (1986) idea of a direct, conscious comparison, arguing that there is no mechanism designed for the comparison, which is a byproduct, “an abstraction from the characteristics of processing” (2004: 13). These characteristics are imposed on the language system during online processing based on the decision that a certain representation fits the utterance better. As a result, the system receives a signal (Carroll 1999 cited in Truscott and SharwoodSmith 2004: 13) rather than an overt imperative to modify the system. This is in agreement with Truscott’s (1998) position on noticing as such as well as the understanding of the noticing-the-gap phenomenon, both presented and discussed in Chapter 1. As a brief reminder, Truscott denies the necessity of focal fully conscious attention being in favour of global awareness instead. In turn, the present work claims that there are occasions in which the discrepancy between the target language and the learner interlanguage is manifold; consequently, while we consciously attend to some aspects of the gap, other facets are noticed as part of a certain unanalysed gestalt. As for the initial state, or the presence of L1 characteristics in L2 performance, APT explains this in the following way: the processing chain – the spreading activation across modules – accesses both lexicons. As a result, some L1 items, due to their higher resting levels, are accessed more easily than L2 items. Consequently, the operating question will be not which L1 items are present in the initial state of L2 but rather which L1 items, how and in which conditions are accessed in processing. This also leads to redefining the notion of transfer, which the proponents of APT prefer to see as “chronic involuntary code switching” (2004: 14) resulting from the above-mentioned levels of activation. Finally, Truscott and Sharwood-Smith address the issue of the limits of SLA. They concentrate on an input competition view, limits on working memory and an output competition view. As for the competition views, they claim that when ––––––––– 12

For a detailed discussion, see Chapter 1

140

Chapter Two

parsing an utterance, L1 entries are activated alongside L2 items and compete for their share in the upcoming representations. Such a competition will take place in both comprehension (input) and production (output) in which the winning language chunks will increase their activation levels and be subject to modifications, including modifications to L1 entries. As far as limits on working memory are concerned, any decline in its capacity will lead to lower processing efficiency, an issue discussed quite thoroughly in Chapter 1. 2.2. Explicit language learning Section 2.2, devoted to explicit language learning, is divided into four subsections, each presenting a different theoretical model: Input Processing; output/interaction-based approaches; the approach to language as a cognitive skill, in which the already-discussed ACT/ACT-R model and the concept of language strategies are revisited and related to language learning; and finally an array of hypotheses related to instructed learning, out of which the Interface Hypothesis will be given the greatest attention, as it is the basis for deciding in what way instructed learning leads to learning at all. As was signalled before, the selection of the models mirrors the one proposed by Pawlak (2006). In addition to the four models selected for discussion, Pawlak describes the connectionist perspectives, which are omitted here as they seem to be better accommodated within the implicit learning paradigm13. 2.2.1. Input Processing (IP) Rooting his proposal in a broader research tradition, VanPatten quotes a number of observations made by researchers who have acknowledged the importance of input; some of these findings were already quoted in the subsection devoted to Krashen’s Monitor Model. The said observations are as follows (VanPatten 1996: 5): – humans acquire languages only in one way – by being exposed to comprehensible input (Krashen 1985: 2); – the availability of comprehensible input is a guarantee of success in both L1 and L2 acquisition (Larsen-Freeman and Long 1991: 142); – the input that is indispensible for language acquisition may be provided by means of interaction or in non-reciprocal discourse (Ellis 1994: 26); cf. also input text as opposed to input discourse (Widdowson 1974); ––––––––– 13

The present work mentioned the perspective before in connection to associative learning, learning by analogy and modelling, all of which are examples of learning which is not considered to be rule-based (for a detailed discussion see Pinker 1999 on Parallel Distributed Processing, among others). The Competition Model, which is connectionist in nature, was presented in Section 2.1 as an implicit learning model.

Language instruction: Theoretical underpinnings …

141

– the acquirer needs exposure to language exemplars or instances for language development to take place (Schwartz 1993: 148); – L1/L2 input – leads to L1/L2 grammar – a system that makes it possible to produce comprehensible output (White 1989: 37). Based on the above-mentioned concepts is VanPatten’s idea of Input Processing (IP), whose workings were, to some extent, discussed in Chapter 1, which was devoted to the role of attention in language learning. This was because IP puts a lot of emphasis on the focusing element in language learning, claiming that “attention to features in input [is] an integral part of the grammar acquisition process” (1996: 6). Related to attention are the two major principles of IP discussed in the said chapter, briefly revisited below (VanPatten 1996: 14-15): P1: Learners process input for meaning before they process it for form: P1a: learners process content words in the input before they process anything else; P1b: learners prefer processing lexical items to grammatical items; P1c: learners prefer processing more meaningful morphology before “less” or “nonmeaningful” morphology. P2: For learners to process what is not meaningful, they must be able to process informational or communicative content at no (or little) cost to attention.

The verb to process repeated throughout the two principles should be interpreted as to understand [the input] and ... . As VanPatten (2004: 25) himself states, input processing is “the first hurdle” in the sense that acquisition is a by-product of comprehension: what is learned is greatly dependent on the degree to which the learner understands input. What is needed, to get the learner past this first hurdle, is input modification in a way that makes it rich in a particular language feature as well as ways of pedagogical recycling of such input: activities that help learners pay attention to this feature. This is what VanPatten (1996 and later works) calls Processing Instruction (PI, to be discussed in Section 3). Triggered by these activities, Input Processing becomes an intervening factor – a monitor, to use Krashen’s terminology (1981; 1982) – on any acquisition model, as depicted in the Figure 9: Figure 9. Input Processing acquisition model (adapted from VanPatten 2004: 115) IP INPUT-----------------------------OTHER PROCESSORS AND MECHANISMS

LEARNER INTERNAL GRAMMAR AS A DEVELOPING SYSTEM

142

Chapter Two

An important aspect of VanPatten’s idea is the fact that Input Processing – what VanPatten (2004) himself admits – is not an attempt to construct a model of language acquisition. As the proponent claims, it is a concept rather than a model and – through its “other processors and mechanisms” – it is applicable to both parameter resetting14 or form regularisation processes15. This is because Input Processing is a concept that in a way belongs to all SLA theories because all theories acknowledge the following facts (VanPatten and Williams 2007: 9-12): – – – – – – – – – –

Exposure to input is necessary for SLA. A good deal of SLA happens incidentally. Learners come to know more than what they have been exposed to in the input. Learner's output (speech) often follows predictable stages in the acquisition of a given structure. Second language learning is variable in its outcome. Second language learning is variable across linguistic subsystems. There are limits on the effects of frequency on SLA. There are limits on the effect of a learner's first-language on SLA. There are limits on the effects of instruction on SLA. There are limits on the effect of output (learner production) on language acquisition.

2.2.2. Output/interaction-based approaches According to Krashen (1989 cited in Ellis 1994: 280), two hypotheses are related to the role of output in language acquisition: the Skill-Building Hypothesis and the Output Hypothesis with special regard to Swain’s (1985) Comprehensible Output Hypothesis. In the light of the Skill-Building Hypothesis (Krashen 1989), it is through practice that we automatise what we have learned consciously, a claim similar to that put forward by the ACT/ACT-R model (to be discussed Section 2.2.3). What makes the two models different is that, in the light of the Skill-Building Hypothesis, the internal system does not change; what is subject to improvement are the speed of retrieval and the amalgamation of structures in online language processing. All of this is in agreement with Krashen’s claim that (explicit) learning does not lead to acquisition. ACT/ACT-R, in turn, proposes that, as a result of practice, we modify our representations of task-execution procedures stored in the LTM; in addition to this, like the Skill-Building Hypothesis, ––––––––– 14 15

Discussed in Section 2.1.1 on UG-based approaches to language learning Such as the competition of cues (discussed in Section 2.1.2 in the part devoted to the Competition Model) as well as interaction (Section 2.2.2); or learning language as a cognitive skill (Section 2.2.3).

Language instruction: Theoretical underpinnings …

143

ACT/ACT-R provides for the increased speed of retrieval. As for the role of output practice in language learning, we can speculate that, in addition to affecting the fluency of production in a direct way – just as any practice develops a skill – the increasing speed of retrieval has a beneficial effect on online reception because items that are more readily available for speaking will be retrieved in listening with comparable ease. As for the (Comprehensible) Output Hypothesis, it puts an emphasis on the direct (correction) or indirect (negotiation of meaning) feedback that we receive from our interlocutors, the former factor situating Swain’s proposal among explicit learning models. What is particularly underscored in the (Comprehensible) Output Hypothesis is the influence of such feedback on our ability to restructure our internal grammar and make it develop. What cognitive functions and processes such functions trigger are involved in the development of language as a system in the case of such interactional input processing? As Swain (1985) claims, what is of utmost importance is processing the input that comes from one’s interlocutor for co(n)textual clues that (dis)confirm that one’s output is comprehensible. The functions activated as a result of such clues are, according to Swain (2004): noticing, hypothesis formulation and testing as well as metatalk, a verbalised reflection on these hypotheses. All this may result in the activation of a number of cognitive processes, two of which seem particularly important. The first of them will amount to searching the available language resources for forms that render the desired meaning in a clear(er) way. In this way the attention focus is going to be shifted from the gestalt meaning-only processing to a more finetuned analysis of the formal properties of language. The other process will consist in restructuring the existing system by making some form-meaning mappings stronger while shadowing or erasing the communicatively ineffective formmeaning pairings. The advantage of such interaction-based feedback lies in the fact that being communicative is a very strong desire and being able to satisfy this desire attaches considerable weight to a given form: the cue that comes from being understood is reliable and has a great conflict validity – it is a winning structure and its success is immediately and very strongly confirmed by being able to get one’s message across. The two cognitive processes – the search and the restructuring – whose activation is provided for by the Output Hypothesis, make Swain’s postulate similar to a slightly earlier proposal called the Interaction Hypothesis put forward by Long (1983b, 1996). Long’s hypothesis emphasises the importance of both the negotiation of meaning and the interactional modifications, each of which has the potential to trigger the said search for language resources as well as to result in restructuring the system. In this area Long (1996: 451-452) attaches particular importance to “interactional adjustments by the Native Speaker or more competent interlocutor”, which facilitate language learning by “[connecting] input, internal learner capacities, particularly selective attention, and output in

144

Chapter Two

productive ways”. Consequently, as noted by Gass (2007: 234), “conversation is not only a medium of practice, but also the means by which learning takes place”. A similar concept of why and how interaction helps restructure a learner’s internal representations of the target language system is provided by Andersen’s Nativisation model. Andersen (1990a; see also Ellis 1994: 378-379 on Andersen’s earlier work) claims that L2/FL learners are normally inclined to conform to internal norms, which amounts to forming hypotheses on the basis of their L1 knowledge. In turn, when interacting with a proficient target language user, they are exposed to an external standard which has a normative influence on the internal system. Andersen (1990a) calls this norm-resetting processes denativisation. 2.2.3. Language as a cognitive skill There have been numerous attempts to adapt skill-learning theories to language learning. The 1940s saw the application of the behavioural skill learning to language learning contexts through stimulus-response conditioning which took the classroom form of audiolingualism. More modern applications of the proposal, particularly Anderson’s ACT/ACT-R model, related to both cognition and metacognition in language learning, are briefly discussed below. On the cognitive/learning level, the ACT/ACT-R skill learning model was applied to language learning context by Anderson himself (Anderson 1980, 1983) as well as DeKeyser (2001, 2004, among others). As Anderson claims (1980: 224): We speak the learned language (i.e. the second language) by using general rulefollowing procedures applied to the rules we have learned, rather than speaking directly, as we do in our mother tongue.

In doing so, we follow the three-stage skill-learning cycle (described in Section 1.2.1), proceeding from rule-based production through the associative stage towards autonomy, the whole process being slow and painful, and full automatisation – extremely rare, as Anderson (1980) notes. Such proceduralisation, regardless of the degree it reaches, is achieved through practice. As pointed out by DeKeyser (2004: 52): If we want our students to achieve fluency in the second language (in the sense of the automatic procedural skill ...), then, according to cognitive theory, we must enable them to engage in the practice of using that language, while they keep the relevant declarative knowledge in working memory.

Alternative versions of such rule-based practice acknowledged by DeKeyser (2004) are Logan’s instance theory of automaticity (Logan 1988) and Schneider’s item-based theory (Schneider 1985). DeKeyser (2004) notes that both theories

Language instruction: Theoretical underpinnings …

145

provide for single-step, direct access retrieval of past solutions from the memory but in doing so they utilise practice in two different ways: in the former the role of practice is to increase the number of instances encoded and, as a result, facilitate retrieval; the latter model proposes that practice makes retrieval faster by strengthening the representation of an item in long term memory. The rule-based and the item/instance based models of skill learning correspond to two modes of language learning accounted for by the cognitive theory: rule-based and exemplar based, accommodated alongside each other in Skehan’s view of language. Skehan (2003) argues that these learning modes coexist and learner preference for one and not the other is motivated by individual differences: the rule-based mode is chosen by those whose language aptitude is related to their grammatical ability; the exemplar mode is chosen by learners who rely on their memory in language learning. Another area where the view of language as a cognitive skill applies is the metalinguistic/learning-to-learn domain, with special regard to learning strategies. They are discussed within the cognitive view of language based on the assumption that if language learning is like learning any other skill, then it may be made more efficient through “special ways of processing information that enhance comprehension, learning and retention of the information” (O’Malley and Chamot 1990: 1). The first interpretation of language learning strategies and in fact the first proposal of strategy classification came from Rubin (1975), as an afterthought of her good learner study. She divided “the techniques and devices, which a learner may use to acquire language (Rubin 1975: 43) into two groups: strategies 1) directly and 2) indirectly contributing to language learning (Rubin 1981: 124126). One of the most popular strategy taxonomies based on her model is the one devised by Oxford (1990)16, which divides learning strategies into direct and indirect strategies. The former include: memory strategies such as creating mental linkages, applying image and sound, reviewing well and employing action; cognitive strategies like practising, receiving and sending messages, analysing and reasoning as well as creating a structure for input and output; and compensation strategies which include guessing intelligently and overcoming limitations in production. Indirect strategies comprise: metacognitive strategies which are about centring, arranging, planning and evaluating learning; affective strategies such as lowering anxiety, encouraging oneself and taking one’s emotional temperature; and finally, social strategies including asking questions, cooperating and empathising with others. Learning strategies are applied to language learning based on the understanding that (O’Malley and Chamot 1990: 217): ––––––––– 16

Another popular model is the one proposed by O’Malley and Chamot (1990), which was described in Section 1.2.2.

146

Chapter Two

– learning is an active and dynamic process in which individuals make use of a variety of information and strategic modes of processing; – language is a complex cognitive skill that has properties in common with other complex skills in terms of how information is stored and learned; – learning a language entails a stagewise progression from initial awareness and active manipulation of information and learning processes to full automaticity in language use; and – learning strategies parallel theoretically derived cognitive processes and have the potential to influence learning outcomes in a positive manner. 2.2.4. Theoretical underpinnings of instructed learning Related to skill learning is the consideration of the role of instruction on both the acquisition of the declarative knowledge as well as its subsequent proceduralisation. In SLA theory both issues are addressed by a variety of hypotheses, a selection of which is discussed in this section. The influence of instruction on the declarative knowledge of language is addressed by the following hypotheses: – the Teachability Hypothesis (Pienneman 1985): formal instruction does not affect structures that are beyond the learner at a given stage. This hypothesis is clearly related to Vygotsky’s concept of the zone of proximal development, discussed earlier in this chapter and as such is not going to be given extensive attention here; and – the Noticing Hypothesis (Schmidt 1990). Schmidt’s proposal was presented quite extensively in Chapter 1. That is why in the present section it is going to be addressed in one aspect only: Schmidt’s claim that “what learners notice in input is what becomes intake for learning” (Schmidt 1995: 20). This claim is going to be considered from the perspective of interfacing between the instruction mode and the resulting knowledge as well as through knowledge/knowledge interfaces. In turn, when it comes to the relationship between instruction and knowledge, based on the body of research quoted in the present chapter it seems obvious that the most frequent interfaces between instruction and knowledge will be implicit/implicit and explicit/explicit. Ellis and Bogart’s (2007: 1) note: “explicit learning usually benefits from tutoring, whereas implicitly learned skills don’t need much support other than sufficient opportunity to practice”. The word usually is of particular importance here because if we take into account the graduity of extensive-intensive noticing as well as attentional shifts (cf. Chapter 1), we come up with an array of input/intake options, as presented in Table 1 below.

Language instruction: Theoretical underpinnings …

147

Table 1. Learning/knowledge interfaces

explicit deductive

explicit inductive

implicit inductive/deductive

knowledge  instruction mode 

implicit

explicit

As a result of exposure to comprehensible input and output, learners achieve a considerable degree of language schematisation which is rule- or analogy-based, but whose intricacies are inaccessible and, consequently non-verbalisable.

As a result of exposure to comprehensible input and output, learners – drawn by communicative expediency or their need to make sense of language as a system – will consciously notice the most salient language features, schematise them and come up with explicit language rules verbalised on demand. This knowledge will be explicit inductive. However, the induced explicit knowledge and may become the point of departure for subsequent top-down investigations of input. Explicit knowledge gained in this latter way will be of deductive origin. If input is approached by a learner whose awareness of language is intrinsically high or stimulated by instruction, or if this awareness is increased as a result of negotiation of meaning in output, the most salient features of this input will be consciously noticed, schematised and formalised into rules verbalised on demand. This knowledge will be of inductive origin. However, as above, the induced explicit knowledge may become the point of departure for subsequent top-down investigations of input. Knowledge gained in this way will be of deductive origin. In the course of focus-on-form activities, the learners will gain explicit rule-like knowledge of the formal aspects of language that are subject to instruction. This knowledge will be gained inductively, if instruction triggers the learner’s own conscious educational experience or deductively, if the teacherfronted, top-down instruction is assimilated.

If input is approached by a learner whose awareness of language is intrinsically high or stimulated by instruction, or if this awareness is increased as a result of negotiation of meaning in output situations, in addition to certain aspects of language being learned explicitly, some elements of input will be processed peripherally, owing to extensive attention that is paid alongside intensive focusing. In this way they may be subject to learning which is not fully explicit (the knowledge will be inaccessible or only partly accessible to verbal report).

In the course of activities in which the learner’s conscious and focal attention to language form is evoked and certain aspects of language are learned explicitly, some elements of input will be subject to implicit learning, owing to the peripheral attention, especially if focusing on form is carried out in the communicative context and/or the learner has access to samples of actual language use. As a result of the mutilaspectuality of input, this knowledge maybe inaccessible to verbal report. The knowledge will be inductive in origin.

148

Chapter Two

Finally, when it comes to the role of instruction in knowledge/knowledge interfacing, with particular regard to the proceduralisation of the declarative knowledge, the following hypotheses have been put forward: – different versions of the Interface Hypothesis which addresses the issue of whether by practicing structures learners can learn to use them without conscious control of them; in other words, whether explicit knowledge becomes implicit; – the Variability Hypothesis (Ellis 1987): teaching structures affects the careful style but not the vernacular (natural production) style; – the Selective Attention Hypothesis (Lightbown 1985a)/Pedagogical Grammar Hypothesis (Sharwood-Smith 1985): instruction provides hooks, points of access for further learning through exposure; and – the Delayed Effect Hypothesis (Lightbown 1985b), in the light of which the effects of instruction are reflected in learning after a time of germination. All the four hypotheses are addressed below. However, the main focus of the discussion is going to be on the Interface Hypothesis because, as was mentioned in the introductory part of this section, this is the postulate based on which we can decide to what extent instructed language learning can be seen as learning in the functional sense. The other three hypotheses will be cited in the course of this mainstream discussion. As was argued in relation to skill learning, declarative knowledge is potentially proceduralisable as a result of repetitive processing, ideally in the form of overlearning/cyclical practice. Some aspects of such a transformation were presented in the ACT/ACT-R model discussed earlier in this section. The word potentially is used here purposefully, as there is disagreement between researchers as to how proceduralisable explicit knowledge is. The controversy stems from different conceptualisations of the type of interface between explicit and implicit knowledge as well as from the different ways in which various researchers operationalise the term implicit. Starting from the most radical perspective, Krashen’s Monitor Theory (Krashen 1981, 1982) with its non-interface position provides for complete separation of the two types of knowledge and, consequently, rules out the possibility of the learned transforming into the acquired. A perspective which is slightly less radical that Krashen’s is the weak interface position, first presented in Bialystok’s (1978) Model of Second Language Learning. In the light of the model, explicit and implicit types of knowledge, even though still seen as separate, can be transformed into one another: explicit knowledge becomes implicit as a result of practice; implicit knowledge turns into explicit through inferencing. This, however, proceeds with considerable difficulty because of what Bialystok intuited, and what we now know based on neuroimiging: the two types of knowledge “involve different types of representation and are substantiated in separate parts of the brain” (N. Ellis 2005: 307; cf. also Paradis 1994).

Language instruction: Theoretical underpinnings …

149

Going a little further towards the interfacing of the two types of knowledge is the concept of implicit and explicit types of knowledge as intersecting continua, which is part of Ellis’ Variable Competence Model (Ellis 1984 cited in Ellis 1994: 365-366)17.The proponent writes (Ellis 1994: 365): “[l]ike Bialystok, I see [the two kinds of] L2 knowledge as represented differently in the mind according to how analysed and how automatic it is”. However, as Ellis admits, with time and practice, items that were originally available only in planned performance become part of the learner’s vernacular style; in turn, participating in different types of usage events helps the learner acquire rules. Ellis’s later proposal (1993) provides for yet another configuration of types of knowledge and interfaces – strong and weak – between them. The updated interface model (Ellis 1993) is based on the assumption that the two types of knowledge, explicit and implicit, can be both controlled and automatic. In the light of such an acknowledgement, Ellis proposes a strong interface between the controlled and automatic variations of each knowledge (explicit controlled/explicit automatic; implicit controlled/implicit automatic) and a weaker interface between – for example – explicit controlled and implicit automatic, suggesting that the latter type of transformation is far more difficult. The proposal coincides with Ellis’s Variability Hypothesis (Ellis 1987): the explicit teaching of structures affects the careful style to a greater extent than the vernacular (natural production) style. If we examine Ellis’s model (1993), it is easy to notice that, in the light of it, explicit knowledge can easily become implicit in Frensch’s (1998; see earlier in this chapter in Section 1.1 devoted to the definition of implicit learning) understanding of the implicit, which boils down to the unintentional and/or the automatic. However, it is much more difficult (Ellis 1993) – if not impossible – for conscious declarative knowledge to become unconscious without the learner being able to retract their cognitive steps and recall the original, conscious, verbalisable state of this knowledge, which is how the implicit is understood by Diennes and Berry (1993; see earlier in this chapter). In the case of the implicit in the latter understanding, it has to be accepted, after N. Ellis (2002, 2005) that the interfacing between the explicit and the implicit will be limited to the former triggering the latter. Ways in which implicit representations can be triggered by explicit knowledge are discussed below. Ellis (1994) points out that explicit knowledge in the form of rules is the blueprint necessary for the subsequent organic acquisition of implicit knowledge. He (Ellis 2003: 148) explains it in the following way: “the initial state of learning is conscious, involving explicit attempts on the part of the learner to understand ––––––––– 17

The intersection is possible owing to the Capability Continuum Paradigm – Ellis himself acknowledges the similarity between his model and Tarone’s – which “allows for two means of the internalization of interlanguage” (Tarone 1983: 155). Both, Ellis’ and Tarone’s belong to variability theories of L2 knowledge.

150

Chapter Two

what is to be learned”; in the subsequent learning the learner can be less concentrated on individual aspects of input. In his model of such a role of explicit knowledge in implicit learning, Ellis (2003) sees two planes on which the intervention of the articulate, conscious knowledge is of importance: the stage of intake, where explicit knowledge is a form of attention focusing device increasing the level of noticing; and the level of noticing the gap, where explicit knowledge of rules functions as a monitor. In the former case, “[i]f learners are armed with explicit knowledge of a [certain] feature, they are more likely to notice its occurrence in the ... input”; as for the noticing-the-gap plane, “when the learners know about a particular feature, they are better equipped to detect the difference” (Ellis 2003: 149) between their own performance and the target, expert-like execution of the task. A similar perspective is offered in an earlier work by N. Ellis (1994), who claims that the explicit, declarative knowledge of rules influences the working memory component called the central executive by making certain features of input salient and, consequently facilitating their perception and processing. If we go back to the view of working memory as activation patterns in the long-term store and simultaneously see the explicit/declarative knowledge of rules as certain overtly delineated paths of such activation, we can speculate that if they are pre-established, any new relevant input will be more easily processed along these activation paths instead of seeking a totally novel route. This is, most probably, what Ellis’s blueprint idea amounts to. A similar idea underlies the Selective Attention Hypothesis (Lightbown 1985a) and the Pedagogical Grammar Hypothesis (Sharwood-Smith 1985); both their proponents subscribe to the view that instruction provides hooks or points of access for further learning through exposure. The idea of the explicit blueprinting/providing hooks/triggering the implicit is incorporated by a number of language learning models. One model – in which this concept is particularly emphasised at the beginning of the learning process, in relation to novel problems – has been put forward by N. Ellis (2002) in his implicit tallying hypothesis. Very much in conjunction with Harrington and Dennis’s (2002: 261) conceptualisation of “the language learner as an intuitive statistician”, the implicit tallying hypothesis proposes that “the initial registration of a language representation may well require attention and conscious identification” (N. Ellis 2002: 173). However, “once a stimulus representation is firmly in existence, that stimulus ... need never be noticed again; yet as long as it is attended to for use in the processing of future input for meaning, its strength will be incremented and its associations will be tallied and implicitly cataloged” (N. Ellis 2002: 174). This, as N.Ellis points out in a later publication (2005), goes hand in hand with Baars’ (1997) observation that the most conscious involvement is needed in the case of novelty. Interpreting the above we can say that it is enough to limit explicit instruction to the introduction of a certain form and to later leave it for implicit/incidental processing in communicative context. How

Language instruction: Theoretical underpinnings …

151

long the introductory explicit phase should last depends on how complex and/or ambiguous a certain pattern is (cf. Curran and Keele 1993). In any case, explicit instruction has to be followed by what N. Ellis calls provision of exemplars and tuning opportunities (2002: 175). Another model of interaction and mutual triggering between the explicit and implicit types of knowledge is proposed by Rodrigo et al. (2002). They argue that explicit knowledge, understood as all kinds of rule-like articulate representations of language as a system, will be a part of the otherwise implicit learner interlanguage as discordances or slates: portions of knowledge that can be singled out from the continuous entity and which function as milestones on the learner’s route between their mother tongue and the target language. We can hypothesise that such portions of explicit knowledge will be the endmarks of some interlanguage processes and the starting points of other processes. Consequently, on the one hand, the explicit knowledge slates will be the result of global observations becoming focal when the representation of certain exemplars in input reaches critical mass, and some explicit rules are induced from it. This process will potentially be reinforced by teacher-fronted instruction being successfully assimilated or becoming active at the right moment in the right context. On the other hand, the said explicit representations resulting from teacher instruction and learner discovery will, hopefully, act a point of departure for further development of interlanguage, among others by helping to narrow the learner hypothesis space, as MacWinney (1997b) puts it. The latter mechanism is in agreement with Lightbown’s Delayed-Effect Hypothesis (Lightbown 1985b), in the light of which the result of instruction may not be perceived instantaneously, but it definitely “plants the seeds for later-stage development” (Pawlak 2006: 216). This means that teaching does not bring about the product but triggers a process whose successful completion is delayed in time for as long as it takes an individual learner to respond to the priming effect of instruction. This, in turn will depend on whether the instruction was offered within the learner’s zone of proximal development. A reverse process – that of implicit knowledge being declarativised – is also deemed to be possible, and is described in Johnson (1996). This is the process we frequently observe in the case of our native tongues: the implicit, f-functional knowledge of the language acquired in the pre-school period is subject to declarativisation in the course of the formal instruction received as part of the native language lessons. As a result of this, we learn the rules underlying the sofar natural use of our first linguistic system together with the metalanguage that is used to describe them. Similarly, in the case of the implicitly learned foreign language declaritivisation takes place when the learner acquires the knowledge of the rule underlying their so-far intuitive language behaviours either from the tutor in a top-down fashion or comes by this rule as a result of inferencing.

152

Chapter Two

Considering the nature of the learner’s language system, which is in a state of constant flux, the two types of interfacing described above – learning/knowledge and knowledge/knowledge – need to be seen in terms of processes rather than products. As a result, we can state that the learner’s expertise in the target language will be a combination of chunks of knowledge learned implicitly and explicitly, the latter represented in the learner’s internal grammar in the steppingstone fashion proposed by Rodrigo et al. (2002); and both kinds of knowledge will form intersecting continua, as suggested by Ellis (1984). The problem that remains to be considered is the one of instruction which facilitates the two kinds of learning and results in the two kinds of knowledge, its effectiveness and potential constraints. This instruction, similarly to learning modes and knowledge, will be discussed with reference to the implicit and explicit modes.

3. Focus on form: pedagogical options Stemming from the implicit and explicit models of language learning discussed in Section 2 are a number of classroom applications, a selection of which are going to be presented here. However, the binary-oppositions mode of presentation that Chapter 2 has used so far is going to be abandoned, and focus-on-form pedagogical options will be placed on a kind of implicit-explicit continuum, based on how implicit/explicit they are in the sense of whether or not – or, rather, to what extent – they pay attention to instilling in the learner’s mind conscious, rulelike knowledge of the target language (cf. the working definitions of implicit/explicit proposed in Section 1 of the present chapter). Therefore Krashen’s Natural Approach approach, which is discussed first (Section 3.1) is placed at the far-implicit end of the continuum, as Krashen values acquisition (unrelated to declarative representations) over conscious systematic learning. That Krashen’s approach is discussed as a focus-on-form option at all is motivated by the fact that contemporary FFI opts for a combination of some kind of instruction with exposure to the communicative context; Krashen’t Input/Comprehension Hypothesis satisfies the exposure end of the pedagogy. Input enhancement, which is presented next, in Section 3.2, is treated as an implicit option because the frequency and salience effects it is based on do not have to lead to cognitive effects that go beyond noticing and memorizing of exemplars, into the conscious abstractions of patterns and rules; they might, though, which makes them less implicit than Krashen’s model. Consciousness Raising (C-R), a middle-of-theroad approach which combines the implicit and the explicit, is described in Section 3.3. Section 3.4, which is the first of the two looking at explicit instruction, discusses the role of experience and discovery in explicit, rule-based language learning. However, some solutions, like those proposed within the TaskBased Learning framework, are still very close to the C-R organic-over-mechanic

Language instruction: Theoretical underpinnings …

153

approach to language. Finally, Section 3.5 presents options in the explicit deductive language pedagogy which is situated at the far-explicit end of the continuum. 3.1. Comprehensible-input language pedagogy Along the continuum of implicit/explicit approaches to language learning, one of the far ends is where we place Krashen and Terrell’s Natural Approach (Krashen and Terrell 1983) and its mainstay principle that comprehensible input is a sufficient condition for language acquisition. The fact that this approach is “comprehension-based” means that, to use VanPatten’s words, the necessary instruction is limited to “the general provision of comprehensible input by instructors and materials during the entire course of language acquisition” (2000: 82). Any other forms of instruction, as declared in Krashen’s Monitor Model (discussed earlier in this chapter), are of peripheral importance and of little influence on language acquisition. Consequently, the essence of one of the hypotheses informing the Natural Approach – the Input Hypothesis, later renamed as the Comprehension Hypothesis – can be expressed in Krashen’s words: “language acquisition does not happen when we learn and practice grammar rules. Language acquisition only happens when we understand messages.” (Krashen 2004: 21). What is it that makes it possible for the learner to understand messages or, in other words, what makes input comprehensible? Krashen (1981 and later works) describes it as i+1 (cf. Section 2.1.2), specifying that if we assume that the current language level of the learner is i, the language input should be slightly beyond this level. When it comes to practical solutions, Long (1983a), in whose understanding making input comprehensible amounts to its fine-tuning – a bone of contention between himself and Krashen – suggests using lexis and structure that the learner is already familiar with. Another idea is the one proposed by Ellis (1985), based on which input should be prepared in a way that enables the learner to utilise their linguistic and extralinguistic knowledge in interpreting this input. Krashen himself (2004) is in favour of what he calls “narrow input”: starting with one author/genre/topic/etc. and gradually branching out. “Staying ‘narrow’,” he claims, “allows the acquirer to take advantage of background knowledge built up through the input.” (Krashen 2004: 25). As was made clear in the description of the Monitor Model (see earlier in this chapter), Krashen (1981, 1982, 1985) relates his Input Hypothesis to his four other hypotheses on how languages are learned: the Natural Order Hypothesis; the acquisition/learning distinction and the Monitor Hypothesis; and the Affective Filter Hypothesis. In a much later work (Krashen 2004: 21), he confirms the relationship between the hypothesis, renamed as Comprehension Hypothesis, and the other four postulates by stating:

154

Chapter Two The Comprehension Hypothesis is closely related to other hypotheses. The Comprehension Hypothesis refers to subconscious acquisition, not conscious learning. The result of providing acquirers with comprehensible input is the emergence of grammatical structure in a predictable order. A strong affective filter (e.g. high anxiety) will prevent input from reaching those parts of the brain that do language acquisition.

In other words, the Input/Comprehension Hypothesis is dependent on the other hypotheses in the following way (Krashen 1985, 2004): – if input contains structures that are understood by the learner – at their level or slightly beyond – the learner is able to progress through a natural order of acquisition which is predictable, independent of the learner’s age, L1 background and conditions in which the exposure to target language takes place. What, however, is the most important is that this natural order is, according to Krashen, unaffected by language instruction. This claim is one of many manifestations that ... – ... there is a distinction between acquisition and learning, the latter functioning as a kind of monitor of the former. As Krashen puts it, the monitor system, which results from formal language instruction, is in charge of planning, monitoring and correcting the utterances which are created by the acquisition system. The influence of the consciously learned knowledge on this acquired naturally is limited; it is also learner-specific: some learners overuse the monitor system, some use it insufficiently and some, who are called the optimal users, utilise the system appropriately. Krashen (2004) claims that teaching rules explicitly makes particular sense in the case of advanced learners, whose speech should naturally be more “polished”; – comprehensible input, even though it is the condition sine qua non for language acquisition, is insufficient if learners are not affectively predisposed to take advantage of it. Consequently, as Krashen (1985) suggests, the best methods will be the ones offering comprehensible input in low anxiety learning contexts, which means, first and foremost, that early production will not be forced and students will be allowed to wait until ready to speak; moreover, lessons will contain material that is attractive to the learner. As for the proof that comprehensible input is both necessary and sufficient for language acquisition, Krashen (1985 and 1993; see also Ellis 1994: 277) puts forward a number of arguments. First of all, children as well as language learners acquire languages on condition that the input they receive is comprehensible. Such input takes the form of caretaker speech in the former case and foreigner talk in the latter; in the case of L1 lack of comprehensible input delays language acquisition. Secondly, children acquiring their L1 as well as L2 learners go through an initial Silent Period, which may suggest that for any form of production to emerge, the comprehensible input has to be subject to incubation. In addition to this, successful teaching methods work through the provision of comprehensible input; they include

Language instruction: Theoretical underpinnings …

155

TPR, immersion teaching and bilingual programmes as well as language education based on extensive reading. To start with, the success of TPR in YL and adult groups alike is confirmed by a number of studies (cf. Ellis 1994). In turn, in the case of immersion, especially promising results have been obtained in the case of the Canadian French. Their description – together with their history and variations – is offered in Ellis (1994) and Pawlak (2006). The general conclusion of the studies both authors present is that immersion students, who are taught content subjects in the target language, achieve both language and academic proficiency (Genesee 1984 and 1987 as well as Swain and Lapkin 1982 cited in Ellis 1994: 226). Finally, it has been found that extensive reading increases language abilities in both native and non-native readers. Krashen (1993) looks at studies comparing the language achievement of two groups of L1 learners: those who took part in classes with intensive reading comprehension activities and students who just read extensively. He notes that 93% of the extensive students did better than their intensive counterparts. Similar effects, this time of L2 extensive reading, are described in Elley (1991 cited in Prowse 1999: 10). The study in question involved 500 9-11 year-olds from 12 schools in Fiji. The control schools followed audiolingual classes (the study was carried out in the 1980s) while the experimental schools were asked to read 20-30 minutes for pleasure each day. On a comprehensive test two years later, the experimental groups outperformed their controls. The results were repeated in Elley’s subsequent study carried out in Singapore. All this shows what Krashen (1993) calls the power of reading, the beneficial effects of extensive exposure to text. 3.2. Input enhancement If we take into account the results of the immersion and extensive reading studies mentioned above as well as our own intuition about how languages are acquired/learned, based on the anecdotal, often autobiographic evidence of what the learner knows in spite of the lack of formal instruction in many areas, we have to agree with Krashen that comprehensible input plays an important role in language education. What, however, is subject to debate is whether it is input alone that is conducive to learner development or if the input needs to be modified – beyond merely comprehensible – in order to trigger acquisition processes. While Krashen subscribes to the former stance (cf. Long’s concept of fine-tuning, refuted by Krashen; Section 3.1), there are a number of researchers that advocate input manipulation in the form of highlighting certain aspects of language and/or increasing the frequency of their use. These two, implying qualitative and quantitative input enhancement, respectively, are discussed below. However, before the two enhancement modes are presented, one important assertion needs to be made: the presented examples of treatment very rarely use input enhancement in isolation; what is much more common is a hybrid instruction design, where highlighting certain aspects of input as well as

156

Chapter Two

increasing the frequency of use of a certain form is combined with explicit learning forms such as discovery techniques, inductive reasoning, awareness raining or direct instruction. This is one more proof that form-focused instruction is much better understood as an implicit/explicit continuum or collaborative mode rather than in terms of implicit/explicit oppositions. The qualitatively enhanced input will be the answer to salience, which is one of the constraints on noticing (Schmidt 1990). Consequently, input made noticeable will be a spoken or a written text, in which selected target language forms are made more conspicuous. In the case of aural input, this implies a change of pitch/volume (higher/lower sounds; speaking up/down) or a different delivery mode (slowing down, singing). Written texts, in turn, graphologically highlight the target form by underlining/bolding/colour-coding etc. The latter treatment is often labelled VIE (visual input enhancement), a term introduced by SharwoodSmith (1981) in relation to consciousness raising. Aural enhancement techniques, especially volume change, are most frequently – though not mandatorily18 – used in recasting, which is an implicit error correction routine, in which the teacher conversationally reformulates the learner’s erroneous utterance, rendering it in the grammatically acceptable form. Unfortunately, this and the above-mentioned examples are given based on pedagogical practice rather than literature to date, as there is only one study (Doughty and Varella 2004) into this type of input enrichment19. Visual input enhancement, definitely more popular than aural input enhancement, has been subject to ample research, whose results are presented here following Ellis (1999)20. An example of such studies is the one carried out by Jourdenais et al. (1995). The researchers demonstrated that English-speaking L2 learners of Spanish who had been exposed to texts containing graphologically highlighted preterit and imperfective verb forms used these forms in production much more frequently than learners who had worked with unenhanced texts. A similar treatment, combined with error correction and reflective learning was applied to a similar effect by Leeman et al. (1995). However, concerning the hybrid, implicit/explicit instructional mode of the latter study, its results may not be due to input enrichment alone, as pointed out by Ellis (1999). A more recent application of VIE can be found in a Polish study by Pawlak (2006; Study 1). The visually-enhanced target forms included past tense morphology (VIE: italics) and passive voice (VIE: bolding). In both cases the VIE treatment was combined with recasting and output enhancement (output based tasks; for details see Pawlak 2006; section 4.1.1.2.2). Based on his results, Pawlak (2006) concludes that the ––––––––– 18 19 20

Recast is a recast without aural input enhancement. To the best of my knowledge. For a full review of the 16 studies implemented since 1991, when the treatment gained popularity, cf. Lee and Huang 2008.

Language instruction: Theoretical underpinnings …

157

“relatively implicit” (2006: 426) instruction proved beneficial in the case of both the simpler (past tense) and the more complex (passive voice) structure leading to increased learner awareness and rare overuse of each of the said target forms. The other input enhancement technique to be discussed involves increasing the frequency of use of a certain structure in input and, consequently, is called input flooding. Ellis (1999) quotes the results of studies in which such treatment was applied. Trahey and White (1993) as well as Trahey (1996) showed that quantitatively enriched input gives results comparable to those brought about by explicit instruction; input flooding, however, does not help eliminate non-target interlanguage rules. The effect of the increased frequency was also tested in a hybrid design study carried out by Pawlak (2006; Study 2), in which input flooding was used alongside input enhancement, dictogloss21 and consciousness raising. As pointed out by the author, the treatment resulted in improved performance on oral communication tasks, increased learner awareness of the target form as well as the ability to monitor their production (a greater amount of self-repair was observed after the treatment). However, as in the case of some of the earlier quoted studies into quantitative input enhancement, it is difficult to decide whether the pedagogical success of Pawlak’s treatment was due to increased frequency of the target form alone; it seems more probable that it resulted from a particularly successful combination of different routines. By analogy to these two types of input enrichment, we can also talk about two implicit output enhancement options: – quantitative – a communication drill, with special regard to whole-class mêlées in which students complete communication tasks as a result of a repetitive application of some structures (Hadfield 1987). The effectiveness of such forms of practice is noted by DeKeyser (2003, 2004); or – qualitative, in which the frequency of use is additionally strengthened by the mode of delivery (song, chant, poem) (Graham 2006; see also Turula 2010). These two forms are a humanistic variation on input flooding based on a repetition drill typical of the Audiolingual Method. Finally, one more input enhancement mode should be mentioned – the one of interpretation-task enrichment, a term introduced by Ellis (1995) to refer to tasks which draw the learner’s extensive attention to a certain target form while intensively focusing on some factors related to the form. This will mean implicitly ––––––––– 21

In dictogloss the teacher dictates a passage at normal speed (no slowing down; no repetition). Students put down key words. When the dictation is finished, they have to reconstruct the text paying attention to structure.

158

Chapter Two

focusing on the form by explicitly paying attention to its meaning; context of use; other similar forms; etc. As an example of such treatment, Ellis (1999) quotes Doughty’s study (Doughty 1991), in which the implicit focus on form was carried out via explicit focus on meaning. The results of the meaning-oriented treatment, as Doughty (1991) observes, were comparable to those observed in the ruleoriented group. Besides, the meaning group outperformed the rule group on reading comprehension tasks. In his concluding remarks on different forms of input enhancement, Ellis (1999: 70) observes: 1. There is some evidence that enriched input with highlighting induces noticing of target features. However, little is yet known about whether highlighting is necessary or which method of highlighting works best. 2. There is fairly convincing evidence that enriched input (with or without highlighting) can help L2 learners acquire new grammatical features and use partially learned features more accurately. Moreover, the evidence suggests that acquisition can be maintained over time. However, clear positive effects are only evident when the treatment provided is prolonged and exposes learners to frequent use of the target structure. 3. There is some evidence to suggest that enriched input (positive linguistic data) does not enable L2 learners to eliminate non-target interlanguage rules. 4. There is some evidence to suggest that when the exposure to enriched input is substantial, it works as, or more, effectively than explicit instruction in promoting acquisition of complex grammatical structures. However, explicit instruction may be more effective with simple structures.

3.3. Consciousness Raising22 In agreement with the graduity of presentation scheme adopted the present chapter – and very much according to its proponent himself (Rutherford 1987) – consciousness-raising will be situated half-way between the implicit focus-on-form approaches to language learning and the form-focused instruction proper, which hinges on imparting linguistic knowledge to the learner by means of instruction which is explicit, inductive or deductive. Rutherford finds fault with both these approaches based on the claim that in their case the relationship between what the learner knows about language and what and how they are taught is far from perfect. ––––––––– 22

The term was first used by Sharwood-Smith (1981) but it is best known owing to Rutherford’s adaptation of the idea (Rutherford 1987). Currently, it is often used interchangingly with focus-on-form. As Hinkel and Fotos (2002: 6) put it, “many teachers and researchers currently regard grammar instruction as ‘consciousness raising’ ... in the sense that awareness of a particular feature is developed by instruction even if the learner cannot use the feature at once”.

Language instruction: Theoretical underpinnings …

159

As for the former implicit stance, Rutherford himself discards the idea that mere exposure to input is sufficient for the emergence of language as a system, claiming that “we have access to considerably less than the necessary range of data for making appropriate generalisations; this is especially true in situations where the classroom is the only resource for such data” (1987: 18). To support the view, he quotes a number of studies proving the point: Pica’s (1983), who argued for classroom instruction as having a positive effect on hypothesis formation; Harley and Swain’s (1984), in the light of which exposure to input alone is not a sufficient condition for learning (though it is necessary); as well as Spada’s (1986), according to whom both form-focused and function-focused practice are required for successful language learning. On the other hand, when it comes to the traditionally understood instructional approach, the problem lies in its main assumption that “[l]anguage learning ... entails the successive mastery of steadily accumulating structural entities (Rutherford 1987: 4). According to Rutherford (1987), the idea of the accumulation of such structures stems from the view that the essence of language is the assemblage of hierarchically arranged constructs and the essence of language learning/teaching is the explicit, overt, direct imparting of these constructs to the learner. These two assumptions are, as Rutherford (1987: 17) observes, based on two beliefs cherished by the traditionally-minded language teaching professionals: firstly, that human language is built up of discrete entities and learning it amounts to the accumulation of these elements; and secondly, that the knowledge about the essential characteristics of these elements can actually be imparted to the learner. These two assumptions are, in Rutherford’s view, exactly where the traditional approaches err. This is because language per se is more than just an assembly of structures; it is an intrinsically complex system. As a result, he claims, it is practically impossible to actually lockstep the instruction in any pedagogically sensible way. Consciousness-raising (C-R) is presented by Rutherford (1987) as an alternative to both input-only approaches and the above-mentioned teachercontrolled gradual agglutination of structures. Contrasted with the former, the C-R approach is definitely interventionist: it involves a certain pedagogical effort that goes beyond mere exposure not only by making input comprehensible or enhancing it but also, and more importantly, by making input matter in the context of language as a system. The assumption that the target language is a cognitive system, “an unfathomably complex labyrinth of intertwining, overlaying and convoluted phonological, syntactic, semantic and pragmatic relationships” (7) – the key word being the relationships between structures and not the individual structures themselves – is in turn, where, according to Rutherford, the C-R approach departs from traditional language pedagogy. Mastering a language through consciousnessraising will amount to developing awareness of the above-mentioned relationships and not to achieving a high-percentage of accuracy of the used forms. Another

160

Chapter Two

difference between C-R and conventional grammar teaching is that consciousnessraising “is a means of attainment of grammatical competence in another language ... whereas grammar teaching typically represents an attempt to instill the competence directly” (Rutherford 1987: 24). What is important is that in doing so, C-R does not treat the learner’s mind as a tabula rasa which needs to be filled in with language constructs; conversely, it is based on the assumption that learning occurs if and only if what is to be learned can be related to what is already known. The already known, which the learner contributes to the learning process, is of a dual character and comprises: 1) the knowledge that – which Rutherford defines as the unconscious foreknowledge23 or innate inkling of how languages are organised; and 2) the knowledge how – the ability to bend language forms to satisfy the desire for rudimentary communication (what we know about how language can be used). While both kinds of knowledge are universal to all humans, the former is called the knowledge of universal principles and the latter – of universal processes. Based on Rutherford (1987), the above-mentioned universals – the principles and the processes – are a point of departure for C-R based pedagogy because the roles of consciousness raising as well as its modes are related to these principles and processes. To start with, these roles and modes in relation to universal principles include: making the general principles available through the accumulation of language data that allow for the formulation and testing of hypotheses; and working from the familiar – what one already knows about language as such and about the target language – towards the unfamiliar. When it comes to universal processes, in turn, the role of C-R will amount to bridging the gap between expressing certain functions in the native and target languages. As a result, any pedagogical intervention of this kind will have to be (i) target-language specific; (ii) carried out in a disciplined and controlled fashion; and (iii) motivated by the relationship we are considering: the one between syntax and semantics as well as the between syntax, semantics and discourse. As for the syntax-semantics relationships, some of them are, according to Rutherford, more challenging for the learner for a number of reasons. In some cases, “the challenge stems from the straightforward need simply to learn a new syntactic network for the realisation of semantic concepts – as would be the case for the learning of any new language” (Rutherford 1987: 84). Other problematic aspects will be language specific. Difficulties in this area will spring from: collocation (what goes together with what) or lexical properties (synonymy, derivation, cohesive ties, etc.). Consequently, in the case of the need-to-learn challenge, the C-R treatment will amount to following the communicative in––––––––– 23

What we know about language as a system in general; the unconscious quality of this knowledge makes it different from the declarative knowledge that proposed in the ACT/ACT-R model and closer to a kind of language instinct – cf. Pinker (1994).

Language instruction: Theoretical underpinnings …

161

tentions of utterances and trying to pinpoint the lexical and syntactic properties which ensure the realisation of such intentions. When it comes to languagespecific difficulties, it will be the problematic areas listed above that should be subject to consciousness-raising. The syntax-semantics-discourse problems, on the other hand, will require consciousness-raising activities based on what the learner already knows about the tri-lateral relationship. This will include (Rutherford 1987: 97) being aware that: 1. the ways in which we, as humans, interact with each other through language conspire to bring about within discourse the arrangement of whole chunks of propositional content in preferred sequences; 2. the crucial semantic relationships destined for destruction in the placement of these chunks are rescued through grammatization; and 3. grammar thus ensures that the entire discourse/semantics complex becomes processable for comprehension. In interpreting the above quotation, we can state that human interaction via language is considerably conventionalised. As a result, certain form-meaning relationships are highly automatised and, consequently, unanalysable; this is what Rutherford means by the destruction of semantic relationships. This may pose a considerable problem for an L2/FL language learner who, in order to fully appreciate the semantic weight of a given form, has to be able to process the formmeaning relationship either in the course of exposure or as a result of pedagogic treatment. As exposure in classroom learning is insufficient (Rutherford 1987; cf. earlier in this section), such processing needs to be C-Red in the learner so that it can bring about the said “rescue through grammaticization24”. The activities that Rutherford (1987) sees as suitable in this respect include, first of all, making certain language elements available to the learner by means of massive exposure to language as well as the more specific tasks of: asking the learner to decide about the linear organisation of the elements in a sentence/text; sensitising the learner to the importance of discourse organisation (a sample activity may be reordering jumbled sentences); consciousness raising in the area of constituent relations (hyponymy; anaphora; cohesion); and many others. In practice, as suggested by Rutherford, such treatment may involve (in the case of English as the target language): – topic/comment  subject/predicate conversions; – word order conversions; ––––––––– 24

It is important to, point out that Rutherford understands grammaticisation as reanalysis, reinterpretation and reorganisation of the known – the native language as a system or the interim system of the target language – towards the new – the target language system).

162

Chapter Two

– loosening the linear, immediate relationship between form and meaning (The subject that I’m interested in IT  the subject that I’m interested in) as well as acquiring multiple structures to express the same meaning (His accent is hard to imitate; It’s hard to imitate his accent; To imitate his accent is hard); – changing the noun-to-verb ratio (normally 3:1; lower in less advanced learners); or – gradual introduction of coordination and subordination of clauses. From the practical point of view, all the above-mentioned instructional routines, though decidedly interventionist, differs from traditional grammar pedagogy in a number of ways. First of all, while traditional instruction involves explicitness and directness in pedagogical interventions, consciousness-raising should rather be understood as facilitation and not as instruction. As Rutherford himself puts it, C-R is understood as “the illumination of the learner’s path from the known to the unknown” (1987: 21), and amounts to aiding the learner’s gradual reanalysis of the native tongue structure as the target language structure. A similar view is expressed in Stevick (1980: 251), who claims that consciousness-raising: casts light on the unfamiliar pathways and the arbitrary obstacles through which [the student] must eventually be able to run back and forth with his eyes shut... the teacher needs to know the same pathways and obstacles ... on top of this there are skills of knowing when to turn on the spotlight and when to turn it off, and knowing how just to aim it so that it will help the student instead of blinding him.

Secondly, in the course of C-R facilitation “language is seen as not just successive states of ‘being’ but also long avenues of ‘becoming’” (Rutherford 1987: 42). Consequently, these two approaches differ in their product vs. process approach to language and its pedagogy. In the case of traditional grammar instruction, “it would seem that if one has to ‘get a handle’ on something, the most natural kind of thing to try to grasp in this way is a solid, stable, fixed piece of the total language product – something with edges on it ... in other words, a language construct” (Rutherford 1987: 56). Consciousness-raising, on the other hand, understood as a processcentred activity, concentrates on language in a way that is holistic and, as a result, there are no edges and it is impossible to get a handle or grasp of anything. The final product/process difference is best referred to by means of a set of twin metaphors proposed by Rutherford – the alternative views of language as an organism as opposed to a machine. As Rutherford puts it, “[o]rganism is a better metaphor than machine for what we know about language” (1987: 37), because of the associations: machine – precision, organism – plasticity; machine – constructed, organism – grows; machine – linear interconnections; organism – cyclical intercon-

Language instruction: Theoretical underpinnings …

163

nections; machine – sterile, organism – fecund. The metaphors, according to Rutherford (1987), have a bearing on how we conceptualise language: in the light of the metaphor of language as a machine, the workings of language as a system are determined by the individual workings of the units of language and the outcome of their combination. Consequently, learning a language is seen in terms of the general product – language mastery – being component of these individual products – the mastery of individual grammatical structures achieved based on imparting knowledge to the learner which is explicit and discrete. What it comes to the language-as-an-organism stance, the hierarchy is reversed: it is the workings of language as a whole that determine the operation of these individual units. As a result of such a holistic approach, language learning is seen in terms of process and language instruction – as was mentioned before – means lighting the paths of this process, in a way that enables the learner to gramaticise – reanalyse and reconstruct – their internal language system. As a consequence of the latter philosophy, what the language learner is taught is not grammar but rather how to learn or how to manage his own learning. Rutherford (1987: 153) observes: Target language grammar enters the learner’s experience not as an objectified body of alien knowledge to be mastered or as obstacles to be overcome but rather as a network of systems in which the learner is already enmeshed, the full grammatical implications of which he alone has to work out on the basis of what he comes in contact with what he himself contributes as an already accomplished language acquirer.

This means that grammar is not in command but in the service of learning and the learner, who, liberated from the lockstep patterns of instruction, follows their own learning schedule and enjoys the benefits of the individualisation of learning, collaboration, open-endedness of classroom procedures and learning outcomes. A good example of a contemporary instructional proposal which is organic rather than mechanic in nature is the Lexical Approach which creates “a lexical syllabus which subsumes a structural syllabus, it also indicates how the structures that make up the syllabus should be exemplified. It does this by emphasizing the importance of natural language” (Willis, 1990: vi). Consequently, to cite the main proponent of the Lexical Approach (Lewis 1993; 1997: 15), in language learning, more attention should be paid to: lexis – different kinds of multi-word chunks; specific language areas not previously standard in many EFL texts; listening (at lower levels) and reading (at higher levels); activities based on L1/l2 comparisons and translation; the use of the dictionary as a resource for active learning; probable rather than possible English; organizing learners’ notebooks to reveal patterns and aid retrieval; the language that learners may meet outside the classroom; and preparing learners to get maximum benefit from texts. At the same time, less attention should be paid to: sentence grammar; uncollocated nouns; indiscriminate recording of new words; and talking in L2 for the sake of it because you claim to use the communicative approach.

164

Chapter Two

Summing up the language-as-machine and language-as-organism metaphors, Rutherford (1987: 154-155) presents the two views of grammar-centred pedagogical programmes in the following schematic way: objectives

curriculum

methodology

mechanic view of language knowledge of language structure well-formedness grammar as an end teacher organised structures linear exhaustive uniqueness hierarchical accumulation agglutination product-oriented language/learner distance increasing complexity an end (necessary and sufficient) memory specific rules predictable, closed objectification competitive, divisive rule articulation group-focused grammar as an obstacle grammar in command speeding up (time needed for production) cadential (lockstep learning)

organic view of language knowledge of language system grammatical understanding grammar as a means teacher/learner organised operations cyclic selective relationship holistic metamorphosis fusion process-oriented language/learner proximity progressive reanalysis a means (necessary but not sufficient) understanding general principles unpredictable, open incorporation cooperative, supportive operational, experience individual-focused grammar as a facilitator grammar in service slowing down (time needed for reflection) chaotic (differential learning)

What, however, needs to be pointed out is that, in fact, Rutherford (1987) himself, though decidedly pro-organic, treats the two metaphoric conceptualisation of language as two sides of the same coin, one applicable in the area of the formal properties of language, and the other – to language functions. This would justify a more machine-like approach to form and a more flexible, organic, approach to meaning. This suggestion is very appealing, especially in the light of the attitude to communication systems proposed by Lem (1994) based on the idea put forward by a Russian mathematician, statistician and humanist Nalimov (1974, 1982). One of the problems Nalimov dealt with was the concept of probability in linguistic representations. In doing so, he asked, among other things, “[h]ow ... the fuzzy, probabilistically weighted vision of the world [can] be combined with formal logic, which we cannot afford to reject” (1982: 15). His own answer to the question was that (Nalimov 1982: 281):

Language instruction: Theoretical underpinnings …

165

[t]he insufficiency of logic in everyday language is made up for by the use of metaphors. The logic of the text and its metaphorical side are two mutually complementary phenomena. And, to my mind, further evolution of linguistic means should proceed by deepening language complementarity rather than by searching for another kind of logic.

Departing from this, Lem (1994) divides communication systems – or codes – into hard and soft, or rather proposes a continuum with hard/soft extremes. Hard languages, usually contextless, are the codes or notation systems of mathematics, computer programming or the Morse code. At the other end of the continuum are fairly flexible codes, such as metaphors or the language of art in a general sense. Placed on this spectrum of options, form (morphosyntax) will naturally be closer to the hard end of the continuum while semantics – with its naturally fuzzy boundaries, personal and social meanings – will occupy a place among the softer, more flexible options. Consequently, as will be argued later in the book, it might not be entirely farfetched to think of diverse instructional options, whose specificity will match either the more machine-like approach to form or the organic approach to form-meaning mappings. These considerations will be returned to later in the present work. Considering consciousness-raising as an actual pedagogic proposal, it is important to point out that research to date contains a number of examples of the successful implementation of various forms of C-R treatment; some of these studies are briefly introduced below. However, for the terminological clarity of the presentation, it needs to be pointed out that the term consciousness raising (CR) is used alternatively with language awareness (LA) raising, the latter introduced – unlike C-R, which is an American invention – on the European side of the Atlantic by Hawkins (1984; see Piechurska-Kuciel 2005 for a comprehensive diachronic and synchronic overview of the approach). Conceptualised similarly to consciousness-raising as “a mental attribute that develops through paying motivated attention to language in use [enabling] language learners to gradually gain insights into how language works” (Bolitho et al. 2003: 251), LA is built through discovery techniques and asking questions about the workings of a/the language (Hawkins 1984); it is manifested in the learner’s “enhanced consciousness of and sensitivity to the forms and functions of language” (Carter 2003: 64). The main principles and objectives of the language awareness approach are similar to those of consciousness-raising: engaging the learner affectively in communicative activities while encouraging them to invest their attentional resources into investigating the formal properties of input. Detailed language awareness raising techniques are similar to those proposed by Rutherford (1987) and can be described as analysing structure in context, speculating about meaning consequences of structural change and so forth. As mentioned above, the rationale for the C-R/LA approach to language instruction as well as potential classroom implementation of the approach have

166

Chapter Two

been examined by a number of studies a selection of which is presented below. Some of them are hybrid solutions (like the studies by Fotos, Izumi, Mohamed, Mackey as well as Laufer and Girsai) as – similarly to input enhancement – it is difficult to find C-R/LA treatment that would rely on one option alone. As for the said research, its body includes studies in which: – White (1991) proved the effectiveness of consciousness-raising activities in a verb movement study carried out from the UG perspective; – Fotos (1993) utilised two types of consciousness-raising exercises: implicit, in the form of structured output tasks, and explicit, providing for solving grammar problem tasks on the basis of positive and negative information provided to the learners; she discovered that both types of tasks promoted significant levels of noticing of the structure in question – comparative forms, question tags, indirect object placement – in subsequent tasks; – Yip (1994) observed that consciousness raising by means of directing learner attention to specific grammatical features in the form of a problem-solving task was successful in teaching English ergatives to Chinese learners; what she emphasises is the degree of interest and participation – and, consequently, intrinsic motivation – that resulted from the treatment; – Hales (1997) used learning based on the analysis of transcripts of recorded learner conversations to successfully raise language awareness concerning English future forms as well as differences between speech and writing or the meaning of words; – Thornbury (1997) suggested reformulation and reconstruction activities, such as dictogloss, to promote noticing; – Julian (2000) used awareness-raising activities – which consisted in dissecting the meaning of close terms and looking at their full semantic representation – to effectively increase accuracy and lexical force as well as to expand the active vocabulary of upper-intermediate learners; like Yip (1994; cited above), Julian emphasises the potential of such activities for increasing intrinsic task motivation; – Lynch (2001) proved that collaborative transcription of own texts – a study formally similar to the one carried out by Hales (1997) – reinforced by interactive feedback from the teacher resulted in greater output accuracy on the part of the learner participants; – Izumi (2002) studied the effects of output activities and visually enhanced input on the acquisition of English relative clauses. The treatment utilised in the course of the study included text reconstruction tasks and text comprehension. While Izumi does not report any positive effects of visual input enhancement, it is pointed out what the output promotes (2002: 573): noticing of the formal properties of language; integrative processing of the target structure; and noticing the gap; – Mennim (2003), in another reactive focus-on-form experiment, showed that transcription and analysis of self-produced material resulted in greater learner

Language instruction: Theoretical underpinnings …













167

accuracy in the area of articles, prepositions, passive structures and pronunciation; additionally, learner differences in their approaches to such a formfocus technique were observed; Mohamed (2004) looked at consciousness raising from the perspective of learner preferences concerning deductive and inductive C-R tasks and found that both types of activities are effective learning tools; Nitta and Gradner (2004) examined 9 contemporary intermediate coursebooks to find that all of them contained C-R activities (based on the organic rather than mechanic approach to structure) alongside traditional presentationpractice grammar spots; Eslami-Rasekh (2005) concentrated on raising pragmatic awareness through translation activities, saw them as moderately effective and argued for additional communication practice; Mackey (2006) discovered that interactional feedback, in the form of the teacher’s questions – recasts and negotiations – supports noticing and, consequently, learning; Vickers and Ene (2006) demonstrated that awareness-raising activities based on comparing own text to that authored by a native speaker lead to improvement in grammatical accuracy in the area of the past hypothetical conditional; Laufer and Girsai (2008) found a place for explicit contrastive analysis and translation activities in learning second language vocabulary and collocation.

Even though the above listing shows that – as in the case of form-focused instruction – conscious raising is subject to different interpretations, there are some common characteristics that link the above-mentioned experimental efforts: all of them treat language holistically rather than analytically25; they also activate semantic relations networks through translation or considering available formmeaning alternatives. As a result, they are what Rutherford (1987: 21) calls “the illumination of the learner’s path from the known to the unknown” rather than ways of overt pedagogic treatment of language units, even if, like some of them, they focus on a specific form. 3.4. Explicit focus on form: the experiential classroom One of the learner routines involved in consciousness raising activities is discovery learning. If such learning goes beyond increasing language awareness into the intended formulation of rules, we can talk about explicit inductive focus on form. Such instruction is experiential and explicit in the sense that through interaction with language learners induce its rules. ––––––––– 25

The word rather is used as some of the studies – Fotos (1993), Mennim (2003) and Laufer and Girsai (2008) – use C-R treatment alongside explicit instruction.

168

Chapter Two

Discovery language learning is a concept which goes back to 1960s and Gattegno’s method called the Silent Way. This method, as argued by its proponent (Gattegno 1972), is based on three principles: learning is effective if it is based on discovery and creation rather than on remembering and repetition; learning is facilitated by using accompanying physical objects (Cuisenaire Rods); and learning is facilitated by problem solving. A relatively new approach to language pedagogy which relies on discovery is Data-Driven Learning. Similar to the Silent Way in the sense that it values discovery and creation over memorisation, and it involves problem solving, DDL has replaced Cuisenaire Rods with language corpora26: it is carried out by means of the study of language as a system based on a number of accumulated authentic texts, a store of written and/or spoken language. According to Huston (2002) the pedagogical utility of language corpora in focus on form stems from the fact that they: – allow students to discover language for themselves; real language, as corpora do not contain the simplified, constructed and artificial language samples used in most coursebooks; – give information about how language works, which is not simply based on native speaker intuitions but confirmed statistically; – show the relative frequency of certain forms: their collocation tendencies, semantic prosody and details of phraseology; and – allow students to observe nuances in meaning, make comparisons between languages (if parallel corpora are subject to study). Consequently, as I argue elsewhere (Turula 2010), when using corpora, learners act as language detectives, discovering facts about the language. In doing so, they improve their general skills to deduce meaning from context. This is pedagogically important as, generally speaking, people are more inclined to remember what they have worked out on their own. Additionally, when working with different language corpora, the learner can compare the grammar of spoken and written language and see there are a number of grammars. Besides, studying a language based on a corpus allows for interacting with authentic language data and modelling one’s language behaviour on that of a native speaker. Finally, corpora are a form of input flooding; as such, they use the frequency of occurrence as a learning-triggering factor. Another experiential learning mode, in which learners focus on form in the course of communicative task completion, is Task-Based Learning (TBL). ––––––––– 26

According to Huston (2002), the origins of DDL most probably go back to 1987, when the Collins Birmingham University International Language Database (COBUILD) project was started.

Language instruction: Theoretical underpinnings …

169

Skehan (2003: 95), following Candlin (1987), Nunan (1989), Long (1989), defines a task as an activity in which: – – – – –

meaning is primary there is some communication problem to solve there is some sort of relationship to comparable real world activities task completion has some priority the assessment of the task is in terms of outcome

The above-described nature of a task guarantees that the combination of activities aimed at its execution will cater for both the communicative use of target language on the one hand and, on the other, learner motivation to use target language and pay attention to its accurate use. The former will be based on the interest aroused by: the task; its relation to real-life context; the fact that the activity is meaningful; and the challenge provided by the problem to be solved. In turn, efforts to be accurate will be undertaken because the task is evaluated in terms of outcome – formal outcome included – and that the outcome has to be presented in front of the teacher and fellow students, which is seen as reason enough to make the presentation attractive in terms of content, formally polished and complex. The fact that TBL pedagogy provides a balance between fluency and the authenticity of language use as well as formal accuracy can be seen in the tripartite task framework put forward by Willis (1996: 52) presented below: Pre-task Introduction to topic and task Teacher explores the topic with class, highlights useful words and phrases, helps students to understand task instructions and prepare. Students may hear a recording of others doing the same task. Task cycle Task

Planning

Report

Students do the task in pairs or small groups. Teacher monitors from a distance.

Students prepare to report to the whole class (orally or in writing) how they did the task, what they decided or discovered. Language focus

Some groups present their reports to the class or exchange written reports and compare results.

Analysis

Practice

Students examine and discuss specific features of the text or transcript of the recording.

Teacher conducts practice of new words, phrases and patterns that occurred in the data during or after the analysis.

170

Chapter Two

The stages of the TBL model provide for: pro-active focus on form, either inductive or deductive (the pre-task phase); formal facilitation and scaffolding on part of the teacher – pro-active or reactive; explicit or implicit – as well as negotiation of meaning and restructuring resulting from learner-learner or learnerinstructor interaction (the task cycle); and finally re-active focus on form, through learner self-evaluation as well as by means of error correction activities, explicit and implicit (language focus). In addition to the above-mentioned input-based attention to formal properties of language, Task-Based Learning gives numerous opportunities for output-based focus-on-form, motivated by an appropriate structuring of tasks that are to be performed by the learners. Skehan (2003: 122) calls this approach Machiavellian, as it requires the task to be prepared in a way that it constrains language depending on what the teacher wants to achieve. Loschky and Bley-Vroman (1993 cited in Skehan 2003: 122) suggest that in structure-oriented activities, the task and the language to be used can be related in three different ways: – Naturalness – target structures are not forced; alternative language is possible and acceptable; – Utility – target structures facilitate task completion but can be replaced by other structures or the use of communication strategies; and – Essentialness – target structures are indispensable for task completion (the most desirable situation in structure-oriented task pedagogy). Skehan argues for a middle-of-the-road approach in which the “naturalness of tasks still has importance, but attempts are made, through task choice and methodology, to focus attention on form” (2003: 125). Such an approach is based on the so-called information-processing tasks, whose pedagogy – more organic than mechanic, in fact – requires the implementation of five principles (Skehan 2003: 129-131): – choose a range of target structures: it is both ineffective and inherently artificial to choose one target structure; keep track of interlanguage development and identify a range of structures to be used; – choose tasks that meet the utility criterion: create appropriate task conditions and hope the range of structures would be used; – sequence tasks to achieve balanced goal development: tasks have to be at the appropriate level of difficulty and they have to be focused in their aims between fluency, accuracy and complexity; – maximise the chances of a focus on form through attentional manipulation: the pretask phase should increase chances for noticing; the planning and report phases should emphasise content as well as form; there must be an opportunity for on-task reflection, again in terms of content and form; raise learner consciousness; and – use cycles of accountability: involve learners in cycles of self-evaluation of how much language progress they have made.

Language instruction: Theoretical underpinnings …

171

An instructional model which caters for integrating another experiential learning option – the Content-Based Learning – with learning strategies (discussed within the explicit learning paradigm in Section 2) was proposed by Chamot and O’Malley (1994) and is known as CALLA (Cognitive Academic Language Learning Approach). As one of its proponent claims, CALLA enables students to “develop the ability to use the target language for learning essential academic content and language and to become independent and self-regulated learners through their increasing command over a variety of strategies for learning” (Chamot 1998: 2). CALLA has three major components: – lessons focusing on high-priority topics from content subjects such as science, math, social studies, and literature; – development of the language needed for understanding, speaking, reading and writing about content topics; as well as – explicit instruction in using learning strategies for learning both “language and content in the second language” (Chamot 1998: 2). The model is largely based on the knowledge and implementation of strategies: metacognitive, cognitive and social/affective. It, however, is the content that determines which strategies need to be selected. The CALLA lesson plan model is based on the following cyclical progression (O’Malley and Chamot 1990: 201-203): preparation: activation of student background strategy knowledge (what they know about strategies and how they usually set about doing tasks); presentation: selection of the strategies to be used in approaching a particular task; modelling strategies in (think-aloud) reflection on individual strategy utility while on task; practice: giving students ample opportunities to use strategies in tasks similar to the one used for modelling and encouraging reflection and focusing students’ attention on the task-strategy use relationship; and evaluation: discussing the helpfulness of strategies in relation to context as well as skills practiced; and expansion: offering further practice opportunities in a variety of tasks and reminding students to use strategies in the course of it. 3.5. Explicit focus-on-form: direct instruction As was mentioned before, explicit, instructed language learning is usually associated primarily with the traditional teacher-fronted classroom. That is why the first instructional mode that comes to mind in relation to such pedagogy is the Presentation-Practice-Production cycle, whose structure mirrors the ACT/ACT-R model as well as the phases of direct instruction discussed in Section 1.2. Discarded some time ago for the benefit of the task-cycle pedagogy (discussed in the previous section), the Presentation-Practice-Production mode is

172

Chapter Two

currently finding its way back to the language classrooms. This is because of the ongoing teachers’ belief in its pedagogical utility. However, as the new PPP needs to be both effective as well as accommodated within the communicative approach to language learning, each of the three components has been subject to a number of amendments, or has been supplied with a set of guidelines which make it more learner-centred. Addressing the Presentation stage, Westney (1994) argues for the presented rules to be “(a) accurate, (b) usable and (c) capable of gradual integration into broader patterns which reveal more of the structure of the language” (1994: 7677). These three requirements can be satisfied if teachers find a good balance between the rules of thumb and rules of grammar, starting with the former, simpler and more down-to-earth option and moving towards a more formalised set of abstract generalisation, as Westney (1994) argues, following Berman (1979). What, in addition to the graduity of difficulty, makes the teacher explanation accurate and usable by the learner is its compliance with Gricean (1975) co-operative principle. As I argue elsewhere (Turula 2010: 391), a good explanation will be: 1. reliable: the teacher should be certain that the explanation (s)he offers is well grounded in TL grammar. If the teacher does not have the necessary knowledge, they should put the explanation off or direct the learner to a relevant source; 2. relevant: the teacher should discuss only the issue in question; 3. short and to the point: digressions and/or discussing other – even if related – issues can be confusing; and 4. simple in terms of the metalanguage used. When it comes to Practice, the contemporary PPP, like any other focus-on-form approach, has to acknowledge the already emphasised fact that in FFI the best results are achieved if paying attention to formal properties of language is combined with exposure to the authentic communicative context (Ellis 1994 and later works; the studies presented in Doughty and Williams 2004). This implies that teacherlearner discourse exchanges ought to be referential rather than display in character. This also means that practice should enable the learner to see grammar as choice (Halliday 1994 and 2003; Nunan 1996; Larsen-Freeman 2003). In doing so, the upto-date practice should, according to Larsen-Freeman (2003: 59-66): help abandon the “one right answer” myth; base the choice of structure on social-interactional factors, identifying intentions (including pragmatic factors) behind the use of structure; and teach was is probable rather than what is possible (cf. also Lewis 1993), avoiding discourse that is grammatically correct but unlikely to be used by a native speaker of the target language. All this, as I argue in one classroom-oriented publication (Turula 2006b), may mean that practice can amount to: choosing different grammatical forms for a given piece of discourse while simultaneously

Language instruction: Theoretical underpinnings …

173

analysing possible changes of the speaker’s meaning (his/her attitude; identity or power relations they want to imply; cf. Larsen-Freeman 2003) attached to each form; collecting authentic samples of use of a given structure (a tense; a modal) instead of doing traditional 20-sample exercises containing it; and, most importantly, supplement the traditional one-sentence practice by going beyond it, into discourse-based contexts of grammar practice. As far as output practice is concerned, it should be meaningful, engaging as well as present a challenge in terms of form, meaning and use (Larsen-Freeman 2003: 117). When combining practice with the communicative contexts, we should also remember that each of the two can be a point of departure for the other. According to Azar (2007), the decision which direction to take lies at the root of distinguishing between focus-on-form and Grammar-Based Teaching (GBT). While the former seeks to accommodate attention to structure within communicative activities – this kind of practice was described, among others, in relation to the TBL model (see Section 3.4) – drawing the learner’s attention to grammar while (s)he is on task, GBT chooses to incorporate communicative activities within the structural syllabus. As a result, formal practice is separate, though integrated with CLT practice, and grammar is taught as one of the many components of a well-balanced language instruction (Azar 2007: 4). In other words, GBT practice will “proactively ... develop communicative competence in all skill areas through a widely varied practice opportunities and the inclusion of communicative methods” (Azar 2007: 6) only after it has been subject to direct instruction in a specially designed lesson or a series of lessons devoted exclusively to a given grammatical problem. Finally, as regards production, the activities in which the learner is involved should resemble real-life tasks. As for their design, they should help develop fluency, accuracy and complexity in a balanced way, which means that guidelines given in Section 3.4 with reference to output-based focus on form – as well as the characteristics of a good fluency/accuracy/complexity task presented in Chapter 1 – are applicable here as well. An instructional treatment which, in certain aspects, is convergent with PPP is Processing Instruction, “an input-based approach to focus on form” (VanPatten 1996: 6), whose pedagogy involves input modification in a way that makes it rich in a particular language feature as well as ways of pedagogical recycling of such input: activities that help learners pay attention to this feature. In order to incorporate meaning-bearing input into grammar instruction, Processing Instruction is divided in three parts (VanPatten 1996). Part one amounts to the explanation of the grammar point emphasising the connection between form and meaning. Part two presents the information about the best processing strategies for this topic, pointing out the strategies that lead to misinterpretation of meaning. Finally, part three consists of activities based on structured input, “purposefully

174

Chapter Two

manipulated sentences and discourse that carry meaning” (VanPatten 1996: 6). When it comes to the first two components of Processing Instruction, VanPatten gives the following guidelines (VanPatten 1996): “teach only one thing at a time . . . keep meaning in focus” (67); “learners must do something with the input . . . use both oral and written input . . . move from sentence to connected discourse” (68); “keep the psycholinguistic processing strategies in mind” (69). As for the part-three guideline, referring to input structuring raises the question of input finetuning (cf. Long 1983a; Ellis 1994: 273). Due to the three-partite structure, which makes PI similar to PPP, we can call VanPatten’s model – by analogy as well as disanalogy – presentation, practice, comprehension. What makes PI different from PPP is that VanPatten’s treatment involves “appropriate processing instruction activities [which] do not ask the learner to produce targeted grammatical items; instead, learners are pushed to attend to the properties of the language during activities in which they hear or see language that expresses some meaning” (1996: 6). The educational value of such provision of examples – proposed earlier in the form of collecting samples of use (Turula 2006b) – is emphasised by N. Ellis (1991 cited in Ellis 1994: 642), who demonstrated, in his comparative study of groups of adults, that mature learners benefit the most form structured explicit instruction if such instruction offers rules of language use plus examples of such use. Finally, similarly to experiential focus-on-form, explicit instruction – in addition to the above-mentioned pro-active options – utilises explicit error correction (explicit re-active FFI). Summing up Sections 3.4 and 3.5 we can say that the most important characteristic of explicit learning is the need for “systematic teaching efforts” as well as the need for the students “to be intentionally engaged in metacognitive, hypothesis-testing procedures or to be exposed to explicit knowledge conveyed in social discourses” (Rodrigo et al. 2002: 171). This makes explicit teaching and learning far more demanding in terms of effort and attentional resources than in the case of implicit learning which relies on exposure and an extensive rather than intensive awareness of input. However, Norris and Ortega (2000; 2001) argue that such investment of effort pays off. They emphasise the time factor claiming that explicit instruction has a much better effectiveness-to-time ratio: explicit transmission of knowledge – even if requiring greater investment in terms of learning effort and attention focusing – takes much less time than any form of implicit learning, the massive exposure the latter requires being extremely time-consuming. The question that may be asked at this point – particularly, though not solely in relation to sections 3.4 and 3.5 – is if the instructional effort investment pays off in terms of effectiveness as well or, in other words, Is (explicit) formal instruction conducive to learning and if yes – in what way?

Language instruction: Theoretical underpinnings …

175

In an attempt to answer this question, Ellis (see 1994: 611-645 for a detailed overview) mentions three types of studies to date which have looked into the effectiveness of instruction: studies investigating the relationship between instruction and levels of proficiency; studies looking at the effects of formal instruction on production accuracy; and finally, research into the relationship between form-focused instruction and the order of acquisition. When it comes to the first type of studies – those investigating whether learners who receive formal instruction achieve higher levels of proficiency than those who do not – their results are as follows (Ellis 1994): – formal instruction does make a difference but it leads to learning not to acquisition: the results are best visible in tests and not in natural production (cf. Ellis’s weak and strong interfacing between implicit and explicit knowledge discussed earlier in this chapter as well as his Variability Hypothesis); – such instruction may be particularly useful in adults (for whom implicit learning is more difficult) and beginners (who find it difficult to find comprehensible input from which to acquire a language implicitly); – the best result comes when formal instruction is combined with ample exposure to the target language in communicative contexts; and – generally, L2 learners develop greater accuracy and FL learners – greater communication skills as a result of instructed learning. The second type of studies described in Ellis (1994) includes samples of research into the effects of formal instruction on production accuracy. Their results show that: – formal instruction affects accuracy in planned (but not – except for one study – in unplanned) production; gains in accuracy are significant, especially if the structure under investigation is simple; a similar argument is put forward by Norris and Ortega (2000), who argue that instructed learning is beneficial but admit: “the measurement of change induced by instruction is typically carried out on instruments that seem to favour more explicit types of treatments by calling on explicit memory-based performance” (2000: 483) (cf. Ellis’s Variability Hypothesis discussed earlier in this chapter); – it is possible that instruction has a delayed effect on unplanned production (cf. Lightbown’s Delayed Effect Hypothesis discussed earlier in this chapter); – overlearning in one area of grammar may lead to errors in another area (in one study drilling in negatives led to error in declarative sentences); and – formal instruction may make learners cautious. Finally, Ellis (1994) quotes a number of studies investigating the effects of formal instruction on acquisition order/sequence which are characterised by the following results:

176

Chapter Two

– there are no significant differences between instructed and uninstructed learners in the area of developmental patterns; – premature instruction – ignoring the natural order of acquisition – may inhibit learning; conversely, if a grammatical structure is not subject to developmental constraints, it can be taught by means of formal instruction (cf. Pienneman’s Teachability Hypothesis discussed earlier in this chapter); and – it is possible to explain the meaning of a construction without enabling the structure for use in production (cf. Lightbown’s Delayed Effect Hypothesis as well as Sharwood – Smith’s Pedagogical Grammar Hypothesis discussed earlier in this chapter). When it comes to more recent studies, including, among others, Erlam (2003), Lyster (2004), Philp et al.(2006), Lyster and Mori (2006) as well as two Polish studies by Piechurska-Kuciel (2005) and Pawlak (2006) they, to a considerable extent, confirm the findings related to the effectiveness of formal instruction quoted in Ellis (1994). Although different in scope, length and individual selection of procedures, all of these studies share a number of characteristics. Their focus-on-form is based on the explicit presentation of the structure in question as well as on language awareness raising (Erlam 2003, Lyster 2004, Piechurska-Kuciel 2005); FFI is very often reinforced by fluency practice (Erlam 2003, Lyster 2004, Pawlak 2006) or elicited production (Philp et al. 2006, Pawlak 2006) during which such techniques as corrective feedback – including explicit correction, recasts and prompts – are used (Erlam 2003, Lyster 2004, Philp et al. 2006, and Pawlak 2006). Finally and most importantly, all of the above-mentioned researchers present quantitative data confirming the effectiveness of the undertaken pedagogical interventions: in all of these studies the groups that underwent form-focused instruction outperformed their controls on accuracy tests as well as demonstrated formally improved language production, the latter being the result of greater automatisation of the learned structure. In other words, the applied treatment has two positive effects – improved declarative knowledge of language and its more effective proceduralisation. What, however, needs to be pointed out is that the gains in fluency were observed in a controlled – rather than natural – environment. All this needs to be supplemented by one more finding reported throughout another ample body of research (Doughty and Williams 200427): focus on form gives the best result if it combines language instruction with exposure to meaningful input. The solution is in agreement with Ellis’s assertions that the best result comes when formal instruction is combined with ample exposure to the target language in communicative contexts. ––––––––– 27

Including: Long and Robinson 2004; DeKeyser 2004; Swain 2004; White 2004; Doughty and Varella 2004; William and Evans 2004; Harley 2004; Lightbown 2004; Doughty and Williams 2004.

Language instruction: Theoretical underpinnings …

177

The role of exposure, emphasised by all modern FFI approaches, can be ascribed to a number of factors, some of which were pointed out, following Krashen in Section 3.1 of the present chapter. First of all, it can be argued that it will provide the opportunity for implicit learning which, as argued in relation to instruction/knowledge and knowledge/knowledge interfacing, complements the explicit component and interacts with it. Secondly, exposure can be a means of providing examples whose importance in relation to instruction was discussed earlier in this section. Moreover, exposure to language incorporates all output options and the communicative context whose positive influence on language learning has also been discussed in this chapter. Finally, matching instruction with exposure has the potential to trigger different hybrid learning modes in which the implicit/explicit learning ratio will differ from learner to learner and from task to task. As a result, such a match is a way of catering to individual learner differences as well as various object-of-study constraints on pedagogy discussed in Section 4 below.

4. What kind of focus on form? Learner individual differences and object-of-study constraints Reflecting upon Section 3, we can state that there is ample research evidence that both the implicit and the explicit focus on form are conducive to learning; there are also numerous studies reporting inconclusive results as far as the effectiveness of different instructional modes is concerned. That is why, at this point of the argument, it is good to look at the question of what kind of instruction. The issue will be addressed in three different ways: Section 4.1 looks at the focus-on-form vs. focus-on-formS as two instructional options; Section 4.2 summarises findings on learner-instruction matching as well as presents a number of attempts to relate the implicit/explicit modes to learning styles; finally, section 4.3 tries to consider all focus on form options vis à vis different grammatical problems to be taught. 4.1. Focus on form or focus on formS? As was mentioned in the introductory part of the present chapter, one of the distinctions between different types of form-focused instruction is drawn along the lines of focus-on formS as opposed to focus-on-form (Long 1988, 1991). The former, as mentioned before, is often understood as traditional pedagogy, a teacher-fronted and linear treatment of grammar understood as a collection of discreet units. Focus-on-form, in turn, stems from a more flexible approach adopted as a result of a shift from the structural towards the more communicative model of language teaching. This shift can be traced back to mid 1970s. In his 1979 publication, Widdowson argues against a strictly linguistic, formal analysis of language, because, as he claims, it is difficult to reconcile the exactness of

178

Chapter Two

linguistic analysis and the open-endedness of communication. As he puts it (Widdowson 1979: 243): If meaning could be conveyed by exact specification, if it were signalled entirely by linguistic signs, then there would be no need for the kind of negotiation that lies at the very heart of communicative behaviour whereby what is meant is worked out by interactive behaviour

Widdowson’s approach to language pedagogy, called discourse-based, stems from the principle that we should not teach the linguistic realisation of functions but rather concentrate on developing interpretative strategies, the knowledge of discourse conventions. The development of linguistic competence starts with analysing discourse based on the above-mentioned strategies and knowledge. Grammar, as emphasised in Widdowson’s later publication, should not be “a constraining imposition but a liberating force: it frees us from a dependency on context and a purely lexical categorization of reality” (1990: 86). Teaching grammar as a liberating force will amount to the instruction in the notional and attitudinal meanings of structures, and will be carried out by means of “production tasks which challenge learners grammatically, and also lead them to notice gaps in their knowledge of the target language system” (Cullen 2008: 223). A similar attitude to grammar as liberating force as opposed to grammar taught linearly and in a hierarchical manner is expressed in Rutherford (1987) via his metaphor of an organism vis à vis a machine (discussed earlier in this chapter). However, when we analyse the descriptors of the mechanic and the organic views of language proposed by Rutherford, it becomes clear that, in many ways, the former stance coincides with explicit, especially the top-down explicit instruction, and the latter proposal is convergent with implicit learning and teaching. Explicit discovery learning will be situated somewhere in the middle, with some characteristics – such as systematicity and rule-primacy – belonging to the mechanic domain and other – including learner participation and reflection – typical of the organic view. As all of these issues have, to a certain extent, been addressed in the present work, the distinction of focus on form vs. focus-on-formS will be considered in the present section following Long’s own delineation of the two concepts as all-inclusive, progressing along “complex, gradual and interrelated developmental paths” (Long 1988: 11) as opposed to discrete, linear teaching and testing of individual language structures. Such a distinction is best explained through the “growing a garden” vs. “building a wall” metaphors proposed by Nunan (1996: 69). Considering the two pedagogical options, we can note that focus on formS, which implies linear, systematic and discreet learning, is best understood by means of the wall-building metaphor, while focus on form is more like growing a garden in the sense that instead of learning one item completely, learners learn a few items simultaneously and imperfectly. In addition to this, as Nunan (1996)

Language instruction: Theoretical underpinnings …

179

puts it, a garden – contrary to a wall – is a place where systematicity may well be replaced with variety and spontaneity on the one hand and mutual interrelations on the other. Transferring this from the level of metaphor to language pedagogy, we have to accept that in focus on form “rather than being isolated ‘bricks’, the various elements of language ... are affected by other elements they are closely related to in a functional sense” (Nunan 1996: 68). Another characteristic which Nunan associates with the wall-building approach of the linear pedagogy is its normativity: the fact the all of the bricks need to fit together in a predesigned fashion. This, as he notes, has been proved to be problematic by a number of grammaticality judgment tests on which native users of a given language accept as correct sentences that clearly violate some norms decreed by linearly organised pedagogical grammars. The above problem results from the fact that focus on formS is usually sentence-based: the normdictated strings have to be manageable; and strings of discourse including numerous samples of one form are extremely rare. This is a major weakness of the wall-building approach because, as indicated by Hopper in the already-quoted definition of Emergent Grammar: “structure or regularity comes out of discourse and is shaped by discourse as much as it shapes discourse ... [Its] forms are not fixed templates but are negotiable in face-to-face interaction” (1987: 142). As a result, what is decreed as a norm in a narrow sense, often fails to operate in broader contexts. The more versatile garden-growing approach, in turn – in an attempt to determine what a norm is and, consequently, what grammar choice needs to be made – takes into account the discoursal context, understood as both co-text as well as the cultural and situational milieu of the discourse. “Without reference to context,” to use Nunan’s words (1996: 68), “it makes little sense to speak of ‘facts’, ‘correctness’ or ‘propriecy’.” This is because grammar and discourse are interdependent, the latter determining higher-order choices which influence the lower-order grammatical decisions. As a result, the interpretation of numerous elements in a sentence is largely determined by factors from outside this sentence. Anaphora, elision, word order and other phenomena of this kind are an example of such influences not to mention structural choices such as tense, aspect, mood etc. whose selection may be determined stylistically, co-textually, in relation to the experience the user wants to describe or based on the interpersonal meaning they want to convey. On the other hand, however, it needs to be kept in mind that the garden approach, in spite of its advantages, may be confusing while the wall-building normativity is known to give the learner a sense of order and logic (s)he needs, particularly at the lower levels of language advancement. Summing up his discussion and opting himself for the garden-growing pedagogy, Nunan argues for teaching grammar through “active engagement in discoursal encounters” during which the learner is introduced to grammar as a choice of means for expressing “meanings of increasingly sophisticated kinds” (Nunan 1996: 73). What is important is that such exploration should: start at the level of text

180

Chapter Two

and move towards the lower-order grammatical structures and not vice-versa; be context-bound; and be based on the authentic text to allow for the formulation of hypotheses about the real – and not the artificial – language as a system. All this boils down to the following 5 pedagogical principles (Nunan 1996: 75): 1. Teach grammar as choice 2. Provide opportunities for learners to explore grammatical a discoursal relationships in authentic data 3. Teach grammar in ways that make form/function relationships transparent 4. Encourage learners to become active explorers of language 5. Encourage learners to explore relationships between grammar and discourse Teaching grammar as choice will, according to Nunan, result in replacing the question of Which form to use? with the one of What meaning is to be conveyed?. The choices learners make will be fully informed on condition that alternative forms together with their meaning potential are presented and explained. When it comes to providing exploration opportunities as well as encouraging learner initiative in this area, the suggestion is to persuade learners to take language out of the classroom and study naturally occurring language samples. In turn, clarifying form/function – or form-meaning – relationships, as well as exploring grammardiscourse interplay, may be carried out by means of all kinds of both inductive and deductive tasks, such as comparing alternative forms in the same context trying to detect meaning differences, aimed at increasing the learner’s understanding of individual structures. While the organic approach to grammar as a matter of choice rather than norm in a broad authentic context is certainly very attractive, especially to a humanistically-minded teacher, the more analytic, wall-building stance also has a number of subscribers, some of whom, such as VanPatten (1990 and later works), DeKeyser (2003, 2004), Azar (2007) and even, to a certain extent, Rutherford (1987) have been quoted in the present work. This approach, as has to be admitted – is well-grounded theoretically. The one-form-at-a-time, single-sentence-context pedagogy is in concordance with what we know about human attention and its limitations. Additionally, if we acknowledge the positive effects of teacherfronted instruction, we cannot ignore the fact that it may not be possible to offer such instruction on form in a garden-growing fashion; when teachers present grammar problems, they focus on individual forms, not on language as such. All in all, both instructional modes have their strong points and their limitations. That is why they need to exist along each other. This in fact is the case in language pedagogy: when we look backwards at the instructional options discussed in Section 3, we notice that all of them can be found in both – focus-onform and focus-on-formS – variations. The narrow input, as understood by Krashen, and input flooding will belong to the garden-growing tradition. In turn,

Language instruction: Theoretical underpinnings …

181

the slightly more explicit quantitative enrichment proposed in data-driven learning (DDL) is definitely closer to building a wall, with focus on individual forms: a typical corpus worksheet will contain hundreds of context samples for one particular structure. When it comes to qualitative input enhancement, most of the VIE studies cited here used language material in which only one structure (a tense; passive voice) was visually highlighted. However, it seems possible to utilise the same enhancement procedure in the implicit focus, if enhancement is applied to a number of forms in one text; in such a case we will talk about focuson-form. In turn, consciousness raising is understood as primarily organic. Yet, we can imagine consciousness raising activities in which students will be asked to reflect on the use of just one form and make inter-language or NS/NNS comparisons. Finally, explicit form-focused instruction is also open to both options. In Task-Based Learning the choice of one of the two instructional modes will depend on whether, in the structuring of the tasks, we choose the naturalness or the essentialness approach, the former being closer to focus on form, the latter requiring focus on formS. Focus on formS will also be a more natural choice in the rule-based top-down instruction; however, remembering Westney’s requirement for gradual integration of individual patterns into a broader system (Westney 1994: 77), learners should certainly be encouraged to make inter-pattern comparisons in a more organic, focus-on-form manner. It is also important to remember that while a typical focus-on-form treatment understood as a pedagogical option implies the garden-growing approach to instruction, Azar’s Grammar-Based Learning definitely stems from the focus-on-formS and its wall-building philosophy. The question of what kind of instruction asked in relation to focus-on-form as opposed to focus-on-formS will in many cases follow the guidelines offered in sections 4.2 and 4.3 as regards implicit vs. explicit pedagogy. There is, however, one recommendation based on which we can link focus on form and focus on formS to instruction timing. As proposed by Skehan (2002), for every area of grammar to be mastered, the learner passes through the following stages (Skehan 2002: 91-92): – noticing, during which “the initial insight ... that some aspect of form may be worth attention” (2002: 91) appears; – patterning, when – through identification, extending, complexifying and integrating – “the noticed input [is] analysed, processed, generalisations [are] made and then extensions [are] achieved” (2002: 91); – controlling, in which “a rule-based generalisation, initially handled with difficulty, becomes proceduralised” (2002: 91); this is manifested by avoiding error, achieveing salience and automatising (cf. also Norris et al. 2003); and – lexicalising, as a result of which “the learner is able to go beyond rule-based processing” (2002: 92) in real-time performance.

182

Chapter Two

Mapping the stages on available instructional options, Skehan (2002) notes that while patterning, controlling and lexicalising will benefit from instruction (I) – Skehan in fact argues for a series of interventions (from I1 to In) – the noticing stage is in fact a pre-instructional phase. In the light of this, it stands to reason that while both focus-on-form and focus-on-formS may be in place in stages 2-4, the noticing phase (1) is likely to benefit most from the more organic FFI mode. 4.2. Learner constraints on form-focused instruction. Individual differences Summarising research to date (for details, see Ellis 1994: 646-664; cf. also Wesche 1981 and Skehan 2003), Ellis enumerates four areas of individual differences which have been considered, and four potential recommendations related to learner/instruction matching (Ellis 1994): – IQ: in the case of a high IQ, a more structured instructional approach is favoured; for learners with a lower IQ a less systematic approach seems to be better; – Expert/novice: experts perform well with both explicit and implicit learning contexts; novices benefit from explicit learning; – Language aptitude: memory learners benefit from memory-based instruction and from inductive learning; analytic learners – from analytic and deductive learning; – Field (in)dependence: field dependent learners benefit from inductive treatment; field independent – from deductive teaching. More contemporary research into individual difference and FFI concentrates on language aptitude and working memory as important learner factors related to both learning outcomes as well as – and more importantly to the present section – learner-instruction matching. 4.2.1. Language aptitude and working memory as individual differences When it comes to language aptitude and different learning/teaching options, it is important to base the discussion of suitable FFI options on modern views of language aptitude, with particular regard to the fact that language aptitude is a multi-component combine28 of different predispositions in different configurations, ––––––––– 28

Carroll (1965), one of the proponents of the concept of language aptitude distinguished four such components: phonemic coding ability (the ability to code foreign sounds in a way that they can be remembered later; also related to the ability to spell and handle sound-symbol relationships); grammatical sensitivity (the ability to recognize grammatical functions in a sentence); inductive language learning ability (the ability to identify patterns and relationships between forms and meaning); and rote learning ability (the ability to form and remember associations between stimuli). Skehan (2003) combined grammatical sensitivity

Language instruction: Theoretical underpinnings …

183

related to acquisition stages, task demands, pedagogical techniques or overall learning modes rather than a sum-total of these predispositions. Such a perspective on aptitude accommodates Skehan’s proposal of matching individual aptitude constructs with language acquisition stages (Skehan 2002; see also Dörnyei 2005: 62 for a summary) and Robinson’s Aptitude Complex Hypothesis (Robinson 2001, 2005). When it comes to Skehan’s SLA stage/aptitude construct matching, the following links can be established between input (I), central processing (CP) and output (O) on the one hand and individual abilities (Skehan 2002; Dörnyei 2005): – input processing strategies, including input segmentation/parsing (I), require good attentional control and working memory capacity and speed; – noticing (I) is based on phonetic coding and working memory abilities; – pattern identification (I/CP) requires good working memory, inductive learning and phonemic coding abilities as well as grammatical sensitivity; – pattern restructuring and modification (CP) demands grammatical sensitivity and inductive language learning ability; – pattern control (O) relies on automatisation and is related to good integrative memory; and – pattern integration (CP/O) demands proficient chunking and good retrieval memory. Based on the above, we can note that individual differences in the area of all of the different components of language aptitude can be treated as a point of departure for learner-instruction matching, the instruction aimed at compensating for language aptitude deficiencies or relying on the learner’s strengths. Examples of the former, compensatory pedagogical approach are contemporary studies into the relationship between achievement and strategy training, strategy use being understood by O’Malley and Chamot (1990) as a manifestation of language aptitude. Sawyer and Ranta (2001) describe a number of such studies, singling out Cohen et al.’s (1998) research as one of the most ambitious programmes of this kind. Their experimental group, as a result of strategy-based instruction (SBI) consisting of “presentation, discussion, promotion and practice of strategies [with] the added element of explicit (as well as implicit) integration of training” (Cohen et al. 1998: 82), outperformed the comparison group on a number of subscales, among others – importantly for the present work – on a grammar task. As for the latter learner-instruction matching mode, the best known example of aptitude-related treatment relying on learner strengths is described in the ––––––––– and inductive language learning ability into linguistic (=language analytic) ability.

184

Chapter Two

already-cited publication by Wesche (1981). In this study the matched subjects – analytical learners undergoing analytical grammar instruction and memory learners matched to a CLT learning mode – outperformed the unmatched learners (the audiolingually-instructed group). The more contemporary body of research contains examples of studies in which certain aptitude-related individual differences prove to be particularly well-matched with some instructional modes and techniques. To start with, it is important to point out that numerous studies (Harley and Hart 1997, Robinson 1997b, De Graaff 1997) refute Krashen’s claim (Krashen 1981) that language aptitude is relevant in controlled (learning) but not in naturalistic (acquisition) language settings. Conversely, contemporary research shows that aptitude is relevant in all contexts, implicit and explicit, immersion and interventionist, corroborating Robinson’s (1996, 1997b) Fundamental Similarity Hypothesis, in the light of which language aptitude is a significant individual difference regardless of the adopted language pedagogy. What, however, seems more important in our discussion of individual differences is considering how different aptitude components (IDs) match with different instructional modes. Robinson’s study (Robinson 1997b) shows that grammatical sensitivity is a good predictor of success in implicit, explicit inductive and explicit deductive instructional settings, while it seems irrelevant in incidental, focus-on-meaning contexts. Therefore, as argued by Skehan (2003), learners who are more memory oriented and, consequently, unable to extract rules on their own, will be better off in explicit-instruction classroom – a similar point is argued by Ranta (1998 cited in Sawyer and Ranta 2001) – or in settings were learning is exemplar-based and rule extraction is of lesser importance. What is also significant is whether the instructional treatment offered is input or output oriented. As noted by Erlam (2005) based on her study of 60 students exposed to three instructional modes: deductive instruction, inductive instruction and structured input groups, learners with higher analytic ability and working memory capacity benefit most from instruction based on input and not requiring output. When it comes to individual differences and matching instructional microoptions, Robinson and Yamaguchi (1999) note that learning from recasts is particularly suitable for learners with high rote memory and phonetic sensitivity; in turn, Mackey et al. (2002) observe that recasts match well with good phonological working memory capacity (the ability proved to be the most important at lower levels). Based on Robinson’s interpretation (Robinson 2001), the results of the two studies are not necessarily mutually exclusive and contradictory: low WM capacity may make it impossible for the learner to rehearse recasts even if (s)he has the ability to notice them; noticing the gap, in turn, may be low if the ability to notice/identify patterns is low (and will not be compensated for by the memory for contingent text). The ultimate outcome may, in such a case, be less related to individual differences per se and better understood in the context of ability, aptitude and task demands combines.

Language instruction: Theoretical underpinnings …

185

Such a proposal linking abilities to aptitude complexes and to task demands is put forward in Robinson’s Aptitude Complex Hypothesis (Robinson 2001, 2005). In the light of Robinson’s postulate, language aptitude is the said multicomponent combine in which different cognitive abilities (called first-order, distinguished after Snow 1994) – such as attention, working memory or different types of reasoning and processing speed – form combinations which contribute to second order aptitude complexes, including noticing the gap, memory for contingent speech, deep semantic processing, memory for contingent text and metalingual rule rehearsal. What is pedagogically important in terms of learning/teaching techniques is that the individual abilities and the aptitude complexes they form are related to task aptitudes in the following ways (based on Robinson 2005: 52; Figure 1): – processing speed + pattern recognition = the noticing-the-gap aptitude complex, related pedagogically to here-and-now, open tasks; – phonological working memory capacity + phonological working memory speed = memory-for-contingent-speech aptitude complex, related pedagogically to one-way, convergent same-gender-participants tasks; – semantic priming + lexical inferencing = deep-semantic-processing aptitude complex, related pedagogically to same-proficiency and familiar-participants tasks; – text working memory capacity + text working memory speed = memory-forcontingent-text aptitude complex, related pedagogically to single, planningtime tasks; and – grammatical sensitivity + rote memory = metalinguisitc-rule-rehearsal aptitude complex, related to background knowledge, reasoning, few-elements tasks. Summing up the above, Robinson (2005) argues for a broader battery of processsensitive aptitude subtests enabling the creation of learner aptitude profiles to be later utilised in task design. Finally, a number of minor points concerning aptitude-instruction matching need to be made. First of all, individual differences in this area are age-sensitive. For example, it is important to remember that the earlier-described tendency for the analytic learning to be conducive to learning in different instructional settings depends on maturational process: Harley and Hart (1997) showed that this component of language aptitude is utilised in immersion learning in post-critical-period learners but not in children. When it comes to Robinson’s hierarchy of abilities and aptitude complexes, he claims (Robinson 2001) that while memory for contingent text and speech as well as deep semantic processing are typical of children, adults rely on noticing the gap and metalinguisitc rule rehearsal. What is more, we need to remember that the relationship between aptitudebased individual differences and instruction is mutual: the learner’s strengths or

186

Chapter Two

weaknesses should underlie the choice of the instructional mode; at the same time, however, instruction will also influence aptitude components. On the one hand, task complexity results in greater differentiation of performance – the more complex the task, the greater the impact of individual differences – as argued by Skehan (2003) and Robinson (2005). On the other hand, explicit deductive instruction will level the learner-learner IDs. Based on her study (see earlier in this section), Erlam (2005: 167-168) notes: “deductive instruction that gives students opportunities in language production minimizes any effect that individual differences in language aptitude may have with respect to instructional outcomes”. Related to language aptitude are studies into the importance of working memory – its speed and capacity (cf. Robinson 2001, 2005) – as an individual difference affecting success in language acquisition/learning. One of the most contemporary studies whose authors relate their research to Robinson’s Aptitude Complex model (Robinson 20010, 2005) is the one carried out by Biedro and Szczepaniak (2009) with the participation of 44 linguistically gifted individuals demonstrating near-native proficiency in 1-5 foreign languages. The psychological profiles of all of them demonstrate that they score considerably high on the MLAT (Modern Language Aptitude Test), standard IQ tests as well as demonstrate outstanding verbal intelligence and working memory (the last factor tested by PRSPAN, a WM test designed by the authors of the study). As for WM as an individual difference and learning/teaching mode considerations, Weissheimer and Mota (2009), based on a body of research, note that learners with high working memory capacity generally outperform those with poorer WM scores in “fluency, accuracy, complexity, lexical density and syntactic planning of speech” (2009: 94). More specifically – and more interestingly for the relationship between working memory as an individual difference and form-focused instruction – Harrington and Sawyer (1992) discovered a .57 correlation between WM scores and the reading and grammar results on TOEFL. Another study, Mackey et al. (2002) showed that good working memory was in charge of positive response to recasts as FonF microoption, the significance of WM scores vis à vis the ability to learn from recasts being the highest at lower levels of language proficiency. In a recent study, Mackay et al. (2010) also show that there is a positive correlation between learners’ WM test scores and their tendency to modify output. In turn, when it comes to the relationship between working memory and preferable learning modes, Robinson (1997b) found that WM-learners responded particularly well to incidental learning. 4.2.2. Learning styles and instruction In addition to the above – and, to a considerable extent, informed by the above – we can try to relate various instructional modes to other individual learner differences such as learning styles; multiple intelligences (Gardner 1983); and personality types. All are briefly discussed below.

Language instruction: Theoretical underpinnings …

187

One of the most popular learning style constructs is the one put forward by Kolb (1976; 1984). Learning style classification within this model is carried out along two predispositional poles: concrete-abstract, depending on how information is taken in by the learner, and active-passive, which indicates how the intake is processed. Consequently, Kolb’s learning style continuum is twodimensional and divided into four types: – concrete/reflective, the diverger, who wants to know why they learn something, i.e. in what way the course satisfies their individual goals and interests; – abstract/reflective, the assimilator, who wants to know what will be taught and desires it to be organised in a logical fashion with ample room for reflective learning; – abstract/active, the converger, who wants to know how and learns through trial and error; and – concrete/active, the accommodator, who wants to know what if, likes openended question and discovery learning. Willing (1987), who re-interpreted Kolb’s abstract-concrete as FI/FD and added the active/passive as a personality predisposition, arrived at the four-partite division of the learning-style space into: – analytic/passive: the conformist, who is authority oriented, classroomdependent and visual; – holistic/passive: the concrete learner, who is oriented towards the classroom and fellow learners, likes games and working in groups; – analytic/active: the converger, who is analytic, independent, solitary and prefers to learn about language; and – holistic/active: the communicative learner, who prefers the integrated skills approach (communicative learning in which all skills are taught in combination leading to the completion of a certain task, e.g.: a group project); they also take their learning out of the class. Looking at the two typologies and the defining characteristics of different types of learners given by their proponents, we can state – discussing Kolb’s types – that: – the diverger will most probably benefit from explicit deductive instruction, including strategy training; (s)he will value focus-on-formS over focus-onform and would like to know form-function mappings as well as practical applications of the learned rules; – the assimilator is likely to be open to explicit deductive instruction; (s)he will also benefit from discovery techniques; individuals with thick ego boundaries (cf. Ehrman 1999) or reluctant to take risks (cf. Ely 1986) may need the discovery tasks to be blueprinted by a top-down presentation of rules or post-task feedback; – the converger seems to need independent studies of language as a system, both top-down and bottom-up; and

188

Chapter Two

– the accommodator will be interested in all forms of discovery learning; (s)he will value both focus-on-form and focus-on-formS modes; (s)he will also benefit from discovery inducing implicit learning modes. In the case of Willing’s types, we can argue that: – the conformist is likely to benefit from teacher-fronted direct one-item-at-atime explicit instruction; being visual, (s)he will value the VIE mode; – the concrete learner will like explicit group-based learning of grammar, in which (s)he is going to look up to more able peers for scaffolding; (s)he is also likely to benefit from implicit instruction and output options; – the converger will prefer explicit learning: primarily deductive, from materials or experts of their choice; potentially inductive through discovery, in which case (s)he will also seek confirmation of his/her hypotheses with the teacher or – more likely – in books; and – the communicative learner is likely to benefit from output-based instruction, focus on form rather than focus on formS as well as implicit learning. In turn, trying to relate multiple intelligences to types of grammar instruction, we can speculate that the following treatment modes can be matched to them (Table 2; informed by Christison 1998; adapted from Turula 2010): Table 2. Multiple Intelligences: learner/instruction matching type of intelligence

associated abilities/skills

types of instruction

Linguistic intelligence

Ability: to use words effectively both orally and in writing. Skills: to remember information, to convince others to help you, and to talk about language itself. Ability: to use numbers effectively and reason well; to predict Skills: to understand the basic properties of numbers and principles of cause and effect; to use simple mathematical machines.

explicit: deductive input-based options FonF

Logicalmathematical intelligence.

Musical intelligence

Ability: to sense rhythm, pitch, and melody Skills: to recognise simple songs; to vary speed, tempo, and rhythm in simple melodies.

explicit-deductive; explicit inductive if blueprinted by teacher-fronted proor reactive focus on form (especially in the case of individuals with thick ego boundaries or reluctant to take risks; FonFS rather than FonF memory based or pattern based; aural input enhancement FonF or FonFS, depending on other dominant intelligences

Language instruction: Theoretical underpinnings … Spatial intelligence

Bodily-kinesthetic intelligence Interpersonal intelligence

Intrapersonal intelligence

189

Ability: to sense form, space, colour, visual input enhancement line, and shape; Skills: to graphically represent visual or spatial ideas. Ability: to use the body to express ideas discovery techniques reinforced solve problems by manipulatives Skills: coordination, flexibility, speed, an FonF Ability: to understand another person's interaction; output-based focus moods, feelings, motivations, and on form; implicit learning intentions. FonF Skills: to respond effectively to other people in some pragmatic way, such as participating in a project. Ability: to understand yourself: your explicit-inductive; discovery strengths, weaknesses, moods, desires, techniques and intentions; to know yourself as a FonF or FonFS, depending on language learner; to handle feelings other dominant intelligences

When it comes to personality types, considering their defining characteristics (based on Myers 1962, Ehrman 1989, Ehrman and Oxford 1990, Brown 2000, Felder and Brent 2005, Bielska 2006), we can argue that each of them will benefit from the following instructional modes (Table 3): Table 3. Personality types: learner/instruction matching

character profile extrovert

introvert

sensor

intuitor

language learningrelated advantages ready for conversational risks; uses social strategies get meaning across concentrated, selfdependent

hard-working and diligent; fond of memory strategies good at guessing from context; able to provide self-structure to input; uses compensation strategies

language learningrelated drawbacks dependent on outerworld authorities rejects social strategies; not fluent in speaking (long pre-speaking processing; avoidance strategies) lost in high-risk/lowstructure contexts inaccurate, prone to mistakes and inaccuracies; has a tendency to overcomplicate things

type of instruction (partly informed by Bielska 2009) explicit, teacher-fronted; output-based FonF or FonFS explicit inductive; discovery techniques

explicit, teacher fronted FonFS implicit or explicit inductive; reactive focus on form, either explicit or implicit; FonF

190 thinker

feeler

judger

perceiver

Chapter Two good analyst, selfdisciplined, instrumentally motivated; uses metacognitive strategies establishes good rapport with both the teacher and peers; integratively motivated; efficient affective strategy user systematic; works towards task completion flexible, open-minded, easily adapting to new circumstances; uses affective strategies

afraid of performance failure; needs external control and correction dependent on appreciation; inefficient in metacognitive strategies; affected by classroom atmosphere does not use affective strategies; inflexible, intolerant of ambiguity lazy; has long-term motivation/orientation problems

explicit inductive; may need proactive blueprinting or reactive feedback; explicit deductive implicit; may need reactive focus on form, either explicit or implicit

explicit; FonFS

implicit; FonF

What needs to be pointed out, though, is that one’s learning style is a matter of one’s place on a certain continuum (Bielska 2006); additionally, personality is usually considered in character types, such as, for example ENFJ (extrovertintuitive-feeling-judging). All this makes any final decisions related to the type of instruction to be used impossible to be taken before the final diagnosis of the learners’ predispositions and expectations is carried out. The same applies to the earlier-discussed learning styles and multiple intelligences. 4.3. Object-of-study constraints The implicit and explicit knowledge, quite distinct neurolinguistically (Paradis 1994), are also, as mentioned on a number of occasions throughout the present chapter, different along the lines of the seven dimensions listed below (Table 4; based on Ellis 2006a: 432-434): Table 4. Seven dimensions of implicit and explicit knowledge type of knowledge  dimension  awareness control systematicity accessibility online use self-report learnability (with age)

explicit knowledge

implicit knowledge

+ + + + +

+ + - (it decreases)

Language instruction: Theoretical underpinnings …

191

As can be seen in Table 4, the strong points of one type of knowledge are at the same time the disadvantages of the other. We can assume that what is known about differences in knowledge can be of consequence to learning and teaching. It, in fact, is so, as was discovered by Hammerley (1975) and confirmed by Ellis (2005) in his study of seventeen different grammatical structures, items that are difficult to learn explicitly may be easier to acquire implicitly and vice-versa (see also another study by Ellis 2002 cited earlier in this chapter). This observation led Ellis to the assumption that the implicit/explicit learning distinction is yet another dimension of language learning difficulty, in addition to objective, natural-order related problems and hindrances that result from individual learner differences. What makes a grammatical item easy for implicit or explicit learning may be external, a characteristic of the input, or an intrinsic quality of the item itself. Ellis (2006a: 435) and N. Ellis (2006) enumerate the following extrinsic factors that facilitate implicit learning and, potentially, diminish the necessity of overt tuition: – Frequency: how often the grammatical feature occurs in the input; – Salience: if the grammatical feature is easy to notice in the input; if it is not overshadowed as a result of cue competition (N. Ellis 2006); – Functional value: whether or not the grammatical feature maps on to a clear, distinct function; – Regularity: if the grammatical feature conforms to some identifiable pattern; or, as N. Ellis (2006) would ask, how reliable the co-occurrence of a certain mapping is; and – Processability: whether the grammatical feature is easy to process [for instance, in terms of natural sequences of acquisition] (Pienneman 1998). Based on N. Ellis (2006: 165), we can add another factor facilitating implicit learning: – The lack of associative attentional tuning involving interference, overshadowing and blocking, or perceptual learning, all of which are shaped by the learner’s L1. If the above-mentioned six factors do not occur in the learning environment, explicit instruction is a definitely better option. When it comes to the intrinsic specificity of a given form itself, grammatical phenomena of considerable complexity will be ideal for implicit learning (DeKeyser 2003), mainly because they are too complicated for explicit instruction, the latter working best (Robinson 1996 cited in Ellis 2006a: 437) for forms that are structurally regular, conceptually clear and, consequently, can be supplied with a fairly easy and straightforward explanation. In addition to this, the choice of appropriate pedagogy will also depend on what kind of grammar we teach.

192

Chapter Two As Lewis claims (1986: 9-12), there are three kinds of grammar:

– facts, such as the information that some verbs are regular and some are not; – patterns, such as, for example, question tags; – primary semantic distinctions, including, for example, temporal expressions (tense and aspect). Commenting on all three, Lewis claims that facts and patterns are relatively easy to understand and master based on explicitly provided rules or algorithms; their problem is that they are extremely difficult to automatise (cf. also Willis 2005 on teaching question tags), which implies the need for explicit tuition combined with extensive input and output practice. Primary semantic distinctions, on the other hand, may not be too difficult to proceduralise in terms of morphosyntax: the past simple tense does not cause major form-related difficulties; they, however, are definitely not easy to conceptualise in terms of form-meaning pairings as even quite advanced learners find it complicated to make decisions about which temporal expression should be used especially if subtler meaning distinctions are to be rendered. In the case of such grammar implicit learning as a sole or, potentially, supplementary pedagogical option may be the best solution. A similarly manifold view on grammar and its pedagogy can be found in Willis (2005), who speaks of teaching the grammar of pattern (including the afore-mentioned question tags), grammar of structure and grammar of orientation. As in the case of Lewis’s division, grammar of pattern and grammar of structure will be best taught explicitly, based on rules – if the teacher is dealing with more analytically minded learners – or through associative, memory-based patterning based on analogy – if the instruction is directed to gestalt learners. When it comes to grammar of orientation, in turn, Willis (2005 following Brazil 1995) points out that the purpose of grammar is not the construction of sentences but the construction of meanings. Grammar of orientation – with its tense system, articles, etc. – is central to this concept. It shows “how the things we are speaking or writing about are related to the real world and to other elements of the text” (Willis 2005: 34). As a result, this type of grammar will best be taught based on Rutherford (1987) and his two-sides-of-the-same-coin approach as well as to Lem’s (1994) distinction between soft and hard communication codes: as a combination of explicit instruction in the case of the formal properties of such structures and organic, implicitly-bent exploration aimed at the conceptualisation of the meaning of the structure. All in all, considering different constraints on the two discussed modes of instruction – the implicit and the explicit focus on form – such as the timing of the instruction, the individual differences of the learners and the object of formfocused instruction, it seems necessary to adopt the do both stance proposed by Azar (2007). She writes, from the practitioner’s perspective:

Language instruction: Theoretical underpinnings …

193

Do both. Those are the two words I write most often in margins when I read academic articles about the teaching of grammar in second-language instruction. Focus on fluency or accuracy? Do both, in proper balance given the students’ needs and goals. Have students work with grammar structures inductively or deductively? Do both, you never know where any particular student’s “Aha” is going to come from. Use authentic or adapted language? Students need both. Work with sentence-level vs. connecteddiscourse material? Both can have good pedagogical purpose and effect. Engage in open-ended communicative interaction or controlled-response exercises? Explicit instruction or communicative exposure? Both. “[F]ocus on awareness rather than performance” in teaching grammar (Ellis 2002, p. 29)? My handwritten note in the margin: Do both.

Such a do-both solution – accommodating the recommendations described above – called Organic Approach Deductivised (Turula 2009d) has been theorised and will be delineated in this book as a form-focus option directed particularly at such a complex grammatical problem as the formally and conceptually proper use of target language temporal expressions. First of all, however, it is necessary to clarify the second part of focus on form: what exactly we focus on when we focus on form, especially when working with advanced learners. As I refuse to subscribe to the view that teaching grammar at the advanced level is confined to recycling what is already known, Part II will address the problem of what it means to focus on form at higher levels of language proficiency. Chapter 3, entitled Focus on FORM, offers a look at the issue from Larsen-Freeman’s (2003) tripartite model of form, with particular regard to the most neglected part – the semantics of the form. Chapter 4 relates these observations to temporal expressions, which are of central interest in the book, and the study of the effectiveness of form-focused instruction it presents.

PART II: FOCUS ON FORM When it comes to the effectiveness of form-focused instruction, Chapter 2 presented a number of studies describing the positive outcomes of such pedagogic interventions (Ellis 1994 – a review of studies; Erlam 2003, Lyster 2004, Doughty and Williams 2004 – a collection of studies, Piechurska-Kuciel 2005, Philp, Loewen and Ellis 2006 and Pawlak 2006). To briefly update the reader on their outcomes as reported in Chapter 2, in all of these studies the groups that underwent form-focused instruction outperformed their controls on accuracy tests; some of the testees – if this was tested as part of the research – demonstrated improved language production resulting from greater automatisation of the learned structure. In other words, the applied treatment had two positive effects – it improved declarative knowledge of a certain formal aspect of language and resulted in its effective proceduralisation. While the positive automatisation outcome presented in all of the studies is uncontroversial, it seems appropriate to have a closer look at the quality of the linguistic knowledge that becomes proceduralised. There is good reason for a pinch of scepticism as it seems that the cited studies do not actually focus on forms and their meanings in the communicative context; the communicative context notwithstanding, the very focus on form in each research effort is exclusively on syntactic and morphological formal accuracy: Erlam (2003) – direct object pronouns; Philp, Loewen and Ellis (2006) – generic article a/an and 3sinS; Lyster (2004) – grammatical gender assignment in French; Pawlak (2006) – past simple and present perfect tenses as well as passive voice; the studies presented in Doughty and William’s collection (2004) refer to: the use of possessive determiners (White); past and conditional past (Doughty and Varella); the passive (Williams and Evans); and grammatical gender (Harley). The only study that, to a certain degree, attends to actual form-meaning correspondences is the one carried out by Piechurska-Kuciel (2005). Out of the three experimental groups participating in this research, one was subject to form-only-focus treatment of the kind described above, the other two received language awareness training which, in addition to formal accuracy, concentrated on the conceptual underpinnings of language use: the experimenter asked the testees to justify their grammatical choices, evoke potential contexts and ponder on meaning changes resulting from using available formal alternatives. Whether the preference of the majority of the cited studies for focus on the purely formal properties of language is justified is a question the answer to which depends on how we choose to define form or rather what we think a student should know when they know a particular form; another question is how different aspects of form are learned and, consequently, should be focused on in the language classroom. The answer to the question of what is form, which accords with the philosophy of language subscribed to throughout the present book, is the one

196

Part II

offered by Larsen-Freeman (2003). Clearly based on Hymes’ (1972) communicative competence model – amended and updated by Canale and Swain (1980) as well as Bachman (1990) in their proposals of the communicative language ability (CLA) – Larsen-Freeman (2003: 35) argues that in the case of every linguistic form there will be three interacting, equally important subsystems and, consequently, dimensions to the prototypical language unit: the form per se, its meaning/semantics and its use/pragmatics. Considering the fact that cognitive linguistics – with reference to which the present argument is constructed – proposes a usage-based approach to language and extends the scope of semantics to incorporate both meaning and use, we arrive at a representation of linguistic form reminiscent of the Langackerian symbolic unit (cf. Chapter 2) which, because of the said semantics-pragmatics continuum is not tripartite, as proposed by LarsenFreeman, but bipolar, a form-meaning pairing (a concept shared by all cognitively motivated philosophies of language, including all construction grammars; cf. Croft and Cruse 2004). This will mean that learning grammar should aim at what VanPatten et al. call establishing form-meaning connections (FCMs), which, as they point out, “go beyond lexical learning”(2004: 4) into the realm of syntax and morphology, looking at every linguistic unit as a form-meaning combine. In the light of the above we can consider the second question asking: if forms incorporate formal properties of language together with their meaning, is it not necessary for any language teaching philosophy to be uniform in its treatment of both poles of the unit? In other words, do focus-on-form pedagogic interventions have to consist in noticing the formal properties of a given structure together with the semantics as well as the pragmatic potential of this structure? Based on what was written in Chapter 2 in relation to hard and soft codes (Nalimov 1974, 1982; Lem 1994) as well as different types of grammar (Lewis 1986; Willis 2003), the answer to this question may seem to be No. Formal properties of structures, which are an example of a hard code, benefit from formfocused instruction; the meaning, in turn, being softer and more elusive, is learned through discoursal encounters in which the rationale behind the semantic choice of a certain form is intuited on the basis of numerous clues, found in- and out-ofdiscourse. This is one more reason why successful instruction matches focus on form with exposure to communicative contexts, both passive and active. There is, however, one fault to be found with this otherwise flawless reasoning: a NS/NNS gap many learners notice at the advanced level, a gap which is no longer formal but conceptual and can best be expressed by a question: Why does my understanding of the semantics of grammatical structures and, consequently, my choice of form usually differ form that of a native speaker?1. Consequently, it seems important to consider the following problem: maybe in teaching the ––––––––– 1

The problem will be considered in detail in section 3.1 of Chapter 5.

Focus on form

197

semantic pole of form we should also consider an array of options – just as we did in the case of the formal pole of form in Chapter 2 – taking into account extrinsic factors, learner individual differences and, most importantly, the intrinsic properties of different form-meaning pairings. This is because – considering the above-mentioned gap – we cannot rule out the need for explicit meaning-focused instruction, especially – as it stands to reason – in the case of the rarer, marked form-meaning pairings2. Part II of the present work considers intrinsic properties of the semantic pole of form in the hope of finding the answer to the question of what to teach and, eventually, how to teach grammar – with special regard to the tense-aspect system – to the advanced foreign language learner. The point of departure for the considerations is that at higher levels of proficiency, when formal properties of language are – in most cases – satisfactorily assimilated and automatised, focus on form should imply teaching the semantics of grammar. What remains to be done is delineating the territory. The near-native knowledge of meaning, according to Stern (1983: 343) amounts to “an intuitive grasp of the linguistic, cognitive, affective and sociocultural meanings expressed by language forms”. Consequently, we need to find and describe a model of semantic knowledge that will accommodate all of the listed meanings. This is the main aim of Part II. It starts with Chapter 3 which attempts to draw an overall picture of semantics. As the book is written from the cognitive perspective, meaning is investigated from the experiential, usage-based point of view. Consequently, it will take the reader through Langacker’s (1987) encyclopaedic approach to meaning and Rosch’s (1975a and 1975b) prototype theory towards the concept of the bipolar model of semantics of Jackendoff (2002, 2004). All this will lead to a discussion of the holistic semantic models such as frames, scenarios and mental spaces; as well as the examination of form-meaning pairings from the perspective of the Conceptual Integration Theory (CIT; Fauconnier 1994 and 1997; Fauconnier and Turner 1994) with its basic concepts of mapping, frame shifting and conceptual blending (Coulson 2000, Coulson and Kutas 2001 and Coulson et al. 2006). Chapter 4, in turn, will relate the above-mentioned theoretical underpinnings to the area of grammar that is of interest to the present work: English temporal expressions, with special regard to the tense-aspect system. The semantics of different English tenses – to use a term untilised in pedagogical grammars – will be presented based on Langacker (1987, 2000, 2001, 2002) and Taylor (2003, 2008); examined from the data-driven perspective (Carter and McCarthy 2006; Mindt 1995 and 2000); and investigated within the CIT framework for which purpose the results of my own study will be quoted (Turula 2007b). ––––––––– 2

This claim is partly informed by Cook’s argument for instruction on marked parameter settings in UG-based approaches discussed in Chapter 2.

198

Part II

All this – in connection to what was written in Part I about attention on the one hand and focus on form in language teaching on the other – will eventually lead to a consideration of the best classroom approach to teaching meaning to advanced foreign language learners. Some general guidelines will be presented in the final section of Part II; the question will be revisited from the practical point of view in Part III, which will present the proposed pedagogical solution, together with its implementation and results.

CHAPTER THREE THE MEANING OF MEANING Meaning in the cognitive understanding is a departure from philosophical semantics which, based on Aristotle’s category realism, studied the relationship between language and real world phenomena it describes, and which explained meaning is by means of necessary and sufficient conditions. To use the most frequently cited example, a bachelor was defined as [+adult, +male, -married]. Cognitive semantics, in turn, believes in the construction of meaning; consequently, it underscores the importance of experience in concept formation, as a result of which meaning is both denotative and connotative/associative. Revisiting the bachelor example, cognitive semantics subscribes to the view that, as Fillmore (1975, 1978) and Lakoff (1987) argue, there are better and worse examples of this category. Similarly, there are better and worse examples of birds, balls or even – as a group of American students claimed (Pinker 1999) – better and worse odd numbers. Such cases are best explained by prototype semantics, discussed in Section 1.1. However, as Aristotelian categories are also indispensible for the delineation of meaning – there are not, in fact, better or worse odd numbers – there is a need for bi-polar model of the study of meaning; such a concept (Jackendoff 2002; 2004) is described in section 1.2. After that, the present chapter proceeds to discuss semantic models whose conceptual spaciousness has the potential to accommodate the multi-faceted idea of meaning. Sections 2 and 3 address the mental space architecture as well as its building clocks – frames and scripts – and the structuring processes – mappings, based on different perspectives, profiling and zone activation. The chapter concludes by the presentation of Conceptual Integration Theory (Section 4), a proposal which encompasses all the above-mentioned concepts, models and processes, and which explains meaning making through frame shifting and conceptual blending (Section 5).

1. Introduction to meaning. The cognitive perspective Various attempts to define meaning propose a bipolar concept of semantics based, as Jackendoff (2004) notes, on Frege’s Sense and Reference (1972 [1892]) and his disassociation of the objective, externally available sense of an expression from the user’s subjective idea of this same expression. This results in denotativeconnotative or referential-emotive oppositions, one of which is Yule’s (1985) division into conceptual meaning and associative meaning. The former is a more or less objective definition of what a certain X is; the latter refers to our experience-based knowledge of this X. As a result, the conceptual meaning of a

200

Chapter Three

needle will be “a small thin polished piece of metal used for sewing”1, while its associative meaning will contain information such as: needles are sharp and, consequently, one can hurt oneself if one is not careful; being small, needles are difficult to locate if lost (especially in a haystack); etc. Langacker (1987), in spite of acknowledging the duality in his division into dictionary and encyclopaedic meaning, argues that it is actually impossible to draw a borderline between the lexical (dictionary) definition of an item and its other specifications pertaining to a number of different domains. As a result, it is best “to avoid such false dichotomies” and subscribe to the understanding of meaning that is “encyclopaedic in nature” (1987: 154), all-inclusive, semantic and pragmatic, linguistic and extralinguistic at the same time. He gives the example of banana, arguing that its being a tropical fruit is as important to its meaning as the fruit’s colour, shape, etc. The all-inclusiveness of Langackerian approach to meaning, its fuzzy boundaries and the fact that individual human experience is part of meaning leads to the acknowledgement made in Lewandowska-Tomaszczyk (1987: 45) that what we need are holistic models which assign semantic value to “dynamic wholes” rather than “component parts”. Such conceptual gestalts are an idea accommodated in prototype semantics. 1.1. Prototype semantics The prototype approach to semantics, proposed by Rosch (1975a; 1975b; Rosch and Mervis 1975; Rosch and Lloyds 1978) – based on the study of colours, which evolved into the investigation of basic categories – was one of the reactions against the traditional, objectivist philosophical view, with its sharp-boundary Aristotelian categories, and its offspring: formal linguistics and truth-conditional semantics; reactions against the problems posited by the necessary and sufficient conditions of the objectivist approach to meaning such as the famous bachelor example described by Fillmore (1975, 1978) and Lakoff (1987). Assuming that every category can be delineated by the if and only if condition, the latter amounting to a set of features – both Fillmore and Lakoff argue – bachelor will be a combination of the following attributes: male, adult and unmarried. If this is so, is the Pope a bachelor? And if the answer is ‘yes’, is he as good a bachelor as any 25-year-old male who does not don a cassock and has not yet said his ‘I do’ to anybody? Prototype semantics offers a way out of the bachelor hoax. It has to be said, however, that Rosch herself is not unsympathetic towards the idea of attributes or features. Yet, she does not treat them as necessary and sufficient but as better/more or less representative (see also Taylor 2008). Rosch and Mervis ––––––––– 1

Collins Cobuild English Dictionary

The meaning of meaning

201

(1975) observe that the best members of a given category will have numerous attributes in common with other good representatives of the same category and, potentially, few characteristics that they share with members of other categories. This observation is based on Wittgenstein’s (1953) idea of family resemblance. Based on the example of game as a category, Wittgenstein points out that sometimes it is difficult to come up with a definition of a game which would be suitable for all individual examples. This is because – as we go from one instance of a game, like a board game, to another, such as card games and still other including ball games – we can notice that in every such step some commonalities are preserved and some have to be abandoned. Wittgenstein’s advice on how to find one’s way out of such a maze of such a “complicated network of similarities overlapping and criss-crossing: sometimes overall similarities, sometimes similarities of detail” (Wittgenstein 1953 reprinted in Aarts et al. 2004: 41) is don’t think, look. On applying such a gestalt approach to individual category members we see a set of attributes which are present in some of them and absent from others. Among other examples quoted in literature, we will look at the attributes shared by the more prototypical or central representatives of the bird category examined by Rosch (1975b). The attributes include descriptors such as “has feathers”, “can fly”, “has a beak”, “builds nests in trees”, “lays eggs”, etc. The less such bird-like features are present in a particular category member the closer towards peripheries – and, consequently, other categories – it is. As a result, the answer to the bachelor problem is that some members of a category are more typical or better representatives (an eligible 25-year-old) of the category than others (the Pope). Another approach to the notion of prototype, as pointed out by Harley (2001: 288), is seeing it as “a special type of schema... a frame for organising knowledge that can be structured as a series of slots with fillers ... A prototype is a schema with all of the slots filled in with average values”. If in the case of bird the slots “has feathers”, “can fly”, “has a beak”, “builds nests in trees”, “lays eggs”, etc. are filled in with values that are far from extremes, normal, to use the word in its colloquial understanding (=complying with experience; in Poland this will in all likelihood mean: has average feathers with no special colour pattern; has a medium-sized pointed beak; builds tree nests and fills them in with an easy-tocount number of rather small eggs; etc.). Consequently, based on both family resemblance and the average value slot fillers, a sparrow will, in our country, be a better, more central member of the bird category than a stork or a parrot, not to mention penguins, which are peripheral everywhere outside, perhaps, Antarctica. The fact that category members can be rated in terms of how well they represent the category in question is called the prototype effect. Taylor (2008) notes that this effect can be noticed in numerous experiments concerning the speed of verification (decisions like if X is a member of category Y how good an example X is are easier for central examples); priming effects (category name

202

Chapter Three

primes the most representative category member); or list effects (when asked to list category members, testees list the most prototypical instances first). Such prototype effects underlie categorisation which is radial, in the sense that all members of a given category, interrelated based on different similarity clusters, are situated – at a different radius distance – around the centre of the category. In this form, the model has been widely adopted in cognitive linguistics, among others in the theory of Idealised Cognitive Models (ICMs). According to Lakoff (1987), ICMs – structures we use to describe and understand the world around us – are complex wholes, gestalts, very often imprecise, oversimplified, with a central predictable member of the category (a sparrow in the bird category of an average Pole). However, some Idealised Cognitive Models are cluster models. In such a cluster, there is not one prototypical category member but rather a number of prototypes each belonging to every individual ICM within a given cluster. This is best seen in all kinds of semantic extensions in which these different prototypes motivate various new, often metaphorical senses of a given word. An example of such a cluster model is the one of mother (Lakoff 1987). When asked what the prototype of a mother is, we may be inclined to see a housewife mother as closer to the idealised model than a working mother (especially if we live in a patriarchal society); in any case, many of us may choose the woman who gives birth to a child as the best example. If this is so, the question about proper categorisation of surrogate mothers, stepmothers or mothers superior may arise. Lakoff (1987) suggests a way out by classifying mother as a cluster model radial category, with different prototypes belonging to different participant ICMs. These different models, as mentioned above, will serve as the basis for different sense extensions, such as the ones in Necessity is the mother of invention (sense extension motivated by the birth ICM of mother) or He wants his girlfriend to mother him (sense extension motivated by the nurturance ICM of mother). 1.2. Conceptual and spatial semantics; Jackendoff’s model (2002, 2004) In the light of the above we can ask if semantics is exclusively experiential or if there will be any schematic, abstract ways to meaning, either inborn or acquired. The answer to this question will require a step back towards the bipolar concept of semantics introduced at the beginning of this section. A similar division or formal distinction yet along slightly different lines is reintroduced in Jackendoff (2002), who argues that there are actually two separate but interacting subcomponents: conceptual structure (CS) and spatial structure (SpS). The former is built on discrete features and functions; “it encodes such aspects of meaning as category membership and predicate-argument structure” (2002: 346). Spatial semantics, in turn, is a contribution from perceptual modalities and as such, encodes the “spatial understanding of the physical world” (Jackendoff 2002: 346) such as time, shape, motion, layout in space, etc. For instance, verbs of self-locomotion (Jackendoff

The meaning of meaning

203

2002: 35), such as walk, jog, limp, strut, will be conceptually indistinguishable (the same category membership – a kind of walking; similar predicate-argument structure – intransitive verbs, potentially used with adverbials), and their actual meaning differences, dynamic and static spatial configurations, will be coded in spatial semantics alone. However, Jackendoff himself admits, in certain cases it is difficult to determine where the boundaries between the two subcomponents can be drawn. Clarifying the idea of conceptual and spatial semantics in a later publication, Jakendoff (2004) uses, as a point of departure, Chomsky’s generative grammar in general and, in particular, the concepts of E-language and I-language (Chomsky 1986). While the former concept describes language seen as a kind of external artefact, used in performance, the latter refers to competence (which, to Chomsky, was of primary importance): the internally encoded information, the representation of language in the human mind. When it comes to ways in which the I-language is actually encoded, the explanation will best be motivated by the creativity of language, the fact that generation of an infinite number of utterances is based on finite resources, naturally limited by the finiteness of the human brain. Following this, (Jackendoff 2004: 323-324), “one’s potential repertoire of syntactic structures must be mentally encoded in terms of a finite set of primitives and a finite set of principles of combination that collectively describe (or generate) the class of possible sentences”. Jackendoff uses the above model – and quite obviously Wierzbicka’s construct of semantic primitives (Wierzbicka 1992) and her “alphabet of human thoughts” (1992: 27) – as the underlying concept for his Conceptual Semantics. Assuming that: 1. every syntactic structure conveys a concept and 2. there is no list of possible concepts just as there is no list of possible utterances he concludes that, like syntactic structures, there has to be a kind of conceptual architecture “characterised in terms of a finite set of mental primitives and a finite set of principles of mental combination that collectively describe the set of possible concepts expressed by sentences” (Jackendoff 2004: 324). The two – conceptual primitives and relevant principles – will constitute “the grammar of lexical concepts” (Jackendoff 2004: 324), or rather I-concepts, to emphasise the parallel between the model and the Chomskyan idea of competence rather than performance. The basic units for representing concepts in sentences are words encoded not as lists of instances but as abstract schemata used, in a rule-like fashion, for the categorisation of novel input. When it comes to the nature of a conceptual

204

Chapter Three

schema, Jackendoff (2004) sees it as compositional, because – in the course of the above-mentioned “mental combination” – the principles manipulate the set of primitives into conceptual sets. As for the intrinsic organisation of such sets, Jackendoff (2004) further subdivides the conceptual structure and describes it by means of: – ontological categories and argument structure; – organisation of semantic fields; and, finally – conceptualisation of boundedness and aggregation. As for the first subcomponent of conceptual structure, the major ontological categories include such conceptual constituents as Thing, Event, State, Place, Path, Property, and Amount, which correspond to the basic syntactic constituents of a sentence (the phrases). Each of the constituents will be subject to type-token distinctions as well as be influenced by its predicate-argument structure2. The former conceptual operations include decisions on whether the speaker currently conceptualises the learner as learner as species or this particular learner; in the case of conceptualisation informed by predicate-argument structure, it will be of importance whether X goes, with zero argument slots to fill in, or X goes to Y, the transitivity implying transaction between entities. When it comes to the organisation of semantic fields, Jackendoff (2004) observes that many verbs and prepositions appear across categories/semantic fields forming “intuitively related paradigms” (2004: 333). Based on the localist theory of Gruber et al. (1965 cited in Jackendoff 2004: 333), Jackendoff (2004) theorises that “the formalism for encoding concepts of spatial location and motion, suitably abstracted, can be generalised to many other semantic fields”. Such a suitable abstraction and generalisation can be seen (Jackendoff 2004) in the word go used across the fields of (a) spatial location and motion: go home; (b) possession: go to X; (c) ascription of properties: go from red to green; and (d) scheduling of activities: go from Monday to Tuesday. Finally, the question of aggregation and boundedness is the one of whether a combination of two syntactic structures carrying some conceptual meaning is always appropriate and if not, what the underlying causes of such a discrepancy are. Jackendoff (2004: 335-336) uses the example of certain verbs such as sleep, eat or run in combination with certain adverbials including for hours and until noon claiming that temporal bounding (punctuality) can be imposed only on unbounded processes; process that are already bounded – as in the case of He ran (into the building; 5 miles down the road) – cannot be further bounded without creating new interpretations – He ran into the building for hours implies repetition – or rendering the sentence ungrammatical – He ran 5 miles ––––––––– 2

Cf. verbs of self-locomotion discussed, after Jackendoff (2002), earlier in this section

The meaning of meaning

205

down the road for hours3. This, as Jackendoff notes, is part of the more serious problem of distinction between events and processes or telic and atelic actions: doing something in as opposed to for a period of time; as well as mass and count distinction in entities, where a part of entity (a) is this entity in the case of the former – some water is water; (b) is not this entity in the case of the latter – a bite of a sandwich is not a sandwich. In conclusion, considering the three subcomponents of conceptual structure discussed above as well as the earlier-described composition of conceptual constituents where conceptual primitives are combined based on relevant principles, Jackendoff writes (2004: 338): “[b]eneath the surface complexity of natural language concepts lies a highly abstract formal algebraic system that lays out the major parameters of thought”. There are, however, areas of conceptualisation, where the above mathematics of semantic primitives seems to fail. When discussing lexical schemata and defending the very concept, Jackendoff admits that these schemata may differ between individuals as well as have a certain degree of indeterminacy. A good example of this are colour categories, where, as Dbrowska (2004) observes, semantic decomposition is only partial and leaves “an unanalysed residue” (2004: 106). This is why the construct of conceptual semantics needs enrichment. One such extension is Spatial Semantics (Jackendoff 2002, 2004) with its three subcomponents: (a) spatial structure of objects, (b) focal values in continuous domains and (c) preference rule systems which help encode spatial (=visual) and other sensory understandings of the physical world. As will be seen below, all these subcomponents require input processing based on experience-motivated unanalysable gestalts, the don’t think, look approach of Wittgenstein cited in the discussion of prototype semantics being highly applicable. Another proposal amounts to introducing perceptual primitives alongside the atoms of meaning. These, however, unlike Wittgenstein’s experiential gestalts, are componential (Dbrowska 2004). When it comes to Jackendoff’s model and his Spatial Semantics, the notion of the spatial structure of entities refers to their physical properties that are not semantic primitives and cannot be reduced to a set of +/– features. As a result, this concept is best addressed at the intersection of language and visual faculties. This will refer to questions of what makes a stool different from a chair on the one hand and, on the other how distinctions between walk/stroll/stagger/pace are determined. Jackendoff (2004) argues that a proper conceptualisation of such differences means – in addition to the combination of relevant conceptual primitives – evoking certain extralinguisitc schemata encoding, among others, “geometric and topological prop––––––––– 3

Jackendoff (2004) states that the sentence may be seen as acceptable on condition that 5 miles down the road was where the runner was, and not where he got. In such a case, temporal bounding results in option 1: novel interpretation.

206

Chapter Three

erties of physical objects” (Jackendoff 2004: 338). The model of such schemata that appeals to Jackendoff the most is the 3D model structure proposed by Marr (1982 cited in Jackendoff 2004: 339) which provides for “the decomposition of objects into parts, the geometric systems of spatial axes around which objects are organised, and the relations among the parts” as well as “single objects [represented] in single positions, ... ranges of sizes, ranges of angles of attachment of parts, and ranges of detail from coarse- to fine-grained” (Jackendoff 2004: 339). Consequently, the geometric/topological representation will be a certain stereotyped instance which is easily related to the gestalts called prototypes discussed earlier in this section. In Jackendoff’s model of semantics, such a visually-motivated structure is connected, via an interface, to a relevant schema in the conceptual structure of language (Jackendoff 2002; 2004). In turn, the issue of focal values refers to continuous domains – like the concepts of temperature, size, colour, taste – where conceptualisation is possible only in relation to a certain point within this continuum, which Jackendoff (2004) calls focal value and relates to sensory perceptions – visual, in the case of e.g. colours, gustatory, for taste or tactile, when it comes to temperature. We can assume that such focal values will be what was earlier called average values after Harley (2001), and will be established based on the prototype effect related to previous experience. Finally, preference rule systems, which are Jackendoff’s (2004) final spatial supplement to the mathematics of conceptual semantics, are necessary in the analyses of the so-called cluster concepts, an instance of which was the example of mother discussed, after Lakoff (1987) earlier in this section. For each of the individual senses of the word – motivated by the birth, nurturance or any other of the ICM in the cluster – each individual condition is sufficient for generating the proper sense. There, however, will be preference for the more stereotypical sense, which, as indicated before, will give the biological mother primacy over all other senses. Similar preference rule systems operate in relation to form-function clusters of objects: the more stereotypical chair will be the one formally in accordance with the stereotyped 3-D model (four legs, a back) as well as the one which is functionally suitable (can be used for sitting). Any chairs in whose case one of the conditions is not satisfied – chairs without legs and backs as well as four-legged and backed chairs but made from paper and, consequently, unfuctional – will be seen as peripheral instances of the category. Another crucial aspect of preference rule systems – and a concept of particular importance to the present work, as will be shown in the latter part of the chapter – is that when information about the satisfaction of one condition is missing, the invariable assumption is that it is satisfied as default values. This is illustrated by sentences such as X climbed or X saw Y, in which climbing and seeing are interpreted in their stereotyped versions as no evidence to the contrary is offered. Conversely, when a peripheral interpretation is required, some information necessary to

The meaning of meaning

207

relinquishing the stereotype-related condition is needed. This is explicated by the examples given by Jackendoff (2004: 340), John climbed (up) the mountain and John climbed down the mountain in which up is assumed by default while down needs to be supplied to suppress the upward movement condition in climb. The concept of meaning related to space is also found in Dbrowska (2004), who, quoting a plethora of source texts (for a full overview, cf. 2004: 109) argues for “symbol systems grounded in perceptual experience” because, as she argues, the perceptual image is central to meaning. She observes that the concept behind every phonological label is formed as a result of numerous encounters with the entities behind the label, encounters in which we single out and remember the shared characteristics of these entities. This brings us back to prototype semantics and Rosch’s idea of attributes or features based on which we categorise newly encountered exemplars. However, contrary to Wittgenstein’s concept of family resemblance based on gestalts, we form our judgments compositionally. Dbrowska’s (2004) claim to the fore of such an understanding of perception hinges on a number of neuro-scientific studies which prove that different groups of neurons fire during visual intake of various aspects of the processed image such as shape, spatial orientation and combination (separation, contact, inclusion, etc.), movement and its trajectory as well as their likeness to other images. All this has a number of consequences. First of all, this means that perceptual processing is componential: the said different aspects are processed separately and fed into the overall perception very much in the convergence-zone fashion described following Damasio (1994) in Chapter 1. Secondly, as it is difficult to imagine that in the course of such composition the representations in question will be put together like jigsaw puzzles, we have to allow for a certain degree of indeterminacy in the perceptual image; Dbrowska (2004) points this out. This, in turn, is very much in agreement with Harley’s (2001) idea of prototype as a schema with slots and fillers, clearly convergent with Minsky’s (1975) concept of default assignments in frames (to be discussed later in this chapter). Finally, as Dbrowska (2004) argues, the composition resulting from perceptual experience will be multimodal, combining input from all sensory modalities as well as from the inner voice/eye/touch/etc. modality called introspection. As a result, as demonstrated by Regier (1996 cited in Dbrowska 2004: 111) in his computer model, we have to assume that any instance of processing will be constrained by three types of structures: an orientation-combination structure; a map-comparison structure; and a motion-trajectory structure (Dbrowska, following Regier; 2004: 111). Taking into account the fact that meaning relies on an interaction between conceptual semantics and spatial semantics (Jackendoff 2002; 2004) – or verbal definitions as well as perceptual concepts (Dbrowska 2004) – it has to be stated that we indeed need holistic semantic models but these models should be allinclusive in a way that accommodates the abstract and the specific; the level of schema and the level of instantiation. It seems that such models – “continuities

208

Chapter Three

between language and experience” (Petruck 1996: 1) resulting from the interaction between the conceptual and the spatial/experiential communicating via an interface (Jackendoff 2002; 2004) – will be best represented by concepts belonging to the realm of holistic semantics: frames, cultural models and scripts. All these notions are going to be described in Section 2, following major publications of their chief proponents. Their presentation will be preceded by a discussion of the overall structure that accommodates them known as mental space architecture; they will also be related to inter- and intra-space structuring phenomena such as profiling and zone activation on the one hand and, on the other, the basic notions of the Conceptual Integration Theory (CIT) including mapping, frame shifting and conceptual blending.

2. Holistic semantic models 2.1. Mental spaces Mental spaces, to start with the construct providing reference for the remaining concepts, are our context-based understandings of discourse which can be defined as “domains that we set up as we talk or listen, and that we structure with elements, roles, strategies and relations” (Fauconnier 1994: 2). These domains are both heralded and constructed by grammatical constructions, ranging from single morphemes such as –ed or if to more elaborate discourse markers and structures like John believes that; In the movie; etc. Called space builders, such constructions help us construe meaning by induction: brought together in a sentence – or a bigger part of discourse – they provide a starting point for further conceptual development of ideas. They do not assign meaning, they ensue it. What happens in discourse is that a “linguistically homogenous form” generates “heterogeneous and incomplete information” (Fauconnier 1994: xx) which needs to be filled in as we move along. All of this is based on Turner’s assumption that “[e]xpressions do not mean; they are prompts for us to construct meanings by working with processes we already know” (1991: 206), language being “only the tip of the iceberg of invisible meaning construction that goes on as we think and talk” (Fauconnier 1997: 1), which Fauconnier calls backstage cognition and which is based on the frames, scripts and cultural models discussed further in this section. Going back to the mental space architecture accommodating all of the underspecified meanings which are subject to individual specification and/or negotiation, it is constructed as we move along through a discourse encounter with another individual. It provides the scaffolding which allows us to link concepts and map them onto each other. Consequently, we are able to understand the message of our interlocutor in spite of their shifts in perspective; their leaps from the real to the hypothetical/fictional; the meaning extensions they use etc. As

The meaning of meaning

209

a result, it is possible to comprehend utterances rejected as self contradictory or nonsensical by traditional logic. For instance, Jackendoff’s (1975) famous example of In the painting, the girl with brown eyes has green eyes makes perfect sense because we acknowledge two different conceptual contexts: the real, inhabited by the brown-eyed girl, and the pictorial, in which, for some reason (artistic vision or bad paint quality), the real life model is physically altered. Based on Fauconnier’s (1994) ID (=Identification) Principle, the girl from the real life space is used to identify her counterpart in the picture space owing to the mental architecture linking the two spaces. Similar correspondences/mappings underlie metaphors and metonymies: The Hemingway you are looking for has been sitting on my desk for ages, where Hemingway (=writer) identifies a book written by this writer. The meaning construction as described above will be carried out in a number of ways motivated by various language forms which are granted one of the following statuses (Fauconnier 1997: 40-41): – (the already mentioned) space builders, which are grammatical expressions opening new spaces or shifting focus to another space; they take on a number of forms such as adverbials (in 1990), subject-verb complexes (She believes), prepositional phrases (in reality) or conjunction+clause (If I were you); – names and descriptions, which syntactically take on the form of NPs, represented by proper nouns (Anna; Acapulco), or common nouns, modified or not (a doctor; the doctor we saw at the party); – tenses and moods, determining the focus of a created space, its relation to the base space – which is either the speaker’s reality or any other parent space which is a point of departure for the created space – its accessibility and the location of parts used for identification; – presuppositional constructions like definite descriptions (denoting phrases including the present president of Poland), aspectuals (verbs with distinctive inherent lexical aspect such as begin, repeat, complete etc.), clefts and pseudo-clefts (What I need is a short holiday; A short holiday is what i need) as well as other word order phenomena; they signal that the assigned structure is introduced in the presuppositional mode and will potentially be propagated into neighbouring spaces influencing its definiteness, aspect and emphasis; – trans-spatial operators, be and other copulative verbs that enable links including mappings between spaces (A woman is a rose); and, finally – the Access Principle, also called the Identification Principle – introduced earlier in this section – in the light of which one element, the trigger, names, describes or points to its counterpart, the target, like the model pointing to their image in the-girl-with-brown-eyes sentence.

210

Chapter Three

When it comes to describing the correspondences between spaces in terms of functions, Fauconnier (1994: 3) refers to Nunberg’s (1978, 1979) concept of pragmatic functions for establishing links between objects in discourse. Departing from this concept, Fauconnier (1994) argues that such functions can be motivated by 1) locally pragmatic reasons as well as 2) psychologically, experientially and culturally. In the first instance of correspondence description we will be considering grammatical functions that assign roles such as Agent, Patient, Event, Action etc. or roles in the Fauconnierian (1994) understanding, similar to Jackendoff’s type/token differentiation described in the previous section as in the learner involving a single or variable denotation (=this learner as opposed to learners as such). Another example of this kind is the argument structure with its potential to trigger the conceptualisation of who does what to whom on the schematic as well as the actualuse level (cf. Goldberg 1995, 2006 and her discussion of the ditransitive construction). Still other locally pragmatic functions will be carried out by anaphora or explicit discourse markers in the form of different stylistic devices (see different statuses discussed on the previous page). Such functions will also be motivated implicitly, as suggested by Fauconnier (1994), based on the very genre of the discourse in question. In such a case, phrases like Once upon a time or Did you hear the one about will covertly induce both relevant discourse organisation as well as some narrative schemata typical of a fairytale or a joke, respectively. In turn, functions motivated psychologically will, among others, be the ones that result from the conceptual embodiment of language. Johnson (1987) argues that embodied experience gives rise to certain image schemata. As a result, functions with psychological underpinnings will provide connectors between spaces resulting in ontological (human body is a container) or orientational (up is positive) metaphoric mappings to mention just the most popular examples. Finally, when it comes to experiential and cultural functions, the former will include functions that link authors to books (Fauconnier 1994) as well as, in fact, any two parts of a certain experiential domain producing connections and mappings called metonymies; in the case of cultural functions, space building processes are motivated by experience that is shared with a certain group of people, as in the case of the president (Fauconnier 1994) which, unless indicated otherwise, will mean our president to the compatriots who are interacting. Another example of the culturally motivated construction of discourse may be the way argumentation proceeds: in a top-down or bottom up manner, reflecting a way of thinking that is typical of a particular culture. All this results from the fact that “construction of spaces represents the way we think and talk” even if it “does not, in itself, say anything about the real objects of this thinking and talking” (Fauconnier 1994: 152). What is important to point out – a fact already implied in the paragraph devoted to pragmatic functions – is that mental space construction is not limited to one-sentence utterances. In fact, what is created in every discourse is a complex

211

The meaning of meaning

space architecture or network with intricate patterns which allow the interlocutors to easily move through discourse and to fluently change their points of view and conceptual departure. The difference between one-sentence and broader-discourse contexts is that the former need local, “minimally simple” contexts (Fauconnier 1994: xxi) in which they can be understood: they have to be signalled explicitly by means of constructions such as I believe; If I were you, etc. Longer passages allow for more tolerance or referential opacity: they build more complex referential structures which enable co(n)text-based meanings to ensue; and they are more reliant on the ways in which discourse participants fill in the information gaps based on their shared personal and cultural experience. The fact that we are talking about mental spaces in terms of a referential structure implies interconnectedness; interconnectedness, in turn, means that the various inter-space links differ in terms of their place in the overall hierarchy and. The hierarchy, as described by Fauconnier (1994; 1997), is the result of the fact that every newly created space is linked to its parent space, be it the speaker’s reality or a She believes space, and is situated lower in the hierarchy as depicted in the diagramme below (Figure 10). Space R is the speaker reality space, space M1 is its daughter space and – simultaneously – the parent space for the M2 space; In Christie’s Moving Finger and Jerry Burton believes are the relevant space builders. Figure 10. The hierarchy of parent and daughter spaces for: In Christie’s Moving Finger Jerry Burton believes that the author of the letters should be found.

R

In Christie’s Moving Finger

M1 the space of the book

Jerry Burton believes M2 the space of Burton’s beliefs

that the author of the letters should be found

As for the relevant terminology used by Fauconnier (1994; 1997), the initial, starting space situated at the highest point of the hierarchy and providing the starting point for the whole architecture is called the Base; the space which is the point of departure and the structuring factor for a certain part of the discourse is called the Viewpoint; finally, the space that is of utmost importance to discoursal interpretations in progress, as it internally structures a space currently attended, is called the Focus. In the fragment of architecture presented above, R (=the space of

212

Chapter Three

the current discoursal encounter) will be the Base, space M1, initially in Focus, will be the Viewpoint for M2 which may, depending on how the discourse develops, become the Viewpoint for another space; or a new space may be created from the Viewpoint of one of the previous spaces. We can speculate, given the human tendency for parallel processing described in Part I that in the face of sentential or discoursal ambiguities, one space building device within a parent/base space will generate two or more daughter spaces which both/all will temporarily be part of the structure until the final resolution of the meaning opacity when the focus is subject to bottlenecking. In addition to such hierarchy-building “transspatial operators” (Fauconnier 1994: 143), there will be intraspatial connections established by different space builders, among others the majority of verbs, (Fauconnier 1994) whose connecting power derives from their argument structure. Summing up the referential structure of mental spaces, their architecture makes it possible for grammar and meaning to be associated in the four following ways (based on Fauconnier 1997: 65-66): – a language expression can do many things at once: build spaces, set up new elements within spaces, structure spaces internally, and link them to other spaces; – the effect of language expression depends on the space configuration it operates in: for example the copula be will link spaces differently depending on what connectors (space mappings) are available at the relevant discourse insertion point4; all this – the variable space building potential – is the result of the underspecificity of every language form; – the fact that language expressions underspecify cognitive constructions implies that grammatical constructions will normally generate many formally compatible configurations of the referential structure. They will be processed simultaneously until the ambiguity is resolved owing to constraints of relevance, prototypicality as well as the available default options determined experientially and culturally; and – some logical properties of sentences in isolation will be special cases due to their potential for space construction: when not specified otherwise by the context which, in isolation, is missing, two target spaces – de re and de dicto (given from the speaker’s, as opposed to the thinker’s, point of view5; referentially transparent vs. referentially opaque); or specific alongside non––––––––– 4 5

The potential be-induced mappings include categorization, metonymy or metaphor. Fauconnier (1997: 51-52) gives the example of a certain Ursula who has had James Bond introduced to her as Earl Grey, a tea importer, and thinks he is handsome. From the speaker’s point of view sentences Ursula thinks the British chief tea importer is handsome and Ursula thinks the top British spy is handsome are both true de re, whereas the second sentence is certainly false de dicto if judged from the perspective of the thinker (=Ursula).

The meaning of meaning

213

specific (the learner in the two previously discussed meanings) – will be triggered or set up, as a consequence of the general Access Principle cited earlier in this section. This conflict will potentially be resolved when some contextual cues are provided. The referential structure, as discussed above, will provide the basis for the conceptual structure, which fills in the mental space architecture by means of frames – including cultural models – and scripts. 2.2. Frames and cultural models Frames define the in-space entities of the previously discussed mental architecture in terms of roles and their values. They are defined as data-structures for representing stereotyped situations (Minsky 1974 and Fillmore 1982), like being in a certain kind of place or circumstances (a street; going to church; etc.). Minsky theorises that in every new encounter of a certain item, in the process of the recognition and categorisation of this item, “one selects from memory a structure called a Frame” (1974: 2), which, as can be added here, is the result of a number of previous encounters with such an item. Every frame is based on a network of nodes and relations. Its top levels, to use Minsky’s terminology, are fixed and represent what is always true about the ‘in-frame’ entity; the lower levels have numerous terminals – slots, which, in such encounters, are filled in by specific information not available within the frame. Yet, as Minsky argues, “a frame’s terminals are normally already filled with “default” assignments” (1974: 3), assumptions which come from previous experience. For example, for an utterance like “He is having dinner”, the default assignments we feed into the frame are that he is most probably seated at a table, uses dinner-specific kitchenware and, if we are Polish, we assume that what he is having is probably hot and consists of more than one course. The latter statement – the if-we-are-Polish reference – is convergent with Jaszczolt’s (2005) idea that the salience of certain cognitive defaults will frequently be motivated socially and culturally. As for how individual frames fit into a certain overall system, Minsky (1974) states that frames are not stored in isolation; rather, they form systems of interrelated stereotyped representations, and transformations between these representations mirror or record every important action. To explain the idea, Minsky uses the example of visual imagery: every scene or item will be represented by a series of frames that are related to each other in the system, each frame standing for this scene/item perceived from a different viewpoint; the interframe transformations, in turn, will mirror the real-life effect of changing perspective and observing the scene/item from different angles. This concept is highly reminiscent of the already-mentioned 3D model structure proposed by Marr (1982 cited in Jackendoff 2004: 339), the construct utilised by Jackendoff (2004) in his Spatial Semantics. Adding Minsky’s model to the considerations, we

214

Chapter Three

can speculate that in order to understand certain conceptual subtleties – like the chair/stool difference mentioned before or the dinner situation – we need to activate an appropriate fragment of the frame system as well as the mechanism of inter-frame transformations; we will also probably agree that viewpoint will imply multidimensionality in ways other than purely physical. It is important to point out that such frame-systems are connected to a superordinary information retrieval network which, according to Minsky (1974), partakes in the matching process between a new situation/item and its stereotyped representation. Such matching can be visualised as putting a more schematic pattern, a blueprint, over a particular novel instantiation to check if the latter can be accommodated within the former. Such accommodation means that certain important facets or orientation points – slots/terminals in the case of frames – of the schema and the instantiation coincide. When a novel situational frame does not fit in with the stereotyped frame in terms of terminal/slot values, which means that these values cannot be made to accord with the markers/characteristics of a given new situation/item, the said information retrieval network, as Minsky (1974) argues, provides a substitute stereotyped frame. In our conceptual reality, this will mean that in every new situation we will apply stereotypes constructed as a result of similar situations from the past only as long as the specifics of the new situation accord with those that are already part of the frame. If not, we will discard the retrieved frame and look for a substitute which will more accurately underlie the new experience. This compensatory process, which is possible due to the interconnectedness of frames in the network, is indicative of two different aspects of cognition. One of such aspects will be the potential of the system for different ways of representing knowledge including for understanding based on analogies, etc., discussed in Part I in relation to Damasio’s (1994) concept of brain modularity and to implicit learning modes; another aspect of cognition pertinent to the frame retrieval/matching process is the working memory model seen as spreading activation patterns (cf. Miyake and Shah 1999 cited in Part I). Finally, it is important to describe the ways in which, according to Minsky (1974), the perceived information is structured into frame representations. The question that may be asked – in relation to the previously quoted two-fold concept of semantics – is whether the stereotyped scenes are originally perceived in an analytic, discrete, serial and symbolic way or if these perceptions are holistic. Minsky himself tends towards the serial symbolic explanation, which is most probably the result of the fact that he construed his concept within the realm of computers and Artificial Intelligence; he, however, argues that the fact that perceptions are instantaneous has to be explained based on some field-like global parallel processes, preferably interactions between individual sensations and their abstractions based on the earlier-discussed networks they form. It has to be pointed out that this makes the concept highly congruent with our initial assumption that we need a holistic semantic system to accommodate “continuities

The meaning of meaning

215

between language and experience” (Petruck 1996: 1; cited earlier in this chapter), which was the point of departure for the present section. Elaborating on Minsky’s concept and relating it to other concepts such as Barlett’s (1932) schema6, Bower’s associative relations (cf. Bower 1974, Anderson and Bower 1980)7 and Geckeler’s (1971) semantic field8, Fillmore (1975: 123) writes, very much along the lines of Minsky’s proposal discussed above: The Frame idea is this. There are certain schemata or frameworks of concepts or terms which link together as a system, which impose structure or coherence on some aspect of human experience, and which may contain elements which are simultaneously parts of other such frameworks.

Where Minsky and Fillmore apparently differ is in the acknowledgement of the latter that the area of human experience on which frames impose structure or coherence is (Fillmore 1975) represented holistically rather than analytically: – either as a prototype, an exemplar demonstrated and manipulated but not analysed; or – as Bruner’s (1964) enactive or iconic memory representation; or – as a scene, “each case frame as characterizing a small abstract ‘scene’ or ‘situation’” (Fillmore 1982: 115) understood in quite a general sense as (Fillmore 1975: 124): including not only visual scenes but also familiar kinds of interpersonal transactions, standard scenarios defined by the culture, institutional structures, enactive experiences, body image and, in general, any kind of coherent human beliefs, actions, experiences and imagings.

––––––––– 6

7

8

Eysenck and Keane (2004) note that Barlett was the first psychologist to argue for the importance of underlying previous-experience-based schemata in understanding new stories. He investigated the effect of conflict between the new story and its underlying schema (seen, among others, in the event of listening to a story from another culture) on the remembered version of this story. This is because schemas, as Eysenck and Keane put it, “allow us to form expectations” which, if unfulfilled, make us take actions which interfere with normally semiautomatic data processing (2004: 384). As Anderson and Bower (1980: 36) explain, “the memory structure in HAM [Human Associative Memory] is ... conceived to be a network of associative relations among abstract semantic concepts, an interrelated set of meaningful propositions”. The associations are carried out in working memory where roles and cues of a current situation are matched with those retrieved from the semantic LTM (Gluck et al. 2007), which makes the process similar to the earlier described Minskyan interaction between reality and the frame system as well as the frame system and the information retrieval system. Geckeler (1971) argues for a kind of denotational continuum in which every lexical entry cannot be understood except in relation to its neighbouring entries, all of which belong to a certain thematic set.

216

Chapter Three

The above-listed holistic models, however, do not have to exclude analytic representations. Firstly, as stated earlier in this section, the prototype theory does not exclude the compositionality of semantic primitives or features. Moreover, the very notions of prototype, stereotype or scene imply a certain amount of schematicity, symbolicity or, as Fillmore himself admits (Fillmore 1982: 115; cf. above), abstractness. As a result, similarly to Minsky’s frame model, it will be assumed that, given the above, a frame remains a desirable element in the two-fold conceptual/experiential semantics, where the overall meaning is the result of the interaction between a certain abstract representation and an experiential instantiation. It is also important to note that Fillmore’s (1975, 1982) system described above is a two-level frame-and-scenes referential-conceptual construct. Within this system frames are linked to and activated by other frames based on the similarity of linguistic material they contain, an idea reminiscent Minsky’s systems of interrelated stereotyped representations as well as of Geckeler’s (1971) semantic field. Scenes, on the other hand, are connected with other scenes and have the potential to trigger them by virtue of sameness or similarity (Fillmore 1975), which, in turn, evokes the concept of Bower’s (Anderson and Bower 1980) associative memory. In the light of this concept of frames and scenes activating and triggering other frames and scenes, we can say that Fillmore’s frames, because of the linguistic material they carry, are conceptually close to – and, considering publication chronology, were most probably the inspiration for – Fauconnier’s (1994) space builders. In turn, the scene that a frame brings along with it to fill the in-frame space can be seen as Fauconnier’s conceptual structure. All this is particularly clear vis à vis Fillmore’s (1975: 125) idea of discourse analysis. As the interpreter moves, intellectually, through the text, (s)he develops an image/scene/picture of the world which is built up – in terms of the referential structure, as it seems – and filled out – by means of the individual contributing images/scenes/pictures. The point Fillmore makes is that the interpreter will start with a schematic scene triggered by the beginning of the text, a mere outline with numerous empty slots to be filled in the course of the text-interpretation experience with the above-mentioned conceptual content. This takes us back to Turner’s (1991) idea of expressions as prompts and Fauconnier’s concept of language being the tip of an iceberg of the whole domain of backstage cognition. What is also important in the Fillmore-Fauconnier analogy is that, while moving through the text, its interpreter will “introduce new scenes, combine scenes through history or causation or reasoning” (Fillmore 1975: 125), constructing the whole hierarchy of domains motivated both explicitly – by language expressions – and implicitly based on the experience – psychological, cultural etc. – that the author of the text and its interpreter share (Fauconnier 1994). Finally, it is important to point out that the frames-and-scenes that build discourse architecture can be divided into cognitive and interactional (Fillmore 1982), instances of which are given in the following subsections. The former,

The meaning of meaning

217

based on the commercial event frame as well as cultural models, will be described in relation to a number of cognitive phenomena such as a change of focus, coactivation of two different frames or situations; the latter, including discourse markers and other stylistic devices such as temporal expressions or word order phenomena, are going to be associated with the issue of intersubjectivity. 2.2.1. Cognitive frames As for particular examples of cognitive frames as well as scenes that fill in the slots of the conceptual architecture described above, we will look at Fillmore’s frame-and-scene semantics classics, including, first and foremost, the best-known commercial event (1977 cited in 1982: 116). The schematic scene that is related to it involves a person interested in exchanging their money for goods (the Buyer), a person who wants to exchange their goods for money (the Seller), as well as the objects of the transaction (the Money and the Goods). This general scene is relevant, semantically, in two ways. First of all, it indexes or evokes a number of verbs such as buy, sell, pay, spend, etc. and, consequently, relates them semantically in such a way that in order “to understand any of them you have to understand the whole structure in which it fits” (1982: 111). Secondly, each of the individual words may be used to focus on one aspect of this overall scene, backgrounding another part of it: buy foregrounds the Buyer, the Money and the Goods, simultaneously overshadowing or minimizing the importance of the Seller; pay makes the Buyer, the Seller and the Money salient while rendering the Goods peripheral to the event. All in all the frame-and-scene and its content are processed bi-directionally: whereas each of the words evokes the frame as a whole together with the scene it contains as a memory trace (cf. the PDP model of Rumelhart and McClelland 1986), it is the frame and its scene that give an overall structure to all of the word meanings (Fillmore 1982) and provide the contextual reference needed for the appropriate interpretation of each individual lexical item. In addition to the change of focus described for the commercial event frame, a single situation can be framed in two alternative ways with two different linguistic results. Fillmore (1982) gives the example of somebody unwilling to spend money and argues that his/her situation can be schematised either along the stingygenerous axis or along the thrifty-wasteful axis, each demonstrating a different speaker attitude. In a similar way, a single word, for example angle (Fillmore 1982), may trigger two separate frames with unrelated in-frame scenes depending on whether this term is used in its everyday or technical sense. This cognitive phenomenon is similar to Fauconnier’s (1997) special-case logical properties of sentences in isolation, described earlier in this section, in which the space building process, as a consequence of the general Access Principle, incurs a two-targetspace set up. In both cases – two frames; two spaces – the conflict can only be resolved on condition that it is context-informed.

218

Chapter Three

Finally, when it comes to frame-based cognitive phenomena, as Fillmore argues – on the basis of bachelor adapted to mean ‘a male fur seal without a mate during the mating season’ (1982: 125) – frames may be borrowed to produce new senses of a given word. An instance of the process will be the word mouse extended to mean a computer device based on the word’s frame including, among others, the concept of a grey ovalish entity with a tail. Returning to instances of cognitive frames filled in with scenes or scenarios, examples such as breakfast or weekend, discussed in Fillmore (1982), are slightly different in the sense that they are culturally bound. In addition to all of the scene participants and objects of their actions they need background reference in the form of a three-meal nutrition model in the case of the former and the calendric seven-day cycle for the latter. As such, they are socially shared frames which Coulson (2005) defines as cultural models, in whose case the analysis of meaning has to stem from anthropological models, conceptual compounds surrounding homo loquens and forming a part of their subjective background (Bartmiski and Niebrzegowska 1998). Cultural models sanction every interpretation of meaning by filling in such socially-based slots in the frame. As a result of cultural differences, these particular slots may be filled in differently and lead to slightly, considerably or totally different conceptualisations of in-frame scenes, depending on how big the differences in question are. To prove this point, we can refer to Fillmore’s (1975) example of the English write frame and the Japanese kaku frame. Despite sharing a number of concepts such as the person of the writer, the trace that is left as well as the implement and the surface, the two frames differ in the nature of the product which may be an indefinite sketch or a doodle for kaku and not for write. 2.2.2. Interactional frames An important issue to be addressed for the phenomenon of interactional frames to be satisfactorily explained is the question of subjectivity. It is, as pointed out by Sinha (1999), the contribution of cognitive semantics to Aristotelian category realism. In the light of Aristotle’s concept of meaning, understanding is motivated by experience which is the same for every human being as it results from contact with the same objective reality. While cognitive semantics includes Aristotle’s concept of the experiential source of meaning, its proponents argue that this experience is subjective and not objective, a claim included in Barmiski’s concept of homo loquens cited in the previous section. The questions that can be asked are: What kind of subjectivity are we dealing with? And how possible is it that my subjective meanings can actually be our subjective meanings? These questions are related to the problem of “uncertainty and ambiguity about the status of images, concepts and meanings ... and a vulnerability to the Other Minds objection” (Sinha 1999: 228). As a way out of this my/our subjective meaning

The meaning of meaning

219

hoax – which, as it seems, is bound to make interaction impossible – Sinha (1999) offers the alternative of intersubjectivity and referential realism. In the light of the proposal, meanings are not mental objects but acts which are “subjectively construed so as to make sense in an intersubjectively shared universe of discourse which is continuous with (not separate from) the material world in which other (non-discoursive) human activities are carried out” (Sinha 1999: 232). This joint reference of discourse participants – rather than truth, as declared within the objective, Aristotelian stance – is a factor that makes communication possible. Contemporary discussions of intersubjectivity relate this joint reference to shared knowledge, or a co-conception of the world (Overstreet 1999: 66); an earlier-used term – reciprocity of perspectives – implied assuming mutual experiences and disregarding personal differences. An interesting perspective on the co-conception is Tomasello’s (1999, 2005) theory of mind or, as he himself puts it, the theory of intention reading, in the light of which language learning – including the learning of meaning – is part of a broader process of collaborative cultural adaptation. As Moll and Tomasello (2007) point out – referring to Vygotsky’s (1978) general theory of culture – while other primates are motivated by social competition, humans are driven by group collaboration. Development of technologies, cultural institutions and – most importantly to the present work – the acquisition of symbolic systems like language are driven by such cooperation. This claim is based on what Moll and Tomasello (2007: 1) call the Vygotskian intelligence hypothesis and what is later developed by Hermann et al. (2007) as the Cultural Intelligence Hypothesis. Related to both hypotheses is the afore-mentioned Tomasello’s theory of mind. It is based on the acknowledgement that human beings possess “the foundational skill of understanding intentions” (Tomasello et al. 2005: 675) which hinges on the following abilities (Tomasello 2005: 3; abridged from research to date): – the ability to share attention with other persons to objects and events of mutual interest; – the ability to follow the attention and gesturing of other persons to distal objects and events outside the immediate interaction; – the ability to actively direct the attention of others to distal objects and events; and – the ability to culturally (imitatively) learn the intentional actions of others, including their communicative acts underlain by communicative intentions. These four abilities are reinforced, as Tomasello (2005: 4) points out, by patternfinding and skills including: forming perceptual and conceptual categories; construing schemas for recurrent patterns; performing statistically based distributional analysis of sequences; and creating analogies across two or more

220

Chapter Three

complex wholes. As such, they “are necessary for children to acquire the appropriate use of any and all linguistic symbols including complex linguistic expressions and constructions” (Tomasello 2005: 4). How do we make intersubjective meanings? As Tomasello notes, linguistic symbols are learned culturally in a bi-polar way, “not only the conventional form of the symbol but also its conventional use in the acts of communication” (2005: 12). The said cultural (imitative) learning is also bi-directional/intersubjective – all users know their conversational partners share the conventions. The linguistic symbols are used dyadically or triadically, fulfilling the regulatory and referential functions of language, respectively. What is also important is that linguistic symbols are perspectival in the sense that a person may refer to a given entity or event in a number of different ways depending on their own mental viewpoint and/or communicative goal. This, however, needs to be done “with respect to the listener’s attentional states” (Tomasello 2005: 12) to make sure the interlocutor partakes – truly intersubjectively – in our understanding of the situation. The above intersubjective perspective leads, in a sense, to the second type of frames mentioned in Fillmore (1982) called interactional, which give structure and coherence to “what is going on between the speaker and the hearer, or between the author and the reader” (Fillmore 1982: 117) by means of different deictic references as well as – even more importantly – through what Sinha (1999: 233), after Shotter (1993) calls “joint action, joint attention and joined intention” between discourse participants. As a consequence, interactional frame-and-scene semantics will deal with the schematisations of communicative situations by means of tenses, person markings or demonstratives on the one hand and, on the other, the knowledge of illocution, cooperative principles and conversational routines as well as all of the common and negotiated meaning before homo loquens and – equally importantly – homo audiens. If we decide to look for analogies between Fillmore and Fauconnier once more, the interactional frames (Fillmore 1982) resemble pragmatic functions – local and implicitly motivated – used in the construction of the mental architecture of discourse (Fauconnier 1994). Summing up, we can say, quoting Fillmore (1982: 11), that frame semantics – called semantics of understanding or U-semantics in a later publication (Fillmore 1985) – “offers a particular way of looking at word meanings, as well as a way of characterizing the principles for creating new words and phrases, for adding new meanings to words, and for assembling the meaning of elements in a text into a total meaning of the text”. What is important for the present argument – as will be demonstrated in later parts of the present work – frames and scenes, as theorised by Fillmore, are triggered not only by words – as can be seen in the examples quoted above – but also by grammatical structures or linguistic categories, which “can get associated with the prototypical instances of scenes” (Fillmore 1975: 124).

The meaning of meaning

221

2.3. Scripts Finally, we look at scripts, which have been used interchangeably with frames by Fillmore himself, but are, in fact, a separate construct introduced and popularised by Schank and Abelson (1977) in their study of the structure of human memory. The study, which was aimed at solving problems in text comprehension tasks performed by a computer programme, emphasised the importance of narratives in human cognition. As Schank and Abelson state in a much later publication (1995: 1): – virtually all human knowledge is based on stories constructed around past experiences, – new experiences are interpreted in terms of old stories, and – the content of story memories depends on whether and how they are told to others, and these reconstituted memories form the basis of the individual’s remembered self. In addition to such individual stories, there are, according to Schank and Abelson (1995), social stories based on common memories, stories which may accord or compete with individual stories. In the light of all this, the analogy between scripts and frames is quite obvious: in both cases all new input is categorised on the basis of schematised old experiences; frames and scripts alike can be personal or shared, the social stories mentioned above being highly reminiscent of cultural models. The chief difference between the two concepts is that frames, which contain situations or scenes, are naturally more static while the story-based scripts will tend to be more dynamic. One more important similarity between the two experiential semantic models lies in the fact that scripts as a way of representing knowledge – like frames – gained their popularity at a time when there was considerable interest in Artificial Intelligence, when the main concern was the one of how certain areas of human knowledge can be utilised in the creation of computer programmes. In this sense scripts as knowledge representations are a functional concept where the knowledge is best understood based on what people do when they utilise it rather than to what exactly the object of this knowledge is. Listed below are sample answers to the what people do when using knowledge question (Schank and Abelson 1995: 2-3): – – – – –

people answer questions; people make plans and inform others of them; people comprehend what others are saying; people inform other people of events that have taken place; and people give advice to other people.

222

Chapter Three

Acknowledging the fact that the above list is incomplete, Schank and Abelson (1995: 3) conclude: “[i]n each and every one of the situations listed earlier, the knowledge that people use to help them is encoded in the form of stories”. Consequently, stories, as argued later in the cited work, will be behind the facts that we know and they will also help us in the face of new situations. In the case of the former, every statement we make, like I used to live in Wrocaw, is a summary of a longer story, evoked for any of the purposes discussed above like making plans (I may be considering moving house back to my hometown) or giving advice (Since you are going to Wrocaw, I am the best source of relevant information). Even an apparently factual statement like Wrocaw is a city in Lower Silesia is also a trigger for a number of stories, personal and culturally shared, self-produced or heard, about the very place in question. As for the latter issue, using old stories to categorise, address or find solutions to new situations, Schank and Abelson (1995) argue that – when faced with a novel experience – we activate scenarios of old events and, based on similarity, decide which of the pastbased narrative schemata matches the newly encountered experience; a process in a way similar to the matching between a new situation/item and its stereotyped representation described by Minsky (1974) in relation to his frame-based superordinary information retrieval network. In the case of scripts, when comparing the old and new circumstances we adapt old stories in ways which accommodate the new course of events. Following this, we can imagine that every new acquaintance will be made in relation to previous personal encounters of this kind, and the necessary adaptations will potentially result in incorporating other similar scripts; similarly, a mismatch between an old schematised frame and a new in-frame entity resulted in the activation of a substitute frame of reference. Consequently, the new story will bear a resemblance to a number of other previously “written” stories, starting like one of them, continuing as another and finishing as still another, all of the individual scripts being “a set of expectations about what will happen next in a well-understood situation” telling us “how to act without our being aware that we are using them” (Schank and Abelson 1995: 5-6). Finally, the discussion of scripts is going to return to where it started. What may be important to point out is that a concept which used human psychology as the point of departure for an AI study returned to psychology in the form of Hermans’ valuation theory with life-story themes of S (=self-enhancement) and O (=the need to affiliate with others) seen as an indispensible element of our identity (cf. Hermans and Kempten 1993, among others). Similarly, Bruner (2004) argues that we naturally portray ourselves through stories, the self-telling usually being provoked by “episodes related to some long-term concern” (Bruner 2004: 8). Such long-term concerns may be the four life scenarios created under the influence of the environmental and personal experience proposed by McAdams (1985, 1993), who, based on more than 200 life interviews, defines human life narratives in terms of:

The meaning of meaning

223

– tone, dividing them into: a romance, in which life is an eternal quest and a never-ending struggle with evil; a comedy, where life is a series of unrelated episodes: one cannot control everything but success is possible owing to good luck and optimism; a tragedy, whose protagonist is a victim doomed to failure; and finally an irony, where the protagonist distances him/herself from the course of action; – imagery: the metaphors through which one perceives one’s life, like the wellknown life is a journey metaphor; – themes: like Hermans and Kempten’s (1993) division, of two kinds – 1) themes of agency or autonomy as well as 2) themes of communion with others. Going back to scripts as understood by Schank and Abelson (1977, 1995), we can speculate that the tone, imagery and theme of our life stories will be the cornerstones holding together the general script of our life which will, potentially, motivate all our individual scripts triggered in particular instances of reasoning, acting as our own individual interpretative defaults. The personal tone-imagery-theme defaults will not, however, be the only defaults motivating our understanding of scripts. In order to see and understand a whole array of such defaults it is necessary to relate the notion of scripts to a broader area of what is known as narrative semantics, “narrative ... understood to encompass productions of a certain kind, whether these are conversational, written, theatrical, filmic, or pictorial” (Talmy 2000: 417) and constituting a cognitive framework for Narrative Structure. According to Talmy, the goal of narrative semantics is to “develop a comprehensive framework for ascertaining and characterising the structure of all existing and potential forms of narrative, as well as the larger context within which the narrative is located” (2000: 417). This larger context is placed against a more comprehensive background of interacting patterns forming cognitive systems that will create “a single pattern understood as a story” (Talmy 2000: 418). Such a story pattern can be understood through three divisions in which the narrative context is seen: domains, strata and parameters. Domains include (Talmy 2000: 423): the spatiotemporal physical world; the culture or society with its conceptual and affective structure, its presuppositions, values, norms and so on. The meaning of narratives will also be related to the generative (speaker) and interpretative (listener) activity of the human mind as well as seen in the context of the narrative itself. As for strata (Talmy 2000: 438), they determine the narrative semantics through: the temporal structure, including the relation between the narrative time and the producer/experiencer time as well as the internal structure of events; the spatial structure and its static and dynamic components such as region and location (the former) and path and placement (the latter); the causal structure – “the physics of matter and energy in space and time”; and the psychological structure of the producer/experiencer (intellective and affective; values; personality). In relation to the above, strata will include the notion of the

224

Chapter Three

perspective point, both spatial and attitudinal, of both the individual and the group/society related to the narrative in question. Finally, parameters (Talmy 2000: 439) include: the relations between individual domains and strata (inclusion, co-extension, partial overlap, separation); relative narrative quantity (scope, size of narrative subdivisions also called granularity, density; degree of differentiation (vague/clear; sketchy/elaborate, implicit/explicit; the type of sequential structure; and evaluation. All these domains, strata and parameters will potentially serve as defaults for an individual script related to the story we are following, the (auto)biographic narrative, history, etc. What is of particular importance here – and constitutes the most important reason for quoting Talmy’s (2000) types of story pattern divisions – is the question of defaults. Introduced with the concept of frames at the beginning of our discussion of holistic semantic models and revisited above, defaults will be crucial to the argument offered in the present work in ways that will be clarified in Part III. At this point it will suffice to highlight the most important aspect of defaults: they are co(n)textual as well as extra-contextual, the latter case involving numerous layers of background knowledge involving all of the psychological, cultural, autobiographical, axiological, physical as well as other domains, strata and parameters quoted after Talmy above. All this is in concord with what Jaszczolt includes in her proposal of Default Semantics. She declares (2005: xvi-xvii): – the theory of discourse meaning (meaning of utterances and their sequences) is truth-conditional9 and dynamic; – pragmatic information, such as the output of pragmatic inference or defaults, contributes to the truth-conditional content; – the representation of the truth-conditional content is a merger of information from (i) word meaning and sentence structure, (ii) conscious pragmatic processes and (iii) default meanings. It is called a merger representation; – default meanings are conceived of as (i) cognitive defaults stemming from the properties of the human thinking process and (ii) social-cultural defaults, stemming from the way society and culture are organised; and – the merger is represented by using the principles of dynamic semantics, in particular those of Default Semantics. Applying the dynamic approach to mergers makes it possible to represent the meanings of multi-utterance discourses. What is particularly important is that Jaszczolt – unlike the proponents of the Relevance Theory (cf. Sperber and Wilson 1986 a and b) but very much along the lines of Turner’s (1991) expressions as prompts as well as Fauconnier’s (1997) ––––––––– 9

Truth-conditions relate to the meaning the speaker intends to communicate (Jaszczolt 2005).

The meaning of meaning

225

backstage cognition and language as a tip of the iceberg – believes that semantics is underspecified, meaning is normally underdetermined and defaults are “salient interpretations arrived at without the help of the context of the particular situation in which the utterance was uttered” (2005: 5). The present work will move along a similar route10. Putting it all together, we can say that, most importantly, the holistic experiential semantic models – frames/scenes, cultural models and scripts – are the conceptual structures filling in spaces in the referential network being constructed as we move through discourse. The links in this network are established by a plethora of space builders – explicit, linguistic as well as implicit, pragmatically, psychologically and culturally motivated, all of which have been discussed in this section. What remains to be clarified is the nature of the connections – once they are formed – as well as some in-space mental activities which underlie decisions of what connects to what in the complex mental architecture. That is why the following section looks at the phenomenon of mapping as well as the question of perspective and related processes known as profiling and zone activation.

3. Holistic semantic models: the intra- and inter-space processes 3.1. Mappings From the nature of mental spaces and their parent-daughter interrelations we can conclude that the construction of mental architecture requires certain cognitive links to be established between the entities which are accommodated within the network. These abstract correspondences or mental links we create are called mappings11 (Fauconnier 1994 and 1997, Coulson 2005, among others). Sometimes, as in the case of the “brown-eyed girl” – an example given earlier in the section devoted to mental space architecture – meaning construction is possible due to mappings sanctioned by the then-quoted ID principle, based on which a certain element in one space, called the trigger, is a means of identifying and mentally accessing the target element in another space. The process is slightly ––––––––– 10

11

This understanding of defaults is also in agreement with the definition of semantics proposed by Stern (1983) and quoted in the introduction to Part II, where the interpretation of a language unit depends on linguistic and conceptual as well as affective and sociological meaning. The most obvious type of mapping theorised by cognitive linguistics is the mapping between FORM and MEANING, which, is the primary interest of the present work and is repetitively referred to. This kind of mapping, however, does not seem to need a detailed discussion: the present work assumes that the intricacies of this mapping are best explained by what meaning is – as it is rather uncontroversial what form is – which is exactly the aim of Part II.

226

Chapter Three

more complex in the case of metaphor, where the intended meaning results from mappings between source and target spaces which are reinforced by the similarity between these two spaces motivated by their common generic space. The whole process – together with the different types of spaces – is going to be discussed in a more detailed way in Section 4, which presents the theoretical model of the Conceptual Integration Theory. At this point however, the very mechanism of metaphoric mapping will be explained based on two examples. In a sentence like Consumption is the engine of economy (discussed by Sweetser 1999), the metaphoric meaning is clear owing to the mental links we establish between consumption in the target space and engine in the source space; in turn, economy in the target space is mapped onto a missing element – car or train – which belongs to the source space. As mentioned earlier, what sanctions the mappings by making them “cognitively realistic” (Sweetser 1999: 133) is the generic – or inherent similarity – space containing the propeller or driving force similarity between consumption and engine. However, it has to be pointed out that the way in which certain entities are mapped onto one another is often less straightforward and proceeded by quite complex in-space structuring processes. We can see this in the example of mappings between the source domain rose and its two different target domains sanctioned by two different metaphors. In the case of a woman is a rose, the aspect of the rose domain that is subject to metaphoric mapping onto the woman domain is the broadly understood beauty of the flower. In turn, the metaphoric mapping in life events are roses, necessary to understand the proverb There is no rose without thorns, is based on links which are created along two different lines: between the floral beauty and the charm of human existence on the one hand and, on the other, between the thorns of a rose and existential hardships. What is important to point out is, considering the moralizing tone of the proverb, that the latter thornrelated mapping is, perhaps, more important in this metaphor. Intra-domain differences, like the ones triggered by the two different rose mappings described above, are related to inter-domain processes such as fore- or backgrounding of certain in-frame elements, a phenomenon that was described by Fillmore (1982) in his example of the commercial event. Cognitive linguistics calls such phenomena profiling or zone activation; both of them are described later in this section. First of all, however, the concept of perspective – which is crucial to the understanding of these two phenomena – is going to be discussed. 3.2. Perspective The ways in which we construe the in-space content on the one hand and structure the inter-space architecture on the other are related to how we look at a certain chunk of reality. The word look evokes visual experience as the primary association. Consequently, the question of differences in perspective can, first of all, be linked to the act of observing a scene/item from different angles, as a result

The meaning of meaning

227

of which some parts of this scene/item are foregrounded while others occupy the periphery of vision or are not seen at all. All this has, to a certain extent, already been referred to in the present work: the different-angle observation was part of Minsky’s (1974) idea of a series of frames, Marr’s 3D model (1982 cited in Jackendoff 2004: 339) and the related construct of Jackendoff’s (2004) Spatial Semantics; the question of visual fore/backgrounding, in turn, was discussed in Part 1 in relation to attention. However, looking at reality also implies seeing in another sense of the word which is interpreting and understanding. From this point of view perspective can be explained in a number of ways. First of all, perspectivisation, a term Taylor (2003: 93) uses after Dirven et al. (1982), will mean highlighting “different components of frame-based knowledge” of semantically complex words, like the case of the commercial event frame-and-scene, where each of the different verbs (sell, buy, pay) allows a different perspective on the event to be assumed involving directing the spotlight of attention to a chosen aspect of the manifold domain. Such selective highlighting of certain aspects of in-frame content is a means of rendering specific meanings including metaphor, which requires perspectivisation of a certain sub-domain which is subject to metaphoric meaning extension. Secondly, perspective will also be assumed and change dynamically in discourse depending on which space – Base, Viewpoint or Focus (Fauconnier 1994, 1997; see earlier in this chapter) – is currently our point of departure. Finally, perspective can be understood as a way of looking at reality and interpreting it based on one’s individual or shared viewpoint motivated by experience. Such a perspective will be the result of the already mentioned implicit space builders which provide psychologically, experientially and culturally motivated correspondences between spaces in mental space architecture. To comprehensively present all of the different types of perspective and perspectivisation, we will start by looking at it as a general framework accommodating notions such as figure/ground alignment, viewpoint as well as deixis and subjectivity/objectivity described after Langacker (1987, 2000). Then we will look at perspective in discourse, following Fauconnier’s (1994, 1997) observations. 3.2.1. Perspective in the scope of predication12 The first notions related to the concept of perspective are figure/ground alignments. According to Langacker (1987), a figure is a certain entity that stands out from the rest of the scene because of the special prominence it is granted; the ––––––––– 12

Scope of predication, or the Base, are those aspects of a scene that are specifically included in a particular predication (Langacker 1987: 493)

228

Chapter Three

remaining non-prominent part of the scene is the ground. What makes a good figure (Langacker 1987) are the following criteria of prominence: – the compactness of the entity in question, which, naturally, makes focusing easier. This means that a single player will be a better figure than a whole football team; as well as factors such as – motion, because it is easier to single out a moving entity from an otherwise stable scene; or – sound, which easily stands out as figure against the background of silence. Figure/ground alignments are, as Langacker puts it (1987: 120), “a valid and fundamental feature of cognitive functioning”, part of which is language functioning. As a consequence – in addition to the visually/auditorily perceived prominence of the type described above – figure foregrounding will be observed in grammar in the form of profiling (to be discussed later in this chapter) as well as subject/object, head/modifier and other linguistic hierarchies (Langacker 1987); it will also be present in word order phenomena such as thematisation as well as the overall organisation of discourse including proceeding from the general to the particular, from thesis to conclusion as well as all of the other foregrounding discoursal phenomena which were discussed in relation to pragmatic functions in the context of mental space architecture described earlier in this chapter. When it comes to viewpoint, it – like other aspects of perspective – is most easily understood in the context of visual experience, language being a means of expressing a particular viewpoint. As Langacker (1987) states, we can look at a certain object from different sides, getting, accordingly, different perceptions of it; we can also observe complex scenes with multiple participants, assuming positions close to some of these participants. All this directly affects our estimation – and, consequently, linguistic description – of two aspects of the observed scene: proximity and salience. These two are related to more specific viewpoint notions: vantage point, referring to the position from which the scene is observed; orientation, which amounts to how the viewer organises the scene in terms of 3D relations; and prominence, which influences our choice of X above/behind Y over Y below/in front of Y or vice versa. Normally, as pointed out by Langacker (1987), our ways of looking at entities – and, consequently, the linguistic means we use for their description – are canonical in the sense that our conceptualisations are in agreement with the alignment of human visual field axes. Such canonical alignments are typical of and reflected in conventional pictorial representations: human beings are

The meaning of meaning

229

portrayed as standing on their legs rather than heads; animals are depicted from the front or side and not from the back; a flying kite will have the string attached from the bottom and not from above; etc. In language, in turn, we choose to say that The picture is over the table rather than The table is under the picture and any unorthodox non-prototypical verbalisation will be possible only if there are special reasons for it: The table is under the picture may imply some special prominence of the picture (expensive; of sentimental value to the speaker; important to the listener, etc.). In addition to sensory experience, vantage point and observation are central to language meaning phenomena, such as relational (prepositional) expressions and – most importantly to the present work – tense and aspect (to be discussed in Chapter 4). In turn, deixis means defining an entity by means of reference to another entity which is “a ground element in the scope of predication” (Langacker 1987: 126). As a result, some expressions, which are inherently non-deictic such as pencil or stand, will be located relative to the ground through the use of linguistic means such as, for example, demonstratives – this pencil – or tense and aspect as in is standing; this in this pencil and the present progressive tense in is standing will be means of referring the non-deictic elements to the ground which, in both cases, is the moment of speaking. Deictic – or grounding – expressions fall into a number of categories. Langacker (1987: 127) lists expressions like I and you designating speech act participants, or here and now pertaining to the place and time of speaking. They will also have different specifications: such expressions as the ones listed above are characterised by positive specification; negative specifications (not X), in turn, can be found in the third person pronouns (not I) as well as in there and then (not here and now). Deictic expressions, regardless of their function and specification, can be further divided into explicit – as in the case of I or this – as well as implicit, including, among others, tense and aspect, whose grounding potential is implemented indirectly: is standing only implies what here and now communicate directly. Deictic in a similarly indirect way are prepositions linking two entities, used in contexts from which a tripartite relationship can be inferred. This pertains to descriptions such as The car is in front of the church, which presuppose the existence of the third entity – the speaker – unless indicated otherwise (Langacker 1987). Such use of reference, linguistic and cognitive, to an element in the scope of predication via a different element is, according to Langacker (2000), ubiquitous and, consequently, fundamental to both language as a system and the organisation of thought. Reference point constructions in language – as he calls means of linguistic expressions utilising deixis such as the ones quoted earlier – demonstrate that two or more entities are in mental contact, which, as was shown above, can be either explicit or “our reference point ability [will remain] below

230

Chapter Three

the threshold of explicit attention” (Langacker 2000: 173). The process of making reference, diagrammed in Langacker (2000: 174), can be explained in the following way: the conceptualiser (C) uses R (=reference point) to mentally access T (=target), both R and T are located within the same dominion or domain; the mental path in the conceptualiser will go from C to T through R – in this particular sequence – to indicate what is accessed through what. In addition to the deictic expressions mentioned above, Langacker (2000) gives numerous examples of referring to entities via other entities such as quasi-possessive constructions (including the dative-shift construction in She gave me a book); constructions in which the reference is provided by context and the topic; examples of metonymy; and many others. Finally, the subjectivity/objectivity dimension of perspective refers to the role of the speaker/hearer as the source of predication. Taking into account that, as Langacker (2000: 203) puts it, “[w]ith the possible exception of God, there is no such thing as a neutral, disembodied, omniscient and uninvolved observer”, we have to address the question of objectivity/subjectivity based not on how involved/uninvolved the speaker/hearer is but rather on how (s)he conceptualises the scene in relation to self. This amounts to the observer choosing appropriate focal settings to establish a construal relationship between themselves and the scene (Langacker 1987: 128). In relation to this, the terms subjectivity and objectivity can be defined as the perceived relationship between self [S] and other [O]. In what Langacker (1987: 129) calls the optimal viewing arrangement, [S] and [O] will be fully distinct, [S] will be separate from [O] and the attention of [S] will be intensively focused on [O]. In such a case, the observer will be fully subjectified13, the scene – fully objectified, and the role of the observer will be one of a conceptualiser and not of a scene participant. In turn, in the egocentric viewing arrangement (Langacker 1987: 130) the boundaries of the scene will expand to include the observer and their surroundings. In this way the speaker becomes objectified: the object of the observation, the attention focused on [S] as well as on [O]. There are a number of deictic expressions which put the speaker in focus as part of the objective scene. They include the I pronoun, the earliermentioned prepositional expressions implying a tripartite relationship as well as certain temporal expressions – such as the progressive aspect – to be discussed in the next chapter. ––––––––– 13

This interpretation results from Langacker’s (1987) special interpretation of the notions of subjectivity and objectivity. Subjectified means given the status of the subject understood as the objective (in the generally accepted sense), uninvolved viewer; objectification of the viewer, in turn, will imply their becoming part of the observed scene which involves a highly subjective (in the generally accepted sense, again) attitude on the part of the viewer.

The meaning of meaning

231

3.2.2. Perspective in discourse In turn, when it comes to the perspective we adopt when moving through discourse – the perspective that enables us to correctly interpret the whole discoursal encounter and, consequently, to follow its paths – is based on the three already-introduced standpoints: the Base, the Viewpoint and the Focus (Fauconnier 1997). To briefly revisit their definitions, the Base is the point of departure for the whole discoursal encounter, the point to which the interlocutors can always return, a kind of chief landmark delineating every new path of the exchange between the discourse participants. In turn, the Viewpoint is an interim point of departure, the starting space of a new topic which is introduced at a certain point in the discourse and is related to the Base. Finally, the Focus relates to the way in which the current space is structured internally in the sense inherent in the notion of perspectivisation as understood by Taylor (2003; see earlier in this section), whereby we are highlighting an aspect of the in-space complex content. To take the reader on a discourse perspective tour leading from Base to Viewpoint and Focus, we will consider the following discourse sample: Prototype semantics is a fascinating area of knowledge. It has been subject to numerous studies. It originated in the 1970s as a result of Rosch’s investigation of the concept of COLOUR. Yet, it had most probably been largely inspired by Wittgenstein’s concept of family resemblance.subjecting it to the analysis inspired by Fauconnier (1997: 73-75).

In the example above, the information that prototype semantics is a fascinating area of knowledge will be the Base of the discoursal snippet as well as its initial Viewpoint or current vantage point: a mental spot from which we can observe the whole discoursal event like a commander-in-chief surveying a battlefield from a hill. In the next sentence we introduce an additional piece of information: that prototype semantics has been subject to numerous studies. In discourse architecture this new sentence will be represented by means of a past event space but because of the present perfect tense used in it as well as the anaphoric it, the initial Viewpoint – present – and the Focus – prototype semantics – is preserved. In the next sentence, the space builder in the 1970s sets up a new Focus space, the one of Rosch’s study of colour. The Viewpoint space is still the one of our present outlook on prototype semantics; the perspective between the Viewpoint – now – and the Focus – the study in the 1970s – is indicated by the past tense. In the final statement, the Focus of the previous sentence becomes the Viewpoint as we see the new Focus – Wittgenstein’s concept – from the perspective of Rosch’s study, which is signalled by the past perfect tense used in the Wittgenstein sentence. Throughout the discoursal trip, we have managed to retain the basic perspective by keeping our mental eye on the Base, aware of the fact that the point of departure of all of the considerations is the notion of prototype semantics. At the same time, we have also assumed a number of interim perspectives, changing our

232

Chapter Three

vantage points and, consequently, Viewpoint from the prototype semantics to Rosch’s study. The whole four-step process of constructing the architecture of the discourse in question can be diagrammed in the way presented in Figure 11: Figure 11. Perspective in the Prototype Semantics discourse. B=Base; V=Viewpoint; F=Focus; E=Event (1) B/V/F

(2) B/V/F

(3) B/V

proto type seman tics

proto type seman tics

subject to

E

(4) B

proto type seman tics

subject to

proto type seman tics

Rosch’s study

E/F

subject to

numerous

numerous

numerous

studies

studies

tudies

Rosch’s study

V

F/E Wittgen stein’s concept

3.3. Profile Departing from the concepts of predication and designation, the latter “characterised by the elevation of some entity to a special level of prominence within predication” (Langacker 1987: 183), Langacker refers to the predication as base and to the designatum as profile which “stands out in bass-relief” against the base (1987: 183). When it comes to the meaning of an expression, it springs from both the profiled element as well as its place within a broader domain. To clarify the concept of standing out in bass-relief, Langacker (1987) gives numerous examples, the one of uncle being perhaps the most effective. Its base is the concept of a family tree, a network of persons connected by virtue of childbirth or marriage; its profile is the substructure of the network that directly pertains to the meaning of uncle which will be the I-parent-his/her male sibling area. As indicated above, the meaning of the word will be derived from both, its profile as well as the vast interconnected set of domains covering concepts such as “the conception of a person; the differentiation of people by gender and the associated notion of sexual intercourse; the concepts of birth and the life cycle; the parent-child ... and the sibling relationship[s]” (Langacker 1987: 185). A similar interaction between the profiled substructure and its base can be observed in the semantics of Fillmore’s (1982) commercial event frame-and-scene: individual

The meaning of meaning

233

words (sell; buy; pay) foreground/profile some in-frame relations and background others; however, every one of these profiled relationships can only be understood based on the commercial-event domain as a whole. What will be designated/profiled from different predications/bases is, according to Langacker (1987), determined by what kind of predication, nominal or relational, we are dealing with. In the case of the former the profiled element is a thing; the latter profiling singles out a process or an atemporal relation. While a nominal profile simply shows an entity or entities, the relational designatum has two participants standing out in the bass-relief fashion: a trajector [tr], which has the special status of “the figure within a relational profile (Langacker 1987: 217) and a landmark [lm], viewed as an entity simply providing a reference point for the location of the trajector. The difference between nominal and relational profiles is best depicted in the diagrammes (a), (b) and (c) depicted in Figure 12 below, based on Langacker (1987: 216; 247), out of which (a) depicts a nominal profile of the colour GREEN, (b) an atemporal relational profile of a GREEN object and (c) a temporal profile for The trees are getting GREEN. Figure 12. Profiles: (a) nominal – the colour green; (b) relational atemporal – green used as an adjective; (c) relational temporal – The trees are getting green. (a)

(b)

(c) lm

tr

tr

lm HUE

HUE

time

In the case of diagramme (a), green is a noun and, as such, designates/profiles a fragment of the colour domain standing out from it, as shown by means of the heavy line. Diagramme (b) illustrates green in its adjectival sense, in relation to a certain thing of this colour. In this relational profile both the colour and the thing of this colour stand out, yet the colour itself is just the reference point, the landmark, for this green thing, the trajector. Finally, in the last relation, which is temporal, there are a number of component states, each of which is an atemporal relation between the landmark (the colour domain with green profiled) and the trajector (the thing: trees); the broken line between the component states is Langacker’s way of linking all projections of the same landmark and trajector to indicate we are dealing with the same [lm] and [tr] from one state to the next. “The full set of projections (including those of states left undepicted) constitutes the temporal profile of a process” (Langacker 1987: 245). What is important to point out is the fact that two elements of the diagramme stand out in the bass-

234

Chapter Three

relief fashion: the final part of the trajectory designated by the are getting green expression; as well as a portion of time – marked by the heavy line on the timeline – which is the temporal profile of the designated process. However, as argued in Grzegorczykowa (1998) as well as demonstrated in Fillmore (1975 and later publications), in addition to profiling in the immediate scope of predication, described in Langacker (1987), it is possible to highlight certain aspects of a broader domain, such as the semantic frame underlying the predicate in question. Grzegorczykowa (1998) explains this point based on Bartmiski’s (1993) example of the predicate of water, whose individual contextinduced profiles will include highlighting water as source of life; water as a dilutant; or the purifying potential of water. Considering the fact that in the above example a whole entity – the conceptual domain of water – represents a part of this entity activated by the context in which the predicate is used, profiling in Bartmiski’s understanding seems to be equivalent to Langackerian zone activation, a cognitive phenomenon described in the following section. 3.4. Active zones Zone activation is a phenomenon typical of relational predications, in which the actual prominent structures within a certain profile coincide – often very imprecisely – with the trajector and its landmark. This is, among others, the case in example (c) above. In the sentence The trees are getting green it is not the whole tree but its crown that acquires the said colour as a result of leaves first budding and then growing on the branches. A similar phenomenon can be observed in the examples of We all heard the trumpet or Give me a red pencil (Langacker (1987: 271). Langacker argues that in the former instance, in spite of a whole physical object being profiled, it is actually the sound this object makes that gives us the auditory sensation referred to in the sentence; in the pencil example, we will not relate the red colour to the pencil as such but rather to its surface or its marking point. The crown of the tree as well as the sound of the trumpet and the red part of the pencil are the active zones of these three trajectors with respect to the processes in which they are partaking14. The latter is a very important statement because for other processes, as will be argued later in this chapter, different active zones of profiles which take part in these processes will be activated. In any case, the active zone, as stated by Langacker (2002), should not be understood as a precisely delimited region but rather as a focal area; yet, for the ––––––––– 14

Langacker (1991) argues that the active zone phenomenon is a very important notion as, in addition to “pertaining to the relational predication with its arguments”, dealing with subtleties to which linguistic form is often blind, it “go[es] to the heart of critical grammatical issues” (189) such as object-to-subject, subject-to-subject and subject-to-object raising. These issues, however, will not be discussed in the present work as they seem irrelevant to its line of reasoning.

235

The meaning of meaning

sake of analysis, it is often treated as bounded (Langacker 2000). It may also happen that, as argued in another publication (Langacker 1991: 190), for some relational predications the active zone of the trajector and/or landmark may be coincident with the whole of the profiled entity, as in the example of The spacecraft is now approaching Uranus (Langacker 1991: 190). An issue that needs to be addressed at this point – as it will be of importance when active zones are discussed in relation to conceptual blending – is the active zone/metonymy problem presented in Langacker (2000) and subjected to criticism in Bierwiaczonek (2010). Langacker argues that in sentences like (1) Your dog bit my cat as well as (2) I’m in the phonebook (2000: 62-64) we are dealing with a kind of metonymy: a part of the entity – its focal area or active zone, which actually takes part in the event (the dog’s teeth and whatever part of the cat that got bitten in the former; my name in print and a certain page of the phonebook in the latter) – is represented by the whole predicate (the dog; the cat; I; the phonebook). As a result, we can say that all these predicates are sources of metonymy and their active zones are their metonymic targets. The only difference between cases (1) and (2) lies, according to Langacker (2000), in the fact in the former the active zone is comprised in the predicate (Figure 13a) while for the latter the predicate and the active zone are separate (Figure 13b), as depicted below (adapted from Langacker 2000: 63). Figure 13. Trajectors and active zones for Your dog bit my cat and I’m in the phonebook. (a)

(b)

your dog [tr]

the teeth [az]

I [tr]

my name [az]

The above representation, together with the sign of equality put between zone activation and metonymy lies at the root of the criticism expressed by Bierwiaczonek in the form of a question: “If active-zone phenomena are indeed “a kind of metonymy”, what kind of metonymy are they?” (2010: 12). Dissatisfied with such an all-inclusive treatment of active zones which, as he argues, makes the very notion conceptually imprecise, Bierwiaczonek suggests limiting the concept of active zones to cases like (a) above, where the conceptualised area is vague and not bounded: both in the case of Your dog bit my cat, as in examples such as I cut my finger, it is impossible to say – precisely – what part of the cat was bitten, what part of the dog did the biting or what area of the finger was cut.

236

Chapter Three

Instances like (b), in turn, the individual entities have very sharp boundaries in the sense that they are easily distinguished (=conceptualised and lexicalised): the I in the phonebook stands for and is easily verbalised as my name. Consequently, Bierwiaczonek argues, such examples should be classified as metonymies and not active zone phenomena. While the sharp/fuzzy boundaries argument appears uncontroversial, it seems that drawing a very sharp distinction between zone activation and metonymy may not be fully justifiable. It is impossible to completely separate the two because all metonymy actually requires a kind of zone activation. The boundedness/ unboundedness question notwithstanding, in a whole-for-part metonymy, like America for the USA, the predicate represents the actual focus in exactly the same way as your dog stands for the dog’s teeth in Your dog bit my cat. The point that metonymic mappings involve zone activation is additionally endorsed by the case of non-referential metonymy (Koch 1999; Waltereit 1999), in which time-space contiguity – on which metonymic extensions are normally based – is replaced with experiential in-frame togetherness. As a result of this togetherness, an overall concept represented by the frame may predicate any of the in-frame aspects of the concept – or zones, in fact – activated under the influence of the context in which it is used. The example of water cited in the previous section after Grzegorczykowa (1998) and Bartmiski (1993, 1998) is an instance of such a phenomenon; other examples can be found, among others, in Turula (2009a). The results of the study I carried out show that metonymic extensions motivated by innovative use of nounto-verb conversion originate from a very broad, all-inclusive, associative cognitive domain indexed by the converted noun15. Assuming, based on Langacker’s (1987) encyclopaedic concept of meaning, that domains of all predicates have the potential for similar all-inclusiveness and context-dependent activation of in-frame zones, we can argue that the I frame, which needs to be selectively activated for the I’m in the phonebook sentence, will contain, among others, a concept like my name and phone number, a concept which will be the zone of the whole domain activated in the context of a phonebook. In the light of this, Langacker’s (2000) diagrammatic representation of the trajector and its active zone for this sentence may need to be amended in the way presented in Figure 14. ––––––––– 15

Listed below – together with the resulting event schemata – are the zones activated in the CUSHION domain by the skeletal sentence X cushioned Y (source: Turula 2009a: 109): CUSHION=a soft object Æ action (manner) schema – to act like a cushion: to soften the fall (3 answers) CUSHION=object used in pillow fights; to sleep; to muffle noise Æ action (instrument) schema – hit with a cushion; to lull sb to sleep putting them on a cushion (12 answers) CUSHION=comfort Æ transfer schema – to offer comfort (to win favour with) (2 answers) CUSHION=an object Æ essive schema – to use sb as if they were a cushion (1 answer) CUSHION=a place to put sth/sb Æ locative schema – to put on a cushion (1)

The meaning of meaning

237

Figure 14. Name and phone number activation within the I domain The I (experiential, encyclopaedic) frame [tr]

my name and phone number [az]

Consequently – as seen in the relevant diagrammes – in the case of both Your dog bit my cat and I’m in the phonebook we are dealing with the same phenomenon: an activation of a zone of a certain conceptual domain, be it a more general conceptual region, a base or the predicate itself. While it is true that the vagueness/autonomy of the part within a whole – an issue raised by Bierwiaczonek (2010) – remains unresolved, we can ask how important this issue actually is to a phenomenon like zone activation16. As will be demonstrated in the section dealing with conceptual blending, the vagueness/autonomy is not a factor that needs to be considered as far as this process is concerned. The concepts of frames, cultural models and scripts (discussed in Section 2) as well as the phenomena of mapping, profiling and zone activation, which have been the focus of the present section, are the basic concepts of the Conceptual Integration Theory (CIT) proposed by Fauconnier and Turner (1994); they are also important to the dynamics of online meaning construction, with particular regard to frame shifting and conceptual blending (Sweetser 1999; Coulson 2000 and 2005; Coulson and Kutas 2001). The following two sections look at both concepts: Section 5 presents an outline of CIT; Section 6 analyses the two abovementioned dynamic aspects on construing meanings in real time.

4. Conceptual Integration Theory The Conceptual Integration Theory put forward by Fauconnier and Turner (1994) is, according to Evans and Green (2006), a derivative of two cognitive semantics traditions: Mental Spaces Theory, discussed in Section 2, and the Conceptual Metaphor Theory (CMT). What CIT shares with the former is the dynamicity of meaning construction and the architecture of the cognitive spaces partaking in the process. The contributions of the Conceptual Metaphor Theory include the very concept of target and source spaces, the unidirectionality of mappings – concepts ––––––––– 16

The fact that it is important to the distinction between active zone phenomena and metonymy is fully acknowledged.

238

Chapter Three

are mapped from source to target domains but not vice versa – and, most importantly, the belief that metaphor is not a trope – a purely linguistic device, as formerly claimed – but a conceptual mechanism found in everyday speech (Lakoff and Johnson 1980). In addition to this – and particularly important to conceptual blending, which is of special interest to the present work and will be discussed in the next section – is the fact that, in the light of CMT, not all aspects of source and target spaces are explicitly stated and, consequently, metaphoric mappings involve entailments or inferences of these aspects (Evans and Green 2006). The process of constructing new metaphoric meaning, which is carried out by means of projection – or mapping – of conceptual structure between conceptually related entities is based on the following four-space structure (Figure 15): Figure 15. Fauconnier and Turner’s (1994) four space model the generic space

the source

the target

the blend

The projection model consists of four spaces: source and target as well as two middle spaces – the generic and the blended space. The important part of the model is the projection between the source and target spaces, which is based on their conceptual similarity motivated by the contents of the generic space, in the way which was presented earlier on the basis of the consumption is the engine of economy metaphor. The projection into the blended space – which contains the conceptual outcome of the whole mapping process – will be based on the lexical meaning of the involved source and target domains. However, while the generic middle space contains an abstract schematisation common to both domains, the middle blended space will include certain not easily extracted “specifics from the source and target, yielding the impression of a richer and often counterfactual or impossible structure” (Fauconnier and Turner 1994: 5). Enigmatic as they sound, the specifics of the richer, counterfactual and impossible structure will be given in Section 5.1 which is devoted to conceptual blending per se. What is important to point out here, though, is that, as mentioned before, the projection into the middle space – be it generic, blended or both at the same time – is not restricted to metaphoric mappings. As Fauconnier and Turner (1994: 1) state:

The meaning of meaning

239

projection to a middle space is a general cognitive process, operating uniformly at different levels of abstraction and under superficially divergent contextual circumstances ... The process of blending is in particular a fundamental and general cognitive process, running over many (conceivably all) cognitive phenomena, including categorization, the making of hypotheses, inference, the origin and combining of grammatical constructions, analogy, metaphor, and narrative.

What remains to be explicated here, in relation to CIT itself, is what exactly conceptual integration amounts to in each of the phenomena listed above. The problem is addressed by Bierwiaczonek (2005), who distinguishes three types of cognitive process: intrinsic inclusion (in the case categorization, among others); contiguity (in metonymy); and non-contiguity or separation (in metaphor). These three different types of projection on the conceptual level are manifested by different co-activation patterns on the neuronal level. Categorization will involve “the activation of neural and conceptual structures the two predications have in common” (Bierwiaczonek 2005: 12); metonymy, in turn, will cause synaptic activation resulting from the contiguity (temporal, spatial or experiential) of the concepts that take part in the mapping; and, finally, in the case of metaphor, the co-activated neural areas will be remote and their simultaneous inducement will be the result of the activation of the generic space (Bierwiaczonek 2005) – or generic/blended space (Fauconnier and Turner 1994) – they share on the conceptual level.

5. Dynamic aspects of online meaning construction: conceptual blending and frame shifting Assuming that every discourse event will be based on the conceptual architecture of mental spaces, frames and cultural models as well as numerous cross-sectional mappings between them, we have to accept as inevitable that discourse construction will by no means be linear and perfectly coherent. The opaque character of a number of linguistic forms on the one hand and the essentially unique experience of every homo loquens (Bartmiski and Niebrzegowska 1998) on the other, mean that “a potentially unlimited array of space configurations” (Fauconnier 1994: xlii) exist. As a consequence, following Coulson (2000 and 2005), it may be necessary to acknowledge the role of context while simultaneously considering different ways in which understanding a single form – with its encyclopaedic, all-inclusive underlying frame of reference – can influence the cognitive structuring of discourse by indexing mental spaces that – often unexpectedly – change the course of reasoning and require immediate online renegotiation of meaning. Coulson (2000; 2005) claims that in such dynamic meaning construction people employ two related sets of process - conceptual blending and frame shifting.

240

Chapter Three

5.1. Conceptual blending The idea of conceptual blending, already highlighted in the present book in the part devoted to mappings, is in contrast with the traditional “building-block” (Sweetser 1999: 136) understanding of semantic compositionality. In the process of the afore-mentioned mappings between the input and target spaces, Fauconnier and Turner (1994: 2) propose “projection to a middle space”, where a rich blending process takes place. It is based on the mutual multi-level activation of selected zones of the input concepts, and the outcome meaning differs from and is richer than the simple sum total of the input semantics. This is because, following Evans and Green (2006: 407-408), “conceptual integration gives rise to a blended space which provides a mechanism that accounts for the emergent structure not found in the input domains”. This emergent structure is the result of inferences or entailments, described earlier in this chapter, that are part of the conceptual mapping. Evans and Green (2006) explain the process on the basis of the surgeon is a butcher metaphor. Decoded within a two-space-only model, the metaphor would result in the following mappings (Evans and Green 2006: 402): BUTCHER  SURGEON CLEAVER  SCALPEL ANIMAL CARCASSES  HUMAN PATIENTS DISMEMBERING  OPERATING As Evans and Green observe, the above model cannot account for the following difficulty: while being a butcher is a profession demanding high skills, the overall evaluation of the surgeon who is called a butcher is negative. As a result, we have to state that this evaluation does not come from the source input space. This is where inferences/entailments step in and lead to the emergent structure – the blended space – where the overall meaning goes beyond the mere sum total of the elements included. In this emergent structure we are faced with a disanalogy: while surgeons save lives, butchers perform on dead carcasses. This gives rise to the meaning of incompetence rendered by the surgeon is a butcher metaphor. Summing up the above, in the process of blending the generic space gives structure to the whole architecture by establishing counterpart connectors between spaces, such as agent, patient, instrument, etc.; input spaces, as a result of mappings between them, provide selective projections into the blended space where meaning emerges, as a result of a creative process motivated by these projections as well as schema induction and inferences/entailments. As a result, we can state, following (Evans and Green 2006) that meaning construction goes through three stages: composition, completion and elaboration. The first two

The meaning of meaning

241

processes consist in putting together the overall four-space architecture as well as filling in the input spaces (butcher/surgeon) with content such as different aspects and attributes of the source and target entities (cleaver-scalpel; animal carcass-human patient; etc.). Elaboration, in turn, relies on various defaults: implications, associations, analogies and disanalogies fed by the multi-level interactions between the active zones of input domains, as indicated in the example above. As signalled by the quotation from Fauconnier and Turner (1994) used in the previous section, the above-described three-stage process of meaning construction is not the sole property of metaphors. As mentioned before, following Fauconnier and Turner (1994), blending and the meaning enrichment that it entails is a phenomenon found in everyday speech. A simple noun phrase such as a pet fish, an example given by Fodor and Lepore (1996), is not simply the common area of the two classes containing fish and pets; “[i]nstead, the category pet fish selectively integrates aspects of each of the source categories in order to produce a new category with its own distinct internal structure” (Evans and Green 2006: 400). Similarly, as I argue elsewhere (Turula 2007), a blend like booze indisposition will not simply mean an indisposition caused by booze. It will be a combination of selected meanings of booze and indisposition and will, most probably, denote some kind of regrettable physical state – be it loss of consciousness or hangover – resulting from excessive alcohol consumption. Depending on the speaker’s experientially justified defaults/ entailments/ inferences, such an expression may also imply that the indisposition is not a real illness; is well-deserved or evokes pity; etc. We create blends such as the two discussed above online, as we take part in discourse events. In fact every syntagmatic combination results in conceptual blending. The point of departure of such a process is zone activation (Langacker 1987) in both/all components of the combination in question. Sweetser (1999:130) adapts the Langackerian idea for the sake of such conceptual blending, which she explicated based on the example of red. Sweetser argues that when we talk about a red pencil and a red apple, we “do not refer to simple intersections between pencils or apples with red things”. What actually happens is a “flexible matching between frame-evoked roles and individuals referred to” (1999: 131). As a result, advanced interpretative processes are involved allowing discourse participants to activate one zone of red for pencils and another one for apples (and still other “reds” for lips and hair) as well as to decide, on the basis of context, whether the pencil referent has red surface or draws red lines. The process of such zone activation and the resulting conceptual blending is schematically diagrammed in Figure 16 (based on Sweetser’s 1999 observations on English Adjective+Noun modification construction):

242

Chapter Three

Figure 16. Zone activation for red

LIPS

HAIR

RED

BALL

APPLE

The following conceptual integration of an Adjective+Noun modification, for e.g. red ball, will utilise a four-space network which contains (Sweetser 1999: 139): the generic space accommodating visible physical objects with surfaces; the source space including a spectrum of colours visible on surfaces with red profiled within the hue continuum; the target space containing ball; and the blend with red ball. As argued before, in relation to the issue of boundedness/autonomy of the activate zone, this problem is of secondary importance to zone activation as such. What is activated may have fuzzy (the different shades of red) or easily-delineated (the surface of the ball) boundaries; the fact does not affect the nature of the process, whose most important characteristic is that, in a relational profile (cf. Section 3), the two language units that come into contact will mutually activate their active zones and that these units as such will predicate/stand for their activated focal areas. Simultaneously, in addition to such relatively simple integration models, there will be instances of multiple blending such as in the grim reaper blend analysed by Fauconnier and Turner (2002). In this case, the construction of meaning will be motivated by the generic space of agent affecting patient plus as many as three input spaces: the reaper space, the killer space and the death space. Consequently, considering the varying degrees of the internal composition of the model, Fauconnier and Turner (2002) come up with the following integration networks taxonomy (Table 5):

243

The meaning of meaning Table 5. Fauconnier and Turner’s (2002) integration networks taxonomy: type of integration

example

s0069mplex

terms of kinship: the relative of X

mirror

the Buddhist monk riddle17

single-scope

the boxing CEOs blend in which the concept of a boxing match – or generally physical struggle – is mapped onto business competition

double-scope

the earlierdiscussed surgeon is a butcher metaphor

multi-scope

the earlierdiscussed death as the grim reaper

input spaces there are two input spaces but only one of them provides a frame (in this case the more schematic mother/daughter conceptualisation frames expressions such as Anna is Elbieta’s daughter) there are two input spaces that share a common frame – the frame of a day-long journey, in the case discussed

there are two inputs, each of which contains a frame

there are two inputs, each of which contains a frame

there are three or more inputs, each of which contains a frame; there may also be more than one generic space

blend

the frame underlies the resulting blend

the common frame underlies the resulting blend the frame of one of the concepts underlies the resulting blend; in the case of the boxing CEOs metaphor it will be the BOX frame both frames motivate the resulting blend; selective parts of both input domains contribute to the final meaning all frames motivate the resulting blend; selective parts of all input domains contribute to the final meaning

––––––––– 17

There is a well-known riddle (cf. Fauconnier and Turner 2002) which can be summarised in the following way: A Buddhist monk embarks on a journey uphill. He starts in the morning, gets to the top of the mountain in the evening, spends the night meditating and climbs down the following morning within the same time frame – start in the morning, finish in the evening. The question is: is there a place on the twice-taken route in which the monk finds himself at the same time on both days? Finding the correct answer to the riddle is best achieved by imagining the two trips on the same day and guessing – correctly – that if the monk walked from the foot of the mountain to its top and from the top of the mountain to its bottom on the SAME day, he would certainly meet himself on the way. Such a conceptualisation requires mirror conceptual integration in which the two days are mirror images of each other.

244

Chapter Three

The applications of Conceptual Integration are numerous in areas such as humour, persuasion or formal blending, which are discussed below. First of all, conceptual blending underlies our understanding of humorous discourse. Coulson (2005) argues that “blending processes are important for humour production and comprehension, as humorous examples often involve the construction of hybrid cognitive models in so-called blended spaces”. As I argue in another publication (Turula 2008: 14-15), Kids’ talk18 is full of such hybrid models since children can be very creative with language and their interpretations of the surrounding world; sometimes, however, the blend becomes even more complex when it is enriched by a conceptual frame of an adult listener, who, as a result of such blending, sees the final hybrid as humorous. The point is explicated based on two samples of such discourse presented below. Kids talk; Sample 1 Jacek and Pawe are playing with saucepans. Jacek (putting a lid on one of the saucepans) ‘It’s cold, put a hat on’.

In Sample 1, the initial blend created by the child is based on the similarity between a pot and its lid on the one hand and a human head and a cap on the other. Consequently, the two input spaces – kitchenware (source) and human anatomy and fashion (target) result in a metaphoric mapping between the two. The It’s cold piece of information situates the blend within a broader frame of one person being concerned with the well-being of another. At this point, the adult listener (mother) steps in, blending the child’s input space with a space in which she features herself as a caregiver and sees the humour of the hybrid as presented below: Sample 1. Conceptual blending and its humorous effect Child’s input Blend Mother’s input child mother/child mother taking care of taking care of taking care of a pot a child/a pot the child

 Humorous outcome Mother may be a little infantile in her efforts to take care of the child; some situations may verge on grotesque

A similar kind of conceptual integration may be the result of an overlapping between a child’s and an adult’s frames in which the humorous effect is produced by the contrast between the defaults each of them provides (Sample 2). ––––––––– 18

The cited samples of discourse between myself and my own three children were collected in the years 1996-2001

245

The meaning of meaning Kids’ talk; Sample 2

Jacek, Pawe and Rafa are watching TV. All (on seeing an ad for the latest Barbie): ‘How stupid, I hate her’. Mum, although in secret agreeing with her sons’ views, decides to give them a lesson on political correctness. Mum: ‘I’m not at all surprised you don’t like Barbie – because it’s a toy for girls. But if a girl was to hear you saying how ‘stupid’ it was, I’m sure she’d be sad. Instead of saying it’s ‘stupid’, you should say something like ‘I don’t like it’. The effects of said lecture were very quickly obvious: Rafa (on seeing a Barbie ad): ‘tupid’ ‘tupid’ Pawe: ‘You don’t say something is stupid. You say it’s for girls’.

Sample 2. The blending mother’s input

child’s input

blend

be politically correct

don’t say what you think, say something else What, on earth, is ‘something else’? default

even greater political incorrectness

message concerning girls This must be what needs to be said

so much for hypocrisy! I’m a girl myself. Don’t I feel stupid?

Involuntary affective default insincerity/ hypocrisy

mother’s default

What needs to be noted is the fact that the humorous effects in Samples 1 and 2 indexed by the caregiver’s frame in the former and based on an unintended effect of political correctness training in the latter seem to confirm what has already been said about the conceptual blending: it is a very complex process in which the resulting meaning is far beyond the sum of input meanings. In addition to the above, processes of conceptual blending are part of sociocultural phenomena: they “mediate the exploitation of stable conventional mapping schemes in order to adapt shared cultural models to the idiosyncratic needs of individuals” (Coulson 2006: 187), where the final meaning – often induced in the process of persuasion – is a blend between the socially-shared meaning and the personal meaning of a given individual. Finally – and most importantly, as will be focused on in Chapter 4 in a discussion of the semantics of temporal expressions – there is a special type of blending referred to as formal blending (Evans and Green 2006), in which part of the constructed meaning is motivated by the semantics of the grammatical construction used.

246

Chapter Three

5.2. Frame shifting Frame shifting, the other dynamic aspect of online meaning construction and a concept that is valid in a discussion of Conceptual Integration Theory is, as Coulson (2000) defines it, a kind of semantic reanalysis of discourse, as a result of which old understanding is abandoned and the conceptual architecture is readjusted to the new concept. It is based on the acknowledgement – illustrated in Coulson (2000) by the famous Armstrong’s One step for [a] man, one giant leap for mankind – that our understanding of different phenomena can be altered quite dramatically by the change in the underlying assumptions we base them on, such as contextual factors and background knowledge. The result of such alteration is a semantic leap towards the less/non-obvious meanings of certain language constructions. As Coulson (2000: 2) puts it: Semantic leaps is not a technical term, but, rather a family of interesting natural language phenomena. It includes all sorts of nonstandard meanings, absent from dictionaries and not typically computable by traditional parsers. Leaps include things such as metaphoric and metonymic expressions, hyperbole, understatements and sarcastic quips. They also include things such as innuendo, subtle accusations and the private meanings that can arise when people live or work closely together.

Coulson and Kutas (2001) as well as Coulson et al. (2006) confirmed the phenomenon of frame shifting on the physiological level in two formally related experiments. In both studies adult testees were presented with a number of one line jokes such as When I asked the bartender for something cold and full of rum, he recommended his wife and their non-funny controls like When I asked the bartender for something cold and full of rum, he recommended a daiquiri. In the first experiment (Coulson and Kutas 2001), while the testees were reading the sentences their event-related brain potentials (ERPs) were recorded. What was observed was that “joke and non-joke ERPs differed in several respects depending on the participants’ ability to get the joke and contextual constraint” (2001: 71). In the later study (Coulson et al. 2006), which used a headband-mounted eye-tracker to examine the participants’ eye movements while the testees were reading the above-mentioned jokes and their non-funny controls, similar meaning processing differences were inferred on the basis of the observation that regressive eye movement could be noticed only for the joke sentences. Coulson et al. explain the phenomenon in the following way: when the testees see a sentence like the nonfunny control, they evoke a scene/construct which is a mental model of the situation. This does not happen once they are finished reading but while they are reading, which is the essence of online processing. In the case of the joke sentences, the surprising conclusion makes the reader abandon their already adopted cognitive architecture and (s)he has to reevaluate the sentence pragmatically: the bartender joke sentence quoted above loses the quality of a regular bar-based exchange and becomes the bartender’s insult to his wife.

The meaning of meaning

247

As Coulson et al. (2006) point out, the plot of humorous stories – jokes like the above-quoted one about the bartender’s wife – is deliberately built up in such a way so as to index one frame and, simultaneously, evoke elements of another frame. The much earlier Raskin’s (1985) Semantic Script Theory of Humor is based on a similar idea. Raskin suggests that the narrative structure of a joke enables the activation of two different scripts, or two event representations based on two different frames. The story is told in such a way so as to balance these two possible cognitive domains. What makes a joke funny is the final contrast between the two scripts/frames. Coulson et al. (2006), however, question the appropriateness of Raskin’s theory, arguing that the knowledge of the script/frames scenarios is necessary but not enough to understand many jokes. What such comprehension is based on are all of the aspects of online meaning construction discussed in the previous sections of the present work, including the creativity of the process as well as compositionality, the latter amounting to selective activation and multi-level combination and not mathematical addition, to go back to the composition, completion and elaboration (Evans and Green 2006) of the final blended space. As I argue in the earlier-cited article devoted to the role of frame shifting in the understanding of kids’ talk (Turula 2008), the above mechanisms seem particularly applicable to this kind of humorous discourse. This is because kids’ talk, though humorous, is very much unlike jokes in the sense that it does not simultaneously activate different frames/scripts, as Raskin proposes. In the case of jokes the very introduction “Did you hear the one about …?” keeps the listener – familiar with the genre and its schema – on the watch for the interplay between different cognitive domains. For kids’ talk, such cognitive alertness is not a question since the funny dialogues and quips in question are usually part of discourse that is not intended to be humorous. As a result, the situation bears every resemblance to the experiments with funny sentences and non-funny controls carried out by Coulson et al. (2001, 2006), as the post factum realization of the humour that is typical of kids’ talk requires considerable semantic leaps and much more advanced online meaning construction than in the case of regular jokes. The creativity and compositionality of the above-mentioned process is presented (Kids’ talk; sample 3), in an analysis of one humorous discourse encounter an adult and a child described in Turula (2008: 17): Kids’ talk; Sample 3 Mum, Jacek and Pawe are talking about violence on TV. Mum’s explaining that you can’t copy everything you see in cartoons, because , if everything in cartoons happened for real, all of them would’ve died. Pawe isn’t so sure. Mum decides to give an example: Mum: “Imagine what would happen, if everyone at preschool had a gun”. Pawe (suddenly livens up): ‘Cool, first I’d shoot Miss Halina and then Miss Ania…’

248

Chapter Three

The two frames that are indexed and, eventually, demand a semantic leap on the part of the adult, are presented below: Sample 3. The frame shifting KINDERGARTEN KINDERGARTEN adult’s frame child’s frame Top level of frame (generic): KINDERGARTEN is a place where children play and learn in groups taken care of by teachers child’s defaults adult’s defaults (based on immediate (based on an indirect personal experience) perspective additionally blurred by the Teletubbies frame) The children are sweet; It’s a place where your the teachers are comproperty and physical petent professionals well being are in conand nice human beings stant danger. You have to fight back to survive Final adult’s in-frame Final child’s in-frame entity entity KINDERGARTEN is KINDERGARTEN is close to heaven on a survival camp if not earth all-out-war Frame shifting humour based on disanalogy

A similar conceptual leap is necessary in the case of a figure of speech called zeugma, which – like metaphor and metonymy – is not a mere trope but a manifestation of a complex cognitive mechanism. In a zeugma (Taylor 2003: 104) two distinct senses of a polysemous word are “incongruously yoked together in one construction”. This rhetoric is present in numerous puns such as the joke – whose origin I cannot trace (and properly acknowledge) – cited below: Judge: ‘So you want to divorce your husband on account of his appearance?’ Woman: ‘Yes, your honour. He hasn’t made an appearance in 15 years now.’

or Cruse’s (1986: 13 cited in Taylor 2003: 105) example of Arthur and his driving licence expired last Thursday. The leaps we take – and the resulting frame shifting or rather frame juggling – are between the two individual spaces generated by two distinct senses of the words appearance and expire. In conclusion, both conceptual blending and frame shifting are phenomena related to the dynamic online meaning construction which takes place via conceptual integration. They have been discussed separately but, as could have been inferred

The meaning of meaning

249

in relation to a number of the examples given in the work, they are, in fact, overlapping processes: conceptual integration of the surgeon is a butcher metaphor, to mention just one example, required blending which was, to a certain extent, based on a semantic leap from the basic understanding of the butcher frame in terms of professional effectiveness towards the disanalogy between dealing with animal carcasses as opposed to attending to living people. On the other hand, while the samples of kids’ talk or the cited instances of zeugmas demonstrated cases of frame shifting, we can say that they both led to enriched, inference-based novel blended understandings of the situations. In the former, it resulted, for example, in the mother coming up with a non-typical kindergarten space filled, ad hoc, with a newly created conceptual blend; in the latter, a zeugma blended the two distinct senses of the word leaving each sense slightly “contaminated” with the other and, potentially, more prone to a creative yoking of the non-prototypical scenes in subsequent discourse events of a similar kind. The present chapter as a whole can be summed up by a statement that, in the light of cognitive semantic theory, meaning construction is a creative, non-linear and very complex process involving addition in which 2+2 gives 5 or more, taking a few steps forwards only to retrace one’s conceptual route back to where one started as well as frequently adjusting the network to incorporate elements of the culture one lives in, one’s own private story, and, last but not least, the abundance of subtleties that are rendered formally. The very notion of construction of meaning entails an act rather than a state, which, as argued by Sinha (1999) as well as Tomasello (1999, 2005) is intersubjective in the sense that both the speaker and the hearer, based on joint reference, create the meaning as they move through discourse. As they do so, they follow the paths of the internal network of the architecture of meaning making connections, shifting and blending in ways motivated by and dependent on current discourse demands. All this makes the whole process energetic: social and practical. Chapter 4 will be looking at such meaning construction in the interpretation of events through the tense-and-aspect system. Consequently, English temporal expressions will be investigated based on all of the aspects of cognitive semantics discussed in Chapter 3: the cognitive space architecture; its fillers in the form of frames and other holistic models; and finally the Conceptual Integration Theory and its main constructs: conceptual blending and frame shifting.

CHAPTER FOUR FOCUS ON THE SEMANTICS OF THE ENGLISH TENSE-ASPECT SYSTEM The investigation of form-meaning mappings – with special regard to the semantic pole – which has so far been presented in the present work is now going to focus on the area of grammar which is of particular interest to the present work: temporal expressions, including tense and aspect. The tense-aspect system is a part of the TAM (tense-aspect-modality; Li and Shirai 2000) trio of categories in which formal marking on verbs is used to indicate temporal concepts. The present chapter looks mainly at the first two, investigating modality only marginally in relation to the future tense1. That is why, throughout the work the terms tense-aspect system or temporal expressions will be used to refer to what pedagogical grammars call grammatical tenses. As is going to be demonstrated, the said system will cover a whole range of tense, aspect and, potentially, modality combinations in all three time spans: present, past and future. In the light of Functional Grammar, (Dik 1989, Hengeveld 1989 cited in Siewierska 1991 115-116) – the theory of which motivates the present investigations (cf. Chapter 2), the TAM trio operates on four different levels – predicate, predication, proposition and illocution – indicating, respectively (based on Hengeveld 1989: 132): – the internal temporal profile, by means of imperfective, perfective, ingressive, progressive, continuous or egressive aspect; – the external temporality including location on the time line, frequency, the (ir)realis issue expressed by tense, quantificational aspect (distance from the deictic centre), the objective modality; – the proposition and its source, in terms of modal subjectivity; and – the speaker’s intentions of face saving (downtoning) or highlighting (emphasising). The above FG distinction is rooted in the three-layered model of clause structure put forward by Foley and Van Valin (1984): the nucleus (predicate and its operators, including aspectual markers), the core (the nucleus plus some modal/auxiliary operators) and periphery (adjuncts, tense, illocutionary force). The present book adopts this perspective, considering the semantics of tense and aspect on all the three (Foley and Van Valin 1984) / four (Hengeveld 1989) above-mentioned levels. ––––––––– 1

In whose case the distinction between the modal and temporal understanding is not easy and subject to debate, as will be shown in the present chapter.

252

Chapter Four

The interest of the present work in the tense-aspect system as form-meaning pairings is part of a larger body of research into this area of second/foreign language acquisition; research to date into how temporal expressions and their meanings are learned will be presented at the beginning of Part III. What, however, seems to be of particular interest at this point of our discussion is that (Bardovi-Harlig 2004: 115) “[s]ome studies2 have demonstrated that second language learners show a mastery of form that exceeds their mastery of target language form-meaning associations”. Such a discrepancy between a considerable automaticity in the area of the formal morphological properties of tenses and aspects on the one hand and, on the other, the lack of knowledge of the semantics of these verb patterns is exactly the concern of the present work, mentioned at the beginning of Chapter 3, together with the fact that the main aim of the present work is discovering effective ways of levelling the above-mentioned non-native speaker disadvantage. The point of departure for this pedagogic quest is the already-verbalised assumption that the semantics of grammar should be subjected to some kind of form-focused instruction. However, as stated in a similar context in the previous chapter, before appropriate methodology is devised, it seems necessary to establish what the object of such instruction should be. In other words, we must decide what ought to be taught as part of the semantics of the English tense-aspect system. In this respect, our considerations of the tense-aspect system as form-meaning pairings mirror the discussion of meaning presented in Chapter 3. After the initial description of the preliminaries of temporal expressions (Section 1), we will again be looking – this time from the perspective of the tense-aspect system – at notions such as prototypicality effects (Section 2), mental space architecture, inter- and intra-space perspective (Section 3) as well as holistic semantic models and the related dynamics of online meaning construction – frame shifting and conceptual blending (Section 4). At the end of this chapter an overall model of the semantics of the English tense and aspect system will be presented. The model will be a proposal of what should be subject to meaning-oriented focus on form; the question of how this content should most effectively be taught as part of foreign language instruction will be preliminarily addressed as well, vis à vis the pedagogic guidelines presented in the concluding paragraphs of Part I. 1. Tense and aspect – the preliminaries Before addressing the question of tense and aspect semantics, it seems necessary to introduce these temporal expressions in terms of definition and scope. Looking at tense and aspect as delineated by Comrie (1976, 1985), we can say that tense is a grammaticalisation of location in time, while aspect represents ––––––––– 2

Bardovi-Harlig 1992a and 1992b; Dietrich et al. 1995

Focus on the semantics…

253

the internal temporal contour of a situation3. In other words, tense will contain the answer to the question of where on the timeline while aspect will be related to how something happened or – rather – how some event was construed by the speaker/hearer in terms of its intrinsic properties as well as in terms of the selfother relationship between the conceptualiser and the event. 1.1. Tense When it comes to tense as a grammatical category, it is deictic in the sense that using different grammatical tenses is a means of locating things – which in the present work will be called situations4 (following Comrie 1976, 1985; Lyons 1977 and Declerck 1991) – in time5 by means of referring them to other points or cycles on the timeline. For a better understanding of tense as a reference point – or reference cycle – construction, it is, according to Comrie (1985), necessary to define the three major parameters related to it: the deictic centre; the location of the event in question in relation to the deictic centre; and finally – and important to some languages only, English excluded – how far from the deictic centre the event is located6. 1.1.1. The timeline and the basic temporal concepts Comrie (1985) states that the above-mentioned parameters – the first two of which, being relevant to English, are going to be discussed in this section – are best illustrated on the timeline, the universal concept in which the flow of time is conceptualised from left to right; which has a specification of the present moment (0); and on which different situations – processes, events and states – are represented. The said situations can be punctual (A, B and C) or marked as stretches (D, E, F, G, H and I); they can also be sequential (B follows A; H follows G), overlapping precisely (D and E), partly (H and I) or occurring within each other (F within G), as depicted in Figure 17 below: ––––––––– 3

4 5 6

Similar definitions can be found throughout tense-related literature; cf. among others, Klein (1994: xi): “Tense, according to common understanding, serves to ‘localise’ an event (in the broadest sense of the word) in time – notably before, around or after the time of utterance. Aspect, on the other hand, allows us to present the event in various ways – for example as completed or not completed, or as seen from inside or outside”. Klein (1994: xi) calls them “event[s] in the broadest sense of the word; the present work uses the term events to refer to non-stative situations. Similar definitions can be found throughout tense-related literature; cf. Klein 1994: xi. Comrie (1985) distinguishes between the English immediate future – I am about to go – and the more distant future – I will go. This distinction is, however, relatively unimportant to the present line of argument and will be ignored.

254

Chapter Four

Figure 17. Situations on the timeline (adapted from Comrie 1985: 6) A

B

C

D

G

H

0 E

F

I

A similar concept of a timeline is present in other works related to tense (Lewis 1986; Declerck 1991) with one important reservation: it is important to differentiate between the physical objective concept of time on the one hand and psychological (Lewis 1986: 49) and linguistic (Declerck 1991: 16) time on the other. This will be related to the fact that, in the case of a temporal location of events, we will be dealing with Ontological Reality, which is the objective course of events, as well as the Epistemic Reality or Domain, which accommodates the events and their temporal characterisation as construed by the conceptualiser (Turewicz 2000). All this amounts to the fact that some events can take time to happen but may be conceptualised down to punctual proportions; at the same time, as Griffith (2006: 100) puts it, “we can also mentally stretch events and concern ourselves only with their middle stages”. In addition to this we can choose to concentrate on situations as such or, conversely, to only pay attention to their initiations or culminations. This remark is crucial to the considerations of tense and aspect that follow in further parts of the present chapter in relation to how events are perceived in terms of duration/punctuality as well as their internal temporal architecture (aspect). When it comes to the deictic centre as one of the concepts related to a timeline, it is an indispensible temporal parameter. This is because, to quote Comrie (1985: 13), “[t]ime itself does not provide any landmarks in terms of which one can locate situations”. As a result, it is necessary to arbitrarily specify points in time to which situations will be referred7. In fact humanity has been doing this for centuries through AD and BC dating as well as with expressions such as post-modernist or pre-Raphaelite, relating processes, events and states to other well-known entities (phenomena, events, people, etc.). Such historical – or future – reference points offer a relative perspective on situations, in the sense that these reference points themselves need to be located in time; that is why the linguistic means which are used for such temporal references are called relative. Typically however, as Comrie observes, the speech situation – usually the present moment, specified as [0] on the timeline above – is chosen as the reference point. ––––––––– 7

What is also important is that it is also possible to conceptualise deictic centres as cycles. The latter statement refers to sentences such as I do it every day (a cyclical deictic centre) as opposed to I’m doing it now (a punctual deictic centre).

Focus on the semantics…

255

As the here and now is the ultimate reference point, in the sense that it is impossible to imagine any further reference, the linguistic means used to locate situations in relation to the present moment of speech are called absolute. The importance of the deictic centre as a reference point is acknowledged by other tense theoreticians, and the very concept is described under a number of different labels such as: “temporal zero point” (Lyons 1977: 278) or “the tense locus” (Chung and Timberlake 1985: 204); it usually refers to the coding time – cf. Comrie’s present moment/speech situation above – or decoding/reception time (Declerck 1991: 14), the moment when the message is received by its addressee, which may be temporarily removed from the actual speaking/writing time. As for linguistic means of locating situations in time in reference to the deictic centre, Comrie (1985: 8) divides them into three classes, with examples that indicate absolute [a] or relative [r] reference: – lexically composite expressions, such as this morning [a], on the following day [r] – lexical items, including yesterday [a], lately [r] – grammatical categories such as grammatical tenses, indicated by verb morphology – look-->look-ed [a] or looking [r] – and/or by the grammatical words adjacent to the verb – look-->will look [a] or had looked [r] The reference time delineated by the expressions or lexis described in the first two examples is called the established time (ET) by Declerck (1991); it is understood that the situation time (ST) – the time of the state or event normally described in the utterance which contains the said linguistic means – will be part of or will coincide with the established time (Declerck 1991). This means that in a sentence like Yesterday I wrote a letter to Pat, the time devoted to the activity of writing will coincide with – or, more probably, be part of – the time conceptualised as yesterday. The nature of temporal reference is slightly different in the case of grammatical tenses: verbal morphology and/or the grammatical words adjacent to the verb denote the time of the situation but not the established time; or rather, by conceptually placing this situation in relation to the deictic centre, grammatical tenses establish the orientation of the situation on the timeline but not its exact location. As a result, a sentence such as I wrote a letter to Pat locates the situation on the timeline to the left-hand side of the moment of speaking [0] but it is impossible to precisely determine or establish the time of this situation, which has to be – and usually is – provided by the co(n)text. This question is highlighted here as it is going to be revisited in a later section, where grammatical tenses as means of locating situations in time are discussed in relation to Langacker’s reference-point constructions. Finally, it is important to look at grammatical tenses from the absolute/relative perspective. In the former case, the deictic centre will be the moment of speaking and the situation will be located on the timeline: at the same time as or including the present moment for present tenses; before the present moment for past tenses; and

256

Chapter Four

subsequent to the present moment for future tenses8. However, as indicated above, it is also possible to specify – by means of contextual clues, expressions such as lately, the following day or appropriate temporal morphology – deictic centres other than the present moment, which is the case for relative tenses. Comrie (1985) divides such tenses into pure relative and absolute relative. The former include all non-finite forms like present and past participles as in the following examples: Men talking to strangers often get/often got/will often get cheated or Anybody left behind is not/was not/will not be waited for. The three options given in each example indicate that the participles talking and left used in the relative clause will be interpreted as elipted present, past or future tenses with regard to the tense – present, past and future, respectively – used in the main clause, which is its deictic centre or reference point. When it comes to the second category of relative tenses – absolute relative tenses – they will include the perfect tenses9. These tenses are called absolute relative because “their meaning combines absolute time location of a reference point with relative time location of a situation” (Comrie 1985: 65). This can be illustrated by an example such as Teddy left when we had arrived, in which the first part is absolute as it has its deictic centre in the present moment, while the latter part is relative to the first part and also relative in the sense that even though we know that our arrival was prior to Teddy’s leaving we do not know exactly when the earlier arrival situation took place with reference to the later act of leaving. As for formal representations of absolute and relative tenses, Comrie (1985) bases his system on a much earlier model proposed by Reichenbach (1947), in which certain points on the timeline and their relations were specified. For absolute tenses there are three such points – E for the moment of event (referred to as situation in the present work), R for the moment of reference and S for the moment of speech – and three relations – before, after and simul(taneous). Consequently, the following combinations can be found in the languages of the world (Comrie 1985: 122-128): • • • • •

absolute tenses10: present – E simul S past – E before S future – E after S non-past – E not-before S non-future – E not-after S

• • • • •

relative tenses: present – E simul R past – E before R future – E after R non-past – E not-before R non-future – E not-after R

––––––––– 8 9

10

English future tenses are subject to debate and are referred to later in this section. This statement refers to the past perfect and future perfect. Present perfect, which will be discussed later in this section, is treated as a special case of an absolute tense by Comrie (1985), which is disputed by some theoreticians (e.g.: Declerck 1991). The controversy is presented in Section 1.1.2. In Comrie’s model, there is no R (reference point) for absolute tenses as it coincides with the moment of speaking.

Focus on the semantics…

257

When it comes to English tenses, Declerck (1991: 225) uses the following specification11 to relate the traditionally used labels to Reichenbachian (1947) symbols and terminology12:

E-R-S E,R-S R-E-S E-S,R S,R,E S,R – E S-E-R

Reichenbach’s model anterior past simple past posterior past anterior present simple present posterior present anterior future

Traditional labels past perfect past tense/preterit conditional tense present perfect present tense future tense future perfect

1.1.2. The controversies As was already indicated in Section 1.1.1, there are two controversies concerning tense, one connected with the present perfect tense, the other related to the status of future tenses. Both are briefly discussed below. As can be noticed in the division of tenses into absolute and relative, present perfect was not definitely assigned to either of the two categories. This is because, as claimed by Comrie (1985), it is a special case of a relative absolute tense: present perfect is absolute because, like past simple, it has the present moment as its deictic centre (a similar interpretation can be found in Kearns 2000); the event time, however, as pointed out by Comrie, is difficult to establish and, consequently, relative. Such an interpretation has been subject to critique by Declerck (1991), who argues that Comrie (1985), in labelling past simple and present perfect as absolute (=no reference point but the moment of speaking) ignores additional references provided by the context – such as yesterday as in I did it yesterday – which delineate the established time for past simple and not present perfect. Declerck observes that in the light of the above, Comrie’s analogy between present perfect and simple past in terms of temporal schemata is wrong. The above-cited argument concerning additional, context-based past-time reference as a temporal demarcation factor for past simple and present perfect does not seem to apply consistently to all uses of the two tenses. First of all, in many past simple utterances such a reference is not directly denoted by any linguistic means; it is inferred from or implied via a very broad, often only covertly present conceptual background. Secondly, there are numerous instances in which – for reasons known to the speakers alone – the two tenses are used interchangeably. If this is so, a good ––––––––– 11 12

The E/R/S combinations not typical of the English tenses have been excluded. The reading of E-R-S is “E before R before S”.

258

Chapter Four

solution of the present perfect controversy will be one which considers the language user rather than the language used as the source of the difference being discussed. Such an explanation will not be based on the presence – or absence – of the additional temporal reference but on the mental space architecture (Fauconnier 1997; see Chapter 3 for a detailed description), within which what makes a difference is the discourse perspective assumed by the speaker/hearer for each of the two tenses. As a result, both present perfect and simple past will be interpreted as absolute tenses with regard to the present moment of speaking, which will provide the Viewpoint space for each of them; they, however, will have two different Focus spaces: the present perfect event will have the moment-of-speaking as its Focus while past simple will imply adopting the-time-of-the-past-situation perspective. This is the understanding of the past simple / present perfect difference that will be adopted for the purposes of the present work. When it comes to future tenses as a source of controversy, the debate is triggered by a number of different factors: the morphology argument; the mood-rather-thantense argument or the multiplicity of forms argument, to mention just a few. The first claim (cf. Berk 1999 among others) is that there is a distinct morphological difference between the English past and present tenses on the one hand and future tenses on the other. While the former are a product of derivational morphology, future tenses are formed by means of the will modal. In the light of this, the difference goes beyond pure formal distinctions into the realm of tense semantics, as it is hard to dispute that by adding a modal and indicating the mood (another source of controversy) we add another kind of meaning. Comrie himself admits that: he states (1985: 43) that, unlike in the case of present and past tenses, future is more speculative than factual. Consequently, a reference to a situation by means of a future tense has to be made with the understanding that anything we say is a prediction which may turn out to be false. Besides, future tenses are not remote from the moment of speaking in the way past tenses are, as any usage event involving a future form amounts to implying rather than stating that what we are describing “does not hold for the present moment”. A good example of such implicature – rather than a factual statement – is a sentence like John will be eating his lunch when you call him in five minutes which does not rule out the possibility that John is having his lunch right now. “Following on this,” Comrie (1985: 44) writes, “one might argue that while the difference between past and present is indeed one of tense, that between future on the one hand and past and present on the other should be treated as a difference of mood”. Finally, as Huddleston and Pullum (2002) point out, with so many ways of indicating future – future simple as well as present simple, present continuous, be going to form, etc. – it is difficult to talk about future tense as a category. The different arguments against the future as a tense can be – and have been – addressed in the ways presented below.

Focus on the semantics…

259

The formal argument is often refuted based on the fact that there are other tenses – including perfects – which are formed morphologically as well as lexically. In turn, as regards the multiplicity of forms, (cf. Declerck 1991), there is a clear difference between the will future on the one hand and the other future forms: the former refers to future situations on its own and, consequently, can be seen as the absolute future; the other forms – such as the be going to form or present tenses with future uses – need some additional temporal anchors, including adverbials, to disambiguate their future time reference from other possible references. Finally, when it comes to the modal – rather than temporal – meaning, the problem is slightly more complex and, as such, will be addressed at greater length. Comrie, who argues in favour of both the temporal and modal meaning of will, claims (1985: 44) that in certain contexts will cannot be classified as a modal because it is quite definitive, and does not “make reference to alternative worlds”, which is the case of other modals such as may; consequently, It will rain tomorrow will be factual while It may rain tomorrow will not. On the other hand, we have sentences like He will be at home reading, I guess, with a very strong modal meaning to will. All this leads to two important observations. First of all, it seems indisputable that in language use there is a perceivable division into what can be called factual, realis future on the one hand and, on the other, the modal future. In the latter case, the modality of the future tense will generally be associated with two basic underlying factors (cf. Lyons 1977; Bardovi-Harlig 2004): the speaker’s belief in the probability of the occurrence of the situation in question (the epistemic modality) or the speaker’s sense of necessity and obligation (the deontic13 modality). In addition to the two basic categories, Carter and McCarthy (2006) list other modal uses of will such as habitual events and general truths: I will sit down in front of the TV all night; Flowers will grow better in sunny places; directives: Will you/You will/ sit down and be quiet!; or requests: Will you help me?. The other important observation on the temporal as well as modal nature of will is based on the fact that even such apparently neutral, temporal (Comrie 1985) statements as It will rain tomorrow or I will be old one day can have some modal connotations resulting from the uncertainty of the future (It may well not rain tomorrow and I may not live long enough to pass as an oldtimer). Decreasing the neutrality of statements even further, we will arrive at examples such as I will be here tomorrow which, like the two previous examples, can be seen as factual rather than modal but will potentially carry even more nontemporal implications, which, depending on the broader context, may indicate obligation, willingness, uncertainty or promise. ––––––––– 13

Quirk et al. (1985) further divide deontic modality into intrinsic and extrinsic depending on the source of the sense of obligation: the conceptualiser themselves or an external decider, respectively.

260

Chapter Four

In the light of what was written above about the fairly indisputable division into the temporal and modal will on the one hand and, on the other, the lack of clear-cut boundaries between the two categories, the best solution seems to treat the will future as a kind of temporal-modal continuum with fuzzy boundaries between the two extremes, the final semantic reading of a particular usage based on the broader context and always, to a certain extent, tentative. This proposal is in accordance with a number of stances on the future tense presented in the literature. First of all, a similar perspective is presented in Dahl (1985: 103), who writes: Normally, when we talk about the future, we are either talking about someone’s plans, intentions or obligations, or we are making a prediction or extrapolation from the present state of the world. As a direct consequence, a sentence which refers to the future will almost certainly differ, also modally, from a sentence with a non-future time reference. This is the reason why the distinction between tense and mood becomes blurred when it comes to the future.

Other approaches to the will future, similar to the continuum concept in the sense of not being decidedly discriminate, are presented in Bybee et al. (1994), Li and Shirai (2000) as well as in Carter and McCarthy (2006). Bybee et al. demonstrate that modal markers across world languages tend to grammaticise into future tense markers as a result of which, as observed by Li and Shrai (2000: 2) “it is often not easy to determine whether a form belongs to one category or another”. Carter and McCarthy, in turn, discuss will twice, once in the section devoted to future time reference and again as a modal. Considering all this, the present work – with its focus on tense and aspect – is going to deal with the temporal – though not necessarily future – meanings of will, acknowledging yet not interpreting the modal meanings unless it is necessary to indicate the difference between the future simple tense and other future forms14.

––––––––– 14

A situation that could be a pretext for falling back on modalities in the interpretation of temporal meanings is described in Lewis (1986). The author relates an ad hoc poll vote on the question of which future form is the strongest in the sense that it expresses the highest certainty, taken on a number of occasions by the native speakers of English. As Lewis states, the respondents listed the items in different ways, with the will future as the most certain in some rankings and the least certain in others. Lewis (1986: 140) limits himself to a statement that in the comparison of different future forms “the ‘degree of certainty’ is not only an impractical classroom explanation, it is also completely without foundation”, which is not entirely true. It seems more accurate to state, following one of the cognitive axioms – to which Lewis subscribes as well – that meaning is in the mind of the beholder, that the respondents must have seen will as weak or strong because of the epistemic or deontic modality they attached to it, most probably based on the context they supplied by default while semantically interpreting the forms.

Focus on the semantics…

261

1.1.3. Tense as a grammatical category In addition to the question of deixis and reference – discussed above, together with related controversies – Comrie (1985) looks at tense as a grammatical category. Consequently, he argues for a flexible characterisation of the semantics of each temporal form-meaning pairing based on the interaction of the contextindependent, conventionalised semantic representation and certain contextspecifics that motivate all possible implicatures (cf. the bi-polar concept of semantics discussed in Chapter 3). When it comes to the future simple, for example, Comrie – as shown in the previous sub-section – opts for two representations, temporal and modal, arguing that the meaning (Comrie 1985: 26) “will be explainable in terms of the interaction of context independent meaning and context”. In addition, the semantics of individual tense categories will be defined through their most typical as opposed to peripheral meanings and uses. Comrie (1985) argues that for the past simple tense past time reference will be its central meaning while the counterfactual present – its peripheral meaning. The interaction of context-dependent / context-independent meaning as well as prototype effects are emphasised here because a further analysis of tense as a grammatical category – presented later in this chapter – will be carried out along the same intersecting lines; lines delineated by the bi-polar concept of semantics as well as the cognitive semantic assumption that some meanings are more central to a given category than others. 1.2. Aspect 1.2.1. Perfective vs. imperfective When analysing aspect, Comrie (1976) divides it into two categories, perfective and imperfective, for situations seen – respectively – as bounded or unbounded temporarily. The perfective, as Comrie states, has long been subject to misinterpretations, ill-defined and misunderstood in terms of its role in creating linguistic meaning. First and foremost, what makes the perfective different from the imperfective is not the shorter rather than the longer duration of the situation, but the perception of this situation as “a single, complete whole” (Comrie 1976: 17), “a complete situation, with beginning, middle, and end” (Comrie 1976: 18). As a result, examples such as He reigned for 30 years and He stood there for an hour can be seen as both imperfective and perfective, depending on how we view them: whether or not, respectively, we want to make “explicit reference to the internal temporal constituency of the situation” (Comrie 1976: 21)15. Secondly, ––––––––– 15

English, as seen in the two examples (reigned and stood), has one form for both perfective and imperfective uses; Comrie, however, gives examples of languages – Russian, Greek –

262

Chapter Four

perfectively viewed situations do not need to be punctual. He cried can mean both a single cry and a number of cries made over a period of time. What is, however, true, is that perfective aspect rules out the possibility of an analysis of the internal structure of the situation which is conceptually punctual (even if it is not punctual temporarily). The final misinterpretation is expressed in the belief that the perfective describes a completed situation. Comrie (1976) prefers to use the word complete which, once again, means, a single undivided whole regardless of the fact if the activity is actually terminated, “[i]ndicating the end of a situation [being] at best only one of the possible meanings of a perfective form, not its defining feature” (Comrie 1976: 19)16. Consequently, the perfective will preferably be defined in terms of undivided, unanalysable, complete situations. The imperfective, on the other hand, is a way of making “explicit reference to the internal temporal structure of a situation, viewing a situation from within” (Comrie 1976: 23). It is further subdivided into: habitual, for situations performed repetitively/iteratively, over an extended period of time, which Comrie (1976: 26) refers to as “habitually durative”; and continuous, or the durative proper, which Comrie (1976: 33) calls “imperfectivity that is not occasioned by habituality. The continuous aspect includes two other subcategories – progressive (for non-stative meaning) and nonprogressive. This classification – in terms of the abovementioned oppositions – is presented in Figure 18. Figure 18. Aspect (adapted from Comrie 1976: 25) ASPECT

perfective

imperfective

habitual

continuous

nonprogressive

progressive

––––––––– 16

which indicate this conceptual difference formally. Comrie (1976) gives examples of the ingressive meaning of the perfective: a perfective form may, in addition to the complete situation, indicate the beginning of this situation, as in sit, which means remain seated as well as assume the seated position.

Focus on the semantics…

263

1.2.2. The perfect aspect In addition to the two main aspects diagrammed above, perfective and imperfective, whose opposition is based on the way the internal temporal constitution of a given situation is viewed, Comrie introduces one more aspect – perfect – which, contrary to the other two, “tells us nothing directly about the situation in itself, but rather relates some state to a preceding situation” (1976: 52). In the light of this definition, it becomes clear that the difference between past simple and present perfect, both of which Comrie (1985; see earlier in this chapter) classified as absolute, is not temporal, as claimed by Declerck (1991), but aspectual, determined by the perfect aspect which indicates “the continuing present relevance of a past situation” (Comrie 1976: 52); this interpretation is very much in accord with the Viewpoint/Focus explication offered earlier in this chapter. Aspectually motivated in a similar way is another difference which can be noticed between the present perfect tense and other absolute tenses. Unlike the latter, in whose case there is only one temporal reference point, the present perfect has two such points: the present moment of speaking as well as the point of the past situation with certain present-time relevance. What is important, however, is that the English present perfect cannot be used together with the specification of this past situation17; this specification can be made neither by means of lexical items and phrases (such as yesterday; cf. reference point expressions listed in the part of this section devoted to tenses for more examples) nor indirectly, as in How did you break your arm where the moment of the accident is implied. When it comes to types of perfect, Comrie (1976: 56) lists three which show that states can be related to preceding situations in three different ways (Comrie 1976: 56-65): – perfect of result, in which the present situation is described as the result of some past activity. Instances will include sentences such as He has arrived or I’ve had a bath, in which the fact of being here (former example) or being clean (latter example) are the outcomes of past activities denoted by the used verbs; – experiential perfect, used when the described situation occurred at least once in a certain period of time which started in the past and continues until now. Examples will include I’ve been to New York; – perfect of recent past, indicating temporal closeness between the past situation and the present moment of speaking, like He has just left; Being, as indicated earlier in this section, a special aspectual case, the perfect can be combined with other aspects, the perfective and all of the subtypes of the imperfective. ––––––––– 17

This refers to a punctual reference such as at 5 o’clock and not to extended periods of time that incorporate the present moment of speaking, such as today or this morning.

264

Chapter Four

1.2.3. Lexical aspect When it comes to ways of referring to the internal architecture of a situation, in addition to grammatical means of indicating aspect, we have to address what Comrie (1976: 41) calls the inherent, lexical meaning. However, before presenting Comrie’s perspective on lexical aspect, it seems chronologically proper to go back to the earliest classification of this kind: Vendler’s (1957) four categories of verb encoding: – activities, which are situations consisting of successive phases over a period of time with no inherent endpoint, like swim or run; – accomplishments, basically like activities but with a natural endpoint, such as paint a picture; – achievements, in turn, are accomplishments which are punctual, instantaneous, e.g.: fall or win; and finally – states, which are homogeneous situations with none of the above characteristics (no successive phases, no endpoints), like know. Andersen (1990 cited in Shirai 2004: 92) represents the above as follows: state activity accomplishment achievement

~~~~~~~~~~~ ~~~~~~~~~~x x

An additional category coined by Comrie (1976) and included in the lexical aspect typology proposed by Smith (1991) is semelfactive, which refers to situations which are punctual like achievements but do not encode an endpoint like cough or hiccup. Comrie (1976), instead of using the above five-category specification, discusses lexical aspect based on three contrasting oppositions18. As will be seen in the discussion below, he frequently concludes that clear-cut divisions within each of three pairs are actually quite rare. As a result, it seems that what he refers to is more frequently a situational rather than an inherently lexical aspect. ––––––––– 18

A similar, tri-partite bipolar typology was proposed a decade later by Quirk et al. (1985). Similarly to Comrie (1976), they divide verbs – or their meanings, as they refer to situation types rather than verbs – into stative (qualities, states and stances) and dynamic, the latter subdivided into durative and punctual. On the third level, dynamic durative as well as punctual verbs are further subcategorised as conclusive and nonconclusive, which – partly – overlap with Comrie’s telic and atelic categories.

Focus on the semantics…

265

The pairs of contrast described by Comrie are: – punctual vs. durative verbs. They are also referred to in the literature (cf. Comrie 1976) as perfective vs. continuous (cough or win vs. live or know). There are, however, as Comrie admits, a number of problems related to the distinction. First of all, as he states, durativity relates to how long a certain situation lasts or, at least, is perceived to last. Consequently, the only definite label will be “punctual”, referring to verbs denoting situations that have no duration and consequently no analysable internal structure, like cough. All this will, however, refer only to situations which are interpreted as semelfective (that happened only once) and not iterative (that happened repeatedly; in such a case He coughed may imply durative coughing). In the case of other verbs, their durativity/punctuality is subject to interpretation based on context, without which, as Comrie states, distinguishing the closed categories of punctual as opposed to durative verbs is virtually impossible; – telic vs. atelic verbs. The former are related to the term accomplishment used by the already quoted Vendler (1957), denoting situations which automatically terminate once a certain point of completion has been reached; atelic verbs, in turn, do not imply such an automatic termination. A test Comrie suggests for distinguishing atelic verbs from telic ones amounts to checking if a sentence containing the verb has the same meaning for the progressive and perfect aspects. The difference can be depicted by sing as opposed to make a chair. Comrie explains that a sentence containing an atelic verb, like John is singing, also implies that John has sung while John is making a chair cannot possibly be interpreted as John has made a chair. All this, however, is constrained by the argument structure of the sentence in which the verb is used. Therefore, sing is atelic, while sing a song will be telic, as there is the terminal point that is an indicator of telicity; besides, it would not be possible to interpret John is singing a song and John has sung a song as equal in meaning. As a result, considering the influence of co(n)text, Comrie prefers to speak of (a)telic situations rather than (a)telic verbs, subscribing to Dowty’s view (1972: 28 cited in Comrie 1976: 46) expressed in the statement that “I have not been able to find a single activity verb which cannot have an accomplishment sense in at least some special context”. Another subcategory proposed by Comrie contains telic verbs denoting achievements (cf. Vendler’s taxonomy), and this classification is also problematic. While it is true that such verbs refer to punctual situations and preclude the phase of activity in which the terminal point has not yet been reached – consequently, we say I cut my finger last night and not I was cutting my finger last night – the category contains verbs like die which can be used in the specifically imperfective form as in He was dying. As a way out of this hoax, Comrie postulates acknowledging that verbs like die denote a special class of achievements: the ones in which the punctual event and the immediately preceding process are so closely bound that once the lead-in process starts the event itself cannot be prevented; and

266

Chapter Four

– stative vs. dynamic verbs. The former denote states and the latter – dynamic situations such as events or processes (know vs. run). These two, according to Comrie, are distinguishable based on an examination of the different phases of the situations they denote: in the case of know whatever fragment of the situation is chosen as focus, it will be exactly the same as other fragments; conversely, in the case of run different phases of the situation will depict different aspects of the activity like having one or the other foot on the ground etc. This explanation, however, does not apply to all stative and dynamic situations. On the one hand, there are states, like stand that may or may not involve change in relation to different phases; there are also dynamic events, like the one denoted by emit a sound which do not have to imply any variation to the process. Consequently, an alternative basis on which stative and dynamic verbs can be differentiated is needed. The distinguishing criterion, as Comrie suggests, is that dynamic situations need energy input to continue; states will continue anyway, until terminated. In the light of this, punctual situations, which involve a change of state, are dynamic by definition. In turn, the fact that states are initiated and terminated, makes it possible to refer to them by means of verbs normally denoting perfectivity, like stand as in He stood in front of me (state) and He stood in front of me (initiation of the state). Later publications on lexical aspect (cf. Olsen 1997: 154 as well as Li and Shirai 2000: 16 based on Smith 1991: 30) combine Vendler’s (1957) and Comrie’s (1976) categorisations into the models presented in Table 6. Olsen (1997: 154) supplements the list with an additional category of the stage level state, in which a state is limited in time, as in I know (state) as opposed to I know now (stage-level state): Table 6. Models of lexical aspect

state activity accomplishment achievement semelfacitve stage-level state

nucleus +durative +dynamic +durative +dynamic +durative +dynamic +dynamic +durative

coda

+telic +telic +telic

Additionally, she views each situation in relation to its nucleus and coda, the two notions being important in relationships between lexical and grammatical aspect. As Olsen (1997) states, for the imperfective grammatical aspect event time and reference time intersect at the nucleus of the situation denoted by the verb in question; in the

267

Focus on the semantics…

case of the perfect aspect, the intersection is at the coda of the verb; and the perfective aspect, as indicated before, ignores the internal structure of the situation. Li and Shirai (2000: 16) present the two combined categorisations in a simplified form (and without the sixth category of stage-level state; Table 7): Table 7. Lexical aspect; Li and Shirai’s (2000: 16) categorization states dynamic punctual telic

activities +

accomplishments + +

semelfacitves + +

achievements + + +

The inherent dynamicity, punctuality and telicity (see Comrie’s 1976 definitions presented earlier in this section) of the five situations listed horizontally (Table 7) is indicated by means of a plus (+); the lack of it indicates, respectively, stativity, durativity and atelicity (cf. Comrie 1976). As a result, states (live, understand) are stative, durative and atelic; activities (run, learn) can be characterised as dynamic, durative and atelic; accomplishments (learn, climb a mountain) and achievements (die, understand) are both dynamic and telic with the difference that only the latter can be seen as punctual; finally, semelfactives (cough) are dynamic and punctual like achievements but, unlike them, atelic. Some of the bracketed examples have been purposely used in two different categories to indicate what was signalled in the discussion of lexical aspect based on Comrie (1976): quite frequently the inherent lexical aspect is, in fact, closer to the inherent situational aspect, and is influenced by co(n)text. A question that may arise at this point is why we should discuss the lexical aspect in a book whose main interest lies in the grammatical aspect. There are at least two important reasons for looking at the aspectual properties of verbs and their relations to the grammatical aspect, one connected to language use, and the other – to language acquisition. The relationship between the lexical and grammatical aspects – especially the influence of the latter on the former – has, to a certain extent, been implied in the remarks on co(n)textual constraints on lexical aspect (Comrie 1976) as well as signalled in relation to Olsen’s (1997) concept of verbal nuclei and codas. The influence, however, is by no means unidirectional. For example, the telicity and dynamicity of the verbs used affect the grammatical aspect, as described in Olsen (1997) in relation to perfect tenses, the examples including differences between: resultative perfect in I have eaten (telic) and continuative perfect in I have known him (atelic); or the existential/experiential (dynamic) perfect in I have travelled Europe and the universal (stative) perfect as in I have lived in Europe. Lexical aspect is also of importance in terms of restrictions some stative verbs place – or

268

Chapter Four

not – on the progressive aspect. As a result, in some cases we will have errors, such as *I’m knowing, while other instances of the interaction between the stative lexical and the progressive grammatical aspects will show the potential of the latter for the aspectual re-profiling (to be revisited in Section 3) of verbs. As a result of this, a state may become a stage level state in order to indicate a passing rather than permanent state19. A good example of the perfectivity of the middle – and rather transitory – phase of a state is, among others, the McDonald’s slogan of I’m lovin’ it, whose aim is to imply that the pleasant state does not outlast the moment of consumption. A similar interplay between different kinds of aspectuals was observed by Koops (2004) in his corpus-based study of emergent aspectual constructions in the present day English. Koops found that the newly emerging progressives (like be in the middle of...; sit/stand and ...; go ...ing) show a special preference for some verbs whose lexical meaning coincides with some aspects of the progressive meaning (respectively: the internal – container – perspective; duration; iteration/repetitiveness). What needs to be pointed out in addition to this is that the grammatical aspect interacts not only with the lexical aspect but also with other linguistic means delineating the internal temporal profile of an event, such as adverbials referring to the temporal extension of described situations. Koops (2004) notes a characteristic of the progressive constructions which he refers to as a “certain bias towards occurring with adverbials specifying punctual temporal reference” (2004: 126) like at 5 p.m. and a parallel bias against adverbials specifying duration, like all day or for an hour. All these phenomena can be accommodated within Smith’s (1991) twocomponent theory, in the light of which the situation type (achievement, accomplishment, activity, state and semelfactive) – which Li and Shirai (2000) relate to the lexical aspect – and the viewpoint aspect (perfective, imperfective and neutral20), called grammatical aspect by Li and Shirai (2000), will be in constant interplay. As demonstrated by studies treating lexicon as the organisational system of the basic structure of language (cf. Li and Shirai 2000), such interaction results from the fact that aspect as such is a kind of interface between lexicon and grammar, “[l]exical aspect [containing] information about the semantic properties of lexical items, and grammatical aspect [conveying] information, usually expressed by morphological devices, about grammatical categories” (Li and Shirai 2000: 5). All of this will influence language use – examples of which were quoted above – and language acquisition, discussed below. In turn, the body of research into how lexical and grammatical aspects interact in the process of language learning has been related to a theoretical ––––––––– 19 20

A similar process has been described by Comrie (1976; cited earlier in this section) in the case of the perfectivity of stative verbs that indicate the initiation or termination of the state. This is an aspectual value for which both perfective and the imperfective interpretations are possible (see also Li and Shirai 2000: 212).

Focus on the semantics…

269

construct called the Aspect Hypothesis. In the light of the hypothesis: “first and second language learners will initially be influenced by the inherent semantic aspect of verbs and predicates in the acquisition of tense and aspect markers associated with or affixed to these verbs” (Andersen and Shirai 1994: 133). Bardovi-Harlig (2002: 130; see also 2005: 397) based on Shirai (1991) as well as Andersen and Shirai (1996), breaks down the Aspect Hypothesis into four separate claims: (a) learners first use perfective past marking on achievements and accomplishments, eventually extending their use to activities and states; (b) in languages that encode the perfective/imperfective distinction, imperfective past appears later than perfective past, and imperfect past markings begin with states, extending next to activities, then to accomplishments, and finally to achievements; (c) in languages that have the progressive aspect, progressive marking begins with activities, then extends to accomplishments and achievements; (d) progressive markings are not incorrectly overextended to statives. Related to this will be what Bardovi-Harlig (2000) calls the native-speaker distributional bias: the lexical/grammatical aspect combinations to be learned first are the ones that are most frequently used in native speaker speech: the perfective achievements and accomplishments; the imperfective states; and the progressive activities. Considering all of the above, it might be more proper – very much, in fact, along the lines suggested by Comrie (1976) the later amendments notwithstanding – to talk about the aspect of situations in which the lexical and grammatical aspects interact with each other on the one hand and with various co(n)textual phenomena, including, in addition to the afore-mentioned argument structure, grammatical factors such as tense. This will lead us to the usage-based model of Bybee et al. (1994) discussed below. 1.3. Tense and aspect. The grams Based on Comrie’s (1976, 1985) proposal – and an extension to it – is the model accommodating both tense and aspect put forward by Bybee et al (1994). The team of researchers look, diachronically, at the process of grammaticalisation of lexical items and the gradual development – through phonological and semantic reduction – of grammatical morphemes called grams in the languages of the world. Grams, as defined by their proponents, are form-meaning pairings used to denote the temporality of expressions. The meaning they carry is very general, “often characterised as abstract or relational (Bybee et al. 1994: 5) and “more dependent on the meaning contained in the context” (Bybee et al. 1994: 7). As a result, the process of assigning meaning labels, as Bybee et al. argue, is best carried out in relation to “the conceptual content of the uses” (Bybee et al. 1994: 45). In this way, as Bybee et al. admit, their approach follows Dahl’s (1985) recommendation to look at grams not through Jakobsonian distinctive features but via their semantic content understood as “the semantic substance of the focal

270

Chapter Four

points in the conceptual space that are encoded by gram” (Bybee et al. 1994: 46). This final claim, which evokes associations with Fauconnier’s (1994, 1997) conceptual architecture, confirms that the described approach is usage-based with the focus space referring to the space of a current event; the gram will be the space builder while the conceptual content with which the space will be filled in is triggered by both the conventionalised meaning of the gram and the present context in ways that will be discussed in detail later in this chapter. At this point, we are going to concentrate on the individual meaning labels for grams. Bybee et al. (1994) divide grams into three groups: 1. completives or the grams that are found along the paths leading to pasts and perfectives; 2. grams, whose common meaning is progressive, imperfective and present; and finally 3. different pathways to the future, and their definitions are not purely temporal; Bybee et al. acknowledge the nongrammatical (lexical) sources of grams as well as take into account their current context-dependent relevance. In relation to group (1), Bybee et al. (1994) assign the following labels21: – completives, which indicate doing something “thoroughly and in completion” (54), as in the case of shooting someone dead or eating up; – perfectives, which stand for actions understood as indivisible wholes; which will have the meaning of completives but without the absolute thoroughly-andin-completion quality, such as paint a picture on the condition that we see them as bounded and not as processual. Perfectives will also be used for narratives containing sequences of discrete events, such as He opened the window, looked into the dark and spotted nothing; – anteriors or perfects, which are relational completives – situations occurring prior to a reference time and in relation to the situation at this reference time: I had finished dinner before they arrived or I’ve just eaten dinner; – resultatives, similar to passives in the sense that they indicate states that result from past actions, such as He’s gone; and – past related to situations that occurred before the moment of speaking. As Bybee et al. (1994) point out, the label past will frequently co-occur with meanings described by other labels – and not discussed in this group – such as ––––––––– 21

Bybee et al. (1994) look at grams across the languages of the world. As a result, the meaning labels they propose are an all-inclusive set of form-meaning pairings. They will be quoted as such a set, the meanings that are relevant for the English tense and aspect to be referred to later in the present chapter.

Focus on the semantics…

271

habituals, progressives or imperfectives. The reason why these meanings are absent from this group is because it includes only grams in whose case the past coincides with the completive. Group (2) includes aspects and tenses related to situations which are present, repeated and ongoing, including grams labelled as: – imperfectives, used for situations which, unlike in the case of perfectives, are treated as unbounded and viewed from within. The situation may be perceived as in progress: He’s painting a picture; habitual: He paints pictures; continuative: He keeps painting; iterative: He paints and paints and nothing happens; etc. As a result, Bybee et al. (1994: 126) find it difficult to view the so-called present tense as tense, because, as demonstrated above, we are actually dealing with “deictic temporal reference” covering “various types of imperfective situations with the moment of speech as the reference point” (Bybee et al. 1994: 126); as well as labels, some of which have already been mentioned in the definition of the imperfective, including: – progressive, viewing actions – expressed by dynamic predicates – as ongoing at reference time like She is reading; states are not normally described by progressive forms; – continuous, applied to both states and actions which are perceived as ongoing at the reference time; it will be parallel to Comrie’s (1976) continuous nonprogressive category, such as She lives in Paris; – habitual relating to situations repeated on several occasions, like I have a little workout every morning; – iterative in relation to situations such as the ones repeated on a particular occasion; mostly – but not only; cf. It rained and it rained and it rained22 – used for telic predicates; – frequentative, a type of habitual meaning with the indication that a certain situation is frequently repeated in a certain period of time: He visits us frequently/quite often; – continuative: progressive with the implication of the agent’s deliberation in keeping the action continuous, including: She keeps repeating this. Finally, labelling different grams which mark the future, Bybee et al. (1994: 243) start the discussion by admitting that “it is not uncommon for languages to have more than one gram which has future as a use”. This results from the already ––––––––– 22

Allan Milne, Winnie-the-Pooh, Chapter 9 in which Piglet is entirely surrounded by water, the chapter’s title itself being a nice example of a resultative gram.

272

Chapter Four

discussed opacity of future situations which, unlike the past and present situations, do not fully belong to the realm of factuality. The grams listed below bear the mark of high specificity: – immediate and distant futures: I’ll do it at once/now as opposed I’ll do it (some time in the future); – deontic and epistemic futures, the former including possibilities, necessities or volition – You will be punished; The king will be respected – the latter further subdivided into future certainty and future possibility, depending on how certain the speaker is that the situation will actually occur – I’m sure they will come as opposed to I hope they will come; – definite and indefinite futures, similar to deontic and epistemic futures with the reservation that in this case it is not so much the speaker’s certainty but rather the speaker’s assurance that the situation will occur at a definite time; – expected futures, referring to situations which are expected to occur, among others because they have been pre-arranged, such as the English: The coach leaves tomorrow at eight; I’m going to the seaside next Tuesday: or She is (going) to be here soon; finally Bybee et al. (1994) mention all kinds of aspectual futures, including perfective and imperfective, with all of the previously labelled meanings. Summing up the preliminary considerations of tense and aspect, we will revisit the main ideas. Comrie (1976, 1985) as well as the other theoreticians cited look at tense and aspect from two main perspectives: in terms of deixis (tense) and the intrinsic properties of the event in question (aspect); and as grammatical categories with special regard to meaning, basic and peripheral as well as conventionalised as opposed to context-specific semantic representations. The present chapter, in its subsequent parts, will follow this route, signposting it with relevant findings of cognitive semantics, discussed extensively in the previous chapter. As a result, both the English tense and aspect will, primarily, be examined as grammatical categories: we will look at them through two kinds of meaning – conventionalised and context-specific – and the relevant prototype effects related to them. In the case of the former, temporal profiles of individual grammatical tenses and different aspects will be presented; the question of re-profiling will also be discussed. When it comes to prototype effects, they will be examined in two ways: through the primary – as opposed to peripheral – meanings of a given tense/aspect; as well as through the primary – as opposed to peripheral – use of a certain prototypical – and not other – temporal expression. Then the English tense and aspect will be looked at from the perspective of perspective: tense through deixis and aspect from the point of view of the optimal/egocentric viewing arrangements. Subsequently, both tenseas-deixis and aspect-as-viewpoint will be examined as elements of the mental space architecture, temporal grams (Bybee et al. 1994) seen as space builders; the contents

Focus on the semantics…

273

of the focal space they trigger will be examined as well. Finally, the English tense and aspect will be accommodated within the Conceptual Integration Theory to show that – like any other form-meaning pairings – they are subject to the dynamic aspects of online meaning constructions such as frame shifting and conceptual blending, the latter in particular related to Comrie’s (1976, 1985) concept of the interaction between the conventionalised and context-specific meanings of a given tense/aspect combination. 2. The English tense and aspect as grammatical categories. The prototype effects One of the best known perspectives on tense as a grammatical category is the proposal offered by Taylor (2003) for the past tense. As such, it will presented here; it will also be the point of departure and the model for similar categorical representations that will be devised for the present and future tenses, in whose descriptions the respective categorisation paths delineated by Turewicz (2000) and Aijmer (1985) will be taken into consideration and used for the crossreferencing of the models. This section will also look at how these three tense categories relate to the previously described aspects, with special regard to the perfective/imperfective distinction. In addition to this, separate aspectual categories will be proposed for the progressive and perfect aspects because of their particular importance in the English perfect and continuous tenses described in pedagogical grammars. All of the above considerations and related conclusions will be based on the results of the corpus-based study carried out by Mindt23 and presented in his two publications (1995 and 2000). 2.1. The past tense Taylor looks at the past tense as polysemous and, consequently, “a cluster of related meanings” (2003: 171), which can be listed based on their prototypicality. First and foremost, the past tense is a deictic structure used for locating situations at a point in time prior to the moment of speaking. Another frequent use is the one found in the historic narrative, where states, events and actions are sequenced and, consequently, mutually-referencing. This use is extended to any narrative, including fictional narratives, in whose case it is difficult to speak of factual referencing because ––––––––– 23

Mindt’s is a corpus-based study carried out with the use of the following corpora (Mindt 2000: 596): 1) General corpora: Lancaster-Oslo-Bergen Corpus, Brown-University Corpus, Corpus of English Conversation, Longman Lancaster Corpus and British National Corpus; and 2) Newspaper corpora: British – The Times (1993) and The Independent (1993) and American – The New York Times (1993) and USA Today (1993). The total number of words – 243,098,954.

274

Chapter Four

situations described in fiction do not actually belong to the realm of realis. Still another meaning paired with the past form is that of counterfactuals, found in contexts such as if-conditionals, wishes and desires as well as suppositions and suggestions. The final subcategory of past tense meanings is the one formed by pragmatic softeners, in which the past tense is used as a face-saver such as I wanted to ask you something (instead of I want to ask you something) or Was there anything else that you wanted (instead of Is there ...), which are supposed to distance the speaker from the possible negative reaction of the addressee. Trying to co-relate these different meanings of the past tense, Taylor (2003) considers a number of possibilities for an overall uniting concept. The proposals include counterfactuality – or non-actuality, to use Comrie’s (1985) label – or the metaphorically understood remoteness. However, like Comrie, Taylor rejects both of them as the overall meanings of the past tense on the ground of their generality and the resulting temporal ambiguity: counterfactuality and remoteness may well refer to future situations as well as present-time states; and remoteness can be spatial (not here) as well as temporal (not now). Consequently, the conceptual representation of the past tense Taylor eventually proposes is not an attribute model but a radial category, in which – as in the case of the MOTHER ICM described in Chapter 3 – the central and peripheral meanings of the form in question are assigned to different nodes. They differ in their semantic extensions (e.g. narrative past is derived from historic past and not any other node), their syntactic environments (the contexts in which counterfactuals are used are unique) or their susceptibility to routinisation (the case of pragmatic softeners, where the past reference of the past tense is subject to semantic bleaching). As a result, the past tense – based on Taylor’s (2003) description of subcategories as well as the models he proposes for polysemous lexical items – can be represented by means of a radial diagramme (Figure 19A; the prototypical meanings are bolded; varying font size refers to differences in prominence). It is important to point out that in this diagramme, both the historical narrative and the fictional narrative are part of a common network, in which the latter is an extension of the former and both have the subcategory narrative as the upper-level schema that captures the commonality between them (cf. Taylor 2003: 164-165). At the same time, while it is difficult to disagree with Taylor that finding an overall uniting concept for all of the different meanings is difficult, it seems possible to see the deictic and narrative uses as a monosemous category. As such, it will have a perceivable inner structure and will be represented like e.g. the bird category, as seen in diagramme (Figure 19B), with deictic characterised by features 1, 2 and 3; historical narrative – 1 and 2; and fictional narrartive – 1. The prototypicality of deictic and narrative pasts with their primarily temporal meaning, has been corpus-confirmed by Mindt (2000). The results of his study show that past forms denoting the temporality – rather than other meanings – of states and events amount to 95% of the past tense use.

275

Focus on the semantics… Figure 19. Past tense as a radial category (A)

counterfactual if-conditionals; wishes and desires; suppositions and suggestions

THE PAST TENSE

pragmatic softeners

fictional

the non-deictic /narrative for co-referencing of sequenced situations

the deictic for locating situations at a point in time prior to the moment of speaking

historical

(B) 1

fictional narrative 2

historical narrative 3

deictic

situations 1) prior to the moment of speaking; 2) realis; 3) with reference to the moment of speaking

276

Chapter Four

As for the grammatical aspect that would typically be used in the case of past situations, we can postulate, based on the classification of grams proposed by Bybee et al. (1994) as well as on the native speaker distributional bias hypothesis (Bardovi-Harlig 2000), that the perfective will be the most frequent in the use of the past tense forms. This is confirmed by the statistics presented in Mindt (2000). First of all, it is important to note that in most cases (93%) past form uses refer to definite past, which means that the when of the described situation is specified – or inferred from the context – as a point or section on the timeline. This initial observation indicates that when talking about a past situation, we refer either to complete, perfective events such as accomplishments, achievements and activities or to imperfective situations in progress, including states and habitual activities. As for the perfective-to-imperfective ratio of past tense use, the former prevails, as indicated by a number of findings Mindt (2000) presents. To start with, the dominance of the perfective aspect seems to be proved by the list of the most frequent past tense verbs including: say, go, come, take, look and get, which cover approximately 23% of all uses of past forms (Mindt 2000: 186). It is important to note that they all carry the perfective rather than the imperfective meaning: three of them – come, take and get – are accomplishment/achievement words with a coda focus; go can be an activity verb or an accomplishment/achievement word with the focus on onset and most uses of look are semelfactive, with the dynamicity of achievements but devoid of their telicity. We can argue that all of the words can also be used to indicate repetitive occurrences of the situations they denote. This, however, is disconfirmed by Mindt’s other statistics: according to his findings, in only about 2% of their use do past forms carry the imperfective meaning, which Mindt (2000: 183) refers to as “progressive meaning without using the progressive form”. All of this suggests that the uses of the past tense are in all likelihood completive (perfective) rather than, for example, habitual/ frequentative (imperfective). Considering the above – together with the previously constructed radial diagramme and the attribute diagramme – we can state that the past tense will prototypically combine past time orientation with perfectivity. This is quite consistent with the properties of human perception as such: entities that are seen from a distance – and the prototypical past tense implies temporal remoteness – will tend to be conceptually punctual and unanalysable, which in the case of aspect means perfective. 2.2. The present tense When it comes to the present tense, Mindt (2000) gives three prototypical meanings of this temporal construction – 1) extended present; 2) actual present; and 3) timelessness – and three peripheral meanings – 4) future and 5) past time orientations as well as 6) if-conditional counterfactuals. The first two prototypes refer to real states or events occurring in the extended now, such as The club

Focus on the semantics…

277

members take part in three trips a month and I rarely eat more than an apple for breakfast; or the actual now, at/around the moment of speaking as in I want to go home or Here comes Charlie. Additionally, the actual present prototype includes present states such as I know, as well as events denoted by the so-called performative/declarative verbs like I beg you to stop. The final subcategory of the present tense – timelessness – is represented by statements of general truths, for example Children obey their parents. The peripheral meaning subcategories, in turn, include: the so-called historic present, which is the present tense used to narrate past events, e.g. It’s last Tuesday evening, the door opens and in comes Charlie ...; the future time orientation as in I’ll call you as soon as I get home or We come tomorrow; and finally the if-clause use of present forms, an example of which may be If I don’t leave now I could be late. The prototypes as listed above have been ordered by prominence: extended present accounts for 58% of all uses of the present tense; actual present – 24% and timelessness – 11%. It is important to point out that the percentages are quoted for the most typical present time reference verb pattern: that of verbs other than be, have or do; for the verb be, which is the second most typical present verb patterns, the ranking is different: extended present – 34%, timelessness – 33% and actual present – 32% (source of all statistics: Mindt 2000: 135-161 and 592-595). We are, however, dealing with the same trio of the most prominent present meanings, which can be diagrammed as in Figure 20A (the prototypical meanings are bolded; varying font size refers to differences in prominence). Considering the most prototypical aspects expressed by the present tense, we can speculate, following Bybee et al. (1994) and Bardovi-Harlig (2000) that the majority of its uses will be imperfective. To revisit a previously offered explication, we can also say that what is perceived directly – and the present tense perspective precludes any type of temporal distance – is more likely to be analysed. This claim can again be positively verified based on the data found in the cited corpus-based study (Mindt 2000). Firstly, Mindt (2000: 160) points out that present forms denote events (63%) more frequently than states (37%). This – combined with the fact that two of the three prototypical meanings of the present are extended present (58%) and timelessness (11%) – leads to the conclusion that most uses of the present tense (69%) will refer to events that are stretched over a period of time, which inevitably pinpoints imperfective habituals and frequentatives, as well as relatively frequent statives (cf. BardoviHarlig 2000).

278

Chapter Four

Figure 20. Present tense as a radial category (A)

future time orientation

past time orientation

extended present

THE PRESENT TENSE counterfactual

actual present

timelessness

(B) 2 1 timelessness

extended present

3 3 actual pres present

extended present

actual present

present

timelessness

1) habitual/frequentative; 2) stative; 3) perfective

Focus on the semantics…

279

All of this is confirmed by Mindt’s list of verbs most frequently used in the present tense (covering approximately 50% of all present forms; Mindt 2000: 160) such as the predominantly non-dynamic know, think, mean and see. The perfective, in turn, though definitely peripheral in the overall picture, will probably be the most common aspectual option for the actual present (among others in the already mentioned performatives/declarative verbs); this aspect will also be found in the non-prototypical use of the present tense for the past time reference, as the historic present is normally chosen if we want to import a certain amount of dynamicity into the description of a past situation. Combining the present tense and aspect prototypes, we can observe that the three types of aspect – habitual/frequentative, stative and the more peripheral perfective – can be accommodated within the timelessness  extended present  actual present continuum of the prototypical uses of the present tense. In this continuum the boundaries between the three components are fuzzy: it is, for example, difficult to decide, where actual present ends and the extended present begins. As a result, the transition from one to another will imply a gradual zooming in or out, with the extended present tending towards timelessness in one direction or towards the actual present in the other direction. Consequently, it is possible to draw a present tense-aspect bird-type attribute diagramme as shown in Figure 20B. A model of the present tense semantics which, to a certain extent, incorporates aspectual properties of verbs used in the pattern in question is the one put forward by Turewicz (2000). She proposes a slightly different perspective on the above-mentioned six – prototypical and peripheral – meanings of the present tense. She looks at the temporal orientation of each of them from the perspective of the Ground, containing elements of both the Ontological Reality (OR) and the Epistemic Domain of the conceptualiser (speaker or their interlocutor). Turewicz (2000: 153-157) states that the Ontological Reality (OR), with special regard to such dimensions as the Ontological Time of Speaking (OTS) and Ontological Ground Space (OGS) both encircle the fragment of OR constituting Ontological Immediacy (OI). As such, Ontological Reality is mirrored in the conceptualiser’s Epistemic Domain – the speaker/interlocutor conception of OR – with the dimension of Epistemic/Conceptual Time of Speaking (CTS), Ground Space (CGS) and Immediacy (EI). This mirroring, however, does not have to mean that CGS and CTS will always coincide with OGS and OTS, a fact that is demonstrated by the cases of the historic present or the future use of the present tense. Considering all of this, Turewicz (2000: 157) postulates the following definition of the meaning of the present tense: The present tense morpheme can be defined as a grounding predication which locates a conceptualised situation in the Epistemic Immediacy of the ground. In the cases when the Conceptualised Time of Speaking coincides with the Ontological Time of Speaking ... such cases define what can be called ontologically real and immediate meaning of the Present Simple Tense. In the cases when Conceptualised Time of Speaking is specified

280

Chapter Four as extending beyond the Ontological Time of Speaking, objective scenes from the part of Ontological Reality linguistically specified will be conceptualised as ontologically real and epistemically immediate.

Based on the definition, Turewicz (2000: 183-184) proposes the following specifications of six meanings24 of the present tense; the examples are labelled following Quirk et al. (1985): – ontologically real and immediate present, in which CTS coincides with OTS and EI with OI; an example will be the state present: timeless or extendedpresent states realised by imperfective verbs such as I know you or We live near here; – ontologically real and epistemically immediate present, in which CTS extends beyond OTS, as in the case of the habitual present: timeless or extended present frequentatives/iteratives realised by perfective verbs like I go to school; – ontological in-Ground present, in which CTS coincides with OTS and CGS with OTS; examples include commentaries (Ronaldinho scores), exclamatory sentences (In comes our star!) or performatives (I propose a toast to our host); – epistemic in-Ground present, in the case of which CTS is identical with OTS; the described situation belongs to OR prior to OI but the speaker chooses to place it in EI; this is the case of the historic present as in It’s Friday night, I enter the room and ...; – ontologically planned epistemically immediate present, in whose case the described situation belongs to OR which is ahead of OI; the situation is conceptualised as immediate because one of its facets is ontologically immediate: for example a timetable belonging to OR has the potential to create a reality which will be epistemically immediate (The train leaves at 7.00); and – ontologically non-real, epistemically irreal in-Ground present, in which CTS is identical with OTS and the situation can be placed in the area of Epistemic Irrealis, as Turewicz (2000: 184) puts it, “beneath Ontological Reality”; examples include conditional sentences type 1. Turewicz’s classification basically coincides with the one proposed by Mindt (2000), with one already indicated difference in the area of the prototypical uses: Turewicz ignores the distinction between timelessness and extended present; instead, she divides present situations into stative and dynamic. Other meaning categories overlap.

––––––––– 24

Turewicz’s specification differs from Mindt’s: timeless or extended-present uses are divided into stative and dynamic/habitual

Focus on the semantics…

281

2.3. The future tense In his statistics referring to will – discussed as a modal verb – Mindt (1995: 61) shows that approximately 94% of its occurrences have future time orientation, present time orientation and timelessness being considerably rare and amounting to 4% and 2%, respectively25. Within this framework three different prototypes of the future tense can be listed (Mindt 1995: 57): 1) certainty/prediction, as in You’ll get better soon (71%); 2) volition/intention, including examples such as I will tell you what I mean (16%); and 3) possibility/high probability, like You’ll probably get better (10%). In addition to these, Mindt lists two less frequent uses – 4) inference/deduction (That will be him, over there) and 5) habit (He will sit for hours, reading) – which together amount to 2% of all occurrences of will. The above examples – which, with the exception of the second prototype (volition/intention) which is deontic, belong to the epistemic modality – can be diagrammed as in Figure 21 (the prototypical meanings are bolded; varying font size refers to differences in prominence; attribute diagramme not included because of the polysemy of will): Figure 21. Will as a radial category

habit

inference /deduction

certainty /prediction THE FUTURE TENSE

possibility /high probability

volition /intention

––––––––– 25

Mindt (1995: 61) admits that will used for timelessness is slightly more frequent in expository prose – 12% of occurrences.

282

Chapter Four

The above attempt at constructing a prototype model of will is, to a certain extent, endorsed by historical linguistics. Based on a diachronic investigation of the semantic development of will, with particular regard to the change form “the modal notion of Volition to a temporal expression with future meaning” (Aijmer 1985: 11), Aijmer derives the prototypical temporal (cf. Mindt 1995 above) meaning of will from the Old English willan denoting a want/need/craving/desire. She notes that volitional modality of willan was later subject to weakening in will+that clause and will+infinitive constructions as well as the 3rd person dilution of the ‘active’ desire. The result of all of the semantic bleaching is the nonvolitional future for both human and, by extension, non-human subjects. As a result, as Aijmer (1985) argues, what started as a modal, volitional will, changes into an expression of future which, in a kind of U-turn, reassumes the modal epistemic meaning, epistemic modality being a kind of modality, “in which the speaker expresses his attitude to the whole situation (cf. Mindt’s categories of certainty/prediction and possibility/high probability above). Alongside these changes, as Aijmer (1985) shows based on diachronic comparisons, the volitional will also develops into a semantic vehicle for the omnitemporal (Aijmer 1985) or timeless (Mindt 1995) capacity, habit and disposition, which are peripheral in comparison to the prototype. Consequently, the prototype around which will is organised will be as follows (Figure 22; adapted from Aijmer 1985: 19): Figure 22. Radial category for will (adapted from Aijmer 1985: 19) Epistemic modality Disposition

future

future

will

Capacity

Habit Volition The diagramme (Figure 22), like the previous one (Figure 21; based on Mindt’s 1995 statistics) shows that the prediction- and volition-related uses of will are prototypical and the others – peripheral. Aijmer (1985: 16 following Lyons 1977: 846) further argues that “[t]he deontic interpretation is ... indistinguishable from the future one if the person who predicts something is in position to bring about

Focus on the semantics…

283

that the action takes place”. This is one more confirmation that, unless we are able to prove the opposite, we can assume that the temporal meaning of will is prevalent. As for the aspect most typically related to the future tense, following Bybee et. al. (1994) and Bardovi-Harlig (2000) we can say that it will be: perfective for future (planned, promised, desired, etc.) achievements and accomplishments as well as some activities, and imperfective for future states and repetitive events. All of this is corroborated by Mindt’s (1995) statistics, based on which the future tense is used with verbs denoting events (68%; examples go, make, clean) and states (32%; examples be, have, need). Relying on the earlier considerations referring to temporal remoteness and its effect on the perception of situations as punctual and unanalysable (cf. the discussion of the past, earlier in this section), we can postulate that, at least in the case of the much more frequent events26, the future tense, like the past tense, will be more likely to denote the perfective aspect. 2.4. The progressive and the perfect aspects In addition to the above, two aspects will be discussed separately: the progressive which is part of the imperfective domain (Comrie 1976; Bybee et al. 1994); and the perfect belonging to the group of completives (Bybee et al. 1994). There are two reasons why these two are dealt with separately and subsequently related to all tenses in general. First of all, regardless of the tense with which they are combined, these two aspects have their special meaning profiles which will be presented as radial categories below. Secondly, the progressive and the perfect are – in addition to simple – the basic pedagogical grammar labels given to English tenses, which is important as we are slowly approaching the part of the book in which educational decisions will be described. The imperfective progressive – represented by the be auxiliary plus the present participle of the verb in question – will denote one of the following meanings (Mindt 2000: 248-250): 1) incompletion; 2) temporariness; 3) iteration, especially in the case of semelfactives; 4) highlighting/prominence; 5) emotion; 6) politeness/downtoning; 7) prediction; 8) volition; and 9) matter-of-course. The most prominent meanings are (Mindt 2000: 256) incompletion (60%), temporariness (36%) and iteration/habit (12%) with the others being much less frequent: highlighting/prominence (9%), prediction (7%), volition (5%) and emotion, matter of course and politeness (5%). The radial category for the progressive is presented in Figure 23 (the prototypical meanings are bolded; varying font size ––––––––– 26

This, in fact, may also be the case for stative verbs. The meaning of You’ll be alright, which Mindt (1995: 62) gives as one of the prototypical examples, is not stative but the one of the achievemental/punctual You’ll overcome a problem and consequently perfective.

284

Chapter Four

refers to differences in prominence)27. The three prototypical uses share one characteristic of the described situation: it is in progress in the sense that it has not been completed, it is being repeated or it is temporary in the sense that it is unstable and cannot be classified as permanent. As a result of this similarity – as well as the fuzzy boundaries between the said subcategories – the three prototypes are frequently combined, the most popular co-occurrences being (Mindt 2000: 257-259): – incompletion+temporariness – I was sitting and thinking how difficult it is to write anything – 30% of all incompletion uses (incompletion in isolation, as in It was getting cold, 52%); – temporariness+incompletion – We were doing our homework when you came in – 50%, of all temporariness uses (temporariness in isolation, as in He was repairing the car all day yesterday, 40%) – iteration+incompletion – You are drinking too much – 44% of all iteration uses (iteration in isolation, as in She is always coming up with nice ideas, 31%) Figure 23. Progressive as a radial category

emotion matter of course politeness

volition

incompletion

THE PROGRESSI VE ASPECT

highlighting / prominence

temporariness

iteration/habit

––––––––– 27

The percentages, as indicated by Mindt, add up to more than 100% because individual meanings frequently overlap

285

Focus on the semantics…

Potential uses of the progressive in combination with all three tenses – present, past and future – are presented in the Table 8 below: Table 8. Present, past and future progressives temporal orientation  present

past

future28

incompletion

I’m reading the book.

temporariness

I’m working as a teacher.

We were finally making progress. We were doing our homework when you came in. She was coughing.

They will slowly be getting ready. I’ll be staying with my aunt and uncle on my holiday. She will still be haplessly hiccupping. You will be going, I swear!

aspectual meaning of the progressive



iteration

highlighting /prominence emotion politeness /downtoning prediction

volition

matter of course

She is crying.

I’m warning you!

Oh my, you are always complaining. What are you suggesting? They are coming.

I’m not doing it again. If we are sharing expenses, it may be good to ask the waiter for two separate bills.

That’s what I was telling you all that time! Oh God, we were waiting forever. I was thinking if you could possibly let me know ... We thought you weren’t coming. He said they were leaving. If you were doing it the usual way you were probably doing it wrong.

The prime minister will not be dining with us. They will be waiting for us at home, I guess. In the 22nd c. people will be living in space.

The list above contains more and less prototypical uses of the progressive. Based on corpus statistics (Mindt 2000: 251-265), they will be most frequently matched with the past tense (50%) and the present tense (41%). Future time orientation and timelessness – as in the emotive You are always complaining – are rare and account for 5% and 4% of uses, respectively. As for the combinations of the ––––––––– 28

It has been difficult to find examples of use for future emotion and future volition. In both cases the tentativeness of the things to come makes it difficult to make any predictions about our affective responses – including desires – which are usually automatic and not easily predictable.

286

Chapter Four

grammatical progressive and most frequently used verbs, the aspect in question, as postulated by Bardovi-Harlig (2000), is popularly applied to verbs demoting activities (87%) with go, do, get, come, try and look accounting for 60% of all uses; state verbs are much less frequent – 13% of uses. It is also important to point out that in the case of will+progressive, the non-progressive meaning – like habit/matter of course or downtoning – is much more frequent than the temporal progressive meanings (Mindt 1995) of temporariness, incompletion or iteration/habit. When it comes to the perfect aspect, the meaning differs slightly depending on the temporal orientation – present, past, future – and the combination of the grammatical aspect with the lexical aspect. In the case of the present perfect, the three prototypes will be (Mindt 2000: 222): 1) indefinite past, resultative, as in We’ve climbed Mount Doom (the combination: present tense+perfect aspect+telic verb); 2) indefinite past non-resultative, as in I’ve seen you before (the combination: present tense+perfect aspect+[usually] stative verb); and 3) indefinite continuative I’ve lived here for more than a decade now (the combination of present tense+perfect aspect+atelic verb). The three prototypes are, respectively, representative of 48%, 31% and 17% of all present perfect uses. The less frequent meanings are recent past (4%) and completion (less than 1%) (Mindt 2000: 224). The radial category for the present perfect is depicted in Figure 24 (the prototypical meanings are bolded; varying font size refers to differences in prominence). Figure 24. Present perfect as a radial category indefinite past: resultative

indefinite past: nonresultative

completion

THE PRESENT PERFECT

recent past

continuative past

287

Focus on the semantics…

In turn, the past perfect has seven meanings (Mindt 2000: 237), five of which, with the possible substitution of the past with the pre-past time orientation, are the same as the ones listed for the present perfect above. The two additional ones are definite pre-past: He had been to Paris in 1980; and the non-real ifconditional past: If I had been there last night ... . Mindt (2000: 240-242) lists three prototypical meanings of the past perfect: 1) definite pre-past (39% of all uses of the tense); 2) indefinite pre-past: resultative, as in She had given me the book (32% of all uses); and 3) indefinite pre-past: non-resultative, as in I had expected to be taken home (19% of all uses). The less frequent cases include: 4) non-real past (the if-conditional use; 6%); 5) continuative pre-past (I had not seen you; 3%); and 6) completion as well as recent pre-past (She had finished writing the letter; She had just finished writing the letter; approx. 1%). The radial category for the past perfect is depicted in Figure 25 (the prototypical meanings are bolded; varying font size refers to differences in prominence). Figure 25. Past perfect as a radial category

completion

recent prepast

definite pre-past

THE PAST PERFECT continuative past

indefinite past: resultative

non-real pre-past

indefinite past: nonresultative

288

Chapter Four

Future perfect, classified by Mindt (2000) as a modal perfect, has only one temporal meaning – completion. What is important to point out is that this meaning is peripheral, the core use of the will+have+ past participle pattern being the non-temporal modal of prediction about the past as in They will have arrived by now. Consequently, the three perfect tenses can be presented in the following bird-type attribute diagramme (Figure 26), which shows that the prototypical quality of indefiniteness will be shared by past and present perfect tenses while the common characteristic of all three, including the future perfect, is completion: Figure 26. Attribute diagramme for the present, past and future perfect tenses 1 2

3

4 past perfect

present perfect

future perfect

5 1) completion; 2) indefinite; 3) continuative; 4) definite; 5) non-real

2.5. Prototypicality effects in time orientation Finally, there is one more prototype effect which, unlike the previous ones, is not related to a particular verb pattern representing a tense and/or aspect; it refers to time orientation which, as demonstrated before, can be expressed by means of different tenses and aspects. In the summary section of his book, Mindt (2000: 592-595) reverses the reference hierarchy and shows the verb patterns which are the most typical for each of the three time orientations – present, past and future as well as timelessness – in three kinds of discourse: spoken conversation, fictional texts and expository prose. Table 9 presents the prototypes (in bold, bigger font size) and the less popular patterns (smaller font size) abridging the four tables presented in Mindt (2000: 592-595).

289

Focus on the semantics… Table 9. Prototypicality effects in time orientation ( based Mindt 2000) time orientation  type of discourse 

present

past

future

timelessness

spoken conversation

present forms [THE PRESENT SIMPLE TENSE] of verbs other than be, have and do (40-56%)

past forms [THE PAST SIMPLE TENSE] of verbs other than be, have and do (20-39%)

will+inf [THE FUTURE SIMPLE TENSE] (40-56%)

present forms of be (40-56%)

present forms of be (20-39%) present progressive (3-4%)

present perfect (1019%) past forms of be (10-19%) would+inf (10-19%) past progressive (59%) past perfect (3-4%) past forms [THE PAST SIMPLE TENSE] of verbs other than be, have and do (20-39%)

be going to form (10-19%) present progressive (3-4%)

present forms [THE PRESENT SIMPLE TENSE] of verbs other than be, have and do (20-39%)

will+inf [THE FUTURE SIMPLE TENSE] (40-56%)

present forms of be and of verbs other than be, have and do (20-39%)

past forms of be (10-19%) past perfect (5-9%) present perfect forms (3-4%) would+inf (3-4%) past progressive (34%) past forms [THE PAST SIMPLE TENSE] of verbs other than be, have and do (20-39%)

shall+inf (10-19%) be going to form (5-8%) present progressive (3-4%)

past forms of verbs other than be, have and do (5-9%)

will+inf [THE FUTURE SIMPLE TENSE] (40-56%)

present passive (2039%) present forms of be (20-39%)

past forms of be (10-19%) present perfect (1019%) past perfect (5-9%) would+inf (5-9%) past progressive (34%) past forms [THE PAST SIMPLE TENSE] of verbs other than be, have and do (40-56%)

shall+inf (3-4%) present forms of verbs other than be, have and do (3-4%)

present forms of verbs other than be, have and do (10-19%)

will+inf [THE FUTURE SIMPLE TENSE] (40-56%)

present forms of be (20-39%)

past forms of be (10-19%) present perfect (5-9%) past perfect (5-9%) would+inf (5-9%) past progressive (34%)

shall+inf (5-9%) be going to form (59%) present forms of verbs other than be, have and do (3-4%)

present forms of verbs other than be, have and do (20-39%) present passive (1019%)

fiction

present forms [THE PRESENT SIMPLE TENSE] of verbs other than be, have and do (40-56%) present forms of be (20-39%) present progressive (5-9%)

expository prose

present forms [THE PRESENT SIMPLE TENSE] of verbs other than be, have and do (40-56%) present forms of be (20-39%) present progressive (3-4%)

aggregate data (all three discourse types)

present forms [THE PRESENT SIMPLE TENSE] of verbs other than be, have and do (40-56%) present forms of be (20-39%) present progressive (5-9%)

290

Chapter Four

The main observations that can be made based on the above presentation are, as follows: – the simple tenses – present, past and future – and the present form of be are the prototypical verb patterns for all three types of time orientation, respectively. This observation is of great pedagogical importance as, traditionally, the socalled present tenses – the present simple and the present continuous/progressive – as well as respective past tenses and different ways of expressing futurity are taught together and given equal attention, as if their communicative weight was at least comparable. This is obviously not the case when we compare the aggregated percentages: the present simple (40-56%) and the present continuous (5-9%); the past simple (40-56%) and the past progressive (3-4%); and the future simple (40-56%), be going to (5-9%), the present simple (3-4%) and the present continuous (not listed). That is why, when communicative expediency is considered – which will be the case at lower levels of language proficiency, the more prototypical tenses should be given more pedagogical attention, in terms of both instruction and exposure to meaningful contexts containing the forms; – at the same time, though, the prototypicality/non-prototypicality distinction has another consequence, this time at higher levels of language proficiency, when the desire to express subtler differences in meaning grows. In such a case, the more prototypical tenses – whose frequent occurrence in discourse as well as communicative expediency guarantee that their meaning will have been learned at lower levels as a result of exposure, instruction or both – will need less attention than the more peripheral verb patterns; – quite surprisingly, there are no major differences between the three different types of discourse and the variation refers almost solely to the secondary, nonprototypical verb patterns: 1) for the present time orientation: the present continuous tense is slightly more common in fiction; 2) for the past time reference: the past perfect tense is a little more frequent in written texts; the present perfect is less used in fiction than it is in spoken conversation and expository prose; and the would+inf pattern is the most common in speech; 3) for the future time orientation: the present continuous tense disappears in expository prose while the popularity of the be going to form decreases between spoken discourse and fiction and vanishes in the category of expository prose; 4) for timelessness: the present forms of be is the most frequent in spoken conversation and scores evenly with the present simple in fiction to finally give priority to the present passive in expository prose; – the perfect tenses do not enjoy great popularity, not even the present perfect which is quite often referred to as “a very English tense”. For past time orientation it is decidedly less frequent than the past simple tense in spoken conversation and expository prose and much less frequent than the past simple in fiction. The past perfect is quite rare in any type of discourse with spoken conversation obviously being the least popular. The future perfect is not listed.

Focus on the semantics…

291

All in all, Section 2 looked at different types of prototypical effects. It started by examining tenses and aspects, their more prototypical and more peripheral functional uses, diagramming them as radial categories as well as – whenever possible – looking at the prototypical use clusters in search of their shared attributes, trying to single out underlying similarities of the different senses. Towards the end of the section the perspective was changed from tense/aspect to temporal orientation and the demonstrated prototypicality effects referred to more prototypical and more peripheral structural uses, indicating which verb patterns are typically used with each time reference. The importance of the latter viewpoint was presented above and will be returned to in the next chapter, in which pedagogical solutions related to teaching grammar are offered. In turn, the prototypicality effects connected with the individual verb patterns, together with the similarities pinpointed across the clusters, will be revisited in the next section, in which such primary semantic characteristics are seen as the basis for different temporal and aspectual profiles.

3. Perspectivisation of the English tense and aspect Although Comrie (1985) as well as Taylor (2003) – see Sections 1 and 2 – rule out distance as an overall uniting concept for past tenses, it has to be admitted that the metaphor of physical closeness and remoteness is quite attractive in underlining the contrast between the category of the present tense on the one hand and past and future tense categories on the other. It is both very convenient as well as well justified to imagine the difference between the present and the other two tenses in terms of the conceptual distance between the user and the described situation which will be shorter for the former and longer in the case of the latter. Quite obvious for all of the prototypical uses of the tenses, the distance effect is also clearly seen in the peripheral uses, resulting in what Declerck (1991: 24) calls a “shift of temporal perspective”, a perceivable, meaningful change in how the speaker/writer chooses to conceptualise the situation. For example, the temporal perspective will change and affect the meaning if the present tense is used to describe past events: the impression of a shorter (emotional) distance between the described situation and the speaker will imply his/her greater involvement in the course of events and, consequently, increase the liveliness of the narrative. Similarly, the present tense used for future events will – by shortening the speaker-situation distance – denote a higher certainty of the things that will happen based on the indication of the usual course of such events or the fact that the preparations are in progress. Another example is the past tense used in present-time contexts to indicate the counterfactuality of what is described or for greater politeness, both meanings motivated by the remoteness, not temporal but conceptual or social, between the speaker and the situation or the speaker and his/her interlocutor. Finally, in the case of the future tense in its present habitual or general

292

Chapter Four

truth use, the sense of modality underlying the will+inf verb pattern will trigger the effect of tentativeness or narrative elaboration both found in formal contexts which, like the polite uses of the past, often result from the social or emotional distance the speaker/writer wants to imply. The issues to be examined in this section are all related to the above notion of distance and other related concepts, all gathered under the umbrella term of perspective, understood in terms of all kinds of figure/ground alignments. For tense and aspect, we will be interested in processual profiles, in which the figure standing out from the background will include the main participant of the process, the process itself and the progression through time. Chapter 3 presented such a profile for The trees are getting green, singling out the trees, the colour and the relevant fragment of the timeline to indicate what Langacker (1987: 244) calls “the temporal evolution of the situation”. As shown in the diagramme accompanying the trees example, the relationship between the individual elements of the scene in such a profile is scanned sequentially. The above statement, with particular regard to evolution as well as the sequential scanning of such evolution suggests that every process implies some kind of change. In this section we will look at this problem, showing that the change occurs in the case of events and not states unless the latter are re-profiled. We will also examine the internal structure of both types of profiles based on simple totalities (simple tenses) as opposed to complex partialities29 (progressive tenses); we will also examine the issue of (un)boundedness, total for (im)perfectives or partial for the perfect aspect. The present section will refer all of the above questions back to Langacker’s description of perspectivisation, (Langacker 1987; cf. Chapter 3), together with a plethora of phenomena discussed in relation to it. They are going to be revisited in the following way: first of all, the deictic character of tense will be re-examined and related to the notion of viewpoint with special regard to optimality/egocentricity of viewing arrangements on the one hand and, on the other, to reference-point constructions; secondly, the idea of profiling and reprofiling will be described and temporal and aspectual profiles for different tenses in various aspects will be presented. 3.1. Tense as deixis: the present tense hoax The deictic character of tense was discussed in detail in Section 1. That is why, the present section will only briefly return to the theoretical preliminaries presented there, to look more thoroughly at some interesting or controversial ––––––––– 29

I use this term to refer to the progressive aspect by disanalogy to Lewis’s (1986) simple totalities, used in relation to simple tenses.

Focus on the semantics…

293

issues related to the topic, such as the problem of the present tense in the area of the coincidence between speaking and event time as well as implicit grounding in the scope of predication in relation to tense as a reference point construction. As was described before, tense locates events on a timeline by means of referring them to other points/cycles on the same timeline. If the reference point is the present (moment of speaking), the tense in question is called absolute; if there is another reference point in the past or in the future, the tense is relative. The absolute tenses will include the present, past and future tenses as well as the present perfect tense; the relative tenses will be the two remaining perfects – past and future. All this has already been described and diagrammed in Section 1 of the present chapter, following Comrie (1985) and the much earlier Reichenbach (1947). What has also been mentioned is that the past tense locates events prior to the time of speaking, the future tense – after the time of speaking and the present tense at the time of speaking. A similar linear from-left-to-right location of the past, the present and the future is proposed by Langacker (1991: 242-244) in his basic epistemic model. In Langacker’s model, however, the three are arranged in reference to different types of reality: known reality and irreality. Known reality accommodates past and present situations with the latter confined to a special area of immediate reality. Irreality, in turn, is divided into unknown present and past reality as well as present, past and future irreality. While the temporal location of events as prior or subsequent to the time of speaking or belonging to reality or irreality seems uncontroversial, the coincidence between present event(s) and present time of speaking as well as the concept of immediate reality pose a problem. This problem is related to the lack of an actual coincidence between speaking time (ST) and event time (ET) in the present tense utterances about the extended present/timeless situations as well as when the present tense is used for past/future time reference. Langacker (2000) offers an interesting solution of the present event/present reference time hoax. It is quoted below for its educational value as well as because it introduces the concept of a bounded/unbounded process, which will be of importance to the later discussion of aspect. In relation to the problem of coherence between the time of speaking (ST) and the time of the present event (ET)30, Langacker departs from an observation that the coincidence between the two is uncontroversial for the so-called performatives or declarative verbs. In sentences like I hereby name this ship ––––––––– 30

Langacker’s notation of ET and ST is adopted with the acknowledgement that both times are subject to the ontological/epistemic distinction with their various dimensions discussed in Section 2 in relation to the present tense meaning categorization proposed by Turewicz (2000). In the course of the present argument, for the sake of its clarity, this distinction will be ignored.

294

Chapter Four

Queen Anne or I promise to be good the user actually performs the action denoted by the verb by uttering this verb. The situation can be diagrammed in the following way31 (Figure 27): Figure 27. ET and ST coincidence in performatives (adapted from Langacker 2000) ET

t ST The source of the problem lies in the acknowledgement made in the previous section: the present tense with the use of performatives, which are perfective by nature, is not a prototypical mapping between the present form and its meaning. In most cases, the described event will be imperfective and its time reference – extended present or timelessness. This is exactly where the problem arises, because all extended present events – not to mention those that are timeless, which are, by definition, unrestricted in time – are much longer than the time needed to describe them. As a result, it is impossible for ET and ST to coincide, as depicted in Figure 2832: Figure 28. ET and ST discrepancy in timelessness/extended present (adapted from Langacker 2000) ET

t ST ––––––––– 31 32

The two time points are located on the same timeline (t); they are diagrammed separately for the clarity of presentation. Speaking time is delineated as punctual in spite of the fact that Langacker himself (1991: 243) argues against it, postulating that: “[i]t is linguistically significant that an act of speech is not punctual but has a brief temporal duration”. The point here is that the concept of punctuality as utilised in the present section refers to the conciseness of the event and not its duration (cf. Comrie 1976 on the perfective/imperfective difference). Please note that the punctual representation of ST and ET is used in this section only, for the reason specified above. In other places – for example in the case of temporal profiles of grams presented in Section 3.5 – the Langackerian bar (and not punctual) representation of both times is returned to.

295

Focus on the semantics…

In fact, however, it is possible to make the two times – ET and ST – coincide. As proposed by Langacker, what we need to do first is conceptualise the unboundedness of imperfective situations in terms of uncountability of substances. If we assume that any substance will be represented by a portion of this substance, as a result of which a drop of water is still water, we have to agree that what co-occurs with the moment of speaking is “a temporally coincident sample of the overall [unbounded] situation” (Langacker 2001: 24). As a result, as Turewicz points out, any phase of a process so construed qualifies as an adequate representation of the process”, which – even if internally varied or “factively heterogenuos” – is “effectively homogenous” (Turewicz 2000: 83), consistent in how it is perceived at the moment of speaking. Consequently, depending on whether the situation in question is a state (A) or a habitual/iterative activity (B), the correspondence between the time of speech and the time of event will be construed as diagrammed in Figure 29. Figure 29. Correspondence between ET and ST in unbouded states/events conceptualised as uncountables (adapted from Langacker 2000: 24) (A) state

(B) habitual/iterative event

ET

t

ET

t ST

ST

When it comes to the problem of temporal coincidence between the present moment of speaking and the future/past time reference of the present tense, the problem is, according to Langacker (2001), slightly different from the one presented above. In the case of the present tense used for planned future events – They open on Monday – or for events described in the historic present – It’s Friday night, I open the door, come in and I see ... – we are dealing with perfective, punctual situations. Consequently, the differing length of ST and ET as in future/past time reference is not an issue, the hoax consisting in the lack of synchronisation between ST (present) and ET (future or past). However, as Langacker observes, in both cases the reference is actually not between the present time of speaking and the future/past event but rather between the present time of speaking and a presently construed virtual schedule (2001: 31) of the future event or an imagined script (2001: 32) of the past event. The two can be presented schematically in a way, in which ET’ stands for the event time which is part of the virtual schedule or the imagined script (Figure 30):

296

Chapter Four

Figure 30. ET/ST coincidence in future and past events described by the present tense (adapted from Langacker 2000: 31-32) present tense for future plans

the historic present

ET’

ET’

t

t ST

ET

ET

ST

In this case, as Langacker puts it in a much earlier publication, we will be dealing with a kind of “radical mental transfer pertaining to the deictic centre” as the speaker “decouples the deictic centre from the here-and-now of the actual speech event and shifts it to another location” (Langacker 1991: 266). A similar schema/script is activated for the present time used in sport commentaries – the speaker is usually referring to their stereotyped script of the sporting event in question and the present tense forms correspond with the stereotyped punctual events in this script. 3.2. Simple totalities vs. complex partialities: the optimality/egocentricity issue What is important to emphasise is the now-extended application of the word punctual that has been used throughout the previous subsection. The concept of punctuality – which the present work formerly applied, following Comrie (1976) – to the perfective aspect alone, is now employed to describe all present tense uses, the perfective ones – like performatives, future plans or the historic present – as well as the imperfective ones. In the case of the latter, such an application of punctuality is possible based on the assumption (adopted from Langacker 2001) that a sample of an unbounded situation is bounded just as a portion of an uncountable substance becomes countable. This, in spite of the polysemy of the present tense, allows us to conceptualise all its uses – in terms of what Lewis calls primary semantic characteristics – as “single, simple entities, unities, totalities” (Lewis 1986: 66). Such punctuality will be the defining characteristic of other simple tenses – past and future – whether they are used to refer to perfective situations or samples of unbounded, habitual/iterative or stative ones. Progressives, in turn, regardless of their time reference and the currently used meaning will be seen as opposite – complex entities, partial rather than total. A similar distinction between the non-progressive and the progressive is proposed in Radden and Dirven (2007), where these two aspects are described as providing maximal and restricted viewing frames, respectively.

297

Focus on the semantics…

To understand this totality/complexity vs. partiality, or the maximal vs. restricted viewing frames, we will look at the differences between the progressive vis à vis the perfective on the one hand and the imperfective on the other, based on the three basic aspectual profiles proposed by Langacker (2001: 12): Figure 31. Totality vs. partiality in perfective, imperfective and progressive aspects (adapted from Langacker 2001: 12) perfective

imperfective

progressive

t

t

t

There are a number of observations that can be made in relation to the three diagrammes in Figure 31. Firstly, the progressive profile – like the imperfective – is unbounded. However, this unboundedness can be seen only within a certain region in whose boundaries the progressive is presented in a zoom-in fashion; it is also quite clear that such a narrowing down of the focus applies to a process which was originally perfective, as indicated by the boundedness of the underlying profile. In this way the representation indicates an important fact: that the progressive applies only to perfective processes, being itself a process of imperfectivisation; in the case of processes that are already imperfective, the effect, according to Langacker (2001) is vacuous. Whether or not such a statement is fully justified will be addressed in section 3.5 which is devoted to reprofiling. At this time, we are going to concentrate on the nature of the difference between the two imperfective processes: the one originally imperfective and the one imperfectivised. The distinguishing characteristic, as Langacker observes (2001: 12), is that “a progressive expression creates this imperfective process by selectively attending to the interior (emphasis mine) of an overall occurrence”. In another publication, following Goldsmith and Woisetschlaeger (1982 cited in Langacker 2002: 94), Langacker explains the difference between the present simple and continuous tenses in terms of, respectively, structural and phenomenal knowledge. The former, as he puts it implies portraying something as part of the world’s structure and, consequently, durable “until the world itself changes” (Langacker 2002: 94); the latter, in turn, is observed from a here-and-now perspective, a perspective which is delimited, internal to the speaker and their senses and, as a result, phenomenal. The concept of looking at the process phenomenally, and consequently from a personal immediate and delimited perspective, brings out the notion of the internal/external viewpoint with an egocentric as opposed to an optimal viewing arrangement discussed in relation to perspective in Chapter 3. Adapting

298

Chapter Four

Langacker’s diagramme (1987: 129), we can say that simple tenses (ST) will represent the optimal relationship between Self (S) / Other (O) as depicted in Figure 32A, while the progressive tenses (PT) will be conceptualised in terms of the egocentric arrangement (Figure 32B). Figure 32. Optimal and egocentric perspectives of simple and progressive tenses (informed by Langacker 1987: 129) (A)

(B)

S

O (ST)

S

O (PT)

In the sense implied by the diagrammes (Figure 32), the use of the progressive will denote the objectification of the speaker – the fact that the speaker perceives themselves as part of the described situation. Such interiority of perspective will be the primary defining characteristic of all of the senses of the polysemous progressive: the prototypical uses, like incompleteness, temporarity and iteration, all of which imply being in the middle of the activity; and the peripheral uses, in whose case volition, emotion and emphasis all manifest the affective engagement of the speaker. Both being in the middle as well as affective engagement – which, by the way, also implies being at the very heart of the matter – will be part of the “process by selectively attending to the interior of an overall occurrence”, which leads to the conceptualisation of the perceived situation which is only partial – as it is impossible to get an overall perspective from the inside – but quite complex – as, owing to this proximity, all of the details of the situation impose themselves on the conceptualiser. As a result, we can speak of the progressives in terms of complex partialities, as suggested in the heading. 3.3. What the perfects have in common Speaking of the primary semantic characteristics of different aspects, we identified simple totalities for the simple tenses and complex partialities for the progressive tenses. In the case of perfect tenses, it seems justifiable to derive the common underlying characteristic from the meaning of the verb have, which has been grammaticalised as part of the verb pattern. Departing from a locution schematised as X has Y, which denotes a relation of direct physical control and taking it through several attenuation stages, Langacker (2002) arrives at the abstract interpretation of the schema as denoting any kind of control or access. As such, X has Y is represented by an atemporal relational

299

Focus on the semantics…

profile typical of modified NPs such as X with Y. If, by analogy, we consider locutions like X has done Y and construe them, as Langacker (2002) proposes, in terms of such atemporal profiles, we will receive a denotation of X with a certain result (of an action applied to Y). In both types of predications – involving have as a verb of possession as well as grammaticalised into the perfect auxiliary – the X will function as a reference point and the trajector of the relational profile in which there is a mutual relationship of the trajector controlling or accessing the landmark, and the landmark being of importance to the trajector. As part of the afore-mentioned attenuation process, the actual relationship will – conceptually – give way to the sheer potential for such a relationship. The atemporal profile can be depicted as demonstrated in Figure 33A (based on Langacker 2002: 327-334). It is crucial to point out that the potential for relationship is depicted by the solid-line arrow and the less important actual relationship – by means of the dashed arrows; it is also of consequence that the auxiliary have landmark will have its own internal profile as an event involving the very trajector of profile A as its trajector (Figure 33B). Figure 33. The atemporal (A) and auxilary (B) profiles of have (adapted from Langacker 2002) A

B

lm tr

lm

tr

According to Langacker (2002), further grammaticalisation of have results in the subjectification of X, who/which ceases to be part of the relational profile and is replaced by a reference point coinciding with the time of speaking, whose role is to indicate the temporal priority of the landmark event, a path which will not be pursued as what is of relevance to the present line of reasoning is, in fact, the attenuated, yet still implicitly present notion of X with a result (of a temporarily prior event) or the prior event being of present/extended present/timeless relevance). The echo of the grammaticalised possessive have will be treated as the underlying semantic characteristic of perfect tenses, including resultative and completive as well as non-resultative or experiential. The former will denote having a certain kind of result of a past activity, like a product, achievement or accomplishment; the latter, in turn, will refer to having a certain experience resulting from our participation in past situations. In terms of perspective, the present – as well as past or future – form of have will imply the present, past or future Base, Viewpoint and Focus on a temporarily earlier state or event.

300

Chapter Four

3.4. Tense as a reference point construction: the perspective of mental space architecture Another interesting issue related to perspectivisation which seems to deserve special attention is the role of tense in the implicit grounding in the scope of predication, which, as it is going to be demonstrated in this section, is directly related to Fauconnier’s (1997) discourse perspective with its Base (B), Viewpoint (V) and Focus (F). First of all, however, we need to revisit, following Langacker (2000), a question that has already been addressed on a number of occasions in the present work: that tense and aspect are implicit deictics. This is because, as indicated before, they are used to trigger some reference points/grounding elements in the scope of predication without explicitly naming them. In other words, the periphrastic some time before NOW is an overt mention of the reference point which is the present time of speaking (NOW), while the –ed suffix added to the verb to form the past tense does the same thing but in a covert manner. This is why Langacker calls tense and aspect “grounding predications” (2000: 220), and conceptualises them as off-stage anchors of the processes occurring on the stage of the immediate scope of predication vis à vis objective descriptions, in which the phrase some time before now is present onstage as the landmark (Figure 34). Figure 34. Grounding in the scope of predication: objective description vs. temporal grounding (informed by Langacker 2000: 221) objective description (G=the periphrastic before NOW)

lm G

grounding predication (G=the past tense)

tr G

Tense and aspect can also be covert anchors in a slightly different way, which will be explained with reference to the past perfect tense, based on the implicit grounding in prepositional phrases. This phenomenon is described by Langacker (1987, among others), who shows how these phrases, which explicitly predicate two participant entities, are used in contexts from which a tripartite relationship can be inferred. As was mentioned before, an utterance such as There is a car [C] in front of the church [CH] presupposes the observer [O] aligned with the two remaining entities in the O-C-CH fashion. A similar inference and resulting

301

Focus on the semantics…

alignment can be observed in the use of relative tenses, such as the past perfect. A sentence like He actually had been a good man indicates cognitive reference between the now and the pre-past, in which there is a presupposition of the past reference point. This can be construed in the following way (Figure 35): Figure 35. Temporal implicit grounding in He actually had been a good man (informed by Langacker 2000: 174)

D

T R

C

C = conceptualiser (speaker or hearer) R = reference point; a situation in the past (the broken line is used to indicate implicitness) T = target: Smb actually being a good man D = the past dominion = mental path

What the diagramme (Figure 35) shows is that the speaker, using the past perfect tense, implies that the subject of this sentence was a good person before a certain event in the past. Triggered by such a verb pattern is mental space architecture A (Figure 36), which substantially differs from what could have been motivated by a space builder in the form of an absolute past tense in a sentence like He actually was a good man (mental space architecture B; Figure 36). Figure 36. Mental space architecture for He actually had been a good man (A) vs. He actually was a good man (B). (A)

(B)

B

B/V

the speaker’s reality

the speaker’s reality

V

F/E smb being good

F/E smb being good

302

Chapter Four

As can be seen in the two diagrammes (Figure 36), in both cases the Focus/Event (F/E) space remains the same. What changes is the Viewpoint (V) – the moment of speaking for sentence B and an empty space for sentence A. Sentences like A – He actually had been a good man – are a manifestation of a very interesting discoursal phenomenon. The fact that the event in the past which provides the reference point is not mentioned – the Viewpoint mental space is left empty – can be explained in two ways: the interlocutors share the knowledge about this event and the space is filled in by default; or the hearer does not share the speaker’s knowledge about the event in question and the speaker leaves the space empty to elicit a relevant question from the hearer and, in this way, focus the hearer’s attention on the event. The sentence He actually had been a good man will in the latter case become a pretext to talk about the very event that caused the change in the subject of the sentence. Whatever the motivation, the difference between architectures A and B is intentional and is of consequence to the interpretation of discourse. The past perfect tense in Figure 36 was used as an example of a more general phenomenon already introduced in Chapter 3, the one of distance in mental space configurations. This is because the general function of tense, as Fauconnier (1997: 93) sees it, “is to mark local distance between mental spaces along certain dimensions”. These “certain dimensions” primarily include, as Fauconnier (1997) observes, the temporal domain, which is the deictic distance theorised by Reichenbach (1947) and Comrie (1985) presented in Section 1. Another dimension along which the distance will be measured is the physical remoteness as situations occurring not now are frequently the ones that are removed spatially as well as temporarily. Finally, the distance that is equally important is the one Fauconnier (1997) calls epistemic, which is the case in all hypothetical uses like the unreal past, tentative uses as in the future tense or in the case of social and emotional distance which were mentioned at the beginning of Section 3. The above-mentioned three dimensions of distance – temporal, spatial or epistemic – are established by means of tense between the four discourse primitives introduced and described in Chapter 3 as well as utilised above in the discussion of the past perfect example: Base, Viewpoint and Focus as well as Event. Normally, as we progress through a discourse encounter, the four primitives will shift quite dynamically from one node in the network to another, conflated (all of the initial three/two of them) in one space or spread through a chain of consecutive spaces. As a result, every discourse encounter will, to a considerable extent, be a unique event with its own, unprecedented mental space architecture. On the other hand, though, just as language production involves an amalgamation of structures, the construction of mental architecture will utilise space builders (listed and described in Chapter 3) which, by establishing a certain space S, simultaneously generate a bigger chunk of the space architecture – an entire configuration including space S as well as its neighbouring spaces. Such

303

Focus on the semantics…

configurations may be usage-based or schematised. Curter (1994 cited in Fauconnier 1997: 73-77) proposes such combines for spaces that are tense-built, calling them “universal tense-aspect categories”. In the case of the three basic semantic tense types – as well as the perfect aspect – the configurations will be as presented in Table 9. (based on Curter 1994): Table 9. Tense/aspect space configurations (based on Curter 1994) PRESENT S S or S’s parent:

not prior to Viewpoint/Base; fact

PAST FUTURE applied to space S means that: S S is in Focus; S’s parent: S’s parent: is Viewpoint; S’s time frame is: prior to Viewpoint; posterior to Viewpoint; events or properties represented in S are: fact prediction

PERFECT S’s parent S’s parent

prior to Viewpoint;

fact

The Base in all three cases, unless shifted as a result of a discourse twist – like a story within a story or the use of a historic present – is the moment of speaking. In Table 9, the three basic semantic tense categories have been described together with the perfect aspect, because all four, as is shown in the table, are inter-space builders, affecting the mental architecture beyond a single node. The other aspects – perfective, imperfective and imperfective/progressive – in combination with the tenses will be in charge of intra-space configurations as demonstrated in the following section. 3.5. Temporal and aspectual profiles; the notion of re-profiling Taking into consideration the temporal profiles of individual usage events we have to be aware – as indicated in Section 1.2.3, where the interplay between the lexical and the grammatical aspect was described – that we are actually talking about simultaneous profiling in two different but interacting domains: the structural domain of the verb pattern as well as the lexical domain of the verb actually used in this pattern. In the case of verb patterns, we will not be talking about temporal profiles of tenses or aspects but rather about temporal profiles of grams (cf. Bybee et al. 1994), each of which is a combination of certain temporal and aspectual properties. Considering the fact that there are three basic tenses – present, past and future – as well as four basic aspects – perfective, imperfective, imperfective/ progressive and perfect – the twelve possible combinations will be as shown in Figure 37.

304

Chapter Four

Figure 37. Temporal profiles of grammatical tenses (=tense-aspect combines) (adapted from Langacker 2000: 224-229 and Langacker 2002: 330-340)33: present perfective

t

past perfective

t

present imperfective

t

past imperfective

t

present progressive

t

past progressive

t

present perfect

t

past perfect

t

future perfective

t

future imperfective

t

future progressive

t

future perfect

t

The perfect progressive tenses – for example the present perfect progressive – can be profiled as in Figure 38A; profiles of past and future perfect progressives will be similar combinations of the relevant progressive and perfect profiles. In turn, when it comes to the be going to form, Langacker (2002: 331) profiles it as ––––––––– 33

The diagrammes for present and past tenses are adapted from Langacker; future tenses are profiled by analogy

305

Focus on the semantics…

presented in Figure 38B, in which the trajector is construed as moving along a certain spatial path at the end of which it initiates some activity (B1); as the verb go is grammaticalised, the subject of the activity becomes subjectified and the profile acquires the quality presented in B2, in which G stands for the grounding predication or the reference point (temporal). Figure 38. Temporal profiles of the present perfect progressive and the be going to form A

B1

B2

present perfect progressive

go

be going to form

lm

lm

t

tr

t

G

tr

t

Like perfect progressives, the profiles of the be going to form amalgamated with the different tenses and aspects will be combinations of the profiles of all the participant structures. As has already been mentioned in the present work, temporal profiling means sequential scanning of a certain change within the scope of predication; it has also been pointed out that the change will be the case in temporal profiles of events while it will not occur in stative profiles. Considering, in turn, the lexical aspect we can state that some verbs will have stative or imperfective profiles while others will naturally be more dynamic and consequently perfective. The expression “more naturally” is used on purpose here because, as Langacker himself (1987: 258) notes “[p]redicting whether a given verb will behave as a perfective or an imperfective is ... not always possible given its basic semantic value alone”. This is because each of the verbs will, as mentioned at the beginning of the present section, interact and be influenced by the temporal profile of the verb pattern in which it is used. We can obviously say that – more naturally, again – stative verbs will appear in relation with imperfective grams, as demonstrated below (Langacker 1987: 255): J.P resembles his father. The Smiths have a lovely home in the country. Your sister believes that the earth is flat. An empty moat surrounds the dilapidated castle. This roads winds through the mountains.

306

Chapter Four

If, however, we want to imply that the inherently stative process is subject to some kind of evolution, the stative verbs from the examples above “receive a perfective construal” (Langacker 1987: 256) and are used in the progressive, as demonstrated below: J.P. is resembling his father more and more every day. The Smiths are having a lovely argument. That guy is spinning your sister a line and she’s believing every word of it. The SWAT team is surrounding the dilapidated castle. The road is winding through the mountains.

We will consider the above in the light of Langacker’s (2001) earlier-quoted statement about the vacuity of applying the progressive – which is a process of imperfectivisation – to processes that are already imperfective. Examples 2 and 4 are less interesting, because we are dealing with polysemous senses of the verbs have (own as opposed to be in the middle of an activity) and surround (stative vs. dynamic sense). The remaining sentences are more challenging, as they actually involve what Langacker sees as improper – further imperfectivisation of the already imperfective process. Upon closer examination of the two groups of sentences, there certainly is more to the introduced formal change than just a temporal zoom in, which differentiates a permanent or long-term situation from a state occurring at the moment of speaking. For such situations there is the stagelevel state lexical aspect (Olsen 1997) which refers to present vis à vis permanent states; the example used in Section 1.2.3 showed the difference between know and know now. In the light of this, we can state that there is a mechanism which perfectivises, if necessary, otherwise imperfective processes. This mechanism is most probably the first step towards subsequent imperfectivisation of such a process by means of the progressive form used to describe the process. This happens – apparently much faster nowadays – and happens for a variety of reasons. Implying evolution, which was the case of sentences 1, 3, and 5 quoted above, is only one of the possible motivations. Others may include supplying the imperfective process with the extra quality of habitualisation, politeness, formality, emotion and all of the other meanings that, after Mindt (2000) were identified, in Section 2. Listed below are examples of such a process: habitualisation: politeness: formalisation: emotion:

I’ve been seeing him around here. I’m just wanting your advice, is all. You will certainly not be needing this any more. I don’t believe what I’m hearing.

which Turewicz (1998) calls re-profiling: formal alterations whose task is to change a certain lexical symbolic unit in a way that enables its use with a seemingly conflicting aspect.

Focus on the semantics…

307

Considering the earlier discussed question of perspective, we can assume that re-profiling will also be utilised in any case requiring an alternative viewpoint, incurring subjective (Langacker 1987) viewing arrangements or, conversely, objectification of the conceptualiser; a shift from totality/punctuality to particularity/processuality or vice versa; or a change from stable to tentative and the other way round. All this is related to subjectivity/subjectivisation in the understanding proposed by Lyons (1994), which, as Scheibman (2001: 61) observes, amounts to “ways in which speakers use language to express their perceptions, feelings and opinions in discourse (=subjectivity) and how such expressive motivations and strategies conventionalise and interact with linguistic structure”. In other words, the discrepancies between grammaticality standards and what is actually confirmed by data-driven research into language use may, to a considerable extent, be motivated by the speakers’ desire to emphasise highly personal meaning distinctions and their own individual perspective on the described situation. The process is what Traugott describes as grammaticisation, a process aimed at rendering meanings which “often become increasingly based in speakers’ attitudes towards what they are saying” (1995 cited in Scheibman 2001: 62; emphasis mine). In turn, by using conventionalised language to describe our perceptions – which cannot be but subjective – we can choose to objectify (in Lyons’, not Langackerian, understanding) our attitudes, as observed by Haiman (1995). All of these highly subjective ways of rendering personal meanings have the potential to enter the intersubjectivity domain in the process of co-construction of meaning and the subsequent popularisation of novel formal solutions and their meanings. This is possible on condition that the innovation is perceived as attractive by other language users. The said attractiveness is usually determined by the overall usefulness of a certain formal innovation and the resulting frequency of its use. In other words, as pointed out by the proponent of emergent grammar (Hopper 1987: 150), “the more useful a construction is, the more it will tend to become structuralised in the sense of achieving cross-textual consistency”. As a result, what initially might have been a slippage between standards and actual use becomes increasingly represented in discourse, because its potential for rendering subtler meanings outweighs its structural oddity. Examples of such changes were given on the previous page. What is also important to point out in relation to the above-described phenomena is the nature of the shift from the regular, unmarked discoursal meaning towards special, sometimes nonce meanings. Uses of this kind result from the overlapping of two seemingly incongruent language units such as, for example, an inherently imperfective, stative verb and the progressive verb pattern. As a result of this incongruity, the extra-quality of the speaker’s individual meaning, mentioned in relation to the examples given above, is added by way of actually contaminating the imperfectivity of the verb by the progressiveness of the

308

Chapter Four

verb pattern. As a result, the said verb – and other similar verbs – enrich their semantics by becoming “more ready” for other uses with the progressive aspect. Such contamination or semantic enrichment is part of the process involved in dynamic online meaning construction, already described and called conceptual blending.

4. The CIT perspective on the semantics of English temporal expressions The above-mentioned conceptual blending potentially occurring at the intersection of forms carrying lexical and grammatical aspect is one manifestation of the dynamics of online meaning construction. The other such phenomenon is, as described in Chapter 3, frame shifting. Whether the potential is actually realised at the level of the conceptualiser, and if there indeed is the frame-shifting effect evoked by tense/aspect were the foci of a study carried out in the years 2006-2007 (Turula 2007). The study was based on verbal reports obtained from 4 native speakers of English – two BE and two AE speakers – while they were doing a grammar test. Each test unit consisted of two subsections – the first part was a fragment of an Anglophone film dialogue in which all verb forms were bracketed infinitives; the second part contained the same dialogue in its original form, as presented below in the sample test unit (Figure 39): Figure 39. Sample test unit Man and woman having lunch together Man: “What [1] _____ you ______ (do)?” Woman: “What [2] _____ you ______ (care)?” Man: “I [3] ______ (not). I just [4] ____________________ (make) polite conversation.” Man and woman having lunch together Man: “What do you do?” Woman: “What do you care?” Man: “I don’t. I was just making polite conversation.” (Wolf)

The task each testee had to complete was twofold: first they put the bracketed infinitives into tense/aspect forms that were, in their opinion, the most suitable for the given context as well as gave the rationale for their choices; then, if the choices differed from the original, the testees were shown the actual film dialogue and asked to comment on the differences. A number of observations were made on the basis of the obtained data (Turula 2007):

Focus on the semantics…

309

– The rationale given by NS for their formal choices was meaning related in two ways: a) the implicature/conceptual meaning of the utterance was given: Two ‘amateur robbers’ have almost been caught by the police. One of them says ‘That was way too close. We [1] ______________________ (not do) that again!’ Two ‘amateur robbers’ have almost been caught by the police. One of them says ‘That was way too close. We are not doing that again!’ (The Incredibles) Interviewer: Why not future simple? NS: This is confidently expected. It’s not a prediction or promise.

and/or b) an event-based frame or script was provided: The family being in a really tight spot, the children cry to their mother: ‘What __________we [1] _________________ (do)?’ Mother (very firmly): ‘We [2]___________ (not panic). We [3] ___________ (not die). I [4] _____________ (swear) I [5] __________(get) us out of this.’ The family being in a real tight spot, the children cry to their mother: ‘What are we going to do?’ Mother (firmly): ‘We are not going to panic. We are not going to die. I swear I’m going to get us out of this.’ (The Incredibles) Interviewer: Why be going to? NS: That’s what you say when you’re in danger.

– If the grammatical forms chosen by the native speaker differed from the original, frame shifting – abandoning the old understanding and adopting a new interpretation of the event – could be observed while the testee commented on the second part of the test unit: Man and woman having lunch together Man: “What [1] _____ you ______ (do)?” Woman: “What [2] _____ you ______ (care)?” Man: “I [3] ______ (not). I just [4] ____________________ (make) polite conversation.” Man and woman having lunch together Man: “What do you do?” Woman: “What do you care?” Man: “I don’t. I was just making polite conversation.” (Wolf) NS: [blank 4] ‘I’m just making polite conversation’. He’s trying to be nice. Interviewer (shows the original) NS: ‘I was just making polite conversation’. Well, he’s stopped being nice. The talk is over. With ‘I’m making’ she still had a chance, now – not any more.

310

Chapter Four

– Conceptual blending phenomena could be noticed in the rationale given for grammatical choices. If a given tense/aspect form was polysemous, it was the context that activates a suitable semantic zone within the whole entity, e.g. the be going to form: Neighbour to neighbour (angrily): “Get out immediately or there’s gonna be trouble. (As Good As It Gets) A man who’s about to lose his job to his faithful workmates: Man: “I forbid you to take any action if I’m fired.” One workmate: “I’m not sure if I’m going to be here after the takeover.” (Wolf)

NS [about situation 1]: the meaning of be going to – refers to the official, predictable course of action Interviewer: Not his intentions? NS9: No, he wants to sound official NS [about situation 2]: the meaning of be going to refers to the speaker’s intentions. Interviewer: Not the official, predictable course of action? NS: No, he’s not sure what his intentions are.

This conceptual blending process presented above can be explained based on Sweetser’s (1999) conceptual blending model discussed in Chapter 3 (Figure 40): Figure 40. Conceptual blending in temporal expressions by analogy to Sweetser’s model Conceptual blending - zone activation (based on Sweetser 1999) I’m not sure…

I’m not sure…

WILL

BE GOING TO

or else!

or else!

LIPS

HAIR

RED

BALL

APPLE

On the basis of the three above-mentioned observations, the following claims can be made (Turula 2007: 364-365): – The rationale offered by native speakers is predominantly semantic, which once more confirms that grammar structures (tense and aspect in the present study) are form-meaning pairings and should be taught as such, especially at higher levels of language proficiency.

Focus on the semantics…

311

– As was demonstrated by samples 1 and 2, the meaning attached to the forms in question is both conceptual (Yule and Jackendoff) / dictionary (Langacker) as well as associative (Yule) / encyclopaedic (Langacker) / spatial (Jackendoff). – The rationale behind the conceptual meaning is based on implicature, referring to “what the speaker wants to say by the particular grammar form”. In other words, meaning amounts to what one wants to mean. – The associative/encyclopaedic/spatial meaning of grammatical forms is based on “dynamic wholes” (Lewandowska-Tomaszczyk 1987: 45) or “continuities between language and experience” (Petruck 1996: 1) such as event frames and scripts, mental spaces and individual experience-induced cultural models which are the basic constructs of the Conceptual Integration Theory. – Online processing of grammatical forms is based on such phenomena as frame shifting and conceptual blending. As seen in the examples given above, a change of tense/aspect results in the re-evaluation of the entire situation as a result of which old understanding is abandoned and new meaning is noticed. In turn, semantic zone activation in polysemous structures like the be going to form and the future simple tense is motivated by what can be called space coercion: certain space building structures – like I’m not sure or or else! – profile one of the potential meanings for the particular context (Figure 40). In conclusion, focus on FORM in the area of tense and aspect, with special regard to the semantic pole, can be based on the following specification of the meaning of individual temporal grams. First of all, all grams can be analysed on both levels of Jackendoffian semantics: the conceptual and the spatial/experiential. The former will be represented by primary semantic characteristics including: – for simple tenses – simple single entities, totalities conceptualised punctually, either as perfective processes or as portions/samples of imperfective processes; – for progressive tenses – complex partialities with interiority, being in the middle of the activity, as an objectified, egocentric perspective; and finally – for perfect tenses – the underlying meaning of the lexical verb have implying the access the subject has to the landmark of the relational profile or the relevance the landmark has to the subject. The spatial/experiential level, on the other hand, will be of consequence in two ways: in terms of the different usage-based meanings of different grams and the different prototype effects. When it comes to the meanings, they will include: – present (extended, actual and generic/timeless) reference; future reference in the case of definite arrangements; past reference in historic present uses; and hypothetical use in if-conditionals for the present simple tense;

312

Chapter Four

– past reference including historical and fictional narratives; if-conditional and unreal past counterfactuals; and pragmatic softeners for the past simple tense; – certainty/prediction, volition/intention, possibility/high probability as well as habit and inference/deduction for the future simple tense; – incompletion; temporariness; iteration/habit; highlighting/prominence; volition; as well as matter-of-course, emotion and politeness for the progressive tenses; – indefinite past resultative and non-resultative; continuative past; recent past and completion for the perfect tenses, and the meaning of definite pre-past for the past perfect tense. The prototypical effects in the area of tense and aspect use as well as tense/aspect amalgamation will include: – a more frequent use of certain aspects with certain tenses – such as present+imperfective or past+perfective; – a more frequent use of simple rather than progressive tenses – present, past and future – for the relevant time orientation; – some usage-based meanings being more typical than others, such as the incompletion, temporariness and iteration meanings of the progressive being more popular than politeness-related uses. Secondly, tense and aspect are an important element of the mental space architecture. They mark local distance between mental spaces along temporal and epistemic dimensions as well as profile – and, equally importantly, re-profile – inspace content; this was demonstrated in the course of the present chapter by numerous examples and related diagrammes. Finally, tense and aspect generate phenomena related to the dynamics of online meaning processes, motivating processes such as conceptual blending and frame shifting proving that they indeed are part of the holistic-model semantics. In this way we have delineated what should be subject to a pedagogic treatment. Such treatment should consist in FFI instruction whose overall principle will be a combination of focus on form (=form/meaning pairings) with exposure to communicative contexts; FFI whose internal composition will be delineated in Chapter 5 as Organic Approach Deductivised. The studies investigating the effectiveness of OAD are described in Chapter 6.

PART III: FOCUS ON FORM-MEANING CONNECTIONS IN THE ENGLISH TENSE-ASPECT SYSTEM The first two parts of the present book looked at what it means to focus on form and how form per se should be understood vis à vis such pedagogical interventions. In an attempt to find the answers to the above two questions, Part I examined the role of attention, attentional selectivity and different attentional pools, the problem of conscious awareness as well as different forms of instruction, including comprehension-based approaches and different forms of inputenhancement; Part II argued that in the case of advanced learners focusing on form ought to boil down to learning form-meaning connections, the meaning understood broadly as encyclopaedic rather than dictionary, intersubjective to a greater extent than subjective . Interim conclusions that have been drawn on the basis of the said two parts are as follows. First of all, form-focus pedagogical interventions have to combine explicit and implicit learning motivated by both deductive and inductive instruction together with ample exposure to authentic language in communicative contexts so that the learner can equally benefit from intensive/focal and extensive/global attention as well as rely on his/her conscious and unconscious learning mechanisms. What, in turn, should be subject to so delineated instruction is the meaning of different grammatical forms investigated from the perspective of holistic semantic models with regard to prototype effects as well as semantic phenomena such as frame shifting and conceptual blending. The present – and final – part of the book sets out to adapt the above theoretical bases – psychological, pedagogical and linguistic – to teaching grammar to the advanced EFL learner in the area of the English temporal system. In doing so, it will start by looking at the relevant research to date following – and updating – a comprehensive description of studies into tense and aspect formmeaning mappings and their acquisition/learning routes proposed by BardoviHarlig (1999, 2000). The reference to the body of research presented in the said publication has a number of functions. First of all, it helps to contextualise the present research effort as well as provides a point of departure for the research design in two areas. Primarily, Bardovi-Harlig’s (2000) claim that temporal expressions actually form a grammatical subsystem with its own internal architecture will draw our considerations towards a suitable paradigm in which tense-and-aspect pedagogy can be accommodated. It will be argued here that construction grammar, with its form-meaning pairings and inheritance hierarchies accommodating both the rulebased and the analogy-based systems, offers a promising reference point in this area. Secondly, the present work will attempt to make up for the deficiency signalled by Bardovi-Harlig, who notes that “the field [of TAM research] has

314

Part III

nearly accomplished the necessary documentation of early to intermediate stages of acquisition ... there are still gaps in our knowledge of advanced stages in most languages” (1999: 369). Therefore, the focus will be on the proficient language user in the hope of diagnosing his/her knowledge of the semantics of temporal expressions, pinpointing potential difficulties and trying to devise a suitable pedagogical treatment. For this purpose, a detailed procedure behind the solution called Organic Approach Deductivised will be presented. All of this will be included in Chapter 5. Chapter 6 will present three experimental studies carried out to examine the effectiveness of the said approach, their results, a discussion and following conclusions.

CHAPTER FIVE ORGANIC APPROACH DEDUCTIVISED: TOWARDS A RESEARCH DESIGN No research is carried out in vacuum; it is always informed by previous investigations as well as relevant theoretical models. That is why, before presenting one’s own research design, it seems necessary to look at studies to date in terms of both their findings and their investigation models as well as look for appropriate theoretical grounding of the intended research effort. Taking all of this into account, this chapter starts by presenting research into the acquisition and learning of the tense and aspect in a second language (Section 1) in order to accommodate the studies to be described in Chapter 6 within a certain research tradition. Section 2, in turn, is an attempt to endorse the problem of form-focused instruction on the semantics of time talk in English by a linguistic model – construction grammar – whose utility in this respect is documented based on a comparison between this theoretical proposal and models informing both traditional structural approaches to grammar as well as the more consciousness raining-oriented Lexical Approach. Section 3, which concludes the chapter, presents Organic Approach Deductivised: it describes its 3-D pedagogical approach to grammar, whose innovation consists in taking the instruction plus exposure approach of modern form-focused treatments one step further, towards cultural learning of grammar; as well as justifies the treatment vis à vis problems faced by the advanced language learner.

1. Form-meaning mappings in the tense-aspect system. Research to date To quote VanPatten et al. (2004), the tense-aspect system is the most researched area with respect to form-meaning connections (FMCs; to be revisited later in this chapter). This claim is confirmed by the thorough account of studies into the acquisition of temporal expressions – also called time talk (Smith 1980) – in second language contexts offered in Bardovi-Harlig’s (2000). The present section is meant to be a recount of her comprehensive review; it is, however, supplemented in three ways: publications dated after 2000 are cited; in the section referring to Aspect Hypothesis studies, Bardovi-Harlig’s account is complemented by observations made by Li and Shirai (2000), who themselves have published widely in relation to the hypothesis; finally, the instructional stance on tense/aspect, a perspective Bardovi-Harlig disregards, is discussed as well.

316

Chapter Five

1.1. Contextualising the OAD studies Based on her diachronic look at tense-and-aspect research, Bardovi-Harlig (2000) states that originally the acquisition of time talk was subject to incidental studies carried out in the area of the order of morpheme acquisition (VanPatten 1984, VanPatten and Pica 1984 – among others) or phonetic constraints on the said developmental patterns (Wolfram 1985). As she points out, these studies were not concerned with the process of acquisition but its product. A considerable turn in research tendencies could be observed in the 1980s, with the advent of interest in the semantic development of the learner interlanguage, temporal semantics being part of the system. This change of research focus resulted in a shift from studies on the acquisition of morphology to the investigation of morphology as a surface realisation of an underlying semantic system, with three important areas of interest: tense, for locating the event on a timeline; aspect, for expressing one’s view of the situation; and lexical aspect, for the semantics of the predicate. It was then that a considerable body of research into the acquisition of the temporal expressions was carried out on both sides of the Atlantic. The geographic separation resulted in two different research paths (Bardovi-Harlig 2000): in the European tradition temporal semantics was the point of departure for research into the linguistic means learners use to express certain meanings (meaning-oriented research); the American tradition, on the other hand, followed the path of looking at verbal morphology first and then investigating the patterns and their semantics (form-oriented research). Offering examples of research belonging to both traditions, Bardovi-Harlig (2000) quotes a number of meaning-oriented and form-oriented studies. When it comes to the former, she refers to the ample body of research carried out from a number of perspectives: the concept-oriented approach (von Stutterheim and Klein 1987); the semantic approach (Giacalone-Ramat 1992); the functional-grammatical perspective (Skiba and Dittmar 1992); the notional perspective (Berretta 1995); and function-to-form studies (Long and Sato 1984, Sato 1990). These studies were carried out based on the assumption that adult learners have access to a whole range of semantic concepts and look for means of expressing them in the second/foreign language. Consequently, the above-cited researchers looked at different linguistic devices, crossing the boundaries of the tense-aspect system, trying to answer the following questions (Bardovi-Harlig 2000: 23): 1) How do learners express temporality at a given stage of their acquisitional process? 2) How does temporal reference change over time? 3) What are the explanatory factors that can account for the development? As has been proved by numerous meaning-oriented studies (cf. Bardovi-Harlig 2000: 2192), learners go through three different stages of mapping temporal meanings onto linguistic means. Initially they utilise pragmatic means such as scaffolding or relying on the contributions of their interlocutors; or by the chronological ordering

Organic Approach Deductivised…

317

of the events recounted, owing to which the novice learner is able to indicate which situations occurred prior to other situations. The second level of development is marked by the use of lexical time talk, such as the application of temporal adverbials. The final stage is characterised by the emergence of morphology which marks temporal relations. In turn, the form-to-function studies (Long and Sato 1984, Sato 1990, Berretta 1995) followed a form and checked where and how it is used by learners “thus determining what it means in the system (Bardovi-Harlig 2000: 11). As a result, research to date in this area has concentrated on “the distribution of the emergent verbal morphology” (Bardovi-Harlig 2000: 12), the main research paths including the study of acquisitional sequences as well as research aimed at investigating the Aspect (cf. Chapter 4) and Discourse Hypotheses. The present work – with its focus on the tense-aspect system as a morphological means of expressing temporal relations – will follow the path of form-oriented studies carried out in the American tradition. This is why, meaningoriented research will not be described any further and the remaining part of this section will concentrate on clarifying the three above-mentioned types of formoriented studies. 1.2. Form-to-function studies of the acquisition of the temporal system As indicated above, the studies carried out within the American form-first tradition can be divided into three groups: research into the acquisitional sequences as well as investigations of the Aspect and Discourse Hypotheses. Sections 1.1.1, 1.1.2 and 1.1.3 deal with the three groups of studies. In addition, Section 1.1.3 presents Bardovi-Harlig’s synthesis of the aspect and discourse approaches. 1.2.1. The acquisitional sequence studies1 The acquisitional sequence research presented in Bardovi-Harlig (2000: 93-190) contains an array of longitudinal studies into the developmental patterns in the attainment of tense-aspect morphology in a number of languages. Their design, as Bardovi-Harlig observes, distinguishes the said studies from the product-centred order-of-accuracy research mentioned earlier in this section: these studies concentrate on emergence rather than acquisition and, as the very label indicates, ––––––––– 1

The studies presented in this section belong to both the American and the European research traditions as the emergence of verbal morphology has been studied on both sides of the Atlantic (cf. the introductory remarks to this section). As Bardovi-Harlig (2000: 93) puts it, in this area “the meaning-oriented approach and the form-oriented approach afford a better opportunity for understanding the acquisition of tense-aspect than either approach alone”.

318

Chapter Five

on sequences rather than orders. All of this is based on the assumption that verbal morphology features do not emerge as single characteristics of the learner’s interlanguage but rather as nodes in the overall tense-aspect network in which “parts of the system are intrinsically related to the other parts and will have an impact on each other as the system develops” (Bardovi-Harlig 2000: 95). The findings of all of the studies (for design details see Bardovi-Harlig 2000: 97-100) can be summarised by means of the four major principles identified in the course of each of them (Bardovi-Harlig 2000: 111-114): – the development of temporal expression is slow and gradual: learners acquire verbal morphology in stages that take time to complete; – as part of the slow, gradual acquisition, form often precedes function: verbal morphology may emerge without clear meaning; – in the course of system emergence, irregular morphology precedes regular morphology; and – learners avoid discontinuous marking such as aux+V-ing and hypothesise that tense relies on suffix alone. In addition, as Bardovi-Harlig (2000: 186) states in a sum-up to the summary above: – the stages or acquisitional sequences learners go through are remarkably similar across target languages and little first-language influence can be identified, the latter statement being in accordance with the results of a much broader body of studies into acquisitional sequences cited, among others in Ellis (1994). The study into the acquisitional sequences in time talk that is of particular interest to the present work – because of the target language in question – is BardoviHarlig’s own research into the emergence of tense-aspect morphology related to the past tense in English. Consequently, it will be presented in greater detail here. Using the Reichenbachian framework (Reichenbach 1947; cf. Chapter 4) for the past tenses as her point of departure, Bardovi-Harlig defines the objective of her study by stating that “[t]he acquisition of the target-like tense-aspect system by learners of English entails the acquisition of both the morphology ... and its semantic and pragmatic features. In addition, learners must come to distinguish the meaning and use of each of the tense-aspect forms from those of their semantically close neighbours” (2000: 126-127). Translating the above into the practicalities of her own study, Bardovi-Harlig (2000) investigates the emergence of the tense-aspect form-meaning pairings of the past simple, past progressive, present perfect and past perfect grammatical tenses in 16 adult ESL learners enrolled in a two-year English course. All the testees, three of them women, were aged between 16 and 20 and represented four native language backgrounds:

Organic Approach Deductivised…

319

Arabic, Japanese, Korean and Spanish. In the course of the study, all its participants were asked to keep a daily journal by writing essays, compositions, film-based narratives and creative pieces; they also wrote a 35-minute essay as their final exam. In addition to this, the testees were asked to perform a number of oral tasks – three interviews, one film-retell and two narrative-retell assignments – which were recorded. The collected written and oral learner corpus, containing 1576 written and 175 oral texts, became the basis for an analysis of the emergence of tense-aspect morphology in the learner’s interlanguage, the results of which are presented below with reference to the four grammatical tenses listed above. As for the overall findings of her research, Bardovi-Harlig (2000) discovered that even though the participants started their language course at a comparatively same level of proficiency, during the two-year period they showed individual differences in the rate of acquisition, the length of course participation and their final language advancement. When it comes to detailed, acquisition-sequence related observations, they refer to two groups of findings: related to the order of the emergence of the form and the development of form-meaning connections. As for the former, Bardovi-Harlig discovered a set order of emergence of the four past tenses, past simple being the first to appear in learner discourse, followed by past progressive, present perfect and past perfect. As for the developmental path of the compound morphological patterns, they differ slightly between the past progressive and the two perfect tenses. The former starts at bare progressive (aux+V-ing) and goes through present progressive towards the target past progressive form. The sequence of attainment, however, is not unidirectional and the learners frequently exhibit U-shaped behaviours in this area. What is important is, as Bardovi-Harlig observes, that the development of past progressive is not dependent on the prior emergence of the simple past; all it hinges on is the present progressive, as demonstrated in the language samples of all testees, regardless of their language background. Conversely, the present and past perfect tenses emerge based on the earlier attainment of the simple past: their appearance in learner discourse, spoken or written, is preceded by a high percentage of the correct use of past simple morphology. As for the developmental stages these two go through, the morphological emergence of both present and past perfect follows the path of ill-formed verbs (V=base/second form); perfect continuous, in turn, is observed to emerge posterior to simple perfect. In a later publication, Bardovi-Harlig and Comajoan (2008) present the acquisition sequence of temporal expressions for future time reference: will > be going to > present progressive > present simple. In a study interpolating the two (Bardovi-Harlig 2000; Bardovi-Harlig and Comajoan 2008), Housen (2002) – partly based on VanPatten’s (1984) and Pica’s (1984) studies – presents the sequence for all temporal morphology, as it emerges in the learner’s interlanguage: base V form > V-ing > irreguar past of be > irregular past of other verbs > regular past; be going to > perfect aux+V > present Vs > future will+V. The

320

Chapter Five

developmental pattern is to a considerable extent convergent with Bardovi-Harlig’s findings (cf. above and earlier in this subsection) with the exception of the future simple tense. While Bardovi-Harlig sees it as the first to emerge in the area of future time reference, Housen places it at the end of the sequence. Considering the native speaker distributional bias (Bardovi-Harlig 2000)/prototype effects (Li and Shirai 2000), mentioned in the context of the Aspect Hypothesis in Chapter 4, and about to be revisited in the subsequent part of this section, it seems unlikely that the comparatively frequent future simple pattern will emerge after the much rarer be going to form. Consequently, the developmental sequence proposed by BardoviHarlig (2000; Bardovi-Harlig and Comajoan 2008) appears more reliable. When it comes to the emergence of form-meaning mappings in the learner’s interlanguage, Bardovi-Harlig (2000) notes, forms emerge prior to their semantic representations. Such a conclusion is reached because various forms are overgeneralised and, consequently, overused. This observation refers, among others, to the present perfect tense, which is over-generalised by the testees to uses where another tense-aspect form, usually the past simple, the past perfect or the present simple, is preferred. Past perfect, whose morphology emerges last of all the three compound past tenses, follows a similar over-generalisation pattern: it is often used in contexts where past simple or present perfect tenses are the correct choices. The above-mentioned over-generalisations disappear in the course of the attainment of the semantics of individual tense-aspect forms, which, according to Bardovi-Harlig (2000), is indicative of one more characteristic of the development of learner time talk: the fact that it changes through the reassignment of the whole system of temporal expressions. This process can be explained based on the present perfect. The fact that it is often over-generalised to present and past simple contexts results from its situation – formal and semantic – in the neighbourhood of these two tenses. When the new form-meaning mapping start to emerge and “the learners carve out the meaning and use associated with the newly acquired form, ... they must also reassign part of their understanding of the simple past and present tenses to the present perfect” (Bardovi-Harlig 2000: 175). All in all, temporal expressions are not learned as isolated form-meaning pairings but rather as nodes in a broader network of the time-talk system. This particular acknowledgement is of importance to the present line of reasoning and is going to be revisited later in this chapter. 1.2.2. Aspect Hypothesis (AH) studies Another body of studies into the acquisition of time talk has been carried out within the theoretical framework of the Aspect Hypothesis, whose underpinnings were already mentioned in Chapter 4. Before presenting research to date, it seems necessary to remind the reader briefly of the contents of the theoretical proposal at issue.

Organic Approach Deductivised…

321

As mentioned before, following Shirai (1991), Andersen and Shirai (1994, 1996) as well as Bardovi-Harlig (2000, 2002, 2005), at the onset of language acquisition, both first and second language learners are affected by the lexical aspect of verbs in their attainment of the grammatical tense and aspect of these verbs. As a result, Li and Shirai (2000) observe, there is a clear and consistent association between the lexical and grammatical patterns in the following way, (Bardovi-Harlig 2002 and 2005; see also Chapter 4): perfective past is first learned for events (accomplishments and achievements); the imperfective – for any time reference – is primarily associated with states; and, finally, the progressive originally emerge for activities. Li and Shrai (2000; cf. also Salaberry and Shirai 2002), base the Aspect Hypothesis on Prototype theory calling it the prototype account. Such a stance is motivated by the acknowledgment that “[l]earners ... infer from the input a core, or prototypical, meaning for each inflection such as ‘completed action’ for past and perfective morphology and then associate it with aspectual categories [of verbs] that have similar features” (Andersen and Shirai 1994: 148). Thus, as observed in Bardovi-Harlig (2000: 398), the prototype of the past as a completed action associates with events which are telic and have inherent endpoints; in turn, the prototype of progressive as “action in progress” associates with activities that are dynamic and atelic; and finally, the prototype of present as “continued existence” associates with states. Adherence to prototypes leads to the unmarked use of certain categories and, consequently, often results in their overuse, motivated by over-generalisation, a phenomenon described (following BardoviHarlig 2000) in the previous section in relation to past tenses. What is important is, especially to the present work, whose main focus in on the advanced learner, “it is not until the morphology begins to spread to other aspectual categories in increasingly less prototypical combinations that the system exhibits potentially native-like contrasts... it is not until contrasts are possible in interlanguage that grammatical aspect becomes a true viewpoint aspect” (Bardovi-Harlig 2005: 399). This is a perspective similar to the one expressed in Andersen (1990b), who claims that access to non-prototypical or marked associations is a native speaker advantage. Going back to the diachronic perspective on AH research to date, the onset of interest into the acquisition of the lexical aspect can be traced back to the mid1980s (Li and Shirai 2000) or even mid-1970s (Bardovi-Harlig 2000). Before that, as observed by Li and Shirai (2000), the study of aspect was part of the universal approaches to language development, the main traditions including: – Bickerton’s (1981, 1984) Language Bioprogram Hypothesis, in the light of which there are inborn (preprogrammed) semantic distinctions between states and processes as well as punctual and non-punctual events;

322

Chapter Five

– Slobin’s (1985) Basic Child programme hypothesis, which provided for the existence of a prestructured semantic space containing such prelinguistic notions as process and result; and – input-based approaches, which accounted for language acquisition based on the learner’s (child’s) ability to analyse incoming patterns (Bowerman 1985; Li and Bowerman 1998; and others). The first body of AH research proper – the very hypothesis first verbalised in Andersen and Shirai (1994) – contained studies into the primary language acquisition as well as research in the area of SLA. The former included a longitudinal study by Antinucci and Miller (1976 cited in Bardovi-Harlig 2000: 193), in which seven Italian children and one English child were proved to be “sensitive to lexical aspect in the morphological encoding of past events” (Bardovi-Harlig 2000: 193). These findings confirmed the results of a three-year earlier study of 74 French-speaking children carried out by Bronckart and Sinclair (1973 cited in Bardovi-Harlig 2000: 194). When it comes to L2 acquisition, Li and Shirai (2000) claim that the earliest studies into the acquisition of aspect were the ones carried out by Ramsay (1990), Andersen (1991), Bardovi-Harlig and Reynolds (1995) as well as Bardovi-Harlig and Bergström (1996). Different in design – the first two were concerned with native speakers of English learning Spanish as their L2; the other two dealt with English as the target language – all four studies reported similar findings: in naturalistic second language learning the developmental sequence of morphological development follows the L1 developmental path: perfective past marking on achievements and accomplishments, eventually extending its use to activities and states; imperfect past marking beginning with states, extending next to activities, then to accomplishments, and finally to achievements; progressive marking on activities, and only then on accomplishments and achievements; progressive markings on activities, later overextended to statives. The subsequent body of empirical studies addressing the Aspect Hypothesis is quite impressive (cf. Bardovi-Harlig 2000: 206-210) and shows, as claimed in Bardovi-Harlig, that “the aspect hypothesis is widely supported” (Bardovi-Harlig 2000: 251). Evidence in favour is also presented in a much more recent study carried out by Sugaya and Shirai (2007), in which native speakers of different European languages learning the Japanese imperfective aspect morphology, showed a considerable bias towards activity verbs, regardless of their mother tongue. However, there is also counterevidence to the Aspect Hypothesis cited in both Bardovi-Harlig (2000) as well as Li and Shirai (2000). The claim of the hypothesis that is most frequently undermined is the one that AH applies at the initial stages of acquisition. Bergström (1997) and Salaberry (1999) present findings to the contrary, claiming that the higher the level of development of learners the greater the congruence between AH and research findings; they also

Organic Approach Deductivised…

323

observe that lexical aspect grows in importance with learner advancement. Other counterexamples include studies of learners who showed developmental patterns other than the ones provided for the aspect hypothesis (cf. among others, Robinson (1995: 344; see also Bardovi-Harlig 2000: 266-269 for other examples). 1.2.3. Discourse Hypothesis studies. Bardovi-Harlig’s (1998, 2000) aspect-andnarrative approach The third and final research tradition described in Bardovi-Harlig (2000) contains studies that address the Interlanguage Discourse Hypothesis, and investigate the relationship between personal, impersonal and personalised narratives and tense/aspect morphology distribution. As in the case of the two previous groups of studies, there had been research efforts preceding and/or inspiring Discourse Hypothesis studies. In one of the earliest studies of this kind, Godfrey (1980) observed that tense and aspect morphology cannot be fully understood without recourse to discourse. Dahl (1985), in turn, who studied the distribution of tense/aspect morphology in relation to discourse in primary languages, claimed that the distinction between background and foreground by means of verbal morphology is a universal characteristic of the narrative. Based on the above is the Interlanguage Discourse Hypothesis, which states that “learners use emerging verbal morphology to distinguish foreground from background in the narratives” (Bardovi-Harlig 1994: 43 cited in Bardovi-Harlig 2000: 279); a narrative is understood after Dahl (1984: 116), as “a series of real or fictive events in the order in which they took place”. The ability to mark the route of the main discourse as well as the diversions is a characteristic of a competent language user. The most basic narratives of lower-level learners, as pointed out in Bardovi-Harlig (2000: 280), consist only of foreground, and are characterised by narrativity – which means that the order of textual units matches the order of event units – as well as completeness and punctuality, associated with the perfective aspect. As the learner develops, (s)he starts to control narrative diversions, an ability demonstrated by the use of imperfective verbal morphology, which enables the language user to indicate the difference between the main stream of events and background events. In her own theoretical proposal, Bardovi-Harlig (1998, 2000) synthetises the AH and narrative approaches and shows how narrative structure may contribute to the spread of tense-aspect morphology from unmarked to non-prototypical cases. She supports her stance with the observation that both the aspectual class and the narrative structure contribute to the order of tense-aspect morphology acquisition. All of this seems to be well grounded in the three principles accounting for the aspect/narrative influence distinguished by Andersen and Shirai (1994):

324

Chapter Five

– the congruence principle: verbs use tense/aspect morphemes whose meaning is the most similar to that of a verb; – the relevance principle: learners use morphology relevant to the verb, immediately following the verb stem and, consequently, they learn the most relevant morphology first. “[B]ecause aspect is more relevant to the meaning of the verb than tense, mood or agreement, the first use of verb morphemes is as aspect markers” (Andersen and Shirai 1994: 145); for example, as Andersen and Shirai point out, the –ed ending is used to indicate a completed action and not the past tense. Bybee2 sees it as primarily discourse-related, claiming that “inflectional aspect indicates how the action or state should be viewed in the context” (1985: 21); – one-to-one principle: learners expect every new morpheme to have only one meaning (guided by the prototypical meaning). Summing up the above, and referring their aspect-related observation to a discourse-specific need of the learner, Andersen and Shirai (1994: 152) state that: [t]hese principles follow naturally from the speakers’ (both learners and nonlearners) need to distinguish the reference to the main point/goal of talk from supporting information, within the tradition of research on grounding and the function of tense/aspect markings in narratives.

With the above as a point of departure, Bardovi-Harlig (2000: 313) puts forward the following hierarchy owing to which it can be predicted which verbs in a narrative will be inflected by lower-level language learners, whose linguistic resources are naturally very limited: – achievements are the predicates most likely to be inflected for the simple past, regardless of grounding; – accomplishments are the next likely type of predicate to carry the simple past. Foreground accomplishments show a higher rate of use than background accomplishments; and – activities are the least likely of all dynamic verbs to carry simple past (foreground activities more than background activities; activities also use progressive which is limited to the background. All in all, in spite of their different foci, studies carried out within the three theoretical frameworks share two important characteristics. Firstly, all of them are ––––––––– 2

“The more a morphological category is relevant to the verb […], the greater is the morphophonological fusion of that category with the stem’ (Bybee 1985: 11); the order: stem < aspect < tense < mood < person.

Organic Approach Deductivised…

325

actually sequence studies: they follow the learner’s development and concentrate on the process – with all its interim stages – rather than the end product of learning. Secondly, all of these research efforts address natural learning noninterventionist environments. While the present work sympathises with the process-over-product approach, it is, as has been pointed out on several occasions throughout the book, more interested in the research into time talk in controlled learning environments. Section 1.2 addresses the issue, treating Bardovi-Harlig’s (2000) comprehensive account as a point of departure. However, as Bardovi-Harlig herself seems to be biased towards uninstructed learning, the discussion will be supplemented by examples quoted from outside the main reference source. 1.3. The focus-on-form perspective on time-talk acquisition Following VanPatten’s (1990) distinction of three investigation areas such as foreign language learning, instructed second language learning and uninstructed second language learning, Bardovi-Harlig (2000) looks at all the three different learning environments and their influence on the emergence of tense-aspect morphology but, as she claims, her interest in the attainment of temporal morphology is not pedagogical but acquisitional. In this context, based on Gass’s (1989) claim that language acquisition is essentially the same process in all of the above environments as well as Pienneman’s Teachability Hypothesis (quoted in Chapter 2), in the light of which acquisition orders are unaffected by instruction, Bardovi-Harlig expresses her scepticism about the effect of tuition on the emergence of tense-aspect morphology. She admits that the effects may be easier to determine in experimental studies in which instruction will potentially increase the rate of acquisition; she, however, states (Bardovi-Harlig 2000: 405): Where instructional input, motivation and input through L2 contact are combined, the outcome seems to be an advanced level of development and, eventually, corresponding target-like form-meaning connections. However, instruction does not seem to be a privileged variable in any sense.

While it is not the intention of the present work to argue that instruction as such is in any way a prerogative, it will be maintained – as it was throughout Part I – that instruction is a crucial part of the language teaching process, the importance being particularly noticeable in foreign language teaching contexts. There is a very important reason for such a claim. While language acquisition may, as observed by Gass (1989), be an identical process for all three learning environments – the foreign language learning, instructed second language learning and uninstructed second language learning – this sameness will only be qualitative in the sense that in all three contexts the learners will acquire the target language based on similar cognitive mechanisms and following the same developmental patterns. There will,

326

Chapter Five

however, be a quantitative discrepancy between the three types of learning in the area of the amount of language the learner is potentially exposed to and, consequently, the amount of time needed for the activation of the said cognitive mechanisms and covering the said developmental stages. Considered from this perspective, the foreign language learner is at a disadvantage because of the scarcity of exposure to language in authentic contexts. The only way in which this deficiency can be compensated for is by means of certain pedagogical interventions. As a result, instruction in foreign language learning will, as admitted by Bardovi-Harlig (2000), increase the rate of acquisition, just as it would in any of the environments mentioned. However, contrary to what BardoviHarlig claims, it will be of crucial importance, being amendatory as well as acceleratory, to the understanding that, in addition to facilitating learning, instruction – as pointed out above – will have to compensate for the lack of exposure to the target language input by different ways of input enhancement as well as a certain degree of explicit, teacher-fronted about-the-language – rather than language-only – input. Taking the above into account, it is of interest to the present work how effectively the acquisition of tense and aspect morphology can be influenced by formal instruction, especially that Part III of the book sets out to devise an effective pedagogical treatment in this area. In search of an effective strategy, we will look at four research samples whose aim was to investigate the effect of instruction on the acquisition/learning of time-talk morphology: Bardovi-Harlig and Reynolds’ (1995), Doughty and Varella’s (2004) and two studies by Pawlak (2006), all of which are described below. In their study, Bardovi-Harlig and Reynold (1995) look at the emergence of the past simple tense morphology in 182 adult learners at six levels of language proficiency, identify areas of difficulty and propose a compensatory treatment which they call “an acquisitionally based approach to instruction” (1995: 107). This pedagogical proposal is motivated by the authors’ own research findings and the resulting assumption that instructed and uninstructed learning modes are very much alike. Bardovi-Harlig and Reynold observe that, similar to uninstructed contexts, the classroom-based acquisition of the tense at issue is not a unitary process, but proceeds gradually and in relation to the acquisition of verbs of a relevant lexical aspect, which can be explained in the light of the Aspect Hypothesis. Another analogy between instructed and uninstructed learning are the identified difficulties: initial neglect in using the past tense morphology in contexts where it would be applied by the native speakers of the language as well as under-generalisation of the form-meaning pairing. Considering the above, Bardovi-Harlig and Reynolds (1995) propose that the treatment should consist of a combination of an increased use of contextualised examples in input and focused noticing exercises. The researchers observe that the said interventions should primarily be applied to verbs such as states and activities,

Organic Approach Deductivised…

327

which are not prototypically used as perfectives and, consequently, pose considerable difficulty in the area of past tense morphology. When it comes to detailed pedagogical solutions, the interventions Bardovi-Harlig and Reynolds applied consisted in input-flooding the learner with text snippets containing activity and state verbs in the past simple tense. As for the focus component, the learners were subjected to a number of general awareness-raising activities which required finding a sentence with only one verb in the simple past or supplying the simple form for each progressive form found in the text. In their conclusions, BardoviHarlig and Reynolds (1995: 127) state: “We argue that input enhancement which includes focused noticing as well as positive evidence, provides learners with an awareness which helps input to become intake even outside the classroom”. When it comes to Doughty and Varella’s (2004) research, they investigated ways in which learners’ attention can be drawn to the formal properties of language without backgrounding the communicative value of the input in question, their main concern being the likelihood of explicit instruction precluding fluency. The subjects in the study were 34 middle-school ESL learners aged 11-14. Based on learner problem diagnosis, the time talk which was subject to investigation was the past tense, including the conditional past, while the educational context was a content-based classroom where English was the means of studying science and mathematics. The instructional procedure implemented in the course of the study was implicit FonF in the form of recasts carried out by means of intonational and corrective focus during content-oriented language exchanges between the tutor and the students. The researchers’ belief in the effectiveness of such a research design was based on the following three observations concerning language use and error repair between native speaker adults and children (Doughty and Varella 2004: 117): – adults are likely to use recasts with ill-formed rather than well-formed children’s utterances; – adults usually recast ill-formed utterances containing one – and not many – errors; and – while recasting, adults normally provide “specific contrastive evidence” such as examples of the correct use of a given form. “It has been shown,” the authors (Doughty and Varella 2004: 117) argue in relation to the above, following Grimshaw and Pinker (1989),“that children both notice the linguistic information brought into focus by adults and seem to make use of it” (emphasis in the original). Evaluating the effectiveness and feasibility of the communicative focus on form applied in their own study, Doughty and Varella (2004) state that it has been shown that the intonational focus and corrective recasting help the learners

328

Chapter Five

improve their use of the past tense morphology, and that the effects were durable, as demonstrated on a delayed post-test. Additionally, and equally importantly, the applied treatment did not interfere with the content-based teaching. Finally, in his two research efforts, Pawlak (2006) looks at the extent to which form-focused instruction can facilitate the learning of temporal morphology in the Polish EFL context. Study 1, which concerned the past simple morphology, was carried out in the action research mode with the participation of 33 Polish former secondary school students. In the treatment, which consisted of 6 sessions, the teacher-researcher utilised such instructional options as input and output enhancement as well as recasting. As the author admits (Pawlak 2006), in doing so he expected to achieve durable effects in the following areas: increased awareness of the formal property in question manifesting itself in the learners’ ability to monitor output; and learner fluency resulting from an increased automatisation of the grammatical structure at issue. In his conclusions to the study Pawlak (2006) states that as a result of the said treatment his students’ output demonstrated an increased frequency in the use of the past tense as well as greater accuracy in this area. Additionally, as Pawlak points out (2006: 426), “the instructional treatment resulted, to some extent, in reducing the gap in performance between the best and the poorest students in the class”. This final observation – which is in agreement with the findings of Erlam’s study (2005) cited in Chapter 2 – indicates that form-focused instruction can be particularly beneficial not only in foreign language learning contexts as such – a point made at the beginning of this section – but also as a way of levelling in-class differences to the advantage of the less proficient language learners being instructed. In addition to these general conclusions, Pawlak makes a number of detailed observations about the effectiveness of the treatment. He points out that the effect depends on how complicated the structure is – simple over complex – as well as what kind of production is required of students – controlled over spontaneous. The other study (Pawlak 2006) which concerned the present perfect morphology applied similar procedures and led to similar results and following conclusions. Involving 72 second-year Polish secondary school learners of English, the quasi-experimental research treatment offered its participants numerous opportunities to “practice [the present perfect tense] in a variety of text-manipulation activities and, to a much lesser extent, simple communicative tasks” (Pawlak 2006: 431). In addition to this, the instructional procedures involved the presentation of the said tense in context, in which the form was graphically highlighted by means of upper case font or bolding. As demonstrated by Pawlak on the post-test and delayed post-test, the treatment resulted in greater accuracy and appropriacy as well as fluency in the use of the present perfect. Summing up the results of both studies, Pawlak states (2006: 466):

Organic Approach Deductivised…

329

Generally speaking, the subjects in both studies appeared to benefit from the pedagogic intervention, although the nature and the scope of the improvement was a function of the inherent properties of the structure, the learners’ stage of interlanguage development as well as the type of performance in which the students were required to engage.

In conclusion to the present section, it can be stated that the area of the acquisition of tense-aspect morphology is decidedly well researched. There have been numerous studies, carried out from the three perspectives presented, which have investigated the emergence of temporal form-meaning pairings vis à vis different hypotheses as well as with special regard to instructional treatment. The most important of them have been described here and will be returned to – for reference and inspiration – when the present research design is described. At this stage it can be said that my studies which will be presented in this book, with their focus on time talk and its semantics, follow the American form-oriented mode, with individual temporal expressions as points of departure for meaning-related considerations. In addition to this, very much like the studies cited after BardoviHarlig (2000), my research focuses on the process rather than product, investigating the dynamics of change resulting from the applied treatment to a greater extent than its results. Finally, it has to be pointed out that the organic, exposure-based part of the treatment will be examined against the background of the AH studies and the prototypicality effect. What, however, needs to be pointed out as well is that in the course of the abovequoted comprehensive account of research to date, Bardovi-Harlig (2000) makes two important observations which will become a point of departure for the subsequent considerations related to my research. First of all, as Bardovi-Harlig (2000: 2) puts it, “within the expression of temporal semantics we find a subsystem of language with many of the characteristics of the larger linguistic system”. This means, as indicated earlier in this section, that learning a new part of the time-talk network affects its other parts; learning a new morpheme and the meaning it carries affects other formmeaning mappings in the neighbouring area. As a result, when considering the best pedagogical options in teaching the system of temporal expressions, it seems necessary to look for an underlying linguistic theory which will accommodate the idea of language-as-a-system. Construction grammar, as will be argued in Section 2, offers a promising model of this kind. Secondly, what needs to be emphasised in relation to research into time talk, is Bardovi-Harlig’s (2000) observation – based on the acquisition sequence studies – that forms emerge prior to form-meaning mappings. This, together with the concern expressed in her earlier publication (Bardovi-Harlig 1999) about the considerable neglect as regards the acquisition of temporal form-meaning mappings in the advanced language learner – an observation confirmed by the cited studies, all of which concentrated on beginner or intermediate learners – will

330

Chapter Five

be of importance in the design of an appropriate pedagogic treatment. Section three looks at the two issues and tries to pinpoint the potential FMC problems experienced at higher levels of language proficiency, their origins and possible ways of repair.

2. Linguistic perspectives on form-focus pedagogy. Construction grammar as an option3 Language pedagogy has always been rooted in a much broader domain, psychological, psycholinguistic and linguistic. The psychological and psycholinguistic perspectives on focus on form were presented in Part I. The present section, in turn, investigates the possibility of basing certain pedagogical solutions on a linguistic theoretical construct: construction grammar (CxG). However, before the advantages of relating grammar instruction to CxGmotivated underpinnings are discussed, it seems necessary to look at previous linguistic – generative and early cognitive – perspectives on language pedagogy. This is because construction grammar as an option is best seen in such a broader context. As I argue elsewhere (Turula 2009e; cf. also in the introduction to Part II), the early Chomskyan (Chomsky 1965) strong syntactocentrism – all of the criticism of its opponents and numerous revisions by its proponent (Chomsky1981a, 1981b, 1986, 1995) notwithstanding – seems to be surprisingly well off in the area known as pedagogical grammar. All of this most probably results from the foreign language learners’ (and teachers’) need for a reliable set of guidelines they can resort to in such a dangerous territory as FL grammar. An all-inclusive system of rules enabling every possible form-function combination to be generated would certainly be a much welcome solution to learning problems in this area if it were not for the numerous exceptions left out of the system and listed as “other uses”. The latter together with the truly syntactocentric disregard for the semantics of grammar – form-focused instruction, as was pointed out in Part II, rarely or never overtly concentrating on meaning – may result in a FL learner’s formal expertise but will rarely lead to the near-native-like understanding and, consequently, use of grammatical forms expected at higher levels of language proficiency, especially in the case of marked/non-prototypical form-meaning mappings. It would be unfair to claim that such a situation has never been subject to repair. Approaches to grammar pedagogy rooted in cognitive linguistics exploited, among others, by Skehan (2003) and Turewicz (2000 – to mention one of major Polish publications in this area) led to focus-on-form, which was primarily usage––––––––– 3

This section is an adaptation of my article under the same title (Turula 2009e).

Organic Approach Deductivised…

331

based/organic (Langacker 1987 and 2000 / Rutherford 1987), and was expected to result in the learner’s grammar understood as a combination of exemplars and/or rules. However, the Lexical Approach (Lewis 1993 and 1997), which is one of the most popular solutions of this kind and, to a certain extent, an alternative to traditional (yet still alive and kicking) grammar teaching, is – when implemented in the FL classroom – more inclined towards the exemplar end of the system. With its idea of grammaticalised lexis, it seems to neglect the concept of language as rule-based, as well as the ways in which generalisations are made and learned. This leaves one dissatisfied with the present state of affairs and inclined to look for a third solution, a golden-mean option with the potential of reconciling the two afore-mentioned ways with language and its grammar. Construction grammar seems to offer such a third way out. That is why, the present section intends to first present an overview of grammar theories to date together with their advantages and shortcomings; it then goes on to discuss the alternative – constructionist approaches – shows a number of constructions at work (Goldberg 2006), and finally concentrates on how construction grammar can be utilised in FL grammar pedagogy. 2.1. Grammar theories to date and their impact on pedagogical grammar Discussing all of the linguistic theories of the previous and the present centuries is well beyond the scope of this section. This is why, the present section will concentrate on two theoretical perspectives on language – Chomsky’s generative model (1965, 1981a and b, 1986, 1995) and the cognitive approach to language as suggested, among others, by Langacker (1987). There are a number of reasons for the selection of the two theoretical models. To start with, Chomsky’s and Langacker’s are the two proposals that seem to have dominated the approach to grammar pedagogy of the late 20th and the early 21st centuries. Secondly, these two approaches seem to be in opposition to each other as regards many languagerelated phenomena important to the present line of reasoning. The controversy that is of particular interest to the present work is the conflict between the generativist syntacocentrism, syntax independence and idiosyncrasies confined to lexicon on the one hand and, on the other, the cognitive belief in language as a collection of more and less abstract symbolic units, in which instantiations always come before schematisations. The focus on the controversy is motivated by the fact that the alternative described in the present section – construction grammar – seems to offer a sensible compromise between the two theories in question. However, before this balanced approach is presented, both models – the generative and the cognitive – deserve a brief description. As diagrammed in Croft and Cruse (2004: 227), the summative model of grammar, as presented in the generative theory of the 1960s to 1980s will be similar to the one below (Figure 41):

332

Chapter Five

Figure 41. The generative model of grammar (adapted from Croft and Cruse 2004: 227) phonology

L E X I C O N

syntax

linking rules

semantics Within this model the speaker’s linguistic knowledge is organised into components, each of which describes one set of properties of a given language unit: its phonological, syntactic and semantic characteristics4. Consequently, the notation for a lexical unit belonging to the vertical component (the lexicon), e.g.: the bat, would be represented phonologically, syntactically and semantically as shown in Figure 42 (based on Jackendoff 2003). Figure 42. Generative representation of a lexical unit (based Jackendoff 2003) NP

NP

Det

N

Det

the

bat

[/ /] [Det] [DEF]

N [/bæt/] [N, 3 sg] [BAT]

There are obviously information mappings between the three components and they are possible owing to quite general linking rules. As a result, a lexical unit like the one sampled in Figure 42 is an “arbitrary and idiosyncratic joining of form (phonological and syntactic) and meaning” (Croft and Cruse 2004: 227). Such mappings, however, cannot be extended beyond the unit as, according to generative grammar, “there are no idiosyncratic properties of language larger than a single word (Croft and Cruse 2004: 227).

––––––––– 4

In Chomsky (1981a), the syntactic component is further divided into levels (deep/surface structure) and modules (case theory, binding theory). This subdivision will not be discussed here as it was abandoned by Chomsky (1995) himself in his Minimalist Program.

Organic Approach Deductivised…

333

All of this has had a number of consequences for pedagogical grammars written under the influence of the Chomskyan model5. First of all, for any grammatical construction – be it the passive voice or a given tense, for example present simple – as many of its properties as possible will be described by general rules and all the unique properties will have to be stored in the lexicon. Such a description mode refers to both – the form and the use of the structure in question. That is why, a typical grammar book will be full of constructional as well as transformational algorithms, like the ones presented below in figures 43A and B, respectively, while the usage section will contain examples that are subject to general rules together with a number of so-called “other uses” which do not fit into the rules-only description, and which usually outnumber the rule-governed examples, as depicted in Figure 44. Figure 43. Transformational algorithms for the present simple tense (A) to form 3rd person sing:

add -s

if the verb finishes with -ss, -sh, -ch or -o – add -es

if the verb finishes with consonant+y – change the -y into -ie and add -s

––––––––– 5

There is no denying that the Chomskyan revolution led to a number of positive changes in language pedagogy. Above all, it led to the abandonment of the behaviorist-habit-formation routines in the language classroom and paved the way for the communicative approach. This, however, is of lesser importance to the present argument with its focus on grammar as the instructional focus.

334

Chapter Five

(B)

PEOPLE

LIKE

DOGS

.

DO V

EVERYBODY

LIKE

S

DOGS

.

DOES

V

to form questions in the present simple tense:

1) add auxiliary DO before subject NP; 2) replace full stop with question mark

(!) for 3rd person singular: 1) add auxiliary DOES 2) remove -s ending from main verb

Organic Approach Deductivised…

335

Figure 44. Present simple – rule and other uses (Thomson and Martinet6 1990: 160) 173 The present simple is used to express habitual actions A The main use of the simple present tense is to express habitual actions: He smokes. Dogs bark. Cats drink milk. This tense does not tell us whether or not the action is being performed at the moment of speaking, and if we want to make this clear we must add a verb in the present continuous tense: My dog barks a lot, but he isn’t barking at the moment B The simple present tense is often used with adverb or adverb phrases such as: always, never, occasionally, often, sometimes, usually, every week, on Mondays, twice a year, etc.: How often do you wash your hair? I go to church on Sundays. or with time clauses expressing routine and habitual actions. Whenever and when are particularly useful Whenever it rains the roof leaks. 174 Other uses A It is used chiefly with the verb say when we are asking about or quoting from books, notices or very recently received letters: What does the book say? Other verbs of communication are also possible: Shakespeare advises us not to borrow or lend. B It can be used in newspaper headlines: MASS MURDERER ESCAPES C It can be used for dramatic narrative. This is particularly useful when describing the action of a play, opera, etc.: Juliet is writing at her desk. Suddenly the window opens and a masked man enters. D It can be used for a planned future action or series of actions, particularly when they refer to a journey: We leave London at 10.00 next Tuesday and arrive in Paris … E It must be used instead of the present continuous with verbs which cannot be used in the continuous form, e.g.: love, see, believe, etc. F It is used in conditional sentences type 1 If I see Ann I’ll ask her. G It is used in time clauses (a) when there is an idea of routine: As soon as he earns any money he spends it. (b) when the main verb is in the future form: When it stops raining we will go out.

––––––––– 6

A Practical English Grammar by Thomson and Martinet is used here as an example in spite of the fact that it is quite an old grammar book. First of all, it is still in wide use; secondly, most contemporary grammars – including some corpus-based publications (Carter and McCarthy 2006) – follow the rules-plus-other-uses presentation mode.

336

Chapter Five

In addition to the algorithms and lists of rules and exceptions, Chomskyan syntacocentrism together with the generative belief in the autonomy of the syntactic component of the system underlies a grammar pedagogy which is devoid of semantic analysis. The focus of instruction is on the formal properties of the structure, and its meaning is taken care of by exposure alone (on condition that the said instruction is combined with such exposure), which, as was argued throughout this book, may not be enough. Equally insufficient are the rule-based algorithms used in pedagogical grammars of the kind described above, which tell the learner when and where but never why a certain form is required. For reasons other than – yet not dissimilar to – the above-mentioned pedagogical shortcomings, the generative model became subject to criticism on the part of cognitive linguists, among others Langacker (1987) who claimed that Chomsky was wrong about the elegant modularity of the system as well as the autonomy of syntax. Langacker himself argues for a model of language understood as an inventory of symbolic units, each of such entities consisting of the semantic pole and the phonological pole, a diagramme of which was presented, after its proponent, in Chapter 2. Within the structure, a Langackerian representation of the already introduced lexical unit – the bat will resemble the one depicted in Figure 45: Figure 45. Cognitive representation of a lexical unit (informed by Langacker 1987)

symbolic unit semantic unit cod

phonological unit

usage event BAT

bat

All of this has a number of consequences for grammar pedagogy. First of all, based on the above theoretical underpinnings, the schematised symbolic unit for a given grammar structure – like the already discussed present simple (cf. the fragment on the implications of the generative theory) – is likely to resemble the representation (simplified) shown in Figure 46.

Organic Approach Deductivised…

337

Figure 46. Cognitive representation of the present simple tense grammar (linguistic convention) symbolic unit

usage event

semantic unit

Conceptualisation: single, simple entities, unities, totalities (Lewis 1986: 66)

phonological unit

Vocalisation: bare infinitive (+ morphological variations)

Moreover, remembering that Langacker’s model is dynamic and usage-based (Langacker 1987, 2000), the schema in question will be abstracted from numerous encounters with highly specific applications of this structure and will coexist with a number of low-level idiosyncrasies, especially the ones which – like I do, I swear to God I..., What do I do now? – have become symbolic units: pairings of highly specific form and meaning matched on the strength of linguistic convention. The best-known application of the cognitive model in grammar pedagogy is called the Lexical Approach (Lewis 1993, 1997; discussed in Chapter 2) with its idea of a “grammaticalised lexis” to replace the more Chomskyan “lexicalised grammar”. According to this approach, “grammar is not logically distinct from ‘vocabulary’” because we operate within a “spectrum of generalisability” (Lewis 1993: 137), with words participating in a limited number of patterns at one extreme and widely applicable patterns at the other, this “spectrum” is not unlike the exemplar-rule continuum proposed by Skehan (2003). Compared to the traditional approach, whose language patterns were highly abstract and devoid of meaning and whose rules were evidently unable to cover all uses, the Lexical Approach definitely has a lot to offer to foreign language grammar pedagogy; a claim, which has already been confirmed in the classroom as Lewis’s proposal has been practised for almost two decades now. 2.2. The Lexical Approach – potential shortcomings However, in addition to acknowledging the obvious advantages, the experience of the years during which the Lexical Approach has been operational leads to a number of critical observations. As I argue in a recent publication (Turula 2009e), these reservations include the following four factors: lexical language

338

Chapter Five

learning is predominantly memory-based; extra-lexical grammar is neglected; pattern extraction is based mainly – if not solely – on heuristic browsing (induction); and finally, learners may not take full advantage of the creative potential of language. All of these potential shortcomings of the Lexical Approach are discussed below. The fact that language learning is predominantly memory-based results from the recommendations of the Lexical Approach, quoted in Chapter 1: “more attention should be paid to multi-word chunks” (Lewis 1997:15). While it cannot be denied that in Lewis (1997: 15) there is mention of “revealing and retrieving patterns”, it is also true that, as a result of the communicative demands that come first in language pedagogy, classroom practice is usually focused on “encouraging learners to use lexical phrases as the blocks out of which they construct simple L2 conversations” (Little 1994: 115). Making broader generalisations about the internal composition of the block seems to be of lesser importance. The obvious question at this stage is whether memory-based language learning is the best pedagogical option. The answer is no and yes. The main advantage of giving priority to rote learning will be the fact that prefabricated chunks – called by Langacker “pre-packaged assemblies no longer requiring conscious attention” (2000: 93) – if effectively proceduralised, definitely reduce the cognitive load which is required in online processing in communicative situations. This point was presented in Chapter 1, together with the argument that the need for focused, conscious attention in on-line language processing is conversely proportional to the automaticity of co-performed tasks. What is more, as Skehan (2003) claims, memory is a unique component of language aptitude and its importance grows over time. Grammar sensitivity, in turn, with its pattern extracting potential remains on the same level throughout the language-learning process. As a result, it can be said that the more advanced the learner is the more emphasis should be placed on memory-based learning. On the other hand though, there are claims to the opposite such as the one made by Rysiewicz (2006) as well as Mackey et al. (2002) and Biedro and Szczepaniak (2009): language aptitude depends on memory as long as it is textual memory or working memory, which accommodate two capacities: storing and processing. It is the latter ability – including the skill of seeing and generalising regularities, which according to Rysiewicz, is of greater importance than rote memory per se. Similar findings come from an earlier research by Liang (2002 quoted in Goldberg 2006: 116-117), which proved that the ability to use language [L2] proficiently is correlated with the recognition of constructional generalizations, findings confirmed by my own study into the relationship between language analytic ability and learner proficiency (Turula 2009b), in which I emphasise the role of what Truscott and Sharwood-Smith (2004) call syntactic working memory. All of this shows that however important, memory is only one side of the language aptitude coin, and sensitivity to pattern should not be neglected.

Organic Approach Deductivised…

339

Exclusive reliance on memory, together with Lewis’s recommendation to pay less attention to sentence grammar leads to another drawback of the Lexical Approach: neglect of extra-lexical – or sentence – patterns. The question as to whether or not it is potentially undesirable that extralexical grammar is neglected can be answered by Goldberg’s claim that the obviously Langackerian tendency to operate on the low-level schemata, seeing them as “more essential to language structure than ... broadest generalisation” (Langacker 2000: 92-93) may be too reductionist for some language analyses. Goldberg (2006) backs up her claim with two very convincing examples. In one, she refers to the ditransitive construction claiming that “the implication of transfer is not an independent fact about the words involved; rather the implication of transfer comes from the ditransitive construction itself” (Goldberg 2006: 9). In her other argument, she mentions opaque words like get which have “low cue validity as predictors of meaning” (Goldberg 2006: 105), stating that their semantics can never be sufficiently obvious unless one examines them on the sentence level. A similar point was argued in Chapter 1, based on MacDonald (1994) and Baars (1997b). All of this shows that, in addition to low-level processing of language, it is important to investigate the realm of higher generalisations. The learner’s ability to operate on abstractions may be of particular importance in the case of innovative use of language, as I argue in Turula (2009b). Still another observation which is critical of the Lexical Approach will refer to Lewis’s pedagogical attention foci such as “listening (at lower levels) and reading (at higher levels)”, which, together with the suggestion to “prepare learners to get maximum benefit from text”, usually mean that pattern extraction is based mainly – if not solely – on heuristic browsing (induction). Such bottom-up language data accumulation is something Langacker (2000) is very much in favour of. He calls it “non-reductive”, “maximalist”, praising it over the minimalist generative emphasis on “general rules” (2000: 91-92). The “top-down spirit of generative grammar” is further challenged by the claim that “rules can only arise as schematisations of overtly occurring expressions” (2000: 91-92). Later, as Langacker claims, such patterns are “available for the sanction of new expressions” (2000: 114). However, a claim like the one mentioned, together with the earlier (cf. Chapter 2) assertions about the role of deductive instruction, leads to a number of questions. First of all, considering the limited exposure to language in the FL classroom, there is no guarantee that as a result of such non-reductionist, bottom up browsing all of the important patterns will actually be noticed; the present book has been arguing to the contrary: exposure-based browsing alone is not sufficient to learn the semantics of some forms, particularly the marked/peripheral ones. Secondly, it is uncertain whether the conceptualisations that follow are really native-like7. Consequently and most ––––––––– 7

As I argue elsewhere (Turula 2006a), they are far from native-like for two reasons: bottom-

340

Chapter Five

importantly, we can never be sure whether the learner takes full advantage of the creative potential of language (?), especially that, as Langacker himself claims, “patterns of comparable generality can easily differ in their degree of productivity” (2000: 114). The question mark placed at the end of the “creative potential” statement above shows all of the doubts that one may have about the effectiveness of non-native intuition about the extent to which certain abstracted patterns can be made operational. The lack of confidence in linguistic creativity in an area as potentially tricky as a foreign language can be supported by the following example. Traditional pedagogical grammars instruct the learner that certain verbs – called stative or nonaction, depending on the particular handbook – do not appear in continuous tenses. For some time now, this rule has been subject to breach, the best-known example being the McDonalds’ slogan I’m lovin’ it. The process underlying such usage events, described by Langacker (1987: 256-7) as “imperfectivising what would otherwise be a perfective expression” and called re-profiling (Turewicz 1998; cf. Chapter 4) is implemented in this particular usage event to satisfy its semantic demand: a description of a current experience limited to a certain time and space (like a visit to a restaurant). Once noticed and abstracted, the pattern in question is open to further application. Apparently, though, while it stands ajar for some verbs, the opening is only slight for others. The problem is best illustrated by the pair of see and hear. Both of them being verbs of senses, one might speculate that they would be equally available for the imperfectivising process. This is, however, not the case as demonstrated by both corpus-like analysis8 and native-speaker grammaticality judgments9. Firstly, hear is much less frequent in continuous tenses than see. What is more, sentences like Don’t tell me you haven’t been hearing of him10 are rejected by native speaker judges as ungrammatical while You mean the man I’ve been seeing around here11 is seen as correct if not most natural. This reevokes the question of the speaker’s knowledge about the degree of productivity of a given pattern. In Langacker’s (2000) view, the key factor in such decision-making is entrenchment. Some patterns are highly productive and, consequently, well entrenched, which makes them more easily activated for potential usage events. Other schemata are less productive which makes them poorly entrenched, less used and, what follows, less frequently selected. The entrenchment-based grammaticality judgment, however, needs native-like exposure to linguistic input, which is rarely an option in foreign language learning. –––––––––

8 9 10 11

up browsing gives less impressive results in foreign language learning because of exposure to language that is limited and, in most cases, belated. Brown Corpus and Google search run on 28 April 2008. Verbal reports of 10 – 4 BE, 3AE and 3 AuE – native speakers of English (to be described in detail in Chapter 6, as the interviews were carried out as part of the present research). A quotation from Harry Potter and the Philosopher’s Stone (the film). A quotation from As Good as It Gets (the film).

Organic Approach Deductivised…

341

All things considered, it seems that a pedagogically sound solution to all of the described shortcomings will be a kind of lexical, data-driven approach to grammar with a top-down organising element that would provide the foreign language learner with a kind of blueprint, an overall scheme that would offer some anchoring for the incoming language patterns. A solution of this kind was argued for in Part I, with special regard to the form-focus options discussed in Chapter 2. It seems that construction grammar has the potential to constitute the theoretical linguistic background for such pedagogy. However, before its educationally relevant characteristics can be presented, the theory itself needs a brief introduction. 2.3. Construction Grammar (CxG) – an overview It would be wrong to believe that there is only one construction grammar. In fact, CxG constitutes a family of cognition-based syntactic theories. They include: Construction Grammar (in upper case) developed by Fillmore, Kay and collaborators (1988, 1993, 1996 quoted in Croft and Cruse 2004: 266); variants of construction grammar presented in Lakoff’s (1987) seminal work on the Thereconstruction and Goldberg’s (1995) analyses of argument structure constructions; Langacker’s (1987) Cognitive Grammar; and Croft’s (2001 quoted in Croft and Cruse 2004: 283) Radical Construction Grammar. All of these approaches vary on a number of issues and can in fact be organised in a kind of continuum with Fillmore and Kay’s theory situated at one end, close to certain formalist theories like Head-Driven Phrase Structure Grammar, with Langacker’s Cognitive Grammar occupying the other extreme, the most non-reductionist, maximalist stance12. What they all share, though, is a set of fundamental principles, including the belief that the grammar of language consists of “a large number of constructions of all types, from schematic syntactic constructions to substantive lexical items” (Croft and Cruse 2004: 256). The lexical-syntactic continuum is presented, after Goldberg (2006: 5) in Table 10. Table 10. The lexical-syntactic continuum (Goldberg 2006: 5) Morpheme Word Complex word Complex word (partially filled) Idiom (filled) Idiom (partially filled) Covariational conditional Ditransitive (double object) Passive

e.g. pre-; -ing e.g. avocado, and e.g. daredevil; shoo-in e.g. [N-s] for regular plurals e.g. kick the bucket e.g. jog memory e.g. the Xer the Yer (the sooner the better) Subj V Obj1 Obj2 (He gave her a book) Subj aux VPpp (PP by)

––––––––– 12

For a detailed overview cf. Croft and Cruse (2004: 266-290) as well as the sources they cite.

342

Chapter Five

In the light of all the above-mentioned CxG theories, every such construction is bi-polar, which means that it possesses formal (syntactic and phonological) as well as meaning-related (semantic and pragmatic) properties. Such form-meaning pairings are organised into a system of grammatical knowledge whose model is presented below, following Croft and Cruse (2004: 256), in Figure 47. Figure 47. The CxG model of grammar (adapted from Croft and Cruse 2004: 256)

L E X I C O N

C O N S T R U C T I O N 1

C O N S T R U C T I O N 2

C O N S T R U C T I O N 3

phonological properties

[etc.]

syntactic properties

semantic properties

Although slightly reminiscent of the generative model presented earlier in this section, the one above shows a considerable change: the knowledge of language is structured vertically, along the inheritance line of constructions, the three former horizontal mainstays of the system are reduced to properties of these constructions. Finally, what all theoretical variants of construction grammar agree on is that “languages are learned – they are constructed on the basis of the input together with general cognitive, pragmatic and processing constraints” (Goldberg 2006: 3). All of this, as demonstrated in the subsequent parts of this section, is of importance as the theoretical backbone of grammar pedagogy. 2.4. Construction Grammar as the theoretical backbone of grammar pedagogy To best show whether construction grammar can offer theoretical anchorage for some pedagogical measures, it seems appropriate to go back to the pinpointed shortcomings of the otherwise non-controversial Lexical Approach. As indicated in Section 2.2, the first criticism referred to the fact that language learning is predominantly memory-based. The solution construction grammar offers is a balanced approach in which both memory and analytical pattern sensitivity are important. This, to use Goldberg’s own words (2006: 45), is because: the constructionist approach to grammar offers a way out of the lumper/splitter dilemma: the approach allows both broad generalizations and more limited patterns to be analysed and accounted for fully. In particular, constructionist approaches are

Organic Approach Deductivised…

343

generally usage-based: facts about the actual use of linguistic expressions such as frequencies and individual patterns that are fully compositional are recorded alongside more traditional linguistic generalizations.

The above claim is justified by the fact that CxG theory provides for the existence of rules alongside exemplars13. As a result, the hierarchies of inheritance in language as a system, as diagrammed after Croft and Cruse (2004) earlier in this section, will be based on rules as well as analogies, in a dual-system mode also argued for by Pinker (1999). Consequently, CxG-based grammar pedagogy provides for language learning based both on heuristic bottom-up browsing of incoming usage data as well as blueprinted by and anchored in a top-down formalist fashion by an overall system. Consequently, the other two problems, namely the fact that extra-lexical grammar is neglected as well as pattern extraction is based mainly – if not solely – on heuristic browsing (induction) will also find their solutions through construction grammar. This is because within such a rule-exemplar motivated system, a structure like the previously discussed present simple will, on the one hand, be a form-meaning pairing hinging on the actual language use and, on the other, it will be a part of a larger temporal system constrained by its overall architecture. Within this broader structure, the tense in question will constitute just one node in a broader network of other present as well as other simple tenses. All this takes us back to the interaction between the lexical and grammatical aspect, described in Chapter 4, in which the lexical and the extra-lexical levels blend and result in a new semantic value. Such a perspective on grammar is also in accordance with the idea of interlanguage development through system reassignment (Bardovi-Harlig 2000), in the course of which the emergence of a new form affects the entire semantic neighbourhood, a process which is of particular importance to the tense-aspect architecture. This is where construction grammar steps in, making it possible for form-focused pedagogy to go beyond the symbolic unit into the realm of a system interrelated through certain interconstruction properties. The existence of the system, in turn, enables tuition which can combine the inductive heuristic browsing with deductive, top-down teaching. This is because there are rules or certain far-reaching analogies which can be translated into the metalanguage of explicit instruction. At the same time, it is important to note that, unlike in grammar pedagogy which used the generative approach as the point of departure, the pedagogy based on CxG pattern abstraction and construction of system architecture does not need to go back to the quasi-mathematical algorithms that derive one structure from another. In this area construction grammar is definitely more “lexical” than ––––––––– 13

It differs from Langacker’s proposal in this equality of these two sides of the said continuum; Langacker, while acknowledging both ends, prefers actual use to schematisations.

344

Chapter Five

“generative” as the linguistic course of action to be taken is sanctioned by two important assumptions: the surface generalization hypothesis and the target syntax argument. In the light of the former “there are typical powerful generalizations surrounding particular surface forms that are broader than those captured by derivations or transformations” (Goldberg 2006: 22-23); according to the latter “it is preferable to generate A directly instead of deriving it from C if there exists a pattern B that has the same target syntax as A and is clearly not derived from C” (Goldberg 2006: 23). A similar solution is argued for by Willis (1990), who uses the example of the English passive voice to show that such sentences do not need to be derived from their active counterparts but can be related to intensive sentence constructions, with the -be + past participle pattern of the former reminiscent of the -be + adjectival complement of the latter. Grammar pedagogy based on the Lexical Approach – Willis being one of its proponents – will then suggest a reliance on the identified pattern resemblance rather than transformation in teaching of the passive. Construction grammar encompasses this strategy. However, it takes it further, showing the passive voice as a node in a much broader network of language as a system, as diagrammed in Figure 48, based on Mindt (2000). Finally, the problem to be attended to is the one of creativity and non-native intuition in operationalising some schemata and not others expressed in the critical remark that learners may not take full advantage of the creative potential of language if taught in a LA classroom. Figure 48. The passive as a node within a system (informed by Mindt 2000) Passive Aux Passive construction

Aux constructions

[Subj aux VPpp (PPby)]

He was elected president 14

Cat (get/become)

He was depresse Cat constructions

Passive construction

He got elected ––––––––– 14

catentative

He got angry

Organic Approach Deductivised…

345

In trying to specify why some patterns are widely used while others show limited productivity, construction grammar fully subscribes to Langacker’s (2000) “entrenchment” argument. Goldberg (2006: 102) writes: “a pattern can be extended to a target form only if learners have witnessed the pattern being extended to related target forms”. However, what is additionally important is (Goldberg 2006: 102): if the target form has not systematically been pre-empted by a different paraphrase … In choosing between allowable alternatives that are conditioned by multiple factors, speakers must combine multiple cues.

What the latter statement means is that a certain schema will be fully available for use if there is no “competing” form-meaning pairing at hand. This will, to a certain extent, clarify the difference between the -be + Ving imperfectivising pattern being productive for see and not so much for the semantically close hear. The point is that I have not been hearing about him is successfully pre-empted by the homonymous I have not been hearing from him with hearing effectively entrenched as receiving a message. Pre-emption of this kind is not activated in the case of see, in spite of the family of entrenched constructions such as I’m seeing my doctor tomorrow most probably because seeing one’s doctor and seeing somebody around here both imply visual contact with the individual in question. Such a “combination of multiple cues” once again demands operating from beyond the low-level “lexical” perspective and requires seeing the unit in question not in isolation but as a part of a broader system. Summing up, it can be said that, at least with regard to the four shortcomings of the purely lexical pedagogy, instruction informed by construction grammar – including a number of theoretical proposals put forward by Langacker himself – is a satisfactory compensation to the otherwise quite successful Lexical Approach. Considering the advantages, construction grammar has been chosen as the philosophy underlying the treatment applied in the course of the present research. 3. Organic Approach Deductivised As mentioned in the concluding part of Section 1, the other problem which needs to be addressed alongside the theoretical linguistic underpinnings of grammar pedagogy is the neglect as regards the acquisition of temporal form-meaning mappings in the advanced language learner. The present section attempts to approach the issue. In doing so, it starts by identifying the difficulties more proficient students face related to time talk, and goes on to explain the underlying reasons for such problems. Finally, it presents the details of an instructional treatment called the Organic Approach Deductivised and tries to justify it in relation to cognitive processes and learning modes as well as the bi-polar notion of semantics, conceptual and spatial.

346

Chapter Five

3.1. The advanced language learner. Indentifying the problem According to Housen (2002: 155-156), the learner’s task of mastering the tenseaspect system can be broken into two subcomponents: the form-to-form mapping task as well as the form-to-function mapping task. In the former case we are dealing with learning the various verb form categories and morphological paradigms of the target language. For instance, learners of English as a second/foreign language have to learn that the past and present tense forms of the English verbal lexeme have are had and has rather than haved and haves. When it comes to form-to-function mappings, they amount to learning the relevant temporo-grammatical meanings and discourse-pragmatic functions that are obligatorily or optionally expressed in the TL, and linking these to the appropriate morphological forms. Referring to Chapter 4 of the present work, we can say that, for instance, learners have to learn that the –ing form of an English verb can express, amongst other things, progressivity, habituality, futurity, continuity and backgrounding in narrative discourse. As Housen points out, the two above-mentioned components of time-talk proficiency are linked to each other; learner development, however, does not need to proceed simultaneously in both areas. Based on the research into advanced learners carried out by Housen (2002) herself as well as studies by Bardovi-Harlig and Bofman (1989) and Kilhstedt (2002) we can state these two types of mappings actually do not proceed simultaneously: learners who are formally proficient are often not fully aware of the semantic distinctions between different structures. As a result, in the interlanguage systems of such learners form will always precede meaning (Klein 1994). What should be pointed out here is that the form-meaning discrepancy is far from negligible. As Bardovi-Harlig and Bofman found out in their analysis of the essays of proficient learners – learners who, to use Kihlstedt’s (2002) words, demonstrate a considerable morphological mastery – errors of use were 7.5 times the rate of formal errors. To understand the reasons behind the asynchronous learning of the form-toform as well as form-to-function/meaning mappings, we need to look at the nature of form-meaning connections (FMCs) and the ways in which they emerge in the learner’s interlanguage. VanPatten et al. (2004) distinguish three possible types of such pairings: one form denotes one meaning; one form denotes multiple meanings; or multiple forms denote one meaning. As for the ways in which they are established, every FMC goes through the following stages (VanPatten et al. 2004: 5): making the initial connection; the subsequent processing of the connection; and, finally, accessing the connection for use. The last stage is where the discrepancy of form-meaning mastery in advanced learners can be spotted; the discrepancy which, it seems, results from noticing and processing problems during the first two stages. These problems, as pointed out by N. Ellis (2006: 49), result in four kinds of failures (the bracketed remarks are mine):

Organic Approach Deductivised…

347

1. failing to notice certain (FMC) cues because they are not salient; 2. failing to notice that a (meaning-connected) feature needs to be processed in a different way from that relevant to L1; 3. failing to acquire a mapping because it requires complex (meaning) associations that cannot be learned implicitly; 4. failing to build constructions as a result of not being developmentally ready. It can be stated that the first two failures occur at the stage of making the initial connection while the other two are connected with the subsequent processing of this connection. Similar constraints and related problems are described by Housen herself (2002; I relate them to N.Ellis’s failures). Based on a case study, she states that the influence of inherent semantic principles in the development of tense-aspect morphology interacts with – and may be overridden by – four different factors. Firstly, L1based predispositions and native-tongue form-meaning pairings can compete – and win – against the target language FMCs (N. Ellis’s failure 2). Secondly, the learner’s ability to make form-meaning connections will be affected by type/token frequency, distributional-combinatorial patterns as well as the salience and transparency of the mappings (N.Ellis’s failure 1). This issue was addressed in Chapter 4 where prototypical/non-prototypical meanings were discussed. As was argued there, the peripheral meanings of certain forms are unlikely to be acquired via exposure alone because these FCMs are rare, and lose in their competition against the prototypical form-meaning pairings15. The final two factors mentioned in Housen (2002) include the morphophonemic properties of the form or the fact that different mechanisms (e.g. for regular and irregular morphology) are involved in processing (N.Ellis’s failures 3 and 4). This can be explained with reference to the form-meaning trade off effects discussed in Chapter 1: if attentional resources are excessively allocated to a form which is complicated to compute or nonprototypical – as is the case of morphological properties – the focus on meaning may be close to zero. 3.2. Form-form and form-meaning discrepancy in the advanced language classroom All of the above observations, which were made in relation to – and are mainly relevant for – uninstructed second language learning, can be transferred into the area of foreign language pedagogy. On the one hand, if based on deductive, teacher-fronted, linear instruction, language teaching is very often focused on form-form mappings with almost total neglect of the semantic pole of form––––––––– 15

Based on the treatment applied by Bardovi-Harlig and Reynolds (1995), we can infer that researchers generally see the non-prototypicality of certain form-meaning mappings as a major acquisitional constraint.

348

Chapter Five

meaning mappings. The semantics of grammar, if attended to at all, is focused on through form-function connections in the sense that learners are instructed on when a certain form should be used to perform a given micro-function, like apologising or reporting; the question of why forms should be used to convey various personal meanings are rarely or never addressed. The fact that the contemporary language classroom puts a strong emphasis on this type of instruction being supplemented with the exposure element – which, as argued in Chapter 2, has the potential to result in more organic, experiential and partly implicit language learning, suitable for soft codes like semantics – may not suffice because of the already-mentioned limited exposure to the target language. The shortcomings of the two above-mentioned in-class procedures are discussed below. 3.2.1. The deductive, linear approach: the main culprit? As was mentioned above, the main underlying reason for the advanced student’s insufficient understanding of the semantics of grammar seems to be the deductive approach, omnipresent at Polish schools and in the available grammar reference books. Its main drawback is that language rules are normally offered with reference to where and when a certain structure is to be used, disregarding the why of the usage which, as indicated earlier in the present work, is connected with the personal meaning of the speaker. Such an approach to FL grammar results from – and can fully be justified by – the rather limited time the teacher has on his/her hands in the foreign language classroom, the result of which is that all classroom pedagogy has to take certain shortcuts. Consequently, as Turewicz (2000) puts it, pedagogical grammars, being grammars of performance, adjusted to the needs of the teacher “typically focus on techniques to be employed in the classroom, or characterise a chosen aspect of language structure, important from the perspective of communicative teaching” (17). The communicative demands on teaching result in language pedagogy which starts by offering algorithms and goes on to proceduralise these algorithms through practice16. As could be seen based on an extract on the use of the Present Simple tense cited from Thompson and Martinet (1990; see Section 2 for details), the learner’s knowledge of this tense is reduced to the following application strategies: Use the present simple tense for present time reference: – if you want to express a habitual action or a routine; – with when and whenever; – for verbs of communication like say; ––––––––– 16

or in language pedagogy which offers grammaticalised lexis in the forms of chunks and goes on to proceduralise their retrieval through practice.

Organic Approach Deductivised… – – – –

349

for stative verbs which are not allowed in the continuous form in newspaper headlines; in dramatic past and present narratives; in conditional 1 and time clauses.

While it cannot be denied that such an instruction mode has every chance to be immediately effective as regards its intake – the form-use correspondences being straightforward and easy to understand – one is left with the impression that such tuition is exaggeratingly mechanic in the sense defined by Rutherford (1987): linear, exhaustive and product oriented and, consequently, reinforcing the language-learner distance in the area of grammar semantics. All of this will result from the fact that being able to use the structure based on such a simplified, objective algorithm will actually blind the learner to the meaning of the present simple tense delineated in Chapter 4 as single simple totalities, which allows for the application of this temporal expression that goes beyond the seven uses listed above. Such meaning-blindness is additionally reinforced by the practice grammar books offer: rules are supported by one-sentence examples and tested in a similar one-sentence mode, which results in the so-called backwash effect: the learner learns rules from a book to be able to do practice tests in this book. Consequently, we can go as far as to claim that the learner will actually be able to pretend to know the target language grammar without actually knowing it, as they will be using the tense without understanding. The simile that comes to mind in this context is the one between such a language learner and the Turing machine17, a machine supposed to be capable of human-like interaction without being human. Consequently, the deductive approach to grammar based on simple rule algorithms can be subject to criticism similar to the one expressed by Searle’s (1980) Chinese Room argument. Referring to the concept of the Turing machine, Searle (1980) proposes that a person who understands no Chinese should be locked in a room into which written Chinese characters would be passed. In the room there would also be a book containing a complex set of rules – written in ––––––––– 17

The name Turing machine comes from its proponent, Alan Turing, a British mathematician, who devised such a machine in 1936; or rather put forward the idea for such a machine, as the Turing machine was meant to be a thought experiment rather than something technically feasible. As for the practical solutions used in the experiment, Turing based the idea of his machine on a traditional game, in which a judge (J – sitting in one room) was supposed to guess whether (s)he was talking to a man or a woman (A and B – sitting in two other rooms, and separated from the judge). The guess was based on the answers the two opposite-sex participants gave to the judge’s questions sent via a messenger (the messenger was needed to avoid direct contact between the judge and the two participants). Turing thought that a human judge (J) could try to guess, using the same question procedure, whether his/her interlocutor was a human being or a thinking-talking machine. If a computer could deceive a human being into mistaking it (him ? her ?) for another human, thinking-talking machines would be a fact.

350

Chapter Five

advance in the person’s native language – based on which the person could manipulate these characters, and pass other characters out of the room. This would be done on a rote basis, requiring following commands like When you see character X, write character Y. Searle’s point was that a Chinese-speaking interviewer would pass questions written in Chinese into the room, and the corresponding answers would come out of the room appearing to those on the outside as if there were a native Chinese speaker in the room. According to Searle, such a system could indeed pass a Turing Test, yet the person who manipulated the symbols would obviously not understand Chinese any better than he did before entering the room. Just as a person who speaks no Chinese could pretend to be able to do so on condition that they have been equipped with a sufficiently detailed instruction manual, a learner of English can simulate knowledge of grammar and its formmeaning mappings if they have at their disposal a detailed list of grammatical rules in the form of the algorithm of uses presented for the Present Simple tense earlier in this section. The whole idea can further be explained based on another learning situation (Lewis 1986) in which a correct solution of a mathematical task can be achieved without understanding the underlying mathematical phenomenon of fractions. Such learning without understanding, as Lewis argues, is the result of shortcuts taken in classroom where product overrides process. Lewis (1986: 1516; abridged) writes: How do you react to this: 3/4 : 1/8 In many British schools, students were taught the “rule”: To divide fractions, invert and multiply. If we apply that to the example above, we get: 3/4 x 8 = 6 which is the correct answer. The strange, and unfortunate thing is that many students could apply the rule to a set of examples, get the right answers but still not understand fractions. What was the point of the rule? Has it helped students to do the exercise? – probably. Does it help them to understand the underlying problem? – certainly not. Look now at the following problem: A cake is cut into eight equal pieces. Somebody has eaten a quarter of the cake. How many pieces are left? Most students find this question so easy that they can do it in their heads … The interesting thing is that the problem is exactly the same as the one which intimidates so many people when it is put in symbols, and done using a strange “rule”. The same problem may be put in three ways: The cake problem described above. In words: How many 1/8ths are there in 3/4? In symbols: 3/4 : 1/8.

Similarly, based on the present simple rules proposed by grammar books (e.g.: Thomson and Martinet 1990; cited earlier in this section), a learner may pretend to use them meaningfully without actually being able to do so.

Organic Approach Deductivised…

351

3.2.2. The organic, input-based approaches: another culprit? The fact that form-meaning mappings – unlike form-use mappings –are not oneto-one arbitrary designations, together with the observation that they belong to the softer end of the code continuum presented, based on Nalimov (cf. Chapter 2 and introduction to Part II) may suggest that a more organic language instruction bordering on implicit learning resulting from exposure to authentic language input is an option compensating for the deductive form-focused teaching presented in the previous section; compensating in the sense that meaning is going to emerge on its own even if not focused on in such instruction. One of the proponents of the organic approach, Nunan (1996), critical of the linear, decontextualised view of grammar of many textbooks, emphasises the importance of the learner’s “active engagement in discoursal encounters” (73). Based on such encounters, learners “form working hypotheses about form and function” (74) and learn to understand grammar as choice rather than a set of consciously memorised rules. What happens – or rather is supposed to happen (the reasons for this scepticism will be presented later in this section) – in the area of the semantics of grammar as a result of such an organic context-informed pedagogical approach can be explained in terms of the Frame Theory described in detail in Chapter 3. As was described there, Minsky proposes to “substructure knowledge into ‘micro-worlds’” (1974: 1) and refers to such information chunks as frames with fixed top levels are fixed as well as numerous terminals – slots filled with specific information not available within the frame based on default assignments, assumptions about the situation in question which come from previous experience. Revisiting the concept put forward by construction grammar, that every construction – regardless of its place in the token-type, instantiation-schematisation hierarchy – is a form-meaning pairing, we can stipulate that default assignments will be a part of the semantic framing of every (deductively or inductively learned) grammatical structure, temporal expressions like the already quoted present simple tense included. However, based on Fillmore’s (1982) division into cognitive and interactional frames as well as Fauconnier’s (1994, 1997) notion of space builders, it seems possible that in the case of grammatical forms default assignments will contain both possible syntagmatic combinations of a given form as well as various related connotations such as feelings, presuppositions and attitudes that accompanied our previous encounters with the structure in question. In detail, as I propose in another publication (Turula 2009d), the abovetheorised emergence of form-meaning connection will proceed in the following way (Figure 49): on the basis of numerous encounters of a form in context, a certain schema – understood here as a conventionalised conceptualisation of various situations that a given grammatical structure has belonged to (Langacker 1987) – is extracted and becomes what Minsky (1974) calls a stereotyped entity included in a frame. This schema is later recognised in other instantiations of the form in question,

352

Chapter Five

and the broadly understood context of the previous encounters is recollected in the form of the already mentioned default assignments. As a consequence, when a language user encounters a given form in context, (s)he perceives its meaning based on a certain already-existing conceptual framework, which comprises not only the context itself but a number of the afore-mentioned default assignments (D) attached to the form in question. In this way, a full understanding of a given grammatical structure is ensured by both the schema as well as the context and the resulting default assignments, the latter constituting the borderline delineating the semantic frame of the structure. This, as indicated in Figure 49, is a semantic frame of a language user whose exposure to language can be described as massive, as in the case of a native speaker of the language in question. Figure 49. The conceptual frame of a native speaker (NS) for a given grammatical structure

D

Tense and aspect D

D

CONTEXT

D

D

D

D D

meaning - native speaker’s conceptual frame;

D

- default assignment

Unfortunately, promising as it may seem, such a cognitive scenario is not fully applicable to foreign language learning. The input-based, non-linear browsing of language data is unlikely to be fully conducive to an appropriate conceptualisation of meaning in controlled FL contexts. There are two important reasons for this. First of all, what has already been pointed out on a number of occasions, nativelike default assignments cannot be formed because, in the case of classroom-based instruction, exposure to input is far from massive and language data is too sparse for the implicit tallying (N.Ellis 2002) to be performed by “the language learner as an intuitive statistician” (Harrington and Dennis’s 2002: 261). Consequently, no

353

Organic Approach Deductivised…

statistically relevant hypotheses about language as a system can be made. Secondly, classroom-based foreign language learning usually takes place at what we can call the post-conceptualisation age, when certain conceptual frameworks are usually formed in the course of L1 development. Consequently, the non-native speaker’s access to the above-mentioned target language default assignments that facilitate the perception of meaning is usually – at least partly – blocked by L1 defaults. As diagrammed in Figure 50, the non-native speaker’s conceptual frame does not include some of the default assignments (D) typically utilised by a native speaker. To compensate for this, such a language user has to refer to his/her mother tongue, as well as whatever L2/FL experience (s)he has, to look for substitute default assignments (d), generalisations which which may be intralanguage (FL/FL) – as well as interlanguage (L1/FL). As a result, the conceptual framework is inferior to the one conceived by a native speaker and, consequently, the perceived meaning may, to a certain extent, be distorted. A proof of such faulty, L1-based defaults assigned to the semantic frame of tense and aspect by Polish learners of English as a foreign language is the overuse of the past progressive tense, resulting from the progressive/imperfective confusion, which Leko-Szymaska (2004) diagnosed at such a level of advancement as English Philology. Figure 50. The conceptual frame of a non-native speaker (NNS) for a given grammatical structure D

Tense and aspect d

d

D D d

CONTEXT

D D d D

d D

meaning

NNS conceptual frame

- the nat

- the native conceptual frame; D - the native default - the non-native conceptual frame; d - the non-native default

354

Chapter Five

3.3. Teaching the semantics of grammar. The treatment As was argued in the previous two subsections, the top-down deductive approach seems to fail in helping the foreign language learner establish form-meaning connections and the bottom-up, organic acquisition of linguistic data appears to be only partly successful. The failure of both these approaches – even in their contemporary FFI combine – results from their weaknesses: in the case of explicit instruction it is the neglect of semantics as the interface between form and use, mediating between the two; the implicit organic approach, in turn, is deficient because of insufficient exposure, which affects the most the marked/peripheral FMCs like the mentioned past progressive or non-prototypical uses of popular structures like present simple. It seems justified to say that effective pedagogy will require relying on the strengths of both ends of form-focused instruction as well as compensating for their drawbacks. As far as the strengths are concerned, explicit focus on form has a number of advantages (cf. Chapter 2), the most important of which seems to be what Ellis (1994; 2003) calls blueprinting the territory; on the other hand, the main advantage of the input-based, organic approaches is its empirical grammar pedagogy, data-driven and statistically intuitive (implicit) learning, which, in the case of a soft language code like semantics, are necessary components of any effective treatment. When it comes to compensation, it is hardly possible to increase exposure in controlled learning environments significantly; we can, however, enhance input on meaning and raise semantic consciousness, thereby making up for the quantity of encounters by means of the quality of exposure. It seems possible to fill in the instructional gap between form and use by means of a deductive treatment of the semantics of grammar. Such a treatment would mean offering explicit instruction from the two different directions delineated by Jackendoff (2002, 2004): conceptual semantics and spatial semantics. In the case of the former, focus on meaning will imply presenting the learner with the already-mentioned primary semantic characteristics as well as drawing their attention to tokens in a situational, frame-like context. In all this, it also seems necessary to take advantage of cultural learning typical of our species (Tomasello 1999 and later works). Consequently, the proposed Organic Approach Deductivised will be a combination of the top-down and bottom-up approaches consisting of three components: exposure/discoursal encounters; analysing the form that carries the meaning – the primary semantic characteristics, as proposed by Lewis (1986) are used as a given – while simultaneously flooding the learner with situational contexts in which the form appears; and hypothesising by means of Tomasello’s mind reading – based on both the primary and the contextual meaning – aimed at identifying the possible default assignments of the author of the text when (s)he selected this particular form to carry this/her intended meaning.

355

Organic Approach Deductivised… Figure 51. Three steps of meaning learning Step 1 Tense and aspect

CONTEXT meaning Step 2 D

D

Tense and aspect

CONTEXT

D

D

D D

D

meaning D

Step 3

D

D

Tense and aspect

CONTEXT

D

D

D D D

meaning D

356

Chapter Five

As I argue elsewhere (Turula 2009d), both the analysis and the hypothesis formation – aimed at the re-adjustment of the NNS semantic frames – should be carried out in three steps, which – as seen in Figure 51 – reverse/deductivise the originally bottom-up process of frame formation (described in Section 3.2.2). In step 1, the contextual meaning is analysed based on the primary semantic characteristics of a given form (for example, the already mentioned present simple understood as “single, simple entities, unities, totalities”). In step 2, the probable default assignments of the speaker are hypothesised, which means that learners try to figure out why discourse participants use a certain – and no other – structure in a given context. These default assignments delineate the boundaries of the nativelike conceptual frame in step 3 (Figure 51). In classroom practice, a treatment underlying such a three-step pedagogy includes: – teaching the conceptual semantics of a given form by presenting the primary semantic characteristics of the form included in all of its uses – the deductive, explicit FonF component; – examining the form in multiple situational contexts in order to reconstruct the potential default assignments motivating its choice. I propose that in foreign language teaching contexts the best way to expose the learner to a varied, event-based target language input is by means of film watching. Consequently, a hundred-strong collection of Anglophone films was chosen as study material – the organic, implicit/inductive FonF component; and – reconstructing the semantic frame of the given structure as delineated by the default assignments, through an analysis of the data which starts with whole class discussions of the presented video material to go on to individual learner analysis and learner/teacher exchanges utilising corrective feedback – the cultural learning component. Such a tripartite structure of Organic Approach Deductivised complies with the theoretical underpinnings presented in Parts I and II in a number of ways. Firstly, the devised treatment provides for the earlier-described two-faceted approach to attention: focal/intensive and global/extensive. On the one hand, individual temporal expressions are focused on discretely, along the lines of focus on formS preferred by VanPatten (1996) and DeKeyser (2004), which was chosen for the present treatment as auguring better in a FL classroom than the more communicative focus-on-form. On the other hand, Organic Approach Deductivised exposes learners to these grammatical formS in genuine, communicative contexts and, as a result of such holistic encounters with situational gestalts, gives them a chance to become globally aware of the different aspects of the input chunks which are out of the main focus, in the shadow around the spotlight, the outer shells/peripheral layers of the attention circle.

Organic Approach Deductivised…

357

Secondly, the semantics of grammar that is the object of the tuition is treated holistically and all form-meaning mappings are seen as “continuities between language and experience” (Petruck 1996: 1). This is achieved by the combination and interaction between the conceptual and the spatial/experiential/perceptual components of the treatment. The former is catered to by means of explicit instruction in primary semantic characteristics of time talk; the experiential semantics, in turn, is attended to by means of both exposure and the reconstruction of the potential default assignments of the speaker. The two modes together supply the necessary reference points for semantic processing and reinforce conscious conceptualization by offering possible defaults. What is of additional importance here is that the experience planned as part of the grammar instruction involves the visual input provided by film watching. This supplements the instruction in the semantics of grammar with the perceptual component which, as argued by Dbrowska (2004), is crucial in any meaning construction. If we go back to Regier’s (1996) three sensorimotor constraints on processing – orientation-combination; map-comparison; and motion-trajectory – we can propose that the understanding of form-meaning mappings in time talk, which will potentially develop in the advanced language learner exposed to the Organic Approach Deductivised treatment, will hinge on three types of perception-based processes. First of all, temporal expressions used in a watched film episode will be conceptualised in terms of distance – emotional and social – between the scene participants – as well as relations: interpersonal relations between the interlocutors and their multifarious interactions with the broadly understood environment (the orientationcombination structure). These conceptualisations will incorporate such meaning determinants as deixis or the optimality/egocentricity issue. The scene will additionally be organised in terms of viewpoints and perspectives of its participants (the motion-trajectory structure). Finally, while watching, learners will – for comparison – map the situational frames of individual scenes onto the frames of previously watched episodes (the map-comparison structure). In this way conceptual and spatial semantics (Jackendoff 2002, 2004) of the English time talk will be mutually supplementary. Thirdly, it needs to be mentioned that the treatment displays some important characteristics of implicit learning discussed in Chapter 2. If considered against the implicit learning paradigms, the extensive study of the Anglophone film corpus seems to be the closest to the control of complex systems. To update the reader briefly on the characteristics of these systems, they, as described by Quesada et al. (2005: 6; following Frensch and Funke 1995), are: – dynamic, because early actions determine the environment in which subsequent decisions must be made, and features of the task environment may change independently of the solver’s actions;

358

Chapter Five

– time-dependent, because decisions must be made at the correct moment in relation to environmental demands; – and complex, in the sense that most variables are not related to each other in a one-to-one manner. Similarly, the Organic Approach Deductivised as a pedagogic proposal is: – dynamic, as the dynamics of the students’ grammatical choices are influenced by the initial gestalt impression of each watched scene as a whole; – time dependent, as the data processing takes place online (while watching a film), immediately upon evoking a certain semantic frame necessary for the interpretation of grammar semantics of a given structure; – complex in the sense that it requires simultaneous management of different facets of every tense/aspect-related task, the focus being on: pre-taught grammar semantics; the context; possible default assignments of the speaker(s) in the scenes being analysed; and the effect their words have on the interlocutor(s) on the one hand, and the number of potentially relevant items within the studied frame on the other. If we assume that the first of the above-listed elements is salient while the rest are rather opaque, we can say, following Berry and Broadbent (1987, 1988), that while the former is learned explicitly, the latter are good candidates for implicit learning. The knowledge acquired in this way, as in every CPS task (Diennes and Fahey 1995), is based on implicit memory for specific instances. Such instances, in the light of instance-based learning theories (Logan 1988, 1990), differ from rule-based approaches in their understanding of skill formation: they are seen as the storage and deployment of particular episodes and instances rather than the learning of abstract rules. Considering the fact that Organic Approach Deductivised is a pedagogic treatment based on an extensive study of a corpus containing Anglophone films, it can be said that the accumulation of episode-based knowledge is a natural consequence of the treatment. In addition to the above, it needs to be observed that the process of default assignment reconstruction, especially the one carried out in class or under the influence of the teacher’s corrective feedback (both of which are part of the treatment), resembles the instruction procedure used in the implicit learning study of the Ideal Yoked subject carried out by Druham and Mathews (1989) which was described in Chapter 2. When it comes to the reconstruction of default assignments, the process of deducing – based on a broad situational context – the rationale for the speaker’s choice of a given temporal expression was by no-means rule-reliant and resembled a classifier system explanation more than an allapplicable algorithm; besides, as far-fetched as it is, a comparison can be made between the original subjects of the study cited and the tutor/the class on the one hand and the yoke subjects and the individual participants of the Organic Approach Deductivised treatment on the other.

Organic Approach Deductivised…

359

Finally and very importantly, it needs to be pointed out that grammar learning such as the one expected to take place in the course of the proposed treatment is in agreement with the principles of human cultural/imitative learning as are proposed by Tomasello (1999). To briefly update the reader on the theoretical model at issue, Tomasello argues that human beings, who have a theory of mind, know that the actions of other people are as intentional as their own actions. Consequently, based on their observation of other people, human beings are ready to imitate both what is done and why it is done or, in other words, to understand the result/end of an action as well as the means to achieve this end and the intentions behind the act. A similar process of learning by means of stepping into other people’s shoes (Tomasello 1999) is expected in the case of the Organic Approach Deductivised, in the course of which the advanced language learner familiarises him/herself with the way of using time talk in different contexts together with the pragmatics of individual forms, and they do so by following the observed speaker’s intentions together with the means (s)he uses to achieve his/her communicative goals. All of the above-discussed characteristics of the proposed treatment are in agreement with the recommendations put forward by contemporary form-focused approaches as regards the combination of instruction with exposure to authentic language in communicative contexts. Such pedagogy can be labelled 2-D, as it comprises the didactic element with demonstration. Based on the model, the Organic Approach Deductivised takes it further towards what can be called a 3-D form-focused instruction: didactics (the deductive element), demonstration (the organic element) and default, the final D – the cultural learning through mind reading – being the innovative element in the currently popular FFI mode. The efficiency of the OAD approach was investigated in three studies, whose design, implementation and results are presented in Chapter 6.

CHAPTER SIX IMPLEMENTING ORGANIC APPROACH DEDUCTIVISED: THE STUDIES As was mentioned in Chapter 5, the studies which are going to be described in the present chapter were carried out in accordance with the American time-talk research tradition, in which particular temporal forms – and not temporal meanings – are a point of departure for the subsequent study of form-meaning pairings. There are, as was indicated in the introduction to the present work, two more parallels between the studies described in the present chapter and the tradition within which they are accommodated. First of all, the focus of the Organic Approach Deductivised studies is on the product of the treatment as well as, like other time talk studies (cf. Bardovi-Harlig 2000), on the cognitive processes accompanying the treatment implementation: on gradual emergence of certain grammatical choice tendencies. Secondly – and very much in relation to the assertion above – the present studies, like the majority of the said research to date, are small-scale, qualitative rather than quantitative, and based on data collected from a small number of testees through tests and verbal reports. The decisions underlying the research design have been taken in the belief that while large-format research efforts provide us with invaluable statistically significant data, small experimental studies are a worthy supplement offering insights into phenomena which would otherwise remain uncovered. Another framework within which the investigation will be situated is the one of form-focused instruction – especially the one that inclines towards focus on formS (VanPatten 1996; DeKeyser 2004) – with grammar semantics treated as the pedagogical focus. The present chapter describes the three studies carried out in the course of the present research, and demonstrates as well as discusses their results. The first five sections present the research hypotheses and questions, introduce the reader to the variables, the samples, the procedure implemented as well as the instruments and research methodology. The main part of the chapter presents the results, their analysis as well as the resulting conclusions. 1. Research design The research into the effectiveness of the pedagogical treatment called Organic Approach Deductivised consisted of three studies carried out as action research in the course of two classic experiments. Both experiments involved the treatment as well as three evaluation procedures: a pre-test, a post-test and a delayed post-test. The aim of Study 1 was to establish the initial quantitative distance between the NNS participants and a group of native speakers of English as regards their formal choices in the area of time talk as well as to describe the fluctuations in

362

Chapter Six

this distance resulting from the experimental treatment applied. Study 2 was a replication of Study 1. Study 3 looked at the above-mentioned distance and its alternations in terms of rationales (the why) the testees gave for their grammatical choices in introspective verbal reports. 1.1. Research questions and hypotheses The three studies implemented in the course of the described research were carried out in search of the answer to the question about the effectiveness of the Organic Approach Deductivised treatment (introduced in the previous chapter and described in detail in Section 1.4 below) in the area of NNS’ understanding of English grammar semantics. However, the said research also, and even more importantly, was expected to offer insights into how the conceptualisation of time talk changes as a result and in the course of the said treatment. Both these issues: the product (i.e. the effectiveness and of the treatment) and the process (i.e. the change in the NNS conceptualisation of English grammar semantics) were understood throughout all three the studies as a consistent change in the subjects’ tendencies to choose certain temporal expressions over others on grammar tests, as well as to offer particular rationales for their choices in test-related verbal reports; both types of changes were operationalised as fluctuations in the mathematically calculated distance between the experimental groups and their NNS controls on the one hand and the NS controls on the other. All distances, were analysed for their statistical significance. Related to the above research questions are the following hypotheses: – Hypothesis#1: There will there be a gap – to use Schmidt and Frota’s (1986) term – between the native speaker (NS) and the non-native speaker (NNS) groups on the pre-test. The distance will manifest itself both quantitatively – in terms of what temporal expressions both groups choose for given contexts – as well as qualitatively – in how the testees justify their grammatical choices. – Hypothesis#2: It is possible to diminish the NS/NNS distance by means of the treatment under investigation called Organic Approach Deductivised. The distance can be changed in both quantitatively and qualitatively. – Hypothesis#3: The treatment – owing to the combination of focal and global attention, the specificity of its procedure (reminiscent of the control of complex systems), instruction based on classifier systems as well as the mind-reading component – will result in explicit as well as implicit knowledge of FL grammar semantics. The latter kind of knowledge will be time-resistant, not easily verbalisable, and quantitatively it will be subject to native-speaker bias. – Hypothesis#4: The effect of the treatment will be a sum total of numerous qualitative changes induced by the combination of all components of the treatment – the organic, the deductive and the mind reading.

Implementing Organic …

363

The three studies described in this chapter were carried out in the hope of corroborating the above hypotheses and gaining insights into a much broader problem of how to teach grammar to advanced language learners. 1.2. Research variables There were a number of independent and dependent as well as moderator variables in the 3-study research. The independent variables included the treatment referred to as the Organic Approach Deductivised used with the two experimental groups in addition to their regular course in the practical grammar of English; the NNS control group underwent only the traditional instruction pedagogy1+exposure normally offered in the course of philology studies. The dependent variables contained both quantitative and qualitative fluctuations in the area of grammar semantics in the experimental groups, measured by means of a grammatical test and operationalised as the difference in distance between test scores of these groups and the control NS and NNS groups. As the effect of the treatment was measured twice, on a post-test (immediately after the treatment) as well as a delayed post-test (a year after the treatment), time was treated as the moderator variable. Another intervening factor was the study programmes the NNS groups – experimental and control – covered in the 18 months in which the described research was implemented. As the main interest of the research was group – rather than individual – change, individual differences of the testees were not taken into account. 1.3. Research participants As was mentioned above, the present research consisted of three different studies. This is why, research participants will be presented as belonging to six samples in relation to these three studies. Study 1 The samples participating in the first study were: – Sample 1. The NNS experimental group – labelled E1 throughout Chapter 6 – second-year students of the University of Bielsko-Biala, Poland. N=24 (24 present on the pre-test; 23 – on the post-test; 19 – on the delayed post-test); ––––––––– 1

A course in the practical grammar of English, taught in the presentation+practice mode, the latter based – among others – on Foley, M. and D. Hall. 2003. Advanced Learners’ Grammar. London: Longman.; Thomson, J. and A. V. Martine. 1990. A Practical English Grammar. Oxford: Oxford University Press.; Hewings, M. 1999. Advanced Grammar in Use. Cambridge: Cambridge University Press.; and Vince, M. and P. Sunderland. 2003. Advanced Language Practice. Oxford: Macmillan.

364

Chapter Six

– Sample 2. The NNS control group – C – second-year students of the University of Bielsko-Biala, Poland. N=21 (18 present on the pre-test; 19 – on the posttest; 21 – on the delayed post-test). – Sample 3. The NS control group – US – American students, English majors, seniors (year 3). N=63 (Test 1 [used as pre-test and delayed post-test]: 29 students of Transylvania University, KY; Test 2 [used as post-test]: 34 students of Asbury College, KY). Study 2 – Sample 4. The NNS experimental group – E2 – second-year students of the University of Bielsko-Biala, Poland. N= 25 (21 present on the pre-test; 25 – on the post-test; 23 – on the delayed post-test) Samples 2 and 3 were used as controls in Study 2. Study 3 – Sample 5. The NNS experimental group – E2/NNS – second-year students of the University of Bielsko-Biala, Poland. N=10 (selected from Sample 4) – Sample 6. The NS control group – NS – 10 native speakers of English, all continuing teachers of EFL (at least one year of teaching experience), 2 academics, 6 DELTA/CELTA-certified, 2 in the process of getting their teaching qualifications. The represented varieties of English included: British English (4 testees), American English (3 testees) and Australian English (3 testees). In addition to the above, two native speaker teachers of EFL, both fully qualified (DELTA/CELTA-certified) and representing two varieties of English (BE and AuE), were asked to offer grammaticality judgments on the chosen temporal expressions. These judgments were used in the analysis of the quantitative data. The sampling of all groups can be described as purposive because all six samples were selected based on certain criteria (described below). The experimental and control groups (E1, E2 and C) were pre-established university groups. Second-year students were selected for two important reasons. First of all, they had already had their practical English grammar tuition in temporal expressions (year 1 programme). Secondly, all three groups had been subjected to a kind of standardizing procedure in the form of the end-of-year-one practical English examination at the level of C1. Taking into account the second criterion, the students’ history of language learning was not investigated because it was unimportant to the present research in how many years its participants reached the level required for successfully completing year 1 of their studies in terms of the practical knowledge of English2. ––––––––– 2

A typical Polish learner will have learned English for 6-12 years before entering a university

Implementing Organic …

365

What is important to point out is that the experimental and the control groups covered two different study programmes in which the control group – with Anglophone literature and culture as their majors – had considerably more exposure to authentic language (literature, film, media) than the two experimental groups, who did ELT methodology as their study specialisation. The programme differences are presented in Table 113 below: Table 11. Study programme differences between the experimental and the control groups. The experimental groups: 30h of British and American culture and history 60h of British and American culture and literature

The control group: 30h of British and American culture and history 60h of British and American culture and literature 45h of Canadian culture and literature 60h of seminars in Anglophone culture and literature

The US sample (Sample 3) consisted of two groups of students who in a way mirrored the NNS samples: like the Polish testees, the American students came from two university-established groups; they were in year 3 of their English studies. Their number – approximately 30 for each group tested – was close to the number of Polish testees. As for the group of native speakers (NS; Sample 6), they were all teachers of English as a foreign language in Poland with the required qualifications (CELTA/DELTA) or in the process of completing their certification; they had at least one year of teaching experience. The two aspects – professional competence and TEFL practice – were seen as important in order to guarantee access to the declarative/declarativised knowledge of English grammar. The number of these testees – 10 – was designed to mirror the number of student subjects (NNS; Sample 5) in the verbal report-based study. The above qualifications and experience criteria were also applied in the selection of the two native speakers who performed the grammaticality judgments. –––––––––

3

(6-16, if we count in kindergarten lessons). As demonstrated yearly by the results of school leaving exams in Poland, the level of the learners is anywhere between A2 and C2. All of this shows that the answer to the question of how many years of English the testees have had does not tell the Polish researcher much about the level of English in a given group. That is why a diagnosis of the current level of the learners was considered to be much more revealing. Subjects not requiring exposure to authentic language material are not listed as irrelevant to the study.

366

Chapter Six

1.4. The treatment During the experiment, which lasted for one term (October-January), the E1 and E2 groups underwent three Organic Approach Deductivised sessions. Each session consisted of 4 components: – classwork (1) – during which primary semantic characteristics as well as the various meanings of given grammar forms were presented (examples of both were described in Chapter 4) – 90 mins/a particular type of structures (e.g.: the present tenses); – homework (1) – which required every student to watch an Anglophone film of their choice and notice form-meaning mappings of the temporal structures in focus (present tenses, past tenses or future tenses). Every student watched a different film; the uniting factor was the given form-meaning mappings (e.g.: the present tenses). The suggested watching mode was with English subtitles on so as to avoid form/content trade-off effects resulting from the selectivity of attention. Students were asked to choose and quote in writing 5 coursebook (model; described in pedagogical grammars) uses and 5 surprising uses of each of the given temporal expression (surprising = not complying with – or violating – the rules the students knew). In each case, the cited form had to be placed in the situational context in which it was used, with a brief introduction of when/where/why/etc. the quoted exchange between the characters took place. Students were also asked to highlight (bold) the time-talk form that was currently under investigation (for a sample cf. Figure 52A). Estimated time – 180 mins/a particular type of structures (the present tenses). All selected uses were emailed to the class tutor, who compiled them into a time-talk corpus; – classwork (2) – during which selected form-meaning mappings from the said corpus were discussed, with special attention being paid to the mappings which had been considered “untypical”. In the course of the presentation, the group and the tutor deliberated on the why of the use presented and tried to reconstruct the default assignments of the speaker by guessing the intentions of the interlocutors – 90 mins/a particular type of structures (e.g.: the present tenses); – homework (2) – the purpose of which was an analysis of the student-selected film snippets compiled into a corpus of samples by the class tutor. Each participant was asked to select a number of snippets for comment. Such student comments were e-mailed to the tutor for feedback. Estimated time – 90 mins/a particular type of structures (e.g.: the present tenses). A typical dialogic exchange between a treatment participant and the tutor is presented in Figure 52B.

Implementing Organic …

367

Figure 52. A sample dialogic exchange between a treatment participant and the tutor A Gracie Hart’s boss tells her that she is the best person to be a new face of the FBI and when she doesn’t agree, he adds: ‘Hart, I've been getting five calls a day for you to be on talk shows.’ (Miss Congeniality 2) B The student: I think that present perfect continuous is not suitable for this sentence; probably it’s used here to put emphasis on the fact that those phone calls are annoying for Hart’s boss. The tutor: Possibly; or there are so many phone calls that it seems to be one continuous phone call – hence the present perfect continuous tense.

In addition to this, it is important to point out that during the experiment students wrote three why tests, in which they had to give the rationale for a number of form-meaning mappings. The tests were graded, weighed and incorporated in the final course grade. The rationale for all the components as well as individual techniques used in the course of the treatment can be found in the theoretical part of the book (as well as summarised at the end of Chapter 5). To briefly recapitulate the most important aspects: – The off-line processing mode – including having the English subtitles on, the possibility to pause the film and the ability to perform the tasks at the learner’s own pace – was a way of catering to constraints on attention so that the limited resource could be allocated to both form and meaning; prolonged processing time – treated as the fourth dimension on the attentional capacity (cf. Chapter 1) – allowed for this. – In general terms, Organic Approach Deductivised was an input-based treatment with a presentation-practice-comprehension design reminiscent of VanPatten’s Processing Instruction. When it comes to the focus-on-form techniques utilised in the course of the treatment, they included: explicit teacher and peer tutoring; input flooding; visual input enhancement and corrective feedback. – The treatment, following the postulates presented in Ellis (1994) as well as the numerous publications included in Doughty and Williams (2004), offered focus on form in the communicative context. This meant that the learners could rely on both the intensive and extensive attention; explicit and implicit learning.

368

Chapter Six

– As for the semantics of time talk that were taught, a number of assertions can be made. First of all, as mentioned in Chapter 3, meaning was contextualised, bipolar, conceptual and experiential. Secondly, every new element of time talk was placed within the system, every present tense sharing some meaning facets with other present tenses and every continuous tense – with other continuous tenses. Grammar was treated as a question of choice, always primarily traced back to the speaker’s intentions and not to a set of objective rules. The discussed grammatical temporal structures included: the present simple, the present continuous, the present perfect (simple and continuous), the past simple, the past continuous, the past perfect (simple and continuous), the future simple, the future continuous, the future perfect (simple and continuous), and the be going to form. – Finally, the considerations of the potential default assignments of the participants of the cited dialogues – which were part of the tuition – allowed the participants of the treatment to approximate native-like ways of thinking, to step into the NS shoes and – to use Tomasello’s (1999, 2005) term – to read the native speaker mind partaking in the experience of making and understanding the meaning. This last element of the treatment hopefully made it possible for the learners to enter the realm of intersubjectivity, whose importance in deciphering grammar semantics was emphasised, after Sinha (1999) as well as Tomasello and collaborators (numerous cited sources), in Chapter 3. 1.5. Research instruments, data collection and NS/NNS distance calculation procedures 1.5.1. The test format and the data collection procedures Both the pre-test, the post-test and the delayed post-test were gapped grammar tests based on extracts from Anglophone films. Two different tests (see Appendix 1) were used: Test 1 and Test 2. The two tests were done by the following groups: Test 1: – NNS groups E1, E2 and C – as the pre-test and the delayed post-test (Studies 1 and 2); 10 students selected from E2 provided their verbal reports based on it (Study 3); – students of Transylvania University, KY – to provide background for the pretest and the delayed post-test done by the NNS participants of the treatment (Studies 1 and 2); and – 10 native speaker teachers – as the background for the verbal report-based Study 3, pre-test and delayed post-test.

Implementing Organic …

369

Test 2: – NNS groups E1, E2 and C – as the post-test (Studies 1 and 2); 10 students selected from E2 provided their verbal reports based on it (Study 3); – students of Asbury College, KY – to provide background for the post-test done by the NNS participants of the treatment (Studies 1 and 2); and – 10 native speaker teachers – as the background for the verbal report-based Study 3, post-test. A sample unit is presented in Figure 53. Figure 53. A sample test unit with two gaps A woman who has long refused to marry the commodore is trying to win a favour with him. That is why she says: ‘Commodore, I [1] ______________(beg) you please, do this for me as a wedding gift.’ The woman’s father, who’s listening to the conversation and is very much in favour of the marriage, asks (hopefully) ‘Elisabeth, [2] _________you_______________ (accept) the commodore’s proposal?

An extract, like the one quoted above was treated as one testing unit. The pre-test consisted of 10 such units; the post-test was a 12-unit test. The number of gaps in each unit was between one and five. Both tests, though by no means fully equivalent as each had to contain different situational snippets, were constructed in such a way so as to contain the same number of gaps – 23 – and a comparable number of uses of all English tenses and aspects with present, past and future time reference. When it comes to data collection procedures, in Studies 1 and 2 the testees – both native and non-native speakers – were asked to fill in the gaps in the test with suitable forms of the verbs in brackets. The answers were not judged as regards their appropriateness; the only factor taken into account were the differences between groups in the area of the frequency with which particular temporal expressions were chosen. In Study 3, the pre-test, the post-test and the delayed posttest, the above procedure was supplemented by means of verbal reports in the form of TAPs (think-aloud protocols) delivered by the testees. In the experimental group (E2) they were 10 volunteer participants4, while the NS control group consisted of the 10 NS teachers of English. Their verbal reports consisted of the rationales for all grammatical choices. The rationales, offered asynchronously (=on the earlier written test), were recorded and transcribed, a procedure which provided research material containing 15h of audio-data and an approximately 35,000-word transcript. ––––––––– 4

The group of ten volunteers remained unchanged throughout the three tests. One replacement was made on a delayed post-test because one testee had dropped out.

370

Chapter Six

In the course of data analysis, the rationales offered by the testees were annotated and ascribed to one of the ten categories (Table 12). The categorisation was verified by two independent judges. Table 12. Categories of the rationales offered by the NS and NNS testees in Study 3 Category of rationale offered R – rules (which, in fact stand for the clichés repeated after pedagogical grammars currently in use) S(i) – semantics; interpretation (all effort aimed at determining the rationale behind the speaker’s choice of this particular, and not any other, grammatical form) S(t) – semantics; translation (translating the grammar form into its L1 equivalent) [A] – attempt (unsuccessful, at rules or semantic interpretation SG(kw) – sounds good because of certain key word(s) SG(f) – sounds good because it is part of a formula/lexical chunk SG(tx) – sounds good because as part of the text SG(e) – sounds good because it fits into the event schema SG(i) – sounds good but no explication is offered (intuition) TT – (overt) transfer of training (the justifications offered by the NNS on post-test which directly refer to class discussions of form-meaning mappings)

Sample answer The speaker uses the present continuous tense because what is happening is happening at the moment of speaking. The future simple tense is used for firmness.

‘I beg you’ meaning ‘I implore you’; or rendering the utterance in Polish (the NNS testees’ native tongue) any kind of form or meaning error Present perfect sounds good with ‘never’. Present perfect sounds good with this particular verb. The be going to form is used here for stylistic reasons: it’s used throughout the passage and it sounds good and consistent. ‘I’m going to come in’ is what my mother would say when she wanted to enter my room. It simply sounds good. I can’t explain why. This is continuous for politeness. We talked about it in our class.

It has to be admitted that the category distinction which posed the greatest interpretational problems was the proper classification of rules as opposed to semantic interpretations5. In search of appropriate landmarks or reference points, which would enable proper and consistent categorisation, the classification was based on the form-function algorithms of use found in pedagogical grammars for the R category, and on the form-meaning mappings as proposed by Mindt (1995, 2000) for the S(i) category. A sample R/S(i) opposition for the Present Continuous tense is presented in Table 13. ––––––––– 5

The other categories are in fact quotations of what the students themselves said and, as such, are uncontroversial.

Implementing Organic …

371

Table 13. The R (rules) vs. S(i) (semantic interpretations) indicators for the present continuous tense R (based on Thomson and Martinet 1990: 154-155) the present continuous is used for: 1) actions happening now 2) actions happening about this time but not necessarily at the moment of speaking 3) a definite arrangement in the future 4) with always 5) an in progress or apparently continuous

S(i) (based on Mindt 2000: 248-250) the present continuous is used for: 1) incompletion 2) temporariness 3) iteration/habit 4) highlighting prominence 5) volition 6) emotion, matter of course or downtoning/politeness

It is worth mentioning that the difference between rules and interpretations was verbalised – unsolicited – by one of the interviewed native speakers who, commenting on a use of the be going to form as opposed to the will future, said: “As a teacher you try to emphasise that be going to is used to indicate some earlier plans whereas the will future is used for plans made at the moment of speaking. With native speakers, it is different. I would say I use will because it shows stronger intentions of the speaker”. 1.5.2. The NS/NNS gap calculation procedure: distance in multidimensional spaces As was pointed out before, in all three studies the data was collected by means of a gapped grammar test on which the subjects, both native and non-native, were asked to fill in the blanks with the bracketed verb in a temporal expression (=grammatical tense) that, in their opinion, was the most suitable for the given context (Studies 1 and 2) and give their rationale for the particular choice (Study 3). Before the final analysis, the number of answers in each gap/category of rationale was totalled, and each such score was treated as one dimension in the multidimensional space. Then the distance between participant groups was established by means of a mathematical method known as distance calculation in multidimensional spaces (Steinhaus 1960). The method and the calculations are explained based on the example below. Knowing that gap 2 in the test unit presented in Figure 53 was filled in the following way6: ––––––––– 6

In the table below every percentage representation of score has been replaced by a decimal fraction for the clarity of calculation.

372

Chapter Six

The woman’s father, who’s listening to the conversation and is very much in favour of the marriage, asks (hopefully) ‘Elisabeth, [2] ________________________ (accept) the commodore’s proposal? Blank 2 US E1 C will you accept 21% – 0.21 42% – 0.42 17% – 0.17 are you accepting 17% – 0.17 12% – 0.12 5% – 0.05 do you accept 14% – 0.14 4% – 0.04 11% – 0.11 accept 10% – 0.10 you accept 10% – 0.10 did you accept 7% – 0.07 17% – 0.17 please accept 7% – 0.07 accepted 7% – 0.07 won’t you accept 3% – 0.03 are you to accept 3% – 0.03 have you accepted 21% – 0.21 22% – 0.22 would you accept 12% – 0.12 22% – 0.22 don’t you accept 4% – 0.04 are you going to accept 4% – 0.04 5% – 0.05

and considering the total number of grammatical options chosen (14), we arrive at the 14-dimensional space, in which the distance (d) between the US group and ATH (E) group can be calculated using the following formula7:

(0.21 − 0.42) 2 + (0.17 − 0.12) 2 + (0.14 − 0.04) 2 + (0.1 − 0) 2 + d= 2 (0.1 − 0) 2 + (0.07 − 0) 2 + (0.07 − 0) 2 + (0.07 − 0) 2 + (0.03 − 0) 2 + (0.03 − 0) 2 + (0 − 0.21) 2 + (0 − 0.12) 2 + (0 − 0.04) 2 + (0 − 0.04) 2 The mathematical method of distance calculation in multidimensional spaces was chosen because it seems particularly intuitive to conceptualise the NNS/NS gap in terms of distance; besides, it allows for an actual comparison of the profiles of groups with different numbers of members. However, it has to be admitted that the mathematical method has its limitations, which is probably the main reason why it is not widely used in pedagogical research. Its main shortcoming is that based on distance calculation in multi-dimensional spaces it is difficult if not ––––––––– 7

Complicated as it may look, the formula is based on a calculation known to most people as the Phytagoras theorem. In a 2-D space, c = a + b . In a 3-D space, d = 2

2

a 2 + b 2 + c 2 (a, b and c being the lengths of three different sides of the figure. Consequently, in any n-D space, the distance between two different points can be calculated by square root of the sum of squares of individual component distances.

Implementing Organic …

373

impossible to decide how significant the experiment-induced fluctuations are. That is why, the mathematical calculations will be analysed together with 2 and p measures calculated for inter-group (e.g.: US/E1) comparisons. The statistic was counted in two different ways: multivariate, for each distance calculation (cf. appendices 2 and 3; cited in the presentation of pre-test, post-test and delayed post-test results); and 2x2 (NS-chosen/other choices x onegroup/the-other-compared-group scores) used for pre-test/delayed post-test comparison. The latter procedure has been adopted because: 1) for same gaps on the tests the statistically important difference (in the multivariate analysis) was found to lie outside the temporal system8; or 2) in tests where numerous answers had been chosen by single testees, the statistical significance of the modes of two compared groups was obscured by high degrees of freedom9. The 2x2 calculations were done with the use of STATISTICA 9.0; the multivariate statistics were calculated by GW-BASIC 3.23 (Microsoft 1983-1988). 2. Research implementation 2.1. Research chronology The experimental treatment was first implemented in the years 2005-2006 in Group E1; Study 1, which was related to it, was carried out in the years 20052007, following the chronology presented below: – October 2005: pre-test in the NNS groups, E1 (Sample 1) and C (Sample 2); both Test 1 (pre-test and delayed post-test) and Test 2 (post-test) carried out in the NS control groups (Sample 3) – October 2005 – January 2006: the treatment – January 2006: post-test in E1 and C – January 2007: delayed post-test in E1 and C The treatment was replicated in the years 2006-2007 in Group E2. Study 2 was carried out within the following time frame: – October 2006: pre-test in E2 (Sample 4); C and the NS groups treated as controls – October 2006 – January 2007: the treatment – January 2007: post-test – January 2008: delayed post-test ––––––––– 8 9

Cf. Appendix 2: Test 1, sample viii.1 for E1, where the difference is determined by the group using – or not – uninverted questions in colloquial speech. Cf. Appendix 2: Test 1, sample v.1, US/E2 comparison

374

Chapter Six

Study 3 was carried out alongside Study 2, which means that the verbal reports of Sample 5 were collected after each test in October 2006, January 2007 and January 2008. The verbal reports of the NS teacher group (Sample 6) were collected in the years 2007-2009. 2.2. Studies 1 and 210 2.2.1. Pre-test results Before detailed pre-test results are presented, it seems important to draw the reader’s attention to one general observation that confirms Nunan’s (1996) as well as Halliday’s (1994, 2003) and Larsen-Freeman’s (2003) standpoint: grammar is a question of choice. In many test units, as in Test Sample 1/i.2, a number of options to fill in the blanks were chosen by the testees, both native and non-native. This is one more proof that the semantics of grammar goes not only beyond the sentence level but also beyond the discourse level, engaging default assignments based on text schemata as well as the (inter)subjective perspective based on previous experience – linguistic and general. Test Sample 1/i.2 The woman’s father, who’s listening to the conversation and is very much in favour of the marriage, asks (hopefully) ‘Elisabeth, [2] ________________________ (accept) the commodore’s proposal? blank 2 US E1 E2 C will you accept 21% 42% 38% 17% are you accepting 17% 12% 5% 5% do you accept 14% 4% 24% 11% accept 10% you accept 10% did you accept 7% 10% 17% please accept 7% accepted 7% won’t you accept 3% are you to accept 3% have you accepted 21% 14% 22% would you accept 12% 5% 22% don’t you accept 4% are you going to 4% 5% 5% accept

––––––––– 10

For complete results - see appendices 2 and 3

Implementing Organic …

375

However, alongside examples like the one quoted above, there are a number of cases where the native and non-native speakers are consistent in their selection of tense and aspect. In some of these areas there is a gap between the native and nonnative groups. They include the following NS/NNS differences: – Non-native speakers choose the non-prototypical form-meaning mappings of tense and aspect far less frequently then the US group (Test Sample 1/ix.2). Test Sample 1/ix.2 Press secretary to guest: ‘The Prime Minister [1] _______(expect) you. We [2] ______ (begin) to think you [3] ________(not come). I hope you[4] ______(enjoy) your stay.’ gap 2 US E1 E2 C began 66% 75% 67% 67% were beginning 28% 5% 8% had begun 7% have begun 12% 18% 8% begin 12% 10% 8%

The distance between the NS and NNS groups in this respect is 0.35 (US/E1), 0.32 (US/E2) and 0.24 (US/C), and is statistically significant for the two experimental groups as demonstrated by the following 2 and p scores: 1/ix.2 2 p df

US/E1 15.69501 p 0.8 1

E1/ E1d

1 E2/ E2d 2.459 628 p> 0.1 1

1

p< p< 0.001 0.001

US/ C

US/ E2

US/ E1

C 5 13

E2 3 18

E1 12 12

US/ Cd 8.636 2 328 p< p 0.005 df 1

p df

2

US 25 4

C/ Cd 1.62 103 p> 0.1 1

1

p< 0.025

6.552 721

US/ E1d

E1d 10 9

Cd 10 11

1

p< 0.001

14.62 845

US/ E2d

E2d 8 15

2 measures

Test 1 (pre-test and delayed (d) post-test results)

i. 1. A woman who has long refused to marry the commodore is trying to win a favour with him. That is why she says: ‘Commodore, I [1] ______________(beg) you please, do this for me as a wedding gift.’

Appendix 2

APPENDIX 2: Test 1 (pre-test and delayed (d) post-test results)

0.10 0.10 0.07 0.07 0.07 0.03 0.03

0.21 0.12 0.04 0.04

0.11 0.05

0.13 0.09 0.04

0.05

0.04

0.14 0.05

0.10

0.05

0.22 0.22

0.17

0.05

0.29 0.29

0.05

0.05

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.39 0.33 0.33 0.36 0.40 0.49 0.26 0.21 0.19 difference in distance 0.06 0.03 0.09

accept you accept did you accept please accept accepted won’t you accept are you to accept have you accepted would you accept don’t you accept are you going to accept

i.2. The woman’s father, who’s listening to the conversation and is very much in favour of the marriage, asks (hopefully) ‘Elisabeth, [2] __________you______________ (accept) the commodore’s proposal?’ US E1 E1d E2 E2d C Cd will you accept 0.21 0.42 0.42 0.38 0.48 0.17 0.24 are you accepting 0.17 0.12 0.16 0.05 0.13 0.05 do you accept 0.14 0.04 0.26 0.24 0.09 0.11 0.05 E1 10 3 1 0 0 0 0 0 0 0 5 3 1 1

US/ E1 2 20.61 175 p p< 0.005 df 6

US 6 5 4 3 3 2 2 2 1 1 0 0 0 0 US/ E2 19.2 7699 p< 0.01 9

E2 8 1 5 0 0 2 0 0 0 0 3 1 0 1 US/ C 21.46 814 p< 0.005 9

C 3 1 2 0 0 3 0 0 0 0 4 4 0 1 US/ E1d 13.7 4332 p< 0.05 8

E1d 8 3 5 0 0 0 0 0 0 0 2 1 0 0 US/ E2d 17.84 925 p< 0.05 9

E2d 11 3 2 0 0 1 0 0 0 0 3 2 0 1 US/ Cd 28.67 841 p< 0.001 9

Cd 5 0 1 0 0 1 0 0 1 0 6 6 0 1 E1/ E1d 6.683 575 p> 0.3 5

E2/ E2d 3.34 2062 p> 0.7 6

C/ Cd 2.41 6865 p> 0.07 5

484 Appendix 2

0.08

0.02

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.32 0.30 0.28 0.51 0.39 0.16

iii. A man who’s been invited to dinner asks his hostess: ‘Our host?’ ‘He [1] __________________ (not dine) with us tonight. He hopes you’ll understand. US E1 E1d E2 E2d C Cd won’t dine 0.72 0.08 0.11 0.48 0.30 0.22 0.14 isn’t dining 0.10 0.71 0.84 0.29 0.48 0.61 0.43 won’t be dining 0.07 0.08 0.05 0.09 0.05 cannot dine 0.07 shall not dine 0.03

US/ US/ US/ E1 E1d E2 distance 0.32 0.40 0.40 difference in distance 0.08

ii. Two ‘amateur robbers’ have almost been caught by the police. One of them says ‘That was way too close. We [1] _______________________ (not do) that again!’ US E1 E1d E2 E2d C Cd will not do 0.55 0.79 0.42 0.86 0.56 0.78 0.67 cannot do 0.21 0.05 shouldn’t do 0.10 0.08 0.05 0.05 0.10 aren’t doing 0.07 0.04 0.37 0.05 0.26 are not going to do 0.03 0.04 0.11 0.09 0.05 0.10 shall not do 0.03 wouldn’t do 0.04 0.09 0.05 0.05 don’t do 0.05 0.05 0.05 will not be doing 0.05 ‘d rather not do 0.05 E1 19 0 2 1 1 0 0 1 0 0 0 0

US 21 3 2 2 1 0 0 0

E1 2 17 2 0 0 1 2 0

US/ E1 2 9.61 7821 p p> 0.1 df 6

US 15 7 3 2 1 1 0 0 0 0 0 0

E2 10 6 1 0 0 4 0 0

US/ E2 13.6 7617 p> 0.05 7

E2 18 0 0 1 0 0 0 0 1 1 0 0

C 4 11 0 0 0 3 0 0

US/ C 7.36 3349 p> 0.2 6

C 14 1 1 0 1 0 0 1 0 0 0 0

E2d 13 0 0 6 2 0 2 0 0 0 0 0

E2d 7 11 2 0 0 3 0 0

US/ E2d 14.983 36 p< 0.025 6 E1d 2 16 0 0 0 3 0 0

US/ E1d 13.7 5523 p< 0.05 6

E1d 8 0 1 7 2 0 0 0 0 0 1 0

Cd 3 9 1 0 0 5 0 1

US/ Cd 12.6 1065 p> 0.1 8

Cd 14 0 2 0 2 0 1 0 0 0 1 1 E1/ E1d 11.2 1844 p< 0.05 5

E2/ E2d 10.3 0827 p< 0.05 5

C/ Cd 5.46 8254 p> 0.6 7

Appendix 2 485

0.05

0.19

0.13

0.17

0.05

0.24 0.05 0.05

iv.1. A woman is making a very firm invitation: ‘You [1] ______________ (come) in one hour, darling. US E1 E1d E2 E2d C will come 0.59 0.17 0.42 0.38 0.38 0.61 come 0.17 0.08 0.21 0.24 0.22 can come 0.14 must come 0.03 0.08 0.05 0.09 are coming 0.03 0.46 0.21 0.05 0.22 0.22 have to come 0.03 0.05 0.05 should come 0.04 0.09 0.05 may come 0.04 do come 0.04 0.16 0.05 are to come 0.04 shall come 0.04 0.05 0.10 would come 0.14 0.05 should have come 0.05 are going to come 0.05 0.05 0.05

0.05

0.14

0.05 0.14 0.05 0.10

Cd 0.24 0.19

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.59 0.74 0.72 0.16 0.27 0.23 0.22 0.02

0.04 0.08

US/ US/ US/ E1 E1d E2 distance 0.90 0.97 0.37 difference in distance 0.07

doesn’t dine isn’t going to dine couldn’t dine wasn’t able to dine didn’t dine

E1 4 2 0 2 11 0 1 1 1 1 1 0 0 0 0

E2 8 5 0 0 1 0 0 0 1 0 2 0 3 1 0

US/ E2 11.2 4442 p< 0.05 5

US/ E1 2 31.3 0255 p p< 0.01 df 6

US 17 5 4 1 1 1 0 0 0 0 0 0 0 0 0

0 0

0 0

0 0

C 11 0 0 0 4 1 1 0 0 0 0 1 0 0 0

US/ C 22.8 0619 p< 0.01 5

0 0

0 0

1 1

Cd 5 4 0 1 3 1 2 0 3 0 1 0 0 0 1

US/ Cd 27.2 5096 p< 0.01 8

E2d 9 6 0 1 6 1 0 0 1 0 0 0 0 0 0

US/ E2d 17.1 0688 p< 0.01 5

E1d 8 2 0 3 4 0 1 0 0 1 0 0 0 0 0

US/ E1d 32.1 33 p< 0.01 5

0 0 E1/ E1d 5.01 0556 p> 0.2 4

E2/ E2d 2.39 022 p> 0.4 3

C/ Cd 4.63 9541 p> 0.6 6

486

Appendix 2

distance difference in distance

0.05

0.10

0.10

0.28 0.05

Cd 0.81

C 0.61

0.64

0.79

0.24

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd 0.82 0.18 0.85 0.06 0.46 0.22 0.70 0.78 0.28

iv.2. I [2] ____________(insist). US E1 E1d E2 E2d insist 0.97 0.25 0.84 0.29 0.92 must insist 0.03 0.08 0.04 do insist 0.37 0.05 0.48 0.04 will insist 0.08 0.05 strongly insist 0.08 have to insist 0.04 don’t insist 0.04 am to insist 0.04 would rather insist 0.10 am insisting 0.11 0.10

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.63 0.34 0.32 0.34 0.30 0.44 0.41 0.28 0.46 difference in distance 0.02 0.13 0.29

E1 6 2 9 2 2 1 1 1 0 0

US/ E1 2 30.3 682 p p< 0.01 df 7

US 28 1 0 0 0 0 0 0 0 0

US/ E1 2 27.7 755 p p< 0.01 df 9

US/ E2 29.7 1603 p< 0.01 5

E2 6 0 10 1 0 0 0 0 2 2

US/ E2 15.3 5304 p< 0.05 9

US/ C 13.5 7963 p< 0.01 4

C 11 0 5 0 1 0 0 0 0 1

US/ C 13.2 3628 p> 0.05 7

US/ E1d 5.42 4847 p> 0.1 3

US/ E2d 1.32 5337 p> 0.3 2

US/ Cd 6.57 7267 p> 0.05 3

Cd 17 0 2 0 0 0 0 0 0 2

US/ CD 17.8 3309 p< 0.05 9

E2d 21 1 1 0 0 0 0 0 0 0

US/ E2d 10.7 4783 p> 0.2 6

E1d 16 0 1 0 0 0 0 0 0 2

US/ E1d 12.7 9784 p> 0.05 7

E1/ E1d 21.6 5688 p< 0.01 8

E1/ E1d 7.31 7545 p> 0.4 8

E2/ E2d 21.6 579 p< 0.01 5

E2/ E2d 11.5 726 p> 0.1 8

C/ Cd 3.69 5862 p> 0.2 3

C/ Cd 13.5 7575 p> 0.1 9

Appendix 2 487

are not panicking 0.03 don’t panic 0.03 0.04 aren’t going to panic 0.04 shall not panic

0.05 0.05

0.05 0.05 0.29 0.05 0.05

0.17 0.05

0.04 0.30 0.17 0.10

0.10

v.2. Mother (very firmly): ‘We [2]___________ (not panic). US E1 E1d E E2d C Cd won’t panic 0.72 0.58 0.63 0.43 0.30 0.28 0.43 can’t panic 0.14 0.17 0.05 0.05 0.28 0.19 shouldn’t panic 0.03 0.10 0.18 0.05 0.19 mustn’t panic 0.03 0.17 0.11 0.14 0.09

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.52 0.48 0.51 0.58 0.64 0.66 0.19 0.19 0.12 difference in distance 0.04 0.07 0.02

v.1. The family being in a really tight spot, the children cry to their mother: ‘What __________we [1] _________________ (do)?’ US E1 E1d E2 E2d C Cd will we do 0.52 0.12 0.16 0.19 0.09 0.11 0.14 are we going to do 0.24 0.54 0.53 0.62 0.61 0.72 0.76 shall we do 0.10 0.08 0.05 0.05 0.09 0.05 should we do 0.07 0.04 0.11 0.05 0.04 0.05 are we to do 0.03 do we do 0.03 0.17 0.05 0.13 0.05 have we done 0.04 0.05 would we do 0.05 0.05 are we doing 0.11 0.04 0.05

US 21 4 1 1 1 1 0 0

E1 14 4 0 4 0 1 1 0

E2 9 1 2 3 0 6 0 0

US/ E2 12.5 4285 p> 0.1 7

US/ E1  17.6 8432 p p< 0.01 df 6 2

E2 4 13 1 1 0 0 1 1 0

E1 3 13 2 1 0 4 1 0 0

US 15 7 3 2 1 1 0 0 0

C 5 5 1 0 0 3 3 1

US/ C 14.98 767 p< 0.025 6

C 2 13 1 0 0 1 0 1 0

E2d 2 14 2 1 0 3 0 0 1

E2d 7 0 4 2 1 7 1 1

US/E 2d 15.31 949 p< 0.025 6

E1d 12 1 0 2 1 1 1 1

US/ E1d 10.9 2004 p> 0.05 6

E1d 3 10 1 2 0 1 0 0 2

Cd 9 4 4 0 0 2 2 0

US/C D 17.01 054 p< 0.01 6

Cd 3 16 0 1 0 0 0 0 1 E1/ E1d 5.34 8897 p> 0.5 6

E2/ E2d 6.965 09 p> 0.4 7

C/ Cd 5.31 1003 p> 0.5 6

488 Appendix 2

distance difference in distance

0

US/ US/ US/ US/ US/ E1 E1d E2 E2d C 0.06 0 0 0 0

0.06

0.05

Cd 0.52 0.43

0

0.06

0

0

US/ E1/ E2/ C/ Cd E1d E2d Cd 0 0 0 0

Cd 1.00

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.26 0.38 0.53 0.12 0.17 0.18 0.09 0.15

v.4. I [4] __________________ (swear) US E1 E1d E2 E2d C swear 1.00 0.96 1.00 1.00 1.00 1.00 will swear 0.04

US/ US/ US/ E1 E1d E2 distance 0.07 0.18 0.17 difference in distance 0.18

v.3. We [3] ___________ (not die). US E1 E1d E2 E2d C won’t die 0.90 0.87 0.79 0.76 0.70 0.61 aren’t going to die 0.07 0.12 0.21 0.10 0.22 0.28 don’t want to die 0.03 0.10 wouldn’t die 0.11 aren’t dying 0.05 0.04 haven’t died 0.04

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.21 0.17 0.42 0.55 0.52 0.36 0.16 0.19 0.25 difference in distance 0.13 0.04 0.16

E2 21 0

C 18 0

US/ C .248 95 p< 0.05 3

C 11 5 0 2 0 0

US/ C 15.2 1628 p< 0.01 7

US/E US/ US/ 1 E2 C 2 1.231 571 p p>0.2 df 1

E1 23 1

US/ E2 2.49 8241 p> 0.4 3

US/ E1 2 1.27 1534 p p> 0.5 df 2 US 29 0

E2 16 2 2 0 1 0

US/ E2 11.5 1967 p< 0.05 5

E1 21 3 0 0 0 0

US 26 2 1 0 0 0

US/ E1 2 5.77 9742 p p> 0.4 df 6

-

US/ E1d

US/ E2d

-

US/ CD

Cd 21 0

US/ Cd 11.5 5134 p< 0.01 3 E2d 23 0

US/ E2d 6.05 4972 p> 0.1 4 E1d 19 0

US/ E1d 2.64 9551 p> 0.2 2

Cd 11 9 0 1 0 0

US/ CD 9.90 6953 p> 0.1 6

E2d 16 5 0 0 1 1

US/ E2d 19.1 966 p< 0.01 7

E1d 15 4 0 0 0 0

US/ E1d 5.75 4298 p> 0.5 7

E2/ E2d 4.20 349 p> 0.3 4

E2/ E2d 5.11 3245 p> 0.5 7

C/ Cd 1.25 2835 p> 0.5 2

C/ Cd 4.24 8338 p> 0.5 5

E1/ E2/ C/ E1d E2d Cd 0.805 15 p>0.3 1

E1/ E1d 0.56 9157 p> 0.4 1

E1/ E1d 4.09 4478 p> 0.5 6

Appendix 2 489

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.94 0.73 0.39 0.80 0.19 0.07 0.22 0.43 0.16 difference in distance 0.41 0.21 0.12

vi.1. A young poor boy has just learned he is a wizard and is to become a student of a special school for magic people. His aunt and uncle are strongly against the idea: ‘He [1] ________(not go). We swore we’d put a stop to all this rubbish.’ US E1 E1d E2 E2d C Cd won’t go 0.86 0.21 0.37 0.61 0.35 0.83 0.86 can’t go 0.07 0.10 0.10 will not be going 0.03 shall not go 0.03 isn’t going 0.67 0.53 0.29 0.61 0.17 0.05 mustn’t go 0.08 0.11 0.04 0.04 shouldn’t go 0.04 wouldn't go 0.04

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.13 0.10 0.06 0.13 0.09 0.14 0.07 0.12 0.17 difference in distance 0.03 0.07 0.05

v.5. I [5] _____________________(get) us out of this.’ US E1 E1d E2 E2d C Cd will get 0.90 1.00 0.95 0.90 1.00 0.94 0.81 get 0.07 0.05 0.10 can get 0.03 am going to get 0.05 0.05 0.04 0.10

df

p

2

36.1 8367 p< 0.01 6

US/ E1

10.78 559 p< 0.05 4

US/ E2

E1 5 0 0 0 16 2 1 0

US/ E2 2.19 8504 p> 0.5 3

US/ E1 2 2.63 1724 p p> 0.2 df 2

25 2 1 1 0 0 0 0

E2 19 1 0 1

E1 24 0 0 0

US 26 2 1 0

7.326 868 p> 0.1 4

US/ C

E2 13 2 0 0 6 0 0 0

US/ C 3.50 1025 p> 0.3 3

C 17 0 0 1

25.13 249 p< 0.01 5

US/ E1d

C 15 0 0 0 3 0 0 0

US/ E1d 3.52 4171 p> 0.3 3

27.43 046 p< 0.01 6

US/ E2d

Cd 17 2 0 2

2.934 662 p> 0.5 4

US/ CD

E1d 7 0 0 0 10 2 0 0

US/ CD 3.69 84 p> 0.2 3

E2d 23 0 0 0

US/ E2d 2.52 4983 p> 0.2 2

E1d 18 0 0 1

2.165 837 p> 0.5 3

E1/ E1d

C/ Cd 2.11 508 p> 0.3 2

3.060 065 p> 0.2 2

C/ Cd

Cd 18 2 0 0 1 0 0 0

7.964 412 p< 0.05 3

E2/ E2d

E2/ E2d 2.29 4784 p> 0.3 2

E2d 8 0 0 0 13 1 0 1

E1/ E1d 1.29 3233 p> 0.2 1

490 Appendix 2

0.19

0.18

distance difference in distance

0.63

0.03

0.03

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd 0.84 0.21 0.26 0.23 0.30 0.27 0.63 0.14 0.07

0.10 0.14

Cd 0.76

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.25 0.39 0.21 0.15 0.23 0.22

vi.3. and you [3] ____________ (never tell) me?’ US E1 E1d E2 E2d C told 0.97 0.37 0.84 0.76 0.83 0.72 would never tell 0.03 have never told 0.58 0.16 0.10 0.17 0.11 had never told 0.04 0.05 0.11 are never going to tell 0.05

US/ US/ US/ E1 E1d E2 distance 0.29 0.42 0.44 difference in distance 0.13

vi.2. Realizing they are hiding something from him, the boy asks his folk: ‘You [2] ___________ (know) all the time US E1 E1d E2 E2d C Cd knew 0.93 0.71 0.63 0.57 0.74 0.61 0.76 have known 0.07 0.25 0.37 0.29 0.22 0.28 0.14 had known 0.04 0.04 0.05 0.10 know 0.14 0.05 E1 17 6 1 0

E1 9 0 14 1 0 0

US/ E1 2 25.5 1212 p p< 0.01 df 3

US 28 1 0 0 0 0

US/ E1 2 4.84 4142 p p> 0.05 df 2

US 27 2 0 0

US/ E2 5.46 599 p> 0.2 3

E2 16 2 1 0 0 2

US/ E2 9.73 8537 p< 0.01 2

E2 12 6 0 3

US/ C 9.42 9867 p< 0.05 4

C 13 0 2 2 1 0

US/ C 7.97 9707 p< 0.05 3

C 11 5 1 1

US/ E1d 5.42 4847 p> 0.05 2

Cd 16 0 2 3 0 0 US/ CD 8.20 2717 p< 0.05 3

E2d 19 0 4 0 0 0 US/ E2d 6.11 2476 p< 0.05 2

Cd 16 3 2 0

US/ Cd 3.83 2054 p> 0.1 2

E2d 17 5 1 0

US/ E2d 3.91 8301 p> 0.1 2

E1d 16 0 3 0 0 0

US/ E1d 6.75 6946 p< 0.01 1

E1d 12 7 0 0

E1/ E1d 9.62 6408 p< 0.01 2

E1/ E1d 1.37 6204 p> 0.5 2

E2/ E2d 5.97 8586 p> 0.1 3

E2/ E2d 4.87 2136 p> 0.1 3

C/ Cd 1.28 7192 p> 0.5 3

C/ Cd 2.54 3541 p> 0.4 3

Appendix 2 491

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.28 0.36 0.16 0.11 0.25 0.23 0.04 0.20

US/ US/ US/ E1 E1d E2 distance 0.39 0.41 0.24 difference in distance 0.02

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.28 0.11 0.36 0.16 0.25 0.23 0.04 0.19

vii. 2. A bystander: ‘Who’s this?’ Fan: ‘Don’t tell me you [2] _______________________ (not hear) of him.’ US E1 E1d Ed E2d C Cd haven’t heard 0.86 0.92 0.89 0.90 0.96 1.00 0.95 never heard 0.10 didn’t hear 0.03 0.08 0.11 0.10 0.04 0.05

US/ US/ US/ E1 E1d E2 distance 0.39 0.41 0.24 difference in distance 0.02

vii. 1. Fan to celebrity: ‘I can’t believe I [1]______________________ (meet) you at last.’ US E1 E1d E2 E2d C Cd met 0.38 0.25 0.16 0.38 0.18 0.28 0.38 have met 0.31 0.62 0.68 0.38 0.42 0.61 0.43 am meeting 0.21 0.04 0.05 0.05 0.18 0.05 0.14 meet 0.03 0.12 0.10 0.19 0.18 got to meet 0.03 get to meet 0.03 will meet 0.04 E1 5 15 1 3 0 0 0 0

E1 22 0 2

US/ E1 2 3.08 0541 p p> 0.1 df 2

US 25 3 1

US/ E1 2 11.9 3817 p p< 0.05 df 5

US 11 9 6 1 1 1 0 0

US/ E2 2.94 6957 p> 0.2 2

E2 19 0 2

US/ E2 6.79 7965 p> 0.2 5

E2 8 8 1 4 0 0 0 0

US/ C 2.71 3713 p> 0.2 2

C 18 0 0

US/ C 5.99 5985 p> 0.3 5

C 6 11 1 0 0 0 0 0

US/ E1d 2.89 9663 p> 0.2 2

Cd 20 0 1 US/ CD 2.33 534 p> 0.3 2

E2d 22 0 1 US/ E2d 2.53 2904 p> 0.2 2

Cd 9 8 3 0 0 0 0 1

US/ CD 4.08 3358 p> 0.6 6

E2d 4 10 4 4 0 0 1 0

US/ E2d 7.93 2601 p> 0.2 6

E1d 16 0 3

US/ E1d 11.5 3392 p< 0.05 5

E1d 3 13 1 2 0 0 0 0

E1/ E1d 2.04 475 p> 0.05 1

E1/ E1d 0.26 5045 p> 0.95 3

E2/ E2d 0.46 2893 p> 0.4 1

E2/ E2d 4.27 3477 p> 0.3 4

C/ Cd 0.87 969 p> 0.3 1

C/ Cd 2.85 9837 p> 0.4 3

492 Appendix 2

US 0.34 0.21 0.10 0.10 0.03 0.03 0.03 0.03 0.03 0.03 0.03

0.08 0.04 0.04

E1 0.62 0.21

0.04 0.05 0.05

0.05

0.11 0.10 0.19

0.04 0.18

0.05 0.10 0.16 0.10

Cd 0.33 0.24

C 0.78 0.17

E1d E2 E2d 0.68 0.67 0.56 0.11 0.14 0.22

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.34 0.40 0.37 0.3 0.46 0.22 0.23 0.17 0.51 difference in distance 0.06 0.07 0.24

you have been sacked you were you sacked you sacked you are being sacked you are sacked you will be sacked you get sacked you being sacked you got sacked you getting sacked you are going to be sacked will you be sacked are you being sacked are you sacked would you be sacked wouldn’t you be sacked

viii. 1. Seeing a friend packing his stuff, a man says: ‘You [1] ______________________ (be sacked)?’

E2/ E2d 1.449122 p>0.6 3

E1/ E1d 8.968436 p>0.05 4

US/CD 5.001698 p>0.4 5

2 p df

E1d 13 2 0 1 3 0 US/C 7.201836 p >0.1 4

C 12 3 2 1 0 0

US/E2 6.531656 p >0.2 4

E2 14 3 0 2 2 0

US/E1 17.77743 p0.05 4

US/E1d 7.176028 p>0.1 4

E2d 13 5 0 1 4 0 US/E2d 4.670266 p>0.3 4

Appendix 2 493

0.21

0.16

0.06

US/ US/ US/ E1 E1d E2 distance 0.48 0.57 0.50 difference in distance 0.09

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.71 0.55 0.61 0.10 0.15 0.22 0.21 0.06

ix. 1. Press secretary to guest: ‘The Prime Minister [1] _________________ (expect) you. US E1 E1d E2 E2d C Cd expects 0.45 0.25 0.21 0.19 0.09 0.22 0.10 is expecting 0.31 0.71 0.79 0.76 0.78 0.72 0.90 expected 0.17 0.04 has been expecting 0.03 0.04 0.09 has expected 0.03 were expecting 0.03 was expecting 0.05 0.04

distance difference in distance

0.10

0.14 0.43

Cd 0.33

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd 0.80 0.59 1.06 0.90 0.76 0.70 0.24 0.20 0.21

viii.2. The other answers: ‘I [2] __________________________ (resign) actually.’ US E1 E1d E2 E2d C resigned 0.86 0.37 0.47 0.14 0.22 0.28 will resign 0.07 resign 0.07 0.05 have resigned 0.62 0.42 0.76 0.60 0.39 am going to resign 0.08 am resigning 0.08 0.11 0.10 0.18 0.28 E1 9 0 0 11 2 2

E1 6 17 0 1 0 0 0

US/ E1 2 1.45 812 p p< 0.05 df 5

US 13 8 5 1 1 1 0

US/ E1 2 26.2 9171 p p< 0.01 df 5

US 25 2 2 0 0 0

US/ E2 15.54 944 p< 0.025 6

E2 4 16 0 0 0 0 1

US/ E2 39.0 0423 p< 0.01 4

E2 3 0 0 16 0 2

US/ C 13.09 818 p< 0.05 6

C 4 13 0 0 0 0 1

US/ C 26.5 463 p< 0.01 4

C 5 0 1 7 0 5

US/ E1d 13.39 311 p< 0.025 5

Cd 7 0 3 9 0 2

US/ E2d 16.4 3938 p< 0.01 5

E2d 2 18 1 2 0 0 0

E1/ E1d 0.95 653 p> 0.6 2

E1/ E1d 1.91 8225 p> 0.5 3

US/ Cd 19.7 7437 p< 0.01 5

Cd 2 19 0 0 0 0 0

US/ CD 22.6 2418 p< 0.01 4

E2d 5 0 0 14 0 4

US/ E2d 35.1 0845 p< 0.01 4

E1d 4 15 0 0 0 0 0

US/ E1d 20.3 2839 p< 0.01 4

E1d 9 0 0 8 0 2

C/ Cd 2.57 6141 p> 0.2 2

C/ Cd 2.65 3983 p> 0.4 3

E2/ E2d 4.703 122 p> 0.3 4

E2/ E2d 1.21 1594 p> 0.5 2

494

Appendix 2

0.14 0.05

Cd 0.71 0.10

distance difference in distance

0.05

0.33

0.62

Cd

0.18

0.23

0.05

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd 0.65 0.47 0.80 0.57 0.69 0.64 0.42 0.48 0.14

ix. 3. you [3] __________________ (not come). US E1 E1d E2 E2d C weren’t coming 0.52 0.21 0.04 wouldn’t come 0.45 0.71 0.37 0.19 0.57 0.55 might not come 0.03 won’t come 0.29 0.32 0.52 0.26 0.44 don’t come 0.10 aren't going to come 0.05 0.04 weren’t going to come 0.05 aren’t coming 0.11 0.10 0.09

US/ US/ US/ US/ US/ US/ E1/ E2/ C/ E1 E1d E2 E2d C Cd E1d E2d Cd distance 0.35 0.30 0.32 0.28 0.24 0.25 0.05 0.08 0.08 difference in distance 0.05 0.04 0.01

ix. 2. We [2] __________________ (begin) to think US E1 E1d E2 E C began 0.66 0.75 0.74 0.67 0.70 0.67 were beginning 0.28 0.05 0.05 0.09 0.08 had begun 0.07 have begun 0.12 0.11 0.18 0.18 0.08 begin 0.12 0.11 0.10 0.04 0.08 E1 18 0 0 3 3

E1 0 17 0 7 0 0 0 0

US/ E1 2 23.2 6873 p p< 0.01 df 3

US 15 13 1 0 0 0 0 0

US/ E1 2 15.6 9501 p p< 0.01 df 4

US 19 8 2 0 0

US/ E2 37.4 4326 p< 0.01 7

E2 0 4 0 11 2 1 1 2

US/ E2 13.26 152 p< 0.025 4

E2 14 1 0 4 2

US/ C 23.0 8113 p< 0.01 3

C 0 10 0 8 0 0 0 0

US/ C 9.10 4907 p> 0.05 4

C 12 2 0 2 2

US/ E1d 15.7 6953 p< 0.01 4

Cd 0 13 0 7 0 0 0 1 US/ Cd 23.3 1692 p< 0.01 4

E2d 1 13 0 6 0 1 0 2 US/ E2d 21.8 4857 p< 0.01 5

Cd 15 2 0 3 1

US/ Cd 9.02 154 p> 0.05 4

E2d 16 2 0 4 1

US/ E2d 9.30 199 p> 0.05 4

E1d 4 7 0 6 0 0 0 2

US/ E1d 10.5 7779 p< 0.05 4

E1d 14 1 0 2 2

E1/ E1d 9.79 4628 p< 0.05 3

E1/ E1d 1.33 6678 p> 0.7 3

E2/ E2d 10.1 6539 p> 0.1 6

E2/ E2d 0.71 0559 p> 0.8 3

C/ Cd 1.23 4507 p> 0.5 2

C/ Cd 0.63 9682 p> 0.8 3

Appendix 2 495

0.06

0.27

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.55 0.64 0.37 0.09 0.16 0.27

US/ US/ US/ E1 E1d E2 distance 0.20 0.25 0.20 difference in distance 0.05

US/ US/ US/ E1/ E2/ C/ E2d C Cd E1d E2d Cd 0.43 0.20 0.13 0.11 0.56 0.19 0.23 0.07

x. One character to another (after, supposedly, annihilating the opponent): ‘We [1] _____________________________(not see) him again.’ US E1 E1d E2 E2d C Cd will not see 0.79 0.75 0.68 0.95 0.48 0.94 0.81 won’t be seeing 0.10 0.04 0.09 shall not see 0.07 0.05 0.10 didn’t see 0.03 0.05 aren’t going to see 0.17 0.16 0.13 0.05 aren’t seeing 0.04 0.11 0.26 0.05 wouldn’t see 0.04 0.05

US/ US/ US/ E1 E1d E2 distance 0.79 0.83 0.61 difference in distance 0.04

ix. 4. I hope you [4] ___________________ (enjoy) your stay.’ US 1 E1d E2 E2 C Cd enjoy 0.59 0.08 0.11 0.19 0.18 0.17 0.38 will enjoy 0.24 0.83 0.89 0.67 0.57 0.67 0.52 enjoyed 0.17 0.04 0.05 0.04 0.05 are enjoying 0.04 0.09 0.11 0.05 have enjoyed 0.10 0.04 0.05 would enjoy 0.04 will be enjoying 0.04

US/ E1 2 9.22 0117 p p> 0.05 df 5

E1 18 1 0 0 4 1 0 US/ E2 4.03 2535 p> 0.2 3

E2 20 0 0 1 0 0 0

US/ E2 14.1 2933 p< 0.01 3

US/ E1 2 21.4 8757 p p< 0.01 df 3 US 23 3 2 1 0 0 0

E2 4 14 1 0 2 0 0

E1 2 20 1 1 0 0 0

US 17 7 5 0 0 0 0

US/ C 5.63 4147 p> 0.2 4

C 17 0 0 0 0 0 1

US/ C 1749 99 p< 0.01 4

C 3 12 0 2 1 0 0

US/ E1d 10.4 8276 p> 0.05 5

Cd 17 0 2 0 1 1 0 US/ Cd 5.76 7652 p> 0.3 5

E2d 11 2 0 0 3 6 1 US/ E2d 16.9 689 p< 0.01 6

Cd 8 11 1 1 0 0 0

US/ CD 7.86 736 p< 0.05 3

E2d 4 13 1 2 1 1 1

US/ E2d 7.04 86 p< 0.01 6

E1d 13 0 1 0 3 2 0

US/ E1d 19.7 8412 p< 0.01 2

E1d 2 17 0 0 0 0 0

E1/ E1d 2.73 827 p> 0.6 4

E1/ E1d 1.68 4625 p> 0.6 3

E2/ E2d 15.5 5413 p< 0.01 5

E2/ E2d 4.28 8322 p> 0.6 6

C/ Cd 4.79 762 p> 0.3 4

C/ Cd 4.44 5072 p> 0.3 4

496

Appendix 2

non-proto typical uses distance difference in distance mean SD

proto typical uses distance difference in distance mean SD

mean SD

distance difference in distance

3.53

0.39 0.21

3.96

0.22 0.44 0.28

0.46 0.30

0.55 0.37

1.30 0.35 0.16

0.51 0.28

4.17

4.36

0.43 0.28

2.81

0.39 0.24

4.11

0.47 0.28

1.82

-0.71 0.47 0.20

4.24

1.48 0.36 0.29

2.88

0.40 0.24

0.61

0.39 0.23

3.48

0.47 0.22

3.72

0.42 0.22

0.18 0.37 0.26

3.30

1.28 0.31 0.20

2.44

0.36 0.22

1.33

0.20 0.16

1.80

0.27 0.26

2.14

0.22 0.19

0.29 0.17

2.63

0.27 0.22

2.16

0.25 0.17

2.25

0.28 0.17

2.20

0.27 0.19

0.18 0.245 0.05 0.24

1.60

0.23 0.14

1.86

0.20 0.11

0.32 0.19

2.91

0.20 0.13

1.62

0.27 0.17

0.20 0.16

1.80

0.26 0.26

2.07

0.22 0.19

0.22 0.16

1.95

0.24 0.13

1.89

0.22 0.14

0.31 0.21

2.81

0.29 0.19

2.35

0.29 0.18

0.23 0.14

2.09

0.19 0.10

1.49

0.20 0.12

E2d/ E1/ E1d/ E1d/ US/ US/ US/ US/ US/ E1/ E2/ C/ E2/C E1/C US/C Cd E2 E2d Cd E1 E1d E2 E2d Cd E1d E2d Cd 10.75 8.93 9.83 9.22 9.55 8.22 5.07 5.79 4.70 6.13 6.27 5.00 5.09 6.59 4.62

Appendix 2 497

US/E1 US/E2 US/C 1.02 0.78 1.14

US/E1 US/E2 US/C distance 0.78 0.67 0.75

ii.1. Neighbour to neighbour about his dog: “ I [1]____________(find) Verdell. US E1 E2 C found 1.00 0.39 0.44 0.37 have found 0.47 0.32 0.37 can’t find 0.13 0.16 0.16 will find 0.04 0.05 (-) 0.05 am going to find 0.04

distance

Distance in multidimensional spaces i. 1. Asked whether he’s seen sb a man thinks for a while and then answers: “Oh, you mean the coloured man I [1]__________________________ (see) around here? US E1 E2 C saw 0.91 0.03 0.28 0.09 see 0.06 0.26 0.04 0.05 seen 0.03 have seen 0.43 0.44 0.79 have been seeing 0.17 0.04 0.05 am seeing 0.09 0.08 was seeing 0.04

34 0 0 0 0 0

E1 9 11 3 0 0 0

E2 11 8 4 1 0 1

C

US/E1 US/E2 2 27.43377 24.9635 p p