Language Processing and Disorders [1 ed.] 9781527511958, 9781443895088

Language processing is considered as an important part of cognition, with an ever-increasing amount of studies conducted

171 69 2MB

English Pages 410 Year 2017

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Language Processing and Disorders [1 ed.]
 9781527511958, 9781443895088

Citation preview

Language Processing and Disorders

Language Processing and Disorders Edited by

Linda Escobar, Vicenç Torrens and Teresa Parodi

Language Processing and Disorders Edited by Linda Escobar, Vicenç Torrens and Teresa Parodi This book first published 2017 Cambridge Scholars Publishing Lady Stephenson Library, Newcastle upon Tyne, NE6 2PA, UK British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Copyright © 2017 by Linda Escobar, Vicenç Torrens, Teresa Parodi and contributors All rights for this book reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. ISBN (10): 1-4438-9508-3 ISBN (13): 978-1-4438-9508-8

TABLE OF CONTENTS

A Review of Language Processing, Second Language Acquisition and Language Disorders .............................................................................. 1 Vicenç Torrens, Linda Escobar and Teresa Parodi Part One: Language Processing Reaction Time as a Measure of Implicit Grammaticality Judgment.......... 12 Misha Becker Probabilistic Phonotactics in Vocabulary Acquisition during Reading in a Native Language ................................................................................. 31 Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner The Construal Hypothesis and Relative Clause Processing: The Effect of the Referentiality Principle in Brazilian Portuguese ............................. 54 Gitanna Brito Bezerra and Márcio Martins Leitão Language Experience and Memory Effects in Anaphora Resolution in Greek ..................................................................................................... 75 Eleni Fleva, Georgia Fotiadou, Maria Katsiperi, Eleni Peristeri, Maria Mastropavlou and Ianthi Maria Tsimpli The Form-Function Relation in of-phrases: An Experimental Approach ..... 92 Takashi Fujiwara, Fuminori Nakamura and Daisuke Suzuki An Acceptability Study of Long-Distance Extractions in Swedish ......... 103 Anna-Lena Wiklund, Fredrik Heinat, Eva Klingvall and Damon Tutunjian The Cue-based Retrieval Theory of Sentence Comprehension: New Findings and New Challenges ......................................................... 121 Dan Parker, Michael Shvartsman and Julie van Dyke Contextual Information in Universal-Q processing and Some Remarks on Distributivity, Collectivity and Maximality........................................ 145 Erica dos Santos Rodrigues and Mercedes Marcilese

vi

Table of Contents

Part Two: Second Language Acquisition and Bilingualism When the Burden of Age Does Not Wait: Early and Late L2 Acquisition of Differential Object Marking in Spanish .............................................. 166 Pedro Guijarro-Fuentes and Acrisio Pires Retrieving Presupposition in English L2: An Eye-tracking Study of Pragmatic Scales with Focus Particles ................................................ 193 Olga Ivanova Small Clauses in the Bilingual Syntactic Priming Paradigm ................... 213 Paulina àĊska and Katarzyna Jankowiak Part Three: Language Impairments Is the Self-reference of Autistic Children Atypical? The Case of Two French Autistic Children ......................................................................... 236 Camelia-Mihaela Dascalu Psycholinguistics of Dementia: What is Known about Specific Language and Speech Traits of Alzheimer’s Disease? ............................................ 258 Olga Ivanova, Juan José García Meilán, Francisco Martínez Sánchez, Emiliano Rodríguez and Juan Carro Light Verbs Revisited: A Comparative Perspective of Impaired Language Development and Dementia .................................................................... 277 Vasiliki Koukoulioti and Stavroula Stavrakaki Syntactic Complexity in Children with Autism Spectrum Disorder and Specific Language Impairment ......................................................... 291 Alexandrina Martins, Ana Lúcia Santos and Inês Duarte Regular and Irregular Inflectional Morphology in Acquired Language Disorders: The Case of German .............................................................. 314 Martina Penke and Eva Wimmer Pragmatic Coherence in Alzheimer’s Disease: A Protocol for Analysis .... 345 Ana Varela

Language Processing and Disorders

vii

A Case of Surface Dyslexia in a Spanish-English Bilingual Patient with Broca’s Aphasia .............................................................................. 374 Cristina Vereda, Mercedes González-Sánchez and Lidia Taillefer

A REVIEW OF LANGUAGE PROCESSING, SECOND LANGUAGE ACQUISITION AND LANGUAGE DISORDERS VICENÇ TORRENS, LINDA ESCOBAR AND TERESA PARODI1

Language processing is considered an important part of cognition and studies conducted in this field are increasing. This book gathers together a collection of papers on language processing, second language acquisition and language disorders. The first paper by Misha Becker, “Reaction Time as a Measure of Implicit Grammaticality Judgment” deals with children’s acquisition of the argument structure of novel predicates. This study is based on a reaction time methodology, where children demonstrate a longer period of time to process an ungrammatical sentence or garden path. It presents an experiment in which children aged 4-7 years old had to answer a grammatical or ungrammatical question. The sentences included in the experiment differed with respect to correct or incorrect argument structure. The sentences had familiar and novel verbs, including transitive and intransitive verbs. Becker found that children usually had slower reaction times with ungrammatical sentences than with grammatical ones. In a second experiment, she tested younger children aged 3-4 years old. This time they had to answer questions that included transitive and intransitive familiar verbs. In this second experiment, she found that children answered the ungrammatical questions more slowly than the older children in the first experiment. These data suggest that children aged 3-7 years can distinguish between grammatical and ungrammatical questions in terms of the argument structure of the included verbs.

1 Vicenç Torrens [corresponding autor], Facultad de Psicologia, U.N.E.D. E-mail: [email protected] Linda Escobar, Facultad de Filologia, U.N.E.D. E-mail: [email protected] Teresa Parodi, Department of Theoretical and Applied Linguistics, University of Cambridge. E-mail: [email protected]

2

A Review of Language Processing, SLA and Language Disorders

The paper, “Probabilistic phonotactics in vocabulary acquisition during reading in a native language” by Bordag et al. investigates the role of the prominence of a word form in the incidental acquisition of new words in native German speakers. Specifically, they studied the role of the relative frequencies of segments and their sequences in words. In order to achieve this goal, the authors collected data on the inference and establishment of meaning, its integration within the existing semantic network and the establishment of a new word form. In a self-paced reading task, the authors found that participants learned and retained, in memory, the meaning of both high and low salient novel words, although the reaction times were longer in the implausible condition than in the plausible condition. In a second task, with a vocabulary knowledge scale, participants had to judge their knowledge of novel words they had learned in the previous task, along with totally novel words. The authors found that participants could more easily recall a word with a low phonotactic probability than a word with a high phonotactic probability. In a third experiment, using a semantically-primed lexical decision task, the authors found that participants more slowly processed targets with a semanticallyrelated novel word prime than an unrelated existing word. This semantic effect is totally different for words with firmly established representations. In a fourth experiment, with a recognition task with repetition priming, the authors found no differences in priming effects between high phonotactic probability and phonotactic probability low novel words. They concluded that phonotactic probability in novel words is crucial to understanding lexical acquisition, which varies depending on different variables. The aim of Bezerra & Leitão’s study was to conduct experimental research on the referentiality principle and, in more general terms, on the predictions that the construal hypothesis makes with respect to Brazilian Portuguese. In particular, this study addresses the referential status of the N2 and its effects for relative clause processing and the initial syntactic under-specification proposed by the construal hypothesis. Given the fact that the referentiality principle has been primarily studied through off-line experiments, this method used on-line and off-line experimental techniques to provide new experimental data through: (i) a self-paced reading task; (ii) a questionnaire. Other researchers are also interested in complex phenomena such as anaphora resolution (AR). In the paper, “Language experience and memory effects in anaphora resolution in Greek”, by Eleni Fleva, Georgia Fotiadou, Maria Katsiperi, Eleni Peristeri, Maria Mastropavlou and I. Maria Tsimpli, AR is examined by contrasting more-or-less complex sentences that include overt or null pronouns. In their experiment,

Vicenç Torrens, Linda Escobar and Teresa Parodi

3

participants were presented with two self-paced listening (SPL) tasks under different conditions. However, in both tasks, anaphora resolution of the null and overt pronoun was assessed with subject-verb-object (SVO) sentences and with sentences that presented a rather different word order, including clitic-left-dislocation (O-cl-V-S) structure. A number of factors were taken into account such as preference for the object or the subjectreferent of the sentence, as well as the time needed for resolution. The authors discuss interesting differences in participants’ performance, taking age, education, language experience and memory resources into account, among other things. Takashi Fujiwara, Fuminori Nakamura and Daisuke Suzuki deal with the issue of synonymy in their paper, “The form-function relation in ofphrases: An experimental approach”, to discover whether “of-phrases” and adjectivals (i.e. “of interest vs. “interesting”) are processed equally in English. They show that the form-function relation of “of-phrases” plays a significant role in the processing of English prepositional phrases. Considering previous corpus analysis, they select two factors for their analysis: “subject type” (how the subject slot is filled) and “modality” (whether or not the expressions co-occur with modal verbs). In this way, they could plan their experimental study through a questionnaire, paying more attention to these two variables in unison rather than individually. Traditionally, “of-phrases” are grammatically equivalent to adjectivals. Nevertheless, there is no fine-grained description of the differences between the two. The experimental findings reported in this experimental study focus on the previous highly subjective variables and how discourse function plays a role, providing a significant way for future comparison of these two expressions. The purpose of the study by Wiklund et al. was to obtain controlled acceptability judgment data for Swedish (but also probably extended to other mainland Scandinavian languages) regarding structures that have been assumed to not involve island-like violations, like in most other languages. In particular, three structures in their extracted and nonextracted forms are under discussion: relative clauses (RC), that-clauses (TC), and non-restrictive relative clauses (NRC). Contrary to what should be expected, the results obtained did not always align with the on-line measures obtained via eye-tracking in other previous studies reported in the literature. Parker, Shvartsman and Van Dyke’s paper provides an account of current perspectives on memory retrieval in sentence comprehension. After discussing many psycholinguistic results on the timing and accuracy of dependency-formation found in a large number of present experimental

4

A Review of Language Processing, SLA and Language Disorders

studies, which are carefully reviewed, they argued that these findings can be best captured with “a direct-access retrieval mechanism that gives preferential weighting to syntactic information when navigating linguistic representations in memory.” Inspired by the work of Lima (2013), but using a slightly different methodology, Erica dos Santos Rodrigues and Mercedes Marcilese present new findings and results from two experiments designed to examine the possible role of preceding context in the interpretation of potentially ambiguous Q-expressions. In their study, “Contextual information in Universal-Q processing and some remarks on distributivity, collectivity and maximality”, these authors investigate the processing of three Qexpressions, “cada”, “todo”, and “todos os”, in Brazilian Portuguese (BP), with a larger number of participants than were in the previous research reported in the literature. They also present two further off-line experiments intended to research how the maximality property of universal quantifiers is associated with the processing of “todo” and “todos os” in BP. Interestingly, they discuss the differences found in the processing of these two apparently similar quantifiers where context plays a decisive role. In their paper, “When the burden of age does not wait: early and late L2 acquisition of differential object marking in Spanish”, Guijarro & Pires examine the acquisition of differential object marking (DOM) in L2 Spanish, by speakers of English exposed to Spanish before puberty and later, thus testing the critical period hypothesis. The results show that age is not a predictor for accuracy in the use of the Spanish preposition “a” in the relevant DOM contexts. The authors discuss the results against the background of the interpretability and feature reassembly hypotheses. According to the former, uninterpretable features are expected to be problematic for L2 learners, but not interpretable ones. According to the latter, the acquisitional problem is orthogonal to interpretability, as even with a similar set of L1 features, L2 learners may find difficulties in their target realization. The results appear to challenge the interpretability hypotheses: interpretable features turn out to be problematic for both early and late learners. On the other hand, the fact that semantic (interpretable) features are difficult to acquire is interpreted as evidence in favor of the feature reassembly hypothesis. Olga Ivanova’s paper, “Retrieving presupposition in English L2: an eye-tracking study of pragmatic scales with focus particles” deals with how L2 speakers of English with German L1 process scalar expressions with the focus particle “even”. The study compared processing in L2 to native processing of focus particles which, as opposed to L2, has already been extensively studied. In a self-paced reading task, the test items were

Vicenç Torrens, Linda Escobar and Teresa Parodi

5

contrasted according to the occurrence (or not) of “even” in the utterance “Mike and Lucy love (even) punk”. The critical items measured by eyetracking on first and second pass were “even” and the focus “punk”. These time measures were taken to reflect processing cost. The outcome showed that in the utterance without a discourse particle there were significant differences in the low- and high-level responses between English natives and German L2 English speakers; this was taken to indicate that cultural background may play a role. On the other hand, the presence of the discourse particle “even” yields similar results for both groups of participants: in both cases, “even” involves the highest processing cost as measured by reading times, both in the first and the second pass. According to the author, this result indicates that focus particles are universal tools that guide pragmatic inference independent of proficiency. The paper by àĊska & Jankowiak tries to find syntactic priming in English and Polish bilingual populations. They tested whether processing of a structure in one language was enhanced after using an analogue in another language. The structure under study was small clauses, present in both English and Polish. Small clauses are predication phrases with the semantic function of predication. Two types of small clause constructions were used: small clauses whose predicate is an adjective (AP predicates); and small clauses whose adjectival predicate is introduced by a preposition (PP predicates). This research compares the reaction times elicited by stimuli presented in L1 as compared to L2, as there are usually longer reaction times in a non-native (rather than a native) language. There were three experimental blocks: two within-language blocks (a within-language Polish, and a within-language English block); and one between-language block. The participants had to decide whether sentences were grammatically correct or incorrect. The authors found that accuracy rates were higher for primed compared to unprimed conditions in both languages, regardless of the small clause type. The authors found that sentences are easier to process when they are primed with the same syntactic structure in both languages. “Is the self-reference of autistic children atypical? The case of two French autistic children” by Camelia-Mihaela Dascalu, addresses the question of whether non-typically-developing children have difficulty in learning pronominal reversals, on the assumption that there are a large number of causes that can produce this difficulty, such as echolalia, the mother’s input, conversational roles, the understanding of context, or the lack of the theory of mind. Dascalu conducted research on the use of selfreference by autistic children. The findings were compared with those of a typically-developing child. The data from the autistic children were

6

A Review of Language Processing, SLA and Language Disorders

collected at home with audio-visual recordings of spontaneous speech interaction with mothers and other children. The longitudinal corpus of one neuro-typical child was collected manually and transcribed with CLAN. Dascalu found that the typically-developing child produced linguistic markers very early on, whereas the autistic children rarely used their first names in subject position. Instead, they used the third and second person pronouns to refer to themselves. Dascalu argues that autistic children have a particular difficulty with perspective-taking and reference shift, which explains the delay in the development of the pronoun system in this population. She argues that this delay is not due to language development but a lack of cognitive flexibility. “Psycholinguistics of Dementia: what is known on specific language and speech traits of Alzheimer’s disease?” by Ivanova et at., is a review of studies on language impairment in dementia. The paper tries to supply a new method to measure a prosodic aspect of language. This method attempts to facilitate an early and efficient diagnosis of Alzheimer’s disease (AD). These patients have lexico-semantic problems, due to the disruption of the networks of semantic knowledge. This causes a loss of concepts and other difficulties in lexical access. The authors argue that lexical-based assessment can predict the probability of the onset of AD, since lexical impairment usually causes anomic aphasia in the disease. In addition to this, looking into properties of intonation, shifts in stress or in the amount of sound are found along with voice breaks in prosody, or variations in rhythm or intonation, in patients with AD. Finally, these authors analyze the voices of people with non-pathological and pathological ageing, in order to identify and to extract the prosody-related acoustic and vocal parameters associated with the disease. Ivanova et al. found a high percentage of deaf segments, a significant difference in the prosodic parameters, a reduced speed of elocution and articulation in reading, a low effectiveness of phonation time, an increasing number and proportion of pauses, and significant changes in the speech analyses in the voices of AD patients, compared to non-pathological people. Children with specific language impairment and individuals with different types of primary progressive aphasia have been found to overuse light verbs, which have an underspecified semantic representation and an incomplete argument structure. Koukoulioti & Stavrakaki report the results of a naming and a comprehension task administered to four children with SLI, and a study on a sentence elicitation task administered to seven patients with semantic dementia. They found that the two groups produced a high percentage of light verbs when trying to name actions and that semantic dementia patients relied more on light verbs than the SLI

Vicenç Torrens, Linda Escobar and Teresa Parodi

7

children. The structure of the light verbs constructions was preserved in these patients but the semantic components were not as precise as in a normal population. The paper by Martins, Santos and Duarte, “Syntactic complexity in children with Autism Spectrum Disorder (ASD) and Specific Language Impairment (SLI)”, compares the performance of typically-developing (TD) children to those with SLI and ASD, in the production and comprehension of relative clauses in European Portuguese. The focus lies in subject and object relative clauses in simple clauses and in those with one further level of embedding, i.e. where the subject or object is extracted from an embedded relative clause. Production was tested by means of elicited imitation and the results showed that the SLI and ASD children displayed a lower performance than age-matched TD peers; in simple relatives the ASD children performed better than the SLI children and a subject-object asymmetry was not in evidence. A careful analysis of errors shed light on how the different groups dealt with the subject-object asymmetry and the level of embedding. Comprehension, in turn, was tested by a truth-value judgment task. The performance was similar to that of the production task, with some group-specific differences. In the TD group there was a more obvious subject-object asymmetry, as well as an effect in terms of the level of embedding for the younger children. As for the SLI and ASD groups, it appears that the level of embedding affected the ASD group less. These differences in error types are an interesting result and are potentially indicative of a production-comprehension contrast. Penke and Wimmer’s paper deals with explanations of the neural basis of language disorders like Broca’s and Wernicke’s aphasia and Parkinson’s disease. The single mechanism approach assumes that regular and irregular inflected forms are generated by only one cognitive mechanism; the dualistic view says that regular and irregular morphology have different underlying processes. Following the dualistic view, only irregular forms are stored and retrieved as fully inflected forms from the mental lexicon. If two different components underlie regular and irregular inflection, language disorders might affect the regular or irregular inflectional processes, depending on the areas of the brain that have been damaged. Broca’s aphasia and Parkinson’s disease are expected to result in difficulties with regular inflection whereas patients with Wernicke’s aphasia are assumed to have difficulties with irregular inflected forms. The authors of this paper found that in Wernicke‘s and Broca’s aphasics and patients with Parkinson’s disease, the error rates for irregular inflected participle and noun plural forms were higher than for regular participle

8

A Review of Language Processing, SLA and Language Disorders

and noun plural forms. These findings in German differ from the predictions drawn from the current models on the representation and processing of inflection in the brain: Broca’s aphasics and patients with Parkinson’s disease were not selectively impaired in producing regular inflected forms and were even able to over-regularize regular inflection to irregular inflecting stems. The different error rates observed in English versus German were due to differences in the inflectional systems in these languages. These authors propose that the three groups of patients were suffering from processing difficulties. The paper by Varela presents a model on coherence production and reception, which was tested by analyzing the speech of a patient with moderate dementia of the Alzheimer’s type. The model or coherence proposes four steps for uttering coherently: (i) the intention of our statement; (ii) the cognitive factors that can influence the quality and appropriateness of our statement; (iii) the pragmatic factors to produce a meaning given the context; and (iv) linguistic factors so that we can produce and understand our utterance correctly. The understanding process follows the following route: (i) processing the stimulus with our cognitive abilities; (ii) considering the context we are surrounded by; (iii) decoding the message according to this context; (iv) getting the communicative intention. The patient under study showed many pragmatic deficits in her discourse, difficulties in naming and complex syntactical structures, difficulties with lexical retrieval and management of shared knowledge. Personal orientation deficits caused problems in processing the participants of the conversation and time orientation deficits were responsible for the lack of relevance. This model could be useful as a tool for identifying coherence deficits and developing speech therapies. Vereda presents a case study of a patient suffering from acquired Broca’s aphasia with acquired surface dyslexia. Taking into account that the subject was a Spanish dominant bilingual, they checked if acquired surface dyslexia was present in Spanish and English. In order to find out if there was any significant difference between both languages in terms of reading, they conducted lexical decision and reading tasks. This author found an extremely significant difference in spelling-sound regularity for exceptional words in English (p < 0.01). Significance was also found for regularly inflected words, showing an influence from Spanish pronunciation for regular inflected verbs in the past tense. This influence was also observed when words were similarly spelled in both languages. The present volume deals with research on language processing and disorders presented at the Experimental Psycholinguistics Conference in Madrid. It covers topics ranging across syntax processing, second-language

Vicenç Torrens, Linda Escobar and Teresa Parodi

9

acquisition, lexical processing and language disorders. We would like to thank the plenary speakers (Roelien Bastiaanse, Manuel Carreiras, Volker Dellwo, Teresa Parodi, Jeannette Schaeffer) and the members of the scientific and organizing committees (Joanna Blaszczak, Denisa Bordag, Carlo Cechetto, Albert Costa, Naama Friedmann, Anna Gavarró, Nina Jeanette Hofferberth, Olga Ivanova, Victoria Marrero, Silvia MartínezFerreiro, Eva Moreno, Robert Reichle, Jeannette Schaeffer, Eva Soroli, Ellen Thompson, Spyridoula Varlokosta and Monica Wagner). We are also very grateful to UNED for their support to organize this edition of the conference.

PART ONE: LANGUAGE PROCESSING

REACTION TIME AS A MEASURE OF IMPLICIT GRAMMATICALITY JUDGMENT MISHA BECKER1

Abstract The purpose of this paper is to introduce an experimental methodology to measure children’s implicit judgments of grammaticality or acceptability; that is, to assess their judgments of well-formedness without requiring them to supply a metalinguistic grammaticality judgment. The methodology uses Reaction Time (RT) to gauge a child’s degree of “surprise” at hearing some linguistic form. The premise of this methodology is that it will take a speaker longer to respond to an ungrammatical prompt (in the implementation here, a yes/no question) than a grammatical one. Although RT is a familiar and long used methodology in studies of language processing and other domains of cognition, it has not been used previously in the way outlined below. This paper is structured as follows. In section 1, I present the rationale for using RT to measure grammaticality judgment in children: I explain why existing methodologies of assessing children’s grammatical knowledge are insufficient and why RT succeeds where other methods fail. In section 2, I describe the methodology in detail, including how it is carried out and how the results can be analyzed. I also present some preliminary results obtained using this methodology. Finally, I address some open questions and directions that I hope this work will take in the future.

1

Misha Becker, University of North Carolina, Chapel Hill. E-mail: [email protected]. These results have been published in Becker (2014, 2015a, b); thus the contribution of this particular paper is not the results themselves but a focus on how the studies were conducted.

Misha Becker

13

1. Rationale 1.1. Grammaticality Judgment Modern linguistic theory seeks to develop a formal model of the knowledge of a language that a speaker has in his or her mind, and to understand how that knowledge is structured and acquired. For many linguists, the real point of interest is the tacit body of knowledge known as “linguistic competence” (Chomsky, 1965). Linguistic competence is taken to be a quasi-pure form of our linguistic knowledge, unmarred by problems that can interfere with the actual usage of language (“performance”), including slips of the tongue and other errors of speech production and comprehension. In the study of adult grammatical knowledge, a common approach has been to rely on native speaker judgments about sentence well-formedness or acceptability (Chomsky, 1965; Newmeyer, 1983; Schütze, 1996; Cowart, 1997). As observed by Schütze (1996), there are a number of advantages of using this method to learn about linguistic knowledge over simply observing naturalistic speech production: native speaker judgments give us information about constructions that are grammatical but only rarely used in natural speech; they allow us to examine speakers’ knowledge about ungrammatical forms; and, given that people do make speech errors in production, obtaining grammaticality judgments can help us to tease apart the difference between forms produced that actually conform to the speaker’s mental grammar, from forms produced that may in fact be slips of the tongue. However, this methodology has been criticized along several dimensions (Hiramatsu, 2000). A number of researchers have questioned the utility and accuracy of data collected via metalinguistic judgment, as a means of drawing inferences about speakers’ underlying competence (Birdsong, 1989; Wasow & Arnold, 2005). Others have noted variability across speakers in the kinds of judgments given (Labov, 1975). Snyder (2000) cautions against the problem of syntactic satiation, whereby certain syntactic violations (notably whether-island violations and complex NP violations) begin to sound more acceptable after multiple presentations of these ungrammatical sentences (cf. Sprouse, 2009 for an alternative account of the satiation effects obtained by Snyder). And there is disagreement over how native speaker judgments should be elicited, i.e. through a simple binary judgment (OK/not OK), Magnitude Estimation (Bard et al., 1996; Featherston, 2004), the Likert Scale, or forced-choice between two sentences. Concerns about this methodology are exacerbated when we consider using it with young children. Of primary concern is whether, and at what

14

Reaction Time as a Measure of Implicit Grammaticality Judgment

age, children are capable of providing reliable metalinguistic grammaticality judgments. Metalinguistic awareness typically develops during the early school years as children learn other metalinguistic skills such as reading and writing (Gindes, 1980; Menyuk, 1981). A number of researchers have developed training protocols that can be used to successfully elicit grammaticality judgments in children ages 4 and older, with success occasionally extending to precocious 3-year-olds (McDaniel & Cairns, 1990, 1996). The training involves drawing children’s attention to their own language and perhaps different languages they have been exposed to, and engaging them in discussion about how language works (e.g. that it has words, we use words in particular ways, etc.) and finally by demonstrating some ill-formed sentences and pointing out that these sentences “sound funny” or are said in “the wrong way”. Using this type of procedure, researchers have successfully elicited children’s grammaticality judgments concerning pronoun reference, subjacency, whmovement, relative clauses and subject-auxiliary inversion (McDaniel, Cairns & Hsu, 1990; Stromswold, 1990; Maxfield & McDaniel, 1991; McDaniel, Chiu and Maxfield, 1995; McDaniel & McKee, 1995, among others). As I will argue, however, reliance on children’s metalinguistic judgments is insufficient for studying particular areas of their developing grammars, notably their acquisition of abstract predicates and their argument structure properties.

1.2. Abstract Predicates The area of grammar acquisition I am interested in is how children determine the argument structure properties of novel predicates. I am particularly interested in how children determine these properties for very abstract predicates, like seem or easy. With concrete predicates, such as eat, run, or pink, there is likely to be some experiential basis for determining at least the core aspects of the word’s lexical semantics, and this can help constrain the likely argument structure requirements or syntactic properties of the predicate, roughly along the lines suggested by the Semantic Bootstrapping Hypothesis (Pinker, 1989). For example, if a learner can determine, through hearing the verb run used in conjunction with running events, that the verb run denotes a self-initiated motion, then the learner might expect this verb to occur with an Agent subject, and perhaps with a locative adjunct, but he or she would not expect it to occur with a sentential complement.

Misha Becker

15

(1) John ran (to/in the park). (2) *John ran that he was tired. Conversely, as demonstrated by a wealth of studies in the Syntactic Bootstrapping literature (Gleitman, 1990; Gleitman et al., 2005, among many others), knowing the argument structure properties of a predicate can help narrow down the lexical semantics of the word. Thus, if a learner hears an unknown verb in the sentence frame in (3), he or she might assume that the verb denotes a motion but not a mental state. But upon hearing an unknown verb in the sentence frame in (4), the same learner is more likely to surmise that the verb denotes a mental state or a verb of communication, rather than a motion (Kako, 1998; Snedeker & Gleitman, 2004). (3) John verbed (to/in the park). (4) John verbed that he was tired. Unlike more concrete predicates, however, for certain abstract predicates like seem and easy, neither of these sources of information is straightforwardly available to learners. Since these predicates denote, precisely, abstract states and properties, they are not directly observable through the environment, so that experiential sources of information about lexical meaning are compromised. More importantly, in my view, these predicates overlap in their surface distribution with other subclasses of predicates that have very different argument structure properties (see Becker, 2006; Becker & Estigarribia, 2013, for discussion of why this presents a learning puzzle). As is well known, raising verbs like seem occur in some of the same surface strings as control verbs, such as claim (Davies & Dubinsky, 2004). (5) John seems [t to be friendly.] (6) John claims [PRO to be friendly.] In a parallel fashion, tough-adjectives, like easy, occur in some of the same sentence strings as control adjectives, like eager (Lees, 1960; Chomsky, 1964) (t indicates a trace of movement, while e indicates an empty position without movement). (7) John is easy [PRO to please t] (8) John is eager [PRO to please e]

16

Reaction Time as a Measure of Implicit Grammaticality Judgment

Despite the surface similarity, these subclasses of predicates differ in important ways. Within movement-based approaches to syntax, the subject in (5) is claimed to undergo movement from the embedded subject position, while the subject in (6) is base-generated in the main clause and “controls” the reference of the embedded subject, PRO (Chomsky, 1980; Collins, 2005). Though somewhat more contentious, the surface subject of (7) has also been analyzed as raising into that position from a position in the embedded clause (Rosenbaum, 1967; Brody, 1993; Hicks, 2009), whereas, like (6), the surface subject of (8) is base-generated in the main clause. While these analyses rely upon particular syntactic assumptions about movement and derivation, the argument structural differences between these predicates do not hinge on whether or not one’s syntactic analysis involves movement. In terms of argument structure, raising verbs and tough-adjectives do not select an Agent or Experiencer subject argument, and therefore can occur with an expletive or inanimate subject, while control predicates (claim, eager) do select such an argument and therefore disallow expletives. (9) It seems/*claims to be cloudy. (10) It is easy/*eager to please John. In previous work, I have discussed the theoretical issues surrounding the acquisition of these predicates (Becker, 2014, 2015a, b). In this paper, I will focus on a method for studying how children categorize them. That is, how does a child determine that a given predicate in their language has the properties of seem or easy, as opposed to claim or eager? In order to answer this question, we first need a way to determine how a child has categorized a predicate of this sort. One approach would be to put a predicate the child has just learned into a sentence with an expletive subject (cf. (9-10)) and ask if the child finds the sentence acceptable. (11) It gorps to be raining. (12) It is daxy to please John. If sentences (11-12) are acceptable, then the predicates gorp and daxy do not select an Agent or Experiencer subject. On the other hand, if these sentences are unacceptable, then these predicates require a semantic subject and do not tolerate an expletive. As noted in section 1.1, some researchers have successfully elicited metalinguistic grammaticality judgments from children (McDaniel &

Misha Becker

17

Cairns, 1996). However, I have been unsuccessful in eliciting such judgments about sentences like (11-12) from children as old as 8 or 9 years. My hypothesis is that with these types of syntactic constructions, children have difficulty with the metalinguistic aspect of the task, and that if we could assess their implicit judgment of acceptability we would be able to learn about how they categorize these abstract predicates.

2. Methodology The question raised immediately is how to assess children’s implicit knowledge. To do this, I will tap into something that developmental scientists have known about and exploited for many decades: babies exhibit surprise when they are faced with something unexpected. Exploiting this surprise response, researchers have been able to learn about babies’ categorical perception as well as other aspects of cognitive development, by measuring their sucking rate or heartrate (Eimas et al., 1971). In addition, Spelke and her colleagues (Spelke, 1988, 1991; Spelke et al., 1995; Woodward et al., 1998) measured relative looking time as an indication of how infants expect objects to move through space. Babies look longer at a scene in which inanimate objects move by themselves compared to one in which the objects are pushed by something else, or in which people move by themselves. Using the same principle, Onishi & Baillargeon (2005) assessed 15-month-olds’ expectations about where an observer should look for a hidden object according to the knowledge the observer had about the object’s location. Babies looked longer at the scene in which a person searched for an object in a location incongruent with where they could have known the object to be. Measuring relative looking time has also been used in a variety of linguistic studies, in which experimenters measure how long children turn toward one sound source or another, as in the Head-Turn Preference Procedure (HPP; Werker et al., 1981; Werker & Tees, 1984; Jucszyk et al., 1992), or how long they look at one scene or another after hearing a linguistic prompt, as in the Intermodal Preferential Looking Paradigm (IPLP; Hirsh-Pasek & Golinkoff (1981, 1996). These methodologies have been used successfully in hundreds of studies and reveal important facts about how babies perceive and interpret language. But they cannot be used to tell us how 3- and 4-year-olds judge the acceptability of sentences. The HPP procedure cannot be used for this because it was developed specifically to be used with very young infants; the maximum age with which it has been reported to be used is 18 months (Santelmann et al.,

18

Reaction Time as a Measure of Implicit Grammaticality Judgment

2003).2 Moreover, it measures babies’ perception of a difference between two stimuli, and a pair of sentences that differ in their grammaticality will necessarily be different, purely in terms of their surface form. Recognizing a difference between them does not tell us whether the child finds their grammatical representation different. Similarly, the IPLP cannot be used to assess a child’s judgment of the form of a sentence because, by its nature it does not measure a child’s judgment about a linguistic form, as such. Rather, it tells us how a child interprets the semantics of the sentence — whether the sentence matches or does not match a scene. Thus, neither of these methodologies is suited to the problem at hand.

2.1 Reaction Time Reaction Time has been widely used as a means of measuring how adult speakers process language (Rubenstein et al., 1970; Meyer & Schvaneveldt, 1971; Meyer et al., 1975; Bates & MacWhinney, 1993; McElree & Griffith, 1995, i.a.) and other aspects of cognition (Luce, 1986). It is related to the familiar “double-take” that occurs when we encounter an ungrammatical sentence or a garden path. This double-take often translates into a longer time to respond to a prompt, or to process the input. Capitalizing on both the fact that children indicate surprise when presented with something unexpected, and that ungrammatical input is unexpected and generally yields a slower response, my hypothesis is that we can assess whether a given sentence/question is grammatical or ungrammatical for children by measuring how long it takes them to respond to the prompt. I noted in the introduction that this methodology had not previously been used to assess children’s grammatical knowledge. In fact, RT has been used in a handful of studies on language acquisition, both on L1 (Corrigan, 1988; Naigles et al., 1995) and L2 (Bley-Vroman & Masterson, 1989). But these studies measured subjects’ time to perform a metalinguistic task, rather than to assess their degree of surprise. My objective is to eliminate the metalinguistic aspect of the grammaticality judgment task. Therefore, my proposal is to pose a series of yes/no questions, some grammatical and some ungrammatical, and measure how long it takes children to answer each question. The assumption is that, on average, ungrammatical questions should be answered more slowly compared to grammatical questions. 2

According to M. Soderstrom (personal communication), HPP can be used with babies up to 30 months of age, but there are no published studies with children of this age.

Misha Becker

19

2.2 Procedure First, I will describe the general procedure, including the recording and importing of data, and the analysis. Then I will discuss how the methodology works specifically in the case of novel abstract predicates. Children were videotaped with their faces fully visible. An external microphone recorded the sound. After watching a short video of toys or puppets interacting, children were asked two questions by a puppet. One of the questions should have sounded grammatical, and the other should have sounded ungrammatical due to an argument structure violation, as in the example in (13) (order of presentation was counterbalanced across items and across participants). (13) A:Hey! The policeman is sleeping! B:Really? The policeman is sleeping? A:Yeah! The policeman is sleeping. B:Wow! Is the nurse sleeping too? A:No, the nurse is not sleeping. Question 1: Is the nurse sleeping? (grammatical; target answer “no”) Question 2: Is the policeman sleeping the nurse? (ungrammatical) I recorded children’s responses and, although the dependent variable of interest is RT, not correctness, I also assessed children’s correctness for the grammatical questions. RT can be measured using a stopwatch, which is how it was measured in the study by Naigles, Fowler & Helm (1995). However, the stopwatch method is subject to human error, both in starting and stopping the clock. Even without making an error, humans exhibit a small lag in actually executing the start/stop action, which can add as many as a couple of hundred milliseconds to the recorded times. A more precise way to measure RT involves analyzing the video footage using the ELAN program. ELAN is software that can be used for annotating and analyzing audio and video data. It is freely available through the Max Planck Institute (http://tla.mpi.nl/tools/tla-tools/elan/). In order to analyze the data with ELAN, we created an iMovie file for each child and imported the question and answer session for each item into iMovie. We then exported this “movie” using Quicktime to create a .mov file, and by changing the export setting to “sound to wave” we created a .wav file for the audio portion. With the audio (.wav) and video (.mov) files in place, after opening ELAN one can open a new file for each child and then add the media files

20

Reaction Time as a Measure of Implicit Grammaticality Judgment

(.wav and .mov) as well as a template for annotating the recording. The template is not necessary but is helpful in coding the responses according to different types of stimuli. For our template we used the following items: a code for the kind of stimulus (warm-up vs. filler vs. target), the kind of predicate (for target items), whether the question was grammatical or ungrammatical, and whether the child answered yes or no. Once the ELAN project is opened, the coder can view the video of the session and the soundwave of the audio file, as well as hear the sound simultaneously while watching the video. The coder can then select (highlight) the interval from the end of the experimenter’s question until the beginning of the child’s answer, and annotate that selection with items from the template. The waveform can be helpful in identifying with precision the end of the experimenter’s speech and the beginning of the child’s response. Researchers must decide in advance how to deal with gestural (nonverbal) responses. We chose to include these responses, coding them such that the beginning of the child’s response was the point where the child’s head movement was an unambiguous nod (for yes) or shake (for no). Figure 1 shows a screenshot of what a portion of the video and audio data looks like in ELAN. Once the entire file has been coded, the annotations can then be exported as tab-delimited text so that they can be viewed in Excel or another spreadsheet program, in which one can see the start time, end time and duration of the interval (in milliseconds), along with all of the information that was coded using the template. The exported data can be combined into a single spreadsheet file and, if each file is labeled with an identifier for the participant, the participant ID will be associated with each line of data. Durations can be easily converted to log10 transformed durations, as is standard in analyzing RT data.

Misha Becker

Figure 1. Screenshot of selected interval in ELAN.

21

Reacction Time as a Measure of Imp plicit Grammatticality Judgmen nt

22

2.3 Ressults I have connducted two experiments e using u this m methodology, one with children ageed 4 to 7 yearrs and the second with childdren aged 3 to t 4 years (both are deescribed in Becker, B 2015a)). First, I preesent the resullts of the warm-up andd filler items to t show that th he basic prem mise of the metthodology holds: childr dren generally take longer to answer unngrammatical questions than gramm matical ones. Figure F 2 show ws 4 to 7-year ar-old children n’s RT in answering grammatical and ungram mmatical queestions involving the familiar verbbs play and borrow (e.g. Did D the farmerr play with thee car? vs. *Did the farrmer play the car to his frieend?) and twoo novel verbs, one used intransitivelyy (ballop) andd the other tran nsitively (zorpp). ͵ǤͶ ͵Ǥ͵

RT (log10)

͵Ǥʹ ͵Ǥͳ ͵ ʹǤͻ ʹǤͺ



ʹǤ͹



ʹǤ͸ ʹǤͷ

Lexeme c aged 4-7 years. G = Grammaticcal, UG = Figure 2. Exxperiment 1: children Ungrammaticcal

Misha Becker

23

For the verbs play, borrow and ballop (novel intransitive verb) the contrast between RT to grammatical and ungrammatical questions was significant by an analysis of variance (with standard error adjusted for multiple observations within participants): play mean estimate xࡃ = 0.3309, F2 = 19.20, p < 0.001; borrow mean estimate xࡃ = 0.3439, F2 = 32.40, p < 0.001; ballop mean estimate xࡃ = 0.3551, F2 = 26.23, p < 0.001. The one exception to the general result is the item zorp, which was taught as a novel transitive verb (an example sentence from the dialogue is You have to zorp the bell; it’s very fragile). The difference in RT to grammatical and ungrammatical questions for this item was not significant (mean estimate xࡃ = 0.0272, F2 = 0.18, p = 0.6741). It is unclear why children showed no difference in their RT to the grammatical and ungrammatical questions using this verb, particularly since work with much younger children shows that they easily draw correct inferences about the argument structure frames of transitive and intransitive novel verbs (Yuan & Fisher, 2009). Nevertheless, the pattern of taking longer to answer ungrammatical questions is quite robust for all of the other items. Figure 3 shows 3 and 4-year-olds’ RTs in answering warm-up questions with the familiar verbs pet (transitive; e.g. Is the girl petting the cat? vs. *Is the boy petting?) and sleep (intransitive; e.g. Is the nurse sleeping? vs. *Is the policeman sleeping the nurse?). In all cases, both age groups were significantly slower in answering the ungrammatical questions. Pooling across the two lexical items, for 3-year-olds, mean estimate xࡃ = -0.170, F2 = 5.57, p = 0.02; for 4-year-olds, mean estimate xࡃ = -0.1960, F2 = 8.67, p < 0.01.

Reacction Time as a Measure of Imp plicit Grammatticality Judgmen nt

24

͵Ǥͷ ͵ǤͶ

RT (log10)

͵Ǥ͵ ͵Ǥʹ ͵Ǥͳ



͵ ʹǤͻ ʹǤͺ ʹǤ͹

͵Ǧ’‡–

ͶǦ’‡–

͵Ǧ•Ž‡‡’

ͶǦ•Ž‡‡’

Age and Lexeme L c aged 3-4 years. G = Grammaticcal, UG = Figure 3. Exxperiment 2: children Ungrammaticcal

As notedd above, thesse figures sho ow only that this methodo ology can inform us abbout children’’s judgments of o grammaticaality involving g familiar words and simple argum ment structurees (transitive, intransitive).. But the rationale I ggave for neediing this non-m metalinguistic methodology y was that metalinguisttic tasks do not n work with h more compllex, abstract and a novel predicates. L Let us now turrn to these ressults. In the ttwo experimeents I condu ucted, childreen were taug ght novel adjectives thhat take an inffinitive complement (e.g. JJohn is daxy to t please) by presentinng them with a conversation n (within a viddeo scenario) in which me condition the novel aadjective wass used five times. In som ns within Experiment 1, the video conversation c provided p enouugh semantic context c to allow childrren to infer that a given predicate haad properties of either

Misha Becker

25

tough-adjectives or control adjectives. For example, in some scenarios the novel adjective was applied to an individual who was either happy or excited to do something for someone else. In other contexts, the novel adjective was applied to a situation, for example drawing a person or object. If children were able to categorize the novel adjectives as belonging to the easy-class (tough-adjectives) or the eager-class (control adjectives), they should have been able to judge whether the adjective was grammatical or ungrammatical in sentences that allowed only one of these types of adjectives. That is, if the child categorized an adjective as a tough-adjective, then they should have judged that adjective to sound OK with an expletive subject (Is it daxy to draw a flower?). Crucially, the children had not encountered any expletive constructions in the dialogue. (14) presents the dialogue that accompanied the adjective daxy. (14) Nurse: Hi Mrs. Farmer, I'd like to draw a picture, but I'm no good at drawing. Can you help me? Mrs. Farmer: Sure, Nurse! First, you need to find something to draw. Look here: here's a flower. A flower is daxy to draw. Let me see if I can draw it. There! I did it! Now you try. Nurse: OK, let me see... Oh, I can't do it! It didn't come out right. Let me try drawing that tree over there. I bet a tree is daxy to draw. Mrs. Farmer: Wait, trees are not very daxy to draw. They have so many little branches and leaves. Here, try drawing this apple instead. Apples are very daxy to draw. Nurse: OK. Hey look, I did it! Here's my drawing. Mrs. Farmer: Good job! Nurse: You were right: you have to find something that's daxy to draw when you're just learning. (a) Is it daxy to draw a flower? (target: yes) (b) *Is the tree daxy? Figure 4 shows a subset of the results from 4 to 7-year-olds (the full results are given in Becker 2015a), showing that in conditions where children should have been able to infer that the adjective was a toughadjective, they were faster in answering questions of the form Is it daxy to draw a flower? than questions of the form Is the tree daxy? where the latter construction does not admit tough-adjectives (*Is the tree easy?).

Reacction Time as a Measure of Imp plicit Grammatticality Judgmen nt

26

The differennce in RT to the t two types of questions w was highly sig gnificant: mean estimaate xࡃ = 0.36833, F2 = 49.43, p < 0.0001. ͵Ǥ͵ ͵Ǥʹ

RT (log10)

͵Ǥͳ ͵ ʹǤͻ

–‘—‰ŠǦƒ†Œ‡…–‹˜‡ ʹǤͺ ʹǤ͹ ʹǤ͸ ʹǤͷ

•‹–†ƒš›–‘ †”ƒ™ǥǫǫ

•–Š‡ –”‡‡ š›ǫ †ƒš

Figure 4. Subbset of results frrom novel adjecctives, Experim ment 2.

3. Discu ussion In experimeental studies of o any kind, it is imperativee that the researcher be able to disccriminate betw ween genuinee and spuriouus responses. Spurious responses m may be given when the parrticipant is noot paying atteention but answers anyyway, or wheen the particip pant does nott know the an nswer but guesses. W With RT data,, a common approach iss to assume that any extremely faast or extremeely slow respo onses are not genuine and should s be thrown out. A cut-off thaat is often useed is 2 standarrd deviations above or below the m mean for a particular p item m (Baayen & Milin, 2010; Racine,

Misha Becker

27

2013). One could also use an absolute cut-off, such as 100ms for a lower bound, and 5 seconds for an upper bound. I believe that when working with children one should be more generous in giving them time to respond; in my own work I have used 12 seconds as an upper bound. I would argue that there are times when one should be even more flexible, and use additional evidence to gauge whether a child’s answer was genuine or not. Some aspects of the data that should be considered are: (a) whether the child is overall slow or fast to respond; (b) whether the child gives other indications of paying attention (or not paying attention); if their face is clearly visible in the video one could take clues from the direction of their eye gaze; (c) whether the child answers correctly on the grammatical questions. I have used correctness of answering grammatical questions as a criterion for inclusion. Thus, children who responded incorrectly to too many grammatical questions (a reasonable minimum is 75% correct) are excluded altogether. When this measure is taken, the number of incorrect responses in the remaining data is small and eliminating them further does not change the overall results (it did not in the two experiments described above). The reason I believe one must sometimes allow extremely long RTs in an experiment like this is that the whole premise of this methodology is that one can “stump” people by asking them an ungrammatical question, and a person should be unsure how to answer a question if they are indeed stumped. In my experience running both experiments, there were times when a child was clearly on-task and paying attention (as evidenced by their correct and relatively faster responses to grammatical questions) but sat there for as many as 20 seconds before responding to an ungrammatical question. But if the lengthy pause before answering is due precisely to the ungrammaticality of the question, then such a data point is genuine and should not be thrown out. Thus, researchers must exercise caution and use multiple types of evidence in deciding whether to include or exclude particular data points. Using Reaction Time to measure implicit grammaticality judgment is new, and many questions remain open. I have not yet tested this methodology on adult controls, and this is a necessary step in validating the procedure. I am planning to test this methodology on a wider range of stimuli, including other types of argument structure errors and other types of ungrammaticality (errors of agreement, tense, aspect, perhaps whmovement, overt vs. null subjects, etc.). Additionally, since one of the motivations for developing this methodology is to use it with children who may be too young to provide metalinguistic grammaticality judgments in

28

Reaction Time as a Measure of Implicit Grammaticality Judgment

general (not only for novel abstract predicates), I am planning to test this methodology using simpler syntactic structures with even younger children, such as 2-year-olds.

References Baayen, H. & P.R. Milin. 2010. Analyzing reaction times. International Journal of Psychological Research 3, 12-28. Bates, E. & B. MacWhinney. 1993. Processing a language without inflection: A reaction time study of sentence interpretation in Chinese. Journal of Memory and Language 32, 169-192. Becker, M. 2006. There began to be a learnability puzzle. Linguistic Inquiry 37: 441-456. —. 2014. The Acquisition of Syntactic Structure: Animacy and Thematic Alignment. Cambridge: Cambridge University Press. —. 2015a. Animacy and the acquisition of tough-adjectives. Language Acquisition. DOI: 10.1080/10489223.2014.928298. —. 2015b. Learning structures with displaced arguments. In Andreas Trotzke and Josef Bayer (eds.) Syntactic Complexity across Interfaces. Berlin: Mouton de Gruyter. Becker, M. & B. Estigarribia. 2013. Harder words: Learning abstract verbs with opaque syntax. Language Learning and Development 9, 211-244. Bley-Vroman, R. & D. Masterson. 1989. Reaction time as a supplement to grammaticality judgments in the investigation of second language learners’ competence. University of Hawai’i Working Papers in ESL, volume 8, 207–237. Brody, M. 1993. Theta-theory and arguments. Linguistic Inquiry 24, 1-23. Chomsky, C. 1969. The acquisition of syntax in children from 5 to 10. Research Monograph 57. Cambridge, MA: MIT Press. Chomsky, N. 1964. Current issues in linguistic theory. The Hague: Mouton. —. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press. —. 1980. On binding. Linguistic Inquiry 11, 1-46. Collins, C. 2005. A smuggling approach to raising in English. Linguistic Inquiry 36, 289-298. Corrigan, R. 1988. Children’s identification of actors and patients in prototypical and nonprototypical sentence types. Cognitive Development 3:285–297. Davies, W.D., & S. Dubinsky. 2004. The grammar of raising and control. Malden, MA: Blackwell. Eimas, Peter, E. Siqueland, P. Jusczyk, & J. Vigorito. 1971. Speech

Misha Becker

29

perception in infants. Science 171, 303–306. Gindes, M. 1980. The development of metalinguistic knowledge and its relation to reading acquisition. Paper presented at the Conference on the Language of the Young Child: Frontiers of Research, Brooklyn College, City University of New York. Hicks, G. 2009. Tough-constructions and their derivation. Linguistic Inquiry 40, 535-566. Kako, E. 1998. The event semantics of syntactic structures. Ph.D. thesis, University of Pennsylvania. Lees, R.B. 1960. A multiply ambiguous adjectival construction in English. Language 36, 207–221. McDaniel, D., & H. Cairns. 1996. Eliciting judgments of grammaticality and reference. In D. McDaniel, C. McKee, & H. Smith Cairns (eds.) Methods for Assessing Children’s Syntax, 233–254. Cambridge, MA: MIT Press. McElree, B. & T. Griffith. 1995. Syntactic and thematic processing in sentence comprehension: Evidence for a temporal dissociation, Journal of Experimental Psychology 21, 134-157. Menyuk, P. 1981. Language development and reading. In J. Flood (ed.), Understanding Reading Comprehension. Newark, NJ: International Reading Association. Naigles, L., A. Fowler & A. Helm. 1995. Syntactic bootstrapping from start to finish with special reference to Down Syndrome. In M. Tomasello & W.E. Merriman (eds.) Beyond Names for Things: Young Children’s Acquisition of Verbs, 299–330. Mahwah, NJ: Lawrence Erlbaum Associates. Onishi, K. & R. Baillargeon. 2005. Do 15-month-old infants understand false beliefs? Science 308, 255–258. Racine, J.P. 2013. Reaction time methodologies and lexical access in applied linguistics. Vocabulary Learning and Instruction. Advance online publication. doi: 10.7820/vli.v03.1.racine Rosenbaum, Peter. 1967. The grammar of English predicate complement constructions. Cambridge, MA: MIT Press. Santelmann, L., M. Soderstrom, P. Jucszyk & A.M. Jucszyk. 2003. 18month-olds’ sensitivity to discontinuous dependencies over long verbs. In D. Houston, A. Seidl, G. Hollich, E. Johnson & A. Jucszyk (eds.) Jucszyk lab final report. http://hincapie.psych.purdue.edu/Jusczyk. Snedeker, J. & L.R. Gleitman. 2004. Why it is hard to label our concepts. In D. Geoffrey Hall & Sandra Waxman (eds.), Weaving a Lexicon. Cambridge, MA: MIT Press. Spelke, E. 1988. Where perceiving ends and thinking begins: The

30

Reaction Time as a Measure of Implicit Grammaticality Judgment

apprehension of objects in infancy. In A. Yonas (ed.) Perceptual Development in Infancy, volume 20 of Minnesota Symposium on Child Psychology. Mahwah, NJ: Lawrence Erlbaum. —. 1991. Physical knowledge in infancy: Reflections on Piaget’s theory. In S. Carey & R. Gelman (eds.) The Epigenesis of Mind: Essays on Biology and Cognition, 133–170. Mahwah, NJ: Lawrence Erlbaum Associates. Spelke, E.S., A. Phillips & A.L. Woodward. 1995. Infants’ knowledge of object motion and human action. In D. Sperber, D. Premack & A.J. Premack (eds.) Causal Cognition: A Multidisciplinary Debate, Oxford: Clarendon Press. Yuan, S. & C. Fisher. 2009. “Really? She blicked the baby?” Two-yearolds learn combinatorial facts about verbs by listening. Psychological Science 20, 619-626.

PROBABILISTIC PHONOTACTICS IN VOCABULARY ACQUISITION DURING READING IN A NATIVE LANGUAGE DENISA BORDAG, MARIA ROGAHN, AMIT KIRSCHENBAUM AND ERWIN TSCHIRNER1

Abstract The study addresses the role of phonotactic probability in incidental vocabulary acquisition during reading by adult L1 German natives. On the one hand, phonologically-regular words are more easily stored in memory than irregular ones (e.g. Ellis and Beaton 1993). On the other hand, readers must direct their attention to the new word in order to acquire it incidentally, and low phonotactic probability could contribute to this noticing process (Schmidt 2012). Seventy-two participants first read short texts that contained novel words with either high or low phonotactic probability substituted for low frequency German nouns. A range of tasks (self-paced reading, episodic recognition with repetition priming, lexical decision with semantic priming, Vocabulary Knowledge Scale) were supplied in order to assess various aspects of lexical knowledge. While the tasks that explored meaning acquisition revealed an advantage for the words with low 1

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin are affiliated to the Herder Institute, Leipzig University. The research reported in this paper was supported by a grant from the Deutsche Forschungsgemeinschaft to Denisa Bordag (German Research Council, DFG BO 3615/2-1). We thank Thomas Pechmann for his helpful comments and Teres Zacharias and Marcel Fuchs for their help with preparing the experiments and recruiting the participants. Correspondence concerning this article should be addressed to Denisa Bordag, Herder Institute, Leipzig University, 04107 Leipzig, Germany. E-Mail: [email protected]

32

Probabilistic Phonotactics in Vocabulary Acquisition

phonotactics, the tasks exploring word form acquisition did not show acquisition differences between the two types of novel words. The findings indicate that low phonological probability contributes to readers’ noticing new words but that their focus is primarily on meaning making of the novel words, in order to succeed in global text comprehension. The results justify a diversified approach to lexical knowledge acquisition and have important implications for reading studies.

1. Introduction Whereas child and second language vocabulary acquisition entail very obvious, dramatic gains in language competence that make them the centre of attention, adult native language vocabulary acquisition mostly occurs unnoticed, both by learners and researchers. However, although the core of a language vocabulary is acquired within the first decade of language acquisition, the acquisition process continues throughout adulthood, albeit typically with a focus on words with a low frequency or associated with specific jargons or registers (cf. Nation 2001, 20). A single word acquisition alone is a complex process that can proceed over an extended period of time and which comprises several aspects. According to Nation (2001, 27), there are three basic categories of lexical knowledge that are involved in knowing a word: a) knowledge of form (spoken, written, word parts); b) knowledge of meaning (concept, referents, associations); c) knowledge of use (grammatical use, collocations, restrictions). Each of these categories also includes a distinction between receptive and productive vocabulary that indicates lexical competence. The word acquisition process is cumulative, both with respect to the quantity (sometimes referred to as the breadth of vocabulary knowledge) and quality (referred to as the depth of knowledge). While there are acknowledged standardized procedures to measure the breadth of vocabulary knowledge (Nation 2001), measuring the depth of word knowledge poses more difficulties. Though it is obvious that all aspects of word knowledge cannot be acquired instantly, little is known about the progress of their acquisition. Most vocabulary tests measure later stages of vocabulary acquisition and focus primarily on meaning acquisition (e.g. translation tasks, generating a definition, multiple-choice tests). The Vocabulary Knowledge Scale (Paribakht and Wesche 1997; Wesche and Paribakht 1996) represents one of the few attempts to include more aspects of lexical knowledge in testing, including knowledge of word form alone and grammatical use. In addition, it also attempts to assess the acquisition process.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

33

However, there are many acquisition aspects that off-line tasks like those mentioned above cannot address. Introspection-based methods and/or methods that test explicit knowledge are not sensitive enough to measure the very initial stages of the acquisition process, during which the learners may not even be aware of the first memory traces that have already been left by the new words. They also do not allow insights into developing interactions with other (established) semantic representations. As noted by Borovsky, Elman, and Kutas (2012, 280) and other authors, “these are useful measures of word learning in its final stages but are relatively reticent about earlier stages of learning, when the learner’s knowledge is not stable and/or robust enough to drive such overt behaviours”. However, the ability to assess the initial stages of word acquisition is critical, especially in incidental vocabulary acquisition. While the information about the new word is to some degree (word form, meaning, grammatical properties) typically and explicitly available in intentional learning (through, for example, instruction or deciding to learn words from a dictionary or a lexicon), incidental acquisition requires inference of the new word's meaning and its other properties. It occurs as a concomitant phenomenon during the pursuit of another activity (e.g. reading or listening) and results in a new lexical entry that is added to the mental lexicon without the reader’s/listener’s explicit intent to commit it to memory. The lexical entry, however, emerges gradually through the reader’s repeated exposure to the same word in various contexts. Already, the first occurrence(s) of a word in a text may trigger the establishment of the initial representation(s) but the new memory traces might still be too weak to be measured by the aforementioned methods. In the present study, we employ a range of tasks that attempt to gauge the depth of vocabulary knowledge at an early acquisition stage and use the information revealed by these tests to explore incidental vocabulary acquisition, a mode through which most of the 20,000-word families that a competent native speaker knows are acquired (Nation 2001, 9). From school-age onwards, it is incidental acquisition during reading that especially represents a veritable well of word learning (Nagy, Anderson, and Herman 1987; Nagy, Herman, and Anderson 1985; Sternberg 1987). However, most texts are read with the goal of understanding their overall meaning rather than each individual word. Rieder (2002a, 33, 79) argues that gaps in the mental model of the text's meaning are typically necessary for readers to shift their attention from the text level to the word level and to initiate word acquisition by focusing on an individual word, taking note of its form and integrating its inferred meaning within the existing semantic network. A focus on an individual word is also central to the

34

Probabilistic Phonotactics in Vocabulary Acquisition

Noticing Hypothesis (NH). It defines noticing as “the conscious registration of attended specific instances of language” (Schmidt 2012, 32) and argues that some conscious registration is necessary for language learning (Schmidt 1990, 2001, 2012; but see Godfroid, Housen, and Boers 2010; Godfroid and Schmidtke 2013; Robinson 2003). The NH is often coupled with the assumption that “if some aspects of language are noticed before others, or indeed noticed at all, it is because they are 'salient' in their context” (Carroll 2006, 18). An example of a linguistic factor that has been shown or assumed to trigger the focus on an unknown word and thus foster its acquisition is the frequency of its occurrence or the level of its importance for the text meaning (cf. Hulstijn, Hollander, and Greidanus 1996, 328). In the present study, we investigate the role of one specific factor in the incidental acquisition of new words: the prominence of their word form (cf. also Rieder 2002a; Rieder 2002b). The factor is operationalized as phonotactic probability of novel letter strings. Phonotactic probability (i.e. the relative frequencies of segments and their sequences in words, Vitevitch and Luce 1999) has been studied with respect to the processing of auditory information (Gathercole, Willis, Emslie, and Baddeley, 1991; Gathercole, 1995; Vitevitch and Luce 1999, 2005) and also, exceptionally, as a factor in the intentional word learning of children (Storkel and Lee 2011) and adults (Storkel, Armbruster, and Hogan 2006). While spoken word recognition and production is facilitated by high phonotactic probability, i.e. common sounds and sequences (Munson, Swenson, and Manthei 2005; Newman and German 2005; Vitevitch, Armbruster, and Chu 2004; Vitevitch and Luce 1998, 1999), the results of Storkel, Armbruster, and Hogan’s studies indicate an advantage for low phonotactic probability words in child and adult word learning. However, Storkel and colleagues used tasks that either explicitly asked participants to remember the novel words they were hearing or that made the participants anticipate a post-test with a recall of the novel words. In addition, all of the aforementioned studies focused on the auditory modality. To our knowledge, there are no previous studies exploring the role of phonotactic probability in vocabulary acquisition during reading. Therefore, our study aims to address the unresolved question of how phonotactic structure affects adult word acquisition in an incidental setting during reading.2 For this purpose, we use several psycholinguistic methods 2

We use the term “phonotactic probability” because in languages with rather shallow orthography (like German), orthographic probability can be considered an approximation of phonotactic probability. We also want to place our study within a research context that has developed from spoken word research (see studies cited above).

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

35

that have been shown to be sensitive enough to also provide information about recently established representations. We focus on the inference and establishment of meaning (Self-Paced Reading Task (SPR); Vocabulary Knowledge Scale (VKS)), its integration within the existing semantic network (semantic priming) and the establishment of the new word form (repetition priming, VKS). The results presented here were obtained as part of a larger study, the results of which have partly been reported elsewhere (Bordag, Rogahn, Kirschenbau and Tschirner, 2016).

2. Task 1: Text Reading and Self-Paced Reading The aim of this task was to explore the influence of phonotactic probability on the inference and short-term storage of word meaning.

2.1 Method 2.1.1 Participants The participants of the study were 72 native speakers of German who were students at the University of Leipzig. Their mean age was 24.8 years and they were paid for their participation. 2.1.2 Materials A) Texts The experimental texts were 20 short passages (ca. 100 words in length) which allowed inference of the meaning of unknown words that appeared in them (one in each text). The unknown words were novel words (i.e. pseudowords) with a high or low phonotactic probability. They acted as placeholders for low-frequent concrete German nouns (e.g. “Wams” for “doublet” (piece of clothing)). The texts were of average complexity and had been written with the help of dictionary definitions of the low-frequent German nouns, as well as of statistical co-occurrences using the DWDS and the Leipzig Wortschatz Projekt corpora. Eight additional filler texts were presented. These contained only existing German words with medium frequency. B) Novel words Twenty pairs of pseudowords were constructed with each pair corresponding to a single, low-frequent, concrete German noun (average frequency class of 19, according to the classification of the Leipzig

36

Probabilistic Phonotactics in Vocabulary Acquisition

Wortschatz Projekt - database3) that was used as a template for the meaning. One of the pseudowords of each pair had high, and the other low, phonotactic probability. Pseudowords were used to ensure that participants did not have prior knowledge of the vocabulary in focus. This approach has often been applied in research on L2 vocabulary acquisition (Hulstijn 1992, 1993; Pulido 2003, 2004). A post-test revealed that the low-frequent nouns were, for the most part, unknown to the participants, implying that, from their perspective, they were reading rare, unknown German words, possibly with new concepts. Two indicators were used to create and select the high and low phonotactic probability (HPP vs. LPP) novel words (pseudowords) (see Table 1): 1) Probability of the novel words as sequences of characters in the German language. Firstly, words in 500,000 sentences (obtained from the Wortschatz Projekt) were modelled based on n-gram (here, bigram and trigram) distributions. The n-gram model reflected not only the relative frequency of character sequences but also, crucially, their position in the word (begin/middle/end), thus implicitly capturing the structure of the words in the language. The probability of a character sequence was then computed as the product of its corresponding n-gram transitional probabilities. 2) Ratings in a typicality pre-test were used to verify whether the novel word candidate pairs that were selected on the basis of phonotactic probability, were perceived as atypical or typical by 15 native-speakers who scored them on a six-point scale from 0 (“looks very atypical”) to 5 (“looks very typical”). Finally, based on the two measures described above, 20 novel words were selected that adhered to the phonotactic rules of German and that consisted of bigrams and trigrams with relatively high transitional probabilities, and which also achieved relatively high ratings in the typicality pre-test (average rating was 3.38, see Table 1). Another 20 novel words were selected that also respected phonotactic rules of German but that had distinctive and unusual features only found in very few German word forms (or in a German dialect). These features were inspired by existing German words and included unusual word-initial consonant clusters. For example, /gn/ as e.g. in “Gnade” (‘mercy’) or /gm/ as e.g. in “Gmüs” (Bavarian for “Gemüse”- ‘vegetables’), unusual word initials, e.g. “Ä” (as in “Ähre” – ‘ear of a plant’) or word final vowels: “i”, “u”, “o” (e.g. as in “Kanu” – ‘canoo’), as well as unusual syllable structure 3

Public access available via www.wortschatz.uni-leipzig.de, the most common word “der” occurs 2^19 as often.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

37

(e.g. reduplication). They were rated by the participants as unusual (average rating: 0.97). Only pseudowords that scored more than 0 (“does not look like a German word at all”) in the average typicality rating of the participants were selected. The difference in rating between typical and atypical matches was 2.6 points, on average. The LPP and HPP novel words were pairwise matched according to their phonotactic and other properties. The LPP novel words had a probability of a maximum of one tenth of the HPP pseudo-word match. The word pairs (LPP and HPP) were assigned the same gender (compliant with phonological or semantic gender cues). Other potential factors which might affect the acquisition of the novel words (e.g. length, number of syllables, neighborhood density and pronounceability) were kept as constant as possible across the two types of novel words. Neighborhood density was gauged through two measures: 1) Neighbors at distance one, i.e. how many phonological neighbors could be arrived at by changing one letter in the pseudoword. This measure was 0 or 1 for all selected pseudowords; 2) Orthographic Levenshtein distance 20, computed using the Wuggy pseudoword generator, and the VWR R-package (Keuleers and Brysbaert 2010; Keuleers 2011). The measurement uses the 20 closest existing words in the lexicon and creates a mean of how many insertions, deletions or replacements are needed to reach the input word (see Yarkoni, Balota, and Yap, 2008). Pronounceability was measured through a pronounceability rating (see Ellis and Beaton 1993) on a four-point scale. Table 1. Pseudoword pairs used in experiments 1 and 2 and their properties (averages). Perceived typicality ratings were measured on a scale of 0 (“does not look like a German word at all”) to 5 (“looks like a typical German word”). Properties

HPP Pseudowords

LPP Pseudowords

Length

6.25

6.05

Typicality judgment

3.38

0.97

Bigram probability

1.43771316 × 10-7

5.62008 × 10-10

Trigram probability

4.46363116E × 10-7

5.853202E × 10-9

Neighborhood density (ND 1)

0.3

0.05

Neighborhood density (OLD 3.10 20)

3.11

38

Probabilistic Phonotactics in Vocabulary Acquisition

C) Self-Paced Reading Sentences Twenty pairs of Self-Paced Reading (SPR) sentences were created, each containing a noun phrase consisting of the novel word plus a compatible adjective (e.g. “the dirty Tschinkum (i.e. doublet)”) in the plausible condition, and an incompatible one (e.g. “the helpless Tschinkum (i.e. doublet)”) in the implausible condition. These adjectives had not appeared in the texts. The noun phrases were preceded by at least two words and followed by at least four words within the SPR sentences. The plausible and implausible SPR sentence pairs were identical except for the adjective used. The content of the sentences was related to the topics of the short texts. For each text, none, one, or two filler sentences were constructed that were also related to the subject of the text but consisted only of known words. 2.1.3. Procedure The Self-Paced Reading Task was part of the learning phase and immediately followed the text reading. Participants were asked to try to comprehend the texts in order to be able to respond to statements focusing on the content. The statements, requiring a TRUE or FALSE response, followed the presentation of one to three sentences which were read in a Self-Paced Reading (SPR) manner (moving window). One of these sentences was always the critical sentence while the others were filler sentences related by topic. Each participant read each text only once, either with a LPP or HPP novel word, followed either by the semantically plausible or implausible SPR sentence. Thus, in order to form a complete experimental design comprising four conditions for each item, there were four experimental lists and each participant was assigned to one of them. Four participants thus created a super-participant4 with a complete design. The texts were presented in a randomized order, subject to the constraint that no more than two texts from the same condition (HPP/LPP, plausible/implausible) occurred consecutively. For each experimental list there were two randomizations.

4

Analyses with super-participants/super-subjects are considered a “standard procedure” (Isel, Gunter, and Friederici 2003, 280) in experiments with a Latin square design, when several participants with complementary lists are considered a single point for statistical analysis.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

39

2.1.4. Results Self-Paced Reading The results of the Self-Paced Reading Task are reported in detail in Bordag et al., 2016. They revealed that participants inferred, and at least temporarily retained in memory, the meaning of both high and low salient novel words. Their reading latencies were significantly longer in the implausible (436.3ms) than in the plausible (420.3ms) condition (measured on the novel word itself (position n) and on the spill-over regions n+1, n+2, n+3, and n+4), indicating that participants reacted to the semantic incompatibility between the meaning of the adjective and the novel noun (see Table 2). At least a general representation of the novel noun meaning is a prerequisite for such a response. At the same time, the implausibility effect appeared earlier (already in the novel word itself) in the LPP condition, while it appeared only in the spillover region n+1 in the HPP condition. This finding suggests a stronger meaning-representation of the LPP novel words, which induced a more immediate reaction to the semantic incompatibility. The results of this task thus imply inference and/or an acquisition advantage for words with LPP. Table 2. Mean reading times in ms for each condition at positions n, n+1, n+2, n+3, and n+4 in the Self-Paced Reading Task. Phonotactic Probability

Plausibility

Position N

n+1

n+2

n+3

n+4

Plausible

423.0

429.6

395.0

411.3

446.0

Implausible

423.9

452.2

419.2

423.5

452.9

Plausible

412.9

441.2

396.6

416.5

440.6

Implausible

442.8

451.3

419.7

432.4

447.5

HPP

LPP

40

Probabilistic Phonotactics in Vocabulary Acquisition

3. Task 2: Vocabulary Knowledge Scale5 The purpose of this test was to collect data about conscious and explicit decontextualized knowledge about the novel words and compare it with data from the more implicit previous task, in which participants needed to recognize the novel word forms and access the new meaning within the acquisition context.

3.1 Method Materials and Procedure The 20 novel words that the participants encountered during the learning phase were presented in the paper and pen task in a randomized list, along with 16 filler items (10 existing German words and six pseudowords) that had not been part of the experimental session. The participants were asked to judge their knowledge of these words by choosing one of seven possible statements (see Table 3, adapted from Paribakht and Wesche 1993, 1997; Wesche and Paribakht 1996). The completion of this task took approximately 15 minutes.

3.2 Results VKS The results showed that participants were more likely to recall encountering an LPP word (19.9%) than an HPP word (11.9%) (see Table 3). The analyses with the binary variable, “depth of vocabulary knowledge”, with knowledge of “form” (percentages relating to statement number 2) and knowledge of “meaning” (added percentages relating to statements 3, 4, 5 and 6) as two possible values, further revealed that where the recollection of form did not differ between the conditions, participants recalled the meaning of LPP words more often than HPP ones (23.6 % vs. 16.7 %).

5

The task is reported second due to content that is relevance to task 1. However, in the experimental session, this task was performed last.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

41

Table 3. Proportion of Selected Statements for HPP and LPP Novel Words on the VKS (in percent). Depth of vocabulary knowledge as a binary factor of either recall of form (answer 2) or meaning (answer 3 to 6).

1

Statements

LPP

HPP

I’ve never seen this word before.

11.9

19.9

64.6

63.4

18.6

12.4

3.2

2.8

0.9

0.0

0.9

1.5

3

I’ve seen this word, but I don’t know its meaning. I think this word means…

4

I know that this word means…

2

I know this word very well 5 and can use it in a sentence. I also know the gender of the 6 word and/or its alternating forms. Depth of Vocabulary Knowledge 2

Form

64.6

63.4

3 to 6

Meaning

23.6

16.7

The results of the vocabulary knowledge scale thus support the results of the Self-Paced Reading Task in showing that incidental acquisition of the meaning of new words profits from LPP. On the other hand, no evidence was found that LPP or HPP word forms are recalled better if the meaning is not available.

4. Task 3: Semantically Primed Lexical Decision Task The goal of this task was to explore the integration of the novel words’ semantic representations into the existing semantic network.

4.1 Method 4.1.1 Materials The 20 novel words were used as primes and paired with 20 semanticallyrelated targets to create a semantically-related condition (e.g. Tschinkum (=Wams, “doublet”) - Weste (“vest”)). In order to create a semanticallyunrelated condition, the same targets were paired with semanticallyunrelated concrete nouns. In addition to the critical trials with existing

42

Probabilistic Phonotactics in Vocabulary Acquisition

words and novel words as primes and targets, additional filler trials requiring “yes” responses were added that contained both pseudowords and words as primes.6 An equal number of trials requiring “no” responses (i.e. with pseudowords as targets) were added to create a balanced design. 4.1.2 Procedure Each session started with a practice block of 10 trials. Afterwards, 292 trials followed in four blocks (73 in each block). The length of the pauses that separated the blocks was determined individually by the participants. Each trial started with a fixation sign (200 ms), followed by a prime (400 ms), after which a target appeared. Participants had to make a decision about its lexical status by pressing a YES or NO button. After participants pressed the response button or after a maximum time window of 1500ms, the stimulus disappeared and the next trial started after an inter-stimulus interval (blank screen) of 500ms. Primes and targets were coded with different colors. Participants were instructed to make their lexical decision only for the (green) target words/pseudowords and ignore the primes. The task took approximately 15 minutes. The order in which the items appeared on the screen was individually pseudo-randomized for each participant. A maximum of three items of the same status (i.e. semantically related, unrelated, fillers of different types) and with the same intended answer (YES, NO) were allowed to appear after each other.

4.2 Results for the Semantically-Primed Lexical Decision Task Observations were excluded from the analyses due to cut-off or because they were incorrect. In addition, responses to the items that participants did not recognize in the VKS test were excluded. Altogether 1286 (23.5%) observations were excluded.

6

Some of these trials created conditions that are not of direct relevance for the present paper.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

43

Table 4. Reaction times in the semantically-primed lexical decision task for trials with semantically-related and unrelated pairs of novel words used as primes and existing word targets in the two phonotactic probability conditions (LPP words vs. HPP words). LPP

HPP

Unrelated

586.8 (77.7%)

585.0 (75.2%)

Related

607.2 (77.9%)

600.1 (75.3%)

Diff

+17.4

+15.1

A 2 x 2 ANOVA with the factors relatedness and phonotactic probability revealed that the novel words exhibited an inhibitory effect in the semantically-related condition (F1 (1, 17) = 68.1, p < .001, F2 (1, 18) = 19.9, p < .001). Response latencies were longer when the target was semantically-related to a novel word prime than when it was an unrelated existing word. This effect was evident irrespective of the phonotactic probability of the novel words. It indicates that the novel words were integrated into the semantic network, as they interacted with the existing semantic representation. However, the semantic effect goes in an opposite direction to the semantic facilitation effect that has been typically observed for semantically-related words with firmly established representations (Perfetti, Landi, and Oakhill 2005; Breitenstein, Zwitserlood, de Vries, Feldhues, Knecht, and Dobel 2007). A similar inhibitory effect for newly established, weak representations was observed in our previous experiments (Bordag et al. 2014) and led to a conclusion that follows a neurally inspired theory by Walley and Weiden (1973), further developed by Dagenbach and colleagues (Carr and Dagenbach 1990; Dagenbach, Carr, and Barnhardt 1990; Dagenbach, Horst, and Carr 1990). It assumes a center-surround inhibitory mechanism that enables retrieval of weakly represented meanings of new words that have just been added to semantic memory, by decreasing the activation of stronger representations with which it is linked, i.e. of its semantically-related competitors (see Bordag et al., 2014 for a more detailed clarification).

5. Task 4: Episodic Recognition Task with Repetition Priming The purpose of this task was to explore the role of phonotactic probability in the incidental word form acquisition.

44

Probabilistic Phonotactics in Vocabulary Acquisition

5.1 Method 5.1.1 Materials The 40 novel words from the 20 pseudoword pairs, along with 40 existing words, were presented to the participants as targets; 20 items in each group had not been seen by the participants in any of the previous tasks and therefore required a “no” response in the recognition task (“new” words). Eight of the existing words that required a “yes” answer in the recognition task were the key words in the filler texts (“old” words). The other 12 “old” existing words came from the body of the texts and were repeated between two and three times and were also used in the semantic priming experiment. 5.1.2 Procedure At the beginning of the experiment, participants were instructed to press the “yes” button when they saw a word which they had encountered in the previous texts and the “no” button when they believed the word had not appeared earlier. Each target was paired either with an unrelated or an identical prime. Each participant saw each item only in one of these (unrelated vs. identical) conditions, so that two participants always built a complete experimental list in which each item was seen once in each condition. From each group of items (old pseudowords (novel words), new pseudowords, old existing words, and new existing words), half of the items appeared in the identical and the other half in the unrelated condition. All primes were existing words (except when they were novel words as repetition primes) and had the same length as the target. The experiment started with 10 practice items to familiarize the participants with the episodic recognition task. Then 80 trials followed in one block that started with two additional practice items. Each trial started with a fixation point (200ms) followed by a prime in lower case for 400ms and a target in upper case for 2000ms. After the stimulus disappeared from the screen a feedback was given (“too slow!”,“correct!” or “false”, as well as overall correct items in percent). Participants’ reaction times were measured and the answers scored for accuracy. The experiment took approximately eight minutes. The order in which the items appeared on the screen was individually pseudorandomized for each participant. A maximum of two items of the same status (old pseudowords (novel words), new pseudowords, old existing words, and new existing words), and a maximum of three items in each condition (unrelated vs. identical) could appear after each other. The pseudorandomized order was also controlled for conditions in which the

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

45

experimental items appeared in the previous experiment. A maximum of three LPP or HPP experimental items and a maximum of three words with the same answer (old or new), relatedness (identical, unrelated) and word type (novel, existing or pseudoword) were allowed to follow each other.

5.2 Results for the Episodic Recognition Task A total of 1942 responses (35.5%) were excluded from the analyses based on cut-off (two standard deviations two iterations), or because they were incorrect. Because the responses to the old and new items were of different types (yes vs. no, respectively), the main analyses with the factors relatedness (identical vs. unrelated) and word type (pseudoword/novel word vs. existing word) were performed separately for these two groups of items (see Table 5). Table 5. Reaction times (in ms.) in episodic recognition task and valid values (in percent) for the factors oldness (old vs. new), relatedness (unrelated vs. identical) and word type (pseudoword/novel word vs. existing word). New

Relatedness

Unrelated Identical

Pseudoword 892.3 (67.5%) 747.8 (66.8%)

Old Existing Word 875.2 (70.8%) 766.3 (67.4%)

Pseudoword 869.4 (70.2%) 663.7 (79.4%)

Existing Word 915.3 (71.5%) 746.7 (76.0%)

The two pairs of ANOVAs with the factors relatedness and word type revealed a robust repetition priming for both the old (F1(1,17) = 279.2, p < .001; F2 (1,37) = 80.1, p < .001) and the new items (F1(1,19) = 136.6, p < .001; F2 (1,37) = 61.1, p < .01). Though there was a tendency towards an interaction between the factors relatedness and word type for both the old and the new items, it was significant only in F2 and not in F1. In addition, the factor word type was highly significant for the old items: F1(1,17) = 22.0, p < .001; F2(1,37) = 22.4, p < .001. Participants were 83ms faster when responding to the novel words than when responding to the existing words. This finding supports the assumption that participants noticed the novel word forms in the input. Despite the fact that both the novel words and the existing words appeared in the text passages and the previous

46

Probabilistic Phonotactics in Vocabulary Acquisition

semantic priming experiment, participants responded faster to the novel words than to the existing words, indicating that they focused their attention on these words during the learning phase. Additional analyses of only the responses to the novel words/pseudowords with the factors relatedness, phonotactic probability and oldness confirmed that participants responded faster in the identical than in the unrelated condition (old: F1 (1,17) = 173,4, p < .001, F2 (1,19) = 122.6, p < .001; new: F1 (1,17) = 62.4, p < .001, F2 (1,19) = 43.3, p < .001 ) (see Table 6). The factor phonotactic probability was marginally significant only in F1 of the new items (1,17) = 3.2, p = .091, indicating a weak tendency towards faster response times to the LPP strings (723.6 ms in the identical condition) than to the HPP strings (775.6 ms). The interaction between phonotactic probability and relatedness was not significant. Table 6. Reaction times (in ms.) in episodic recognition task and valid values (in percent) for the factors oldness (old vs. new), relatedness (unrelated vs. identical) and phonotactic probability (LPP vs. HPP).

Relatedness

Unrelated Identical

New LPP HPP 893.1 891.4 (70.8%) (64.3%) 723.6 775.6 (72.0%) (61.7%)

Old LPP 864.0 (71.9%) 657.6 (81.6%)

HPP 875.2 (68.4%) 669.9 (77.2%)

In contrast to the findings from the other tasks, the results of the recognition task with repetition priming revealed no reliable differences between the priming effects of the HPP and LPP novel words: the repetition priming was of the same size (ca. 206ms) for both types of stimuli suggesting that none of them were better stored in the memory. The tendency towards faster recognition of the LPP strings, together with the numerical tendency towards larger priming for the new LPP pseudowords (169.5 vs. 115.9 for HPP), indicates that strings with LPP might be identified faster in the memory but are no indication of an acquisition advantage in the incidental acquisition setting. This result is in accordance with the VKS results, which also did not provide evidence of an acquisition advantage of LPP vs. HPP word forms.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

47

6. General Discussion In this series of experimental tasks, we explored the role of phonotactic probability in incidental acquisition of various aspects of lexical knowledge during reading, by adult native speakers of German. Overall, the results reveal an initial acquisition advantage for the meaning of novel words with LPP. This advantage is obvious: both within the immediate learning context from which the readers inferred the meaning (Self-Paced Reading Task) and outside the learning context when participants were asked to recall the meaning of the isolated novel words (VKS). While our study provides evidence that phonotactic probability affects the establishment of the new semantic representation itself, no support was found to suggest that it would also affect the integration of the new representations within the existing semantic network (semantic priming task). Newly added semantic representations of both low and high probability novel words exhibited a semantic effect of the same size. Similarly, we found no evidence that phonotactic probability would play a significant role in the establishment of the phonological/ orthographic form, as evidenced both by the VKS and the episodic recognition task. The latter task, however, provided support for the assumption that the novel word forms were noticed and focused on, compared to other existing words that had appeared in the texts. Overall, the data show that a particular factor (in this case, the phonotactic probability of the novel word forms) affects incidental acquisition of various aspects of lexical knowledge to a different degree. This finding justifies a diversified approach to lexical knowledge when exploring the process of word acquisition. In addition, the results of our study show that the phonotactic probability of the word form affects adult L1 incidental acquisition during reading. This finding has important methodological implications for studies on incidental vocabulary acquisition during reading. For instance, the well-known “Clockwork Orange” studies (Saragi 1978 in L1 English; Pitts, White, and Krashen 1989 in L2 English) attempted to create realistic reading conditions by employing authentic reading materials while at the same time relying on target words which were incontrovertibly unknown to the readers. To achieve this, the authors explored the incidental acquisition of Nadsat (Russian slang) words imbedded in the novel. These words, as well as target words in similar studies (e.g. Pellicer-Sanchez and Schmitt 2010 used Ibo words), have highly unusual phonotactics, compared to the English common usage vocabulary that is used throughout the rest of the texts. Our results raise doubts about the validity and generalizability of the

48

Probabilistic Phonotactics in Vocabulary Acquisition

outcomes of those studies, since their focus was on words with unusual phonotactics which do not represent a typical subset of words commonly learned during extensive reading. Our study indicates that phonotactic properties of novel words used in texts to explore incidental vocabulary acquisition need to be controlled for, because they affect the acquisition process (at least in its initial stages). It remains to be clarified why the incidental acquisition of meaning was positively affected by the LPP while no evidence was found of a similar profit for the acquisition of word forms with LPP. Storkel and colleagues (Storkel, Armbruster, and Hogan 2006; Storkel and Lee 2011) argue that the benefit for novel words with LPP arises due to the LPP words’ advantage in the first two stages of acquisition (i.e. triggering the word acquisition and formation of an initial representation) and does not affect integrating the new representation with existing ones, which complies with our results. The authors propose that the detection of a mismatch between the novel LPP word and any existing word form representations is easier because of its infrequent constituents, and that it is this detection of novelty that triggers the word acquisition and the formation of a new representation. However, Storkel refers primarily to the emergence of word form representation. In contrast, in our study we find no support for better acquisition of novel word forms with LPP, only of their meanings. We assume that the acquisition setting plays a crucial role. While Storkel and colleagues explored intentional word learning in an auditory mode, we focused on incidental acquisition during reading. The primary goal of our participants was thus not to learn new words but to understand the text’s global meaning. We assume that LPP did contribute to the noticing (Schmidt 2012) of an unknown word in a stream of known words in the text. Rather than initiating an attempt to commit the new form to memory (as would be expected in an intentional setting), the noticing during reading contributed to the participants’ efforts to figure out the novel word’s meaning, which was vital for their primary purpose (text understanding). In addition, there is also evidence suggesting that while LPP is advantageous for triggering lexical acquisition, it is detrimental to storing the word form. Gathercole and colleagues (Gathercole, Willis, and Baddeley 1991; Gathercole, Willis, Emslie, and Baddeley, 1992) suggest that when a new word is stored, similar word forms are activated in longterm memory, which then help to process the word in phonological shortterm memory. Without the support of similar word forms (which is more likely in the case of LPP words) the phonological loop may become more

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

49

easily over-burdened. Gaskell and Dumay (2003 a, b) and Storkel, Armbruster, and Hogan (2006) provide further support for this notion by hypothesizing that novel words with LPP are less likely to activate representations of existing words. Consequently, the aspect (LPP) that fosters noticing and thus triggers the process of establishing a new representation is the same aspect that negatively affects the storage of the word form. Thus, it could be also the case that the positive effect of LPP on triggering the establishment of a new word form representation is cancelled out by the negative effect that LPP might have on storing the new word form in memory. Obviously, more research is necessary to shed more light on both the process of acquisition of various aspects of lexical knowledge and on the role of phonotactic probability in vocabulary acquisition in different contexts. Moreover, the present study focused only on the very initial stage of vocabulary acquisition. Indeed, it would be of great interest to explore the two issues from a longer term perspective, especially with respect to the role of knowledge consolidation.

References Bordag, Denisa, Amit Kirschenbaum, Erwin Tschirner, and Andreas Opitz. 2015. “Incidental Acquisition of New Words during Reading in L2: Inference of Meaning and Its Integration in the L2 Mental Lexicon.” Bilingualism: Language and Cognition, 18(3), 372-390. doi:10.1017/S1366728914000078. Bordag, Denisa, Kirschenbaum, Amit, Rogahn, Maria, and Erwin Tschirner. 2016. “The role of phonotactic probability in L1/L2 incidental and intentional vocabulary acquisition.” Second Language Research, 33(2), 147-178. Borovsky, Arielle, Jeffrey L. Elman, and Marta Kutas. 2012. “Once Is Enough: N400 Indexes Semantic Integration of Novel Word Meanings from a Single Exposure in Context.” Language Learning and Development 8 (3): 278–302. Breitenstein, Caterina, Pienie Zwitserlood, Meinou H. de Vries, Christiane Feldhues, Stefan Knecht, and Christian Dobel. 2007. “Five Days versus a Lifetime: Intense Associative Vocabulary Training Generates Lexically Integrated Words.” Restorative Neurology and Neuroscience 25 (5): 493–500. Carr, Thomas H., and Dale Dagenbach. 1990. “Semantic Priming and Repetition Priming from Masked Words: Evidence for a CenterSurround Attentional Mechanism in Perceptual Recognition.” Journal

50

Probabilistic Phonotactics in Vocabulary Acquisition

of Experimental Psychology: Learning, Memory, and Cognition 16 (2): 341. Carroll, Susanne E. 2006. “Salience, Awareness and SLA.” In Proceedings of the 8th Generative Approaches to Second Language Acquisition Conference, Ed. by Mary Grantham O’Brien, Christine Shea and John Archibald, 17-24. Somerville, MA: Cascadilla Proceedings Project. Dagenbach, Dale, Thomas H. Carr, and Terrence M. Barnhardt. 1990. “Inhibitory Semantic Priming of Lexical Decisions due to Failure to Retrieve Weakly Activated Codes.” Journal of Experimental Psychology: Learning, Memory, and Cognition 16: 328–40. Dagenbach, Dale, Sonia Horst, and Thomas H. Carr. 1990. “Adding New Information to Semantic Memory: How Much Learning Is Enough to Produce Automatic Priming?” Journal of Experimental Psychology. Learning, Memory, and Cognition 16 (4): 581. Ellis, Nick, and Alan Beaton. 1993. “Factors Affecting the Learning of Foreign Language Vocabulary: Imagery Keyword Mediators and Phonological Short-Term Memory.” The Quarterly Journal of Experimental Psychology 46 (3): 533–58. Gaskell, M. Gareth, and Nicolas Dumay. 2003a. “Effects of Vocabulary Acquisition on Lexical Competition in Speech Perception and Production.” In Proceedings of the 15th ICPhS Conference (pp. 1485Á1488). Adelaide, Australia: Causal Productions. —. 2003b. “Lexical Competition and the Acquisition of Novel Words.” Cognition 89 (2): 105–32. Gathercole, Susan E. 1995. “Is Nonword Repetition a Test of Phonological Memory or Long-Term Knowledge? It All Depends on the Nonwords.” Memory & Cognition 23 (1): 83–94. Gathercole, Susan E., Catherine Willis, and Alan D. Baddeley. 1991. “Differentiating Phonological Memory and Awareness of Rhyme: Reading and Vocabulary Development in Children.” British Journal of Psychology 82 (3): 387–406. Gathercole, Susan E., Catherine S. Willis, Hazel Emslie, and Alan D. Baddeley. 1992. “Phonological Memory and Vocabulary Development during the Early School Years: A Longitudinal Study.” Developmental Psychology 28 (5): 887. Godfroid, Aline, Alex Housen, and Frank Boers. 2010. “A Procedure for Testing the Noticing Hypothesis in the Context of Vocabulary Acquisition.” In Cognitive Processing in Second Language Acquisition, 169–97. Godfroid, Aline, and Jens Schmidtke. 2013. “What Do Eye Movements Tell Us About Awareness? A Triangulation of Eye-Movement Data,

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

51

Verbal Reports, and Vocabulary Learning Scores.” Noticing: L2 Studies and Essays in Honor of Dick Schmidt. Honolulu: University of Hawai’i at Manoa, National Foreign Language Resource Center. Hulstijn, J.H. 1992. “Retention of Inferred and given Word Meanings: Experiments in Incidental Learning.” In Vocabulary and Applied Linguistics, edited by P.J. Arnaud and H. Béjoint, 113–25. London: Macmillan. —. 1993. “When Do Foreign-Language Readers Look Up the Meaning of Unfamiliar Words? The Influence of Task and Learner Variables.” The Modern Language Journal 77 (2): 139–47. Hulstijn, J.H., Merel Hollander, and Tine Greidanus. 1996. “Incidental Vocabulary Learning by Advanced Foreign Language Students: The Influence of Marginal Glosses, Dictionary Use, and Reoccurrence of Unknown Words.” The Modern Language Journal 80 (3): 327–39. Isel, Frédéric, Thomas C. Gunter, and Angela D. Friederici. 2003. “Prosody-Assisted Head-Driven Access to Spoken German Compounds.” Journal of Experimental Psychology: Learning, Memory, and Cognition 29 (2): 277. Keuleers, Emmanuel. 2011. Vwr: Useful Functions for Visual Word Recognition Research (version 0.1). http://cran.rǦproject.org/web/packages. Keuleers, Emmanuel, and Marc Brysbaert. 2010. “Wuggy: A Multilingual Pseudoword Generator.” Behavior Research Methods 42 (3): 627–33. Munson, Benjamin, Cyndie L. Swenson, and Shayla C. Manthei. 2005. “Lexical and Phonological Organization in Children. Evidence From Repetition Tasks.” Journal of Speech, Language, and Hearing Research 48 (1): 108–24. Nagy, William E., Richard C. Anderson, and Patricia A. Herman. 1987. “Learning Word Meanings from Context during Normal Reading.” American Educational Research Journal 24 (2): 237–70. Nagy, William E., Patricia A. Herman, and Richard C. Anderson. 1985. “Learning Words from Context.” Reading Research Quarterly, 233– 53. Nation, Ian Stephen Paul. 2001. Learning Vocabulary in Another Language. Ernst Klett Sprachen. Newman, Rochelle S., and Diane J. German. 2005. “Life Span Effects of Lexical Factors on Oral Naming.” Language and Speech 48 (2): 123– 56. Paribakht, T. Sima, and Marjorie Wesche. 1993. “Reading Comprehension and Second Language Development in a Comprehension-Based ESL Program.” TESL Canada Journal 11 (1): 09–29.

52

Probabilistic Phonotactics in Vocabulary Acquisition

—. 1997. “Vocabulary Enhancement Activities and Reading for Meaning in Second Language Vocabulary Acquisition.” In Second Language Vocabulary Acquisition: A Rationale for Pedagogy, edited by James Coady and Thomas Huckin, 174–200. Cambridge: Cambridge University Press. Pellicer-Sánchez, Ana, and Norbert Schmitt. 2010. “Incidental Vocabulary Acquisition from an Authentic Novel: Do Things Fall Apart.” Reading in a Foreign Language 22 (1): 31–55. Perfetti, Charles A., Nicole Landi, and Jane Oakhill. 2005. “The Acquisition of Reading Comprehension Skill.” The Science of Reading: A Handbook, 227–47. Pitts, Michael, Howard White, and Stephen Krashen. 1989. “Language Acquirers’.” Reading in a Foreign Language 5 (2): 271. Pulido, Diana. 2003. “Modeling the Role of Second Language Proficiency and Topic Familiarity in Second Language Incidental Vocabulary Acquisition through Reading.” Language Learning 53 (2): 233–84. —. 2004. “The Effect of Cultural Familiarity on Incidental Vocabulary Acquisition through Reading.” The Reading Matrix 4 (2). http://www.msu.edu/~pulidod/pulido3.pdf. Rieder, Angelika. 2002a. “Beiläufiger Vokabelerwerb: Theoretische Modelle Und Empirische Untersuchungen.” Dissertation, Universität Tübingen. http://w210.ub.uni-tuebingen.de/volltexte/2002/646/. —. 2002b. “A Cognitive View of Incidental Vocabulary Acquisition: From Text Meaning to Word Meaning.” Views 11 (1&2): 53–71. Robinson, Peter. 2003. “Attention and Memory during SLA.” In The Handbook of Second Language Acquisition, edited by Catherine J. Doughty and Michael H. Long, 631–78. Blackwell Publishing Ltd. Saragi, Thomas. 1978. “Vocabulary Learning and Reading.” System 6 (2): 72–78. Saragi, Thomas, Ian Stephen Paul Nation, and Gerold Fritz Meister. 1978. “Vocabulary Learning and Reading.” System 6 (2): 72–78. Schmidt, Richard. 1990. “The Role of Consciousness in Second Language Learning1.” Applied Linguistics 11 (2): 129–58. —. 2001. “Attention.” In Cognition and Second Language Instruction, edited by Peter Robinson, 3–32. Cambridge: Cambridge University Press. —. 2012. “Attention, Awareness, and Individual Differences in Language Learning.” In Perspectives on Individual Characteristics and Foreign Language Education, 6:27–50. Sternberg, Robert J. 1987. “Most Vocabulary Is Learned from Context.” The Nature of Vocabulary Acquisition, 89–105.

Denisa Bordag, Maria Rogahn, Amit Kirschenbaum and Erwin Tschirner

53

Storkel, Holly L., Jonna Armbruster, and Tiffany P. Hogan. 2006. “Differentiating Phonotactic Probability and Neighborhood Density in Adult Word Learning.” Journal of Speech, Language and Hearing Research 49 (6): 1175. Storkel, Holly L., and Su-Yeon Lee. 2011. “The Independent Effects of Phonotactic Probability and Neighbourhood Density on Lexical Acquisition by Preschool Children.” Language and Cognitive Processes 26 (2): 191–211. Vitevitch, Michael S., Jonna Armbrüster, and Shinying Chu. 2004. “Sublexical and Lexical Representations in Speech Production: Effects of Phonotactic Probability and Onset Density.” Journal of Experimental Psychology: Learning, Memory, and Cognition 30 (2): 514. Vitevitch, Michael S., and Paul A. Luce. 1998. “When Words Compete: Levels of Processing in Perception of Spoken Words.” Psychological Science 9 (4): 325–29. doi:10.1111/1467-9280.00064. —. 1999. “Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition.” Journal of Memory and Language 40 (3): 374–408. —. 2005. “Increases in Phonotactic Probability Facilitate Spoken Nonword Repetition.” Journal of Memory and Language 52 (2): 193– 204. Walley, Roc E., and Theodore D. Weiden. 1973. “Lateral Inhibition and Cognitive Masking: A Neuropsychological Theory of Attention.” Psychological Review 80 (4): 284. Wesche, Marjorie, and T. Sima Paribakht. 1996. “Assessing Second Language Vocabulary Knowledge: Depth Versus Breadth.” Canadian Modern Language Review 53 (1): 13–40. Yarkoni, Tal, David Balota, and Melvin Yap. 2008. “Moving beyond Coltheart’s N: A New Measure of Orthographic Similarity.” Psychonomic Bulletin & Review 15 (5): 971–79.

THE CONSTRUAL HYPOTHESIS AND RELATIVE CLAUSE PROCESSING: THE EFFECT OF THE REFERENTIALITY PRINCIPLE IN BRAZILIAN PORTUGUESE GITANNA BRITO BEZERRA AND MÁRCIO MARTINS LEITÃO1

Abstract Two experiments were performed in order to investigate the Referentiality Principle in Brazilian Portuguese. A self-paced reading task did not reveal an on-line effect of N2 referentiality on relative clause comprehension, with a general preference for the N1 attachment even when the N2 was referential. However, a questionnaire study did show an influence from this non-structural information, with more N2 interpretation when it was referential than when it was not. These on-line and off-line data are discussed in terms of the Construal Hypothesis, also considering some assumptions of the Good-Enough Theory.

1. Introduction The Construal Hypothesis was put forward by Frazier & Clifton (1996) as a reformulation of the Garden Path Theory (Frazier, 1979; Frazier & Rayner, 1982). This theory assumes a parser that operates in an incremental and serial way, which means that situations of syntactic ambiguity will immediately follow just one structural analysis, according 1

Corresponding author: Gitanna Bezerra, Federal University of Paraiba, 384 Anacleto Eloy, Quarenta, 58416-265 Campina Grande, Paraiba (Brazil). E-mail: [email protected]

Gitanna Brito Bezerra and Márcio Martins Leitão

55

to a general principle that predicts: “choose the first analysis available” (Frazier, 1987). Thus, the main assumptions made by this theory are that the syntactic analysis will not be delayed and that more than one syntactic analysis will not be considered in parallel. The nature of the first analysis available is evidenced under this theory through two main structural principles: Minimal Attachment and Late Closure. According to the first principle, the parser prefers the analysis that requires the fewest number of syntactic nodes, explaining the preferences of analysis observed in ambiguous sentences such as, “the horse raced past the barn (fell)” and “Sam hit the girl with a book”. The second principle, in turn, predicts that the parser must attach the new input into the linguistic material currently being processed, explaining the preferences of analysis found in ambiguous sentences such as, “while Mary was mending the sock fell off her lap”. The Late Closure, more specifically, would predict a systematic preference of analysis in an ambiguous sentence like “someone shot the servant of the actress who was on the balcony”, where there are two sites of attachment structurally available for the relative clause (the N1 “the servant” and the N2 “the actress”). Following the late closure, the N2 attachment should be immediately chosen by the parser. However, Cuetos & Mitchell (1988), in a seminal work on the processing of this type of syntactic ambiguity (relative clause associated with a complex noun phrase (complex NP)), reported that native speakers of Spanish preferred the N1 attachment, while native speakers of English preferred the N2 attachment. This initial divergence started a great discussion about the universality of late closure, and many hypotheses have been considered in order to explain the absence of a systematic preference of analysis during the processing of this type of structure (Cuetos & Mitchell, 1988; Gibson, Pearlmutter, Canseco-Gonzalez & Hickok, 1996; Fodor, 1998, 2002; Hemforth, Konieczny & Scheepers, 2000; Maia, Fernández, Costa & Lourenço-Gomes, 2006; Grillo & Costa, 2014; Hemforth, Fernandez, Clifton, Frazier, Konieczny & Walter, 2015). Frazier & Clifton (1996, 1997) proposed the construal hypothesis in this scenario and reinforced the relevance and universality of the late closure, but with one observation: this principle is not expected to operate in the processing of some types of phrases which include, for example, the relative clause relation just mentioned. Here, the construal hypothesis assumes a crucial distinction: primary phrases or relations vs. secondary phrases or relations. The first type includes the subject and main predicate of a finite clause and their complements and obligatory constituents, while the second type includes the phrases that cannot, even temporarily, be

56

The Construal Hypothesis and Relative Clause Processing

analyzed as possible primary phrases. Roughly, the first type refers to arguments, which are required to satisfy the lexical description of lexical heads; the second type refers to adjuncts, which are optional and therefore not specifically related to the lexical description of a lexical item. Considering this distinction, the construal hypothesis proposes that primary phrases/relations are initially given fully determinate syntactic analysis via on-line application of structural principles (such as Minimal Attachment and Late Closure) and that secondary phrases/relations are initially subject to an syntactic under-specification – they must be associated into a current thematic processing domain (the extended maximal projection of the last theta assigner) and interpreted using structural and non-structural information. Focusing on relative clause processing, the construal hypothesis predicts that relative clauses must be associated into the current thematic processing domain and then interpreted within this thematic domain using structural and non-structural information. Gilboy, Sopena, Clifton & Frazier (1995) proposed that non-structural information that has an important role during the relative clause interpretation is the referentiality of the nouns that compose the complex NP, posing the Referentiality Principle: The heads of some maximal projections are referential in the sense that they introduce discourse entities (e.g., participants in events described in the discourse) into a discourse model (at least temporarily), or correspond to already existing discourse entities. Restrictive modifiers (e.g., restrictive relative clauses) preferentially seek hosts which are referential in this sense (Gilboy et al., 1995, p. 136).

The underlying assumption is that a head noun of an NP corresponds to a discourse entity (and is therefore referential) when it is introduced by a Determiner. Following this principle, when there are two NPs within the current thematic processing domain and one of them is non-referential, the relative clause will prefer the referential one. One of the types of complex NPs that was focused on by the authors in order to test these predictions was “substance NPs”, which more naturally contain a referential N1 and a non-referential N2 – “the sweater of wool”, for example. A non-referential N2 would therefore be more natural in this type of complex NP. These predictions were supported by experimental evidence obtained from questionnaire studies conducted with native speakers of English and native speakers of Spanish. Thus, the N1 interpretation was preferred by the subjects in sentences like “yesterday they gave me the sweater of cotton that was illegally imported” (in the Spanish questionnaire: “Ayer

Gitanna Brito Bezerra and Márcio Martins Leitão

57

me regalaron el jersey de algodón que importaban de contrabando”), which contains a referential N1 and a non-referential N2. In a second questionnaire study, which was conducted just with native speakers of English, the authors manipulated the N2 referentiality, producing sentences like “yesterday they gave me the sweater of the cotton that was illegally imported”. The prediction was that the number of N2 responses would increase considering that now the N2 was also referential. Indeed, the percentage of N2 responses increased from 26% for a non-referential N2 to 55% for a referential N2. The Referentiality Principle was first investigated in Brazilian Portuguese (BP) by Maia & Finger (2007). These authors conducted a questionnaire study and investigated the influence of several types of complex NP on relative clause interpretation. Substance NPs were also studied, but without manipulating the N2 referentiality. The results were in line with what is proposed by Gilboy et al. (1995). In sentences like “O técnico fez críticas à antena de metal que oxida” (“The technician criticized the antenna of metal that oxidizes”), where just one referential head is available for the relative clause (the N1), Brazilian Portuguese speakers exhibited a general preference for the N1 interpretation. With this theoretical overview, the aim of the present study was to advance the investigation of the Referentiality Principle and, in more general terms, the predictions of the construal hypothesis regarding Brazilian Portuguese data. In this language, most of the studies that have focused on relative clause processing considered complex NPs containing a referential N1 and a referential N2 (Ribeiro, 1999, 2005; Miyamoto, 1999; Maia, Fernández, Costa & Lourenço-Gomes, 2006). Besides this, the only experimental study in BP focusing on the Referentiality Principle has not specifically addressed the referential status of the N2. Taking this into account, as well as the fact that the Referentiality Principle has been primarily studied through off-line experiments, the present research, focusing also on N2 referentiality and using on-line and off-line experimental techniques, contributes to the field and provides new experimental data which discusses more specifically the influence of N2 referentiality on relative clause processing and the initial syntactic underspecification proposed by the construal hypothesis. Two experiments are responsible for this discussion and they will be reported in the next section.

58

The Construal Hypothesis and Relative Clause Processing

2. Experiments The aim of this experimental research is to investigate the Referentiality Principle in BP. In order to investigate whether the referential status of a noun could influence the preferences of relative clause attachment (Gilboy et al., 1995; Frazier & Clifton, 1996), two experiments were conducted: a self-paced reading task (which could provide evidence related to the early stages of language processing) and a questionnaire study, which could show processes related to the late stages of processing. Considering the general preference for N1 attachment in sentences with a complex NP that show substance interpretation, the referentiality of the N2 was specifically manipulated in both experiments to verify if, when referential, the N2 could have its chance of being chosen as the site of attachment increased.

2.1. Self-paced Reading Task In this experiment, two variables were manipulated: N2 referentiality and the gender of the participle of the relative clause. This generated the following experimental conditions: a) Non-referential N2/gender agreement with N1 (N2NR/N1) O policial apreendeu/ a bolsa de couro/ que foi irregularmente importada/ pela empresa. b) Non-referential N2/gender agreement with N2 (N2NR/N2) O policial apreendeu/ a bolsa de couro/ que foi irregularmente importado/ pela empresa. c) Referential N2/gender agreement with N1 (N2R/N1) O policial apreendeu/ a bolsa do couro/ que foi irregularmente importada/ pela empresa. d) Referential N2/gender agreement with N2 (N2R/N2) O policial apreendeu/ a bolsa do couro/ que foi irregularmente importado/ pela empresa. Comprehension question: A bolsa foi importada? The sentences were divided into four segments, with the third segment being the critical one as it contained the disambiguating material (gender information). The hypotheses that were tested based on these experimental conditions were the following: (1) if N2 referentiality can affect the online processing of relative clauses, the under-specification produced by the initial association process will not persist for a long time and a specific

Gitanna Brito Bezerra and Márcio Martins Leitão

59

attachment will occur rapidly; (2) in (a) and (b) there will be a preference for N1 attachment, with the critical segment being read more slowly in (b) than in (a); (3) in (c) and (d) the N1 preference will be attenuated and the critical segment will be read more quickly in (d), where there is a convergence between the two independent variables, than in (c); (4) the critical segment will also be read more quickly in (d) than in (b), in which the non-referential N2 does not favor the low attachment; (5) finally, the critical segment can be read more slowly in (c) than in (a), considering that (c) has a referential N2, which potentially favors a low attachment, and a gender that guides the parser toward N1 attachment. 2.1.1 Method 2.1.1.1 Participants Thirty-two undergraduate students from the State University of Paraiba (UEPB) and the Higher Education Union of Campina Grande (UNESC) voluntarily participated in this experiment. All of the participants were native speakers of Brazilian Portuguese and had an average age of 21 years. 2.1.1.2 Materials The material consisted of four experimental sets, each one containing 16 experimental sentences (four sentences per experimental condition) and 32 filler sentences, following the Latin square design and the within-subjects design. Each subject was exposed to four instances of each of the four experimental conditions, but to no more than one version of an experimental sentence. The experimental sentences exhibited the following linguistic structure: NP + V / complex NP (N1 of N2) / relative clause (disambiguating portion) / by + agent. The configuration of the segment 2 varied in accordance with the referential status of the N2: in the N2R conditions (but not in the N2NR conditions) the N2 was preceded by a Determiner (definite article). Still looking at the complex NP, the N1 and the N2 always differed in terms of gender feature: when the N1 was feminine, the N2 was masculine, and vice versa. The structure of the segment 3 varied among the conditions considering the gender information marked in the participle: in the N1 conditions, the gender of the participle agreed with the N1 (high attachment), and, in the N2 conditions it agreed with the N2 (low attachment). The segment 4 corresponded to both the post-critical and the final segment, after which there was, in all experimental items, a comprehension question that always focused on the N1, such as “A bolsa

60

The Construal Hypothesis and Relative Clause Processing

foi importada?” in the case of the sentences previously shown. Considering this question type, in the conditions (a) and (c), the correct response was “yes”, while in the conditions (b) and (d), the correct response was “no”. 2.1.1.3 Procedure and Equipment The experiment was a self-paced reading task, with a non-cumulative presentation of the stimuli (Mitchell, 2004). The participants read sentences displayed on the computer screen and pressed the “L” key to advance presentation of the sentence, answering the comprehension question by pressing “yes” or “no” (labels attributed to the keys “O” and “P”, respectively). Each participant was tested individually in a silent room. Before the test, they received an explanation about the general mechanism of the task and participated in a practice test, which consisted of the presentation of eight sentences with structures that differed from the experimental sentences. The experimental sections lasted 20 minutes on average and the participants did not report specific difficulties in the execution of the task. The experimental apparatus consisted of a MacBook Pro, which supports Psyscope (Cohen, MacWhinney, Flatt & Provost, 1993) – the program used to program and to run this experiment. 2.1.2 Results The dependent variable that was initially considered was just the reading times of segment 3 (the critical one). However, during the statistical analysis, the data obtained from other segments were also considered relevant and taken into consideration. Considering, at first, segment 2, a T-test was performed and revealed that the reading times were higher in the N2R conditions than in the NRN2 conditions (t(31) = 2.50; p < 0.01), as can be seen in Figure 1:

Gitanna Briito Bezerra and d Márcio Martinns Leitão

61

Figure 1. Meaan reading timees on complex NPs N according tto the N2 refereentiality.

Focusing noow on segmennt 3, a 2 (N2NR vs N2R) x 2 (N1 vs N2) ANOVA was perform med and reveealed neither a main effecct of N2 refeerentiality (F1(1,31) = 0.384, p