Linked Noun Groups: Opposition and Expansion as Genre and Style Markers [1 ed.] 9783030539856, 9783030539863

This book provides a corpus-led analysis of multi-word units (MWUs) in English, specifically fixed pairs of nouns which

322 23 4MB

English Pages 167 [162] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Linked Noun Groups: Opposition and Expansion as Genre and Style Markers [1 ed.]
 9783030539856, 9783030539863

Table of contents :
Acknowledgements
Contents
Abbreviations
List of Figures
List of Tables
Chapter 1: Introduction
1.1 Introduction
1.2 Corpus-led Inquiry and Corpora Used
1.3 The Relevance for Stylistic Studies of Literature
1.4 Relevant Applications
1.5 The Structure of this Book
References
Chapter 2: LNGs in Spoken Interaction and Written Academic Texts
2.1 Introduction
2.2 Spoken Discourse: LNGs in BNC-2014
2.2.1 BNC-2014 LNGs with and and or
2.2.2 BNC-2014 LNGs with and and or: Personal Pronouns
2.2.3 BNC-2014 LNGs with and and or: Numerals and Time Indicators
2.2.4 BNC-2014 LNGs with and and or: Vagueness
2.3 BASE LNGs: Spoken Academic English
2.3.1 BASE LNGs: Personal Pronouns
2.3.2 BASE LNGs: Spatial Relations
2.3.3 BASE LNGs: Numerals and Time Indicators
2.3.4 BASE LNGs: Highlighting Subject-Specific Information
2.4 LNGs in Spoken Data
2.5 LNGs in Written Academic Texts
2.5.1 Introduction
2.5.2 Numerals
2.5.3 Specific Information Containers
2.5.3.1 Introduction
2.5.3.2 Dominant Forms
2.5.3.3 Country and Law
2.5.3.4 Medicine and Science
2.5.3.5 Social Sciences
2.5.3.6 Business, Economics and Education
2.5.4 Gendered Language
2.5.4.1 Introduction
2.5.4.2 Gendered and Age-Referential LNGs in BNC-AC and BAWE
2.5.4.2.1 Gender
2.5.4.2.2 Age-Related Usage
2.6 Conclusions: LNGs in Academic Written English
Appendices
Opposites and Expansions with –and-
Opposites and Expansions with –or-
Law and Country
Medicine and Science
Social Sciences
Business, Economics and Education
Gendered LNG Usage; Reference to Children
References
Chapter 3: LNGs in UK and US Poetry
3.1 Introduction1
3.1.1 The Corpora
3.1.2 Instead of a Literature Review
3.1.3 Groundwork: Identifying Key Themes in the Gutenberg Poetry Corpus
3.2 LNGs in the Domains Treasure, Song and Nature
3.2.1 Treasure
3.2.2 Song
3.2.3 Nature
3.3 LNGs in the Domain of God
3.4 LNGs in the Themes of Love and Death
3.4.1 Heart LNGs
3.4.2 Joy LNGs
3.4.3 Death LNGs
3.4.4 Love LNGs
3.5 LNGs in the domain of Time
3.5.1 Spring and Winter; Morning and Evening LNGs
3.5.1.1 Spring and Winter
3.5.1.2 Morning and Evening
3.5.2 Day and Night LNGs
3.6 LNGs in the Domains of World
3.6.1 Snow and Hills LNGs
3.6.2 Land and Sea LNGs
3.6.3 Earth and its Juxtaposition LNGs
3.7 LNGs in the Domain of Sky
3.7.1 Air and Wind LNGs
3.7.1.1 Air Binominals
3.7.1.2 Wind Binominals
3.7.2 Sky and Heaven LNGs
3.7.3 Moon, Sun and Stars LNGs
3.8 Conclusions
References
Chapter 4: LNGs in Nineteenth- and Twentieth-Century British Fiction
4.1 Introduction
4.1.1 Corpora Used
4.1.2 Method
4.1.3 The Idea of Investigating Literature
4.2 Thematic LNGs in British Fiction
4.2.1 Introduction
4.2.2 LNGs with but in Fiction
4.2.2.1 LNGs with or—Antonymic Pairs
4.2.2.2 LNGs with or—Numerals and Times
4.2.2.3 LNGs with or-time and Vagueness Markers
4.2.2.4 LNGs with or: Summary
4.2.3 LNGs with and in Fiction
4.2.3.1 LNGs with and: References to love and death
4.2.3.2 LNGs with and: Clothing and the Environs
4.2.3.3 LNGs with and: Time Markers
4.2.3.4 LNGs with and: Body Parts
4.2.3.5 LNGs with and: Food Items
4.2.3.6 LNGs with and: People
4.2.3.7 LNGs with and: Pronouns
4.3 Concluding Thoughts: LNGs in British Fiction Texts
4.4 A Case Study: LNGs Occurrence Structure in Dickens and Nineteenth-Century Marx Translations
4.4.1 Introduction
4.4.2 LNGs in the Dickens Corpus Compared to General Fiction Corpora
4.4.2.1 Marginal LNGs: The Use of neither…nor
4.4.2.2 LNGs with or that are (A-)typical where Dickens is Compared to 19C
4.4.2.3 LNGs with and that are (A-)typical where Dickens is Compared to 19C
4.4.2.4 LNGs in the Marx Corpus and How These Compare
4.4.2.5 LNGs with or that are (A-)typical where Marx is Compared to BAWE and 19C
4.4.2.6 LNGs with and that are (A-)typical where Marx is Compared to BAWE and 19C Literature
4.4.3 Case Study Conclusion
Appendices
Appendix for Sect. 4.2.3.1: Death
Appendix for Sect. 4.2.3.2: Items of Clothing
Appendix for Sect. 4.2.3.2: Housing and Environs
Appendix for Sect. 4.2.3.3: Body Parts
Appendix for Sect. 4.2.3.4: Time Markers
Appendix for Sect. 4.2.3.5: Food Items
Appendix for Sect. 4.2.3.6: People
Appendix for Sect. 4.2.3.7: Pronouns
Appendix for 4.4.2(a): Dickens or LNG Usage
Appendix for 4.4.2(b): Dickens and LNG Usage
References
Chapter 5: Findings, Applications and Conclusions
5.1 Findings
5.2 Applications
5.2.1 LNGs and Teaching
5.2.2 LNGs, (Critical) Discourse Analysis and Style
5.2.3 LNGs and Natural Language Processing Tools
5.3 Linked Noun Groups—Oppositions and Expansions
5.4 Appendix: Google Autocomplete Examples
References
People Index: Poets, Scholars and Writers
Subject Index

Citation preview

Linked Noun Groups Opposition and Expansion as Genre and Style Markers

Michael Pace-Sigge

Linked Noun Groups

Michael Pace-Sigge

Linked Noun Groups Opposition and Expansion as Genre and Style Markers

Michael Pace-Sigge School of Humanities University of Eastern Finland Joensuu, Finland

ISBN 978-3-030-53985-6    ISBN 978-3-030-53986-3 (eBook) https://doi.org/10.1007/978-3-030-53986-3 © The Editor(s) (if applicable) and The Author(s) 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover pattern © Melisa Hasan This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Acknowledgements

This book has got a rather curious exegesis. While researching into how far the language of Marx left any mark in nineteenth- and twentieth-­ century fiction, I noticed the extraordinary proclivity of trigrams—binomials or, to be more precise, linked noun phrases. Before work on the Marx paper was completed, I had set off doing some investigations into these particular constructions. Yet, at that stage, little did I know that two years later (and, given that we now live under the cloud of a global pandemic— in a different life altogether) a book would come out of this. Consequently, my first thanks go to Dr Odette Vassallo and her team, who gave me an early platform to air my ideas during the 2018 InterVarietal Applied Corpus Studies (IVACS) conference in Valletta, Malta. I am also particularly grateful to Professor Cantos Gomez, who published my paper on multi-word units in poetry: crucial background research for Chap. 3 of this book. I also like to thank Dr Katie Patterson and Jess Pope who—again— were willing proofreaders of my script. I would not know what to do without you! My thanks go to my editor at Palgrave Macmillan, Cathy Scott, the anonymous reviewers who provided useful advice and the team of copy editors who straightened out the script. Last but not least, I am extremely grateful to all those friends, colleagues and students who patiently listened to me droning on about my findings. This book is dedicated to Christopher S. Boase Esq. We should have spent so much more time together! v

Contents

1 Introduction  1 1.1 Introduction  1 1.2 Corpus-led Inquiry and Corpora Used  7 1.3 The Relevance for Stylistic Studies of Literature  9 1.4 Relevant Applications  9 1.5 The Structure of this Book 10 References 12 2 LNGs in Spoken Interaction and Written Academic Texts 15 2.1 Introduction 15 2.2 Spoken Discourse: LNGs in BNC-2014  17 2.3 BASE LNGs: Spoken Academic English 22 2.4 LNGs in Spoken Data 25 2.5 LNGs in Written Academic Texts 26 2.6 Conclusions: LNGs in Academic Written English 44 Appendices 45 References 51 3 LNGs in UK and US Poetry 53 3.1 Introduction 53 3.2 LNGs in the Domains Treasure, Song and Nature 57 3.3 LNGs in the Domain of God 60 3.4 LNGs in the Themes of Love and Death 62 3.5 LNGs in the domain of Time 67 vii

viii 

Contents

3.6 LNGs in the Domains of World 69 3.7 LNGs in the Domain of Sky 72 3.8 Conclusions 78 References 82 4 LNGs in Nineteenth- and Twentieth-­Century British Fiction 85 4.1 Introduction 85 4.2 Thematic LNGs in British Fiction 90 4.3 Concluding Thoughts: LNGs in British Fiction Texts110 4.4 A Case Study: LNGs Occurrence Structure in Dickens and Nineteenth-Century Marx Translations111 Appendices120 References126 5 Findings, Applications and Conclusions129 5.1 Findings129 5.2 Applications131 5.3 Linked Noun Groups—Oppositions and Expansions140 5.4 Appendix: Google Autocomplete Examples142 References143 People Index: Poets, Scholars and Writers147 Subject Index151

Abbreviations

19C BASE BAWE BNC-2014 BNC-AC BNC-F DC GPC L1 LDOCE LL LNG MC N p/mio, p/m R1

Corpus of 100 texts of nineteenth-century British fiction British Academic Spoken English corpus British Academic Written English corpus Corpus of spoken conversation BNC-Written-Academic British National Corpus sub-corpora BNC-Written-Fiction/Prose British National Corpus sub-corpus Corpus of Dickens’ novels Gutenberg Poetry Corpus Near-collocate, positioned directly before the target word Longman Dictionary of Contemporary English Log-likelihood (value); statistical testing Linked Noun Group (a binomial consisting of two nouns linked by a conjunct) Corpus of Marx’s texts in English translation Number of occurrences Number of occurrences per million words Near-collocate, positioned directly after the target word

ix

List of Figures

Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 2.4 Fig. 2.5 Fig. 2.6 Fig. 2.7 Fig. 2.8 Fig. 2.9 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 3.7 Fig. 3.8 Fig. 3.9 Fig. 3.10 Fig. 3.11 Fig. 3.12 Fig. 3.13 Fig. 3.14 Fig. 3.15 Fig. 3.16

Distribution of pronouns in and LNGs 18 Distribution of pronouns in or LNGs 19 Occurrence numbers for Noun and stuff LNG in BNC-2014 21 Occurrence numbers for Noun -[or]- Noun vagueness markers 22 Distribution of pronouns in and LNGs in BASE 23 NPs preceding as a whole27 Distribution (per million words) of f-and-m and m-and-f LNGs. Left: BNC-AC, right: BAWE 41 BNC-AC gendered LNG with or. The Total Numbers Here are Vanishingly Small—0.12/Million Equal to Two 42 All concordance lines of Women or Men in BNC-AC 43 song LNG forms 58 sound LNG forms 58 bird LNG forms with beast59 -and Nature LNG forms 60 hell with heaven LNG forms 61 soul LNG forms 61 heart LNGs 63 joy LNG opposites 64 hills and its opposite LNGs 70 land with sea LNGs 71 sea LNGs (occurrence N: right) 71 air LNG expansions 73 wind LNG expansions 74 sky LNG expansions 75 heaven LNG opposites 76 moon, stars and sun interrelated LNGs 76 xi

xii 

List of Figures

Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 4.4 Fig. 4.5 Fig. 4.6

nothing -[but]- a/the N in 19C and BNC nothing -[but]- N in 19C (top) and BNC (bottom) The most frequent or 4-grams 19C (top) and BNC (bottom) The most frequent LOVE and LNGs. Left: 19C; right: BNC The most frequent and LOVE LNGs. Left: 19C; right: BNC Ten most frequent LNGs in fiction (excluding pronouns)

90 91 95 99 100 111

List of Tables

Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 2.5 Table 2.6 Table 2.7 Table 2.8 Table 2.9 Table 2.10 Table 2.11 Table 2.12 Table 2.13 Table 2.14 Table 2.15 Table 2.16 Table 2.17 Table 2.18 Table 2.19 Table 2.20 Table 2.21 Table 2.22 Table 3.1

Corpora used for Chap. 2 16 LINK word frequencies 16 and LNGs for pronouns in BNC-2014 17 or LNGs for pronouns in BNC-2014 18 Comparison of the most frequent L1 numerals in and LNGs in BNC-2014 19 Comparison of the most frequent L1 numerals in or LNGs in BNC-201420 and LNGs for time indicators 20 or LNGs for time indicators 21 or LNGs referring to non-gendered persons in BASE 23 and LNGs for spatial relation in BASE 24 numeral LNGs in BASE 24 high-info LNGs in BASE 25 Numeral LNGs in BAWE and BNC-AC 28 Time-linked LNGs in BNC-AC and BAWE 30 Opposites with and LNGs in BNC-AC and BAWE 31 Opposites with or LNGs in BNC-AC and BAWE 32 Expansion LNGs in BNC-AC and BAWE 33 Country and Law LNGs with and in BNC-AC and BAWE 34 Medicine and Science LNGs with and and or in BNC-AC and BAWE 35 Social Science LNGs in BNC-AC and BAWE 37 Business and Teaching LNGs in BNC-AC and BAWE 38 LNGs that reference age-group-related pairs 44 Comparison of Poetry corpora 54

xiii

xiv 

List of Tables

Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 3.8 Table 3.9 Table 3.10 Table 3.11 Table 3.12 Table 4.1 Table 4.2 Table 4.3 Table 4.4 Table 4.5 Table 4.6 Table 4.7 Table 4.8 Table 4.9 Table 4.10 Table 4.11 Table 4.12 Table 4.13 Table 4.14 Table 4.15 Table 4.16 Table 4.17 Table 4.18 Table 4.19 Table 5.1

Ontology of salient themes in the GPC hope LNG oppositions and expansions joy LNG expansions death LNG oppositions and expansions love LNGs with and expansion love LNGs with opposition day LNGs earth LNGs sun, moon and stars LNG as extension sun as weather marker The most frequent LNGs found in GPC Corpora used for Chap. 4 Frequency and relative frequency for the links AND, BUT and OR The most frequent or LNGs with opposites The most frequent or LNGs with numerals The most frequent or LNGs with time markers The most frequent or LNGs with vagueness markers The most frequent and LNGs with items of clothing The most frequent and LNGs with housing or environs The most frequent and LNGs with time markers The most frequent and LNGs with body parts The most frequent and LNGs with food items The most frequent and LNGs with reference to people The most frequent and LNGs with pronouns Comparing Dickens or LNGs to 19C and BNC-F Comparing Dickens and LNGs to 19C and BNC-F Comparing Marx or LNGs to BAWE and 19C Comparing Marx and LNGs with BAWE occurrences Comparing Marx and LNGs with 19C literature and Dickens Corpus (DC) Marx-­typical most frequent and LNG forms LNG usage across text types

55 62 64 65 66 67 69 72 77 77 78 86 87 92 93 96 97 101 102 103 105 106 107 109 113 114 117 118 119 119 130

CHAPTER 1

Introduction

1.1   Introduction This book deals with a minor yet essential and highly functional grammatical construction in the English language. Unlike established features of grammar—for example, conjugation or morphology—it does not have a single, widely recognised name. Like many other elements of language (see, for example, Bill Louw’s semantic prosody), the occurrence patterns of these constructions can become more visible by the application of corpus linguistics. Often, this construction is found under its general heading—binomials. Within a broad perspective of linguistics, binominal phrases have been of interest only relatively recently. Almost all the literature seems to agree that it was first discussed by Richard Abraham, who talked of fixed coordinates, and highlighted this fixedness with the following example: “we say ‘It’s a matter of life and death’ and ‘It’s a matter of death and life’ although logical enough, cannot be considered colloquial or idiomatic English” (1950, 276). Abraham continues that similar fixedness can be found in Romance languages as well as German. His in-depth investigation into binomials in several languages concludes as such: it is likely that at least twenty per cent of fixed coordinates are unclassifiable semantically and, as we have already seen, although rhythm and phonology doubtless play their part in determining the order of words which are

© The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3_1

1

2 

M. PACE-SIGGE

­coordinated, one can never predict which of the many principles will be dominant in any given case. (my highlights; Abraham, 1950, 287)

Similarly, Yakov Malkiel, in 1959, observed its very fixed (i.e. hard to reverse) structure. The English language does, indeed, make use of a large number of multi-word units (MWU), and these are often fixed to a high degree. As can be seen, a lot of the groundwork on the use of binomials happened during pre-corpus times. One of the later seminal works comes from Marita Gustafsson, starting with her PhD thesis.1 The work then undertaken was going manually over thousands of pages from eleven different texts (novels, newspapers, magazines, popular science and law texts) in order to extract 4330 occurrences of 2720 different binomials (Gustafsson 1975, 1976). In her description, these phrases display a characteristic frozenness. Nevertheless, Gustafsson, in her various articles, provided a most comprehensive description of the forms and functions of binomials; her work has become a basis for a lot of research in the field, and this book will keep coming back to her to compare the results produced here with her own research. Amongst word pairs and trigrams that are idiomatic, binomials are a specific subcategory that is prominent yet cannot be described as highly frequent compared to other types of fixed clusters. Amongst all binomials, a number of research investigations have highlighted the most frequently occurring subcategory, namely a pair of nouns which is linked by a conjunction. Some of these are widely used and easily recognised, for example law and order or boy and girl. However, there are a number of less obvious trigrams of that sort, for example nose and cheekbone, increase or decrease. Biber and colleagues [1999] (2007) talk of coordinated binomial phrases2 (Noun and NOUN). They also state that “[m]ost binominal phrases occur too infrequently to be considered part of recurrent lexical bundles”. Biber and Conrad (1999, 183) define lexical bundles as “extended collocations: sequences of three or more words that show a statistical tendency to co-occur”.3 They contrast them with idioms, highlighting that the main difference is that “lexical bundles are the sequences of words that most commonly co-occur in a register”. The Collins Cobuild English Grammar (Sinclair, 1990), which is based on a Pattern Grammar– led classification, refers to these as Linked Noun Groups (LNG). Moon (1998) highlights that these linked nouns are typical of fixed expressions found in her contemporary English corpus. A number of possible conjuncts are available to link pairs from the same word class. Biber et  al.

1 INTRODUCTION 

3

[1999] (2007) describe their coordinated binomial phrases only with the words and or or as coordinator. It must be noted, however, that this grammatical structure also presents occurrences for conjuncts like as, or but or nor, as links for pairs of nouns—this points to a colligational preference. Be that as it is, the binominal phrases N-[OR]-N and N-[AND]-N are, in all cases, the most frequently occurring Linked Noun Groups. The conjunct or fulfills, according to the Longman Dictionary of Contemporary English (LDOCE) (2009, 1229), six different functions: 1. 2. 3. 4. 5. 6.

possibilities / choice (“or anything like that”) “and not” (“to mean not one thing and not another either”) avoiding bad result (“hand over your money or else”) correction (“or rather”) proof (“not urgent or else they would not have called”) uncertain amounts (“a mile or so”)

The following chapters will highlight which of these functions are fairly prominent when N-[OR]-N LNGs are employed in different text types. The conjunction of two items with and appears, on the face of it, straightforward. However, the LDOCE, which is corpus-based, gives an indication of the particular usage patterns found to be prevalent. The conjunction and is far more frequent than or. Moreover, it also covers a wider range of functions as described in the LDOCE (2009, 55)4: (1) join to words, phrases etc. referring to related things (“get some fish and chips”) (2) one event following another (“knocked at the door and went in”) (3) causation (“I missed supper and I’m starving”) (4) adding numbers (“six and four is ten”) (5) (British English) after verbs like go, come, try: show intention (“I can try and persuade her”) (6) (spoken) to introduce a statement, remark, question etc. (“And who’s the lucky man?”) (7) between repeated words to add emphasis (“more and more people are losing their jobs”) (8) in numbers (“hundred and five” / “three and three-thirds”)5 (9) between repeated plural nouns to indicate one thing is better than the other (“there are experts and experts”)

4 

M. PACE-SIGGE

It can be seen that cases like (2), (3), (5) or (6) are of little relevance for an investigation into LNGs. By contrast, (1), (4), (7), (8) and (9) appear typical for such conjunct binominals. It is therefore interesting to see which particular types of or and and conjuncts are typical for LNGs in the various genres investigated in this book. Mollin (2014, 34) quotes Gustafsson (1975, 85–87), who distinguishes four main categories of semantic relationships that are typical for binomials: (1) semantic opposition, in which the two elements are antonyms, differing only in one componential feature, (2) semantic homeosemy (also referred to as synonymy or near-­ synonymy), in which the two elements have identical meaning or meanings that differ only in connotational nuances, (3) semantic hyponymy, in which one element is the hyperonym of the other, (4) semantic complementation. Looking at the literature, these categories seem to prevail. When it comes to LNGs, there are very few examples of (2)—and, amongst these, there is also direct repetition (i.e. “wind and storm” as well as “wind and wind”). Typically, LNGs appear as (1) opposition (or antonyms) and (4) semantic complementation. The latter I shall refer to as expansion. Crucially, certain nouns have a preference for either of these; furthermore, opposition or expansion can also depend on the position of the target noun. As a consequence, a noun can reflect a semantic complementation when it is in initial position of the trigram, yet opposition when in final position. While LNGs are not particularly frequent, a large number of them have been documented in grammar reference works—as shown above. LNGs have also been discussed as part of critical discourse analysis. Consequently, it has been pointed out that LNGs reflect gender bias in English: the Juliet and Romeo effect. Siyanova-Chanturia, Conklin and van Heuven (2011) investigated sentences containing three-word binomial phrases (bride and groom) and their reversed forms (groom and bride). These are identical in syntax and meaning but differ in phrasal frequency. There is strong evidence of the absence of the “Juliet and Romeo” effect (Hegarty et  al., 2011) overall; at the same time, the most recent data shows that there is an increase in equal representation and an inversion of the masculine-­ before-­feminine naming.6 In this book, data from the nineteenth century through to the twenty-first century will be investigated to see whether

1 INTRODUCTION 

5

gender bias is, indeed, entrenched through their use in LNGs and whether the most contemporary data points towards any observable changes. Likewise, discussing LNGs in legal parlance, Nebot (2017) highlights the fact that phrases such as “men and women” or “women and men” leave out intersexual individuals. In fact, LGBTQ+ can be described as an extended LNG; yet its usage is too recent to appear even in relatively recent corpora.7 Nebot’s chapter is found in a volume (Goźdź-Roszkowski and Pontrandolfo, 2017) applying corpus linguistic methods to investigate legal texts. Here, almost all contributions highlight the important role binomials play in the language of the law, something already highlighted by Gustafsson’s pre-corpus research in 1984. Unlike in the present volume, these writers focus their research on specialist corpora of English and Scottish law texts. The reader might find it of interest that there is some convergence with the findings presented in this book, as the corpora of academic written texts include sub-corpora from legal scholars. However, the level of divergence between the most frequent binomials described in the Goźdź-Roszkowski and Pontrandolfo volume is of particular interest and highlights the importance of the corpus material to find such LNGs. The reader will find this described in greater detail in Chap. 2. Possibly the best-known recent corpus-based study into English binomials is Sandra Mollin’s 2014 book, aptly titled The (ir) reversibility of English binomials. While in many respects similar to Benom and Levy’s (2006) magisterial study, her choice of a general corpus appears to be more suitable to reflect highly frequent binomials. Her 2014 book appears to be flawed in that it represents a lot of corpus-specific hapax legomena.8 Using the whole of the BNC, this appears like a corpus-based, updated version of Gustafsson’s 1975 thesis. Mollin describes that her “final file of high-frequency binomials includes 544 types” (2014, 22), which she lists in her Appendix (pp. 223–237), ordered from fully irreversible down to equally found in its prototypical and reverse form. This massive table makes very clear that the overwhelming majority of binomials are, indeed, noun-based. The second most common binomials are adjective constructions, and the rarest type of binomials described by her are linked verb groups. While her analysis is in-depth and serious, to me one major flaw appears quite obvious: while she discusses phonetic and semantic constraints in detail, the differences in frequency of occurrence in the different genres (spoken vs. written) and between sub-corpora (e.g. newspaper and academic writing) that Mollin describes in chapter three of her book seem

6 

M. PACE-SIGGE

to be left without further analysis. Crucially, she never seems to discuss the spread of any of the binomials extracted.9 So, for example “Cherry and Whites” is listed as one of the most frequent newspaper sub-corpus binomials (see p. 33). I failed to reproduce this finding with my version of the BNC, leaving me to suspect that the 103 entries quoted may come from a single file I failed to include in my search. Similarly, “Charles and Diana” appears over 100 times—yet these come from no more than ten sources, meaning that single articles use this trigram repeatedly. Seifart et  al. (2018) have demonstrated that non-native learners of a language can find the use of nouns problematic where L1 and L2 are culturally and structurally very different. It is therefore important to highlight, in teaching, that LNGs are multi-word units: as such, they are classed as formulaic sequences. There are compelling reasons for thinking that the brain represents formulaic sequences in long-term memory, bypassing the need to compose them online through word selection and grammatical sequencing in capacity-limited working memory (see, for example, Siyanova-Chanturia et al. 2011; Conklin and Schmitt, 2012; Wray, 2013). In Conklin and Schmitt’s work, mixed-effects modelling reveals that native speakers and non-native speakers read more frequent formulaic sequences more quickly than less frequent ones. Another important angle in the study of groups is their importance in teaching English: they form building blocks that demonstrate fluency (see, for example, Ernestova, 2007 or Sugiati and Rukmini, 2017). However, amongst teaching materials, there is little evidence that the distribution of LNGs in different genres of English has been addressed. Crucially, however, the occurrence pattern of such pairs is dependent on genre (cf. Hoey, 2005), and this book will document the structural distribution the most frequently occurring as well as significantly overused LNGs in the different corpora discussed. Linked Noun Groups: Opposition and Expansion as Genre and Style Markers has one clear mission: Gustafsson has provided a comprehensive and still-valid basis of what binomials are and how they function, based on a manual evaluation of the material available to her. This particular approach has then been refined by Mollin in the last decade by basing her research fully on the 100 million word corpus of the BNC and with the specific aim to describe to what extent binomials are found to be “irreversible” or, if reversible, which form the cline of reversibility takes. However, Sinclair, Biber et al. and Mollin have based their investigations on large, general corpora; Moon has generalised her findings, which come from a

1 INTRODUCTION 

7

relatively small specialist corpus. Likewise, Gustafsson (1984) as well the authors in Goźdź-Roszkowski and Pontrandolfo’s volume look at all forms of binomials but in specialist, legal corpora. Only Biber et al., Moon and Sinclair focus on linked noun binomials only. This book provides a different approach: with a focus on Linked Noun Groups only amongst all possible binomials, their specific usage pattern is to be traced in a variety of (mostly British English) text genres that go back to as far as the nineteenth century (for poetic texts) and which are as recent as the mid-2010s (the spoken BNC-2014 corpus). These corpora are outlined below.

1.2   Corpus-led Inquiry and Corpora Used This book sets out to show that LNGs are a clear stylistic marker. In order to make this claim, a corpus-led analysis is undertaken. In order to gain a first impression as to what the most frequent LNGs in the target corpora are, WordSmith Tools versions 7 and 8 (Scott 2020) have been employed to create lists of the most frequent trigrams. As not all corpora are available with part-of-speech (POS) tagging, the most frequent trigrams noun + conjunction (AND; AS; BUT; NOR; OR) + noun will be selected and their usage analysed and compared. Manual selections were undertaken in particular where nouns could not be directly classified. This is the same technique as employed by Katja Basaneže (2017, 209). Furthermore, concordance lines (82 characters in length) based on and, but and or of all the corpora used have been saved separately to create material to look at the most frequent trigrams. In a final step, the conjunct is the target word during concordance searches. Using the Pattern function in WordSmith, the L1 and R1 column will be investigated for nouns separately. Where nouns are identified, the relevant full concordance lines will be checked to see whether an L1 or an R1 use of a noun results in an N-[CONJ.]-N construction.10 In Chap. 3, dealing with the poetry corpus, this method has been modified. Here, major recurring themes will be determined first, and the LNGs will be determined by these. The main tool employed for concordancing, KeyWord extraction and so on is WordSmith Tools 7 and 8 (Scott 2020). For further analysis, I have also used the LancsBox tool (Brezina et al., 2018), which allows for POS tagging of uploaded material. Where other tools have been employed, these are listed in the relevant chapter(s).

8 

M. PACE-SIGGE

One key aim of this book is to make the reader aware that LNG use is highly genre-specific. Earlier corpus-based research shows LNG usage yet does not indicate that particular LNGs reflect specific communication events. This book will look at different genres and time frames as shown below with a focus on British English:

(1) (2) (3) (4) (5) (6)

Text variety (corpora used in brackets)

In chapter

Spoken English (BNC-2014 and BASE) Academic written English (BAWE) Poetry (Gutenberg material) Nineteenth-century fiction (19C corpus, Patterson 2018) Twentieth-century fiction (BNC) Dickens’ novels; Marx in translation (individual styles)

(2) (2) (3) (4) (4) (4)

For (1), the BNC-2014 is the newest large casual spoken corpus: it is being used to demonstrate how rare LNGs are in online production. The British Academic Spoken Corpus is also fairly recent (2005) and reflects that LNGs are still relatively rare in their use but there are parallels to written academic English. (2) British Academic Written English (2007) is the most recent written corpus material available—and it is here in particular where LNGs seem to be employed. There are three corpora for imaginative writing. (3) This has been assembled by the author because there is, to date, only one large poetry corpus in existence11: the Gutenberg Poetry Corpus (Parrish, 2018). That, however, includes prose text and non-­ English poetry. The Gutenberg Poetry material corpus (GPC) employed here has been manually created. (4) consists of 100 full-text nineteenth-­ century British prose novels, which provide a highly suitable comparator to the GPC (which has a large amount of nineteenth-century material) and has previously been employed to show diachronic change when compared to (5), the BNC Prose-Fiction sub-corpus. (6) These are corpora by Mahlberg (2007) and Pace-Sigge (2018), which have been chosen to indicate that the clear division between fiction-style writing and academic-­ style writing found in the present was less prevalent 150 years ago. Between the two extremes of usage—spoken English and written academic English—there is the area of imaginative texts. This research shows that LNGs, unless they are numerals, are not typical of spoken English:

1 INTRODUCTION 

9

they are found in carefully crafted written texts, in particular in academic writing. In imaginative writing, poetry clearly focusses on a select number of main themes, and this book investigates the key LNGs prominent within these themes. When it comes to fiction, novels and short stories of the nineteenth and twentieth century have a high degree of overlap in their usage; at the same time, diachronic changes—for example, a move to greater informality—are easily traceable. Finally, case studies of the works of two influential writers of the nineteenth century—Charles Dickens’ novels and the translations of the writings of Karl Marx—highlight a number of surprising parallels in use.

1.3   The Relevance for Stylistic Studies of Literature To my knowledge, there has been very little corpus-led research into poetry. This book presents one step in addressing this issue, highlighting that poetry is concerned with a small group of major themes and that different poets employ recurrent LNGs in their writings. This suggests a greater degree of formulaicity than commonly assumed. When it comes to stylistic devices recurrent in novels, the use of particular LNGs can reflect patterns typical of a particular period; it thus gives an insight into salient diachronic changes that are reflected in wording choices by authors of different time periods. As a result, some of these phrases appear both in the imaginative texts written by Charles Dickens and in non-fiction texts of the same time. Overall, however, LNGs are often found to form phrases that can be seen as typical of a specific text type; areas of divergence indicate characteristic styles unique to individual writers.

1.4   Relevant Applications Biber (1988)12 has written widely about different collocations that form genre variations; Hoey (2005) as well as O’Keefe et al. (2007) show how collocations and colligations diverge in different genres; Hoey (2005) indicated that lexical primings tend to be highly genre-specific; and Sardinha (2017) describes how collocates allow the audience to predict genre. Pace-Sigge (2015) shows the importance of genre differences with respect to English-language teaching. The findings presented in this book appear to give a number of relevant points English-language teachers

10 

M. PACE-SIGGE

should be aware of when teaching and observing the use of LNGs. For any learner of the English language, for instance, awareness of idiomatic patterns like these serves to assist their register-specific application, and it will be shown how this can be employed at various levels of text composition. If, as Sugiati and Rukmini (2017) claim, relevant textbooks are failing to prepare students to use formulaic phrases, this research should assist in the provision of useful teaching material. The use of complex noun phrases is highly dependent on the pragmatics of each speech event. As such, they should not be expected in spoken usage (they take up a far greater processing time) but should be expected in edited texts. By contrast, LNGs are high in informational content. They are, therefore, characteristically found in academic writing. Learners have a need to be exposed to such multi-­ word units (MWUs). In particular, they have to be made aware of the particular nesting patterns (cf. Hoey, 2005) of LNGs as demonstrated here. While this work cannot replace a textbook, a reader will find a large number of examples, will become aware of genre-specific usage patterns and, hopefully, will find inspiration for further investigations—maybe in the form of student projects. One further application resulting from this investigation should be in the development of virtual speech recognition tools. A simple test has shown that advanced digital assistants are able to recognise English idioms (thus arguably demonstrating an ability to deal with metaphoricity). Genre-specific LNGs are a similar, though possibly less frequently occurring, feature of the English language—yet the (2019) crop of assistants appear unable to recognise these. It can be argued that developers of virtual tools (see Google’s Duplex and its related tools dealing with natural conversation13) will need to integrate this knowledge in order to enhance their machines’ abilities to understand pragmatic and semantic implicature better.

1.5   The Structure of this Book This book is divided into five chapters. After the introduction, Chap. 2 deals with LNG multi-word units as they occur in (or are absent from) typical casual conversations in British English. This will be contrasted with academic written English, which diverges most visibly from the spoken form. Chapters 3 and 4 focus on imaginative writing. Chapter 3 starts with an outline of the main themes and topics found in a large corpus of poetical writings. Based on this, LNGs found for each of these topics will be

1 INTRODUCTION 

11

investigated separately. Chapter 4 will give an overview of the most salient LNGs found in nineteenth- and twentieth-century British fiction. In a second section to the chapter, two case studies will look at idiosyncratic usage of prominent LNGs in Dickens and Marx. Finally, Chap. 5 will look at the relevance of these findings and provide an overall concluding chapter. Chapter 5 will also look at what LNG genre-specific usage means for the teaching of English as a (foreign) language and also provides indications as to why these findings should be seen as important by developers of virtual speech tools.

Notes 1. Gustafsson will have been aware of the work of Koskenniemi, who, in 1968, described a number of what she then called “repetitive word pairs”. In her work, she highlights that these can be found in Old and Middle English source material already. 2. The acronym for this would be CBP—which, in the outside world, stands for the “Customs and Border Protection” of the US. Something which, in the era of the 45th president, one does not want to refer to. 3. Biber (and a number of corpus linguists) does not mention dispersion in this respect. However, this book shows that dispersion (within academic subjects or as characteristic of a poet or fiction writer) is going to be discussed in-­depth throughout the argument presented in this book. It must also be said that lexical bundles have a tendency to present a complete expression (“black lives matter”) unlike chunks, which are just highly frequent collocates, like, for example, “during the whole”. 4. The tenth sense “and?” in spoken has been left out here as it is not relevant with in LNGs. 5. Up to the early nineteenth century, this is also true for lower numbers, for example “one-and-twenty”—see Chap. 4. 6. Omar ([1973] 2007) reported the emergence of masculine before feminine forms in Arabic-learning children. 7. In fact, “LGBT” appears five times in the most recently published corpus, the BNC2014. However, no long form of the acronym is ever being used. 8. Gustafsson (1976, 625) already pointed out that the majority of her binomials were hapaxes: “among the 2,720 items only 446 occurred more than once”. 9. Benor and Levy (2006) undertook a study not dissimilar to the one presented in Mollin’s monograph a decade later. Here, too, the same criticism can be raised, as their US-English corpus threw up oddities that seem more

12 

M. PACE-SIGGE

typical of their particular corpora chosen than of naturally occurring English. 10. WS7 allows the user to click on any collocate in order to create a new concordance list based on the original concordance list. This feature has been employed here. Where the word “class” is not obvious (e.g. the run vs. run away) the concordance lines were used to see whether a sufficient number of noun forms occurred. 11. The Poetry sub-corpus in the BNC is rather too small—just 226,367 tokens. 12. Furthermore, in 1990, 1993, 1995 and, with Edward Finegan, 1989, 1994, 1997. 13. See, for example https://www.blog.google/products/assistant/chattingyour-google-assistant-just-got-easier/ (last accessed February 2019).

References Abraham, R.  D. (1950). Fixed Order of Coordinates: A Study in Comparative Lexicography. Modern Language Journal, 34(4), 276–287. Basaneže, K.  D. (2017). Extended Binomial Expressions in the Language of Contracts. In Phraseology in Legal and Institutional Settings (pp.  203–220). London: Routledge. Benor, S. B., & Levy, R. (2006). The Chicken or the Egg? A Probabilistic Analysis of English Binomials. Language, 82, 233–278. Biber, D., & Conrad, S. (1999). Lexical Bundles in Conversation and Academic Prose. In H.  Hasselgård & S.  Oksefjell (Eds.), Out of Corpora: Studies in Honour of Stig Johansson (pp. 181–190). Amsterdam and Atlanta: Rodopi. Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. ([1999] 2007). Longman Grammar of Spoken and Written English. Harlow: Pearson Education. Biber, D. (1988). Variation Across Speech and Writing. Cambridge: CUP. Brezina, V., McEnery, T., & Wattam, S. (2018). LancsBox v. 4.0 (Concordance Software). Retrieved from http://corpora.lancs.ac.uk/lancsbox/ Conklin, K., & Schmitt, N. (2012). The Processing of Formulaic Language. Annual Review of Applied Linguistics, 32, 45–61. Ernestova, M. (2007). Role of Binomial Phrases in Current English and Implications for Readers and Students of EFL.  In G.  Shiel, I.  Stričević, & D.  Sabolović-Krajina (Eds.), Literacy without Boundaries (pp.  273–279). Zagreb: Croatian Reading Association. Goźdź-Roszkowski, S., & Pontrandolfo, G. (Eds.). (2017). Phraseology in Legal and Institutional Settings: A Corpus-based Interdisciplinary Perspective. London: Routledge. Gustafsson, M. (1975). Binomial Expressions in Present-day English: A Syntactic ad Sematic Study. Turku: Annales Universitatis Turkuensis.

1 INTRODUCTION 

13

Gustafsson, M. (1976). The Frequency and “Frozenness” of Some English Binomials. Neuphilologische Mitteilungen, 77(4), 623–637. Retrieved March 27, 2020, from www.jstor.org/stable/43343096. Gustafsson, M. (1984). The Syntactic Features of Binomial Expressions in Legal English. Text-Interdisciplinary Journal for the Study of Discourse, 4(1–3), 123–142. Hegarty, P., Watson, N., Fletcher, K., & McQueen, G. (2011). When Gentlemen are First and Ladies Last? Effects of Gender Stereotypes on the Order of Romantic Partners’ Names. British Journal of Social Psychology, 50, 21–35. Hoey, M. (2005). Lexical Priming: A New Theory of Words and Language. London: Routledge. Koskenniemi, I. (1968). Repetitive Word Pairs in Old and Early Middle English Prose: Expressions of the Type” Whole and Sound” and “Answered and Said”, and Other Parallel Constructions (Vol. 107). Turun Yliopisto. Longman Dictionary of Contemporary English (LDOCE). (2009 [1978]). 5th ed. Harlow: Pearson Education. Mahlberg, M. (2007). Clusters, Key Clusters and Local Textual Functions in Dickens. Corpora, 2(1), 1–31. Malkiel, Y. (1959). Studies in Irreversible Binomials. Lingua, 8, 113–160. Mollin, S. (2014). The (Ir) reversibility of English Binomials. Amsterdam: John Benjamins. Moon, R. (1998). Fixed Expressions and Idioms in English. Oxford: Clarendon Press. Nebot, E.  M. (2017). The Out-grouping Society: Phrasemes Othering Underprivileged Groups in the International Bill of Human Rights (English-­ French-­Spanish). In Phraseology in Legal and Institutional Settings (pp. 109–125). Routledge. O’Keefe, A., McCarthy, M., & Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge: Cambridge University Press. Omar, M. ([1973] 2007) . The acquisition of Egyptian Arabic as a native language. The Hague: Mouton. Pace-Sigge, M. (2015). The Function and Use of TO and OF in Multi-word Units. Houndmills and Basingstoke: Palgrave Macmillan. Pace-Sigge, M. (2018). How homo economicus is reflected in fiction – A corpus linguistic analysis of 19th and 20th century capitalist societies. Language Sciences 70, 103–117 Parrish, A. (2018). A Gutenberg Poetry Corpus. Available at https://github. com/aparrish/gutenberg-poetry-corpus (last accessed 09/11/2018). Sardinha, T. B. (2017). Lexical Priming and Register Variation. In M. Pace-Sigge & K. J. Patterson (Eds.), Lexical Priming: Applications and Advances (189–230). Amsterdam: John Benjamins. Scott, M. (2020). WordSmith Tools 8. www.lexically.net (last accessed 16/ January/2020).

14 

M. PACE-SIGGE

Seifart, F.  Jan Strunk, Swintha Danielsen, Iren Hartmann, Brigitte Pakendorf, Søren Wichmann, Alena Witzlack-Makarevich, Nivja H.  De Jong, Balthasar Bickel. (2018). Nouns Slow Down Speech Across Structurally and Culturally Diverse Languages. Proceedings of The National Academy Of Sciences May 2018, 201800708; https://doi.org/10.1073/PNAS.1800708115. Sinclair, M. (Ed.). (1990). Collins Cobuild English Grammar. London and Glasgow: Collins. Siyanova-Chanturia, A., Conklin, K., & van Heuven, W.  J. B. (2011). Seeing a Phrase “Time and Again” Matters: The Role of Phrasal Frequency in the Processing of Multiword Sequences. Journal of Experimental Psychology: Learning, Memory and Cognition, 37(3), 776–784. https://doi. org/10.1037/a0022531. Sugiati, A., & Rukmini, D. (2017). The Application of Formulaic Expressions in The Conversation Texts of Senior High School English Textbooks. EEJ, 7(2), 103–111. Wray, A. (2013). Formulaic Language. Language Teaching, 46(3), 316–334.

CHAPTER 2

LNGs in Spoken Interaction and Written Academic Texts

2.1   Introduction While these two forms of English are at extreme ends of language usage, they provide a helpful platform to introduce the research presented in this book. Complex noun phrases like the Linked Noun Groups (LNGs) should not be expected in spoken usage (most probably due to the longer processing time required; see Siyanova-Chanturia et al., 2011) but should be expected in edited texts. This has been found to be true even where language is more formal and high in information density (in the British Academic Spoken Corpus—BASE). They are found to be rare in casual, informal conversation as demonstrated through investigating the recent BNC-2014 data. By contrast, LNGs are high in informational content. They are, therefore, characteristically found in academic writing, where they reflect topicality to a very high degree. The use of noun phrases and nominalisations is, indeed, seen as typical of academic (scientific) writing. “Nominalizations are crucial to the conciseness expected in academic language”, according to Snow (2010, 452). Furthermore, this chapter will pinpoint that more recent academic writings display a small yet noticeable shift towards greater gender awareness, and this is manifest in how a number of LNGs appear as newly coined multi-word units. In order to find this shift, material from the late twentieth century (BNC) is compared with more recent, post-2000 material

© The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3_2

15

16 

M. PACE-SIGGE

(BAWE). Chapter 5 will take case studies into account where usage patterns for key examples are being looked at in detail. This chapter will make use of the corpora shown in Table 2.1. As there is not sufficient space to look at all possible LNGs, this chapter will focus on the most frequently occurring ones. As a result of doing so, the relevant dispersions are being revealed (see 2.4.3). The links for the noun groups investigated are the following conjunctions, whereby N stands for the number of tokens and the percentage within the total of tokens for the relevant conjunct is given. Table 2.2 shows that, while and is being used proportionally without much divergence, the use of or is more prominent in BNC-AC; the most noticeable differences are, however, found for the use of but which is clearly prominent in the spoken corpora. And yet, the LNG with but appears to exist only, if at all, in the form of rare, idiosyncratic expression in the BNC-2014 and BASE: so it is not further investigated.

Table 2.1  Corpora used for Chap. 2 Corpus BNC-2014 BASE BNC-AC* BAWE

N tokens 10,447,716 1,634,056 16,163,228 6,529,941

N files 1,251 203 501 2,761

Publication Year 2017 2007 1994 2005

These are the six W-AC sub-corpora within the BNC written. For details see http://www.natcorp.ox.ac.uk (last accessed 08/June/2020) BNC-AC are the written academic writing sub-corpora

a

Table 2.2  LINK word frequencies Corpus BNC-14 BASE BNC-AC* BAWE

AND N 276,683 45,606 416,576 190,670

AND % 2,64 2,79 2,61 2,84

OR N 41,634 7,573 88,758 22,984

BNC-AC are the written academic writing sub-corpora

a

OR % 0.40 0,46 0.56 0.34

BUT N 103,418 10,178 55,849 15,149

BUT % 0.99 0.62 0.35 0.23

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

17

2.2   Spoken Discourse: LNGs in BNC-2014 2.2.1  BNC-2014 LNGs with and and or It has to be highlighted that the usage profile for and mirrors the findings found for or LNGs perfectly within casual spoken English, while the colligation structure properties are broadly similar. Therefore, we can focus on the following four categories when it comes to LNG use: 1. 2. 3. 4.

Personal pronouns Numerals Time markers Vagueness markers

In a comparison of the LNGs, the word frequencies and the words employed differ depending on the conjunct being and or or. 2.2.2  BNC-2014 LNGs with and and or: Personal Pronouns Casual spoken English, as represented by the BNC-2014 (usually recorded in informal settings, amongst friends and family), presents a clear predominance of personal pronouns. This is also reflected in the occurrence of LNGs. While pronouns are the focus of this book, Table 2.3 gives a good impression of how these occur—including them with the vagueness marker stuff. Table 2.3  and LNGs for pronouns in BNC-2014 ME as L1 N N=2,794 ME AND I 329 ME AND MY 166 ME AND YOU 98

p/mio

N

p/mio

31.5 15.9 9.4

YOU as L1 N=2,269 YOU AND I YOU AND YOU YOU AND YOUR

259 203 69

24.8 19.4 6.6

ME AND HE 84 ME AND SHE 75

8.0 7.2

YOU AND HE YOU AND SHE

67 41

6.4 3.9

THEM as L1 N=1,793 THEM AND THEY THEM AND YOU THEM AND STUFF

N

p/mio

151 68 26

14.4 6.5 2.5

It must be said that “you or you” or “you and you” is only an option in multi-participant conversations (it’s a form of deixis) and needs at least three people

18 

M. PACE-SIGGE

Fig. 2.1  Distribution of pronouns in and LNGs

THEM 26%

ME 41%

YOU 33%

Table 2.4  or LNGs for pronouns in BNC-2014 ME as L1 N=162 ME OR YOU ME OR SOMETHING ME OR ANYTHING ME OR I

N 16 14 11 6

p/mio 1.5 1.4 1.0 0.6

YOU as L1 N=349 YOU OR YOU YOU OR I YOU OR ANYTHING YOU OR WHATEVER

N 30 17 11 9

p/mio 2.9 1.7 1.0 0.9

THEM as L1 N=245 THEM OR SOMETHING THEM OR ANYTHING THEM OR WHATEVER

N 30 13 9

p/mio 2.9 1.2 0.9

Figure 2.1 also shows a preference for speakers to move out in concentric circles—from the self to the direct addressee to a broadly defined group of people outside. Table 2.4 in comparison to Table 2.3 shows that while or LNGs have clear equivalents (me and I—me or I, you and you—you or you) the use of personal pronouns is much less pronounced as the total numbers are a lot lower—which is understandable given that or occurs only one out of five of the times and does. Instead, the pointer towards vagueness is far stronger where or is employed; furthermore, while stuff is strongly associated with and LNGs, it is not surfacing in or LNGs. Table 2.4 is set to mirror the order found in Table 2.3. Crucially, or LNGs do not use same order; in fact, the most personal is the least used as Fig. 2.2 demonstrates. This highlights that and is used in a more personally involved manner while or indicates “possibilities or choices”, which a speaker seems to be more likely to offer to others.

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

Fig. 2.2  Distribution of pronouns in or LNGs

19

ME 22%

THEM 32%

YOU 46%

Table 2.5  Comparison of the most frequent L1 numerals in and LNGs in BNC-2014 HUNDRED AND FIFTY TWENTY THIRTY

N= 934 403 203 136

per/ mio 38.6 19.4 13.0

TWO AND A HALF TWO THREE PRON.

N= 591 233 29 14 81

per/ mio 22.3 2.8 1.3 7.8

2.2.3  BNC-2014 LNGs with and and or: Numerals and Time Indicators A further Linked Noun Group type is found is numerals. There are 934 (L1) counts of hundred in BNC-2014 (see Table 2.5). Only the hundred and count follows a neat, Ziphian pattern.1 Both and and or LNGs use two, yet it can be seen that these reference points are quite different. Table 2.6 shows that Number OR has a clearly colligational format in BNC-2014: the numbers are very low and they are followed in eight out of ten cases by the following count number. Therefore, we also find 230 occurrences of FIVE or (mostly, five or six) and 140 of SIX or (mostly, six or seven). Where there is no following numeral, the speaker employs a vagueness marker. Furthermore, though this is not an LNG, there are 268 instances of more or, 231 of which are more or less.

20 

M. PACE-SIGGE

Table 2.6  Comparison of the most frequent L1 numerals in or LNGs in BNC-2014 ONE OR TWO THE OTHER SOMETHING

N= 511 243 33 26

per/ mio 23.3 3.2 2.5

TWO OR THREE

N= 504 403

SOMETHING 28 WHATEVER 12

per/ mio 38.6

THREE OR FOUR

N= per/ 354 mio 277 26.5

2.7 1.1

SOMETHING HIGHER

19 8

1.8 0.7

N 123 287

per/ mio 11.8 27.5

64 5

6.1 0.4

Table 2.7  and LNGs for time indicators TIME as L1 N=1,166 TIME AND pron TIME AND THEN

N

per/ mio 39 3.7 111 10.6

TIME AND STUFF TIME AND EFFORT

12 11

1.1 1.0

DAY as L1 N=994 DAY AND pron DAY AND THEN DAY AND AGE

N 331 101 20

per/ YEARS as mio L1 N=567 31.7 YEARS AND pron 9.7 YEARS AND YEARS 1.9 YEARS AND THEN YEARS AND THAT

Related to the issue of numerals are references to a time frame: there are time, day and years, as shown in Table 2.7. As we have seen, there is a clear preference for pronoun references. Table 2.7 indicates that all time indicators carry and then. In fact, and then as well as and pron. are typical for this structure in BNC-2014. A number of random checks (for the nouns home, people stuff, something, thing, work) appear to confirm that. Otherwise, we get idiomatically fixed phrases like time and effort, day and age or years and years as well as the use of vagueness. Looking at or LNGs (Table  2.8), results are mirroring everything seen earlier. There is again a dominant move to use vagueness markers, and, unlike what we have seen with the use of numerals, they are the predominant marker with any time-related word.2 2.2.4  BNC-2014 LNGs with and and or: Vagueness In spoken casual conversation, vagueness markers are predominant (see, for example, Jucker et al. 2003, Evison et al. 2007, or Pace-Sigge 2013), and it can therefore be expected that the most typical usage found in BNC-2014 are LNGs that mirror this. As a consequence, stuff is the most

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

21

Table 2.8  or LNGs for time indicators DAY as L1 N=152

N

DAY OR SOMETHING 33 DAY OR TWO

22

DAY OR WHATEVER 7

per/ mio 3.1 2.1 0.6

YEAR as L1 N=187

N

YEAR OR SOMETHING YEAR OR TWO YEAR OR SO YEAR OR WHATEVER

41

per/ mio 3.9

30 29 6

2.9 2.8 0.5

18 FOOD &

15 HOUSE &

N

WEEK OR SOMETHING WEEK OR TWO WEEK OR SO WEEK OR WHATEVER

46

per/ mio 4.4

40 28 9

3.8 2.7 0.8

12 PEOPLE / TIME & 11 PHOTOS / TRAINING /CLOTHES &

19 FRIENDS &

31 WORK &

WEEK as L1 N=208

STUFF

10: MONEY /KIDS / BILLS HOUSES / FILMS &

Fig. 2.3  Occurrence numbers for Noun and stuff LNG in BNC-2014

commonly employed part within and LNGs, as Fig.  2.33 demonstrates. Nevertheless, the total figures are low enough to make this a rare event, with the most common, “work and stuff”, appearing less than three times in a million words. Stuff is, by contrast, very rarely used in a Linked Noun Group following or. There are only 27 occurrences of N-or –stuff and each one is a singular expression. Nevertheless, LNG or is the colligational structure that makes use, as we have already seen, of vagueness markers to a very high degree, with the clear focus that these are in R1 position to the conjunct or, as Fig. 2.4 clearly shows.

22 

M. PACE-SIGGE

-WHATEVER SOMETHING 2,034 6,818 -ANYTHING 1,259

399 SOMETHING 76 THINGS -

OR

SOMEWHERE 147 -SOMEBODY / SOMEONE 118

Fig. 2.4  Occurrence numbers for Noun -[or]- Noun vagueness markers

Overall, it can be seen that LNG use with and is extremely rare in BNC-2014: the numbers are vanishingly low compared to the size of the corpus, or even compared to the number of times and occurs. The same can be said for LNG use with or.

2.3   BASE LNGs: Spoken Academic English 2.3.1  BASE LNGs: Personal Pronouns BASE represents spoken British academic material—in other words, the recordings of lectures and seminars in a university setting. As such, it is less direct yet more formal than the BNC data. The BNC-2014 is 6.25 times larger than BASE—this means that the low figures observed above are yet lower in this data. The structure of use is also quite different. So, for example, the use of you as L1 to and: nearly 2300 occurrences in BNC-2014, only 62 in BASE. In fact, as Fig. 2.5 (in contrast to Fig. 2.1) highlights, speakers refer only rarely to themselves, address their listeners a lot less than in the BNC-2014 and, instead, focus on referring to third parties. References to gender inclusiveness are vanishingly rare in BNC-2014. So, for example L1 he has 57 occurrences—of which only seven are he or she. Contrast this to BASE usage (Table 2.9).

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

23

ME 12% THEM 52%

YOU 36%

Fig. 2.5  Distribution of pronouns in and LNGs in BASE

Table 2.9  or LNGs referring to non-gendered persons in BASE HE: L1 N=22 / HIS: N=13

N

HE OR SHE SHE OR HE

22 1

HIS OR HER

13

per/ mio 13.5 0.6 8.0

For every single time that they are using he/his, the speakers include and she/her, thus aiming for non-gendered personal reference. It must be specifically highlighted, however, that only one speaker (and only once) inverts this so that the male is not in first position. This is all the more interesting when the total count of he in BASE use of and is 61—and for she it is 60, which is neatly balanced. However, in L1 position, she occurs only in this single instance. 2.3.2  BASE LNGs: Spatial Relations The use of spatial and time relations appears to be more important in BASE than in BNC-2014, as shown in Table  2.10. In connection with either here or there, pronouns appear in R1.This has not been found in BNC-2014.

24 

M. PACE-SIGGE

Table 2.10  and LNGs for spatial relation in BASE HERE as L1 N=342

N

HERE AND THEN HERE AND THIS HERE AND HERE HERE AND PRON.

28 13 12 46

per/ mio 17.2 8.0 7.4 28.2

THERE as L1 N=298 THERE AND THEN THERE AND THIS THERE AND THERE THERE AND PRON.

N 32 9 9 29

per/ mio 19.6 5.5 5.5 17.8

Table 2.11  numeral LNGs in BASE HUNDRED AND FIFTY

N

p/m

59

EIGHTY TWENTY

39 37

36.2 23.9

THIRTY FORTY

27 17

22.7 ……. 16.6 10.4

TWO AND A HALF THREE

N

p/m

ONE OR

N

p/m

TWO OR

N

p/m

50 10

30.7 6.1

TWO MORE

93 11

57.1 9.5

THREE MORE

62 6

38.0 3.7

ONE YOU

7 6

4.3 3.4

ONCE OR

N 6

3.7

THREE OR FOUR

N

TWICE

43

26.3

2.3.3  BASE LNGs: Numerals and Time Indicators These match the data found for BNC-2014. The most relevant figure for LNG with the conjunct and is hundred, as shown in Table 2.11.4 Like in the BNC-2014, hundred and fifty is the most frequent of these. However, the other numerals, while also using full counts of ten, appear in differing frequencies. The use of two and does follow, however, the same pattern as seen in BNC-2014. Amongst the time markers, there is only one LNG, namely years and years, which occurs 26 times. Furthermore, minute or two are recorded seven times. Neither references to week or day appear in LNGs. 2.3.4  BASE LNGs: Highlighting Subject-Specific Information There is, however, one area where the BASE data differs substantially from BNC-2014, and this is also the only area where there is overlap with the data of academic written texts (BAWE): LNGs that provide highly specific information. Still, it must be said that these remain low in number: the most frequent of these appears six times in every 10,000 uses of and in BASE.

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

25

Table 2.12  high-info LNGs in BASE BASE and LNG (a) GOODS AND SERVICES (b) NEEDS AND WANTS (c) BLACK AND WHITE (d) X AND Y

N 31 19 18 18

p/m 19.0 11.7 11.0 11.0

BASE or LNG MORE OR LESS PLUS OR MINUS ONE WAY OR ANOTHER COMMENTS OR QUESTIONS

N 77 14 13 6

p/m 47.2 8.6 8.0 3.4

Table 2.12 represents idiomatic fixed phrases.5 The occurrences in and LNGs for (a) are typically preceded by of or the and followed by a relative clause, while (b), barring two exceptions, is the fixed phrase their needs and wants. The reverse form is not found, though there is wants and interests, which appears three times in one file and twice says that “a syllabus should take students’ wants and interests into account”. (c) refers exclusively to skin colour, and it must be noted that white and- occurs only six times and white and black is not recorded at all. Finally, (d) X and Y are the typical placeholders, as in “you can think about x and y as being anything you like”. Both (a) and (d) prominently occur in BASE LNGs as well. The occurrences in or LNGs are rather formulaic, as typified by (a) and (c), whereby (a) is merely an example of the most frequent ADJor-­ADJ in all corpora. (d) usually appears at the end of a lecture or seminar.

2.4   LNGs in Spoken Data Overall, there are very few LNGs recorded in both casual speech and speech employed in lectures and seminars. Where they are found, the total numbers are vanishingly low. Typically, they can be found with personal pronouns or numeral markers; where N-[OR]-N constructions are used, vagueness predominates. There are marked differences between the two types of spoken data: BASE is a lot less personalised and it reflects greater awareness of gendered language use. It is also more focussed on specific details, including special markers. Beyond that, there are a large number of fixed phrases which support the thesis that lack of processing time in online production does not allow for complex noun syntax.

26 

M. PACE-SIGGE

2.5   LNGs in Written Academic Texts 2.5.1  Introduction Written academic English provides two apparently opposed tendencies: on the one hand, the style is at the end of the spectrum with regards to formality (cf. Snow, 2010). Discourse-specific use of academic language is a crucial marker, as Hyland and Tse (2007, 247) describe: “writers must encode ideas and frame arguments in ways that their particular audience will find most convincing, (…) frequently moulding everyday words to the distinctive meanings of the disciplines”. It is conservative and formulaic (cf. Biber, 2015). Indeed, Bennett (2009, 43) goes as far as saying that English Academic Discourse “seems rigidly standardized and rule-bound, monolithic even”. As a result, one expects trigrams like “the results of” to be rather prominent while a possible example of a typical academic-language style Linked Noun Group would be “theory and practice.”6 On the other hand, “[t]here is also no single academic language, (…) Academic language features vary as a function of discipline, topic, and mode” (Snow, 2010:450). It can be said that, within these topics, authors often strive to coin neologisms: this means that one would expect any such trigrams, in particular LNGs, are rather few in number. There is, however, one final caveat that pulls these two strands together: every writer in each discipline is expected to use the current terminology that is fitting for the subject,7 and to use them indeed so often that the term “buzzword” is not necessarily out of place. This is true for both words and phrases. A final point on the wording: academic writing shows a heightened awareness, in particular to word choices that aim to be nongender specific and non-exclusionary to minorities. Given that data published ten years apart (mid-1990s, early 2000s) is being discussed here, any evidence of that (as observed in BASE above) should be of interest, both as to the use and as to the development of LNGs in academic writing. Beyond the content, the quantity of data processed is of importance. Compared to the spoken corpora discussed above, a lot more material will be used here: the academic texts in the BNC account for over 16 million words. That said, it must be pointed out that this still means that the largest coherent cluster in the BNC-AC, “part of the,” occurs 5088 times. This, the reader must keep in mind, means that the most frequent trigram appears almost 315 times per one million words in this corpus. The corpora have been tested for a number of conjuncts within their LNGs. Neither but, nor or yet produced anything beyond a few instances. The single one exception is as. The phrase is “N as a whole”. The most frequent of these is “society as a whole”, which occurs 118 times in

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

The SOCIETY The ECONOMY The POPULATION The NATION The WORLD The GROUP The COUNTRY The COMMUNITY The SCHOOL The STATE The UK / USSR / US / TEXAS etc.

27

AS A WHOLE

Fig. 2.6  NPs preceding as a whole

BNC-AC and 43 times in BAWE. More interesting still, the pragmatic and colligation structure of “as a whole” is extremely stable. The preceding noun is typically carrying the definite article and as is always followed by the indefinite article.8 Both corpora only use this term in connection with the word field society, as Fig. 2.6 demonstrates (most frequent NP top): Apart from this single exception, all other LNGs appear with either and or or. These will be discussed in detail below. 2.5.2  Numerals As seen above, the use of number-and-number / number-or-number noun groups accounts for the most frequently occurring such clusters. In neither BAWE nor BNC-AC, numeral LNGs are the most frequent. Also, in particular with and, there is a noticeable tendency to use placeholders, as shown in Table 2.12. Table 2.13 shows that the same LNGs appear broadly with the same frequencies in both data sets. As in BASE, we see that LNGs with or are typically employed for vagueness—the figures for—and more being substantially higher than the more specific description. In particular, the less precise indications using or tend to be significantly more frequent. LNGs with and here make strong use of placeholders, while “1 and 2” or “3 and 4” tend to refer to tables of graphics. As the direct comparison in Table 2.12 highlights, amongst the LNGs with or, numerals are the most frequent trigrams—albeit for only low-count numbers: unlike the spoken data where there is no reference to hundreds. The one exception are denotations of decades—see below for details.

28 

M. PACE-SIGGE

Table 2.13  Numeral LNGs in BAWE and BNC-AC LNG

N BAWE

1 AND 2

149

per mio BAWE 22.8

A AND B

139

21.3

2 AND 3

65

B AND C

N BNCAC 262

per mio BNC-AC

LNG

N BAWE

per mio BAWE

N BNC-AC

per mio BNC-AC

0.063

ONE OR MORE

81

12.4

469

0.528

398

0.096

TWO OR MORE

59

9.0

317

0.357

10.0

241

0.058

YEARS OR MORE

11

1.7

61

0.065

61

9.3

175

0.042

ONE OR TWO

61

9.3

272

0.306

X AND Y

54

8.3

125

0.030

TWO OR THREE

48

7.4

219

0.247

FIRST AND SECOND PRIMARY AND SECONDARY

44

6.7

121

0.030

THREE OR FOUR

20

3.1

116

0.131

41

6.2

162

0.040

FOUR OR FIVE

13

2.0

51

0.057

Both forms “1 and 2” / “one and two” and so on appear with and. This is more predominant in BAWE, probably reflecting that these are unpublished works. All percentages are in relation to and and or token totals (see Table 2.2)

Overall numbers for this type of LNG are low—lower in any case than in the spoken data. This already indicates that LNGs in written academic English have a tendency to point to a different kind of information content. 2.5.3  Specific Information Containers 2.5.3.1 Introduction Snow (2010, 450) gives the following definition of academic language: “[a]mong the most commonly noted features of academic language are conciseness, achieved by avoiding redundancy; using a high density of information-bearing words, ensuring precision of expression; and relying on grammatical processes to compress complex ideas into few words”. This is born out in the material investigated here: the two corpora show a high degree of agreement as to the LNGs employed. They are very focussed on communicating high-information values and act as specific containers that reflect the topics writers deal with. The categories below stem from the corpus-derived trigrams: the BAWE corpus is not subdivided by subject area, though the BNC is. LNGs with or most frequently refer to quantities, while and as the linking conjunct has quantities as

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

29

highly frequent. This section focusses on information containers, where presence or absence or negative or positive are the most frequent LNGs after numeral references on the one hand, and where England and Wales or formula and formula are most frequent amongst all the LNGs. This section shows there does not seem to be one particular field where LNGs are dominant. However, while a small number of LNGs in this genre seem to be quite frequent, it must be highlighted that both the total number and the relative frequencies of this construction remain low. As a consequence, these trigrams rarely account for a frequency of higher than six instances per one million words. As a rough rule of thumb, frequencies of lower than 50 (BNC-AC) or 30 (BAWE) were not taken into account. The full list of all the LNGs found in both corpora can be found in the chapter’s appendix (Sect. 2.6). The choice as to the respective categories was first made based on the assumed topical area of the trigram; for its use within a particular faculty of academia. This was then cross-checked for its classification within the BNC subfolders. Below is a discussion of key findings of the two academic text corpora. 2.5.3.2 Dominant Forms This section highlights the use of LNGs with and or or that appear non-­ specific to any one topic. The first classification was made intuitively: information containers that could appear in more than one of the subsections used below. Where the BNC had such a trigram overwhelmingly in use within one particular academic field, this is noted. At times the lower level of proficiency (in using technical terms and phrases) can be noted in BAWE. In Sect. 2.3.2, we have seen that academic text have a strong preference for numerals. An extension to that is the use of time markers, as Table 2.14 shows. A brief initial check as to the dispersion of THE 1950s AND 1960s reveals that amongst the BNC-AC it comes from 55 different sources, amongst which not one used this phrase more than five times. This is comparable to SPACE AND TIME in BAWE, which occurs in 24 different sources. However, its high occurrence per million makes it a clear outlier: this can be explained by the fact that one single file in BAWE uses this phrase a total of 14 times. It is noteworthy here that academic writing refers to two full decades, speaking of either “in the 1950s and ...” or “during the 1960s and 70s”; variations include “in the 1950s and the

30 

M. PACE-SIGGE

Table 2.14  Time-linked LNGs in BNC-AC and BAWE LNG THE 1950s AND 1960s THE 1960s AND 1970s THE 1970s AND 1980s TIME AND PLACE SPACE AND TIME TIME AND ENERGY TIME AND MONEY PRESENT AND FUTURE

N BNCAC 106 117 70 79 74 50 36 35

BNC-AC per million 6.56 7.24 4.33 4.89 4.57 3.09 2.33 2.29

N BAWE 10 10 12 41 59 3 34 42

BAWE per million 1.66 1.66 1.84 6.28 9.04 0.46 5.21 6.43

1960s”. These come from all BNC sub-corpora apart from ac-tech-engin, with the majority being from the social sciences. That there are relatively few of these in BAWE, which seems to have proportionally greater use of the more vague “time and place” and so on could be seen as evidence of greater empirical rigour in the published, earlier material. This seems to be supported by the type of information conveyed: . …intensified markedly between the 1950s and the 1970s. 1 2. …defended in depth by Isaiah Berlin in the 1950s and ’60s. While both corpora use example as in (1), the less specific forms seem to be less used in BNC-AC. Furthermore, the contracted form (“…and ‘60s”) appears three times in BAWE (bringing the total to two per million words) while it is only recorded twice in far larger body of concordance-­ lines in the BNC-AC. “Time and place” is often preceded by “specific” or “particular” in both corpora. In the BNC-AC, it appears a number of times in legal discussions. By contrast, “space and time”, the dominant LNG in BAWE, seems to appear very often because it discusses space, that is, the universe in detail. BNC-AC, by contrast, employs a more physical or philosophical focus, with references to geographical developments or to Kant and Plato. There is a single reference to “time and space” constraints of the writer (in BAWE) too. Lastly, “time and money” predominates in the BNC-AC-polit-law-edu, while “present and future” often occurs as in “past, present and future”

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

31

and can lead to very evocative language as in this line (from BNC-AC-­ polit-law-edu): “authority is diffused between past, present and future; between the old, the new and what is to come. It is steady because, though it moves, …”. The dominant form—and therefore strong stylistic feature of written academic texts—is the use of opposites. This is a rather interesting finding: opposites are a rhetorical tool. When looking the literature, the focus is invariably on adjectives (like dull vs. brilliant)—see Jones et  al. (2007). Indeed, such adjective forms are very strongly represented in my corpora, too—“public and private”, “true or false”, “black and white” or “left and right”—the latter two also, in their dispreferred colligation, as LNGs. Crucially though, even specialists on the genre of academic writing (e.g. Douglas Biber) or the teaching of the same (e.g. Avril Coxhead 1998, Ken Hyland 2002) seemingly failed to notice this particular discourse device. Tables 2.15 (LNG with and) and 2.16 (LNGs with or) provide a list of the most frequent of these; and it can be seen that these are fairly equal in their distribution across the two corpora.9 Coxhead and Byrd (2007, 134) point out that a typical feature of academic language is the use of “long complicated noun phrases with nouns more often followed by prepositional phrases than by relative clauses…and a tendency to use words of Latin or Greek origin”. This is certainly true here, where of is the most typical preposition to follow the above LNGs. Furthermore, it must be noted that a number of the opposites refer to a plurality, which points to a discussion. Looking at “theory and practise”, for example, “practise” is the most frequent noun-collocate, co-occurring 207 (80) times in the 6455 (5868)

Table 2.15  Opposites with and LNGs in BNC-AC and BAWE LNG THEORY AND PRACTISE CAUSE AND EFFECT INPUT AND OUTPUT STRENGTHS AND WEAKNESSES ADVANTAGES AND DISADVANTAGES SIMILARITIES AND DIFFERENCES

N BNCAC 107 93 82 73 53 23

BNC-AC per million 6.56 5.75 5.11 4.52 3.27 1.42

N BAWE 48 54 49 68 62 34

BAWE per million 7.35 8.27 7.50 10.41 9.49 5.21

32 

M. PACE-SIGGE

times “theory” occurs in BNC-AC (BAWE). Typically preceded by between, or in, the LNG is mostly part of “the theory and practise of” (20 in BNC-AC, seven in BAWE). The BNC-AC has, furthermore, five instances of “policy, theory and practise”. Looking at “cause and effect”, this is typically preceded by of. However, in BAWE “to establish cause and effect” appears in 8% of all its uses, while it is not recorded in BNC-AC which most prominently uses “between cause and effect” in 13% of all of these LNGs. As can be seen, “strengths and weaknesses”, “advantages and disadvantages” are markedly more frequent in BAWE. Here, both are often used with regards to economics, whereas both seem to cover a wider range of topics in BNC-AC. “Strengths and weaknesses” is hedged just twice using “apparent” in BNC-AC and only three times in BAWE. Crucially though, “advantages and disadvantages” is never preceded by a form of hedging in BAWE, yet shows a number of different downtoners in BNC-AC. Table 2.16 demonstrates that LNGs indicating opposition are evident in BNC-AC, yet rather marginal in the newer corpus. The first typically appears in the form of “the presence or absence of” (111 times in BNC-AC, all occurrences in BAWE), the last, typically “in one way or another” appears in the humanities, social sciences and politics/law/education subfolders of the BNC-AC, a few times in natural sciences, yet never in engineering texts. That it occurs at all might surprise students of academic writing skills courses: the language is rather vague and it does, indeed, point towards the lack of more precise knowledge in the writer. Thus, one gets the double vagueness in “and in one way or another more or less Bellinesque”. Alternatively, it is employed as the fixed idiom, as in “most of them in one way or another emphasizing the value of local initiative”. As such, the idiom seems to imply that “further more detailed information is not necessary here”. Below are just some of the opposites—others (profit and loss, males and females, etc.) will be described as part of the subsections below. Table 2.16  Opposites with or LNGs in BNC-AC and BAWE LNG PRESENCE OR ABSENCE ONE WAY OR ANOTHER

N BNCAC 143 54

BNC-AC per million 8.85 3.34

N BAWE 11 14

BAWE per million 1.68 2.14

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

33

Table 2.17  Expansion LNGs in BNC-AC and BAWE EXPANSIONS MATERIALS AND METHODS RESEARCH AND DEVELOPMENT CHARGES AND EXPENSES CONVEYANCE OR TRANSFER HARDWARE AND SOFTWARE HARDWARE OR SOFTWARE SOFTWARE OR HARDWARE KNOWLEDGE AND SKILLS RECRUITMENT AND SELECTION RELIABILITY AND VALIDITY INFORMATION AND CONSULTATION CASES AND MATERIALS

N BNC-AC 120 111 76 61 65 6 10 63 3 3 0 0

p/mio 7.42 6.87 4.70 3.71 4.02 0.38 0.62 3.9 0.19 0.19 0 0

N BAWE 10 74 0 10 15 2 0 17 91 37 40 39

p/mio 1.66 11.33 0 1.53 2.30 0.3 0 2.60 13.94 5.67 6.13 5.95

Table 2.17 looks at another form of lexical bundles: not opposites but expansions of the first term. The most frequent in BNC-AC, “materials and methods”, appears because it is usually a chapter heading. There are two LNGs which are the odd ones out: “charges and expenses” and “conveyance or transfer” are used only in two law texts, its dispersion is so low as to be almost idiosyncratic. At the other extreme stands “Cases and Materials”, which only occurs in BAWE—and is part of a variety of book titles. Something similar can be observed with the LNG “recruitment and selection”. This is rare in BNC-AC; the texts only come from eight sources. This seems to indicate that the material here is highly specific, as is the colligation structure, with the LNG typically preceded by method, structure, policy and so on. “Software and hardware” as well as “knowledge and skills” are fairly fixed phrases. In the BNC-AC, there are only twelve instances of “hardware and software” and” skills and knowledge” is marginal (three occurrences). It must be noted that “software or hardware” is preferred to its reverse form in BNC-AC, while BAWE hardly shows any use of the orLNG at all. Another fixed expression is “reliability and validity”. It could be said that this occurs strongly in BAWE as the writer feels the need to justify the methods or processes employed: BNC-AC authors may expect their audience to be familiar enough with these.

34 

M. PACE-SIGGE

2.5.3.3 Country and Law The following four subsections present clear differences in the foci of the two corpora. While there is a degree of overlap, it can be see that Country and Law is given far more prominence in the BNC-AC; Medicine and Science and Social Sciences provides an indicator to what priorities have shifted; finally, the Business, Economics and Education section shows how LNGs related to these topics have become more prominent in BAWE than the older BNC-AC. The first thing one notices, looking at LNGs in the Country and Law section, is the absence of N-[OR]-N constructions in any significant numbers. The single exception is “any/consequential/other loss or damage”, which occurs 3.65 times/million in BNC-AC (almost exclusively in the subset of law texts) yet less than once per million words in BAWE. As the full list (see appendix) shows, this area appears rather one-sided, with the majority of LNGs found only in the BNC-AC. One reason is “Crime and Punishment”, which, though occurring amongst law and social science texts, comes mainly from two ac-humanities files (Dostoyevsky’s novel). Interestingly, “State and Revolution” (V.I. Lenin) is the BAWE equivalent, making titles by Russian writers a reason for a stylistic marker in academic texts. The reason for most other LNGs is, however, that law-related texts clearly form a large chunk of the BNC-AC (20 per cent). This does explain, for example, the highest count for any Linked Noun Group for “England and Wales”. At the same time, frameworks of conduct, which may or may not be based on law, that is, the LNG “rules and regulations”, appear almost two and a half times as frequent in BAWE than the older corpus (Table 2.18). Table 2.18  Country and Law LNGs with and in BNC-AC and BAWE LNG make selection ENGLAND AND WALES LAW AND ORDER (noun) RANK AND FILE (noun) RULES AND REGULATIONS

N BNCAC 704 140 33 30

BNC-AC N per million BAWE 43.56 46 7.54 10 2.05 6 1.86 33

BAWE per million 7.04 1.53 0.93 4.60

TOWNS AND CITIES CITIES AND TOWNS

65 26

4.02 1.79

10 0

1.53 0

CRIME AND PUNISHMENT STATE AND REVOLUTION

95 3

5.88 0.19

2 35

0.31 5.36

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

35

The two rows marked “nouns” show that LNGs are also quite close to linked adjective groups: there are indeed “law and order politicians” and “rank and file members”. Where the trigrams modified a noun (usually but not always post-positioned), this has been here treated as an adjective groups. However, in examples like “law and order was being imposed” or “many of the ordinary rank-and-file”, LNGs are being seen. One clear curiosity is “towns and cities”, appearing nearly twice as often than “cities and towns” in BNC-AC. Neither the context nor the variety of sources seems to be different: clearly, the authors preferred a structure of small-to-large / less-to-more-important to the reverse form. However, BAWE produces a full conformity that does not allow for “more important” to be named first here. 2.5.3.4 Medicine and Science Table 2.19 provides the clearest example of change in use over time, for example the Department of Science and Technology and its secretaries of state and ministers: the department has long been defunct (1988 is the last corpus mention). “Science and Technology” is mostly preceded by in or of. However, in BAWE the majority of this Linked Noun Group appears as part of a journal or book title. Other LNGs are reflecting a typical feature of academic writing: section/chapter headings. Therefore, we find headings like “Evidence based care and issues for research. A brief consideration of the evidence base required for the diagnosis and management of the patient’s problem(s)” (in BAWE) and “Patients and Methods” in medicine files in BNC-AC. Table 2.19  Medicine and Science LNGs with and and or in BNC-AC and BAWE LNG

N BNCAC

BNC-AC N per million BAWE

BAWE per million

SCIENCE AND TECHNOLOGY EDUCATION AND SCIENCE

98 78

6.06 4.83

51 2

7.96 0.31

PATIENTS AND METHODS ACCIDENT AND EMERGENCY HEALTH AND SOCIAL CARE DIAGNOSIS AND TREATMENT DIAGNOSIS AND MANAGEMENT

78 59 32 30 10

4.83 3.65 1.98 1.88 0.62

0 7 51 10 49

0 1.07 7.81 1.53 7.50

BLOOD OR URINE PATIENTS OR CLIENTS

44 5

2.72 0.31

0 12

0 1.84

36 

M. PACE-SIGGE

While the staffing and procedures in Accident and Emergency Departments was of particular interest to authors in the 1980s, this shifted to “health and social care” in the early 2000s. In fact, health and social care needs (13 per cent) as well as /services /system/workers are subject to some BNC-AC discussion, yet the focus in BAWE is on health and social care professionals (42 per cent) as well as health and social care /settings/needs/ practise (in that order). To a degree, the focus has shifted from patient to provider—and the latter has been upgraded in status from worker/service to professional and the actual provision (settings/practise) has become a lot more important. “Blood and urine” (five different writers, one use in a law-related text) is one of several examples where LNGs directly refer to medical practise in BNC-AC—yet these are far less visible in BAWE. Lastly, the wording “patients or clients” appears rather managerial. In all cases, the writers refer to the work of health professionals—their patients and clients. This, however, is insufficient evidence of a shift in the paradigm, as all twelve concordance lines were written by only two authors. 2.5.3.5 Social Sciences Social Science texts have a strong tendency to employ adjective groups, for example “social and emotional/cultural/political” or “political and ideological/individual” and so on. This subsection, in particular, highlights that adjectives can be, as a technical word class, very close to nouns, something we already indicated earlier (see Table 2.16). One can have working class and middle class as an NP-conj-NP construction, yet this 4-gram typically premodifies a noun or noun phrase, as in working class and middle class households. Also, as seen above, LNGs are very useful for journal and book titles— half of the uses of “Policy and Practise” are just that. Table 2.20 indicates that “age and sex”, as a descriptor for participants of a study, is a dominant descriptor in BNC-AC.10 In BAWE, however, the more neutral identification of “age and gender” is nearly as frequent, yet marginal, in the older corpus.11 The referrer “individuals and/or groups” records a slight drop in use over time. It is notable that, while the or-LNG is less frequent, the differences are not that pronounced. Their colligations only differ marginally, too.

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

37

Table 2.20  Social Science LNGs in BNC-AC and BAWE LNG AGE AND SEX AGE AND GENDER POLICY AND PRACTISE INDIVIDUALS AND GROUPS INDIVIDUALS OR GROUPS CLASS AND GENDER GENDER AND CLASS

N BNCAC 132 10 91 64 48 29 8

BNC-AC per million 8.17 0.62 5.63 3.99 2.99 1.80 0.47

N BAWE 20 14 18 16 11 37 34

BAWE per/mio 3.06 2.14 2.66 2.44 1.68 5.67 5.21

Among and between precede “individuals and groups”, while there is only by before “individuals or groups”. By contrast, “class and gender” is more than three times as frequent as its reverse form in BNC-AC, while they are about equal in use in BAWE.  This reflects the focus of the particular sentences written there. Crucially, interest in this topic has increased substantially over time. 2.5.3.6 Business, Economics and Education At first sight, it might be odd to put education into the same category as business and economics. Still, training, skills and development are all necessary prerequisites of a modern economy. Also, in the context of a chronological comparison, it is of interest to see whether any notable changes have taken place. This is also the section that appears to have the highest degree of LNGs that appear both with and and or. A first glance at Table 2.21 shows that, barring one notable exception, the or-LNG form is being far less used in recent times (as per BAWE). Most notable for this section is that it reflects clear changes in the importance of particular topics and wordings between the 1980s BNC-AC and early 2000s BAWE material. The one notable shift over time is from “goods or services” to “product(s) or service(s)”; “goods and services” has increased slightly in use over time, while the LNG “products and services” is the most-used phrase in the early 2000s. This provides some indication that the term goods is on its way out and products is generally adopted in its stead.

38 

M. PACE-SIGGE

Table 2.21  Business and Teaching LNGs in BNC-AC and BAWE

GOODS AND SERVICES GOODS OR SERVICES PRODUCTS AND SERVICES PRODUCTS OR SERVICES PRODUCT OR SERVICE HEALTH AND SAFETY TRADE AND INDUSTRY TRADE OR BUSINESS COSTS AND BENEFITS SUPPLY AND DEMAND RICH AND POOR (noun) PROFIT AND LOSS PROFIT AND LOSS ACCOUNT

N BNCAC 166 66 8 2 0 115 65 42 41 35 30 22 11

BNC-AC per million 10.27 4.08 0.49 0.12 0 7.12 4.10 2.62 2.54 2.17 1.86 1.41 0.70

N BAWE 87 11 102 23 37 93 17 1 34 32 28 60 41

BAWE per million 13.32 1.68 15.62 3.52 5.67 14.24 2.6 0.15 5.21 4.60 4.29 9.19 6.28

TEACHING AND LEARNING EDUCATION AND TRAINING EDUCATION AND SKILLS TRAINING AND DEVELOPMENT

104 94 85 6

6.43 5.82 5.26 0.38

5 22 11 46

0.77 3.36 1.68 7.04

LNG

Overall, the LNGs used for Business and Economics have become more prominent in their use from the time of the BNC-AC authors to the time of the BAWE texts. Most of the above have doubled in frequency; “profit and loss (account)” does occur over eight times more frequently.12 The exceptions are “trade or business”, which occurs in the form of “in the course of a/any trade or business” in BNC-AC 14 times but is almost non-existent in BAWE. A similar trend can be observed with regards to education-related terminology. Therefore, “teaching and learning” is predominant in the BNC yet barely used anymore in the BAWE. Instead, references to learning or education are neatly subsumed by the LNG “training and development” in BAWE—a corpus where “training” clearly shows a preference to the more general “education”.

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

39

2.5.4  Gendered Language 2.5.4.1 Introduction It is a widely described feature in the usage of English language that it employs a male-first, female-second description mode. See, for example, Hardman (1999), Wright et  al. (2005) and Hegarty et  al. (2011). Motschenbacher (2013) and Schmid (2015) offer a corpus-based analysis. This language feature results in N-conjunct-N constructions like “Adam and Eve”, “Jack and Jill”, “Fred and Wilma”, “Mr and Mrs Smith” and “husband and wife”. Similarly, roles that are typical of females are highlighted by a phrase like “women and children”. Kesebir (2017) has shown (using gender order in particular) that this type of preference reflects that “audiences assign stronger relevance to a party when the party is mentioned first rather than second”. His psychology experiment can be interpreted as either a deeply ingrained patriarchal attitude that is reflected in language or, alternatively, that a change of the prototypical order (the play has the title “Romeo and Juliet” after all13) is to forefront the importance of the first-mentioned person. There is clear evidence of the absence of the “Juliet and Romeo” effect (Hegarty et al., 2011) in all text genres. Though some academic writers try to inverse the masculine-before-feminine naming, these appear, in the data used, as clear exceptions. This section focusses on the attempts made by writers in their essays, theses and articles to present a heightened language awareness—in the case of LNGs, this means they are attempting to inverse the prototypical phrase. This then would be an example of the “Juliet and Romeo” effect. 2.5.4.2 Gendered and Age-Referential LNGs in BNC-AC and BAWE 2.5.4.2.1  Gender Academic writing—given the amount of discourse studies and gender studies published—should be a place where writers display their awareness of (unequal) relations. Thus, for example, the use of references that refer to “she” instead of “he”.14 Similarly, it could be expected that published academics strive for an equal gender balance. One would also expect that this behaviour becomes more pronounced in recent times.

40 

M. PACE-SIGGE

For reasons of space, personal pronouns are not separately discussed here. Readers can find such an analysis in Motschenbacher (2013, 221 ff.) for the BNC-W corpus. Suffice to say that the male bias here is at its most prominent (see Appendix 4.7). Two things need to be said with regards to BAWE in comparison to the BNC data. For one, the former is smaller and this means that where occurrences in the BNC are recorded, these might not appear in the BAWE. (A full list of all these LNGs can be found in the appendix.) The second, related point is that certain topics only appear in the BNC academic data (“police men and women”) or are discussed more (“husband and wife”, “parents and children”). This, again, makes a side-by-side investigation difficult. That being said, there remain a large number of LNGs that provide sufficient material for salient findings. “Men and women” is the most common LNG identifier, occurring nearly 32 times per million words in BNC-AC and nearly 37 times in BAWE. By contrast, “women and men” is roughly one quarter of these: less than 7.5 times per million words in both. The use of “women and men” in both corpora appears to be, in almost all concordance lines and in both corpora, a deliberate choice to counter a male-preferential style. Still, most of the examples of the f-&-m structure reflects the topic (and traditional perceptions)—“division of household work between women and men is beginning to change” or reasons of stress—“I believe in gender equality. Women and men should share responsibilities” in both corpora. Another point is that the dispersion (the number of writers actually using this structure) is quite low in BNC-AC: less than one-fourth. Most of these writers are in the social science field. This is in sharp contrast to BAWE where the number of authors is half of the number of the concordance lines. Initially, the BAWE seems to show no greater gender language awareness than the BNC-AC. The only Linked Noun Group (LNG) that is in balance is “boys and girls” / “girls and boys”. Looking at “men and women” / “male(s) and female(s)” however, the small amount of the reverse form seems to match the relative use found in BNC-AC. This is similar to Motschenbacher (2013, 223), who also recorded that the gender balance is very biased to male-first usage for man-woman / male-­ female(n) while the contrast is not so pronounced for girl-boy. This being said, progress can be found during a more detailed study. The trigrams “female and male” and “male and female” are typically linked

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

f-&-m m-&-f

M

M

EN

AN D AL W OM E AN M EN AL D ES F AN EMA D LE BO FEM HU YS A ALE S SB N AN D G IR D LS AN D W IF E M EN AN M D AL W OM E AN M EN AL D ES FE M AN AL D E F E BO M A Y HU S LE SB AND S AN GI D AN RLS D W IF E

45 40 35 30 25 20 15 10 5 0

41

Fig. 2.7  Distribution (per million words) of f-and-m and m-and-f LNGs. Left: BNC-AC, right: BAWE

adjectives: “between male and female participants”. The use given in Fig. 2.7 refers to the trigram in LNG form only—occurring in about one-­ fourth of the total usage. In these, LNG “female and male” does not occur at all in the larger BNC sub-corpus, while it is recorded three times (about one-seventh of the use of “male and female”) in BAWE. BAWE also shows a shift to a more formal usage, where reference to male(s) and female(s) in LNGs is significantly higher. A further point of comparison is the use of gender-neutral terms. While “husband and wife” is fairly prominent in BNC-AC (nearly eleven occurrences per million words while “wife and husband” is a mere 0.12 occurrences), it is only present 3.36 times in BAWE. However, the non-gendered alternative, spouses occurs eight per million times in both corpora. The clearest gender bias in BNC-AC appears to be the Linked Noun Group “policemen and women” (six per million), which has no reverse form. Yet the neutral form, police officers is by far more frequent: 12.2/million times in BNC-AC and 2.1/million in BAWE. Looking at LNGs with the conjunct or, the total numbers are vanishingly small (see appendix). Many forms found in the BNC-AC are not recorded in the BAWE.  Yet the occurrence patterns for the contrastive needs closer inspection as here, the female-precedes-male structure is the preferred one, as shown in Fig. 2.8.

42 

M. PACE-SIGGE

2 0.49 1.5 1

m-&-f 1.41

0.5 0

0.19 0.12 0.12

WIFE OR HUSBAND

WIVES OR HUSBANDS

f-&-m

0.49 WOMEN OR MEN

Fig. 2.8  BNC-AC gendered LNG with or. The Total Numbers Here are Vanishingly Small—0.12/Million Equal to Two

That “wife or husband” occurs more frequently in the BNC than in the BAWE is explained by the most frequent trigram here. It is a formula from a legal text that runs as follows: “for benefit of the settlor or the wife or husband of the settlor in any circumstances”, which appears several times, and there are variations of this, too. Furthermore, there is “to incriminate that person or the wife or husband of that person of an offence”. This means, however, that, while “wife” precedes “husband”, the latter is the active part: either the settlor or the offender, while the wife appears to be the bystander first. Similarly, as Fig. 2.9 demonstrates, most of the uses of “women or men” show a prosody for the female part that is (with the possible exception of concordance lines 2 and 8) negative. Lines 4, 5 and 7 seem to show women as victims. Lastly, concordance line 1 provides a logical link—females-feminists—but this is to mark a contrast to a dominant male actor. This matches Motschenbacher’s observation who compared husband-­ wife to widow-widower occurrence patterns: “[t]he following traditional gender discourses may be retrieved from this distribution: men occupy the dominant position within marriage; women seem to be perceptually more salient when they are not yet or no longer married” (2013, 226). Overall, while there are some positive shifts towards a more gender-­ neutral balance on the surface, deeper investigation seems to indicate that it is context-dependent restraints that fully support the number of detailed

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

43

Fig. 2.9  All concordance lines of Women or Men in BNC-AC

test cases of lexically gendered word pairs by Kesebir (2017) and his conclusion that “word order is a function of and cue for relevance. … We saw that context could shift the order of references to the two genders but people also display a tendency to repeat predominant word orders” (2017, 276f.). This, some might say unfortunately, is also true for academic writing. 2.5.4.2.2  Age-Related Usage This section provides a clear link to the above. Firstly, there is the formulaic sequence that prefers reference to females: “women and children” has no reverse form in either corpus. One can put this down to the fact that the idiom is fully fossilised. However, there are clearly moves to undermine the perception that the female is supposed to be the traditional caregiver. Secondly, there is also a move to avoid gender- and age-group specific LNGs, as can be seen in Table 2.22. On the surface, BNC (ac-humanities) writers seem to be enlightened, talking of “fathers and children”. However, this turns out to be the title of the Turgenev novel Fathers and Children.15 Motschenbacher (2013, 207) records the following figures for binominals in the whole of the written BNC: “mother/child: 156; father/child: 22”. The findings presented here point towards the fact that the father/child figure could be lower still. This being said, both corpora avoid being gender-specific by speaking of “parents and children” far more often. Looking at the BNC-AC concordance lines for “children and parents”, it becomes clear that writers give the children a role as a responsible actor (“leaflets directed at children and parents” is an example)—rather than let the parent(s) be dominant.

44 

M. PACE-SIGGE

Table 2.22  LNGs that reference age-group-related pairs LNG MOTHERS AND CHILDREN FATHERS AND CHILDREN PARENTS AND CHILDREN CHILDREN AND PARENTS ADULTS AND CHILDREN CHILDREN AND ADULTS CHILDREN AND YOUNG ADULTS TEENAGERS AND YOUNG ADULTS CHILDREN AND YOUNG PEOPLE

N BNC-AC 18 4 101 11 32 20 100 0 63

BNC-AC per million 1.14 0.24 6.25 0.70 1.98 1.21 6.19 0 3.78

N BAWE 1 0 8 0 9 7 1 5 8

BAWE per million 0.15 0 1.42 0 1.33 1.07 0.15 0.75 1.23

“Children and Adults” appears nearly as often as the reverse form because these texts describe activities where young people are the focus. For example, language development or computer games. This notwithstanding, there are a good number of concordance lines that appear to allow a reversal without creating an atypical-sounding passage. Finally, the final three rows of Table 2.22 show that LNGs are employed to describe people who are not legally adult. Unfortunately, the total numbers of BAWE are too low to claim that “young people” is a preferred form to “young adults”; however, the absence of “teenagers” in BNC-AC LNGs should be noted.

2.6   Conclusions: LNGs in Academic Written English While there are a number of fairly frequent LNGs, the total number of those that are employed repeatedly is fairly low; likewise, their occurrence profile is rather low: few LNGs can be found more often than six times per million words in either corpus. Indeed, adjective groups (“social and political tendencies”) or ubiquitous adverb groups (“more and more” as well as “more or less”) tend to be yet more widely used. Last but not least, LNGs appear to be restricted to be using the conjunct and and, to a far lesser degree, or, while all other linking elements are too infrequent to be of note. Bennett, surveying style guides for English academic writing, concludes that “the single most important factor to have emerged from this survey of style manuals is the remarkable degree of consistency that exists as

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

45

regards the general principles and main features of academic discourse in English” (2009, 52). Looking at the LNGs employed by published academics in the 1980s and by university academic writings in the early 2000s, Biber’s (2015) contention that, of all genres, academic writing being the most conservative, still holds. The key differences are driven by context dependency; further differences show slight shifts in wording choices. Similarly, the word choices with regards to gender or age equality show only marginal changes. Here, as for the LNGs above, the findings of Mollin hold true: Our findings here confirm previous studies in the conclusion that a clear hierarchy of constraints exists: semantic factors explain the preferred order of binomials to the greatest degree, (…) Analysing success rates of individual constraints and the resolution of clash cases, where different types of constraints predict different orders, has allowed for the suggestion of a clear constraint hierarchy which operates for binomials. (Mollin 2012, 102)

If anything, this section highlights that LNGs, in academic writing, often provide a rhetorical discourse marker. While a great number are specific to the subjects under discussion (and these provide the reason for a lot of the differences found where BNC-AC is compared to BAWE), the majority of LNGs are either opposites (“theory and practise”, “presence or absence”, males or females”) or expansions (“science and technology” or “goods and services”). This also explains why a number of the frequently found LNGs turn out to be the titles of article or books: such a rhetorical device is meant to be eye-catching and, at the same time, aims to provide information-rich indications about the content.

Appendices In the following are highly frequent LNGs for (British) academic written texts that have not been discussed further above. These, roughly, mirror the structure in Sect. 2.3. First of all, Sects. 2.5.1 and 2.5.2 look at opposites and expansions that are found in various sub-corpora of academic writing. The following sections give highly frequent LNGs found in a number of academic fields. Finally, Sect. 4.7 shows LNGs that are highly frequent and also provide figures for reverse forms of the fixed binomials in order to show how writers of academic texts try to counter-gendered (i.e. male-first) language.

46 

M. PACE-SIGGE

Opposites and Expansions with –andLNG Opposites Policy and practise Input and output Strengths and weaknesses North and south Advice and assistance Advantages and disadvantages Similarities and differences Top and bottom (n) Black and white (n) Advice and support Positive and negative (n) Opportunities and threats Logical relation Cause and effect Causes and reasons Technical Formula and formula Food related Food and drink Fruit/s and vegetables Food and beverage Expansions More and morea Charges and expenses Recruitment and training Recruitment and selection Reliability and validity Information and consultation

N BNC-AC

BNC-AC N BAWE per million

BAWE per million

91 82 73 67 60 53 23 23 18 16 9 1

5.63 5.11 4.52 4.15 3.71 3.27 1.42 1.42 1.13 0.96 0.50 0.05

0 49 68 40 0 62 34 9 6 13 2 37

0 7.50 10.41 6.13 0 9.49 5.21 1.31 0.92 1.99 0.31 5.67

93 0

5.75 0

54 44

8.27 6.73

198

12.25

286

43.80

16 10 0

0.99 0.62 0

33 45 32

5.1 6.89 4.9

393 76 10 3 3 0

24.3 4.70 0.62 0.19 0.19 0

181 0 3 91 37 40

27.7 0 0.46 13.94 5.67 6.13

These adverb phrases have been included because they are the most frequent and and or trigrams

a

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

47

Opposites and Expansions with –orLNG More or lessa Presence or absence Loss or damage One way or another True or false Positive or negative Right or wrong Success or failure Knowledge or skills

N BNC-AC

BNC-AC per million

659 143 59 54 53 48 43 30 4

40.8 8.85 3.65 3.34 3.28 3.0 2.66 1.86 0.21

N BAWE 101 11 0 14 11 42 22 16 5

BAWE per million 15.5 1.68 0 2.14 1.68 6.43 3.36 2.45 0.766

These adverb phrases have been included because they are the most frequent and and or trigrams

a

Law and Country LNG Landlord and tenant (act) Rank and file Policemen and women Crime and punishment Rights and duties Rights and obligations Retribution and deterrence Rules and regulations Loss or damage Town and country Scotland and wales Local and national Towns and cities National and local Town and country planning Britain and the United States National and international Church and state Rural and urban England and France Britain and France

N BNC-AC

BNC-AC per million

N BAWE

BAWE per million

107(68) 101 97 95 91 84 71 30 59 94 82 68 65 59 54 53 53 51 50 46 26

6.62 (3.9) 6.25 6.0 5.88 5.63 5.28 4.33 1.86 3.65 5.81 5.11 4.21 4.02 3.65 3.34 3.28 3.28 3.14 3.09 2.85 1.61

0 14 0 2 12 12 0 33 6 0 0 18 10 14 0 7 28 4 11 9 9

0 2.14 0 0.31 1.84 1.84 0 4.60 0.92 0 0 2.99 1.53 2.14 0 1.07 4.51 0.66 1.68 1.38 1.38

48 

M. PACE-SIGGE

Medicine and Science LNG

N BNC-AC

Patients and methods Education and science Morbidity and mortality Ulcerative colitis and Crohn’s disease Mind and body Health and illness History and examination Mitochondria and chloroplasts Fertilisation and embryology Atoms or molecules Illness or disability

BNC-­AC per million

N BAWE BAWE per million

78 78 58 47

4.83 4.83 3.59 2.91

0 2 0 2

0 0.31 0 0.31

20 9 5 0

1.21 0.56 0.31 0

45 54 48 57

6.89 8.27 7.35 8.73

0 3 2

0 0.19 0.10

33 10 12

4.60 1.53 1.84

Social Sciences LNG

N BNC-AC

Words or behaviour Working class and middle class

49 12

BNC-­AC N BAWE BAWE per million per million 3.03 0.74

0 0

0 0

Business, Economics and Education LNG Opportunities and threats Hospitality and tourism Art and design Schools and colleges Language and literature The success or failure Increase or decrease

N BNC-AC

BNC-­AC per million

N BAWE

BAWE per million

1 0 93 66 63 18 16

0.05 0 5.7 4.09 3.90 1.14 0.96

37 36 0 1 3 10 20

5.67 5.48 0 0.15 0.45 1.53 3.06

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

49

Gendered LNG Usage; Reference to Children LNG Men and women Women and men Husband and wife Wife and husband Spouses Male and female (n) Female and male (n) Policemen and women Policewomen and men Police officers Males and females Females and males Boys and girls Girls and boys Women and children Men and children Husbands and wives Wives and husbands Mothers and children Fathers and children Husband or wife Wife or husband Husbands or wives Wives or husbands Male or female Female or male Men or women Women or men He and his She and her He and she She and he His or her Her or his He or she She or he Him or her Her or him Parents and children Children and parents Adults and children

N BNC-AC 511 121 173 2 90 44 20 97 0 198 77 3 88 25 54 0 21 0 18 4 8 22 2 2 34 0 3 8 59 15 4 0 714 11 586 1 129 3 86 11 32

BNC-AC per million 31.62 7.49 10.70 0.12 8.42 2.82 1.24 6.0 0 12.24 4.76 0.19 5.44 1.55 3.31 0 1.36 0 1.14 0.24 0.49 1.41 0.12 0.12 2.10 0 0.19 0.49 3.70 0.93 0.24 0 44.17 0.71 36.26 0.06 7.98 0.19 5.14 0.70 1.98

N BAWE BAWE per million 241 48 22 0 52 23 3 0 0 14 78 4 17 16 22 0 7 0 1 0 0 0 0 0 8 0 4 0 0 4 0 0 90 3 68 0 11 0 8 0 9

36.91 7.35 3.36 0 8.01 3.52 0.46 0 0 2.14 11.94 0.61 2.60 2.45 3.36 0 1.07 0 0.15 0 0 0 0 0 1.42 0 0.70 0 0 0.70 0 0 13.78 0.46 10.41 0 1.68 0 1.42 0 1.33 (continued)

50 

M. PACE-SIGGE

(continued) LNG Children and adults Children and young adults Teenagers and young adults Children and young people

N BNC-AC

BNC-AC per million

20 100 0 63

1.21 6.19 0 3.78

N BAWE BAWE per million 7 1 5 8

1.07 0.15 0.75 1.23

Where terms are non-gendered, they have been highlighted in bold

Notes 1. There is also thousand and (406 occurrences)—these mostly refer to calendar years. 2. Including the word time (146 occurrences—18 of which are time or something). 3. Occurrences lower than ten have not been taken into account. 4. These numeral bigrams occur with the following frequencies: HUNDRED AND N = 241, TWO AND N = 169, ONE OR N = 138, TWO OR N = 78, ONCE OR N = 6, THREE OR N = 55. 5. These include a number of hedging devices that are formulaic, though not LNGs: Whether or not N = 56 May or may not N = 25 Less than or equal N = 15

6. There is, indeed, a number of academic articles that have theory and practise as part of their subtitle. Please note that practice would refer to the noun form in both British and American English. 7. This is well demonstrated by a discussion (in summer 2019) where the reviewer of Big and Small: A Cultural History of Extraordinary Bodies by Lynne Vallone, Marina Warner, criticises her use of the term “pigmies” instead of “San and Khoi” (LRB, Vol. 41 No. 11, 6 June 2019, p. 27f.). This sparked a debate with letters published over a number of the following issues. 8. Of the 1436 concordance lines of as with whole, every single one is the phrase “as a whole”; only a small fraction of these are not preceded by a noun or noun phrase in BNC-AC. 9. A full list of the most frequent such LNGs can be found in the Appendices. 10. The label is, in fact, used in more social science (16) than medicine (ten) files. 11. “Age and gender” is used by social science writers in eight out of ten cases.

2  LNGS IN SPOKEN INTERACTION AND WRITTEN ACADEMIC TEXTS 

51

12. The exception is “Trade And Industry”, which, in both cases, is in reference to the “Ministry of Trade and Industry”—a reference, as we have seen above, that can easily date. 13. One can, after all, imagine a play entitled “Getrude”, where the English Queen and not her son is the main character. 14. One recently published book, job interviews by (Routledge 2018), has case studies which reflect a 50/50 male–female divide. Where a fictional applicant is referred to, the author uses “she” instead of he. It must be noted that other gender identities, if used, were too few in number to be counted here. 15. The title depends on the translation here—the novel has also been published as Fathers and Sons.

References Bennett, K. (2009). English Academic Style Manuals: A Survey. Journal of English for Academic Purposes, 8, 43–54. Biber, D. (2015). When an Uptight Register Lets its Hair Down The Historical Development of Grammatical Complexity Features in Specialist Academic Writing. University of Lancaster: Presentation, Corpus Linguistics 2015. Coxhead, A. (1998). An academic word list (Vol. 18). School of Linguistics and Applied Language Studies. Victoria University of Wellington. Coxhead, A., & Byrd, P. (2007). Preparing Writing Teachers to Teach the Vocabulary and Grammar of Academic Prose. Journal of Second Language Writing, 16, 129–147. Evison, J., McCarthy, M., & O’Keeffe, A. (2007). ‘Looking Out for Love and All the Rest of It’: Vague Category Markers as Shared Social Space. In Vague Language Explored (pp. 138–157). Basingstoke: Palgrave Macmillan. Hardman, M. J. (1999). Why We Should Say “Women and Men” Until It Doesn’t Matter Any More. Women and Language, 22(1), 1–3. Hegarty, P., Watson, N., Fletcher, K., & McQueen, G. (2011). When Gentlemen are First and Ladies Last? Effects of Gender Stereotypes on the Order of Romantic Partners’ Names. British Journal of Social Psychology, 50, 21–35. Hyland, K. (2002). Directives: Argument and Engagement in Academic Writing. Applied Linguistics, 23(2), 215–239. Hyland, K., & Tse, P. (2007). Is there an “Academic Vocabulary”? TESOL Quarterly 41(2), 235–253. Jucker, A. H., Smith, S. W., & Lüdge, T. (2003). Interactive Aspects of Vagueness in Conversation. Journal of Pragmatics, 35(12), 1737–1769. Kesebir, S. (2017). Word Order Denotes Relevance Differences: The Case of Conjoined Phrases with Lexical Gender. Journal of Personality and Social Psychology, 113(2), 262–279.

52 

M. PACE-SIGGE

Mollin, S. (2012). Revisiting Binomial Order in English: Ordering Constraints and Reversibility. English Language and Linguistics, 16(1), 81–103. Motschenbacher, H. (2013). Gentlemen Before Ladies? A Corpus-based Study of Conjunct Order in Personal Binomials. Journal of English Linguistics, 41(3), 212–242. Pace-Sigge, M. (2013). Lexical Priming in Spoken English Usage. Basingstoke: Palgrave Macmillan. Schmid, H. J. (2015). Does Gender-related Variation Still have an Effect, Even When Topic and (Almost) Everything Else is Controlled. Change of Paradigms– New Paradoxes: Recontextualizing Language and Linguistics, 31, 327. Siyanova-Chanturia, A., Conklin, K., & van Heuven, W.  J. B. (2011). Seeing a Phrase “Time and Again” Matters: The Role of Phrasal Frequency in the Processing of Multiword Sequences. Journal of Experimental Psychology: Learning, Memory and Cognition, 37(3), 776–784. https://doi. org/10.1037/a0022531. Snow, C.  E. (2010). Academic Language and the Challenge of Reading for Learning About Science. Science, 328, 450–452. Wright, S. K., Hay, J., & Bent, T. (2005). Ladies First? Phonology, Frequency, and the Naming Conspiracy. Linguistics, 43(3), 531–561.

CHAPTER 3

LNGs in UK and US Poetry

3.1   Introduction1 This chapter looks at a corpus of British and US poetry, focussing on the main themes surfacing and investigates the usages in depth and compares words and sets of words found with their occurrence patterns in prose literature. The genre of poetry is often viewed as least suitable when wanting to observe frequently recurring patterns of fixed sets of words. It is seen as a use of language that subverts every pattern (linguistically speaking) that it can. Thus, Thorne (2006, p. 2) describes these imaginary writings as follows: “[p]oetry is perhaps the most distinctive literary genre in terms of its (…) often unexpected approach to language and syntax”. The poet A.E.  Houseman said that “poetry defies rational definition” (cf. Hutchings, 2012, p. 1ff.). Yet this still does not mean that a poet can fully create lexical patterns from the ground up: while there might be idiosyncratic divergence from conventions seen as “normative” or “natural”, the poet still has to strive in her or his writings for a level of convergence with the language of the readership in order to fulfil a communicative need. As Louw (1993) has demonstrated, poets do not necessarily use uncommon words: they use words readers are familiar with, albeit in more frequent ways and with nesting patterns that are found to be less commonly occurring. John Sinclair (1966) and Geoff Leech (1969) already looked at the way language works in a piece of poetry. Following in their footsteps, linguists such as Louw (1993), Hoey (2007) and O’Halloran (2007) McIntyre and © The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3_3

53

54 

M. PACE-SIGGE

Walker (2010), and O’Halloran (2012) all undertook language analysis of a number of poems where they have shown how the comparison with corpus evidence makes colligations, collocations and prosodies visible that also exist in other text types. These investigations highlight that a search for high-frequency Linked Noun Groups should yield salient results for the text-type written poetry as well. 3.1.1  The Corpora As a first step, existing corpora of poetry were reviewed. Details of this process can be found in Pace-Sigge (2019a, b). For this particular chapter, only the Gutenberg Poetry Corpus (GPC), created by the author—a collection of US and UK poetry from 1600 to early 1900 with the majority published in the nineteenth century—has been employed. In order to reveal keywords and key concepts in the poetry corpora, Patterson’s (2014) Nineteenth-Century British prose fiction corpus (“19C” hereafter) is employed as a comparator corpus (see Table 3.1). 3.1.2  Instead of a Literature Review A detailed review of the literature concerning linguistic and, in particular, corpus-based or corpus-assisted investigations of poetry can be found in Pace-Sigge (2019b). Relevant for the study at hand are the investigations by McIntyre and Walker (2010), who conducted a keyword study on Blake’s poetry, in particular, by comparing and contrasting Songs of Innocence (1789) with Songs of Experience (1794). This study focusses on domains that have a number of words assigned to them, namely HAPPY, VIOLENT/ANGRY and FEAR/SHOCK (sic) domains. Given the small size of the corpus, Table 3.1  Comparison of Poetry corpora CORPUS

BNCPOETRY

Gutenberg Poetry Corpus (Parrish, 2018)

GPC

19C

N FILES N TOKENS

30 226,367

n/a 19,223,6792

168 4,203,894

100 13,933,715

Number of tokens used for wordlist with the number of numerals (amounting to 2.96 million items) deducted

3  LNGS IN UK AND US POETRY 

55

Table 3.2  Ontology of salient themes in the GPC THEME WORLD SKY > > > > > > > >

world earth sea land ground hills snow

sky sun air wind stars moon heaven

TIME

GOD

LOVE

NATURE DEATH SONG TREASURE

day night morn/ing spring winter (dark) (old)

god soul glory muse heaven hell hope

love soul heart joy

nature tree flower bird doe

death song heaven sound hell

gold silver

Table only shows the relevant nouns. Words in brackets indicate that these words might take a different word category and may, therefore, be of no relevance when looking at LNGs

they were then able to trace back occurrences to individual poems and describe how the contrast between the two books was achieved. Lo (2008) and Fang, Lo and Chinn (2009) have gone one step further. On the basis of classical Chinese poetic texts, they developed the idea of an automated system that allows for a “structured imagery of poetic texts”. Their approach was found to be similar to the work done by me, though for a very different language and a different purpose. Using over 50,000 poems by nearly 3000 poets as their corpus, Lo (2008) segmented individual characters into meaningful word units (WUs) before WUs were indexed according to their semantic class. These then become part of the keyword set employed in their 2009 research paper. For example, for “winter” they focus every bigram—both meaningful cluster (e.g. harsh winter) and frequent chunk (e.g. winter spring).2 As a result, they present an ontology of salient themes found in classical Chinese poetry. Interestingly, the ontology is very similar to the one found by Pace-Sigge (2019a, 2019b) and shown in Table 3.2. 3.1.3  Groundwork: Identifying Key Themes in the Gutenberg Poetry Corpus The first step of this piece of research was to create wordlists of highly frequent words and sets of words (2–4 grams) for the GPC. WordSmith Tools (Scott, 2020)  was employed in order to create the wordlists. Frequently occurring words in these two wordlists were initially highlighted as they seemed to point towards a number of salient themes that were shown to reoccur. As a next step, wordlists of single and bigram to

56 

M. PACE-SIGGE

4-gram sets of words were created for the 19C corpus. Based on that, the GPC keywords in comparison to the 19C British fiction material were calculated. These lists were then compared, using keywords.3 After conducting these two moves, it became clear that there are a select number of salient themes that the poets have addressed in their writings. Thus, nine dominant themes (expressed through their salient lexical expressions) became apparent. These are given in the top row of Table 3.2. The columns for each row give the most frequent words of those semantic fields. The investigations in this chapter will focus on these words. A large number of topics, as shown in Table  3.2, are widely seen as prototypical themes dealt with in poetic texts.4 Amongst these, the theme of LOVE is most key in GPC compared to 19C. Table 3.2 is organised by the number of relevant keywords (which are nouns) found for each domain. As can be seen, the final domain, colour, employs adjectives typically and is given here only for reasons of completeness. It must be noted, however, that the two most frequent semantically coherent trigrams (which also include a noun) are “of the world” and “all the world” with 334 and 249 occurrences, respectively. Thus, they are below the 0.01% threshold within the total size of the corpus (79.5 and 59.3 times per million words). This indicates that, when compared with the figures found in Chaps. 2 or 4, total numbers are notably lower. The most frequent linked [and] noun group is outside the domains discussed in this chapter: “here and there”, occurring 241 times.5 In this chapter the most frequent of the discussed LNGs is, however, “day and night”, which can be found 44 times per million words (184 occurrences in the corpus). This confirms what has been described earlier: as poetic writings tend to be less bound by rules than all other text forms, repeated clusters of words remain relatively less frequently found—they are, however, not fully avoidable by writers. This allows us to investigate their use in greater depth. The nine domains and the words associated with each of them that provide the framework within LNGs will be investigated in this chapter. While some of these words may be highly frequent, they might not appear in LNGs; others are more likely than not found together in the same trigram. In the following section, the discussion will start with the domain which provides the least, moving towards the one with the most exposure to LNGs.

3  LNGS IN UK AND US POETRY 

57

3.2   LNGs in the Domains Treasure, Song and Nature 3.2.1  Treasure This is possibly the smallest domain within the corpus of poetry, with a focus on the word gold. This appears 1961 (0.05% of the corpus total size) times in total, of which one-third have been identified as “gold” in the form category of nouns. It is typically an adjective, found in linked groups like “green and gold N” or “gold and purple N”. The most frequent Linked Noun Group (LNG) is “gold and silver”, occurring 15 times (three times used by Christina Rossetti). This is followed by eight occurrences of “gold and ivory” where, again, Christina Rossetti uses a treasure-­ related LNG: “If I one day may see / Its spices and cedars, /Its gold and ivory.” The stylistic doubling is used here effectively to highlight the domain of treasure. Thomas Hood makes liberal use of the exclamation “gold and gold”: “Gold! and gold! and gold without end! / He had gold to lay by, and gold to spend”, which he repeats six times in one poem. There are a further ten LNGs of “pearl and gold”, three of which being used by Robert Herrick. Similarly, A.C. Swinburne prefers the phrase “silver and gold” to its reverse form: he uses it three times within a total of seven occurrences in the full corpus. A minor use, to finish, is “gold nor silver”, which, as an LNG, occurs only once. The only times silver appears in an LNG without gold are in the two occurrences each of “silver and silk” and “silver and myrrh”. Given the high frequency of the term gold in the GPC, it is certainly a highly relevant lexical item in this genre. However, as far as LNGs are concerned, its noun form is rarely employed apart from “silver and gold”, which is also not untypical of nineteenth-century writings (see Chap. 4). 3.2.2  Song The word song occurs 2509 (0.06%) times in GPC; it is typically found in trigrams like “Song of love/birds/praise/death/nature” et cetera. However, despite the relatively high frequency of the word, fixed phrases are very rare. The most common binominal form links “dance” with song. Accordingly, there are five occurrences each of “song and dance” and “dance and song”, and three of “song or dance”.6 What seems to be prevalent here, however, is the linking of song with a number of related items.

58 

M. PACE-SIGGE

DANCE LOVE FEAST LIGHT SHRIEK

5 2 2 2 1

PRAYER REVEL SCENT

1 1 1

and

SONG

and

DANCE LIGHT SILENCE EMOTION SHOW

5 3 2 2 2

WINE SMOKE SPEED SHOUT SMILE

2 1 1 1 1

Fig. 3.1  song LNG forms. (occurrence N: right)

SIGHT LIGHT SENSE

4 3 2

or

SIGHT

9

and

LIGHT

2

SOUND

SOUND and SIGHT SILENCE LIGHT SENSE

6 4 3 2

Fig. 3.2  sound LNG forms (occurrence N: right)

Thus, we find that both the colligational structure and the positive prosody are typical markers of how this item occurs in LNGs (Fig. 3.1). Looking at sound, the most usual pattern here is to link it with visual experience (in other words, an expansion) or, in a lower number of cases, an opposite. Sound tends to form frequent binominals in GPC only with a very restricted number of other nouns as Fig. 3.2. Stylistically more interesting are those uses of sound Linked Noun Group (LNG) which are both single occurrence and otherwise unique. It turns out that it is A.C. Swinburne who uses this word to create evocative new binominal groups: “Meantime the purple inward of the house / Was softened with all grace of scent and sound / In ear and nostril perfecting my praise” or “Even all men’s eyes and ears /With fiery sound and tears” or “The star came out upon the east / With a great sound and sweet” or “full of sound and shadow”, to list just a few. All of these are examples of

3  LNGS IN UK AND US POETRY 

59

Fig. 3.3  bird LNG forms with beast Beast or- (3) -or Beast (6)

BIRD Beast and(8)

- and Beast (16)

how sound appears to be a word of particular appeal to this poet and how he creatively employs it to create a personal voice. 3.2.3  Nature The word nature occurs 1597 times in the GPC (0.04%): it is a highly frequent noun in the corpus and does appear in a number of LNGs, which will be discussed in detail below.7 By contrast, the terms flower or tree appear in prepositional phrases (“tree of life”, “tree of knowledge”) or similes (flower-like). In fact, they appear only in the single LNG “tree and flower”. This occurs five times.8 Flower also carries two marginal LNGs, which only occur in this particular order: “leaf and flower” (four occurrences) and “flower and fruit” (twice). Bird (1196 occurrences) is an interesting item as, in the GPC, it only occurs in LNGs in combination with “beast” (see Fig. 3.3). There are also “neither bird nor beast” (twice) and “fish nor beast nor bird” (once). These constructions appear to be used by a variety of poets, Milton being prominent amongst them, using “beast and bird” once; “bird and beast” three times and even once saying “bird with beast” in Paradise Lost. It should come as no surprise then that Joseph Barker (1848) counted over a dozen occurrences of “bird and beast” in the Bible which is clearly the source for this singular combination of two nouns. “Bird and beast”, appearing at below only four times per million words in GPC, is also the most frequent LNG in this subsection.

60 

M. PACE-SIGGE

Fig. 3.4  -and Nature LNG forms

art and (5)

god and(9)

NATURE

life and (2)

time and -(5)

Nature, like tree appears in prepositional phrase like “voice of nature” or “law of nature” in GPC.  Importantly, it forms the second part for a small group of LNGs, as shown in Fig. 3.4.

3.3   LNGs in the Domain of God A lot of publications in the nineteenth century have a particular focus on religion and the Bible. It should therefore not come as a surprise that the GPC contains a notable number of items that reflect such a content. This section will describe how poets used LNGs that fit in this tradition. The word muse does not occur with Linked Noun Groups, and glory is more typically used for “glory to god”—the most frequent LNG is the “love and glory”, which occurs only four times, with “peace and glory”, “family and glory” and “glory and joy” being found three times each. Nor are there many LNGs involving god—the deity usually appears in prepositional phrases or in address forms like “Lord our God”. The following three coordinated binomials occur more than five times: . God and man (26 occurrences: 6.2 times per million) 1 2. God and nature (eight occurrences) 3. God and love (six occurrences)

3  LNGS IN UK AND US POETRY 

61

“God and man”, it must be noted, quite often appears with negative prosody. For example, “foe” or “Satan” as well as “scorned”, “cursing” “vile” or “hate” appear. Still, apart from being low in spread, the number of occurrences even for the most frequent God LNGs is rather low. By contrast, the item hell is more interesting: while total numbers for LNGs are low, it appears with antonyms and a tight variation of combinations. Apart from the constructions shown in Fig. 3.5, there is also the ominous phrase “death and hell” (occurring 13 times) and the much less used formula “earth and hell” (five occurrences). As can also be seen in Chap. 4, soul is on word that occurs in a number of LNGs. It must be noted that the LNGs recorded for nineteenth-­century literature exactly mirror the ones found in the poetry corpus (Fig. 3.6). This also presents the most frequent LNG with “body and soul”, occurring over 6.4 times per million words in the GPC.9 There is only one word that expands this particular domain: hope. This word could be also assigned to the domains of love or death. It has a particular strong association with the divine world (“faith and hope”, “hope of heaven”) and, for this reason only, has been fit into this section. Fig. 3.5  hell with heaven LNG forms

heaven and (17)

HELL - and heaven (8)

heaven or- (15)

BODY

27

HEART MIND

15 5

Fig. 3.6  soul LNG forms

and

SOUL

and

BODY

17

SENSE

5

62 

M. PACE-SIGGE

Table 3.3  hope LNG oppositions and expansions opposition expansion

-linkand or and and

HOPE FEAR FEAR TRUST JOY

N 24 15 7 7

- HOPE FEAR

-linkand

N 5

FAITH LOVE

and and

13 11

and and

LOVE PEACE

6 6

JOY

and

6

As can be seen in Table  3.3, the most common form is “hope and fear”—occurring 5.7 times per million words. Of this, it is encountered five times in William Morris’ poems and three times in Christina Rossetti’s10 and a couple of poets (Swinburne and Henry Vaughan) use it twice. It is also occurring once in Lord Byron: “While, future hope and fear alike unknown, /I think with pleasure on the past alone”. It must be pointed out that a number of poets use both “hope and fear” as well as “hope or fear”. “Love and hope” does occur twice in Byron, yet Dante Rossetti uses it a full five times. This is in marked contrast to the phrase “faith and hope”, which is employed by different poets only once each. Lastly, it must be pointed out that, in all combinations, hope combines with only a small number of words which keep reoccurring with and, or or nor; hope is typically the first-positioned noun or opposition and, to a far lesser degree, second-positioned noun for expansion.

3.4   LNGs in the Themes of Love and Death As has been said above, poetry often appears to revolve around the themes of love and death. This section shall demonstrate to what degree this is true in the way that poets have employed linked noun binominal groups. Love is, indeed, the most frequent noun in GPC, occurring 10,046 (0.24%,) times. However, 5/168 poetic texts do not mention it at all in the GPC. It might be seen as positive that death, by contrast, occurs only 3872 times (0.09%)—it is not mentioned in 18/168 files. It must also be noted that a number of items that are corresponding to this domain (like soul or hell) have already been discussed above. This means that this section looks at heart and joy as well as love and death only.

3  LNGS IN UK AND US POETRY 

63

3.4.1   Heart LNGs The use of heart in GPC LNGs mirrors what is described for fiction in Chap. 4 in smaller scale. Overall, heart is followed by other body parts (though some might be more abstract). Heart appears rarely as the second noun in these binominals: “head and heart” (six occurrences), “hand and heart” and “eyes and heart” (five occurrences each) are the only forms occurring.11 Figure 3.7 shows how “heart and—” is typically expanded. As can be seen, even the most frequent, “heart and brain”, is not particularly common, occurring about 5.2 times per million words. The more abstract “heart and soul” and “heart and mind” are less frequent. In contrast to fiction, where the abstracts (“heart and soul”, “heart and mind”) are the most frequent by far, GPC has “heart and brain” as far more frequent in both absolute and relative terms. It appears that Edward Doyle is the one who creates this imbalance: out of the 22 concordance lines of “heart and brain”, five are his, another two each come from Robert Lowell and Elizabeth Browning. Given that Emerson and Whitman also use this particular phrase, it appears to be another US preference. All the other LNGs appear, however, to be single occurrences per poet. 3.4.2  Joy LNGs The term joy forms two distinct Linked Noun Groups within the GPC. Firstly, one of opposites and, secondly, one of expansions. While, by

HEAD 13

MIND 7

SOUL / HAND 15

BRAIN 22

Fig. 3.7  heart LNGs

SPIRIT / EYE 6

HEART and -

LOVE / DEATH 4

64 

M. PACE-SIGGE

SORROW

GRIEF

and

6

or

3

JOY

and

or

JOY

PAIN

13

GRIEF SORROW PAIN

10 7 5

Fig. 3.8  joy LNG opposites Table 3.4  joy LNG expansions JOY as final expansion LOVE LIFE HOPE YOUTH LIGHT PEACE, PRIDE

-link-

and

N 9 8 7 5 5 3

-link-

and

JOY as initial expansion HOPE PEACE PLEASURE PRIDE

N 6 5 5 5

number of occurrence, the opposites are the more frequent, the expansions account for a wider range of co-terms. As can be seen in Fig. 3.8, the dominant form joy [and/or] opposite, while “sorrow and joy” and “grief or joy” are a lot less frequent. It must be noted that Keats and Teasdale use “joy and pain” twice each, while Hardy and Rossetti employ the links and and or, William Morris uses and, nor and or. This indicates that this formula is used by a restricted number of poets only while “sorrow and joy” and “grief or joy” appear to be single occurrences per writer. As has been pointed out, the expansion of the term provides for fewer, but a wider range of, examples, as shown in Table 3.4. It is remarkable to see that the Linked Noun Group for this particular word also appears with a clear colligational difference: the overall tendency is that opposites follow joy, whereas expansions appear before joy. Furthermore, it is found to be used by a variety of poets. The one person to stand out is Walt Whitman—he appears to have created each of the “expansion and Joy” constructs—in fact, “light/ pride and joy” occurs twice. The only time there is “joy and” used by Whitman is “joy and pride”. Used twice by Whitman, once by Ralph Waldo Emerson, this seems to be a particularly nineteenth-century US poetry phrase.

3  LNGS IN UK AND US POETRY 

65

3.4.3  Death LNGs As we have seen with joy above, both death and love are terms that appear with their opposites as well as displaying expansion when occurring in LNGs. Death itself does not appear in a great many of diverse binominal noun constructions. In fact, death has a preference for opposites in these constructions, and, amongst all the LNGs available, a single one opposite occurs far more often than all other linked binominals together, as Table 3.5 demonstrates. What is notable is that there are two opposites to death: both “life” and “birth”. “Love” has been excluded as it is closer to an expansion (similar to death bringing about “change” or is the “end”). Twice there is the phrase “song of love and death”, and once it is certainly given equal status, in J.R. Lowell: “…had battled down the triple gloom / Of sorrow, love, and death”. As for “life and death”: it is the most frequent Linked Noun Group uncovered so far, occurring 16.4 times per million words. While widely used, it is a phrase typically used by Henley (4/69), Swinburne (10/69) and Whitman (9/69). Whitman, too, makes strong use of “life or death” (6/27). Otherwise, there appears to be no single concurrent scheme of use amongst the poets for these LNGs, though Shelly presents a topical tryptic combination: “The gradual paths of an aspiring change: / For birth and life and death, and that strange state”. 3.4.4   Love LNGs Love is one of the most frequent words in the GPC, and significantly more frequent than in nineteenth-century prose (19C), occurring 2391 times

Table 3.5  death LNG oppositions and expansions DEATH final opposition LIFE LIFE BIRTH expansion LOVE CHANGE END

-linkand or and and and and

N 69 27 10 12 6 5

-linkand

DEATH initial LIFE

N 12

and

BIRTH

8

and and

HELL DEATH

13 8

66 

M. PACE-SIGGE

Table 3.6  love LNGs with and expansion - AND LOVE life love faith truth home, peace god, joy, hope beauty

N 26 14 12 10 9 8 5

p/mio 6.2 3.3 2.9 2.4 2.1 1.9 1.2

LOVE ANDtruth hope, life peace joy fame, faith, laughter beauty pleasure friendship, fortune, wisdom

N 16 12 10 9 8 7 6 5

p/mio 3.8 2.9 2.4 2.1 1.9 1.7 1.4 1.2

per million words. When it comes to linked binominals, however, single forms are rarely repeated to a high degree as can be seen below. Table 3.612 shows that even the most frequent Linked Noun Group (LNG), “life and love” is rather rare, appearing just over six times per million words. It is widely used, being found in Keats (three times), Dunbar, Spenser, Tennyson as well as Whitman (all twice). “Love and love” is the kind of repetition that usually runs over two lines—like in Ella Wilcox: “Too short for spite, but long enough for love./ And love lives on forever and forever”. Alternatively, it is found, like in Byron or in Shakespeare’s Sonnet 86, in two consecutive clauses in a single line: “Make but my name thy love, and love that still”. Looking at the most frequent LNG with love as the initial element, we can see that “love and truth” is clearly preferred by Burns or Lowell (it occurs three times) while also being used by Lewis Carroll, Thomas Gray or Helen Keller, amongst others. As we have seen above, poets who use one form seem to be averse to use its reversion form. Thus, “love and life” is found to be used by Browning, Shelley, Swinburne and Wilde, neither of which uses its reverse form. The one exception is Paul L. Dunbar, who says both, albeit in different poems: That is full of love and life in every line Of life and love and jealous hate!

While “life and love” and “love and love” can be seen as fixed phrases, the colligation is, for the term love, probably the most interesting part: there are a wider range of LNGs starting with love than those having the

3  LNGS IN UK AND US POETRY 

67

Table 3.7  love LNGs with opposition LOVE ORhate pity -OR LOVE hate

N 10 5

p/mio 2.4 1.2

N 5

p/mio 1.2

LOVE ANDdeath, hate fear pain, pity grief

N 12 9 7 6

p/mio 2.9 2.1 1.7 1.4

word as the second part. Furthermore, there is a clear tendency for LNGs ending in “-and love” to be positive expansions. However, “love and/or-” is the one binominal construction that has fairly frequent terms that are juxtaposed to love, as Table 3.7 shows. There is, however, a certain extent of mutability: both “life and love” and love” life are found, the same is true for “truth”, “peace” and “fame” as well as “death”, “hate” and “pity”. Beyond and and or as linking elements, there are a small number of LNGs with nor: “neither love nor hate” (seen in Frost, Graves, Rossetti and Tennyson) appears five times; there is also “neither joy nor love” (seen in Arnold and Milton), which occurs twice.

3.5   LNGs in the domain of Time As we have seen above, there are a wide range of terms that are indicators of time: these can either be interpreted as consecutive (spring follows winter, morning follows night) or as oppositions (where winter symbolizes death or sleep and spring awakening or new life). There is, unfortunately, not enough space in this book to discuss the finer details of their usage. However, poets do make use of these terms within a number of linked binominals, as will be shown below. 3.5.1  Spring and Winter; Morning and Evening LNGs While the actual words are definitely key when the poetry corpus is compared to prose fiction of a similar time period, there are only a small number of select binominals that occur with these terms.

68 

M. PACE-SIGGE

3.5.1.1 Spring and Winter These seasons are encountered a lot less frequently than references to the full day (see Sect. 3.5.2). It is also clear that poets have a preference to talk about spring, 1701 times (0.04% = 405 times per million words), which is almost twice as often as they mention winter—only 886 times (0.02% = 211 times per million). Neither season seems open to use with binominals that reoccur. The only exceptions are “summer and winter” (five occurrences) as well as “spring or fall”, which appears twice. 3.5.1.2 Morning and Evening This subsection is probably the most underwhelming. With evening appearing 694 times (0.02%) and morning 1355 times (0.03%; morn occurs a further 840 times), these nouns are comparatively infrequent. As LNGs, they only appear more than once with and. As such, the following has been found (number of occurrences in brackets): • night and morning (eight) • evening and morning (four) • morn’ and evening (six) • morning and evening (five) These are merely given to be on the record. It must be said that “night and morning” are used twice by both Yeats and Emerson; “morn and evening” can be found used twice by Robert Herrick while he never uses “morning and evening”—something found in concordance lines of five different poets. 3.5.2  Day and Night LNGs Unlike the terms above, day and night are counted amongst the most prominent nouns in the GPC. Day appears 7974 times (0.19% = 1875 times per million words) and night 6129 times (0.14% = 1459 times per million words). It seems to be reasonable, therefore, to expect these in a number of binominals, too. Expectations of how items are being used do not, however, correlate with mere frequencies. There are typical genitive forms that are descriptive, for example, “summer’s day”, “light of day”, “break of day” or “dead of night”. However, when it comes to Linked Noun Group constructions, day almost exclusively co-occurs with its opposite, night.

3  LNGS IN UK AND US POETRY 

69

Table 3.8  day LNGs DAY AND night

N 184

p/mio 43.8

- AND DAY night

N 140

p/mio 33.3

hour DAY OR night two

8 N 27 17

1.9 p/mio 6.4 4.0

- OR DAY night

N 21

p/mio 5.0

As Table 3.813 shows, day forms LNGs in either direction, with both and and or almost exclusively with night within the poetry corpus. There is also the phrase “a day or two” which, in this count format, occurs 14 times. The clear exception is “day and hour”, used by seven different poets (twice by Ella Wheeler Wilcox), for example here: “Yet the day and hour advances when in fright you all flee before it”. Beyond that, poets like Arnold, Emerson, Service or Tennyson seem to prefer “night and day”, while “day and night” is a format found often repeated by Allingham, Arnold, Barnes, Bridges, Crabbe, Lowell, Milton, Rossetti, Spenser, Swinburne and, in particular, Whitman (who employs it over 30 times). The high frequency of both forms seems to leave it as an option for any poet to use either form, though, typically, one form rather than its reverse occurs more often than the other. Finally, it must be pointed out that the word night does also occur in a small number of LNGs that are not linking it to day. Therefore, there is “noon an’ / and night”, used by eight different poets once; “night and sleep” occurs seven times, two of which come from Emily Dickenson, who wrote, “I trudge the day away,—/Half glad when it is night and sleep”. Finally, “love and night” occurs just four times (by different poets) and is probably most beautifully employed by Swinburne: —Baby’s eyes / Laugh to watch it rise, / Answering light with love and night with noon.

3.6   LNGs in the Domains of World The domain world stands out as the one area where GPC keywords are frequent, yet do not necessarily all appear in Linked Noun Groups. Here we have no repeated binominals with either ground or world, only very

70 

M. PACE-SIGGE

few for hills or snow. The prominent LNGs are “land and sea” and, even more so, “heaven and earth” and its variants. 3.6.1  Snow and Hills LNGs Snow appears in this section mainly because it is used by two poets: Coleridge, who uses “land of mist and snow” in The Rime of the Ancient Mariner, and Herrick, who uses “frost and snow” in four of eight recorded instances14 and “snow and sleet” in two of his poems. Herrick also uses “hail and snow”—this occurs four more times by different poets. This is all that can be found on snow extensions. Oppositions rather than expansions appear to be more typical for hills LNGs. While there is a single expansion, “hills and woods” (five occurrences by different poets), typically hills as a geographical feature are juxtaposed as Fig. 3.9. “Hills and valleys” appears in eight different poems, so do the others, yet “hills and dales” occurs twice in Burns’ work. “Hills and waters” is not in the graphic as it is used only once by Swinburne, yet Whittier employs this a full six times. 3.6.2  Land and Sea LNGs The number of combinations to make up the binominals for these terms is highly restricted. The two words form the core for a set of binominals; all other binominals with either land or sea are opposites. Figure 3.10 displays the most common form, “land and sea”, which occurs nearly nine times per million words; its reverse form, “sea and land” nearly six times. Notably, “land and sea” is used by only 21 poets, four of whom use it twice, and two three times. It is almost idiosyncratic for Spenser (who uses it four times) and Whitman (six times). Whitman uses “sea and land” only once. It is a phrase typical of Milton (four times, three of which are in Paradise Lost) and Emerson (six times). Emerson and Hood also employ “land or sea” twice, and Whitman and Emerson are the two poets who have put all three variants into their poems. Fig. 3.9  hills and its opposite LNGs

HILLS and valleys 8

dales 7

hollows 5

3  LNGS IN UK AND US POETRY 

Fig. 3.10  land with sea LNGs

and

LAND

or

SEA

15 and

SEA

Fig. 3.11  sea LNGs (occurrence N: right)

SEA

37

LAND

71

LAND

25

EARTH WIND SUN

and

SEA

18 6 3

SEA

and

SKY SHORE AIR

12 8 3

It is the term sea that also forms binominals with words other than land,15 though these are still clear opposites, as shown in Fig. 3.11. “Earth and Sea” is a clear near-synonym for “land and sea”. As such, it is used by 14 poets; Brown, Low and Whitman use it twice, and Swinburne uses it three times. Looking at the colligation, “land and sea” tends to be at the end of a clause, while “earth and sea” mostly appears mid-clause. More importantly, perhaps, “land and sea” is connected with active verbs like fight, govern, travel, whereas “earth and sea” tends to be more abstract and often collocates with nouns like sky or air. Finally, “sea and sky/shore/air” are LNGs that occur only once in the corpus per poet. 3.6.3  Earth and its Juxtaposition LNGs Earth is the one item that only ever appears with opposites and, as such, has one single associate that appears more often than all other binominals with earth put together: heaven. These particular LNGs have been discussed earlier (in Sect. 3.3). This section will therefore focus on all those occurrences where earth is linked to words other than heaven. Table 3.9 can be divided into two groups that are juxtaposed to earth: the opposite to landmass (“sea”—see Fig. 3.11—“ocean” or “water”) and

72 

M. PACE-SIGGE

Table 3.9  earth LNGs

EARTH ANDSKY AIR WATER SKIES OCEAN

N 29 21 8 7 6

p/mio 6.9 5.0 1.9 1.7 1.4

the atmosphere (“sky”, “skies” or “air”), with the latter being far more frequent for earth Linked Noun Groups. The most frequent phrase, “earth and sky” seems to be preferred by certain US poets (Bryant, Emerson and Whitman use it three times each), otherwise British poets employ it just once. This is in stark contrast to the “earth and skies” found used by an Australian, a US and a number of British poets. The other option “earth and sky” is used twice each by Pope and Bryant. Significantly, it is found three times in the collection of poetry by the Bronte sisters, including this extended form: “And called my willing soul away, / From earth, and air, and sky”. This seems to be echoed by D.H. Lawrence, who uses another binominal: “The sky and earth and water and live things everywhere”. It must be pointed out that twice “earth and water” refers to “watering plants” as in “To get the seeds of wild flowers, and to plant them / With earth and water, on the stumps of trees” (Coleridge). It can also be used as a reference to lakes as in George Crabbe, D.H.  Lawrence or, indeed, Pope: Here hills and vales, the woodland and the plain,/ Here earth and water seem to strive again;” For Shakespeare, Spenser and Swinburne, “earth and water” is a reference to the elements: “Ayre hated earth, and water hated fyre. (Spenser)16

3.7   LNGs in the Domain of Sky First of all, an oddity: the first three terms in this domain are three-letter words, followed by three or four letters; the more biblical term heaven stands out. There are clear overlaps here (the sun is one of the stars; it is a counterpoint to moon. Air and wind are related). Furthermore, there may

3  LNGS IN UK AND US POETRY 

73

be particular reasons why one of the terms appears in an initial position in a binominal (see Pace-Sigge (2019b) for a deeper analysis on this). Similar to what we have seen above, there are relatively few recurring LNGs here—and different terms have a different level of variety. What stands out for this particular domain are LNGs that have a tendency to include the definite article “the”. 3.7.1  Air and Wind LNGs Although being near-synonyms, these two items are used with entirely different co-nouns in GPC binominals. It is the colligational structure that is, however, broadly similar: both air and wind are appearing with an expansion-type LNG. Unlike in Sect. 3.6.1, the items in this sub-section stand out for the variation of nouns they combine with. Furthermore, while a typical form can have occurrences of the reverse format, the latter will be extremely rare. 3.7.1.1 Air Binominals As Fig. 3.12 demonstrates, the most commonly occuring reference is to the ancient classification of elements, that is, “earth and air”. It is noteworthy, too, that “sun and air” is semantically close to “light and air”. It is noteworthy that air tends to be the end-positioned noun in these LNGs.17,18 “Earth and air” appears three times in poems by the Brontë Sisters, twice by William Bryant; all other uses are single occurrences per poet. Without exception the phrase is clause (line) final. This is quite different for the expression “light and air” as in “[i]t seems to me that every thing in the light and air ought to be happy”—three of the six occurrences are, like the quote above, found in Whitman’s Leaves of Grass. Whitman Fig. 3.12  air LNG expansions Sun and

Light and 6 Earth or 6

7

Earth and 21

AIR

and Light 5

74 

M. PACE-SIGGE

uses “sun and air” also—just once. Unlike J.G. Whittier who employs it four out of seven times. The other examples being by J.R. Lowell and Christina Rossetti, it seems overwhelmingly a turn of phrase found in US poetry. 3.7.1.2 Wind Binominals Wind, unlike air, tends to be front-focussed in the majority of cases. The only major alternative is “rain and wind”, which appears 2.5 times less often than “wind and rain” (Fig. 3.13). As with “earth and air”, “wind and rain” appears less than five times in a million words. The phrase typically (though not always) is clause-final; three times it ends a stanza in Newbolt’s poem Waggon Hill. Otherwise, it is found times in Burns’ and three times in Hardy’s poetry. Similarly, “Rain and wind” turns up four times in the poetry: twice in Melancholy and in two further poems. “Sea and Wind”, employed once by Edmund Waller, who seems to prefer “sea or wind”—all three occurrences are his. A parallel to this is Swinburne, who composed poems including “wind and sea” five times, four of which, unsurprisingly, in The Armada, with Newbolt’s Admirals All supplying the final example. Another interesting point is that “sea and wind” cover the same semantic field as “wind and tide” or “wind and wave”. Here, like with “sun and air” it is a typical Whittier phrase: “Which wind and wave on wild Genesareth heard”, one of the three examples by him. “Wind and tide”, finally, is employed by ten different poets: “Of distant blues, where water and sky divide,/ Urging their engines against wind and tide” (Robert Bridges). With the exception of Waller, none of whom appears to have use of the other LNGs above. Finally it must be noted that the chunk “the wind and the” is the most popular construction in this segment, occurring 31 times (7.4 times /million words). As a result, there are extended phrases with definite articles that form Linked Noun Groups like “wind and the sun” (six occurrences), “wind and the sea” (four occurrences) or “wind and the rain”. Similarly, RAIN SEA

7 3

and

SEA

3

or

Fig. 3.13  wind LNG expansions

WIND

and

RAIN TIDE

21 10

WAVE SEA

7 6

3  LNGS IN UK AND US POETRY 

75

“the sun and wind” (five occurrences) the rain and wind” (three occurrences). 3.7.2  Sky and Heaven LNGs While sky and heaven could be employed synonymously, this is absolutely not the case for LNG constructions in the GPC.  Typically, they appear with two distinct expansions (for sky) and two distinct opposites (heaven) as can be seen below As can be seen in Fig. 3.14, poets tend to end-focus on sky. As such, “earth and sky” is ten times as frequent as the reverse form and “sea and sky” appears still slightly more often than the alternative. As (within the same clause) “sky and stars” is the only form appearing, it appears as if the writers start with what is physically nearer to them and put the more distant reference in second position. As above, a different set of poets uses “sea and sky” as opposed to “sky and sea”. Also, the phrase occurs only once per poet. The same holds true for the three variations “sea and sky”, “sky and sea” and “sea and ocean”. Lastly, “earth and sea”, occurring 6.9 times per million words, occurs four times in Whitman’s and three times in Bryant’s poems, twice each in Dunbar and Kipling. The usage pattern found for heaven is very different. First of all, heaven is front-focussed in binominals. We also find the domain’s most frequent LNGs here: “heaven and earth” (16.7 times per million words)19 and “earth and heaven” (7.6 times). While this presents one obvious juxtaposition (the godly paradise vs. worldly existence) the secondary antonym is found in “heaven and hell” (occurring four times in a million words). Unlike all other items, there is also a clearly drawn opposition presented: “heaven or hell” as well as “earth or heaven” (see Fig. 3.15). “Earth and Heaven” is used three times by Arnold and Rossetti, twice by Lowell—otherwise it occurs once per poet. By contrast “heaven and earth” is almost clichéd in its repeated use: Browning and Waller use it six times each, and Dryden, Lowell and Milton five times. A number of poets EARTH SEA

29 12

and

Fig. 3.14  sky LNG expansions

SKY

and

EARTH SEA

3 9

OCEAN STARS

4 3

76 

M. PACE-SIGGE

EARTH

32

and

HELL

7

and

EARTH

10

or

HEAVEN and EARTH HEAVEN

and and or

HELL ALL HELL

70 17 6 14

Fig. 3.15  heaven LNG opposites Fig. 3.16  moon, stars and sun interrelated LNGs

SUN and

MOON and

STARS

STARS

draw on it twice or three times in various poems. This is in direct contrast to “heaven and/or hell”, which is employed once only by a variety of poets. 3.7.3  Moon, Sun and Stars LNGs These three items are in this one subsection as they tend to combine into frequently occurring LNGs—after all, these are lights in the firmament which stand in obvious relation to each other. As with the word wind (see 3.6.1.1.), extended LNGs appear here in use as well. As can be seen in Fig.  3.16, the most common near-collocates in binominal form for sun are either the earth’s natural satellite, the moon, or, alternatively, other suns. There is also a clear hierarchy in what comes first (see Table  3.10)—sun before moon, and moon before stars. This is fully in line with Moon’s observation, who stated: “[we] seem to observe the ‘me-first’ or ‘towards speaker’ orientation [in binomials]” (Moon, 1998, 154). In fact, there is even a triple-noun LNG that demonstrates this order: “sun and moon and stars”, which occurs three times in GPC, used by Southey, Yeats and Whitman.20 There are also those LNGs which are extended forms including definite articles:

3  LNGS IN UK AND US POETRY 

77

Table 3.10  sun, moon and stars LNG as extension SUN ANDMOON STARS

N 40 14

p/mio 9.5 3.3

MOON ANDSTARS

19

4.5

-AND SUN MOON

N 9

p/mio 2.1

-AND MOON STARS

5

1.2

-AND SUN RAIN

N 7

p/mio 1.7

WIND

4

>1

Table 3.11  sun as weather marker SUN ANDRAIN SHOWER/S WIND / AIR

N 14 8/7 7

p/mio 3.3 1.9/1.7 1.7

• the sun and moon (16 occurrences) • the moon and stars (seven occurrences) • the sun and moon (six occurrences) • the moon and the stars (three occurrences) Only “the sun and moon” (four occurrences per million words) is used more than once by a single poet: three times by Christina Rossetti, twice each by Emerson and Whitman. As Table  3.11 shows, sun is also a marker of weather: it denotes the short version of “sunshine”, a word that does not occur within LNGs apart from single occurrences. “Sun and rain” is used twice by Swinburne and Whitman. The reference to heavy rain appears in Whittier once as “sun and shower” and twice as “sun and showers”. The latter also occurs thrice in Herrick, who does not use the singular form. While “sun and wind” has not been used by the same poet twice, “sun and air” is a metaphor preferred by Whittier, who uses it four times in Anti-Slavery Poems. There is only one occurrence where the conditions of living in the open is reflected by the lines “To take potluck beneath the sky/With sun and moon and wind and rain. This super-extended LNG, referencing both the firmament and weather conditions, is in the long poem by Wilfrid Wilson Gibson, The Hare.

78 

M. PACE-SIGGE

3.8   Conclusions Poetry is an interesting showcase as to the function and use of Linked Noun Group (LNG) constructions. Before the results above are reviewed, I may be briefly allowed to make a direct comparison with the most frequent binominals in fiction. While the most frequent Linked Noun Groups in nineteenth-­century fiction occur more frequently, these (like in BNC-­ Fiction and Academic English material) are merely count nouns (two or three, day or two). In the GPC, however, these LNGs are less frequent: “two or three”, for example, is almost six times more likely to be come across by a reader of 19C fiction than by readers of the poetry corpus material. Instead, the prominent constructions are of high lexical value (as can be seen in Table 3.12). This ties in with the fact that, unlike fiction, the most frequently occurring form are N-[or]-N, which are notably less than N-[and]-N. There is only one phrase that provides overlap: “men and women”— which is significantly more frequent in GPC than either fiction corpus reviewed in this book. In Table  3.12, it can be observed that most of the highly frequent LNGs are to be found in the domains discussed. The exceptions are also peculiar because they not only are very frequent, but are also very often used by the same poets. The chief culprit Table 3.12  The most frequent LNGs found in GPC

LNG with and DAY AND NIGHT NIGHT AND DAY MEN AND WOMEN EAST AND WEST HEAVEN AND EARTH LIFE AND DEATH LNG with or TWO OR THREE MAN OR WOMAN DAY OR NIGHT NIGHT OR DAY LIFE OR DEATH HOPE OR FEAR NORTH OR SOUTH

N 184 170 99 60 70 69 N 48 33 27 21 27 15 14

p/mio 43.8 40.5 23.6 14.2 16.7 16.4 p/mio 11.4 7.8 6.4 5.0 6.4 3.6 3.3

Phrases in italics are highly frequent in their use amongst a number of poets. Details are below.

3  LNGS IN UK AND US POETRY 

79

here is Walt Whitman: his works make up less than 4% of the corpus, yet he has produced roughly half of all occurring such phrases. In detail, these phrases are “men and women”, 59/9921; “East and West”, 10/6022; “Man or Woman”, 20/33; “North or South”, 10/14. While “two or three” has a wider spread—21 poets use it 44 times, Whitman (again) employs it four times, though Hartley is yet more keen, and it occurs seven times in his Yorkshire Ditties. This chapter has been exceptional, however, because the selection of LNGs is not determined by mere frequency but by theme. As observed above, Poetry is mainly concerned with a number of key themes. Each domain is represented through a number of key nouns that are significantly more prominent in the GPC than the 19C fiction corpus. While the most highly frequent clusters are also typical of a number of domains, four important findings stand out: One: There are a number of key items in the GPC that have a distinct dis-preference to form recurrent LNGs. For example, world or the domains of Song and Nature. Whatever binominals have been uncovered here are very unlikely to be found oft-repeated. Two: Where there are N-[and]-N or N-[or]-N constructions that have reoccurrence figures of higher than one (meaning the binominal phrase appears at least once per million words), they are more likely to be overused by a small number of poets if the total number is quite high. Three: If a Linked Noun Group appears in a reverse form (e.g. “heaven and hell” / “hell and heaven”), one of these tends to be prototypical and occurs far more frequently than its counterpart. It is important to note that only on very rare occasions the same poet uses both forms: either one or the other is chosen and often used more than once.23 Four: This final conclusion is characteristic of poetic material: there are simply not that many LNGs that are formulaic. Keywords that are found with a relatively high number of Linked Noun Groups tend to have only one or two binominals with relatively high frequencies, and the other binominals are a lot lower in their numbers. Crucially, this does not mean that these items that are key in the GPC are not employed in this particular grammatical format: more often than not there are some visible LNGs— these are visible because of their repetition. At the same time, below the line of visibility (occurring singularly or twice) these keywords are found in a large number of novel LNGs that are idiosyncratic of one poet’s output. At times the impression is given that, instead of a formulaic well-­ known phrase, a poet has chosen a second noun that is either semantically

80 

M. PACE-SIGGE

or phonetically close to the expected item. This is perfectly demonstrated by the Linked Noun Group that is the title of the first volume of poetry by E.E. Cummings: Tulips and Chimneys (1923). While “tulips and roses/ daffodils” is an expected binominal, as would be “fireplaces and chimneys”, the modernist poet created a unique combination. A further important point that has to be made is how poetry, as a text genre, is consumed. Unlike a thriller (being a “page-turner”), readers would dip in and out of volumes of poetry; this results in a relatively low to average word count encountered in a single sitting. This also means that even “high-frequency” LNGs like “day and night” are encountered only infrequently. In fact, a random reading of pieces by poets in the corpus provides examples of how a reader might encounter these. A look at Christina Rossetti’s poem Eve presents just one Linked Noun Group, “sorrow and sin”, which appears only one other time in the GPC, in Ella Wheeler Wilcox’s poem Here and Now.24 In other words: LNGs do exist and are used by different people— but they are exceedingly rare. The other finding of this chapter was that keywords are used by a number of poets, and poets do use the grammatical form of an LNG—yet the resulting LNGs are idiosyncratic. Looking at Swinburne’s Hymn to Proserpine, there are a few LNGs, two of which may serve as examples here: “racks and rods” and “vapour and storm”. The latter has one GPC keyword, namely storm. However, the resulting LNG occurs only once and in this it is like “racks and rods”. This means that not only is Swinburne the poet who created this LNG,25 he also employs it (at least in the poems of his in the GPC) only a single time. At the start of this chapter, I have quoted various sources that describe the language of poetry as highly edited and thus devoid of typical idiomatic phrases. This was countered by the view that all poems have to have a communicative value: thus, they cannot be free of known lexical and grammatical constructs. Based on the findings presented, by the end of this chapter a perfect synthesis can be made: poets do, indeed, create works that have a very individual voice; yet no poet can write without making use of one or the other frequent binominal. Some poets, in fact, are prime users for certain LNGs. Furthermore, it can be found that there is a certain incline in the structure of usage: some binominals described here are typical of US poets; some poets—Whitman is the person who keeps reappearing—make great use of formulaic LNGs, while amongst others (Byron, for example) we rarely encounter them. It shall be left to literary critics to decide whether overuse of well-worn LNGs is a marker

3  LNGS IN UK AND US POETRY 

81

of hastiness, carelessness or lack of imagination in the writer. Or whether this particular style reflects the use of an authentic voice, accessibility or an expression of freedom.

Notes 1. Parts of  the  introduction as  well as  Tables 3.1 and  3.2 have been taken from earlier work by the author (Pace-Sigge 2019a, 2019b). 2. As a further step, in order to be more detailed, these sets of words were subdivided into four categories, namely primary, complex, extended and textual. These roughly translate as keywords, bi-grams, longer n-grams and phrases with strong collocational strength. 3. Keywords in WS are calculated as described here: https://lexically.net/ downloads/version5/HTML/index.html?keywords_calculate_info.htm (last accessed 08/May/2019). 4. Answers from an anecdotal mini-survey. Answers to the question “what are the themes typical of poetry?” Respondent (1) Romance. Number one. That could be unrequited love, falling in love. Anything to do with love. Death is another popular one. And I’d say lastly maybe nature, like scenery. Respondent (2) Life and its components. Respondent (3) Ancient stuff has lots of war, magic, religion, journeys. Respondent (4) Love and death. Respondent (5) Anything and everything in the world that we are concerned about that the human condition deals unable to understand/properly express/control sufficiently—poetry is an attempt to breach this gulf—in some sense, sometimes it appears to do so when the poetry is truly sublime. Respondent (6) Faith (Donne, Herbert, Eliot, Hopkins, Rossetti, Cowper...), war (Tennyson, Homer, Scott...), legends and fantasy (Byron, Spenser, Coleridge, Emily Bronte, Tennyson again...), nature (Wordsworth, Gay, Hughes, Hopkins (again), Keats...), communities (Chaucer, Crabbe, Goldsmith...), love (as already noted), lust (Ovid, Rochester, Dryden, Betjeman, Sappho...) moral/social/emotional reflection (Blake, Larkin, Plath, Yeats...). Nothing is closed off to poetry except the trite and the expected, and even these can be subverted or even celebrated. 5. The most frequent or LNG is “two or three”: 48 occurrences. 6. There is also one occurrence each of “dance/tale/concerto/hymn/ speech/prose or song”. 7. Doe does not appear in any LNGs. 8. Twice used by the British poet R. Bridges (1844–1930).

82 

M. PACE-SIGGE

9. This is twice as frequent as “body and soul” occurs, relatively, in 19C. 10. She also uses ”fear and hope” once. 11. All of these appear to have been once only by different poets in the corpus. 12. No occurrences below five are taken into consideration here. There are examples like “love and night” or “love an sorrow” (four times) or “love or grace /death/sleep (three times). 13. For purposes and simplification, both the forms an’ and and have been counted together in these LNGs. 14. Two more uses by John Clare, one each by Coleridge and Bryant. 15. There are six examples of “house” and “land”, which seem to have fallen out of use. They only appear amongst older works: Dryden, Keats, Pope (twice) and anon (twice). 16. Amongst all concordance lines, this is the only example where this is a chunk and not an LNG as such. However, it is the best demonstration of the words as elements. 17. There are also “air and earth” and “air and music”—both occurring just twice as well as one use of “air and/but/or/nor fire” each. These are, again, expansions. 18. Note, too, that there is “sun and air” as opposed to “wind and sun” (four occurrences), which highlights the divergence in positioning. 19. This LNG occurs 19 times as “heav’n and earth”. This has not been taken into account here, as 17 of these lines are in Milton’s Paradise Lost. 20. Walt Whitman extends this phrase even further: “oh sun and moon and all your stars” as well as “Sun and moon and countless stars above”. 21. Only 23 different poets use this phrase. It is often repeatedly occurring: apart from Whitman, Elizabeth Browning (five occurrences), Christina Rossetti and Sara Teasdale (four occurrences each). 22. Used seven times by Thomas Hood, four times each by Kipling, Emerson and Lowell. 23. “Night and Day” / “Day and Night” is a clear exception here—probably because of very high rate of occurrences. Yet, even here the majority of poets employ either one or the other format.  Furthermore, it has to be noted that Hoey (2005, Chapter 5) describes that a similar divison between ‘majority usage / minority usage’ can be seen with polysemitic items. 24. “Sorrow and Sin” occurs only once in the 19C corpus. 25. Swinburne’s phrases were also checked for occurrence in the 19C and in the BNC-Fiction, BNC-Drama and BNC-Poetry subfolders: they are not recorded.

References Barker, J. (1848). A Review of the Bible; Containing Remarks on the Scripture History of Creation, etc. London: J. Chapman. Carter, R. (2004). Language and Creativity. London: Routledge.

3  LNGS IN UK AND US POETRY 

83

Fang, A. C., Lo, F., & Chinn, K. C. (2009). Adapting NLP and Corpus Analysis Techniques to Structured Imagery Analysis in Classical Chinese Poetry. Association for Computational Linguistics. Proceedings of the Workshop on Adaptation of Language Resources and Technology to New Domains, 27–34. Hoey, M. (2005). Lexical Priming: A New Theory of Words and Language. London: Routledge. Hoey, M. (2007). Lexical Priming and Literary Creativity. In M.  Hoey, M. Mahlberg, M. Stubbs, & W. Teubert (Eds.), Text, Discourse and Corpora: Theory and Analysis (pp. 7–30). London: Continuum. Hutchings, W. (2012). Living Poetry. Houndmills: Palgrave Macmillan. Leech, G. N. (1969). A Linguistic Guide to English Poetry. London: Longman. Lo, F. (2008, July 3–6). The Research of Building a Semantic Category System Based on the Language Characteristic of Chinese Poetry. In Proceedings of the 9th Cross-Strait Symposium on Library Information Science, Wuhan University, China. Louw, B. (1993). Irony in the Text or Insincerity in the Writer? The Diagnostic Potential of Semantic Prosodies. Text and Technology: In Honour of John Sinclair, 240, 251. McIntyre, D., & Walker, B. (2010). How Can Corpora be Used to Explore the Language of Poetry and Drama? In A. O’Keefe & M. McCarthy (Eds.), The Routledge Handbook of Corpus Linguistics (pp. 544–558). London: Routledge. Moon, R. (1998). Fixed Expressions and Idioms in English. Oxford: Clarendon Press. O’Halloran, K. (2007). Corpus-assisted Literary Evaluation. Corpora, 2(1), 33–63. O’Halloran, K. (2012). Performance stylistics: Deleuze and Guattari, Poetry and (Corpus) Linguistics. IJES, 12(2), 171–199. Olsen, A.  H. (1986). Oral-Formulaic Research in Old English Studies: I. Oral Tradition, 1(3), 548–606. Pace-Sigge, M.  T. (2019a, September). Typical Phraseological Units in Poetic Texts. In International Conference on Computational and Corpus-Based Phraseology (pp. 330–344). Cham: Springer. Pace-Sigge, M.  T. (2015). The Function and Use of TO and OF in Multi-word Units. Basingstoke: Palgrave Macmillan. Pace-Sigge, M. T. (2019b). A Case Study on Some Frequent Concepts in Works of Poetry. Journal of Research Design and Statistics in Linguistics and Communication Science, 5(1–2), 123–152. Parrish, A. (2018). A Gutenberg Poetry Corpus. Retrieved November 9, 2018, from https://github.com/aparrish/gutenberg-poetry-corpus. Patterson, K.  J. (2014). The Analysis of Metaphor: To What Extent Can the Theory of Lexical Priming Help Our Understanding of Metaphor Usage and Comprehension? Journal of Psycholinguistic Research. Retrieved May 8, 2019, from http://link.springer.com/article/10.1007/s10936-014-9343-1# page-1.

84 

M. PACE-SIGGE

Rayson, P. (2016). Log-likelihood and Effect Size Calculator. Excel spreadsheet. Retrieved December 8, 2018, from http://ucrel.lancs.ac.uk/llwizard.html. Scott, M. (2020). WordSmith Tools 8. Retrieved January 16, 2020, from www. lexically.net. Sinclair, J. (1966). Taking a Poem to Pieces. In R. Fowler (Ed.), Essays on Style and Language (pp. 68–81). London: Routledge and Kegan Paul. Thorne, S. (2006). Mastering Poetry. Houndmills: Red Globe Press.

CHAPTER 4

LNGs in Nineteenth- and Twentieth-­ Century British Fiction

4.1   Introduction This forms a companion chapter to Chap. 3. The discussion will be based on prominent LNGs found in a dedicated full-text corpus of nineteenth-­ century literature (Patterson 2014) and the twentieth-century fiction sub-­ corpus of the British National Corpus (BNC). For both corpora—nineteenth and twentieth centuries—the most frequent as well as the most significantLinked Noun Group (LNG) clusters will be extracted and their individual uses examined. A key element of this chapter is to directly compare and contrast their use of key elements found in both corpora and indicate developments where there has been a modification or visible change in the patterns used overall in fiction of these time periods. This chapter will also include a section that looks at two sets of writing from nineteenth-century Britain in particular, namely Dickens’ novels and the English translations of Marx’s writings. This particular case study will indicate as to how far Dickens’ idiosyncratic use of LNGs differs from that of other writers in Victorian Britain; the study of frequent LNGs in Marx’s writings provides a link to the discussion of contemporary academic writing (Chap. 2). While the author Charles Dickens uses a number of noun groups that are typically found in other Victorian literature, it might come as a surprise that there are instances of LNGs that are salient in their use both in English translations of Marx’s philosophical writings and Dickens’ fiction.

© The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3_4

85

86 

M. PACE-SIGGE

4.1.1  Corpora Used For this particular chapter, a variety of sources will be used. The baseline for fiction corpora is the BNC-written prose fiction sub-corpus (in the following, BNC). This corpus represents British fiction of the late twentieth century. The comparator corpus is the 19C British fiction (in the following, 19C) corpus consisting of full-text fiction writing collected from all of the nineteenth century. This corpus was created by Katie J.  Patterson (2014). Inspiration for the 19C corpus came from the Dickens corpus (in the following, DC), created by Mahlberg (2007) and consisting of novels and longer fiction by the Victorian author. Finally, there is the corpus of Marx’s writings (in the following, MC) compiled by Pace-Sigge (2018). This seems to be, on the face of it, an odd choice: Marx’s writings were research based and would, these days, more likely be classified as “academic writing”. Furthermore, these are translations of the original.1 However, the distinction into different genres was far less prominent at a time when both scientific writing and the novel still showed their roots in an epistolary style. In the late nineteenth century, the idea of specialisms only started to take root and the educated reader was expected to read— and understand—widely (cf. Biber and Conrad 2000). A number of connecting links have been investigated for their use of LNGs. For example, the conjunct yet does not seem to appear in LNGs in the fiction corpora. Similarly, as only records chunks of text that appear of little consequence for this investigation. There are frequent instances of her as she or face as she in the BNC. Yet this only pinpoints that there is a stronger concern with female protagonists because 19C records most frequently him as he. These pronoun-led chunks, however, reveal little of consequence for the current investigation. As a result, the fiction corpora LNG research will focus on the following three conjuncts in Table 4.2. Table 4.2 already hints at structural differences: all conjuncts are more frequent in use in 19C compared to BNC; only for or usage is the

Table 4.1 Corpora used for Chap. 4

Corpus BNC-W-fict-prose (BNC) 19C-British fiction (19C) Dickens Corpus (DC) Marx’s Writings (MC)

N tokens 16,819,078 13,933,715 4,534,669 1,387,229

N files 432 100 23 13

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

87

Table 4.2  Frequency and relative frequency for the links AND, BUT and OR Corpus / conjunct BNC-W-fict-prose 19C-British fiction Dickens Corpus Marx’s Writings

AND N 423,916 457,455 162,978 31,015

p/mio 25,203 32,840 35,977 22,312

BUT N 96,763 97,425 24,346 6909

p/mio 5750 6,994 5,374 4,971

OR N 36,406 40,834 14,920 7,147

p/mio 2,164 2,931 3,294 5,142

frequency a lot higher in the Marx Corpus. But is a frequent link—yet it is prominent almost exclusively for adverb/adjective binomials  (“all but sunny”, “more but less”). The LNGs with but are, this notwithstanding, a clear stylistic marker. 4.1.2  Method The initial move was to check the most frequent conjuncts for their use in Linked Noun Groups. For this, L1 and R1 collocates for the items (see Sect. 4.1.1) were investigated in their respective concordance lines. Where frequently reoccurring LNGs that were meaningful clusters were detected, these conjuncts—and, but and or as it turned out—were selected. In order to be better able to focus on LNGs with these items, the concordance lines (with around 160 characters per line) for each of these items were saved as a separate file. These files then are being checked for the most frequent clusters—using both the wordlist cluster and the concordance cluster tools available in WordSmith 7 (Scott, 2019). A further step involves a search for nouns directly preceding or following the linking conjunct. This search has been facilitated by the pattern function in WordSmith 7. This enables the viewer to find such LNGs that might have lower occurrence frequencies but which should still be included in this particular research. In order to provide direct comparison as to the occurrence patterns between the two corpora (first, BNC vs. 19C, and then DC vs. MC) targeted, LNGs were tested to see whether differences are statistically significant (using the log-likelihood calculator; Rayson, 2016). To do so, LNGs that occur relatively frequently in one of the two corpora which  are also  found in the other corpora  are selected. Then, both  the relative

88 

M. PACE-SIGGE

frequencies and actual usage pattern (as found in the respective sets of concordance lines) are being discussed side by side. At times, in order to highlight how different LNGs are from other linked groups (in particular, linked adjective groups or linked adverb groups), examples of these are presented as well. 4.1.3  The Idea of Investigating Literature Biber and Conrad point out: From a style perspective, eighteenth-century novels are also similar to modern novels in many of their typical lexical and grammatical characteristics. It is somewhat difficult to specify what a “typical” modern novel is, because there is considerable experimentation with a wide range of linguistic styles. (Biber and Conrad, 2000:147)

The latter half appears to be true for a lot of imaginary writing—be it prose or poetry. However, where a diachronic shift can be found in literature, corpus linguistic methods provide a good basis to outline broad trends. Looking at something as specific as LNGs will, furthermore, highlight both continuities and changes in the focus of producers of literature. Indeed, Biber and Conrad (2000) have undertaken a detailed comparison and point out that word choice and spelling are the main differences and that sentence length and complexity have declined over time. In particular, “the most important differences from modern novels involves the syntactic complexity of noun phrases. In eighteenth-century novels, noun phrases tend to have many modifiers, especially relative clauses” (Biber and Conrad 2000, 154). This stands in sharp contrast with twentieth-­ century novels, which may not use relative clauses or use other, non-­ nominal forms: “Instead of complex noun modification, modern novels tend to employ simpler syntax with more verbs and simple clauses. Descriptive details are often given in adverbials rather than being embedded inside noun phrases” (Biber and Conrad 2000, 155). The reader will find that the LNGs discussed below substantiate this particular claim for a further dimension. Not only are noun phrases displaying a tendency of decreasing complexity, Linked Noun Groups also appear to display greater paucity of variation for certain nouns. In other words, twentieth-century

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

89

fiction not only presents different noun groups to the reader, but also gives a more restricted use for certain nouns.2 This chapter does not—and cannot—attempt to be an exercise in literary criticism. It does not focus on individual works, writers or types of fiction. It does, however, attempt to give a provision for an empirical approach to literary studies. The potential importance of this is highlighted by Siegfried J. Schmidt: [I]nterpretation will remain what each literary scholar declares it to be, either by definition and explication or by daily praxis. Experience tells me that every attempt to introduce a rational and explicit theory into literary scholarship as well as any demand to legitimate scholarly work in this domain is immediately refused by a majority of scholars as normative and imperialistic. (Schmidt, 2000: 623)

While it is debatable that all scholars would raise such an objection, this book’s research does not intend to “set a norm”: the book’s intention is to reflect what is typical (“natural”) usage, as reflected in highly frequent recurrent (and highly dispersed) patterns of noun usage. Nor can I claim a greater (“imperialistic”) authority. Schmidt, again, states: “… all observations are observer dependent, and whatever is said is said by an observer to an observer” (ibid, 625). Consequently, the choice of my material (British English imaginary writing), the methods chosen and the patterns identified are particular to me. As can be seen in Marchi and Taylor (2009) or the volume by Baker and Egbert (2016), the concept of triangulation can minimise observer bias, yet even colleagues who are very close in their aims and approaches obtain differences in results. That being said, unlike mere idiosyncratic “interpretation” of literary works, an empirical approach as shown here does, at the very minimum, provide firmly grounded data to support the views put forward. The final section of this chapter is, indeed, moving away from general corpora (reflecting literary works of a specific time) to an investigation of two specific corpora, namely looking at stylistic elements in the writings of Charles Dickens and the translations of Karl Marx’s work. This section has the aim to give stylistic insights but also to compare and contrast the choice of LNGs used by these two writers in a wider context.

90 

M. PACE-SIGGE

4.2   Thematic LNGs in British Fiction 4.2.1  Introduction This chapter shall focus on LNGs in 19C and BNC with the conjuncts but, or as well as and. This particular order has been chosen because of the incline of complexity it offers. There are only a few highly key LNGs with but, and these shall therefore give an initial indication as to how 19C and BNC fiction material differ. Or is, for the material at hand, providing the richest seam of highly frequent LNGs. We have seen in Chap. 2 how LNGs are an instrument to poignantly present opposites—LNGs with or in these sub-corpora seem to fulfil similar functions. Finally, LNGs with and are discussed. These are the most intricate. There are more examples with fairly low relative frequencies yet these can be found fitting into a particular ontology of usage. Unlike LNGs with or, employing and as link appears to be typically used to describe an expansion (though opposites, for example “night and day”) are also quite visible. 4.2.2  LNGs with but in Fiction For this usage pattern, the use of pronouns stands out. Thus, there is NP-[BUT]-Pronoun as well as Pronoun -[but]- NP. Linked [but] Noun Groups tend to be very low in frequency and the only re-occurring patterns of any frequency use the vague pronoun nothing.3 There are the constructions nothing but a+NP (160 occurrences in 19C and 132 occurrences in BNC) and nothing but the+NP (180 occurrences in 19C and 98 in BNC; see Fig. 4.1). Fig. 4.1  nothing -[but]a/the N in 19C and BNC

a nuisance (BNC) 5 the truth (19C / BNC) 11

a pair (BNC) 4

Nothing BUT -

a heap (19C) 3

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

91

As can be seen, within multimillion-word corpora these 4-grams are very rare. This must be seen in the light of the most frequent trigrams (“out of the”—3604 occurrences in 19C, for example). Within the less-than 1,000,000 instances of but, even the most frequent, “nothing but the truth”, appears only in 0.01% of all uses. And this is for a fairly commonly understood formula. The next-most frequent form is the doubly negative “nothing but a nuisance” (BNC) or the descriptive and mostly negative “nothing but a heap (of blackened ruins / debris)” (19C). Furthermore, there are the uses of nothing -[but]- N (Fig. 4.2) which present an interesting insight as to chronological change and fairly good evidence of semantic narrowing. First of all, given that but is more frequent in 19C, Fig.4.2 shows that there used to be fewer repeated LNGs in 19C; yet 19C had a wider range misery / misfortune / confusion / trouble 5

truth/death 7

sorrow / lies / ashes / revenge / blackness 3

Nothing BUT -

contempt 5

love / beer / pleasure / kindness 3

misery / darkness 4

trouble 15

Nothing BUT -

lies / work 3

Fig. 4.2  nothing -[but]- N in 19C (top) and BNC (bottom)

92 

M. PACE-SIGGE

of LNGs with “nothing but”. The change that is immediately noted is the contraction of prosodic meaning, that is, in BNC, this Linked Noun Group is always negative (doubly so—apart from, some might say, where “work” is mentioned).4 Yet, while a few nouns—trouble, lies and also darkness/ blackness appear in both corpora, those words that reflect a positive prosody in 19C (truth, love, beer, pleasure, kindness) find no equivalent in BNC. As pointed out above, the sum total of available evidence is exceedingly small. It can be seen, however, as positive evidence of de-lexicalisation over time for this particular LNG structure. Nothing further can be said with regards to is N-[but]-N constructions. The next section will look at those LNGs that show the most frequent trigrams. 4.2.2.1 LNGs with or—Antonymic Pairs These particular LNGs are the ones which are most prominent information carriers. They are relatively infrequent and reflect particular interests prevalent at the time of writing. As has been noted before, linked ADJ groups (“good or evil”, “dead or alive”) are more prevalent with or. What Table  4.3 demonstrates will be seen below (in Sect.  4.2.2.3) again. Overall, such LNGs are used with far higher frequency in 19C. It is noteworthy, however, that these trigrams mostly also occur in BNC— though they tend to be quite rare. This delineates therefore even more strongly the concerns and the turns of phrase typical of a time period. The antonymic pairs most divergent in 19C compared to BNC for this construction refer to the sexes: “man and woman”5 or “father and mother”. By contrast, the social ritual of drinking a hot beverage is no longer Table 4.3  The most frequent or LNGs with opposites LNG MAN OR WOMAN LIFE OR DEATH HAND OR FOOT DAY OR NIGHT FATHER OR MOTHER NIGHT OR DAY MEN OR WOMEN HEAD OR TAIL TEA OR COFFEE

N 19C 67 38 20 18 17 16 12 10 5

19C per million 4.8 2.7 1.4 1.3 1.2 1.2 0.9 0.7 0.3

N BNC 32 10 0 12 3 6 2 8 34

BNC per million 1.9 0.6 0 0.7 0.2 0.4 0.1 0.5 2.0

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

93

reduced to having a cup of tea: a waiter will now ask “tea or coffee” in the BNC. Lastly, “day or night” / “night or day” is slightly more frequent in 19C. This is often a metaphorical shortcut in order to say “all the time”. 4.2.2.2 LNGs with or—Numerals and Times The most frequent type of or LNGs all refer to numerals or times. As such, they would be seen as what the Longman Dictionary of Contemporary English (LDOCE) terms “uncertain amounts”—though the degree of vagueness varies from the nearly precise (“three or four”) to the definitely imprecise (“one or the other”). The next-most frequent set of LNGs refer to time span. As Table 4.4 demonstrates, the usage patterns for the same noun groups vary between 19C and 20C material. The earlier usage is used as benchmark here. Counting, the next trigram would be “ten or eleven”. However, given that Britain uses either the duodecimal or decimal system and the similarity of sounding “ten or twelve” in speech, this prime number is rare here: five instances in 19C and seven in the (larger) BNC prose fiction sup-corpus. Table 4.4  The most frequent or LNGs with numerals LNG TWO OR THREE ONE OR TWO THREE OR FOUR FOUR OR FIVE FIVE OR SIX SIX OR SEVEN ONE OR OTHER SEVEN OR EIGHT EIGHT OR NINE a TEN OR TWELVE5 TWENTY OR THIRTY EIGHT OR TEN FORTY OR FIFTY SIX OR EIGHT FIFTY OR SIXTY THIRTY OR FORTY TWO OR THREE HUNDRED TWO OR THREE MORE SOME TWO OR THREE SOME THREE OR FOUR a

Counting on from here

N 19C 894 483 402 166 115 84 74 48 43 33 33 28 27 26 22 21 22 21 20 17

19C per million 64.2 34.7

N BNC 347 612

BNC per million 20.6 36.4

29.9 11.9 8.3 6.0 5.3 3.4 3.2 2.4 2.4 2.0 1.9 1.9 1.6 1.5 1.6 1.5 1.4 1.2

217 82 78 45 71 44 25 34 23 6 19 2 8 17 8 1 0 4

12.9 5.1 4.6 2.7 4.2 2.6 1.5 2.0 1.4 0.4 1.1 0.1 0.5 1.0 0.5 >0.1 0.0 0.2

94 

M. PACE-SIGGE

Table 4.4 presents two immediate insights: one, numeral N-[or]-N LNGs are overall less frequent in the twentieth-century data6 and, two, that neither corpus presents the count numbers in consecutive order when it comes to their consecutive frequencies.7 Starting with the second point, this is a feature more prominent in 19C which presents “two or three” nearly twice as often as “one or two”—almost the reverse of what is found in BNC. The Zipfian distribution of occurrence is, nevertheless, fairly regular in both. This points to an important factor with regards to the author’s intention: use of certain numerals carries a semantic association with particular concepts. For example, looking at the item hour, Hoey writes: “hour is likely to be primed for many speakers of English to collocate with half an, one, two, three, four and twenty four, but thirty occurs only once in my data” (2005, 16). This gives also a first indication why single-digit LNGs are all present, yet there are too few teens and numbers higher than 60 (until the hundreds are mentioned) to appear in the most frequent list. The bottom lines in Table 4.4 also give some indications where “two or three” may differ in usage. While “two or three times” appears in nearly one out of ten  of all uses in both corpora, “two or three hundred” or “some two or three” are very rare in the BNC fiction data. As can be seen in Fig. 4.3, the most frequent numeral LNGs are being used in reference to time, distance or people. One or two stands out as it usually refers to concepts. While the ranking of usage differs between 19C and BNC, there is no real deviation where the two usage patterns are concerned. However, unlike what we have observed with but, the LNGs for or numerals extend to other things (trees, cars, bicycles) in BNC amongst the most frequent 4-grams. The degree of uncertainty, indeed vagueness (“one or two other things”), pinpoints to the spoken dialogue that often makes an essential part of prose fiction, in particular in novels. 4.2.2.3 LNGs with or-time and Vagueness Markers Lastly for this section are N-[or]-N constructions that are prototypical of the genre. In a narrative, the framework of time will usually find a place in one form or another. Furthermore, fiction texts contain a lot of conversation. As spoken language tends to be more vague than descriptive written texts, vagueness markers would be expected to be fairly frequent. This subsection highlights how this structural element presents clear evidence of changes in style over time. Table 4.5 shows two clear chronological shifts: first of all, while “(a) day or two” is the most frequent such Linked Noun Group in both corpora,8 overall

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

ONE OR TWO (other) things points words

TWO OR THREE 19C times days years hours months minutes weeks

THREE OR FOUR days years times hours months minutes weeks

questions occasions attempts pieces exceptions

hundred steps miles

feet miles points inches yards hundred thousand

men people friends gentlemen

others men women

men people soldiers

BNC times days years hours minutes weeks months

years times days hours minutes months o'clock

things pieces pairs points notes

questions occasions cases trees

hundred feet inches miles

miles paces hundred

people ladies women

cars men occasions

people men bicycles

Fig. 4.3  The most frequent or 4-grams 19C (top) and BNC (bottom)

95

96 

M. PACE-SIGGE

Table 4.5  The most frequent or LNGs with time markers LNG DAY OR TWO MINUTE OR TWO HOUR OR TWO YEAR OR TWO MOMENT OR TWO WEEK OR TWO TWO OR THREE TIMES TWO OR THREE DAYS MONTH OR TWO

N 19C 340 185 127 96

TIME OR OTHER

80 76 75 70 51 41

TWO OR THREE YEARS SECOND OR TWO THREE OR FOUR DAYS THREE OR FOUR YEARS AN HOUR OR SO MINUTE OR SO

40 40 36 27 61 23

FOR A MINUTE OR TWO FOR A DAY OR TWO FOR A MOMENT OR TWO FOR AN HOUR OR TWO SOME TIME OR OTHER

71 65 45 40 37

19C per million 24.4 13.3

N BNC 141 92

BNC per million 8.4 5.5

9.1 6.9

58 50

3.4 3.0

5.7 5.5 5.4 5.0 3.7 2.9 2.9

118 61 38 27 24 14 20

7.0 3.6 2.3 1.6 1.5 0.8 1.2

2.9 2.6 1.9 4.5 1.7 5.0 4.7 3.2 2.9 2.6

57 14 20 118 49 36 38 51 22 14

3.4 0.8 1.2 7.0 2.9 2.1 2.3 3.0 1.3 0.8

such references to a time span were far more prominent in nineteenth-­century fiction than in twentieth-century fiction. Secondly, fiction in the BNC displays a stronger tendency for vagueness than 19C, with “-or so” being, relatively, always more frequent. In fact, while the phrase “(a) moment or two” is not significantly less frequent in 19C, this phrase, together with “an hour or so”, are the second-most frequent time marker N-[or]-N constructions. Furthermore, it is interesting to see how the times are distributed—in 19C, it is day/minute/hour before year/moment/week/times. In BNC, it is day/moment/week before hour/second/year. Also of interest should be that one particular phrase—“(some) time or other” is almost never recorded in BNC compared to 19C. Similarly, periods longer than “one or two” are rare—whereas there are a number of time frames with “three or four” in 19C. As Table 4.5 shows, elements of vagueness appear with time markers. Consequently, Table 4.6 will give an indication of fixed phrases that indicate vagueness for or-LNGs.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

97

Table 4.6  The most frequent or LNGs with vagueness markers LNG

N 19C

WORD OR TWO ONE WAY OR ANOTHER SOMETHING OR OTHER WAY OR ANOTHER SOMEWHERE OR OTHER ONE OR TWO THINGS

81

19C per million 5.8

52 51 49 20 12

3.8 3.7 3.5 1.5 0.9

N BNC 15 73 35 76 10 43

BNC per million 0.8 4.1 2.0 4.6 0.3 2.4

Yet these N-[or]-N phrases are, with the exception of “(one/some) way or another”, rather infrequent. As has been already observed above, certain phrases seem to fall out of favour—19C still records “(a) word or two” with some frequency of note, whereas it is quite rare in BNC. 4.2.2.4 LNGs with or: Summary Returning to the LDOCE definition, fictional texts of the nineteenth and twentieth centuries seem to mostly employ or in N-[or]-N constructions as per definitions (1), (2) or (6). In other words, to express choice, “and not” or uncertain amounts. They are typically employed for numerical lists and, related to this, time markers. Less certain amounts, or, in other words, greater vagueness, appear to be more typical of the BNC than the 19C material. It can be said, too, that N-[OR]-N constructions are, overall, more frequently used in 19C. This can be interpreted in two ways: either it is a construction that is no longer prevalent in contemporary literature or, alternatively,  the construction persists still, but is far less formulaic.9 Finally, this Linked Noun Group is used for both opposites10 (“night and day”) and to express vagueness “something or other”. These are, while present in both corpora, relatively few in number. 4.2.3  LNGs with and in Fiction The final part of this section focusses on N-[and]-N constructions: these are the ones centred around the most frequent conjunct. Yet these are not necessarily found in higher frequencies for the LNGs concerned. Instead, a wider spread of themes appear to be covered. In fact, narrative fiction has, as a typical construction, VP-NP-and-VP (like “he opened the door and entered”). There is also the typical old-fashioned way of describing that a character has come of age in 19C: “one and twenty (years)”.11

98 

M. PACE-SIGGE

Overall, for the LNGs encountered, it appears that they allow for a whole ontology. In British fiction, authors use linked pronouns like my and mine and refer to words like people, food items, items of clothing, body parts, love amongst others. These will be presented below, starting with the least frequent topical LNGs to the most frequent. There are a great many N-[and]-N form in the fiction corpora that are used several times. However, frequencies might be low or they might only appear in one of the two corpora. Unless the latter is noteworthy, these forms will not be discussed further. A large number of LNGs not taken into consideration are, however, to be found in the appendix. 4.2.3.1 LNGs with and: References to love and death Love is a term with particular resonance in English-language imaginary writing (compare also Chap. 3). In all its uses, it occurs 9074 times (0.07% of the total of tokens) in 19C and 8588 times (0.05%) in BNC-F. In fact, the form at LOVE and has been recorded as prominent in the historical collection of English books (containing Early English Books Online (EEBO), Eighteenth Century Collection Online  (ECCO) and Evans). Sketchengine (Kilgariff and Rychly, 2019)) lists 306,439 tokens for LOVE as noun12 for this collection and gives the following most frequent trigrams: •  •  •  •  • 

Faith and love Love and affection Love and mercy Love and favour Love and peace

•  •  •  •  • 

Love and charity Love and obedience Love and joy Love and gratitude Love and friendship

•  •  •  •  •  • 

Love and care Love and delight Love and respect Love and esteem Love and reverence Love and kindness

As can be seen, love overwhelmingly starts these trigrams. While the term is highly frequent, the love-[and]-N constructions are decidedly infrequent. Given that N=14 is the minimum occurrence needed to be counted once per million words in 19C and N=17 for BNC, Figs. 4.4 and 4.5 demonstrate that these LNGs exist but are rare. As a reminder: a trigram must appear 14 times in 19C”, and 17 times in BNC-F, to be counted as “once per million words”. Figure 4.4 shows that the overlap between 19C and BNC fiction is rather narrow (“Love and Happiness”, “Love and Friendship”). More importantly, the differences clearly highlight the key concerns of the respective ages.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

LOVE and

HOPE; DUTY 10

99

LOVE and

AFFECTION 7

BEAUTY, MARRIAGE 9 CARE 6 RESPECT, TRUST, TRUTH 8

REVERENCE; WAR, FAITH; HAPPINESS; LIFE; KINDNESS; ADMIRATION; WAR 6

TENDERNESS; FRIENDSHIP; JOY 5

WAR 5

JOY; KISSES; LONGING; LOYALITY; MARRIAGE; ROMANCE 4

DEATH; FRIENDSHIP, HAPPINESS; LIGHT; DUTY ; PRIDE; TRUST; TRUTH; WORK 3

DEVOTION; CHARITY; GENTLENESS; PITY ; SYMPATHY; YOUTH 4

Fig. 4.4  The most frequent LOVE and LNGs. Left: 19C; right: BNC

Love is less frequent in BNC; consequently, the number and LNGs recorded are lower. The interesting point here is the shift (which can also be seen when looking at EEBO etc. data above) in the focus of these trigrams. Some of the trigrams appear in a different category (as verb rather than noun, “love and honour” being such an example). We can find that Love and Duty are very typical of Victorian fiction. In contrast, it is low frequency in the BNC. Similarly, love and hope was used more frequently in 19C than in BNC, while love and affection, love and care as well as love and kisses can be seen as more typical of twentieth-century expressions. Figure 4.5 demonstrates not only a wider variety of the reverse form using love as the final part of the LNG; crucially, the reverse forms in 4.5

100 

M. PACE-SIGGE

PEACE 8

LIFE 14 HOPE 10 TRUST; GRATITUDE; JOY; LIFE 5 WARMTH 6 RESPECT 3

and LOVE

and LOVE

Fig. 4.5  The most frequent and LOVE LNGs. Left: 19C; right: BNC

echo the forms seen in Fig. 4.4 for 19C. However, they are entirely different for BNC. Another dimension of difference is life and love, which is the most frequent form found in the BNC. These are, in their majority, clause final and carry a strong negative prosody (e.g. “a man badly damaged by life and love ought to avoid…”). This appears to be quite different in 19C where the negativity is not pronounced and where this phrase is always mid-clause. Indeed, it is split into two clauses by George Gissing: “Beauty is solace of life, and Love the end of being”. “Love and affection” has got its equivalent in “warmth and love”. This is in contrast to 19C where we find “joy and love” yet the more formal “respect and love” also occurs. When it comes to death, only the phrase “life and death” can be found with any frequency in both corpora: 93 times in 19C and 44 times in BNC material, or, in other words, 6.7 and 2.6 times per million words, respectively.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

101

4.2.3.2 LNGs with and: Clothing and the Environs Table 4.7 is thematically ordered, and follows the “top-to-toe” structure. If anything, items of clothing [and]-N constructions demonstrate how fashion has changed over a period of about 100 years. Bonnets, shawls, sticks, waistcoats and stockings are out. Hats are only worn when it is cold: no longer with jackets but with a coat (where it can be first or second placed). By contrast, only the BNC records descriptions like “jacket and trousers” or “shoes and socks”. The total occurrences are low whichever way they are being looked at. Still, the most frequent trigrams give an impression of what literary characters might have looked like: 19C Bonnet and shawl/hat and gloves Coat and waistcoat Shoes and stockings

20C Hat and coat Coat and skirt/shirt and trousers Shoes and socks

Table 4.8 moves away from personal dress to the personal surroundings and the wider environment characters experience in the fiction data. Table  4.8 is thematically ordered. With two exceptions, all trigrams are low in frequency. The first section, house / home and *, is remarkably similar in their frequencies13, except for  the phrases “gold and silver” and “pen and ink”, which are amongst the most frequent of such LNGs in 19C but have Table 4.7  The most frequent and LNGs with items of clothing LNG BONNET AND SHAWL HAT AND GLOVES HAT AND STICK HAT AND JACKET HAT AND COAT COAT AND WAISTCOAT COAT AND SKIRT COAT AND HAT JACKET AND TROUSERS SHIRT AND TROUSERS SHOES AND STOCKINGS SHOES AND SOCKS

N 19C 29 15 11 10 8 22 0 6 0 6

19C per million 2.1 1.1 0.8 0.7 0.6 1.6 0 0.5 0 0.4

20 0

1.5 0

N BNC 0 3 0 0 24 1 14 14 10 13 3 18

BNC per million 0 0.1 0 0 1.4 >0.1 0.9 0.9 0.6 0.8 0.1 1.2

102 

M. PACE-SIGGE

Table 4.8  The most frequent and LNGs with housing or environs LNG KNIF E AND FORK GOLD AND SILVER PEN AND INK CUPS AND SAUCERS KNIVES AND FORKS BOOKS AND PAPERS HOUSE AND HOME HOUSE AND GARDEN HOUSE AND GROUNDS HOME AND ABROAD BOARD AND LODGING BED AND BREAKFAST BED AND BOARD TABLE AND CHAIRS

N 19C 48 45 27 25 24 21 13 12

19C per million 3.5 3.2 2.0 1.9 1.8 1.6 1.0 0.9

N BNC 50 11 3 39 19 25 6 16

BNC per million 3.0 0.7 0.1 2.3 1.1 1.5 0.3 0.9

CHAIR AND TABLE

10 10 14 1 4 3 7

0.7 0.7 1.1 >0.1 0.3 0.2 0.5

6 2 8 36 8 14 2

0.3 0.1 0.5 2.1 0.5 0.8 0.1 0.8

TREES AND BUSHES

10

0.7

13

TREES AND SHRUBS

8

0.5

9

0.4

TOWN AND COUNTRY HEAVEN AND EARTH

14

1.0

3

0.1

46 20 13 11

3.3 1.4 1.0 0.8

20 6 3 13

1.2 0.3 0.1 0.8

LAND AND SEA SEA AND LAND SEA AND SKY

become rare in twentieth-century fiction. Similarly, the phrase “home and abroad” appears to be fading out of use. The opposite can be found in the next section, furniture; recorded only once (in Wilkie Collin’s Armadale), “bed and breakfast” has moved from the lowercase service provided by a person to the uppercase place which is fairly frequently mentioned (36 times in 14 different texts). Indeed, bed, tables and chairs are found more often in BNC than in 19C; also the sparse “chair and table” is rare in BNC compared to “table and chairs”. Moving on to the surrounding environment, the description of trees usually comes with undergrowth. Apart from the two bigrams in Table 4.8, there is also “trees and hedges” and “trees and flowers”. Overall, the differences in occurrence between the two corpora here is negligible. It also hints towards a particular nesting: trees, in these constructions, are typically followed by other forms of organic growths which are less tall.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

103

When it comes to the further surroundings—towns, states, the sea—we can see a concrete absence in more contemporary writing. In particular, the phrases “town and gown” and “church and state” have disappeared in the BNC fiction sub-corpus.14 References using the and Linked Noun Group for earth, land and sea have likewise decreased in use over time. The exceptions are “sea and sky” and “heaven and earth”. The latter seems to be moving towards delexicalisation in BNC, with 7/20 instances being move/moving heaven and earth.15 In 19C, the link to the biblical phrase is far stronger; hence, 26/46 occurrences are preceded by a preposition like in, of or between. 4.2.3.3 LNGs with and: Time Markers This and the following subsections display trigrams that come with more noticeable frequency in both corpora. As we have observed with “mind and body” / “body and mind” in Sect. 4.2.4.3, “day and night” and its reverse form (see Table  4.9)  are close in frequency in 19C.  In 19C, it comes from 49 different sources; several writers use it over five times. This, the most frequent N-[and]-N construction, is used by a far wider variety of writers in the BNC-Fiction, where the 74 tokens come from 69 different sources. Why this might be is discussed in greater detail below. While there is some overlap between this and the revers forms, it has to be said that a writer might use both but has clear a preference for one or the other form. This time reference is also more than twice as common in 19C than it is in BNC. The key difference appears to be that “day and night” appears in longer clusters, most prominently “at all hours of the day and night” Table 4.9  The most frequent and LNGs with time markers LNG DAY AND NIGHT NIGHT AND DAY DAY AND AGE DAYS AND NIGHTS MORNING AND EVENING NIGHT AND MORNING YEARS AND YEARS TIME AND PLACE TIME AND SPACE

N 19C 123

19C per million 8.8

95 0 42 23 14 46 28 14

6.9 0 2.9 1.7 1.0 3.2 2.1 1.0

N BNC 74 25 20 22 7 6 32 27 21

BNC per million 4.4 1.5 1.2 1.3 0.3 0.3 1.9 1.6 1.2

104 

M. PACE-SIGGE

(used five times by four different authors). Furthermore, and LNGs with morning can be seen as more preferred in 19C than in BNC, where the difference for “morning and evening” is significant when we look at all uses of “morning”. Another key divergence is that 19C presents greater variation to refer to “time and place” than BNC, while—albeit with very low numbers—the phrases “day and age” and “time and money” are clearly preferred in the twentieth century (see appendix). 4.2.3.4 LNGs with and: Body Parts Far more frequent than either the items of clothing or the surrounding of the characters in novels are and LNGs that refer to body parts. Table 4.8 is split into two sections: first, the body references overall (including metaphorical references to the body) are listed, then the table is organised from “top-totoe”, starting with references to “hair” and ending in a Linked Noun Group ending in “* and foot”. Similar to what we have seen in Table 4.7, the change in fashion from the nineteenth to the late twentieth century is reflected in some of these constructions. In order to be listed in this table, an LNG had to occur at least once in a million words in either corpus. Starting with the metaphors in Table  4.10, there appears a concrete drop in usage between the nineteenth- and the twentieth-century material. Both have “flesh and blood” as most frequent Linked Noun Group here, and both use it to refer to “offspring”, to “being vigorous” or as opposition to “ghost”. Both corpora focus on the terms body, mind and soul. Yet, apart from higher frequencies of use, 19C also employs LNG constructions with heart. Thus, for example, “heart and brain” appears typically as a container term, as in “I might feel myself the heart and brain of a multitude”; the other key difference is that “mind and body” appears as often as “body and mind”, whereas the former occurs only twice as often as the latter in BNC. In 19C, the dispersion rate is similar, though only a minority of source texts appear to use both forms. They appear to be used in similar ways, typically preceded by “in” or “of” and the only 5-gram being “both in body and mind” / both of mind and body”. The one discernible difference is the use of the personal pronoun (his, my) preceding “mind and body” which has no equivalent. Looking at descriptive body part [and]-N constructions, both corpora have “hands and knees” as their most frequent descriptor, followed by “head and shoulders” and “hands and feet”. A number of descriptors are more typical of 19C: “hair and whiskers”, “face and figure / face and form” or, indeed, “face and manner”. This seems to reflect that whiskers were more fashionable then, and it should be noted that “face” was often

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

105

Table 4.10  The most frequent and LNGs with body parts LNG

N 19 C

FLESH AND BLOOD HEART AND SOUL BODY AND SOUL HEART AND MIND MIND AND HEART HEART AND BRAIN MIND AND BODY BODY AND MIND HAIR AND WHISKERS HAIR AND BEARD HAIR AND EYES HEAD AND SHOULDERS

128

HEAD AND EARS HEAD AND FACE FACE AND NECK FACE AND HANDS FACE AND FIGURE FACE AND FORM FACE AND BODY FACE AND HAIR FACE AND MANNER EYES AND MOUTH

36

EYES AND NOSE NECK AND SHOULDERS ARMS AND LEGS HANDS AND KNEES HANDS AND FEET HANDS AND ARMS HANDS AND EYES HANDS AND FACE HAND AND FOOT FINGER AND THUMB

74 47 22 16 13 46 45 13 13 11 38 19 23 22 22 15 2 6 15 14 5 14 22 50 40 17 13 12 36 41

19C per million 9.2

N BNC 71

BNC per million 4.2

5.3 3.2 1.6 1.2 1.0 3.2 3.2 1.0 1.0 0.8 2.7 2.6

26 23 8 6 0 30 15 1 9 22 60 2

1.5 1.4 0.5 0.3 0 1.8 0.9 >0.1 0.6 1.4 3.6 0.1

1.4 1.7 1.6 1.6 1.1 0.1 0.4 1.1 1.0 0.4

8 38 43 4 3 18 17 1 17 17

0.5 2.3 2.6 0.2 0.1 1.2 1.1 >0.1 1.1 1.1

1.0 1.6 3.6

15 100 60

0.8 5.9 3.6

2.9 1.2 1.0 0.9 2.6 3.1

36 11 7 28 14 38

2.1 0.7 0.4 1.8 0.8 2.3

linked to larger entities and was seen as the point from which a person can be judged.16 As this belief receded, so, it seems, have related lexical items. By contrast, the BNC shows a strong preference for the trigrams “arms and legs” and “head and shoulders”. In both corpora, these are typically preceded by “his” or “her”; only in BNC, however, the longer phrases “stood / towering head and shoulders above” are recorded. “Hands and

106 

M. PACE-SIGGE

face” appears, proportionally, twice as often in BNC than in 19C. In both corpora, these are most frequently followed by “and”. Also, there is a connection to cleaning these extremities17 which seems to make it almost a fixed phrase, as half of the concordance lines display this usage format. 4.2.3.5 LNGs with and: Food Items Table 4.11 is thematically ordered. It also presents one of the strongest indicators of how preferences—in this case, dietary ones—have changed in the course of just over 100 years. Neither have the phrases “bread and butter issues” or “pieces / slices of bread and butter” Yet, while in 19C there is  an extended  [and] LNG with “tea and bread and butter” (six occurrences), the BNC has “bread and butter” as part of a larger fare—for example “tea with ham, bread and butter” or even “potatoes, hard-boiled eggs, bread-and-butter, trifle; all at once, mind you” or “caviare [sic] and black bread and butter“. In a similar vein, “bread and cheese”, which is twice recorded as “a supper of bread and cheese” in 19C, is now a mere “lunch of bread and cheese” (four occurrences) in BNC.  And, while it might be “scanty” or just a “mouthful” and maybe “with beer” in Victorian Britain, in the twentieth century it might come with “salad”, “apples” or “meat”, and “coffee” rather than an alcoholic drink. This is further reflected in the use of “bread and water” and “bread and milk”. The former has “nothing but bread and water to live on” in 19C, “bread and water” are typically referred to as the bare necessities in BNC—something stressed by the fact that it more prominently features “food and water” (see appendix). Interestingly, the clearest difference for “bread and milk” is that it is twice mentioned in connection with breakfast in 19C; it is once Table 4.11  The most frequent and LNGs with food items LNG

N 19C

19C per million 13.4

N BNC 81

BNC per million 4.8

BREAD AND BUTTER

186

BREAD AND CHEESE BREAD AND WATER BREAD AND MILK BREAD AND MEAT

67

4.9

54

3.2

25 20 11

BRANDY AND WATER GIN AND WATER RUM AND WATER FIRE AND WATER

74 17 13 15

1.8 1.4 0.8 5.3

9 10 6 2

0.6 0.6 0.3 0.1

1.2 1.0 1.1

5 2 5

0.3 0.1 0.3

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

107

a “break and milk breakfast” but also “break and milk for supper” or “bread and milk for dinner”. The biggest contrast can be found with regards to drink. Twentieth-­ century fiction appears to be far less concerned about spirits: “brandy/ gin/rum with water” are certainly recognisable descriptive phrases in 19C, yet they are extremely rare in BNC. Lastly, “fire and water”, in 9/15 cases in 19C, is “go* through fire and water”—a metaphor that has nothing to do with nutriments as the reference is to the elements. However, in the very few examples found in BNC, it appears never as part of the above phrase. It is used twice to indicate “opposites” and three times in its literal sense.18 4.2.3.6 LNGs with and: People Looking at proper nouns, references to people provide the most frequent N- [and]-N constructions in fictional texts. If anything, it shows a society that has become less formal, less concerned with god and working animals, yet making more references to friends. In order to give an insight into how far the frequencies of occurrence diverge, the log-likelihood for each trigram (as measured within the total word count of the respective corpus) has been added in Table 4.12. Table 4.12  The most frequent and LNGs with reference to people LNG

N 19C

N BNC 167

BNC per million 9.9

LogLikelihood

MR. AND MRS.

310

19C per million 22.3

MEN AND WOMEN

247

17.8

222

13.2

FATHER AND MOTHER

232

16.7

45

2.7

10.19 175.83

BROTHER AND SISTER

95

6.8

65

3.8

12.72

LADIES AND GENTLEMEN

87

6.2

82

4.4

2.58

WOMEN AND CHILDREN

80

5.7

66

3.9

5.27

BROTHERS AND SISTERS

55

3.9

41

2.4

5.53

MOTHER AND SISTER

52

3.7

52

3.1

HUSBAND AND WIFE

50

3.6

26

1.5

0.92 12.90

GOD AND MAN

32

2.3

6

0.4

24.76

MALE AND FEMALE

24

1.9

25

1.4

0.27

LORDS AND LADIES

24

1.9

7

0.5

10.11

LORD AND MASTER

22

1.7

7

0.5

11.23

WIFE AND MOTHER FAMILY AND FRIENDS MUM AND DAD

15 9 0

1.1

18

1.1

0.00

0.6 0

18 158

1.1 9.4

0.00 190.70

74.67

108 

M. PACE-SIGGE

Table 4.12 is ordered by frequency in 19C. It can be seen that, amongst the low-frequency and LNGs, there is no significant difference in use between the two corpora.19 By contrast, the most frequent such Linked Noun Group, “Mr. and Mrs.”, which is, predictably, followed by a surname, is significantly overused in 19C, as are most of the other trigrams. Apart from the marginal difference in the use of the address form “ladies and gentlemen”, one can also detect a shift in people relations. Both “of / between god and man” and “lord and master” are significantly more common in 19C.  The former is a fixed phrase, occurring in the forms of “the laws of god and men” or “the love of god and men”. For the latter, this should come as no surprise, given that several occasions of “(her) lord and master” is a reference to a woman’s husband. Furthermore, (see appendix) “horse and man” appears 1.5 times per million words in 19C but is no longer relevant in the BNC, where even the trigram “car and driver” is less frequent. Instead, in BNC, friends seem to be given slight preference. Indeed, the phrase “friends and family” is not recorded in 19C yet can be found used a few times in BNC. The one clear exception (demonstrating a concrete shift from the formal to a more informal form of address) is “father and mother” as opposed to “mum and dad”. The former is most significantly overused in 19C, while the latter is a key trigram in BNC—it occurs not even once in the 19C corpus. “Father and mother” is usually preceded by a personal pronoun (my, his, her) or a proper name (Armadale’s, Hassan’s). The notable difference is “both father and mother”, which appears six times in 19C but never in BNC. Also interesting is the shift in prosody: “father and mother” is followed by adjectives “delighted” as often as “buried” (twice) and also once by “proud of him” or “excellent people” in 19C. However, the adjectives used in relation to “father and mother” in BNC are “dead” (three times20), “murdered”, “divorced”, “liars” or, in a negative tone, “religious” (once each). 4.2.3.7 LNGs with and: Pronouns These are the most frequently occurring [and] LNGs in the corpora. In order to give an insight in how far the frequencies of occurrence diverge, the log-likelihood for each trigram (as measured within the total word count of the respective corpus) has been added in Table 4.13. Table 4.13 is ordered by frequency in 19C. These particular N- [and]N constructions are the ones which show the highest frequencies of use across the board. Some of these run over two clauses (“me, and I”, “us,

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

109

Table 4.13  The most frequent and LNGs with pronouns LNG

19C per million 24.4 13.3

N BNC 141 92

BNC per million 8.4 5.5

96

9.1 6.9

58 50

3.4 3.0

TIME OR OTHER

80 76 75 70 51 41

5.7 5.5 5.4 5.0 3.7 2.9

118 61 38 27 24 14

7.0 3.6 2.3 1.6 1.5 0.8

TWO OR THREE YEARS SECOND OR TWO THREE OR FOUR DAYS THREE OR FOUR YEARS AN HOUR OR SO MINUTE OR SO

40 40 36 27 61 23

2.9

20

1.2

FOR A MINUTE OR TWO FOR A DAY OR TWO FOR A MOMENT OR TWO FOR AN HOUR OR TWO SOME TIME OR OTHER

71 65 45 40 37

2.9 2.6 1.9 4.5 1.7 5.0 4.7 3.2 2.9 2.6

57 14 20 118 49 36 38 51 22 14

3.4 0.8 1.2 7.0 2.9 2.1 2.3 3.0 1.3 0.8

DAY OR TWO MINUTE OR TWO HOUR OR TWO YEAR OR TWO MOMENT OR TWO WEEK OR TWO TWO OR THREE TIMES TWO OR THREE DAYS MONTH OR TWO

N 19C 340 185 127

and I”). It must be noted that only the reflexive forms “you and your” and “he and his” are used with the relative same frequencies. A shift that can be interpreted with a stronger focus on female characters can be seen by the use of “her and she”, which is significantly more used as well as “him and she” and “she and her”, which are slightly overused in BNC.  The total figures shift the balance from a male-dominated set of referrers in 19C to a more female-focussed one in BNC. The clearest divergence between the two corpora lies in the significant overuse of the pronouns “me” (“me, and I”, “me and my”) and “I” (“you and I”, “us, and I”). This might be a reflection that first-person narration was more fashionable in Victorian Britain (see Gallagher, 2015). There is also the most common n-gram, “V-me, and I will”. This occurs 48 times and is, typically, reported speech. Lastly, “Me and my” is interesting: “me and my sore back” appears four times in the same text in BNC, whereas “me and my children” or “me and my family” seem to a more typical use in 19C.

110 

M. PACE-SIGGE

4.3   Concluding Thoughts: LNGs in British Fiction Texts The results presented in this chapter may be interpreted in several ways. On the surface, we can detect less depth and less variation for a number of LNGs that use the same core words. This could indicate delexicalisation or a shift to a simplification of language usage. Whilst that might be true, a note of caution must be given: 19C is made up out of 100 full-text files; the BNC fiction prose sub-corpus, however, consists of excerpts from 432 texts. This, by itself, can be seen as the cause for a number of trigrams that are found with a higher frequency of occurrence in the 19C data, as idiosyncratic usage forms of individual writers would become more prominent: this is simply a result of the set-up available. Figure 4.6 looks at the most frequent Linked Noun Groups in the two British fiction corpora. The fact is that the total numbers are demonstrably lower in the BNC data; however, while there are one or two clear differences, it is remarkable how stable the most frequently used LNG constructions have remained in their use over the course of two centuries. The main topics where LNGs occur are counting, time and people; we have also seen that vagueness markers and references to distance show little variation: the way they are referred to by linked nouns has shifted only very little (see Fig. 4.6). There are also noteworthy differences between the two corpora. As can be expected, changes in fashion are reflected: whiskers, bonnets and sticks are out, and jackets and socks are featured instead. The rare interrogative “tea or coffee” is contemporary, while the dramatic phrase “life or death” seems to be more typical of Victorian fiction. A clear change is demonstrated in the reference to food stuffs: while spirits are not often mentioned in BNC, the reference to meals and home furnishing gives a slight indication that the characters are better off in the twentieth century. This echoes observations made, looking at the same corpora, by Pace-Sigge (2018). The reference to parents is a curious one: not only does it point to a higher level of informality (“mum and dad”) but also a shift towards greater negative prosody for the Linked Noun Group “father and mother”.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

19C N/ per million TWO OR THREE 894 / 64.2

BNC-F N/ per million ONE OR TWO 612 / 36.4

ONE OR TWO 483 / 34.7

TWO OR THREE 347 / 20.6

THREE OR FOUR 402 / 29.9

MEN AND WOMEN 222 / 13.2

DAY OR TWO 340 / 24.4

THREE OR FOUR 217 / 12.9

MR. AND MRS. 310 / 22.3

MR. AND MRS. 167 / 9.9

MEN AND WOMEN 247 / 17.8

MUM AND DAD 158 / 9.4

FATHER AND MOTHER 232 / 16.7

DAY OR TWO 141 / 8.4

BREAD AND BUTTER

MOMENT OR TWO / an HOUR OR SO 116 / 6.9

186 / 13.4 MINUTE OR TWO 185 / 13.3

111

MINUTE OR TWO 92 / 5.5

Fig. 4.6  Ten most frequent LNGs in fiction (excluding pronouns)

4.4   A Case Study: LNGs Occurrence Structure in Dickens and Nineteenth-Century Marx Translations 4.4.1  Introduction Taking the kind of structural analysis presented above, this section will present a brief case study. As such, the LNGs found in the Dickens Corpus (DC hereafter; Mahlberg, 2007) will be compared with what the 19C and

112 

M. PACE-SIGGE

BNC fiction corpus has shown. As a further step, there will be a look at the Marx Corpus (MC hereafter; Pace-Sigge, 2018). Reasons why the translations of Marx can be seen as a valid form of nineteenth-century British writing have been described in detail before (Pace-Sigge, 2018). The interesting point is that “academic English” at that point in time was less conformist than in the second half of the twentieth century. The question is, therefore, whether the philosophical and economic treatise written by Marx show, in their translations, a closeness to academic writing—or whether the translator is more influenced by the popular writings (in this case, Dickens) of his time. As has been pointed out above, a difference in the number of files per corpus can be misleading when the relative number of occurrences are being compared. In this section the main comparison is between the 23-file DC and the 100-file 19C. Both are full-text corpora. Two books by Dickens are contained in 19C.21 In order to see where there is a clear divergence in the frequency of use, the trigrams were statistically tested. Tables 4.14 and 4.15 have been assembled based on LNG occurrences that are significantly more frequent in DC than 19C, using Rayson’s (2016) log-likelihood calculator. The minimum certainty level is at the 99.9th percentile (p < 0.001); often, however, it is within the 99.99th percentile. Those n-grams which are statistically significant only to a lower degree in their frequency divergence can be found in the appendix. 4.4.2  LNGs in the Dickens Corpus Compared to General Fiction Corpora 4.4.2.1 Marginal LNGs: The Use of neither…nor Neither the link as nor the link but show any evidence of repeated Linked Noun Groups in use within the corpus of Dickens’ novels. There are, however, two LNGs with nor as link: The common phrases “neither here nor there” and the trigram “eyes nor ears”. The former occurs nine times (two times per million words) in DC, but is recorded just once in 19C. It is similar for the latter: recorded four times in Dickens, it is only apparent twice in 19C—one of which is in Dickens’ novel David Copperfield. These LNGs must therefore be seen as typical, idiosyncratic turn of phrase by Dickens.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

113

4.4.2.2 LNGs with or that are (A-)typical where Dickens is Compared to 19C There are just a handful of N-[or]-N constructions that are fairly frequent and significantly more used by Dickens compared to both his contemporaries and more contemporary fiction. As can be seen in Table 4.13, there is one exception: “day or two” appears underused in Dickens’ novels. Looking at their usage pattern, this phrase is identical for DC and 19C. It is simply less preferred by the London author. Table 4.14 shows those or LNGs which are significantly overused in DC; these have been ordered thematically. Interestingly, both “step or two” and “pace or two” are often preceded by “back”, thus forming the phrase “fell / falling back a step or two” or “stepped back a pace or two”. These usage patterns are identical across all the nineteenth-century writing: Dickens, however, has a clear preference for this cluster. This is similar for the most prevalent usage of this construction. Both “two or three” and “three or four” usually refer to a time frame and are most frequently followed by “times”, “years” or “days”. The clear exception is “six or eight”. Both DC and 19C refer to “feet”, “months” or “years”. However, Dickens also uses it to speak of “six or eight persons / people / young gentlemen /noblemen”.22 4.4.2.3 LNGs with and that are (A-)typical where Dickens is Compared to 19C This section provides a rich seam of Linked Noun Groups that can be seen to demonstrate quite well the main themes and concerns of Dickens as a writer. All the highly frequent LNGs that are statistically not very Table 4.14  Comparing Dickens or LNGs to 19C and BNC-F LNG

N

LADY OR GENTLEMAN STEP OR TWO PACE OR TWO

13 46 23

TWO OR THREE THREE OR FOUR SIX OR EIGHT DAY OR TWO

398 176 24 47

DC per/mio 2.9 10.0 5.1 87.9 38.9 5.3 10.4

19C

BNC

per/mio 0.4 4.2 2.0 64.2

per/mio >0.1 0.9 0.8 20.6

29.9 1.9 24.4

12.9 0.1 8.4

114 

M. PACE-SIGGE

divergent in their frequency to 19C data are in the appendix. Table 4.15 lists all those which are significantly overused in DC thematically, with the exception of the bottom section where there are “women and children” (no difference) and two trigrams (highlighted) which are underused in Dickens’s oeuvre.23 “Her and she” are significantly underused even compared to other Victorian writers. This could be seen as support for the view of some critics who have accused Dicken of giving a rather poor representation of Table 4.15  Comparing Dickens and LNGs to 19C and BNC-F LNG 1. MR AND MRS YOU AND ME LADIES AND GENTLEMEN LORDS AND GENTLEMEN BOY AND GIRL 2. BRANDY AND WATER RUM AND WATER BREAD AND MEAT KNIFE AND FORK KNIVES AND FORKS CUPS AND SAUCERS 3. FIVE AND TWENTY FOUR AND TWENTY 4. PEN AND INK BOOKS ANDPAPERS WORD AND HONOUR 5. COAT AND WAISTCOAT HEAD AND FACE 6. BOARD AND LODGING DOOR AND WINDOW WHEELS AND HORSES 7. DAY AND NIGHT DAYS AND NIGHTS NIGHT AND DAY MORNING NOON AND NIGHT 8. LOVE AND DUTY LOVE AND TRUTH LOVE AND GRATITUDE HER AND SHE FATHER AND MOTHER WOMEN AND CHILDREN

N 357 123 100 26 12 120 46 35 107 38 38 97 68 55 31 32 19 27 16 14 10 96 35 57 16 13 11 7 75 48 26

DC p/mio 78.8 27.3 22.1 5.7 2.6 26.5 10.1 7.7 23.6 8.4 8.4 21.4 15.0 12.2 6.8 7.1 4.2 6.0 3.5 3.1 2.2 21.2 7.7 12.6 3.5 2.9 2.4 1.5 16.6 10.5 5.7

19C p/mio 22.2 10.6 6.2 0.2 1.2 5.3 1.0 0.8 3.5 1.8 1.9 8.2 5.2 2.0 1.5 0.8 1.6 1.4 1.1 0.6 0 8.8 2.9 6.9 0.5 0.7 0.6 0.1 29.2 16.7 5.7

BNC p/mio 7.8 7.8 4.4 0 0.5 0.1 0.1 0.3 3.0 1.1 2.3 0.1 0.1 0.1 1.5 0 >0.1 0.5 0.5 0.2 0 4.4 1.3 1.5 0.4 0.1 0.1 0 41.6 2.7 3.9

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

115

women (see, for example: Robson, 1992 or Langland, 2002). The log-­ likelihood test highlights that the underuse of “father and mother” is yet more strongly significant. Interestingly, this particular underuse of LNGs that reflect reference to parents seems to have not been given very much attention in literary criticism of Dickens. However, looking at his novels, this is a reflection of the heroes in many of his novels: “[t]hat Dickens novels abound in child victims of injustice bears witness to the persistent rankling of his own early injuries and humiliations” (Adrian, 1971: 3). A large part of his novels focus on children, and in these it is the absence of caregivers (whether it is David, Oliver or Pip, etc.) that drive the plot. Consequently, it appears that “boy and girl” also appear significantly more frequently in Dickens than in general literature corpora. The table represents the overused LNGs in sections. We can therefore see the topics which appear to carry particular weight in the novels of Mr. Dickens: 1.  2.  3.  4. 

Terms of address Food and drink; cutlery Count numbers The written and spoken words

5. Appearances 6.  Residence and transport 7.  Day and night 8. Love

These preferences for certain Linked Noun Groups in Dickens’ works give a good indication of the kind of environment and atmosphere the author tried to create. When we look at those trigrams that are overused, a pattern already established in Sect. 4.4.2.2 becomes apparent: the nesting of the phrases are quite similar to those found in other nineteenth-century writing. In fact, the most frequent of these, “Mr. and Mrs” shares not only its colligations but even its position within the text (quite often, this is the first line in a new chapter) is to be found as typical in both corpora. Looking at the other N- [and]-N constructions, we find two particular forms. As we have seen above, Dickens uses them in a way similar to his fellow writers, and, while there might be some small divergences in collocates, the main difference is the preference for these phrases found in Dickens. This can be seen as the ideolect of the author, revealed through the use of long strings or complex constructions that occur rarely or never in the texts of other writers (cf. Coulthard, 2004).  The other peculiarity is found in very strong overuse: these seem to form the majority of the and LNGs. The most obvious ones are extremely rare in 19C (and, indeed, BNC). Thus, for example, the DC has 26 occurrences of “Lords and gentlemen”, whereas

116 

M. PACE-SIGGE

19C has only three, of which two are from Dickens’ novel Bleak House. Other phrases that are very rare in 19C are “bread and meat”, “word and honour”, “wheels and horses”, “door and window” as well as the phrase “morning noon and night” and every single “love and –N-” LNG. Looking at the phrases in detail, only three of the eight sections identified, namely sections 1, 2 and 5 and also 824 appear to be used in very much the same way in both corpora, apart from the clear preference in Dickens’ novels. All remaining sections show divergence usage patterns for the trigrams listed. Section 3 sees “five-and-twenty” typically used for years in both but 19C has “some five and twenty” as the most frequent 4-gram, in Dickens it is “about five and twenty”—whereby the former more often refers to years. This is in contrast to “four-and-twenty” which typically counts “hours” or “years” in both corpora. The contrast is even starker for Section 4, where “pen and ink” and “books and papers” are used with different collocates in the different corpora. “Word and honour” is not only far more prominent in Dickens—it is also in eight out of ten cases the exclamation “upon my word and honour” but only four  out of eleven  in 19C. The latter also has “I give you my word an honour” relatively a lot more frequently. Section 6 has both “door and window” and “wheels and horses” as idiosyncratic turns of phrase in Dickens. The colligation and semantic association for “board and lodging” is quite different. In 19C, it is typically the first-person singular narrator who mentions this (me, my, my own), whereas it is “his board and lodging” (four out of 16) or “her board and lodging” (two  out of 16) which indicates a third person or omniscient narrator under these circumstances. Section 7 refers to time sequences. In 19C, we find the phrase “day after day, night after night” four times25 in 123 concordance lines, yet it appears only once in Dickens. However “day and night again” stands out as Chap. 5 in Hard Times starts as follows: “DAY [sic] and night again, day and night again….” and this is then repeated a few lines later. Similarly, the cluster “day and night, ever …” appears twice in Dombey and Son yet nowhere else. Clearly, this is an example of a stylistic feature the author employed. And it is probably all the more effectual as this phrase is nowhere else in evidence. 4.4.2.4 LNGs in the Marx Corpus and How These Compare With the translations of Marx (see Pace-Sigge, 2018 for details), there are two points of comparison. First of all: In how far does it look like modern academic writing? And, secondly: Is there evidence that the translators reflect Linked Noun Group usage that is typical of the nineteenth century?

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

117

The only LNG links found in the MC are for and and or. There is one single exception: “neither value nor surplus value” which appears six times and is easily identified as a typical Marxian phrase. 4.4.2.5 LNGs with or that are (A-)typical where Marx is Compared to BAWE and 19C The most frequent N- [or]-N constructions found in MC are not oppositions but typically expansions of a given entity. As such, these are all very specific to the Marx’s subject, as shown in Table 4.16. In fact, Marx’s writings are so topical that only the most frequent LNGs are recorded at all in BAWE. However, even the most frequent LNG here, “increase or decrease” appears just 1.1 times per million words in BAWE. The one occurrence of “surplus value or profit” is, indeed, a Marx quote. None of the N- [or]-N phrases found can be said to be typical of 19C literary writing.26 4.4.2.6 LNGs with and that are (A-)typical where Marx is Compared to BAWE and 19C Literature As we have seen above, N- [and]-N constructions are typically dominant. It is here where clear overlaps in usage can be demonstrated. For these comparisons, only the most frequently occurring LNGs are highlighted. On first look, the proportional use for these LNGs is marginal in BAWE. However, it must be highlighted that that corpus consists of over 2700 files, covering different subjects by a large selection of writers. The MC consists of 16 files by a single writer who writes on a single subject.

Table 4.16  Comparing Marx or LNGs to BAWE and 19C LNG GOLD OR SILVER INCREASE OR DECREASE INCREASES OR DECREASES SURPLUS VALUE OR PROFIT VALUE OR SURPLUS VALUE MONEY OR COMMODITIES VALUE OR PRICE EXPANSION OR CONTRACTION INCREASE OR DIMINUTION DECREASE OR INCREASE PURCHASE OR SALE

MC N 60 34 20 30 20 19 19 11 9 8 7

MC p/mio 44.1 25.0 14.7 22.1 14.7 14.0 14.0 8.1 6.6 5.9 5.1

BAWE N 4 19 6 1 0 0 0 1 0 1 0

19C N 1 0 0 0 0 0 0 0 0 0 0

118 

M. PACE-SIGGE

Therefore, it is not so much the words per million that are relevant here but the fact that Marx uses linked noun constructions that are still very much in use one and a half centuries later. In this (see Chap. 2), the general academic phrases “1 and 2” / “A and B” clearly stand out. It is also noteworthy that “supply and demand” is the typical form for this trigram—twice as frequent as its reverse form in both Marx and BAWE. One of the most remarkable changes—to me—is the clear decline of the term “profit and rent”. Rent is unearned revenue that are not the result of economic activity. Rent from houses are the obvious example— but so are monies gained through being patent or copyright holders. This was clearly still very important to Marx—whereas modern-day economists (with the exception of Piketty and his school) seem to ignore it all too often. Instead, the most frequent BAWE trigram is “profit and loss”— occurring 50 times (3.0 times per million). The first two entries in Table 4.18 might be surprising for not having been included in Table 4.17. However, “gold and silver” is referred to in BAWE a mere ten times, “silver and copper” not at all. While it is one of the key LNGs that a reader of Marx shall encounter, it also is a reflection of the concerns of the time: “gold and silver” is relatively frequently mentioned in nineteenth-century literature. The other entries do not seem typical of the language of either an economist or academic writing. They are, however, fairly typical trigrams of its time. None more so than the rather informal “here and there”.27 While the evidence here seems insufficient to say that Marx (or his translator) was influenced by his reading of Victorian literature, it is nevertheless noteworthy that there appears to be a borrowing of use that Karl Marx (or his translator) would have come across as part of his own reading. Table 4.17  Comparing Marx and LNGs with BAWE occurrences LNG WEAR AND TEAR SUPPLY AND DEMAND DEMAND AND SUPPLY PROFIT AND RENT I AND II (1 AND 2 in BAWE) A AND B PRODUCTION AND CONSUMPTION WOMEN AND CHILDREN

MC N 169 109 53 94 66 57 35 34

MC p/mio 124.3 80.1 38.9 69.1 48.5 41.9 25.7 25.0

BAWE N 11 32 16 16 149 139 19 22

BAWE p/mio 1.7 4.9 2.5 2.5 22.8 21.3 2.9 3.4

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

119

Table 4.18  Comparing Marx and LNGs with 19C literature and Dickens Corpus (DC) LNG

MC N

HERE AND THERE GOLD AND SILVER SILVER AND COPPER WOMEN AND CHILDREN DAY AND NIGHT

46 330 27 34 16

MC p/mio 33.8 250.0 20.5 25.0 12.1

19C p/mio 39.4 3.2 >0.1 5.7 8.8

DC p/mio 45.0 3.1 0.4 5.1 21.2

Table 4.19  Marx-­typical most frequent and LNG forms LNG PURCHASE AND SALE PURCHASES AND SALES BUYER AND SELLER VALUE AND SURPLUS VALUE COMMODITIES AND MONEY CAPITAL AND MONEY FIXED AND CIRCULATING CAPITAL RAW AND AUXILIARY MATERIALS CONSTANT AND VARIABLE CAPITAL

MC N 56 47 37 35 33 31 55 48 38

Finally, Table 4.19 lists all those and LNGs that are typical of Marx. These either appear not at all in BAWE or only maybe once or twice—and not at all in 19C.  Unsurprisingly, given that this corpus contains three volumes of Capital, linked LNGs with “capital” or “money” are dominant. There are also longer noun groups like the bottom three which are not strictly LNGs in English. These translations are clearly mirroring the German original where compound nouns can be sliced up and seem to appear as adjectives—however, it is simply a shortened form. Thus, for example, it should read “fixed capital and circulating capital” (compare the German “Festkapital und Umlaufskapital”, or, in the original, “Festund Umlaufskapital”). This is, admittedly, hard to translate. As a typical Marxian noun construction, it should be, however, taken into consideration in this case study. 4.4.3  Case Study Conclusion Given the space available, these case studies had to be quite concise. Nevertheless, an insight has been provided into the key concerns by both writers. While this is quite narrow in the case of Marx. it is detailed for the 23 works by Dickens. In fact, the mere key-phrase analysis of LNGs appears

120 

M. PACE-SIGGE

to back up issues highlighted by Dickens researchers. Thus, there is an absence of parents and female, yet children and males are recognisable in the Linked Noun Group usage. Likewise, Dickens has a clear focus on various groups of people (and a preference, it seems, for lords and ladies) and gives a good amount of detail for household items. Lastly, the preference for time markers—“night and day” and related ones—also stands out. With regards to the Marx translations, N- [or]-N constructions are almost fully typical of the books in this corpus. This stands in contrast to N- [and]-N constructions which have many phrases that are still found in current academic use. Marx also uses a small number of turns of phrase that seem to be more typical of nineteenth-century usage. Lastly, an investigation into LNGs also shows how the translators tried to grapple with specific German compound noun constructions: the resulting English form appears to be a noun phrase that uses a double-adjective modifier.

Appendices Appendix for Sect. 4.2.3.1: Death LNG

N 19C

19C per million

N BNC

BNC per million

Death and life Disease and death Torture and death Sickness and death

5 6 5 3

0.4 0.4 0.4 0.2

2 0 0 2

0.1 0 0 0.1

Appendix for Sect. 4.2.3.2: Items of Clothing LNG

N 19C

19C per million

N BNC

BNC per million

Bonnet and veil Bonnet and cloak Bonnet and gloves Hat and cloak Hat and umbrella Hat and overcoat Dress and coat Dress and appearance Boots and shoes

6 6 5 6 5 5 0 5 7

0.4 0.4 0.4 0.4 0.4 0.4 0 0.4 0.5

0 0 0 0 2 1 5 0 4

0 0 0 0 0.1 >0.1 0.3 0 0.2

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

121

Appendix for Sect. 4.2.3.2: Housing and Environs LNG

N 19C

19C per million

N BNC

BNC per million

House and furniture House and land/s Home and abroad Home and friends Home and family Door and window Table and chairs Chair and table Trees and hedges Trees and flowers Town and county Town and gown Earth and heaven Earth and sky Earth and sea Earth and air London and Paris Church and state

9 5 10 5 1 9 3 7 7 5 9 5 11 10 8 8 6 10

0.6 0.4 0.7 0.4 >0.1 0.6 0.2 0.5 0.5 0.4 0.6 0.4 0.8 0.7 0.7 0.7 0.4 0.7

0 4 2 0 10 5 14 2 6 7 0 0 1 0 0 0 5 2

0 0.2 0.1 0 0.4 0.2 0.8 0.1 0.3 0.3 0 0 >0.1 0 0 0 0.2 0.1

Appendix for Sect. 4.2.3.3: Body Parts LNG

N 19C

19C per million

N BNC

BNC per million

Mind and heart Heart and brain Hair and clothes Eyes and lips Neck and bosom Neck and arms Neck and throat Hands and legs Hand and arm

16 13 0 12 5 5 0 3 12

1.2 1.0 0 0.9 0.4 0.4 0 0.2 0.9

6 0 11 7 0 1 7 5 1

0.3 0 0.7 0.4 0 >0.1 0.4 0.3 >0.1

122 

M. PACE-SIGGE

Appendix for Sect. 4.2.3.4: Time Markers LNG

N 19C

19C per million

N BNC

BNC per million

Place and time Place and hour Time and money

10 5 6

0.7 0.4 0.4

5 0 15

0.2 0 0.9

Appendix for Sect. 4.2.3.5: Food Items LNG

N 19C

19C per million

N BNC

BNC per million

Air and water Wind and water Wood and water Soap and water Milk and water Whiskey and water Food and water Mud and water Land and water

12 11 9 9 9 7 6 1 1

0.9 0.8 0.7 0.7 0.7 0.5 0.4 >0.1 >0.1

2 3 0 9 2 8 15 7 4

0.1 0.2 0 0.6 0.1 0.5 0.9 0.5 0.2

Appendix for Sect. 4.2.3.6: People LNG

N 19C

19C per million

N BNC

BNC per million

Horse and mana Husband and father Friends and acquaintances Friends and relativesb Husband and children Friends and family

18 11 8 8 5 0

1.3 0.8 0.6 0.6 0.4 0

1 7 11 11 5 10

>0.1 0.5 0.7 0.7 0.3 0.6

car and driver—five occurrences in BNC

a

b

friends and relations in 19C

123

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

Appendix for Sect. 4.2.3.7: Pronouns LNG

N 19C

19C per million

N BNC

BNC per million

You and My You and Miss You and Mrs. One and the same

41 29 20 45

2.9 2.1 1.4 3.2

7 5 7 32

0.5 0.3 0.5 1.8

Appendix for 4.4.2(a): Dickens or LNG Usage LNG

Minute or two Word or two A step or two Something or other Hour or two Two or three times Five or six Week or two A pace or two Man or woman Two or three days A minute or so Two or three hours Three or four times Mile or two Three or four years Two or three years A mile or two Life or death Three or four days

DC

19C

BNC

N

p/mio

p/mio

p/mio

57 50 46 35 35 32 25 25 23 20 20 19 18 17 14 14 14 12 12 11

12.6 11.0 10.0 7.9 7.9 7.1 5.5 5.5 5.1 4.4 4.4 4.2 4.0 3.8 3.1 3.1 3.1 2.7 2.7 2.4

13.3 5.8 4.2 3.7 9.1 5.4 8.2 5.5 2.0 4.8 5.0 1.7 2.2 1.7 2.0 1.9 2.9 1.7 2.7 2.6

5.5 0.8 0.9 2.0 3.4 2.3 4.4 3.6 11 1.9 1.6 2.9 0.8 1.0 0.8 1.2 1.2 0.7 0.6 0.8

124 

M. PACE-SIGGE

Appendix for 4.4.2(b): Dickens and LNG Usage LNG

Up and downa You and I Men and women Bread and butter You and your Me and My Hundred and fifty Brother and sister Heart and soul Bread and cheese Years and years Gold and silver (n) Brothers and sisters Lords and ladies Bonnet and shawl House and home Head and shoulders Three and sixpence Finger and thumb Male and female Mother and sister Heads and shoulders Tenderness and love Mum and dad

DC

19C

BNC

N

p/mio

p/mio

p/mio

592 158 97 89 70 57 53 49 43 39 30 28 28 6 25 24 23 21 21 20 19 7 3 0

130.7 34.9 21.4 19.6 15.5 12.6 11.7 10.8 9.5 8.6 6.6 6.2 6.2 1.3 5.5 5.3 5.0 4.7 4.7 4.4 4.1 1.5 0.7 0

77.6 32.6 17.8 13.4 10.2 11.1 7.1 6.8 5.3 4.9 3.2 3.2 3.9 1.9 2.1 1.0 2.7 0.5 3.1 1.9 3.7 0.4 0.4 0

58.0 21.7 13.2 4.8 10.0 4.4 7.2 3.8 1.5 3.2 1.9 0.7 2.4 0.5 0 0.3 3.6 0.1 2.3 1.4 3.1 >0.1 0.1 9.4

The adverb phrase “up and down” has been included because it is the most frequent and trigram. Statistically (log-likelihood test), the occurrence frequency between DC and 19C shows zero divergence a

Notes 1. Then again, Marx is definitely more widely read in every language other than German. Furthermore, the translators will have to find a form of English that would be acceptable to readers in Victorian Britain. 2. This is particularly visible because the BNC (twentieth century) corpus is over 17% larger than the 19C corpus—which, one might assume, would mean higher frequency means also greater variety. 3. This derives from the NP “no thing”.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

125

4. BNC also has “anything but work”—three occurrences “anything but kindness” two occurrences in 19C 5. This trigram occurs in 29 texts. It is very strongly used in Trollop’s Eustace (eleven times). 6. Amongst the most frequent of these trigrams, it is one or two and one or other that are used proportionally the same. 7. A third point would be that twentieth-century writers (or publishers?) insist to write out numerals, while numbering in-text was acceptable in 19C prose fiction. 8. “Day or Two” in 19C appears in 72 different sources. Among these, a substantial number of writers use it five or more times. 9. There is also a third possible reason: given that 19C has 100 files and BNC 432 files, the lower spread of original sources might be the reason that there are fewer repeated phrases. 10. Opposites with or are, however, far more typical as adverb or adjective constructions: more or less, right or wrong, good or evil etc. 11. Such count-LNGs are only listed in Sect. 4.4. 12. Love as verb occurs 182,062 times in the Historical Books Collection. This means that love as noun occurs 0.04% of the total of all tokens and love overall appears in over 0.05%. 13. Amongst the less frequent LNGs, there is one exception in “house and furniture” in 0.6 times / million words in 19C (usually preceded by verb+pronoun). It is not recorded in the BNC, which has, however, “house and everything” (0.5 times) which appears to be the equivalent. 14. They continue, however, to appear in twentieth-century newspaper articles. 15. This compares to only 2/46  in 19C instances of “moving heaven and earth”. 16. Hence the strong interest in the “science” of Physiognomy, which allows one to draw conclusions about a person’s character. Something even used by law enforcement (see, for example, https://www.oldpolicecellsmuseum.org.uk/content/learning/that-criminal-look (last accessed 12/2019). 17. Only “washed” occurs (twice) in 19C, while there are the lemmas “wash” as well as “washed and washing” in BNC.  There is also “bath/ed”, “rinsed”, “splashed” and “wiped” in BNC. 18. By contrast, in 19C “fire and water” stands for opposites three times and is only once use in its literal sense. 19. This includes the mention of “husband and …” (see appendix). 20. As opposed to “parents were alive”—once only. 21. These are Bleak House and David Copperfield. Where this could have influence occurrence patterns, it has been highlighted.

126 

M. PACE-SIGGE

22. Occurring a single time in each case. By contrast, 19C only has one concordance line “six or eight children”. 23. There are also highly frequent LNGs with names typical of Dickens novels (Jarndyce and Jarndyce; Pickwick and friends) which have been left out here. 24. These have a strong tendency to be clause-final phrases. It must be noted that the occurrences included in 19C always include a line or two from Dickens as well. 25. By four different writers, one of them being Dickens. 26. What really should have been done by me here is to compare the Marx translation with other economists’ and philosophers’ writings of the nineteenth century. However, not being aware of such a corpus, this more fitting comparison was omitted here. 27. “Here and there” does occur in BAWE too—yet only three times. It appears, in MC, to be a direct translation of the German “hier und da”.

References Adrian, A.  A. (1971). Dickens and Inverted Parenthood. Dickensian, 67(363), 3–13. Baker, P., & Egbert, J. (2016). Triangulating Methodological Approaches in Corpus Linguistic Research. London: Routledge. Biber, D., & Conrad, S. (2000). Register, genre, and style. New York: Cambridge University Press. Coulthard, M. (2004). Author Identification, Idiolect, and Linguistic Uniqueness. Applied Linguistics, 25(4), 431–447. Gallagher, C. (2015). Novel. In The Encyclopedia of Victorian Literature (pp. 1–12). New York: John Wiley. Hoey, M. (2005). Lexical Priming. A New Theory of Words and Language. London: Routledge. Kilgariff, A., & Rychly, P. (2019). Sketchengine. Brno: Lexical Computing Ltd. Retrieved from December 12, 2019, from https://www.sketchengine.eu/. Langland, E. (2002). The Receptions of Charlotte Brontë. In C. Dickens, G. Eliot, & T.  Hardy (Eds.), A Companion to the Victorian Novel (pp.  387–405). Oxford: Blackwell. Longman Dictionary of Contemporary English (LDOCE). (2009 [1978]). 5th ed. Harlow: Pearson Education. Mahlberg, M. (2007). Clusters, Key Clusters and Local Textual Functions in Dickens. Corpora, 2(1), 1–31. Marchi, A., & Taylor, C. (2009). If On a Winter’s Night Two Researchers…: A Challenge to Assumptions of Soundness of Interpretation. Critical Approaches to Discourse Analysis Across Disciplines: CADAAD, 3(1), 1–20.

4  LNGS IN NINETEENTH- AND TWENTIETH-CENTURY BRITISH FICTION 

127

Pace-Sigge, M. (2018). How Homo Economicus is Reflected in Fiction—A Corpus Linguistic Analysis of 19th and 20th Century Capitalist Societies. Language Sciences, 70, 103–117. Patterson, K.  J. (2014). The Analysis of Metaphor: To What Extent can the Theory of Lexical Priming Help Our Understanding of Metaphor Usage and Comprehension? Journal of Psycholinguistic Research, 45(2), 237–258. https:// doi.org/10.1007/s10936-014-9343-1. Rayson, P. (2016). Log-likelihood and Effect Size Calculator. Excel spreadsheet. Retrieved December 8, 2018, from http://ucrel.lancs.ac.uk/llwizard.html. Robson, L. (1992). “The Angels” in Dickens’s House: Representation of Women in A Tale of Two Cities. The Dalhousie Review, 72(3), 311–333. Schmidt, S.  J. (2000). Interpretation: The Story Does Have an Ending. Poetics Today, 21(4), 621–632. Scott, M. (2019). WordSmith Tools Version 7. Stroud: Lexical Analysis Software. Retrieved December 8, 2019, from www.lexically.net.

CHAPTER 5

Findings, Applications and Conclusions

5.1   Findings This book set out to demonstrate the breadth of use of binomials that are Linked Noun Groups (LNGs) across different text types. LNGs are a specific subgroup of binomials which have a long history of academic interest, starting, it appears, with Richard Abraham in 1950 and then Yakov Malkiel in 1959.1 Little distinction between different forms has been taken however, and, less focus still, it seems has been on the distinction between different genres. Rosamund Moon following on from  John Sinclair (1990); Douglas Biber and his team have, at least, concentrated on linked noun binomials; Neal Norrick (1988), referring back to work done in the mid-1970s by Marita Gustafsson (who started out with an investigation of the phonetic length of binomials) does focus on one particular text type, namely “binomials in free conversation”. One clear drawback appears to be that a lot of research clearly focussed on the issue that binomials are relatively infrequent—and the discussions are simply based on large, general corpora like the Longman Corpus (used by Biber et al. 2007) or the whole of the BNC (used by Mollin in 2014). The resulting findings show that Moon (1998) was both correct and mistaken in her discussion of Linked Noun Groups. Linked noun groups are, indeed, low in frequency and prevalent in the English language. Her small corpus is not a  general one and does, therefore, not present the breadth needed when looking at usage patterns within different genres. Building on her early efforts, this book aimed to expand the base of © The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3_5

129

130 

M. PACE-SIGGE

corpora employed; the results provide a far clearer insight into how the construction is employed. Unlike Biber et al. or Mollin, binomial LNGs are not identified in a large corpus overall but are viewed as specific and characteristic for each genre: in this, it echoes Norrick’s early investigation. The usage structure of Linked Noun Groups shown in the preceding chapters very clearly demonstrates that Linked Noun Groups can only be usefully appraised within different genres. There is a more fine-grained level of usage still, as Chaps. 3 and 4 have highlighted: the repeat usage of particular LNGs is part of the idiosyncratic voice of a poet or a writer. The result is that the reader finds in this book the most frequently occurring Linked Noun Groups in casual spoken British English, in British academic spoken as well as written corpus data, in British and US poetry, and, finally, in nineteenth-century and twentieth-century British fiction. Table 5.1 gives a first impression of the overall frequencies in the different text types and gives examples of some of the most frequently repeated LNGs in each of them. If this is compared with what Gustafsson (1976) and Mollin (2014)2 have said with regards to the frequency distribution across text types for binomials, the research presented in this book confirms their findings overall. Looking at the word frequencies in different corpora, and tends to be five times more frequent than or as a single item. Table 5.1 demonstrates that it also is the most frequently used link between two nouns in the constructions investigated. Within these, it is most frequently found with

Table 5.1  LNG usage across text types Text type

SPOKEN

in chapter II. LNG frequency 1-2 overall* Most frequent - me and I LNGs - one hundred and fifty

ACADEMIC WRITING II. 3-4 - one and two - cause and effect

1= up to 1/mio words; 5= above 10/mio words

a

POETRY

19C FICTION

20C FICTION

III. 1-2

IV. 4-5

IV. 3-4

- day and night - sun and moon

- two or three - me and I - Mr and Mrs

- one or two - her and she - men and women

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

131

count numbers. Other linking elements between nouns (but, nor, neither, etc.) are, by comparison, very rare. Further observations are that fiction (in particular Victorian fiction) and academic writing tend to be  the most highly formulaic text types, showing the highest numbers of repeat usage of LNGs. The lowest such are found in spoken usage and poetic writing—yet this is for entirely different reasons. Amongst producers of spoken utterances (apart from the reference to numerals), the construction of linked binominals appears to take too much processing time and, apart from formulaic phrases (often vagueness markers of the order of “day or something”, “work and stuff”), there are not many. By contrast, the number of LNGs in poetry which are used by a large number of poets is highly restricted. This seems to support the view that poets work very hard to not repeat clichéd phrases. Instead, it has been found that particular LNG phrases appeared to be preferred by certain poets—they use them in a number of different poems. More importantly, keywords in poetry do appear in LNGs—but these occur only once or twice, as poets appeared to have striven to create a novel expression. In this book the most frequently occurring such linked noun constructions for each were presented; furthermore, a brief study into the idiosyncratic usage of both Charles Dickens and Karl Marx (in translation) have been given in the previous chapter.

5.2   Applications As has be hinted at in the Introduction, the applications coming out of this research are numerous and go beyond the simple enhancement of providing descriptive data on language choices. In the words of S.J. Schmidt: (Let’s) talk about what to do with results. How can the results obtained by empirical research in literary studies be applied to problems outside the academia, and how can we transform the knowledge available in a community of investigators by systematically integrating new knowledge instead of simply replacing one opinion with another? (2000, 631)

One key area is discourse analysis—fixed phrases stand, it has to be said, often in the way of modernising and modifying language usage patterns. The probabilistic chances of a learner to encounter any particular Linked Noun Group need to be taken into account in English (as a foreign) language teaching; the same is true where machine learning is concerned and

132 

M. PACE-SIGGE

systems of natural language processing have to integrate the knowledge of these fixed phrases: both their existence and where they should typically be encountered. 5.2.1  LNGs and Teaching Pace-Sigge (2015) already highlighted the importance for students of English to know how multi-word units (MWUs) are an essential part of the language and that these need to be learned and their usage understood by foreign language learners. This book has focussed on an infrequent but highly relevant section of MWUs and the data presented should help any learner to select Linked Noun Groups that are appropriate for the text type they produce. In fact, the research presented here can be used as a strong argument why students should be exposed to as much natural occurring texts as possible, as learners have a need to be exposed to such multi-word units. Ernestova, (2007) highlighted the importance of binomials in particular when it comes to language learning. If, as Koprowski (2005) or the more recent study by Sugiati and Rukmini (2017) claim, relevant textbooks are failing to prepare students to use formulaic phrases, this research should assist in the provision of useful teaching material. Apart from textbooks, reference works also need to take note, as Gabrovšek (2011, 28) writes: Fixed binomials represent an intriguing and surprisingly diverse if minor category within the phraseology of English. …the basic element of lexicographical policy is that opaque binomials, being as they are idiom-like or compound-like vocabulary units, must always be listed and defined; on the other hand, whenever a binomial is transparent in meaning, it should merely get listed (not defined!) only in dictionaries designed also with an encoding component in mind.

Consequently, language learners should be made aware of the particular nesting patterns of Linked Noun Groups as demonstrated here. When looking at binominals in general, linked adverb groups (“more and more”, “more or less” etc.) are the ones which are relatively stable in their use across genres. By contrast, it must be stressed that clear divergencies between LNGs reflect different genres. First of all, spoken English: this is probably the most formulaic genre of texts, yet, they have a very low use of Linked Noun Groups. In fact, complex noun phrases should not be expected in spoken usage—possibly for reasons of the mind’s processing time involved, but should be expected in

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

133

edited texts. The types of LNGs occurring in spoken (British) English are pronoun clusters (“you and me”) vague referrers (“work and stuff”) and, most widely, count nouns (“hundred and fifty”). Even the more specific form of spoken usage—in academic institutions—yields little that diverges from the above. There are more deictic pointers (“here and there”) as well as some repeated subject-specific phrases (“goods and services”), yet, overall, the production of spoken English seems to make do with only a limited form of Linked Noun Groups. It must be noted that the insights won from the present research support the findings shown by Seifart et al. (2018, 5722), who stated: “[o]ur results from naturalistic speech contradict experimental studies showing faster planning of nouns and thus suggest that the effect of referential information management overrides potential effects of higher processing costs of verbs”. In fact, they highlight the reason why nouns, in spoken exchanges, lead to a slowdown (which might be seen as a reason for lower rates of usage in speech): “[p]ragmatic principles of noun use and the slowdown associated with new information converge to create a uniform pattern of speech rate variation across diverse languages and cultures” (Seifart et al. 2018, 5722). Next, we move on to written texts. When looking at occurrence patterns of Linked Noun Groups, two key findings of this book should be kept in mind: 1. The higher the degree of editing, the more likely it is to encounter a Linked Noun Group. For this reason, they can be expected to be infrequent in blogs (which are close to the spoken genre), yet oftenrevised texts (works of fiction or poetry; academic publications) will show a much wider variety of LNGs. 2. LNGs are high in informational content. They are, therefore, characteristically found in academic writing and the highly descriptive sections of novels.3 There are a number of studies that have shown how important “frequent formulaic sequences” are in the reading and comprehension of texts. Looking at studies undertaken over the last ten years, Arnon and Snider (2010) found that the processing times for highly frequent phrases is similar to the processing times to well-known words: in almost all cases, they can be processed (and therefore are comprehended) much faster. Siyanova-Chanturia, Conklin and van Heuven (2011) undertook an eye-­ tracking study where participants were exposed to three-word binomials

134 

M. PACE-SIGGE

(“bride and groom”) and their reversed forms which they assumed to be different only in frequency but are syntactically and semantically not different. Their mixed-effects modelling revealed that native speakers and non-native speakers read highly frequent formulaic sequences more quickly than less frequent ones. As a result, Conklin and Schmitt conclude in their review (where they look at the relationship between corpus extracted formulaic language and psycholinguistic processing): “[t]here are compelling reasons to think that the brain represents formulaic sequences in long-term memory, bypassing the need to compose them online through word selection and grammatical sequencing in capacity-­ limited working memory” (45, 2012). There is a reason why the findings of this research are relevant for language learners, as “each and every occurrence of a linguistic form, a word or a phrase, contributes to its degree of entrenchment in a speaker’s memory” (Siyanova-Chanturia, Conklin, & van Heuven, 2011, 783). This, in fact, echoes findings made by others, amongst them Nick Ellis (2002), Michael Hoey (2005) and Pace-Sigge (2013). Therefore, entrenching awareness when and where which LNG forms appear to be typical should be part of English (as a foreign language—EFL) teaching. 5.2.2   LNGs, (Critical) Discourse Analysis and Style As we have seen, there is a long list of research into binomials that show how fossilised their patterns are in the English language. Gustafsson (1976) refers to it as “frozen”; her research also includes the phonetic properties displayed in binomials. Siyanova-Chanturia, Conklin, and van Heuven (2011) have shown that reversed linked noun forms are seen as untypical enough as to warrant longer processing time by the listener/ reader. As a result, Linked Noun Groups are shown to be particularly resistant to shifts in language even where other parts of language use start reflecting greater degrees of change. This is particularly clear when the focus is on language that is equally balanced between genders—what Hegarty et  al. (2011) refer to as the “Juliet and Romeo” effect. While clearly absent in 19C fiction texts and even within twentieth-century BNC (cf. Motschenbacher, 2013), it could be expected in more recently produced wordings. One can see that there is a slight shift however. The trigram “men and women” is not particularly frequent in BNC 2014. Still, while it is recorded 24 times, the reverse form, “women and men”, also occurs at least five times. However, in effect, this is a phenomenon that a

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

135

listener would come across 0.4 times in a million (compared to just over two times / million words for the prototypical form). Chapter 2 details in how far reverse forms can be found in one other genre, namely academic writing. Yet, while Mollin (2014) demonstrated that a large number of binominals are existent with the prototypical as well as the reverse form, even very recent academic essays and theses only inverse the masculine-­ before-­feminine Linked Noun Groups at a very low rate of occurrence. This book has shown that Linked Noun Groups, while atypical of spoken English, are found in carefully crafted written texts, in particular academic writing. As described in Chap. 2, academic language has long, complex noun phrases as one characteristic marker of its style. Possibly most surprising is the variety and divergence of LNGs in academic writing—even where older translated forms of a single writer (Marx in this case) are investigated. This is something Goźdź-Roszkowski’s (2011) in-depth study of lexical bundles across US legal genres already hinted at, as contracts vary in their use of lexical bundles clearly from legislation, textbooks or other material in that genre. As a result, the widest range and variety of LNGs is to be found in this book’s Chap. 2. While a single corpus of “academic English” is investigated, the different academic subjects—be it law, business, social sciences, medicine—display their own, highly specific LNGs. As Hyland and Tse (2007, 244) show, “a random analysis of AWL families with potential homographs reveals a considerable amount of semantic variation across fields”. This, again, highlights that any writer of an academic text is expected to know and employ the technical language—and these include Linked Noun Groups—that are typically employed in their subject area. The one exception are reference pointers like “a and b”, “x and y” or, indeed, “cause and effect”. Within the field of discourse analysis, therefore, it is highly relevant to know that there is no one single “academic discourse”: there are typical markers of academic discourse which provide a general base layer. Beyond that, there is a second layer that is made up out of subject-specific discourse markers that are not typically found in other areas of (academic) discourse. At the same time, this does not mean that there is no leakage: for example, “law and order” is part of legal discourse, yet is also a prominent phrase in many newspapers. Likewise, “health and safety” or “supply and demand” are specific to business studies (economics)—however, they can even be found in the (spoken) BNC 2014 corpus. Thus, there are 50 occurrences of the former (4.8 occurrences per million words) and five of the latter. The two other text forms that are highly edited are fiction and poetry. Fiction Linked Noun Groups turn out to be, on the surface of it,

136 

M. PACE-SIGGE

surprisingly formulaic: the most common ones are used by a single writer (Dickens) also occur, with slightly differing relative frequencies, in both general nineteenth- and twentieth-century British fiction corpora. Yet, below the surface, an investigation of LNGs appears to be a perfect tool to reveal not just idiosyncratic turns of phrase, but also the main concerns of each writer. Consequently, we can see that the Marx translations reflect the translator’s own background: a large number of N- [OR]-N constructions are fairly typical of their time and are found elsewhere in the 19C corpus. At the same time, the N- [AND]-N constructions are genre specific; a number of the LNGs used by Marx are still in use in academic writing over a hundred years later. At the same time, the key concerns of the German philosopher shine through “value and surplus value”, for example. Similarly, the discourse revealed by a single writer like Dickens appears to be far more relevant. Simply by looking at the frequencies of LNGs employed in Dickens’ works and by comparing these with the occurrence patterns of other Victorian writers (and checking for significant differences), the key concerns (or lack of concerns) of the writer become apparent. Thus, we can see in Chap. 4 that Dickens has a strong preference for male-focussed references (the Mr Pickwick mark) while there seems to be a dis-preference for female characters and, indeed, caregivers (the David Copperfield mark). Forms of address, food and drink (and feasting) and the highly self-­ referential LNGs like “pen and ink” or “books and papers”, and the description of (male) dress (“coat and waistcoat”), all seem to be descriptive of the concerns the writer Charles Dickens had carried from his own personal experiences into his books. Lastly, while a binomial like “love and duty” appears to be typical in its Victorian sentiment, it is noticeable that it occurs proportionally four times more often in Dickens’ books than in other Victorian novels. Finally, the (non-) use of frequently reoccurring Linked Noun Groups within poetic texts were addressed in this book. Here, again, Linked Noun Groups can be identified as a highly useful marker of a typical style. Amongst the few critical discourse analyses (CDAs) of poetry, many focus on intertextuality. At the same time, DA is seen by some as the most widely applied form of Anglo-American literary criticism. Thus, Easthope (2013) talks of “poetry as discourse”, which echoes Fowler’s 1981 “Literature as Social Discourse”. The approach in this book is quite different. Chapter 3 outlines other pieces of corpus-based research, yet the clear focus on LNG occurrence patterns uncovers some important stylistic choices. Looking at N-[and/or]- N only, poetry displays a rather low usage

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

137

of such constructions as often-repeated trigrams. Where these occur with high frequency, however, indications are strong that the high numbers are created because a either a single or a particular subgroup of poets makes use of such a phrase. “Night and day” is one such example. A further stylistic marker is, how often poets use different LNGs. Chapter 3 clearly demonstrates that, as a user of this particular colligational structure, Walt Whitman stands out. Given his background as a journalist (he was editor for a number of newspapers), this particular observation gives rise to the question whether one particular style of writing is, in his case, been transferred from one genre to another. Mollin (2014, 32) shows that binomials are more frequently found in newspaper texts than in either fiction or speech texts. As a result, Whitman’s poems can be seen as more direct and personal.4 A further discovery made can be linked to Hoey’s (2005) theory of lexical priming: namely that particular sets of words appear in one speech community (namely, US-based poets in the 1800s) yet these are absent amongst British poets. As such, it can be said that certain Linked Noun Groups are widely enough used in the US to have made their way into poems by three different producers. Finally, a reading of poetry displays the highly relevant notion of absence: as Gustafsson (1975) pointed out, a lot of binomials found are hapax legomena. While this study focussed on LNG that occur repeatedly, the particular focus in Chap. 3 on poetry related keywords highlighted that a lot of poets do, indeed, employ LNGs. However, these might either be rare in use (in other words, only one or two other poets seem to have used the same binomial) or they are unique creations by a single poet, which are not found in other poet’s works. The example we saw is of Swinburne, who does use LNGs, yet rarely is the same trigram used more than once in his entire body of work. All in all, one hopes that it has been demonstrated that Linked Noun Groups, being a dense information carrier that is relatively infrequent in its use, is a potent marker of style due to its unique form to expand or contradict the message of the initial noun. Furthermore, while several LNGs are fossilised, frequently used constructions, their innate flexibility provides an ideal platform to modify through either boosting or contradicting while allowing for endless new combinations if creative or stylistic frameworks demand this.

138 

M. PACE-SIGGE

5.2.3  LNGs and Natural Language Processing Tools In this short section, I would like to highlight how far developers of natural language processing (NLP) tools have worked to identify and process Linked Noun Groups. Also referred to in the context of machine learning (ML) or artificial intelligence (AI), NLP tools make use of vast quantities of data: the “NL” stands for text material that has been extracted from billions of words from various sources, including spoken text. In everyday life, typical uses are search engines like Google, or personal assistants like Siri and Alexa, or appliances making use of Google Go.5 We have seen above that it is Linked Noun Groups that are a key candidate to convey the message a producer wants to give. A such, it should be no surprise that software engineers have tried to use the high information density of linked nouns, as this early description of a patent by Corman and Dooley (2007, 7) shows: Additional sentences can be converted into a network of linked noun phrases … A large document may yield hundreds or even thousands of networks of linked noun phrases. Accumulating the network of each sentence yields a network of words comprised of the subjects and objects of the text and how these are related to one another.

This, however, like Negishi and Takeuchi’s 2015 patent, looks for noun phrases (e.g. “update procedure”) which are close as these linked noun phrases certainly are high in their information content. The theory described by Corman and Dooley in order to “identify information content” appears persuasive, and it appears that the algorithm thus created could plausibly be adopted to not just extract related noun phrases but also related LNGs—that is, those particular Linked Noun Groups that expand on the meaning of the initial noun. An alternative approach is to ensure that n1 is linked to n2 with either [AND] or [OR] as these are the most frequent links we have discovered. This is quite important as it will shorten processing time for any kind of NLP application if word clusters like the trigram “women and children” are instantly identified as single lexical items. At the time of writing (April 2020), the author tested some of the most frequent LNGs described in this book. Trying Siri (using the LNGs given in Table 5.1) to either autocomplete after the first two words or by saying the complete trigram showed that the Apple software was clueless: “I am not sure about that.”

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

139

This is quite different with Google Go (the software underlying the voice-­ based search engine). Giving the first two elements of a Linked Noun Group (“Mr. and—“; “day and—“) would result in the wanted completion. This, however, appears not in the results, only when the “auto-complete” search pane was reviewed. Crucially though, the system was noticeably slow. Compound nouns like “police station” or “processing effort” would yield results without a visible delay, yet giving incomplete Linked Noun Groups clearly meant that the system had to check through a large amount of possible options. Even the fairly common phrase “cause and effect” seemed to take longer to process than any compound noun tried. While this means that the Google tool appears more adept than Siri, drawbacks are still obvious. A final experiment looks at the autocomplete function of the Google search engine. The results are a lot more promising for this. The only exceptions were “one and two” (aca.) and “her and she” (20C lit.) where the first two words did not yield the trigram in its first ten suggestions. By contrast, just typing in “cause—” will instantly offer the option of “cause and effect” (aca.); “me—” gives the option of “me and I” (spoken). The results for all tests can be found in the appendix. What does this mean, however, mean for the development of NLP tools? Firstly, it seems interesting that the speech-based tools are a lot less open to recognising LNGs than type-text tools. This reflects the occurrence patterns of Linked Noun Groups overall; however, proficient users of the English language will still be able to process all types of LNGs (and will easily recognise reversed forms as being atypical). According to an IBM article6 on the future of AI, developers apparently agree “that advancing the ability of computers to interact with us in a more natural way is critical for the AI-human relationship to reach its fullest potential” is crucial. If, however, application developers aim to make their tools more conversation-­like, there are clear obstacles: “[t]he transition to a conversational assistant is harder from a computer science standpoint”, Google’s CEO said in an 2016 interview.7 Even at that point, he had identified a clear trend from the first-generation text-based search to a voice-based search and, beyond that, the development towards a system that mimics natural sounding conversation. Consequently, moving on from the Google Go assistant, there was Google Duplex (2018), which aimed to sound

140 

M. PACE-SIGGE

humanlike by adding backchannel vocalisations and the ability to “rethink”. Then, in 2020, Meena, described as a tool “Towards a Human-like Open-­ Domain Chatbot” (the title of Adiwardana et al. 2020 paper), was presented by engineers of the Google Research Brain Team. While binomials are not explicitly mentioned, the “training objective is to minimize perplexity, the uncertainty of predicting the next token (in this case, the next word in a conversation)”. The developers claim that—by a metric they developed—Meena is getting close to humans for the “Sensibleness and Specificity Average” (SSA) score. The preceding chapters have clearly demonstrated that LNGs are an integral part of the English language; compellingly their usage patterns are highly genre specific. In order to artificially recreate natural language processes, any tool needs to have a sense when it is sensible (and, the reverse, when it is nonsensical) to understand and employ LNGs. These are carriers of highly specific information in a dense format and a human speaker/ listener—writer/reader would have a range of expectations as to genre-­ appropriate usage. This particular knowledge must, therefore, be integrated in the development of any NLP tool.

5.3   Linked Noun Groups—Oppositions and Expansions Given the evidence provided in this book, what conclusions can be drawn? Apart from certain exceptions, this book provides the reader with the most frequent Linked Noun Groups occurring in five different corpora (BNC 2014, BASE, BAWE, GPC and 19C) and two sub-corpora of the BNC (academic writing and prose fiction) that show repeat occurrences higher than five. The result echoes the appendix of Mollin’s 2014 book on 544 binomials under analysis and surpasses the manual earlier research as well as the all-too-brief entries in standard grammars. According to Molin, the vast majority of binomials is register-independent, (…) a number of highly register-specific binomials occur almost exclusively in one of the registers. These reflect the subject matters typical of the registers: for example, the binomials salt and pepper and name and address are virtually restricted to magazine writing (…) Academic-exclusive binomials include Marx and Engels as well as policy and practice. (Mollin 2014, 33)

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

141

The range of sources chosen for this book is wider than Mollin’s and includes a large amount of texts that are more recent—this helped highlighting some recent developments, in particular in academic discourses. Furthermore, Chap. 4 (and, to a degree, Chap. 3) have  provided a demonstration that Linked Noun Groups can be seen as a characteristic and idiosyncratic style marker of one individual prose writer (Dickens or Marx) or, indeed, poet. In Chap. 1, different functions for the key links or and and have been described. In spoken discourse—either speech or spoken exchanges in fiction texts—in particular, the use of or for uncertain amounts (“two or three”, “a year or two”) is prominent, mirroring the use of and in numbers (“one and twenty”). Otherwise, the key function in Linked Noun Groups seem to fit the LDOCE category of ‘possibilities or choice’: “tea or coffee”, “one way or another”. Yet  the categories  “avoiding a bad result”, “correction”, “proof” and the absolute “and not” seem not to be typical. It must be pointed out, however, that “opposition / antonymy” appears a strong candidate for or usage quite strongly so in poetry, such as “sorrow or joy” and “hope and fear”, and also in academic writing, as in “presence or absence” or “positive or negative”. And, apart from when used with numbers, is very typical for what the LDOCE calls to “join to words, phrases etc. referring to related things”. In this book, these forms are referred to as expansions. Use of and has also one further, minor function within Linked Noun Groups: “between repeated words to add emphasis”. Typically, these are found in other binomials like adjective or adverb groups (“more and more” is the most frequent of these). However, there are some singular exceptions—like “sun and sun”, “heart and heart” (once) or “love and love” (three occurrences) in the GPC where repetition is used as a stylistic tool. Looking at the four categories of binomials described by Gustafsson (1975, 85–87), her second and third category (homeosemy and hyperonymy) are difficult to keep apart at times. Hence “sun and stars”, which covers both of these. Yet, in the particular subcategory of binomials investigated here, linked nouns, it is these two which are overwhelmingly employed: semantic opposition /antonyms (“day and night”, “land and sea”) and semantic complementation (“coat and waistcoat”, “knives and forks”). The colligational links are particularly strong. Apart from a few exceptions, the initial noun of an LNG is followed by either an opposition or an expansion in the majority of cases.8 When it comes to geography, there is a clear preference to name the items closer to the human

142 

M. PACE-SIGGE

experience first—therefore, we have “land and sea” or “sun and stars”. It can be said, therefore, that, at least amongst linked nouns, there is greater level of predictability than Abraham saw in 1950. To sum up, it can be said that Linked Noun Groups, throughout their usage in different text types and genres, tend to be used overwhelmingly for oppositions or expansions.

5.4   Appendix: Google Autocomplete Examples men and

man or

men and women

man or a monster lyrics

men and man

man or men

men and black

man or muppet man or astroman

men and women equality

man or woman of the sea

men and mice

man or woman

her and her and him her and him bells thorne her and skip marley her and his her and i her and i or her and me

men and women symbol

her and him free

Mr. and

sun and

day and

mr. and mrs.smith

sun and moon

day and night

mr. and mrs

sun and moon sign

day and time

mmr. and mrs. ramachari

sun and steel

day and age

mr. anderson mr. and mrs. 420 mr. and mrs. khiladi mr. and mrs. smith cast

sun and moon pokemon sun and moon pokedex sun and bass 2020 sun and sand

two or more atoms

two and a half men cast

two or three

two and a half men netflix

two or more velocities add by

two and a half men kandi

one and done one and a half men one and a half year one and the other one and only chords

me me naiset me before you

two or three things i know about her two or three witnesses two or more are gathered one hundred and

one or one or more one orange calories one or two one or the other one order of magnitude one origin one or the other questions

cause cause synonym cause

me and the boys

cause and effect

me and i

cause suomeksi

me too

causeway

me saatio

cause meaning cause of diabetes

me gusta

day and ross

two or

two and a half men prudence

one and only lyrics

day and night blinds

two and a half men

two and a half men rose

one and only

day and night nurse

two and

two and a half men jake

one and

day and night map

one hundred and one dalmatians one hundred and fifty one hundred and eighty one hundred and fifty thousand one hundred and one one hundred and twenty one hundred and fifty thousand in numbers one hundred and ten dollars one hundred and twenty thousand one hundred and ten

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

143

Notes 1. Both Norrick and Google Scholar give this as the earliest research on “English binomials”. 2. It could be pointed out that figure 5.1 here is broadly echoing the findings that Mollin (2014, 32) gives in her figure 3.3. With regards to the whole of the BNC. 3. These issues were looked at, in greater detail, in Sect. 2.3. 4. There are, of course, many other elements of Whitman’s poetry that give just that impression. The point made here is that the use of LNGs is yet another stylistic tool employed to bring this about. 5. See Pace-Sigge (2018) for details. 6. IBM (n.d.) http://www.ibm.com/cognitive/advantage-reports/future-ofartificial-intelligence/ai-conversation.html (last accessed 20 April 2020). 7. Helft, M. “One-On-One with Google CEO Sundar Pichai: AI, Hardware, Monetization and the Future of Search”. Forbes. https://www.forbes. com/sites/miguelhelft/2016/05/20/one-on-one-with-sundar-pichai-onthe-future-of-google/#4f291e801042 (last accessed 20 April 2020). 8. Thus, there are, in GPC, eight occurrences of the antonymic “hills and valleys” yet only two occurrences of the expansion “wood and hills”.

References Abraham, R.  D. (1950). Fixed Order of Coordinates: A Study in Comparative Lexicography. Modern Language Journal, 34(4), 276–287. Adiwardana, D., Luong, M.  T., So, D.  R., Hall, J., Fiedel, N., Thoppilan, R., Yang, Z., Kulshreshtha, A., Nemade, G., Lu, Y., & Le, Q. V. (2020). Towards a Human-like Open-domain ChatBot. arXiv preprint arXiv:2001.09977. Arnon, I., & Snider, N. (2010). More than Words: Frequency Effects for Multi-­ word Phrases. Journal of Memory and Language, 62, 67–82. https://doi. org/10.1016/j.jml.2009.09.005. Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. ([1999] 2007). Longman Grammar of Spoken and Written English. Harlow: Pearson Education. Conklin, K., & Schmitt, N. (2012). The Processing of Formulaic Language. Annual Review of Applied Linguistics, 32, 45–61. Corman, S.R., & Dooley, K. J. (2007). System and Method of Analyzing Text using Dynamic Centering Resonance Analysis. U.S. Patent 7,295,967. Arizona State University. Easthope, A. (2013). Poetry as Discourse. London: Routledge. Ellis, N. C. (2002). Frequency Effects in Language Acquisition: A Review with Implications for Theories of Implicit and Explicit Language Acquisition. Studies in Second Language Acquisition, 24, 143–188.

144 

M. PACE-SIGGE

Ernestova, M. (2007). Role of Binomial Phrases in Current English and Implications for Readers and Students of EFL.  In G.  Shiel, I.  Stričević, & D.  Sabolović-Krajina (Eds.), Literacy without Boundaries (pp.  273–279). Zagreb: Croatian Reading Association. Fowler, R. (1981). Literature as Social Discourse: The Practice of Linguistic Criticism. Bloomington: Indiana University Press. Gabrovšek, D. (2011). “Micro” Phraseology in Action: A Look at Fixed Binomials. ELOPE: English Language Overseas Perspectives and Enquiries, 8(1), 19–29. Goźdź-Roszkowski, S. (2011). Patterns of Linguistic Variation in American Legal English: A Corpus-based Study. Frankfurt am Main: Peter Lang. Gustafsson, M. (1975). “Binomial expressions in present-day English.” A Syntactic ad Sematic Study. Turku: Annales Universitatis Turkuensis. Gustafsson, M. (1976). The Frequency and “Frozenness” of Some English Binomials. Neuphilologische Mitteilungen, 77(4), 623–637. Retrieved March 27, 2020, from www.jstor.org/stable/43343096. Hegarty, P., Watson, N., Fletcher, K., & McQueen, G. (2011). When Gentlemen are First and Ladies Last? Effects of Gender Stereotypes on the Order of Romantic Partners’ Names. British Journal of Social Psychology, 50, 21–35. Hoey, M. (2005). Lexical Priming: A New Theory of Words and Language. London: Routledge. Hyland, K. & Tse, P. (2007) Is there an “Academic Vocabulary”? TESOL Quarterly 41(2), 235–253. Koprowski, M. (2005). Investigating the Usefulness of Lexical Phrases in Contemporary Coursebooks. ELT Journal, 59(4), 322–332. Malkiel, Y. (1959). Studies in Irreversible Binomials. Lingua, 8, 113–160. Mollin, S. (2014). The (Ir) reversibility of English Binomials. Amsterdam: John Benjamins. Moon, R. (1998). Fixed Expressions and Idioms in English. Oxford: Clarendon Press. Motschenbacher, H. (2013). Gentlemen Before Ladies? A Corpus-based Study of Conjunct Order in Personal Binomials. Journal of English Linguistics, 41(3), 212–242. Negishi, S., & Takeuchi, H. (2015). Computer-implemented Method, Program, and System for Identifying Non-self-descriptive Terms in Electronic Documents. U.S. Patent 9,158,756. International Business Machines Corp. Norrick, N. R. (1988). Binomial Meaning in Texts. Journal of English Linguistics, 21(1), 72–87. https://doi.org/10.1177/007542428802100106. Pace-Sigge, M. (2013). Lexical Priming in Spoken English Usage. Houndmills and Basingstoke: Palgrave Macmillan. Pace-Sigge, M. (2015). The Function and Use of TO and OF in Multi-word Units. Houndmills and Basingstoke: Palgrave Macmillan.

5  FINDINGS, APPLICATIONS AND CONCLUSIONS 

145

Pace-Sigge, M. (2018). Spreading Activation, Lexical Priming and the Semantic Web: Early Psycholinguistic Theories, Corpus Linguistics and AI applications. Houndmills and Basingstoke: Palgrave Macmillan. Schmidt, S.  J. (2000). Interpretation: The Story Does Have an Ending. Poetics today, 21(4), 621–632. Seifart, F.  J. S., Danielsen, S., Hartmann, I., Pakendorf, B., Wichmann, S., Witzlack-Makarevich, A., De Jong, N. H., & Bickel, B. (2018). Nouns Slow Down Speech across Structurally and Culturally Diverse Languages. Proceedings of the National Academy Of Sciences, 115, 5720–5725. https://doi. org/10.1073/PNAS.1800708115. Sinclair, M. (Ed.). (1990). Collins Cobuild English Grammar. London and Glasgow: Collins. Siyanova-Chanturia, A., Conklin, K., & van Heuven, W.  J. B. (2011). Seeing a Phrase “Time and Again” Matters: The Role of Phrasal Frequency in the Processing of Multiword Sequences. Journal of Experimental Psychology: Learning, Memory and Cognition, 37(3), 776–784. https://doi. org/10.1037/a0022531. Sugiati, A., & Rukmini, D. (2017). The Application of Formulaic Expressions in The Conversation Texts of Senior High School English Textbooks. EEJ, 7(2), 103–111.

People Index: Poets, Scholars and Writers1

A Abraham, Richard, 1, 2, 129, 142 Adiwardana, Daniel, 140 B Benor, Sarah Bunin, 11n9 Biber, Douglas, 2, 3, 6, 7, 9, 11n3, 26, 31, 45, 86, 88, 129, 130 Blake, William, 54, 81n4 Bridges, Robert, 74 Brontë Sisters, 73 Brown, 71 Browning, Robert, 66, 75 Bryant, William Cullen, 72, 73, 75, 82n14 Burns, Robert, 66, 70, 74 Byron, George Lord, 62, 66, 80, 81n4

1

C Clare, John, 82n14 Coleridge, Samuel Taylor, 70, 72, 81n4, 82n14 Conklin, Kathy, 4, 6, 133, 134 Conrad, Susan, 2, 86, 88 Corman, Steven R., 138 Crabbe, George, 69, 72, 81n4 D Dickens, Charles, 9, 11, 85, 86, 89, 111–120, 126n23, 126n24, 126n25, 131, 136, 141 Dooley, Kevin J., 138 Dryden, John, 75, 81n4, 82n15 Dunbar, Paul Laurence, 66, 75

 Note: Page numbers followed by ‘n’ refer to notes.

© The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3

147

148 

PEOPLE INDEX: POETS, SCHOLARS AND WRITERS

E Ellis, Nick, 134 Emerson, Ralph, 63, 64, 68–70, 82n22 Ernestova, Marie, 6 F Fowler, Roger, 136 G Gabrovšek, Dušan, 132 Goźdź-Roszkowski, Stanislaw, 5, 7, 135 Gustafsson, Marit, 2, 4–7, 11n1, 11n8, 129, 130, 134, 137, 141 H Hegarty, Peter, 4, 39, 134 Herrick, Robert, 57, 68, 70, 77 Hoey, Michael, 6, 9, 10, 53, 94, 134, 137 Hood, Thomas, 70 K Keats, John, 64, 66, 81n4, 82n15 Kesebir, Selin, 39, 43 Kipling, 75 Koprowski, Mark, 132 L Leech, Geoff, 53 Levy, Roger, 5, 11n9 Louw, Bill, 1, 53 Low, 71 Lowell, Robert, 63, 65, 66, 69, 74, 75, 82n22

M Mahlberg, Michaela, 8, 86, 111 Malkiel, Yakov, 2, 129 Marx, Karl, 9, 11, 85, 86, 89, 111–120, 124n1, 126n26, 131, 135, 136, 140, 141 McIntyre, Dan, 54 Milton, John, 59, 67, 69, 70, 75, 82n19 Mollin, Sandra, 4–6, 11n9, 45, 129, 130, 135, 137, 140, 141, 143n2 Moon, Rosamond, 2, 6, 7, 76, 129 Motschenbacher, Heiko, 39, 40, 42, 43, 134 N Negishi, Satoshi, 138 Newbolt, 74 Norrick, Neal, 129, 130, 143n1 O O’Halloran, Kieran, 53, 54 P Pace-Sigge, Michael, 8, 9, 20, 54, 55, 73, 81n1, 86, 110, 112, 116, 132, 134 Parrish, Allison, 8 Patterson, Katie J., 54, 85, 86 Pontrandolfo, Gianluca, 5, 7 Pope, Alexander, 72, 82n15 R Rayson, Paul, 87, 112 Rossetti, Christina, 57, 62, 74, 75, 77, 80, 81n4, 82n21 Rossetti, Dante, 62, 64, 67, 69

  PEOPLE INDEX: POETS, SCHOLARS AND WRITERS 

S Schmidt, Siegfried J., 89, 131 Schmitt, Norbert, 6, 134 Scott, Michael, 7, 81n4, 87 See also Wordsmith Tools Seifart, Frank, 6, 133 Shakespeare, William, 66, 72 Sinclair, John, 53, 129 Siyanova-Chanturia, Anna, 4, 6, 15, 133, 134 Spenser, Edmund, 66, 69, 70, 72, 81n4 Sugiati, Ana, 6, 10, 132 Swinburne, Algernon Charles, 57, 62, 65, 66, 69–72, 74, 77, 80, 82n25, 137

149

T Teasdale, Sara, 64, 82n21 Tennyson, Alfred Lord, 66, 67, 69, 81n4 Thorne, Sara, 53 V Van Heuven, Walter, 4, 133, 134 W Walker, Brian, 54 Waller, Edmund, 74, 75 Whitman, Walt, 63–66, 69–73, 75–77, 79, 80, 82n20, 82n21, 137, 143n4 Wilcox, Ella, 66, 69, 80

Subject Index1

NUMBERS AND SYMBOLS 19C, 54, 86, 134 A Absence, 4, 29, 32, 34, 39, 44, 45, 103, 115, 120, 137, 141 Academic academic discourse, 45, 135, 141 academic language, 15, 26, 28, 31, 135 academic spoken, 130 academic texts, 15–45, 135 academic writing, 5, 9, 10, 15, 26, 29, 31, 32, 35, 39, 43–45, 85, 86, 112, 116, 118, 131, 133, 135, 136, 140, 141 written academic, 8, 15–45 Academic subjects business and economics, 37, 38 country, 34–35, 47–48 law, 34–35, 47–48

1

medicine, 35–36, 135 science, 2, 35–36, 48, 125n16 social sciences, 30, 32, 34, 36–37, 40, 48, 50n10, 50n11, 135 Adverb(s), 44, 46, 47, 87, 88, 124, 125n10, 132, 141 Antonym(s), 4, 61, 75, 141 B BAWE, 16, 24, 27–45, 117–119, 126n27, 140 Binomials, 1–7, 11n8, 45, 60, 76, 129, 130, 132–134, 136, 137, 140, 141 BNC-2014, 7, 8, 15–24 British Academic Spoken Corpus (BASE), 15, 16, 22–27, 140 British National Corpus (BNC) academic written sup-corpus, 6 fiction sub-corpus, 85, 86, 103 poetry sub-corpus, 12n11

 Note: Page numbers followed by ‘n’ refer to notes.

© The Author(s) 2020 M. Pace-Sigge, Linked Noun Groups, https://doi.org/10.1007/978-3-030-53986-3

151

152 

SUBJECT INDEX

C Children, 39, 40, 43, 49–50, 109, 114, 115, 120, 126n22, 138 Colligation, 9, 17, 27, 31, 33, 36, 54, 66, 71, 115, 116 Collins CoBuild English Grammar, 2 Compound(s), 119, 120, 139 Conjunct(s), 2–4, 7, 16, 17, 21, 24, 26, 28, 41, 44, 86, 87, 90, 97 Conversation, 10, 15, 17, 20, 94, 129, 139, 140 E Education, see Teaching Expansions, 4, 33, 45–47, 58, 62–67, 70, 73–75, 82n17, 90, 117, 140–142, 143n8 F Fixed coordinates, 1 Formulaic language, 134 LNGs, 2, 15, 56, 85, 129 phrase, 10, 131, 132 sequences, 6, 43, 133, 134 G Gender female, women, 5, 32, 39–43, 45, 78, 79, 86, 109, 114, 115, 120, 134, 136, 138 feminine, 11n6 gendered, 25, 39–44, 49–50 male, man, 23, 32, 40–42, 45, 60, 61, 79, 92, 100, 108, 120, 136 masculine, 11n6 Genre(s), 4–9, 29, 31, 39, 45, 53, 57, 80, 86, 94, 129, 130, 132, 133, 135–137, 140, 142

Google Google Duplex, 139 Google Meena, 140 Grammar, 1, 4, 140 Gutenberg Gutenberg Poetry Corpus (GPC) (see Parrish, Allison) H Hapax legomena, 5, 137 Homeosemy, 4, 141 Hyponymy, 4 I Idiom, 2, 10, 32, 43 J Juliet and Romeo Effect, 4, 39, 134 L Lancsbox, 7 LGBT (LGBTQ+), 5, 11n7 Log-likelihood, 107, 108, 115, 124 Longman Dictionary of Contemporary English (LDOCE), 3, 93, 97, 141 Love, 56, 92, 136 M Metaphor, 77, 104, 107 N Natural language processing (NLP), 138–140 Nesting, 10, 53, 102, 115, 132

  SUBJECT INDEX 

Novel, 2, 8, 9, 34, 43, 51n15, 79, 85, 86, 88, 94, 104, 112, 115, 116, 126n23, 131, 133, 136 Number/numeral, 2, 15, 54, 85, 131 O Ontology, 55, 90, 98 Opposites, 31–33, 45–47, 58, 63–65, 68, 70, 71, 75, 76, 90, 92, 97, 102, 107, 125n10, 125n18 P Pattern, 1, 6, 9, 10, 19, 24, 41, 42, 53, 58, 85, 87, 89, 90, 115, 125n21, 132–134, 136, 139 usage pattern, 3, 7, 10, 16, 75, 88, 90, 93, 94, 113, 116, 129, 131, 140 See also Formulaic Phraseology multi-word unit, 2, 6, 10, 15, 132 phrases, 132 Priming (lexical priming), 9, 137 Pronoun(s), 23, 90, 98, 108–109, 123, 133 personal pronoun(s), 17–19, 22–23, 25, 40, 104, 108 Prosody, 1, 42, 58, 61, 92, 100, 108, 110 Prose fiction, 54, 67, 86, 94, 125n7, 140 Psycholinguistics, 134 S Sketchengine, 98 Spoken casual spoken, 8, 17, 130 English, 8, 17, 132, 133, 135 informal spoken, 17 spoken academic English, 25

153

Style discourse analysis, 4, 7, 131, 134–137 style manual, 44 stylistic feature, 31, 116 stylistic marker, 7, 34, 87, 137 stylistic tool, 141, 143n4 written prose, 86 written texts, 9, 24, 45, 94, 133, 135 Swinburne, 58 Syntax, 4, 25, 53, 88 T Teaching, 6, 9–11, 31, 38, 131–134 students, 10, 132 Technology applications, 131–132 assistant, 10, 138, 139 See also Natural language processing (NLP) Textbook, see Teaching Themes (in poetry), 7, 9, 10, 53, 55–56, 62, 79 See also Ontology Time markers, 17, 24, 29, 96, 97, 103–104, 120, 122 Translation(s), 9, 51n15, 85, 86, 89, 111–120, 126n26, 126n27, 131, 136 V Vagueness markers, 17, 19–22, 94–97, 110, 131 W Wordsmith Tools, 55 See also Scott, Michael