Multi-Dimensional Analysis: Research Methods and Current Issues 9781350023826, 9781350023857, 9781350023833

Multi-dimensional Analysis: Research Methods and Current Issues provides a comprehensive guide both to the statistical m

224 82 4MB

English Pages [279] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Multi-Dimensional Analysis: Research Methods and Current Issues
 9781350023826, 9781350023857, 9781350023833

Table of contents :
Cover
Half title
Series page
Title
Copyrights
Dedication
Contents
List of Illustrations
List of Contributors
Acknowledgments
Introduction Tony Berber Sardinha and Marcia Veirano Pinto
Part One Understanding the Principles: Origins of the Method, Corpus Design, and Annotation
1 Multi-Dimensional Analysis: A Historical Synopsis Douglas Biber
2 Corpus Design and Representativeness Jesse Egbert
3 Tagging and Counting Linguistic Features for Multi-Dimensional Analysis Bethany Gray
4 The Multi-Dimensional Analysis Tagger Andrea Nini
Part Two Conducting an MD Analysis: Quantitative and Qualitative Aspects
5 Multivariate Statistics Commonly Used in Multi-Dimensional Analysis Pascual Cantos-Gomez
6 Doing Multi-Dimensional Analysis in SPSS, SAS, and R Jesse Egbert and Shelley Staples
7 From Factors to Dimensions: Interpreting Linguistic Co-occurrence Patterns Eric Friginal and Jack A. Hardy
8 Adding Registers to a Previous Multi-Dimensional Analysis Tony Berber Sardinha, Marcia Veirano Pinto, Cristina Mayer, Maria Carolina Zuppardi, and Carlos Henrique Kauffmann
Part Three Exploring the Method
9 Examining Lexical and Cohesion Differences in Discipline-Specific Writing Using Multi-Dimensional Analysis Scott A. Crossley, Kristopher Kyle, and Ute Römer
10 Using Discriminate Function Analysis in Multi-Dimensional Analysis Marcia Veirano Pinto
11 Using Multi-Dimensional Analysis to Detect Representations of National Cultures Tony Berber Sardinha
Index

Citation preview

Multi-Dimensional Analysis

Also available from Bloomsbury Research Methods in Linguistics: Second Edition, Lia Litosseliti Research Methods in Applied Linguistics, Brian Paltridge Corpus Linguistics and Linguistically Annotated Corpora, Sandra Kuebler Experimental Research Methods in Language Learning, Aek Phakiti

Multi-Dimensional Analysis Research Methods and Current Issues Edited by Tony Berber Sardinha and Marcia Veirano Pinto

BLOOMSBURY ACADEMIC Bloomsbury Publishing Plc 50 Bedford Square, London, WC1B 3DP, UK 1385 Broadway, New York, NY 10018, USA BLOOMSBURY, BLOOMSBURY ACADEMIC and the Diana logo are trademarks of Bloomsbury Publishing Plc First published in Great Britain 2019 Paperback edition published 2021 Copyright © Tony Berber Sardinha, Marcia Veirano Pinto and Contributors, 2019 Tony Berber Sardinha and Marcia Veirano Pinto have asserted their right under the Copyright, Designs and Patents Act, 1988, to be identified as Editors of this work. For legal purposes the Acknowledgments on p. xvi constitute an extension of this copyright page. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publishers. Bloomsbury Publishing Plc does not have any control over, or responsibility for, any third-party websites referred to or in this book. All internet addresses given in this book were correct at the time of going to press. The author and publisher regret any inconvenience caused if addresses have changed or sites have ceased to exist, but can accept no responsibility for any such changes. A catalogue record for this book is available from the British Library. A catalog record for this book is available from the Library of Congress. ISBN: HB: 978-1-3500-2382-6 PB: 978-1-3501-9040-5 ePDF: 978-1-3500-2383-3 eBook: 978-1-3500-2384-0 Typeset by Deanta Global Publishing Services, Chennai, India To find out more about our authors and books visit www.bloomsbury.com and sign up for our newsletters.

To Marilisa and Julia Tony To Walter, Otto, and Rafael Marcia

Contents List of Illustrations List of Contributors Acknowledgments

ix xv xvi

Introduction  Tony Berber Sardinha and Marcia Veirano Pinto 1

Part One  Understanding the Principles: Origins of the Method, Corpus Design, and Annotation 1 2 3 4

Multi-Dimensional Analysis: A Historical Synopsis  Douglas Biber 11 Corpus Design and Representativeness  Jesse Egbert 27 Tagging and Counting Linguistic Features for Multi-Dimensional Analysis  Bethany Gray 43 The Multi-Dimensional Analysis Tagger  Andrea Nini 67

Part Two  Conducting an MD Analysis: Quantitative and Qualitative Aspects 5 6 7 8

Multivariate Statistics Commonly Used in Multi-Dimensional Analysis  Pascual Cantos-Gomez 97 Doing Multi-Dimensional Analysis in SPSS, SAS, and R  Jesse Egbert and Shelley Staples 125 From Factors to Dimensions: Interpreting Linguistic Co-occurrence Patterns  Eric Friginal and Jack A. Hardy 145 Adding Registers to a Previous Multi-Dimensional Analysis  Tony Berber Sardinha, Marcia Veirano Pinto, Cristina Mayer, Maria Carolina Zuppardi, and Carlos Henrique Kauffmann 165

Part Three  Exploring the Method 9

Examining Lexical and Cohesion Differences in Discipline-Specific Writing Using Multi-Dimensional Analysis  Scott A. Crossley, Kristopher Kyle, and Ute Römer 189

viii Contents 10 Using Discriminate Function Analysis in Multi-Dimensional Analysis  Marcia Veirano Pinto 217 11 Using Multi-Dimensional Analysis to Detect Representations of National Cultures  Tony Berber Sardinha 231 Index

259

List of Illustrations Figures Figure 2.1

Rates of occurrence (per million words) for verbs in COCA

33

Figure 3.1

Example interface for an interactive tag editing program (FixTag)

61

Figure 4.1

Screenshot of MAT interface for Windows

73

Figure 4.2

Mean scores and ranges for Dimension 1, Involved versus Informational Discourse, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs

89

Mean scores and ranges for Dimension 2, Narrative versus Non-narrative Concerns, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs

89

Mean scores and ranges for Dimension 3, Situation-Dependent versus Explicit Reference, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs

90

Mean scores and ranges for Dimension 4, Overt Expression of Persuasion, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs

90

Mean scores and ranges for Dimension 5, Abstract versus Nonabstract Information, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs

91

Mean scores and ranges for Dimension 6, On-Line Informational Elaboration, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs

92

Figure 4.3

Figure 4.4

Figure 4.5

Figure 4.6

Figure 4.7

Figure 5.1

Plot correlating factors and genres (adapted from Biber 1988, 18) 102

Figure 5.2

Mean factor scores and related genres (adapted from Biber and Finegan 1986, 32)

103

Figure 5.3

Rotated solution graph

105

Figure 5.4

Dendrogram of nine Russian near synonyms meaning “to try” (adapted from Gries 2009, 22)

108

List of Illustrations

x Figure 5.5

Cluster analysis of fifteen text types based on one hundred common words (adapted from Nishina 2007, 31)

109

Figure 5.6

Dendrogram

110

Figure 5.7

Data display

111

Figure 5.8

Data display of nonhierarchical clustering

112

Figure 6.1

SPSS results containing the KMO statistic

134

Figure 6.2

SPSS scree plot

135

Figure 6.3

SPSS output containing variance explained by the factors in the model

135

SPSS output for complete factor matrix (only first five features displayed)

136

Figure 6.5

SAS results containing the KMO statistic

137

Figure 6.6

SAS scree plot

137

Figure 6.7

SAS output containing eigenvalues for each factor

138

Figure 6.8

SAS output for complete factor matrix (only first five features displayed)

138

R results containing KMO statistic

138

Figure 6.4

Figure 6.9

Figure 6.10 R scree plot

139

Figure 6.11 R output containing variance accounted for by each factor

139

Figure 6.12 R output for complete factor matrix (only first five features displayed)

139

Figure 7.1

Comparison of average factor scores for Dimension 1: Involved production (+) versus Information production (−). Adapted from Biber (1988, 128) 152

Figure 7.2

Group comparison of caller versus agent texts from Friginal’s Dimension 2

156

Figure 8.1

Calculation of a factor score in an Excel spreadsheet

178

Figure 8.2

Vertical plot of Web registers added to Biber’s (1988) Dimension 3, Explicit versus Situation-Dependent reference

180

Horizontal bar plot of Web registers added to Biber’s (1988) Dimension 3, Explicit versus Situation-Dependent reference

181

Figure 9.1

Dimension 1 loadings: Ease of function words

199

Figure 9.2

Dimension 2 loadings: Text simplicity

201

Figure 9.3

Dimension 3 loadings: Content word frequency

203

Figure 8.3



List of Illustrations

xi

Figure 9.4

Dimension 4 loadings: Word overlap

205

Figure 9.5

Dimension 5 loadings: Function word repetition

206

Figure 10.1 SPSS main discriminant analysis screen

220

Figure 10.2 SPSS discriminant analysis: Statistics screen

220

Figure 10.3 SPSS discriminant analysis: Classification screen

221

Figure 10.4 SPSS discriminant analysis: Save screen

221

Figure 11.1 Scree plot for American

237

Figure 11.2 Scree plot for Brazilian

238

Figure 11.3 Sequential mean decade scores for Dimension 1, American

240

Figure 11.4 Sequential mean decade scores for Dimension 2, American

242

Figure 11.5 Sequential mean decade scores for Dimension 3, American

243

Figure 11.6 Sequential mean decade scores for Dimension 4, American

245

Figure 11.7 Sequential mean decade scores for Dimension 5, American

246

Figure 11.8 Sequential mean decade scores for Dimension 1, Brazilian

249

Figure 11.9 Sequential mean decade scores for Dimension 2, Brazilian

250

Figure 11.10 Sequential mean decade scores for Dimension 3, Brazilian

252

Figure 11.11 Sequential mean decade scores for Dimension 4, Brazilian

253

Figure 11.12 Comparison between Dimension 2 and 4, Brazilian

253

Figure 11.13 Sequential mean decade scores for Dimension 5, Brazilian

254

Tables Table 1.1

MD studies of specialized discourse domains in English

18

Table 1.2

MD studies based on other linguistic characteristics

19

Table 3.1

Contingency table for calculating precision and recall for POS tags (adapted from Manning et al. 2008)

57

Contingency table for calculating precision and recall for passives (data based on Gray 2015)

57

Short descriptions and summary of the six dimensions of register variation for English found by Biber (1988)

68

Short description and summary of the eight text types for English found by Biber (1989)

69

Table 3.2 Table 4.1 Table 4.2

List of Illustrations

xii Table 4.3

Table 4.4

Table 4.5

Table 4.6 Table 5.1

Comparison of dimension scores and distribution of text types between Biber’s (1988, 1989) analysis of the LOB corpus and a MAT analysis of the same corpus

74

Comparison of dimension scores and distribution of text types between Biber’s (1988, 1989) analysis of the LOB corpus and a MAT analysis of the Brown corpus

79

Comparison of dimension scores and distribution of text types between the analysis of the LOB corpus and the analysis of the Brown corpus with MAT

84

Comparison of mean dimension scores and standard deviations for MFTs and personal and professional letters from Biber (1988)

88

Factors and factor loadings (adapted from Biber and Finegan 1986, 28)

101

Table 5.2

Rotated sums of squared loadings

104

Table 5.3

Rotated component matrix with suppressed loadings