Multi-dimensional Analysis: Research Methods and Current Issues provides a comprehensive guide both to the statistical m
224 82 4MB
English Pages [279] Year 2019
Table of contents :
Cover
Half title
Series page
Title
Copyrights
Dedication
Contents
List of Illustrations
List of Contributors
Acknowledgments
Introduction Tony Berber Sardinha and Marcia Veirano Pinto
Part One Understanding the Principles: Origins of the Method, Corpus Design, and Annotation
1 Multi-Dimensional Analysis: A Historical Synopsis Douglas Biber
2 Corpus Design and Representativeness Jesse Egbert
3 Tagging and Counting Linguistic Features for Multi-Dimensional Analysis Bethany Gray
4 The Multi-Dimensional Analysis Tagger Andrea Nini
Part Two Conducting an MD Analysis: Quantitative and Qualitative Aspects
5 Multivariate Statistics Commonly Used in Multi-Dimensional Analysis Pascual Cantos-Gomez
6 Doing Multi-Dimensional Analysis in SPSS, SAS, and R Jesse Egbert and Shelley Staples
7 From Factors to Dimensions: Interpreting Linguistic Co-occurrence Patterns Eric Friginal and Jack A. Hardy
8 Adding Registers to a Previous Multi-Dimensional Analysis Tony Berber Sardinha, Marcia Veirano Pinto, Cristina Mayer, Maria Carolina Zuppardi, and Carlos Henrique Kauffmann
Part Three Exploring the Method
9 Examining Lexical and Cohesion Differences in Discipline-Specific Writing Using Multi-Dimensional Analysis Scott A. Crossley, Kristopher Kyle, and Ute Römer
10 Using Discriminate Function Analysis in Multi-Dimensional Analysis Marcia Veirano Pinto
11 Using Multi-Dimensional Analysis to Detect Representations of National Cultures Tony Berber Sardinha
Index
Multi-Dimensional Analysis
Also available from Bloomsbury Research Methods in Linguistics: Second Edition, Lia Litosseliti Research Methods in Applied Linguistics, Brian Paltridge Corpus Linguistics and Linguistically Annotated Corpora, Sandra Kuebler Experimental Research Methods in Language Learning, Aek Phakiti
Multi-Dimensional Analysis Research Methods and Current Issues Edited by Tony Berber Sardinha and Marcia Veirano Pinto
BLOOMSBURY ACADEMIC Bloomsbury Publishing Plc 50 Bedford Square, London, WC1B 3DP, UK 1385 Broadway, New York, NY 10018, USA BLOOMSBURY, BLOOMSBURY ACADEMIC and the Diana logo are trademarks of Bloomsbury Publishing Plc First published in Great Britain 2019 Paperback edition published 2021 Copyright © Tony Berber Sardinha, Marcia Veirano Pinto and Contributors, 2019 Tony Berber Sardinha and Marcia Veirano Pinto have asserted their right under the Copyright, Designs and Patents Act, 1988, to be identified as Editors of this work. For legal purposes the Acknowledgments on p. xvi constitute an extension of this copyright page. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publishers. Bloomsbury Publishing Plc does not have any control over, or responsibility for, any third-party websites referred to or in this book. All internet addresses given in this book were correct at the time of going to press. The author and publisher regret any inconvenience caused if addresses have changed or sites have ceased to exist, but can accept no responsibility for any such changes. A catalogue record for this book is available from the British Library. A catalog record for this book is available from the Library of Congress. ISBN: HB: 978-1-3500-2382-6 PB: 978-1-3501-9040-5 ePDF: 978-1-3500-2383-3 eBook: 978-1-3500-2384-0 Typeset by Deanta Global Publishing Services, Chennai, India To find out more about our authors and books visit www.bloomsbury.com and sign up for our newsletters.
To Marilisa and Julia Tony To Walter, Otto, and Rafael Marcia
Contents List of Illustrations List of Contributors Acknowledgments
ix xv xvi
Introduction Tony Berber Sardinha and Marcia Veirano Pinto 1
Part One Understanding the Principles: Origins of the Method, Corpus Design, and Annotation 1 2 3 4
Multi-Dimensional Analysis: A Historical Synopsis Douglas Biber 11 Corpus Design and Representativeness Jesse Egbert 27 Tagging and Counting Linguistic Features for Multi-Dimensional Analysis Bethany Gray 43 The Multi-Dimensional Analysis Tagger Andrea Nini 67
Part Two Conducting an MD Analysis: Quantitative and Qualitative Aspects 5 6 7 8
Multivariate Statistics Commonly Used in Multi-Dimensional Analysis Pascual Cantos-Gomez 97 Doing Multi-Dimensional Analysis in SPSS, SAS, and R Jesse Egbert and Shelley Staples 125 From Factors to Dimensions: Interpreting Linguistic Co-occurrence Patterns Eric Friginal and Jack A. Hardy 145 Adding Registers to a Previous Multi-Dimensional Analysis Tony Berber Sardinha, Marcia Veirano Pinto, Cristina Mayer, Maria Carolina Zuppardi, and Carlos Henrique Kauffmann 165
Part Three Exploring the Method 9
Examining Lexical and Cohesion Differences in Discipline-Specific Writing Using Multi-Dimensional Analysis Scott A. Crossley, Kristopher Kyle, and Ute Römer 189
viii Contents 10 Using Discriminate Function Analysis in Multi-Dimensional Analysis Marcia Veirano Pinto 217 11 Using Multi-Dimensional Analysis to Detect Representations of National Cultures Tony Berber Sardinha 231 Index
259
List of Illustrations Figures Figure 2.1
Rates of occurrence (per million words) for verbs in COCA
33
Figure 3.1
Example interface for an interactive tag editing program (FixTag)
61
Figure 4.1
Screenshot of MAT interface for Windows
73
Figure 4.2
Mean scores and ranges for Dimension 1, Involved versus Informational Discourse, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs
89
Mean scores and ranges for Dimension 2, Narrative versus Non-narrative Concerns, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs
89
Mean scores and ranges for Dimension 3, Situation-Dependent versus Explicit Reference, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs
90
Mean scores and ranges for Dimension 4, Overt Expression of Persuasion, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs
90
Mean scores and ranges for Dimension 5, Abstract versus Nonabstract Information, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs
91
Mean scores and ranges for Dimension 6, On-Line Informational Elaboration, for a selection of Biber’s (1988) registers compared to the mean and range for MFTs
92
Figure 4.3
Figure 4.4
Figure 4.5
Figure 4.6
Figure 4.7
Figure 5.1
Plot correlating factors and genres (adapted from Biber 1988, 18) 102
Figure 5.2
Mean factor scores and related genres (adapted from Biber and Finegan 1986, 32)
103
Figure 5.3
Rotated solution graph
105
Figure 5.4
Dendrogram of nine Russian near synonyms meaning “to try” (adapted from Gries 2009, 22)
108
List of Illustrations
x Figure 5.5
Cluster analysis of fifteen text types based on one hundred common words (adapted from Nishina 2007, 31)
109
Figure 5.6
Dendrogram
110
Figure 5.7
Data display
111
Figure 5.8
Data display of nonhierarchical clustering
112
Figure 6.1
SPSS results containing the KMO statistic
134
Figure 6.2
SPSS scree plot
135
Figure 6.3
SPSS output containing variance explained by the factors in the model
135
SPSS output for complete factor matrix (only first five features displayed)
136
Figure 6.5
SAS results containing the KMO statistic
137
Figure 6.6
SAS scree plot
137
Figure 6.7
SAS output containing eigenvalues for each factor
138
Figure 6.8
SAS output for complete factor matrix (only first five features displayed)
138
R results containing KMO statistic
138
Figure 6.4
Figure 6.9
Figure 6.10 R scree plot
139
Figure 6.11 R output containing variance accounted for by each factor
139
Figure 6.12 R output for complete factor matrix (only first five features displayed)
139
Figure 7.1
Comparison of average factor scores for Dimension 1: Involved production (+) versus Information production (−). Adapted from Biber (1988, 128) 152
Figure 7.2
Group comparison of caller versus agent texts from Friginal’s Dimension 2
156
Figure 8.1
Calculation of a factor score in an Excel spreadsheet
178
Figure 8.2
Vertical plot of Web registers added to Biber’s (1988) Dimension 3, Explicit versus Situation-Dependent reference
180
Horizontal bar plot of Web registers added to Biber’s (1988) Dimension 3, Explicit versus Situation-Dependent reference
181
Figure 9.1
Dimension 1 loadings: Ease of function words
199
Figure 9.2
Dimension 2 loadings: Text simplicity
201
Figure 9.3
Dimension 3 loadings: Content word frequency
203
Figure 8.3
List of Illustrations
xi
Figure 9.4
Dimension 4 loadings: Word overlap
205
Figure 9.5
Dimension 5 loadings: Function word repetition
206
Figure 10.1 SPSS main discriminant analysis screen
220
Figure 10.2 SPSS discriminant analysis: Statistics screen
220
Figure 10.3 SPSS discriminant analysis: Classification screen
221
Figure 10.4 SPSS discriminant analysis: Save screen
221
Figure 11.1 Scree plot for American
237
Figure 11.2 Scree plot for Brazilian
238
Figure 11.3 Sequential mean decade scores for Dimension 1, American
240
Figure 11.4 Sequential mean decade scores for Dimension 2, American
242
Figure 11.5 Sequential mean decade scores for Dimension 3, American
243
Figure 11.6 Sequential mean decade scores for Dimension 4, American
245
Figure 11.7 Sequential mean decade scores for Dimension 5, American
246
Figure 11.8 Sequential mean decade scores for Dimension 1, Brazilian
249
Figure 11.9 Sequential mean decade scores for Dimension 2, Brazilian
250
Figure 11.10 Sequential mean decade scores for Dimension 3, Brazilian
252
Figure 11.11 Sequential mean decade scores for Dimension 4, Brazilian
253
Figure 11.12 Comparison between Dimension 2 and 4, Brazilian
253
Figure 11.13 Sequential mean decade scores for Dimension 5, Brazilian
254
Tables Table 1.1
MD studies of specialized discourse domains in English
18
Table 1.2
MD studies based on other linguistic characteristics
19
Table 3.1
Contingency table for calculating precision and recall for POS tags (adapted from Manning et al. 2008)
57
Contingency table for calculating precision and recall for passives (data based on Gray 2015)
57
Short descriptions and summary of the six dimensions of register variation for English found by Biber (1988)
68
Short description and summary of the eight text types for English found by Biber (1989)
69
Table 3.2 Table 4.1 Table 4.2
List of Illustrations
xii Table 4.3
Table 4.4
Table 4.5
Table 4.6 Table 5.1
Comparison of dimension scores and distribution of text types between Biber’s (1988, 1989) analysis of the LOB corpus and a MAT analysis of the same corpus
74
Comparison of dimension scores and distribution of text types between Biber’s (1988, 1989) analysis of the LOB corpus and a MAT analysis of the Brown corpus
79
Comparison of dimension scores and distribution of text types between the analysis of the LOB corpus and the analysis of the Brown corpus with MAT
84
Comparison of mean dimension scores and standard deviations for MFTs and personal and professional letters from Biber (1988)
88
Factors and factor loadings (adapted from Biber and Finegan 1986, 28)
101
Table 5.2
Rotated sums of squared loadings
104
Table 5.3
Rotated component matrix with suppressed loadings