The Meta-Analytic Organization : Introducing Statistico-Organizational Theory 9781315699424, 9780765620675

This groundbreaking book develops a new organizational theory derived from ideas in statistics and

198 79 2MB

English Pages 292 Year 2010

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Meta-Analytic Organization : Introducing Statistico-Organizational Theory
 9781315699424, 9780765620675

Citation preview

The Meta-Analytic

Organization

This page intentionally left blank

The Meta-Analytic

Organization Introducing Statistico-Organizational Theory

Lex Donaldson

First published 2010 by M.E. Sharpe Published 2015 by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN 711 Third Avenue, New York, NY 10017, USA Routledge is an imprint of the Taylor & Francis Group, an informa business Copyright © 2010 Taylor & Francis. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Notices No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use of operation of any methods, products, instructions or ideas contained in the material herein. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data

Donaldson, Lex. The meta-analytic organization : introducing statistico-organizational theory / by Lex Donaldson. p. cm. Includes bibliographical references and index. ISBN 978-0-7656-2067-5 (hardcover : alk. paper) 1. Organization. 2. Statistics. I. Title. HD31.D568 2011 302.3’501—dc22

2010018815

ISBN 13: 9780765620682 (pbk) ISBN 13: 9780765620675 (hbk)

To Jerald Hage— role model of a theorist

This page intentionally left blank

Contents

List of Tables and Figures Foreword by Frank Schmidt Preface

ix xi xiii

Part I. The Vision for a New Organizational Theory 1.

Creating Organizational Theory From Methodological Principles Methodological Principles as Foundations for Organizational Theory The Theoretical Structure of Statistico-Organizational Theory Managerial Errors Overview of Statistico-Organizational Theory

3 4 10 15 19

2.

The Deep Structure of Data Cognitive Positivism The Methodological Philosophy Underlying Meta-Analysis Hierarchy in Inquiry Continuities of New Theory With Prior Organizational Theory Research

30 30 33 40 42

Part II. The Sources of Error 3.

Managerial Errors From Small Numbers The Law of Small Numbers Inference and the Nature of What Is Counted Errors Caused by Small Organizational Size Weak Inference and Organizational Mortality International Comparative Advantage in Inference Relationships With Some Previous Organizational Theories Conclusions vii

49 50 56 56 60 61 63 65

viii

4.

Data Disaggregation and Managerial Errors The Fallacy of Disaggregation Cycles of Dysfunctional Control Inference and the Management of Human Resources The Problem of Aggregating Previous Inferences The Fallacy of Immediacy Organizational Structure for Inference Conclusions

66 66 69 72 74 78 79 82

5.

Measurement Error of Profit The Problem of Measurement Error Measurement Error of Profit Measurement Error of Profitability Ratios Measurement Error Caused by Time Rates Measurement Error Caused by Control Variables Measurement Error Caused by Comparison With Standard Measurement Error in Contingency Misfit Analyses Low Reliability Not Readily Increased Conclusions

83 84 86 94 97 98 100 103 104 106

6.

Quantifying the Measurement Error of Profit Formal Analysis of Measurement Error of Profit Sensitivity of Profit Reliability Causal Model of the Determinants of Profit Reliability Measurement Error of Growth of Sales Conclusions

107 107 114 117 119 119

7.

Measurement and Sampling Errors in the M-Form and Strategic Niches Errors in M-Form Divisional Profitability Errors in Niche Strategy Analysis Conclusions

121 121 134 137

8.

Errors From Range Restriction and Extension Errors From Range Restriction in Organizational Management Errors From Range Restriction in Organizational Misfit and Fit Errors From Range Extension Conclusions

138 139 143 145 145

9.

Confounding by the Performance Variable Severe Confounding Produced by Weak Spurious Correlations Confounding by Definitional Connections

147 148 149

ix

Confounding by Reciprocal Causality: Performance-Driven Organizational Change Confounds and Other Sources of Error Conclusions 10. Controlling for Confounding by Using Organizational Experiments Confounding as a Source of Error Experiments With Control Groups Experiments in Organizations Bias in Organizational Experiments Conclusions 11. Controlling for Confounding by Data Aggregation Summary Controlling Confounds by Averaging Confounding by Multiple Causes Confounding in Multiple Causes of Organizational Performance Confounds Readily Eliminated Controlling Confounds Through Averaging in Organizational Management Conclusions

158 163 164

165 166 167 170 173 176 177 177 178 180 183 190 193 196

Part III. Integration 12. Errors Not Self-Correcting Repeated Operation of Errors Overall Error From Multiple Error Sources Conclusions

201 201 203 211

13. Equations of Statistico-Organizational Theory True Correlation Measurement Error Range Restriction and Range Extension Combination of Measurement Error and Range Restriction or Extension Confounding Overall Error Without Sampling Error Sampling Error Overall Error With Sampling Error Conclusions

213 214 217 221 223 225 230 231 232 235

x

14. How Managers Can Reduce Errors Reducing Sampling Error Reducing Measurement Error Reducing Errors From Range Artifacts Reducing Errors From Confounding

237 237 238 241 242

15. Conclusions Situations That Lead to Errors Reflections and Limitations

245 246 251

Appendix References Name Index Subject Index About the Author

253 255 261 265 271

List of Tables and Figures

Tables 10.1 Results of Impact of MBO on Productivity in Studies Using Control Groups 175 11.1 Elimination of Confounding of Effects on Performance by Averaging 186 15.1 Summary of Statistico-Organizational Theory 247 Figures 1.1 1.2 1.3 1.4 1.5 1.6 5.1 5.2 6.1 9.1 9.2 9.3

Truer Estimates From Larger N at Higher Hierarchical Levels The Theoretical Structure of Statistico-Organizational Theory Causal Model of Statistico-Organizational Theory Errors in Observed Correlation From Range, Measurement Error, Reverse Causality, and Sampling Error Errors in Causal Relationships From Number of Observations and Attenuation Managerial Misperceptions From Number of Observations and Attenuation Bar Chart Presentation of Profits of Disney’s Business Segments: Reported Versus Possible True Profits Error in Bar Chart Analysis of Understating True Relationship Because of Measurement Error Causal Model of Determinants of Profit Reliability The Possible Values of X – Y From the Values of X and Y The Possible Values of X – Y Showing Their Association With X A Difference Score Variable (W) and Its Definitional Connections With Its Constituent Variables (X and Y) Masking a Positive Effect of W on Z

xi

4 11 13 16 17 18 91 95 118 150 152

153

xii

LIST OF TABLES AND FIGURES

9.4 Constituent Variables (Divisional Sales and Divisional Costs) of Divisional Profit Masking the Positive Effect of Divisional Price Discounting on Divisional Profit 11.1 Effects on Performance Showing Strong and Weak Confounding 12.1 Correlation After True r Is Cumulatively Affected by Unreliabilities of the Cause and Profit Variables, Range Restriction, Prior Profit Confound, and Sampling Error (Upper and Lower Bounds for N = 20)

155 181

209

Foreword Frank Schmidt

This book applies the methods of meta-analysis—and psychometric methods in general—to the types of business-related data that managers use almost daily to make important managerial decisions: sales revenue, and profit data. The results show that these figures are often distorted by sampling error and measurement error and that taking them at face value can lead to erroneous business decisions. Up until now these psychometric and meta-analysis methods have been applied only to research studies conducted by scientific researchers (mostly academics). In applying them to business data, this book represents a new and creative step—one that I think yields important theoretical insights and practical benefits. The pervasive theme in this book is that data are often deceptive if taken at face value without consideration of the distortions produced by statistical and measurement artifacts. This theme resonates well with me, because for the last forty years I have been concerned with probing the hidden meaning of data. The methods of psychometric meta-analysis that John Hunter and I developed have shown that many of the apparent conflicting results between different studies addressing the same question disappear when statistical and measurement artifacts are recognized and corrections are made for their effects. The upshot of this is the conclusion that the traditional naive empiricism of most researchers—the tendency to take observed data at face value without appreciating how misleading data can be—was a major error that retarded progress in understanding. The main focus of this book is how sampling error and measurement error distort business data. Of course, there are other artifacts that also cause data distortion— examples include recording and transcription errors in data; range restriction; and dichotomization of continuous variables. However, these other artifacts are sometimes present and sometimes not present in data. But sampling error and measurement error are always present. A sample always deviates randomly from xiii

xiv

FOREWORD

the population it is supposed to represent, and there are no measures without measurement error—that is, there are no perfectly reliable measures. As strange as it may seem, some researchers have questioned the existence of sampling error. One of the early objections to meta-analysis maintained that sampling error was only a hypothesis, not a fact. This objection viewed as mere speculation the idea that sampling error produces variation in data. In fact, statisticians have empirically tested sampling error formulas in numerous studies (for example, by randomly drawing marbles of different colors from urns) and have repeatedly proven the accuracy of sampling error formulas. Some researchers have also maintained that some measures do not have any measurement error. But these arguments have never been able to present an example of such a measure. Even in the most exact areas of physics, the measures used have measurement error, and physicists acknowledge this by presenting confidence intervals based on measurement error to show the uncertainty in estimates computed from their data. A key strength of this book—and of the Hunter-Schmidt meta-analysis method— is that it simultaneously takes both sampling error and measurement error into account. The reason why this has not always been the case centers on an important difference between statisticians and psychometricians. Traditionally, statisticians and statistics have focused on sampling error (their specialty) and ignored measurement error, while the opposite has been the case with psychometricians (who invented the models for measurement error). In their writings, psychometricians have typically told the reader to assume a very large sample so that sampling error is almost nil and can be ignored—the better to focus solely on measurement error. Likewise, statisticians have implicitly told the reader to assume perfect measurement so that the focus can be solely on sampling error. Thus, despite the fact that both are always present in all data, approaches that simultaneously dealt with the effects of both sampling error and measurement error were not developed until recently. Without such methods it is not possible to determine the real meaning of data. In this book, Professor Donaldson is on the same mission I have been on since graduate school—to determine what data really mean. Meta-analysis has shown that getting the truth out of research data has turned out to be a much more complicated and challenging process than we have traditionally thought. Professor Donaldson’s book shows that this is also true of the business data used daily by managers to make critical business decisions. This is an important contribution. And this book goes further: it shows that these facts provide insights that help us to understand organizations and shed important light on organizational theories. The full implications of this work will probably take years to play out; the process will be interesting to watch.

Preface

This book is intended for academics interested in organizational theory and for their students. For them, it tries to shed new light on traditional organizational theory issues, such as strategy and structure, by looking at these issues from the unfamiliar perspective of statistics and methodology as generators of substantive theory. It may be of similar interest to academic specialists in human resource management. Those interested in statistical techniques and methodological questions may find the applications and discussion of interest. Included are several instances of methodological principles, from meta-analysis or elsewhere, and of the errors they describe in managerial inference-making. However, these should be seen only as one set of possibilities. Other methodological principles, not discussed here, could lead to errors in the making of inferences by managers. The central message of this book should be understood as the general idea that it is possible to create theory about organizational management from methodological principles. This creating organizational theory from methodological principles is meant to hold widely across methodological principles, without being restricted only to some of them. Readers of this book may well have their own favorite methodological principles. Hopefully, the present book will encourage them to think about how to generate organizational theoretical propositions from their methodological principles. In that way, statistico-organizational theory is intended to be an open architecture, within which many colleagues can contribute, so that they develop it in their own ways. In the first chapter, I offer an overview of statistico-organizational theory, showing how, according to methodological principles, errors in managerial inferences arise from data whose properties are shaped by the organizational situation. I also discuss the theoretical structure of statistico-organizational theory and abstract certain general patterns about how types of errors are liable to change in an organization over time as it grows and develops. The origins of this book go way back in my professional life. I started doing social science research in 1966. At about that time, I was working in a company xv

xvi

PREFACE

analyzing its data on how employees’ job satisfaction was affected by the job itself. I was using the Kolmogorov-Smirnov test from what was then one of the “hot” books on methodology: Sidney Siegel’s Nonparametric Statistics for the Behavioral Sciences (1956). His book contained mathematical formulas and deductive logic. It reminded me of the physics I had learned in high school. By contrast, the theory of how job design affected job satisfaction, stated in words rather than mathematics, seemed much less precise and scientific. I often pondered this disjunction between the scientific nature of the methodology and the lack of science in the theory as I rode home in the evening on the London tube. I encountered this discrepancy again and again with my teachers at the University of Aston (then home to the Aston Group studies). They would assert a theoretical idea, but only as long as the empirical results supported it. Indeed, there was much skepticism openly expressed about many hallowed theories. There was a freely shared distrust about results of case histories and case studies because they were not rigorous enough methodologically. There was a readiness to give up believing in a theory according to whatever the multiple regression or factor analyses showed. It was as if method, not theory, reigned supreme. Later, when I was in the Organisational Behaviour Research Group of the London Graduate School of Business Studies, there was an ongoing discussion about what it was that made the research we were doing scientific. The conclusion reached was that it was the use of scientific method, rather than the theories themselves being scientific. Our leader there—and my PhD supervisor—was Derek Pugh. He used to say to opponents in seminar, “Show me the color of your data!” Theorizing was one thing, but the acid test was the empirical data. The methods underlying data collection were thereby key. One of Derek’s favorite precepts was: “To get covariance, you must first get variance.” For a correlation to measure the association it is supposed to, there has to be adequate variation in the variables being correlated. Derek’s pithy statement typifies the concerns over methodology; and it signifies the importance of methodology: any correlation not based on adequate variance is worth little and is potentially highly misleading. Given this high value placed by my teachers and mentors on methodology and their more casual attachment to theories, I referred to them as theoretical agnostics. Yet, clearly, they all believed strongly in methodology. In 1977 I joined the Australian Graduate School of Management. This was a collection of scholars drawn from many disciplines, all sharing a commitment to the practice of social science. There was general adherence to tenets of rigorous research. Methodology was a core part of the language we shared. Later on, I was on sabbatical leave at the University of Maryland. One day I walked into the hallway where there was a crowd around the visiting speaker, Frank Schmidt. He said, in reply to a question: “It turns out that there are stable underlying generalizations if we just look for them properly.” This claim grabbed my attention, because for me this was the big quest.

PREFACE

xvii

In 1982 Frank was a visitor at the Australian Graduate School of Management. He inducted us into meta-analysis. His lectures contained a big warning about the perils of small numbers of observations and the resulting sampling error. Primary studies were rife with this problem, and meta-analysis provided the solution by aggregating many studies’ results so that its findings were based on larger and far more adequate numbers of observations. Conventionally, primary studies sought to compensate for their small numbers of observations by using significance tests, but Frank showed convincingly that these tests often misled. Another principle was, as Frank was wont to say, “All data have measurement error.” As I have reflected upon this declaration, it has come to seem to me like Einstein’s oft-repeated statement that the speed of light is a constant—recognizing this uniform limit guided his theorizing and enabled him to take many crucial steps. A third of Frank’s meta-analytic strictures was that many primary studies suffer from range restriction, so that their correlation is an understatement. This, of course, reinforces Derek Pugh’s dictum that to get covariance you have to first get variance. Only now there was an equation that told you by how much the failure to get variance reduced the correlation (covariance). This was but one example of how psychometric methods introduced more certainty and precision into the analysis. Psychometrics starts with the true correlation and then specifies quantitatively how much error is introduced, to produce the observed correlation that we actually see. These causal models allow us to be much more definite about how error works in data. This definiteness is different from most of the sociological discussions of error, which is seen as a vague trouble that is discussed vaguely. Therefore, I have sought to use the psychometric approach to error wherever possible in this book. The psychometric way of thinking combines with the statistical way of thinking to form the view that there is a deep structure to data. What is seen on the surface of the data is not the truth, which exists only deeper down. Consequently, data lie, in that they give a misleading picture. These distortions and errors can be avoided or corrected by following the procedures or formulas of methodology. During the 1980s, I conducted a number of meta-analyses and supervised a PhD student doing one. Thus, I gained added familiarity with the principles of metaanalysis through practice and, in this way, came to internalize the ideas of metaanalysis. In particular, I came to a vivid appreciation of the three-level hierarchy in meta-analysis: the individual subject, the group of subjects that is the sample in a primary study, and the aggregation of samples in the meta-analysis. As we go up the hierarchy, errors, such as those caused by small numbers of observations, are reduced, so that the meta-analysis gives a truer view than the primary study, which, in turn, gives a truer view than the case study of the individual subject. About 1990, it dawned on me that this same three-layer hierarchy exists in organizations. An individual salesperson, for example, makes a certain number of sales. A group of salespersons makes the aggregate sales for its branch; and the national sales manager aggregates all the sales of the branches, to give the company sales. This aggregate figure gives the truest view, because having large numbers

xviii

PREFACE

of observations controls much error. Thus, the national figure is truer than the branch view, which is truer than the individual salesperson’s figure. At the lower levels, figures are based on smaller numbers of observations and are therefore more beset by the random error from sampling. These errors mean that inferences made from data are less true, and so mislead decision-makers more, at lower, not higher, hierarchical levels. Of course, small organizations are like a single branch and so inherently suffer small-numbers problems and have more sampling error. Thus, large organizations have an inference advantage relative to small organizations. This inference advantage can be put alongside any other sources of competitive advantage possessed by an organization, such as economies of scale or rare resources. Statistico-organizational theory seeks to specify which organizations will have their managers making more inference errors, as opposed to those organizations that possess an inference advantage. The theory does so where the source of the error is the number of observations. Similarly, other sources of error featured in meta-analysis, such as measurement error and range artifacts, also play their roles in organizations. That is to say, these sources of error are also present in the data in organizations from which managers seek to make inferences. Again, the theory specifies which organizational situations will have more of these problems afflicting the inferences being drawn by managers. A methodological principle drawn not from meta-analysis, but rather directly from psychometrics comes in here. Where a variable is the difference between two variables, it can have much measurement error, even if the two variables themselves have little measurement error. As I cast around for situations in which managers were dealing with this “difference score” type of variable, it struck me that profit was exactly such, being the difference between sales and costs. As the book shows, this difference score can produce dramatically large amounts of measurement error, obscuring any association between profit and another variable. Yet profit is the backbone of capitalism and the centerpiece for assessing performance inside many firms. Thus, when managers look at organizational data to find the causes or effects of profit, they can be very much misled by errors in the measurement of profit. In psychometrics, difference scores are known also to be prey to confounding. The difference score is prone to correlate with the two variables that compose it. If either of them is, in turn, correlated with another variable, then that correlation creates a spurious correlation between that third variable and the difference score variable. This can confound any true relationship between that variable and the difference score variable. Again, because profit is the difference between sales and costs, it is open to confounding via sales and costs. I had been aware of these two problems with difference score variables (heightened measurement error and confounding) for many years, but, as I constructed the present theory, their significance for the variable of profit became much clearer. In the 1980s and 1990s, I also did work (sometimes with PhD students) showing the positive effect of the fit of structure to strategy on organizational performance.

PREFACE

xix

This work revealed to me the extent to which this positive effect is prey to a negative effect of organizational performance on these strategy-structure fits—because often crises of poor performance are the triggers for organizations to change from misfit to fit. Thus, reverse causality could be a confound. Since managers are much concerned with organizational performance and with its causes and effects, statistico-organizational theory emphasizes errors in it, including errors from confounds and from using profit to measure organizational performance. A meta-analysis was made by Rodgers and Hunter (1991) of studies of the effects on organizational productivity of management by objectives (MBO). This meta-analysis found that top management commitment was a moderator. Rodgers and Hunter went on to make the startling argument that the studies that used control groups were in organizations that had low top management commitment, so that insistence upon studies using control groups unwittingly produces much downward bias in estimates of the effects of MBO on productivity. Their discovery forms the basis of my argument here that organizational experiments are to be avoided. Moreover, Rodgers and Hunter argue that control groups are superfluous in a meta-analysis because it uses averages effects, and the average effect in the control groups is very small—so small that if the control were omitted little error would be present. This leads me to argue that averaging can eliminate confounding. Hence, averaging is a remedy for confounding by other causes of performance—usually considered the most troublesome source of confounding in social science. For managers, confounding of the figures in their organization’s data can be avoided by the aggregation that occurs as their data flow up the hierarchy. The resulting figures are averages, and I show that they are sound guides for managers. These arguments are likely to be controversial, but they are an offshoot of the meta-analytic literature. Hence, large organizations are like a meta-analysis: they aggregate data and thus mostly eliminate sampling error and confounding. Most of the book is presented just in words, deliberately so, to make it accessible to as broad a readership as possible. However, the methodological principles are often expressed in the literature in mathematical formulas. In Part III of this book, I present those formulas that are being used in statistico-organizational theory. I also consider how the different sources of error interact to produce an overall error, by constructing an equation that combines the individual equations. Since statisticoorganizational theory seeks the factors that shape these errors, many of the variables in the methodological equations are then replaced by those factors. This produces the overall theoretical equation of statistico-organizational theory. While the book is intended to offer an explanatory theory of where errors in inference-making from data will occur, some readers will be most interested in how to minimize these errors in their own organization. For them, I offer some guidance in the penultimate chapter about how to avoid or reduce these errors. Finally, in the Conclusions chapter, I summarize the theory from the point of view of individual managers, suggesting how they will be influenced by these er-

xx

PREFACE

rors as they look at data in their organization. In closing, I offer some reflections on statistico-organizational theory and note its limitations. Acknowledgments I would like to acknowledge Ron Amey, who first impressed upon me the importance of statistics in management, at the University of Aston. Similarly, a strong and influential early proponent of methodology was Derek Pugh, my PhD thesis supervisor and leader of the Organisational Behaviour Research Group at the London Graduate School of Business Studies. I also acknowledge the inspiration and companionship of Frank Schmidt, whose meta-analytic methods have shaped many of the ideas in this book. Long conversations as we walked the fall cornfields of Iowa have had a profound effect on my thinking. His forthright and clear style of writing about abstract issues has been a model for me. I thank Steven Charlier, a doctoral student in the Management and Organizations Department of the University of Iowa, for kindly making available to me data used in this book. I should also like to thank the following colleagues who provided constructive feedback on early drafts of chapters of this book: W. Richard Scott, Sara RynesWeller, Edwin Locke, Richard Priem, Martin Evans, Terry Boles, Kenneth Brown, Maria Kraimer, and Scott Seibert. Other colleagues gave useful feedback in seminars at the universities of Iowa and Wisconsin-Milwaukee. Tom Powell and his colleagues at Saïd Business School, Oxford University, also gave valuable feedback. At the Australian School of Business, Markus Groth and Steve Lui gave valuable comments. My diligent doctoral students helped with the final preparation: Jane Qiu helped especially with the figures, and Ben Nanfeng Luo checked my math. The book has benefited much from all these comments, but the responsibility for the final version remains with me alone. I should also like to thank my editors: June Ohlson, whose cogent comments and suggestions improved the coherence and readability of the final typescript and who created the equation in Chapter 1 that symbolizes the whole theory; Doug Cooper, who turned the final typescript into American English usage and improved it in other ways; Elizabeth Granda and the team at M.E. Sharpe, including Henrietta Toth and Laurie Lieb, who did a great job in the final production and editing of the book; and Harry Briggs, executive editor at Sharpe, who has been very supportive and patient throughout its long gestation. Last, but not least, I acknowledge Harry (“Harald”), research assistant extraordinaire.

Part I The Vision for a New Organizational Theory

This page intentionally left blank

1 Creating Organizational Theory From Methodological Principles

This book offers a new organizational theory: statistico-organizational theory. Unlike previous organizational theories, statistico-organizational theory utilizes ideas from statistics to generate a theory of organizations. Statistical and other methodological principles are used as the basis from which to derive substantive theory about the behavior of managers and organizations. Many of the methodological principles used in this book are drawn from the methodology of meta-analysis. Meta-analysis holds that a single, “primary” study gives less valid information than the results from numerous studies aggregated together into a meta-analysis. The meta-analytic result is superior because it is based on more observations and so avoids error from small sample size in the primary studies. Analogously, a small organization is like a primary study, and it suffers the same problems of error from small sample size. In contrast, a large organization, with multiple levels in its hierarchy, is able to aggregate larger numbers of observations and so avoids error from small sample size. As the data flow up the hierarchy from sections to departments and then to the whole organization, so the number of observations grows larger, allowing more accurate estimates (Figure 1.1). The upper hierarchical levels are essentially conducting a meta-analysis, while the lower levels are like a series of primary studies. Hence, the large organization is a meta-analytic organization. Any single, primary study is potentially subject to confounding by extraneous variables that introduce spurious correlations, distorting results. Aggregation across studies can neutralize these confounds. Similarly, the meta-analytic organization can neutralize its confounds by aggregating its data. Hence, the meta-analytic organization can reduce not only sampling errors but also biases from confounding, so that its findings are doubly accurate. 3

4 THE VISION FOR A NEW ORGANIZATIONAL THEORY Figure 1.1

Truer Estimates From Larger N at Higher Hierarchical Levels

Organization N = 1,000

Departments N = 100

Sections N = 10

By operating meta-analytically, the large organization can make superior inferences from numerical data, enabling its managers to make more effective decisions. Thus, we can begin to appreciate how the large organization can possess an inference advantage. However, this is not inevitable, and large organizations can unwittingly throw away some of their inference advantage, as we shall see. Meta-analysis also deals with errors from range artifacts and measurement error, and these also have roles in the information systems of organizations. Methodological Principles as Foundations for Organizational Theory Organizational theory is presently composed of many different theories that draw on various base disciplines. Some organizational theories, such as institutional theory (Scott 1995), draw upon ideas from sociology. Organizational ecology (Hannan and Freeman 1989) draws upon ideas from biology. Other organizational theories, such as transaction cost economics (Williamson 1975), draw upon ideas from economics. Yet, it is possible that productive ideas to build organization theory may be drawn from academic disciplines other than these traditional source disciplines. Previously, we suggested that the finance discipline might yield the new organizational theory of organizational portfolio theory (L. Donaldson 1999). Herein, we are suggesting that it may be fruitful to draw organizational theory ideas from statistics and the related area of social science methodology. Thus, the present work is part of a larger program intended to try to enrich organizational theory by drawing ideas from source disciplines other than the conventional ones.

CREATING ORGANIZATIONAL THEORY 5

Statistico-organizational theory is a departure from conventional organizational theories and so potentially offers previously unavailable insights. Moreover, it draws on ideas from statistics that enjoy a high degree of validity and coherence. Many of these ideas are already familiar to social scientists versed in modern methodology. Indeed, many researchers already believe the ideas, so that statistico-organizational theory only applies them in a new way. The key crossover attempted by statisticoorganizational theory is to take principles of methodology and use them to generate substantive theory about organizations. The central idea of statistico-organizational theory is that the pitfalls that exist whenever social scientists examine their research data are also present whenever managers examine data in their organization. Therefore, the principles of social science methodology can be used to predict errors of managers in making inferences from numerical data. Statistico-organizational theory analyzes how the data that organizations make available to their managers shape the inferences managers make, thereby influencing managerial decision-making. The theory seeks to develop a series of propositions about when managers who are using numerical data will make errors. The theory also seeks to specify the conditions under which those errors are likely to be most egregious. In this way, statistico-organizational theory seeks to reverse a neglect that has characterized organizational theory to date—namely, neglect of the informational content of organizational control systems. The propositions of statistico-organizational theory offer illumination on a broad range of organizational issues, such as strategy, structure, human resource management, and franchising. The discussion will be presented in a nontechnical way because the emphasis is on the concept of deriving theory from methodological principles and the theory’s potential broad applications to understanding organizational management. Inferences From Data by Researchers and Managers Many social science researchers spend long hours poring over quantitative data. They use statistical tools, such as significance tests, to help them avoid making erroneous inferences. They use these statistical tools to separate information from noise. A core premise of the present work is that managers face the same task when they look at quantitative data in their organization. The researcher looks at study data to try to better understand how the world works. Likewise, the manager looks at data inside the organization to try to understand how it is working and what is happening in its environment. The manager tries to make valid inferences from data and tries to separate information from noise. This managerial problem is not merely an analogy of the researcher’s problem. Managers face the same problems as researchers in many respects. Both face the problems that inhere in making inferences from numerical data. The potential traps, such as errors from looking at unreliable data, are present in any numerical data, whether in the hands of researchers or managers. The statistical properties of data give rise to problems that

6 THE VISION FOR A NEW ORGANIZATIONAL THEORY

are inherent in numerical data. Thus, considerations that apply to researchers also apply to managers. The methodological principles that apply when a researcher looks at data in a study also apply when a manager looks at data about the organization and its environment. Hence, methodological principles in social science must have implications for managerial decision-making. Researchers have used statistical theory over the years to create a set of methodological principles that help the researcher make valid inferences. These are statistical theories such as the law of large numbers and methodological issues such as the sources of error in measurement. These issues must hold also when managers look at organizational data. Therefore, to better understand managerial decision-making, we must theorize the processes that govern managers when they make inferences from organizational data. This involves an analysis of many facets of the data in organizations, such as the number of observations, the error in measurement, and so on. It also leads to an analysis of how the data presented to managers are shaped by the organizational structure. Again, it leads to an analysis of how the data in an organization are shaped by other characteristics of the organization, such as its size, and by characteristics of the environment of the organization, such as the size of the national population. A core idea of statistical inference is that what appears to be true may be false. A manager can therefore fall into the trap of making a false inference from the data. This wrong message, in turn, feeds into managerial decision-making. Knowledge of statistical and other methodological principles enables prediction of when managerial inference errors will be made and when they will be most egregious. This in turn allows prediction of when data-based inferences will produce mistaken managerial decisions. Foundations of the Kuhnian Research Paradigm A fundamental issue in social science is the search for foundations of knowledge— that is, for valid premises from which to generate theoretical propositions. The argument of statistico-organizational theory is that the principles of social science methodology provide a secure foundation from which to develop valid theory about some aspects of social organization. Statistics is much involved in helping scientists make the soundest inferences possible from their data. For example, estimates of parameters are more valid, the larger the number of cases in the sample. Also, estimates of parameters are more accurate if the data are reliable through having little measurement error. These basic statistical ideas are used in established social science methods to deduce many statistical procedures that form a coherent intellectual structure. This structure allows reasoning involving mathematical formulas and precise quantity estimates. The structure displays a degree of rigor that much of social theory can only envy. Social scientists are trained in these quantitative techniques, such as statistical testing, and become familiar with the underlying statistical theoretical ideas. These ideas (such

CREATING ORGANIZATIONAL THEORY 7

as the importance of large numbers in sampling) function as accepted principles on which methodological practices are based. Thus, much of the day-to-day work conducted by social science researchers when devising studies or processing data takes these statistical ideas as axiomatic. In contrast, the propositions derived from organizational theories are treated by many academic researchers as mere hypotheses, to be tested through the methodological principles. The theories are fallible and, indeed, may be shown to be false, possibly leading to their abandonment, when the methodological principles are applied. Thus, the truth or falsity of substantive theories is ascertained against the methodological principles that are the criteria for validity. Kuhn (1970) argues that scientists normally work in a paradigm, a set of assumptions about what constitutes valid knowledge, which the scientists adhere to without challenging these assumptions. For many social scientists in organizational research, this unquestioned Kuhnian paradigm is not a theory, but, rather, the set of methodological principles that are used to test the theories. Indeed, many working social scientists entertain a degree of skepticism about substantive organizational theory while believing in the methodological principles. These methodological principles are respected and treated as fundamental truths. If methodological principles are so strong and substantive theories are relatively frail, why not construct substantive theories from methodological principles? Statistico-organizational theory focuses attention upon the statistical properties of data, such as how the numbers of observations shape errors of inference. These statistical properties of data are integral to the organization (such as its size) and its environment (such as the variability in industry sales). Thus, the statistical properties of the situation confronting the organization are determinants in statisticoorganizational theory. These statistical properties are located in the data about the organization and its environment that are visible to organizational managers inside the organization. These organizational data are objective facts that constrain the managers who are making inferences from them. In that sense, the present analysis continues the tradition of positivist theorizing in organizational science (Burrell and Morgan 1979; L. Donaldson 1996; Hannan and Freeman 1989). It is a particularly opportune time to be applying methodological principles in the new way stated here because recent advances in methodology have led to the crystallization of a philosophy that data have a deep structure. This philosophy implies a strong form of the argument that an analyst can be misled by the initial impression given by data (Hunter and Schmidt 2004). In social science research, this awareness leads to the adoption of the methodological techniques of meta-analysis in order to reveal the true, underlying picture (Hunter and Schmidt 2004). The implications for managers are that in order for them to see the true picture, certain conditions would need to be met. Yet, some organizations are unlikely to readily meet these conditions because of inherent situational limitations, such as small organizational size. Again, some organizations could meet these conditions but fail to do so because of managerial practices, such as the way the organization

8 THE VISION FOR A NEW ORGANIZATIONAL THEORY

is structured or how performance is measured, so that managerial inferences have become infected with error. We seek to construct a theory of how the properties of organizational data affect the validity of managerial inference and the consequences. That is to say, we are interested in the pattern of managerial and organizational behavior that results from organizational data. And we are also interested in antecedents of the organizational data—that is, in the causal variables that shape organizational data, such as organization size. Thus, we wish to better understand how decision-making and managerial action are formed by data whose properties and pitfalls are specifiable by statistical theory. In this way, we can create a statistical theory of organizations that can shed light on important aspects of organizational life and death beyond that available from existing organizational theories. What the Theory Is Not Inductivist, or grounded, research seeks to develop theory from data, by finding patterns in empirical evidence and then devising a theory from them. The present enterprise is completely different. Statistico-organizational theory is derived from methodological principles. Thus, it is deductive and theory-driven. Statisticoorganizational theory is not empirically driven, nor inductive. Statistico-organizational theory is not a methodology; rather, it uses methodological principles to construct an organizational theory. The role of statistics in statistico-organizational theory is not to provide a method for analyzing data. Instead, statistical (and other methodological) ideas are used as the premises of a theory of organizational management. There is no presumption in statistico-organizational theory that managers do, or should, base all their decision-making on quantitative data; rather, the argument is that when they do so, certain errors are liable to arise. Prescriptively, managers may sometimes be advised to follow statistical procedures to reduce errors of inference, so that statistics is being used normatively. In contrast, the intent of the present work is positive, to use statistics to explain where managers will make erroneous inferences from data. There has been a tradition of empirical study of biases in human decision-making. In particular, it has been shown that human beings have cognitive biases and often fail to grasp fundamental statistical ideas or fail to use them in a logical way (e.g., Janis 1972; Kahneman and Riepe 1998; Staw 1981; Vecchio, Hearn, and Southey 1996). There is also a literature about managerial decision-making as being affected by political processes, such as the pursuit of self-interest (e.g., Pettigrew 1973), and about how structural position in the organization can affect perceptions. Accepting that these processes occur, our concern is different. Our concern is to examine the role of another set of factors that bias managerial judgments. Our focus is upon the data that are presented to managers by the organization and its environment, and the resulting statistical properties of that data that then influence managerial

CREATING ORGANIZATIONAL THEORY 9

decisions. Thus, the central issues are the factual properties of organizational data and the inferences managers make from them. Hence, even if managers are completely rational, they will still make the errors discussed here. Thus, without invoking inherent human biases, inability to be a natural statistician, or political self-interest, managerial decision-making is still subject to the problems that are identified by statistico-organizational theory. Information-Processing and Data The theory of organizational structure has been dominated by the informationprocessing and administrative decision-making school. From the foundational writings of Simon (1957) in the 1950s onward, the major paradigm for much organizational theory has been the organization as a decision-making engine. Following Simon, an organization is conceived of as a system for processing information and making decisions. Within this paradigm, Galbraith (1973) has articulated the central role of information-processing, which has been pursued by subsequent organizational theorists (e.g., Egelhoff 1982, 1988, 1991). The Galbraithian model focuses upon the need to efficiently schedule operations throughout an organization that contains several interdependent parts and thereby the need to coordinate those parts. This coordination is achieved through mechanisms such as plans and may be affected by inventory buffers. Organization theorists have been much concerned with the effects of uncertainty in such a system. Uncertainty reduces the ability to predict and to make estimates accurately. This prevents clear choices between options and reduces the effectiveness of planning. The main solution to uncertainty is to obtain more information. Galbraith’s central idea is that when a decision is uncertain, more information must be processed to resolve the uncertainty. This insight leads to an emphasis on how much information is available in each part of the organization, through hierarchy of authority, procedures, and the like. The discussion embraces enhanced mechanisms for processing information, such as investment in vertical information systems—for example, computers to process more information and more quickly transmit it to a central planning office (Galbraith 1973). The obverse of reducing uncertainty is taking steps to cope with substantial uncertainty, by carrying inventories and dividing operations into autonomous subunits so as to isolate them from each other’s fluctuations. While this discussion of uncertainty and organization design by Simon (1957) and then Galbraith (1973) is seminal, it has focused mainly on the amount and location of information. In so doing, the theory overlooks certain implications of its own insights. Uncertainty does, indeed, require more information for its reduction. However, for data to reduce uncertainty, the data must allow people to extract valid information. The data provide an increase in certainty only if they yield accurate estimates of the underlying variables. That is, the parameter estimates, such as mean values, need to be accurate. Otherwise, extensive data collection by the

10 THE VISION FOR A NEW ORGANIZATIONAL THEORY

organization only yields noisy data that cannot reduce the uncertainty. Therefore, our concern, in a sense, goes back a step, to ask how information is extracted from data. The approach taken here focuses therefore on how the properties of data affect the inferences that managers draw from them. The control systems of the organization provide the managers with much of the data they receive. For example, sales performance reports generate sales data, accounting systems generate cost data, and human resource management systems create appraisals of managerial performance. In his seminal case histories of the development of the modern corporation, Chandler (1977) charts the crucial role played by control systems, such as accounting and forecasting systems. He shows how these control systems were developed as part of the changes in scale, strategy, and structure that brought into being the modern corporation. Thus, the vertical integration of specialist-function firms, when they were combined to form the multifunction firm, was premised, in part, on the development of flows of numerical data and statistics that allowed sales forecasts to form the basis for production volumes (Chandler 1977). Subsequently, the integration of multiple businesses into the diversified multidivisional structure was facilitated by the measurement of the profitability of each division, as the main accountability structure, in place of cost centers (Chandler 1962). Organizational theory has followed Chandler in being much concerned with organizational strategy and structure (e.g., Rumelt 1974). However, organizational theory has mostly neglected to theorize about how the informational content of control systems shapes managerial decision-making. We seek to fill some of that lacuna. The Theoretical Structure of Statistico-Organizational Theory In this section, we bring out the theoretical structure of statistico-organizational theory, emphasizing its causal structure and the roles of methodological principles and situational variables. The methodological principles are at the heart of the theory, providing the core explanatory mechanism of how errors are produced from data. Statistico-organizational theory specifies the situations in which these errors will arise through the way the organizational data that managers examine are affected by the characteristics of the organization or its environment. In social science, the role of any theory is to explain some phenomena by certain causes (Kincaid 1996). The theory may be stated as a causal model. It refers to causes and explains their effects by the intervening processes that connect the causes to the effects. Transaction economics explains vertical integration by the cause of asset specificity, which works through intervening processes such as opportunism (Williamson 1975). Population ecology explains that population density causes an increase in the founding rate of new organizations, through the intervening process of enhanced legitimacy (Hannan and Freeman 1989). Statistico-organizational theory is a theory that likewise explains effects by causes that operate through intervening processes.

CREATING ORGANIZATIONAL THEORY 11 Figure 1.2 Situational variables

The Theoretical Structure of Statistico-Organizational Theory Data properties

Methodological principles

Errors in inferences

In statistico-organizational theory, errors in inferences made by managers from data in their organization are explained by causes operating through intervening processes. The causes of the errors are the properties of the data, which operate through methodological principles, as the intervening processes. The data properties are shaped by situational variables in the organization or its environment. Hence the ultimate causes of the errors are these situational factors, while the data properties are the more proximal causes. The central scenario in statistico-organizational theory is the manager, or staff analyst, in an organization looking at numerical data about that organization or its environment and trying to infer how an individual scores on a variable, or what causes what. Such inferences are subject to errors. These are the phenomena that the theory seeks to explain. The causes of these errors are the properties of the numerical data the manager is examining, which in turn are caused by the situational variables that shape the data. These causes work through the processes specified in the methodological principles to produce the errors. Schematically (see Figure 1.2), statistico-organizational theory may be stated by saying that situational variables, S, create properties in the data, D, that, according to methodological principles, M, produce errors, E. This theoretical schema may be summarized in the formula E = S(D) × M. Errors, E, are caused by the situation, S, that determines properties in the data, D, which are mediated through the methodological principles, M, to create the errors, E. The symbols S(D) signify that the situation, S, determines the data properties, D. The multiplication sign between S(D) and M signifies that the data properties, D, interact with the methodological principles, M, to cause the errors, E. For instance, suppose an organization assesses the value of training its employees by correlating the numbers of days of training that each individual received in a year with that individual’s productivity. The result shows only a trivial correlation, leading the organization to abandon training as worthless. However, training was quite standard throughout the organization, so that all employees received four or five days per year. Therefore, the variation in training was limited. This entails the methodological principle that, to attain the true correlation, the variation in data has to be equal to the variation in the population. The amount of reduction in the observed correlation due to the limited variation is predictable by the methodological principle. Thus, the observed, trivial correlation understates the true

12 THE VISION FOR A NEW ORGANIZATIONAL THEORY

correlation and misleads the organization’s managers. Yet it is the true correlation that managers should use to decide whether to cease or continue the training. The reason that the organization trains its employees so uniformly is because competition has forced it to do so. Therefore, competition is the situational variable, S, that has led to limited variation in the training variable in the data, D, which, through the methodological principle, M, produces the error of the trivial and misleadingly small correlation, E. The theory in statistico-organizational theory may be stated as a causal model. The dependent variable is errors made by managers and analysts when they make inferences from data in their organizations. The causes are the situational variables that increase or decrease these errors. The causes are characteristics of the organization or its environment. They produce errors through the intervening processes, such as the existence of sampling error, that are specified in the methodological principles. Figure 1.3 shows the causal connections between the situational variables and the errors they produce. Whereas the methodological principles themselves and the derived data properties (e.g., number of observations) are generic methodological considerations, the predictions about the situational variables that lead to errors are distinctive of statistico-organizational theory. There are two main types of errors: bogus variability and attenuation of association. Bogus variability is apparent variation from one data set to another that is purely artifactual. For examples, means will vary despite really being constant, similarly for correlations. Generically, bogus variability leads managers and analysts to the mental state of complexity mystification, in that the world seems to be more causally complex than their models—though this perception is illusory. Bogus variability leads to other errors, such as to chance playing a role in the assessment of the performance of managers and to cycles of dysfunctional control, which can lead to organizational mortality. Conversely, organizations in the situation that create lesser amounts of bogus variability, such as large organizations in large countries, have, thereby, an inference advantage, whereas smaller organizations tend to have an inference disadvantage and, at the extreme, higher mortality rates. Attenuation of association (e.g., correlation) understates associations such as that between organizational performance and one of its causes. Generically, attenuation of association leads managers and analysts to the mental state of causality pessimism, in that their models seem to feature causes that are weak. It leads managers to underuse those causal levers and to search vainly for other, perhaps stronger causes, whereas none may exist. Bogus variability is increased by situational variables, many of which have size as a common feature. Bogus variability is positively affected by the size of the organization’s products (or services) in that larger products will tend to be less numerous and so estimates will be based on a smaller number of observations, incurring sampling error as the underlying methodological problem. Country size positively affects the size of organizations in that country. Organizational size directly increases the number of observations in data inside those organizations,

13 Figure 1.3

Causal Model of Statistico-Organizational Theory

Situational Variables

Core Error

Outcomes

COMPLEXITY MYSTIFICATION

Product size

Country size

+

+

+ + Organizational ˉ size +

+ Structural differentiation

BOGUS VARIABILITY

ˉ

Inference advantage

+

+

+

Chance-based performance assessment

Dysfunctional control

Temporal differentiation +

Organizational mortality

+

Profit centers + Multicausality + Profitability Homogeneity of profitability

Prior performance

ˉ +

+

+

ATTENUATION of ASSOCIATION

+

+

CAUSALITY PESSIMISM

Underuse of causal lever

Competition +/ˉ

+

Other causes +/ˉ Sales or costs

Search for other causes (can be illusory)

14 THE VISION FOR A NEW ORGANIZATIONAL THEORY

thereby decreasing bogus variability. However, organizational size also increases structural differentiation (e.g., number of divisions), which fosters data disaggregation, so that fewer observations are used to make inferences from increasing bogus variability. Thus, organizational size has an indirect positive effect on bogus variability that works to offset its direct negative effect. The fallacy of immediacy can also lead to data disaggregation, so that data can be said to be temporally differentiated, leading to the use of only the most recent observations and again increasing bogus variability. Creating multiple profit centers in an organization also disaggregates the organizational data—there being fewer observations about a profit center than about the profit of the whole organization—and so increases bogus variability. Weak correlations have more sampling error and therefore more bogus variability. Weak correlations will occur where an effect, such as organizational performance, has multiple causes (i.e., multicausality). Hence, bogus variability is raised by product size, organizational size (indirectly via structural differentiation), temporal differentiation, profit centers, and multicausality. Bogus variability is lowered by organizational size (directly). Thus, while larger organizational size, fostered by being located in a larger country, leads toward an inference advantage for the larger organizations, especially those producing small things (e.g., pencils) in large quantities, this can be undermined by the bogus variability that can come from disaggregating organizational data through structural and temporal differentiation, from multiple profit centers, and from multicausality. Thus, size—of country, organization and number of its products produced— works to increase the number of observations. Structural and temporal differentiation along with the division of an organization into numerous profit centers—all of which tend to increase as the size of the business increases—work to decrease the number of observations. Thereby, size variables reduce bogus variability, while these size-affected variables raise it. Attenuation of association is increased by situational variables, many of which have profit as a common feature. Attenuation of association is positively affected by having multiple profit centers inside the organization and so using profit as the measure of the performance of organizational subunits. Profit being a difference score tends to reduce any association between profit and another variable, because of enhanced measurement error. Measurement error of profit tends to be greater where profitability is low, so that profitability negatively affects attenuation of association. Conversely, measurement error of profit tends to be lower where homogeneity of profitability in a data set is high, so that homogeneity of profitability positively affects attenuation of association. Multicausality means that the causes of performance (on average) have lower correlations with performance, so that such correlations are more easily materially confounded. The fact that organizational change is often driven by poor organizational performance can produce a masking confound by prior performance on any positive effect of an organizational characteristic on organizational performance, which attenuates the positive association

CREATING ORGANIZATIONAL THEORY 15

between that characteristic and performance. Competition tends to reduce the range of organizational performance and the characteristics that cause it, so that competition attenuates associations between characteristics and organizational performance. Thus, attenuation of association is made greater by multicausality, multiple profit centers, homogeneity of profitability of a data set, prior performance, and competition, while attenuation is lessened by profitability. In general, profit affects attenuation of association. Thus, attenuation of association is increased by profit centers, homogeneity of profitability, prior performance, and competition and is decreased by profitability. Moreover, in many organizations, prior profitability is a salient part of prior performance that induces performancedriven organizational change. Again, in many organizations competition reduces profit, thereby attenuating associations involving profit, such as in managerial efforts to find the causal levers of the profitability of the organization. Hence, many of the situational variables that affect attenuation of association are profit-related. Attenuation of association can also be affected by confounding by other causes of the dependent variable or, where profit is one of the variables being associated, by sales or costs. However, each of these confounds can be negative or positive, so that they could either depress or inflate the observed association. Therefore, they attenuate or disattenuate the association. Hence, the effects of these two confounds on attenuation of association is indeterminate, unlike the other causes of attenuation in the causal model, which are determinate. For either of these two confounds, its effect on an association could vary from data set to data set, swinging from positive to negative, contributing to bogus variability across data sets. Of course, the net effect of the determinants of attenuation of association (e.g., profitability) could also vary from data set to data set, thereby introducing some bogus variability. But because these determinants are more stable, changing in degree, but not swinging from positive to negative, the degree of bogus variability they produce is limited. Managerial Errors In the discussion of statistico-organizational theory in this section, we will pull together numerous methodological principles to focus on broad types of errors. This broader style of discussion will end by abstracting certain overall patterns as an organization grows and develops. However, these are only trends and so supplement, but do not replace, the more precise predications about the errors caused by specified situations that will be offered in the course of the book. We take here as the point of departure the types of managerial errors. The effect on association (correlation) of sampling error, measurement error, range restriction, and confounding by reverse causality is shown in Figure 1.4. As the number of observations, N, increases, so bogus variability decreases. This is shown in Figure 1.4 as a cone that converges toward the true value when size is large. The observed correlation values exist in a fan above and below the true value. This

16 THE VISION FOR A NEW ORGANIZATIONAL THEORY Figure 1.4

Errors in Observed Correlation From Range, Measurement Error, Reverse Causality, and Sampling Error

Correlation

Range extension

True

Range restriction

High sampling error

Measurement error

Reverse causality

Low sampling error

Number of observations

error decreases with size, but at a decreasing rate as size increases, which is shown as the curved walls of the cone in Figure 1.4. Size is the number of observations on which the correlation is based. It forms the horizontal axis in Figure 1.4. Measurement error, range restriction, and confounding by prior performance (reverse causality) all reduce the observed association below its true value. (Range extension increases the observed correlation above its true value.) The higher are unreliability, range restriction, and the spurious correlation from prior performance, the greater is the understatement of association (correlation). The value of the correlation is shown as the vertical axis in Figure 1.4. The more that the observed correlation is below the true correlation, the greater is the attenuation of the correlation. Conversely, if the correlation is not attenuated, so that it has its true value, then this is disattenuation. The lower are unreliability, range restriction, and the spurious correlation from prior performance, the greater is the disattenuation of association (correlation). Figure 1.5 shows the two dimensions of the number of observations and disat-

CREATING ORGANIZATIONAL THEORY 17

Disattenuation

Errors in Causal Relationships From Number of Observations and Attenuation

Bogus variability

Valid interference

Attenuation

Figure 1.5

Bogus variability and underestimated causal relationship

Underestimated causal relationship

Small

Large

Number of observations tenuation, which for simplicity are combined to give four cells, and also the resulting errors produced in each cell. A small number of observations produces more bogus variability, while attenuation produces more underestimated associations. A small number of observations and attenuation produce both more bogus variability and more underestimated associations. Conversely, data that have both a large number of observations and disattenuation produce associations that have their true value and hold across data sets, so that a manager can make a valid inference from the numerical data. Figure 1.6 shows the two dimensions again, and this time the managerial states produced by these errors. A small number of observations produces complexity mystification. Attenuation produces causality pessimism. A small number of observations and attenuation produce both complexity mystification and causality pessimism. Data that have both a large number of observations and disattenuation produce causality integrity, in that a manager can make a valid inference from the numerical data about causality, including its strength and generalizability. Small organizations tend to have small numbers of observations and so are prey to bogus variability. Larger organizations tend to have less of this error. But growth in organizational size often eventually leads to diversification and resultant divisionalization (Rumelt 1974), which tend to lead to disaggregation of organizational data, so that some bogus variability is retained. Diversification and

18 THE VISION FOR A NEW ORGANIZATIONAL THEORY

Disattenuation

Managerial Misperceptions From Number of Observations and Attenuation

Complexity mystification

Causality integrity

Attenuation

Figure 1.6

Complexity mystification and causality pessimism

Causality pessimism

Small

Large

Number of observations divisionalization lead also to use of profit to measure divisional performance, which reduces reliability, leading causal effects to be understated. Thus, as organizations grow in size and then diversify, they tend to move from being small with bogus variability to being large and diversified with understated associations. Thus, they shift from one of the types of error (bogus variability) to the other type (understated association). In terms of Figure 1.5, this move is from the top left-hand corner to the bottom right-hand corner. Thus, as organizations change over time, they will initially grow in size and thereby start to decrease bogus variability. Eventually, the size growth will lead to diversification that decreases reliability, leading to underestimated associations. Thus, the typical trajectory will not include being in the benign quadrant of causal integrity in the upper right-hand corner of Figure 1.6. An organization can escape moving from one error to another if it grows but does not diversify. Then its growth leads it to have a large number of observations, so that bogus variability is avoided, while retaining a functional structure that avoids using profit to assess its organizational subunits, so that underestimation of association is also avoided. In this way, both complexity mystification and causality pessimism would be avoided and the organization would attain causal integrity. But given that large organizations typically diversify—in either products or services, or geographies—this benign outcome is not likely to be usual.

CREATING ORGANIZATIONAL THEORY 19

In terms of Figure 1.6, the typical trajectory is from complexity mystification to causality pessimism. Managers will initially perceive that the world is more changeable and that causality is more complex than they really are. Then managers will come to see more stability and more generalizability, but perceive that the causes are weak and so underuse the causal levers they know, while looking for other causes. Thus, initially the organization is in start-up mode, small in size, with managers seeking to master the most elementary aspects of the organization: its product, production, marketing, and financial health. While seeking to solve these fundamental puzzles, they will look at numerical data containing little stability and few signs of robust cause-and-effect relationships on which they can depend as guides for effective decisions. Thus, their task as managers will be beset by uncertainty, and examining numerical data may only increase their sense of uncertainty. There will be fruitless searches to ascertain the conditions under which each cause-and-effect relationship holds true. If, nevertheless, managers make enough effective decisions for their small organization to survive and grow, bogus variability will subside, so that inexplicable variability in the effects of causes and temporal instability will both decrease. Then, genuine temporal constancies and trends will become clearer, as will the generality and dependability of cause-and-effect relationships, including those involving sales and costs. This allows learning from numerical data, helping managers to make informed decisions and increase effectiveness. When the now large organization diversifies and divisionalizes into multiple profit centers, associations with performance will become increasingly underestimated. There will be a sense that once dependable causes of performance are becoming feebler. There will be disillusionment with familiar paradigms as no longer delivering the benefits they once did, so that their techniques will be less utilized. Instead, there will be a search for new paradigms and new techniques. But this may be pointless, because the existing techniques may in fact be delivering, and there may be no more effective causal levers. Thus, at maturity, the managerial culture may be jaundiced and entertain a succession of fads, brief enthusiasms for the new that quickly fade in their promise. However, the size, wealth, dominant market share, and reduced risk of the large, diversified corporation will usually allow it to continue in being. Having briefly stated the overall theoretical causal model and sketched some broad patterns, we will now introduce the individual error sources featured in statistico-organizational theory. Overview of Statistico-Organizational Theory The central idea of statistico-organizational theory is that organizations are statistical machines. Their workings are governed by the universal principles of statistics that pervade all of their operating processes. Statistico-organizational theory takes

20 THE VISION FOR A NEW ORGANIZATIONAL THEORY

ideas from statistics and social science methodology and uses these as the axioms from which are developed substantive theory about organizations. This book focuses mainly on four principles, each of which pinpoints a factor or factors determining whether numerical data give managers correct signals about the true state of their organization and its environment. These four principles are the law of small numbers, measurement error (that is, unreliability), range artifacts (restriction and extension), and confounded causation. There are, however, other methodological principles and ideas that could be used to generate theoretical propositions about organizations and their managers. Different methodological principles and ideas may also be used in the future to produce a more comprehensive statement of statistico-organizational theory. Since many social scientists are well versed in methodological principles, it is likely that many of them can add to statistico-organizational theory and, thereby, refine it. In that sense, the present book is simply a preliminary statement of the theory that puts forward the main concept—that is, using methodological principles to generate substantive theory—and illustrates this for certain methodological principles. Borrowing from the computer industry, we may say that the theory has an “open architecture” in that, potentially, many scholars could contribute to its development. The present four methodological principles, however, have been foci of methodological concerns in social science and so are not incidental. They pertain to the philosophy of data that underlies one contemporary stream of social science research methodology, meta-analysis, which guides the present book. This is the philosophy that data have a deep structure, as will be explained in the next chapter. We will now give an overview of the main ideas of statistico-organizational theory, which will then be discussed more fully in the ensuing chapters. The Law of Small Numbers From the law of large numbers in statistics we take the idea that estimates of parameters by managers in organizations are more valid when the data consist of a large number of observations. Conversely, when inferences are made from few observations, then errors are more likely. The smaller the number of observations, the greater is the error in the estimate derived from it. This is the law of small numbers (Schmidt, Hunter, and Urry 1976; Tversky and Kahneman 1971). Managerial decision-making about marketing and production strategies, as examples, will be more erroneous, the smaller the number of cases, because the inferences on which they are based are more likely to be incorrect. The law of large numbers states that the larger the number in the sample from a population, the closer the value of a statistic of the sample (e.g., the sample mean) approximates the parameter value of the population (e.g., the population mean). The error from small numbers of observations is a random error, so that the observed value is randomly above or below the true value and the magnitude of the error varies randomly. There is (on average) less random variation from sample

CREATING ORGANIZATIONAL THEORY 21

to sample in large samples than in small samples. The larger the sample, the more likely that the mean obtained in the first sample will be found also in the second sample. Conversely, there is more random variation from sample to sample in small samples than in large samples. The smaller the sample, the less one can rely on the statistic (e.g., mean) obtained in the first sample being found in the second sample. Thus, the smaller the number of observations in a sample, N, the higher the probability that the inferences from that sample will be erroneous. This principle applies to any quanta: pencils, gaskets, sales, books, people. Every time a manager estimates a statistic based on a small number of observations (i.e., small N), random error intrudes. The implications for organizations are pervasive. This problem of small numbers can arise in many ways. Countries with small populations tend to have small-sized organizations. Small organizations have fewer sales and fewer customers or clients than large ones (Blau and Schoenherr 1971). But even larger organizations may unwittingly decimate the data they present their managers through decentralization into small branches. Also, overenthusiasm for very recent data can throw away useful observations from prior periods. In these and many other ways, managers, sometimes unknowingly, find themselves facing data from too few observations, so that their inferences are erroneous and lead to wrong decisions. The priority here is not to lament this process, but to understand how it leads to predictable patterns of behavior that enable us to better understand organizations. Meta-analysis has shown that small numbers are a particularly major source of errors in social science (Hunter and Schmidt 2004), and so our discussion will emphasize small numbers as a potential source of error for managers. In meta-analysis in social science (and other areas such as medicine and biology), each individual study of a topic is combined with others, so that parameter estimates can be based on large Ns. Thereby, these estimates are sounder than those from any individual study, which are based on smaller Ns. Thus, aggregation across studies produces larger Ns and so a truer estimate (Hunter and Schmidt 2004). In meta-analysis, the variations in statistic values from study to study can be shown to result from the random variation due to the small Ns of the studies, rather than to real variation. Therefore, the aggregate value is the truer value (Hunter and Schmidt 2004). Thus, meta-analysis leads to the philosophy that data should be combined to produce a bigger N. It leads also to distrust of variations that may appear to reflect some real difference in locales but are often just random variation. Whether apparent variation between locales is genuine can be ascertained only by an analysis of the aggregate data. Thus, in research, the results from one study are less truthful than the results from studies combined into a meta-analysis. Metaanalysts sit atop a pyramid of primary studies, a vantage point from which they can discern the truth more accurately. Large organizations are like meta-analytic machines. The managers of large organizations sit atop a vast pyramid, which produces, at its top, data with large sample sizes. Thus, most of the random variation has already been extinguished from the data looked at by top managers, simply through combining observations

22 THE VISION FOR A NEW ORGANIZATIONAL THEORY

from all the different parts of the organization. In contrast, managers of smaller organizations are faced with small-scale data sets, consequently full of puzzles and apparent trends, many of which are illusions—that is, simply artifacts produced through small numbers. The task of these managers of small organizations is akin to the social scientist trying to infer causal relations from a small sample of, say, twenty subjects—in fact, it is the same task. Thus, large organizations are meta-machines and therefore are inherently superior to smaller organizations in the informational aspects of organizational learning, adaptation, and management. Small organizations are closer to their customers, but so close that they cannot tell the forest from the trees. The manager of a small firm receives up-to-the-minute data on a rich variety of aspects, such as by dealing with individual customers and talking to production workers—more so than any one manager in a large firm, who tends to be remote from the “firing line.” The problem for the manager of the small firm is making sense out of all the jumble of facts, in their variety of formats and small numbers of observations. In contrast, the senior managers in the large firm review tables of statistics that aggregate thousands of data in standardized format, allowing comparison. From these data, clear, meaningful patterns and trends can be identified, allowing causal inference, prediction, and planning. For this reason alone, other things being equal, large firms will tend to be the trendsetters in reacting to environmental changes. Smaller firms may match their larger competitors and so follow their lead. Hence, some part of the leadership role of major firms in an industry, sometimes ascribed to oligopoly dominance or professionalized management, resides in large firms’ inherent ability to make better numerical analyses, a capacity that small firms cannot have. While a large organization starts with the inherent advantage of having a large number of observations, it may dissipate this advantage by disaggregating its data. This can occur through aggregating data only within organizational subunits, such as divisions or departments or branches, so that the number of observations is only for that subunit. Therefore, the base for any statistic calculated from the data is not the whole organization but only the subunit. This means that there will be more random variation in the statistics of a subunit than in those of the whole organization, so the figures are more likely to mislead managers into making the wrong decision. Therefore, the organization needs to have a structure in which data are aggregated at the center. These aggregate data can then be used to validly determine which factors really vary by organizational subunit (e.g., locale), so that decisions about them may be decentralized down to those organizational subunits. In that sense, the organizational structure has to centralize before it decentralizes. A conventional human resource management (HRM) development practice, for instance in a bank that has branches, is to appoint an individual to be the manager in charge of a small branch. If that branch’s performance over a period of, say, two years is high, then the individual is judged to be an effective manager and is promoted to manage successively larger branches. If, however, that branch’s per-

CREATING ORGANIZATIONAL THEORY 23

formance over the period is poor, then the individual is judged to be an ineffective manager and possibly outplaced. However, because smaller numbers of business operations (e.g., commercial loans) occur in a small branch, there is more random variation in the performance of the small branch than of a large branch. Thus, the attribution by upper managers that lower branch managers are effective, or ineffective, is contaminated by the luck of the small branch. Hence, the subsequent decision to promote has elements of a lottery. If, subsequently, the branch managers ascend the hierarchy, their performance will be that of larger units and so luck will decline as the arbiter of their fate—if they survived the early trials. Thus, the career success of managers is more determined by luck in the early, rather than in the later, stages of their career. The smaller the number of observations from which to assess managerial performance, the more prone to chance will be the HRM systems of management development and internal selection. There is an implication of the small numbers problem for organizational control. Upper-level managers tend to intervene if they see a problem of poor performance, but this can be illusory, caused by small sample size. At each lower level of the hierarchy, the number of observations (e.g., sales) is smaller and therefore the sales figure is fuzzier (i.e., more random). Therefore, there is more likelihood that, by chance alone, performance will appear poor when it is not really. Thus, if upperlevel managers initially suspect that there may be a performance problem lower down the hierarchy and if they inspect the sales data broken down by organizational subunits, the investigation will tend to falsely throw up cases of poor performance. This apparent poor performance is just an artifact of the disaggregation of organizational data that is inherent in analyzing the performance of the different subunits. Nevertheless, the false poor performance figure will confirm the initial suspicion of upper managers that there is a problem requiring their intervention. They may therefore erroneously blame, punish, or countermand lower-level managers, upsetting trust and the smooth working of the organization; but much of this perception of poor performance and of the need for rectification is an illusion, created by small samples from disaggregating the organization’s data. Having had their initial suspicions confirmed, upper-level managers may then break down the figures into those from even lower hierarchical levels, which are based on even smaller numbers of observations, and so will produce even more random error and thus identify cases of even poorer performance that are even more false. In this way, pathological cycles of analyzing, blaming, and intervening can occur, cascading down the organization, growing in magnitude, and becoming increasingly detached from reality. Some decisions are taken by having each branch, or franchise, decide whether a new approach works or not. Then each branch essentially “votes” by saying whether the approach worked or not, and the “votes” determine the organization’s decision; but each branch is much smaller than the organization, and so there is more random variation in the results for branches than in the result for the organization. Hence, voting combines the experiences of branches in a way that distorts the true

24 THE VISION FOR A NEW ORGANIZATIONAL THEORY

picture. In contrast, aggregating all the data across the whole organization would yield a much more accurate picture, leading to better decisions. Because small N is a major cause of error in numerical analyses, two chapters of this book are devoted to its discussion (in addition to the discussion of small N in Chapter 2). Chapter 3 analyzes the importance of the number of observations and organizational size. It brings out the implications for organizational mortality and for international comparative advantage in managerial inference-making. Chapter 4 discusses the fallacy of disaggregation, which seems to promise greater accuracy, but breaks up large N data and so introduces errors. The chapter brings out the implications of small numbers problems for cycles of dysfunctional control, human resource management, and problems from aggregating previous inferences. Another way in which data disaggregation can arise is through the fallacy of immediacy. Chapter 4 also takes up the implications of data disaggregation for organizational structure. Measurement Error Again from social science methodology, we can take the concept of measurement error (Hunter and Schmidt 2004). This arises from several origins, and two in particular will be highlighted here, because of their likelihood of occurrence in organizations and their potential severity for managerial decision-making. The two causes of measurement error are difference scores and single-item indicators, both of which inflate measurement error. A reliable score is measurement with little error, so unreliability and measurement error can be used as synonyms. A major source of measurement error is difference scores. A difference score is produced by subtracting one score on a variable from another score on another variable (or that same variable). This difference score can have lower, often much lower, reliability than the variables from which it is derived (Johns 1981). Therefore, estimates of parameters involving difference scores are often (much) less reliable than the original variables. Hence, inferring the true parameter from differencescore data is hazardous and error-prone. Organizations, however, can generate data that are difference scores. Indeed, organizations may use difference scores as mainstays of their control systems, oblivious to the errors in inference that this introduces. For instance, profit is a difference score, being the difference between revenue and costs. Therefore, profit is subject to inherent unreliability, so that it is error-prone. Nevertheless, profit is widely used in internal organizational control systems as a basis for performance measurement, accountability, rewards, managerial succession, investment, and strategy setting, and so on—despite the fact that its unreliability makes it a hazardous measure on which to base such decisions. Not only profit but also other measures, often used by managers and directors of an organization to measure its performance, can suffer from low reliability. Many of these measures are difference scores. Profitability measures—that is, ratios of

CREATING ORGANIZATIONAL THEORY 25

profit to other financial variables such as assets or sales—are as unreliable as profit. Similar problems inhere in sales growth, which subtracts sales in one period from sales in the subsequent period and is therefore a difference score, leading potentially to low reliability. Again, analyzing a performance measure relative to a control variable or a target also leads to low reliability. Low reliability reduces a correlation below its true value. Thus, if managers examine data to find the causes of, for instance, profitability, which is measured unreliably, the managers will tend to wrongly see a strong cause as being only weak, so they might underuse it or fail to use it. The use of single-item indicators is another source of measurement error in organizational management. Social science methodology recommends that, to reduce measurement error, a variable should be measured by multiple items, rather than by just a single item. Yet, in management a single item may be used—for example, the sales by a department. This leads to measurement error and hence can lead to false inferences by managers. Such measurement error can be reduced by multipleitem measurement, yet this has been less common in organizational management. A “balanced scorecard” contains multiple performance measures and so would be expected to have the merit of higher reliability. Divisional profitability measures compound the unreliability of the profitability measure with the problems from small sample size (discussed in the previous section) in that profitability is applied to just a part of the organization’s aggregate performance. In strategy, analyses of niches suffer likewise, so that the finer-grained the niches, the more unreliable are their profitabilities, and so the more erroneous are the resulting managerial decisions about strategy. Chapter 5 provides a general discussion of the errors introduced by measurement error (i.e., unreliability) in various measures of organizational performance— profitability, profit ratios (e.g., return on sales), and changes over time (e.g., sales growth)—or by the use in performance analyses of controls or comparisons with a standard. Chapter 6 provides a formal, quantitative analysis of the degree of measurement error and the conditions in which it will be most egregious and therefore most likely to severely mislead organizational managers. Chapter 7 discusses the sources and effects of errors introduced when unreliability in measures of organizational performance is compounded by applying them to just part of the organization or its environment: divisional profitability and strategic niches. Errors From Range Artifacts According to social science methodology, another source of error is range artifacts (Hunter and Schmidt 2004). A fundamental methodological principle is that, for any association between two variables to be revealed truly, both variables must have their true range—that is, the range in the population. Thus, for the full association between one variable and another to be given by data, those data must have the

26 THE VISION FOR A NEW ORGANIZATIONAL THEORY

full ranges of those variables. If the range is restricted (less than full), then the data will understate the association. Conversely, if the range is extended, so that it is more than the true range, then the data will overstate the association. Either misstatement could lead managers to err in their decision-making. An organization’s managers are interested in seeing which variables are associated strongly with organizational performance in order to pinpoint causal levers and act upon them. However, organizational performance may be constrained so that its range is restricted. This will lead even strongly associated variables to seem to have a weak relationship or possibly even a nil relationship. If a statistical significance test is used, the lowered association from range restriction could make a significant, true correlation appear to be nonsignificant, so that it would be classified as being not really different from zero (Hunter and Schmidt 2004). Hence, managers would wrongly discount factors as being unimportant causes of organizational performance and so not act upon them. When a cause of organizational performance resides in the fit between some organizational characteristic (e.g., organizational structure) and some other variable (e.g., organizational strategy), then a high degree of misfit may be avoided by organizations and their managers. An even poorer performance, from extreme misfit, may lead to organizational demise, so that extreme misfit cases cease to exist. In both ways, the degree of misfit in a study of organizations would be restricted in range, leading to understatements of the effect of misfit upon organizational performance, possibly leading to the erroneous view that fit does not matter. Conversely, range may sometimes be extended, so that it is greater than exists in reality, leading to the overstatement of associations. This can occur if managers take just the top and bottom cases—for example, the top salesperson versus the bottom salesperson—and compare them on some other variable. Suppose a manager wishes to find the effectiveness of sales training by correlating the amount of training of each salesperson with sales performance. If the manager uses just the top five and bottom five performers out of fifty salespersons, the correlation overstates the true effect for the fifty salespersons. A falsely high correlation could lead the manager to stop searching for other causes of sales performance, even though training is actually only one cause and has limited effect in boosting performance of the salespersons. Chapter 8 discusses problems that stem from range restriction and range extension. All three of these sources of error (small N, unreliability, and range artifacts) may be operating to produce error in the organizational data used by managers to make a decision. Hence, all three sources can add to the error in the data. Thus, managerial inferences from data are potentially prone to considerable error. As just seen, range artifacts can deflate (range restriction) or inflate (range extension) a correlation. Unreliability deflates a correlation. Thus, a variable that has range restriction and unreliability will doubly understate a true correlation. Sampling error randomly decreases or increases the correlation. A correlation

CREATING ORGANIZATIONAL THEORY 27

deflated by range restriction and unreliability could be inflated by sampling error, so that the errors from range restriction and unreliability are to some degree offset by sampling error. It is equally likely that a correlation deflated by range restriction and unreliability will be deflated by sampling error, so that the errors from range restriction, unreliability, and sampling error combine to drag the observed correlation below its true value. As seen, this could lead managers to underappreciate the strength of the causal factors they identify and to erroneously search for other, nonexistent causes of organizational performance. Errors From Confounds A confound exists whenever the true relationship between a cause and an effect is obscured by some other cause. Managers are interested in knowing the true causes of the performance of their organization. But this relationship can be obscured by confounding in the organizational data, leading to mistaken inferences and wrong decisions. The effect of a cause on organizational performance can be confounded by the feedback effect of performance on the cause, because many features of an organization are changed as a result of poor performance. This can create a negative feedback from performance to its causes, which will offset any true positive effect of the cause on performance or inflate a true negative effect of the cause on performance. Moreover, where performance is measured by profit, its constituent variables, sales and costs, can each have a relationship with a cause that may obscure the true effect of that cause on profit. The reason is that profit can be definitionally correlated with sales or costs, so that they can become confounds. The greater the sales, the greater the profit, so that sales tend to be positively correlated with profit. Conversely, the greater the costs, the lesser the profit, so that costs tend to be negatively correlated with profit. If sales or costs are, in turn, also correlated with some other variable, then that variable will be correlated with profit. Thus, a relationship will exist between that variable and profit that is indirect and due to sales or costs, therefore confounding any direct relationship between that variable and profit. Hence, confounding in organizational data may mislead managers about the true causes of profit. Given that profit centers are an integral part of the multidivisional form that is widely used in large corporations, this is a widespread potential problem. Confounding can occur also by two variables each being related to a third variable. For example, a cause of organizational performance can be confounded if there is another cause of organizational performance that is correlated with it. Such extraneous causation may be removed from organizational data by running an experiment in the organization. While this removes error from confounding, it can introduce a bias that understates the true effect. The reason is that an organizational innovation (e.g., a redesigned job) run as an experiment can produce only a

28 THE VISION FOR A NEW ORGANIZATIONAL THEORY

weak effect that understates the stronger effect that would result from full-blooded adoption by the organization. Confounding may also be removed by aggregation, in that averaging eliminates the confounding, leaving a valid estimate of the true effect. The random positive error produced by confounding, which is present in data from one part of the organization, may be offset by negative error from confounding in another part of the organization, so that an average for the whole organization is free of the confounding. This is another sense in which the large organization operates as a meta-machine, reducing the errors in data as the data move up the organizational hierarchy and are aggregated. Thus, it is better for organizations to use averaging, rather than experiments, to remove confounding. However, such aggregation is only available in large organizations. Again, confounding could add to the other sources of error: small N, unreliability, and range. Confounding, like small N, can either deflate or inflate the true value of a parameter, such as a correlation. Thus, where small N, unreliability, and range restriction are all deflating a correlation, confounding could further deflate it, dragging it further down below its true value. Hence, confounding could combine with the other sources of error discussed here to mislead managers in the inferences they draw from data. Chapter 9 discusses problems that come from confounding by performance, both from negative feedback effects and definitional connections. Chapter 10 discusses confounding by a third variable, such as some other cause of organizational performance, and the possibilities for eliminating such confounds by experiments. Chapter 11 argues that confounding can be eliminated by aggregation. Errors Not Self-Correcting While confounding may be corrected in large organizations through aggregation, other errors are not subject to such benign self-correction. It might be thought that, while the main errors discussed in this book (sampling error, reliability, range artifacts, and confounding) exist “one off”—that is, in a single analysis of a single data set—their repeated operation leads them to cancel each other out, so that the errors are self-correcting. This might be thought to occur in two ways. First, the same analysis might be repeated on different data sets. Second, within the same analysis, errors from one source might be canceled out by errors from another source, leaving a true conclusion. However, neither operation tends to produce self-correction and error elimination on any dependable basis, as Chapter 12 shows. A repeated analysis of the same type over different data sets leads to a pattern of errors, which gives false signals of more variability than really exists. This makes managers needlessly pessimistic about their understanding of causality in their organization or its environment. Also, the tendency for analyses to understate true correlations is not eliminated just by repetition of the same analysis across data

CREATING ORGANIZATIONAL THEORY 29

sets. Again, this leads to mistaken pessimism by managers about their understanding of causality. The different sources of error can sometimes offset each other and cancel each other out. However, the more likely scenarios are that the different errors will combine to produce false conclusions that true positive correlations are zero or negative. This is likely given that the true positive correlation with organizational performance of one of its causes tends to be small (e.g., .3). This is diminished by unreliability and (often) by range restriction, making it prey to a negative confounding by reverse causality, so that the overall result is a zero or negative correlation. Sampling error produces a range of final, observed correlations around this zero or negative, so that the final correlation tends to be zero or negative. This tends to lead managers to fail to use the causal levers before them. While this book mostly avoids mathematics, there are formulas that predict the errors, and these underlie the discussion. They are expressly presented in Chapter 13 to provide a formalization of the theory for those seeking it. These formulas are then combined into an overall methodological equation to provide an integration of the argument of the book. Many of the terms in this equation have their values shaped by factors in the organizational situation. When these situational factors replace these terms, we have the overall theoretical equation of statisticoorganizational theory. It predicts the amount of overall error that will occur in an organizational situation. Although this book is primarily concerned to offer new, positive organizational theory rather than prescriptive advice, nevertheless, because of the seriousness of the errors discussed in it, in Chapter 14 we suggest how managers can avoid or at least reduce the main errors. Finally, Chapter 15 summarizes the theory that has been presented in the book. It does so by reviewing the organizational factors that shape the varying properties of the data that confront managers and the errors that they cause. In each chapter, theoretical propositions are presented to summarize the argument and to point the way toward hypotheses that can be tested in future empirical research. The initial proposition is the methodological principle, and then the ensuing propositions state the conditions under which the more egregious errors will be produced, according to that principle. Before plunging into the detailed discussions of the various sources of error in Chapters 3 to 11, we first take a step back and examine the underlying philosophical position that data have a deep structure.

2 The Deep Structure of Data

This chapter will discuss the roots of statistico-organizational theory. It will explain the intellectual background of statistico-organizational theory as being in the tradition of cognitive positivism. It will then outline the specific set of methodological principles used by statistico-organizational theory and their origins in meta-analysis. Appreciating these methodological principles involves understanding the underlying philosophy that data have a deep structure. The implications of these principles for managerial inference-making are sketched. The chapter closes with a brief discussion of how the present work relates to earlier work. Cognitive Positivism Statistico-organizational theory is a positivist organizational theory. More specifically, it is in the tradition that may be called cognitive positivism, a category that is not always recognized. Theories such as that propounded herein, that relate the organization to its environment, are classed as positivist (Burrell and Morgan 1979). The organization and its structure are shaped by objective factors such as material variables—for example, organizational size (Donaldson 1996). Commentators have criticized such theories for the neglect of the human actor and the absence of an analysis of how organizational strategy and structure come about as a result of human decision processes (e.g., Silverman 1970). This criticism has led to calls for an analysis at the level of action—that is, the motivated behavior of actors acting on their perceptions and values (Child 1972). In such a program, there is a disjunction as the analysis leaves the positivist theory of organization and moves down to the individual level theory of human behavior. However, there may be some insight to be gained from theoretical continuity—that is, by extending the positivist approach from the macro level to the micro level. The present chapter attempts to offer elements of a positivist theory of managerial decision. This is far from being a comprehensive theory, for other factors that would need to be brought into play lie outside the purview of a positivist theory. 30

THE DEEP STRUCTURE OF DATA 31

Nevertheless, by pursuing a positivist theoretical line, fresh insights, distinct from those presently available, may come into view. The link between positivist theory and managerial cognition is that statistical theory yields insights into the way organizational managers make sense of data. For instance, managerial action is often informed by perceptions of performance, and the data on which these managerial perceptions are based are numbers, such as the dollar values of sales or profits, so are subject to the laws of statistics. The organizational data used by managers have characteristics such as sample size and reliability, which affect the accuracy of the inferences managers make from them. These characteristics of the data are shaped, in turn, by the organization and its contingencies, especially its size. Social scientists understand how data condition the inferences that they draw in their research. Managers face similar issues about how their data condition the inferences that they draw. Since inference-making by managers from numerical data has not been analyzed much within organization theory, statistico-organizational theory tries to go some way toward filling this gap. Positivism is often equated with explanation by materialist factors. The theory advanced here is positivist in the sense that material factors, such as the number of observations, N, on which an estimate is based, play a major role. However, there is more to positivism in toto than materialism. Positivism contains the concept that knowledge possessed by human beings can be increased by using scientific methods of inquiry and that such enhanced knowledge can form the basis for better judgments by social policy-makers (Comte 1896). Clearly, this process involves ideas in the minds of human beings. Positivism itself is also clearly an idea. Therefore, positivism cannot be completely nonideational or anti-ideational. In particular, the stress on developing superior knowledge through use of scientific methods entails a set of concepts about how people think and how they can make better inferences about cause and effect. This aspect of positivism that refers to ideas and thinking may be termed cognitive positivism. Although not always recognized, especially by critics of positivism, cognitive positivism has been an integral part of positivism from its inception. This book is concerned with managerial inference-making and the relation of this process to the methodological principles used in social science. In that sense, the basic intellectual position of this book is cognitive positivism. Earlier theories have dealt with biases in individual perception and decisionmaking and are thereby psychological theories (e.g., Kahneman and Riepe 1998). Again, some theories have dealt with how individuals are not natural statisticians in that they do not process information in the way required by statistics to produce sound conclusions. In contrast, the present theory is not primarily about how individuals process information. It is primarily about how the data that individuals look at may influence their thinking. In that sense it is primarily objective rather than subjective. Its focus is on facts about the data, rather than human foibles. It is thus a discussion of managerial inference-making “from the outside in.” In those senses, it is positivist.

32 THE VISION FOR A NEW ORGANIZATIONAL THEORY

Thus, the starting point of the present analysis is positivist, focusing upon factual aspects of data such as the number of observations and their implications for inference in an impersonal way. However, these positivist factors interact with more perceptual or emotional factors. For instance, managers make inferences from data, but their overall interpretation can be influenced by attributional factors. For example, a senior manager may attribute the cause of poor performance of an organizational subunit to its individual manager, thereby displacing the blame away from the senior manager. In such ways, the present book incorporates subjective elements along with the objective or materialist elements. Hence, it draws in nonpositivist elements and adds them to the positivist base elements. Thus, the present analysis is not completely positivist, nor intended to be so— nor is it arguing for a purely positivist approach to the analysis of managerial decision-making. Nevertheless, it is strongly positivist in its basic orientation in that it emphasizes factors inherent in organizational data as shapers of inferences by managers. Central to almost all managerial phenomena—be they strategy, structure, or whatever—is the managerial decision. This decision is often based upon inferences drawn from data. Any process that causes errors to arise in inference will necessarily adversely affect managerial decisions. It will thereby have ramifications throughout the organization, impacting on many facets of organizational life. Thus, organization theorists need to attend to managerial inference-making as a cause prior to managerial decision-making. Statistical theory deals with how inferences are drawn from data. Weick (1995) has theorized about the process of sense-making, whereby people reflect upon and make sense of events that have occurred. He shows how sensemaking is subject to the emotions of the sense-maker and other factors. Our concern here is with a particular aspect of sense-making: making sense of numerical data. This is no doubt affected by emotions and other causes of sense-making, but also it is subject to the errors dealt with in statistics. The term inference is often used in statistics to refer to the process whereby information about the population is inferred from a sample. Sound inferencemaking is highly influenced by the formal principles laid down by statistics. We use these principles to generate a theory of managerial inference-making by contrasting the formal requirements of statistics with the practices that managers may use. McKelvey (1997, 354) has written about “stochastic idiosyncrasy,” whereby “microstates” (356) have local variation that can obscure the more underlying and stable background laws that are central to scientific explanation. The stochastic idiosyncrasy produces probability distributions around the mean value of the variables of interest to the scientist. A similar situation confronts the manager who must abstract the underlying pattern from data that are subject to stochastic variations. This inference-making involves aggregation across cases to find parameter values, such as the mean (i.e., average) value, of each variable.

THE DEEP STRUCTURE OF DATA 33

The Methodological Philosophy Underlying Meta-Analysis The core idea of this book is that data themselves can easily lead those examining them to make errors of inference, so that the errors that occur when social science researchers look at their data also occur when managers look at their data. Therefore, the intellectual strategy of this book is to articulate briefly errors that arise from data in social science and then show how these errors might arise in organizations and their consequences for managerial decision-making. The present theory uses methodological principles particularly from the HunterSchmidt type of meta-analysis. These principles are themselves based upon statistics (e.g., the concept of sampling error and its correction). These principles also draw from psychometrics (e.g., measurement error and its correction, and range restriction and its correction). Meta-analysis is based on fundamental methodological principles that are hallowed canons in social science. However, meta-analysis develops some of these ideas in ways that are opposed to some conventional research practices, lending them an unfamiliar air. It is a conviction driving this book that meta-analysis is a major development in social science methodology. Thus, the principles that underlie the Hunter-Schmidt variant of meta-analysis are used as the set of methodological principles to form the theory, because they seem to be among the better contemporary views of the problem of making sound inferences in social science (Hunter and Schmidt 2004; Hunter, Schmidt, and Jackson 1982). These kinds of techniques are frequently used by psychologists in their numerical work. However, the present theory is not an analysis of individuals, as would be the case in psychology, but an analysis of organizations. The organization and its environment shape the data that drive managerial inferences, which in turn lead to decisions affecting the organization, its strategy, and its performance. The theory is therefore an organizational theory and appropriately called statisticoorganizational theory. Truer View at the Aggregate Level To begin understanding the meta-analytic view, we may begin with an analogy with archaeology. An archaeologist may find an archaic building that has become covered with soil by bending down and scraping off the soil. However, an archaeologist flying in an airplane can look down at the ground and see the pattern of the building—and indeed of the entire, buried archaic town—which is visible by marks on the surface when viewed from hundreds of feet above the ground. Thus, going up in altitude allows the archaeologist to see the true, underlying structure more clearly. The archaeologist on the ground can only guess where exactly to start digging, whereas the aerial archaeologist eliminates this guessing. The spade archaeologist slowly uncovers a part, but does not know how it may or may not connect with uncovered parts and so what its significance is within the whole

34 THE VISION FOR A NEW ORGANIZATIONAL THEORY

structure. Indeed, while digging, the archaeologist may well come upon a rock and mistake it for part of an old wall. The aerial archaeologist might not be misled in this way. The aerial archaeologist sees a broader pattern and in this way may see the true meaning of each part. Moreover, the archaeologist using the spade will only very gradually reveal the whole structure, whereas the aerial archaeologist may perceive it at the outset. The archaeologist digging in the dirt is closer yet sees less clearly than the archaeologist who flies above. Meta-analysis holds that a similar process occurs in social science research. A virtue of meta-analysis is that it allows the researcher to see down through the often erroneous findings of individual studies to the true underlying position. This reveals a picture that is truer and may also have the additional virtue of being simpler. Schmidt (2010, 233–242) presents a meta-analysis of studies of the relationship between a test of decision-making and job performance that differ widely in their correlation and shows that, when correction is made for sampling and measurement errors, the studies are consistent with there being a single underlying value. As Schmidt says of his result: “Note that this interpretation is not only more accurate, it is also more parsimonious. That is, it is an example of application of Occam’s razor in data interpretation. It is an example of simplicity underlying apparent complexity. The surface structure of the data is quite complex but the deep structure is quite simple” (236). Hence, the Hunter-Schmidt version suggests that “data lie,” so that an analyst must work hard to see through the superficial local variations to the more stable, underlying truth. Thus, in order to see more truthfully, the analyst must go beyond the surface appearance of local findings to the hidden, true general position, much of which is attained by aggregating data from different locales. The Hunter-Schmidt version of meta-analysis articulates the underlying philosophy that data have depth and that appearances are deceptive, but that by use of these meta-analytic techniques, the truth can be found or better approximated. Schmidt states his view that data can mislead the researcher: “I concluded that the commonly held belief that research progress requires only that we ‘let the data speak’ is sadly erroneous. If data are allowed to speak for themselves, they will typically lie to you” (233). Schmidt also writes: “So it is clear that the injunction to ‘Just let the data speak’ is very naïve and deceptive. Data often look you in the eye and lie to you—without even blinking” (239). Thus, at the heart of the philosophy is a wary mistrust of taking data at face value and recourse to procedures to try to avoid being misled by data. Meta-analysis uses methods from statistics and psychometrics to drill down beyond the often false surface appearance of data to the underlying reality. Psychometrics deals with the true scores of a variable and the true parameter values derived from them, such as the true mean of a variable or the true correlation between variables. True scores and parameters are contrasted with the observed scores and parameters. The gap between the true and the observed misleads researchers into making erroneous inferences from the data. Naive quantitative research

THE DEEP STRUCTURE OF DATA 35

takes the quantities presented by research data and treats them as if they were reality. Meta-analysis and psychometrics more generally start with the concept of a true world that is not readily apparent. They work back from the observed world to infer what is this true world. The aim for the meta-analyst is to go from the surface appearance of data to the true underlying reality. As Hunter and Schmidt (2004) state repeatedly, it is the true, underlying relationships that are the stuff of science. According to the Hunter-Schmidt version of meta-analysis, a major reason why numerical data can mislead greatly is that the sample sizes often used in social research are small enough that they lead to large sampling error. Therefore, the true value of a parameter, such as the mean, can be distant from the observed mean. The observed mean can be greater or lesser than the true mean. Hence, sampling error produces random variation of observed values from true values. In the Hunter-Schmidt view, another reason why numerical data can mislead is that many studies understate the true, underlying relationship. Prominent causes of such understatement are measurement error and range restriction. Measurement error occurs in all variables in social science (Hunter and Schmidt 2004). Measurement error has the effect that any observed measure of association, such as a correlation coefficient, is less than the true association (e.g., correlation). Some variables are measured less reliably than others and so have more measurement error. Thus, the degree of understatement of the strength of an association is affected by the amounts of error of measurement of the variables. But because all variables are measured with error, all correlations between variables tend to understate the true correlation. Range restriction is another cause of understatement of a relationship. A variable in a study may be restricted in range—that is, have less variation than exists in the population or universe. Range restriction lowers the observed association below the true association. To fully reveal the degree of covariation between two variables, it is necessary to first obtain the full variation on both variables, but this may not be attained in a study. For example, a study of firms in a small country such as New Zealand, whose large firms tend to be small relative to large firms in, say, the United States, will not have the full range of size of organizations in the world, because of the lack of those large firms. Therefore, in any study of firms in New Zealand, the correlation with size understates the true correlation in the universe of organizations—that is, across the world, including large-sized organizations in the United States. The Hunter-Schmidt meta-analysis procedure (stated most simply) takes each study finding, corrects it for both measurement error and range restriction, averages across studies to estimate the true general value, and then calculates how much of the variation of the findings of the studies is due to sampling error. The corrections for measurement error and range restriction increase correlations, making them closer to the true value. Averaging the correlations cancels out much of the random error from sampling error (which causes each study to vary above or below the true value) and so leads to a better estimate of the true value.

36 THE VISION FOR A NEW ORGANIZATIONAL THEORY

Meta-analysis contains the concept that the way to come closest to the truth is to combine all the studies of a topic. Thus, meta-analysis is inclusive, in that all the studies of a relationship are included and then averaged. In the Hunter-Schmidt version of meta-analysis, the variation of individual study findings around this average is explained by seeing whether they are due to artifacts—that is, to measurement error, range restriction, and sampling error. If the variations between individual studies are due to artifacts, these variations are spurious and the studies come from the same population. Then the true value of the population is the average of the studies. Hence, the average of the studies is not just a “mere average,” to put it colloquially, but is the true position, around which study findings vary artifactually. This shows that there is general validity of the relationship. Then, those individual study findings that vary from the true figure are incorrect. Thus, individual study findings can mislead, sometimes greatly. How far the variation in the findings of individual studies is explicable by artifacts is an empirical matter, which differs according to the relationship being examined. In some metaanalyses, the artifacts explain only some of the variation across studies, in which case there may be real variation. This real variation may be because of moderators that modify the strength of the relationship from study to study. Nevertheless, the Hunter-Schmidt version of meta-analysis has the potential to show that all variation across studies is due to artifacts, so that a relationship has general validity. For instance, Hunter and Schmidt used meta-analysis to show validity generalization of personnel selection procedures. In particular, they showed that tests of general mental ability validly predict performance, to a substantial degree, in all jobs. The variation across studies, apparently showing variation due to different jobs or settings, was overwhelmingly due to artifacts, mostly sampling error. Much of the remaining variation was explained by range and measurement error (unreliability) artifacts (Pearlman, Schmidt, and Hunter 1980). Some studies have more range restriction than others, a discrepancy that produces variations in the findings of studies. Similarly, some studies use more unreliable measures than others, also producing variations in the findings of studies. In the meta-analysis, the proportion of variation in the findings of studies that is due to the artifacts of range and unreliability can be found. Hence, the meta-analysis can explain differences between studies as being due to the artifacts (sampling error, range, and unreliability). In this way, the mental ability selection tests were shown to be valid and to generalize. The general correlation understates the true correlation, because of measurement error (unreliability). However, Hunter and Schmidt (2004) use a correction formula whereby the general correlation is corrected to the value that it would have if the measurements were without error. This is the true correlation and it is higher than the general correlation. It bears repeating that, in this situation, meta-analysis is not just a technique for integrating research findings and working out an average correlation that is “just an average.” The average value is valid across many situations, so that the generalization is valid. Going further, the general finding (i.e., the average cor-

THE DEEP STRUCTURE OF DATA 37

relation) is more valid than any individual study finding. The reason is that each study finding contains much more sampling error artifact than the average (which cancels out much of the random variation in studies). Also, the average is corrected in meta-analysis for the other artifacts that are present in individual studies, such as unreliability, to yield a truer result than any individual study. Thus, the general finding is more valid than the individual study finding. Individual Studies Mislead The meta-analytic philosophy gains added impetus by revealing that the traditional individual study is deficient. An individual study will often have a small sample size, so that parameters (e.g., the mean of a variable) that are computed from it can differ widely from the value of that parameter in the universe. Moreover, the thinking behind meta-analysis holds that sampling error is greater in most data sets than many people, including many social scientists, realize (Hunter and Schmidt 2004). The sample sizes used in much social science are small enough that there is great sampling error, so that individual study findings vary randomly above or below the true value. As an illustration, Hunter and Schmidt (2004, 4) show in a simulation that thirty samples of sizes 19 to 72 drawn from a population with a true correlation of +.33 can yield correlations that vary from –.10 to +.56. These small samples produce random variations above and below the true correlation. The largest correlation was +.56, which is .23 above the true correlation of +.33. The smallest correlation was so far down that it was a negative correlation, –.10, which is .43 away from the true value (+.33). Hence, looking at just a correlation from one of these small samples, one could infer that the relationship was +.56 and therefore much stronger than it is really (+.33) or that the relationship was negative (–.10) when it is really positive (+.33). But these variations above and below the true value come from random sampling error. As would be expected from errors produced by small samples, the greater variations occurred for the smaller samples. For instance, the two smallest samples, of size 19, had correlations of –.02 and +.52—that is, they varied greatly from each other and from the true value (+.33). Thus, the smallest samples contained the most random error and so gave the most misleading estimates of the true value. As this illustration shows, studies using small samples can produce great variations in findings that are just artifacts and distort the true picture. Sample sizes this small are quite frequently found in research on organizations. In the Hunter and Schmidt example, the range of sample sizes (from 19 to 72) has a midpoint of 45.5, which would be considered respectable in some topics of organizational theory research. For instance, in organizational structural research, Hage and Aiken (1967) had a sample of 16 organizations and Pugh et al. (1969) had a sample of 46. In the Aston Program, there have been forty studies of the relationship between organizational size and functional specialization (for a review, see Donaldson 1996, 138), and the mean sample size of the studies was 40.5. Thus, samples as small as in the Hunter and Schmidt simulation are quite

38 THE VISION FOR A NEW ORGANIZATIONAL THEORY

common in organizational theory research, and these samples will contain much random variation from sampling error. Larger samples produce less sampling error, but still can produce a substantial amount of apparent, but false, variation. Schmidt gives an example of real studies of the relationship between individuals’ scores on a decision-making test and their job performance. The studies were conducted in sixteen different organizations. The sixteen studies varied considerably in correlations: from .02 to .39. Thus, in some organizations the test appears to relate to performance, while in other organizations hardly at all. The mean number of observations in the studies was 156. Nevertheless, almost all of the variation in study correlations (.02 to .39) was due to sampling error. After taking account of sampling error, the studies were consistent with an underlying correlation of .2 and had a residual variance of only .0001 (Schmidt 2010, Table 1). In other words, when sampling error was controlled, there was very little variation between the studies. The apparent variation was an artifact created by sample size—even though the mean sample size was 156, which would be considered quite large for much social science research. Inadequate Protection by Using Statistical Significance Tests Unfortunately, the conventional tool for dealing with the problem of sampling error, significance testing, is inadequate and often misleads. The small sample sizes typical in much social science research lead to low statistical power, so that often the significance tests give misleading results (Harlow, Mulaik, and Steiger 1997; Schmidt, Hunter, and Urry 1976). In particular, the test may say that the parameter (e.g., correlation) is not significantly different from zero, when the true underlying parameter is actually nonzero. Continuing the simulation example from above, of the thirty samples whose correlations varied from –.10 to +.56, nineteen attained statistical significance, but eleven were nonsignificant (Hunter and Schmidt 2004, 5). Therefore, if each sample was a study, over one-third (36.7 percent) of the studies would be viewed as having results consistent with the true underlying relationship being zero—that is, “as not really finding a relationship.” Therefore, use of the statistical significance test within each study would lead to a conclusion of “no relationship” for these studies, which is false, because all of the studies are just samples from a population whose true correlation is +.33. Again, in the studies of the decision-making test, half of them were not significant (Schmidt 2010, Table 1), even though all were consistent with a correlation of +.2. This problem of low statistical power is particularly likely to occur if the true underlying value is small (e.g., a correlation of +.2). Then the confidence intervals around such a small correlation more readily include zero, leading to a conclusion of “not significantly different from zero.” Such a small correlation is quite possible if the individual variable is one of several causes of the dependent variable—that is, if there are multiple causes, which is often the case in social science. For instance, organizational performance (e.g., profitability) has many causes, so that any single

THE DEEP STRUCTURE OF DATA 39

cause tends to have a small correlation with performance. For example, Schlevogt (2002, 358) found a set of causes of organizational performance. As will be seen in Chapter 11, the highest correlation of any cause with performance (controlling for confounding) was +.29 (for planning) and the next highest correlation was +.20 (for the two causes: firm age and firm size), with other causes having correlations of +.19 or less. Moreover, a small correlation is more likely if the true correlation is understated because of unreliability in measurement and range restriction. Because these conditions often occur in social science, the application of the statistical significance will often cause researchers to falsely conclude that there is no true relationship between the variables in their study. A researcher comparing a study’s findings with prior studies’ findings might try to gain a surer conclusion by seeing whether the studies were consistently significant. However, if each study has low statistical power, which could occur if each study had a small sample, then each statistical significance test could mislead. Hence, looking across studies for patterns of significance just repeats the error of using the test in an individual study. As we saw in the simulation example from Hunter and Schmidt (2004), eleven out of the thirty studies were nonsignificant, so the studies were not consistently significant. Thus, using consistency of significance as the criterion would lead to the conclusion that the correlation fails to be generally valid, yet this is false because all the studies come from the same population of correlation of +.33. Again, in the studies of the decision-making test, eight out of the sixteen were not significant (Schmidt 2010, Table 1), so there was not consistent significance, even though all studies were consistent with a correlation of +.2. A weaker criterion would be that there be more significant studies than nonsignificant studies. The above simulation example, in which there were nineteen significant to eleven nonsignificant studies, would have correctly concluded that the set of studies overall supports the relationship. But the nonsignificance of some studies would tend to lead to the idea that there is a moderator, such that the relationship is weaker in some studies than others. However, again this would be false, because all studies come from the same population of correlation of +.33. In the studies of the decision-making test, only eight out of the sixteen were significant (Schmidt 2010, Table 1), which was not a majority of studies, even though all the studies were consistent with a correlation of +.2. In meta-analysis, the superior approach is considered to be combining the findings from all the studies and using that combination to calculate the parameter (e.g., the mean). The parameter is now estimated from the total number of observations pooled from all the studies. This will typically have much less sampling error, so that the true value of a parameter can be assessed much more accurately. The remaining sampling error in the parameters can be dealt with by inferential statistics, such as calculating the confidence intervals, to see whether they exclude zero, indicating that the parameter is truly nonzero. The aggregation of findings across studies produces a truer picture than that given by the individual studies. By taking the overall perspective in meta-analysis, the true nature of underlying reality can

40 THE VISION FOR A NEW ORGANIZATIONAL THEORY

be better discerned. This fundamental idea in meta-analysis is a methodological principle on which it is founded. Hierarchy in Inquiry Hierarchy in Academic Research Meta-analysis implies a hierarchy of three levels in social science research. For studies of organizations, at the bottom of the hierarchy is the case study of an individual organization. The case study researcher has “dirt under the fingernails,” but inference from a case is often hazardous. When a variable has changed, this may be attributed to the earlier change in a variable that is believed to be its cause, yet other variables that are causes may also have changed, upsetting causal inference. At the next level up is the study of a number of organizations. The researcher may compare across the organizations, measuring not only the effect variable and cause variable, but also the other rival explanatory variables so that they can be controlled. However, all the estimates of the parameters involved, such as means and correlations, entail, often considerably, sampling error due to the small sample sizes that are typical in social research. Therefore, the estimated parameters may be misleading. Moreover, as just seen, using statistical significance tests in order to avoid being misled by sampling error may also mislead, again because of small sample size. At the third level up is the meta-analysis, where combining study findings yields estimates of the parameters that are much closer to the true values because the estimates are based on a much larger sample size. The parameters estimated, such as the mean, vary much less around the true, underlying value than in the individual study. Moreover, the distorting effects of other artifacts such as unreliability and range restriction can also be corrected in the meta-analysis, to produce a better estimate of the parameter (e.g., mean or correlation). Thus, meta-analysis produces data that have less of one major error—sampling error—and then corrects for some other errors. In general, whenever we try to understand reality by looking at numerical data, there are gains from moving up from the individual case, first to the comparative study of many cases, and then to the meta-analysis of all the studies. These gains occur because of the misleading nature of data at the lower levels. All of these errors inhere in the numerical data themselves, because all data are subject to sampling error and measurement error, and many data are subject to other artifacts such as range restriction. Therefore, such errors also apply when managers look at the numerical data in their organization. Hierarchy in Managerial Inference-Making Managers examining numerical data about their organization and its environment are subject to making the same errors from their data as social scientists are from

THE DEEP STRUCTURE OF DATA 41

their data. These errors from the data can lead managers to make false inferences and thus wrong decisions. Again, as in social science research, there is a triplelevel hierarchy in organizational managers examining numerical data. At the bottom level, there is the manager examining a single case, such as an employee or a customer, and trying to make an inference from that case, such as why a sale failed to occur. At the next level up, there is a manager examining the data from a set of cases, such as all the sales in the manager’s branch. At the meta-analytic level in an organization, there is the data on all the sales of all the branches of that organization. Data in an organization will tend to have sampling error because of sample size. In an organization, aggregation of all the data about a variable will reduce the sampling error, by basing it on a larger N. Thus, analyses of data at the center of an organization (e.g., the head office) will function like a meta-analysis, which reduces sampling error by aggregating across studies. In aggregate analyses of organizational data, the pooling of large numbers of observations reduces the ratio of noise to signal. The resulting figures are less fuzzy and therefore a better basis for managerial inference-making. Managers and staff analysts higher up the organizational hierarchy will tend to make better inferences from numerical data than those at lower levels of the hierarchy. The reason is that the higher managers will tend to use more aggregate data, which are less problematic and so lead to more valid inferences. In this sense the organization is a meta-analytic machine. As data about the organization or its environment ascend the organizational hierarchy, they are aggregated. Therefore, the data come to have more of the properties of a large, rather than a small, N, so there is less sampling error and more accurate parameter estimates. Thus, by “going up in altitude,” the managerial analyst sees more clearly the true, underlying reality. In saying that an organization is like a meta-analytic machine, we are referring to sampling error as a source of potential error in management, because that is the problem that is reduced through aggregating data in the organizational hierarchy. Sampling error has been shown by Hunter and Schmidt (2004) to be the greatest source of error in organizational research, because sample sizes used in organizational research are typically small. Similarly in organizations, sampling error would also be expected to explain much of the variation within organizational data in those data sets of about the same sample sizes as in organizational research. In contrast, in organizational data sets of large size, there will be little sampling error, so it is not a source of major managerial inference errors. Organizational data sets vary in their size (i.e., the number of observations on which they are based), and this affects whether managerial inferences from data are sound or unsound. Thus, statistico-organizational theory specifies the situations in which sampling error is likely to be a problem for inference and its magnitude, and data set size is a major determinant. Even in organizational management situations that have large numbers of observations and so have little sampling error, there will still be other sources of managerial inference errors from data. All data in an organization will have mea-

42 THE VISION FOR A NEW ORGANIZATIONAL THEORY

surement error, because all variables are measured imperfectly. Depending upon the magnitude of measurement error, it will have much or little adverse impact on managerial inference. Other sources of managerial inference errors, such as range artifacts, may also be present in the data, even when the data have a large N. In social science, the way to reduce measurement error (i.e., to increase reliability) is to avoid measuring a variable in the way that produces error. This means avoiding using single items and, instead, measuring a variable by using multiple items. Also, it means avoiding using difference scores, which reduce reliability, and, instead, using the whole variable. These two points have strong implications for what organizations use to measure performance. Difference scores are implicit in profit, sales growth, and other performance measures frequently used by managers. The whole variables from which they are constructed would provide more reliable measures. Thus, in trying to ascertain the true picture in looking at organizational data, it is necessary to make allowances for the distortions because of measurement error, which is always present in data, and range restriction (or extension), which is often present. These methodological principles mean that statistico-organizational theory can make predictions about the circumstances in which managers will make errors and what those errors are liable to be. Managers will make two main kinds of errors. The first error, that the observed figure varies randomly around the true figure, is caused by sampling error. The second error, understating the strength of a relationship, is caused by several sources, particularly measurement error and range restriction. If combined together, these two errors produce a more understated relationship. The methodological principles drawn upon to derive statistico-organizational theory are mostly well established in meta-analysis. However, one principle is not presently part of meta-analysis, though it is drawn from meta-analytic work, and so has connection to the body of meta-analysis. This principle may be somewhat contentious and in that sense should be viewed more tentatively than most of the other ideas in this book. The principle is that meta-analytic results obtained by combining studies may also reduce the effects of confounds—that is, other causal variables that obscure true effects in individual studies. Thus, the reduction in artifacts that is attained by using the general picture—that is, the results from the large number of observations aggregated from individual studies—may yield another benefit, beyond those recognized in meta-analysis to date. Combining the results from studies may offset the effects of confounds, which work in different ways in individual studies, so that the average is freer of confounds. Thus, again, it is the general correlation, the mean from aggregating the studies that has more validity than the individual studies. Continuities of New Theory With Prior Organizational Theory Research Previously, we criticized organizational theory, especially in the United States, for an emphasis on novelty that leads to the spawning of new theories and abandoning

THE DEEP STRUCTURE OF DATA 43

old ones (Donaldson 1995). This process is made worse in that, rather than seeking cumulation of knowledge through combining old and new theories, the new theories tend to be presented as distinct paradigms, or function as paradigms, that overthrow (or ignore) previous theories. This process fragments organizational theory as a whole and frustrates the buildup of knowledge and thus a coherent view that can be offered to managers (Donaldson 1995). The present work is a new theory, but is not presented as a distinct paradigm that overthrows previous theories. It has a strong sense of continuity with previous research, in that it starts from methodological principles that have long been accepted as fundamental in social science research (e.g., small samples contain much sampling error). Thus, rather than challenging beliefs, it builds on principles many of which are widely adhered to by many social scientists. Statistico-organizational theory is not based on a critique of existing organizational theories, so does not entail any negation of them. Rather, it points to a gap in present theorizing and seeks to fill it, so that its relationship to existing organizational theories is potentially complementary rather than antagonistic. Thus, while this book offers a new theory, it is not a distinct paradigm. Moreover, while it adds to the number of organizational management theories, hopefully, it will not make the field more divided, because it connects methodology and theory. Statistico-organizational theory holds that objective properties of the numerical data facing managers influence their decision-making. Thus, it is in the tradition of theories that state that the external conditions shape managerial decisions and thus organizations (Lawrence and Lorsch 1967). Moreover, prior organizational theories hold that organizational size influences organizational structure (e.g., Blau and Schoenherr 1971; Child 1973a), so statistico-organizational theory is similar in arguing that organizational size affects the control and information systems of organizations. Some of the causal factors in statistico-organizational theory, such as organizational size, are set by the situation and so are fixed determinants, but some of them can be affected by managerial choice, such as whether to aggregate or disaggregate data, thereby affecting sample size. Managerial decisions about which performance measures, such as profitability, to use also affect the properties of the data that such measures present to managers. Therefore, managerial decision-making using these figures is shaped by prior managerial choices (Child 1972). As noted above, the Hunter-Schmidt method of meta-analysis gives prominence to N, the number of observations in each study, arguing that much variation between studies is often due to the small sizes of samples used in individual studies. Focusing on such a simple variable as N may seem too simple to merit serious intellectual discussion. However, it is worth briefly noting how this simple variable, N, the number of some thing, is ubiquitous in modern organizational theory. In structural contingency theory, the number of employees in an organization has been shown empirically to be strongly, positively related to many aspects of organizational structure, such as specialization and formalization (Pugh et al. 1969). Similarly,

44 THE VISION FOR A NEW ORGANIZATIONAL THEORY

Blau and Schoenherr (1971) showed that structural differentiation of organizations is strongly, positively related to the number of employees in state employment security agencies and that this, in turn, is related strongly to the population of the state. In population ecology, the number of organizations in a population (i.e., the density) drives the founding and disbanding rates of organizations (Hannan and Freeman 1989). In institutional theory, mimetic isomorphism has been operationalized by showing that an organization is more likely to adopt a feature, the greater the number of organizations in its field that have already adopted that feature (Fligstein 1985). Thus, be it ever so humble, N, the simple count of the number of things, is often used in modern organizational theory literature as an explanatory variable. In this book, we continue this practice, by using N, here the number of observations, as a major explanatory variable of the accuracy of parameter estimates and thus of the degree of managerial error. Statistico-organizational theory is structural in that emphasis is placed on the situation that confronts the manager, such as the environment of the manager’s organization and given characteristics of the organization, such as its size. In that regard, the new theory stands in a tradition of sociological structuralism going back to Durkheim (1947). The approach is also structural in that the organizational structure shapes the way that data are presented to managers, such as whether a manager sees aggregate or disaggregate data. The structural position and responsibilities of the manager also play a part, such as whether the manager is at the center of the organization with responsibilities for the organization overall or is in a subunit and responsible for only that part of the organization. These different responsibilities will affect what data the manager will attend to and so what inferences he or she is likely to draw. In this regard, the new theory relates to previous organizational theory analyses of organizational structure and to analyses of how the structural position of managers affects their decision-making (e.g., Pettigrew 1973). There are further continuities with prior organizational theory. Several aspects of statistico-organizational theory are continuities with the Aston Program of organizational research. The present author worked earlier as part of the Aston Program of research into organizations and especially organizational structures (e.g., Donaldson and Warner 1974). The Aston Program studies compared across organizations by using a set of variables and measuring each organization on each variable (Pugh et al. 1963). The program featured much use of psychometrics. Variables were constructed from multi-item scales to increase their reliability. Variables were then combined into factors using factor analysis (Pugh et al. 1968). The relations between structural factors and multiple contextual variables were assessed using multiple regression analysis (Pugh et al. 1969). At the time, these techniques were common in psychology, but less so in organizational-level research. Now, by utilizing the statistical and psychometric techniques used in meta-analysis, we may again advance organizational theory and in this way maintain continuity with the Aston Program.

THE DEEP STRUCTURE OF DATA 45

There is another continuity between the Aston Program and the present endeavor. The Aston research empirically analyzed organizational structure using the concept of bureaucracy from Weber. For Weber (1968), bureaucracy is itself a type of formal rationality applied to administration. Other manifestations of formal rationality are science, written law, religion codified into written precepts, and music written in notation. Statistico-organization theory uses scientific principles of methodology to deduce substantive propositions about managerial inference. It is therefore a development that is part of the general tendency toward increased formal rationality that Weber delineated. In social science, methodological principles play a very important role and function as the superordinate framework. Social science has various theories, and the empirical validity of each is ascertained by testing them according to the principles of sound methodology. Thus, methodology is the criterion used to judge the truth or falsity of theories. Methodology is therefore the core belief of many researchers. Whatever fondness they may have for a theory, many researchers will give it up if it does not pass the empirical tests—the tests set by methodological principles. Some organizational researchers are theoretical agnostics, freely admitting that they are not sure if any theory is really right or wrong. However, they may be committed to methodology, in that they will abide by methodologically based tests of theories. An example, again, is the Aston study of organizational structure. This study empirically examined the relationships between organizational context and organizational structure, but without any hypotheses, reflecting lack of beliefs in any theory (e.g., size theory or technology theory) (Pugh et al. 1969). Statistico-organizational theory, because it is derived from methodological principles that many social scientists already believe in, builds upon their prior beliefs, rather than asking them to subscribe to new ideas that lack credibility. Given researchers who believe in social science method, it is hard to see how they can refuse to entertain statistico-organizational theory because to do so would violate their methodological beliefs. These beliefs are often formed through core professional training (e.g., in classes in PhD programs). These beliefs are then reinforced through working on research day by day, analyzing data. Either way, professionally competent academics are well versed in methodological principles, so that they can readily grasp the logic of statistico-organizational theory. Indeed, the experience of the author is that, once the basic ideas of the theory are set before colleagues, they can themselves readily generate statistico-organizational theory propositions. Statistico-organizational theory may thus be developed by many other colleagues than the present author. Also, it may be developed in ways different from some suggested in this book. Indeed, other researchers, better equipped than the author to think about some of the methodological and technical issues, may improve the theory well beyond the present book. The theory thus has the potential to be an inclusionary theory—rather than a narrow, exclusionary paradigm (Aldrich 1992).

46 THE VISION FOR A NEW ORGANIZATIONAL THEORY

In Hunter-Schmidt meta-analysis, sampling error is the major source of error in social science research. Therefore, in discussing sources of error from data in organizations, we will begin by analyzing size-based problems in Chapters 3 and 4 and then turn to the other sources of problems in later chapters: measurement error in Chapters 5, 6, and 7, range artifacts in Chapter 8, and confounds in Chapters 9, 10, and 11.

Part II The Sources of Error

This page intentionally left blank

3 Managerial Errors From Small Numbers

Social science methodology has long been concerned over the limitations inherent in data from samples of small size. These difficulties are present in any situation in which a researcher is faced with trying to make an inference from data that contain this weakness. Similarly, managers can also face small numbers in data about the operations of their organizations or the environments of their organizations. This problem of trying to make inferences from small numbers of observations can lead managers to make errors and unsound decisions. Yet the existence of this source of error in management is not generally recognized within organization theory. Accordingly, the task here is to develop a theory about the likely location and severity of problems in management due to small numbers of observations. We consider the implications of the law of small numbers and show that it has pervasive influence on managerial inference-making from numerical data. The number of observations upon which an inference is drawn is shaped by the nature of the thing counted, some (e.g., pencils) being more numerous than others (e.g., oil refineries). The number of observations is also shaped by the size of the organization (e.g., number of members). This confers an inference advantage on large organizations. Consideration of managerial inference argues, contrary to the popular view, that there are certain advantages from large size and centralization, even for innovatory responses. Conversely, small organizations are more prone to error in the inferences their managers make, contributing to the well-known high mortality rate of small business firms. Organizational size is shaped in turn by the size of the country in which the organization is located. This confers an international comparative advantage in inference upon large countries. We close the chapter with a discussion of familiar organizational sociological topics such as bureaucratization and professionalization and argue that inferences underlie some of these processes. 49

50 THE SOURCES OF ERROR

The Law of Small Numbers The larger the number of observations in a sample taken from a population, the closer the value of a statistic (e.g., the mean) of that sample will be to the value of the parameter (e.g., mean) of the population. Thus, the larger the sample, the more accurate is the estimate that it yields of the population parameter. This is the law of large numbers. Moore et al. state it as: “Draw independent observations at random from any population with finite mean μ. As the number of observations drawn increases, the mean  of the observed values gets closer and closer to the mean μ of the population.” Moore et al. also explain that “the law of large numbers says that μ is also the long-run average of many independent observations on the variable. . . . In the long-run . . . the average outcome gets close to the population mean.” Furthermore, Moore et al. stress that “if we keep on taking larger and larger samples, the statistic  is guaranteed to get closer and closer to the parameter μ.” And they reassure about the generality of this law: “[The law of large numbers] is remarkable because it holds for any population, not just for some special case such as Normal distributions” (Moore et al. 2009, 291). Therefore, there is more random variation in small samples than in large samples (Yeomans 1968). Going from a sample of, say, 30 people to one of 300 people, the parameter estimates become much more stable. Thus, accuracy is problematic for small numbers of observations, which yield inaccurate estimates of parameters. The smaller the number of observations, the worse the inaccuracy problem; inaccuracy will be greater with only ten observations than with twenty observations. The law of small numbers states that the smaller the sample, the more a sample statistic varies randomly around the true value. By chance, on occasion, the sample statistic will be exactly the true value. More usually, the sample value will be above or below the true value. The smaller the sample, the more likely it is that the sample statistic will differ from the true value, and the more the sample value can differ from the true value. This principle of small numbers leading to inaccurate parameter estimates holds for any quantitative data, such as the number of employees, customers, sales, outsourced contracts, and so on. The noise in data is greater when the number of observations on which it is based is small. The value taken by a parameter estimate is affected by a host of underlying causes that produce random variation about the true value. The fewer are the observations, the greater is the random variation. When there are more observations, the underlying causes tend to cancel each other out so that the resulting random variation is small. This is because any variable can be considered to be caused by a large number of underlying factors. On a chance basis these will occasionally be positively correlated, producing extreme and unrepresentative parameter values. More usually, however, many of these factors covary in negative ways so that they tend to offset each other and so the observed value is more representative of the true, general parameter value. As the number of observations increases, the multiple underlying causes of the variable come to cancel each other out so the parameter

MANAGERIAL ERRORS FROM SMALL NUMBERS

51

estimate becomes more accurate. Thus, any variable is itself a portfolio and the portfolio model, of mutually counteracting causes, applies to it. The same condition applies in finance, when negative correlations between the prices of stocks of different companies produce stability in the returns of a portfolio of these stocks (Brealey and Myers 1966). Thus, estimates are more accurate when based on larger numbers of observations. The important point is the absolute number of observations. The determinant of sampling error is not the number of actual observations as a percentage of all the possible observations. Thus, for example, in estimating the mean of job satisfaction of employees in a company, the sample size is the absolute number of employees, not their percentage to total employees. For instance, if 20 employees are sampled out of a total of 200 employees in the company, the sample size is 20, not 10 percent. It should be noted, however, that the accuracy of parameter estimates is not a linear function of the number of observations. One hundred observations do not yield a parameter estimate that is ten times more accurate than an estimate based on ten observations. For instance, the standard error of the mean of a sample is a mathematical function whose numerator is the square root of N (Moore et al. 2009, 296). Therefore, the effect of increasing N from 10 cases to 100, that is, by a factor of 10, leads the square root of N to increase only from 3.16 to 10, that is, a factor of 3.16. Thus, the reduction in the standard error due to increasing N is less than proportionate to the increase in N. Hence, the accuracy of estimation increases with the number of observations, but at a decreasing rate with respect to the number of observations. Thus, bigger samples yield better estimates than small samples, but the rate at which this improvement is occurring decreases as samples become larger. Hence, even quite large samples tend to retain some sampling error, rendering their parameter estimates somewhat inaccurate. This is a reason why, given the sample sizes that are feasible in many situations, even seemingly substantial-sized samples can still have substantial sampling error. The key point is that any parameter (e.g., mean) estimated from numerical data will tend to have more error in it, the smaller the number of observations on which that estimate is based. This error reduces as the number of observations increase, but less than proportionately to the increase in the number of observations. The aim of increasing the number of observations is the reason why, in social science research, rather than relying upon samples, it is better to combine the results from different samples in a meta-analysis, in order to have the bigger N that will have little sampling error (i.e., a small standard error). This logic applies equally to a manager making an inference from data in an organization, who would be well served to aggregate data from the whole organization, in order to maximize the number of observations and to avoid disaggregating data. Some protection from the problem of small sample size may be sought by using statistical significance testing to see whether sampling error could have caused the sample statistic to occur. The significance test tells whether the parameter

52 THE SOURCES OF ERROR

has a certain probability of occurring given the true value in the population. For example, given some true mean value, is the observed mean value in the sample likely or unlikely to occur—and with what probability? However, even a manager who uses a statistical significance test may still make an erroneous inference because of the low statistical power of some significance tests. The power of a test is defined as 1 minus the probability of making a Type II error—that is, accepting the null hypothesis when it is in fact false (Siegel 1956, 9–10). Power is low when the number of observations is small, so it is problematic to use significance tests to try to avoid making false inferences due to small samples (see Hunter and Schmidt 2004, 59–64). The problem of low power of a statistical test is acute when the true parameter value is small (Hunter and Schmidt 2004). Yet, there are likely to be many occasions in management when the true value is small, because any one factor is only one cause out of many. This applies to measuring the degree of association, such as the correlation, between an effect and one of its many causes. If the true effect has a single cause, then the true underlying association will be very high, in fact a correlation of one (i.e., r = 1.0). However, many organizational phenomena are widely believed to have multiple causes—witness almost any textbook on most aspects of management. For example, organizational effectiveness (i.e., performance) has been shown to have many causes (e.g., Schlevogt 2002). Therefore, the true correlation between performance and one of its causes will on average be low (e.g., r = .2), and so the power problem of statistical significance tests will be present. Thus, statistical significance tests applied to managerial data will not always prevent erroneous managerial inferences. Samples in Organizational Management In statistics, a sample refers to a set of cases taken from a population or universe. Because of sampling error, the set of cases will usually be less than completely representative of the population. Therefore, a sample statistic, such as the sample mean, can differ in value from the value of that parameter in the population. Many data sets in organizational management are samples and so the problem of sampling error applies to them, as in social science research. In research, the theory is typically a generalization that applies to the whole population or universe. In turn, a figure (e.g., mean or correlation) from the data in a study is not merely descriptive of that study, but is intended to be an estimate of the population (Schmidt and Hunter 1984; Schmidt et al. 1985). Thus, one can ask: what is the population of which the study is a sample? For example, Blau’s theory (1970) that size leads to specialization, as a theory, applies to work organizations of all types in all countries. Any study of a set of organizations, such as organizations in one city or in one industry, is a part of the population of organizations. Researchers treat the individual study as a sample of the population, because the main interest is not to know something about just that

MANAGERIAL ERRORS FROM SMALL NUMBERS

53

concrete set of organizations in the study (i.e., in the city or industry), but to make statements about the theory by seeing whether general theory holds in the organizations studied (e.g., whether size correlates with specialization). Researchers often use inferential statistics, such as significance tests (with their limitations), to see whether the parameter value in the study is compatible with the parameter value that the theory states exists in the population. For example, if the theory states that size and specialization will be positively correlated, a test may be made to see whether the positive correlation in the study is large enough to be significantly different from zero. Thus, the significance of the study correlation provides evidence that the study has come from the population depicted by the theory. Hence, the study provides empirical support for the validity of the general theory, because the study is considered to be a sample from the population (e.g., the population of work organizations). The claim that a theory holds generally is often investigated in research by conducting more than one study. Where the first empirical study is followed by a second empirical study, researchers may examine whether their results are the “same” by seeing whether they are statistically significantly different from each other. Thus, if the first study produced a correlation of +.5, the question is whether the second study produced a correlation that is significantly different. If the correlations are not significantly different from each other, then the conclusion is drawn that the results are the same. This conclusion may be reached even though the second correlation is not identical—that is, it may be +.4. The researchers are thus recognizing that the two studies could be from the same population, but differ to some degree in their correlations because of sampling error. Thus, the two studies are being treated as samples from the same population. Latterly, general validity across numerous studies of the same relationship may be examined by conducting a meta-analysis. Again, this technique treats each study as a sample from the same population. In statistics, the techniques of significance testing are often referred to as inferential statistics because they indicate what inferences can be made from the observed sample to the population. Thus, researchers use inferential statistics to make better inferences from samples to the population. Similarly, the managers of an organization often look at a set of numbers to make an inference from that data set to the population. Thus, the sales of automobiles by Ford in any month are a subset of the company’s sales over a longer period. Thus, in using the data for a month, managers at Ford seek information about causal processes that operate beyond that month—that is, beyond the immediate data set. For instance, they may investigate whether there is a relationship between advertising and selling more automobiles. The managers may examine whether the sales in months when Ford advertised in the newspaper are greater than the sales in months without the advertising. Their interest is in the general question of whether advertising boosts sales. Thus, the monthly sales figure is treated as an observation about an underlying variable, namely, Ford sales in general. Hence, the monthly sales figure is a sample from the population of sales over all months.

54 THE SOURCES OF ERROR

These managers are like researchers in that they are looking at a data set to answer a more general, causal question. The monthly sales figure is not being looked at as purely descriptive—that is, as just a thing in itself. Instead, the monthly sales figure is a sample. Therefore, it is subject to sampling error in that the monthly sales figure is not completely representative of Ford automobile sales in general. Hence, the problem of small samples leading to random variation around the value in the population will occur for managers looking at data about their organization or its environment. So the smaller the number of observations, the less representative of the population the sample will tend to be. In fact, Ford sells many automobiles each month, so sampling error, while existing, will only be small; but the Ford dealership in one small city would sell far fewer in a month. Moreover, if the analysis were to be made of just one model, say the Taurus, perhaps only a handful would be sold in a month. Inferences about the real prospects of that model from the data of sales in one small city would be problematic, because the data would contain much random variation. Thus, for instance, a trend of declining monthly sales of that model in that one city would not necessarily validly indicate a real decline for the market because the monthly variations could have arisen by chance. This is the same problem as in academic social science of trying to infer whether a general theory holds true from a small sample in one city. Thus, in management, managers who want to know about a general variable, such as sales in a market, will look at a data set not for its own sake, but to learn about the general picture. The sales of one model in each of ten small cities for one month are like ten samples from a population. Even if the causal forces operating on sales are identical in the cities, so that they conform to a general causal pattern, each sales figure will vary randomly around the sales of the population due to sampling error. In summary, a single data set examined by managers is often a sample of a larger population. Therefore, the data set is subject to sampling error that will produce spurious variations around the true population value. Hence, the problems that afflict samples in social science research also apply in many managerial settings. In particular, small samples have more sampling error. Therefore, the law of small numbers, whereby smaller numbers of observations contain more random error, applies to many situations in which managers seek to glean information from their numerical data. Managers seek to make inferences, thereby raising the issues dealt with by inferential statistics. Hence, it is appropriate to analyze managerial inferences in terms of sample size problems. When statisticians talk about a sample, they are referring to a random sample—that is, a sample composed by drawing a sample from the population using random number tables (Ehrenberg 1975). The sales of Ford cars in one small city are a sample but not a random sample, in that they are not drawn at random from the population. However, they are like a social science study of organizations drawn from one city, such as Birmingham, England (Pugh et al. 1968), or of one industry (e.g., textiles) in one country (e.g., Canada) (Clark 1990), which

MANAGERIAL ERRORS FROM SMALL NUMBERS

55

has scores of the organizations on some variables (e.g., specialization). A study is then used to examine the validity of general theoretical relationships, even though the studies are not random samples of the population of organizations in general. Such studies may use inferential statistics to allow for sampling error in the studies when making inferences about the population of organizations in general, so that the studies are treated as samples. However, because the studies are not random samples, the sampling error calculations from inferential statistics are not completely accurate. The nonrandom samples will tend to have more sampling error than that calculated by inferential statistics. Therefore, the allowances made for sampling error are conservative, in that some of the sampling error is not corrected for. The same will hold for managers; the often nonrandom samples that they examine will have more sampling error than would be corrected for if they used inferential statistics, such as confidence intervals or significance tests. Hence, the sampling error variation present in data being examined by managers will be greater for nonrandom samples. The sampling error given by statistical formula will tend to understate the amount of sampling error. Therefore, the hazards of managerial inferences from small numbers of observations will be greater than would be the case if the samples were random. This strengthens the argument of statistico-organizational theory that managerial inferences from smaller numbers of observations are problematic. Hence, the nonrandomness of some samples used by managers reinforces the cautions expressed under the heading of the law of small numbers. When sales are the variable being dealt with in statistico-organizational theory, it is not the dollar value of the sales that is the variable of interest; rather, it is the number of separate transactions. Each separate transaction is an observation, and so the number of observations (N) is the number of transactions. Hence, the N can be small, because the number of events may be much less than their total dollar sales value, so that the small numbers problem could be stronger than might appear by looking at a large dollar value of total sales. The implication of the law of small numbers for management is that a manager looking at any data that are based on a small number of observations is looking at data that are infected with artifactual error—that is, with random chance effects that obscure true cause-and-effect relationships. The proposition from methodology is: 3.1 The smaller the number of observations, the more erroneous the inference from them. Applied to managers making inferences from numerical data, this becomes: 3.2 The smaller the number of observations used by managers, the more erroneous the inference they make.

56 THE SOURCES OF ERROR

In management, any condition of the organization that produces large numbers of observations leads to more accurate estimates of the variables of interest to its managers and hence produces better inferences from data, leading to better decisions. These conditions include factors such as organizational size (e.g., number of employees or number of sales). They also include the way that information is structured, as a result of organizational strategy and structure. A major way to increase error in data, and so degrade the quality of inferences made from data, is to reduce the number of observations through disaggregation, such as into divisional, departmental, or branch figures, so that the number of observations on which those figures are based is small. Inference and the Nature of What Is Counted Some organizational variables naturally come in large numbers (e.g., hamburgers in fast-food restaurants), and others come in small numbers (e.g., milling machines sold by a capital goods manufacturer or the number of accounting systems in a diversified company). Therefore, it is easier to obtain more adequate, larger numbers of observations on the former aspects (i.e., piece-parts) than on the latter, more holistic aspects of the organization and its environment. More holistic aspects, with their inherently smaller numbers of observations, will be more prone to sampling error and therefore to less accurate inferences from numerical data than will pieceparts, with their inherently larger numbers of observations. The proposition is: 3.3 Management inference will tend to be less valid about inherently less numerous, holistic aspects of an organization and its environment than about piece-parts. Errors Caused by Small Organizational Size Large organizations deal in larger volumes, and thus in larger numbers of observations, and so achieve more accuracy in parameter estimates than smaller organizations that inherently deal in smaller volumes. Thus, many data in large corporations are based on large numbers, such as total national sales or total labor costs for the year. However, in small business firms, much of the data on their operations and environment is based upon small numbers of sales or customers, and hence the data limit the ability to produce valid inferences. Large organizations (i.e., those with many employees) are usually seen in organization theory as inflexible and unable to respond to their environments compared with small organizations. This rigidity is due to several factors, including the bureaucratization and dispersion of power in the large organization (Mintzberg 1979). At the extreme, large organizations are seen as inertial and unable to cope with changes in their ecological niches (Hannan and Freeman 1989). Accepting

MANAGERIAL ERRORS FROM SMALL NUMBERS

57

some of this argument, we nevertheless see possible inference-based advantages enjoyed by large organizations that help them respond to their environments. Organization theory has previously tended to overlook the inference advantages of large size, and here we identify them as a step toward a more balanced view of the large organization. Partly under the influence of population ecology theory, some organizational scholars subscribe to the view that large corporations seldom make adaptive changes and are prone to die out (Hannan and Freeman 1989). However, an analysis of the hundred largest U.S. corporations shows that they made substantial changes to their strategies and structures (Fligstein 1985, 1991) and had low rates of mortality, less than 2 percent per annum (Fligstein 1990). Chandler (1977) reaches a similar view that large corporations endure successfully (see Donaldson 1995, 73–75). Thus, it behooves us to identify the advantages of large organizations, because they clearly must possess some. Several, such as scale economies, have already been identified in the literature (Chandler 1977). Here we suggest that an additional advantage of the large organization may lie in its capacity to make inferences from data. Large organizations enjoy an inherent superiority through the capacity of their managers to make sound inferences based upon data. Managers at their apex examine figures, such as total corporate sales, that are produced by aggregating large numbers of observations, which removes much of the random variation, so these figures tend to be relatively sound guides for action. Conversely, even the top manager of a smaller organization examines figures, such as firm sales, based only on a small number of observations. In consequence, the figures can contain patterns and seeming trends that are just quirks resulting from small numbers. Thus, small organizations are inferior inference engines and are disadvantaged relative to larger organizations in making data-based inferences and in the decisions that flow from them. Large organizations can make more accurate numerical estimates than small organizations, allowing the large organizations to better assess and control their activities. Similarly, thanks to these more accurate numerical estimates, the larger organizations can better forecast their futures than smaller organizations can forecast theirs. Therefore, the tasks of direction and planning can be performed better in larger, than in smaller, organizations. In larger organizations the key numerical indicators that managers use in their inference-making are less subject to random variation than those figures used in smaller organizations. This holds even if both the large and small organizations are producing the same product or service for the same market, using the same process technology and input materials. Thus, figures such as the annual sales, the cost of a product, or the probability of an item going out of stock are all less subject to random variation in large organizations. In the large organization, its figures are more meaningful and are better bases of prediction and planning. Better predictions can be made about the future regarding customer demand, raw material needs, operating costs, and so on. This means that managers examining figures to identify a problem, such as the

58 THE SOURCES OF ERROR

cause of falling sales, have an inherently easier task in large than in small organizations. Interpretation hinges on causal analysis. This entails finding variables, such as price or advertising or type of outlet, which covary with the dependent variable, such as total sales. But the sales figures of a small firm are based on small numbers and thus contain a large amount of random variation. Hence identifying the real causes—that is, the real independent variables—is more difficult because of all the random noise. The association between the causes of sales and the sales figures in the small data set tends to vary from the true association. It can be weaker or stronger than the true association. Either error provides a false message about the likely impact on sales of changing the cause (e.g., the price). In small organizations, as compared with large organizations, data will tend to be from a smaller number of observations and therefore will tend to contain more random variation. Therefore, errors from data will be greater in smaller organizations. Occasionally, the parameter values will be the same as the true value, but usually they will be above or below it. Thus, the erroneous inferences from data are more likely in small rather than large organizations. Also the errors are likely to be more egregious in small organizations, in that the magnitude of errors can be greater, because of the greater variations from the true value that are possible in smaller data. Therefore, the average error will be greater in small organizations than in large organizations. 3.4 The smaller the organization, the higher the probability that its managers will make erroneous inferences from its data. 3.5 The smaller the organization, the greater the errors that its managers will make from its data. By a smaller organization we mean here smaller in number of its employees. Many other aspects of size correlate with number of employees, such as sales revenue and capital (Child 1973b; Hopkins 1988; Lioukas and Xerokostas 1982; see Donaldson 1996, 147–158). The foregoing propositions are meant to apply to these aspects of size that vary closely with organizational size. As already mentioned, some other variables, such as number of items (e.g., pork pies) produced, can be large even in small organizations (i.e., organizations with few employees). Therein, the critical variable determining inference errors about operational aspects is the number of items produced, which could be many, even though the organization is small in terms of numbers of employees. Thus, an organization could be large for number of items produced but small for employees. Accordingly, it would be prey to greater inference errors for analyses about its employees than about the numbers of items produced. Thus, whether it is the number of employees or the number of products (or service transactions) that determines the error in an analysis depends upon whether the subject of the analysis is employees or products, or some variable associated with either employees or products. Propositions 3.4 and 3.5 are to be interpreted in this way.

MANAGERIAL ERRORS FROM SMALL NUMBERS

59

When their number of observations is small, a solution to the problem for the organization is to amass larger quantities of data, such as by combining sales over several years. Then, the parameters will become relatively freer of random error. However, when an organization amasses data over several time periods (e.g., months), this will also conceal any real changes in the parameter value from time period to time period. Given that problems of small numbers of observations tend to occur more in small organizations, these considerations imply that the same degree of accuracy can be approximated in the small, relative to the large, organization, but only after a greater time has elapsed, because the small firm takes years to accumulate the number of sales that a large corporation makes in a month. Hence, a large organization may have valid monthly data and so see valid trends (i.e., changes over those months). In contrast, the small organization has to accumulate months of data to have a valid figure and so may miss a trend or change point or only see it after the large organization. Thus, smaller organizations are inherently slower in diagnosis of changes occurring over time than large organizations. An implication is that large firms have a particular advantage in their better diagnostic capacity in situations that are rapidly changing. When an industry turns down in sales, a large firm will spot, plot, and extrapolate from this trend more quickly than will a small firm whose data are too insensitive to yield meaningful tracks by month. Thus, the more changeable, unstable, unpredictable, turbulent, and discontinuous is the environment, the greater is the adaptive advantage of large over small organizations in terms of data-based diagnoses. There may be offsetting factors, such as the greater centralization of power and lesser bureaucratization of the smaller firm, that raise its probability of successful adaptation to change relative to the larger firm. However, any advantage in implementation by small organizations comes with the disadvantage, relative to the large organization, of being less able to formulate the appropriate plan because of deficiencies in data. The point for the present discussion is that, whereas the structural flexibility of small firms is conventionally recognized in organization theory, the inherent inference weakness of small firms is overlooked. Some changes in the environment may be of qualitative kinds that are not caught in the data stream of a large organization. A small organization whose managers are close to the product, customers, and production process may well catch sight of such a change sooner and appreciate it better. However, it may be too easy to overstate the amount of organizationally important change that is of this “new paradigm,” frame-breaking variety. Large organizations routinely collect data on many aspects of sales, costs, and profits; and the factors that affect these variables are those that are of most significance in business organizations. Such factors can be more readily evaluated when there are streams of aggregate data on all of these variables than when such data are poor or absent. Therefore, much dealing with environmental change involves, in part, managers looking at quantitative data and seeking to make inferences from them. Smaller firms are in closer touch with their customers, in a concrete sense, than large organizations, but large organizations better understand their business as a

60 THE SOURCES OF ERROR

system interacting with its environment than do smaller organizations. Smaller firms can attend to an individual customer or to part of their own business more alertly and more quickly than larger firms. However, the problem for the smaller firms is knowing what action to take, because, relative to the larger firms, they have less ability to make an accurate diagnosis of what is causing the problem. Therefore actions taken in small firms have more of the characteristic of gambles or blind stabs, some of which will turn out to be correct on a chance basis. In contrast, in larger organizations, better quality diagnosis of underlying true cause and effect is possible, leading to valid solutions as the bases for actions taken. Thus, while small firms may be “faster on their feet,” they may also be “slower and more confused in their head” than larger firms. 3.6 The smaller the organization, the longer the time period over which it must accumulate data in order to make an accurate estimate of a parameter (e.g., mean). 3.7 The smaller the organization, the more slowly it can diagnose changes from its data. Weak Inference and Organizational Mortality Another extension of statistico-organizational theory relates to the explanation of different mortality rates of business firms. Mortality in the sense of bankruptcy, failure, and disbandment of the firm has attained theoretical interest through population ecology. In particular, new and small businesses (which are often the same thing) have been identified as prone to high mortality rates—that is, to the “liability of newness” (Hannan and Freeman 1989). More colloquially, it is widely acknowledged that small business is risky relative to big business, with most new firms failing to survive more than five years (Aldrich 1979). The riskiness of small business is often attributed to factors such as inadequate capital, overexpansion, excessive stockholding, lack of trained management, and uncertainty in the environment of the small business. An additional reason for the uncertainty of small businesses is simply that they are small. Small size means few customers, few sales, and few employees, therefore rendering hazardous the making of inferences from data that are beset by the properties of small numbers. Failures to forecast future requirements in capital and stocks, often ascribed to deficiencies in training, would be problematic—even were the entrepreneur trained—because of the small number of observations from which to make inferences about the future. Thus, some part of the unusually high mortality rates of new and small business resides in problems with inference. The proposition is: 3.8 The high mortality of small organizations is due in part to their greater inference problems because of their small size.

MANAGERIAL ERRORS FROM SMALL NUMBERS

61

One implication is that more of what occurs in small organizations is blind variation and more of what occurs in large organizations is correct adaptation. This means that the mechanism of blind variation followed by culling of nonadaptive structures—the mechanism emphasized in population ecology theory (Hannan and Freeman 1989)—will be more prevalent in small rather than in large organizations. Thus blind variation, mortality, natural selection, and population-level adaptation will all be greater and more important mechanisms in small rather than in large organizations. This is due in part to the weaker inferences that can be made in small organizations, as compared with larger organizations. International Comparative Advantage in Inference In terms of international competition between nations, statistico-organizational theory argues that large nations with large companies thereby possess an inherent inference advantage vis-à-vis smaller nations. The larger the nation, the greater will be its largest companies, and so the greater will be the accuracy of their figures. By contrast, smaller nations yield, typically, smaller organizations, even for their largest companies, and thereby their data has more sampling error. As emphasized, the error from small size is sampling error, which decreases as the number of observations increases. The sampling error of a mean is proportionate to the square root of the number of observations. Therefore, a country that has a population that is one hundred times another country has one-tenth the sampling error in calculating means, such as to compare across means to try to identify valid causes of effects that are managerially important. Thus, a country with a population of 300 million will have one-tenth the sampling error of a country with a population of 3 million. The largest company in each country will have differences in their numbers of observations that are roughly proportionate to their country sizes. Therefore, there will be about ten times as much sampling error in the data of the company that is based in the small country as in the company that is based in the large country. Thus, the company in the large country will enjoy about ten times more size-based inference advantage than the company in the small country. Moreover, country size, by setting company size, also affects the ability of the company to draw valid inferences from disaggregated data. For large companies in large countries, their large size leads to a larger size of the groupings produced if their data are disaggregated. Hence, disaggregation could still yield sufficiently large numbers to permit valid inference in the data of organizational subunits (e.g., by broad product categories); similarly, as regards disaggregation of figures about the environment of the organization (e.g., the market of the company). Disaggregation, for example by demographics, could still yield data sets large enough for valid inferences about real differences between consumers by gender, age, or other factors. Therefore, valid market segmentation is more viable in the larger organizations of larger countries. By contrast, for small countries, the smaller size

62 THE SOURCES OF ERROR

of their largest organizations means that disaggregation is more likely to introduce errors and erroneous inferences. For the United States, a nation with a large population and large-sized corporations relative to most other countries, including Japan, its large size offers a comparative advantage in management analysis. U.S. organizations can develop more valid analyses of their internal organization and environment than can organizations in other countries. It is a commonplace that many consumer trends occur first in the United States. This may be due in part to the United States being ahead in objective changes, such as demographic shifts, or ahead in innovation or fashions. However, even if these changes were not occurring earlier in the United States, corporations in the United States might be able to perceive them earlier because of inference advantages. The United States can exploit this advantage, while it lasts. The economic development of countries such as China and India that have population sizes larger than the United States means that those countries may increasingly give rise to companies that are larger than the U.S. firms with which they compete. Also, at present British and French companies are limited in their inference potentialities by the smaller size of their organizations and of their markets, relative to their U.S. competitors. But as Europe evolves toward a fully integrated European Union, the resultant superstate, e.g., the United States of Europe, will be larger in size (and wealth) and will be able to support larger domestic corporations than the United States. Thus, the size-based inference superiority of the United States may not last for many more years. These national size differences, however, are mitigated by multinational or global corporations that operate in many countries other than their home country. They can grow much larger than their domestic base would allow. Thus, the multinationalization of a company from a small country can potentially offset its inference disadvantage. However, any inference benefits from multinationalization exist only if the company aggregates data across the countries in which it operates. It is not clear that this always occurs. Managements of multinational corporations may instead analyze data only within each country or region of the world. Moreover, top management might rely primarily upon the data about its home country. These parochial practices mean that differences between countries in inference advantages due to country size would persist to some degree even in multinational corporations. Another offsetting factor is that organizations in smaller countries may seek to reduce the size-based inference advantage by pooling data, through trade associations, government bureaus, and so on. 3.9 Large countries will tend to possess larger organizations than smaller countries, so that the data in organizations in larger countries will tend to contain less sampling error than in organizations in smaller countries.

MANAGERIAL ERRORS FROM SMALL NUMBERS

63

3.10 U.S. companies presently enjoy a comparative advantage in inference, but that may be reduced over time due to the rise of competitors from large countries. Relationships With Some Previous Organizational Theories Existing organizational theory stresses benefits to organizational decision-making and effectiveness from bureaucratization, repetition, automation, and professionalization. However, these tend to be fostered by an increase in organizational size, so that some part of the inference advantage from larger size flows indirectly through these processes. Thus, some of the beneficial aspects of these developments come from the greater accuracy of parameter estimates from larger N data. Hence, the conventional discussion of the benefits from routinization and allied processes needs to be reinterpreted through the lens of the newer theory, statisticoorganizational theory. Bureaucratic theory (Weber 1968) speaks of the deliberate attempts to reduce variability through bureaucratic structuring (e.g., rules and standard operating procedures). However, the bureaucratic norms are sensible and functional only in that they arise out of learning based on a large number of trials, so that analysis can detect stable patterns and thus pinpoint effective procedures. Therefore, effective bureaucratization is based on the inference advantage of a large number of observations. Such large N is more likely in an organization of large size (i.e., number of employees) or scale (e.g., number of products sold), so that the wellknown connection between organizational size and bureaucratization in conventional organizational theory (Pugh et al. 1969) reflects in part the causal processes identified in statistico-organizational theory. Bureaucratization is fostered by repetition of the same operation (e.g., hiring employees, making the product, conducting a sale, delivering the service), leading to mass operations done in a standard way (Blau 1972). The standardization can be done sensibly because the main parameter values are well known, due to repetition producing a larger number of observations. Thus, repetition leading to standardization is facilitated by the inferences made possible by larger Ns. The superior inference promoted by larger N allows the defining by organizational managers of effective, standard ways to do the work of the organization. The increase in repetition and bureaucratization that size induces leads to rules and standard procedures that reduce uncertainty. However, larger size provides a reduction in uncertainty in addition to the reductions brought about by repetition and bureaucratization. Every time a manager looks at a figure, it is a better guide to the future if it is based on a large N. Thus there is a direct contribution to the reduction in uncertainty from size itself, flowing from superior capacity to make sound inferences. This can occur for decisions that have not been standardized or bureaucratized, and so some part of the inference advantage from large size flows

64 THE SOURCES OF ERROR

directly to uncertainty reduction, without going through the intervening causal variables of repetition or bureaucratization. Once managers have an organizational system whose main parameters are known fairly accurately, allowing its cause-and-effect chains to be understood, they can take further steps to run the system more effectively. This will often involve developing the system so that it is even more predictable and even less randomly variable in its operation. For example, the managers may be able to institute automation, which will drive out some of the residual uncertainty in the operating system. One of the key insights of Woodward (1965) was that as batch size increased from small to mass, so the production system became less uncertain and more predictable, because managers battled with production within steadily finer tolerances. Increasing batch size means larger Ns, and so there will be more observations, leading to better estimates of parameters that in turn allow automation. Hence, part of what is conventionally ascribed to advances in technology leading to greater predictability and efficiency is actually based on larger numbers of observations. Moreover, some increases in the effectiveness of operations are attained through instituting production planning and total quality assurance programs. The prerequisite for these is large-scale production—that is, sufficient numbers of products per type and period to give predictable, stable parameters from which to plan and design these systems for improvements. In contrast, when production runs are small, the parameters tend to vary widely, obstructing planning and systems design. Refinements through large-scale production come about through learning, and the basis of cognitive managerial learning is not mere behavioristic repetition, but large numbers of cases for analysis. Much of what we have referred to here is often discussed as professionalization of management in large organizations—that is, having an array of specialists skilled at statistics, planning and control, and so on. Their skills typically involve creating and analyzing databases. Yet their conclusions are only as good as their data, and large data sets are better for statistical estimation. Since size of database is such a key, all-pervasive prerequisite for high-quality professional work, professionals will have more value in large rather than small organizations. This is a reason, an underlying material fact, why professionals and administrative specialization are used more in large than in small organizations. This holds both for the expertise of internal specialists and for the employment of external consultants (Child 1973b). Cynics argue that elaborate internal administrative specialization and employment of consultants are symptoms of too much wealth in large organizations. This argument belies the fact that modern professional experts need data, a lot of data, to properly exercise their profession and this data cannot be as abundantly available about small organizations. In summary, size leads to bureaucratization, repetition, automation, and professionalization, with the beneficial consequences recognized in existing organizational theory. Some of these processes and benefits rest on inference advantages from having large numbers of observations, in that the ability of managers and professional staff to make decisions from numerical data is facilitated by large Ns.

MANAGERIAL ERRORS FROM SMALL NUMBERS

65

Conclusions The focus of this chapter has been on discussing the law of small numbers and mapping its effects on managerial inferences. Wherever estimates are based on small numbers of observations, the error will tend to be high. Thus, the nature of what is counted plays a role: less numerous holistic things lead to greater inference errors than more numerous piece-parts. Managers in small organizations tend to make more erroneous inferences from numerical data than managers in large organizations. Erroneous inferences based upon small numbers are endemic in small organizations and contribute to their mortality and the shaping of their population by the high rates of change in the organizations that compose the population. The size of organizations reflects the size of their countries and so organizations in large countries have an inference advantage over organizations in small countries. At present, U.S. corporations benefit from this advantage, but it may be eroded in the future. The contributions to uncertainty reduction and effectiveness from bureaucratization, repetition, automation, and professionalization are underlain by the inference advantages of having large numbers of observations. In the next chapter, we shall discuss how even in situations where large numbers of observations are available, the organization may fail to aggregate them and so incur small numbers problems.

4 Data Disaggregation and Managerial Errors

As we argued in the last chapter, large organizations have an advantage over smaller organizations, in that large organizations tend to have larger numbers of observations, allowing their managers to make more valid inferences from their organizations’ data. The inherent inference advantage of the large organization can be dissipated, however, when the organization disaggregates its total data into component parts, such as divisional, departmental, sectional, and branch data. Going down the hierarchy, the successive subunits and sub-subunits, such as divisions, departments, and sections, become smaller. Thus, the more that data are disaggregated, the more problematic inference becomes, because the underlying number of observations becomes smaller. Similarly, the more the data are disaggregated by product-market segments, the smaller the number of observations there are for each segment. It therefore becomes important to consider why organizations disaggregate data and whether this practice really helps them achieve their intended outcomes. Inference problems stemming from small numbers due to disaggregation of data lead to false managerial inferences with dysfunctional consequences. In this chapter, we extend statistico-organizational theory for some specific topics: human resource management, cycles of control, and the problem in making inferences from aggregating previous inferences. We also consider the fallacy of immediacy and the data disaggregation to which it leads. Finally, having established that it is desirable for the organization to aggregate its data, we consider the organizational structure that is required to do so. The Fallacy of Disaggregation While the pitfalls created by disaggregating data are well known in statistics, many laypeople nevertheless believe that “to understand a figure you have to break it 66

DATA DISAGGREGATION AND MANAGERIAL ERRORS

67

down and see what is really going on.” This refers to disaggregating, for example, total firm sales into the sales of each product. Indeed, analysis often connotes going from the broad brush to the detail. Disaggregation is felt to reveal more, as if one is putting the leg of a fly under a microscope. However, this idea that disaggregation provides meaningful analysis is fallacious, because disaggregation can obscure the truth. Unfortunately, there is no equivalent of the microscope in statistical analysis. The smaller focus of the subsample produces not a clearer insight into the real processes, but a blur, because of the randomness introduced by the small number of observations. Yet faced with a performance problem, say, poor national sales figures, the response of many managers would be to disaggregate them by product, or region, or whatever. Such disaggregation will reveal differences across grouping of the data, seemingly confirming that the national average figure is “just a meaningless average” that is “misleading” and “masks important variations.” The problem is that, while the disaggregated figures will contain real variation by product, region, and so on, this real variation will be muddled up with artifacts that are introduced by the disaggregation. These artifacts grow worse, the further the disaggregation is pressed, because the size of the resultant subgroups gets progressively smaller. Progressively finer disaggregation reveals more differences across sub-subgroups, but much of this difference is meaningless. Thus, disaggregation of data can be pathological for managerial inference. The more finely that data are disaggregated, the less valid the managerial inference and the worse the resultant decisions. Smaller Ns produce fuzzier numbers. The ratio of noise to signal increases as the N decreases, and hence the ratio increases with disaggregation. Going down the organization from apex to grassroots, any performance figure, such as annual sales, becomes disaggregated into divisions, regions, areas, cities, and so on. These disaggregated sales figures, say, by city, are necessarily small samples, relative to total company sales. Thus, an overall company sales growth of 6 percent in the year may, down at the city level, show wide variations, from 2 percent to 9 percent sales growth. Localized causes, which are random noise at the systems level, yield major variation at the city level. These localized shocks are short-lived and multitudinous, thus canceling each other out over the long run and in aggregation. Nevertheless, the figures will show more variation the more they are disaggregated. Yet much of this variation is illusory, produced simply by disaggregating the organization’s data. The hazards of disaggregation are reinforced by the psychological mechanism of attribution. The shift from aggregate to disaggregate is a shift from the abstract to the concrete: from the sales of Sears to the sales of washing machines at Sears, from national sales to sales in the East, to sales in Boston, to sales in the downtown Boston store, and then to individual salespersons’ results. The human mind clutches readily at the concrete product or the individual city or individual person and can readily invent all kinds of causal attributions for good and bad sales. Some of these attributional theories will be true, but many will be false. The problem

68 THE SOURCES OF ERROR

for the management is to distinguish true variation from all the noise in the disaggregated data. Again, managers and administrators are subject to an expectation, popularly held and sanctified in certain prescriptive theories, that they should manage not by the numbers at their desks atop the organization, but “managing by walking around (MBWA)” by “getting out and into the field,” “down at the coal face,” and immersing themselves at firsthand in actual products, locales, salespeople, and customers. There is some value in information so gained, but there is still the thorny problem of how to make valid inferences about operating variables such as sales from these myriads of locales and employees and from the profusion of explanations readily offered, some of which are self-serving. For this reason, managers are likely to look at the figures and rely on them to some degree when making inferences and attributions about the causes of organizational performance. The fundamental propositions are: 4.1 Data disaggregation reduces the number of observations and therefore produces error. 4.2 The more that data are disaggregated, the greater is the error. The implications for managerial inference are: 4.3 The probability that managers will make an erroneous inference from data is higher where the data are based on disaggregation. 4.4 The more that organizational data are disaggregated, along any organizational dimension (e.g., products, markets, locales, organizational subunits, or people), the more that differences will be revealed to managers, and the more false some of these differences will be, leading to more erroneous managerial inferences. Of course some variations within total, aggregate data are real. A region may sell more winter coats than another. Televisions may sell more than washing machines. But all such variations are best identified by aggregating the data, to give the largest possible N, and then looking for variations within it. This may reveal, for instance, that an organization operates in several different ecological niches, so that there are real differences across its product markets and divisions. Disaggregation is more of a problem in small than in large organizations. As we have said, the validity of inference does not increase linearly with number of observations, but curvilinearly, in a negatively decelerating geometric fashion. Larger gains in validity are made with each additional observation when the total number of observations is small than when they are large. The gain in increasing the sample from 1 to 50 is greater than in going from 50 to 100, which, in turn, is greater than going from 100 to 150. This means that the losses in validity of inference through data disaggregation are smaller for large than for small orga-

DATA DISAGGREGATION AND MANAGERIAL ERRORS

69

nizations. Disaggregating tenfold a large organization of 100,000 employees still leaves groups of 10,000 employees. Each group has substantial validity for an analysis in which employees are the unit of study; and the validity for 10,000 is only somewhat less than for 100,000 employees. In contrast, disaggregating by tenfold for a small organization of 100 employees creates groups of only 10 employees, so each group will contain considerable random error. A group of 10 will have a lot more random error than the group of 100 employees. Thus, the problem of disaggregation is not a linear function of original size, and it is greater in small rather than in large organizations. 4.5 The smaller the organization, the more disaggregation of data leads to inference errors. Some organizations will be large enough that their major subunits, such as divisions, are themselves large organizations, which have large Ns allowing valid inference-making. But a smaller organization will have correspondingly small subunits, such as the functional departments of a small firm, so that inference from its small N data readily leads to substantial errors. Cycles of Dysfunctional Control There is an implication of managerial inference errors for organizational control. Control is monitoring performance relative to some standard and taking action to rectify a disparity of actual performance below the targeted or standard performance level. The performance figure is subject to the law of large numbers. The more observations the performance figure is based on (e.g., people, products, or sales), the more stable it will tend to be. Conversely, the fewer the numbers of observations on which a performance figure is based, the less stable it will be, because random shocks will introduce greater variation around the true underlying figure. As one goes down the organization from apex to grassroots, any performance figure, such as annual sales, becomes disaggregated into divisions, regions, areas, cities, and so on. As has been said, these disaggregated sales figures, say, by city, necessarily are small samples relative to the total company sales. Thus, an overall company sales target of 10 percent growth in the year and an actual performance of 11 percent will, down at the city level, show variation in degree of attainment of the target. Moreover, the figures will show more variations the more they are disaggregated, so that there is more chance-based variation at lower compared to higher levels in the hierarchy. A senior manager dissatisfied with total sales growth might initiate an internal inquiry down the sales hierarchy. Examining the sales growth of the major organizational subunits, say regions, will reveal variation, but some will occur purely by chance, because it is based on a smaller number of observations than for the whole organization. Some part of this variation, however, will be genuine, due to

70 THE SOURCES OF ERROR

real causes, such as the regional economy or the personality of the regional sales manager. Nevertheless, there also will be a random component in the figure, so that some of the variation in the disaggregated figures is spurious, due to chance. This misleads the senior manager about true causation. This error, in turn, leads her to faulty diagnosis and so to faulty blame-placing about the performance of her subordinates. For instance, a regional sales manager may be blamed for the poor performance of the region and dismissed, whereas the poor performance is actually due to random chance. Because the regional sales manager is not really less effective as a regional manager, firing him will not lead to higher performance of the region. Thus, attempts to rectify such poor performance would be ineffectual because they are based on a misdiagnosis that wrongly attributes blame. Plus the firing may demoralize other regional sales managers or aspirants to that role, consequently damaging the organization in the long term. Given that the analysis has revealed variations by region—some of them worrisome—the senior manager may initiate a further analysis down to the level below the region, say the state level. Alternatively, the regional managers, concerned by the unease of the senior manager, may themselves make analyses of the states in their regions. Because states have smaller numbers of observations than regions, more of their performances will be due to chance than regional performances. Therefore, chance alone will lead to the discovery of greater performance variations at state level than at regional level. State sales managers will be subject to even more blame-placing and draconian actions than were the regional managers. The results from the states may cause the senior manager, or the regional or state managers, to analyze the performances of the subunits of the states, the territories. Again, because the territories are smaller than the states, more of their performance will be due to chance, being based on an even smaller number of observations. Thus, from chance alone, there will be even greater performance variation between the territories, leading to even more energetic blaming and punishing. The variability will be greater at the lower level than at the level above because the lower-level figure is more disaggregated. Hence, the further the analysis is pushed down the hierarchy, the smaller the number of observations on which the performance figure is based, the more performance variation that appears, even if wholly illusory, and the more blaming and punishment meted out, even if undeserved. Thus, once the process is initiated at the top of the hierarchy, there will be some tendency for it to cascade downward, because initial suspicions that something is wrong somewhere are seemingly confirmed, and more so at each lower level. Hence the process is self-reinforcing, in that the more the disaggregation, the more likely that incidences of poor performance will be seen, even though due purely to chance. Hence, some part of exercising of control through disaggregation of data about organizational performance is quite illusory, being based on random chance. Experienced senior managers in a large corporation may be aware that some such

DATA DISAGGREGATION AND MANAGERIAL ERRORS

71

variation is spurious and will disappear if they just wait for more data to accumulate. When the overall organization is going well, this is what may happen. If, however, for whatever reason, total organizational performance becomes substandard, then senior management is likely to initiate the sequence of cascading disaggregation, discovering variation, blaming, intervening, and making ineffective rectification attempts. Thus, digging down into aggregate performance data and identifying some disaggregate data in which it spuriously appears that there is poor performing can produce misleading results that can lead to ineffectual actions by upper management. Moreover, the process of questioning subordinates’ capabilities is experienced as coercive by the subordinates and upsets their morale. It also reduces the trust relationship that allows superordinates to delegate down the hierarchy to subordinates. Furthermore, as superiors come to distrust the performance capability of their subordinates, so less delegation of decision authority may be made by the superior or accepted by the subordinate, undermining the decentralization that needs to occur as organizations grow, diversify, or seek to innovate. Thus, identifying poor performance through progressively finer disaggregation sets off a dysfunctional cycle that can lead to spurious identification of poor performance, blaming, punishing, and distrust that upsets delegation. As stressed previously, disaggregation is more of a problem in small organizations because their aggregate data are few to start with. Small organizations are, in a sense, condemned to be buffeted by wide performance variations, both positive and negative, that will periodically catapult them into the pathological cycle of investigation, disaggregation, blame, and rectification that has just been described. This cycle accounts for some of the lack of managerial stability in small firms. The owner-manager who typically runs a small business is particularly likely to become caught in the dysfunctional cycle of mistakenly seeing performance problems. He will recurrently take strong and possibly punitive action. This will tend to sow distrust between him and subordinate managers, to whom he is only just beginning to delegate authority. The owner-manager may readily become convinced that he needs to cease delegating and resume micromanaging. This decision reinforces the entrepreneurial nature of his management system, because it teaches the entrepreneur that, even years after founding the firm, he must be prepared to step in to stop things going out of control. This preserves the power of the entrepreneur but undermines attempts to create a subordinate cadre of professional managers, thus jeopardizing the adoption of sound administration and the choice of an experienced successor. This, in turn, may retard the growth of the small firm and may even increase its chance of mortality—helping to explain the well-known high mortality rates of small firms (Aldrich 1979). This example shows how the institutionalization of professional managers in a business may be upset, which is counterproductive in a growing small business. The dysfunctional cycles of control that statistico-organizational theory analyzes

72 THE SOURCES OF ERROR

may help explain why the transition from control by owner-managers to control by professional managers is so difficult in growing small businesses. The propositions are: 4.6 Poor organizational performance will often lead to disaggregation of performance data. 4.7 The more organizational performance data are disaggregated, the more spurious poor performances are identified. 4.8 The more organizational performance data are disaggregated, the more lower-level managers and employees are blamed erroneously. 4.9 The more organizational performance data are disaggregated, the more lower-level managers and employees are sanctioned erroneously. Inference and the Management of Human Resources The selection of senior managers in an organization is often done internally by seeing how junior managers perform in their assignments. A promising junior manager may be placed in charge of a very small branch to test her fitness for command. This practice is seen to be conservative and risk-avoiding for the organization, which can “see how she goes.” Trying out juniors as managers of small branches protects the company from having its large branches, which are vital to the company’s health, run by incompetents. If successful in the small branch, the manager may be promoted to head a larger branch. This promotion provides a reward for the manager. However, if she is unsuccessful in the small branch, she will not be promoted and may be outplaced from the organization. The success of the manager is assessed by how well the branch performed during her tenure. While this testing process seems reasonable, unfortunately the small branch will have small-numbers properties that will affect the key output variables on which the branch performance is monitored, such as sales, new customers, complaints, staff turnover, and employee accidents. For instance, a bank that seeks to extend its lending from individuals to companies will emphasize making commercial loans. Yet, inherently, a small branch may make very few commercial loans in a year. Hence, chance-based variations in the number of such loans could easily turn branch performance from good to poor. Therefore, the performance of the small branch during the tenure of its new manager will be due in part to her competency and in part also to chance. The new manager is to some degree being handed a lottery ticket. The smaller the branch and thus the fewer observations that the performance variable is measured on, the more will her career fate hang on chance. Therefore, the dismissal of the manager of a small branch, or her promotion, could be unjust because it is based on chance rather than her work. If the manager survives the hurdle of successfully running the small branch, she may be promoted to run a larger branch, whose performance will therefore be less subject to random chance variations. If promoted higher, to head a division or the

DATA DISAGGREGATION AND MANAGERIAL ERRORS

73

whole company, chance will be a still smaller factor, because her performance will be measured on a larger number of observations. Hence, chance plays a larger role in determining the fate of the manager earlier, rather than later, in her career. Because heading the small branch is just a trial to allow the organization to see whether the neophyte really has the abilities required to be a full-fledged part of the management team, the organization may have the manager run the small branch for just a year or two. Yet this will mean that the numbers involved are small and therefore less valid reflections of the true ability of that junior manager. Hence, the shorter the period of time that the manager is in the position, the more that chance will determine the performance of her unit and the less valid the inference about her promotability will be. The propositions are: 4.10 The smaller the organizational subunit of which a manager is in charge, the more that chance affects the subunit’s performance. 4.11 The smaller the organizational subunit of which a manager is in charge, the more that chance affects the manager’s career advancement. 4.12 For a manager who is promoted up the hierarchy, chance affects success early in the career more than late in the career. 4.13 The shorter the tenure of a manager in charge of an organizational subunit, the more that chance affects the subunit’s performance. 4.14 The shorter the tenure of a manager in charge of an organizational subunit, the more that chance affects the manager’s career advancement. This problem will be worse in organizations whose subunits have their performance particularly affected by small-numbers problems. This includes small organizations. It includes also organizations with many subunits, whose size is thereby particularly small. Again, it includes organizations that produce small numbers of products or services. In contrast, the internal promotion decisions will tend to be based on more valid inferences in larger organizations, because the performances of their organizational subunits are less affected by small-numbers problems. The proposition is: 4.15 Chance affects career success more in small than in large organizations. Hence, the more chance affects the performance and career of a manager, the worse are the promotion decisions being made by the organization about that individual. A manager who, by chance, performs well early in his career and who, in consequence, is rapidly promoted up the organizational hierarchy may be inept and so eventually damage the organization when he gains more authority, because his inherent decision-making and managerial abilities are poor. Other managers who are unlucky early on and so are denied promotion but remain in the organization will observe such lucky, former peers with dismay. Such a situation leads

74 THE SOURCES OF ERROR

to perceptions of inequity, to loss of motivation, and to loss of confidence in the organization. These problems may be avoided if the organization has a policy of letting go those managers who were “unsuccessful” at running the small branch, in order to stem loss of morale. Some of those managers who are let go are in reality competent and so a loss to the organization, which has wasted the resources put into their recruitment and development. The propositions are: 4.16 Smaller organizations will have less effective career systems than larger organizations. 4.17 Smaller organizations will have more problems of rightly perceived lack of equity in their career systems than larger organizations. It should be noted that there is no claim here that the presence of chance in promotion decisions somehow implies that such promotion decisions are governed wholly by chance. Our argument is only that chance plays a role in the circumstances specified. Again, recognizing the element of chance is in no way to deny that managers in charge of small units can be effective by skillfully attending to the local peculiarities of their small unit, so that attribution to them of responsibility for some part of the performance of their unit is valid. The Problem of Aggregating Previous Inferences Inference problems can occur when a final decision results from a series of smaller, prior decisions. These prior decisions are each taken on a fraction of the total data and so are prone to the small-numbers problem, leading to erroneous inferences on which the decision is based. Thus, again, the problem introduced by data disaggregation occurs. The final decision combines these disaggregated decisions and so is prone to error. In contrast, if the data had not been disaggregated and the final decision had been made based on all the data, then it would have had less error. Thus the final aggregation of the inferences does not avoid the errors that infect those inferences because they are from disaggregated data. The greater the number of prior decisions that determine the outcome of the final, overall decision, the smaller is the number of observations in each of the prior decisions, therefore the greater is the probability that the final decision will be erroneous. Ceteris paribus, for an organization of a given size, the greater the number of segments into which the organization and its data are decomposed, the smaller is the number of observations in each segment. (Also, the weaker will be the power of any statistical test that is used to control for the random chance that arises from the small numbers.) Therefore, there is a high probability that the wrong inference will be made within any one of the prior decision segments. The final decision may be taken by reviewing all of these prior decisions and noting the pattern of their dispersion. There will inevitably be some variation in

DATA DISAGGREGATION AND MANAGERIAL ERRORS

75

the outcomes of each of the prior decisions (e.g., some segments have positive outcomes and some negative) that is wholly due to chance alone. This variation could sow doubt in the minds of decision-makers about what the true picture is and whether any valid generalization can be made. Any generalization could seem to be too “broad brush” an approach—that is, insufficiently sensitive to the range of different situations. This may lead to the belief that there is some underlying moderator that requires more work to understand and identify empirically. Although these variations could be purely artifactual, they could nevertheless lead to hesitancy or outright rejection of a proposal as too uncertain and difficult. Thus, disaggregated decision-making will more likely produce an erroneous decision than will the same issue decided by simply considering the whole data-set. When the whole data set is considered, there will be less chance-based error to mislead the decision-makers. Any kind of parallel decision-making in which each subunit of a larger organization makes decisions independently, based on its own data, disaggregates the data of the organization, and so the decisions made by each subunit will be more erroneous than the decisions that would have been made if the data of the whole organization had been used. Even if the independent decisions of the subunits are subsequently pooled and used to make an overall decision, this will still be more erroneous than if all the data from the subunits had been pooled for the organization and a decision then made. An example of a way that such parallel decisionmaking and data disaggregation can occur is through franchising, in which a large organizational system is composed of subunits each owned separately and each making its own decision based on its own data. To exemplify this process, consider the following hypothetical scenario. Company A is a large automobile manufacturer presently considering bringing out a new car. The retail outlets of Company A are independently owned franchises and so cannot just be told to do something by the company. There is, instead, a strong cultural pressure to work “with” the dealers, and to be seen to work with the dealers, by being sensitive to their needs and by involving them in new product decisions. Company A is interested in a car with a new ceramic orbital engine developed in Australia. This engine has many advantages over conventional engines, not least that its only exhaust output is air and water vapor. The company is excited about this new product and is thinking of launching it globally as the “green” car: “The automobile for an environmentally sensitive age.” First it wants to test customer and dealer reaction at the point of sale: the dealership. Company A selects 100 dealers at random nationally and elicits their cooperation. The company gives each of them ten of the new, green cars. After a one-month trial, the franchise owners are flown to Detroit for a dealer conference on the new car in order to ascertain whether it was a success. Dealer reactions vary widely. Some dealers have sold all ten cars and acclaim it a howling success. They want more cars and regular production. Other dealers have sold only a few and still others have sold only one or none. The latter

76 THE SOURCES OF ERROR

dealers see this experimental car as a complete failure. Nobody they spoke to was really interested in buying the thing; it was just a gimmick. It did not “fly” in “their” market. During the discussion, these zero-selling dealers, who came from Oklahoma and Nebraska, are the butts of derisory jokes from their peers. There follows a long discussion about differences in the market and economy of Oklahoma versus the top-selling locale, Boston. That evening, at a lavish dinner in a five-star hotel, a story circulates to the effect that the Boston dealer who sold all ten of the cars did so by unfair means, such as putting a $3,000 rebate in cash in the trunk for the customer to find. The Boston dealer vigorously denies this rumor, but the other dealers who sold only a few cars are disinclined to believe these protestations. They ask, “So how come these other guys can sell three times as many cars?” At the resumption of the conference the next day, company management points out that the average sales of cars by a dealership per month is 25 percent of stock and states that that is a realistic criterion for the dealers to apply in order to determine whether their experience was a success or a failure. The dealers are asked for a show of hands on this basis. This voting shows that there are about as many failures as successes. The national marketing manager groans inwardly. She reports to the next executive committee meeting that results of the dealer experiment on the green car are “mixed”: some dealers tag it a success whereas others tag it a failure. Marketing has inaugurated more research and in-depth analysis to identify the moderating factors that make this difference, but until the results are known she recommends holding off from a final decision. She pledges to personally visit every dealer to “see for herself.” The head of manufacturing greets this report with a snort of derision; if marketing cannot decide after an “experiment” for which his production people made a “preproduction” run of 1,000 real cars, how much more study does it need? The deputy national marketing manager, who is in charge of advertising, comes to the aid of his boss at the executive committee meeting. He is sure that the marketing people can soon nail down the uncertainties—and when they do and decide to go with the green car, as he is sure they will, his people have this simply great advertising campaign ready to roll: “America Loves the Green Car.” At this point, the vice president for corporate affairs and external relations intercedes to point out that the company could not possibly use that slogan and must now be exceedingly cautious in its posture, for she has been privately warned by several shareholders that if the new car flops, becoming “another Edsel,” they will sue the top manager and the board of directors. The court could subpoena material from the Detroit dealer conference and have the Oklahoma and Nebraska dealers testify under oath that they knew the new car did not sell in their markets. This wrangle continues within the company for several months, politicizing the decision, splitting the company, and making top management sick of the whole idea of the green car. The CEO commences the next executive committee meeting

DATA DISAGGREGATION AND MANAGERIAL ERRORS

77

by announcing that, this morning in the car while driving to work, the tape he was playing by a leading management guru said that companies should “stick to their knitting” and “get back to basics” and that this has set the seal on his thoughts. He is therefore canceling the green car project. Of course, the wrangling and politicking are only a symptom. The cause is that, instead of aggregating the data from the dealerships and then making a decision, the company relied on each dealership deciding from its chance-infected, small number of observations whether the green car had been a success or not, and the company then counted the positive and negative dealer outcomes. Company B is another large automobile manufacturer and a major international rival of Company A. Company B also has a new green car using a similar, nonpolluting, ceramic orbital engine. However, its dealers are wholly owned and they usually do what they are told by the company. Company B hears of Company A’s experiment and runs the same experiment, with 100 dealers receiving 10 cars each. However, Company B just collects the sales results from each dealership and aggregates them at the head office. There is no consultation conference of dealers, nor are they asked to vote, nor even asked for their opinion whether they see the green car as a success or not. The marketing department at Company B records that 25 percent of the green cars sold within a month. Since the average monthly sales by dealerships is 25 percent, Company B judges the project a success. The executive committee of Company B unanimously accepts the report, authorizes full production, and launches a massive advertising campaign proclaiming, “The World Loves the Green Car.” This is an enormous success and the green car becomes an automobile bonanza. Company A now reverses its decision and rushes its car to market, but it has lost all the prime-mover advantages and never attains the market share and profit that Company B enjoys for the green car. The irony is that the average sales rate of the green car in Company A was actually 25 percent, the same as Company B. However, the culture and structure of Company A prevented calculation and recognition of this statistic. This meant that each dealer essentially voted for or against the green car based on whether that dealership sold more or less than about 25 percent of them. Due to random sampling error, dealer reaction varied widely, so that many sold above 25 percent and about the same number sold below it. This led to an inconclusive vote in the show of hands at the conference. It also led to a search for the moderating factor that determined why some dealers sold many and some few. Yet the dealer variation in sales actually resulted from sampling error and not from a real moderator, so that the search for the moderator was misplaced. Moreover, the signal of “mixed” success/failure and the uncertainty about the validity of the new product were erroneous, because in aggregate the new car was a clear success. The problem lay not in the car or customer reactions or dealer experiences or the data, but in the way that the data were disaggregated through the organizational structure and inference process. Thus, the fragmentation of data through disaggregation, which is endemic

78 THE SOURCES OF ERROR

in parallel decision-making and often concretized into organization structures through franchising and autonomous local units, can seriously hamper managerial inference-making. The proposition is: .

4.18 Decision-making that disaggregates data has a higher probability of leading to an erroneous managerial inference than making a single decision using the aggregate data.

The Fallacy of Immediacy A related problem, which again leads to pathological disaggregation in the name of analysis, is breaking data down into short time periods, such as months or weeks, and then taking only the most recent time period, this month or this week, in effect throwing away most of the data. The emphasis is upon the immediate, most recent, data, but this can be fallacious. The best estimate of the future, such as how long it will take to drive from Boston to Chicago, is derived from making the drive on many occasions, rather than relying on just the last occasion, because that is affected by an idiosyncratic set of values of the underlying causes, whereas the average over many occasions reflects the underlying factors more generally and truly (Ehrenberg 1975). The idea that the disaggregation of data into short periods is problematic is the reason that social scientists often use rolling averages of five years or so to analyze organizational performance data (e.g., Child 1974), because they know that annual data can be misleading. Yet many large corporations measure performance more frequently than annually. For example, Cooper Industries, a U.S. diversified industrial products corporation, uses monthly reports of performance compared with budget to assess its divisions (Collis and Stuart 1991, 14). The large corporation itself is evaluated by financial markets on its quarterly performance, which provides an institutional pressure and legitimated template for evaluating performance within the corporation on the basis of months. Some managers and other analysts of an organization may reject data gathered over longer periods in favor of the fewer observations on the most recent time period. This decision is often driven by a psychological mechanism of fixation with the present. People often express impatience and demand to know “what’s happening now” because “the rest is just history.” Therefore, for instance, the year’s total sales of a company are disregarded in favor of last month’s sales, despite this figure being based on about a twelfth the number of observations of the sales for the year. With small numbers of observations, the monthly sales figure is more infected with error than is the aggregate for the year. Thus, by looking at this month’s figure, the manager may erroneously perceive what is causing performance and so take action

DATA DISAGGREGATION AND MANAGERIAL ERRORS

79

based on this erroneous inference. Hence, too finely grained analyses of the time slices of data can lead to errors of inference. This practice is encouraged by the mass media, which emphasize the idea that news is important. It is also facilitated by the ready availability of very recent figures through electronic media. Yet such fixation with the most recent figures can be counterproductive, because often the manager wants not just to know last month’s sales, but rather to diagnose what is causing sales or to predict what future sales will be. The best estimate of future sales may be last year’s sales, not last month’s sales, because last month’s sales are affected by an idiosyncratic set of values of the underlying causes, whereas the year’s sales better reflect the underlying factors. 4.19 If managers use only the most recent data, the resulting inferences will be more erroneous than if longer periods are used. For example, the most recent period is last month and the longer period is the last twelve months. The proposition that longer-period data are more valid than just the most recent is to be understood as holding ceteris paribus, in that other factors (e.g., a large N based on numerous observations of piece-parts) could offset the recency problem. The idea that aggregate data yield surer inferences than disaggregate data leads to several theoretical propositions about the required organizational structure for making sound inferences from data, as will be seen in the next section. Organizational Structure for Inference As has been stated, managerial inferences will be most effective if managers use aggregate rather than disaggregate data. Thus, for instance, regional data will be preferred to city data and national data over regional data. This means that all the data must be standardized—collected and processed using standard procedures, using standard definitions. Data must be written down or recorded in computers, which means that data collection needs to be formalized (Pugh et al. 1968). The data from around the organization need to be collated centrally and then analyzed and presented—all in a timely fashion. 4.20 Valid managerial inference requires formalized and standardized information collection systems. Moreover, interpretation of the data needs to be made by managers and staff analysts at high levels in the hierarchy for they will work with overall organizational data—that is, aggregate data, which is based on large numbers of observations. Managers at lower levels will tend to have only the data for their local subunit, based on smaller numbers of observations.

80 THE SOURCES OF ERROR

Nevertheless, data from across the whole organization are sometimes collated together into a central database that is then made available to managers down the levels of the organization. This means that even lower-level managers can access and analyze aggregate data that have the benefits of large numbers of observations. Such decentralized access to centralized data is clearly desirable from an inferences perspective. However, managers with more localized responsibilities, even if given access to overall organizational data (i.e., aggregate data) may attend only to the results that bear on the standing of the unit for which they are responsible—that is, “the sales of my region.” Therefore, lower-level managers will tend to disaggregate the data and place more emphasis on disaggregate data, thus making less sound inferences because their disaggregate data are less valid. Thus, ceteris paribus, centralized organizational structures will make superior data-based inferences compared to decentralized structures. 4.21 The probability is higher that a manager will make a superior data-based inference in a centralized organizational structure than in a decentralized structure. As an example of the positive application of statistico-organizational theory, consider the classic structural problem of centralization versus decentralization of decision-making authority. Received theory stresses the necessity to decentralize in large organizations, especially to delegate authority down to local units where there are differences in the environment geographically (Galbraith 1973). The organization chart of a large retail corporation might show a nationwide system of retail stores organized into district groups, or zones, and regional territories (Chandler 1962). Thus, the organization chart is a pyramid of the geographical tiers of head office, territories, and stores. This constitutes a multilevel hierarchical structure, differentiated by area and managed under a philosophy of decentralization (Corey and Star 1971). The implication might be made that managers in each store would buy goods suitable for their locale, based on their knowledge of their specific customers. In practice, however, there is usually a large and elaborate head office buying department that selects goods, develops advertising material, and even specifies the physical layout of the goods in all the stores. Much of this may be formally “advice only,” but it is almost always followed by stores in practice (Corey and Star 1971). Why would having a buyer in the head office be better than having the manager of each store buying for the store? There are several factors, including economies of scale in purchasing, which bear on the appropriateness of centralization, but these are already understood by organization theorists. However, there is also the issue of effective inference. Central managers will be better placed to collate and interpret data about sales, prices, and the like. Indeed, identifying real variations in customer tastes across the regions requires a statistical comparison across the

DATA DISAGGREGATION AND MANAGERIAL ERRORS

81

regions and this will be best performed centrally. Differences in tastes across regions are not self-evident and cannot be presumed to be known a priori; they need to be identified through analysis. Managers have to identify out of the admixture of facts and myths what tastes really are regional and by how much taste differs by region. This may be accomplished in large part by examining statistical data. Examination of localized data would run into the problem of small numbers and therefore random chance variation. Thus, a better approach involves collating data nationally and comparing across regions using as large a number of observations as possible. Once regional or district variations are correctly identified, they form a basis for selective decentralization of buying decisions about certain goods, as occurs in some national retail chains (Corey and Star 1971). The identification of such parameters (e.g., regional differences) is most appropriately taken centrally, because of the problem that would come from making inferences from disaggregate data about preferences, profit margins, trends, and geographic variations. Thus, “managing by the numbers” will be done better by centrally placed managers. 4.22 Only centralized analysis of data will reveal which decisions are more effectively decentralized. 4.23 Effective decentralization of decision-making authority can be achieved only after centralization of the structure. This is not to assert that centralization is always the most effective organizational structure. There are benefits to decentralization, such as speed of decision-making, flexibility, and motivation of lower-level personnel, that are well understood in the organizational theory literature (Hage 1965; Mintzberg 1979). The point is that there are also advantages to centralization when managers make inferences from data, and these have been mainly overlooked in organizational theory to date. There are, of course, several contingency factors that lead an organization to decentralize, including size and diversification. Decentralization of decisionmaking is appropriate for a large (Pugh et al. 1969) and diversified (Chandler 1962) organization. Nevertheless, certain analyses of data are better made centrally, particularly on variables where the number of observations per region or division is small, thus making analyses by regions or divisions prone to error. Other analyses are appropriately made by each region or division and thus decentralized. This holds for variables for which each region or division has enough observations to yield accurate estimates. Yet even so, some analyses would best be made at the center of the division rather than decentralized to branches. Again, the reason for concentrating these analyses at the division level is to avoid analyses at branches that would be made on too few observations. Thus, even within a decentralized, divisional structure, data requirements would lead to analyses that are to some degree centralized—for example, at the divisional level.

82 THE SOURCES OF ERROR

Conclusions Disaggregating organizational data will increase error and so render flawed more of the judgments that are made from those data. This disadvantage of disaggregation applies to disaggregation on any dimension, such as into organizational subunits or customer segments. Thus, large organizations unwittingly dissipate their inference advantage whenever they disaggregate data about the organization or its environment, thereby producing smaller numbers of observations. This occurs, for example, when the performance of subunits rather than the whole organization is examined. Similarly, throwing away data by taking only the data for the immediate period can lead to less sound inferences. Detailed examination of the performances of subunits may be triggered by low overall organizational performance and may soon reveal further problems in the performance of subunits, some of which are artifactual rather than real, but which reinforce initial suspicions and create an artificial crisis and erroneous blame-placing. Further analysis of sub-subunits is liable to reveal even worse performances, in a vicious circle. Such dysfunctional cycles of control are liable to be particularly prevalent in small organizations and to lead entrepreneurs to resist delegating power to professional managers. The performance of small subunits is particularly infected with error, so judging the managers in charge of them contains a random and arbitrary element that, nevertheless, may determine their future careers. Decisional disaggregation leads to inference problems also. When a decision is broken into a host of decisions taken in parallel, each based on data from only one part of the organization, then their subsequent combination can yield a mistaken managerial judgment. Organizational decisions are most truly taken at the organizational level based on organizational data. Disaggregating the decision down to subunits, such as franchises, disaggregates the data and so errors enter in. Data can also be disaggregated by taking only the most recent data. Yet this immediacy, while appealing, can be fallacious, leading to the small-numbers problem again. A sounder approach for large organizations is to base their analyses on the largest aggregates possible. This involves standardized data being analyzed centrally. This process can identify true differences, such as by region, thereby showing where some decentralization may be required. Only centralized data analysis will reveal which factors truly vary by organizational subunit (e.g., region), so sound decentralization can occur only after centralization. This chapter completes our discussion of the effect of sample size as a factor that affects organizational management. In the next two chapters, we turn to another methodological principle that produces factors that affect organizational management: measurement error. In Chapter 7, we combine two kinds of error: sampling error and measurement error. In Chapter 8, we will turn to a further methodological principle—range (restriction and extension)—that produces factors that affect organizational management.

5 Measurement Error of Profit

In Chapters 3 and 4 we considered the errors produced by small numbers of observations. Now, we consider the problems produced by measurement error and how they create errors for organizational managers. In keeping with much of the interest among managers, we shall be particularly concerned with errors in the measurement of performance. Since measures of performance are used for many purposes, including appraising and rewarding managers, it is important to ascertain whether management perceives performance correctly. There are reasons for holding that managers’ perceptions of performance are not wholly accurate, thanks in part to errors in measuring performance. A pervasive source of problems is the potentiality for high measurement error in profit, stemming from the fact that profit is a difference score (i.e., the difference between sales and costs). This measurement error of profit can introduce errors into business decisions, which often rely upon profit data. Other performance measures, such as sales growth, can also suffer from high measurement error because they, too, are difference scores. While it is conventional, both in society and in academic research, to emphasize profitability when assessing the performance of business organizations, there is the objection that profitability measures only a narrow aspect of the performance of a business. Whereas profitability is oriented toward meeting the interests of shareholders, the interests of other stakeholders are met by business performances that diverge from those of the shareholders (Child 1975; L. Donaldson 1995; Pickle and Friedlander 1967; Pfeffer and Salancik 1978). However, the weight given to the different stakeholder interests is not necessarily uniform in setting organizational goals, so that, for business organizations, profitability may still predominate as an evaluative criterion because shareholders are the prime beneficiaries (L. Donaldson 1985; Etzioni 1961). Profit measurement is often recognized as “fuzzy,” through distortions introduced by imprecision, arbitrary classifications, and, at times, manipulation. However, when looked at through the lens of psychometrics, profit is revealed to have an inherent flaw in that it is prey to measurement error. Similar remarks apply to 83

84 THE SOURCES OF ERROR

many other performance measures widely used in organizational management, such as sales growth. We shall begin by briefly outlining the problem of errors of measurement—that is, data unreliability—in social science. Then we shall apply this concept to measures used in organizational management. Profit will be shown to be particularly prone to measurement error. Derived measures of profitability, such as the ratio of profit to assets, will also be shown to be prone to measurement error. More broadly, measures used in an organization that deal in the rate of change of a variable across time, such as sales growth, will also be shown to be inherently prone to measurement error. Organizational performance measurements that control for some factor will also be shown to be prone to measurement error. Less so, but also prone to considerable measurement error is the comparison of the actual level attained organizationally on a variable against some goal or standard. As we shall see, many performance measures are prone to measurement error because they are difference scores. And where the effect on performance of some organizational attribute is the result of a misfit between it and some other factor, then that misfit is also a difference score and so prone to high measurement error. Whenever a measure used in an organization has substantial measurement error, this means that the observed measure misstates the true level of the variable, which can lead managers to draw the wrong inferences from the data and so make the wrong decision. Moreover, measurement error can make the association between two variables seem weaker than it is really, so that the strength of relationships is understated, which, again, could lead a manager to make a poor decision. In this chapter, we discuss the idea of measurement error in profit and argue that it also exists in many related variables, such as the ratio of profit to sales. In the following chapter we will offer a more detailed, technical analysis of the degree of measurement error, identifying the conditions under which it will be most acute. The Problem of Measurement Error Methodology in social science holds that a source of error in data is unreliability, by which is meant error of measurement. Unreliability will be used as synonymous with measurement error in this book (which differs from other usages in which reliability refers to the stability of a measure over time). Cohen et al. (2003, 55) define and explain the concept of reliability in the following way: The reliability of a variable (rxx) may be defined as the correlation between the variable as measured and another equivalent measure of the same variable. In standard psychometric theory, the square root of the reliability coefficient rxx may be interpreted as the correlation between the variable as measured by the instrument or test at hand and the “true” (error-free) score. Because true scores are not themselves observable, a series of techniques has been developed to estimate the correlation between the obtained scores and these (hypothetical) true scores. These techniques may be based on correlations among items, between

MEASUREMENT ERROR OF PROFIT 85

items and the total score, between other subdivisions of the measuring instrument, or between alternative forms. They yield a reliability coefficient that is an estimate (based on a sample) of the population reliability coefficient. This coefficient may be interpreted as an index of how well the test or measurement procedure measures whatever it is that it measures. . . . The discrepancy between an obtained reliability coefficient and a perfect reliability of 1.00 is an index of the relative amount of measurement error.

Unreliability stems from problems in the way that a variable is measured, such as by reliance upon subjective assessment, perhaps involving arbitrary assignment across categories. Unreliability can also stem from using only a single item to measure a variable, whereas using multiple items and aggregating across them would be more reliable (Van de Ven and Ferry 1980). Moreover, where a variable is itself constructed by subtracting the score of one variable from the score of another variable, which is to say that the variable is a difference score, then the unreliability of the two constituent variables is compounded. Indeed, the unreliability of a difference score can be considerably greater than the unreliabilities of its constituent variables. Thus, variables based on a difference score are particularly unreliable. In whatever way unreliability occurs, it leads to hazardous inferences about the true situation based upon data that is infected with measurement error. In the world of business and management these same artifacts also cause errors. However, because they operate in an underlying way, they are not always recognized as sources of error and so appropriate remedies are not always taken. Nevertheless, because managers examine data, just as social scientists examine data, any unreliability in the data facing managers will generate problems. However, because managerial data are rarely discussed in such terms, beyond informal comments (such as “these numbers are a bit rubbery”), analysis of the consequences for organizations of these properties of organizational management tends to have been neglected. More specifically, organizational theory has neglected to theorize the significance of such unreliability in managerial data. We will try to begin to fill this lacuna by presenting an analysis of the effects of data unreliability upon managerial inference. In this chapter, we will emphasize the way low reliability can come from difference scores. A reason for this emphasis is that unreliability is inherent in difference scores. Therefore, any performance measures, such as profit, that are difference scores will be prone to low reliability. Moreover, even if each variable has been measured highly reliably, taking their difference can still produce a variable with low reliability. Furthermore, as we shall see, many measures of organizational performance are difference scores, so the resulting problem of potential low reliability measures is pervasive. Although the more prosaic sources of low reliability, such as subjective measures, are well understood and there have been efforts to reduce them in practice, unreliability from difference scores is less widely appreciated and so may be more prevalent. Hence, by focusing on unreliability based on difference

86 THE SOURCES OF ERROR

scores, we hope in the present discussion to make a contribution to organizational management theory. In social science, a difference score such as X–Y, which is a variable made up by subtracting one variable, Y, from another, X, is less reliable than those variables by themselves (Johns 1981). This is so even though the two variables completely define the difference score. This holds where X and Y are positively correlated; the higher the positive correlation between X and Y, the lower is the reliability of difference score X–Y. The fundamental methodological proposition is: 5.1 If a variable is the difference between two variables, e.g., X–Y, its reliability will tend to be lower than the reliability of those variables, X and Y. This tendency can be greater or smaller depending upon certain conditions, as will be explained in the next chapter. Measurement Error of Profit In seeking to make business managers more performance-oriented, better motivated, more realistic, and more attentive, companies resort to frequent, and probably increasing, use of the measurement of the performance of the organization in terms of profit. For the same reasons, companies increasingly use measures of the profit of organizational subunits, such as divisions. Creating profit centers is believed to be a recipe for helping middle managers to “keep their eye on the ball” and indeed for disciplining them (Williamson 1970, 1985). Profit centers are often adopted by an organization as part of a reorganization away from functional to divisional structures (Chandler 1962; Rumelt 1974). Having multiple profit centers in an organization is felt to be an important part of the benefits of divisional structure (Williamson 1970, 1985). The manager in charge of a division is assessed in terms of its profit. The manager in charge of a subunit of a division, such as a business unit or profit center, may also be assessed in terms of its profit. Managers, whether at corporate or lower levels, may also be assessed in terms of figures that are based on profit, such as the profit-to-sales ratio, profit-to-assets ratio, return on investment, and annual rate of growth in profit. The difficulties of performance measures such as the profit-to-assets ratio have been widely discussed in terms of the arbitrariness and lack of comparability introduced by the way accounting measures, especially of assets, are defined and manipulated (Child 1974; Parker 1972). Arbitrariness in accounting, subjectivity in judgments, and the like can produce measurement error, reducing the reliability of the profit variable. These problems, however, can be reinforced by what, psychometrically, is an overarching reliability problem with profit measures. Profit is defined as sales

MEASUREMENT ERROR OF PROFIT 87

minus costs. It is therefore a difference score. Unreliability in either constituent variable (i.e., sales or costs) leads to much greater unreliability in the difference score. This holds for difference scores produced by subtracting the levels on two variables that are positively correlated (Johns 1981). Sales and costs are typically positively correlated. Hence profit will typically be less reliable than sales or costs. Thus, any error in measuring sales (e.g., by not having all this year’s sales recorded in the annual total sales) or costs (e.g., by failing to fully attribute all costs) will lead to disproportionately large amounts of error in profit, because it is a difference score. For example, suppose that a company reports profit this year of $4 million, from sales of $24 million and costs of $20 million. However, the true profit is only $2 million, so that the error of measurement of profit is $2 million. Sales have been overstated by $1 million because some sales leads that have not yet been contracted have been booked. Costs have been understated by $1 million, because some work has been done this year by contractors who will not be paid until the next financial year. Therefore, both sales and costs have errors of measurement. However, the error for sales is small: $1 million compared to true sales of $23 million—that is, only 4.3 percent. Similarly, the error for costs is small: $1 million compared to true costs of $21 million—that is, only 4.8 percent. Nevertheless, the error for profit is large: an error of $2 million compared to a true profit of $2 million—that is, 100 percent. This is much larger than the errors for either sales (4.3 percent) or costs (4.8 percent) or even the error if both the sales and costs errors were added together, yielding 9.1 percent. Here, profit is unreliable and much less reliable as a figure compared with the sales and cost figures from which it is composed. Because profit is the difference between sales and costs, profit is a smaller value than sales or costs, therefore the percentage of error in profit is smaller even though the amount of error in profit may be similar to that in sales and costs. Error of measurement is the ratio of noise to signal. This is the ratio of the amount of error in the measurement of a variable relative to the true amount of that variable. For sales, the error is only a small fraction of the amount of sales: $1 million relative to $23 million. Similarly, for costs the error is only a small fraction of the amount of sales: $1 million relative to $21 million. But for profit, the ratio is much greater: $2 million relative to $2 million. The main reason the ratio is greater for profit is because its true amount is much smaller than for sales and costs, only about one-tenth. The denominator is much smaller for profit than for sales or costs, while the numerators are similar. Because profit is the margin between sales and costs, the fuzzy penumbra of error around profit is much more than that around sales or costs, even though it is just made up of those errors. If only sales had its error but costs had no error, then reported profit would have been $3 million, so the error in profit would have been 50 percent (= $1 million/$2 million); hence an error of 4.3 percent in sales would have produced an error in profit more than ten times greater. Thus, even if only one of the two constituent

88 THE SOURCES OF ERROR

variables (sales and costs) that define profit has some measurement error, the resultant measurement error in profit can be much greater. Of course, if the cost had been overstated by $1 million, to give reported costs of $22 million, then the reported sales of $24 million would have meant that reported profit was $2 million, which is the true amount. Hence, errors in the same direction—in this example, both costs and sales are overstated—if of the same magnitude, cancel each other, leaving the true profit—an error of measurement of zero. However, this would be a rare case because the magnitudes of errors in sales or costs would have to be equal. More usually, if the errors of measurement of sales and costs are in the same direction, they will be unequal, so that some error of measurement in profit exists, and this can be greater than the error in sales and profit. For example, if reported sales were $25 million and costs were $22 million (both overstatements), they would produce a reported profit of $3 million, which would be an error of $1 million, or 50 percent of the true profit ($2 million). This 50 percent profit error would come from an error in sales of only 8.7 percent and an error in costs of only 4.8 percent, so that, even if both these errors were added together, making 13.5 percent, the profit error would still be more than three and one-half times greater. Clearly, 50 percent profit error is more of an error than the zero error of the immediately preceding example, but a lesser error than the 100 percent of the first example. Hence, depending upon the combination of errors in sales and costs—their directions and magnitudes—the resulting profit can be either true or erroneous to varying degrees. If the error in sales is independent of the error in costs, then the resulting profit will contain error that varies randomly around the true profit. Thus, sometimes small errors in sales or costs will produce large error in profit, while other times they will produce small or no error. Because this profit error is random, it is unpredictable. A manager or commentator looking at a firm’s reported profit will not know whether it is true or erroneous because the errors in sales and costs, and their interactions, will not be visible. Similarly, the profit of an organizational subunit, such as a division, also is subject to the same type of error due to errors in subunit sales or costs and their interactions. Therefore, when using a profit figure to determine a bonus for a divisional head or resource allocation to a division, a top manager can be unaware when that figure is highly unreliable and erroneous. Thus, an analysis of profit is of a figure that tends to be more unreliable than sales and costs. As seen in the first example, errors of measurement of sales and costs of less than 5 percent can produce errors of measurement of profit of more than ten times as much. Profit is nothing but the sales and cost figures reexpressed (by subtracting cost from sales), yet it produces a figure that tends to have properties of unreliability—the very characteristic that social science measurement seeks to avoid. Thus, profit is the sort of unreliable figure that will lead to problems of inference. Nevertheless, profit is widely relied upon in business organizations in assessing the effectiveness of human resource management practices and making

MEASUREMENT ERROR OF PROFIT 89

a range of decisions about strategy, resource allocation, bonus payments, managerial rewards, and managerial promotions. It might be suggested that business analysts should eschew profit in favor of sales and costs. Yet this suggestion would be resisted because sales and costs are associated with functional structures and are seen as old-fashioned and not being the “primary objective” but rather “just a means to an end, not the end itself” (for a discussion of measuring organizational performance by means or ends, see Mintzberg 1979). The proposition is: 5.2 Profit tends to be a less reliable measure of organizational performance than sales or costs. Amount of Error in Profit From Low Reliability Measurement error means that the value on a variable of a case will tend not to be its true value, so that the value is inaccurate. For an organization whose performance is being measured, measurement error means that the observed performance is not its true performance. In the next chapter, we will analyze the profits of business segments in the Walt Disney Company. The financial statements of the Walt Disney Company* give the sales revenue and operating income for 2002 for each of its four business segments: Media Networks, Parks and Resorts, Studio Entertainment, and Consumer Products. These business segments are subunits of the whole Disney organization and so illustrate analyses of the profits of organizational subunits. We will see that in 2002, the reliability of the profits of these business segments could have been only .22. Following Cohen et al. (2003), as quoted above, the square root of this reliability, which is only .47, is the correlation between the observed variable and the true variable. This means that the observed (i.e., reported) profits of the business segments would be very different from their true profits. The mediocre correlation of .47 means that some or all of the reported profits contain large errors. It is possible to depict the amount of error that would exist for the profits of the business segments. This can be represented as the true profit that would exist for each business segment in contrast to its reported profit. The difference between the true and reported profit for that business segment is its profit measurement error. By trial and error, a set of possible true profits was found that has a correlation of .47 with the reported profits. These are given below. However, this set is hypothetical. Other sets are possible, consistent with a correlation of .47. But their overall amount of error would be the same as given in the example here. Thus, the magnitude of the overall measurement error is correct, even though the amounts and directions (i.e., whether above or below the reported profit) of the true profits are hypothetical and illustrative. ——————— *Walt Disney Co. 2005; data collected and kindly made available by Steven Charlier of the Management and Organizations Department of the University of Iowa.

90 THE SOURCES OF ERROR

For instance, the Parks and Resorts business segment, with a reported profit of $1.169 billion, could have had a true profit of $2 billion—that is, an understatement of $831 million that would be the amount of measurement error for that business segment. Meanwhile, Media Networks, whose reported profit was $986 million, could have had a true profit of only $229 million—that is, an overstatement of $757 million. The more modest reported profit of the Consumer Products business segment of $394 million could, nevertheless, have been an overstatement by $294 million of a true profit of only $100 million. And the small (relatively speaking) reported profit of Studio Entertainment of $273 million could have understated by $697 million a true profit of $970 million. This set of massive errors, $831 million, $757 million, $294 million, and $697 million, is consistent with the correlation between reported and true profit being only .47. Since overstating profit is as misleading as understating it, the absolute value of the errors is of interest. It averages $645 million, which is only a little less than the average (absolute) profits, $706 million. Hence, these profit figures contain almost as much noise as signal. This set is only one possible set of the true profits; other combinations are possible, but for a correlation between reported and true profit of only .47 the errors must be of the large degree shown. Hence, while this example is hypothetical and illustrative, it does indicate the actual magnitude of errors that come from a profit reliability of only .22. A manager or staff analyst might display the profits of the business segments by using a bar chart at a presentation or in a report. Figure 5.1 shows the bar chart of the profits of the four business segments. For each business segment, the left-hand bar is the amount of reported profit and the right-hand bar is the amount of the true profit. For all four business segments, the height of the bars is very different for the reported profit compared to the true profit. Thus, the large, misleading differences between reported and true profits would still exist if managers used a simple tool such as a bar chart. In this scenario, the true profit and, thus, the measurement error are treated as if they were known for each business segment, to illustrate their operation. However, in reality, the true profit is not known and neither is the measurement error of each business segment, which is why the reported (i.e., observed) profit is misleading to the managers. By way of comparison, the measurement error is far less likely for sales and costs at Disney in 2002. As will be seen in the next chapter, the average reliabilities of sales and costs of the business segments can be estimated to be .99. Taking this as the value of the reliability of sales, the square root is .995 (to three decimal places), which would be the correlation between the reported sales and the true sales. Thus, the true sales would differ from the reported sales, but only very slightly, so that there is little error for the sales of the business segments. An example of one set of true sales and errors that correspond to this correlation is as follows. The Media Networks business segment, whose reported sales were $9.733 billion, could have had true sales of $10.033 billion, an understatement error of only

MEASUREMENT ERROR OF PROFIT 91 Figure 5.1

Bar Chart Presentation of Profits of Disney’s Business Segments: Reported Versus Possible True Profits

2,500

Profit in millions $

2,000

1,500

1,000

500

0 Media networks

Parks and resorts

Studio entertainment

Reported profit

Possible true profit

Consumer products

$300 million. The reported sales of Studio Entertainment of $6.691 billion could have overstated true sales of $6.491 billion, but only by $200 million. Meanwhile, Parks and Resorts had reported sales of $6.465 billion, but could have had true sales of $6.265 billion—that is, a small overstatement error of only $200 million. And the modest reported sales of the Consumer Products business segment of $2.44 billion could have understated true sales of $2.74 billion, but only by $300 million. This set of small errors is consistent with the correlation between reported and true sales being .995. The absolute value of these errors averages only $250 million, which is only about 4 percent of the average (absolute) sales, $6.332 billion. Hence, these reported sales figures are quite truthful and they contain very little noise relative to their signal. This set is only one possible set of the true sales; other combinations are possible. But for a very high correlation between reported and true sales of .995, the errors must be only of the small degrees shown. Hence, while this example is merely illustrative, it indicates the small actual magnitude of measurement errors in sales from a high sales reliability of .99. In summary, the low reliability of profit leads to great errors in individual cases, such as the business segments of a corporation shown in these examples. Since being a difference score depresses the reliability of profit much below that of sales, the error of measuring the profit of a business segment is much greater than measuring its sales. The profit reliability used here for Disney, .22, is the lowest possible out of a range, but is the mostly likely (as will be discussed in the next

92 THE SOURCES OF ERROR

chapter). Thus, it is not definite that the profit reliability is so low. Accordingly, this example only illustrates the full extent of the error that may exist for Disney in 2002. In that sense it is extreme, but serves to show how large measurement errors in profit can be. Measurement Error of Profit and Attenuation of Correlation So far, we have seen that unreliability (i.e., the degree of measurement error) of profit can be much greater than the unreliability of sales or costs. This greater unreliability affects measures of association between profit and other variables. Unreliability of a variable attenuates (lowers) its correlation with other variables (Hunter and Schmidt 2004). Thus, when correlating profit and some other variables across cases, some cases will have true profit and some cases will have errors in their profit, producing overall a lower than true correlation. Therefore, the observed correlations involving profit tend to understate the true correlation between profit and other variables. Hence managers looking at data containing profit measures tend to see relationships that are weaker than they truly are. This could lead managers to see a cause of profit as weaker than it is. Alternatively, it might lead managers to see profit as a weaker cause of some other variable than it is really. For example, consider a hypothetical corporation in which some of its business segments have implemented a program to increase quality and some have not. The corporation is wondering whether it should require all its business segments to implement the quality program. It wants to know whether the quality program is beneficial, by seeing whether the business segments that use it are more profitable than those that do not. The correlation observed between having the quality program and profit is .2, which some managers in the corporation hail as showing that quality produces positive profit improvement. However, other managers say the result is trivial or “not significant” or “accounts for less than 5 percent of the variance in profit” (because the variance is equal to the square of the correlation). However, unknown to the managers, their correlation is an underestimate because it is plagued by attenuation. For verisimilitude in this otherwise hypothetical example, we shall use the figures from Disney 2002, so that the reliability of profit is only .22. Suppose that the reliability of measurement of the quality program is .8; then the reliabilities of profit and the quality program combine to deflate the true correlation to about .42 „Y .22 y .8 ˆ of its value. Hence, the true correlation is about .48, more than double the observed correlation of .2. If the managers had seen that relationship, they would have concluded that quality was beneficial and they would have made the correct decision to adopt the program corporation wide. In the above example, the profits are of the business segments within the corporation. But similar remarks would apply to an analysis of the profits of corporations across corporations outside of the focal corporation. Hence, managers may analyze profit when looking inside their organization or outward from it. Either

MEASUREMENT ERROR OF PROFIT 93

way, profit will tend to be unreliable and so correlations involving profit will tend to be understatements. The methodological proposition is: 5.3 The more the measurement error in a variable, the more that any association, e.g., correlation, involving that variable is attenuated (reduced) below its true value. Leading to the propositions: 5.4 Because profit tends to have substantial measurement error, any association, e.g., correlation, involving profit tends to be substantially attenuated (reduced) below its true value. 5.5 Because profit tends to have more measurement error than sales or costs, any association, e.g., correlation, involving profit tends to be more attenuated than are associations involving sales or costs, in the same data set. Some managers may not use correlations but, rather, seek to establish associations more simply through the use of bar charts and other visual representations of numerical data, yet these also are prone to measurement error from unreliability. We can see that a bar chart analysis would be afflicted by measurement error. Returning to the example above, suppose that a staff analyst wishes to help company managers visualize the correlation between having the quality program and profit by using a bar chart. She contrasts no program with having a program. She defines “no program” as the degree of implementation of the quality program that is one standard deviation below the mean level of implementation of the quality program. She defines “program” as the degree of implementation of the quality program that is one standard deviation above the mean level. Again, using figures from Disney 2002 for verisimilitude, the mean profit of the business segments is $706 million, while their standard deviation is $439 million. The true correlation of .48 equates to a standardized slope coefficient of .48, so that a one standard deviation increase in the implementation of the quality program leads to .48 of a standard deviation increase in profit, which is $210 million. An increase in the degree of implementation of the quality program from its mean level to one standard deviation above that mean increases profit to $916 million (= $706 million + $210 million). Similarly, a one standard deviation decrease of the degree of implementation of the quality program from its mean level to one standard deviation below that mean decreases profit to $496 million (= $706 million – $210 million). Thus, increasing the degree of implementation of the quality program from one standard deviation below its mean level to one standard deviation above it increases profit from $496 million to $916 million. Hence, on the chart the height of the “no program” vertical bar would be $496 million, whereas the height of the “program” bar would be $916 million.

94 THE SOURCES OF ERROR

In contrast, the observed correlation of .2 equates to a standardized slope coefficient of only .2. A standardized slope coefficient of .2 implies that an increase of the degree of implementation of the quality program from its mean level to one standard deviation above that mean increases profit by only a factor of .2 of the standard deviation of profit, which is $88 million. That is, profit increases from $706 million to $794 million. Similarly, a one standard deviation decrease of the degree of implementation of the quality program from its mean level to one standard deviation below that mean decreases profit from $706 million to $618 million. Thus, increasing the degree of implementation of the quality program from one standard deviation below to one standard deviation above its mean level only increases profit from $618 million to $794 million. Hence, on the actual chart the height of the “no program” bar is $618 million, whereas the height of the “program” bar is only $794 million. When the staff analyst creates the bar chart for a management presentation, it shows only a small profit benefit from having a quality program (Figure 5.2, lower half). In contrast, the true profit benefit would be given by a bar chart that showed the full gap in profit (Figure 5.2, upper half). If company managers were presented with this true bar chart, they would have a correct impression of the substantial benefit of having a quality program. The true benefit is stronger than the apparent benefit. But the managers would not see the true bar chart. They would see only the bar chart that understates the true effect because the profit data are infected with measurement error. Thus, bar chart analysis contains the problem that relationships are understated due to measurement error. The difference in the heights of the “no program” and “program” bars is smaller with measurement error than without it, so managers are liable to infer that the program is less beneficial than it is really. They may make the incorrect decision not to require adoption of the program across all the business segments of the corporation. Note that the amount of error in the correlation flows through to create the same amount of error in the bar chart. The observed correlation, .2, is only 42 percent of the true correlation, .48. Similarly, the difference between profits of the “no program” and “program” bars as seen on the chart, $176 million, is only 42 percent of their true difference, $420 million. Thus, the attenuation of correlation, which we discuss many times in this book, is part of the more general problem of attenuation of association that exists in other ways that managers look at figures in the information system of their organization. In particular, very simple visual depictions such as bar charts, which are widely used in organizational management, are subject to problems due to measurement error. Measurement Error of Profitability Ratios Profitability is often expressed not just by simple profit, but also by a ratio of profit to some other financial variable, such as assets or sales or shareholders’ equity. In this book we will use profitability to mean the ratio of profit to sales, as distinct

95 Figure 5.2

Error in Bar Chart Analysis of Understating True Relationship Because of Measurement Error

Without measurement error: True effect

Profit in billions $

1,000 800 600 400 200 0 No program

Program

With measurement error: Untrue effect

Profit in billions $

1,000 800 600 400 200 0 No program

Program

96 THE SOURCES OF ERROR

from profit on its own. However, this ratio still leads to the same problem of low reliability (i.e., measurement error) as does use of the simple profit measure. Unreliability of Profit-to-Sales Ratio Analyses of company profitability often use the ratio of profit to sales. This may be done to take out scale differences so that the profitability of a small company can be compared with that of a large company that has higher profits because of having much greater sales. While this is sensible, reexpressing profit in terms of the ratio of profit to sales does not avoid the problems discussed in this chapter. Specifically, the error introduced by profit being a difference score is retained when the ratio of profit to sales is used, so that the reduced reliability of profit flows through to a reduced reliability whenever this ratio is used. The reason is that in profitability, which is profit divided by sales, the numerator is profit. Therefore, the numerator is prone to the low reliability of profit, which then afflicts the profitability ratio. Hence, profitability will suffer from the same unreliability as discussed above for profit. Any unreliability in sales or costs can produce greater unreliability in profitability, even though it is just the ratio of profit to sales. The greater unreliability of profitability will then tend to render any correlations involving profitability as underestimates. The propositions are: 5.6 Profitability, i.e., a ratio of profit to sales, tends to be a more unreliable measure of organizational performance than sales or costs. 5.7 Any association, e.g., correlation, involving profitability tends to be more attenuated than an association involving sales or costs, in the same data set. Unreliability of Profit-to-Assets Ratio Profitability is often expressed by the ratio of profit to assets, to express the returns being obtained from the assets. But assets are famous for being “rubbery figures,” because they rely on assessments of their worth without having been offered on the market. Moreover, they can be overstated, or they can be undervalued due to old estimates that have not caught up with economic inflation (e.g., Child 1974). Accepting all these reasons, we are concerned, rather, to point out that the error in profit that comes from its being a difference score persists when profit is reexpressed as the ratio of profit to assets. Again, profit is the numerator of this ratio and so the error from profit being a difference score enters also into the profit-to-assets ratio. Therefore, the ratio of profit to assets can have more error than sales and costs. And correlations involving this ratio tend to be understatements. 5.8 The ratio of profit to assets tends to be a more unreliable measure of organizational performance than sales or costs.

MEASUREMENT ERROR OF PROFIT 97

5.9 Any association, e.g., correlation, involving the ratio of profit to assets tends to be more attenuated than an association involving sales or costs, in the same data set. Measurement Error Caused by Time Rates Another way in which errors of measurement can be inflated is by using rates over time, which are also used quite widely in management. Again, the reason is that the annual growth rate is a difference score: this year’s figure minus last year’s figure (often divided by last year’s figure). Therefore, a time rate tends to suffer greater unreliability than its constituent variables: this year’s figure and last year’s figure. This problem of low reliability measurement in a time rate variable occurs for growth in sales, despite sales being a simple variable so that it tends to be more reliable than profit. Sales growth may be measured as the difference between sales in a period and sales in the preceding period (the numerator), divided by sales in the preceding period (the denominator). Clearly, the numerator is a difference score, and it is therefore subject to the usual reliability problem. For instance, management may calculate the rate of sales growth of a company. If sales last year were $10 million and sales this year are $11 million, then the sales growth rate is 10 percent per annum. However, if last year the true sales were $10.2 million and this year they are truly only $10.8 million, then the true sales growth is only $600,000, and so the growth rate is only 5.9 percent, and the error is 4.1 percent. The error is 69 percent (4.1/5.9) of the true growth rate. This large error in the sales growth rate was produced by much smaller errors in the reported sales: for last year, only 2 percent, and for this year, only 1.8 percent. A manager might compare across cases (of organizations or organizational subunits) to see if the sales growth rate variable is correlated with some other variable. For example, divisional expenditure on advertising might be correlated with the ensuing growth rate in divisional annual sales, to test whether the advertising is really boosting sales growth. Being a difference score, the reliability of the growth rate in annual sales variable tends to be less, the higher is any positive correlation between the sales in the different years (Johns 1981). The correlation between the sales in any two adjacent years is the serial autocorrelation for sales. This serial autocorrelation is often positive and high, because sales in any year build upon the sales in the previous year. Hence, the more unreliable will be the sales growth variable. This, in turn, produces an understated correlation, which could mislead the manager into falsely inferring that the advertising expenditure by divisions produces little or no benefit in sales. If the performance measure used is growth in profit, then the unreliability inherent in profit is amplified by the unreliability from using annual growth rates. Essentially, the calculation used to produce the profit growth rate has first taken the difference between sales and costs (i.e., profit) that typically has lower reliability than that for sales or costs. This has been done for each of the two years. The time rate of change in profit has then taken the difference between the profits of the two

98 THE SOURCES OF ERROR

years. Profit growth rate is the profit for year 2 less the profit for year 1, divided by the profit for year 1. Hence, the numerator is a difference score (year 2 minus year 1), and so profit growth rate will suffer a reduction in its reliability. Hence, for profit growth, the decrease in reliability through taking difference scores has occurred twice, by using profit and then by taking the difference between years, considerably decreasing reliability. Profitability, the ratio of profit to sales, might be expressed as the annual growth rate in profitability. Again, profit being the numerator would tend to be low on reliability, and then taking the difference between the two annual profitability figures would tend to further reduce reliability. Similar remarks would apply to the ratio of profit to assets expressed as its annual growth rate. The general proposition is: 5.10 Any organizational variable that is the rate of change of some measure over time tends to be less reliable than that variable measured at a single time. The more specific propositions are: 5.11 The growth in sales over time tends to be less reliable than sales measured at a single time. 5.12 The growth in profit over time tends to be less reliable than profit measured at a single time. Leading to the general proposition that: 5.13 Any association, e.g., correlation, involving the rate of change of some measure over time tends to be more attenuated than an association involving that variable measured at a single time. The more specific propositions are: 5.14 Any association, e.g., correlation, involving growth in sales over time tends to be more attenuated than an association involving sales measured at a single time. 5.15 Any association, e.g., correlation, involving growth in profit over time tends to be more attenuated than an association involving profit measured at a single time. Measurement Error Caused by Control Variables In academic research, difference scores can enter in another way. In order to control for the effect of some exogenous variable and prevent confounding and hence

MEASUREMENT ERROR OF PROFIT 99

spurious correlation, that variable may be entered into the analysis. For instance, in a multiple regression analysis, the control variable may be entered alongside the independent variable. The analysis then gives the effect of the independent variable on the dependent variable, controlling for the control variable. More specifically, in a multiple regression analysis, the slope of the dependent variable on the independent variable is a partial slope controlling for the exogenous (i.e., control) variable. This means that the relationship between the exogenous variable and the dependent variable is established, and then taken out, by calculating the residuals around this relationship and using the residuals to measure the association between the independent and dependent variables (Blalock 1972, 433). Thus, the dependent variable is really itself minus the expected value from the exogenous (control) variable. This can readily be appreciated when the familiar equation explaining the dependent variable, Y, by the independent variable, X, while controlling for the control variable, C, Y=X+C is rewritten in its mathematically equivalent form: Y – C = X. Thus, the dependent variable is really a difference score between the actual value on the dependent variable, X, and the expected value from the control variable, C. Clearly, there is error of measurement in the dependent variable, Y. And there is also error of measurement in the control variable, C. Even if these two variables have high reliabilities, the reliability of the residual, Y – C, tends to be reduced by the degree of positive correlation between Y and C. The control variable, C, will only be needed as a control if it has a substantial correlation with the dependent variable, Y, so the correlation between Y and C could often be substantial. This contributes to lowering the reliability of the residual variable, Y – C, below that of variables Y and C. Thus, the observed correlation of X and Y will tend to be understated for purely measurement reasons, apart from any controlling of spurious effects of C. The same issue can arise within organizations among managers. For instance, in the accounting systems of some companies, in a comparison of the production costs of plants that vary in size and therefore enjoy different economies of scale, this size differential may be controlled for. Thus, plant size is essentially used as a control variable by having an expected figure for each size of a plant and then comparing each plant’s actual costs with that figure. This is equivalent to subtracting the expected from the actual costs and thus is a difference score. Again, this brings with it increased unreliability. Suppose that the unit cost of a product from Company C’s large plant in Alabama is $95 and the average for the company’s large plants is $90. Then the difference is $5, and so the Alabama plant is above expected costs by 5.6 percent and seems

100 THE SOURCES OF ERROR

too expensive—leading the head office to fire its manager. However, errors in the cost accounting system mean that the Alabama plant’s true cost is $93 and the true large plant average is $91, so that the difference of the Alabama plant is really only $2 above the true expected large plant cost. This excess is wholly due to the older equipment at Alabama, it being the original Company C plant. The error is $3, which is 150 percent (= $3/$2) of the true excess. Yet the errors that lead up to it are in themselves small. The error in Alabama’s costs is only 2.2 percent (= $2/$93). The error of the large plant average costs is only 1.1 percent (= $1/$91). These small errors lead to large errors when controls are applied because the analysis uses a difference score. After controls have been applied, the errors in the level of the dependent variable (here $5) will typically be larger than the errors in either the original dependent variable ($3) or the control variable ($1). The propositions are: 5.16 Controlling for a variable tends to make the resulting analysis less reliable. 5.17 Any association, e.g., correlation, involving a control variable tends to be more attenuated than that association without that control variable. A way to reduce errors of measurement is to aggregate data. Then random errors in the measurement of each case will tend to cancel out, producing an average for that set of cases that has much less measurement error. However, there are limitations to how far this method can be used in the management of many organizations. Organizations will often wish to examine the performance of each unit individually for many purposes, such as assessing the performance of the manager in charge or deciding which unit should receive new investment. A company may have few similar units (e.g., large plants), so the average used for standard-setting is based only on that small number. Therefore, the aggregation method of reducing measurement error that is frequently used in academic research may be less usable in many managerial contexts, leaving the company’s performance measurement after applying controls highly unreliable. Measurement Error Caused by Comparison With Standard The unreliability due to difference scores can occur in other managerial control systems. Simon (1957) states that in making decisions managers usually compare the attained performance level with the satisficing level—that is, the level that is deemed to be satisfactory. According to Simon, managers tend not to take action as long as performance attains the level. When performance drops below the satisficing level, then there is a problem that causes managers to take problem-solving action to restore performance to the satisficing level. Thus performance is compared with a standard. This idea may also be expressed by comparison of attained performance with a goal or target or benchmark or “hurdle rate,” set by managers

MEASUREMENT ERROR OF PROFIT 101

for themselves or by their superordinates (e.g., the head office). The gap between desired and attained performance is a trigger for performance. Williamson (1970) states that in the M-form corporation, the corporate head office disciplines the divisions, which often in practice means punishing managers who fail to attain the standard, while rewarding those who attain and exceed it. Locke et al. (1981) emphasize the positive motivational consequences that come from having a goal and striving to attain it. The budget is often a core management tool in an organization and each of its departments; managing by budgets involves comparisons of actual income and expenditures against those budgeted (Hopwood 1976). All these processes entail comparison of the actual performance of an organization (or its subunits) with some desired level. Such a comparison between the actual level of a variable and the expected level is a difference score and so again brings in the problem of unreliability. Thus, organizational performance is frequently interpreted in management in a way that leads to a comparison with some standard. This produces a figure that is the gap between the actual and the standard; being a difference score, it will be prey to the inflated unreliability inherent in difference scores. The attained level is subject to measurement error; however, the target level is not—that is, a target of 20 percent is definitely 20 percent, not 19 or 21 percent. However, the error of measurement in the attained level applies to the gap between the attained level and the target, and the gap is usually (much) smaller than the attained level itself. For instance, an attained level of 19 but a target of 20 leaves a gap of only one. Thus, the error of measurement of the gap is a greater ratio to the gap (e.g., 1) than is the ratio of the error in the attained level to the attained level (e.g., 19). Again we have inflation of the ratio of noise to signal. For example, suppose that the management announces that, while present annual sales are $100 million, the goal for next year is $110 million, an increase of 10 percent. At year-end, the reported sales attained are only $105 million, leaving an apparent gap of $5 million. However, sales this year were really only $104 million; that is, the reported sales attained figure has an error of $1 million, which is only an error of about 1 percent. But relative to the true gap of $6 million, this $1 million is an error of 17 percent. Thus, the error of measurement of the sales target gap is seventeen times larger than the error of sales. This is because the true gap ($6 million) is much less than the true sales attained ($104 million). Again, it is the radical shrinking of the denominator, from sales to gap, which leads to the inflation of the percentage of error. Even though the target figure is measured without error, comparisons between the target and the attained level suffer from considerably more measurement error than the simple variable of the attained level, because the gap is a difference score. This holds for any variable whose attained level is compared with some target level (e.g., number of products that need reworking or number of satisfied customers). Nevertheless, because the target is measured without error, only the attained

102 THE SOURCES OF ERROR

level introduces measurement error into measuring the gap between target and attained levels. Therefore, the amount of error can be less than the difference scores discussed above, where both the variables (whose difference is being measured) have measurement error. Because setting performance standards is considered a core task of management and because the number of such standards increases through quality improvement programs, benchmarking, and so on, the scope of the problem of unreliability being introduced by having standards is likely increasing within management. Clearly, the problem of unreliability introduced by comparison of the actual figure with a standard can occur for profit figures, or profitability. The actual profit attained, for example by a division, is compared with the level that is required, such as the 20 percent return on investment (i.e., profit on capital invested) required by the head office. In such situations, the measurement error from using profit is compounded by the measurement error from comparison with a standard. In budget systems, where actual performance is compared to expected performance for short periods, such as each month, the error caused by difference scores is reinforced by sampling errors from the smaller number of instances that occur during a shorter time period, so that unreliability interacts with sampling error to increase error (in the manner to be discussed later). The reduction in reliability when an organizational variable is compared with a standard leads, again, to attenuation of association (e.g., correlation). The propositions are: 5.18 Expressing an organizational variable relative to a standard tends to make it less reliable. 5.19 Any association, e.g., correlation, involving an organizational variable relative to a standard tends to be more attenuated than that association without that organizational variable being relative to that standard. It will be apparent that profit measures used about organizations or in organizations (e.g., to assess business segments or divisions) can be subject to multiple sources of unreliability, combining to lower the reliability of the profit measures. Profit is inherently unreliable, as are the derived profitability measures (e.g., profitto-sales ratio). The unreliability of profit can be compounded by expressing it as a rate over time. Further unreliability can be added by analyzing profit, while controlling for some variable or by expressing profit relative to some standard. Multiple sources of unreliability could occur, such as expressing profit as the rate of increase over time in the ratio of profit to sales, relative to a standard, while controlling for some exogenous variable—which would considerably reduce the reliability. Of course, some sources of profit unreliability would cause more unreliability than others, and not all sources of unreliability would apply to every profit measure. But in dealing with profit measurement, the potential is there for substantial unreliability problems. And the lower the reliability of the profit measure, the more error

MEASUREMENT ERROR OF PROFIT 103

in the profit figure and the more that any correlation involving that profit measure would be an underestimate. Measurement Error in Contingency Misfit Analyses Academic researchers sometimes study the relationship between misfit, such as the misfit of the organization to its environment or the misfit of its structure to its strategy, and the resulting organizational performance. Contingency theory (L. Donaldson 2001; Lawrence 1993) holds that performance is affected by the fit between an organizational attribute (e.g., structure) and the factors on which it is contingent (e.g., strategy). Managers may also, on occasion, use contingency thinking. If they examine the degree of misfit, then they are using a difference score—that is, the gap between the actual level of a variable and the level required to fit the level of the contingency variable. If the contingency and organizational attribute variables are highly positively correlated, misfit is measured much less reliably than the contingency and organizational attribute variables—whose level of fit to each other defines the misfit. Therefore, contingency misfit analyses are prone to great measurement error, leading to erroneous inferences. The propositions are: 5.20 Analyses of misfit between an organizational attribute and its contingency factor tend to be less reliable than the organizational attribute or the contingency factor. 5.21 Any association, e.g., correlation, involving a misfit between an organizational attribute and its contingency factor tends to be more attenuated than an association involving just that organizational attribute on its own. As we keep emphasizing, performance can also often be a difference score, because it is measured by profit or a time rate. Therefore, when a misfit score is correlated with performance, both may be difference scores: misfit is the difference between actual and required level, and performance may be measured by a difference score such as profit or sales growth. Because misfit scores can reduce the reliability considerably, there could be low reliability on both sides of the analysis, misfit and performance, producing double unreliability in the analysis. For example, suppose the contingency and organizational attribute variables have each been measured quite reliably, at .8 (and assuming, for simplicity, that the standard deviations of these variables are the same). Suppose also that they are correlated positively, .7; then their misfit has the reliability of only .33. Clearly, the reliability of misfit (.33) is much less than the reliabilities of either the contingency (.8) or the organizational attribute (.8). If, also, sales and costs have each been measured quite reliably at .8 and are correlated positively, .7, then profit has the reliability of only .33. Thus, the correlation between misfit and profit is reduced to only .33 „Y .33 y .33 ˆ of its true value.

104 THE SOURCES OF ERROR

Hence, a true correlation would appear to be two-thirds smaller—for example, a true misfit-profit correlation of –.3 would appear to be only –.1. A manager might well view this figure as trivial or “not really different from zero.” Therefore, the errors of measurement are liable to be substantial in analyses of misfit and performance—that is, in contingency analyses by managers. This problem could apply at the organizational level, such as in an analysis across firms in an industry, or at the subunit level, such as comparing across divisions or departments within an organization. Low Reliability Not Readily Increased The view might be taken that the problem of low reliability is, in reality, minor and temporary because it can be readily solved. In social science, reliability can be improved by taking multiple measures of the same concept and then aggregating them together. The usual view is that the multiple measures should be highly positively correlated. This idea is formalized when the reliability of measurement of a concept is expressed in terms of the correlations among the items or indicators that are collectively used to measure it. High positive intercorrelations are taken as evidence that the concept is being reliably measured. This is evidence of reliability in the sense of repeated measures of the same concept giving similar scores. However, it is not necessarily evidence that the collective set of items is a better measure of the underlying concept. In Cohen et al.’s (2003) terms, the issue is whether the items are together correlated more closely with the concept than is a single item. High positive intercorrelations among items do not necessarily mean that they are together more highly correlated with the concept than is one of them singly. Thus, if an item has low reliability, meaning that it correlates poorly with the concept, adding to it other items that are highly positively correlated with it will not necessarily decrease measurement error. In fact, measurement error may be decreased more by adding to the original item other items that are lowly correlated with it. As an example, let us return to the case of the profits of business segments in the Walt Disney Company in 2002. As we saw, the possible low profit reliability of .22 means that the correlation between observed profit and true profit is only .47. Let us call this measure of observed profit, profit A. We will seek to bolster its low reliability by adding to it another measure of profit, profit B, so that profit A and B have a high positive correlation. Therefore, profit B is constructed so as to be very highly positively correlated with profit A, so much so that they are perfectly, positively correlated (r = +1.0, for simplicity and for the sake of showing the logic; in practice such a perfect correlation is infeasible). This is attained in a simulation by defining profit B as being equal to profit A less $20 million. However, when the scores of profit B are added to those of profit A, the new, multi-item variable, profit (A and B), again correlates only .47 with the true profit. This is the same as the low correlation (.47) between observed profit A and the true profit that we started with.

MEASUREMENT ERROR OF PROFIT 105

Adding the new item, profit B, has not reduced the error of measurement of true profit. This is not surprising, because profit B is almost a replica of profit A. Instead, what is needed is another item that is unlike profit A, so that it compensates for the deficiencies in profit A. The addition of this item to profit A will reduce the measurement error of true profit. Let us call this item profit C. As we saw earlier, the correlation between profit A and true profit is low (.47) because of the errors of measurement of the profit of each business segment. To compensate for this, profit C should have an error that is equal in amount, but in the opposite direction, to the error in profit A. For instance, in the earlier simulation, the Parks and Resorts business segment has an error that the observed profit is $831 million less than the true profit. Suppose that profit C has the same amount of error, $831 million, but more than the true profit. Averaging the scores of profit A ($1.169 billion) and profit C ($2.831 billion), the errors would cancel each other out completely, leaving the true profit ($2 billion). Similarly, profit C would be constructed for all the other business segments so that the average profit of A and C would be the same as the true profit for that business segment. Hence, average profit (A and C) would correlate perfectly with true profit (+1.0, again, for simplicity and for the sake of showing the logic; in practice such a perfect correlation is infeasible). Thus, the objective of eliminating measurement error has been obtained. However, the correlation between profit A and profit C is only +.24. While profit C is positively correlated with profit A, it is much less than highly so, and in fact it is only weakly correlated. Hence, error in the measurement of true profit has been eliminated by adding together two indicators (profit A and profit C) that would fail the criterion that they be highly positively correlated. Following the criterion of high intercorrelation is not the path to solving the problem of low reliability, meaning high measurement error. What is required is an additional item (e.g., profit C) whose measurement errors are negatively correlated with the measurement errors of the original item (e.g., profit A). In the example, the correlation between the errors in profit A and profit C is negative –1.0 (again, for simplicity and for the sake of showing the logic; in practice such a perfect correlation is infeasible). This would attain the target of completely eliminating the measurement error, a lofty but theoretical objective. However, measurement error could be reduced simply by having the measurement errors in profit C be uncorrelated with those in profit A. However, since in real life (unlike in the simulation) the true profit, and hence the errors of measurement of each case, are unknown, it is not readily feasible to devise a new item (e.g., profit C) that will combine with the original item (e.g., profit A) to eliminate measurement error or even to substantially reduce it. Therefore, the problem of the propensity of profit, growth rate, and other difference score measures of organizational performance to have low reliability, and hence considerable error and attenuation of correlation, is persistent and pervasive. Similarly, if management sought to avoid the low reliability of profit by combining it with another performance measure, say sales growth, so that divisional

106 THE SOURCES OF ERROR

performance, for instance, was assessed as a combination of divisional profit and divisional sales growth, this might not avoid the problem in fact. Again, the combination would have substantially increased reliability over profit on its own only if the errors of measurement of profit and sales growth were negatively correlated. Since these errors are not known, ensuring their negative correlation is not readily feasible. Conclusions Measurement error (unreliability) is a source of error in the figures used in the control systems of organizations. Profit figures are especially prone to unreliability, not simply because of arbitrary and subjective judgments but because of the nature of profit as a difference score. Profit is sales less costs, which, psychometrics teaches, can be particularly unreliable. Hence, profit tends to be more unreliable than the variables from which it is composed: sales and costs. Similarly, profitability ratios, such as profit to sales and profit to assets, are prone to unreliability, because they include profit and therefore involve difference scores. Profit and other performance measures, such as sales, can have their reliability reduced by expressing them as a rate over time, controlling for an exogenous variable, or reexpressing them relative to a standard. All of these reduce reliability by introducing difference scores. The unreliability of any of these performance measures leads to erroneous management inferences about individual cases, such as the profit of a division, and so can lead to poor decisions, such as about evaluating and rewarding its managers. Moreover, unreliability in performance measures reduces correlations between performance and any other variable, such as an organizational attribute that is a cause of performance. This means that the true effect of the cause on performance is understated by the observed correlation. This can mislead managers to underuse, or possibly cease to use, a cause of performance or to search vainly for some other, more powerful, causes that may not exist. Unreliability may be present also in variables being correlated with performance, such as the misfit between an organizational attribute and a factor on which its effect on performance is contingent, depressing their correlation, because, again, misfit is a difference score. In this chapter, we have introduced the idea that profit is prone to measurement error because it is a difference score. In the next chapter, we will pursue this idea further in a quantitative analysis that calculates how much measurement error (unreliability) there is and how it varies according to specific situations.

6 Quantifying the Measurement Error of Profit

In the previous chapter, we demonstrated that profit is a difference score and so potentially prey to substantial measurement error, leading to problems of errors in profit scores and attenuation of correlation. In this chapter, we calculate the amount of measurement error (that is, the diminution in reliability) due to profit being a difference score; and we show the conditions under which its measurement error will be highest. We show also that measurement error in profit may be quite unstable and fluctuate considerably. This more detailed type of argument might not suit the needs and taste of every reader; those so disposed may prefer to move to the following chapter, which resumes the mainly nontechnical discussion. Formal Analysis of Measurement Error of Profit In order to demonstrate more exactly the propensity for profit reliability to be low, and to illustrate this more concretely, we develop a formal analysis, which draws strongly upon psychometrics. The financial statements of the Walt Disney Company* give the sales revenue and operating income for 2002 for each of its four business segments: Media Networks, Parks and Resorts, Studio Entertainment, and Consumer Products. These business segments are subunits of the whole Disney organization and so illustrate analyses of the profits of organizational subunits. The Walt Disney Company in 2002 had a reliability of the profit of its business segments that could be as low as .22 (the derivation of this figure will be explained below). Its square root is .47, and in the previous chapter we saw how misleading the resulting profit figures could be. Moreover, any correlation involving business ——————— *Walt Disney Co. 2005; data collected and kindly made available by Steven Charlier of the Management and Organizations Department of the University of Iowa. 107

108 THE SOURCES OF ERROR

segment profit would be attenuated to only .47 percent of its true value. Hence, the observed correlation would be about half its true value. Much (53 percent) of the true value would disappear, because of attenuation of correlation solely due to profit being a difference score. For instance, suppose that at Disney there is a cause of business segment profit that has a true correlation of .3. Its observed correlation would be only .14 if profit had reliability as low as .22. Such a low correlation could easily be judged as very weak, trivial, and not worth basing policy upon, or dismissed as “not significant.” To understand what determines the reliability of profit as a difference score, we need to use the equation from psychometrics that gives the reliability of a difference. According to Johns (1981), a difference score variable, X – Y, has a reliability that can be calculated from the following equation: rdiff =

sx2 rxx + sy2 ryy − 2rxy sx sy sx2 + sy2 − 2rxy sx sy

.

The equation that gives the reliability of profit (i.e., sales minus costs) is therefore rprof =

ss2 rss + sc2 rcc − 2rsc ss sc ss2 + sc2 − 2rsc ss sc

(1)

where rprof = ss2 = sc2 = rss = rcc = rsc = ss = sc =

reliability of profit variance of sales variance of costs reliability of sales reliability of costs correlation between sales and costs standard deviation of sales standard deviation of costs.

It will be convenient to simplify this equation a little by considering the average of the liabilities of sales and costs, raa. (This can be thought of as the average of the reliabilities weighted by their variances.) Substituting the average of the reliabilities of sales and costs, raa, for the reliabilities of each of sales and costs, we obtain rprof =

ss2 raa + sc2 raa − 2rsc ss sc . ss2 + sc2 − 2rsc ss sc

QUANTIFYING THE MEASUREMENT ERROR OF PROFIT 109

Thus, the equation for the reliability of profit becomes rprof =

raa (ss2 + sc2 )− 2rsc ss sc ss2 + sc2 − 2rsc ss sc

.

(2)

Because variance is the standard deviation squared, there are only four distinct variables determining profit reliability in this equation: the average reliability of sales and costs, raa; the correlation between sales and costs, rsc; and the variances of sales and costs, ss2 and sc2. The variables are quantities within a data set and are its properties. For example, the correlation between sales and costs is that existing within a data set. For illustration we use a simple data set that consists of the four business segments in the Walt Disney Company. Hence, the correlation between sales and costs is the correlation between the sales of the business segments and their costs. Similarly, the derived variables, such as profit reliability, are those quantities for that data set—for example, for the Disney business segments. From the figures of the Walt Disney Company, the costs of each business segment can be calculated as sales revenue minus operating income. Thus, we now have the segment costs, as well as the segment sales. From these data, the standard deviation of sales can be calculated to be $3 billion and that of costs to be $2.8 billion. The variance of sales is calculated as $9 quintillion (i.e., billion billion) and that of costs as $7.8 quintillion. Next, the correlation between sales and costs can be calculated, which is .981. Thus, to be able to calculate profit reliability, we need to know only one more variable: the average reliabilities of sales and costs. Knowing the sales-costs correlation, the average reliabilities of sales and costs can be inferred. With such a high correlation, .981, the reliabilities of sales and costs must be high, because if they were lower they would attenuate the sales-costs correlation and it would be less than .981. More specifically, a true correlation is always higher than the observed correlation, because there is always some unreliability in the variables being correlated and hence some attenuation of their correlation. Hence, the true sales-costs correlation must be higher than the observed sales-costs correlation, because there will be some unreliability of both sales and costs, and hence some attenuation of their correlation. The most conservative assessment of the reliabilities of sales and costs comes from assuming that the true sales-costs correlation is one. Given this, the reliabilities of sales-costs can be calculated by using an equation from psychometrics in the following way. In psychometrics, the true correlation produces the observed correlation through the inaccuracy introduced by the unreliabilities of the variables. The observed correlation is equal to the product of the true correlation multiplied by the square roots of the reliabilities of the variables. The equation (Hunter and Schmidt 2004) is rosc = rtsc rss rcc

110 THE SOURCES OF ERROR

where rosc is the observed correlation between sales and costs, rtsc is the true correlation between sales and costs, rss is the reliability of sales, and rcc is the reliability of costs. To simplify, consider the average reliabilities of sales and costs, raa, and substitute this for the reliabilities of both sales and costs. (This substitution is valid where the reliabilities of sales and costs are equal, which must be nearly so with their average reliabilities being so high, as seen below.) Then, the equation becomes rosc = rtsc raa Hence: raa =

rosc rtsc .

(3)

Thus, the average reliabilities of sales and costs can be calculated by dividing the observed sales-costs correlation by the true sales-costs correlation. For Disney, substituting the observed and true correlations into equation 3 above, the average reliabilities of sales and costs is .981 (= .981/1). Now we have values of all the variables in equation 2, which gives the reliability of profit. Inserting these values into equation 2, the reliability of profit is .22. This says that the reliability of profit is low. Although the reliabilities of both sales and costs are very high, averaging .981, profit reliability is low, only .22. How does it come about that the profit reliability is so low? In overview, equation 2 means that profit reliability is lower (i.e., measurement error of profit is greater), the less that the average sales-costs reliability exceeds the sales-costs correlation and the more equal are the variances of sales and costs. As can be seen in equation 2 above, the numerator contains the average reliabilities minus the correlation between sales and costs. Therefore, the effect of average reliabilities of sales and costs on profit reliability is positive. This is to be expected, because the better that sales and costs are measured, the better is profit, which is derived from them, measured. However, more measurement error is produced in profit the more that sales and costs are positively correlated. Clearly, these two terms oppose each other so that measurement error is less, the more that the average reliabilities of sales and costs exceed the correlation between sales and costs. To put it the other way around, the less that the average sales-costs reliabilities exceed the sales-costs correlation, the greater the measurement error of profit. Profit reliability is the opposite of measurement error, and so profit reliability is higher, the more that the average reliabilities of sales and costs exceed the correlation between sales and costs. The proposition is: 6.1 The less that the average sales-costs reliabilities exceed the sales-costs correlation, the greater the measurement error of profit.

QUANTIFYING THE MEASUREMENT ERROR OF PROFIT 111

In the Disney example, the average sales-costs reliabilities, .981, are equaled by the sales-costs correlation, .981, which leads toward zero profit reliability. If these were the only terms in the equation, profit reliability would be zero. (However, there are other terms in the equation and so the value of profit reliability also depends upon them, as will be seen shortly.) In general, profit reliability will tend to be lower when the sales-costs correlation is high. The reason is that then it is infeasible for the average sales-costs reliabilities to be much higher than the sales-costs correlation, because it is infeasible for the average sales-costs reliabilities to be very high (this would mean that sales and costs were measured with almost no measurement error). Hence, when the sales-costs correlation is high, the average sales-costs reliabilities will tend to approximate it, producing low profit reliability, as seen in Disney. The proposition is: 6.2 The higher the sales-costs correlation, the greater tends to be the measurement error of profit. The sales-costs correlation will often be high, because increases in units sold (goods or services) produce increases in sales revenue, but also produce increases in the costs incurred in producing and selling the units. Therefore, sales and costs tend to be highly positively correlated, as in Disney. Moreover, sales tend to be measured with limited error because they are a simple variable and subject to scrutiny by the firm’s accountants, auditors, and senior managers. Costs also tend to be measured with limited error. Hence, the inherently high sales-costs correlation will be subject to little attenuation from measurement error in sales and costs and so will have a high observed correlation. As seen, this high sales-costs correlation tends to lead toward lower profit reliability. The relative effects on profit reliability of average sales-costs reliabilities and sales-costs correlation are affected by a weighting factor for each. In equation 2, the average sales-costs reliabilities are weighted by the sum of the variances of sales and costs. In contrast, the sales-costs correlation is weighted by twice the product of the standard deviations of sales and costs. The former term is usually larger than the latter term, as it is here, in the Disney case: the sum of the variances is $1.672 quintillion (i.e., billion billion), while twice the product of the standard deviations of sales and costs is $1.668 quintillion—that is, the former is $4 quadrillion (i.e., million billion) more than the latter. These differences in weighting make the term involving the average sales-costs reliabilities (in equation 2) greater than the term involving the sales-costs correlation. This turns the difference between the average sales-costs reliabilities and the sales-costs correlation from zero to positive. The weight given to the average sales-costs reliabilities is greater than that given to the sales-costs correlation whenever the standard deviations of sales and costs are unequal. In contrast, if the standard deviations of sales and costs are equal, then the weight given to the average sales-costs reliabilities is the same as

112 THE SOURCES OF ERROR

that given to the sales-costs correlation. Recall that a variance is the square of a standard deviation. If the standard deviation of sales is 4, its variance is 16, as is that of costs if its standard deviation is also 4. In this case, the product of their standard deviations is 16, while the sum of their variances is 32. Hence, in this case, the term used in the equation, twice the product of the standard deviations of sales and costs (e.g., 32), is equal to the sum of their variances (e.g., 32). Thus, the two weighting terms used in the equation are equal if the standard deviations of sales and costs are equal. If, however, the standard deviations of sales and costs are unequal, then the two weighting terms differ in their numerical values. For instance, if the standard deviation of sales is 3 and that of costs is 5, twice their product is 30, which is less than the sum of their variances, 34 (= 9 + 25). If the standard deviations of sales and costs are unequal, then the sum of their variances will be greater than twice the product of their standard deviations. Hence, profit reliability will be above zero, even if the average sales-costs reliabilities equal the sales-costs correlations. The greater the difference between the standard deviations of sales and costs, the more that the sum of their variances is greater than twice the product of their standard deviations, and so profit reliability will be more highly positive. The weight given to the average sales-costs reliabilities relative to the sales-costs correlation is affected by profit. The standard deviations of sales and costs will tend to be unequal if mean sales are different from mean costs. This will occur in situations of substantial profit or loss. Only at breakeven (i.e., at about zero profit) will the two weighting terms tend toward being equal, so that, then, equal sales-costs average reliabilities and correlation produce zero profit reliability. In the other profit situations, the weighting terms will be unequal, so that the lowest profit reliability will be above zero. For example, at Disney mean business segment sales were $6.3 billion, while mean business segment costs were less at $5.6 billion. The outcome was that, rather than being zero, profit reliability was positive. However, as just seen, mean sales were only somewhat above mean costs, so that mean profit was $0.7 billion, which is only 11 percent of mean sales. Actually, the mean profitability is only 12 percent. This leads to profit reliability being positive but low (.22). Overall, profit reliability will tend toward zero, because average sales-costs reliabilities will tend toward equaling sales-costs correlation. When mean profit is also zero, then profit reliability can be as low as zero. As long as average sales-costs reliabilities equal the sales-costs correlation, profit reliability will still be low; though it will become positive and somewhat higher the more that mean profitability departs from zero. However, as long as mean profitability is only modest, profit reliability will be low, as seen in the Disney case (.22). Such modest mean profitabilities are liable to occur more widely among business firms or their subunits (e.g., business segments) when the economy is in recession than in boom. 6.3 The lower the mean profitability, the greater tends to be the measurement error of profit.

QUANTIFYING THE MEASUREMENT ERROR OF PROFIT 113

6.4 Higher measurement error of profit is more likely in economic recession than in boom. In summary, high sales-costs correlations are likely, so that profit reliability is low, leading it to be problematic, by attenuating correlation and misleading managers in other ways. Profit reliability is likely to be particularly low when firm profitability is low. Thereby, at the very time that managers will be anxious about profits and conducting numerical analyses to find profit levers, the results of those analyses will tend to be misleading, possibly completely so. Important drivers of profit may appear to have little or no correlation with profit, casting strong doubts on their efficacy. The attenuation of correlations involving profit could create errors in many types of analysis that involve profit. For instance, the top management of Disney might worry whether the company will survive economic recession. The managers reason that disastrously low corporate profit is avoidable if the various business segments have profit fluctuations that are not strongly positively correlated with each other. If they are strongly positively correlated with each other, then downturn in one will be accompanied by downturn in others, producing corporate financial disaster. The managers look at the profit fluctuations over time of the two business segments that produce the most profits, Media Networks and Parks and Resorts, which together produced 76 percent of Disney profits in 2002. It can be calculated that their correlation over the five years from 2000 to 2004 was only .44. This is quite low, and the common variance, the correlation squared, is only 19 percent. This result might lead an analyst writing a report to conclude: “This means our two largest business segments have only about one-fifth of their profit fluctuation in common.” The resulting managerial view might be that there is no threat to corporate financial health from the profits of the business segments being correlated. The true correlation, however, has been attenuated by the lower reliability of both the profits that are being correlated together. Each of the two profit figures attenuates the correlation by the square root of profit reliability. The observed correlation is equal to the product of the true correlation multiplied by the two square roots of profit reliability: ropmpr = rtpmpr rpm rpr where ropmpr is the observed correlation between the profits of Media Networks and the profits of Parks and Resorts, rtpmpr is their true correlation, rpm is the reliability of profit for Media Networks, and rpr is the reliability of profit for Parks and Resorts. Assuming the two reliabilities are the same, they can each be substituted by rpp, so the equation becomes ropmpr = rtpmpr rpr .

114 THE SOURCES OF ERROR

Hence, the true correlation can be calculated from rtpmpr=

ropmpr . rpp

(4)

Using the observed correlation of .44 and applying the 2002 profit reliability of .22, in equation 4, gives the value of the true correlation as 2 (= .44/.22). The maximum possible value of a correlation is 1, so the true value would be 1. The observed correlation (.44) is less than one-half of the true correlation (1). The analysis wrongly gives the message that there is little correlation between the two business segments, whereas it is in reality overwhelming. The analyst who used the common variance (i.e., the correlation squared) understated the true association by four-fifths, giving it as 19 percent rather than 100 percent. The profit fluctuations of the two largest business segments are completely correlated. A downturn in one business segment is likely to be accompanied by a downturn in another (insofar as the past is a guide to the future). Since the corporation obtains most of its profits from these two business segments, this means that the corporation is vulnerable to financial disaster. The understatement of correlation error is produced every time that the performance measure is a difference score, in any of its various forms—profitability ratios, time rates, performance controlling for some variable, or performance relative to a standard—as was discussed in the previous chapter. Sensitivity of Profit Reliability Profit reliability can vary markedly according to the average sales-costs reliabilities, so that small changes in average sales-costs reliabilities can produce large changes in profit reliability. For Disney in 2002, the average sales-costs reliabilities were .981, which led to profit reliability being low at only .22. However, if average sales-costs reliabilities had been .986, the profit reliability would have been .64. The square root of this reliability is .8, so that any correlation involving profit would have 80 percent of its true value. Hence, the degree of attenuation would be only 20 percent. A true correlation of .3 would be attenuated to .24, only a small drop. This change in profit reliability from .22 to .64 is produced by a change in average sales-costs reliabilities from .981 to .986. Thus, a tiny difference in average sales-costs reliabilities can create the difference between low and medium-high profit reliability, which produces, respectively, great or small attenuation of profit correlations. These minute shifts in average sales-costs reliabilities are unlikely to be visible to managers, so that profit correlations could swing wildly in organizational data without those looking at the correlations realizing that the variation is due to purely technical differences in measurement of sales or costs. Very minor changes in the measures of sales or costs that produced minor changes in their reliabilities could radically change observed profit correlations. Since minor changes in the measure-

QUANTIFYING THE MEASUREMENT ERROR OF PROFIT 115

ment of sales or costs could occur, profit reliability could be prone to instability. Substantial shifts in profit reliability could happen quickly. If profitability was very low, so minimum profit reliability of zero was possible, an organization with high profit reliability could, by losing just a very small amount of that profit reliability, readily fall into zero profit reliability. This would be tumbling into the “black hole” of organizational information systems, because correlations involving profit would appear to be nil. Their true information value would never be seen, due to the information being trapped inside the data and not visible to those looking at it. The profit reliability of subunits (e.g., business segments) of a firm could vary from year to year, because of variations in either the average sales-costs reliabilities or profitability. Such variations in profit reliability might well not be apparent to managers, possibly leading them to search vainly for real-world reasons for variations in profit correlations that are actually due to the purely technical reason of variations in profit reliability. As we have seen, because sales and costs are always measured with some error, very high values of average sales-costs reliabilities are unlikely. The most likely value is at the bottom of its possible range, with the average sales-costs reliabilities being equal to the sales-costs correlation, which tends toward zero profit reliability. The sensitivity of profit reliability to slight changes in average sales-costs reliabilities means that the profit reliability could sometimes rise to levels above its minimum, but it is more probable that profit reliabilities would decline toward their minimum. The propositions are: 6.5 If the sales-costs correlation is high, profit reliability is very sensitive to changes in average sales-costs reliabilities. 6.6 If the sales-costs correlation is high, profit reliability can readily vary.

Profitability and Profit Reliability As we are seeing, the degree of correlation of sales and costs is crucial in determining profit reliability. The sales-costs correlation features explicitly in the equation (2) used to calculate profit reliability, and it also has an implicit role in setting the range of possible values of the average sales-costs reliabilities, another major variable in equation 2. Under what conditions will the sales-costs correlation be high or low? Sales will be highly correlated with costs when sales increase proportionately with respect to cost increases (so that all the data points would fall on the regression line of sales on costs). In this condition, the difference between sales and costs, profit, will also increase proportionately with respect to sales increases. Therefore, the ratio of profit to sales—that is, profitability—will be constant. Hence, where

116 THE SOURCES OF ERROR

profitability is constant across organizations or their subunits, sales and costs will be highly correlated. Conversely, if profitability varies greatly across organizations or their subunits, then sales and costs will be lowly correlated. As we have seen, high profit reliability is more likely for low sales-costs correlations, and low profit reliability is more likely for high sales-costs correlations. Therefore, heterogeneity of profitability across organizations or their subunits makes it likely that profit reliability will be high. Conversely, homogeneity of profitability across organizations or their subunits makes it likely that profit reliability will be low. It is quite common for corporations to place uniform demands across their divisions or other subunits for profitability, so that, despite differences in sales volumes, they must nevertheless generate similar rates of profit. The corporate level management can use coercive isomorphism (DiMaggio and Powell 1983) over the managers of divisions or other subunits (Williamson 1970). This will lead toward homogeneity and isomorphism of profitability of organizational subunits. Where a comparison is made across those organizational subunits, the homogeneity of their profitability will make for greater sales-costs correlations and therefore for less reliable profit figures. For example, at Disney in 2002, its four business segments were all close to the regression line of sales on costs, so that the sales-costs correlation is high (+.981). If the correlation had been perfect, +1.0, all four business segments would have lain exactly on the regression line of sales on costs, in which case the ratio of profit to sales (i.e., profitability) would be the same for all of them—that is, would be a constant. However, the correlation being less than +1.0 results in some variation in the profits of the business segments. The profitabilities of the business segments were 18, 16, 10, and 4 percent. Corporate managers of large U.S. corporations often challenge their business units to attain demanding targets, such as 20 percent profitability. The Disney figures are compatible with business segments that are trying to attain high, uniform targets of that type, but succeeding only to varying degrees. While the resultant profitabilities are not constant, they are sufficiently similar to give a high correlation of sales and costs across business segments, which in turn leads to the low profit reliability (.22) of Disney in 2002. Firms in the same industry may be subject to similar business conditions as well as to public comparisons of their profitability, so their profitabilities may be somewhat similar, through mimetic and normative isomorphism (DiMaggio and Powell 1983). Hence, they may be somewhat homogeneous, though less so than subunits of the same organization, which, as we have seen, are subject to the coercive isomorphism (DiMaggio and Powell 1983) of the corporate head office (Williamson 1970). Therefore, firms in the same industry will tend to have medium homogeneity of profitability. When an analysis is made across firms in the same industry, medium homogeneity of profitability will produce medium levels of profit reliability. Across the industries of an economy, profitability will vary considerably, so the heterogeneity of profitability in analyses across industries will make for high reliability of profit.

QUANTIFYING THE MEASUREMENT ERROR OF PROFIT 117

The propositions are: 6.7 Profit reliability will tend to be lower where profitability (the ratio of profit to sales) is more homogeneous. 6.8 Profit reliability will tend to be lower in analyses of organizational subunits than in analyses of firms in the same industry. 6.9 Profit reliability will tend to be lower in analyses of firms in the same industry than of firms in different industries. Managers searching for the causes and correlates of the profit of subunits (e.g., divisions) of their own organization are more likely to have those correlations severely attenuated than are analysts looking for correlates of firm profit across industries. The managers are more likely to fall into the dread black hole of zero profit reliability—without knowing it. Then they will experience the debilitating frustration that “nothing seems to correlate with the profits of our divisions.” Causal Model of the Determinants of Profit Reliability The formal analysis of the determinants of profit reliability may be summarized in Figure 6.1. Profit reliability is higher, the higher is the average sales-costs reliabilities, because the more reliably sales and costs are measured, the more reliably is measured the variable that they constitute: profit. This is shown in Figure 6.1 as the positive causal arrow from average sales-costs reliabilities to profit reliability. Conversely, profit reliability is higher, the lower is the sales-costs correlation, because profit reliability is the average sales-costs reliabilities less the sales-costs correlation (for a positive correlation). Thus, the sales-costs correlation has a negative effect on profit reliability, as shown by the negative arrow in Figure 6.1. The average sales-costs reliabilities also affect the sales-costs correlation, by raising the value of the observed correlation. This is shown in Figure 6.1 by the positive causal arrow from sales-costs reliabilities to the sales-costs correlation. Because the effect of the sales-costs correlation on profit reliability is negative, the raising of it by the average sales-costs reliabilities means that the average sales-costs reliabilities have an indirect, negative effect on profit reliability via the sales-costs correlation. Thus, average sales-costs reliabilities have both a positive, direct effect on profit reliability and a negative, indirect effect. Hence, higher average sales-costs reliabilities lead to higher profit reliability, while also reducing it by increasing the sales-costs correlation—through narrowing the gap between the average sales-costs reliabilities and the sales-costs correlation. This two-edged effect—of average sales-costs reliabilities simultaneously raising and lowering profit reliability—is part of the process that limits profit reliability. The fact that there is always some measurement error means that the highest average sales-costs reliabilities are unlikely, so that low profit reliabilities are the most likely. The downward pressure on average sales-costs reliabilities is shown

118 THE SOURCES OF ERROR Figure 6.1

Causal Model of Determinants of Profit Reliability Presence of measurement error Average sales-costs reliabilities + +

Homogeneity of profitability

+

Sales-costs correlation

-

Profit reliability

+

Profitability

+

Unequal sales-costs variances

in Figure 6.1 by the negative causal arrow from presence of measurement error to average sales-costs reliabilities. Homogeneity of profitability across the units being studied (e.g., organizational subunits) leads to higher sales-costs correlations. This is shown by the positive causal arrow from homogeneity of profitability to higher sales-costs correlations in Figure 6.1. Thus, homogeneity of profitability causes lower profit reliability, indirectly via the sales-costs correlation. Unequal variances of sales and costs raise profit reliability. This is shown in Figure 6.1 by the positive causal arrow from unequal variances of sales and costs to profit reliability. Profitability tends to lead toward unequal variances, shown in Figure 6.1 by the positive causal arrow from profit to unequal variances of sales and costs. Thereby, profitability causes higher profit reliability indirectly via the unequal sales-costs variances. Hence, the homogeneity of profitability lowers profit reliability, while the level of profitability raises profit reliability. Hence, profit reliability tends to be lower where profitability is homogenously low. Thus, in the overall model, there are three immediate (proximal) causes of profit reliability: two positive, average sales-costs reliabilities and unequal sales-costs variances, and one negative, the sales-costs correlation. The distal causes of profit reliability are presence of measurement error, profitability, and homogeneity of profitability. Whether their effect on profit reliability is positive or negative depends upon their intervening variables. For some of the variables, their quantitative effects on each other are given by equations 2 and 3. Figure 6.1 simply summarizes the connections between variables and the signs (positive or negative) of those directions.

QUANTIFYING THE MEASUREMENT ERROR OF PROFIT 119

Measurement Error of Growth of Sales In the last chapter, we argued that time rates, such as growth rates of sales and profit, are difference scores and therefore subject to measurement error. We can use the formulas developed in this chapter to quantify the amount of measurement error and show concretely that it has high values. As an example of how low reliability can exist in sales growth, consider the growth in sales of Disney’s business segments in 2002 since the previous year, 2001. The numerator of sales growth from 2001 to 2002 is the difference between sales in 2002 and in 2001; this, again, is a difference score and so is prone to low reliability. The serial autocorrelation between business segments sales in 2001 and 2002 is very high: .986 (to three decimal places). Applying equation 3, for inferring the average reliabilities of sales and costs, it ranges down to a minimum of .986, which is more likely than higher values because sales and costs must be measured with some error. Sales in 2001 have a standard deviation of $2.9 billion and a variance of $8.2 quintillion (the figures for 2002 have already been given above). Equation 2, for the reliability of difference scores, can be applied to the difference between sales in 2001 and 2002 (replacing costs in 2002 by sales in 2001). From this it can be calculated that sales growth from the one year to the next has a reliability of only .06. Its square root is .25. Therefore, any correlation involving sales growth (2001–2002) of the business segments at Disney could have only one-quarter of its true value. A true correlation of .3 would be observed to be only .08, rendering it liable to be dismissed as worthless. Once again, we see that being a difference score makes a figure prone to low reliability. Sales growth is not profit and so is free of many of the problems associated with the measurement of profit—and, indeed, of problems associated with the measurement of costs. Nevertheless, the simple fact of taking one sales figure away from another can be sufficient to lead to low reliability. Again, the sales growth rate will tend to be less reliable than the two sales figures that constitute it; and correlations involving sales growth will tend to understate the true relationship. Conclusions Using equations from psychometrics, we have been able to make numerical estimates of the measurement error of profit in a real company and show that it can be substantial. Even when sales and costs are measured with very little error, there can be large measurement error in profit. These large errors in measurement would make profit figures misleading and lead correlations involving profit to appear much weaker than they are truly. Great measurement error in profit is liable to occur when the correlation between sales and costs is high, which itself is quite likely, especially when profitability is homogenous within a data set. When this situation is accompanied by low profitability, profit reliability can tend toward zero.

120 THE SOURCES OF ERROR

The measurement error of profit can be highly sensitive, so that it could change markedly over time or from data set to data set, without the change in measurement error being transparent to managers, thereby confusing them. Growth in sales has likewise been shown to be prone to substantial measurement error, because it too is a difference score. We will now turn to an analysis of the problem introduced by unreliability interacting with the problem introduced by small numbers of observations and so creating more serious problems. These occur both in the M-form organizational structure and in niche strategies.

7 Measurement and Sampling Errors in the M-Form and Strategic Niches

In the previous chapters, we established that profit is liable to be an unreliable measure of performance. Profit is used as a performance measurement in the multidivisional form (M-form) of an organization, where the profitability of each division is used to measure divisional performances. Likewise, in some analyses of strategy, profit is used as the way to measure the performance of different productmarket niches in order to identify the most successful niche strategies for a company. Because both the M-form and niches use profit, both of them entail the unreliability inherent in profit. Moreover, they also both disaggregate company data and thereby introduce the errors from small numbers. Thus, this chapter presents examples of how both measurement error (unreliability) (discussed in Chapters 5 and 6) and sampling error from small numbers of observations (discussed in Chapters 3 and 4) can occur simultaneously, rendering it more likely that managers will make invalid inferences from data. Errors in M-Form Divisional Profitability The M-form is an organizational structure composed of multiple divisions under the control of a corporate head office. Williamson (1985) states that the M-form gives superior monitoring and control of assets, relative to market mechanisms. A key feature of the M-form for Williamson is that corporate management holds the divisions accountable for their profit performance, which reduces opportunism by middle management and consequent loss of performance by the corporation. Certain limitations of the use of divisional profitability have been identified previously. In a corporation where the divisions are interdependent, such as Alcoa, the 121

122 THE SOURCES OF ERROR

prices of semifinished products transferred from one division to another introduce biases into divisional profitability measures (Rumelt 1974). In contrast, divisions that lack task interdependence—that is, divisions that have only pooled interdependence in Thompson’s (1967) terms—have divisional profitability measures that are not biased in this way. Similarly, the argument has been made that corporations should rely upon financial controls that emphasize profitability measures only if the corporation is highly diversified (and thus low on interdependence between divisions) and that less-diversified corporations should rely upon strategic controls that go beyond profitability to also use the operational and competitive conditions that underpin corporate performance (Hoskisson and Hitt 1994). Yet, allowing for such considerations, using profit as the measure of divisional performance brings with it the inherent problem about the measurement of profit. As we have seen, the profit variable, being a difference score, tends to have a lower reliability than some other financial performance measures, such as sales and costs. Therefore, divisional profit has the potential for lower reliability. This potential was analyzed in the previous chapters as existing for overall corporate profit and for divisional profit. Divisional profit, however, is less reliable than corporate profit. The reason is that summing divisional profitability to make corporate profit increases reliability, because it is like summing multiple items into a scale. Each division’s profitability is affected by error factors idiosyncratic to it; these factors tend to randomly cancel each other out when the profitabilities of the divisions are added together.* Moreover, because divisional profit is the profit of only one part of the corporation, this disaggregates the profit data, leading to small-numbers problems, additional to the problem of the low reliability of profit. In the next section, we will analyze the small-numbers problem, and, in the subsequent section, we will consider the interaction between the small numbers and profit unreliability problems. M-Form as Data Disaggregation Splitting the organization into divisions in the M-form is a disaggregation of the organization, which increases the potential for the small-numbers problem that has been discussed in Chapters 3 and 4. Organizational profit is split into divisional profits that are based on smaller numbers of observations than is the profit of the whole corporation, increasing error in the profit figures. Inside each division there may be multiple profit centers so that divisional profit data become disaggregated, introducing more errors from the even smaller numbers of observations. Empirical research leads to the view that large firms often have multiple profit centers. Hill and Pickering (1986, Table 1, p. 29) surveyed 144 large firms in the ——————— *I owe this insight to a contribution by Todd Darnold of the PhD program of the Henry B. Tippie College of Business of the University of Iowa, in the 6J:205 Contemporary Topics in Management and Organizations class on October 27, 2004.

ERRORS IN THE M-FORM AND STRATEGIC NICHES

123

United Kingdom, of which 119 had multidivisional structures. The median number of divisions was between four and six, that is, about five divisions, with 22 percent of firms having seven or more divisions (Table 3, p. 31). However, subsidiaries of the 144 firms were more numerous: the median number of subsidiaries per firm was between twenty-one and forty, with 8 percent of firms having more than 100 subsidiaries (Table 3, p. 31). Furthermore, the divisionalized firms typically had numerous subsidiaries within each division: “The average operating division in the divisionalized companies contained on average 10.4 subsidiaries, so it appears that the divisions are often quite complex entities in their own right” (Hill and Pickering 1986, p. 31). Since subsidiaries would be profit centers, this implies that the divisionalized firms typically contained about fifty profit centers (= 5 divisions x 10 profit centers per division). Thus, many of these large firms in the United Kingdom contain many profit centers, so that the average size of a profit center is only about 2 percent of the size of the firm. Hence, analyses of the profits of profit centers in such organizations will tend to be beset by the small-numbers problem. The smallnumbers problem exists for the profit centers, and to a lesser degree the divisions, much more so than for the firm as a whole, whose profit figure is not subject to disaggregation and so is based on larger numbers of observations. For example, according to Rosenzweig (2007, 39), the large Swedish-Swiss corporation ABB had 5,000 profit centers: its “matrix had fifty-one business areas and forty-one country managers, which intersected in 1,300 separate companies. These companies were divided into 5,000 profit centers, each one accountable to deliver profits.” Thus, the average size of a profit center in ABB would be one five-thousandth of the size of ABB. The number of observations, N, in an analysis of a typical profit center would be only .02 percent of the size of ABB. Of course, what matters statistically is not the proportionate size, but the absolute size. However, the proportionate size affects the absolute size, N. In 1988, the total number of employees at ABB was 169,459 (Bartlett 1993, 14), which gives an average number of employees per profit center of only thirty-four. Using thirty-four as the number of employees per profit center for ABB as a whole, the average profitability of each employee in the typical profit center would have a standard error equal to its standard deviation divided by about six (the square root of thirty-four) (Moore et al. 2009, 296). In comparison, for ABB as a whole, the average profitability of each employee would have a standard error equal to its standard deviation divided by about 412 (the square root of 169,459). Thus, if other things were equal, the standard error of the typical profit center would be about sixty-nine times larger than the standard error of ABB as a whole company. Unless the standard deviation of ABB is sixty-nine times greater than the standard deviation of the profit center, the profit center figure will contain more sampling error than the profit figure for the company. Goran Lindahl, the head of the relays business area of ABB, one of the corporation’s seven business segments, explains its structure: “The newspapers may describe ABB’s power transmission power segment as a $5 billion operation with

124 THE SOURCES OF ERROR

35,000 employees, but I think of it as almost two hundred operating companies further divided into 700 profit centers each with about 50 employees and $7 million in revenues” (Bartlett 1993, 3). Thus, an analysis of employees in the average profit center of the relays business area would have an N of only fifty. This is a little larger than the number of employees in the average profit center for ABB as a whole, but is still quite small. The average profitability of each of the fifty employees in the typical profit center within the transmission power segment would have a standard error equal to its standard deviation divided by seven (i.e., the square root of fifty). In contrast, the average profitability of each employee in the entire transmission power segment, with its 35,000 employees, would have a standard error equal to its standard deviation divided by 187 (the square root of 35,000). Thus, if other things were equal, the standard error of the typical profit center would be about twenty-six times larger than the standard error of the business segment within which it resides. Unless the standard deviation of the business segment is twentysix times greater than the standard deviation of the profit center, the profit center figure will contain more sampling error than the business segment figure. Thus, the average profit per employee figure would likely have more sampling error for a profit center than for the power transmission business segment of ABB. Overall, the analyses show that, even for a large company with 169,459 employees, disaggregation of profit data by having many small profit centers will tend to create substantial sampling error. When managers disaggregate profit figures, such as average profit per employee, to profit centers from larger aggregates such as the whole company or even from large business segments (as here), error increases because of the small-numbers problem. As seen (other things being equal), the standard error of an average profitper-employee figure would be increased by a factor of sixty-nine when examining individual profit centers rather than the whole of ABB. If we use the relays business area to represent the size of the typical business area in ABB, then we can say that this increase in error comes mainly from disaggregating from the business area to a profit center, a factor of thirty-one (the ratio of 187 to 6). In contrast, the disaggregation from ABB to a business area increases error by only a factor of two (the ratio of 412 to 187). This is because the decrease in size (35,000 to 34) from the business area to the profit center creates an increase in standard error that is the ratio of the square roots of their sizes (187 to 6), whereas the decrease in size (169,459 to 35,000) from ABB to the business area creates an increase in standard error that is only the ratio of the square roots of their sizes (412 to 187). Thus, disaggregating ABB as whole company into business areas that are just under one-fifth the size of ABB only increases standard error a little, doubling it. However, further disaggregation down the hierarchy from business areas to profit centers increases standard error by a factor of thirty-one. Thus, disaggregating from business areas to profit centers produces fifteen times the error of disaggregating from the whole company to the business areas. This is because the business areas

ERRORS IN THE M-FORM AND STRATEGIC NICHES

125

are only one-fifth the size of the whole company, whereas the profit centers are about one-thousandth the size (the ratio of 34 to 35,000) of the business areas. (Of course, while the proportionate size of organizational subunits affects the number of observations within each, it is the absolute number of observations, N, that affects sampling error.) Disaggregation is greater within each business area than from the whole company to business areas. The greater the disaggregation as one goes down the hierarchy, the more that error from small numbers increases (other things being equal). The profit centers of ABB are typically of small size. This will tend toward small-numbers problems in profit figures for profit centers. The resulting errors are likely to be greater than for aggregate figures, such as for ABB as a whole, or even for large business segments. When managers disaggregate profit figures, such as average profit per employee, to profit centers from larger aggregates, such as the whole company or large business segments, error increases, because of the small-numbers problem. However, ABB is well known for following a philosophy of decentralization and keeping operating units small, so it may not be typical of the amount of disaggregation of profit centers among large corporations. But the company provides a cautionary case of how far large corporations can disaggregate themselves into small profit centers and the resulting small numbers that could make figures (e.g., averages) for those profit centers problematic. The proposition is: 7.1 The profit figure will contain more sampling error for a profit center than for the whole company. Interaction of Profit Unreliability and the Small-Numbers Problem in the M-Form Internal profit analyses across profit centers and divisions in the M-form corporation present two problems: profit unreliability and small numbers. These two problems interact and can create considerable error in the assessment of the performance of each division or profit center. The veracity of the profit accountability to which managers are subject in the M-form will be lessened by this error. This inaccuracy, in turn, will tend to bias any decisions based thereupon, such as new capital allocation to divisions and rewards such as promotions of divisional managers. These arguments will now be considered in more detail. Divisional profit (divisional sales less divisional costs) is a difference score and so suffers from the problem of unreliability, as was discussed in the previous chapters. A corporation may use the ratio of profit to sales to compare profitability across divisions despite their varying in size (i.e., in sales). However, the ratio of profit to sales for each division will also tend to be unreliable, for the reasons explained in the previous chapter. Divisional profitability may also be expressed as a ratio of profit to some other measure, such as assets, but that

126 THE SOURCES OF ERROR

calculation suffers the same problems of inherent unreliability as explained previously. Calculating growth in profit over time is a third way to measure divisional performance, but this also suffers from the tendency toward low reliability through being a difference score. Thus divisional profit, whether taken as simple profit, profit to sales, profit to assets, or profit growth, will tend to be infected with measurement error, potentially leading to problematic inferences being drawn by divisional and corporate managers. These problems of divisional profitability are inherent to profitability and so are the same as those that occur for corporate profitability. Additionally, however, there are other sources of error that can operate for divisional profitability that make it more error-prone than corporate profitability. These sources arise from a division being only part of the whole corporation. They are the problem of transfer pricing and the problem of disaggregation, which will now be discussed in turn. The accurate assessment of the profit of each division is most problematic when the divisions are interdependent, so that one division receives inputs from another, such as product supplies, information, or new product technology. This interdependence leads to the problem of transfer-pricing, when the attribution of profit to one rather than another division becomes arbitrary (Lorsch and Allen 1973). Most M-form companies, in fact, sell related products, and some are vertically integrated, so that there is interdependence between some divisions in the typical M-form (Capon et al. 1987; Channon 1973, 1978; Dyas and Thanheiser 1976; Fligstein and Brantley 1992; Grinyer, Yasai-Ardekani, and Al-Bazzaz 1980; Pavan 1976; Rumelt 1974; Suzuki 1980). Thus, most M-form companies will experience transfer-pricing problems that are liable to make the profits of their divisions less reliable by increasing measurement error. While the problem of transfer pricing is widely understood, there is another source of error in divisional profitability that has received less attention and so we will focus on it. For all companies, calculating the profitability of each division disaggregates total corporate data. This disaggregation applies not just in companies whose divisions are interdependent (i.e., relate- product or -service companies or vertically integrated companies), but also to companies whose divisions are independent of each other (i.e., unrelated-product companies or conglomerate corporations). The number of observations on which divisional profitability is calculated is smaller than for the whole corporation, as we saw in the last section. This introduces the problems of smaller sample sizes, discussed previously in Chapters 3 and 4. Therefore, the profitability of a division will be more error-prone than that of the whole corporation. This problem of divisional performance being measured erroneously is worse if the division is small in size, so that its profit measure is based on a small number. Small-sized divisions will occur either when the corporation is small or when the corporation is broken into a large number of divisions. If the corporation is small and it is also broken into many divisions, both factors apply: the size of each division

ERRORS IN THE M-FORM AND STRATEGIC NICHES

127

will be very small and there will be substantial variations in performance across divisions that are erroneous. Thus, the inference errors resulting from disaggregation of the corporation into divisions are increased for small companies and for more numerous divisions. The smaller the size of a division, the more error-prone is its profitability. Therefore, the smaller the average divisional size of a corporation, the more error-prone is its assessment of the divisions. The propositions are: 7.2 The smaller is the size of a division, the more erroneous is its profit figure, leading to more erroneous managerial inferences and mistaken decisions about that division. 7.3 The smaller is the size of the divisions in a corporation, the more erroneous are the divisional profit figures, leading to more erroneous managerial inferences and mistaken decisions about the divisions in that corporation. Of course, a corporation having a large number of divisions or profit centers has two different implications for the number of observations. On the one hand, the size of each division or profit center will tend to be small—that is, their average size will be small. This small size will increase the standard error of a figure about that division or profit center (e.g., its average profit per employee), as we saw in the ABB example above. This sampling error will tend to compound the error from the low reliability of profit. However, having a large number of divisions or profit centers also means that there will be many observations in analyses that compare across the divisions or profit centers. Therefore, these analyses will suffer less from small-numbers problems, and they will tend to have small standard errors (i.e., little sampling error). Thus, the concern in this chapter is where the analyses of the profit of a division or profit center contain both substantial errors from profit unreliability and from small numbers of observations. Hence, the focus is upon figures that characterize a division or profit center on its own (e.g., its average profit per employee), rather than figures that examine associations across divisions or profit centers. For a division or profit center, the low reliability of profit leads to much random variation around the true figure—for example, the average profit per employee in that division or profit center. Half the cases would be above the true value and half would be below the true value. Likewise, the small-numbers problem leads to much random variation around the true figure. Again, half the cases would be above the true value and half would be below the true value. If the two sources of error are independent, then their combined errors will be random combinations. The two errors will sometimes amplify each other and sometimes dampen each other. In one-quarter of cases, both unreliability and sampling error produce errors above the true value, so their combined effect would be to produce an even larger positive error. Similarly, in one-quarter of cases both unreliability and sampling error produce errors below the true value, so their combined effect would be to

128 THE SOURCES OF ERROR

produce an even larger negative error. Hence, in one-half of the cases, the error is amplified due to the interaction of unreliability and sampling error. In contrast, in one-quarter of cases, unreliability produces an error above the true value while sampling error produces an error below the true value, so their combined effect would be to produce a reduced error. Similarly, in another one-quarter of cases, unreliability produces an error below the true value while sampling error produces an error above the true value, so their combined effect would be to produce a reduced error. Hence, in this one-half of the cases, the error is reduced due to the interaction of unreliability and sampling error. Thus, while the two errors tend to offset each other in half the cases, in the other half they combine to make it worse. Hence, the problem identified earlier of low profit reliability will be not only preserved, but also compounded by the small-numbers problem, in half the cases, while it will be made less of a problem in the other half. Therefore, in comparing across profit centers, managers will be looking at some figures that are fairly accurate, mixed with some figures that are inaccurate. Nor will it be apparent which are which, because the two errors and their interactions are not apparent to managers from looking at profit figures. For example, the average profitability of each employee of a profit center would contain some error due to the likely low reliability of profit, plus some error due to sampling error from the small number of employees per profit center. While in half the profit centers these two errors would reduce each other, in the other half of the profit centers they would amplify each other. As a result, in those latter profit centers, average profit per employee would be overstated or understated, possibly considerably. In about one-quarter of the profit centers, the profit would be considerably overstated and in about one-quarter of the profit centers the profit would be considerably understated. Because the errors and their interactions are not visible to mangers, they would have no way of telling which profit centers had their average profit figure afflicted. Therefore, erroneous judgments may be made about those profit centers and their managers. Moreover, this type of random variation fluctuates from one time period to the next, so that a profit center whose profit is erroneously understated in one period might have it correctly stated in the next—or erroneously overstated. These fluctuations create uncertainty. If sense is made of them (Weick 1995), it will be erroneous, because random fluctuations are liable to be mistaken for systematic cause and effect. Because divisional profit is based on smaller numbers of observations than corporate profit, even if the unreliability of profit were the same for divisions as for the corporation as a whole, the error in divisional profit would be greater. However, as argued above, the unreliability of profit is liable to be higher for divisions than for the corporation as a whole. Therefore, there the two sources of error compound, and so there is liable to be more random error in divisional profit than in corporate profit. Hence, decisions made by managers relying upon divisional profit figures are more likely to be suboptimal than those based on corporate profit.

ERRORS IN THE M-FORM AND STRATEGIC NICHES

129

The proposition is: 7.4 Divisional profit figures can contain more random error than the profit figure for the corporation, leading to more erroneous managerial inferences and mistaken decisions from the profit figures of the divisions than of the corporation. To repeat, these remarks apply to figures used to characterize part of an organization, such as a division or profit center, not to analyses across these parts, such as a correlation of divisional profitability with some other divisional variable. In that situation, disaggregation of company data into separate data for many parts (e.g., divisions) produces a large number of observations. This, in turn, produces little sampling error. The low reliability of profit attenuates the correlation involving divisional profitability (as seen in the previous chapters), so that the true correlation is substantially understated. Sampling error then produces a range of values around this attenuated correlation. Because sampling error is limited (due to the large N), the range of values is also limited, so that the observed correlation tends to be close to the attenuated correlation. Therefore, a manager is highly likely to make an incorrect inference from the data. The error is systematic (i.e., biased downward to an underestimate) rather than a random error that produces sometimes underestimates and sometimes overestimates. Hence, the profit correlation will tend not to fluctuate from time period to time period and will be quite stable. Therefore, managers will tend to feel some certainty about the cause-and-effect processes underlying profit correlations, while underestimating the strength of the causal connections. In contrast, if the company data are disaggregated into only a few parts, such as business segments, there will be only a small number of observations and thus much sampling error. Then there will be a large range of observed correlations around the attenuated correlation, which include the true correlation, but the likely result is a correlation that differs from it. Hence, a manager might by luck draw the correct conclusion but is more likely not to. Thus, in comparisons across parts of the organization using correlations, much disaggregation—and so a large number of observations—preserves the attenuation due to low profit reliability. Therefore, the manager is certain to be wrong. On the other hand, little disaggregation and so a small number of observations makes being wrong merely probable, because there is still some chance of random error canceling out the attenuation of the correlation. Errors in the M-Form Williamson (1970) argues that the M-form is superior to the unitary-form (U-form), but this claim needs to be critically examined in light of the above. In the M-form, the performance of the divisions is assessed through their profits. In contrast, the

130 THE SOURCES OF ERROR

U-form measures the functional departments (e.g., manufacturing, research and development) mainly as cost centers, with marketing also considered as a sales revenue center. These cost and sales revenue measures tend to be more reliable than the measures of divisional profitability. Profit is the difference between sales revenue and costs and so suffers the unreliability inherent in difference scores. In contrast, sales and costs are simple variables, not difference scores, and therefore tend to be more reliable. Thus, the M-form employs a measurement approach that is inherently more error-prone than the U-form. Thus, managerial inferences and decisions about divisions are more likely to be mistaken in the M-form corporation than are managerial inferences and decisions about functional departments in the U-form corporation. The proposition is: 7.5 The performance of a division in an M-form corporation is assessed less reliably than the performance of a functional department in a U-form corporation, leading to more erroneous managerial inferences and mistaken decisions. Aggregation of data, helpful in reducing error, can occur over time if divisional performance is assessed not just over a single year, but over several years. Some corporations use a three-year time window over which to assess performance, including the profitability of their divisions (Lorsch and Allen 1973). If an average of annual profit is taken, this should produce a truer figure, because the random overestimates can be offset to some degree by the random underestimates. This averaging may be especially beneficial for small divisions or for divisions that do not sell many items each year, because they suffer more from the small-numbers problem that creates the random error. This longer time frame is commonly used in corporations that have related products. In contrast, unrelated-product corporations tend to focus just on annual performance (Lorsch and Allen 1973). Thus, unrelated-product corporations may have more measurement error introduced by single-year data in their assessments of divisional performance compared with related-product corporations. In the M-form theory of Williamson (1970, 1985), the corporate head office polices the performance of the divisions, so that managers in charge of divisions that produce high profits are rewarded, through praise, bonuses, promotion, and so on. In contrast, managers whose divisions perform poorly are criticized and lose bonuses; they may be demoted and, at the extreme, dismissed. Holding managers accountable so sharply hinges upon the accurate assessment of divisional profit. Any sense that profit figures are inaccurate will be seen as unfair by the divisional managers. This perception will tend to breed a sense of injustice among divisional managers who feel that their assessed profit is lower than its true value. This in turn may affect their interactions with corporate management, leading to evasion, camouflage, and conflict. Further, it may lead the management of one division into

ERRORS IN THE M-FORM AND STRATEGIC NICHES

131

conflict with that of another, rather than collaborating for the corporate common good. Low performance in and of itself would produce some of this conflict, but perceived inaccuracies are liable to compound the stress of the situation. The inherent difficulty of measuring profit accurately makes rewards based on divisional performance problematic. This may be part of the reason that corporations in which the divisions are interdependent tend to place less emphasis on the profit performance of the divisions in calculating the bonuses of divisional managers and more emphasis upon the corporation (Pitts 1974, 1976). This emphasis avoids problems and helps to focus on the superordinate goal. It also emphasizes corporate profit, a measure that is more accurate and so less liable to lead to unjust assessments. Nevertheless, corporations whose divisions are unrelated tend to place emphasis upon divisional profit in assessing the accountability of divisional general managers and determining their bonuses—despite the problem of random error in profit measures. Williamson (1985) holds that the internal capital allocation within the M-form is superior to that of the external capital market, in that the corporate center knows the profit performance of each division and can use this information to allocate new capital to the better-performing divisions. However, as we have seen, the divisional profit figures will in fact contain errors because of the unreliability in measurement, so that new capital is not always allocated to the division making the highest economic return for the corporation’s investment in it. By contrast, the external capital market relies upon publicly available information, which for many companies is restricted to the total corporate profit. Because this figure is not the result of disaggregation, it avoids the problems that inhere in small numbers, as well as probably being measured more reliably than divisional profit, so that corporate profit should be less error-prone than the divisional profit figures. Thus, the benefits of internal capital allocation in the M-form relative to external capital allocation through the market may be overstated in the Williamsonian theory, due to overlooking inference problems. In the multidivisional corporation, the performances of the divisions are typically assessed primarily in terms of profit, and because profit is unreliable, some part of the variation observed across divisional performances will be erroneous. Further, this error will be compounded when the size of the divisions is small. When divisional profits are measured and the divisions are small, the variation in performance across divisions increases due purely to error. Greater error in divisional performance measurement implies a greater chance that the performance of a division drops below the satisficing level (Simon 1957). When managers fail to attain the satisficing level, they are liable to be deemed to have failed by the head office; the smaller the division, the more likely that this judgment is false. Thus, more divisions will be erroneously labeled as having unsatisfactory performance when the corporation is small or the divisions are small. Hence, corporate managerial attention will be drawn off into analyzing and intervening in divisions where such intervention is not really required. This may frustrate the managers of that

132 THE SOURCES OF ERROR

division, leading them to resist, precipitating further political action, and extending in some cases to dismissals or resignations. Such a wild goose chase may distract corporate management from attending to real problems in the corporation. Thus, the cycles of dysfunctional control discussed in Chapter 4 may occur in divisionalized corporations through disaggregation of their performance data into divisions. A further implication is that a manager in charge of a small division is more exposed to chance relative to the manager of a large division. Thus, there may be perceptions of inequality leading to demotivation and conflict. Also, less experienced managers may be placed in charge of small divisions and only successful ones promoted to lead larger divisions, so that success early in divisional general management is more chance-based than success later on in the managerial career. This repeats in divisional general managers the problem of chance in career advancement that was discussed in Chapter 4. In summary, the benefits of the M-form, which Williamson argues in his theory, are undermined by the errors in measurement of divisional profit performance stemming from the unreliability of profit, especially when reinforced by the divisions being of small size. These errors blunt accountability and may lead to distrust and demotivation of divisional management. The errors may lead to mistaken assessments of some divisions as poor performers, thus distracting management from really important matters. These inference problems blunt also the superiority of the M-form as an internal capital market relative to the external market. None of these problems argue against use of the M-form, but rather serve to caution us about its limitations, so that the benefits from the M-form are reduced. The M-form is not inferior in performance consequences to the U-form; rather the advantages of the M-form are reduced to a degree by the inference problems. The M-form would remain the preferred structure over the U-form for a diversified corporation because it constitutes a fit between strategy and structure (Chandler 1962). The more that a performance measure is used as a basis for judging organizational managers—including for bonuses, promotions, and dismissals—the more incentive they have to manipulate that measure, leading to more error in its measurement. Thus, greater emphasis upon a performance variable will lead to it being less reliable. The tendency in the 1990s was to accentuate share price, such as by giving CEOs large numbers of share options, which leads to “managing earnings.” This can lead to profit manipulation and to increased measurement error on profitability, because that is the variable often used to assess the performance of a company. Similar processes can apply at divisional level, accentuated when divisional general managers’ bonuses are tied to the profitability of their divisions. Profit Used to Determine Bonuses in Small Organizations We have emphasized divisions as the organizational subunits that are profit centers, and many of these divisionalized companies are large. However, performance measurement using profit can also play a role in small companies. Gibbs et al.

ERRORS IN THE M-FORM AND STRATEGIC NICHES

133

(2009) surveyed automobile dealerships, which typically were independent small business firms, in the United States. In them, the profit of a department was used to determine a bonus received by the departmental managers. The organizational structure of these dealerships involved a general manager and several subordinate managers, who headed the departments of new car sales, used car sales, and car servicing (Gibbs et al. 2009, Table 1, p. 246). The departmental managers received a bonus that was determined by a formula (Table 1, p. 246) based on profit in 99 percent of cases (Table 4, p. 249). The profit measure was either “gross profit (revenue less the cost of goods sold) or net profit (gross profit less other costs)” (Gibbs et al. 2009, 248; emphasis added). Clearly, these profit figures were difference scores: revenue less costs. They were, therefore, potentially prone to the problem of substantial measurement error (low reliability). Nevertheless, the new car sales departmental managers, for instance, received a profit-based bonus averaging $53,635, which was more than the average salary, $33,555, of those managers, indicating the importance of this bonus to them (Table 1, p. 246). Moreover, 12 percent of these managers received no salary (Table 1, p. 246), so were presumably very dependent upon the bonus. While 85 percent were eligible to receive this bonus, only 58 percent received any of it (Table 1, p. 246); 27 percent received nothing from it, despite being eligible. Thus, managers could win or lose substantially from this bonus system and so it would be highly salient to them and likely to be quite consequential for their resultant feelings and behaviors. Therefore, the accuracy of the metric, profit, used to determine the bonus was important. The unit whose profit performance was used to determine the managers’ bonus was, in most cases (74 percent), the unit of which they were in charge—for example, the new car sales department for the new car sales manager (Table 4, p. 249). The departmental managers averaged nineteen employees who reported directly to them (Table 1, p. 246), indicating that their departments were small. The combined, average number of direct reports of the departments was fifty-seven, and the general manager had an average of twenty additional direct reports (apart from the three departmental managers) (Table 1, p. 246), so the average size of a dealership was at least seventy-seven, indicating that these dealerships were quite small organizations rather than being large corporations. Thus, even in small organizations, profit of their subunits can be used as the performance metric to determine monetary bonuses, despite the subunits being small. Moreover, the small size of these departments, nineteen on average, made them vulnerable also to sampling error, additional to the error from unreliability. Suppose that a dealership used the average quarterly profit over the year to calculate the bonus. Calculating the sampling error would involve dividing by 8.8 (the square root of 77) for the dealership, but by 4.4 (the square root of 19) for the departments. Hence (if other things were equal), the error for the departments and therefore for their managers would be twice as much as that of the dealership and hence for the general manager.

134 THE SOURCES OF ERROR

By contrast, if divisional profit were used to set divisional manager bonuses in a large firm whose divisions had 10,000 employees, the standard error of average quarterly profit of a division, which would involve dividing by 100, would be about twenty-three times smaller than that for the dealership departments. Unless the standard deviation of the division was twenty-three times greater than the dealership department, the dealership department would contain more random error in its profit than the division. Once again, we see how using profit figures about small organizational subunits produces sampling error, which can reinforce the lower reliability of profit to create a greater error. Nevertheless, these potentially erroneous departmental profit figures are used to set large amounts of pay, so that any errors in profit cause nontrivial errors in the total pay of the manager, with resulting possible negative effects on motivation, feelings of justice, and propensity to cooperate with colleagues. It should be noted that the organizational subunits in these automobile dealerships are not divisions, but rather functional departments (e.g., sales). This cautions that the problems arising from treating organizational subunits as profit centers can sometimes exist in functionally structured organizations—and not just in divisional structures. The divisional (i.e., M-form) structure, however, typically treats its divisions as profit centers and so has a higher probability of incurring the problems of low reliability that flow from profit centers than has the functional structure. Errors in Niche Strategy Analysis A particular feature of much analysis of corporate and business strategies is the emphasis placed on profitability in examining different market segments and product lines. This entails disaggregating company profit into the profit for each product market, thereby engendering smaller numbers of observations for each product market. Since profitability tends to suffer from inherent low reliability, the use of product-market profit figures compounds the problem in the unreliability of profit with the problem inherent in disaggregation. Hence, such analyses are liable to produce incorrect assessments of the profitability of a product market or of one product market compared with another. This in turn will lead to incorrect conclusions about wherein lies a profitable niche for the company. Thus, companies will be prone to wrongly identifying as profitable certain market segments or products and committing resources thereto, while disengaging from other market segments or products that in reality are as, or possibly more, profitable. This problem is seen particularly in the pursuit of a niche strategy. A fashionable idea in strategic management is the concept of niche strategy. The notion is that a company should identify particular products and markets that are favorable to it and then pursue its fortunes in them by letting go on other product markets that are less favorable. The prescription is to analyze existing products and markets in terms of the profitability, market share, and so on enjoyed by the company. The niche philosophy extends to a critique of relying on conventionally

ERRORS IN THE M-FORM AND STRATEGIC NICHES

135

derived accounting measures of the profitability of each product-market segment and enjoins the manager to a more accurate allocation of overheads to each separate product market to yield their true profit. Whatever the merits of the niche strategy, analysis of profitability of each product market entails disaggregation of company total figures. Thus, a certain amount of the variation across product and market segments will be due to chance. Pursuit of “profitable” niches will therefore run the risk of error. This will be especially so, the smaller the scale of the company and the more finely grained the niche analysis. Once again, the two sources of error will enter in here: unreliability of profit and disaggregation. We saw in a previous chapter that aggregates yield better inferences than disaggregates; a corollary is that managerial strategies predicated upon aggregates are inferentially sounder than those based on disaggregates. Yet the search for profitable niches often leads to attempts to assess separately the profit from each part of the organization or its products or markets—that is, to disaggregation of profitability data. This disaggregation may extend to examining the profit not merely of each division, but of each product or market segment or customer type or geographic locale—or some combination of these. Thus, there can be considerable disaggregation. Hence, analyses of the profitability of each niche will combine random error from the unreliability of profit with the random error from small numbers of observations, in a similar way to that discussed earlier regarding the profitability of each division. Again, these errors will combine to sometimes reinforce each other and sometimes offset each other, so that the profits of some niches will be quite accurate and the profits of other niches will be quite inaccurate. Nor will these underlying disturbances be visible to the managers and consultants examining niche profitability, so they cannot correct the profitability of each niche for its particular error. Moreover, assessing the profit of individual products is difficult and is becoming more so over the years. In traditional manufacturing, the cost of a product could be ascertained by counting the costs of its inputs such as materials and labor. But the increasing use of more capital-intensive manufacturing processes has shrunk the proportion of costs that are due to direct labor—that is, the person who makes the product. Much of the manufacturing relies upon advanced, perhaps automated, machinery, which is made possible by technical staff, engineers, and others who are not assigned to any one machine. Many of these costs of machines and indirect staff are classified as overhead in accounting systems; and overhead has risen to make up much of the costs of a manufacturing plant. Yet, the attribution of overhead to individual products in multiproduct plants with complex technical processes is notoriously difficult. There is an old joke among managers: “You ask me: ‘Is it profitable?’ If you want it to be profitable, we’ll allocate little overhead and so it will be profitable. If you want it to be unprofitable, we’ll keep adding on more and more overhead until it is unprofitable. You just tell me what you want.” Thus, niche strategy is placing increased emphasis upon the finely grained attribution of profit to products at the same time that this is becoming more difficult in many organizations.

136 THE SOURCES OF ERROR

Consider the following hypothetical scenario. A company embraces the niche philosophy and conducts an analysis of its products. It breaks its products down into ten categories and identifies the profitability of each category. This examination reveals worrying evidence that several product categories are unprofitable, or lowly profitable, so they are duly axed. However, the consultants performing the niche analysis conclude with a caution that the analysis into ten categories is rather crude and should be regarded as only preliminary. Company management, flushed with success at having identified unprofitable product categories and pruned them out as a result of the preliminary niche analysis, subsequently retains the same consultants to perform a stage 2 investigation, which is a more finely grained analysis of the profitability of the company’s products split into 100 different subcategories. The results of this are even more interesting than the stage 1 analysis, as the new results reveal even greater variations in profitability among the product subcategories than among the product categories. The consultants are vindicated in their original view that the stage 1 analysis of categories was only a preliminary examination that masked important differences and only crudely identified niches for the company. Top managers now convene a series of meetings between themselves and each division manager, with the consultants giving expert advice, and go through each product subcategory, pruning out the “dog products.” At the end of this second round of the niche-focusing exercise, the company has wiped out 25 percent of its sales—destroying economies of scale in manufacturing and purchasing and relinquishing growth options to competitors— and has irreparably damaged its long-term standing in the industry. Yet because of the successive disaggregation, the market segments and product lines examined have been progressively smaller and therefore more infected with error. Thus, some of the segments or product lines exited were erroneously classified as unproductive. The niche philosophy leads companies to withdraw from niches revealed to be unprofitable; but disaggregation of data and its consequent noise may lead companies to withdraw from product markets erroneously regarded as unprofitable. Repetition of this pattern over the years could lead to repeated erroneous withdrawals, culminating in injurious loss of sales, profits, and scale by the company. The niche philosophy has been credited with contributing to the decline of certain parts of the U.S. electronics industry, through premature relinquishing of product segments to international competitors (Egelhoff 1993). Yet because of the possibility of inference problems due to disaggregation combining with inference problems due to the lower reliability of profit, the underlying judgments about which niches are profitable could be in error. Small firms or firms other than the top few “majors” in an industry are often advised to adopt a niche strategy in order to avoid head-to-head competition with the majors. Yet such smaller companies, precisely because of their small size, will have more difficulty identifying truly viable niches than will the majors. When the smaller firm disaggregates its sales-market data by product or geographical region, it will have inherently smaller numbers than will the major firm, and therefore the niche analysis of the smaller firm will be less valid and useful than will that of

ERRORS IN THE M-FORM AND STRATEGIC NICHES

137

the larger firm. Thus, the niche strategy, which is often presented to small firms as “building on their strength,” is, from the point of view of inference capacity, tapping into their inherent weakness. In summary, the more finely grained the niche analysis is made, the smaller the company, and the more that profit is emphasized rather than sales revenue and costs, the more erroneous the niche analysis is liable to be. The propositions are: 7.6 There will be more error in analyses of the profit of niches than in analyses of the profit of the company. 7.7 The smaller the niche used in analyses of the profit of niches, the higher the probability that erroneous differences in profit of niches will be found. 7.8 The smaller the niche used in analyses of the profit of niches, the higher the probability that incorrect strategic decisions will be made. Conclusions In the multidivisional (M-form) structure, emphasis is placed on the profit performance of each division, yet this is likely to contain errors of measurement. As seen in the previous chapter, divisional profit, like corporate profit, is a difference score and therefore tends to be unreliable. However, the unreliability of divisional profitability is greater than for corporate profit. Moreover, the problem of the low reliability of profit is compounded by the problems of disaggregation, in that a division is only part of the company and so subject to more random variation than corporate profitability. The smaller size of divisions and profit centers relative to the whole corporation makes their profit figures more erroneous than those of the whole company. These errors will often amplify the errors from the low reliability of profit. The unreliability of divisional profitability reduces some of the advantages claimed for the M-form. Divisional profit figures will tend to be more erroneous than the cost figures used to assess functional departments in the U-form organization. These errors in the performance measures used in the M-form, relative to the U-form, reduce claimed advantages of the M-form, such as those regarding accountability, capital allocation, and rewarding managers. In niche strategy, the search for profitable niches again emphasizes the profit measure, which is prone to unreliability, and again produces the problem of disaggregation. Thus, niche analyses run the risk of producing strategies that are counterproductive, because errors from the low reliability of profit can be reinforced by the sampling error from small numbers of observations in the niche. The more finely grained the niche, the more this is so. With the concepts of errors created by small numbers and measurement error, we have two building blocks of a theory of the management of inference. Hopefully, they should provide a starting point for a formal analysis of the topic of the contents of control systems that has been neglected in organizational theory to date. We now turn to a third building block: range restriction and extension.

8 Errors From Range Restriction and Extension In the methodological principles that underlie meta-analysis, a third important artifact is range restriction. This is in addition to the first two artifacts: sampling error and unreliability. In this chapter, we will consider the role of range artifacts in managerial inferences. In social science research, the range artifact can attenuate (i.e., reduce) any measure of association between two variables, such as the correlation between X and Y. This occurs when the variation on either variable is less in the study than in the real world (i.e., the universe or population). In order for covariation (i.e., association) between any two variables to be fully revealed, both variables must have their full variation. When range is restricted on either variable, then its variation is reduced and so the correlation is reduced. Thus, range restriction on either variable attenuates correlation. This easily occurs when a study fails to include the top or bottom end of the range of a variable. For example, a study of the correlation between organizational size and organizational structure may not include very large or very small organizations and so restrict size range, producing a correlation less than that in the universe of all organizations. Range restriction can occur in either of the variables being correlated (e.g., X or Y). Restriction in one variable will tend to lead to restriction in the other variable. When examining the covariation between organizational performance and a cause of it, restriction in either performance or its cause will lead to attenuation in the correlation between them, so that the effect of the cause on performance will be understated. The fundamental methodological proposition is: 8.1 Range restriction leads to attenuation of association, i.e., reduction of the observed correlation below the true correlation. The contrary phenomenon is range extension, which is also an artifact. Here, the range on one of the variables in a study is greater than that in the universe, 138

ERRORS FROM RANGE RESTRICTION AND EXTENSION

139

producing a correlation that is greater than the relationship in the universe. This may occur, for example, in a study that takes only the students in the top third and bottom third of a variable, such as test score, and correlates that with some other variable, such as their anticipated future salary. The proposition is: 8.2 Range extension leads to inflation of association, i.e., inflation of the observed correlation above its true value. Differences in range can also lead to differences in correlations between studies. If one study restricts range while the other study does not, then the first study is artifactually attenuating the correlation and so understating it, while the second study is not and so will have a higher correlation. As an illustration, a metaanalysis of the relationships between organizational structural variables (Walton 2005) showed that range accounted, on average across fifteen pairs of structural variables (e.g., centralization and specialization), for 20 percent of the variation between studies in their correlations. This is only somewhat less than sampling error, which accounted for 25 percent on average; and it is more than unreliability, which accounted for only 5 percent on average. Thus, for the variables in those studies, range is an important artifact. The phenomena of range restriction and range extension can occur also in management, as we shall now see. Errors From Range Restriction in Organizational Management Range restriction can lead to attenuation of correlation in analyses by managers and their staff in organizations. The result is that the observed correlation understates the true correlation. Suppose that the Brown Corporation has ten divisions, all of which manufacture products. The company has traditionally believed that economies of scale are important and so it has grown or acquired its divisions so that they are large players within their industry. In fact, the CEO proclaims the philosophy that the divisions should be number one or two in size in their industry, or have a plan to become so, or they will be sold or shuttered. His belief is that scale leads to lower unit costs, so that the division is the low-cost producer in its industry, enabling it to have both sharply competitive pricing and good profit margins. However, lately there has been much talk in the business press about the end of economies of scale because of new information technologies that facilitate lowest-cost production regardless of scale. Worried, the CEO commissions a study by the head of the corporate planning department. For each division, she takes the product that has the highest number of units produced annually and counts that number and the unit cost of that product. She correlates the number of products with unit cost and finds the expected negative correlation, but much weaker than expected, only –.2. This result leads to

140 THE SOURCES OF ERROR

intense discussion in the corporation and to criticism of the belief in economies of scale as “old thinking.” Yet, there really is a high negative correlation between scale and unit costs in these industries, looking across the complete range of scale, including small- and medium-sized firms, as well as large divisions like those in the Brown Corporation. But by analyzing only divisions in the Brown Corporation, the head of corporate planning has omitted the small- and medium-sized scales, so that the scale variable is very restricted in range, producing a correlation weaker than exists between scale and costs. Furthermore, while there is variation in scale of many thousand units between divisions, the relationship is actually between the logarithm of scale and the logarithm of unit cost, so that even these scale differences produce only minor variation in the logarithm of scale, attenuating the scale-cost relationship. Thus, really there are still important economies of scale in the manufacturing industries in which the Brown Corporation operates, and its divisions benefit by having lower costs than many other, smaller firms that compete with them. However, these benefits are obscured in the analysis because it fails to obtain the same amount of variation as in the manufacturing industries as a whole. There can be a particular problem of range restriction when organizational performance is the variable being correlated. M. Meyer and Gupta (1994) have argued that measures used to assess the performance of organizations tend over time to show less variation, so that organizations come to differ less from each other on the organizational performance variable, thereby restricting the range of organizational performance. An interpretation would be that once an aspect of an organization’s performance is being measured and used to evaluate that organization, then its managers have an incentive to increase the level of that variable. Thus, variance is eliminated over the years, producing more homogeneous performances—that is, the organizations become more alike in their performances. This means that their range is being restricted relative to what it had been. Insofar as this occurs by moving the drivers of performance to high levels for all the organizations, then the correlations between those drivers (i.e., causes of performance) and performance will be attenuated (reduced). Suppose that training organizational members was a cause of organizational performance. Initially, the amounts of training conducted by organizations varied widely across the organizations, producing a strong correlation between training and organizational performance. This led to the correct inferences by organizational managers and industry regulators that training was important and to be encouraged. Later, however, when training was widespread, the correlation between training and organizational performance declined, purely because of the decreases in the variation of organizational performance. This led to the inference that training was no longer effective (perhaps based on a perception that its content was out of date), so that training was discontinued. Yet training was just as effective latterly as formerly. The high level of performance enjoyed across the organizations later was due in part to their large amounts of training, so that it was a cause of their

ERRORS FROM RANGE RESTRICTION AND EXTENSION

141

high performance. The correlation between training and performance no longer recorded this fact, because it was artifactually lowered due to range restriction on the performance measure. Training of new employees would have considerable effect on performance because of their limited job knowledge, so that it should be continued. M. Meyer and Gupta (1994) further argue that, after a performance measure loses its variation, it will tend to be replaced by another performance measure, which is unrelated to the first and which restores variation in performance across the organizations. This means that the range of performance is no longer restricted and so attenuation ceases. Therefore, each cause of the new performance variable can again have substantial variation, so that it covaries substantially with performance, producing large correlations (if that is their true value). These correlations lead to the correct inference that these variables are indeed causes of performance and so managers will encourage them again. For instance, if training of organizational members is a cause of this new aspect of performance that is captured by the new measure, then the correlation between training and performance will become substantial. Thus, the organizational manager or industry regulators, or other commentators, observing the differences between organizations, will tend to see training as valuable. They will recommend that it be increased in organizations that do little and maintained in organizations that do a lot. Thus, when the new performance measures are introduced, valid inferences about cause and effect can again be made, which involve correlations, or other associations, between performance and its causes that are being drawn from the existing workforce. Hence, if the sequence described by Meyer and Gupta occurs, the ability of managers to infer from data comparing across the organizations in their industry (or field) will go through a predictable sequence. As organizations become more alike in performance, so the correlations with performance of their causes will decline, leading to erroneous inferences that the causes are ineffectual and the levels of those causal variables should be decreased. After replacement by new performance measures, organizations become more unlike in performance, and so the correlations with performance of its causes will reveal their true strength, leading to correct inferences about their effectiveness and steps to increase the levels of those causal variables, especially in organizations that are lower on them relative to other organizations. Of course, this will raise the performance of the lowerperformance organizations, leading to less variation and higher average performance of the organizations—the very scenario that Meyer and Gupta document. Thus, the decline in variation in organizational performance that Meyer and Gupta address can result from initially correct managerial inferences drawn from comparing across organizations and correctly perceiving correlations between performance and its causes. As these perceptions are acted upon, variation in performance declines, leading to circumstances that can make inference problematic. The tendency for the variation in performance of organizations to decrease could be due to various processes. Being evaluated on a performance metric focuses the

142 THE SOURCES OF ERROR

attention of the managers on that metric and how to raise it. Organizations may raise their performance on the measure by learning which factors lead to higher performance and energetically working to raise those levels. Mimetic isomorphism may be involved as organizations mimic others; coercive isomorphism may be involved if a regulator penalizes low performance (DiMaggio and Powell 1983). Loose coupling (J. Meyer and Scott 1983) might also occur, in that the organization may use impression management to manipulate things to make performance look high, while continuing to operate in a way that is contrary to the spirit of the performance measures. The processes that M. Meyer and Gupta (1994) analyze may occur not just between organizations, but also within an organization, at its divisional or subunit level. For instance, over time, the performances of divisions of an organization might become more similar on whatever metric the head office uses to assess the divisions—for example, divisional profitability. Thus the profitability of divisions might become more alike, as all strive to attain some corporate wide target, such as 20 percent divisional profitability. In an analysis comparing across divisions, the causes of divisional profitability will become less correlated with divisional profitability as differences decline between divisional profitabilities. This can lead to false inferences that those factors are no longer important causes, leading to counterproductive moves such as curtailing the activities that boost or maintain the levels of the causal variables. The propositions are: 8.3 Range restriction leads to attenuation of association, so that managers tend to perceive their organizational practices as being less efficacious than they truly are. 8.4 Any tendency for performance measures to lose variance leads to underestimates of the efficacy of organizational practices that affect those performance measures. Range will tend to be restricted due to selection. For any cause of organizational performance, low levels of the cause lead to low performance, which tends to reduce the probability that that organization will survive. Thereby, the organization will exit from the population being studied and so the low level of the cause will cease to exist in that population. For instance, if sales strategies affect firm profitability, then firms with poor sales strategies will have poor performance. In competitive environments, poor performance will tend to lead to bankruptcy and disbanding of those firms. Actually, the stronger is the effect of a cause on organizational performance, the more likely are organizations that are low on the causal variable not to survive. Therefore, paradoxically, estimates of the strength of that cause on performance will tend to underestimate its effect on performance more than will the estimates of weak causes, because they have less effect on survival. Range will also tend to be restricted due to adaptation. Organizations suffer-

ERRORS FROM RANGE RESTRICTION AND EXTENSION

143

ing low performance as a result of some characteristic (e.g., strategy or structure) will tend to replace it with more appropriate characteristics that produce higher performance. Economists hold that in highly competitive environments in which organizations are highly adaptive, the correlation between causes and performance will tend toward zero. March and Sutton express these tendencies in these terms: “Information about apparent determinants of differences in performance diffuses through a population of competitors and thereby tends to eliminate variation in both the determinants and their effects” (1999, 340). March and Sutton also state that organizations imitate others: “Organizations seek to emulate the performance successes of others by emulating their organizational forms and practices. This practice is institutionalized through concepts of ‘best practice’ and in the activities of managerial media and consultants” (341). Moreover, abandonment of ineffective practices and imitative adoption of effective practices means that there is more competition, so that effective practices produce fewer performance gains: “Poor performance rankings are interpreted by potential competitors as indications that a practice does not work or a market does not exist, thus inhibiting imitation and competition, thereby reducing the competitive pressure and improving relative performance. Good performance rankings, on the other hand, not only stimulate admiration; they also encourage imitation and competition that tend to erode a favorable position” (340). Thus, adaptation among competing firms produces less variation of practice and less performance variation from that practice. The outcome is a reduction in variation in both the practice and the resulting performance, which in turn produces a reduction in the correlation between the practice and the performance—that is, attenuation of the correlation. Thus, the effect of the practice on performance appears to be less than it is really in industries where competition and resultant imitation are strong. Therefore, while the gains from adopting the high-performing practices would be substantial, evidence-based inferences by managers in strongly competitive industries would lead them to see these practices as only weakly correlated with performance. Similarly, managers in weakly competitive industries looking at results from strongly competitive industries would be misled. The proposition is: 8.5 The more the competitive pressure for performance on an organization and the more a cause affects performance, the more the range on that causal variable will be restricted, leading to attenuation of association between that cause and performance. Errors From Range Restriction in Organizational Misfit and Fit Range restriction based on success also occurs for organizational alignment or fit, such as the organization aligning or fitting its structure to its strategy or fitting its strategy to its environment. Misfit produces lower performance than fit. Thus, by

144 THE SOURCES OF ERROR

comparing across organizations, their managers can perceive that organizations in fit outperform those in misfit. Thus, misfit should correlate negatively with performance. However, this relationship will occur most strongly when there is variation in fit—that is, when some organizations are in fit and some are in misfit. The greater the variation in fit, with some organizations being in fit while some are in misfit, the greater the resulting variation in performance and thus the greater the covariation (i.e., correlation) between fit and performance. However, the range in fit can be restricted in two ways: misfit may be less than complete and also fit may be less than complete. First, high levels of misfit may not be present because organizations with the resulting poor performance have failed to survive, having been culled or selected out of their ecological niche in population ecology terms—that is, they have disbanded (Hannan and Freeman 1989). Also, they may have been taken over and reorganized by a new corporate “parent,” reducing their degree of misfit (Goold, Campbell, and Alexander 1994). Thus, such high misfits will tend to be absent from a set of organizations at any one time, despite their having existed temporarily at an earlier date. Hence, the stronger that misfit depresses performance, the less likely is it that high misfits will exist, to produce the variation at the high end of misfit. Therefore, the less evident will be the true strength of the relationship between fit and performance. Again, organizations in misfit may adapt by becoming better fitted (e.g., to their environment), thus reducing the variation on misfit. The stronger that misfit affects performance, the more likely that adaptation will occur sooner and for more moderate degrees of misfit, because the organization and its managers will suffer the negative consequences of misfit. For reasons of selection and adaptation, organizations will tend not to be found in high misfit. Second, turning to fit, organizations may tend not to be found in complete fit. For instance, regarding organizational structure, research shows that organizations that have diversified tend to adopt divisionalization, which is a fitting structure (Chandler 1962; Donaldson 1987; Rumelt 1974). However, examination of the details of structure shows that many divisionalized organizations have nevertheless failed to adopt all the characteristics of full divisionalization, such as divisional autonomy and control systems (Hill and Pickering 1986). The reasons for this only incomplete fit may include political resistance and a lack of managerial knowledge of what complete fit entails. Moreover, while incomplete fit means that organizations attain only suboptimal performance, this is consistent with the idea that other factors, such as dominant market share, can provide slack that absorbs some inefficiency from maladaptation (Child 1972). Moreover, an organization attaining only incomplete fit is consistent with the idea that management decision-making seeks only to attain the satisficing level rather than the optimal level (Simon 1957). Hence, for these reasons, there may be few organizations at either the low or the high end of the fit-misfit continuum, producing a lesser correlation between fit and

ERRORS FROM RANGE RESTRICTION AND EXTENSION

145

performance than would hold if organizations were more fully distributed along the continuum. This means that the full benefits to be derived from going from complete misfit to complete fit are understated by observed correlations. The proposition is: 8.6 Measures of misfit tend to have range restriction, leading to attenuation of their association with performance, leading managers to perceive the effects of misfit as being less consequential than they truly are. Errors From Range Extension Range extension occurs when the variation in a study is greater than in the population (or universe). It produces an inflated correlation that is greater than the true correlation in the population. The inflated correlation misleads the analyst into thinking that the cause is stronger than it is. Range extension can occur in organizational management. Range extension could happen when managers focus on the extremes of a variable. For instance, a sales manager might have twenty subordinates but compare just the top-selling salesperson with the bottom salesperson. The variable of weekly sales per salesperson would have the full range, from $3,000 (by the top salesperson) to zero (by the bottom salesperson); but the intermediate sales levels, $2,500, $1,700, $1,100, and so on, of the other eighteen salespersons, would be omitted. Thus, the standard deviation of sales overstates the variation that exists in reality. A correlation between the personality and the sales made by each individual salesperson would be greater comparing just the top- and bottom-selling salespersons than if all the other salespersons were included. This leads to the effect of the causal variable being overstated, so that it seems to be stronger than it is really. This in turn could lead to the misunderstanding by managers that there are not other causes of success that operate. Thus, the managers may focus too much on one cause, possibly neglecting other important causes. The proposition is: 8.7 Range extension produces overestimates of association, leading managers to perceive the effects of organizational practices as being more efficacious than they truly are. Conclusions Range artifacts distort the associations between organizational variables and thus estimates of the effects of causes. Range restriction leads to attenuation of association and so to underestimates of the effects of organizational practices that cause performance. This could lead managers to abandon or underinvest in those practices, despite their efficacy. Re-

146 THE SOURCES OF ERROR

striction in range can occur in either a cause of performance or performance itself. Any trend over time to reduce variation in performance will attenuate associations involving that performance measure. Selection, adaptation, and isomorphism can all work to restrict range of organizational variables. Where the cause is a misfit between two variables, restrictions in the range of the misfit by having incomplete misfits and incomplete fits will tend to attenuate association in which misfit is involved. Range extension produces the opposite problem: associations are inflated, leading to misperceptions that causes have more effect than they do really. This could lead managers to utilize practices that are only weakly effective or to overinvest in practices that are less effective than other practices. In Chapters 9, 10, and 11, we will turn to a discussion of another source of error in analyses inside organizations: confounding. We begin in Chapter 9 with an analysis of how examining data for the causes of organizational performance can lead to confounding. In Chapters 10 and 11, we complete the discussion by turning to the more familiar form of confounding: by another factor apart from the focal cause.

9 Confounding by the Performance Variable

In social science methodology, confounding is a major obstacle to making correct inferences from data. Confounding means that a spurious association obscures the true relationship between two variables, usually due to a third variable. The confounding can lead to no apparent relationship when really one exists or to an apparent relationship when really there is none. Either way, the analyst is misled by the data. In this chapter, we discuss how analyses of the causes of performance can be misled by confounding due to the performance variable itself. In the next chapter, we will discuss how an analysis of a cause of performance can be confounded by a third variable, other than performance—the more usual type of confounding. In confounding, a correlation between two variables is really spurious due to the association of some other variable. An observed positive correlation could really be nil or negative, or an observed zero correlation could really be positive or negative. The stronger the association between the third variable and the two focal variables, the stronger is the spurious relationship it induces. Also, the weaker is the true relationship between the two focal variables, the more their relationship is confounded by any spurious relationship. Confounds are potentially very troublesome, because they do not merely change the magnitude of the observed relationship from its true value, but may render it zero or of opposite sign. Thus, they have the potential to be more severe than the attenuation of correlation due to unreliability, which simply reduces the magnitude of a relationship. Whereas unreliability reduces the observed relationship below its true value, confounds can increase the observed relationship above its true value. Thus, confounds are like sampling error in that they can produce either an underestimate or an overestimate of the true relationship. Confounds can occur purely accidentally, in the sense that the focal cause of an effect happens to be correlated with a third variable that is also correlated with 147

148 THE SOURCES OF ERROR

that same effect. However, confounds can also arise systematically in that something is tending to make them occur. We shall focus in this chapter on systematic confoundings, because these are regular and predictable, therefore allowing us to predict where managers will make errors and thereby to contribute to statisticoorganizational theory. Systematic confounding may occur when the definition of a variable is such that it is associated with another variable. Another source of systematic confounding is causal, in that a causal relationship confounds the focal cause. Both sources of systematic confounds will be considered here. Difference scores are a source of confounding. Because of their definitions they can have correlations with their constituent variables. These constituent variables may, in turn, be correlated with some other variable, which confounds the relationship between it and the difference score (Wall and Payne 1973). In particular, this problem is potentially present whenever a cause of performance is being investigated and performance is measured by a difference score, such as profit. Thus, when the effect being assessed is profit performance, this can introduce a confound because of a definitional connection. In this way, causation is being confounded by the definition of performance (e.g., profit). Reciprocal causality provides another source of confounding, which can be fatal to making sound inferences. Here, the dependent variable is also a cause of the independent variable. This confounds the relationship between cause and effect. The feedback from effect to cause obscures the effect of that cause. This reciprocal causation is particularly likely to be operative where organizational performance is the dependent variable, because much change in organizational variables is driven by organizational performance. Hence, again, causation can be confounded by performance. Thus, both sources of confounding considered in this chapter are confounding of causation by performance. Performance produces these confounds either definitionally or by feedback effects. Both are ways that the effect of cause on performance can be confounded just by performance itself, without any other variable having to enter in to produce the confound. We are interested in confounding because of its capacity to lead managers to draw the wrong inference from data and thereby make the wrong decision. We will begin by discussing why the spurious relationships produced by confounds can be strong enough to obscure the true relationships. Then we will discuss, in order, the two sources of confounding being analyzed here: definitional connection and reciprocal causality. Severe Confounding Produced by Weak Spurious Correlations The concern with confounds is that they are great enough to produce spurious correlations that are large relative to the true effects, leading to false conclusions being drawn from the data. Severe confounding of a relationship, such as

CONFOUNDING BY THE PERFORMANCE VARIABLE

149

complete masking, can occur even if the spurious relationship is not strong (i.e., lacks a high correlation). This means that it is easier for confounds to lead to false inferences. The amount of confounding by a spurious relationship is its magnitude relative to the true relationship. Thus, for example, a spurious correlation of +.4 is not to be judged absolutely comparing against the possible range of positive correlations (i.e., up to +1.0). Rather, relative to a true correlation of –.2, the spurious correlation of +.4 produces a net, or observed, correlation of +.2. The conclusion that would be drawn is that the relationship is positive, when it is actually negative. Again, if the spurious correlation were only +.2, the net, or observed, correlation would be zero. The conclusion drawn would be that the relationship is nil, but really a negative relationship exists. Thus, even if not strong, spurious relationships are a problem if the true relationships they confound are weak or similar to them in degree. While it may be tempting to dismiss weak true correlations of about +.2 as not worth bothering about, there is reason not to follow that path. Some dependent variables of interest in organizational science are influenced by multiple causes. Therefore, the true effect of any one cause is, on average, weak. For example, if organizational performance has eleven causes, which are independent of each other, then the average cause has a correlation of only .3. Such a correlation could easily be attenuated in an empirical study—through a combination of unreliability in the independent and dependent variables, and range restriction—so that it would be only .2. Therefore, weak correlations are germane to organizational management. Given that confounding is more feasible for such weak correlations, the issue of confounding becomes more salient. Confounding by Definitional Connections Confounding by Difference Score A difference score is defined as being the difference between two variables, so that it is definitionally connected to them. This definitional connection can lead the difference score to be correlated with the variables that constitute it. If these variables are correlated with some other variable, then this induces a spurious correlation between that variable and the difference score that confounds their true relationship. Whenever a variable is generated by a difference score, sometimes known as a deficiency score (Wall and Payne 1973), then it tends to be correlated with both of the variables that constitute it. These correlations can lead to confounds, as will now be explained. Wall and Payne caution that “because of the constraints inherent in the derivation of deficiency scores, obtained relationships between such scores and an independent variable may reflect no more than the relationship between one of the two component measures of the deficiency score and that independent variable.” Wall

150 THE SOURCES OF ERROR Figure 9.1

The Possible Values of X – Y From the Values of X and Y

Y

5

–4

–3

–2

–1

0

4

–3

–2

–1

0

1

3

–2

–1

0

1

2

2

–1

0

1

2

3

1

0

1

2

3

4

1

2

0

3

4

5

X

and Payne identify two constraints, of which it is their “logical constraint” (1973, 322) that concerns us here. This constraint restricts the range of the deficiency score, such that it tends to be correlated with its component variables. If those component variables are also correlated with a third variable, there is a correlation between the deficiency score and the third variable that is spurious. This is what concerns Wall and Payne in the passage just cited. This means that a true correlation between the deficiency variable and the third variable is confounded by the spurious correlation due to its component variables. This would make the true correlation appear to be some other value that is not true. In particular, our concern here is with the possibility that, through this type of confounding, a true positive correlation between an organizational characteristic and profit could falsely appear to be negative, nil, or inflated positive, because of confounding by the components of profit (i.e., sales or costs). The correlation between a difference score and its constituent variables occurs in the following way (Wall and Payne 1973). If variable W is the difference between X and Y (i.e., X – Y), then W tends to be correlated positively with X. The

CONFOUNDING BY THE PERFORMANCE VARIABLE

151

reason is that higher values of X increase W. This is due to range restriction on the difference score, W, that is associated with X. In Wall and Payne’s terms, there is a logical ceiling on the values of W. If X and Y are both measured on scales of 1 to 5, then, when X is 1, W (i.e., X – Y) can take a value in the range of –4 to zero—that is, the range of W is constrained such that it cannot be positive (see Figure 9.1). However, when X is 3, W can take a value in the range of –2 to +2—that is, W is constrained such that it cannot be highly negative or highly positive. And when X is 5, W can take a value in the range of zero to +4—that is, W cannot be negative. Clearly, the higher the value of X, the higher the values of W can be (see Figure 9.2). To put it another way, as X increases from 1 to 5, so the expected value of W increases from –2 to +2. Thus, there tends to be a positive correlation between X and W purely from the way that W is defined, as being the difference between X and Y. Similarly, W tends to be negatively correlated with Y, because higher values of Y decrease W. These two definitional correlations exist because W is defined as being X minus Y, which gives it a positive association with X and a negative association with Y. Thus, there is a definitional connection between the difference score, W, and its constituents, X and Y. If, in turn, X is correlated with some other variable, Z, then W will be spuriously correlated with Z. The definitionally based correlation between W and X means that, if X and Z are also correlated, then W and Z are spuriously correlated. Hence, there is a confounding of the relationship between W and Z, due to X. Similarly, if Y is correlated with Z, then, again, W will be spuriously correlated with Z. Hence, again, there is a confounding of the relationship between W and Z, this time due to Y. Thus, a spurious correlation between W and Z can arise because of either X or Y. As just seen, the definitionally created correlation WX is positive, so that if the correlation XZ is negative, then there is a spurious negative correlation, WZ, due to X (see Figure 9.3). The definitionally created correlation WY is negative, so that if the correlation YZ is positive, then there is another spurious negative correlation, WZ, due to Y. In this example, both confounds are of the same sign (negative), so the confound due to Y reinforces the confound due to X. Hence, the confounding of the relationship between W and Z is greater than that due to either X or Y alone. If the true correlation between W and Z is positive, nevertheless, the negative confound produced by X and Y together could reduce it or make it zero or make it positive, depending upon whether the joint confound was less than, equal to, or greater than that true correlation. Because the definitionally induced correlations between W and X and Y tend to be opposite in signs, if the correlations XZ and YZ are also opposite in signs, then the confounds of W and Z due to X and Y tend to be of the same sign, and so are reinforcing. In this circumstance, the confounding is stronger than for any one constituent variable. When they are reinforcing, it is easier for these confounds

152 THE SOURCES OF ERROR Figure 9.2

The Possible Values of X – Y Showing Their Association With X

X–Y=W

ƒ

4 ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ ƒ

3 2 1 0

ƒ

ƒ

ƒ

ƒ

–1

ƒ

ƒ

ƒ

ƒ

–2

ƒ

ƒ

ƒ

–3

ƒ

ƒ

–4

ƒ

Expected value

X

1

2

3

4

5

together to be great enough to disguise the true relationship between W and Z. If, however, the correlation between X and Z is of the same sign to the correlation between Y and Z, then the definitionally created spurious correlations produced by each of X and Y with Z are of opposite signs and so offset each other. Then, the confounding is weaker than the stronger of the confounds from the constituent variables, reducing their collective ability to produce an observed correlation between W and Z that is misleading.

CONFOUNDING BY THE PERFORMANCE VARIABLE Figure 9.3

153

A Difference Score Variable (W) and Its Definitional Connections With Its Constituent Variables (X and Y) Masking a Positive Effect of W on Z

– X –

+ +

W (= X – Y )

Z



+ Y – Thus, while the definitional connection between a difference score and its constituent variables is not always such as to produce a confound, it may do so, and it can produce a confounding large enough to give misleading conclusions. The definitional connections between a difference score and its constituent variables means that there is a tendency for it to be correlated with them. Whether a substantial confounding is produced by definitional connections is an empirical matter depending upon the signs and strengths of the correlations in a particular case. Some difference scores are absolute differences; for example, in pay mismatch (i.e., the mismatch between actual and desired pay), any difference between actual and desired pay is a positive score. With absolute differences, there tends to be a definitionally induced correlation with its constituent variables only if there are more cases of one type of inequality than of the other type—for example, more

154 THE SOURCES OF ERROR

cases of actual pay being less than desired pay, than cases of actual pay being more than desired pay. The greater is the imbalance between the two types of cases, the stronger is the definitionally induced correlation with the constituent variables. Thus, for absolute difference variables, there tends only to be confounding due to definitional connections if the imbalance is more than trivial. The fundamental methodological proposition is: 9.1 When a variable is a difference score, e.g., X – Y, confounds are possible due to the constituent variables X and Y. The confounding of the relationship between a variable and a difference score by the constituent variables of the difference score can be controlled, so as to eliminate their spurious effects. This is usually done by including the constituent variables in a multivariate analysis of the effect of the difference score variable on the dependent variable (Wall and Payne 1973). The proposition is: 9.2 When data contain difference score variables, e.g., X – Y, managers and other analysts may make erroneous inferences, because of confounding, if they do not control for the constituent variables X and Y. Profit Confounded by Sales and Costs As already emphasized, profit is a difference score, so it is potentially prone to the problem of being correlated with its constituents and confounded by them. Since profit is sales revenue minus costs, it tends to be positively correlated with sales and negatively with costs. Therefore, a relationship between profit and a variable could be confounded by a relationship between sales and that variable or a relationship between costs and that variable. If the relationships between sales and the variable and between costs and the variable are opposite in signs, any confounding they produce is likely to be of the same sign, and so the confounds reinforce each other, making a severe confound more feasible. Thus, any analysis of data using profit stands the risk of being confounded by sales or costs and, thereby, the risk of managers making the wrong inference. Suppose a corporation facing an economic recession encourages its divisions to discount their prices. Some discount and some do not, so the head office seeks to evaluate the effects of price discounting by divisions on their profits. The head office managers expect to show a positive correlation between divisional discounting and divisional profit. However, the correlation is weakly negative, which throws management throughout the company into confusion. Yet, the true effect of divisional discounting on divisional profit is positive, but it is masked by negative confounds that are a little stronger (see Figure 9.4). This confounding is due to definitional connections: divisional profit is positively correlated with divisional sales and negatively with divisional costs. In turn,

CONFOUNDING BY THE PERFORMANCE VARIABLE Figure 9.4

155

Constituent Variables (Divisional Sales and Divisional Costs) of Divisional Profit Masking the Positive Effect of Divisional Price Discounting on Divisional Profit

Divisional sales

– + Divisional price discounting

+

+

Divisional profit (= Divisional sales – Divisional costs)

– Divisional costs

divisional sales have a negative effect on divisional discounting, because those divisions doing well in sales do not discount their prices. Thus, there is a spurious negative correlation between divisional discounting and divisional profit that is due to divisional sales. Divisional costs have a positive effect on divisional discounting, because those divisions facing high costs discount their prices so as to make enough sales to cover their costs. Thus, there is also a spurious negative correlation between divisional discounting and divisional profit that is due to divisional costs. Together, these two definitionally based negative confounds overwhelm the true positive correlation between divisional discounting and divisional profit, producing the observed weak negative correlation. The error itself stems from something inherent in profit: that it tends to be correlated with both sales and costs, because they constitute it. Given that profit is integral to a business firm and is used often to assess its divisions, internal comparisons of profit become endemic to management, but can bring with them potential traps that prevent sound inference. As already stated, the solution to the problem of the confounding of a difference score by correlations from its constituent variables is fairly straightforward in social science research: researchers include the constituent variables as control variables in any analysis involving the difference score. By implication, managers in organizations conducting analyses that include profit can also avoid the confounding problem by including sales and costs as control variables. However, there is some cultural pressure away from so doing. Sales and costs are apt to be seen as only the means to an end; what really counts is profit. Therefore, it seems more purposeful to target just profit and analyze it, without considering sales and costs. Moreover, the social science argument about difference scores

156 THE SOURCES OF ERROR

and confounding by constituent variables is little known apart from academics who have graduated from PhD programs. Managers, staff analysts, and organizational consultants are likely to be unaware of the need for controls for sales and costs when analyzing profit. Hence, profit, being a difference score, is inherently prone to confounding. Since most managers and analysts will not control for its constituent variables, sales and costs, the possibility of confounding exists in many organizational analyses. 9.3 When analyzing relationships involving profit, managers and other analysts may make erroneous inferences, because of confounding, if they do not control for both sales and costs. Organizational Situations in Which Definitional Confounding Is Likely As we have seen, profit is prone to confounding by its constituent variables, sales and costs. This confounding problem potentially applies every time profit is used to measure the performances of organizational subunits and compare across them. In contrast, neither sales nor costs are difference scores and so are free of this kind of confounding. Functional structures treat their functional departments as cost centers, while using costs and sales to assess their sales departments. Hence, functional structures use sales and costs to assess their major organizational subunits (departments) and their managers. Thus, functional structures use assessments of their major subunits that are not inherently subject to confounding. In contrast, multidivisional, or Mform, structures usually measure the performance of their divisions in terms of profitability. Thus, multidivisionally structured companies use profitability, which is inherently prone to confounding. Hence, the performance assessment of major organizational subunits is more likely to be more confounded, and therefore less valid, in multidivisional structures. Therefore, managers are more likely to draw erroneous inferences, leading to wrong decisions, from performance assessments of major organizational subunits in multidivisional than in functional structures. Similarly, in the holding company structure, each business may be a subsidiary that is a profit center. Since many multinational corporations treat each foreign subsidiary as a profit center, an analysis of the profitability of their subsidiaries is prone to be integrally confounded. In contrast, a company operating purely domestically that is organized functionally would be spared this problem, because it lacks multiple profit centers. Multiple profit centers can arise in other ways in companies, such as when the regions or branches or factories are profit centers. Whenever a company consists of multiple profit centers, an analysis comparing across them in their profitability is prone to confounding.

CONFOUNDING BY THE PERFORMANCE VARIABLE

157

The propositions are: 9.4 In organizations whose major subunits are profit centers, analyses of their performances are prone to more confounding than are analyses in organizations whose major subunits are cost centers. 9.5 M-form, or multidivisionally structured, organizations are prone to more confounding in analyses of the performances of their organizational subunits than are organizations that are functionally structured. 9.6 Holding company–structured organizations are prone to more confounding in analyses of the performances of their organizational subunits than are organizations that are functionally structured. 9.7 Multinational organizations are prone to more confounding in analyses of the performances of their organizational subunits than are organizations that are functionally structured. Confounding and Unreliability of Difference Scores There is some similarity here with the argument in Chapters 5, 6, and 7 about unreliability, which stated that difference scores are likely to produce high unreliability that can prevent managers from drawing sound inferences from data. For confounding, like unreliability, difference scores are a hazard for sound inference, yet they are used with some frequency in organizations. As seen, difference scores tend to suffer low reliability and confounding. The low reliability makes their observed correlation with another variable an understatement of the true correlation—for example, a true high positive correlation is observed to be a low positive correlation. This attenuation of the true correlation by unreliability makes the correlation weak and therefore makes it easier for the confounding to be strong. Confounding can alter the sign of the relationship involving the difference score variable—for example, a true positive correlation could be observed as a negative correlation. Hence, the true relationship between the difference score and the third variable is attenuated and then it is increased or decreased by the confounding. Thus, the combination of low reliability and confounding can make observed correlations involving difference scores misleading to managers. Of course, unreliability is a more uniform problem than confounding, in that all measurements have some unreliability and so observed correlations always understate true correlations. In contrast, confounding may not occur (if neither of the constituents is actually correlated both with the difference score variable and with the variable to which the difference score variable is being related—or the constituent variables confound positively and negatively, completely offsetting each other). However, if the confounding does occur, then it is always jointly with the attenuation due to unreliability.

158 THE SOURCES OF ERROR

Confounding by Reciprocal Causality: Performance-Driven Organizational Change Reciprocal causality is an idea widely appreciated in social science methodology, and it is germane to analyses of data by managers in organizations. Moreover, a particular form of reciprocal causality occurs in organizational theory research, because performance feeds back to drive organizational change. The reciprocal causation considered here occurs when an organizational characteristic affects organizational performance, and performance feeds back to affect that organizational characteristic. Managers are interested in the effect of an organizational characteristic on organizational performance. Organizational theory research has discovered that not only do organizational characteristics (e.g., organizational structure) affect organizational performance, but also there is often an effect of organizational performance on some of these organizational characteristics. This feedback effect of organizational performance on an organizational characteristic is reverse causation. Where both the organizational characteristic affects organizational performance and also organizational performance affects the organizational characteristic, then reciprocal causation exists. In reciprocal causality, both X causes Y and Y causes X. A major way in which organizational performance affects organizational characteristics is through performance-driven organizational change. Crises of low organizational performance trigger adaptive organizational change, such as adoption of a more effective organizational structure (Chandler 1962). While any analysis of the effect of X on Y can be confounded if Y also causes X, such reverse causation occurs regularly when one of the variables is organizational performance, because organizational change is performance-driven. Therefore, analyses by organizational managers of organizational performance will tend to be confounded. Since the reverse causality is from performance to the other variable, the confounded relationship is the effect of that variable on organizational performance. Thus, the effect of some organizational characteristic on performance is confounded by the reverse causality. The reverse causality of performance affecting the organizational characteristic tends to be negative, so any positive effects of the organizational characteristic on performance are masked. Because performancedriven organizational change systematically creates this confound, analyses of the effects of organizational characteristics on performance will recurrently be subject to confounding, whereas analyses of relationships between other variables may be less frequently subject to confounding. Hence, when managers conduct causal analyses, they will be more subject to problematic confounding when performance is the putative effect than for other variables. March and Sutton write of “negative feedback effects by which success or failure in organizational performance creates countervailing tendencies. According to one common speculation, organizational performance below target or aspiration

CONFOUNDING BY THE PERFORMANCE VARIABLE

159

levels (failure) triggers increases in search, decreases in organizational slack, and decreases in aspirations.” As a result, “simple unidirectional causal interpretation of organizational performance is likely to fail” (1999, 343–344). Performance causing an organizational characteristic tends to be ubiquitous, because performance-driven change is common. Simon (1957) argues that managers are boundedly rational, so that much decision-making is problem-solving. This is initiated only when performance falls below the satisficing level. Since this is a general theory of managerial decision-making (L. Donaldson 1999), it follows that it applies to all organizational characteristics. Hence, any organizational characteristic is liable to be left at its existing level by managers until performance drops to an unacceptable level. Studies show that organizational structures remain in place, even if suboptimal, until organizational performance becomes low (Chandler 1962; L. Donaldson 1987; Ezzamel and Hilton 1980; Hill and Pickering 1986). Similarly, studies show that organizational strategy tends to remain constant, despite dysfunctions, until performance drops to crisis levels (Cibin and Grant 1996; G. Donaldson 1994). Thus, social science research on organizations has shown that many adaptive changes by organizations are driven by crises of poor performance. Organizations that have retained a suboptimal characteristic, such as strategy or structure, and need to adopt a new and more optimal structure often fail to do so until their performance has declined and become unacceptable. Thus, there is causation from the characteristic to organizational performance, but also feedback from organizational performance to the characteristic—that is to say, reverse causation. Typically, the effect of the characteristic on organizational performance is positive. In contrast, the effect of organizational performance on the characteristic is negative, because low organizational performance leads to the adoption of higher levels of the characteristic. For example, L. Donaldson (1987) found that fit of structure to strategy positively affects performance, but also that performance negatively affects fit of structure to strategy. Hence, there is simultaneously both a positive effect of the characteristic on organizational performance and a negative effect of organizational performance on the characteristic. Being opposite in sign, these positive and negative effects tend to cancel each other out, leading to a low or nil relationship being observed between the characteristic and organizational performance. Depending upon the relative strengths of the positive effect of the characteristic on organizational performance and the negative effect of organizational performance on the characteristic, the observed relationship between the characteristic and organizational performance will be either negative, zero, or below the true positive value. Any of these relationships will misstate the true effect of the characteristic and organizational performance. The negative result is incorrect because the true relationship is positive. The zero result is incorrect because the true relationship is nonzero. The positive result understates the true relationship (because of the offsetting negative effect of performance on the characteristic).

160 THE SOURCES OF ERROR

The fundamental methodological proposition is: 9.8 Where it exists, reciprocal causation, i.e., X causes Y and Y causes X, confounds causal interpretation. The derived propositions for organizational theory are: 9.9 Where change in an organizational characteristic is induced by crises of low performance, a negative effect of prior performance on the characteristic confounds any true effect of the characteristic on performance. 9.10 Reciprocal causation by prior performance confounds causal interpretation of the causes of performance. 9.11 The positive correlation between an organizational characteristic and organizational performance can be masked to a degree by a negative correlation between that organizational characteristic and organizational performance. The reverse causality of organizational performance on organizational characteristics is a systematic type of confound that is likely to recur. Therefore, the effects of organizational characteristics on organizational performance are liable to be confounded more regularly than other relationships between variables. 9.12 When causes of organizational performance are analyzed, they will be subject to confounding more regularly than other relationships. 9.13 When positive causes of organizational performance are analyzed, they will be subject to masking more regularly than other relationships. These false findings can lead managers to make erroneous decisions to the detriment of their organizations. The perceived negative relationship might well lead a manager to conclude that an organizational characteristic harms organizational performance and so refuse to increase its level, or even reduce it, which damages the organization. The perceived nil relationship might lead a manager to conclude that the organizational characteristic has no value for the organization and so put no resources into increasing its level or make no effort to prevent its being reduced, which, again, damages the organization. Even a perceived positive relationship that understates its true magnitude tends to lead a manager to conclude that the organizational characteristic has only limited value for the organization and so make little effort to increase its level or make only a weak effort to prevent its being reduced, which, once again, damages the organization. Such a problem of false inferences by managers might occur in an organization in the following way. Suppose a corporation has introduced accountability management, by setting key performance indicators for managers, defining their

CONFOUNDING BY THE PERFORMANCE VARIABLE

161

responsibilities clearly, and training them in how to exercise their delegated authority effectively. There is some dispute in the corporation as to whether this accountability program is really raising managerial effectiveness or not and whether its benefits outweigh its costs. A manager in the central human resource management department at the corporate head office tries to answer these questions. Of the corporation’s twenty divisions, some have implemented the accountability program, to varying degrees, and some have not. The manager rates the degree of implementation of the accountability program of each division on a four-point scale: none, weak, medium, and strong. He then correlates the ratings of each division with its profitability. The profitability of the division is taken from the most recently available figures, which date from last year. The correlation turns out to be mildly negative, surprising the manager and indicating to him that the program is not working and actually counterproductive. He writes a report recommending that, to reduce further harm to the corporation, the program should be discontinued. However, the true effect of the accountability program on divisional performance is positive. Implementing the program changes managers’ behavior and makes them more effective. This improvement occurs in the eighteen months after they complete the training. Thus, to capture the positive effect of the program, divisional profitability would need to be measured about two years after the program. Instead, divisional profitability has been measured for the nine months prior to the analysis. Moreover, divisions that performed poorly last year were more likely to adopt the accountability program than those that performed well. The managers in charge of poorly performing divisions had a sense of urgency that they must do something “to lift their game” and so adopted a program that seemed to offer some hope of improvement. In contrast, the managers of well-performing divisions saw no need for urgent action and maintained a skeptical attitude that the program was just the latest fad. Therefore, there was a negative association between adopting the program and prior profitability. A couple of divisions adopted the program early, some twenty months before the analysis, and their profitability reflected the boost from the program, but having been the poorest divisions, their performance was still below that of the nonadopting divisions. However, most divisions adopted the program in that last nine months and they still had low profitability. Hence, overall, the positive effect of the program on profit was swamped by the negative effect of profitability on adopting the program. In this scenario, measuring performance (profit) before the program has increased the weight of the negative effect of performance on the program relative to the positive effect of the program on performance. Yet, this result is likely to occur with some frequency, because managers will tend to want to measure the program (or other organizational characteristic) for the latest period—to be “relevant” and “up to date”—while also taking only the most recent available performance figures, which are usually somewhat old.

162 THE SOURCES OF ERROR

Of course, the problem could be avoided by measuring performance some time after the program, say one or two years. But waiting for years to see the effect of present programs is frustrating and seems almost irresponsible because decisions are needed now about utilizing, or not, the program across the corporation. Moreover, two years from now, the accountability program is likely to have been modified, and so the present program is liable to be seen as obsolete and its effects only a matter of historical or academic interest. We have emphasized the case where the negative effect of performance on an organizational characteristic offsets a positive effect of the organizational characteristic on performance. Managers are usually interested in knowing whether practices that are supposed to have a positive effect on performance do so in fact. Hence, the focus here on characteristics that positively affect performance is pertinent. However, if a characteristic negatively affects performance, reverse causality could again confound it. Any negative effect of performance on the characteristic would reinforce the negative effect of the characteristic on performance, confounding it. Thus, the observed relationship between the characteristic and performance would appear stronger (i.e., more negative) than is the true effect of the characteristic on performance. The exaggerated negative result might lead managers to abandon that organizational characteristic, when its real, mildly negative effect on performance was actually acceptable to them so they would have continued with the practice had they known its true value. For instance, suppose an organization has disastrously high staff resignations, which leads to a crisis of low profitability. In order to retain staff, the company opens a day-care center. The employees start going downstairs to check on their children during work hours, so the day-care center harms productivity per staff member, but only weakly so. However, the productivity data are confounded by the large fall in productivity that occurred earlier, due to the resignations. Therefore, the productivity decline is considerably inflated, leading management to erroneously discontinue the day-care center. However, the day-care center had retained staff, who then left when it was discontinued, so that, subsequently, organizational productivity declined disastrously. Management had erroneously discontinued the day-care center, despite its positive effect on retention and thereby positive indirect effect on productivity, because its negative direct effect on productivity was inflated by confounding. Prior performance tends to confound any effect of an organizational characteristic on performance. Profit is a measure of performance and therefore analyses in which profit is treated as an effect are prone to confounding by reciprocal causality. Research shows that the negative feedback effect of performance on organizational characteristics holds not only for profit, but also for sales—for example, low sales induce a performance crisis that triggers organizational change (L. Donaldson 1987, Table 3, p. 18). Therefore, studies of the relationship between an organizational characteristic and sales can suffer from confounding due to reciprocal causation. Hence, performance, whether measured by profit or sales, is prone to confounding

CONFOUNDING BY THE PERFORMANCE VARIABLE

163

from reciprocal causality. Thus, reciprocal causality as a source of confounding applies more widely than definitional connection, which applies only to profit, being a difference score. Putting both sources of confounding together, we can say that analyses involving performance are prone to confounding from reciprocal causation and, where performance is measured by profit, also from definitional connection. Thus, both sources of confounding identified in this chapter may be present simultaneously. Since profit is used in many organizations to measure performance, profit can be subject to both confounds simultaneously. Both confounds might reinforce each other so that the confounding is severe, leading to a misleading observed correlation. Alternatively, the negative confound from reciprocal causation could be offset to some degree by a positive confound from definitional connections, leading to less error. Confounds and Other Sources of Error As seen earlier, the definitional connection between profit and sales may be reinforced or offset by a definitional connection between profit and costs. The confounding effect of either of these depends upon the correlation between sales (or costs) and the organizational characteristic that is being related to profit. As just seen, this net confounding can be reinforced or offset by the reverse causal effect of performance on the organizational characteristic. The net of these three confounds can then be strong enough, or not, to produce a severe confound of the true relationship. All of these confounds are variable as to the magnitude of the spurious relationships they create. There is also a possible error due to sampling error of varying sign (i.e., whether it is positive or negative) and magnitude. Thus, there is a distribution of possible net confounds on the focal relationship depending upon the mutual occurrences among the individual confounds. The effect of confounding, such as from definitional connections or reciprocal causality, is to vary the observed correlation around the true correlation, altering the magnitude of the correlation and, possibly, its sign. Sampling error also alters the magnitude of the correlation and, possibly, its sign, relative to the true correlation. The errors from confounding can reinforce or offset the errors from sampling error. Thus, a confound from a definitional connection could be reinforced by a confound from reciprocal causation, and these in turn could be reinforced by sampling error, rendering a false observed relationship more likely. Alternatively, if the confound’s net effect was to depress a true positive correlation, a positive error from sampling could offset that net effect, leading the observed correlation back closer to the true effect. While confounds and sampling error induce error around the true correlation, attenuation (reduction in correlation) from unreliability always acts to reduce the observed correlation below the true correlation. This depressing effect on the true correlation could be either reinforced or offset by confounds and sampling error.

164 THE SOURCES OF ERROR

Conclusions When managers seek to identify any organizational characteristic that positively affects performance, it tends to be masked by a negative reverse casual effect. Moreover, if profit is used to measure performance, then other confounds can arise from definitional connections between profit and its constituent variables, sales and costs. In particular, if sales negatively correlate with the characteristic or if costs positively correlate with the characteristic, they may create spurious correlations that mask the positive effect of the organizational characteristics on performance. This increases the chance that the observed relationship will be completely masked. Confounding is thus a serious problem that can lead to the wrong decision. When managers seek causes of positive performance, reverse causation can often lead to underestimates that, if severe enough, could induce managers not to increase the level of the causal variable, because benefits seem too meager or uncertain. When performance is measured by profit, then definitionally induced confounds could also come into play, possibly leading to complete masking or even to a false negative effect. Organizations that base performance assessment on costs will tend not to suffer the systematic confounding by difference scores and reverse causality considered herein, because neither apply. Therefore, their analyses will tend to reveal the true power of the causes of cost performance. Organizations that base performance assessment on sales will tend to understate the true power of the positive causes of their performance, due to masking by reverse causality. Organizations that base performance assessment on profits may completely misstate the causes of performance, because masking by reverse causality may be reinforced by masking due to confounding by sales or costs or both. However, this depends upon the sign and magnitude of the definitionally induced confounds, both of which are variable. Overall, assessment of organizational subunit performance by cost centers will tend to lead to less confounding than assessment by profit centers. Therefore, top managers in functionally structured organizations are more likely to make accurate inferences from their internal comparisons of performances of organizational subunits than are managers in multidivisional or holding company structures or in multinational corporations. Having discussed in this chapter the confounding that arises from making analyses of organizational performance and profit, in Chapters 10 and 11 we turn to the familiar form of confounding, by some other factors that are distinct from either the focal cause or the dependent variable.

10 Controlling for Confounding by Using Organizational Experiments

Social science methodology holds that inferences from data can be upset by confounds, so that what appears to be a relationship between two variables is really false. Similarly, organizational data containing a confound can mislead the manager. One approach to avoid confounds is for experiments to be run in organizations. In this chapter, we will discuss the problem posed by confounds and critically evaluate organizational experiments as a way of overcoming them. Confounding occurs when the true effect of a focal cause is obscured by the spurious effect of some other variable that is correlated with the true cause. The confounding is an error that leads the true value to be inflated or masked. Confounding can make a true positive effect of a cause appear nil or even negative, thereby obscuring not just the strength of the relationship but even its existence or whether it has a positive or negative effect. Confounding can also make two variables that have no relationship appear to have a relationship. Thus, a manager looking at data that contains a strong confound can be completely misled about causes and effects. There are various strategies in social science for avoiding or coping with confounds, and their implications for organizational management will be discussed. In particular a strategy for controlling confounding in social science is to conduct experiments that use control groups. We will discuss the implications of this approach for organizations wishing to make sound inferences about which management techniques lead to high performance. An organization could control for confounds by making an experiment in one of its parts while using another part as a control group. The control group is not given the experimental treatment, but is otherwise the same. Such a control group 165

166 THE SOURCES OF ERROR

study is most feasible in large organizations that are standardized and possess homogeneous subunits. The control group study produces unconfounded estimates of the true effect, eliminating the error due to confounding. As will be explained, however, it also leads to underestimates of the possible true effects, so that the estimate is biased downward, possibly considerably. Thus, the results from control group experiments in organizations can be a poor guide to future action for managers. Confounding as a Source of Error In the previous chapter, we discussed special cases of confounding due to difference scores or to reverse causation. However, confounding is present to some degree whenever a third variable is associated with both a cause and its effect. It is this more general type of confounding that is discussed in this chapter. A confound occurs whenever a cause of an effect is correlated with another cause of that same effect. Thus, the effect on organizational performance of a cause can be confounded by any other cause of organizational performance. This is worrying because there are potentially many causes of some organizational outcomes, such as organizational performance, and thus many ways in which confounding could occur. Moreover, confounding does not have to be by another cause of the dependent variable, but can be by another variable that is just correlated with both the cause and the dependent variable. Therefore, confounding can occur potentially quite frequently, especially confounding of dependent variables such as organizational performance that have many causes and correlates. Confounding can make a true positive relationship appear to be nil or even negative. This result could lead a manager to make the completely wrong decision—for example, to fail to choose an option that benefits the organization. For this reason, confounding is a potentially severe kind of error. It can be more severe than some of the other sources of error considered in this book, such as unreliability, which reduces the magnitude of a correlation, but not to zero, and does not lead to a change in sign (e.g., from positive to negative). Accordingly, confounding deserves considerable discussion, which was begun in the previous chapter and continues in this chapter and the next. The magnitude of confounding is variable. The confounding consists of a spurious relationship between the true cause and its effect, which masks or inflates the true relationship. The degree of confounding depends upon the strength of that spurious relationship relative to the true relationship. The strength of the spurious relationship depends upon the strength of the correlation between the true cause and the confounding cause (or correlate), and on the strength of the correlation between the confounding cause (or correlate) and the effect. The spurious correlation is the product of these two correlations. Thus, confounding is greater, the greater is the correlation between the confounding cause and the effect, relative to the correlation between the true cause and the effect. Confounding is also greater,

CONFOUNDING BY USING ORGANIZATIONAL EXPERIMENTS

167

the greater the correlation between the true cause and the confounding cause (or correlate). The greater are both these components of confounding, the greater is the confounding. If either source is weak, then the spurious effect is weak and so there is little or no confounding. Hence, there will be no confounding if there is zero correlation between the true cause and the putatively confounding cause (or correlate) or between the putatively confounding cause (or correlate) and the effect. Hence, confounds are potentially frequent, yet have conditions that need to be met for them to exist. However, with many causes (or correlates) of effects such as organizational performance, the concern is that these conditions may be met frequently, leading to the frequent presence of confounds that pose challenges to sound inference. In order to deal with confounding, social science methodology has developed procedures that frequently involve establishing controls (Cook and Campbell 1979). As Ilgen (1986, 258) explains: “Without some control, it is often impossible to disentangle the effects of many different covarying and confounded variables on the behaviors of interest.” An experiment may be conducted in which one set of individuals is treated by increasing their level on a focal causal variable (e.g., their participation in decision-making) and its effect measured. As far as possible, no other causal variable is allowed to alter during the experiment and the individuals are isolated from other influences that may be changing. For any remaining extraneous influences, their effect is registered by a control group. The control group effect is subtracted from the experimental group effect to ascertain the true effect of the treatment. The fundamental methodological proposition is: 10.1 In an experiment, confounding by all factors other than the independent variable can be measured by using a control group. The implications for organizational management of using experiments to avoid the pitfalls of confounding will now be discussed. Experiments With Control Groups The idea of strict control of extraneous causal influences through isolation of individual subjects leads in social science to the laboratory experiment. These are often conducted in universities and may have students as their subjects. Thus, because of potential confounding, causal inference from data is viewed as so difficult that research needs to be conducted away from organizations and their managers. We need to probe this idea as part of a discussion of confounding and as part of a critical analysis of the role of experiments. This leads toward the view that managers may make causal inferences free from confounds by looking at data within their own organizations, rather than relying on experiments in universities.

168 THE SOURCES OF ERROR

The fact that experiments are often performed in universities with students as subjects raises the issue of how far such results hold in other organizations and for other people, such as employees or managers. This is an issue of external validity: whether the results generalize from laboratory to field. It is often believed that these experiments are necessary because of their superior internal validity—that is, valid attribution of causality to true causes rather than confounds. The idea is that, to eliminate confounding, it is necessary to conduct controlled experiments in which the subjects are isolated from any causal force other than the one of interest in the study. Thus, researchers conduct laboratory studies to validate the causal theory before then undertaking field studies to show generality. Such a program prolongs the research process and delays the stage at which knowledge based on field studies may be offered to organizational managers. Laboratory experiments are often criticized for their artificiality, leading to doubts about their external validity. The noted organizational psychologist Locke and his colleagues have replied to these criticisms by showing that, on the topic of goal-setting and its effect on performance, the results of laboratory studies are similar to the results of field studies: “The beneficial effect of goal setting on task performance is one of the most robust and replicable findings in the psychological literature. Ninety percent of the studies showed positive or partially positive effects. Furthermore, these effects are found just as reliably in field settings as in the laboratory” (Locke et al. 1981, 145; emphasis added). Similarly, Latham and Lee, reviewing studies of goal-setting, conclude: “In the laboratory, thirty-seven out of thirty-eight experiments showed that a specific, challenging goal had a positive effect on performance. This finding was supported by twenty-seven out of twenty-eight studies conducted in the field” (1986, 105). Hence, by the criterion of the proportion of studies that support goal-setting, laboratory (97 percent) and field (96 percent) studies have very similar results. Locke (1986) argues that this similarity shows that laboratory study results generalize to the field. Locke edited a book in which numerous authors examined the issue of the generalizability of findings from laboratory to field on eleven topics in industrialorganizational psychology, organizational behavior, and human resource management. Locke found that “in case after case and topic after topic, basically the same results were obtained in the field as in the laboratory” (1986, 6). In the epilogue, he summarizes the findings on the eleven topics, coding the five topics on which size of effects could be compared: on four topics (attribution theory, financial incentives, job scope-job satisfaction, and job satisfaction-job performance) there was either “virtually no difference” or “small difference” in the results for laboratory and field, and on only one topic (objective feedback) were the lab and field results “somewhat dissimilar.” For three more topics (expectancy theory and job choice, performance appraisal, and participation), as well as goal-setting, size effect comparisons were not available, but comparison of the directions of effect could be made, and these were either “virtually identical” or “highly similar” in the results

CONFOUNDING BY USING ORGANIZATIONAL EXPERIMENTS

169

for laboratory and field. Thus, Locke and his colleagues have provided an impressive array of evidence that results on many core topics of industrial-organizational psychology, organizational behavior, and human resource management generalize from laboratory to the field. This conclusion, however, undercuts the argument that laboratory studies are superior to field studies. In particular, if laboratory studies had the superior internal validity claimed for them (i.e., by controlling better for extraneous causes through isolating subjects), then laboratory studies would have different findings than field studies. Thus, the more similar are the results between laboratory and field studies, the more laboratory and field studies have similar internal validity. (Alternatively, laboratory studies could have superior internal validity—yet inferior external validity—to field studies, which offset each other, leading to similar overall validity for laboratory and field studies.) The similarity holds at the aggregate level, comparing the results of numbers of laboratory studies with numbers of field studies. Traditional methodology in industrial-organizational psychology has been wont to classify experimental studies with control groups as methodologically “good” or sound or scientific, while studies that lack experimental methods and control groups are classified as being methodologically “bad” or unsound or unscientific. These a priori distinctions have then been used to exclude nonexperimental or noncontrol group studies in reviews. However, meta-analysts have taken a more empirical approach, asking whether there is any difference in averages between the two sets of studies (Rodgers and Hunter 1994). In the absence of such a difference, the judgment is made that the nonexperimental, noncontrol group studies are not inferior and should be included in reviews. Hence, from the meta-analytic perspective, similarity of findings between laboratory and field studies, such as for goal-setting, is a basis for viewing laboratory studies as not being methodologically superior to field studies. Thus, there is no compelling reason to conduct laboratory studies or to delay commencement of field studies until after laboratory studies have been completed and their results known. Instead, researchers could make field studies without delay and thereby attain external validity while also attaining internal validity. To the contrary, Ilgen (1986, 261–264) argues that there are circumstances in which field studies cannot be feasibly conducted, so that laboratory studies are required. These limitations are cogent, for instance, when field studies would endanger the safety of subjects. However, the list of topics in Locke (1986) indicates that many core topics of interest in industrial-organizational psychology, organizational behavior, and human resource management have been and therefore can be addressed adequately in field studies. Thus, notwithstanding the caution by Ilgen (1986), field studies are often viable, so that laboratory studies are not required. Locke (private communication) makes the point that laboratory studies are easier and faster, which are valid reasons for conducting them, but not for requiring them.

170 THE SOURCES OF ERROR

Experiments in Organizations The argument that field studies can be as valid as laboratory studies dispenses with the need for isolation through laboratory studies. However, the argument would remain that experiments with control groups are required for sound causal influence. These studies could be conducted within organizations, making them organizational experiments rather than laboratory experiments. An experiment could be conducted on members of just one part of an organization, keeping all other causes as constant as possible in that part of the organization. Another, preferably identical, part of the organization could act as a control group, registering the effects due to other causes that changed during the experimental period. Subtracting the control group result from the experimental group result would give the true effect. An Example of Organizational Experiments: MBO As discussed, social science methodology prescribes the use of experiments with control groups as a technique for detecting confounding. In such experiments, the experimental group receives a treatment and subsequent changes in the dependent variable are recorded. However, some or all of this change could be due to changes that have occurred because of the effect of another cause, or causes, of that same dependent variable, which has also changed at the same time as the treatment. The changes in the control group record the effect of any other cause. This is to say, the control group records the degree to which the experimental group’s results have been confounded by other causes that are associated with the treatment. By subtracting the control group’s results from the experimental group’s results, an estimate is attained of the true effect of the treatment, controlling for the confounding. The effect on the performance of an organizational subunit of having a management by objectives (MBO) program in that organizational subunit can be studied by conducting an experiment in one organizational subunit, the experimental group, whose employees are subject to an MBO program, and comparing it with a subunit in the same organization that has no MBO program, but is otherwise similar and provides a control group. The control group effect is then subtracted from the experimental group effect to give the true effect. Rodgers and Hunter (n.d.) reviewed studies of the effects of MBO on organizational performance and identified six studies that used control groups. Typically, each study is of an organization, comparing its subunits. In a study of the effect of an MBO program on the productivity performance of an organizational subunit (Ivancevich 1977), the gain of the experimental group (as calculated by Rodgers and Hunter, n.d., Table 3, p. 42, and rounding to whole numbers) was 7 percent, apparently showing that MBO had a positive effect. However, the control group showed that performance decreased by 6 percent during that same time (as calculated by Rodgers and Hunter, n.d., Table 2, p. 41). Hence, the true effect was a gain of 13 percent (= 7 while also overcoming a

CONFOUNDING BY USING ORGANIZATIONAL EXPERIMENTS

171

background decrease of 6). Thus, a positive effect of MBO on performance had been confounded by another cause (or causes) that was depressing performance, so that about half the true effect was masked. Hence, this illustrates how a true positive effect can be confounded by a negative effect, so that only part of the true effect appears. A manager viewing the result of the experimental group would underestimate the benefit of MBO for her organization. She might not implement it, judging it not worth the costs and trouble involved, so losing the performance that would be gained for the organization if it adopted MBO. In another study of the effect of an MBO program on productivity performance of a subunit (Muczyk 1978), the gain of the experimental group was also 7 percent (as calculated by Rodgers and Hunter, n.d., Table 3, p. 43). This apparently shows that MBO had a positive effect. However, the control group showed that performance increased by 10 percent during that same time (as calculated by Rodgers and Hunter, n.d., Table 2, p. 41)—that is, performance increased more without MBO than with it. Hence the true MBO effect was a loss of 3 percent (= 7 – 10). Thus, a negative effect of MBO on performance had been confounded by another cause (or causes) that had a positive effect on performance. Hence, this study illustrates how a true negative effect can be confounded by a positive effect, so that it falsely appears to be positive. This study also illustrates how the other cause(s) can completely confound the true cause, because the confounding effect is greater than the true effect. A manager viewing the result of the experimental group would wrongly conclude that MBO had benefit for his organization. He might implement it, losing performance for his organization. Confounding of organizational data can thus lead to wrong inferences by management. They may not be merely errors of degree, but total errors, in the sense that relationships appear to be the opposite of what they truly are—for example, a negative relationship appears to be positive. The resultant inferences and managerial actions based thereupon are completely wrong. By conducting an experiment in one part of the organization, while making no change in a similar part of the organization that is the control group, management may ascertain the true effect of the experimental change free from confounding. Organizations Most Able to Conduct Organizational Experiments In order to conduct a valid organizational experiment, certain conditions must be met. There must be measures of the causal variable and its putative effects. These measures are most likely in an organization that has well-developed measurement systems that over time track costs, productivity, sales, absenteeism, grievances, or other relevant outcome variables. Ideally, the control group in an experiment should be the same as the experimental group in all respects except the treatment being studied. Such a control group is more attainable if the organization is composed of parts that are the same as each other. This is more likely in an organization that has differentiated itself into multiple parts whose work is standardized. These conditions of having formalized

172 THE SOURCES OF ERROR

reporting systems, standardization, and specialization are most likely to be found in organizations with large numbers of employees (Pugh et al. 1969). Again, for the results to be uncontaminated, the operations and outcomes of the experimental unit need to be independent of the operations and outcomes of the unit being used as a control group. This condition is most likely to be found in organizational subunits that are specialized vis-à-vis each other by geography, because they are independent of each other. This condition is less likely to be found in organizational subunits that are specialized vis-à-vis each other by functions or by products (or services) that are related to one another, because they are dependent on each other. These two conditions, of similarity and independence between experimental and control organizational subunits, are most likely to be available in large organizations. In an organizational experiment, not only do the work and structure need to be the same between the experimental and control organizational subunits, but also the people in them, because outcomes often result from the interaction of the situation and the person. Ideally, in an experiment, subjects are, on average, the same between the experimental and control groups, a goal that is often attained through random allocation of individuals to the groups. In an organization, different parts may contain different types of people. For example, geographically distinct plants may contain different types of people, reflecting their different locations (e.g., rural versus urban). However, some organizations practice widespread recruitment and then widespread assignment so that persons at any one location are from mixed geographical origins. For example, military organizations often recruit from around the nation and centrally assign across the nation. This process may help to homogenize personnel across the geographic locations of the organization. Furthermore, some organizations scrupulously select applicants using a wide battery of tests of health, ability, and personality. The resulting selectees will be more homogeneous than in an organization that is less selective and so admits more widely varying personnel. Again, military organizations have tended to exemplify this pattern; additionally, some large corporations also use selection batteries. Thus, the larger an organization is, the more likely it is to have formalized, standardized, independent subunits that are homogenized in their personnel and so provide control groups for organizational experiments that control more completely for extraneous influences. The proposition is: 10.2 Control groups for organizational experiments control extraneous influences more completely, the larger is the organization. The more detailed propositions are: 10.3 Control groups for organizational experiments control extraneous influences more completely, the more formalized is the reporting system of the organization.

CONFOUNDING BY USING ORGANIZATIONAL EXPERIMENTS

173

10.4 Control groups for organizational experiments control extraneous influences more completely, the more standardized are the subunits of the organization. 10.5 Control groups for organizational experiments control extraneous influences more completely, the more independent are the subunits in their outcomes. 10.6 Control groups for organizational experiments control extraneous influences more completely, the more personnel are allocated across the subunits of the organization. 10.7 Control groups for organizational experiments control extraneous influences more completely, the more rigorous is the selection of organizational personnel. By means of using control groups, research can identify the confounding of study results and control for it, producing a truer estimate of the effect of a cause, as was exemplified by the MBO studies described in the preceding section. Bias in Organizational Experiments Thus, use of control groups in an organizational experiment allows the confounding effect of other causes to be controlled. However, Rodgers and Hunter (1991) caution that this gain comes at a cost. The average effect of MBO on performance in studies using control groups was much less than in studies using no control groups. From their analysis, Rodgers and Hunter conclude that top management’s commitment to MBO moderates its effect on performance. The studies with high top management commitment had average performance gains of 56.5 percent, whereas the studies with low top management commitment had average performance gains of only 6.1 percent (Rodgers and Hunter 1991, 329). In organizations where top management strongly endorses and participates in MBO programs, the effect of the MBO program is much stronger than where top management commitment is low and the MBO program is supported only by, say, a local HRM manager or a consultant (Rodgers and Hunter 1991). Yet, studies using control groups are found only in the low top management commitment condition (Rodgers and Hunter, n.d., 39–41). When top management commitment is low, then it is feasible to introduce an MBO program “as an experiment,” “to see if it works or not in our organization,” by trialing it in only one part of the organization, while the rest of the organization does not use MBO and so can provide a control group. In contrast, where MBO has top management commitment, MBO is implemented from the top down throughout the organization—so that there is no part that does not use MBO that can be the control group. In the low top management commitment condition, a skeptical, cautious attitude pervades the program so that it is not fully implemented and so is unable to produce strong benefits. In contrast, in the high top management commitment condition, the top managers support and

174 THE SOURCES OF ERROR

model the new behaviors, such as by setting objectives for themselves and their subordinates, thus inducing MBO behaviors to cascade down the hierarchy and to become institutionalized. This leads, in turn, to realization of the full benefits of MBO in that organization. Therefore, Rodgers and Hunter (1991) caution against reliance upon control group studies and warn that they mislead, with the sounder estimates coming from studies without control groups. As Rodgers and Hunter argue: “When MBO is implemented with top management commitment, it is instituted from top management downward, throughout the organization. Typically control groups are obtained by looking at units of the organization where innovation is not introduced. That type of control group would not likely be available in a company which adopted MBO in all units.” Therefore, “our results indicate that MBO effectiveness is contingent on high top management commitment. Because high commitment is absent in studies which employ control groups, additional well controlled studies would have obscured identification of a top management commitment moderator” (n.d., 21–22). Thus, organizational experiments, running a trial in one part of an organization “as an experiment” while the rest of the organization functions conventionally, can be fallacious. The experiment operates as a self-fulfilling prophecy in that the experiment is based on a skeptical attitude about whether the new technique will work and this tentative approach produces weakly positive results that confirm the initial skepticism. The mild benefits produced may be seen as not justifying the costs so that the new technique is never adopted by the organization, becoming ultimately a failure in that organization. Thereby the organization does not avail itself of the benefits that the new technique would have provided if it had been implemented properly. Thus, organizational experiments, instead of being prudent, are unwittingly foolish. The use of control groups enables removal of confounds, but at the price of large downward bias. By using a control group to control for confounding in each study, the organizational experiments eliminate that error. In the six MBO studies that used control groups (Ivancevich 1974, 1976, 1977; Muczyk 1976, 1978), the confounds lead to errors both below and above the true value for each study, so that these errors vary in sign and magnitude: 11.2, 9.5, 2.4, 0.4, -2.0 and -5.9 percent, respectively (see Table 10.1). They are essentially random errors around the true value, or noise that obscures the true value. This error, due to confounding, ranges 8.5 percent below and above the average confounding effect of 2.6 percent (as calculated by Rodgers and Hunter, n.d., Table 2, p. 41). However, the control group studies, undertaken at organizations where top management commitment was low, understate the true effect, so that their estimates are biased downward. In Rodgers and Hunter (1991, 329), the difference is 50.4 percent in productivity gain from MBO between high and low top management commitment. Since the studies with control groups correspond to the low top management commitment condition, they understate the true effect by 50.4 percent. This downward bias, 50.4 percent, is more than five times greater than the range of error, 8.5 percent,

CONFOUNDING BY USING ORGANIZATIONAL EXPERIMENTS

175

Table 10.1 Results of Impact of MBO on Productivity in Studies Using Control Groups Percentages of productivity effects Study Ivancevich 1974 Marketing Division Muczyk 1978 Ivancevich 1976 Muczyk 1976 Ivancevich 1974 Production Division Ivancevich 1977 Mean Mean of absolutes

Control group

Experimental group

True effect

Absolute confound/true as percent

11.2 9.5 2.4 0.4

38.3 7.2 2.9 3.3

27.1 –2.3 0.5 2.9

41 413 480 14

–2.0 –5.9 2.6 5.2

2.6 6.9 10.2 10.2

4.6 12.8 7.6 8.4

44 46 173

introduced by confounds. Hence, the use of control groups in these MBO studies introduces a downward bias that is much worse than the error from confounding that the control groups remove. Thus, organizational experiments that involve control groups to eliminate confounds are prone to the defect of underestimating the true benefits of management techniques and so could lead managers to excessive conservatism and failure to adopt beneficial new techniques. The propositions are: 10.8 Organizational experiments that use control groups to deal with confounds are liable to understate the benefits of new techniques. 10.9 Organizational experiments that use control groups to deal with confounds are liable to lead to failure to adopt beneficial new techniques. As argued above, large organizations are more likely than small organizations to possess the features (standardization, formalization, independent subunits, central allocation of personnel, and rigorous selection) that make the use of control groups feasible. Therefore, large organizations are more likely to produce estimates that are freer from confoundings and the resulting random error. However, large organizations that use control group experiments are likely to suffer downward bias in their estimates of the effects of techniques. In contrast, small organizations are less likely to be able to conduct experiments to eliminate confounding and so are stuck with confounds and thus the random error that confounds produce. However, the inability of small organizations to use control groups saves them from the downward bias. The argument that high top management commitment produces much higher

176 THE SOURCES OF ERROR

productivity than lower top management commitment must be viewed with some caution, because the absence of control groups where top management commitment is high means that we cannot be sure how much of the productivity increase is really due to extraneous influences. Accordingly, the amount of downward bias from using organizational experiments with control groups may be overstated in the MBO review, which relies on extrapolating the magnitude of the control group effect from studies with control groups to those without (Rodgers and Hunter 1991). Nevertheless, the idea that the existence of control groups hinges upon less than enthusiastic support from top management is persuasive. This insight leads to the recognition that organizational experiments with control groups contain some downward bias. It is the existence of some nontrivial degree of bias that is drawn upon in this book, rather than bias of the large magnitude posited by Rodgers and Hunter (1991). Conclusions Confounding has the potential to create serious errors of inference from data. Organizational experiments that feature control groups are a way to avoid confounds. Such experiments are more feasible in large organizations than small, because large organizations are more likely to possess the necessary preconditions, such as standardization and comparable subunits. However, organizational experiments can lead to serious downward bias in estimates of the effectiveness of management techniques. Those large organizations that seek to avoid confounding error through organizational experiments may incur bias, leading to conservatism and failure to adopt effective techniques. In the next chapter, we will outline a way that confounds can be avoided in organizational data—through averaging.

11 Controlling for Confounding by Data Aggregation

As we saw in the previous chapter, confounding, in which a figure is afflicted by a spurious relationship, can be controlled by experiments. However, as we also saw, experiments in organizations are prone to a bias of markedly understating the true effect of a cause on performance. In this chapter, we will propose another way to eliminate confounding: aggregation of data. This reduces confounding without incurring the downward bias from using organizational experiments. It is a novel approach, so it will be argued for at some length. Above, we have said that the information system in a large organization is a meta-machine, in that it improves the inferences that are possible from numerical data by aggregating the data as they flow up the hierarchy. This improvement came from reducing sampling error. Now we argue that this aggregation of data as they flow up the hierarchy also reduces confounding and so is a second source of error reduction. It is a second way in which the organization is a meta-machine. Because the argumentation in this chapter is quite complex, we will briefly summarize it at this point. Summary Averaging always reduces confounding. Averaging eliminates the variations from one data set to another in their confoundings and so eliminates the more extreme confounding. Thus, the average contains less confounding than in some of the individual data sets. Furthermore, confounds can be eliminated by averaging. This is facilitated by their being small and varying in sign. Confounds that vary around zero can be completely eliminated by averaging. The confounding of the average can be zero or trivial. The confound in the average will be smaller, the smaller are the confounds in the individual studies 177

178 THE SOURCES OF ERROR

and the more of them that differ in signs. Small confounds are possible, despite the other causes having greater effect than the focal cause, when other causes have low correlations with the focal cause and the positive confounds are offset by the negative confounds. Nevertheless, sometimes nontrivial confounds can be left in the average. Thus, some averages can be confounded. Hence, averaging does not eliminate all confounding, but does eliminate some and reduces some other confounding. Therefore, an organization that repeatedly uses averages will sometimes reduce confounding a lot and sometimes only a little, but overall will suffer less from confounding than an organization that does not average. Thus, large organizations, by being able to average numerous data sets, will have less confounding than small organizations that do not have multiple data sets and so are unable to average. The figures produced by averaging will provide valuable guidance to managers on numerous issues. Controlling Confounds by Averaging As has been seen in the last chapter, confounding can be controlled by measuring the confound in each study and subtracting it from the experimental effect in that study. However, confounding in a set of studies may also be controlled by taking the average of the experimental effects across studies. Thus, confounds can be controlled by just averaging the apparent (i.e., experimental) effects, rather than needing to identify and control for the confounds in each study. Similarly, because large organizations also aggregate various data sets and take averages, those average figures will be less subject to confounds, conferring an inference advantage on organizations able to average—that is, large organizations whose many subunits generate data that are then aggregated at higher levels in the organizational hierarchy. Therefore, the findings in social science research about averaging are important and will be discussed at some length here because of their implications for the reduction of confounds by averaging. The point of departure for this alternative method for controlling confounds is again the work of Rodgers and Hunter (1991). In their review of management by objectives (MBO) studies that use control groups, they calculate the average (mean) effect of MBO on performance (productivity) in the control groups of the studies that used the control groups. This control group effect mean is small and only 2.6 percent (as calculated by Rodgers and Hunter, n.d., Table 2, p. 41). In three of the studies, their confounding (as shown by the control group effects) was 11.2, 9.5, and –5.9 percent. Averaging eliminates these extreme confounds, which are quite substantial, and replaces them by a much smaller confound, 2.6 percent. Thus, we see how averaging of confounds eliminates the larger confounds. What remains of confounding is the average confound, and this is small (2.6 percent) and closer to zero than half of the studies (11.2, 9.5, and –5.9 percent). Thus, when averaged, the effect on performance of all the other causes of performance that are changing during the time that the MBO program is being studied

CONFOUNDING BY DATA AGGREGATION

179

is small. The average apparent (i.e., experimental) effect on performance of MBO in these studies is 10.2 percent. This figure is inflated by the 2.6 percent underlying increase in performance. Subtracting 2.6 percent from 10.2 percent gives the true effect of 7.6 percent. This is the true effect, free from confounding. Clearly, the difference between the true effect, 7.6 percent, and the confounded effect, 10.2 percent, is small. If the confounded estimate of the MBO effect, 10.2 percent, were used by managers in place of the true, unconfounded effect, 7.6 percent, there would be little error. It would have led to only a trivial error of overestimation (2.6 percent) of the effectiveness of MBO. It seems unlikely that this degree of error would lead managers to a wrong decision about whether or not to implement MBO across an organization. Managers would correctly see that the effect of MBO was positive, but would expect future implementation of MBO to produce slightly higher performance than it actually would. Thus, after averaging figures, the estimate of the true effect (7.6 percent) is only weakly confounded. At the aggregate level (i.e., combining across studies), not having used control groups would have made little difference, because there was so little confounding to be controlled for. The averaging process has controlled for almost all of the confounding. The power of averaging to reduce confounding is strong because the mean control group effect is weak, only 2.6 percent, and close to zero. The purpose of a control group is to avoid confounding by measuring the effect of other causes. By subtracting the control group effect from the experimental group effect, researchers can obtain an estimate of the effect of the focal cause that is free from confounds and therefore valid. However, if the control group mean is close to zero, then subtracting it from the experimental group mean makes little difference to the estimate of the effect of the focal cause. Thus, if the averaging of control group effects across studies produces a mean control group effect that is zero, or near zero, then control groups are no longer needed. Hence, the rationale for having control groups disappears. Because control groups are considered an important defense against confounding, the fact that they can be dispensed with through averaging is testament to the power of averaging to eliminate confounding. Averaging of control group effects made it possible to be rid of almost all the confounding in the MBO studies that used control groups. How did this work in these MBO studies? Table 10.1 in the previous chapter gave the results of the impact of MBO on productivity of the six studies that used control groups. The figures for the experimental and control groups are taken from Rodgers and Hunter (n.d., Tables 1 and 2, except for the Ivancevich [1974] experimental effects for Marketing and Production, which are calculated from comparing start and end date values in his Table 3, p. 569). The true effect is calculated by taking the experimental group result minus the control group result. The studies have been ranked by the magnitude of the control group effect. The control group effect is given first, because this is the centerpiece of our interest; the apparent effect of the MBO experiment is given second, while the true effect is the net of the first two figures and so is given last.

180 THE SOURCES OF ERROR

In the six studies, the confounding (as shown by the control group effects) includes 11.2, 9.5, and –5.9 percent, so that it is not trivial. Nevertheless, the average confounding, the control group mean, is only 2.6 percent. The smaller is the control group mean, the more confounding is eliminated. Smaller control group means can be produced by either small confounds in each study or by differences in sign so that positive confounds are offset by negative confounds. In the MBO control group effects, both mechanisms operate. The absolute confounds (i.e., the control group effects disregarding sign) average only 5.2 percent. Thus, the smallness of the average confounds is due in part to the confounds each tending to be small. Also, the positive confounds are offset to a degree by the negative confounds, so that the average confound, regarding sign, is only 2.6 percent. Thus, although the confounding includes the effect of all the other causes of performance that change over the study period, on average, the confounding effect (2.6 percent) is smaller in magnitude than the true effect (7.6 percent). How can this be? Also, the confounding varies in sign from positive to negative. How can this occur? What underlying causal processes would produce this pattern of confounding? Confounding by Multiple Causes We wish to suggest that, when multiple causes confound the effect of a focal cause, the overall confounding can be weak. This can hold despite the total effect of the multiple causes being much greater than the effect of the focal cause. Suppose that a focal cause produces effects in a dependent variable, but only weakly, with most effect being due to another cause (see Figure 11.1). Then a high correlation between the two causes will greatly confound the weaker cause, obscuring its true value. Even a modest correlation will produce a substantial confounding of the weaker cause by the stronger. If the underlying correlation between these two causes varies randomly around zero, then there will occasionally be some confounding and infrequently severe confounding. Therefore, when a focal cause is accompanied by another cause that is much stronger, substantial confounding is possible. Some organizational outcomes are affected by many causes, however. Organizational performance, for example, is affected by many causes, such as organizational design, leadership, motivation, cost of capital, marketing, and superior product. Multiple causes may be thought more likely to confound any one of them. Being one cause among many would tend to mean that any single cause of the dependent variable is weaker than the other causes combined. This is the source of a frequently expressed worry among researchers: that it is difficult to explain organizational performance because the effect of any one cause tends to be swamped by other causes in empirical research. However, this traditional view may be to a degree mistaken. Suppose that the true effect of a focal cause is, again, only weak, but there are

CONFOUNDING BY DATA AGGREGATION Figure 11.1

181

Effects on Performance Showing Strong and Weak Confounding

Strong Confounding Other cause Correlation

Performance Focal cause

Weak Confounding Other cause 1 + Other cause 2 Other cause 3

Performance

+ -

Other cause 4

Focal cause nine other causes, whose true effect is each also weak. (Figure 11.1 displays a diagrammatic representation of multiple other causes confounding a focal cause, though for simplicity only four other causes are shown.) Combined, these nine other causes affect organizational performance much more than the focal cause affects it. Therefore, if they are correlated with the focal cause and have the same sign, together they strongly confound the focal cause, as in the first scenario. However, these multiple other causes may not work together, so that the resulting confound is much weaker or nonexistent. Whether these causes actually confound the focal cause to an appreciable degree depends primarily upon their correlations with the focal cause. Their correlations with the focal cause need to be of the same sign, and then the more they are correlated with the focal cause, the more they together confound the focal cause.

182 THE SOURCES OF ERROR

Whatever the probability of a single other cause of performance correlating with the focal cause with a particular sign (i.e., positive or negative), the probability of all the nine other causes being correlated with the same sign with the focal cause is that probability raised to the ninth power, which is much less than the probability of the single instance. In fact, it is a very small probability. For random variations around zero, it is quite improbable. Substantial confounding is less likely by multiple other causes than by a single other cause. It is unlikely that every other cause would have the same sign correlation, and a sufficient correlation, with the focal cause. The absence of high same sign correlations between the other causes and the focal cause means that the confounding effect of the other causes tends to be weak. The confounding effect of any one cause on the focal cause will tend not to be reinforced by a similar confounding effect of another cause on the focal cause. Whereas the first other cause may be positively correlated with the focal cause (as in Figure 11.1), creating a positive confound, the second other cause may be negatively correlated with the focal cause, so that the second confound offsets the first, diminishing the confounding of the focal cause. Similarly, the third other cause is positively correlated with the focal cause, whereas the fourth other cause is negatively correlated with the focal cause. Hence, there are multiple positive confounds but they are offset by multiple negative confounds. Thus, multiple causes that differ in their signs (between positive and negative) tend to weaken the confounding effect they have on a focal cause. This weakness holds even though collectively the multiple other causes affect organizational performance much more strongly than the focal cause. If any cause of organizational performance has no relationship with the other causes, then the underlying correlation between that focal cause and the other causes taken together is zero. The correlations between the focal cause and each of the individual other causes will depart randomly from zero, being positive for some and negative for some, producing some confounding between the focal cause and some of the other causes. However, the proportion of positive correlations will tend to be equal to the proportion of negative correlations, so that they will tend to offset each other; and the correlations between the focal cause and each of the other causes will tend to be small. Therefore, the most probable correlation between the focal cause and the other causes taken together will be zero. When the correlation between the focal cause and the other causes taken together is nonzero, it will probably be small, rather than high. Hence, despite the existence of multiple other causes of performance, the correlation between a focal cause and performance will tend to be unconfounded. If there were just one other cause of performance that is much stronger than the focal cause, then, if there is a substantial correlation between the two causes, there will be substantial confounding of the effect of the focal cause on performance. However, with a large number of other causes, their multiplicity, combined with their correlations with the focal cause varying around zero, means that their con-

CONFOUNDING BY DATA AGGREGATION

183

founds tend to be weak, and the positive and negative confounds tend to offset each other, leaving little or no confound. In summary, an effect from multiple other causes of organizational performance can be confounds that vary at random around zero, varying in sign and magnitude but tending to be small. They create random errors around the true effect of a focal cause on organizational performance. Returning to the review of studies of the effects of MBO programs on organizational performance by Rodgers and Hunter (1991), we see that this work exemplifies the above model of confounds. The model says that confounds are sometimes positive and sometimes negative, varying around an underlying zero. Given that there were six studies, the expectation would be an equal split of positives and negatives—that is, three positive and three negative. Empirically, of the six studies, four are positive and two are negative, which differs by only one study from the expectation, illustrating the way confounds can vary in sign around zero. The model says that the confounds will be small, and, as seen, they are: their average is only 5.2 percent of productivity (in absolute terms). This is small relative to the true MBO effect of 8.4 percent (also in absolute terms), despite the confounds being potentially caused by all the other causes of organizational performance other than MBO. Again as seen, because the positive confounds are offset by the negative confounds, the average confound is only 2.6 percent. Thus, the confounds registered by the control group studies of MBO show empirically that it is possible to have confounds that are a pattern of errors around zero that produce only a small average error. Confounding in Multiple Causes of Organizational Performance In the MBO studies discussed above, the control group registers the effect of all the other causes of productivity additional to MBO. In order to understand why all other causes have in total such a weak confounding effect, it is necessary to look at them more closely and see their interaction with the focal cause. This is not possible in the MBO studies, because, rather than identify the other causes, the control group just measures the net effect on productivity of changes in them. What is needed, therefore, is an empirical study that identifies the other causes and their confounding of a focal cause. Usually in organizational science research, the concept of “other causes of performance” apart from the focal cause is an abstract and elusive concept, because only a few of the other causes are identified and measured. However, Schlevogt (2002) studied the effect of a large number of different causes of organizational performance (“effectiveness”) in 124 Chinese firms. Organizational performance was a composite index derived from profitability, revenue growth, financial liquidity, morale, and public image. Its causes were based upon a wide range of theories in the literature about the causes of organizational performance. There were twelve causes: proactive strategy, price-cutting strategy, governmental

184 THE SOURCES OF ERROR

support, participative leadership, firm size, planning decisions, firm age, structural fit, subcontracting, environmental constraints, management expertise, and family-based private ownership. These causes together explained 60 percent of the variance in performance (Schlevogt 2002, 140), so that, in total, these causes explained most of the variance in performance. Thus, we can examine the effect of a cause knowing the simultaneous effect of most of the other causes of performance. Therefore, we can see how a cause is confounded by the other causes of performance that collectively make up most of the causation of performance. We can examine each cause separately and its role in confounding the other causes, adding to our analysis of the phenomenon of the averaging of confounds. For each causal variable, Schlevogt and Donaldson (2000, Table 1, p. 11) present its zero-order correlation with organizational performance, which is the apparent effect of that cause on performance. They also present results of a multiple regression analysis in which each cause has its effect on organizational performance, estimated while controlling for every other cause, which is to say, controlling for the confounding effect of all other causes (Schlevogt and Donaldson 2000, Table 1, p. 11). Thus, the latter is an estimation of its true effect (ignoring issues of reverse causality). This regression (partial slope) coefficient is the true effect without the confounding. The zero-order correlation coefficient is the true effect with the confounding. Therefore, the difference between the two coefficients gives an estimate of the confounding of each focal cause by all the other causes. For each cause, subtracting the regression coefficient from the zero-order correlation gives an estimate of the confound. (The regression coefficient is a standardized slope coefficient and is therefore in the same units as the correlation coefficient; the numerical value of a standardized slope coefficient is the same as that of the correlation coefficient of the same pair of variables.) Thus, for each cause, it is possible to identify the apparent effect (the zero-order correlation coefficient), the true effect (the regression coefficient), and the confound (the zero-order correlation minus the regression coefficient). To simplify the discussion, it is convenient to think of all the causes as having true effects that are positive. There are three causes that have negative true effects: price-cutting strategy, environmental constraints, and management expertise. Turning the signs of these three true effects from negative to positive entails renaming the causes as lack of price-cutting strategy, lack of environmental constraints, and lack of management expertise. Variations in sign remain only in the confounds. The issue is whether these confounds conform to the model presented above. Table 11.1 gives the twelve causes of organizational performance. For each cause, the table gives, in the second column, the zero-order correlation between the cause and performance. This is the apparent effect of the cause on performance that would be seen by a manager looking at these results. This apparent effect is contaminated by the confounding effects of the other eleven causes of performance. In the third column, Table 11.1 gives the standardized slope (beta) partial slope

CONFOUNDING BY DATA AGGREGATION

185

coefficient, which is the effect of the cause on performance, after controlling for all the other eleven causes of performance. This is the true effect of the cause on performance, uncontaminated by the confounding effects of the other eleven causes of performance. The fourth column gives the difference between the second and third columns—that is, the zero-order correlation less the standardized slope (beta) partial slope coefficient. It is the apparent effect minus the true effect, the difference being due to confounding. This fourth column gives the magnitude of the confounding of the effect of the focal cause on performance by the effects of all the other eleven causes of performance. As such, it quantifies the confounding effect of many other causes of performance on the relationship between the focal cause and performance. More cautiously, this could be said to be a quantification of the amount of confounding of the correlation between a focal variable and performance, which is due to many other variables that are also correlates of performance. Either way, the analysis captures the degree of confounding of the relationship between the focal variable and performance. Table 11.1 (column four) ranks the confounding from most positive to most negative. The degree to which the firm was a family-owned private firm (“familybased private ownership”) had the greatest confound: –.28: that is, a negative confound that reduced a true positive effect of +.06 to an apparent negative effect of –.22. In this case, confounding not only turned the true positive to an apparent negative but also inflated the magnitude by more than threefold. The least confounded was the effect of structural fit on performance, which had a confound of only –.01, so that its true effect of .12 was only slightly reduced, appearing as .11. As in the MBO review, averaging reduces confounding; in fact, averaging virtually eliminates confounding. The average confounding effect by the multiple other causes of organizational performance is close to zero. In correlation terms, the average confound is only –.01 (see Table 11.1), which, clearly, is very small. Averaging has reduced confounds that individually range from +.15 to –.28 to only –.01. Thus, extreme confounds have been eliminated and replaced by an average confound that is close to zero. This remaining confound, the average confound, is so small that it is trivial. The average apparent effect of the causes is .13. If the apparent (.13) was used by a manager, instead of the true (.14), there would be only a little error (–.01). Hence, we see again the phenomenon that averaging the apparent effect produces a figure that is virtually unconfounded. This unconfounded figure is produced simply by averaging the apparent effects without having to control for the confounds. It is not necessary to go through and control each cause separately for its confounding, because taking the average of the apparent effects controls the confounding among the causes. Again the reason is that, at the average level, the confounding is close to zero. The reason why the average confound is close to zero (here –.01) is that the magnitudes of the confounds are (on average) small and they contain mixtures of signs. The pattern of the confounding of individual causes of performance is what

0.28 0.10 0.23 0.20 0.25 0.34 0.24 0.11 0.11 0.07 –0.11 –0.22 0.13

Apparent 0.13 0.01 0.14 0.11 0.20 0.29 0.20 0.12 0.17 0.19 0.11 0.06 0.14 1.73

True

Source: Based on Schlevogt and Donaldson (2000), Table 1, p. 11.

Proactive strategy Lack of price-cutting strategy Government support Participative leadership Firm size Planning decisions Firm age Structural fit Subcontracting Lack of environment constraints Lack of management expertise Family-based private ownership Means Sums

Cause 0.15 0.09 0.09 0.09 0.05 0.05 0.04 –0.01 –0.06 –0.12 –0.22 –0.28 –0.01

Confound

Correlation with performance

Elimination of Confounding of Effects on Performance by Averaging (N = 124)

Table 11.1

0.15 0.09 0.09 0.09 0.05 0.05 0.04 0.01 0.06 0.12 0.22 0.28 0.10 1.25

Absolute confound

115 900 64 82 25 17 20 8 35 63 200 467 166

Absolute confound/true as percent

186

CONFOUNDING BY DATA AGGREGATION

187

would be expected on the model that confounds are random variations around zero. One-third of the confounds are at about zero, one-third are considerably above zero, and one-third are considerably below (see Table 11.1). In detail, of the twelve causes, four have only weak confounds, having confounds of magnitude .05 or less: structural fit (–.01), firm age (.04), planning decisions (.05), and firm size (.05). Another four causes are confounded by larger, positive confounds of .09 or more: participative leadership (.09), governmental support (.09), lack of price-cutting strategy (.09), and proactive strategy (.15). The remaining four are confounded by negative confounds of –.06 or more: subcontracting (–.06), lack of environmental constraints (–.12), lack of management expertise (–.22), and family-based private ownership (–.28). The model says that, because each confound is randomly distributed around zero correlation, the confounds will be about equally positive and negative in sign. With twelve confounds the random expectation is six positive confounds and six negative confounds. Empirically, of the twelve causes, seven have positive confounds (proactive strategy, lack of price-cutting strategy, government support, participative leadership, firm size, planning decisions, and firm age) and five have negative confounds (structural fit, subcontracting, lack of environmental constraints, lack of management expertise, and family based private ownership). This is very close to the prediction, departing by only a single confound. The average confound is the difference between the sums of the positive and negative confounds, which is then averaged. If the confounds tend to be about equally positive and negative, then the difference is not large and so the average is small. The sum of the positive confounds is .56 and the sum of the positive confounds is –.69, so the difference is –.13, yielding the average across the twelve causes of only –.01. The model holds that, apart from sign differences, the near-zero average confound arises from each confound being small. The average of the absolute confounds is .10 (see fifth column of Table 11.1), which is small. This is also smaller than the average (absolute) true effect of .14. Thus, on average, the size of a confound is only 71 percent (= .10/.14) of the true effect of a focal cause. The potential effect of all the other causes is much greater than a focal cause, and yet the actual confounding is small. By definition, the average effect of twelve causes is only one-twelfth the magnitude of all the causes. Therefore, if the average cause is considered to be the focal cause, the other eleven causes have together eleven times the effect of the focal cause. Thus, on average, the other causes are together eleven times stronger than the focal cause. Nevertheless, as just seen, the (average) confounding (.10) of the (average) focal cause (.14) is less (71 percent) than the focal cause. Thus, the average confounding of a focal cause is only 6 percent (71 percent x 1/12) of the potential from all the other causes. Even the largest confound, –0.28, of family based private ownership, is still only two-elevenths (18 percent) of the potential confounding effect of all the other causes.

188 THE SOURCES OF ERROR

Hence, the actual confounding is much smaller than that potentially provided by the existence of so many other causes of performance, which could collectively dwarf any one focal cause. This is because of the lack of correlation between a focal cause of performance and the other causes. The correlations among causes tend to be low, producing the modest confounding of the focal cause(s). The correlation among the causes is the key to the confounding among them. For any focal cause of performance, its confound by another cause of performance is the arc composed of two correlations: the correlation between the focal cause and the other cause, and the correlation of that other cause with performance. The product of these two correlations is the confound. For each focal cause, the arc linking a cause to performance via another cause was calculated. For example, for planning decisions, its link to performance via proactive strategy equals the correlation between planning and proactive strategy, .39, times the correlation between proactive strategy and performance, .28, which gives a product of .11. Similarly, the arc linking planning decisions to performance via lack of price-cutting strategy was calculated to be .01. The arcs for each of the links between planning decisions and performance for the other nine causes were also calculated. For planning, the sum of the arcs linking planning decisions and performance via all eleven other causes was .22. For each other focal cause, the sum of the arcs connecting it to performance via all the other causes was calculated. Across the twelve causes, the correlation between the sum of the arcs for a focal cause and the confounding of that focal cause was .78. Thus, the arcs explain to a high degree the confounding of the causes. The correlation between the confounding of each focal cause and its total correlations with the other causes was .86, indicating that much of the explanatory power of the sum of arcs is due to the correlation of a cause with the other causes. A cause is more confounded by other causes if it is more correlated with them, as would be expected. The reason why the average confounding is close to zero is that the average of the sum of the arcs is also close to zero, .06. Hence, the phenomenon of confounds averaging toward zero is also present in the sum of arcs, which explains the phenomenon for the confounds. The average sum of arcs is so small partly because there are sign differences across the causes, with eight positive sums of arcs being partially offset by the four negative sums of arcs. Also, the sum of arcs figure is itself quite small, as indicated by the average of the absolute (to remove sign differences) sum of arcs being only .27. The model postulates that the underlying correlation is zero among the multiple causes of the same outcome (here performance). For these twelve causes of organizational performance, the mean correlation is –.01, which is virtually zero, conforming to the theory. The model also states that the correlations among the causes tend to be centered around zero, with equal positive and negative correlations and with low correlations more frequent than high correlations. The twelve causes have sixty-

CONFOUNDING BY DATA AGGREGATION

189

six correlations among them, of which half (33) are positive and half (33) are negative. Thus, consistent with the theoretical expectation, positive and negative correlations are equally frequent. Twenty-seven correlations are between zero and +.2, and twenty-six correlations are between zero and –.2. Hence, fifty-three, that is 80 percent, of the correlations, are at or less than .2 in magnitude (i.e., small). Thus, low correlations are more frequent than high correlations, consistent with the theory. There are three positive correlations ranging between +.21 and +.4 and three between +.41 and +.59. Similarly, there are four negative correlations ranging between –.21 and –.4 and three between –.41 and –.63. Thus, the correlations are symmetrically distributed around zero, again consistent with the theory. Overall, these data indicate the clustering around zero that the model predicts. Thus, while all the causes have a positive effect on performance, they vary evenly between positive and negative in their correlations with each other and mostly they have low intercorrelations. This provides the reason why the confounding effect of each cause on any other cause tends to be weak, through small correlations that are also offset because of their opposite signs. These data allow us to appreciate the role played by multiple causes of performance in confounding any one focal cause. They display the phenomenon of the average confound being practically zero and its origins lying in the fact that the correlations among the causes are distributed around zero, as the model says. Because the average confound tends toward zero, the average true effect is unconfounded and so the potentially strong source of error from confounding is eliminated. The random error around the true value that confounds can produce is removed for averages. This error reduction by averaging occurs without creating any downward bias, such as may be produced by using control groups to eliminate confounding (as discussed earlier). No bias is produced through averaging, because the correlations among causes center around zero. The two illustrations show that the average of the apparent effect is the true effect. In the MBO review, the average of the apparent (experimental group) effects has little confound and provides a good estimate of the true effect. In the Chinese firms’ data, the average apparent effect is virtually the true average effect. Both studies show that averaging across figures eliminates much confounding. Thus, these studies show that averaging can provide a way to reduce confounding. The Chinese study also shows that elimination of confounding by averaging occurs because of multiple causes whose correlations vary around zero. The propositions are: 11.1 Averaging reduces confounding, so that the average figure contains less error from confounding than the figures that are averaged. 11.2 Averaging eliminates confounding error most fully when the underlying correlation between causes is zero. 11.3 Average apparent effects can be valid estimates of average true effects.

190 THE SOURCES OF ERROR

Confounds Readily Eliminated It is noteworthy that confounding is readily eliminated through aggregation. Averaging fairly few data sets reduces the confounding to a trivial level and, therefore, creates almost no error. The reduction is a function of the number of data sets, K, that are aggregated and then averaged. By averaging, we are able to go from individual data sets, some of which may have highly confounded apparent effects, to an average apparent effect from which much or almost all confounding is eliminated. For the MBO studies (see Table 10.1 in previous chapter), the confounding (expressed as the percentage of the confound relative to the true correlation) in individual studies on average was 173 percent. In other words, the average confounding was more than one and one-half times the size of the true correlation. Thus, typically, the observed correlation was more error than true value; there was more noise than signal. Nevertheless, as we have seen, simply averaging across the six studies produces an average confound that is only trivial (2.6) and is only 34 percent of the true effect of MBO on performance (7.6 percent). Likewise, for the Chinese firm study (see Table 11.1), the confounding (again expressed as the percentage of the confound relative to the true correlation) in individual causes of organizational performance on average across the causes was 166 percent (in absolute terms). Again, the average confound was over one and one-half times the size of the true correlation. Nevertheless, as we have seen, simply averaging across the causes produces an average confound that is only trivial (–.01) and is only 7 percent of the average true effect of a cause on performance (.14). Yet for MBO, there were six studies in the review by Rodgers and Hunter that used control groups, so K was only six. Even with a K as small as six, most of the error from confounding is eliminated. Similarly, for the Chinese firms study, there were twelve causes of organizational performance, so K was only twelve. Hence, with a K of only twelve, almost all of the error from confounding is eliminated. This indicates that confounding can be readily reduced as K increases: it was mostly eliminated when K was six and almost completely eliminated when K was twelve. In detail, for a K of six (for MBO), the remaining confound was only 34 percent, while for a K of 12 (for Chinese firms), the remaining confound was the very small 7 percent. Doubling K from 6 to 12 reduces the confounding to about one-fifth (= 7/34). Of course, this treats both the MBO and Chinese studies as if they are directly comparable, but there is reason to believe that they might be. For both MBO and the Chinese firms, the average confounding of a single MBO study (173 percent) or cause of performance (166 percent) was similar, at about 170 percent of the true effects. Moreover, the reduction in confounding when six causes are average in the Chinese firms is about the same as for the six MBO studies. For the Chinese firms, if only six causes had been investigated, their confounding would have been 32

CONFOUNDING BY DATA AGGREGATION

191

percent. (This is the mean of the least reduction in confounding—average confounding relative to their average true effect—from the six causes that have the highest positive confoundings, and the most reduction in confounding, from averaging six causes that have positive and negative confoundings.) Thus, if there had been the same number (6) of causes in the Chinese study as there were MBO studies, the percentage of confounding of the average would have been about the same: 32 and 34 percent, respectively, an average of 33 percent. Thus, the average confounding for a K of one is about the same for the MBO studies and the causes of Chinese firm performance: 170 percent. And the average confounding for a K of six is about the same for the MBO studies and Chinese causes: 33 percent. The effect of K on confounding is similar across the two quite different things being aggregated: studies of MBO and causes of Chinese firm performance. In increasing K from one to six, for the MBO studies, confounding decreases from 173 to 34 percent (i.e., to one-fifth), and, for the causes of Chinese firm performance, confounding decreases from 166 to 32 percent (i.e., also to one-fifth). For both the MBO and Chinese studies, increasing K from one to six produces a reduction of confounding to one-fifth of its value when K is one. Thus, the increase in K from one to six had a similar effect in reducing confounding to that just seen above from increasing K from six to twelve. Going from a K of one to a K of six reduces confounding by one-fifth of its original value, while increasing K further, to twelve, again decreases confounding by one-fifth. Hence, approximately equal increments of K (i.e., from one to six and from six to twelve) produce equal fractional decreases in confounding, namely, four-fifths. Therefore, the effect of K on confounding is a regularity, of a nonlinear kind. As K increases, confounding decreases, but at a decreasing rate with respect to K. It appears to be a negative geometric curve, which is asymptoting to zero. As K becomes larger, presumably confounding becomes closer to zero. Thus, averaging reduces confounding as a positive function of increasing K. Barely attaining double figures for K (12) was sufficient in the Chinese data to eliminate almost all confounding, despite that confounding having been strong (166 percent for any one cause, on average). In the Chinese study, the number of firms was 124; hence N is 124. But this is not the parameter that determines the reduction in confounding from aggregation. It is the number of causes, K, 12, whose effect is aggregated. Hence, it is K rather than N that determines the reduction in confounding due to aggregation. For an organization, the equivalent is for its managers to look at data that is aggregated, typically coming up the hierarchy from across organizational subunits. It could be data on the effect of a program on departmental costs from, say, the six departments of the organization. Because K can be highly effective in reducing confounding even when K is only twelve, it is in that sense easier for K to reduce confounding than for N to reduce

192 THE SOURCES OF ERROR

sampling error. N reduces sampling error in a mean proportionate to the inverse of its square root—that is, an N of 25 reduces sampling error by a factor of 5 only, not 25. As seen, an increase in K from one to six reduces confounding by four-fifths, whereas that increase in N reduces sampling error (of the mean) by only about three-fifths. Similarly, an increase in K from one to twelve reduces confounding by about 96 percent, whereas an increase in N from one to twelve reduces sampling error (of the mean) by only about 70 percent. However, whereas simply adding cases increases N, such as the number of employees, it does not necessarily increase K. Where K is the number of organizational subunits, say departments, this tends to grow proportionately to the logarithm of the number of employees, N (Blau and Schoenherr 1971), so that progressively larger increases in the number of employees are required for each increment in K. Hence, linear increases in K may require progressively larger increases in the N that underlies them. Overall, smaller increments in K may be required to materially reduce confounding errors than are required in N to materially reduce sampling error. However, those required levels of K may in turn require larger Ns, so that the levels of N are roughly the same for reductions in confounding and for reductions of similar amounts in sampling error. Since the discussion in this book points toward sampling error and confounding as being two of the most serious sources of error for managers making inferences from numerical data, it is of interest to see that both errors may decrease at roughly similar rates as N increases. This again points up the inference advantage of the large organization. The immediate lesson is that reduction in confounding by averaging decreases with K, rather than N, and can be substantial even for small numbers of K. The interesting point for organizational theorists is that confounding can be virtually eliminated by averaging for a K as little as 12. The propositions are: 11.4 Reduction in confounding by averaging is a positive function of K, the number of data sets averaged. 11.5 Reduction in confounding by averaging increases as K increases, at a diminishing rate with respect to K. 11.6 Averaging can almost completely eliminate confounding for a K as small as 12. 11.7 Increases in K reduce confounding error proportionately more than increases in N reduce sampling error. The remarks in this section must be viewed as tentative, because they are based on only two studies. Many more studies are needed. Nonetheless, the present remarks are offered to indicate the kind of relationships that may be possible between confound elimination and K.

CONFOUNDING BY DATA AGGREGATION

193

Controlling Confounds Through Averaging in Organizational Management Averaging of figures provides a way to eliminate confounds. This was seen in the social science research examples. These benefits of averaging potentially also occur in organizations whenever managers average the figures available in their organization. By taking an average of a set of figures, a manager can control for confounds in the figures. The averaging will reduce the confounding. The closer that the underlying relationship of the other causes with the focal cause is to zero, the more that averaging will eliminate them. The manager does not need to know whether there are confounds or what their identity or magnitude is. Thus, she has no need to ascertain whether a variable is confounding the figures or what variable it is. She thus saves time and effort. Since the variable that is confounding a set of figures may be hidden from the manager’s view, she would be unable to correct for it by controlling for it in each figure. Therefore, averaging provides a very simple and cheap way of controlling for confounds in organizational management. For some analyses in an organization, averaging may reduce confounding almost completely, while some other analyses in the same organization may reduce confounding only somewhat. Overall, however, the confounding error in the organization will be considerably less than if none of the analyses had their confounds reduced by averaging. Averaging requires that there be multiple figures that assess the effect of the same cause on the same outcome, so that the figures can then be averaged. In an organization, a ready source of such multiple figures is the multiple organizational subunits, each of which has its own figure. These figures can then be averaged. Typically, in an organization, such figures may be drawn up the hierarchy from lower levels and averages taken at higher organizational levels. Because averaging reduces confounds, confounding will be greater at lower hierarchical levels and less at higher levels. In a multilevel hierarchy, the lower-level figures may be averaged at each successive level to create first averages and then averages of averages, with consequent reduction in confounding at each successive upward level. The implications are that local organizational subunits at the lowest hierarchical level will be more confounded than higher levels. Therefore, estimates of the effects of the causes of, say, the performance of a local subunit will tend to be confounded and thus subject to error that leads, quite frequently, to less than optimal decisions. In contrast, estimates of the effects of the causes of outcomes that have been averaged at a higher level will tend to be less confounded and therefore subject to much less error, leading to more optimal decisions. Large organizations possess multiple (K) figures about the same variable that can be averaged. In contrast, small organizations may possess only one figure about a variable and so cannot take an average. Therefore, the figures that managers in small organizations examine are more confounded. Hence, larger organizations

194 THE SOURCES OF ERROR

have an inference advantage through being able to average more than small organizations. In the MBO review (Rodgers and Hunter 1991) there were six studies, drawn from five separate organizations. Suppose that these were six divisions of the same organization, all of which had conducted an experiment in MBO, with the same results as in Table 10.1. Averaging these six figures above the divisions, higher up in the hierarchy, say at the corporate level, would produce a figure of 10.2 percent productivity gain that is mostly free from confounds, as we have seen. However, these studies all used control groups and were able to do so because of low top management commitment, so that their effect on performance is strongly biased downward (as discussed in the last chapter). The more desirable approach, which more fully reveals the power of MBO when there is top management commitment, is to adopt MBO throughout an organization and then average its benefits across organizational subunits. Because averaging removes confounding without the need to measure it, there is no need to use control groups and the low top management commitment on which they are based, incurring the resulting bias toward only obtaining weak performance benefits. For instance, if an organization of six divisions adopted MBO in all of them simultaneously, measured its effect on performance, and then averaged these figures, the average effect would probably be close to the true effect—that is, almost unconfounded. This result would enable a valid assessment of whether the new technique was valuable and worthwhile relative to its costs. Hence, the decision could then be taken whether to continue and possibly enhance the program. Because the technique was widely adopted throughout the organization, this widespread adoption is compatible with high top management commitment, so that the treatment could be strong and coherent, revealing the full potential of the technique. There would not be control groups to measure the confounding in each division, but this would not matter because the averaging across divisions would eliminate the confounding. The resulting average for the organization would be free of both the error from confounding and of the downward bias that results from running an experiment with other parts of the organization not participating. The average for the organization would be the general effect of using MBO across the divisions. However, there could be a lot of variation across the divisions in the true effect of MBO. In the review of MBO studies that used control groups, the studies varied in their true effects from 27.1 to –2.3 percent (Table 10.1). An average would not capture that, so would not be illuminating about divisional variations or their reasons. Nevertheless, for other purposes, an average figure is valuable. The average could be used for the many purposes for which averages are legitimately employed. For example, it could provide the best estimate of likely productivity benefit from MBO across all the divisions of the company. It could provide the best estimate of the benefit of using MBO in a division that does not yet

CONFOUNDING BY DATA AGGREGATION

195

use it, the best estimate of the likely productivity for future time periods, and the best estimate of the benefit of installing MBO in a new division that was a newly acquired company. While the average has limitations in being a general figure, it is of considerable value, enhanced by being mostly unconfounded. In order to assess a new technique, an organization could consciously use the approach that has just been sketched because of an appreciation of the arguments about the nature of confounds. However, many organizations will already be using some such approach without necessarily appreciating the rationale provided by the argumentation. An organization might change its accounting system or employee benefits or pay systems in an organization wide manner and then measure the effects by figures that are, in reality, averages across the subunits of the organization. Thereby, the organization is implicitly controlling for confounds. It is also avoiding the downward bias in effect estimates that could have eventuated if it had, instead of adopting the change organization wide, just made an experiment in one part of the organization while leaving another part unchanged as a control group. Our central contention is that averaging of data in large organizations will provide a safeguard against confounding. Thereby, the large organization, through functioning as a meta-machine, can help its managers to avoid errors of inference that arise from confounding. There are similarities between confounding and sampling error; in particular, both errors are smaller in large organizations than in small organizations. However, the confounding and sampling error are distinct sources of error. In the study of Chinese firms, there are twelve correlations that are significant at the one-in-a-hundred level of significance (two-tailed). For a sample of size 124 (organizations), out of the 66 correlations only 1 would be expected to be significant due to sampling error. Therefore, there are eleven correlations that are larger than would be expected just from sampling error. The degree of variation around zero in these correlations shows that there is some process additional to sampling error that is creating the variation. Thus, confounding is distinct from sampling error. Therefore, the large organization is able to benefit by reducing both sources of error. In earlier chapters, we stressed that the law of small numbers pervasively affects organizational data, in that observations made on small samples are prone to more error than those made on large samples. Large organizations are seen as meta-machines that eliminate much of this sampling error through the aggregation of their data at higher levels of the organizational hierarchy. Now we are arguing that something similar happens for confounding. Confounding is more of a problem in small organizations than in large organizations. The aggregation of data sets that occurs in large organizations also reduces confounding. Thus, managers at the top of large organizations look at data that inherently contain less confounding than managers in small organizations. Hence, managers in large organization are twice blessed, in that they have less error due to sampling error and less error due to confounding.

196 THE SOURCES OF ERROR

The propositions are: 11.8 Aggregation of data reduces confounding. 11.9 Large organizations tend to have less confounding error in their figures than small organizations. Conclusions Aggregation of data by taking an average reduces confounding. Where the different causes of an effect are fundamentally uncorrelated, so as to vary randomly around zero, averaging can completely eliminate confounding. Then the average apparent effect is the true average effect, so that averages can be used without any additional controls for confounding. When the different causes of an effect are correlated and of the same sign, then the average retains some confounding, but still less than in individual data-sets. When an organization, by averaging, reduces some confounding completely and some partially, overall it will have reduced confoundings of its analyses considerably. Averaging reduces confounding more, the more data sets are averaged, but at a decreasing rate with respect to them. Averaging of as few as twelve can eliminate almost all confounding. All of these averaging processes can occur in an organization through normal aggregation of data as they flow upward in the hierarchy. The more that averages reduce confounding, the more that averages can be used without additional controls for confounding and the more useful averages can be as guides to managers for various purposes, including prediction. Large organizations can use averaging more than small organizations, because large organizations have more data sets to average over (for example, more subunit figures). Thus, large organizations can avoid much of the error due to confounding by averaging. Small organizations can make less use of averaging and so are beset with errors in inference from confounding. Averaging across organizational subunits eliminates confounding as organizational size increases, but at a greatly decreasing rate with respect to size. Because averaging avoids the need for control groups, it avoids incurring the downward bias in effect that comes from using organizational experiments. Thus, averaging produces figures in an organization that are unconfounded and free of the bias from low top management commitment. The advantage for large organizations in controlling for confounding adds to the advantage of large organizations in controlling for sampling error. These combined advantages derive from the large organization’s status as a meta-machine that reduces error in the inferences that can be made from data as they flow up the hierarchy and are aggregated. These two sources constitute inference advantages for the large organization, especially since they can reinforce each other. While

CONFOUNDING BY DATA AGGREGATION

197

both advantages derive from large organizational size, they are distinct and deserve to be recognized as such. In Part II, we have considered four sources of error—sampling error, unreliability, range artifacts, and confounding—mainly individually. In Part III, we integrate the discussion by considering the four sources of error together. It might be supposed that the repeated operation of any one of these errors cancels out that error or that the simultaneous operation of multiple errors cancels out them all. We turn to these questions in Chapter 12 and show that the errors are not self-correcting in these ways. In Chapter 13, we formalize the effects of the errors into equations and then combine these equations to give the overall effect of the errors. Chapter 14 advises managers how to minimize the errors when making inferences from data. Chapter 15 considers statistico-organizational theory overall in terms of the situations that facilitate error.

This page intentionally left blank

Part III Integration

This page intentionally left blank

12 Errors Not Self-Correcting

In this chapter, we consider interactions, specifically, interactions across time and between error sources. What happens to each of these errors if analyses are repeated? What is the overall error produced by interactions between error sources? In particular, we need to address the possibility that an error that might occur on a single occasion might conceivably be eliminated if the same analysis is repeated across time. Similarly, the errors that we have identified as occurring from one error source may be canceled out by errors from other sources when all are working in combination. Thus, we inquire into whether repetition of data analyses produces, or multiple errors produce, self-correction that eliminates errors. As we shall see, repeated data analyses tend not to lead to the elimination of errors; and the multiple errors tend not to cancel each other. Thus, the errors discussed in this book are mostly not self-correcting. Repeated Operation of Errors What happens to the errors discussed in this book if a manager looks at one data set, but then repeats the analysis looking at other data sets? Will the error in one data set be offset by opposite errors in the other data sets so that the errors are corrected, leaving the manager with a quite accurate estimate? If so, then the errors discussed in this book would lead to only minor misperceptions and temporary errors in decisions by managers. However, repeated making of inferences from data by managers is unlikely to lead to self-correction of these errors, as will now be seen. Of the four sources of error discussed in this book, some work the same way repeatedly while others would vary in the effect from occurrence to occurrence. As has been discussed, measurement error (unreliability) and range restriction always depress true correlation. Therefore, every new data set in an organization understates the true strength of relationships. (This holds where there is range restriction; if there is range extension, then it also holds, unless the range extension increases the correlation more than unreliability reduces it. In theory, the impact 201

202

INTEGRATION

of range extension alternating with range restriction should negate any effect, but this is unlikely, in fact, because range extension occurs less frequently than range restriction.) Where the relationships are between cause and effect, the data recurrently understate the true strength of the causes being examined. Thus, data lead managers to a too modest assessment of how far the causes they are measuring are important drivers of the effect. This, in turn, may lead to underutilization of these causes and underinvestment in them. It also leads to a search for other causes, which may not actually exist. This forlorn search could be accompanied by experimenting with all sorts of ineffective ways to try to create the effect being sought. This might lead to cynicism that managers are simply following fads, reducing their credibility. In contrast to the consistent depressing of observed values below true values that is caused by unreliability and range restriction, sampling error creates more variable errors. Sampling error creates a random variation around the true value. Over time, samples will vary in their values around the true value in a random walk. Hence sampling error produces random error above and below the true value of correlations and means. Therefore, from data set to data set the derived values will vary randomly above and below the true score. The variations are larger for data that contain fewer observations. Sampling error, in both means and correlations, leads to misperceptions that causality is more changeable and complex than it really is, in ways that are not fully understood. Because of sampling error, an assessment of means will tend to show that they vary from data set to data set. If the data sets are from different times, the perception will be that means vary over time more than they do truly. Thus, the system characteristic being studied will seem more volatile than it is in reality. This will lead to a perception that the organization or environment is more changeable than it is truly. Also, it may lead the manager to think that the mean varies according to some other factor, leading to a perception that the world is more complex than it is really. This might also lead to a pointless search for nonexistent moderators—that is, some variables that are causing the mean value to vary over time. Similarly, assessments of the effects of causes, such as in correlations, will observe that they vary from data set to data set more than they do really. This will make it seem as if the causes are less predictable and dependable in their effects than they are really. It will seem that sometimes a cause is strong in its effect and sometimes weak. Managers may search for some factors that lead the cause to vary in its effects—that is, they are searching for moderators, even if none exist. This search will consume time and attention as frustrated managers try to pin down a mirage. They perceive the causal structure of their world to be more complex than it is really. Of course, if managers accumulate the results of their analyses over time, the above-true values and below-true values will tend to cancel each other out, leading to a quite accurate estimate (i.e., at or close to the true value). In that way, the sam-

ERRORS NOT SELF-CORRECTING

203

pling error problem could be self-liquidating. However, that sort of accumulation of results over time goes against the popular wisdom that there are real differences from one time period to the next that need to be captured by focusing on the results for each period, rather than by combining them. Thus, the spurious changeability based on the problem of sampling error will exist to the degree that accumulation across time is not used or that the aggregation of data is insufficient to overcome the variability from small numbers of observations. If data from repeated studies are aggregated over time to remove most of the sampling error, the means and correlations will still be depressed, because of unreliability and range restriction, leading to false pessimism about how far causality is known. Both types of errors, those that depress correlations and those that cause random error, lead to a misperception that the world is complex and multicausal and that the manager does not understand it fully, and to a fruitless search for other causes that do not exist. In summary, unreliability and range restriction lead to underestimates of correlations repeatedly in data sets. This leads to a misperception and pessimism that the knowledge of causality is incomplete, in that some additional, unknown cause(s) must exist. Sampling error leads to misperceptions that causality is more complex than it really is, in ways that are not understood. Hence, errors from data lead managers to see the world as more complex and multicausal than it is really. This pessimistic and exaggerated view of the complexity of their organization and its environment will persist, despite attempts to understand more fully the complexity of causation, because that complexity is illusory, being based on ongoing errors in the data. Thus, repeatedly analyzing data sets over time and deriving from them multiple estimates of means and correlations will usually not lead to reductions in the errors discussed in this book. Unless the figures are aggregated across time, a method that may often be resisted, the means and correlations will still have the same errors. Overall Error From Multiple Error Sources If repeated analysis does not lead a single source of error to be self-correcting, does the interaction between different sources of error cause them to cancel each other out, leading to correct results of numerical analyses? In order to understand the overall errors that various sources of error combine to produce, we need to appreciate the strength of the various errors, especially relative to each other and to the correlation of interest, and also to appreciate the variability of these errors. Relative Magnitude of Errors From the Different Sources The various sources of error discussed in this book vary in the magnitudes of their effects. Sampling error and confounds can produce larger errors than unreliability

204

INTEGRATION

and range restriction. The seriousness of an error lies in how great is that error relative to the true figure (e.g., the true correlation). Sampling error is likely to be a larger source of error than the other sources of error in the many organizations that have a small number of observations, N, on which they base their numerical estimates. Therefore, random error and the illusory sense of changeability and unpredictability are liable to be prevalent. Sampling error can lead to errors in the estimation of correlations that are larger than the true correlation. Similarly, confounding can also lead to errors in the estimation of correlations that are larger than the true correlation. The confounding by a variable is the product of its correlations with the independent and dependent variables. This confound could be greater than the correlation between the independent and dependent variables. Thus, the confounding could be greater than the true correlation between the independent and dependent variables. This applies also to confounding when either the independent or dependent variable is a difference score, from correlations between the constituent variables of the difference score and the independent or dependent variables. When performance is the dependent variable, reverse causality by prior performance can be a confound. It can be greater in magnitude than the true effect of the independent variable on performance. Thus, confounding, whether from exogenous causes, definitional connections from difference scores, or reverse causality of prior performance, can create confounds that are each greater than the true effect of the independent on the dependent variable. This is more likely if the true effect of the independent on the dependent variable is small, which is likely if the dependent variable is organizational performance, because it has many causes. Moreover, the three sources of confounding (exogenous causes, definitional connections, and prior performance) can add together, so that in combination they could be much greater than the true effect of a cause on organizational performance. Unreliability will also be present in any data so that measures of association will be understatements. Range restriction is liable often to occur. They both lead to the correlations being understatements. However, the magnitude of the problems from unreliability and range restriction are often smaller than those from sampling error and confounding. Unreliability and range restriction are not greater than the true correlation, because both unreliability and range restriction mediate the true correlation by being multiples of it. The error in the observed correlation from unreliability is the unreliability multiplied by the true correlation. Thus, at its maximum, unreliability produces no more error than the true correlation. Similarly, the degree of error produced by range restriction is not greater than the true correlation. The error in the observed correlation from range restriction is the range restriction multiplied by the true correlation. Thus, at its maximum, range restriction produces no more error than the true correlation. In combination, the unreliability and range restriction factors combine together to produce an observed correlation that is a fraction of

ERRORS NOT SELF-CORRECTING

205

the true correlation. This goes with the intuition that both unreliability and range restriction are in a causal path from true to observed correlation. Thus, together, the error from unreliability and range restriction is never more than the value of the true correlation. In summary, sampling error and confounding can each be greater than the true correlation, but unreliability and range restriction are never greater than the true correlation. The depression of the observed correlation by unreliability and range restriction makes other sources of error stronger. Hence, sampling error and confounding will have more effect on such depressed correlations. Moreover, sampling error is greater for smaller correlations than for larger correlations, and so the sampling errors around the correlation will tend to be greater for correlations that have been depressed by unreliability and range restriction. Thus, there is a greater chance of a negative sampling error that is as large as the positive correlation and so fully masking it or possibly making it appear negative. Similarly, it is easier for a confound to be as great as the correlation, if that correlation has been depressed by unreliability and range restriction. Therefore, the depressed correlations from unreliability and range restriction are more vulnerable to the other sources of error, making false conclusions more likely. Variability of Errors From the Different Sources What about the variability in these sources of error? What are their likely values? Sampling error could be zero, and this is its most likely single value. However, this is less likely, the smaller the N. Sampling error is most likely to be nonzero and is likely to be greater, the smaller the N. Sampling error can, of course, vary from positive to negative. Sampling error varies around the true value. Confounds can produce spurious correlations that vary from positive to negative. Substantial confounding by exogenous variables is more feasible if the true correlation is small, such as when the dependent variable is organizational performance, because other causes will together have more effect than the focal cause. However, the other causes may not all be correlated with the focal cause or may have both positive and negative correlations with it, offsetting each other. Hence, it is plausible that, for weak true correlations, other causes will have some confounding effect, though usually weak. Confounding introduced by difference scores is possible. Confounding of a cause of organizational performance by prior performance is likely because change in organizational characteristics is often driven by crises of low performance. Confounding by prior performance is likely to be serious, because prior performance may well correlate with the cause to a similar degree to the correlation between that cause and performance. All three sources of confounding could reinforce each other, possibly producing a large confound. Unreliability can vary, but is constrained by the true correlation at its maximum and, while reducing a true positive correlation, does not produce a negative correla-

206

INTEGRATION

tion. In these ways, it varies less than sampling error. However, it is always present to some degree, because there is always some measurement error. Range restriction is not inevitable, though it frequently occurs, but it too is constrained by the true correlation at its maximum. Range extension is also possible. Range extension is not as constrained as range restriction. Because range extension overstates the true value of a correlation, it will tend to offset errors from unreliability or confounds that are leading to underestimates. Overall, sampling error can be highly variable. Confounding of a cause of organizational performance is likely from prior performance and possibly also from definitional connections or exogenous causes. Unreliability is less variable but some error will exist from unreliability. Problems from range differ between range restriction and range extension, so can produce either negative or positive errors. Modeling the Typical Overall Error In this book, we have identified a number of sources of error that could affect managerial inferences from organizational data. Thus, the full model would have some complexity. It is therefore capable of producing a range of outcomes. The conclusions will vary across organizations and are ultimately an empirical matter. However, it may be useful to discuss patterns that are likely to emerge. There may be a modal pattern around which empirical cases vary, so that, though the pattern would not apply to all organizations, it may hold sufficiently often to be a useful way of thinking about the general case. To make such an investigation, we need to make certain assumptions in order to limit the variation in the variables and reduce the number of their permutations. In so doing, we shall try to use assumptions that are reasonable on the basis of existing theory and evidence, so that the ensuing model examines the overall errors that would occur under the most likely scenario. Assumptions of the Modeling The assumptions made in the remainder of this chapter are as follows. We focus on analyses of organizational data that are about explaining organizational performance. Managers are particularly interested in the performance of their organization and seek to identify what drives it—that is, they seek the causes of organizational performance in their organization. Hence, the true correlation discussed in this chapter is that between organizational performance and one of its causes. We assume also that organizational performance has multiple causes, which means that the correlation will usually be small, say around .2 or, more optimistically, .3, but sometimes just .1. This assumption accords with research showing that organizational performance has multiple causes and that their magnitudes are often in the range of .1 to .3 (Schlevogt 2002). We assume also that the cause of organizational performance has a positive effect on organizational performance. This is feasible because any causal variable

ERRORS NOT SELF-CORRECTING

207

that has a negative effect can be made positive by taking its opposite—for example, if X negatively affects organizational performance, lack of X positively affects organizational performance. Low correlations between the cause and organizational performance make the correlation vulnerable to confounds, because confounds that produce low spurious correlations are enough to seriously change a correlation, so that a true positive effect can seem zero or negative. We assume also that prior organizational performance has a negative effect on the causal variable, which is supported by empirical research showing, for instance, that low performance triggers change in structure (Chandler 1962; L. Donaldson 1987; Ezzamel and Hilton 1980; Hill and Pickering 1986). Hence, prior performance is a masking confound of the positive effect of a cause on organizational performance. Consistent with empirical research findings, we assume that the confounding effect of prior performance on an organizational characteristic is about the same magnitude as the effect of the organizational characteristic (i.e., focal cause) on organizational performance. All of these assumptions seem reasonable in light of previous empirical research and theorizing about organizations. More tentative are the following assumptions. We assume that there is no strong confounding effect of other causes of organizational performance. These other causes are numerous and their collective effect on organizational performance is stronger than that of the focal cause. However, they typically have only weak correlations with the focal cause. Also, they have positive and negative correlations with the focal cause, thus offsetting each other. Thus, the effect of other causes in general (i.e., on average) produces trivial confounding, which is what the research shows (see Chapter 11). Therefore, confounding by other causes will be ignored, though this assumption is only for the purposes of making this modal analysis. For some causes in some organizations there will be confounding by other causes that materially affect the observed correlation between the focal cause and organizational performance, so the conclusions drawn by managers will vary from those in this chapter, and this should be borne in mind. In this book, we have seen that where a variable is a difference score there can be correlations between that variable and the two variables that compose it. If these variables are in turn correlated with either the independent or dependent variable, then a confound will exist. This kind of confound could exist if organizational performance is measured as a difference score (e.g., profit). Since organizational performance is often measured as profit or some other difference score, then this confound is feasible in practice. However, for the confound to be material, the constituent variables would have to correlate highly with the focal cause. Moreover, any such confounding would be stronger if both constituent variables produced confounding in the same direction (same sign). It may be too much to assume that this is typically the case, so this kind of confound will be ignored in this chapter. Hence, some of the possible sources of error identified in the book (confounding by exogenous factors or definitional connections) are implicitly treated as being

208

INTEGRATION

small enough to be ignored in the present chapter, in the interest of trying to see a simple general picture. Yet, there is always the possibility that either holds in an organization or that, together, they materially affect the conclusions that would be drawn from organizational data. The Simplified General Model Consider here the magnitude and variability of errors and how they interact to influence the true effect of a cause on organizational performance to produce an observed correlation that contains error. The individual methodological factors interact to produce overall error according to a set of equations, which are used implicitly here, but are presented explicitly in the next chapter. These calculations lead to the conclusion that the observed correlation would differ considerably from the true correlation, so that there is much overall error. Hence, the interactions among sources of error are not such as to correct each other. Measurement errors (unreliability) will be present, but will only be a proportion of the true correlation between a cause and organizational performance. Range problems in social science are more usually in the form of range restriction than range extension, and the same may be true for organizational data. Any errors from range restriction will tend to be only a proportion of the true correlation between a cause and organizational performance. Nevertheless, unreliability and range restriction combine to create more error and reduce the true correlation between a cause and organizational performance. Consider the example of a correlation between organizational performance and one of its causes. Suppose that the measurement of the cause was a single item in a poorly designed questionnaire, with a reliability of only .6, and that performance was profit, measured with a reliability of only .5. The combined effect of these two mediocre reliabilities is to lower the correlation to about .55 of the true correlation. Therefore, a true correlation of .3 would become a correlation of .165, an error of .135. Figure 12.1 shows a true correlation coefficient (r) of .3 and then how it is reduced cumulatively by the effects of unreliability in the cause and in profit. If, additionally, range restriction was substantial, with a coefficient of only .6, this would further lower the correlation to .33 (= .55 x .6) of the true correlation—that is, the correlation would be about .1 (Figure 12.1). Hence, the true correlation of .3 becomes a correlation of only .1—that is, one-third of its true value, an error of .2, just through the depression due to unreliability and range restriction. Such reduced correlations are easy to be neutralized by errors from confounding or sampling, making erroneous conclusions of zero or negative effect more possible. Given this weak correlation between the cause and organizational performance that is left after its reduction by unreliability and range restriction, if the strength of the negative prior performance confound exceeds it, then the correlation may become negative. Continuing the example, after its reduction by unreliability and range restriction, the positive effect of the cause on organizational performance

ERRORS NOT SELF-CORRECTING

209

Figure 12.1 Correlation After True r Is Cumulatively Affected by Unreliabilities of the Cause and Profit Variables, Range Restriction, Prior Profit Confound, and Sampling Error (Upper and Lower Bounds for N = 20) 0.4 0.2 0 –0.2 –0.4 –0.6 –0.8 True r

r after reliability of cost

Plus reliability of profit

Plus range restriction

Plus prior profit confound

Plus sampling error upper bound

Plus sampling error lower bound

was only +.1. If the negative confounding effect of prior performance was –.3, then the correlation would be –.2, which is negative (Figure 12.1). The result is incorrect, given that the true correlation is +.3. Sampling error then creates a range of correlations around this incorrect negative correlation. The likelihood and magnitude of sampling error depends in part on the size of the correlation. Sampling error is greater for smaller correlations. For the scenario considered here, the correlation is small, so sampling error is more likely to be greater. This is a reason why sampling error is likely to be troublesome. Sampling error also depends in part upon N, the number of observations. In a large N organization, sampling error will be trivial. Therefore, the correlation, after unreliability, range restriction, and confounding, will be virtually the final correlation, which, as stated here, is usually a false, negative correlation. In contrast, in a small N organization, the error from sampling could be large and could exceed the correlation left after unreliability, range restriction, and confounding. The sampling error is positive in about half the cases and negative in about half. In a small N organization the sampling error interacts with the error from confounding. The two sources of error, confounding and sampling, could reinforce or offset each other. When sampling error creates a negative error, this would reinforce the negative correlation from confounding. When sampling error creates a positive error, this would offset the negative correlation. Thus, in half the cases, sampling error will make the negative correlation

210

INTEGRATION

more highly negative, so that it is more erroneous. In the other half of the cases, sampling error will be positive. Low values of “positive” sampling error will only take the final correlation toward lower negative, or zero, values. Only larger positive sampling error will make the final correlation positive, but such large sampling errors are unlikely. Therefore, the probability is only small that positive sampling error will make the final correlation similar to the true positive value. Hence, under the scenario considered here (of confounding greater than the true positive correlation after unreliability and range restriction), sampling error is unlikely to lead to a final positive correlation of the same magnitude as the true correlation. In summary, for large N organizations, the result is an incorrect negative final correlation that comes from the interaction of the measurement artifacts with confounding. For small N organizations, the result is usually an incorrect negative final correlation, and only in a small minority of cases is the result about the correct positive value. Aggregation of data, to turn a small N into a large N, will not solve the problem, because this only turns a chance of obtaining a false result into a certainty of so doing. To better explain the role of sampling errors in interaction with other sources of error, we will continue the example above quantitatively. Earlier, we showed how a true positive correlation of .3 could, by unreliability, range restriction, and confounding, become a correlation of –.2 (i.e., negative). We will now add in the effects of sampling error. If the data were from an analysis with N of 20, the sampling error for an underlying correlation of –.2 would produce a “fan,” or range, of correlations between –.62 and .22 (Figure 12.1). Thus, at the upper extreme, a positive value of .22, a little below the true of .3, is possible. However, negative values are much more likely. The most likely values of an underlying correlation of –.2 are between –.4 and zero—that is, almost all negative. While a manager, at best, may correctly conclude that the cause has a positive effect of almost the true magnitude, he is much more likely to falsely conclude that the effect is negative. Thus, given a low true positive correlation (reduced by unreliability and range artifacts, so that the correlation is less than the confounding negative correlation from prior performance), the result is a negative correlation. Sampling error then creates a fan of correlations, but, because they are centered on the negative correlation, only at their upper extreme are the correlations tending toward the true correlation. Thus, once the errors from measurement and prior performance have operated, sampling error cannot usually compensate for them, so the final correlation typically misleads the manager. The crux of the matter is the low true correlation, which after reduction by measurement artifacts is overwhelmed by the negative spurious correlation from prior performance, leaving a negative correlation. The weakness of this correlation inflates sampling error, but even so, the fact that sampling error is anchored around a negative correlation makes the final correlations mostly negative, with only rare,

ERRORS NOT SELF-CORRECTING

211

extreme correlations approaching the true positive value. This is what we believe the modal pattern looks like. The results will typically fail to demonstrate that a cause that positively affects organizational performance does so. Instead, most analyses will falsely conclude that a positive cause of organizational performance is negative in its effect. Managers will often conclude, wrongly, that the cause is ineffectual or harmful to organizational performance. They will thus fail to use that causal lever. By failing to invest in raising the level of that causal variable, managers will miss the opportunity to improve the performance of their organization. Thus, the interactions among the different sources of errors will not usually eliminate serious error. Of course, this conclusion is for the scenario considered here. In some organizations range may be extended, increasing the observed positive correlation. Again, the other confounds, from exogenous causes or definitional connections, might operate in other cases, and they might be positive in their effects, counteracting the negative prior performance confound. The actual overall error will always depend on the sign and magnitude of the sources of all the errors that are operating in a given organization. Nevertheless, the present analysis, based upon what existing research tends to show about the errors from various sources, shows that they will typically not tend to cancel each other out and self-correct. This discussion of the interacting effects of the errors from the various sources leads to the same view as the discussion of repeated analyses over time. For both, the errors continue to exist when these multiplex situations are considered. Neither case, moving from single episodes to repeat episodes or moving from single error sources to multiple error sources, leads to error elimination in any way that can be depended upon. As the errors considered here become recurrent or interactive, they do not cancel themselves out. The conclusions drawn in statistico-organizational theory are robust whether we consider moves from single episodes to repeat episodes or moves from single error sources to the interaction between multiple error sources. Conclusions The errors considered in this book do not work to cancel each other. If the error produced in a single episode were to be followed by opposite errors in following episodes and if they were aggregated together, then they would to some extent cancel each other out. However, aggregation across time tends not to occur sufficiently to eliminate the problem from small numbers of observations. Hence, the variations across time from sampling error tend to be interpreted as instabilities in levels of variables or their relationships, pointing toward unknown complexity. Thus, far from self-correcting, the variations over time can produce fresh, inexplicable problems. Likewise, on the topic of the causes of organizational performance, interactions between errors from different sources are such that there is liable to be overall er-

212

INTEGRATION

ror. This overall error may be complete, in the sense that managers draw exactly the wrong lesson from their numerical data. A positive correlation can appear to be negative. Thus, consideration of the interactions across times and error sources does not lead to doubts about the central thesis of statistico-organizational theory: that, when managers examine numerical data in their organization, the traps within that data can often lead them to draw false conclusions that can mislead decision-making. In this chapter, in calculating the errors from the error sources and the overall error from their interactions, we have utilized certain quantitative formulas. The equations for these error sources and the equation showing how they interact to create the overall error will be presented in Chapter 13 to provide a formalization of the argument. However, those not disposed to mathematical notation and equations may skip over the chapter because the substantive implications have already been dealt with in other chapters.

13 Equations of StatisticoOrganizational Theory

In the methodological literature, an argument is often formalized with the help of equations. So too, in statistico-organizational theory, we may use equations. They help to formalize the argument and, hopefully, improve its clarity and precision. They also allow estimation of effects; plus, they can enable us to appreciate more clearly how the errors from various sources combine, by offsetting or reinforcing each other, to give the overall error. Moreover, the equation that gives the overall error can then be modified to display the role of the organizational characteristics that statistico-organizational theory holds lead to these errors in varying organizational situations. In this way, statistico-organizational theory as a whole may be formalized into an equation. (Notwithstanding these advantages, the nonmathematical reader should feel free to proceed to other chapters.) The strategy in this chapter is first to use the methodological literature to give the equations that express how properties of data, such as the correlations or standard deviations therein, give rise to errors. Some of these equations are then stated in simplified form to capture the main effects while leaving the equation tractable. The equations are then combined into an overall methodological equation so that the interactions among errors that produce the overall error can be seen. The terms in the overall methodological equation are then reexpressed in terms of the organizational characteristics that predict how serious these errors will be across the different situations that managers face in organizations. This equation expressed in the organizational characteristics is the overall theoretical equation of statistico-organizational theory. Errors are often discussed in social science in a vague way. The analyst looks at data in which the errors are already present, blurring true patterns, but exactly how is often part of the blur. In contrast, some methodologists (Hunter and Schmidt 2004; Johns 1981) deal with errors in a more rigorous way, allowing clearer treatment. In particular, they may start with the true underlying pattern, such as a true score or 213

214

INTEGRATION

true correlation, and then show how errors operate in ways that are stated explicitly to produce predictable degrees of error in the observed finding. Thus, the analysis of errors can be made by causal models. This approach helps us to think more clearly about errors occurring in managerial analyses of organizational data and so will be drawn upon extensively in this chapter (as it has been implicitly in other chapters). There are two kinds of processes occurring in statistico-organizational theory: determinate and indeterminate. In a determinate process, the outcome can be stated. Most processes considered in this book are determinate: measurement error (unreliability), range restriction and extension, and confounding. For each of these, the error they produce can be predicted in a formula from the values of their component variables. Moreover, their overall error can be stated by combining them in an equation that captures their combined errors. Thus, we can proceed by focusing on the equation for each of these error sources and then combining them into an overall equation. By contrast, in an indeterminate process the outcome cannot be stated. Sampling error is indeterminate in this sense, in that sampling error varies randomly about the true value, being sometimes positive and sometimes negative. Moreover, the magnitude of the sampling error varies from sample to sample. What can be stated, however, is the distribution of sampling error. This distribution increases for smaller numbers of observations. We can combine the determinate and indeterminate errors by considering the determinate errors to have their effect on overall error, which then joins with the indeterminate error to produce the final error. Because the determinate outcome is a certain value, while the indeterminate outcome is a range of values, the final error is a range of values. It will be convenient to start with the determinate errors, giving their equations and then combining them, before turning to the indeterminate error. Accordingly, we will consider in order measurement error (unreliability), range restriction and extension, and confounding, and the overall error they produce, before turning to sampling error. Before considering the errors, however, it is appropriate to consider first the true correlation between the independent and dependent variables. This conditions the role of the errors in several ways. As will be seen, measurement error (unreliability) and range restriction work by reducing (or inflating for range extension) the true correlation by some fraction of the true correlation, so the magnitude of these errors in the observed correlation is conditional upon the true correlation. The confound errors work by adding to, or subtracting from, the observed correlation, and so the magnitude of these errors is affected by the magnitude of their spurious correlations relative to the true correlation. True Correlation We are concerned with managers examining numerical data in their organizations and trying to infer the positive causes of organizational performance. Managers care

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 215

for organizational performance and are monitored and rewarded for the performance of their organization. The performance of an organization is affected by many causes, such as its strategy, structure, human resource management, marketing, and finance, as well as environmental factors such as competition and the business cycle. Because there are multiple causes of organizational performance, any single cause will tend to be weak in its effect on organizational performance. Therefore, we can postulate that the correlation of a cause of organizational performance with organizational performance will tend to be low. To appreciate this point, consider the following hypothetical example. If organizational performance happened to have eleven causes, each independent of the others, it could be calculated from a formula in Hunter and Schmidt (2004) that the correlation between a cause and organizational performance would be .3. The causes, being uncorrelated with each other, would each explain some percentage of the variance in organizational performance. The percentage of the variance in organizational performance explained by each cause would add to that explained by every other cause, so that in total they explained 100 percent of the variance. Therefore, if there were eleven different causes, of equal strength, each cause would explain 9 percent of the variance in organizational performance, so that, in total, all the eleven causes would explain 99 percent of the variance. Thus, each cause would be correlated .3 with organizational performance (because the correlation is the square root of the variance). If the causes had different strengths, which is more realistic, nevertheless, the average correlation of a cause with organizational performance would be .3. Of course, one cause might be correlated more than .3 and another cause correlated less than .3; but, most typically, a cause would have the average correlation of .3. Since we are interested in this book in causes of organizational performance in general, we will focus on such average and, therefore, low correlations, rather than any high correlations, because they would be unusual. Since organizational performance has numerous causes, any focal cause will therefore have a low correlation, say, .3. Clearly the research literature has found many causes of organizational performance, thus providing support for focusing here on a cause that is one of many and so has a low correlation. As seen in Chapter 11, in a study of a large number of possible causes of performance in business firms, Schlevogt (2002) found that only one correlation between a cause (planning) and performance was over .2 and it was only .29 (though these correlations will be attenuated due to unreliability and affected by any range artifact). Thus, the idea that causes of organizational performance have low correlations with organizational performance has some empirical support. Hence, we will use as an axiom herein that organizational performance has numerous causes and, therefore, any focal cause will have a low correlation, which we will postulate to be only .3. In so doing, we acknowledge that there may be many more causes than about eleven, so that the average correlation is less than .3. For instance, if there are 100 causes of organizational performance that are independent of each other, then their average variance explained is only .01, so

216

INTEGRATION

that their average correlation is only .1 (because the correlation is the square root of the variance). Clearly, .1 is only one-third of .3. Therefore, it is feasible that the true correlation is typically much less than the .3 postulated here. This will make the present postulate a conservative assumption, in that the analysis shows how a low correlation of .3 is prone to serious error, while a lower correlation of .1 would be even more prone to serious error. The explicit assumption is that causes are independent of each other and there is evidence that many causes of organizational performance are independent of each other, as was discussed in Chapter 11. Some causes may be correlated, but if they are recurrently, from data set to data set, this may be because one cause is caused by the other cause, in which case the second cause is really the underlying cause of the first. Since the point of causal analysis is usually to identify the different causal levers, the search is not for causes where one drives another, but for causes that each drive the dependent variable, independent of each other. Thus, the interest here is on managers finding causes of performance of their organizations that are independent of each other, so that dealing in causes that are uncorrelated is consistent with the overall logic. Therefore, for modeling purposes, consider that the true correlation is .3. In symbols, the true correlation, rt, equals .3: rt = .3 Because this true correlation is low, the reduction in it due to measurement error and range restriction will make the observed correlation smaller still—for example, .2. Hence, the observed correlation, ro, equals .2: ro = .2 It is therefore easy for confounds to have spurious negative correlations that are large enough to have a strong confounding effect. That is, they would render the observed correlation into a very small positive, zero, or negative correlation. For example, if the true correlation is only .2, then a spurious negative correlation of only –.2 will make the observed correlation appear to be zero. Yet, all such findings are erroneous and lead the manager to the wrong conclusion—that is, to underuse or cease to use or decrease the use of that causal lever. When the dependent variable is performance, confounds can arise from the numerous other causes of performance. Confounding can also arise from prior performance, which tends to be negatively correlated with many organizational characteristics (e.g., strategy and structure), because often organizations do not change those characteristics until there is a crisis of low performance. Hence, using performance as the dependent variable makes confounds more likely, while the true correlation is low, making severe confounding quite feasible. Profit is an important aspect of performance in many organizations (e.g., busi-

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 217

ness firms). Therefore, in many analyses of cause and effect, profit is the dependent variable. Profit is a difference score and this reduces its reliability, so that the true correlation is even more depressed than usual by measurement error. The difference score also introduces another source of confounding due to definitional connections. If its spurious correlation is of the same sign as that of the confounds due to other causes and prior performance (e.g., if all are negative), they reinforce each other. A resultant enhanced negative confounding may well exceed the diminished true positive correlation (after unreliability and range restriction), leading to an observed erroneous negative correlation, which misleads the manager. Thus, by focusing on the managerially relevant case of an analysis of organizational data in which the manager looks for a positive correlation between a cause and performance, especially profitability, it is likely that the true correlation is weak while the errors are numerous and potentially stronger than the true correlation, leading to false conclusions. Measurement Error Measurement error (unreliability) introduces an error into the observed correlation that reduces it below the true correlation. This error is proportional to the unreliabilities in the measurement of the independent and dependent variables. Moreover, if either the independent or dependent variables are difference scores, the observed correlation is further reduced below the true correlation. However, because the error in the observed correlation is a reduction below the value of the true correlation, the magnitude of the error from unreliability is never more than the true correlation. In psychometrics, the effect of reliability is conceptualized in terms of a path model, in which the true value causes the observed value through the reliability (Hunter and Schmidt 2004). The true value is not perceived directly but is registered imperfectly by the observed value, so that the observed value is diminished by the extent of unreliability. Thus, the reliability coefficient fully mediates the effect of the true correlation on the observed correlation. This path modeling is used also in psychometrics for range restriction and extension, as will be seen below; and we will use this path modeling approach for confounds. Reliability affects the true correlation according to the magnitude of the reliability coefficients. The reliabilities are measured as being between zero (complete unreliability) and one (complete reliability). The formula is that the observed correlation is the product of the true correlation between the two variables and the square roots of the reliabilities of the measurement of each of the variables (Hunter and Schmidt 2004). In other words, the observed correlation of X and Y equals the true correlation of X and Y multiplied by the square root of the reliability of X multiplied by the square root of the reliability of Y. In symbols, if ro is the observed correlation, rt is the true correlation, rxx is the reliability of variable X, and ryy is the reliability of variable Y, then

218

INTEGRATION

ro = rt rxx ryy . Therefore, the less reliable is the measurement of the variables in the correlation, the more that the true correlation is reduced, producing the observed correlation. The underlying model is a path in which the observed correlation is produced by the true correlation working through the reliabilities. The lower is the reliability of measurement, the less that the observed correlation reflects the true correlation. Therefore, the less reliable is the measurement, the more that the true correlation is reduced, to produce the observed correlation. For example, if X is measured with reliability of .8, rxx, and Y is measured with a lesser reliability, say .7, ryy, then the joint effect of the reliabilities is such that ro = rt .8 .7 ro = .75rt. Hence, the value of the observed correlation is reduced so that it is only 75 percent of the true correlation. A true correlation of .3 would be an observed correlation of only .23. The formula used here makes it clear that the error produced by unreliability is some fraction of the true correlation. As seen: ro = rt rxx ryy . Therefore, the observed correlation varies according to the reliabilities, from zero (if rxx or ryy is zero) to the whole of the true correlation (if rxx and ryy are both one). Hence, the corresponding error in the observed correlation varies from the whole value of the true correlation to zero. The error from the unreliabilities of X and Y is equal to the difference between the observed correlation of X and Y and the true correlation of X and Y: Error in ro Y ro v rt . This formula gives both the amount of the error and its sign—that is, an observed correlation less than the true correlation is correctly given by the formula to be a negative error. Substituting for ro the formula used to calculate it, Error in ro Y rt rxx ryy v rt which can be rearranged as Error in ro Y rt

„r

xx

ˆ

ryy v1 .

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 219

This equation gives the idea that the initial condition is the true correlation, into which error is introduced to give the error in the observed correlation. Returning to the example, as seen above, if the reliabilities of X and Y are .8 and .7, then a true correlation of .3 would be reduced to only 75 percent of itself. The error in ro would be (.75 – 1)—that is, –.25 of rt (.3),—making an error of –.08. Hence, the negative sign shows that the error is a reduction of the observed correlation below the true correlation. Thus, the error in the observed correlation is always some fraction of the true correlation. The error in the observed correlation is never more than the value of the true correlation. Difference Score Some variables are difference scores, meaning that the variable, Z, is made by taking the difference between two variables, say, X and Y, so that Z = X – Y. For instance, if organizational performance is measured by profit, profit is the difference between sales revenue and cost, so P = S – C. In psychometrics it is understood that, if the two variables are positively correlated, then taking their difference produces a less reliable variable than the constituent variables (e.g., S and C) from which the difference score variable (e.g., P) is made. As seen in Chapter 6, there is a formula (Johns 1981) that gives the reliability of a difference score variable:

rdiff Y

sx2 rxx “ sy2 ryy v 2rxy sx sy sx2 “ sy2 v 2rxy sx sy

.

If the constituent variables were standardized, so that their standard deviations, s, equal 1, their variances, s2, also equal 1, so that the formula reduces to

rdiff Y

rxx “ ryy v 2rxy 2 v 2rxy

in which rdiff is the reliability of the difference score variable; rxx and ryy are the reliabilities of the constituent variables, X and Y, respectively, from which the difference variable is made; and rxy is the correlation between X and Y. To better understand the formula, we may take the average reliability of X and Y—that is, their mean reliability. Twice the average reliabilities, raa, is equal to the sum of rxx and ryy, so, making this substitution, the formula becomes rdiff Y

rdiff Y

2raa v 2rxy 2 v 2rxy 2(raa v rxy ) . 2(1v rxy )

220

INTEGRATION

Subtracting 2 from both the numerator and denominator,

rdiff Y

raa v rxy 1v rxy

.

In words this says that the unreliability of a difference score variable equals the mean reliabilities of its constituent variables less the positive correlation between the constituent variables, divided by the lack of correlation between those variables. The numerator says that the average reliabilities of the constituent variables, X and Y, are reduced by the degree of positive correlation between the constituent variables, X and Y. Hence, the higher is the correlation between X and Y, the more their reliabilities are reduced to give the reliability of the difference variable. The denominator is the extent to which X and Y are uncorrelated. The more highly X and Y are positively correlated, the more that the denominator boosts the reliability of the difference score back up. This restitution is only partial, however. It offsets, but only somewhat, the tendency in the numerator for high positive correlations between which X and Y to produce low reliability of the difference score. Hence, the denominator has the effect of limiting the reduction in difference score reliability due to the numerator. Thus, difference score reliability is more reduced, the higher is the positive correlation between X and Y, but only to a limited degree. For example, if the reliability of X is .9 and the reliability of Y is .7, their average reliability is .8, and so, if the correlation between X and Y is .6, then the reliability of the difference variable, X – Y, is

rdiff Y

.8 v.6 .2 Y Y .5 . 1v.6 .4

The numerator reduces the reliability from .8 to .2, but the denominator boosts it back up to .5. Overall, in this example, taking the difference between X and Y produces a difference variable, X – Y, whose reliability is only .5, which is much less than the average reliabilities of X, .9, and Y, .7—that is, .8. Thus, for two variables measured with reliability of .8 on average, taking their difference reduces the reliability to .5. Returning to the previous example above, if the independent variable was measured with reliability of .8, but the dependent variable was, say, profit, which is a difference score, so that it was measured with the low reliability just calculated of .5, then the observed correlation would be ro Y rt .8 .5 ro Y.63rt .

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 221

Thus, the observed correlation of the effect of the independent variable on profit would be over one-third less than the true correlation. For example, if the effect of the independent variable on profit were a true correlation of .3, the observed correlation would be only .19. Hence, as long as the correlation between the constituent variables is positive, then their reliability is reduced in a difference score variable. Thus, in these circumstances, using a difference score variable reduces the observed correlation below the true correlation. It is a source of error that compounds the error due to unreliability in the simple variables, X and Y. The more highly that the constituent variables, X and Y, are positively correlated, the greater is the reduction in reliability of a difference score variable relative to the reliabilities of its constituent variables. The reliability of profit, rpp, is a difference score variable, whose reliability is therefore liable to be lower than that of its constituent variables, sales and costs. It is calculable from the reliabilities, sales, rss, and costs, rcc, knowing their correlation, rsc: rpp Y

rss “ rcc v 2rsc . 2 v 2rsc

In an analysis of the effect of an organizational characteristic, G, whose reliability is rgg, on profit, the true correlation between them, rtgp, is subject to attenuation, due to reliability, to produce the observed correlation between them, rogp: rogp Y rtgp rgg rpp . Substituting for the reliability coefficient for profit, rpp, the formula for its calculation just given above, the equation becomes rogp Y rtgp rgg (rss “ rcc v 2rsc ) (2 v 2rsc ) . Therefore the error in rogp equals Error in rogp Y rogp v rtgp Y rtgp ( rgg (rss “ rcc v 2rsc ) (2 v 2rsc ) v1) . Range Restriction and Range Extension Range restriction and range extension can jointly be termed range artifacts. Range restriction occurs when the range of a variable in a study is less than the range of that variable in the universe. Range restriction leads to attenuation (reduction) of the observed correlation below the magnitude of the true correlation. However, the reduction in correlation is some fraction of the true correlation and so range restriction never produces an error greater than the value of the true correlation.

222

INTEGRATION

Range extension leads to inflation of the observed correlation above the magnitude of the true correlation. The effect of the range of a variable on a correlation involving that variable is the effect of variation on covariation. To fully capture covariation between variables, it is necessary to fully capture variation of each variable. The variation of a variable may be measured by its standard deviation. Range artifacts are present if the standard deviation of the variable in the data set is unequal to the standard deviation of the universe. If the standard deviation of a variable in a data set is equal to the standard deviation of that variable in the universe (i.e., population), then the covariation between that variable and another variable will not have any error from range restriction or range extension. (This holds if the other variable in the correlation also has no range artifact.) However, if the standard deviation of a variable in a data set is less than that of the standard deviation of that variable in the universe, then there will be range restriction, meaning that the covariation between that variable and another variable will be understated. Similarly, if the standard deviation of a variable in a data set is more than that of the standard deviation of that variable in the universe, then there will be range extension, meaning that the covariation between that variable and another variable will be overstated. The degree of understatement or overstatement of the true correlation is proportional to the ratio of the standard deviation of the variable in the data set to the standard deviation of that variable in the universe (Hunter and Schmidt 2004). The formula is that the observed correlation is equal to the product of the true correlation and the ratio of the standard deviations. In other words, observed correlation equals the true correlation multiplied by the ratio of the standard deviation in the data set to the standard deviation in the universe. In symbols, if SDd is the standard deviation in the data set and SDu is the standard deviation in the universe (i.e., population), then ‡ SD ‹ ro Y rt † d Š. … SDu ‰

Hence, less than the true correlation is observed if the ratio is less than one (i.e., there is range restriction), while more than the true correlation is observed if the ratio is more than one (i.e., there is range extension). In range restriction, the ratio of the standard deviation in the data set to the standard deviation in universe is less than one, and so the true correlation is reduced, to produce the observed correlation. For example, if the ratio of the standard deviation of the sample to the standard deviation of the universe is .7, and the true correlation is .3, then the observed correlation is .21, a reduction of .09 below the true correlation. The smaller is the ratio of the standard deviation in the data set to the standard deviation in the universe, the smaller is the observed correlation relative to the true correlation.

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 223

In range extension, the ratio of the standard deviation in the data set to the standard deviation in universe is more than one, and so the true correlation is inflated to produce the observed correlation. For example, if the ratio of the standard deviation of the sample to the standard deviation of the universe is 1.3, and the true correlation is .3, then the observed correlation is .39, an increase of .09 above the true correlation. The larger is the ratio of the standard deviation in the data set to the standard deviation in the universe, the greater is the observed correlation relative to the true correlation. Thus, the degree of range restriction or extension is proportionate to the ratio of the standard deviations. This treatment of effect the of range restriction and extension as being the ratio of standard deviations is a simplification of the actual formula of the relationship between true and observed correlations as affected by range (Hunter and Schmidt 2004). However, this simplification produces a good approximation of the value from the fuller formula, if conditions are met. These conditions include that the correlation be small (Hunter and Schmidt 2004). In this book, we focus on the correlation between organizational performance and one of its causes. As we have emphasized herein, because there are multiple causes of organizational performance we expect the correlation to be low, in the range of .1 to .3. Therefore, in using the simple formula for range effects, the assumption made is consistent with the overall argument of this book. The error in the observed correlation, ro, from range artifacts equals ?‡ SD ‹ C rt >† d Šv1B. =… SDu ‰ A Continuing the range restriction example, if the ratio of the standard deviations is .7, then the error is –.3 of the true correlation (.3)—that is, –.09—the negative sign meaning that the observed correlation is reduced below the true correlation. Since, for range restriction, SDd is always less than SDu, the ratio of these standard deviations is always less than one, and so the error is always less than the magnitude of the true correlation. Continuing the range extension example, if the ratio of the standard deviations is 1.3, then the error is +.3 of the true correlation (.3)—that is, +.09—the positive sign meaning that the observed correlation is inflated above the true correlation. Combination of Measurement Error and Range Restriction or Extension Unreliability is always present in data to some degree, so if either range restriction or range extension exists in a data set, both sources of error are present. Their errors can combine to produce a greater error than from either alone. As seen, unreliability lowers the observed correlation below the true correlation. Range restriction also

224

INTEGRATION

lowers (attenuates) the observed correlation, so range restriction amplifies unreliability to further lower the correlation. Nevertheless, at its maximum, the error in the observed correlation from unreliability and range restriction combined is less than the magnitude of the true correlation. Range extension raises the observed correlation, so range extension offsets unreliability, which lowers the observed correlation, to reduce the error they jointly produce in a correlation. Usually, therefore, range extension and unreliability will combine to produce a weaker error than the sum of either alone. Range extension that is stronger than unreliability produces an error in which the observed correlation overstates the true correlation. Range artifacts cause the observed correlation to vary from the true correlation. The observed correlation is also affected by unreliability in measurement of the variables. ‡ SD ‹ ro Y rt rxx ryy † d Š. … SDu ‰

As stated, for range restriction, the ratio SDd/SDu is less than one, reducing the observed correlation. For range extension, the ratio SDd/SDu is more than one, increasing the observed correlation. The value from the range artifacts, the ratio of SDd/SDu, combines multiplicatively with the square roots of the reliability coefficients of X and Y to give the observed correlation. For example, if the reliability of X was .8 and of Y was .6 and if the ratio SDd/SDu was .7 (i.e., there was range restriction), these effects would combine to lower the observed correlation proportionately to .48(Y .8 y .6 y 7) of the value of the true correlation. If the true correlation were .3, the observed correlation would be only .14 (= .48 × .3). Thus, in this example, the range restriction combines with modest unreliabilities to make a true correlation of .3 appear to be only about half its value—a considerable understatement. In contrast, if, again, the reliability of X was .8 and of Y was .6, but if the ratio SDd/SDu was 1.5 (i.e., there was range extension), these effects would combine to make the observed correlation have the proportion of 1.04(Y .8 y .6 y1.5) of the true correlation. If the true correlation were .3, the observed correlation would be .31 (= 1.04 × .3). Thus, in this example, the range extension offsets the reduction from the modest unreliabilities, to make the correlation appear to be about its true correlation, .3. Hence, here, the range extension counteracts the unreliabilities to restore the observed correlation to be about the true correlation. The error in the observed correlation is equal to   SDd    1 Error in ro  ro  rt  rt   rxx ryy  SD  .  u 

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 225

Again, a positive error means that—as a result of the joint effect of unreliabilities and a range artifact—the observed correlation is inflated above the true correlation, while a negative error means that the observed correlation is reduced below the true correlation. As seen, in range extension, it is possible for it to completely offset unreliability, so that the error from unreliability that reduces the correlation is completely offset by the error from range extension that increases the correlation. This will occur when unreliability is numerically equaled by the inverse of range extension (because in range extension SDd is more than SDu): ‡ SD ‹ rxx ryy Y† u Š. … SDd ‰

For example, if both rxx and ryy had reliabilities of .7, the error they produce in the observed correlation would be to reduce it by a factor of .7. If the ratio of SDu to SDd were also .7, the range extension would be inflating the observed correlation by the same amount. Therefore, the unreliability error would be canceled out by the range extension error. In this situation the observed correlation would equal the true correlation, so that no error would occur. In this case, there is no diminution of the observed correlation below the true correlation, despite the occurrence of unreliability. However, range restriction is probably a more frequent occurrence in organizational data than is range extension. We present this discussion of range extension simply to clarify that the errors from unreliability sometimes will be equaled, or exceeded, by the range extension error, and then the joint effect of unreliability and range extension is either no error or an overstatement of the correlation, respectively. When profit is one of the variables in the correlation, then, as seen, the longer formula for unreliability is used, which combines with range artifacts:  r  r  2rsc  rgg ss cc Error in rogp  rogp rtgp  rtgp  2  2rsc



 SD  d .  SD 1

 u  

Confounding Confounds affect the observed correlation, either lowering it or raising it, so that it differs from the true value. The sources of error being considered now— unreliability, range restriction and extension, and confounding—all affect the observed correlation, say, between organizational performance and one of its causes. However, unreliability and range restriction and extension work by conditioning the amount of the true correlation that flows through to become the observed correlation. This has been made explicit in the foregoing equations that showed the unreliability and range factors as some fraction of the true correlation. In contrast,

226

INTEGRATION

confounding directly affects the observed correlation. Confounding will therefore be represented as a term in the equation of the causes of the observed correlation that is not an interaction term involving the true correlation. Hence, confounding is not limited in amount by the magnitude of the true correlation and can have a greater effect on the observed correlation. In this book, three types of confounding have been considered: confounding by other factors, confounding from definitional connection (arising from difference scores), and confounding by prior performance. Each type of confounding will be considered in turn. Confounding by Other Factors An observed correlation, roxy, between variables, X and Y, will be confounded by any third variable, Z, which is correlated with both X and Y. Z could be a cause of Y, but does not have to be. Z could be caused by Y or be a cause of X or not be causally related to either X or Y. Z only has to be a correlate of X and Y. The spurious correlation introduced by Z into the observed correlation, roxy(z), is equal to the product of the correlations rxz and rzy: roxy ( z ) Y rxz rzy . Hence, if the true correlation, rtxy, were zero, then the observed correlation roxy would be roxy Y rxz rzy . When there is a nonzero true correlation, rtxy, then the observed correlation will be that true correlation plus the spurious correlation due to Z: roxy Y rtxy “ rxz rzy . The product of rxz multiplied by rzy can be positive or negative. If the true correlation, rtxy, is positive, then a positive spurious correlation adds to it, inflating the observed correlation above the true level. If the true correlation is positive, a negative spurious correlation offsets it, deflating the observed correlation below the true level. The fact that the spurious correlation is a product of two correlations means that small correlations produce only low products. For instance, if rxz and rzy are both only .3, their product, the spurious correlation, is only .09. However, the confounding is determined by the size of this spurious correlation relative to the true correlation. Continuing the example, if the true correlation is .6, a spurious correlation of .09, if negative, lowers the observed correlation to .51, which is trivially different from the true correlation. However, a true correlation of .1, would, if confounded by

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 227

a spurious negative correlation of –.09, lead to a confounded correlation of only .01, which would lead to a false conclusion of a near zero relationship. Thus, the degree of confounding is a function of the product of the correlations with Z and the true correlation, taken together. In this book, we are concerned often with the relationship between organizational performance and one of its causes. Because there are many causes of organizational performance, the focal cause will usually be weak. Moreover, there are many other causes that may confound it, possibly jointly. Thus, it becomes feasible that the focal cause of performance is confounded by another cause(s) of performance. Confounding by Definitional Connection As discussed in previous chapters, some variables are difference score variables— for example, X – Y. Such difference scores can have correlations with each of their constituent variables, which arise from connections created by how the difference score is defined. For instance, profit, P, is sales revenue, S, minus costs, C (i.e., S – C), and so P is liable to be positively correlated with sales revenue, S, and to be negatively correlated with costs, C. If either of these constituent variables is, in turn, correlated with an organizational characteristic, G, whose relationship with profit is being analyzed, rgp, there will be a spurious correlation between G and profit due to that constituent variable. This spurious correlation will be equal to the product of the correlations between G and S, and S and profit P, rgp Y rgs rsp , or between G and C, and C and P, rgp Y rgc rcp . Definitionally, there is potential for rsp to be positive, so, if rgs is also positive, then there is a spurious positive correlation, rgp, between G and P. Similarly, there is a potentiality for rcp to be negative, so, if rgc is also negative, then there is a positive spurious correlation between G and P. Therefore, the two spurious correlations would reinforce each other, leading to a larger spurious correlation, which would make a troublesome confound more likely. For example, if G is correlated .5 with S, and S is correlated .7 with P, then the spurious correlation between G and P is .35 (= .5 × .7). Similarly, if G is correlated –.4 with C, and C is correlated –.6 with P, then the spurious correlation between G and P is .24 (= –.4 × –.6). These two spurious correlations would reinforce each other to produce a larger spurious correlation. If both spurious correlations were independent of each other, they could be simply added together to give their joint effect:

228

INTEGRATION

rgp Y rgs rsp “ rgc rcp . However, the spurious correlation due to C is not wholly independent of the spurious correlation due to S, because S and C are correlated; therefore only some of the spurious correlation due to C adds to that due to S. In the example, S correlates .7 with P, and C correlates –.6 with P. Therefore, S and C are connected by a correlation of –.42 (= .7 × –.6), which means that they share 18 percent common variance (i.e., the correlation squared). Thus, the spurious correlation due to C adds 82 (= 100 – 18) percent of its spurious correlation to that due to S alone. Therefore, this proportion needs to be applied as a weight to the spurious correlation due to C when it is being added to the spurious correlation due to S. The weight is the variance that is not common to C and S. In this example, the combined spurious correlation due to S and C is .35 + (.24 × .82) = .35 + .20 = .55. Therefore, because the positive spurious correlation is inflating the observed correlation, it is equal to the true correlation inflated by .55: ro Y rt “.55 . If the true correlation had been .3, the observed correlation would have been .85, so that the spurious correlations would have grossly inflated it, leading to the false conclusion that it is much stronger than it is and that there is no other important cause of the dependent variable, whereas the other cause(s) are stronger than the focal cause. Conversely, if the two spurious correlations had both been negative, deflating the observed correlation, the observed correlation would have been equal to the true correlation deflated by –.55: ro Y rt v.55 . If, the true correlation, rt, had been .3, the observed correlation would have been –.25, so that the spurious correlations would have completely masked it, producing a negative observed correlation; hence, leading to the false conclusion that there is a negative effect of the cause on profit, whereas the effect of the cause is positive. The general formula for the observed correlation, rogp, between organizational characteristic, G, and profit, P, after confounding by the correlation between G and P due to definitional connections between S, C, and P, is rogp Y rtgp “ rgs rsp “ (1v rsc2 )(rgc rcp ) .

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 229

The error in the observed correlation between organizational characteristic, G, and P, after confounding by the correlation between G and P due to definitional connections between S, C, and P, is rgs rsp “ (1v rsc2 )(rgc rcp ) . Confounding by Prior Performance When the dependent variable is organizational performance, there is a likelihood that reverse causation exists that confounds it. Poor organizational performance is often the trigger for change in organizational characteristics such as structure (Chandler 1962; L. Donaldson 1987), so when a positive effect of such a characteristic is examined, it is likely to be offset by a negative effect of performance on the characteristic. This source of error can be substantial in that the negative correlation due to the effect of prior organizational performance can be as great as the true correlation due to the positive effect of the organizational characteristic on organizational performance (L. Donaldson 1987). Therefore, the confounding by prior performance can be great enough that the positive effect is completely masked, leaving a zero or possibly negative correlation. This gives managers the misleading impression that the organizational characteristic is not a positive cause of the performance of their organization. This, in turn, may lead the managers to underutilize or abandon that causal lever, unwittingly sacrificing organizational performance. The effect of organizational performance on an organizational characteristic (e.g., structure) means that performance subsequently leads the organizational characteristic variable to take a certain level. Therefore, it is prior performance that is negatively affecting the correlation between that organizational characteristic and performance. In contrast, it is the effect of the organizational characteristic on subsequent performance that is positively affecting the correlation between that organizational characteristic and performance. If an analysis across time periods could be conducted, the negative effect of prior performance on the organizational characteristic could be separated from the positive effect of the organizational characteristic on subsequent performance, so there would be no confound. However, in a cross-sectional analysis of data from the same time, both effects are present simultaneously and confuse each other. The cross-sectional correlation records both the negative impact of performance at an earlier period on the later level of the organizational characteristic and the positive effect of prior levels of the organizational characteristic on later performance. Thus, the confounding variable is the correlation between prior performance, PP, and the organizational characteristic, G. The equation of this confounding of the observed correlation, rogp, between an organizational characteristic, G, and organizational performance, P, is rogp Y rtgp “ rppg .

230

INTEGRATION

Given the negative values of rppg, the true correlation, rtgp, is reduced, giving an observed correlation that is either a lower positive, or zero, or negative—all of which understate any true positive correlation. Combining Multiple Confounds All three confounds, due to other causes, definitional connections, and prior performance, could be present together in an organizational data set being viewed by a manager. An observed correlation between, say, a divisional characteristic variable, G, and divisional profitability, P, could be subject to confounding by any other cause of divisional profitability, Z; definitional connections to sales, S, and costs, C; and prior divisional performance. The equation showing how the observed correlation is determined by the true correlation and multiple confounds acting together is rogp Y rtgp “ rgz rzp “ rgs rsp “ (1v rsc2 )(rgc rcp ) “ rppg . The prior performance correlation would tend to be negative, so would offset any true positive correlation, but the other two sources of confounding could each produce either positive or negative errors, so could restore or mask any true positive correlation. Because the errors produced by the confounds are independent of the value of the true correlation, the overall error from the confounds is just their additive sum: Error in rogp Y rgz rzp “ rgs rsp “ (1v rsc2 )(rgc rcp ) “ rppg . Any single confound that is positive in its value produces a positive error in the observed correlation. Similarly, any confound that is negative in its value produces a negative error. The overall error in the observed correlation from all the confounds is the net of their positive and negative effects. For instance, if the confound from Z was +.1, that from S and C combined was +.18, and that from prior performance was –.5, their overall confound would be –.22. Overall Error Without Sampling Error For the correlation of the effect of a cause, G, on organizational profit, P, the combined error from unreliability and range, together with multiple confounds, in the resulting correlation, rogp, is ⎡ rtgp⎢ ⎣

(r

gg

⎛ SD ⎞ ⎤ rpp ⎜ d ⎟−1⎥+ rgz rzp + rgs rsp + (1− rsc2 )(rgc rcp ) + rppg . ⎝ SDu ⎠ ⎦

)

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 231

In this equation, the unreliability and range restriction (if any) reduce the magnitude of the true correlation, but do not change its sign. The confounds contribute their error directly, independently of the true correlation. The confounds can be greater than the true correlation. They can be either positive or negative. When there is range restriction, the effect of it and unreliability produce a negative error that reduces the true correlation. A positive correlation is further reduced if the confounds are net negative, while a negative correlation is further reduced if the confounds are net positive. When there is range extension, the effect of it and unreliability may produce a positive error that inflates the true correlation. A positive correlation is further inflated if the confounds are net positive, while a negative correlation is further inflated if the confounds are net negative. As has been our focus, if the true correlation is positive, then the unreliability and range restriction lower its positive value. In this scenario, confounds that create negative spurious correlations add to the total error because they mask the true positive correlation even more. In contrast, in this scenario, if the confounds create positive spurious correlations, then they offset the error from unreliability and range restriction, thus subtracting from the total error. Sampling Error Additionally, there is error from sampling error, which creates a range of correlations around the correlation that is produced by the true correlation after being moderated by unreliability and range artifacts and then affected by the confounds. Sampling error of a correlation can be stated as (1v r 2 ) N (Chambers 1964, 61), where r is the correlation and N is the number of observations. This formula has the attraction of being stable and easy to use in equation building. It holds for most values of N. However, it does not hold for a small N, from about 15 and downward, so in that sense should be understood as only an approximation for modeling purposes. Chambers cautions that this simple formula for error from sampling is only “approximately true when N is large and the values of r are small or moderate in size” (1964, 61). Blalock (1972, 401) recommends to use the well-known Fisher’s logarithmic z transformation, because that formula gives the confidence intervals of the correlation more accurately. However, the formula is cumbersome because it involves transforming r into z and then z back into r, so that it does not seem to lend itself to a tractable formula that could be used for theory construction and discussion. In contrast, the Chambers formula allows expression of the confidence intervals for a correlation in a formula that is tractable and relatively easy to understand and discuss. How much difference do the two methods make to the estimation of the amount of sampling error? Fortunately, the simple formula gives confidence intervals that are a good approximation to the z formula. The 95 percent confidence intervals around a correlation are similar for the two formulas (see Appendix on page 253). Overall, the simple formula

232

INTEGRATION

approximates the z formula quite well. However, it should be borne in mind that the formula used here to represent sampling error is increasingly distorted for N smaller than 15 and tends to overstate the amount of error. With this proviso, the Chambers formula provides a way of capturing sampling error in its magnitude and the variables that cause it, r and N. Thus, the sampling error in a correlation can be represented by the equation 2 (1v rogp )

N.

Sampling error can be considered to come after the other errors, bringing a final indeterminacy to the end result. Sampling error produces a range of correlations that is centered upon the correlation that emerges after the true correlation has been acted upon by the other errors. These other errors can be considered to turn the true correlation into an observed correlation and the sampling error is around its value. As the discussion has emphasized, sampling error creates a range of possible values. The quantity given by the sampling error term is for the upper and lower bound, the range in which almost all but the most extreme correlations will fall— that is, the most likely range. Any actual correlation could be anywhere within this range—that is, it could have this value or a value ranging in magnitude down to zero. Overall Error With Sampling Error To give the final error, we take the errors from unreliability, range, and confounds and add in the error from sampling. The amount of error from unreliability, range, and confounds, together, can be summarized as the difference between the true correlation and the observed correlation that those sources of error produce. The error from sampling is then added to this: Overall error = (rtgp – rogp) +/– sampling error of rogp. The +/– sign is used because the random error from sampling will sometimes be positive, meaning increasing a true positive correlation or making a true negative correlation appear weaker or positive, and sometimes be negative, meaning decreasing a true positive correlation or making it appear negative or making a true negative correlation appear stronger. We may say that the overall error in a correlation is the sum of errors from unreliability, range disparity, and confounding, plus or minus sampling error. To codify: Overall error in a correlation = unreliability error + range disparity error + confounding errors +/– sampling error.

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 233

As already seen, errors from unreliability and range disparity are represented as ?‡ rss “ rcc v 2rsc rtgp>† † rgg 2 v 2rsc > =…

‹‡ SD ‹ C d Š Š† SD Šv1B, ‰… u ‰ B A

confounding errors are represented as rgz rzp “ rgs rsp “ (1v rsc2 )(rgc rcp ) “ rppg , and sampling error is represented as 2 (1v rogp )

N.

Therefore, putting these equations together, the total, overall error in correlation is given by the overall methodological equation: Overall error = ?‡ rss “ rcc v 2rsc rtgp>† † rgg 2 v 2rsc > =…

2 ‹‡ SD ‹ C (1v rogp ) d 2 Š B † Šv “ “ “ v “ “ v 1 r r r r (1 r )( r r ) r gz zp gs sp sc gc cp ppg Š SD N ‰… u ‰ B A

The overall methodological equation contains three reliability coefficients, two standard deviations, ten correlations, and the number of observations, N. It says that the overall error in a correlation equals the errors from unreliability and range disparity, plus confounding errors and sampling error. We can substitute for some of these data properties (e.g., rgg) the organizational characteristics that shape the magnitude of many of these errors. Reliability is a function of the number of measures, M, that are aggregated to form the score on a variable (e.g., an organizational characteristic, such as organizational structural formalization) (Pugh et al. 1968). Reliability increases as M increases but at a decreasing rate; therefore, we can use the square root of the M term. Hence, M gg , M ss , and M cc can be substituted in place of the reliability coefficients of rgg, rss, and rcc, respectively. The confounding by another factor tends to be less when the number of data sets, K, being aggregated through averaging is greater. But the reduction of confounding through averaging occurs as K increases at a decreasing rate with respect to K. Therefore, the reduction in confounding is expressed as a factor of 1 K . The confounding is the product term, rgz rzp, but this varies in magnitude and sign from facet to facet. For aggregation to reduce this confound, the positive and negative confounds about cancel each other out. The confounding has a maximum

234

INTEGRATION

value in a data set where rgz rzp is the highest across the data sets. This declines with more studies, to become for K studies (rgz rzp ) K . Hence, we can substitute (rgz rzp ) K for (rgz rzp). The number of observations, N, tends to be greater when the organizational size, O, is greater, so O can be substituted for N. Organizational size is the number of members of an organization—for example, employees in a business firm. The number of observations is equal to organizational size, if all the organizational members are used in an analysis—that is, the analysis aggregates across all the members. However, as emphasized herein, some analyses aggregate only within an organizational subunit, so that the aggregation is only partial and the number of observations is smaller. Hence, O is to be understood in the present formula as referring to the organizational size created through the actual aggregation used in an analysis. Note that M, K, and O all refer to the numbers of things aggregated: measures, data sets, and organizational members. Making all these substitutions, we obtain the overall theoretical equation: Overall error = ?‡ rtgp>† 4 M gg >† =…

C ‹ M ss “ M cc v 2rsc Š‡ SDd ‹ B (rgz rzp ) 2 † Šv1 “ “ rgs rsp “ (1v rsc2 )(rgc rcp ) “ rppg “ v(1v rogp ) Š… SDu ‰ B 2 v 2rsc K ‰ A

O

In the overall theoretical equation, error is a function of ten correlations, two standard deviations, and M, K, and O. However, the value of the observed correlation, rogp, is already given by other terms—the true correlation, unreliability, or range artifacts and confounds—so overall error is given by nine independent correlations plus two standard deviations, M, K, and O. Overall error is greater, the larger are rtgp, rsc, rgz, rzp, rgs, rsp, rgc, rcp, and rppg, and the smaller are M, K, O, and SDd (relative to SDu). The overall error is greater, the larger are the correlations and the less numerous are M (number of measures of organizational attribute, sales, or costs), K (number of data-sets), and O (number of organizational members). The error in the correlation of an effect on profit is greater, the larger are its true correlation, the correlation between sales and costs, and the confounding correlations, and the smaller are the aggregation factors and the range (relative to the universe). Alternatively, range restriction and range extension can both be considered forms of range disparity that increase error. The overall proposition is: 12.1 Error in a correlation is greater, the larger are the correlations (true, sales-costs, and confounding) and range disparity, and the smaller are the numbers of measures of organizational performance, data sets, and organizational members.

EQUATIONS OF STATISTICO-ORGANIZATIONAL THEORY 235

In mathematical and verbal forms, this is the statistico-organizational theory expressed formally. Its derivation has been to take the formulas that predict errors, simplify and combine them, and then change some of their terms into the properties of the organization or analysis that shape the data used to quantify the terms in organizations. Conclusions In this chapter, statistico-organizational theory has been expressed in equations in order to heighten the clarity and precision of the theory. Whereas errors are often treated as murky and vague, methodological principles can be used to render them explicit and definite. In psychometrics, the treatment of errors starts with an underlying true score or correlation that is then altered by errors. This is captured in causal models of how true correlations come to take their observed values. The disparity between the true and the observed correlation gives us the error in the correlation. Each major error therefore can be captured in an equation whereby the underlying true value becomes manifest in its observed value, through mediation of properties of the data (e.g., reliability coefficients and standard deviations). This chapter has presented an equation for each error, sometimes simplifying them as made apposite by their intended context and in the interests of theory construction. These separate equations have then been combined into a single equation that gives the overall error, which is the overall methodological equation. The discussion herein has distinguished between two sorts of errors: determinate and indeterminate. Most of the errors considered herein are determinate in that the amount of error produced in each situation by the data properties can be stated exactly. However, sampling error is indeterminate, in that it creates a range of possible values, so that the actual value within that range cannot be predicted. The overall methodological equation predicts what the correlation will be and then sampling error creates a range of possible final values around that value. The equations developed here reflect the managerial foci of this book: attention is given to analyses aimed at finding the true causes of organizational performance, especially of profitability. Accordingly, appropriate terms are included in the overall methodological equation, such as reverse causation from performance to organizational characteristic, and also profit-as-a-difference-score that leads to additional terms for unreliability and confounding. The overall methodological equation contains its defining elements: reliability coefficients, standard deviations, correlations, and the number of observations. These are the data properties that shape the organizational data and thereby the errors from the data in managerial analyses. However, some of these terms can be reexpressed into the organizational characteristics that affect the data properties. This reexpressed equation is the overall theoretical equation. The overall theoretical equation shows how the error in a correlation between a cause and organizational performance is produced by the true correlation, the

236

INTEGRATION

correlation between sales and costs, the confounding correlations, and the number of things aggregated (measures, data sets, and organizational members), together with the standard deviation in the data relative to that in the universe. The overall theoretical equation states how these variables interact to give the error in the final correlation. In this way, we can appreciate how the multiple sources of error discussed in the body of this book combine to create the overall error. Overall error is greater when the correlations (true, sales-costs, and confounding) are larger, and the aggregation variables (number of measures, data sets, and organizational members) and the data range (relative to the universe) are smaller. As seen in the previous chapter, these data properties (the correlations and range) and the organizational characteristics (the aggregation variables) can combine to produce substantial error. They can even produce enough error to make a true positive effect on organizational performance of one of its causes appear to be nil or negative, completely misleading the managerial analyst. Hopefully, the formalization of the argument into equations should clarify how these errors combine to form the overall error in what managers see in their data.

14 How Managers Can Reduce Errors

In this book, we have talked about many different kinds of errors that managers can make when they seek to draw inferences from numerical data. The emphasis has been on how the situation determines the errors made. This focus is in keeping with the aim of the book to contribute to organizational theory as an explanatory or positive inquiry, rather than to offer prescriptive guidance. However, managers can reduce such errors by taking certain steps. Many of these steps are currently well laid out in books on business statistics that aim to offer prescriptive advice to managers (e.g., Moore et al. 2009). The present book does not seek to duplicate such efforts. However, the errors discussed in this book can be serious, adversely affecting organizational performance. Some of them require remedies that are not always dealt with in the conventional business statistics books. Therefore, we will briefly revisit many of the major errors that have been discussed herein and indicate steps managers can take to avoid or reduce them. We will discuss them in the order the error sources were discussed in the book. Reducing Sampling Error As discussed in Chapters 3 and 4, small numbers of observations lead to sampling error. The solution is to avoid small numbers and, instead, base the analysis upon larger numbers, which will minimize sampling error. Small numbers of observations tend to occur in small-sized organizations. Such small organizations can increase the number of observations by combining their data with data from other organizations, such as those in the same industry. This may be done, for instance, through industry associations, government bureaus, or consultants. Large organizations may possess more numerous data but fail to draw them together, leaving them in branches, divisions, or subsidiaries—that is, leaving the data disaggregated. The solution is to aggregate the data by drawing together 237

238

INTEGRATION

all observations from across the organization’s branches, divisions, subsidiaries, affiliates, and so on. Organizations in small countries will tend to be small in size and so suffer more sampling error. They may be able to obtain a larger number of observations by gaining access to data from numerous other small countries or some larger countries—for instance, through international associations, international governing bodies, or consulting firms. Small subsidiaries of multinational corporations may be able to obtain data from larger subsidiaries in larger countries or from numerous small subsidiaries in small countries. Provision of data observations to small- or medium-sized subsidiaries, whether abroad or domestically, is a way in which the corporation can “parent” (Goold, Campbell, and Alexander 1994) and add value to its subsidiaries. Small organizations can also tap into the research results of organizations that are able to use large numbers of observations. For instance, the U.S. military does studies about its human resource management practice that can have more than 40,000 subjects (e.g., Schmidt, Hunter, and Pearlman 1980). The results of some of these studies are published, making them freely available to anyone in the world through the social science literature. Reducing Measurement Error In Chapters 5, 6, and 7, we discussed how measurement error could enter an analysis, creating lower reliabilities, which, in turn, lead correlations to be understated. A variable may simply be measured unreliably. Alternatively, two variables may be measured reliably, but taking their difference may create unreliability. The solution to the first problem is to increase the reliability of the measurement of the variable. The solution to the second problem entails increasing reliability by using an alternative to difference scores. Increasing the Reliability of Measurement The reliability of measurement of a variable may be increased by using items and definitions that are less subjective and less prone to biases, such as giving the socially approved, rather than the true, answer. Reliability will also tend to be increased if each subdimension is assessed by information from a source that is different from that used for the other subdimensions, providing a diversity of formats and informants. For example, for organizational performance, profitability figures could come from the accounts department, whereas data on employee morale could come from the human resource department. In particular, it may be more difficult for upper managers to manipulate data from many departments than from one department. Furthermore, reliability may be increased by using information sources that are independent from management—for instance, product quality rankings from an independent outside agency, such as J.D. Power for the automobile industry.

HOW MANAGERS CAN REDUCE ERRORS

239

In psychometrics, another way to increase the reliability of a variable is to include more than one item in the scale that is being used to measure it. A number of items are used to measure the variable, taking their sum as the measure. While any one item has measurement error, that error will tend to be random relative to the error of the other items, so that a scale’s total score tends to wash out the errors. Thus, multi-item scales are a way to increase the reliability of measurement. Van de Ven and Ferry (1980) argue that considerable improvement in reliability of a variable can be obtained by having as few as five items to measure a variable. Turning to the organizational management context, similar logic holds. A performance measure could be made more reliable by having multiple subdimensions and then taking the total score—for example, using the total score to assess the performance of an organization, or one of its subunits, and the manager responsible for it. Regarding profitability, this might be done by taking multiple measures of profitability and then taking their total or average to give a composite score. However, as shown in Chapter 5, to really reduce measurement error, these multiple measures of profit need to have errors that are negatively correlated, which may not be feasible. The balanced scorecard approach (Kaplan and Norton 1996) provides such a multi-item approach, in that multiple subdimensions of organizational performance, such as financial performance and customer satisfaction, are identified and included in the final performance metric. This has the advantage of combining attainment on various fronts. The concept of organizational performance is broader than just profitability. Such a conceptual shift could produce more reliability in measures of organizational performance. As stated, an organization may evaluate itself or its subunits by a range of performance variables that go beyond just profit. An example of a large corporation that evaluates its divisions not just on their profitability but also on a broad range of indicators is Cooper Industries, a U.S. diversified industrial products corporation. An executive vice president (EVP) for operations heads each of the worldwide operating segments (Collis and Stuart 1991, 14): “EVPs did not focus on any single measure in evaluating a division’s performance, but looked at all financial data that would indicate if the strategy was on course, such as sales, profitability, growth, cash flow, and return on assets. These financial data were supplemented with operating data, including order rates, which served as indicators of the level of product demand for the upcoming period, and first-pass line fill, or service-level data, which provided information on the number of orders that could be filled from stock.” In summary, a way to overcome the unreliability of organizational performance is to use a balanced scorecard. Instead of measuring performance through one single variable, several measures, including innovation, would be used. This would be the equivalent to using a multi-item scale in psychometrics; it would increase the reliability of the measurement of corporate performance. However, it would not necessarily improve the reliability of profit, so that, while the reliability of organi-

240

INTEGRATION

zational performance may be improved by using multiple measures, the reliability of profit as a specific aspect of organizational performance may not. Alternative to Difference Scores in Assessing Effects of Profit In analyzing measurement error in Chapters 5, 6, and 7, we have seen that measurement error can be increased considerably each time that the variable being analyzed is produced by taking the difference between two variables. For example, profit is the difference between sales and costs; mathematically, profit is sales minus costs. The problem (as seen in Chapters 5 and 6) is that a difference score can have much greater unreliability than the two variables from which it is composed, so profit is typically much less reliable than either sales or costs. It is therefore attractive to use sales and costs when we want to understand profit, because sales and costs are more reliable than profits. Such an approach of using sales and costs in analysis focusing on profit may be derived from a method put forward by Edwards and Parry (1993). Edwards and Parry argue that if we want to know the relationship between a difference score, X – Y, and some other variable of interest, Z, we can find it by examining the relationship between each of the constituent variables, X and Y, of the difference score, and that variable, Z. This is possible through a simple algebraic transformation of the difference score: Z = b(X – Y) Z = bX – bY. Therefore, empirically regressing Z on both X and Y simultaneously yields b, which is the coefficient that tells the effect not just of X and Y on Z, but of their difference score, X – Y, on Z. Moreover, the two bs should be identical in numerical value and opposite in sign. These two conditions constitute evidence that b is measuring the effect on Z of the difference between X and Y, rather than just measuring the main effect on Z of X on its own (or the main effect on Z of Y on its own). Conventionally, the relationship between profit and some other variable, W, is ascertained by relating profit to the variable, W. However, this has the problem that profit is a difference score (sales minus costs) and so tends to be unreliable, much more so than either sales or costs. Applying the new method to profit, we proceed in the following way. Suppose we are interested in showing how far divisions’ profitabilities are rewarded by giving more new investment capital to the divisions that have the higher profits. This is being done by looking in a regression analysis at the effect of divisional profit on new investment capital. W is the new investment capital allocated to the division by the corporation, and divisional profit is for the time period of two years earlier. Using the Edwards and Parry (1993) transformation:

HOW MANAGERS CAN REDUCE ERRORS

241

W = bProfit W = b(Sales – Costs). If S = Sales and C = Costs, then W = bS – bC. Therefore, the effect, b, of divisional profits on divisional investments can be found from a multiple regression of W on sales and costs by finding the b for sales and the b for costs. The b for sales should be equal numerically to the b for costs, and they should be opposite in sign, providing evidence that the underlying cause is the difference between sales and costs—that is, profits—and not just sales or costs. Thus, the method of using sales and costs in a multiple regression analysis, rather than their difference, profit, yields a superior estimate of the relationship with profit because it suffers less unreliability. Therefore, managers interested in knowing the effect of divisional profitability on the investments in divisions could simply regress investment on sales and costs simultaneously. By having higher reliability, this method should yield truer estimates, higher than those from the conventional method. Reducing Errors From Range Artifacts In Chapter 8, we saw that when the range in the data differs from the range in the population (or universe) the observed correlation differs from the true correlation. This error comes from range artifacts of two kinds: range restriction and range extension. Both problems can be remedied. Reducing Error From Range Restriction Range restriction occurs when the range in the data is smaller than the range in the population. It leads to the observed correlation understating the true correlation. Range restriction is best cured by eliminating it, by ensuring that the data are drawn so that they represent the variation in the population of interest. Restriction in range often comes about because the study involves only one set of people in one place, whereas to obtain adequate range a broader set of people from multiple locations would need to be studied. If data cannot be found that represent the population, then a correction formula can be applied that estimates what the correlation would have been if the broader range representative of the population had been used. This increases the observed correlation. The correction involves multiplying the observed correlation by the ratio of the standard deviations of the population and the study data, plus some other terms (Hunter and Schmidt 2004).

242

INTEGRATION

Reducing Error From Range Extension Conversely, range extension occurs when the range in the data is larger than the range in the population. It leads to the observed correlation overstating the true correlation. Extension in range often comes about because the study is done by taking extreme cases. This succeeds in its intention of capturing variation of interest, but at the cost of exaggerating the correlations involving that variable. Again, range extension is best cured by eliminating it, by ensuring that the data are drawn so that they represent the variation in the population of interest. This means capturing, in addition to the (often infrequent) extreme cases, the (often more numerous) middling cases. If data cannot be found that represent the population, then a correction formula can be applied that estimates what the correlation would have been if a smaller range, more representative of the population, had been used. This decreases the observed correlation. The correction involves multiplying the observed correlation by the ratio of the standard deviations of the population and study data, plus some other terms (Hunter and Schmidt 2004). Reducing Errors From Confounding Three sources of confounding have been distinguished in this book—definitional connections and reverse causality in Chapter 9, and confounding by causes other than the focal cause in Chapters 10 and 11—each of which has its own remedy, as will now be discussed. We have particularly emphasized the sort of confounds that can be troublesome to managers trying to assess the causes of organizational performance, so we will emphasize their reduction. Reducing Errors From Confounding by Definitional Connections When a variable is itself composed of two variables, it can have a connection with those variables that works to obscure its true effect. In particular, when a variable is the difference between two variables, it tends to be correlated with them. Profit is sales minus costs and so profit tends to correlate positively with sales. Therefore, if sales are also correlated with some other variable, then there will be a spurious correlation between that variable and profit. Thus, the true relationship between that variable and profit will be confounded. Similarly, costs tends to correlate negatively with profit, so that, again, a spurious correlation can occur that confounds the true relationship between profit and some other variable. A remedy is to control for sales and costs in analyses of the relationship between profit and some other variable. For instance, by including sales and costs as well as profit in multivariate analyses that examine the effect of profit on some other variable (e.g., divisional profit on the bonuses of the divisional managers), the observed effect of profit will not then be confounded by its constituent variables, sales and costs.

HOW MANAGERS CAN REDUCE ERRORS

243

Reducing Errors From Confounding by Reverse Causality Managers are often interested to know the effect of some organizational characteristic (e.g., strategy or structure) on organizational performance. They look for a positive association between an organizational characteristic and organizational performance. However, many such organizational characteristics are adopted only as a result of deterioration in organizational performance, which entails a negative association between organizational performance and the organizational characteristic. Thus, the association between an organizational characteristic and organizational performance can be confounded by the reverse causal process between organizational performance and the organizational characteristic. This means that the positive correlation between an organizational characteristic and organizational performance is to some degree masked by the negative correlation between organizational performance and the organizational characteristic. The remedy is to control the reverse causal effect of organizational performance on the organizational characteristic when looking for the effect of an organizational characteristic on organizational performance. Cross-sectional analyses in which all variables are measured at the same point in time do not permit this, thereby suffering the confounding. Therefore, it is necessary to measure variables at more than one time. To obtain an unconfounded estimate of the effect on organizational performance of an organizational characteristic, it is necessary to measure the organizational characteristic (the cause) before organizational performance (the effect). From experience, when analyzing the effect on organizational performance of an organizational characteristic, a lag of two years seems to work best (L. Donaldson 1987; Rogers 2006), reflecting the time necessary for improvements in the organizational characteristic to produce improvements in sales and costs, which are then recorded in profits in company annual financial reports. Thus, the organizational characteristic should be measured about two years before its effect (i.e., subsequent organizational performance). Any effect of the organizational performance at that time on the organizational characteristic will not be for some time hence, so should not confound the results. Reducing Errors From Confounding by Other Causes When a focal cause is correlated with some other variable that is itself correlated with the dependent variable, then that other variable confounds the correlation between the focal cause and the dependent variable. Where the dependent variable is organizational performance, there are likely to be many other causes that could confound a focal cause. An effective remedy for confounding by other causes is aggregation. The potential confounding of a focal cause of organizational performance by all the other numerous and collectively dominant causes, taken together, is great. However, in practice, it is not so strong, because the other causes tend to be correlated only

244

INTEGRATION

weakly with the focal cause; furthermore, some are correlated positively and some negatively so that they considerably offset each other. This produces only a weak confounding of the focal cause by other causes of organizational performance. Aggregating across data sets further reduces confounding because the confounding within each data set is weak, as just discussed, and the confounding is positive in some data sets and negative in others, thus offsetting each other. As a result, the average confounding across data sets can be trivial. Therefore, it does not matter if no control for confounding is made, because the averaging has already effectively eliminated almost all of the confounding. Hence, a manager who sits some levels up in the organizational hierarchy, looking at a figure that is essentially an average of the data that have flowed up from various subunits, is looking at a figure that is free of most confounding. An attempt may be made to control confounding by running an experiment within the organization and measuring confounding by a control group. However, this seems to bias the results downward, falsely reinforcing skepticism and caution. Therefore, organizational experiments are not recommended as a solution to the problem of confounding; the strategy of averaging, as described in the preceding paragraph, is preferred. In summary, managers looking at organizational performance data that have problems from confounding—by reverse causality, definitional connections, or other causes—can avoid the problems by measuring organizational performance subsequent to its causes, including sales and costs in analysis of profit, and averaging across data sets. In conclusion, while certain situations propel managers to make errors when they make inferences from the data in their organization, the use of the approaches outlined in this chapter may enable them to avoid some of the errors or at least reduce them.

15 Conclusions

This book has sought to offer a new theory of organizational management: statisticoorganizational theory. This concluding chapter will present a summary and review the theory from the viewpoint of the situational factors that determine the data properties that create errors, through the intervening methodological problems. Finally, some reflections and limitations will be offered. Hopefully, statistico-organizational theory can shed new light on management by illuminating aspects that are neglected by existing organizational theories. Specifically, the theory focuses on how the inferences managers make from numerical data contain errors that can mislead managerial decision-making. The errors come from objective properties of the data—for example, from the number of observations. Methodological principles allow the prediction of what type and magnitude of errors come from these properties of the data. The errors that come from academic data in social science research must also be present in the data that managers look at in their own organizations. We have sought to show how the properties that produce errors in academic research (e.g., small numbers of observations, measurement error) can also be present in the data that managers look at in their organizations. In the body of the book, we have presented the theory by considering, in turn, each of the methodological principles and their associated data properties and the errors to which they give rise. The theory has been crystallized en passant by a series of theoretical propositions, which should help future researchers formulate hypotheses and conduct tests of the theory. Table 15.1 provides a summary of the theory. The column third from the right, Error Source, gives the types of error identified by the methodological principles used in this book. The columns to its right, Error and Managerial State, give the types of error that flow from that source and their effect on managerial thought respectively. The information in the columns to the left of Error Source gives the preconditions that lead to that source being activated and producing errors. Usually, some data property can be identified that leads to more error. Sometimes, in turn, this data property is created by a situational variable. In some cases, it is 245

246

INTEGRATION

possible to state a methodology issue that intervenes between the data property and the error source. We shall now go through Table 15.1, row by row, to provide a concluding discussion. However, rather than focus on error sources, which would simply mirror the arrangement of the book, we shall approach the issues in terms of the data properties that cause the errors. Focusing on the data takes us to the perspective of the managers in the organizations and thereby gives a managerial orientation. Situations That Lead to Errors There are certain situations that make it likely that numerical data will mislead a manager. The data will have varying properties that determine whether, and by how much, the data mislead. In explaining the theory, these properties have been stated already, but we will summarize them here from a managerial viewpoint. Data may have a small or large number of observations, N. In some data sets in organizations the N is large, while in some other data sets the N is small. If a firm produces bottles, the N may be in thousands, whereas if the firm produces huge industrial milling machines, the N may be small. Hence, the subject of analysis (i.e., the thing being analyzed) affects the number of observations. The smaller the number of observations, the greater is the sampling error, so that there is more random variation around a figure derived from the data, say, a mean or a correlation (see row one in Table 15.1). This deflation or inflation of the inferred statistic can mislead the manager. Repeated analyses of different sets of data from small numbers of observations will produce variations that are spurious. This may give a false impression that the situation is more changeable than it is really, giving a mistaken sense of changeability. It can also lead to a forlorn search for the situational, or other, moderator that explains this variation. However, the moderator is a mirage and so the search is fruitless. Managers infer an exaggerated view of the changeability and causal complexity of their world. In some organizations, the potential for large numbers of observations may be thrown away by disaggregating data, such as into data for divisions, product lines, or branches, or by taking only the most recent data (row two in Table 15.1). This method leads back to data sets that are based only on small numbers of observations and hence spurious random variation in statistics and, again, exaggerated changeability and causal complexity. In other organizations, their small size (e.g., number of members or output volume) means that they are inherently locked into the small number of observations problem for many of their analyses (row three in Table 15.1). The size of an organization is conditioned by factors in its environment. For many organizations, the size of their country influences the size of the organization and hence the propensity for the small-numbers problem. Even if an organization is large and has a large volume of output, the organi-

Small size of country

Multiple causes

3

4

Correlations of sales or costs with profit Performance-driven Negative correlation organizational change of cause and performance Large organizations Organizational experiments

12

14

13

Multiple causes

11

Extreme cases Small true correlation Correlation between focal cause and other factor Profit

Multiple causes

Missing cases

Profit as the variable

All variables are measured with error

Small correlation

Small size of organization

Data disaggregation

Small number of observations

Data properties

9 10

8

6 7

High positive correlation between sales and costs Isomorphism or survival

Divisionalization

2

5

Few products or service transactions

1

Situational variables

Definitional connections Reverse causality

Confounding more feasible Masking or inflating

Selection

Few items Difference score

Small number of observations

Small number of observations

Methodology issue

Summary of Statistico-Organizational Theory

Table 15.1

Bias

Confounding

Confounding

Confounding

Range extension Confounding

Range restriction

Measurement error Measurement error

Measurement error

Sampling error

Sampling error

Sampling error

Sampling error

Error source

Overreliance on cause Overreliance/underuse of cause Overreliance/underuse of cause

Underuse of cause

Underuse of cause Underuse of cause

Exaggerated changeability, causal complexity Exaggerated changeability, causal complexity Exaggerated changeability, causal complexity Exaggerated changeability, causal complexity Underuse of cause

Managerial state

Understates effect

Self-fulfilling skepticism

Over/understated Overreliance/underuse correlation of cause Masking of true positive Mistaken causality correlation

Overstated correlation Over/understated correlation Over/understated correlation

Understated correlation

Understated correlation Understated correlation

Understated correlation

Random variation

Random variation

Random variation

Random variation

Error

247

248

INTEGRATION

zation will have a large number of observations only if the data are coded in a standard way and collected together. Therefore, formalization, standardization, and centralization of data are required to produce large N data. When managers focus on finding the causes of organizational performance, the multiplicity of such causes means that their correlation with performance tends to be small, and smaller correlations have more sampling error than larger correlations, so managers are pulled again into the problem of sampling error (row four in Table 15.1). Human resource management decisions about the promotion of junior managers may depend on the performance of their branch. The smaller the branch, the more that small-numbers problems arise, so that promotion becomes like a lottery. Similarly, the shorter the time that the manager is in charge of the branch, the more that the small-numbers problem arises, so that, again, promotion becomes like a lottery. Decentralized organizations, such as franchises, may require each franchise to use its data to decide whether an idea is good or bad and then aggregate their “votes” across the franchises. Again, the small-numbers problem arises, producing a variety of results from the franchises, which make a conclusion difficult. In contrast, aggregating the data would lead to a straightforward and more accurate conclusion. In any analysis of organizational data by any manager, all variables contain some measurement error and therefore are somewhat unreliable. Unreliability lowers any correlation below its true value (row five in Table 15.1). Therefore, managerial analyses will falsely conclude that causes are weaker than they truly are. Whenever a manager looks at a figure, if it is based on a single measure, then it will be less reliable than if multiple measures had been used and then aggregated into a single, more robust figure (row six in Table 15.1). Such a less-reliable figure makes decisions (e.g., rewards, resource allocation) using it prone to inaccuracy. Decisions could be wrong, arbitrary, unfair, and resented. Correlations will be below their true value compared to correlations for multiple measures, leading managers to underuse causal levers. Managers often analyze data about the profits from their organization and its subunits, or from competitors. However, profit is the difference between sales and costs and difference scores are prone to heightened unreliability. Therefore, profit is measured less reliably than sales or costs (row seven in Table 15.1). The reliability of profit tends to be less when there is a high positive correlation between sales and costs and also when organizational profitability is low and industry profitability homogeneous. Again, the low reliability of profit makes decisions using it prone to inaccuracy. Moreover, analyses that seek to relate profit to some other variable will tend to understate the true correlation. This may lead the manager to think that the identified causes of profit are weaker than they really are, so that the manager uses them less or underinvests in them. Furthermore, the weak correlations may lead to a search for other, stronger causes that may not exist.

CONCLUSIONS

249

Many business organizations contain multiple profit centers, at least one for each division. Thus, managers may compare the varying profitability of different divisions and seek the causes of this variance. Multidivisional, holding, and multinational corporations all have multiple profit centers and so in all of them analyses are liable to be conducted that use profit figures, despite their unreliability. Strategically, such corporations tend to be diversified by product, geography, or other factors, so managers in diversified corporations unwittingly tend to face the problem of unreliable data. In contrast, undiversified firms tend to have functional structures, which use cost centers for their major subunits (i.e., functional departments). Cost tends to be more reliable than profit, so managers in these functionally structured firms can draw more valid inferences from their data. This problem of the unreliability of profit extends also to other measures of profitability, such as profit to sales, change in profit over time, profit relative to some standard, and profit figures adjusted to control for extraneous factors. Hence, the problem of unreliability of profit is quite general across many forms of profit figures, as well as many organizational forms (multidivisional, multinational, etc.). If managers seek a more sophisticated form of causal analysis, examining whether the fit of some organizational characteristic to another characteristic (a “contingency factor”) produces beneficial outcomes, this method also is subject to the enhanced unreliability discussed here. Quite often a manager may examine data for an association, but the correlation is reduced by restriction in the range of one of the variables (row eight in Table 15.1). Such restriction might come about by isomorphism—for example, if a regulatory body requires that organizations are high on some variable, over time all those organizations come to have high levels on that variable. Such restriction can also come about by selection of organizations—when, for example, only surviving organizations are examined and they have high values on essential characteristics— or of organizational members, when, for example, employees are selected who are high on tested ability. Competition can force organizations toward optimal values, thereby eliminating variance in comparisons. A manager, conversely, might examine data in which the correlation has been inflated by extension in the range of one of the variables (row nine in Table 15.1). Such extension, for instance, might come about because only extremes have been included—for example, the sales of the top 10 percent and bottom 10 percent of salespersons are correlated with a personality trait such as extroversion. Often managers examine data to identify the causes of performance of their organization or its subunits. As noted above, because there are multiple causes of performance, any one cause tends to have a low true correlation (row ten in Table 15.1). This makes confounding easier, in that the spurious correlations from confounds can readily be large enough to strongly mask or inflate the focal correlation. Moreover, a multiplicity of other causes makes it more probable that one of them (or several jointly) spuriously affect the focal correlation enough to strongly mask or inflate it (row eleven in Table 15.1). If the focal correlation is masked, the manager

250

INTEGRATION

will falsely conclude that a true positive cause of organizational performance has no effect or possibly even harms performance. If the focal correlation is inflated, the manager will falsely conclude that the true positive cause of organizational performance has more effect than it has, so that the manager may fail to search for other causes of performance and not use them. If the manager is analyzing the relationship between some variable and profit, profit tends to be correlated (positively) with sales and (negatively) with costs. If either sales or costs happens to be correlated with the variable, then a confound arises (row twelve in Table 15.1). Such a confound could, again, either mask or inflate the correlation, leading the manager to the mistaken inference that the cause is weak or strong, respectively. This definitional confound will exist in organizations using profit centers and not exist in organizations that use just cost centers. Hence, this confound will be greater in businesses with diversified strategies and multidivisional, holding, or multinational structures and absent in businesses with undiversified strategies and functional structures. If the manager is examining data for a positive correlation between organizational performance and an organizational characteristic that causes it, there will tend to be reverse causal impact of prior performance on that organizational characteristic (row thirteen in Table 15.1). This latter is a negative correlation that masks the former, positive correlation. Therefore, again, the manager might mistakenly perceive the organizational characteristic as having no effect on performance or harming it. Managers may seek to avoid confounds by conducting experiments in one organizational subunit while using a similar subunit as the control group (row fourteen in Table 15.1). This will be more feasible in large organizations, because they are likely to possess similar subunits—that is, multiple subunits that conduct the same task in the same way (standardization) independently of each other, with comparable performances and using similar personnel. The control group measures the total confounding from all the confounds. However, the existence of a subunit not using the new technique relies upon half-hearted commitment by the organization, so that the effects from organizational experiments tend to be biased downward. Aggregation of results from multiple data sets, such as multiple departments of the same organization, tends to eliminate confounding without incurring the downward bias. Thus, a high-level manager in a large organization committed to the new technique can aggregate the data coming up the hierarchy and achieve a valid appreciation of the strength of the benefits that the technique produces. Overall, there are numerous sources of error when managers seek to make valid inferences from numerical data. These errors tend to be heightened when managers make an analysis of the causes of organizational performance, because these positive effects tend to be masked by negative correlations, due to reverse causality by prior performance. Moreover, since performance has many causes, any one cause tends to have a low correlation and to be readily subject to interference from other causes, which make confounding easier. Furthermore, if the organizational performance is measured by profitability, this, as a difference score, tends to suffer lower

CONCLUSIONS

251

reliability and also introduces definitional confounds, to add to the other sources of confounding. Hence, managers’ responsibility for organizational performance and, in business organizations, for profit makes their inference-making from data particularly prey to errors. Nonetheless, some of these potential problems are ameliorated in large organizations, which function meta-analytically. As data flow from branches to the center of the organization and from the bottom up to higher levels in the hierarchy, it often becomes aggregated. This increases the number of observations and so reduces the sampling error. Similarly, aggregation of data can also reduce confounding due to other causes of correlates of performance. In this way, large organizations come to possess an inference advantage over smaller organizations. Hence, we can see the meta-analytic organization. Reflections and Limitations Previously, we have been critical of the antimanagement tendencies in much of modern organizational theory (L. Donaldson 1995). Unlike these theories, statisticoorganizational theory is not antimanagement. While it sees errors in the inferences made by managers, the errors come from limitations in the data, not from failures in managerial motivations. Modern organizational theory includes negative depictions of managers as opportunistically working to maximize their own self-interest at a cost to the shareholder. In contrast, statistico-organizational theory posits no such baleful motivations. The errors made by managers are honest mistakes due to the limitations of their data. Again, managers are depicted in population ecology theory as failing to make sound adaptations, partly because they are ruled by vested interests, so the managers merely preside over inertial organizations until the defects of these organizations lead them to be culled. Statistico-organizational theory contains no such view of managers as passive; rather, managers search for solutions but are sometimes misled by their data. This can, on occasion, lead to failure to adapt, and hence to culling, but this is not seen as the main causal process of organizational change. Once again, neo-institutional theory sees organizational managers as conformists who work to secure legitimacy rather than enhancing the operational effectiveness of their organizations. In contrast, statistico-organizational theory sees managers as making inferences from numerical data with the intention of enhancing the operational effectiveness of their organizations. Statistico-organizational theory makes statements about how managers will make errors in looking at organizational data because of the traps that inhere in numerical data. However, these ideas are not derived from any close study of managerial behavior; rather they are extrapolations from well-known occurrences among social scientists doing their research. The central idea is that the errors in social science research must exist also in management. In that way, the present exercise is a series of conjectures, influenced in part by Popper’s maxim that theory

252

INTEGRATION

construction is about the making of bold conjectures (Popper 1969). But because the methodological principles used in statistico-organizational theory have been borne out in experience in social science research, there is reason to hold that these are not mere speculations. Thus, the present book is to be understood as an exercise in theory construction. Its aim is to use methodological principles to make predictions about the inferences that managers make from their numerical data. Like all scientific theories, the theory is explanatory, by stating what causal processes operate that connect organizational data to the errors in the inferences that managers make from it. The theory identifies causes (e.g., the number of observations) within organizational data of these errors. Also, it identifies more background causes (e.g., organizational size) that shape the organizational data. Thus, the theory offers explanations in terms of causes and effects. The aim of the book has been to lay out this theory. In order to motivate the discussion, examples have been presented, some based on data from real organizations and some hypothetical and illustrative. Hypothetical scenarios have been presented about managers making inferences in organizational contexts. The managerial behaviors and contexts have been described to try to make them like real life, so that the errors discussed have credibility. However, this book is primarily a work of the imagination, so these scenarios are not offered as any kind of empirical proof. The theory has not yet been tested; that needs to be done in future research. Hopefully, the many theoretical propositions that have been presented in this book will make it easier for future researchers to devise hypotheses to test in their empirical research. Social science research methodology cautions that there are many ways to draw erroneous inferences from numerical data and that some errors are great. Statisticoorganizational theory follows this tradition and applies it to managers looking at their data. Nevertheless, there is always the possibility that managers somehow avoid drawing substantially erroneous inferences from numerical data, because, through living in their organizations, they have access to other information that provides a corrective to misleading numerical data. For instance, manager can draw upon their experience, discussions with colleagues, and qualitative information. Thus, the errors of inference postulated in this book may be less prevalent in practice. Ascertaining the extent of such compensatory processes may be a task for future inquiry, to give a more comprehensive analysis of all the sources of information used in managerial inference-making. This is beyond the scope of the present book. Hopefully, statistico-organizational theory, by articulating the numerical sources of errors in managerial inferences, may contribute to that more comprehensive future analysis. In order to fully appreciate the role of numerical sources of errors by managers, it is necessary to give it due attention, and the present book is offered as a modest step in that direction.

Appendix Comparisons of the Sampling Error of Correlations Calculated by Formula Versus the Z Transformation Procedure

As stated in Chapter 13, the sampling error of a correlation calculated by the simple formula  −     (Chambers 1964, 61) was compared with that calculated from the more complex procedures of the z transformation. The simple formula was found to provide a fairly good approximation to the z transformation, except from Ns of about 15 and downward. The results of the simple formula and the z transformation were compared over the range of correlations from zero to 1.0 in increments of .1 (i.e., .1, .2, and so on). The results from using the simple formula tend to reproduce the results from using the z formula. For an N of 100, for the lower bound (i.e., the correlation minus the confidence interval), the average difference is only a correlation of .01 and the maximum difference is that the simple formula is greater than the z formula by only .02. Even for an N of only 20, for the lower bound confidence interval, the average difference is only a correlation of .06, and the maximum difference (for a correlation of .6) is that the simple formula gives a correlation of .21, which is only .11 less than that given by the z formula, .32. Thus, the simple formula works quite well for giving fairly accurate confidence intervals, across the range of correlation values, for Ns of 20 or more. However, for smaller Ns, the simple formula gives bounds that are greater than those given by the z formula. This distortion is pronounced for N = 5, being on average .39 (for the lower bound). This distortion increases as N decreases below 20, though it is worse for some correlations than others. For N = 15, for the lower bound, the average correlation is only .09 greater for the simple than the z formula; and the maximum difference in correlations is .15 (at r = .6), where the simple formula gives .28, which is more than double that given by the z formula, .13, so 253

254 APPENDIX

that the simple formula is overstating the sampling error materially. However, for the upper bound (i.e., correlation plus the confidence interval), the average correlation is only .04 greater for the simple than the z formula, and the maximum difference in correlations is .07 (at r = .5), where the simple formula gives .88, while the z gives .81—this overstatement would not seem to be material. Thus, for Ns as small as 15, material overstatements begin to arise from using the simple formula, though they are the exception. However, as N shrinks from 15 down to 5, these overstatements grow in magnitude and become pervasive across the range of correlations .2 to .9.

References

Aldrich, Howard E. 1979. Organizations and Environments. Englewood Cliffs, NJ: Prentice Hall. ———. 1992. “Paradigm Incommensurability: Three Perspectives on Organizations.” In Rethinking Organization: New Directions in Organizational Theory and Analysis, ed. Michael I. Reed and Michael D. Hughes, 17–45. Newbury Park, CA: Sage. Bartlett, Christopher A. 1993. “ABB’s Relays Business: Building and Managing a Global Matrix.” Prod. #: 394016-PDF-ENG, 1–23. Boston: Harvard Business School. Blalock, Hubert M., Jr. 1972. Social Statistics. 2nd ed. New York: McGraw-Hill. Blau, Peter M. 1970. “A Formal Theory of Differentiation in Organizations.” American Sociological Review 35, no. 2 (April): 201–218. ———. 1972. “Interdependence and Hierarchy in Organizations.” Social Science Research 1, no. 1 (April): 1–24. Blau, Peter M., and Richard A. Schoenherr. 1971. The Structure of Organizations. New York: Basic Books. Brealey, Richard A., and Stewart C. Myers. 1966. Principles of Corporate Finance. 5th ed. New York: McGraw-Hill. Burrell, Gibson, and Gareth Morgan. 1979. Sociological Paradigms and Organisational Analysis: Elements of the Sociology of Corporate Life. London: Heinemann. Capon, Noel, Chris Christodolou, John U. Farley, and James M. Hulbert. 1987. “A Comparative Analysis of the Strategy and Structure of United States and Australian Corporations.” Journal of International Business Studies 18, no. 1 (Spring): 51–74. Chambers, E.G. 1964. Statistical Calculation for Beginners. Cambridge, UK: Cambridge University Press. Chandler, Alfred D., Jr. 1962. Strategy and Structure: Chapters in the History of the Industrial Enterprise. Cambridge, MA: MIT Press. ———. 1977. The Visible Hand: The Managerial Revolution in American Business. Cambridge, MA: Belknap Press. Channon, Derek F. 1973. The Strategy and Structure of British Enterprise. London: Macmillan. ———. 1978. The Service Industries: Strategy Structure and Financial Performance. London: Macmillan. Child, John. 1972. “Organizational Structure, Environment and Performance: The Role of Strategic Choice.” Sociology 6, no. 1 (January): 1–22. ———. 1973a. “Predicting and Understanding Organization Structure.” Administrative Science Quarterly 18, no. 2 (June): 168–185. 255

256

REFERENCES

———. 1973b. “Parkinson’s Progress: Accounting for the Number of Specialists in Organizations.” Administrative Science Quarterly 18, no. 3 (September): 328–348. ———. 1974. “Managerial and Organizational Factors Associated with Company Performance.” Journal of Management Studies 11, no. 3 (October): 174–189. ———. 1975. “Managerial and Organizational Factors Associated with Company Performance, Part 2: A Contingency Analysis.” Journal of Management Studies 12, no. 1 (March): 12–27. Cibin, Renato, and Robert M. Grant. 1996. “Restructuring Among the World’s Leading Oil Companies, 1980–92.” British Journal of Management 7, no. 4 (December): 283–307. Clark, Peter. 1990. Aston Programme: Describing and Explaining the Structure of Canadian Textile Firms. Birmingham, UK: Aston University, Aston Programme Press. Cohen, Jacob, Patricia Cohen, Stephen G. West, and Leona S. Aiken. 2003. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. 3rd ed. Mahwah, NJ: Lawrence Erlbaum. Collis, David J., and Toby Stuart. 1991. “Cooper Industries’ Corporate Strategy (A).” Prod. #: 391095 PDF-ENG, 1–28. Boston: Harvard Business School. Comte, Auguste. 1896. The Positive Philosophy of Auguste Comte. Trans. Harriet Martineau. London: Bell. Cook, Thomas D., and Donald T. Campbell. 1979. Quasi-experimentation: Design and Analysis for Field Settings. Chicago: Rand McNally. Corey, E. Raymond, and Steven H. Star. 1971. Organization Strategy: A Marketing Approach. Boston: Harvard University, Graduate School of Business Administration, Division of Research. DiMaggio, Paul J., and Walter W. Powell. 1983. “The Iron Cage Revisited: Institutional Isomorphism and Collective Rationality in Organization Fields.” American Sociological Review 48, no. 2 (April): 147–160. Donaldson, Gordon. 1994. Corporate Restructuring: Managing the Change Process from Within. Boston: Harvard Business School Press. Donaldson, Lex. 1985. In Defence of Organization Theory: A Reply to the Critics. Cambridge, UK: Cambridge University Press. ———. 1987. “Strategy and Structural Adjustment to Regain Fit and Performance: In Defence of Contingency Theory.” Journal of Management Studies 24, no. 1 (January): 1–24. ———. 1995. American Anti-Management Theories of Organization: A Critique of Paradigm Proliferation. Cambridge, UK: Cambridge University Press. ———. 1996. For Positivist Organization Theory: Proving the Hard Core. London: Sage. ———. 1999. Performance-Driven Organizational Change: The Organizational Portfolio. Thousand Oaks, CA: Sage. ———. 2001. The Contingency Theory of Organizations. Thousand Oaks, CA: Sage. Donaldson, Lex, and Malcolm Warner. 1974. “Structure of Organizations in Occupational Interest Associations.” Human Relations 27, no. 8 (October): 721–738. Durkheim, Emile. 1947. The Division of Labor in Society. Glencoe, IL: Free Press. Dyas, Gareth P., and Heinz T. Thanheiser. 1976. The Emerging European Enterprise: Strategy and Structure in French and German Industry. London: Macmillan. Edwards, Jeffrey R., and Mark E. Parry. 1993. “On the Use of Polynomial Regression Equations as an Alternative to Difference Scores in Organizational Research.” Academy of Management Journal 36, no. 6 (December): 1577–1613. Egelhoff, William G. 1982. “Strategy and Structure in Multinational Corporations: An Information-Processing Approach.” Administrative Science Quarterly 27, no. 3 (September): 435–458. ———. 1988. Organizing the Multinational Enterprise: An Information-Processing Perspective. Cambridge, MA: Ballinger.

REFERENCES

257

———. 1991. “Information-Processing Theory and the Multinational Enterprise.” Journal of International Business Studies 22, no. 3 (September): 341–368. ———. 1993. “Great Strategy or Great Strategy Implementation: Two Ways of Competing in Global Markets.” Sloan Management Review (Winter): 37–50. Ehrenberg, Andrew S.C. 1975. Data Reduction: Analysing and Interpreting Statistical Data. London: John Wiley. Etzioni, Amitai. 1961. A Comparative Analysis of Complex Organizations: On Power, Involvement, and Their Correlates. New York: Free Press. Ezzamel, Mahmoud A., and Keith Hilton. 1980. “Divisionalisation in British Industry: A Preliminary Study.” Accounting and Business Research 10, no. 38 (Spring): 197–214. Fligstein, Neil. 1985. “The Spread of the Multidivisional Form among Large Firms, 1919–1979.” American Sociological Review 50, no. 3 (June): 377–391. ———. 1990. The Transformation of Corporate Control. Cambridge, MA: Harvard University Press. ———. 1991. “The Structural Transformation of American Industry: An Institutional Account of the Causes of Diversification in the Largest Firms, 1919–1979.” In The New Institutionalism in Organizational Analysis, ed. Walter W. Powell and Paul J. DiMaggio, 311–336. Chicago: University of Chicago Press. Fligstein, Neil, and Peter Brantley. 1992. “Bank Control, Owner Control, or Organizational Dynamics: Who Controls the Large Modern Corporation?” American Journal of Sociology 98, no. 2 (September): 280–307. Galbraith, Jay R. 1973. Designing Complex Organizations. Reading, MA: AddisonWesley. Gibbs, Michael, Kenneth A. Merchant, Wim A. Van der Stede, and Mark E. Vargus. 2009. “Performance Measure Properties and Incentive System Design.” Industrial Relations 48, no. 2 (April): 237–264. Goold, Michael, Andrew Campbell, and Marcus Alexander. 1994. Corporate-Level Strategy: Creating Value in the Multibusiness Company. New York: John Wiley. Grinyer, Peter H., Masoud Yasai-Ardekani, and Shawki Al-Bazzaz. 1980. “Strategy, Structure, the Environment, and Financial Performance in 48 United Kingdom Companies.” Academy of Management Journal 23, no. 2 (June): 193–220. Hage, Jerald. 1965. “An Axiomatic Theory of Organizations.” Administrative Science Quarterly 10 (December): 289–320. Hage, Jerald, and Michael Aiken. 1967. “Program Change and Organizational Properties: A Comparative Analysis.” American Journal of Sociology 72, no. 5 (March): 503–519. Hannan, Michael T., and John Freeman. 1989. Organizational Ecology. Cambridge, MA: Harvard University Press. Harlow, Lisa L., Stanley A. Mulaik, and James H. Steiger, eds. 1997. What If There Were No Significance Tests? Mahwah, NJ: Lawrence Erlbaum. Hill, Charles W.L., and John F. Pickering. 1986. “Divisionalization, Decentralization and Performance of Large United Kingdom Companies.” Journal of Management Studies 23, no. 1 (January): 26–50. Hopkins, H. Donald. 1988. “Firm Size: The Interchangeability of Measures.” Human Relations 41, no. 2 (February): 91–102. Hopwood, A. 1976. Accounting and Human Behavior. Englewood Cliffs, NJ: Prentice Hall. Hoskisson, Robert E., and Michael A. Hitt. 1994. Downscoping: How to Tame the Diversified Firm. New York: Oxford University Press. Hunter, John E., and Frank L. Schmidt. 2004. Methods of Meta-Analysis: Correcting Error and Bias in Research Findings. 2nd ed. Thousand Oaks, CA: Sage. Hunter, John E., Frank L. Schmidt, and Gregg B. Jackson. 1982. Meta-Analysis: Cumulating Research Findings Across Studies. Beverly Hills, CA: Sage.

258

REFERENCES

Ilgen, Daniel R. 1986. “Laboratory Research: A Question of When, Not If.” In Locke 1986, 257–267. Ivancevich, John M. 1974. “Changes in Performance in a Management by Objectives Program.” Administrative Science Quarterly 19, no. 4 (December): 563–574. ———. 1976. “Effects of Goal Setting on Performance and Job Satisfaction.” Journal of Applied Psychology 61, no. 5 (October): 605–612. ———. 1977. “Different Goal Setting Treatments and Their Effects on Performance and Job Satisfaction.” Academy of Management Journal 20, no. 3 (September): 406–419. Janis, Irving L. 1972. Victims of Groupthink. Boston: Houghton Mifflin. Johns, Gary. 1981. “Difference Score Measures of Organizational Behavior Variables: A Critique.” Organizational Behavior and Human Performance 27, no. 3 (June): 443–463. Kahneman, Daniel, and Mark W. Riepe. 1998. “Aspects of Investor Psychology: Beliefs, Preferences and Biases Investment Advisors Should Know.” Journal of Portfolio Management 24, no. 4 (Summer): 52–65. Kaplan, Robert S., and David P. Norton. 1996. The Balanced Scorecard: Translating Strategy into Action. Boston: Harvard Business School Press. Kincaid, Harold. 1996. Philosophical Foundations of the Social Sciences: Analyzing Controversies in Social Research. Cambridge, UK: Cambridge University Press. Kuhn, Thomas S. 1970. The Structure of Scientific Revolutions. 2nd ed. Chicago: University of Chicago Press. Latham, Gary P., and Thomas W. Lee. 1986. “Goal Setting.” In Locke 1986, 101–117. Lawrence, Paul R. 1993. “The Contingency Approach to Organizational Design.” In Handbook of Organizational Behavior, ed. Robert T. Golembiewski, 9–18. New York: Marcel Dekker. Lawrence, Paul R., and Jay W. Lorsch. 1967. Organization and Environment: Managing Differentiation and Integration. Boston: Harvard University, Graduate School of Business Administration, Division of Research. Lioukas, Spyros K., and Demitris A. Xerokostas. 1982. “Size and Administrative Intensity in Organizational Divisions.” Management Science 28, no. 8 (August): 854–868. Locke, Edwin A., ed. 1986. Generalizing from Laboratory to Field Settings: Research Findings from Industrial-Organizational Psychology, Organizational Behavior, and Human Resource Management. Lexington, MA: Lexington Books. Locke, Edwin A., Karyll N. Shaw, Lise M. Saari, and Gary P. Latham. 1981. “Goal Setting and Task Performance: 1969–1980.” Psychological Bulletin 90, no. 1 (July): 125–152. Lorsch, Jay W., and Stephen A. Allen. 1973. Managing Diversity and Interdependence: An Organizational Study of Multidivisional Firms. Boston: Harvard University, Graduate School of Administration, Division of Research. March, James G., with Robert I. Sutton. 1999. “Organizational Performance as a Dependent Variable.” In The Pursuit of Organizational Intelligence, ed. James G. March, 338–353. Oxford, UK: Basil Blackwell. McKelvey, Bill. 1997. “Quasi-Natural Organization Science.” Organization Science 8, no. 4 (July–August): 352–380. Meyer, John W., and W. Richard Scott, with the assistance of Brian Rowan and Terence E. Deal. 1983. Organizational Environments: Ritual and Rationality. Beverly Hills, CA: Sage. Meyer, Marshall W., and Vipin Gupta. 1994. “The Performance Paradox.” In Research in Organizational Behavior: An Annual Series of Analytical Essays and Critical Reviews, ed. Barry M. Staw and Roderick M. Kramer, 16: 309–369, Greenwich, CT: JAI Press. Mintzberg, H. 1979. The Structuring of Organizations: A Synthesis of the Research. Englewood Cliffs, NJ: Prentice Hall. Moore, David S., George P. McCabe, William M. Duckworth, and Layth C. Alwan. 2009. The Practice of Business Statistics. 2nd ed. New York: W.H. Freeman.

REFERENCES

259

Muczyk, Jan P. 1976. “MBO in a Bank and Railroad Company: Two Field Experiments Focusing on Performance Measures.” In Industrial Relations Research Association, Proceedings of the Twenty-Ninth Annual Winter Meeting, 13–19. ———. 1978. “A Controlled Field Experiment Measuring the Impact of MBO on Performance Data.” Journal of Management Studies 15, no. 3 (October): 318–329. Parker, Robert H. 1972. Understanding Company Financial Statements. Harmondsworth, Middlesex, UK: Penguin Books. Pavan, Robert J. 1976. “Strategy and Structure: The Italian Experience.” Journal of Economics and Business 28, no. 3 (Spring–Summer): 254–260. Pearlman, Kenneth, Frank L. Schmidt, and John E. Hunter. 1980. “Validity Generalization Results for Tests Used to Predict Job Proficiency and Training Success in Clerical Occupations.” Journal of Applied Psychology 65, no. 4 (August): 373–406. Pettigrew, Andrew M. 1973. The Politics of Organizational Decision-Making. London: Tavistock. Pfeffer, Jeffrey, and Gerald R. Salancik. 1978. The External Control of Organizations: A Resource Dependence Perspective. New York: Harper and Row. Pickle, Hal, and Frank Friedlander. 1967. “Seven Societal Criteria of Organizational Success.” Personnel Psychology 20, no. 2 (June): 165–178. Pitts, Robert A. 1974. “Incentive Compensation and Organization Design.” Personnel Journal 53, no. 5 (May): 340–348. ———. 1976. “Diversification Strategies and Organizational Policies of Large Diversified Firms.” Journal of Economics and Business 28, no. 3 (Spring–Summer): 181–188. Popper, Karl. 1969. Conjectures and Refutations: The Growth of Scientific Knowledge. 3rd ed. London: Routledge and Kegan Paul. Pugh, Derek S., David J. Hickson, C. Robert Hinings, Keith M. Macdonald, Chris Turner, and Tom Lupton. 1963. “A Conceptual Scheme for Organizational Analysis.” Administrative Science Quarterly 8, no. 3 (December): 289–315. Pugh, Derek S., David J.Hickson, C. Robert Hinings, and Chris Turner. 1968. “Dimensions of Organization Structure.” Administrative Science Quarterly 13, no. 1 (June): 65–105. ———. 1969. “The Context of Organization Structures.” Administrative Science Quarterly 14, no. 1 (March): 91–114. Rodgers, Robert C., and John E. Hunter. n.d. “Impact of Management by Objectives on Organizational Productivity.” Mimeographed. ———. 1991. “Impact of Management by Objectives on Organizational Productivity.” Journal of Applied Psychology 76, no. 2 (April): 322–336. ———. 1994. “The Discard of Study Evidence by Literature Reviewers.” Journal of Applied Behavioral Science 30, no. 3 (September): 329–345. Rogers, Meredith. 2006. “Contingent Corporate Governance: A Challenge to Universal Theories of Board Structure.” PhD diss. University of New South Wales, Sydney, Australia. Rosenzweig, Philip. 2007. The Halo Effect . . . and the Eight Other Business Delusions That Deceive Managers. New York: Free Press. Rumelt, Richard P. 1974. Strategy, Structure, and Economic Performance. Boston: Harvard University, Graduate School of Business Administration, Division of Research. Schlevogt, Kai-Alexander. 2002. The Art of Chinese Management: Theory, Evidence, and Applications. Oxford, UK: Oxford University Press. Schlevogt, Kai-Alexander, and Lex Donaldson. 2000. “An Organizational Portfolio Analysis of Knowledge: Are Managers Able to Learn?” Paper presented at the Australian and Pacific Researchers in Organizations Conference, Sydney, December. Schmidt, Frank L. 2010. “Detecting and Correcting the Lies That Data Tell.” Perspectives on Psychological Science, 5, no. 3 (May): 233–242.

260

REFERENCES

Schmidt, Frank L., and John E. Hunter. 1984. “A Within-Setting Test of the Situational Specificity Hypothesis in Personnel Selection.” Personnel Psychology 37, no. 2 (June): 317–326. Schmidt, Frank L., John E. Hunter, and Kenneth Pearlman. 1980. “Task Differences and Validity of Aptitude Tests in Selection: A Red Herring.” Journal of Applied Psychology 65, no. 2 (April): 166–185. Schmidt, Frank L., John E. Hunter, and Vern W. Urry. 1976. “Statistical Power in Criterion-Related Validation Studies.” Journal of Applied Psychology 61, no. 4 (August): 473–485. Schmidt, Frank L., Benjamin P. Ocasio, J.M. Hillery, and John E. Hunter. 1985. “Further Within-Setting Empirical Tests of the Situational Specificity Hypothesis in Personnel Selection.” Personnel Psychology 38, no. 3 (September): 509–524. Scott, W. Richard. 1995. Institutions and Organizations. Thousand Oaks, CA: Sage. Siegel, Sidney. 1956. Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill. Silverman, David. 1970. The Theory of Organizations. London: Heinemann. Simon, Herbert A. 1957. Administrative Behavior: A Study of Decision-Making Processes in Administrative Organization. New York: Free Press. Staw, Barry M. 1981. “The Escalation of Commitment to a Course of Action.” Academy of Management Review 6, no. 4 (October): 577–587. Suzuki, Y. 1980. “The Strategy and Structure of the Top 100 Japanese Industrial Enterprises 1950–1970.” Strategic Management Journal 1, no. 3 (July–September): 265–291. Thompson, James D. 1967. Organizations in Action. New York: McGraw-Hill. Tversky, Amos, and Daniel Kahneman. 1971. “Belief in the Law of Small Numbers.” Psychological Bulletin 76, no. 2 (August): 105–110. Van de Ven, Andrew, and Diane L. Ferry. 1980. Measuring and Assessing Organizations. New York: Wiley. Vecchio, Robert P., Greg Hearn, and Greg Southey. 1996. Organisational Behaviour. 2nd ed. Sydney, Australia: Harcourt Brace. Wall, Toby D., and Roy Payne. 1973. “Are Deficiency Scores Deficient?” Journal of Applied Psychology 58, no. 3 (December): 322–326. Walt Disney Co. 2005. NYSE: DIS, Mergent Online. www.mergentonline.comcompdetail. asp?company=2488&type=financials. Walton, Eric J. 2005. “The Persistence of Bureaucracy: A Meta-analysis of Weber’s Model of Bureaucratic Control.” Organization Studies 26, no. 4 (April): 569–600. Weber, Max. 1968. Economy and Society: An Outline of Interpretive Sociology, ed. Guenther Roth and Claus Wittich. New York: Bedminster Press. Weick, Karl. 1995. Sensemaking in Organizations. Thousand Oaks, CA: Sage. Williamson, Oliver E. 1970. Corporate Control and Business Behavior: An Inquiry Into the Effects of Organization Form on Enterprise Behavior. Englewood Cliffs, NJ: Prentice Hall. ———. 1975. Markets and Hierarchies: Analysis and Antitrust Implications. New York: Free Press. ———. 1985. The Economic Institutions of Capitalism: Firms, Markets, Relational Contracting. New York: Free Press. Woodward, Joan. 1965. Industrial Organisation: Theory and Practice. Oxford, UK: Oxford University Press. Yeomans, Keith A. 1968. Statistics for the Social Scientist: Volume 2, Applied Statistics. Harmondsworth, Middlesex, UK: Penguin Books.

Name Index

Aiken, Leona S., 84, 89, 104 Aiken, Michael, 37 Al-Bazzaz, Shawki, 126 Aldrich, Howard E., 45, 60, 71 Alexander, Marcus, 144, 238 Allen, Stephen A., 126, 130 Alwan, Layth C., 50, 51, 123, 237 Bartlett, Christopher A., 123, 124 Blalock, Hubert M., Jr., 99, 231 Blau, Peter M., 21, 43, 44, 52, 63, 192 Brantley, Peter, 126 Brealey, Richard A., 51 Burrell, Gibson, 7, 30 Campbell, Andrew, 144, 238 Campbell, Donald T., 167 Capon, Noel, 126 Chambers, E.G., 231 Chandler, Alfred D., Jr., 10, 57, 80, 81, 86, 132, 144, 158, 159, 207, 229 Channon, Derek F., 126 Child, John, 30, 43, 58, 64, 78, 83, 86, 96, 144 Christodolou, Chris, 126 Cibin, Renato, 159 Clark, Peter, 54 Cohen, Jacob, 84, 89, 104 Cohen, Patricia, 84, 89, 104 Collis, David J., 78, 239 Cook, Thomas D., 167 Corey, E. Raymond, 80, 81 Deal, Terence E., 142 DiMaggio, Paul J., 116, 142 Donaldson, Gordon, 159

Donaldson, Lex, 4, 7, 30, 37, 43, 44, 57, 58, 83, 103, 144, 159, 162, 184, 186t, 207, 229, 243, 251 Duckworth, William M., 50, 51, 123, 237 Durkheim, Emile, 44 Dyas, Gareth P., 126 Edwards, Jeffrey R., 240 Egelhoff, William G., 9, 136 Ehrenberg, Andrew S.C., 54, 78 Etzioni, Amitai, 83 Ezzamel, Mahmoud A., 159, 207 Farley, John U., 126 Ferry, Diane L, 85, 239 Fligstein, Neil, 44, 57, 126 Freeman, John, 4, 7, 10, 44, 56, 57, 60, 61, 144 Friedlander, Frank, 83 Galbraith, Jay R., 9, 80 Gibbs, Michael, 132, 133 Goold, Michael, 144, 238 Grant, Robert M., 159 Grinyer, Peter H., 126 Gupta, Vipin, 140, 141, 142 Hage, Jerald, 37, 81 Hannan, Michael T., 4, 7, 10, 44, 56, 57, 60, 61, 144 Harlow, Lisa L., 38 Hearn, Greg, 8 Hickson, David J., 43, 44, 45, 54, 63, 79, 81, 172, 233 Hill, Charles W.L., 122, 144, 159, 207 Hillery, J.M., 52

261

262

NAME INDEX

Hilton, Keith, 159, 207 Hinings, C. Robert, 43, 44, 45, 54, 63, 79, 81, 172, 233 Hitt, Michael A., 122 Hopkins, H. Donald, 58 Hopwood, A., 101 Hoskisson, Robert E., 122 Hulbert, James M., 126 Hunter, John E., 7, 20, 21, 24, 25, 26, 33, 35, 36, 37, 38, 39, 41, 52, 92, 109, 169, 170, 171, 173, 174, 176, 178, 179, 183, 184, 213, 215, 217, 222, 223, 238, 241, 242

Parker, Robert H., 86 Parry, Mark E., 240 Pavan, Robert J., 126 Payne, Roy, 148, 149, 150, 154 Pearlman, Kenneth, 36, 238 Pettigrew, Andrew M., 8, 44 Pfeffer, Jeffrey, 83 Pickering, John F., 122, 144, 159, 207 Pickle, Hal, 83 Pitts, Robert A., 131 Powell, Walter W., 116, 142 Pugh, Derek S., 43, 44, 45, 54, 63, 79, 81, 172, 233

Ilgen, Daniel R., 167, 169 Ivancevich, John M., 170, 174, 175t, 179

Riepe, Mark W., 8 Rodgers, Robert C., 169, 170, 171, 173, 174, 176, 178, 179, 183, 184 Rogers, Meredith, 243 Rosenweig, Philip, 123 Rowan, Brian, 142 Rumelt, Richard P., 10, 17, 86, 122, 126, 144

Jackson, Gregg B., 33 Janis, Irving, 8 Johns, Gary, 24, 86, 87, 97, 108, 213 Kahneman, Daniel, 8, 20 Kaplan, Robert S., 239 Kincaid, Harold, 10 Kuhn, Thomas S., 7 Latham, Gary P., 168 Lawrence, Paul R., 43, 103 Lioukas, Spyros K., 58 Locke, Edwin A., 168, 169 Lorsch, Jay W., 43, 126, 130 Lupton, Tom, 43, 44, 45, 54, 63, 79, 81, 172, 233 Macdonald, Keith M., 44 March, James G., 143 McCabe, George P., 50, 51, 123, 237 McKelvey, Bill, 32 Merchant, Kenneth A., 132, 133 Meyer, John W., 142 Meyer, Marshall W., 140, 141, 142 Mintzberg, H., 56, 81, 89 Moore, David S., 50, 51, 123, 237 Morgan, Gareth, 7, 30 Muczyk, Jan P., 171, 174, 175t Mulaik, Stanley A., 38 Myers, Stewart C., 51

Saari, Lise M., 168 Salancik, Gerald R., 83 Schlevogt, Kai-Alexander, 39, 52, 183, 184, 186t, 206, 215 Schmidt, Frank L., 7, 20, 21, 24, 25, 26, 33, 34, 35, 36, 37, 38, 39, 41, 52, 92, 109, 213, 215, 217, 222, 223, 238, 241, 242 Schoenherr, Richard A., 21, 43, 44, 192 Scott, W. Richard, 4, 142 Shaw, Karyll N., 168 Siegel, Sidney, 52 Silverman, David, 30 Simon, Herbert A., 9, 100, 131, 144, 159 Southey, Greg, 8 Star, Steven H., 80, 81 Staw, Barry M., 8 Steiger, James H., 38 Stuart, Toby, 78, 239 Sutton, Robert I., 143 Suzuki, Y., 126

Norton, David P., 258

Thanheiser, Heinz T., 126 Thompson, James D., 122 Turner, Chris, 43, 44, 45, 54, 63, 79, 81, 172, 233 Tversky, Amos, 20

Ocasio, Benjamin P., 52

Urry, Vern W., 20, 38

NAME INDEX

263

Van der Stede, Wim A., 132, 133 Van de Ven, Andrew, 85, 239 Vargus, Mark E., 132, 133 Vecchio, Robert P., 8

West, Stephen G., 84, 89, 104 Williamson, Oliver E., 4, 10, 86, 101, 116, 121, 129, 130, 131 Woodward, Joan, 64

Wall, Toby D., 148, 149, 150, 154 Walton, Eric J., 139 Warner, Malcolm, 44 Weber, Max, 45, 63 Weick, Karl, 32, 128

Xerokostas, Demitris A., 58 Yasai-Ardekani, Masoud, 126 Yeomans, Keith, 50

This page intentionally left blank

Subject Index

ABB Corporation, 123–25 Academic research, 40 Aggregate level, 33–37 See also Confound control by data aggregation Aston Program, 44–45 Attenuation of association measurement error of profit, 92–94, 95f range artifacts, 138 theoretical structure, 12, 13f, 14–19 Attribution, 67–68 Automation, 64 Averaging data. See Confound control by data aggregation Balanced scorecard approach, 239–40 Bias in organizational experiments, 173–76 Blind variation, 61 Bogus variability, 12, 13f, 14–19 Bureaucratization, 45, 63–64 Causal model profit reliability determinants, 117–18 statistico-organizational theory, 10–12, 13f Cognitive positivism, 30–32 Comparison with standard measurement error, 100–3 Confound control by data aggregation averaging in organizational management, 193–96 propositions, 196 confounding by multiple causes, 180–83 illustration of, 181f confounding in multiple causes of organizational performance, 183–89 averaging effects, 183–85, 186t, 187–89 propositions, 189

Confound control by data aggregation (continued) confounds readily eliminated, 190–92 propositions, 192 controlling confounds by averaging, 178–80, 186t in organizational management, 193–96 in organizational performance, 183–85, 186t, 187–89 management by objectives (MBO) program, 178–80, 183, 185, 189, 190–92, 194–95 summary, 177–78 Confound control by organizational experiments confounding as error source, 166–67 propositions, 167 control group experiments, 167–69 organizational experiments bias, 173–76 management by objectives (MBO) program, 170–71, 173–76 organizational requirements for, 171–73 propositions, 172–73, 175 Confounding error reduction strategies, 242–44 errors not self-correcting, 203–5, 207–8, 209–11 managerial errors, 15–16 theoretical equations, 225–30 true correlation, 216, 225–26, 228–29, 231 theoretical overview, 27–28 Confounding by definitional connections, 148, 149–54 error reduction strategies, 242 theoretical equations, 227–29 Confounding by other factors, 226–27, 243–44

265

266

SUBJECT INDEX

Confounding by performance variable by definitional connections difference scores, 148, 149–54 illustration of, 150f , 152f, 153f, 155f likely organizational situations, 156–57 logical constraint, 149–50 profit confounded by sales and costs, 154–56 propositions, 154, 156, 157 unreliability of difference scores, 157 other error sources, 163 by reciprocal causality, 148, 158–63 propositions, 160 severity produced by weak spurious correlations, 148–49 Confounding by reciprocal causality, 148, 158–63 Confounding by reverse causality, 243 Confounding by spurious correlations, 148–49, 216–17, 226–29 Contingency misfit analyses measurement error, 103–4 Control variable measurement error, 98–100 Curvilinearity, 68–69 Cycles of dysfunctional control, 69–72 Data disaggregation errors aggregating previous inferences, 74–78 propositions, 78 cycles of dysfunctional control, 69–72 propositions, 72 fallacy of disaggregation, 66–69 attribution, 67–68 curvilinearity, 68–69 data noise, 67 propositions, 68, 69 fallacy of immediacy, 78–79 propositions, 79 inference and human resource management (HRM), 72–74 propositions, 73, 74 multidivisional form (M-form), 122–25 organizational structure for inference, 79–81 propositions, 79, 80, 81 Data noise data disaggregation errors, 67 law of small numbers, 50–51 measurement error of profit, 87 random noise, 67 Data reliability, 84–85 Data unreliability. See Measurement error

Deep structure data cognitive positivism, 30–32 meta-analysis philosophy aggregate level, 33–37 misleading individual studies, 37–38 statistical significance tests, 38–40 methodological principles, 7–8, 33–40 research hierarchy academic research, 40 managerial inferences, 32, 40–42 theoretical continuities, 42–46 Definitional connections, 148, 149–54, 227–29, 242 Determinate process, 214 Difference scores confounding by performance variable, 148, 149–54, 157 measurement error equations, 219–21 measurement error of profit, 85–86, 87 Error reduction strategies confounding by definitional connections, 242 by other causes, 243–44 by reverse causality, 243 measurement error alternative to difference score in profit assessment, 240–41 balanced scorecard approach, 239–40 increasing reliability, 238–40 range extension, 242 range restriction, 241 sampling error, 237–38 Errors not self-correcting overall error confounds, 203–5, 207–8, 209–11 illustration of, 209f measurement error, 203–5, 208 modeling typical overall error, 206–8 range restriction, 203–5, 208–10 relative magnitude of errors, 203–5 sampling error, 203–4, 205–6, 209–11 simplified general model, 208–11 variability of errors, 205–6 repeated operation of errors, 201–3 measurement error, 201 range extension, 201–2 range restriction, 201–2, 203 sampling error, 202–3 theoretical overview, 28–29

SUBJECT INDEX Fallacy of disaggregation, 66–69 Fallacy of immediacy, 78–79 Human resource management (HRM) data disaggregation errors, 72–74 law of small numbers, 22–23 Indeterminate process, 214 Inference data disaggregation errors aggregating previous inferences, 74–78 in human resource management (HRM), 72–74 international comparative advantage, 61–63 managerial inferences, 32, 40–42 organizational structure for, 79–81 by researchers and managers, 5–6 weak inference and organizational mortality, 60–61 Inferential statistics, 32, 52–54 Information-processing and data, 9–10 Institutional theory, 4, 44, 251 International comparative advantage, 61–63 Kuhnian research paradigm, 6–8 Law of large numbers, 20–21 Law of small numbers characteristics of, 50–56 data noise, 50–51 propositions, 55 random sample, 54–55 samples in organizational management, 52–56 standard error of the mean, 51 statistical significance tests, 51–52 true value, 50 errors from small organizational size, 56–60 propositions, 58, 60 international comparative advantage in inference, 61–63 propositions, 62–63 nature of what is counted, 56 propositions, 56 organizational benefits automation, 64 bureaucratization, 63–64 professionalization, 64 repetition, 63–64 theoretical overview, 20–24 weak inference and organizational mortality, 60–61

267

Law of small numbers weak inference and organizational mortality (continued) blind variation, 61 propositions, 60–61 Logical constraint, 149–50 Management by objectives (MBO) program confound control by data aggregation, 178–80, 183, 185, 189, 190–92, 194–95 confound control by organizational experiments, 170–71, 173–76 Managerial errors, 15–19 Measurement error balanced scorecard approach, 239–40 error reduction strategies, 238–41 errors not self-correcting overall error, 203–5, 208 repeated operation of errors, 201 managerial errors, 15–16 meta-analysis philosophy, 35, 41–42 theoretical equations difference scores, 219–21 with range extension, 223–25 with range restriction, 223–25 true correlation, 216, 217–19, 223–25, 231 theoretical overview, 24–25 Measurement error of profit caused by comparison with standard, 100–3 propositions, 102 caused by control variables, 98–100 propositions, 100 caused by time rates, 97–98 propositions, 98 in contingency misfit analyses, 103–4 propositions, 103 low reliability not readily increased, 104–6 measurement error in profit, 86–94 amount of error from low reliability, 89–92 attenuation of correlation, 92–94, 95f company illustration, 89–92 data noise, 87 difference scores, 87 error in bar chart analysis, 95f profit centers, 86 propositions, 89, 93 multidivisional form (M-form), 121–34 niche strategy analysis, 134–37 problem of measurement error, 84–86 data reliability, 84–85 data unreliability, 84–86

268

SUBJECT INDEX

Measurement error of profit problem of measurement error (continued) difference scores, 85–86 propositions, 86 of profitability ratios, 94, 96–97 profit-to-assets ratio, 86, 96–97 profit-to-sales ratio, 96 propositions, 96–97 psychometrics, 83–84, 86–87 See also Quantifying measurement error of profit Meta-analysis philosophy, 33–40 Meta-analytic organization, 3–4, 21–22, 41 illustration of, 4f Methodological principles meta-analysis, 33–40 statistico-organizational theory, 3, 4–10 M-form. See Multidivisional form (M-form) Multidivisional form (M-form) as data disaggregation, 122–25 company illustration, 123–25 propositions, 125 divisional profitability errors, 121–34 errors in, 129–32 propositions, 130 profit centers, 122–25, 127–29, 132 profits for bonus determination in small organizations, 132–34 profit unreliability and small-numbers problem, 125–29 propositions, 127, 129 sampling error, 122–29, 132–34 versus unitary form (U-form), 129–30 Niche strategy analysis, 134–37 propositions, 137 Organizational ecology, 4 Organizational mortality, 60–61 Organizational portfolio theory, 4, 51 Organizational structure for inference, 79–81 decentralization vs. centralization, 80–81 Population ecology theory, 10, 44, 57, 144, 251 Professionalization, 64 Profitability ratio measurement errors, 94, 96–97 Profit centers managerial errors, 19 measurement error of profit, 86 multidivisional form (M-form), 122–25, 127–29, 132 Profit-to-assets ratio, 86, 96–97

Profit-to-sales ratio, 96 Psychometrics increasing reliability, 239 measurement error of profit, 83–84, 86–87 meta-analysis philosophy, 34–35 quantifying measurement error of profit, 109–10 reliability, 217 Quantifying measurement error of profit causal model of profit reliability determinants, 117–18 model illustration, 118f company illustration, 107–15, 116 formal analysis of, 107–14 equations, 108–10, 113–14 propositions, 110, 111, 112–13 measurement error of sales growth, 119 psychometrics, 109–10 sensitivity of profit reliability, 114–17 profit and profit reliability, 115–17 propositions, 115, 117 Random sample, 54–55 Range artifacts attenuation of association, 138 covariation, 138 theoretical overview, 25–27 Range extension, 138–39 error reduction strategies, 242 errors from, 145 errors not self-correcting, 201–2 managerial errors, 16 propositions, 139, 145 theoretical equations, 221–25 with measurement error, 223–25 true correlation, 222, 223, 224, 225, 231 theoretical overview, 26–27 Range restriction error reduction strategies, 241 errors in organizational management, 139–43 adaptation restrictions, 142–43 propositions, 142, 143 selection restrictions, 142 errors in organizational misfit and fit, 143–45 propositions, 145 errors not self-correcting overall error, 203–5, 208–10 repeated operation of errors, 201–2, 203 managerial errors, 15–16 meta-analysis philosophy, 35, 42 propositions, 138

SUBJECT INDEX Range restriction (continued) theoretical equations, 221–25 with measurement error, 223–25 true correlation, 216, 221, 222, 223–24, 225, 231 theoretical overview, 26–27 Reciprocal causality, 148, 158–63 Repetition, 63–64 Research hierarchy, 40–42 Reverse causality, 243 Sales growth measurement error, 119 Sampling error error reduction strategies, 237–38 errors not self-correcting overall error, 203–4, 205–6, 209–11 repeated operation of errors, 202–3 managerial errors, 15–18 meta-analysis philosophy, 37–38, 41–42 multidivisional form (M-form), 122–29, 132–34 statistical significance tests, 38–40 theoretical equations, 231–32, 253–54 See also Data disaggregation error; Law of small numbers Sensitivity of profit reliability, 114–17 Spurious correlations, 148–49, 216–17, 226–29 Standard error of the mean, 51 Statistical significance tests law of small numbers, 51–52 meta-analysis philosophy, 38–40 Statistico-organizational theory cognitive positivism, 30–32 managerial errors, 15–19 confounding, 15–16 measurement error, 15–16 profit centers, 19 range extension, 16 range restriction, 15–16 sampling error, 15–18 meta-analytic organization, 3–4, 21–22, 41 methodological principles, 3, 4–10 deep structure data, 7–8, 33–40 information-processing and data, 9–10 Kuhnian research paradigm, 6–8 meta-analysis, 33–40 researcher/manager inferences, 5–6 theoretical characteristics, 8–9 organizational theory continuities, 42–46 statistical principles, 3, 5, 6–8 summary of, 245–46, 247t, 248–51 situational errors, 246, 247t, 248–51

269

Statistico-organizational theory (continued) theoretical analysis, 251–52 theoretical overview, 19–29 confounds, 27–28 errors not self-correcting, 28–29 law of large numbers, 20–21 law of small numbers, 20–24 measurement error, 24–25 range artifacts, 25–27 range extension, 26–27 range restriction, 26–27 theoretical structure, 10–15 attenuation of association, 12, 13f, 14–19 bogus variability, 12, 13f, 14–19 causal model, 10–12, 13f illustration of, 11f Statistico-organizational theory equations confounding, 225–30 combining multiple confounds, 230 by definitional connections, 227–29 by other factors, 226–27 by prior performance, 216, 229–30 by spurious correlations, 216–17, 226–29 true correlation, 216, 225–26, 228–29, 231 determinate process, 214 error analysis, 213–14 final error, 214 indeterminate process, 214 measurement error, 217–21 difference scores, 219–21 with range extension, 223–25 with range restriction, 223–25 true correlation, 216, 217–19, 223–25, 231 overall error, 214 overall error without sampling error, 230–31 overall error with sampling error, 232–35 propositions, 234 overall methodological equation, 213 overall theoretical equation, 213 range extension, 221–25 with measurement error, 223–25 true correlation, 222, 223, 224, 225, 231 range restriction, 221–25 with measurement error, 223–25 true correlation, 216, 221, 222, 223–24, 225, 231 sampling error, 231–32, 253–54 true correlation, 214–17 confounding, 216, 225–26, 228–29, 231 measurement error, 216, 217–19, 223–25, 231 range extension, 222, 223, 224, 225, 231

270

SUBJECT INDEX

Statistico-organizational theory equations true correlation (continued) range restriction, 216, 221, 222, 223–24, 225, 231 Statistics, 3, 5, 6–8 inferences, 32, 52–54 Stochastic idiosyncrasy, 32 Structural contingency theory, 43–44 Theoretical continuities, 42–46 Theoretical overview, 19–29 Theoretical structure, 10–15 Time rate measurement errors, 97–98 Transaction cost economics, 4, 10 U-form. See Unitary form (U-form)

Unitary form (U-form), 129–30 Walt Disney Company measurement error of profit, 89–92, 93–94, 104–6 bar chart presentation, 91f Consumer Products, 89, 90, 91, 107 Media Networks, 89, 90–91, 107, 113 Parks and Resorts, 89, 90, 91, 107, 113 Studio Entertainment, 89, 90, 91, 107 quantifying measurement error of profit formal analysis, 107–14 sensitivity of profit reliability, 114–15, 116 Weak inference and organizational mortality, 60–61

About the Author

Lex Donaldson is a professor of management in organizational design at the Australian School of Business, University of New South Wales, in Sydney. He holds a BSc from the University of Aston in Birmingham, England, and a PhD from the University of London. Previously, he worked in the Technical Efficiency and Organization Department of Mullard Ltd. and the Central Personnel Department of Philips Electrical (UK) Ltd. He has conducted numerous quantitative studies on topics varying from machine utilization to corporate governance.

271