Organizing for Reliability : A Guide for Research and Practice [1 ed.] 9781503604537, 9780804793612

Increasingly, scholars view reliability--the ability to plan for and withstand disaster--as a social construction. Howev

181 93 4MB

English Pages 338 Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Organizing for Reliability: A Guide for Research and Practice 9781503604537

Increasingly, scholars view reliability—the ability to plan for and withstand disaster—as a social construction. However

228 27 5MB Read more

Research for Designers - A Guide to Methods and Practice

1,538 300 8MB Read more

Vygotsky's Legacy : A Foundation for Research and Practice 9781593857479, 9781593854911

Most educators are familiar with Lev Vygotsky's concept of the "zone of proximal development," yet the bu

174 31 910KB Read more

Qualitative Research Practice: A Guide for Social Science Students and Researchers [2 ed.] 9781446209127

Why use qualitative methods? What kinds of questions can qualitative methods help you answer? How do you actually do rig

937 62 6MB Read more

Research for Designers: A Guide to Methods and Practice [1 ed.] 1446275140, 9781446275146

"Today, designers design services, processes and organizations; craft skills no longer suffice. We need to discover

3,707 267 2MB Read more

Qualitative Research Practice: A Guide for Social Science Students and Researchers [2 ed.] 1446209121, 9781446209127

Why use qualitative methods? What kinds of questions can qualitative methods help you answer? How do you actually do rig

2,470 203 13MB Read more

Corpus Linguistics for Pragmatics: A Guide for Research 1138718785, 9781138718784

Corpus Linguistics for Pragmatics provides a practical and comprehensive introduction to the growing field of corpus pra

1,202 209 5MB Read more

Organizing For Dummies 9780764553004

638 108 7MB Read more

Study guide for Essentials of nursing research, appraising evidence for nursing practice [8. ed.] 9781451176834, 145117683X

2,381 319 142MB Read more

Reliability Design of Mechanical Systems: A Guide for Mechanical and Civil Engineers 2nd edition

The revised edition of this book offers an expanded overview of the reliability design of mechanical systems and describ

2,106 297 21MB Read more

Organizing for Reliability : A Guide for Research and Practice [1 ed.]
9781503604537, 9780804793612

Author / Uploaded
Ranga Ramanujam
Karlene H. Roberts

Citation preview

Org a n i z i ng for R e l i a bi l i t y

HIGH RELIABILITY AND CRISIS MANAGEMENT Series Editors: Karlene H. Roberts and Ian I. Mitroff

SERIES TITLES Narratives of Crisis: Telling Stories of Ruin and Renewal By Matthew W. Seeger and Timothy L. Sellnow 2016 Reliability and Risk: The Challenge of Managing Interconnected Infrastructures By Emery Roe and Paul R. Schulman 2016 Community at Risk: Biodefense and the Collective Search for Security By Thomas D. Beamish 2015 Leadership Dispatches: Chile’s Extraordinary Comeback from Disaster By Michael Useem, Howard Kunreuther, and Erwann O. Michel-Kerjan 2015 The Social Roots of Risk: Producing Disasters, Promoting Resilience By Kathleen Tierney 2014 Learning from the Global Financial Crisis: Creatively, Reliably, and Sustainably Edited by Paul Shrivastava and Matt Statler 2012 Swans, Swine, and Swindlers: Coping with the Growing Threat of Mega-Crises and Mega-Messes By Can M. Alpaslan and Ian I. Mitroff 2011 Dirty Rotten Strategies: How We Trick Ourselves and Others into Solving the Wrong Problems Precisely By Ian I. Mitroff and Abraham Silvers 2010 High Reliability Management: Operating on the Edge By Emery Roe and Paul R. Schulman 2008

Org a n i z i ng for R eli a bilit y A Guide for Research and Practice

Edited by Rangaraj Ramanujam and Karlene H. Roberts

Stanford Business Books An Imprint of Stanford University Press Stanford, California

Stanford University Press Stanford, California ©2018 by the Board of Trustees of the Leland Stanford Junior University. All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or in any information storage or retrieval system without the prior written permission of Stanford University Press. Special discounts for bulk quantities of Stanford Business Books are available to corporations, professional associations, and other organizations. For details and discount information, contact the special sales department of Stanford University Press. Tel: (650) 725-0820, Fax: (650) 725-3457 Printed in the United States of America on acid-free, archival-quality paper Library of Congress Cataloging-in-Publication Data Names: Ramanujam, Rangaraj, editor. | Roberts, Karlene H., editor. Title: Organizing for reliability : a guide for research and practice / edited by Rangaraj Ramanujam and Karlene H. Roberts. Description: Stanford, California : Stanford Business Books, an imprint of Stanford University Press, 2018. | Series: High reliability and crisis management | Includes bibliographical references and index. Identifiers: LCCN 2017045421 (print) | LCCN 2017047534 (ebook) | ISBN 9781503604537 (e-book) | ISBN 9780804793612 (cloth : alk. paper) Subjects: LCSH: Organizational effectiveness. | Reliability. | Organizational resilience. | Management. Classification: LCC HD58.9 (ebook) | LCC HD58.9 .O745 2018 (print) | DDC 658.4/013—dc23 LC record available at https://lccn.loc.gov/2017045421 Cover design: Preston Thomas, Cadence Design; Cover image: iStock Typeset by Newgen in 10.5 Garamond Roman.

Con ten ts

Acknowledgments

vii

pa r t i s e t t i n g t h e s ta g e

1 Advancing Organizational Reliability

Karlene H. Roberts 2 The Multiple Meanings and Models of Reliability in Organizational Research

3

17

Rangaraj Ramanujam

pa r t i i i m p o r ta n t a s pe c t s o f r e l i a b l e o r g a n i z i n g

3 Three Lenses for Understanding Reliable, Safe, and Effective Organizations: Strategic Design, Political, and Cultural Approaches

John S. Carroll 4 Mindful Organizing

37

Kathleen M. Sutcliffe

61

vi contents

5 Reliability through Resilience in Organizational Teams

Seth A. Kaplan & Mary J. Waller 6 How High Reliability Mitigates Organizational Goal Conflicts

169

Jody L. S. Jahn, Karen K. Myers, & Linda L. Putnam 9 Extending Reliability Analysis across Organizations, Time, and Scope

143

Peter M. Madsen 8 Metaphors of Communication in High Reliability Organizations

118

Peter M. Madsen & Vinit Desai 7 Organizational Learning as Reliability Enhancement

90

194

Paul R. Schulman & Emery Roe

pa r t i i i i m p l e m e n tat i o n

10 Organizing for Reliability in Health Care

Peter F. Martelli 11 Organizing for Reliability in Practice: Searching for Resilience in Communities at Risk

244

Louise K. Comfort 12 Applying Reliability Principles: Lessons Learned

217

274

W. Earl Carnes

Epilogue

301

Karlene H. Roberts & Rangaraj Ramanujam

Contributors

313

Index

319

Ack n ow l e d g m e n t s

We would like to thank the chapter authors for thoughtfully addressing the key issues, challenges, and future research opportunities in this field. We would like to thank the Owen Graduate School of Management, Vanderbilt University for supporting this work. We would also like to thank Olivia Bartz for her invaluable editorial assistance during the final stretch to our manuscript submission. A special thanks to Margo Beth Fleming, our editor at Stanford University Press. Her critical insight and patient guidance were vital in bringing this project to fruition.

This page intentionally left blank

Org a n i z i ng for R e l i a bi l i t y

This page intentionally left blank

part i

S e t t i n g t h e S ta g e

This page intentionally left blank

chapter 1

Adva n c i n g Org a n i z at ion a l R eli a bilit y Karlene H. Roberts

T

he field of high r eli a bilit y o rg a niz ations (hro) research is now over thirty years old. This chapter discusses original reasons for delineating this area of research and the nature of the early research. I go on to indicate the reasons this book is needed at this time and conclude with a brief description of each chapter.

In the Beginning

HRO research began in the mid-1980s at the University of California, Berkeley. Berkeley researchers Todd La Porte, Karlene Roberts, and Gene Rochlin were soon joined by Karl Weick, then at the University of Texas. We were interested in the growing number of catastrophes that appeared to us to have been partially or wholly caused by organizational processes, including the 1981 Hyatt Regency walkway collapse; the 1994 South Canyon Fire, about which Weick wrote (1995, 1996); and the 1984 Union Carbide disaster in Bhopal, India. Unfortunately, the list goes on and on.1 Some time ago we were asked to write reflections of the early work. The first section of this chapter draws heavily on those reflections. As Weick (2012) points out: Prominent ideas were available to analyze evocative events such as the preceding [Hyatt Regency walkway collapse, etc.]. [Charles] Perrow (1984) had proposed that

4

a dva ncing orga niz ationa l r eli a bilit y

increasingly tightly complex systems fostered accidents as normal occurrence, a proposal that encouraged counterpoint. Barry Turner (1978) had sketched the outlines of organizational culture, the incubation of small failures (later to be conceptualized as “normalization”) and the organizational blind spots. “As a caricature it could be said that organizations achieve a minimal level of coordination by persuading their decision makers to agree they will all neglect the same kinds of considerations when they make decisions (p. 166).” Not long thereafter, Linda Smircich (1983) in a definitive [Administrative Science Quarterly] article gave legitimacy to the notion of organizational culture. Trial and error learning was a basic assumption which meant that, the possibility that groups in which the first error was the last trial provoked interest and a search for explanations. . . . My point is that these ideas, and others not mentioned, were available to make sense of an emerging set of organizations that were complex systems, under time pressure, conducting risky operations, with different authority structures for normal, high tempo, and emergency times, and yet in the best cases were nearly error free. (p. 2)

At the beginning of our research, we were introduced to three technologically sophisticated, complex subunits of organizations that were charged with doing their tasks without major incident: the US Navy’s Nimitz-class aircraft carriers, the US Federal Aviation Administration’s [FAA] air traffic control [ATC] system, and Pacific Gas and Electric Company’s [PG&E] Diablo Canyon nuclear power plant. To us these organizations appeared to engage in different processes, or seemed to bundle the same processes differently, than the average organization studied in organizational research. We kicked off the research with a one-day workshop held on the aircraft carrier USS Carl Vinson, which was also attended by members from the other two HRO organizations we planned to study. One outcome of the workshop was that managers in all three organizations felt they faced the same sophisticated challenges.

Impo r t a n t C h a r a c t e r i s t i cs o f t h e I n i t i a l W o r k

The HRO project did not start by looking at failures but rather at the manner in which organizations with a disposition to fail had not. It became readily apparent that HROs do not maintain reliability purely by mechanistic control or by redundancy or by “gee whiz” technology. They work into the fab-

a d v a n c i n g o r g a n i z a t i o n a l r e l i a b i l i t y 5

ric of these mechanistic concerns a mindset and culture that makes everyone mindful of their surroundings, how they are unfolding, and what they may be missing. High reliability organizing deploys limited conceptions to unlimited interdependencies. These organizations are set apart from other organizations because they handle complexity with self-consciousness, humility, explicitness, and an awareness that simplification inherently produces misrepresentations. The initial high reliability project focused on current functioning because we researchers knew little of past practices and operations. In all cases it took many years to reach the level of performance observed by the researchers. For example, the Air Commerce Act was signed into law in the United States in 1926. It contained provisions for certifying aircraft, licensing pilots, and so on. By the mid-1930s there was a growing awareness that something needed to be done to improve air travel safety. At the same time, the federal government encouraged airlines to embed control centers in five US airports. Maps, blackboards, and boat-shaped weights were used to track air traffic. Ground personnel had no direct radio link with aircraft, and ATC centers contacted each other by phone. Technological changes have vastly altered what high reliability functioning looks like today. Not long after the emergence of ATC centers, semi-automated systems were developed based on the marriage of radar and computer technology. In 2004 the US Department of Transportation announced plans to develop a “next gen” plan to guide air traffic from 2025 and thereafter. This plan will take advantage of the growing number of onboard technologies for precision guidance (Federal Aviation Administration, 2015). Because the researchers had no experience with the histories of these organizations, many existing organizational processes that may no longer serve a purpose were probably not uncovered. An apocryphal story is told about the US Army on the eve of World War II. A senior officer was reviewing an artillery crew in training. The officer noticed that each time the gun fired, one of the firing team members stood off to the side with his arm extended straight out and his fist clenched. The inspecting officer asked the purpose of this procedure, but no one seemed to know. Sometime later a World War I veteran reviewed the gun drill and said, “Well, of course, he’s holding the horses.” An obsolete part of the drill was still in use (Brunvand, 2000), as is probably true in HROs.

6

a dva ncing orga niz ationa l r eli a bilit y

The units under study in the original research were subunits of larger organizations and not necessarily representative of the organization as a whole. Flight operations, for example, are essential to the missions of an aircraft carrier but not its entire menu of complex tasks, which include navigation, weapons handling, supply, housing and feeding six thousand people, and so on (Rochlin, 2012).2 The carrier is central to the task force and is an important part of the navy, which is part of the US Department of Defense. It was beyond the scope of the project to determine the contribution to the nested series by safer or, rather, more reliable operation of the suborganizations. By studying subunits the team may have created an error of the third kind (Mitroff & Featheringham, 1974)—that is, solving the wrong problem precisely by only examining part of the problem. Paul Schulman and Emory Roe attempt to rectify this problem in Chapter 9. More research is needed to explore how organizations or units of organizations that can fail disastrously are linked to other organizations or units of organizations. Specifically, more attention needs to be given to organizations that help other organizations fail or fail alongside them (Roberts, 2012). The failure of BP and its semisubmersible deepwater drilling rig Deepwater Horizon is a good example. As reported, on April 20, 2010, the Macondo well blew up; this accident cost the lives of eleven men and began an environmental catastrophe when the drilling rig sank and over four million barrels of crude oil spilled into the Gulf of Mexico (National Commission on the Deepwater Horizon Oil Spill and Offshore Drilling, 2011, back cover). There are a number of reasons for the paucity of research on interlinked and interdependent organizations. Such research is costly and resource demanding. Moreover, most organizations in which some units need to avoid catastrophe are complex in a manner that would require large, multidisciplinary research teams. The Diablo Canyon nuclear power plant is an example of an interdependent, hence complex organization requiring excessive resources. It is enmeshed in the problems, politics, and legalities of PG&E and its regulator, the California Public Utilities Commission (CPUC), to say nothing of local community politics. Building a multidisciplinary (or interdisciplinary) team is not easy. Scholars are not used to talking with other scholars who speak different languages and are enmeshed in different constructs. The observations at the root of the original HRO conceptual and process findings were intense case studies of the three organizations. Early on, we realized that formal interviews and questionnaires were of little value in or-

a d v a n c i n g o r g a n i z a t i o n a l r e l i a b i l i t y 7

ganizations in which researchers didn’t have an available literature on which to build. Both these research methodologies assume researchers know some basics about what is going on in the organization.

A C o n c e p t u a l P r o b l e m Th a t D o e s n ’ t Go Awa y

A frequent criticism of HRO research is the lack of agreement on a definition by the (now) many authors contributing to the work (e.g., Hopkins, 2007). The chapters in this book reflect this lack of consensus. Despite this repeated criticism, early in the work Rochlin (1993, p. 16) provided a list of defining criteria that seem to provide fairly clear boundaries for organizations to be labeled as high reliability or reliability seeking: 1. The organization is required to maintain high levels of operational reliability and/or safety if it is to be allowed to continue to carry out its tasks (La Porte & Consolini, 1991). 2. The organization is also required to maintain high levels of capability, performance, and service to meet public and/or economic expectations and requirements (Roberts, 1990a, 1990b). 3. Because of the consequentiality of error or failure, the organization cannot easily make marginal trade-offs between capacity and safety. In a deep sense, safety is not fungible (Schulman, 1993). 4. As a result, the organization is reluctant to allow primary task-related learning to proceed by the usual modalities of trial and error for fear that the first error will be the last trial (La Porte & Consolini, 1991). 5. Because of the complexity of both technology and task environment, the organization must actively manage its activities and technologies in real time while maintaining capacity and flexibility to respond to events and circumstances that can at most be generally bounded (Roberts, 1990a, 1990b). 6. The organization will be judged to have “failed”—either operationally or socially—if it does not perform at high levels. Whether service or safety is degraded, the degradation will be noticed and criticized almost immediately (Rochlin, La Porte, & Roberts, 1987). The labeling problem is further compounded by the fact that most high reliability research still selects on the dependent variable by first identifying organizations that researchers think are or should be high reliability or reliability

8

a dva ncing orga niz ationa l r eli a bilit y

seeking. But, does reliability mean the same thing to all employees in a single organization or across organizations? Definitional problems, too, may have led to the fact that the research project went by several names before “high reliability” stuck as the work matured. It is disconcerting that the acronym HRO has become a marketing label: “When it is treated as a catchword this is unfortunate because it makes thinking seem unnecessary and even worse, impossible. The implication is that once you have achieved the honor of being an HRO, you can move on to other things” (Weick, 2012).

E a r l y R e s e a r ch F i n d i n g s

According to Chrysanthi Lekka (2011), the original research identified several characteristics and processes that enabled the three organizations to achieve and maintain their excellent safety records (e.g., Roberts & Rousseau, 1989; Roberts, 1990b, 1993a; La Porte & Consolini, 1991; Roberts & Bea, 2001). These include: Deference to expertise. In emergencies and high-tempo operations, decision making migrates to people with expertise regardless of their hierarchical position in the organization. During routine operations decision making is hierarchical. Management by exception. Managers focus on the “bigger picture” (strategy) and let operational decisions be made closer to the decision implementation site. Managers monitor these decisions but only intervene when they see something that is about to go wrong (Roberts, 1993a; 1993b). Continuous training. The organization engages in continuous training to enhance and maintain operator knowledge of the complex operations within the organization and improve technical competence. Such training also enables people to recognize hazards and respond to “unexpected” problems appropriately and is a means to build interpersonal trust and credibility among coworkers. Safety-critical information communicated using a number of channels. Using a variety of communication channels ensures that workers can receive and act on information in a timely way, especially during high-tempo or emergency operations. For example, at the time of the research, nuclear-powered aircraft carriers used twenty different communication devices ranging from radios to sound-powered phones (Roberts, 1990b). Currently, the Boeing 777 aircraft uses eight communication devices.

a d v a n c i n g o r g a n i z a t i o n a l r e l i a b i l i t y 9

Built-in redundancy. The provision of backup systems in case of failure is one redundancy mechanism. Other such mechanisms include internal crosschecks of safety-critical decisions and continuous monitoring of safety-critical activities (e.g., Roberts, 1990b; Hofmann, Jacobs, & Landy, 1995). Nuclearpowered aircraft carriers operate a “buddy system” whereby activities carried out by one individual are observed by a second crew member (Roberts, 1990b).

O rgani z ing for R eliabilit y : A Gu i d e f o r R e s e a r ch a n d P r a c t i c e

The central purpose of Organizing for Reliability is to showcase the different perspectives about high reliability organizing that have emerged over the past thirty years. We feel that all too often reliability is embedded in studies of accidents (e.g., Caird & Kline, 2004), safety, (e.g., Naveh & Katz-Navon, 2015), errors (e.g., Jones & Atchley, 2002), and disasters (e.g., Guiberson, 2010) rather than being given attention in its own right. The basic question in HRO research is, What accounts for the exceptional ability of some organizations to continuously maintain high levels of operational reliability under demanding conditions? Its importance is underscored by distant accidents (e.g., the Challenger and Columbia space shuttles) and more recent mishaps (e.g., the Volkswagen emissions debacle, PG&E’s San Bruno, California, pipeline explosion) in which organizations failed to operate reliably. Contributors to this book are richly diverse in terms of career stages (from junior to very senior authors), academic disciplines (organizational studies, industrial/organizational psychology, social psychology, sociology, communications, public health, and public policy), and familiarity with many organizational contexts. One consensus of these authors is clear: it is time to provide a relatively full panoply of HRO research in one place so that researchers can identify what they think are next steps, and practitioners can pick up strategies to help them in making their organizations more reliable. Organizing for Reliability is organized in three parts. Part I, “Setting the Stage,” offers some important background to high reliability research. Chapter 1 provides a short history of the work and an introduction to the remaining chapters. Chapter 2 discusses the various meanings of reliability in organizational research and the need to develop context-specific models of reliability.

10

a dva ncing orga niz ationa l r eli a bilit y

angaraj Ramanujam notes that reliability is an increasingly multifaceted R construct that currently covers multiple notions (levels of analysis, organizational capabilities, and assessment criteria). As a result, the overlap among various disciplinary approaches is growing. This overlap presents untapped opportunities for research on reliability, and Ramanujam identifies many for us. Finally, Ramanujam discusses implications of multiple notions and models of reliability for future research. In all this he broadens out the relatively narrow purview of the history of the research. John S. Carroll opens Part II, “Important Aspects of Reliable Organizing,” by examining reliability in the broader context of interrelated theoretical concepts and perspectives. Framing reliability as “an intersection of effectiveness, safety, and resilience,” he proposes that organizing for reliability can be more meaningfully understood by viewing organizations through three distinct lenses: strategic design, political, and cultural. Carroll cautions against relying exclusively on any one perspective noting that multiple perspectives must be considered together to recognize “the complex interdependencies of the organization, the difficulties of implementing change, and the heterogeneity among individuals and groups.” He concludes Chapter 3 by identifying the dominant perspective underlying each of the subsequent chapters. In Chapter 4 Kathleen M. Sutcliffe investigates mindful organizing, a dynamic process comprising ongoing patterns of action that fuel capabilities to more quickly sense and manage complex, ill-structured contingencies. Mindful organizing enables collective mindfulness through actions that promote “preoccupation with failure, reluctance to simplify interpretations, sensitivity to operations, commitment to resilience, and flexible decision structures that migrate problems to pockets of expertise,” all of which operate at multiple organizational levels. Sutcliffe traces the conceptual underpinnings of mindful organizing and the growth in theory and empirical research. She concludes that although mindful organizing was initially studied in the context of “prototypical” HROs, it is increasingly important for all organizations and for nonreliability outcomes such as innovation. She calls for more research to better understand the meaning of reliability as a property of relationships that can be developed and enhanced in mundane settings. Seth A. Kaplan and Mary J. Waller focus on team resilience in Chapter 5. Team composition in high-risk work settings is typically subject to ongoing

a dva ncing orga niz ationa l r eli a bilit y 11

change that accelerates during crisis situations. These authors discuss several strategies that teams can implement to enhance resilience. First, they discuss the roles of internal team dynamics such as changes in team composition and emergent states. Second, they point to the importance of team boundary dynamics such as adapting team composition to events, rapid onboarding of new members, and managing internal and external communication with stakeholders. Third, they suggest that these capabilities can be the effects of specific actions developed and enhanced through simulation-based training. Simulations can create a learning context that allows teams to experience the interplay of dynamic behaviors and states and develop resilience-enhancing behaviors and emergent states. Peter M. Madsen and Vinit Desai examine goal conflict in the context of organizational reliability in Chapter 6. They note that the tension between reliability goals and efficiency goals is particularly magnified and problematic in high-risk work settings where multiple goals are high in both their performance relatedness (e.g., decisions that affect reliability are also likely to affect other performance-linked goals) as well as their causal ambiguity (e.g., uncertainty regarding specific actions on various performance dimensions). One implication is that typical approaches to resolving goal conflict such as identifying and pursuing a single overriding goal or addressing multiple goals concurrently or sequentially may be inadequate or even infeasible. A collective dynamic capability to continually assess and resolve goal conflicts is required. These authors identify and elaborate on four processes that enable HROs to manage goal conflicts—continuous bargaining over goal priority, balancing incentives to reward both efficiency and safety, incremental decision making that builds on the analysis of even very small errors and failures, and commitment to resilience that enables the organization to allocate operational resources to areas of operational weakness whenever problems are anticipated. In Chapter 7 Madsen assesses the role of organizational learning in en hancing reliability. He points out that sustained reliability in high-risk/ high-hazard settings requires organizations to engage in different forms of learning—experiential learning, vicarious learning, learning from near misses, and simulation learning. He reviews the HRO literature as well as the broader organizational literature and offers an integration using Turner’s (1978) disaster incubation model. The four learning strategies together facilitate the early

12

a dva ncing orga niz ationa l r eli a bilit y

identification and correction of latent errors, appropriate timely response to contain the severity of a disaster as it unfolds, and effective cultural readjustment in the aftermath of a disaster. Jody L. S. Jahn, Karen K. Myers, and Linda L. Putnam discuss three mechanisms of reliability enhancement and state that they depend on communication to make it possible to negotiate meanings and actions in organizations. These authors show that studies of HROs often conceptualize communication as accessory to action or as operating at the periphery of organizations. Chapter 8 illuminates these perspectives by comparing communication-relevant studies of HROs with different views of what communication is and how it functions. The authors adopt five metaphorical lenses to review the role of communication in the HRO literature, which they find has relied on some lenses rather than others. Finally, they demonstrate how HRO research could be strengthened by adopting more complex understandings about the relationship between communication and highly reliable performance. Paul R. Schulman and Emery Roe propose a reformulation of the scope and time frame of reliability research in Chapter 9, arguing that reliability is increasingly the property of an interorganizational network, not just a single organization as typically assumed. They identify public dread of hazard and regulation as critical yet understudied factors that shape the practice of reliability management. They examine reliability using different standards and performance states. Most studies focus on a single standard—precluded events (i.e., avoiding outcomes that simply cannot happen such as a radioactive release into the environment). However, several other standards must also be considered—for example, inevitable events (i.e., cannot be avoided, but the focus is on speedy recovery such as in a power outage) and avoided events (i.e., should be avoided such as foreseeable errors). Similarly, the assessment of reliability must take into account an expanded set of performance states (e.g., normal operations, disruption, restoration, failure, recovery, and establishment of a “new normal”) and adopt a much longer-term (“generations”) orientation than is currently the case. Many readers may want to turn immediately to Part III, “Implementation,” as it contains numerous ideas about how to improve reliability and provides some cautions about what, perhaps, not to try. In Chapter 10 Peter F. Martelli focuses on improving reliability in the health-care sector. He begins by not-

a dva ncing orga niz ationa l r eli a bilit y 13

ing that uniform reliable safety has not been achieved in these organizations and continues by discussing assumptions and challenges to high reliability in the quest for quality and safety in health care. Health care’s first forays into the HRO world were in anesthesia, which borrowed principles from aviation. High reliability was rarely mentioned in the health-care literature in the 1990s. Instead, formal language and perspectives came from human factors engineering. By the mid-2000s, reliability was being regularly discussed in the literature, and two books that greatly influenced the future of high reliability research and application were published. In the last decade, acceptance of high reliability as an approach to providing greater safety in health care has been steadily growing. Still, there are significant gaps in understanding HRO as a theory, especially with respect to scope conditions and transferability and preconditions to sustained implementation. Martelli describes one of the larger implementation efforts. After reviewing what HRO is and what problem it is trying to solve in health care, he ends with a call to “preserve our pioneer spirit, drawing lessons across disciplinary boundaries in order to explore the character of work, the character of error, and the technical, economic, and institutional forces that promote reliability seeking in complex, interdependent organizations.” Louise K. Comfort examines organizing for reliability at the community level in Chapter 11. While the majority of HRO research has dealt with single organizations, she notes that organizations are nested in various levels of other organizations and systems of organizations, which create complex adaptive systems of systems (CASoS). The issue is how each system can scale up and down to achieve coherent performance. To illustrate the breakdown in CASoS, Comfort examines responses to the 2004 Indonesian earthquake and the 2011 Japanese earthquake, tsunami, and nuclear breach. She reviews how planning processes differed in each case and how various factors reduced or increased risk. Finally, she provides steps essential to building global resilience. In Chapter 12 W. Earl Carnes addresses the issue of just how difficult it is to implement high reliability processes in organizations. He comes from the world of implementation and provides a unique perspective in this book. Preferring the term “reliability seeking organization” (RSO), Carnes first discusses what a high reliability world looks like by answering the following questions: (1) How do RSOs get to be this way? (2) Why is it so hard to see what people

14

a dva ncing orga niz ationa l r eli a bilit y

in RSOs do? and (3) What’s important to know if you want to be an RSO? To address the issue of implementation, Carnes invited thirty-one professional colleagues to share their answers to the question, “What are the most important lessons you have learned about applying high reliability principles and procedures?” He categorizes their responses into five high-level lessons learned. Carnes then turns to some of the important behaviors mentioned by his respondents. He concludes by asking how to implement these behaviors. In implementation we must ask three questions: What does “good” look like, how are you doing, and how do you know? In the epilogue, we point out that a common definition of reliability remains elusive and that the expansive definitional set we have suggests the need for integrating research from different overlapping fields of inquiry. Since different authors discuss different levels of analysis, we also need work on how reliability at one level influences reliability at another level. While authors discuss different theoretical perspectives, most focus on only one or, at best, two perspectives; the political perspective, as discussed by Carroll, is not often discussed in the wider literature. Finally, one is still left with the question of whether HRO findings are generalizable across organizations

No t e s 1. In December 2016 Worldwatch Institute reported that natural disasters are on the increase. Even earlier, in 2014, the secretary general of the International Civil Defense Organization stated, “Today, natural and man-made disasters are increasing in frequency and destructive capability” (Kuvshinov, 2014, p. 1). 2. The original research was done on Nimitz-class carriers. The US Navy is currently building a new carrier class, the Gerald R. Ford class, and the first carrier in the series has been delivered for sea trials. This class is characterized by new technologies as well as other design features intended to improve efficiency, reduce operating costs, reduce crew size, and provide greater comfort. (See http://www.thefordclass.com; Bacon, 2014).

R efer ences Bacon, L. M. (2014, October 13). Crew’s ship: Sailors’ comfort a centerpiece of new supercarrier Ford. Navy Times. Retrieved from https://www.navytimes.com/story/military/tech/2014/10/13/crews -ship-sailors-comfort-a-centerpiece-of-new-supercarrier-ford/17180325/ Brunvand, J. H. (2000). The truth never stands in the way of a good story. Urbana: University of Illinois Press. Caird, J. K., & Kline, T. J. (2004). The relationships between organizational and individual variables to on-the-job driver accidents and accident-free kilometres. Ergonomics, 47, 1598–1613.

a dva ncing orga niz ationa l r eli a bilit y 15

Federal Aviation Administration. (2015). A brief history of the FAA. Retrieved from https://www .faa.gov/about/history/brief_history/ Guiberson, B. Z. (2010). Disasters: Natural and man-made catastrophes through the centuries. New York: Holt. Hofmann, D., Jacobs, R., & Landy, F. (1995). High reliability process industries: Individual, micro, and macro organizational influences on safety performance. Journal of Safety Research, 26(3), 131–149. Hopkins, A. (2007). The problem of defining high reliability organisations (Working paper 51). Canberra: Australian National University. Jones, T. C., & Atchley, P. (2002). Conjunction error rates on a continuous recognition memory task: Little evidence for recollection. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 374–379. Kuvshinov, V. (2014, March 1). Message of Dr. Vladimir Kuvshinov, secretary general of the International Civil Defence Organisation, on the occasion of the 2014 World Civil Defense Day. Retrieved from http://www.icdo.org/files/9113/9357/9173/WCDD_Message_EN.pdf La Porte, T. R., & Consolini, P. (1991). Working in practice but not in theory: Theoretical challenges of “high-reliability organizations.” Journal of Public Administration Research and Theory, 1(1), 19–47. Lekka, C. (2011). High reliability organisations: A review of the literature (Research Report 899). London: Health and Safety Executive. Mitroff, I., & Featheringham, T. (1974). On systemic problem solving and the error of the third kind. Behavioral Science, 19(6), 383–393. National Commission on the BP Deepwater Horizon Oil Spill and Offshore Drilling. (2011). Deep water: The Gulf oil disaster and the future of offshore drilling. Report to the president. Washington, DC: US Government Printing Office. Naveh, E., & Katz-Navon, T. (2015). A longitudinal study of an intervention to improve road safety climate: Climate as an organizational boundary spanner. Journal of Applied Psychology, 100(1), 216–226. Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books. Roberts, K. H. (1990a). Managing high reliability organizations. California Management Review, 32, 101–113. Roberts, K. H. (1990b). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176. Roberts, K. H. (Ed.). (1993a). New challenges to understanding organizations. New York: Macmillan. Roberts, K. H. (1993b). Some aspects of organizational culture and strategies to manage them in reliability enhancing organizations. Journal of Managerial Issues, 5, 165–181. Roberts, K. H. (2012). Reflections on high reliability research (Paper prepared for the Vanderbilt workshop). Nashville, TN: Vanderbilt University. Roberts, K. H., & Bea, R. G. (2001). When systems fail. Organizational Dynamics, 29, 179–191. Roberts, K. H., & Rousseau, D. M. (1989). Research in nearly failure-free, high-reliability organizations: “Having the bubble.” IEEE Transactions on Engineering Management, 36, 132–139. Rochlin, G. I. (1993). Defining high reliability organizations in practice: A taxonomic prolegomenon. In K. H. Roberts (Ed.), New challenges to understanding organizations (pp. 11–32). New York: Macmillan. Rochlin, G. I. (2012). Bullets over Vanderbilt (Paper prepared for the Vanderbilt workshop). Nashville, TN: Vanderbilt University. Rochlin, G. I., La Porte, T. R., & Roberts, K. H. (1987). The self-designing high-reliability organization: Aircraft carrier flight operations at sea. Naval War College Review, 40, 76–90. Schulman, P. (1993). The analysis of high reliability organizations: A comparative framework. In K. H. Roberts (Ed.), New challenges to understanding organizations (pp. 33–54). New York: Macmillan.

16

a dva ncing orga niz ationa l r eli a bilit y

Smircich, L. (1983). Concepts of culture and organizational analysis. Administrative Science Quarterly, 28(3), 339–358. Turner, B. A. (1978). Man-made disasters. London: Wykeham Science. Weick, K. E. (1995). South Canyon revisited: Lessons from high reliability organizations. Wildfire, 4, 54–68. Weick, K. E. (1996). Drop your tools: An allegory for organizational studies. Administrative Science Quarterly, 41(2), 301–313. Weick, K. E. (2012). Continuities in HRO thought (Paper prepared for the Vanderbilt workshop). Nashville, TN: Vanderbilt University.

chapter 2

Th e M u lt i p l e M e a n i n g s a n d M od e l s of R eli a bilit y i n Org a n i z at ion a l R e s e a r ch Rangaraj Ramanujam

I n t r oduc t i o n

Reliability has been the focus of systematic organizational research since the pioneering work of a group of researchers at the University of California, Berkeley, in the 1980s (Roberts, 1990). The researchers were curious about the seemingly theory-defying ability of some organizations to avoid catastrophic operational outcomes despite operating technologies that were fraught with exceptionally high levels of risk, uncertainty, hazard, and public intolerance of failures (La Porte & Consolini, 1991; Roberts, 1993). Their “careful description of a special set of organizations” (Boin & Schulman, 2008, p. 1053) was encapsulated in the acronym HRO—that is, high reliability organizations. Their primary conclusion—HROs’ attainment of exceptionally high and sustained levels of reliability is the outcome of deliberate and ongoing organizational efforts—continues to serve as the guiding premise for subsequent research. The impact of this work on research, policy, and practice has been remarkable. The last three decades have seen a steady stream of HRO-related research articles (for periodic reviews see Roberts, 1993; Weick, Sutcliffe, & Obstfeld, 1999; Roe & Schulman, 2008; Bourrier, 2011; Lekka, 2011), journal special issues (e.g., Waller & Roberts, 2003; Reinertsen & Clancy, 2006; Ansell & Boin,

18

multiple meanings and models of reliability

2011), and books (Weick & Sutcliffe, 2001, 2007, 2015; Roe & Schulman, 2016). Investigations of major accidents regularly draw on HRO research to frame their analysis and recommendations (e.g., Columbia Accident Investigation Board [CAIB], 2003). Industries such as nuclear power and chemicals have developed their own “HRO operating manuals” (see Chapter 12). The leaders of The Joint Commission (TJC), the accreditation agency for hospitals in the United States, have called for the adoption of “HRO principles” to enhance patient safety in health-care delivery (Chassin & Loeb, 2013). As Gene Rochlin (1993) presciently observed, a three-letter acronym like HRO, if used without careful qualifications, risks conveying a misleading picture about the extent to which the underlying issues have been settled and agreed on. The popularity of the HRO label has increased that risk. In reality, as with any vibrant field of academic inquiry, reliability research is nuanced, dynamic, and disagreement-prone. This chapter revisits what it means to study the reliability of organizations, a topic that has been previously addressed with clarity and insistence (e.g., Roberts, 1993; La Porte, 1996; Weick et al., 1999; Roe & Schulman, 2008). An updated discussion could encourage synthesis across disciplines, reduce duplication of efforts, and identify opportunities for advancing understanding. This chapter is organized around two basic themes—multiple meanings of reliability in organizational research and the need for developing contextspecific models of organizational reliability. First, I discuss the evolving definition of reliability that currently encompasses several distinct but interrelated notions (e.g., safety or the avoidance of harmful outcomes, resilience or the effective response to and recovery from shocks to the system, etc.), spans multiple levels of analysis (e.g., team, organizational, and interorganizational), calls for a diverse mix of organizational capabilities (e.g., prevention, anticipation, response to unexpected events, recovery, learning, and resilience), and points to a variety of standards for assessing reliability (e.g., avoidance of precluded events or accidents, absence of harm, rates of errors and/or near misses, recovery from failures, and avoidance of service disruption). Second, I draw attention to the growing overlaps between high reliability management (Roe & Schulman, 2008) and other approaches such as systems engineering (Leveson, Dulac, Marais, & Carroll, 2009), error management (Frese & Keith, 2014), and safety management (Grote, 2012) and the opportunities they present for

multiple meanings and models of reliability 19

organizational research. Third, I examine recent suggestions that the effective pursuit of organizational reliability might take different forms in different contexts. In other words, organizational reliability might be better understood through multiple context-specific models or descriptions rather than through a single context-invariant model (Boin & Schulman, 2008; Amalberti, 2013). In this regard, market pressures and regulatory oversight represent potentially important contextual features that can account for different organizational approaches to managing reliability. Finally, I discuss the implications of the multiple notions and models of reliability for future research.

D e l i n e a t i n g R e l i a b i l i t y : M u l t i p l e No t i o n s , C a pa b i l i t i e s , L e v e l s o f A n a l y s i s , a n d S t a n d a r ds

Emery Roe and Paul Schulman (2008) have conveyed the various meanings attached to the term “reliability” in organizational research: For some reliability means constancy of service; for others the safety of core activities and processes (La Porte 1996). Increasingly it means both anticipation and resilience, the ability of organizations to plan for shocks as well as to absorb and rebound from them in order to provide services safely and continuously. (p. 5)

Several features of this excerpt are noteworthy. First, it refers to at least four distinct, but interrelated, notions of reliability—performance consistency, safety, resilience, and continuity of service. Second, each notion of reliability implies specific standards for assessing reliability. For example, the criteria for assessing safety are different from those for assessing resilience. Third, implicit in the excerpt is that reliability can be studied at multiple levels of analysis such as a team, an organization, or an interorganizational network. The level of analysis in turn informs the demarcation of the “system” that should function reliably. For instance, given an organizational-level focus, the relevant system comprises organizational incentives, formal structures, organizational culture, the technology, the people, and the various defenses in depth, and so on, that need to function individually and jointly in ways that contribute to the overall reliability of the organization. Finally, each notion of reliability emphasizes a particular set of organizational capabilities linked to anticipation and resilience. Anticipation includes preparing for and preventing foreseeable

20

multiple meanings and models of reliability

errors and predictable surprises through careful design, rigorous training, thoughtful compliance, periodic inspections, continuous learning from near misses, and alertness to early signs of trouble (Weick et al., 1999). Resilience is about how the organization responds both during an adverse event or shock as well as in its aftermath (Vogus & Sutcliffe, 2007). As the event unfolds, resilience enables timely response to the event, adequate containment of its adverse consequences, and maintenance of service continuity. In the aftermath of the event, resilience aids effective recovery and learning from the event, which in turn enhances anticipation and resilience. Most definitions of reliability encompass multiple notions, capabilities, levels of analysis, and standards. Nevertheless, it is useful to consider each notion of reliability separately to understand the corresponding collective capabilities that it emphasizes and the specific reliability standards it implies. See Figure 2.1 for a summary of the discussion. The notion of reliability as performance consistency refers to the ability of a component (which can be a team or an organization depending on how one defines the system) to consistently perform its intended or required function or mission, on demand and without degradation or failure. Formally, it refers to the probability that a component satisfies its specified behavioral requirements over time and under given conditions—that is, it does not fail (Leveson, 2012). Performance consistency emphasizes organizational capabilities for anticipation through design, training, and ongoing monitoring through audits and other tests. The associated reliability standards assess repeatability and predictability in operational processes and outcomes (e.g., low variance). This viewpoint assumes that the technology and tasks function predictably for the most part, and it emphasizes compliance with standardized procedures. By contrast, most organizational studies of reliability emphasize the potential for the unexpected in technology and tasks. As Karl Weick, Kathleen Sutcliffe, and David Obstfeld (1999) have argued, unvarying procedures cannot address what they were not designed to anticipate. Reliability as safety emphasizes the avoidance of harm through ongoing management of the risk of injury, danger, or loss from routine operations. Organizational studies of reliability are at least in part studies of safety. For instance, the initial studies of HROs were primarily focused on the avoidance of catastrophic events such as a meltdown in a nuclear power plant

System

Level of analysis

Interorganizational Organizational Organizational subunit Teams

Anticipation & Prevention Criteria

Response to Adverse Event Criteria

• Absence of precluded events • Rates of errors and near misses • Noncompliance with safety procedures • Presence of error-inducing structures and processes (e.g., weak safety climate, work overload)

• Speed of response • Magnitude of adverse outcomes (local) • Magnitude of outcomes (system-wide)

Individuals Time

figure 2.1 Reliability Standards across Levels and Organizational Capabilities

Resilience Criteria • Avoiding service disruption • Restoration of services to pre-event levels • Enhanced post-event service delivery • Post-event decrease in rates of errors and near misses

22

multiple meanings and models of reliability

(Roberts, 1993). The growing number of reliability-based studies in health care focus on outcomes that endanger patient safety such as medication errors and hospital-acquired infections (Chassin & Loeb, 2013). Safety requires organizational capabilities for both anticipation as well as resilience. Anticipation refers to design and planning to avoid preventable errors and early detection of unpredictable errors, whereas resilience calls for collective processes that enable quick response to contain harm from developing situations, strategies for containing errors before they escalate, and recovering from adverse events or shocks (Roe & Schulman, 2008). Safety can be assessed using multiple outcome-based and process-based standards. A basic standard, of course, is the sustained absence of unacceptably harmful events (“precluded events”) over an extended period (see Schulman and Roe, Chapter 9). However, because precluded events are extremely rare, their absence in and of itself conveys little information about how close the organization might have come to experiencing such an outcome. A more frequently used set of outcome-based safety standards focuses on intermediate outcomes or precursor events—for example, errors, violations, near misses, and incidents (Reason, 1997). Process-based assessments of safety examine the extent to which safety-enhancing structures and processes are present (e.g., survey-based assessments of safety climates). It is worth noting that the “high” in HRO initially referred not only to the high level of safety outcomes (i.e., success in avoiding a precluded event over very long time periods) but also to the high level of challenges presented by the operating environment (Rochlin, 1993). That is, organizations were deemed to be HROs not only because of their exceptional track record of safe outcomes but also because they managed to do so in the face of exceptionally high levels of risk, hazard, complexity, uncertainty, and interdependence in their operating environments. The initial HRO case studies described in great detail the challenging operating conditions and essentially assessed safety as a binary outcome. Subsequent studies, however, have tended to assume the existence of challenging operating conditions and assess reliability as a continuous variable (e.g., rates of errors). In this sense, the standard of “high” has become somewhat blurred. Reliability as resilience focuses on the organizational response to and recovery from a major shock such as an accident (Woods, 2006). Timothy Vogus and

multiple meanings and models of reliability 23

Sutcliffe (2007) define resilience as “the maintenance of positive adjustment under challenging conditions such that the organization emerges from those conditions strengthened and more resourceful. By ‘challenging conditions’ we include discrete errors, scandals, crises, and shocks, and disruptions of routines as well as ongoing risks (e.g., competition), stresses, and strain” (p. 3418). This notion of reliability emphasizes the organizational capabilities for continuously monitoring operating conditions so that deviations from safety limits can be detected and contained early and recover from the challenging conditions. Whereas anticipation emphasizes prevention, resilience emphasizes recovery. A key idea is the distinction between recovery and resilient recovery (Bonanno, 2004). Recovery emphasizes the notion of reliability as service continuity— that is, the capability of the organization to continue delivery of products or services at acceptable predefined levels following a disruptive incident. It implies a return to a pre-event level of capabilities and performance. However, in resilient recovery, not only does the organization recover, but it also emerges from the event with enhanced capabilities and performance (Bonanno, 2004). In other words, learning is integral to resilient recovery. Reliability as resilience suggests different standards such as time for recovery from an event, improved safety outcomes following recovery, and faster response and recovery from subsequent shocks. However, to date, few organizational studies of resilience have empirically examined the outcomes linked to resilience. What emerges from this discussion is a complex picture of reliability as a multifaceted construct. The different meanings of reliability—performance consistency, safety, service continuity, and resilience—are not mutually exclusive. Together, they offer a wide-ranging description of organizational reliability. It is the organization’s ability to continuously provide a critical set of products or services of predefined quality without disruption by intentionally managing the risks to the safety of processes and people within and without the organization and by responding to and recovering and learning from unexpected adverse events. This means that reliability cannot be simply equated to the absence of errors and accidents or, for that matter, to the absence of the causes of errors and accidents. For instance, high workload; inadequate knowledge, ability, or experience of employees; poor interface design; inadequate supervision or instruction; and stressful environment are some common causes of errors in organizational settings (Reason, 1997; Frese & Keith,

24

multiple meanings and models of reliability

2014). However, their mere presence or absence is neither necessary nor sufficient for inferring reliability. Contexts that warrant reliability tend to be error-prone because of the highly complex, uncertain, and interdependent nature of work (Perrow, 1984). In other words, the pursuit of reliability often occurs and acquires significance in the presence of error-inducing conditions (Ramanujam & Goodman, 2011).

R e l i a bi l i t y a n d R e l at e d Conce p ts

As the foregoing discussion indicates, the definition of “reliability” refers to other constructs such as safety, resilience, and accidents, which are subjects of research in their own right. Although they represent distinct ideas, there is growing overlap and convergence in how they are framed and studied. For instance, discussions of safety management are increasingly framed in terms of processes linked not only to safety but also to resilience and learning (Grote, 2012). In organizational psychology, error management refers to an active approach to errors that reduces negative error consequences and enhances positive effects, such as learning and innovation (Frese & Keith, 2014). The basic premise is that error management is made possible through quick detection and damage control as the errors happen. In resilience engineering the ability to perform in a resilient manner is distinguished from the ability to avoid failures and breakdowns. Resilience is defined as a system’s intrinsic ability to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions (Hollnagel, Pariès, Woods, & Wreathall, 2011). This convergence in various disciplines toward a broader definition of the scope of inquiry is consistent with a basic premise of HRO research that organizational reliability is best understood through multiple disciplinary perspectives. It prompts a reassessment of how organizational research can draw from as well as distinctively contribute to this merging research stream. A couple of examples might illustrate how organizational research can draw on research about safety. In studies of HROs, reliability is implicitly treated as a necessary condition for safety (La Porte, 1996; Roberts, 1990; Weick et al., 1999). However, Nancy Leveson and colleagues (Leveson et al., 2009) challenge this assumption. They argue that, from a systems engineering per

multiple meanings and models of reliability 25

spective, reliability and safety represent different system properties and that one does not imply the other. In other words, it is conceivable that a system can be reliable yet unsafe, or it can be safe but unreliable. The two could also be in conflict so that efforts to improve reliability might decrease safety, and vice versa. For instance, efforts to promote compliance with safety protocols (i.e., reliability as performance consistency) could potentially undermine safety if they also constrain organizational ability to respond flexibly to unexpected situations. In complex and dynamic work settings, for example, deviating from the protocols (i.e., reducing performance consistency) might be necessary to ensure safety (Reason, 1997). Such different viewpoints partly reflect differences in how reliability and safety are defined in organizational research and in systems engineering. Nevertheless, such contradictory viewpoints compel organizational researchers to reexamine their assumptions about the link between safety and reliability. By formulating reliability and safety as outcomes that can be observed at different interacting levels of a hierarchical system, the systems engineering viewpoint highlights the need for organizational researchers to consider cross-level effects—for example, how does group-level reliability affect and is affected by organizational-level reliability? Such questions have received little attention in organizational research (for an exception, see Roe and Schulman’s [2008] insightful account of the role of reliability professionals as critical linking mechanisms between the operational and institutional parts of the organization). Research on safety climate is another example of a multiple disciplinary approach. Although much of this initial research was carried out in manufacturing settings to study employee safety in the context of industrial accidents (Zohar, 2000), it has been extended to other settings, including to health-care organizations where it is used to study patient safety. In general, climate refers to “the shared perceptions of the employees concerning the practices, procedures, and the kind of behaviors that get rewarded, supported, and expected in a setting” (Schneider, 1990, p. 22). Climate influences peoples’ behaviors by influencing how they think and feel about certain aspects of their environment such as service, innovation, safety, or implementation (Klein & Sorra, 1996). The main idea is that people in organizations often rely on cues from the work environment to interpret events, develop attitudes, and understand

26

multiple meanings and models of reliability

expectations about their behavior and its consequences. In particular, the perceptions of one’s coworkers provide especially important cues that influence behaviors and, often, performance. A representative study of patient safety climate in a health-care system identified four dimensions (Katz-Navon, Naveh, & Stern, 2005). These reflect employee perceptions about the nature of safety procedures (i.e., how congruent the organizational safety procedures were with the daily work demands and processes in employees’ work units), the safety information distributed to them (extent of continuous feedback and access to safety information via distribution and training and the clarity of such information), their manager’s safety practices (employees’ perceptions about a supervisor’s safety-related activities and methods), and the priority given to safety (degree to which safety is a top priority for employees within the work unit). This study found that a favorable climate of patient safety was associated with fewer treatment errors. Other studies have reported a similarly strong and favorable relationship between safety climate and employees’ safety-related behaviors and outcomes (e.g., Hofmann, Gerras, & Morgeson, 2003; Zohar & Luria, 2005). Notably, these studies consistently suggest that supervisors play an important role in establishing a safety climate (Zohar, 2002). For instance, employees working in units whose supervisor was committed to safety were likely to develop perceptions that safety was important, and, in contrast, employees working for supervisors who were not committed to safety were likely to view safety as being less important (Hofmann & Stetzer, 1996). This research carries several implications for studying reliability. First, it provides evidence for the effects of informal processes, such as employees’ shared perceptions about aspects of their work environment, on work-related behaviors as well as outcomes. By extension, it suggests that similar contextual effects may shape behaviors critical for enacting reliability. Second, the findings about supervisors point to the need for better understanding the role of frontline managers in enabling and facilitating reliable processes and outcomes. For the most part, organizational research on reliability has tended to be silent about the role of frontline managers.

D e v e l op i n g C o n t e x t- S p e c i f i c M od e l s o f R e l i a b i l i t y

The pioneering work on HROs describes the characteristics of a particular set of organizations that were successful in performing highly reliably despite

multiple meanings and models of reliability 27

operating complex, error-prone, and hazardous technologies in intolerant social and political environments (Rochlin 1993). Subsequent extensions and applications, however, have encompassed a wider variety of organizations including many that seem very different from the initial HROs, which were explicitly identified and categorized in opposition to other organizations (i.e., nonHROs). In response to the trend, new terminology was proposed to further clarify the dichotomy. The initial set of HROs represent reliability attaining organizations, whereas other organizations that aspire for, but have not yet attained, high levels of reliability are better viewed as reliability seeking organizations (La Porte, 1996). This distinction raises a broader question about the context-specificity of models or formal descriptions of reliability. For example, the Columbia Accident Investigation Board (CAIB, 2003) noted that the National Aeronautics and Space Administration (NASA) did not possess HRO characteristics that could have helped prevent the disaster and suggested that high reliability research offers a useful description of the culture that should exist in the human spaceflight organization. Arjen Boin and Schulman argue that these conclusions are based on a “misreading and misapplication” of high reliability research (2008, p. 1050). They propose that NASA was not and could not have been a HRO and that, as an RSO, NASA will probably have to rely on a different framework for achieving reliability. According to these researchers, key elements of such an alternative framework of reliability include a well- developed and widely shared operating philosophy about the value of trade-offs that helps employees negotiate the tensions linked to reliability, mechanisms for minimizing risks and errors such as error reporting and commitment to learning from errors, and organizational efforts to maintain its institutional integrity in the face of shifting stakeholder demands. In general, “context” refers to a set of relatively stable features of a setting that shape processes and behaviors of interest (Rousseau & Fried, 2001). A potentially important set of contextual features for understanding organizational reliability might be the extent to which organizations are subject to market pressures and regulatory control and, as a result, may or may not have the flexibility to treat reliability as a probabilistic property that can be traded at the margins for other organizational values such as efficiency or competiveness. The HROs that were studied initially operated in closely regulated environments with limited direct exposure to market pressures. As a result,

28

multiple meanings and models of reliability

the primacy of reliability and safety and organizational values was a stable organizational feature. In comparison, many RSOs are subject to unrelenting pressures from the market and operate with less regulatory oversight. These RSOs need to continuously think about reliability in terms of trade-offs with other competing priorities. Echoing this idea, René Amalberti (2013) argues that it is simplistic to expect a single model of safety (a term he more or less uses interchangeably with reliability) to accurately depict the potentially diverse organizational approaches to avoiding accidents. He proposes that there could be different safety models depending on how organizations approach risk management and settle on trade-offs between adaptability and safety. Amalberti identifies three distinct models along a continuum from models that give priority to adaptation to models that give priority to rules and supervision. These models avoid, manage, or embrace risk and hazard. The ultraresilient model is observed in contexts where risk taking is inherent in the economic model and is associated with benefits as well as high accident rates (e.g., sea fishing where skippers seek out the riskiest conditions to maximize their haul). Organizations operating in such contexts are sensitive to safety and emphasize training. The leader’s autonomy and expertise take precedence over the hierarchical organization of the group or the specification of procedures. However, there are significant interorganizational variations in safety. For example, in professional deep-sea fishing, the rate of fatal accidents varies by a factor of four between ship owners in France and by a factor of nine at the global level (Morel, Amalberti, & Chauvin, 2008). Therefore, multifold safety improvements are still possible through refinements within this model. For instance, an organization could focus its safety improvement efforts on identifying managers with consistently high safety records, learning from their experiences, and transferring the lessons to other managers. What Amalberti terms the HRO Model is relevant in contexts where risk management is an active daily occurrence. The emphasis here is on better organized and team-driven local adaptation. The team enacts the mindful organizing practices and maintains a high level of local collective regulation. In his summary assessment, Amalberti notes that the HRO model is primarily about improving detection and recovery from hazardous situations and secondarily on prevention, which he views as avoiding exposure to risky situa-

multiple meanings and models of reliability 29

tions where possible. Here again, he suggests many opportunities to improve safety exist within the model. The ultrasafe model is relevant in contexts where risk avoidance is a nonnegotiable priority. It emphasizes prevention through standardization of roles, activities, and processes as well as external supervision and relies much less on the local expertise of frontline operators. As an example, Amalberti cites the complete shutdown of air traffic over Northern Europe during the volcanic eruption in Iceland. In this case safety-directed actions were initiated by external regulators. He cites airlines, the nuclear power industry, medical biology, and radiotherapy as excellent examples of this category. Training of frontline operators is focused on respect for their various roles, the way they work together to implement procedures, and how they respond to abnormal situations in order to initiate ad hoc procedures. He acknowledges that this model can become completely procedural by constraining the experience of operators to a permissible set of difficult situations. Once again, the best and the least good operators within a single occupation differ by about a factor of ten. We do not have to agree with the details of these models to appreciate the underlying premise—different economic, competitive, and regulatory conditions can lead to very different and yet comparably effective approaches to reliability management (Grote, 2012). This premise remains significantly underexplored in organizational research (Amalberti, 2013). One implication is that it may not be necessary or even feasible for all RSOs to adopt the characteristics of HROs—for example, recommending a reliability model that relies on teamwork to an organization or industry that values hierarchical authority. Another implication is that multiple models might be operating within the same industry or even the same organization. Within a hospital, for instance, some units face turbulent work environments that call for high reliance on individual expertise (e.g., emergency department). Other units carry out work that is highly complex, specialized, exception-prone, and dependent on teambased local adaptation (e.g., operating room). Some other units face unacceptably high risks of hazard, and, to avoid such risks, their operations are almost completely protocolized (e.g., radiotherapy). Finally, it is also conceivable that the same unit could move rapidly from one model to another depending on changes in the case mix and work.

30

multiple meanings and models of reliability

C o n c l us i o n s

The key idea emerging from the foregoing discussion is multiplicity. What began as an inquiry into the characteristics of a narrow set of organizations has dramatically expanded to encompass multiple notions of reliability, multiple organizational capabilities, multiple reliability standards, multiple levels of analysis, and multiple models of reliability. This presents several questions for future research. The multiple meanings embedded in the term “reliability” call for greater clarity about what exactly is being studied. For example, studies of reliability as safety and reliability as resilience focus on very different organizational capabilities and reliability standards. Any attempt to equate or combine findings from such studies is built on some strong and often implicit assumptions. It assumes that safety and resilience have a common set of antecedents—that is, organizational features that enhance safety also enhance resilience. It also assumes that the antecedents of safety and the antecedents of resilience are not in conflict. It further assumes equivalence among studies that rely on different reliability standards to examine the same notion of reliability. That is, findings from studies that assess safety as the absence of precluded events are similar to findings from studies that measure safety as the rate of near misses. As plausible as these assumptions are, they remain insufficiently tested. Verifying these suppositions is critical for both carrying out a systematic review of the relevant literatures as well as offering valid prescriptions for enhancing reliability. The multiple organizational capabilities linked to reliability suggest a need for more research about their antecedents. What specific structures and processes enhance organizational capabilities to effectively anticipate, prevent, detect, contain, recover, and learn from extreme outcomes? As several chapters in this book report, significant advances in this line of inquiry have been made. However, some basic issues remain unresolved. For instance, studies of HROs have identified specific structural features such as flexible hierarchies, redundancy, and incident command systems as antecedents of reliability (e.g., Bigley & Roberts, 2001). Similarly, studies of mindful organizing have identified specific processes linked to reliability such as preoccupation with failure, reluctance to simplify explanations, sensitivity to operations, deference to expertise, and commitment to resilience (e.g., Vogus & Sutcliffe, 2007). In advancing their initial formulation of mindful organizing, Weick, Sutcliffe, and Obstfeld (1999) stated that their

multiple meanings and models of reliability 31

goal was to “provide theoretical enrichment of previous discussions that have been largely macro-level technology driven structural perspective by supplying a mechanism by which reliable structures are enacted” (p. 82). However, few studies have empirically verified the link between HRO structures and the processes of mindful organizing, much less the similarities and differences between the structural antecedents of processes linked to various organizational capabilities. The growing overlap and convergence among different disciplines creates challenges and opportunities. Where possible, organizational research needs to draw on these other disciplines to expand their conceptualization and identify new research opportunities. As previously discussed, research on safety climate can be helpful in incorporating the role of frontline managers in accounts of reliability. Similarly, stylized problem formulations in fields such as systems engineering can help organizational researchers identify gaps and inconsistencies. For instance, with a few exceptions, studies of reliability have focused on a single level of analysis, typically a team or an organizational subunit. Few studies have examined such questions as whether and how variables one level above and one level below affect reliability at the focal level of analysis (Goodman, 2000; Hackman, 2003). The possibility of context-specific models of reliability point to various questions and opportunities. First, we need to empirically verify the generalizability of key findings that inform our current understanding. For example, in Chapter 4 Sutcliffe discusses studies that have linked mindful organizing to safety outcomes in a variety of settings. We need more such efforts. Second, we need to develop and test context-specific models of reliability such as those proposed by Boin and Schulman (2008) and Amalberti (2013). This in turn will require careful research design and significant resources to compare reliability across different contexts. Third, questions about organizational transition within and between models warrant attention. Studies of HROs have, without exception, only examined organizations that were already deemed to be HROs. As a result, we have very little understanding about the origins of HROs and the extent to which their emergence is a function of regulatory environment or organizational leadership. Stated differently, we do not know whether and how an RSO can become an HRO. The need for studies of organizational transitions is especially urgent in the light of growing demand for implementation of reliability-enhancing practices in organizations. Fourth, the possibility

32

multiple meanings and models of reliability

of different models operating within a single organization or between organizations linked by interdependent work raises questions about how coordination between various models is negotiated without adversely affecting reliability.

R efer ences Amalberti, R. (2013). Navigating safety: Necessary compromises and trade-offs––Theory and practice. New York: Springer. Ansell, C., & Boin, A. (Eds.). (2011). Special issue: In honor of Todd R. La Porte. Journal of Contingencies and Crisis Management, 19(1), 9–13. Bigley, G. A., & Roberts, K. H. (2001). The incident command system: High-reliability organizing for complex and volatile environments. Academy of Management Journal, 44(6), 1281–1299. Boin, A., & Schulman, P. (2008). Assessing NASA’s safety culture: The limits and possibilities of high-reliability theory. Public Administration Review, 68(6), 1050–1062. Bonnano, G. A. (2004). Have we understood the human capacity to thrive after extremely aversive events? American Psychologist, 59(1), 20–28. Bourrier, M. (2011). The legacy of the theory of high reliability organizations: An ethnographic endeavor (Working Paper No. 6). Switzerland: University of Geneva. Chassin, M. R., & Loeb, J. M. (2013). High-reliability health care: Getting there from here. The Milbank Quarterly, 91(3), 459–490. Columbia Accident Investigation Board. (2003). Columbia Accident Investigation report. Ontario, Canada: Apogee. Frese, M. & Keith, N. (2014). Action errors, error management, and learning in organizations. Annual Review of Psychology, 66, 1–21. Goodman, P. S. (2000). Missing organizational linkages: Tools for cross-level research. Thousand Oaks, CA: Sage. Grote, G. (2012). Safety management in different high-risk domains—All the same? Safety Science, 50, 1983–1992. Hackman, J. R. (2003). Learning more by crossing levels: Evidence from airplanes, hospitals, and orchestras. Journal of Organizational Behavior, 24(8), 905–922. Hofmann, D. A., Gerras, S. J., & Morgeson, F. P. (2003). Climate as a moderator of the relationship between leader-member exchange and content specific citizenship: Safety climate as an exemplar. Journal of Applied Psychology, 88(1), 170–178. Hofmann, D. A., & Stetzer, A. (1996). A cross-level investigation of factors influencing unsafe behaviors and accidents. Personnel Psychology, 49(2), 307–339. Hollnagel, E., Pariès, J., Woods, D. D., & Wreathall, J. (Eds.). (2011). Resilience engineering in practice: A guidebook. Farnham, UK: Ashgate. Katz-Navon, T., Naveh, E., & Stern, Z. (2005). Safety climate in healthcare organizations: A multidimensional approach. Academy of Management Journal, 48, 1073–1087. Klein, K. J., & Sorra, J. S. (1996). The challenge of innovation implementation. Academy of Management Review, 21(4), 1055–1080. La Porte, T. R. (1996). High reliability organizations: Unlikely, demanding and at risk. Journal of Contingencies and Crisis Management, 4(2), 60–71. La Porte, T. R., & Consolini, P. (1991). Working in practice but not in theory: Theoretical challenges of “high-reliability organizations.” Journal of Public Administration Research and Theory, 1(1), 19–47. Lekka, C. (2011). High reliability organisations: A review of the literature (Research Report 899). London: Health and Safety Executive.

multiple meanings and models of reliability 33

Leveson, N. (2012). Engineering a safer world: Applying systems thinking to safety. Cambridge, MA: MIT Press. Leveson, N., Dulac, N., Marais, K., & Carroll, J. (2009). Moving beyond normal accidents and high reliability organizations: A systems approach to safety in complex systems. Organization Studies, 30(2/3), 227–249. Morel, G., Amalberti, R., & Chauvin, C. (2008). Articulating the differences between safety and resilience: The decision-making of professional sea fishing skippers. Human Factors, 50, 1–16. Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books. Ramanujam, R., & Goodman, P. S. (2011). The role of organizational feedback processes in the link between latent errors and adverse outcomes. In D. Hofmann & M. Frese (Eds.), Errors in organizations (pp. 245–272). New York: Routledge. Reason, J. (1997). Managing the risk of organizational accidents. Aldeshot, UK: Ashgate. Reinertsen, J. L., & Clancy, C. (Eds.). (2006). Keeping our promises: Research, practice, and policy issues in health care reliability [Special issue]. Health Services Research, 1535–1538. Roberts, K. H. (1990). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176. Roberts, K. H. (Ed.). (1993). New challenges to understanding organizations. New York: Macmillan. Rochlin, G. (1993). Defining high reliability organizations in practice: A taxonomic prolegomenon. In K. Roberts (Ed.), New challenges to understanding organizations (pp. 11–32). New York: Macmillan. Roe, E., & Schulman, P. (2008). High reliability management. Stanford, CA: Stanford University Press. Roe, E., & Schulman, P. (2016). Reliability and risk: The challenge of managing interconnected infrastructures. Stanford, CA: Stanford University Press. Rousseau, D. M., & Fried, Y. (2001). Location, location, location: Contextualizing organizational research. Journal of Organizational Behavior, 22, 1–13. Schneider, B. (Ed.). 1990. Organizational climate and culture. San Francisco: Jossey-Bass. Vogus, T. J., & Sutcliffe, K. M. (2007). Organizational resilience: Towards a theory and a research agenda. IEEE Systems, Man, and Cybernetics 2007 Conference Proceedings: 3418–3422. Waller, M. J., & Roberts, K. H. (Eds.). (2003). High reliability and organizational behavior: Finally the twain must meet [Special issue]. Journal of Organizational Behavior, 24(7), 813–814. Weick, K. E., & Sutcliffe, K. M. (2001). Managing the unexpected: Assuring high performance in an age of complexity. San Francisco: Jossey-Bass. Weick, K. E., & Sutcliffe, K. M. (2007). Managing the unexpected: Resilient performance in an age of uncertainty (2nd ed.). San Francisco: Jossey-Bass. Weick, K. E., & Sutcliffe, K. M. (2015). Managing the unexpected: Sustained performance in a complex world (3rd ed.). San Francisco: Wiley. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. In R. S. Sutton & B. M. Staw (Eds.), Research in organizational behavior (Vol. 21, pp. 81–123). Greenwich, CT: JAI. Woods, D. D. (2006). Essential characteristics of resilience. In E. Hollnagel, D. D. Woods, & N. Leveson (Eds.), Resilience engineering: Concepts and precepts (pp. 21–34). Burlington, VT: Ashgate. Zohar, D. (2000). A group-level model of safety climate: Testing the effect of group climate on microaccidents in manufacturing jobs. Journal of Applied Psychology, 85(4), 587. Zohar, D. (2002). Modifying supervisory practices to improve subunit safety: A leadership-based intervention model. Journal of Applied Psychology, 87(1), 156–163. Zohar, D., & Luria, G. (2005). A multilevel model of safety climate: Cross-level relationships between organization and group-level climates. Journal of Applied Psychology, 90(4), 616–628.

This page intentionally left blank

part ii

Impo r ta n t Asp e c t s o f R e l i a bl e Org a n i z i ng

This page intentionally left blank

chapter 3

Th r e e L e n s e s f o r U n d e r s ta n d i n g R eli a bl e , Sa fe , a nd E f f e c t i v e Org a n i z at ions Strategic Design, Political, and Cultural Approaches John S. Carroll

I n t r oduc i n g t h e Th r e e L e n s e s

Changes in technology and society require organizations to be flexible. Organizations must continually find innovative ways to achieve and maintain success, and they often do this by adapting new concepts that emerge regularly from both practice and theory, including total quality management, learning organization, networked organization, business process reengineering, and sustainability. High reliability organizing has emerged as a promising approach to organizational effectiveness especially applicable in industries that manage hazardous operations. Reliability represents an intersection of effectiveness, safety, and resilience—of particular importance in industries where accidents and disasters happen with disappointing frequency. Given the complexity and flux in management thinking, it is tempting to latch onto a simple theory or model to bring everything into focus, cut through the confusion, explain what is happening, and derive effective actions. If we could just write the proper procedures or provide the right training or set the right incentives or hire the right chief executive officer (CEO) or create a healthy safety culture, then surely we could create highly reliable (productive, safe, effective, resilient) organizations.

38

three lenses for understanding

Unfortunately, our best knowledge and theories cannot assemble a neat package for effective management in hazardous organizations such as nuclear power plants, oil rigs, chemical plants, mines, and hospitals. Over the past decades, attention has shifted from technological approaches to human error reduction to management and organizational issues, and it will undoubtedly cycle through these repeatedly. For example, organizations in high-hazard industries make a lot of formal rules. But are these formal rules followed rigorously? Is there a belief that safety can be assured by following the right rules, or is there a belief that the rules exist so managers can blame someone when something goes wrong? In many organizations people circumvent the rules in order to get the work done efficiently or simply for their own convenience, creating informal rules about “how we work around here.” In organizations that strive for innovation, the underlying principle is “find a rule and break it.” But if we throw out all the rules, how can we work together effectively? There must be processes and practices to help everyone sort out which rules can be broken, when, and by whom. Or would these just be another set of rules to be broken? I believe that the complexities of management and the richness of organizational life cannot be reduced to a simple model or single theory of behavior. For this very reason, the title of this chapter identifies multiple goals of “reliable, safe, and effective” organizations (and views them through multiple “lenses”). Each of these overlapping (but not synonymous and sometimes inconsistent) concepts has its own tradition and research literature. Organizational effectiveness is the broadest concept, used by organization scholars (e.g., Leavitt, 1965) and consulting companies (e.g., McKinsey 7-S Framework; see Waterman, Peters, & Phillips, 1980), to analyze the coordination and functioning of parts of an organization in order to produce results desired by internal and external stakeholders. Although safety is a paramount goal within high-hazard organizations, because an accident carries such enormous consequences, highhazard organizations must still manage multiple goals including financial viability and innovation along with safety. Reliability has at least two meanings: to engineers, reliability refers to components or systems serving their designed function (e.g., Leveson, 2012), but to organization scholars, high reliability organizing refers to performing organizational processes repeatedly without disrupting key functions, especially safety (e.g., La Porte & Consolini, 1991; Roberts, 1990; Weick & Sutcliffe, 2007).

three lenses for understanding 39

Models and theories are incredibly useful, but we are far from having—and may never have—a comprehensive unitary theory of organizational behavior and processes. Most researchers and practitioners think that their model of the situation is the obvious and appropriate one; they rarely appreciate that others operate with different models. Instead, this chapter offers an approach to organizational analysis, of which safety management is one application domain, based on what is called the three lenses (symbolized in Figure 3.1): strategic design, political, and cultural (Ancona, Kochan, Scully, Van Maanen, & Westney, 2004). Each lens is a perspective on organizations that distills the essence of related theories that share ideas about human nature, the functions of organizations, the meaning of organizing, and the information needed to make sense of an organization. Each perspective has its own assumptions and vast amounts of research that flesh out the ideas. Yet the three lenses are distinct enough that they cannot directly compete or combine: one will not “win” a contest among lenses nor will all three join in a happy union or comprehensive model. By using all the lenses to analyze an organization or a problem,

The Three Lenses Perspectives or “lenses” are organized ideas (e.g., metaphors) that fundamentally shape our understanding of things and events.

Strategic Design

Political

Organizations are machines.

Organizations are contests.

An organization is a mechanical system crafted to achieve a defined goal. Parts must fit well together and match the demands of the environment.

An organization is a social system encompassing diverse, and sometimes contradictory, interests and goals. Competition for resources is expected.

Action comes through planning.

Action comes through power.

Cultural Organizations are institutions. An organization is a symbolic system of meanings, artifacts, values, and routines. Informal norms and traditions exert a strong influence on behavior.

Action comes through habit. figure 3.1 The Three Lenses

40

three lenses for understanding

however, we gain new insights and a richer picture of the organization also suggesting a broader repertoire of useful actions. Approaches to safety management, as I discuss in this chapter, are based on distinct traditions, approaches, and assumptions; improvement efforts based around an approach may not have considered all three lenses and therefore fail to deliver the expected results or produce undesirable side effects. This chapter maps out and analyzes the range of approaches and ideas so that specific good ideas can be placed in a broader, more systemic context. This sets the stage for understanding why there are so many ways to think about and work toward safety and why any improvement effort has to consider multiple lenses. I therefore begin with a description of each lens, then discuss safety management through application of each lens, and finally consider how to put the lenses together to advance ideas and practices of organizational effectiveness, safety, and reliability. I close the chapter by briefly outlining how other chapters in this book link to one or more of the three lenses.

Th e S t r a t e g i c D e s i g n L e n s

Any organization can be thought of as a kind of machine that has been designed to achieve goals by carrying out tasks. The designers of the organization, possibly the founders, board of directors, and/or senior managers, have a vision or purpose for the organization and a strategy for achieving that vision based on rational analysis of opportunities and capabilities. In order to enact that strategy, people are hired with necessary skills or given appropriate training, grouped into departments or teams in order to carry out subtasks, connected by information systems and workflows to coordinate tasks, monitored for their performance according to plan, and rewarded to promote continued performance. This approach is highly rational and analytical. People, money, equipment, and information are moved around a strategic and operational chessboard using logical principles of efficiency and effectiveness. The model assumes that, with the right plan and information flow, the organization can be optimized to achieve its goals. Every organization must decide how to group together people and tasks, typically by specialty, geography, product, customer, and so on. Tasks are grouped together to facilitate the flow of materials and information (interde-

three lenses for understanding 41

pendencies within and across business processes) or to achieve economies of scale or scope or to build capabilities in groups of people. In typical nuclear power plants, for example, there are functional departments such as operations, engineering, maintenance, training, and safety. Some of these functions may be located at the plant, such as that of the operators who run the plant in shifts, but some functions could be either local or at corporate headquarters, such as design engineering, regulatory affairs, legal, and safety. If these functions are gathered together at a corporate facility, more functional expertise can be developed and shared among specialists but at the expense of losing some contact with separated local sites. When people are grouped together, boundaries are created that have to be bridged by various linking mechanisms. Operations, maintenance, engineering, and training departments have to coordinate their work. Formal managerial hierarchies, such as department heads, are simple and pervasive coordination mechanisms, but managers can become overloaded if they have to deal with complex coordination problems. The flow of information up the hierarchy and then back down is slow and inefficient, and information gets distorted as it passes through people with different priorities, viewpoints, and lexicons. If managers get too distant from the details of the work and lose their technical edge, they may make poor decisions. Other linking mechanisms, formal and informal, spring up to bridge the gaps. Formal linking mechanisms include liaison roles, cross-unit groups such as quality teams or task forces, integrator roles such as project managers or account managers, accounting and IT systems, planning processes, and meetings. Informal linking mechanisms typically involve networks of personal relationships that develop around friendships, car pools, churches, sports, prior projects, and so forth. Information often flows much more readily through the grapevine of informal networks than through official channels. Even if different groups have strong linking mechanisms, the groups often want different things: shareholders want profits and share price increases, operations wants production and lower costs, safety wants no accidents, and unions want jobs. These groups will naturally disagree from time to time. Individuals trying to advance their careers may find that they get ahead by doing things that undermine others in the organization. Mechanisms are needed to align the efforts of diverse individuals and groups with the overall

42

three lenses for understanding

organizational mission and strategy. Alignment mechanisms include reward and incentive systems such as bonuses, raises, stock options, and promotions. The “balanced scorecard” (Kaplan & Norton, 1992) adds customer satisfaction and social responsibility to typical measures of task accomplishment or return on assets. Resource allocation decisions ensure that groups have the people, money, equipment, knowledge, and information to carry out their assigned tasks. Human resource development processes around hiring, training, mentoring, and job rotation help to align human resources with the mission and design of the organization. In addition, the informal organization, the network of social relationships and status systems that grows up in any social setting, can serve to align the organization as people adapt to new demands or can act as a conservative force preventing change. Finally, the organization’s goals and strategy must fit the demands of its environment in order to survive and grow. Products and services have to be purchased by customers. Business practices have to be seen as legitimate within regulatory statutes, professional standards, ethics codes, and societal values. Entire industries rise and fall with changes in technologies or public opinion.

Th e P o l i t i c a l L e n s

The political lens shatters the assumption that an organization has a mission and goals that “it” is pursuing. Instead, people who use the political lens view the organization as a contested struggle for power among stakeholders with different goals and underlying interests. Whereas the strategic design lens groups and links units that must work together to accomplish tasks, the political lens understands that units with similar interests and goals combine into coalitions that advocate their side of important issues. Goals and strategy are either imposed by a ruling coalition or negotiated among interest groups. As circumstances change, power shifts and flows, coalitions evolve, and agreements are renegotiated. From the political perspective, power is the ability to get things done. Power is like the energy in a system coupled with wires or channels to connect that energy to action. Power is not the same as control over others; the idea of control implies that there is a fixed amount of power that must be contested. Instead, if we can realign stakeholder interests or find solutions that

three lenses for understanding 43

produce benefits for more parties, everyone can get more of what they want without sacrificing their power or interests. If I can “empower” you to be able to do more, it doesn’t mean that I have lost power. Indeed, I may have gained power by having allies or coalition partners who cooperate to get things done. Further, different individuals and groups have different sources of power or power bases (which emerge when something is needed by the organization, in short supply, and/or hard to replace). Typical sources of power include position power or formal authority, control over scarce resources, rules and regulations, information and expertise, alliances with others, skill at manipulating symbols and persuading others, skill at shaping the negotiation and decision-making process, and personal energy and charisma. By adding different power bases or combining them in complementary ways, we may be able to multiply our collective power. For example, a manager with position power and an engineer with technical expertise may support each other (or they may be in conflict). Looked at through the political lens, changes in mission, strategy, organization, or personnel are not simply rational moves to accomplish organizational goals but also threats to those who hold power and opportunities for those who want more power. As the environment shifts or new strategies are developed, groups that have the capabilities to deal with these new demands come to the fore. If an organization strives to make safety more important, it is likely that those with safety expertise will gain power, and those who were respected for their ability to produce the product may now have to negotiate with the safety organization. Such a proposed change is sure to have strong supporters and strong opponents whose success in the old system is now threatened. It may take a coalition of internal and external interests or a dramatic organizational failure or a change in executive leadership to get change underway. Even then, there may be pockets of resistance, overt or underground, that seek to delay and subvert the change process.

Th e C u lt u r a l L e n s

Those who use the cultural perspective assume that people take action as a function of the meanings they assign to situations. We all make sense of situations in terms of our past history, analogies and metaphors, language categories, observations of others, and so forth. These meanings are not given

44

three lenses for understanding

but rather are constructed from bits and pieces of social life. For example, the same situation can be presented or “framed” as a gain or loss, depending on the comparison (e.g., an annual safety record of two fatalities could be a success against a prior year with ten, but a failure if your competition had no fatalities or your values promote “zero fatalities”), and decisions can change when framed differently. More broadly, cultural elements—the symbols, stories, and experiences from which meanings are derived—are shared among members of a culture and transmitted to new members. Cultures develop over time as groups solve important problems and pass on their traditions. Culture is a way of life; it is what we do around here and why we do it. It is customs and laws, heroes and villains, art and science. Some organizations have strong and pervasive cultures, widely shared throughout an organization, but others have fragmented cultures mixed with local, professional, and hierarchical subcultures. Cultures can be thought of in layers, with visible symbols or artifacts that are easily observed, articulated attitudes and beliefs that are written and discussed, and underlying assumptions and meanings that are more difficult to observe (Schein, 1990). Corporate headquarters are often dressed up to pamper executives and impress shareholders and customers. The artifacts of tall buildings, large offices, and dedicated parking spaces symbolize the power and prestige of the executives. Executives espouse beliefs about the importance of the mission and strategic plans (created by the top management team) and may assume but not say that those who have risen to high office are smarter and better and have every right to govern and receive the lion’s share of rewards. Or, consider a company that has “Safety First” as its motto, visible on every office wall, and yet routinely blames and fires frontline workers when problems arise. What are the underlying cultural assumptions: stupid, lazy, and untrainable workers, or valiant workers desperately holding together a crumbling plant with few resources while executives receive stock options for reducing costs, or both (depending on whom you ask)?

P u t t i n g t h e L e n s e s To g e t h e r

Using all three lenses does not guarantee the “right” answer. Complex situations don’t have a right answer. Problems don’t have root causes; systems

three lenses for understanding 45

have causal relationships. However, an analysis that considers and combines all three lenses is more likely to reveal the complex interdependencies of the organization, the difficulties of implementing change, and the heterogeneity among individuals and groups. This allows us to distinguish better answers (better for what purpose? better for which stakeholders?) from worse answers. Most importantly, a richer analysis suggests more ways to intervene to bring change, support change, overcome resistance, and achieve desired outcomes.

S t r a t e g i c D e s i g n App r o a ch e s t o S a f e t y M a n a g e m e n t

It is natural to think of safety as a goal and to organize tasks, roles, and responsibilities in order to achieve safety. Obviously, if managers and workers are given production goals, measured on output and cost, and given incentives to meet those objectives, then safety will receive little attention. A common adage is that “you get what you measure,” and therefore safety management begins with safety objectives and measures. Of course, this implies some form of organization—that is, a structure that establishes who is responsible for achieving those objectives, for measuring how performance compares to objectives, and for allocating resources to tasks designed to get the desired results. For example, we could give each operations manager production, cost, and safety goals, measure and reward his or her performance, and leave it to each manager (of a plant, unit, site, process, etc.) to manage his or her own operations and resolve any conflicts among goals, such as how much time and personnel to allocate to safety. Alternatively, we could create a safety department with safety goals and let it advise and influence the operations department and its managers, each of whom has production and cost goals. Each of these ways of organizing has strengths and weaknesses, and the essence of the strategic design lens is to help organizations understand these issues and make intelligent design choices. Typically, for example, if the organization lacks technical expertise on a subject matter such as safety, it is helpful to consolidate experts into a department where they can easily share information, develop competence, and stay on top of the latest professional information in their field and industry. However, it may become difficult for the professional experts to understand the operational issues on the shop floor since they rarely get out of their offices or out from behind their computers,

46

three lenses for understanding

and operating managers may find their advice difficult to understand or inflexible—out of touch with their needs. So, pressure develops to relocate the experts or reassign them to operating departments and thereby gain more flexibility and innovation that helps operations. But when that occurs, there is a tendency to lose focus on safety and to slowly erode specific safety expertise. At that point, pressure may develop to strengthen the central safety group in order to rebuild expertise and ensure that someone is “in charge of safety.” And then every ten years we reorganize again (Nickerson & Zenger, 2002). Currently, many industries label the strategic design movement a safety management system (SMS) or safety and environment management system (SEMS). The essence of such systems is that there is a set of goals and measures of both process and performance, and responsibilities are clearly defined in the management structure. Decisions are often made on a risk-informed basis, meaning that reduction of risk is an important criterion rather than just compliance with requirements. In the traditional form of safety management, requirements are written and compliance is enforced. Requirements often come from professional societies such as the American Society of Mechanical Engineers (ASME) or regulatory authorities. So, for example, the requirements may say that pipes for certain applications must be made of particular materials and thicknesses and be inspected in specified ways at given intervals. Inspections are carried out to ensure compliance with these requirements, and deviations are noted and corrective actions must be taken. However, in riskinformed regimes, organizations are given risk targets and have latitude to choose among ways to reduce risk. So, the same organization may choose to control the risks associated with older pipes either by replacement or inspection or a combination (depending on the age of the pipes, their location, and/ or the specific hazards associated with each pipe). Ideally, the replacement and inspection regimes could be structured to be cheaper and safer than the current situation. Hence, good management can find ways to be both productive and safe rather than view production and safety as a set of either/or trade-offs. Of course, complex systems must comply with both requirements and must make risk-informed choices in design, operations, maintenance, and so forth. The US Navy Submarine Safety Program (SUBSAFE; Sullivan, 2003) is one informative example. SUBSAFE was created immediately following the loss of the nuclear submarine USS Thresher in 1963. The purpose of SUBSAFE

three lenses for understanding 47

is very specific—maintain hull integrity and operability of crucial systems to allow control and recovery. It has nothing to do with safety of the nuclear reactor, missiles, or slips, trips, and falls. In approximately fifty years prior to the establishment of the SUBSAFE program, sixteen submarines were lost to noncombat accidents; in fifty years subsequent to the program, only one submarine was lost, and that sub was not part of the SUBSAFE program. The core of SUBSAFE is a comprehensive set of requirements that permeates every aspect of submarine design, construction, operations, and maintenance. This includes how work is conducted, what materials are used, how every element of work is documented, how inspections and audits are used to verify compliance with requirements, and so forth. Every ten years, the entire program is evaluated and upgraded or renewed (small changes are made throughout), so that the program is never viewed as finished or complete. The certification process, applied to critical structures, systems, and components, is a central element of the program. Certification is strictly based on objective quality evidence, which is a statement of fact, quantitative or qualitative, that deliberate steps were taken to comply with requirements. This results in considerable documentation of design, material, fabrication, and testing that can be audited for certification and for recertification throughout the life of the sub. Without certification, the sub cannot be operated. This is quite different from the practice in many other industries where industrial plants are always operating with known and unknown problems (and lists of promises of work to be done) and therefore are outside their design envelope in ways that make risks very difficult to estimate. One of the newest and most systematic approaches to safety management is the Systems-Theoretic Accident Model and Processes (STAMP; Leveson, 2012, 2015). STAMP is based on the concept that safety is an emergent systems property rather than an aggregation of reliable components. Safety is assured by maintaining control over hazards through control actions and information feedback, arranged in a hierarchical structure of control. Such control structures include both human and technological components. Take, for example, the case of a commercial airline pilot. In the modern cockpit, the pilot does not “fly” the plane but rather issues commands to computers that actuate the physical components of the plane. Instrument indicators and sensory cues return feedback about what the plane is actually doing. The electronic controls

48

three lenses for understanding

have embedded within them a model of how the plane is supposed to behave, including issuing warnings or overriding a pilot’s command if it is deemed dangerous or impossible. Of course, this creates new failure modes as pilots may no longer understand what the controls are doing in planes that have been designed to be flown by computer, not by humans. But the pilot is only part of the control structure. The pilot in turn receives advisories and commands from airlines operations management and air traffic control (ATC) and additional advisories from the electronic Traffic alert and Collision Avoidance System (TCAS). These elements of the control structure are concerned not only with a single plane (the primary role of the pilot) but also with a system of planes and support structures. Operations management needs to move planes around to serve customers while minimizing costs (airline fuel, etc.). ATC needs to ensure safe and smooth takeoffs and landings and transfer of control across geographical boundaries. Airline operations reports to airline corporate, which itself receives requirements and delivers reports to shareholders, regulators, the public, financial markets, and so forth. ATC itself receives direction from and reports to various governmental levels. TCAS receives information from the transponders on other aircraft in the vicinity and provides advisories to ensure safe separation with aircraft at the same altitude. Although traditional safety management approaches consider accidents to be due to a cascading set of human errors and/or component failures, STAMP considers accidents to be a loss of control due to failure to enforce safety constraints. STAMP is able to consider interactions among system elements, including humans and computer software. In the 2002 collision of two airplanes over Germany, for example, a number of problems combined in unanticipated ways (Leveson, 2015). Pilots of the two airplanes received conflicting advisories from ATC and TCAS; pilot training in the countries of origin of the two airlines differed in whether to follow ATC over TCAS or TCAS over ATC, although all airlines were supposed to train pilots to follow TCAS in such circumstances. Two controllers were supposed to be in the ATC tower on the ground, but the Swiss tower had been routinely allowing one controller at night. Apparently, no one was checking that the right training and staffing were in place. STAMP views this as a failure to maintain the control loops that ensure safety—that is, there was no feedback about this aspect of training and staffing and therefore no opportunity to correct the situation (until the accident).

three lenses for understanding 49

P o l i t i c a l App r o a ch e s t o S a f e t y M a n a g e m e n t

The strategic design approach assumes that organizations can agree on a single set of goals. In contrast, the political lens assumes that a complex organization has multiple stakeholders with differing interests. A venerable maxim states that organizations are designed to turn goal conflicts into political conflicts. For example, the perceived trade-offs between production and safety goals turn into a conflict between the production and safety departments. Although an organization may resolve such conflicts by referring them to senior executives, more likely the departments will negotiate and actions will reflect the relative power and status of the departments (and their leaders). So, if the production department is seen as the engine of profit, and the leader of the department is considered next in line for the executive suite, whereas the safety department is seen as a regulatory requirement that interferes with real work, and the leader of that department is seen as not being a team player and having dubious expertise, it should be no surprise that safety is in constant jeopardy. Perrow’s seminal Normal Accidents (1984) includes the insightful comment that different industries have different rates of accidents not simply because of the inherent complexity and riskiness of their technologies but also because of who is at risk. In mining and fishing, two of the most dangerous industries, miners and fishermen are lower-status workers whose lives and troubles are of little interest to most of society. Their injuries and deaths generate little attention or alarm outside their local communities, unless a union or investigative reporter champions their cause. In contrast, the airline industry is extremely safe in part because the people at risk are elites. Political leaders, industry executives, and wealthy travelers rely on planes, and if a plane crashes enormous attention is directed toward the causes of the accident. The FAA and the National Transportation Safety Board (NTSB) receive far more generous funding than many other regulators because elites influence their representatives to ensure airplane safety. Once again, the SUBSAFE program offers an instructive example. The program explicitly recognizes the potential conflict among goals and stakeholders and gives voice and weight to each of three key roles: (1) the platform program manager is responsible for design and operation of a particular sub design or “platform”; (2) the independent technical authority is responsible for technical expertise—for example, recommending acceptable designs from

50

three lenses for understanding

which the program manager may choose; and (3) the independent quality assurance and safety authority is responsible for compliance with requirements. None of these actors can make a unilateral decision. Designs can move forward only if all three have agreed that their goals are satisfied. This system of checks and balances is analogous to the roles for the executive, legislative, and judicial branches of the US government established by the Constitution. However, as we observe from the ebb and flow of power across the branches of government, balance of power can shift in ways that may entail risk. Consider the NASA shuttle Columbia accident in which space shuttle program management had gradually acquired power over the supposedly independent safety organization. Program people initially sat in on, then became members of, and even chaired safety committees that were, by policy, independent of the program organization. As a result, the “voice” of safety was given less attention and gradually disempowered (Leveson et al., 2005; cf. psychological safety, Edmondson, 1999). SUBSAFE audit practices and philosophy are also astutely oriented to the political realities of a complex organization. Audits of every SUBSAFE-certified ship and facility (e.g., shipyards and contractors) are conducted frequently to verify compliance with requirements. The audit philosophy is based around learning in a constructive manner rather than “policing” and punishing violators. Consistent with that philosophy, audit teams include both external auditors and facility personnel. External auditors are mostly (80 percent) peers from other facilities; their facilities are also the subject of audits, possibly by teams including individuals from the site now being audited. Continuous communication flows between the audit team and the facility with the goal of fully understanding identified problems; there is no desire to “catch” or “surprise” people. The compliance verification organization is of equal status and authority with the program managers and technical authority. Headquarters is also audited and must accept and resolve audit findings just like any other part of the SUBSAFE community. Another interesting example of conflict and power comes from the Millstone Nuclear Power Station crisis in the late 1990s (Carroll & Hatakenaka, 2001). Millstone had three units of three different designs, manufacturers, and vintages on the same site. After the company nearly went bankrupt building the third unit in the mid-1980s, and spurred by a consultant’s report that

three lenses for understanding 51

envisioned a deregulated energy world competing on costs, the company cut back some of its support to the nuclear plants and insisted that they generate electricity and revenue and no longer strive for “excellence.” Over the next decade, the Millstone plants were running well but were not keeping up with industry improvements. Promises made to the regulator were not being kept. Engineers at headquarters who had been doing the high-status work of designing the next plant were moved on-site to support plant operations. Some of those assigned to the older units were appalled at the condition of the units and raised concerns about safety, first to managers who did not welcome these concerns, and then to the regulator and the press. Aware that Time magazine was about to run a cover story (Pooley, 1996) on whistle-blowers at Millstone, the Nuclear Regulatory Commission moved into action and, as each unit was shut down for routine outages, required Millstone to make a variety of changes before restarting each unit. Some of these requirements involved engineering upgrades, some involved reducing the backlog of maintenance work, and some involved creating a “safety conscious work environment” (Carroll & Hatekenaka, 2001, p. 71)—the first time this phrase had ever been used. Northeast Utilities, the owner of Millstone, brought in a new chief nuclear officer and requested help from the rest of the industry. Three other utilities volunteered teams of managers to help Millstone, and each team was assigned to help one of the three units. What had not been anticipated is that each team would take its own approach to managing and improving its unit. By allowing each unit to pursue its own interests (which in part involved the interests of the external partner utilities), the interests of Millstone and Northeast Utilities were in jeopardy. In particular, the units competed for who could spend the most money, with the new managers of Unit 2 boldly spending at a very high rate. It took some months for this political situation to be appreciated—by distributing resources across all three units, none of the units would be generating electricity and money very soon, and Millstone was losing $2 million per day paying for its improvements and purchasing electricity from other utilities for its customers. Instead, the strategic decision was made to concentrate resources on Unit 3, the largest and most modern unit. By recovering Unit 3 as quickly as possible, since it needed the least work in upgrading to industry standards, Millstone would start generating electricity and revenue before bankrupting Northeast Utilities. Efforts could then turn to Unit 2 and finally

52

three lenses for understanding

Unit 1. Indeed, Unit 3 was brought back online after two years of tremendous effort, Unit 2 restarted in another year, and Unit 1 was decommissioned since it was the smallest and oldest, generated the least electricity, and required the most upgrading.

C u lt u r a l App r o a ch e s t o S a f e t y M a n a g e m e n t

The International Atomic Energy Agency (IAEA) coined the term “safety culture” over twenty-five years ago following the Chernobyl nuclear accident (International Nuclear Safety Advisory Group [INSAG], 1991). The investigation had clearly revealed that the accident was caused by more than equipment failures or human error. The “something else” was given a label, and over the following decades more and more attention has been given to safety culture in industry, regulation, and research. However, culture had already emerged as a focus of attention in the concept of safety climate (Zohar [1980]; see Turner’s [1978] insightful analysis of accidents). After years of resisting any regulatory intrusions into management prerogatives, including “culture,” almost every industry now makes safety culture a consideration or even a requirement. Accident investigations routinely identify safety culture as a cause, and corrective actions are designed around improving culture. Industry groups and regulators describe a healthy safety culture and offer measures and suggestions, or even requirements, for culture (e.g., ten traits outlined in Institute of Nuclear Power Operators [INPO], 2012). Although this is a remarkable change, there is no recipe for “improving culture.” Part of my argument in this chapter is that a strategic design approach to culture change, with a project team, timeline, culture metrics, and incentives, is not likely to work without more specific attention to the cultural lens (and how the political lens thinks about culture). Although an oversimplification, the concept of HROs or high reliability organizing directs attention to cultural aspects of safety. In its original form (La Porte & Consolini, 1991; Roberts, 1990), HRO was developed inductively from field observations of a small number of organizations that seemed to carry out very hazardous operations with extremely low rates of accidents. One example is the aircraft carrier USS Carl Vinson, which routinely receives and launches airplanes in rapid succession from a moving target with hazards from high-

three lenses for understanding 53

speed aircraft, explosive munitions and fuel, and variable weather and waves. There seemed to be something quite special and rare to such organizations, an ideal of high reliability to which other high-hazard organizations could aspire and from which all types of organizations could learn. Karl Weick and Kathleen Sutcliffe (2007) codified the HRO concept into five attributes that clearly reflect the importance of culture. Three of the attributes involve anticipation of problems and attention to early warning signs, and the other two involve containment of cascading effects and resilience: 1. Preoccupation with failure—Imagine possibilities and attend to early signs of failure; 2. Reluctance to simplify—Avoid labels and easy answers that limit understanding and search for information; 3. Sensitivity to operations—Focus on the dynamics of work operations; 4. Commitment to resilience—Preserve function despite adversity, return to service, and learn from experience; and 5. Deference to expertise—Rely on the expertise of those closest to the work, who therefore should have more control during a crisis, regardless of rank. The first three attributes describe how people think and the assumptions and categories they use. Instead of insisting on a single logic or mental model, such as “comply with rules” or “design errors out of the system” or “reward the desired behaviors,” HRO exhorts everyone to pay attention, focus on the work, and never get complacent (e.g., NASA started referring to the space shuttle as a “bus,” mislabeling its operations as simple, familiar, and routine). In short, multiple logics and multiple ways of “sensemaking” are considered more robust and more helpful than a single approach, which tends to generate overconfidence and complacency. I am reminded of a Zen observation: “Everything changes, everything is connected, pay attention” (Jane Hirshfield, quoted in Grubin, 2010). The last two HRO attributes are somewhat more complicated from a three lenses viewpoint. Commitment to resilience is partially about cultural values of resilience, robustness, and learning. But it also involves how this is accomplished, which brings up strategic design issues around redundancy of function, slack resources, cross training, mechanisms for event analysis and learning from operating experience, and so forth. Analogously, deference to

54

three lenses for understanding

expertise is partially cultural, illustrated by the difference in status accorded to craft workers in the United States and Germany: in the United States, even skilled crafts are performed by blue-collar workers of modest social status and limited authority in their industry (relative to engineers and managers), whereas in Germany the meister or master craftsman is revered as an expert and leader. However, deference to expertise is also about power and how much expertise is valued compared to rank. In the nuclear power industry, for example, the chief operator in the control room is officially (and legally) in charge during any emergency, and yet when an executive shows up and tries to give orders or even ask questions, it is difficult for the chief operator to ask the executive to leave the control room (which is considered an appropriate action in an emergency). In Germany and some other countries, the collective power of craft workers is expressed in strong unions, tripartite participation with industry and government, and mandated membership on company boards of directors. One danger of the HRO approach is that it can be construed as placing responsibility on lower ranks of workers to make decentralized decisions. Exhortations for workers to be vigilant about weak signals and to defer decentralized decisions to frontline workers who have detailed knowledge of the work can slip into blame of those workers when something goes wrong. The new human error could be failure to be vigilant or to not exercise expertise. Although that is a misreading of HRO, it is an understandable temptation. A principle from a different professional culture, engineering, states that reliance on human attention is the weakest form of defense against accidents. Better to design the hazard out of the system or provide engineered defenses that mitigate a problem or to have clear procedures that help people do the safe thing. In short, relying on decentralized decisions in a highly coupled system misses opportunities to examine the larger system and make improvements in design and operation that help those decentralized decision makers work better. As Jens Rasmussen explained in his analysis of the MS Herald of Free Enterprise ferry disaster, each local actor could “not see the complete system and judge the state of the multiple defenses conditionally depending on decisions taken by other people” (1997, p. 189). Expecting a person to interpret and respond to a unique event in the moment is like blaming the goalie in soccer or ice hockey for every goal—reducing shots on goal is everyone’s job in a team sport, whereas the goalie is the last (and sometimes desperate) line of defense.

three lenses for understanding 55

Given the comprehensive nature of the SUBSAFE program, it should be no surprise that attention is also paid to cultural issues. Leaders routinely point to three cultural challenges: ignorance, arrogance, and complacency. Avoiding these traps is a constant struggle; passionate, engaged, and effective leadership is considered the key to setting the right cultural tone. Leaders actively promote a questioning attitude that includes critical self-evaluation, a learning orientation (continuous training, audit philosophy), an assumption that everyone is trying to do the right thing (but we have to verify), and a focus on objective quality evidence rather than opinion. Each year SUBSAFE holds an annual meeting on the anniversary of the loss of the Thresher. During this meeting, the team views a video of the Thresher that includes an emotionally laden set of images of the crew and civilians on board; relatives of those who died attend and speak of their loss. It is a cultural ritual that keeps alive the experience and emotions of a disaster. Of course, the annual meeting and associated training also have a strategic design purpose to train on lessons learned and changes made during the past year, but the heart of the annual meeting is a shared emotional commitment to safety—they will never forget the Thresher. The very concept of “safety” is a cultural construction (certainly “safety culture” is!). The conceptualization of safety—and it is different from place to place—affects the way safety is managed. There are as many kinds of safety as there are hazards, including tripping hazards, falling objects, fires, explosions, toxic releases, radiation, theft of hazardous material, and terrorist attacks. In various industries and countries, safety may be managed separately or bundled in with health, environment, and/or security. In many US industries, “safety” refers to industrial safety as regulated by the Occupational Safety and Health Administration (OSHA) and measured by days away from work or other indexes. Even in high-hazard industries with potential for low-probability but extreme-consequence disasters, safety statistics are often dominated by falls, dropped objects, and car accidents. For example, over the past twenty-five years, BP has had an impressive and measurable reduction in personal safety incidents driven by an obsessive focus on hazards. I have taught in MIT-BP executive education programs where the class could not start until there was a safety moment describing the emergency procedures and status of any emergency drills (i.e., were any alarms expected?). All extension and audiovisual cords had to be taped to the floor to avoid tripping hazards. During class I

56

three lenses for understanding

was asked to get off the table I was sitting on because tables were not designed to support my weight. As I was talking with one BP employee while walking up a flight of stairs, he took my hand and moved me over to the banister. The cultural belief at this time was that if employees took care of these safety hazards, BP operations would be safe. Unfortunately, as revealed by the Texas City chemical plant explosion in 2005 and again by the Gulf oil spill in 2010, assuring process safety or system safety involves a different set of skills and knowledge. Process safety hazards are often invisible and can involve combinations of multiple pieces of equipment, materials in process, human actions, and computer software that cannot be understood just by looking at the screen. Nor will everyone doing what is in the procedure manual necessarily avoid accidents, since procedures are frequently missing, incomplete, confusing, or wrong. In 1979 operators at Three Mile Island were trained to do exactly the wrong thing: they believed they should keep the pressurizer from “going solid” (filling completely with water), since that would make the plant vulnerable to water hammer, which could break even very large pipes. Indeed, breaking a pipe on a submarine (which is where many of the operators received their first training in nuclear power operations) is an extreme hazard, whereas losing cooling water in a small reactor with very little nuclear fuel is not much of a problem. However, in a large commercial nuclear power plant with a huge amount of nuclear fuel, “going dry” (letting the water boil off) is a far greater hazard than going solid.

P u t t i n g t h e L e n s e s To g e t h e r : W h a t ’ s N e x t for S a fe t y M a n age m e n t ?

The three lenses framework is not a theory of organizations or of safety management but rather an approach to achieving more useful understanding. The lenses are not mutually exclusive, and compartmentalizing knowledge by lens (as we have done in the preceding discussion) is only a checkpoint to make sure we are taking everything into consideration. In analyzing an event or an organization, the ideas from each lens are not so much added together as they are compared and combined to achieve a more comprehensive analysis from multiple perspectives. Especially when planning and enacting change, we must consider all the lenses to understand unintended side effects and issues around implementation (typically political resistance and cultural fit).

three lenses for understanding 57

Consider an analogy to experiencing modern art: rather than creating a representation of something and conveying a message to the viewer, the artist creates an opportunity for sensemaking in which the viewer can have multiple experiences. Looking at the colors of the piece generates one set of impressions, looking at the figures evokes another, thinking of the images from different perspectives (left, right, near, far) creates more impressions, and so forth. It can be unpleasant, confusing, exhausting, and disappointing (all at the same time!), but it can also be exciting, insightful, challenging, fresh, and even revelatory. Analyzing an organization using all three lenses can be all those things too. SUBSAFE has served as a valuable set of examples throughout this chapter. Although it is part of the US Navy and thus subject to all sorts of stereotypes about “command and control,” the reality is that the SUBSAFE program has given significant attention to all three lenses, and that is why, in my opinion, it has been successful for over fifty years. To reiterate, it is not just the structure of the program as a set of requirements, obsessive documentation of objective quality evidence, roles and responsibilities, audit practices, and so forth, but also the enacted and reenacted experience of the program with its balance of powers, annual renewal ceremony, audit philosophy and teamwork, and engaged leadership (and much more). A second interesting example of combining the three lenses arises from concepts of cultural strength such as Ron Westrum’s (2004) model of organizational stages of safety culture (pathological, reactive, calculative, proactive, and generative), which is similar to the DuPont Bradley Curve that represents stages of culture progress (reactive, dependent, independent, interdependent) measured by a safety perception survey that is intended to predict reductions in safety incidents. Although these models are labeled as stages of culture development, they really represent all three lenses. Indeed, I would assert that what is needed to move from one stage to the next is not more of the same but rather using a different lens. The pathological culture is really a political culture characterized by each stakeholder being out for his- or herself. The reactive culture is highly local, with subcultures doing their thing until something bad happens, at which time they do what is required and go back to local routines. Moving to a calculative culture means establishing structures and procedures; setting a strategic design with goals, metrics, and incentives; and seeking compliance to standard procedures. This can only be accomplished if local authority and discretion are subsumed to centralized authority in the

58

three lenses for understanding

form of plans, standards, and other routines. A guiding coalition has to make and enforce the rules, and this emerges from and contributes to a shift in the balance of power. But a proactive culture requires a different kind of change: reporting of near misses and other problems requires more than a compliance orientation and mandate to report. Indeed, a compliance orientation is often seen as standing in the way of a strong safety culture built on trust, participation, and teamwork. The generative culture is an effort to be systemic, bringing all lenses together in a mindful way (cf. Hudson, 2007). Researchers are beginning to confront the tensions among multiple goals and approaches. For example, safety climate does not always have simple linear effects on safety outcomes but rather has to be considered in a context of other climates (other goals) such as learning climate and innovation climate. Research has found seemingly paradoxical results in which more safety climate produces worse outcomes when the organization does not have complementary climates, such as a learning climate (Katz-Navon, Naveh, & Stern, 2009). Managers and other practitioners are rightfully concerned about taking action. They want something that works in practice, regardless of whether it works in theory. Safety management has been a domain in which the world of practice has often led the world of theory: the genesis of HRO theory from observations of the USS Carl Vinson and other HROs is just one example. The SUBSAFE program was not designed through the three lenses, but somehow the designers seem to have anticipated features from all the lenses. Although the three lenses may appear simplistic and obvious in hindsight, my three decades of research and consulting in several high-hazard industries suggest that senior leaders typically operate from their own perspectives and rarely do the breadth of thinking that is facilitated by using all three lenses, nor do they readily create a climate in which others in the organization can bring alternative perspectives to bear. Leaders and managers (everyone, really) would benefit from a little more openness, a little more humility, a little more curiosity, and a little more inquiry and experimentation, even if that takes a little more time (see Schein, 2013). Th e R e m a i n d e r o f t h i s Book

The remaining chapters of Organizing for Reliability offer a broad and deep examination of safety management and HRO in particular. This chapter serves

three lenses for understanding 59

as an introduction and way of setting the stage. It challenges the reader to not only consider each chapter in terms of its own approach but also to situate each chapter among the other chapters and within each of the three lenses. For example, the strategic design lens is at the core of Chapter 10, which reviews reliability in health care. The cultural lens is at the heart of Chapter 12, which focuses on HRO implementation. Of particular interest are chapters that use more than one lens, such as Chapter 7, which uses the strategic design and cultural lenses; Chapter 5, which invokes the strategic design and political lenses; and Chapters 4 and 9, which cite the political and cultural lenses. Finally, Chapter 11, focusing on resilient communities, uses all three lenses. As our field develops further, the three lenses could serve as an analytical device and a call for synthesis, although a single overarching theory is unlikely to emerge.

R efer ences Ancona, D., Kochan, T., Scully, M., Van Maanen, J., & Westney, D. E. (2004). Managing for the future: Organizational behavior & processes (3rd ed.). Boston: South-Western. Carroll, J. S., & Hatakenaka, S. (2001). Driving organizational change in the midst of crisis. MIT Sloan Management Review, 42, 70–79. Edmondson, A. (1999). Psychological safety and learning behavior in work teams. Administrative Science Quarterly, 44, 350–383. Grubin, D. (Producer/Director). (2010, April 7). The Buddha [Television broadcast]. Arlington, VA: Public Broadcasting Service (PBS). Hudson, P. (2007). Implementing a safety culture in a major multinational. Safety Science, 45(6), 697–722. Institute of Nuclear Power Operators. (2012). Traits of a healthy nuclear safety culture. Atlanta, GA: Author. International Nuclear Safety Advisory Group. (1991). Safety culture: A report by the International Nuclear Safety Advisory Group (Safety Series No. 75-INSAG-4). Vienna: International Atomic Energy Agency (IAEA). Kaplan, R. S., & Norton, D. P. (1992). The balanced scorecard: Measures that drive performance. Harvard Business Review (January–February), 71–79. Katz-Navon, T., Naveh, E., Stern, Z. (2009). Active learning: When is more better? The case of resident physicians’ medical errors. Journal of Applied Psychology, 94, 1200–1209. La Porte, T. R., & Consolini, P. M. (1991). Working in practice but not in theory: Theoretical challenges of “high-reliability organizations.” Journal of Public Administration Research and Theory, 1(1), 19–47. Leavitt, H. J. (1965). Applied organizational change in industry. In J. G. March (Ed.), Handbook of organizations (pp. 1144–1170). Chicago: Rand McNally. Leveson, N. (2012). Engineering a safer world: Applying systems thinking to safety. Cambridge, MA: MIT Press. Leveson, N. (2015). A systems approach to risk management through leading safety indicators. Reliability Engineering and System Safety, 136, 17–34.

60

three lenses for understanding

Leveson, N., Cutcher-Gershenfeld, J., Carroll, J. S., Barrett, B., Brown, A., Dulac, N., . . . & Marais, K. (2005). Systems approaches to safety: NASA and the space shuttle disasters. In W. Starbuck & M. Farjoun (Eds.), Organization at the limit: Lessons from the Columbia disaster (pp. 269–288). Malden, MA: Wiley-Blackwell. Nickerson, J. A., & Zenger, T. R. (2002). Being efficiently fickle: A dynamic theory of organizational choice. Organization Science, 13, 547–566. Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books. Pooley, E. (1996, March 4). Nuclear warriors. Time, pp. 46–54. Rasmussen, J. (1997). Risk management in a dynamic society: A modeling problem. Safety Science, 27(2/3), 183–213. Roberts, K. H. (1990). Some characteristics of one type of high reliability organization. Organization Science, 2, 160–176. Schein, E. H. (1990). Organizational culture. American Psychologist, 45, 109–119. Schein, E. H. (2013). Humble inquiry: The gentle art of asking instead of telling. San Francisco: Berrett-Koehler. Sullivan, P. E. (2003). Statement of Rear Admiral Paul E. Sullivan, U.S. Navy Deputy Commander for Ship Design, Integration and Engineering Naval Sea Systems Command before the House Science Committee on the SUBSAFE program. Retrieved from http://www.navy.mil/navydata/ testimony/safety/sullivan031029.txt Turner, B. A. (1978). Man-made disasters. London: Wykeham Science. Waterman, R. H., Jr., Peters, T. J., & Phillips, J. R. (1980). Structure is not organization. Business Horizons, 23(3), 14–26. Weick, K. E., & Sutcliffe, K. M. (2007). Managing the unexpected: Resilient performance in an age of uncertainty (2nd ed.). San Francisco: Jossey-Bass. Westrum, R. (2004). A typology of organisational cultures. Quality and Safety in Health Care, 13(suppl. 2), ii22–ii27. Zohar, D. (1980). Safety climate in industrial organizations: Theoretical and applied implications. Journal of Applied Psychology, 65(1), 96–102.

chapter 4

M i n df u l Org a n i z i ng Kathleen M. Sutcliffe

H

ROs a r e c e n t r a l t o o r g a n i z a t i o n t h e o r y because they provide a unique window into organizational effectiveness under trying conditions. The ways in which HROs “mindfully” organize are a dormant infrastructure for performance improvement in all organizations. In fact, the claim that HROs are adaptive organizational forms for complex environments is as true for organizations facing increasingly volatile, uncertain, complex, and ambiguous environments in the twenty-first century as it was in 1999 when the concept of high reliability organizing was first introduced (see Weick, Sutcliffe, & Obstfeld, 1999). True, HROs have been treated as exotic outliers in mainstream organization theory, but their processes are not unique. Reliable organizations are sensitive to and constantly adjust to small cues or mishaps that if left unaddressed, could accumulate and interact with other parts of the system, resulting in larger problems. By constantly adapting, tweaking, and solving small problems as they crop up throughout the system, organizations prevent more widespread failures. (Barton & Sutcliffe, 2009, p. 1330)

In this chapter I examine processes of mindful organizing as a means for achieving simultaneous adaptive learning and reliable performance. Although processes of mindful organizing and the mechanisms through which reliable performance are enacted are often thought to be stronger in HROs, an emerging view is that they are no less important for ordinary organizations. Naturally, some organizations such as “prototypical” HROs (i.e., nuclear power generation plants, ATC and commercial aviation, and naval aircraft carrier operations), cannot afford to fail because when they do, results can be catastrophic. Thus,

62

mindful organizing

some may think that these high stakes make HROs irrelevant. But most ordinary organizations experience hazards and crises of a much smaller scale every day. Crises are relative to what organizations and their members expect won’t go wrong. In other words, crises are relative to what people (and organizations) come to count on. For example, when a shipping company like FedEx grounds a faulty aircraft because a worn part cannot be repaired and the supplier is out of stock, it may not cost lives for the courier or its supplier, but it may be a “disaster” to customers who were expecting artwork for an opening exhibition, an originally signed legal brief, or a critical engine part (Sutcliffe & Vogus, 2014). It is a fiasco to those who counted on FedEx to reliably deliver what it had promised. For other RSOs, such as technologically complex pharmaceutical firms that are subject to regulation, competition, and demands for quality and social responsibility, the stakes may be even higher (Rerup & Levinthal, 2014). Unreliable performance may not cost billions of dollars in damages, but it can cost reputations, market shares, careers, and livelihoods. Mindful organizing decreases the likelihood that organizations will be blindsided by events that they didn’t see coming, and disabled by events that do catch them unawares (Weick & Sutcliffe, 2007). I begin by describing the conceptual foundations of mindful organizing and the associated concept of collective mindfulness in research on high-risk HROs (Weick, 1987; Weick & Roberts, 1993; Weick et al., 1999) and studies of individual mindfulness (e.g., Langer, 1989). I then describe processes of mindful organizing and review recent research to explore how this research domain has evolved over the past decade. I move on to examining lingering questions and possible avenues for future research. I conclude with some implications for managerial practice.

C o n c e p t u a l Fou n d a t i o n s

The perspective on mindful organizing described in this chapter emerged in the late twentieth century as part of the evolving research stream on highhazard organizations. The growing body of work on HROs at that time was characterized (with some exceptions; see, for examples, Roberts & Rousseau, 1989; Roberts, Rousseau, & La Porte, 1994) as an eclectic mix of case studies, more descriptive than theoretical, relatively a-conceptual (more about

mindful organizing 63

accidents and crises than organizations), and of insufficient coherence to generalize (Weick et al., 1999). Existing studies had tended to focus on system characteristics, bureaucratic mechanisms such as organizational structure and formal processes (e.g., policies and procedures, extensive training, etc.), technological redundancy, and other activities aimed at anticipating or precluding untoward events (Roberts, 1990, 1993). Yet, the internal dynamics and microsystem processes in these organizations had been relatively underexplored. As researchers recognized the value of using HROs as templates of adaptive organizational forms for increasingly complex environments, interest in articulating the mechanisms through which these organizations achieved reliable performance surged along with understanding the conditions under which everyday organizations resemble HROs and how these prosaic organizations might replicate HROs’ exceptional performance (Barton & Sutcliffe, 2009). The provocation to broaden and deepen research in this area was accompanied by a desire to integrate this research stream into organizational studies more generally. The HRO literature, perceived as a unique niche on the periphery of mainstream organization theory, was disconnected from studies of more ordinary, everyday organizations. Scholars such as sociologist Dick Scott (see Scott, 1994, p. 25) had questioned this state of affairs and argued that research on high-risk organizations should be more widely diffused as it had potential to inform existing research, particularly in the domains of organizational effectiveness and organizational learning. HROs deserved closer attention, theoretically speaking, because of their capabilities to adapt and to suppress inertia in complex, dynamic environments, but they also deserved closer attention practically speaking (Weick et al., 1999). Recall that interest in and research about total quality management and business process reengineering grew wildly in the last two decades of the twentieth century (see, for example, Dean & Bowen, 1994, a special issue of the Academy of Management Review). The core concern in quality programs was the focus on eliminating defects and reducing variability. In 1982 W. Edwards Deming had suggested that in addition to using statistical process controls, organizations seeking total quality management must create and sustain broad-based organizational vigilance for finding and addressing other systemic problems (see Sitkin, Sutcliffe, & Schroeder, 1994). The frenzied interest in and pressures for higherquality and more highly reliable performance across multiple industry sectors

64

mindful organizing

seemed to be accompanied by increasing volatility and complexity (see Ilinitch, D’Aveni, & Lewin, 1996). Together, these forces propelled researchers to better understand how HROs achieve highly reliable performance and, just as importantly, to more solidly link and embed HRO research into mainstream organization and management theory (Sutcliffe & Vogus, 2014).

R e l i a b i l i t y a n d Lo g i cs o f A n t i c i pa t i o n a nd R esilience

To be highly reliable is to strive for a minimum of jolt, a maximum of continuity (James, 1909, p. 61). To describe something as reliable is to describe “what one can count upon not to fail in doing what is expected” (Weick & Sutcliffe, 2001, p. 91). Reliability, to put it succinctly, is a “situation specific localized accomplishment” (Weick, 2011, p. 21). In a broad sense, highly reliable performance in complex interdependent organizations and systems emerges from two logics: a logic of anticipation and a logic of resilience (Wildavsky, 1991; Bigley & Roberts, 2001; Schulman, 2004). Anticipation requires that organizational members understand the nature of their work and how it is accomplished, anticipate and identify the events and occurrences that must not happen, identify all possible causal precursor events or conditions that may lead to those events, and then create a set of procedures for avoiding them (Schulman, 2004; Sutcliffe, 2010; Reason, 1997). Repeated high performance, then, is achieved by a lack of unwanted variance in performance (e.g., by doing things just this way through standardized operating procedures and routines). Indeed, studies show that HROs are obsessed with detailed operating procedures, contingency plans, rules, protocols, and guidelines and using the tools of science and technology to better control the behavior of organizational members to avoid errors and mistakes (Hirschhorn, 1993; Schulman, 2004). At one point, the Diablo Canyon nuclear power plant, for example, had 4,303 separate, multistep, written procedures, each one revised up to twenty-seven times, that were designed to anticipate and avoid problems with maintenance, operations, protection, and analysis (Hirschhorn, 1993). Anticipation removes uncertainty and reduces the amount of information that people have to process. It also decreases the chances of memory lapses, minimizes judgment errors or other biases that can contribute to crucial fail-

mindful organizing 65

ures, provides a pretext for learning, protects individuals against blame, discourages private informal modifications that are not widely disseminated, and provides a focus for any changes and updates in procedures (Sutcliffe, 2010; Goodman et al., 2011). But it is impossible to write procedures to anticipate all the situations and conditions that shape people’s work (Hirschhorn, 1993; Sutcliffe, 2010). Even if it were possible to craft procedures for every situation, too many rules can create unwanted and even harmful complexity (Katz-Navon, Naveh, & Stern, 2005). People can lose flexibility in the face of too many rules and procedures. In some instances compliance with detailed operating procedures is critical to achieving reliability, in part because it creates operating discipline. But blind adherence is unwise if it reduces the ability to adapt or to react swiftly to unexpected surprises (Sutcliffe, 2010). The idea that standard operating procedures and invariant routines are means through which reliable outcomes occur, conflates variation and stability and makes it more difficult to understand the mechanism of reliable performance under dynamic, varying conditions (Weick et al., 1999; Weick & Sutcliffe, 2006, Levinthal & Rerup, 2006). Reliability is far broader. To respond flexibly in real time, reorganizing resources and taking actions to avoid drift (Pettersen & Schulman, 2016), maintain or restore functioning despite unforeseen disruptions (e.g., surprises, variations, or peripheral failures), or to recover after major damage requires resilience as well as anticipation (Schulman, 2004; Weick et al., 1999). These abilities are generally traced to dynamic organizing. HROs develop capabilities to detect, contain, and bounce back from inevitable surprises such as mishaps, errors, and failures that are part of an indeterminate world. The hallmark of an HRO is not that it is error-free but that errors don’t disable it (Weick & Sutcliffe, 2007). Reliable performance in complex systems is complicated because it is a dynamic nonevent (Weick, 1987)—something that is difficult to specify and visualize. As Karl Weick has written (1987, 2011), “dynamic” refers to the fact that highly reliable performance is preserved by timely human adjustments. “Nonevent” refers to the fact that successful outcomes rarely call attention to themselves. Because reliable outcomes are constant, there is nothing to pay attention to. This constancy can increase the propensity toward complacency and inertia as it decreases vigilance, the sense of vulnerability, and the quality of attention across an organization (Sutcliffe, 2010). Coupled with the fact that

66

mindful organizing

our grasp of events is always a bit behind because discrete concepts simplify and lag behind continuous perceptions (Weick, 2011, p. 25), this can be a recipe for disaster. Adverse outcomes sometimes result from performance and execution mistakes, but misperceptions, misconceptions, and misunderstandings can lead to even greater vulnerability and harm (Reason, 1997; Schulman, 2004; Weick, 2011). For example, in five of the seven years between 1988 and 1994, the Bristol Royal Infirmary (BRI), a teaching hospital associated with Bristol University’s Medical School located in southwestern England, experienced a mortality rate for open-heart surgery in children under the age of one year that was roughly double that of other centers across the country (Weick & Sutcliffe, 2003, 2007). A close analysis revealed that the collective conception that practices were less than ideal but that BRI and its surgical team were “on a learning curve,” minimized questions, feedback, and inquiry into the quality of care being delivered (Weick & Sutcliffe, 2007, see chap. 6). In sum, highly reliable performance results not from organizational invariance but rather from continually managing fluctuations in job performance and human interactions (Weick et al., 1999, p. 88). To some extent all organizations, like prototypical HROs, face uncertain, volatile, and complex environments that increase vulnerability to surprises and conditions that can change without warning. These changes can have large and negative consequences. To remain reliable, organizational systems must be able to handle unforeseen situations in ways that forestall negative consequences. This requires mechanisms that enable organizational members across an organization to become alert and aware of inevitable fluctuations, to cope with, circumscribe, or contain them as they occur and before their effects escalate and ramify. This requires mindfulness enabled by mindful organizing to which we now turn.

O r g a n i z a t i o n a l M i n d f u l n e ss a n d M i n dfu l Org a n i z i ng

Organizational mindfulness has been defined as a collective capability to discern discriminatory detail about emerging issues (threats/opportunities) and to act swiftly and wisely in response to these details (Weick et al., 1999; Weick & Sutcliffe, 2006; Vogus & Sutcliffe, 2012). Although originally used to describe how HROs avoid catastrophe and perform remarkably well under

mindful organizing 67

trying conditions, organizational mindfulness has come to characterize “organizations that pay close attention to what is going on around them, refusing to function on ‘auto pilot’” (Ray, Baker, & Plowman, 2011, p. 188). For example, substantial overlap exists between the processes and practices that characterize mindful organizing and the attributes of the so-called Toyota Way. The Toyota Way is the set of principles defining Toyota’s organizational culture, production processes, and decision-making processes and structures (Liker, 2004). It comprises two pillars, the principle of continuous improvement and the principle of respect for people. Toyota’s principle of continuous improvement reflects a desire to learn, both periodically from major events and continuously from everyday experiences in order to make adjustments and improvements no matter how small (Ibid.). A focus on collective mental processes was informed by research in organization theory and psychology. For example, Lloyd Sandelands and Ralph Stablein (1987, pp. 137–138) argued that organizations are mental entities capable of thought. As they proposed, although “mind” is commonly defined by what it is able to do, such as think, feel, perceive, or will, it is “not so much a substance with intellective powers as it is a process of forming ideas . . . an ideational process.” Sociologist Ron Westrum (1992, 1997) similarly proposed that some organizations (e.g., HROs) are generative, thinking entities protected by a comprehensive envelope of human thought (1997, p. 237). Weick and Karlene Roberts (1993) found that the reliable operations on naval aircraft carrier flight decks resulted from the “collective mind,” embodied in the interrelating of crewmembers’ social activities and interactions. Weick, Kathleen Sutcliffe, and David Obstfeld (1999) took this research one step further by building on Ellen Langer’s Western perspective on mindfulness to create a clear specification of the collective mindfulness concept and the means through which it is brought about (Sutcliffe & Vogus, 2014). Langer (1989, 2005) defined mindfulness as a state of alertness and lively awareness and proposed that when individuals engage in routine behaviors they are inclined to act mindlessly, which sometimes leads to untoward outcomes. To counteract this tendency, people have to learn to switch their modes of thinking in order to see “both similarities in things thought different and differences in things thought similar (2005, p. 16). In Langer’s model (grounded in a cognitive information-processing framework), individuals exhibit the

68

mindful organizing

enriched awareness associated with a mindful state by (a) actively differentiating and refining their experience through categories and distinctions (1989, p. 138); (b) creating new, discontinuous categories out of ongoing events and experiences (p. 157); and (c) appreciating the nuances of contexts and alternative ways to adapt and operate in them (p. 159). In extending these ideas to the group level, Weick and colleagues (1999) proposed that organizational mindfulness is not about single individuals being mindful or engaging in meditative practices, although the veracity of that claim is unsettled and is a prime target for future scholarly research. Rather, they argued that in a compact way, it is an enriched alertness and awareness across an organization. It involves interpretive work directed at weak signals of unexpected developments. That interpretive work consists of differentiation of perceptions and reframing of conceptions (Barton, Sutcliffe, Vogus, & DeWitt, 2015; Weick, 2011). Interpretive work suggests both corrective actions and new sources of ignorance that may become new imperatives for noticing. As Weick and Sutcliffe (2006) elaborate, mindfulness means that organizations differentiate coded information more fully and more creatively and develop a rich awareness of detail. In other words, mindfulness increases attentional stability and attentional vividness. An important facet of mindfulness is that it is focused on the present. It is about attention as much as thinking (Weick & Sutcliffe, 2006). Mindfulness is focused on clear and detailed comprehension of one’s context and on factors that interfere with such comprehension. Organizational mindfulness has been portrayed as a relatively enduring and stable property of organizations (Ray et al., 2011; Vogus & Sutcliffe, 2012), a top-down process that creates the context for thinking and action on the front line. But, as Timothy Vogus and Sutcliffe (2012) highlight, mindfulness produces reliability (both strategic and operational) by operating across organizational levels: “It is not enough to focus on senior managers, middle managers, or frontline employees in isolation” (p. 726). Collective mindfulness is expressed as by-products of mindful organizing—specifically through preoccupation with failure, reluctance to simplify interpretations, sensitivity to operations, commitment to resilience, and flexible decision structures that migrate problems to pockets of expertise, which operate at multiple organizational levels (Weick et al., 1999; Weick & Sutcliffe, 2001, 2006, 2007; Vogus & Sutcliffe, 2012).

mindful organizing 69

M i n df u l Org a n i z i ng i n Pr ac t ice

Theory detailing how mindful organizing is enabled and enacted has proliferated over the past decade. Mindful organizing is enabled through the actions and interactions of leaders and organizational members as they shape the organization’s social and relational infrastructure (e.g., climate and culture; Weick, 1987; Weick 2011). That is, mindful organizing is more likely to take hold under particular contextual or cultural conditions. Mindful organizing is enacted through a set of five interrelated guiding principles and accompanying contextualized practices (Weick et al., 1999; Sutcliffe & Vogus, 2014; Vogus & Hilligoss, 2016). Together, these principles strengthen the system’s (e.g., team, unit, organization) overall capabilities to discern, learn, and adapt. I describe the basics of these dynamics in the following sections. Conditions That Enable Mindful Organizing Whatever is realized day to day in work organizations “depends on the construction of relationships of work: people and their objects and technologies, on the one hand, and people with people on the other. . . . [H]ow people relate to each other . . . explains the kind of intelligence they produce” (Taylor & Van Every, 2000, p. x). In other words, intelligence is a product of interconnectivity (p. 213). Relationships and the ways in which people interact are central to organizing for highly reliable performance because they are the site where trust is negotiated (Weick, 2009), where discourse can sharpen or blunt sensitivity to unexpected discrepancies (Barton, Sutcliffe, Vogus, & DeWitt, 2015), and where ambiguity about action options can be more or less resolved (Weick & Sutcliffe, 2015). Weick (2009) recounts how organizational members often face collective situations in which their private view is at odds with a majority view. When people face this common social and organizational dilemma, they can feel threatened, which makes it harder for them to express their views and/or speak up about potential problems (Blatt, Christianson, Sutcliffe, & Rosenthal, 2006). These dynamics are often stronger in organizational contexts where work is hierarchical, distributed, and dynamic and where there are widespread differences in social status and power. To encourage people to voice their concerns and question interpretations, organizational leaders and members must establish a context of trust and respect. In contexts where respect is a norm,

70

mindful organizing

people are both more likely to communicate their interpretations to others and also are more likely to generate shared interpretations through these communications and opportunities for interaction (Christianson & Sutcliffe, 2009). For example, in a study of medication misadministration in a large sample of nursing units, Vogus and Sutcliffe (2007b) found that processes of mindful organizing were more strongly associated with reliable outcomes in units where nurses reported a strong climate of trust. In addition to attending to relational norms and building a context of trust and respect, leaders and organizational members must pay heed to issues of interconnectivity, particularly to the interrelating of activities (Weick, 2011). Research detailing the performance reliability of flight operations on aircraft carriers has shown that reliable outcomes are more likely when the crews of aircraft carriers are more heedful in their relationships (Weick & Roberts, 1993). Heedful interrelating is a social process through which individual action contributes to a larger pattern of shared action. The pattern of shared action is a threefold combination of contributions, representations, and subordination (Weick, 2009, p. 164). That is, when people interrelate heedfully, they first understand how a system is configured to achieve some goal, and they see their work as a contribution to the system and not as a standalone activity. Second, they see how their job fits with other people’s jobs to accomplish the goals of the system (they visualize the meshing of mutually dependent contributions). And third, they maintain a conscious awareness of both as they perform their duties. Although a simplification, one way to better understand heedful interrelating is by considering its opposite, heedless interrelating—when people simply do their jobs without considering how their work contributes to the overall outcome and ignoring what is going on around them, both upstream and downstream. I draw attention to the importance of developing and enhancing norms of respect and trust as well as people’s abilities to work effectively with their colleagues because without a strong relational foundation, mindful organizing is much more difficult to attain. Mindful Organizing Processes Mindful organizing is a macro-level pattern of collective daily processes and organizing practices that help people to focus attention on perceptual details that are typically lost when they coordinate their actions and share

mindful organizing 71

their interpretations and conceptions (Weick et al., 1999). Mindful organizing reduces the loss of detail by increasing the quality of attention across the organization, enhancing people’s alertness and awareness so that they can detect discrepancies and subtle ways in which situations and contexts vary and call for contingent responses (Weick & Sutcliffe, 2006, 2007). Mindfulness generates enhanced awareness of context, which then makes situations more meaningful. As noted earlier, that increase in meaningfulness occurs through five interrelated processes and associated practices (i.e., habits [Vogus & Hilligoss, 2016]) that increase information about failures, simplifications, operations, resilience, and expertise associated with present and emerging challenges (Weick & Sutcliffe, 2007). Without these contextual inputs, awareness is more likely to be simplified into familiar categories that have been applied in the past (e.g., mindlessness). Although there is nothing wrong with influence from the past, it tends to be stripped of context and subject to the editing of hindsight (Weick & Sutcliffe, 2015). The following list describes mindful organizing and its five key processes: 1. Preoccupation with failure. A preoccupation with failure is consistent with James Reason’s (1997) emphasis on the importance of fostering “intelligent and respectful wariness” (p. 195) to avoid catastrophe. This preoccupation reflects the organization’s ongoing proactive and preemptive analyses of surprises and possible vulnerabilities and aims to overcome normalizing, crude labels, inattention to current operations, rigid adherence to routines, and inflexible hierarchies (Sutcliffe & Weick, 2013). By actively searching for and attending to innocuous or seemingly insignificant weak signals, people can more quickly detect if the system is acting in unexpected ways. 2. Avoidance of simplifying interpretations. To organize for higher reliability is to weaken the shareability constraint and strengthen the impact of knowledge by acquaintance (Weick, 2011). The shareability constraint (see Freyd, 1983; Baron & Misovich, 1999), arises when people coordinate and communicate and “impose discrete but shared concepts on the continuous perceptual flow” and “begin to simplify their direct perceptions into types, categories, stereotypes, and schemas” (Weick, 2011, p. 23). General labels can obscure insightful details and can lull people into a false sense that they know precisely what they face. They also constrain the precautions people take and the number of undesired consequences they imagine.

72

mindful organizing

Consequently, highly reliable organizations create processes to complicate people’s interpretations and worldviews. In practice this means frequently discussing alternatives as to how to go about their everyday work (Vogus & Sutcliffe, 2007a). 3. Sensitivity to operations. Organizational members can build an integrated big picture of current situations through ongoing attention to real-time information and making adjustments to forestall small problems from compounding and growing bigger. Many untoward events originate in latent failures—that is, loopholes in the system’s defenses such as defects in supervision, training, briefings, or hazard identification (Reason, 1997). Being in close touch with what is happening here and now means that latent problems can get the attention they need. It is also a means of reducing the likelihood that any one problem or error will become aligned with others and interact in ways not previously seen (Weick et al., 1999). 4. Commitment to resilience. A preoccupation with failure, reluctance to simplify interpretations, and sensitivity to operations are aimed at anticipating vulnerabilities, contingencies, or discrepancies—either to preclude them or prevent them from accumulating into bigger problems or crises (Weick & Sutcliffe, 2007). Jointly, these three processes enable a rich representation of the complexity of potential threats. But it is not possible to totally reduce uncertainty and anticipate all situations and conditions that shape people’s work. A commitment to resilience and deference to expertise (sometimes known as flexible decision structures) together comprise the pool of expertise and the capacity to use it in a flexible manner that allows for swift recovery from disruptions or surprises (Sutcliffe & Vogus, 2014). Resilience results from enlarging individual and organizational action repertoires and capabilities to improvise, learn, and adapt (Sutcliffe & Vogus, 2003). More mindful organizations have a strong commitment to improving overall capability (Wildavsky, 1991, p. 70) through continual training and simulation, varied job experiences, learning from negative feedback, and ad hoc networks that allow for rapid pooling of expertise to handle unanticipated events (Weick et al., 1999). 5. Deference to expertise. Organizations often privilege hierarchical structures such that important choices are made by important decision makers who participate in many choices (Weick et al., 1999). Mindful organizing ex-

mindful organizing 73

presses a different priority. When unexpected problems arise, the organization loosens the designation of who is the “important” decision maker in order to allow decision making to migrate along with problems (see Roberts, Stout, & Halpern, 1994, p. 622). The result is that expertise trumps rank, which increases the likelihood that new capabilities will be matched with new problems assuring that emerging problems will get quick attention before they blow up. By deferring to expertise, the organization and system is more flexible, has more skills and experience to draw on, and can deal with inevitable uncertainty and imperfect knowledge (Weick & Sutcliffe, 2015).

R e s e a r ch o n M i n d f u l O r g a n i z i n g

Earlier I noted that the past decade has seen strong growth in theory and qualitative research on collective mindfulness and processes of mindful organizing, but empirical research testing theory and particular hypotheses has lagged. Perhaps this shouldn’t be surprising in such a relatively new field of inquiry. Research on individual mindfulness is in an early stage of development as well, yet important evidence of mindful organizing, its construct validity, antecedents, and evidence of linkages to important outcomes is beginning to accumulate. Research on mindful organizing has diffused and spans industries such as construction and highway maintenance (Busby & Iszatt-White, 2014), drug rehabilitation (Cooren, 2004), education (Ray et al., 2011), firefighting (Bigley & Roberts, 2001; Barton & Sutcliffe, 2009), offshore oil and gas production (Maslen, 2014), and IPO software firms (Vogus & Welbourne, 2003), to name a few. Studies exploring processes of mindful organizing in health care are particularly prominent. In part, this may reflect the realization that health care, relatively speaking, is a low-reliability industry and is undergoing tremendous social pressure for improvement (Chassin & Loeb, 2013; Sutcliffe, Paine, & Pronovost, 2016). One of the more troubling facts about medical harm (in addition to the fact that errors are rampant) is that most mishaps and errors are an indigenous feature of a dynamic, uncertain, and oftentimes vague unfolding work process. Acts become mistaken after the fact; they don’t start off that way (e.g., they arise as part of the trajectory of medical care; Paget, 1988, p. 93). With its emphasis on improving system awareness and alertness

74

mindful organizing

as well as the capacity to act, it is not surprising to find widespread interest in mindful organizing by researchers, policy makers, and clinicians in the health-care domain. For example, a recent article in the health policy journal Milbank Quarterly coauthored by Mark Chassin, president and CEO of TJC, the independent, not-for-profit organization that accredits and certifies more than 20,500 health-care organizations and programs in the United States, and Jerod Loeb (Chassin & Loeb, 2013) suggests that with some attention to readiness (e.g., leadership commitment, cultural engagement, and techniques of process improvement), principles of high reliability and “collective mindfulness” may enable safer and more reliable health care. In the following sections, I review some of the more general findings and move to a discussion of research directions and practical implications. Construct Validity Naturally, developing valid ways to measure a given concept are critical for researchers who wish to conduct empirical research on the concept. Thus, issues of construct validity are important considerations. Some of the earliest studies of collective mindfulness inferred its presence from the absence of errors or mishaps in situations where they were expected (Weick & Roberts, 1993). Some studies of collective mindfulness have relied on secondary data from previously published works or inquiry reports (e.g., Weick & Roberts, 1993; Weick et al., 1999), have used qualitative methods that illustrated the concept richly but limited cross-study coherence (e.g., Rerup, 2005, 2009), or have used indirect measures such as leader behaviors and specific human resource practices (Vogus & Welbourne, 2003). Recently, survey measures have been developed and validated both for mindful organizing and for collective mindfulness. I use these as a starting point for our discussion. Vogus and Sutcliffe (2007a) sought to establish construct reliability as well as the convergent, discriminant, and criterion validity of a measure of mindful organizing in a study of 1,685 registered nurses from 125 nursing units in thirteen hospitals. Their measure consisted of nine items that assessed the degree to which members of a work group collectively engaged in behaviors representing the five processes of mindfulness (e.g., preoccupation with failure: “When giving report to an oncoming nurse we usually discuss what to look out for”; see Vogus and Sutcliffe 2007a, p. 54, for all nine items). In addition

mindful organizing 75

to establishing reliability and validity, they also demonstrated the collective nature of the construct by assessing the extent to which individual responses could be aggregated to the unit level. The results confirmed that the singlefactor measure was a precise, unidimensional measure of mindful organizing at the unit level that closely resembled the content domains identified in earlier work (e.g., Weick et al., 1999; Weick & Sutcliffe, 2001). It was also consistent with the original idea that mindful organizing is a joint function of all five processes described earlier (e.g., preoccupation with failure, avoidance of simplification, etc.). Additional research has confirmed the validity of the mindful organizing construct. For example, Dietmar Ausserhofer and colleagues (Ausserhofer, Schubert, Blegen, De Geest, & Schwendimann, 2013) assessed the psychometric properties (i.e., validity and reliability) of the mindful organizing scale developed by Vogus and Sutcliffe (2007a) in three European health-care samples (German, French, and Italian). The findings supported the one-factor model. Additionally, in a laboratory study of student teams, Vogus and colleagues (Vogus, Tangirala, Lehman, & Ramanujam, 2015) examined the discriminant validity of mindful organizing and demonstrated its distinctiveness from several related work-group constructs including communication frequency, transactive memory, and several teamwork behaviors. To examine organizational mindfulness in a sample of US business schools with dual goals of empirically validating the organizational mindfulness construct and exploring the usefulness of mindful organizing for the educational context, Joshua Ray, Lakami Baker, and Donde Plowman (2011) developed and validated a five-factor measure of organizational mindfulness and its constituent processes. Ray and colleagues’ measure of organizational mindfulness assesses the extent to which administrators enact practices and structures that work to ensure more mindful ways of acting, thinking, and organizing at the college, rather than subunit level. In addition to confirming a five-factor model (consistent with Weick & Sutcliffe, 2007), Ray and colleagues showed that individuals in different organizational roles (e.g., deans, associate deans, and department chairs) differed in the extent to which they perceive their colleges to be mindful: individuals at the top have more positive perceptions of mindfulness than those in other roles. As Vogus and Sutcliffe (2012) assert, at this stage in the development of collective mindfulness research there is no reason to curtail construct development

76

mindful organizing

in search for an ideal index. The fact that multiple perspectives on mindfulness exist is emblematic of the richness of the mindfulness construct and the deep and wide-ranging lines of study and practice that lie at the core of this work. In fact, “these two measures capture important nuances that distinguish mindful organizing and organizational mindfulness respectively, and presents unique strengths for explaining organizational outcomes” (Vogus & Sutcliffe, 2012, p. 727). Certainly more discriminant validation is needed, but these findings provide evidence that mindful organizing and mindfulness can be measured systematically and rigorously. Antecedents of Mindful Organizing Vogus and Iacobucci (2016) examined antecedents of mindful organizing and some mechanisms through which these antecedents exert their effects in a health-care setting. The results showed that reliability-enhancing work practices (REWPs) such as selective staffing, extensive training, developmental performance appraisal, and decentralized decision making were positively associated with mindful organizing and performance reliability. REWPs such as these focus on building interpersonal skills, encourage people to share expertise and make recommendations for improvement, and provide those on the front lines control over their work processes and the opportunity to make local adaptations (p. 913). These practices lead to higher levels of mindful organizing by increasing the levels of trust and respect in employee communications and interactions. In an unpublished study of hospital nursing units, Vogus and colleagues (Vogus et al., 2015) explored the effects of work-group professional characteristics on mindful organizing and found that professional experience had a curvilinear relationship with mindful organizing (i.e., a positive relationship with diminishing returns at high levels of experience). The effects of experience on mindful organizing were strengthened when members of a work group collectively had high professional commitment but were diminished when a work group had high variability in its experience. Similarly, in a study of software firms and performance innovation, Vogus and Theresa Welbourne (2003) found evidence that human resource and work design practices were antecedents of mindful organizing and subsequent innovation in the long term. Finally, in a study of the California electrification industry (CAISO), Emory Roe and Paul Schulman (2008) found that characteristics

mindful organizing 77

of frontline personnel (particularly individuals’ cumulative knowledge base and their capabilities to access expertise embedded in organizational networks) were associated with abilities to mindfully organize. Outcomes of Mindful Organizing A number of qualitative studies provide evidence of mindful organizing’s effects, many of which pertain to reliable performance outcomes. G. Eric Knox, Kathleen Simpson, and Thomas Garite (1999), for example, studied hospital obstetrical units and found that those with better safety performance and fewer malpractice claims were distinguished by the features of mindful organizing. Other studies in health care affirm the salutary effects of mindful organizing. For example, Roberts and colleagues (Roberts, Madsen, Desai, & van Stralen, 2005; Madsen, Desai, Roberts, & Wong, 2006) conducted a qualitative longitudinal study of a pediatric intensive care unit (PICU) and found that the introduction of mindful organizing practices (e.g., reluctance to simplify interpretations evidence by constant in-service training to interpret and question data and working hypotheses; collaborative rounding by the entire care team that enabled increased sensitivity to current operations, etc.) was associated with lower levels of patient deterioration on the unit. In a large sample quantitative study, Vogus and Sutcliffe (2007b) similarly found benefits to mindful organizing. Inpatient nursing units with higher levels of mindful organizing reported fewer medication errors over the subsequent six months. In addition, the inverse association between medication errors and mindful organizing was stronger when registered nurses reported high levels of trust in their nurse managers and when units reported extensive use of standardized care protocols. Using a measure of mindful organizing similar to Vogus and Sutcliffe (2007a), Vogus and colleagues (Vogus et al., 2015) studied a sample of graduate student teams and found that teams that scored higher on the mindful organizing scale were two and a half times more likely to build a safe bridge that withstood testing. Outside of health care, Claus Rerup (2009) studied the multinational pharmaceutical company Novo Nordisk using a longitudinal case study method and found that the firm’s recovery from a crisis and subsequent highly reliable performance resulted from alertness and attention to weak signals that were enabled by three mindful processes including a preoccupation with failure,

78

mindful organizing

reluctance to simplify interpretations, and sensitivity to operations. Additional support for the benefits of mindful organizing come from Rerup’s (2005) study of serial entrepreneurs, in which venture success appeared to be a consequence of processes of mindful organizing, although some evidence suggested that mindfulness was helpful only up to a point. In other words, the relationship between mindful organizing and success was curvilinear. In addition to its effects on organizational outcomes, mindful organizing appears to have some surprising effects on individuals. For example, in a study of nurses in three midwestern acute-care hospitals, Vogus and colleagues (Vogus, Cooil, Sitterding, & Everett, 2014) found that mindful organizing was negatively associated with emotional exhaustion on units with higher rates of adverse events and positively associated with emotional exhaustion on units with lower rates of adverse events. Moreover, mindful organizing was associated, over time, with lower unit-level turnover rates.

Fu t u r e R e s e a r ch A g e n d a

The growth of theory in high reliability and mindful organizing over the past few years has led to a surfeit of ideas related to gaps, unresolved questions, and areas ripe for future inquiry (for examples see Vogus & Sutcliffe, 2012; Sutcliffe & Vogus, 2014; Rerup & Levinthal, 2014). Some of these ideas include: more closely and systematically examining the variety of organizational outcomes of mindful organizing (Vogus & Sutcliffe, 2012; Sutcliffe & Vogus, 2014); exploring the subjective and affective experiences of those working in contexts where mindful organizing is the norm (Sutcliffe & Vogus, 2014); understanding the cost and effectiveness trade-offs of mindful organizing and the reasons why mainstream organizations pursue these processes in the absence of obvious threats (Ibid.); more clearly understanding the role of organizational routines in mindfulness and mindful organizing (Vogus & Hilligoss, 2016; Rerup & Levinthal, 2014); and better understanding how mindful organizing influences the capability to change course or adapt and adjust in real time (Gartner, 2013; Sutcliffe & Vogus, 2014). I expand on some recent ideas and propose some additional areas that are in need of further exploration in the paragraphs that follow. One lingering unanswered question in the literature is whether there is a link between individual and organizational mindfulness (Vogus & Sutcliffe,

mindful organizing 79

2012; Sutcliffe & Vogus, 2014). To date the literatures on organizational mindfulness and mindful organizing have been silent on the role of individual mindfulness. As described earlier, collective mindfulness grew out of research on individual mindfulness. But the question as to whether mindful organizing and organizational mindfulness require (or depend on) individual mindfulness has, with a few exceptions (see Weick & Putnam, 2006), been unexplored. Recall my perspective that organizational mindfulness results not from individual mindsets or intrapsychic processes but rather from patterns of action and interaction (e.g., organizing processes; Weick & Sutcliffe, 2007). But this isn’t to say that mindful organizing and individual mindfulness are unrelated, if only because in the process of acting and interacting, individuals may develop habits or routines that enable their own mindfulness. Or, it may be the other way around. That is, individual mindfulness may be antecedent. The development of a multilevel model for mindful organizing and mindfulness is just beginning to emerge from a variety of findings across the individual and organizational mindfulness literatures (see Sutcliffe, Vogus, & Dane, 2016). For example, in studying the beneficial effects of mindfulness on a variety of positive individual and organizational outcomes, Jochen Reb, Jayanth Narayanan, and Zhi Wei Ho (2015) found that organizational constraints and organizational support predicted employee mindfulness and also that leader mindfulness played an important role in fueling employee mindfulness and outcomes such as employee well-being. Research by James Ritchie-Dunham (2014), consistent with prior work by C. Marlene Fiol and Edward O’Connor (2003), suggests that mindful leaders shape organizational mindfulness through the strategic processes they institute. Mindful leaders are those who convincingly commit to mindful practices, institutionalizing consideration of new perspectives, new categories, and new information. These processes include appreciative inquiry and engagement of multiple stakeholders, scenario planning and double-loop learning to appreciate situation and context, and story building and story busting to observe, refute, and ultimately integrate new information (Ritchie-Dunham, 2014). In part, this work may reflect how leaders and organizational members reciprocally shape a mindful organizational culture through patterns of mindful organizing. Weick and Sutcliffe (2007) also discuss leadership in the context of organizational mindfulness and culture, particularly the importance of credibility and communication of core values in fueling mindful organizing. This research suggests that organizational

80

mindful organizing

context and leadership play an important role in shaping individual mindfulness in the workplace and provide interesting top-down evidence for linkages. Daire Cleirigh and John Greaney (2014), however, found that individuals treated with a brief mindful intervention performed better on group tasks, suggesting a possible bottom-up effect of individual mindfulness on group interactions and group/organizational performance. Silke Eisenbeiss and Daan van Knippenberg (2015) studied leader-follower dyads and found that follower mindfulness importantly moderated the association between ethical leadership and follower effort and helping such that at higher levels of follower mindfulness, ethical leadership was more strongly related to follower extra effort and helping. Silvia Jordan and Idar Johannessen (2014) examine organizational defensiveness and argue that individual defensiveness can negatively transform features of organizational culture, which subsequently subvert or prevent mindful organizing. Patricia Schultz and colleagues (Schultz, Niemiec, Legate, Williams, & Ryan, 2015) provide an interesting complement to this theorizing in finding that individual mindfulness reduces defensive responses to some situations. These results suggest that individual mindfulness may matter both to mindful organizing and organizational mindfulness, but further research could provide important insight into the creation of a generalizable multilevel model for mindfulness that combines top-down and bottom-up mechanisms. Earlier I highlighted progress in ascertaining the validity of mindful organizing and collective mindfulness. But more needs to be done to discriminate these concepts from related others. One important example in organizational and professional contexts is situation awareness, which is “the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future” (Endsley, 1995, p. 36). In other words, situation awareness describes the construction and maintenance of a cognitive map representing the overall situation and operational status of a complex system. Endsley’s model consists of three elements: perception, integration, and extrapolation. Endsley’s perception element is consistent with being sensitive to operations (Weick et al., 1999), while integration and extrapolation are by-products of the overall achievement of mindfulness in an organizational context. Theoretically, situation awareness differs from organizational mindfulness in important ways. Situation awareness primarily describes the cognitive map of an individual gaining awareness of a

mindful organizing 81

complex system; organizational mindfulness is not limited to the individual, although individual engagement is important. Instead, organizational mindfulness acknowledges that an individual’s cognitive resources are necessarily limited, which leads to simplification and generalization. To gain moment-tomoment awareness of complex situations in complex environments requires social interaction (Roth, 1997), “a combination of shared mental representations, collective story building, multiple bubbles of varying size, situation assessing with continual updates, knowledge of physical interconnections and parameters of plant systems, and active diagnosis of the limitations of preplanned procedures” (Weick et al., 1999, p. 98). More research is needed to clarify how situation awareness and mindfulness are similar and different as well as the social and interactive factors that contribute to both. Research on mindful organizing and collective mindfulness grew out of research on HROs—organizations that navigate difficult conditions in a nearly error-free manner. Thus, it isn’t surprising to find that although exceptions exist, research often has been aimed at surfacing or making clear things to be avoided or mitigated (e.g., discerning threats rather than discerning opportunities; Sutcliffe, Vogus, & Dane, 2016). Still, as I noted earlier, the study (and practice) of mindfulness and mindful organizing, naturally, have spread far beyond the high-risk contexts of traditional HROS. That is, mindful organizing is less unique than it is sometimes portrayed. Although research on mindful organizing has focused on identifying negatives—threats rather than opportunities—it is reasonable to think that mindful organizing and mindfulness will be just as important to organizations and settings that vary widely in hazard and risk and to organizations looking for opportunities. After all, mindful organizing processes improve overall performance reliability by enhancing attention to perceptual details, conceptualization of those details, and the ability to act on what is “seen.” Consistent with research by Vogus and Welbourne (2003), who found a link between mindful organizing and firm innovation, this line of thinking is supported in Michelle Barton’s (2010) study of high-tech entrepreneurs. Barton recognized that when founders developed mindful practices for monitoring unfolding events and making sense of equivocal experiences, they were better able to shape and capitalize on new opportunities. More mindful practices helped entrepreneurs rapidly build knowledge about their emerging opportunity and continually shape

82

mindful organizing

and update their processes to reflect that emergent knowledge. In addition, a more mindful approach allowed entrepreneurs to make smaller, less disruptive adjustments as they shaped their opportunities over time, leading to better overall firm performance. Further work is needed to determine whether organizations with higher levels of organizational mindfulness are better able to detect and respond to market opportunities and otherwise adapt more quickly. Two additional ideas for future research come to mind. The first relates to the goal of mindful organizing and the second relates to the goal of this book. If higher reliability is produced through processes of mindful organizing, we should expect that organizations (or their subunits) that organize for mindfulness will experience fewer crises and catastrophes and have more resilient performance over the long term than their counterparts that are not so organized. More work is needed to explore this notion. Finally, the aim of this book is to better understand high reliability and its generalizability across settings. Reliability is fundamental to what organizing is—being able to depend on others and to be depended on (Weick & Sutcliffe, 2001, p. 91; Busby & Iszatt-White, 2014, p. 69). In hazardous settings the meaning of reliability is relatively clear and generally pertains to performance outcomes, but this may not be the case in more mundane settings. As Jerry Busby and Marian Iszatt-White (2014) claim, the meaning of reliability in mundane settings may be more variable and more relational. More research is needed to better understand reliability as a central construct in organizational research and how it is a property of relationships in organizational systems.

Imp l i c a t i o n s f o r M a n a g e r i a l P r a c t i c e

Although specific practices will naturally vary from context to context, some practices may be relevant to all organizations. Mindful organizing is a dynamic continual accomplishment. Organizations and their members that pursue these activities repeatedly and continually are likely to achieve greater reliability than those that don’t, in part because of the binding culture that is created through enactment of these practices. In some ways mindful organizing enables complex systems and organizations to develop an intrinsic resistance to strategic and operational hazards. Thus, paying attention to these ideas makes sense for leaders and organizational members who want to minimize

mindful organizing 83

performance jolts and maximize performance continuity. Here I provide some ideas for enabling and enacting mindful organizing: •

•

•

•

Spend time assessing unit and organizational climate and culture and the existence of social-relational norms of trust and respect. Encourage people to mentally simulate their work in order to help them build capabilities to identify discrepancies from expectations and cope with disturbances once they appear. What activities lie upstream and downstream from them? How can their work unravel? How can disturbances be corrected? Continuously evaluate surprises and foiled expectations, failures, mistakes, near misses, and close calls using various protocols. For example, Winston Churchill’s debriefing protocol may be helpful: Why didn’t I know? Why didn’t my advisors know? Why wasn’t I told? Why didn’t I ask? (Weick & Sutcliffe, 2001). Develop richer forms of communication. The STICC protocol may be useful in situations where people are handing off work: Situation (“Here’s what I think is going on.”) Task (“Here’s what I think we should do.”) Intent (“Here’s why.”) Concern (“Here’s what I think we should keep our eye on.”) Calibrate (“Now, talk to me.”) (Weick & Sutcliffe, 2007, p. 156)

•

•

•

•

Make tacit expectations more explicit so that it is easier to spot violations of expectations. For example, keep track of what you would expect if your working hypothesis for a particular situation is correct. Encourage conceptual slack—that is, recognize a divergence in team members’ analytical perspectives and foster a willingness to question what is happening rather than feign understanding. Appoint someone to play devil’s advocate, which legitimizes questioning and alternative interpretations. Invest in broad generalized training and retraining as certain skills, such as teamwork, can decay over time and must be refreshed or relearned in order to remain effective. Identify pockets of expertise and encourage people to self-organize into ad hoc networks to provide expert problem solving when problems or crises appear.

84

mindful organizing

C o n c l us i o n s

Mindful organizing involves both anticipating and rebounding in the face of surprise and complication in service of improving performance reliability. It is an important means to build system resilience. What is distinctive about mindful organizing is that there is a consistent effort to recapture detail. There is an effort to refine and differentiate existing categories, create new categories, and detect subtle ways in which contexts vary and call for contingent responding. I end this chapter on mindful organizing with two final comments. First, if we view mindful organizing through the three lenses framework (see Chapter 3), we can see that it is composed of elements of strategic design and culture but is relatively silent on power. That is, through patterns of action (routines and structures) mindful organizing creates a mindful culture, an emergent ordered system of meaning and symbols that shapes how, on an ongoing basis, organizational members interpret their experiences and act. Culture encompasses what people value, how they think things work, and how they think they should act. Second, mindful organizing, with its emphasis on intelligent wariness, doubt, and a preoccupation with failure, might be perceived as a “negative perspective,” antithetical to optimism, growth, and support. This raises an important question: How does mindful organizing align with the growing attention to positive psychology and positive organizational scholarship? In the past two decades, positive psychology, what leads to people’s happiness and well-being, has taken the world by storm. In fact, belief in the power of positive thinking and cultivating an optimistic outlook has infused much of our lives. Positivity has infused the world of organizations as well. Positive organizational scholarship explores “positive outcomes, processes, and attributes of organizations and their members” (Cameron, Dutton, & Quinn, 2003, p. 4) and dynamics that enable organizational flourishing and strength. The answer to the question asked above is embedded in another: How does strengthening—the capability to resist attack—come about? My answer is that strengthening comes about by organizing in such a way as to become aware of problems earlier so that they can be remedied before they get large and unwieldy. It also comes about by building capabilities to cope with whatever problems do break through. Small problems, mistakes, mishaps, or lapses are

mindful organizing 85

a natural part of organizational life, but a strong tendency to pay careful attention to best cases and careless attention to worst cases often means that small details don’t get attention until too late (Cerullo, 2006, p. 6). I argue that this doesn’t have to be the case. Disruptions, adverse events, crises, and accidents are not inevitable; they result from small problems, surprises, or lapses that shift, grow, and escalate until they are too big to handle. Developing capabilities to anticipate, contain, and repair vulnerability should be a priority for all organizations regardless of the hazardous (and risky) nature of the work or the industry.

R efer ences Ausserhofer, D., Schubert, M., Blegen, M. A., De Geest, S., & Schwendimann, R. (2013). Validity and reliability on three European language versions of the safety organizing scale. International Journal for Quality in Health Care, 25(2), 157–166. Baron, R. M., & Misovich, S. J. (1999). On the relationship between social and cognitive modes of organization. In S. Chaiken & Y. Trope (Eds.), Dual-process theories in social psychology (pp. 586–605). New York: Guilford Press. Barton, M. A. (2010). Shaping entrepreneurial opportunities: Managing uncertainty and equivocality in the entrepreneurial process (Unpublished doctoral dissertation). University of Michigan, Ann Arbor. Barton, M. A., & Sutcliffe, K. M. (2009). Overcoming dysfunctional momentum: Organizational safety as a social achievement. Human Relations, 62(9), 1327–1356. Barton, M. A., Sutcliffe, K. M., Vogus, T. J., & DeWitt, T. (2015). Performing under uncertainty: Contextualized enagement in wildland firefighting. Journal of Contingencies and Crisis Management, 23(2), 74–83. Bigley, G. A., & Roberts, K. H. (2001). The incident command system: High-reliability organizing for complex and volatile environments. Academy of Management Journal, 44(6), 1281–1299. Blatt, R., Christianson, M. K., Sutcliffe, K. M., & Rosenthal, M. M. (2006). A sensemaking lens on reliability. Journal of Organizational Behavior, 27(7), 897–917. Busby, J., & Iszatt-White, M. (2014). The relational aspect to high reliability organization. Journal of Contingencies and Crisis Management, 22(2), 69–80. Cameron, K. S., Dutton, J. E., & Quinn, R. E. (Eds.). (2003). Positive organizational scholarship: Foundations of a new discipline. San Francisco: Berrett-Koehler. Cerullo, K. A. (2006). Never saw it coming: Cultural challenges to envisioning the worst. Chicago: University of Chicago Press. Chassin, M. R., & Loeb, J. M. (2013). High-reliability health care: Getting there from here. The Milbank Quarterly, 91(3), 459–490. Christianson, M. C., & Sutcliffe, K. M. (2009). Sensemaking, high reliability organizing, and resilience. In P. Croskerry, K. Crosby, S. Schenkel, & R. L. Wears (Eds.), Patient safety in emergency medicine (pp. 27–33). Philadelphia: Lippincott. Cleirigh, D. O., & Greaney, J. (2014). Mindfulness and group performance: An exploratory investigation into the effects of brief mindfulness intervention on group task performance. Mindfulness, 6(3)1–9. Cooren, F. (2004). The communicative achievement of collective minding analysis of board meeting excerpts. Management Communication Quarterly, 17(4), 517–551.

86

mindful organizing

Dean, J. W., & Bowen, D. E. (Eds.). (1994). Total quality. Academy of Management Review, 19(3). Eisenbeiss, S. A., & van Knippenberg, D. (2015). On ethical leadership impact: The role of follower mindfulness and moral emotions. Journal of Organizational Behavior, 36, 182–195. Endsley, M. R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors, 37(1), 32–64. Fiol, M., & O’Connor, E. J. (2003). Waking up! Mindfulness in the face of bandwagons. Academy of Management Review, 28(1), 54–70. Freyd, J. J. (1983). Shareability: The social psychology of epistemology. Cognitive Science, 7, 191–210. Gartner, C. (2013). Enhancing readiness for change by enhancing mindfulness. Journal of Change Management, 13(1), 52–68. Goodman, P. S., Ramanujam, R., Carroll, J. S., Edmondson, A. C., Hofmann, D. A., & Sutcliffe, K. M. (2011). Organizational errors: Directions for future research. Research in Organizational Behavior, 31, 151–176. Hirschhorn, L. (1993). Hierarchy versus bureaucracy: The case of a nuclear reactor. In K. H. Roberts (Ed.), New challenges to understanding organizations (pp. 137–150). New York: Macmillan. Ilinitch, A. Y., D’Aveni, R. A., & Lewin, A. Y. (1996). New organizational forms and strategies for managing hypercompetitive environments. Organization Science, 2, 211–220. James, W. (1909). What pragmatism means. Lectures delivered at the Lowell Institute, Boston, Massachusetts, December 1906, and at Columbia University, New York City, January 1907. In Pragmatism: A new name for some old ways of thinking (p. 61). New York: Longmans. Jordan, S., & Johannessen, I. A. (2014). Mindfulness and organizational defenses: Exploring organizational and institutional challenges to mindfulness. In A. Ie, C. T. Ngnoumen, & E. J. Langer (Eds.), The Wiley Blackwell handbook of mindfulness (Vol. 1, pp. 424–442). Chichester, UK: Wiley Blackwell. Katz-Navon, T., Naveh, E., & Stern, Z. (2005). Safety climate in healthcare organizations: A multidimensional approach. Academy of Management Journal, 48, 1073–1087. Knox, G. E., Simpson, K. R., & Garite, T. J. (1999). High reliability perinatal units: An approach to the prevention of patient injury and medical malpractice claims. Journal of Healthcare Risk Management, 19(2), 24–32. Langer, E. J. (1989). Minding matters: The consequences of mindlessness-mindfulness. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 22, pp. 137–173). San Diego, CA: Academic. Langer, E. J. (2005). On becoming an artist: Reinventing yourself through mindful creativity. New York: Ballantine. Levinthal, D. A., & Rerup, C. (2006). Crossing an apparent chasm: Bridging mindful and less mindful perspectives on organizational learning. Organization Science, 17(4), 502–513. Liker, J. K. (2004). The Toyota way: 14 management principles from the world’s greatest manufacturer. New York: McGraw-Hill. Madsen, P. M., Desai, V. M., Roberts, K. H., & Wong, D. (2006). Mitigating hazards through continuing design: The birth and evolution of a pediatric intensive care unit. Organization Science, 17(2), 239–248. Maslen, S. (2014). Learning to prevent disaster: An investigation into methods for building safety knowledge among new engineers to the Australian gas pipeline industry. Safety Science, 64(20), 82–89. Paget, M. A. (1988). The unity of mistakes: A phenomenological interpretation of medical work. Philadelphia: Temple University Press. Pettersen, K. A., & Schulman, P. R. (2016, March 26). Drift, adaptation, resilience and reliability: Toward an empirical clarification. Safety Science [online version]. Retrieved from http://www .sciencedirect.com/science/article/pii/S0925753516300108 Ray, J. L., Baker, L. T., & Plowman, D. A. (2011). Organizing mindfulness in business schools. Academy of Management Learning and Education, 10(2), 188–203.

mindful organizing 87

Reason, J. (1997). Managing the risk of organizational accidents. Aldeshot, UK: Ashgate. Reb, J., Narayanan, J., & Ho, Z. W. (2015). Mindfulness at work: Antecedents and consequences of employee awareness and absent-mindedness. Mindfulness, 6, 1250–1262. Rerup, C. (2005). Learning from past experience: Footnotes on mindfulness and habitual entrepreneurship. Scandinavian Journal of Management, 21, 451–472. Rerup, C. (2009). Attentional triangulation: Learning from unexpected rare crises. Organization Science, 20(5), 876–893. Rerup, C., & Levinthal, D. A. (2014). Situating the concept of organizational mindfulness: The multiple dimensions of organizational learning. In G. Becke (Ed.), Mindful change in times of permanent reorganization (pp. 33–48). Berlin: Springer-Verlag. Ritchie-Dunham, J. L. (2014). Mindful leadership. In A. Ie, C. T. Ngnoumen, & E. J. Langer (Eds.), The Wiley Blackwell handbook of mindfulness (Vol. 1, pp. 443–457). Chichester, UK: Wiley Blackwell. Roberts, K. H. (1990). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176. Roberts, K. H. (Ed.). (1993). New challenges to understanding organizations. New York: Macmillan. Roberts, K. H., Madsen, P. M., Desai, V. M., & van Stralen, D. (2005). A case of the birth and death of a high reliability healthcare organization. Quality and Safety in Health Care, 14, 216–220. Roberts, K. H., & Rousseau, D. M. (1989). Research in nearly failure-free, high-reliability organizations: Having the bubble. IEEE Transactions on Engineering Management, 36(2), 132–139. Roberts, K. H., Rousseau, D. M., & La Porte, T. R. (1994). The culture of high reliability: Quantitative and qualitative assessment aboard nuclear-powered aircraft carriers. The Journal of High Technology Management Research, 5(1), 141–161. Roberts, K. H., Stout, S. K., & Halpern, J. J. (1994). Decision dynamics in two high reliability organizations. Management Science, 40, 614–624. Roe, E., & Schulman, P. (2008). High reliability management. Stanford, CA: Stanford University Press. Roth, E. M. (1997). Analysis of decision making in nuclear power polant emergencies: An investigation of aided decision making. In C. Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 175–182). Mahwah, NJ: Erlbaum. Sandelands, L. E., & Stablein, R. E. (1987). The concept of organization mind. In S. Bacharach & N. DiTomaso (Eds.), Research in the sociology of organizations (Vol. 5, pp. 135–161). Greenwich, CT: JAI. Schulman, P. R. (2004). General attributes of safe organisations. Quality & Safety in Health Care, 13(Suppl. 2), ii39–ii44. Schultz, P. P., Niemiec, C. P., Legate, N., Williams, G. C., & Ryan, R. M. (2015). Zen and the art of wellness at work: Mindfulness, work climate, and psychological need satisfaction in employee well-being. Mindfulness, 6, 971–985. Scott, W. R. (1994). Open peer commentaries on “Accidents in high-risk systems.” Technology Studies, 1, 23–25. Sitkin, S. B., Sutcliffe, K. M., & Schroeder, R. G. (1994). Distinguishing control from learning in total quality management: A contingency perspective. Academy of Management Review, 18(3), 537–564. Sutcliffe, K. M. (2010). High reliability organizations (HROs). Best Practice & Research Clinical Anaesthesiology, 25(2), 133–144. Sutcliffe, K. M., Paine, L, & Pronovost, P. J. (2016). Re-examining high reliability: Actively organising for safety. BMJ Quality and Safety. Retrieved from http://qualitysafety.bmj.com/ content/26/3/248.full Sutcliffe, K. M., & Vogus, T. J. (2003). Organizing for resilience. In K. S. Cameron, J. E. Dutton, & R. E. Quinn (Eds.), Positive organizational scholarship: Foundations of a new discipline (pp. 94–110). San Francisco: Berrett-Koehler.

88

mindful organizing

Sutcliffe, K. M., & Vogus, T. J. (2014). Organizing for mindfulness. In A. Ie, C. T. Ngnoumen, & E. J. Langer (Eds.), The Wiley Blackwell handbook of mindfulness (Vol. 1, pp. 407–423). Chichester, UK: Wiley Blackwell. Sutcliffe, K. M., Vogus, T. J., & Dane, E. (2016). Mindfulness in organizations: A cross-level review. Annual Review of Organizational Psychology and Organizational Behavior, 3, 55–81. Sutcliffe, K. M., & Weick, K. E. (2013). Mindful organizing and resilient healthcare. In E. Hollnagel, J. Braithwaite, & R. L. Wears (Eds.), Resilient healthcare (pp. 145–158). London, UK: Ashgate. Taylor, J. R., & Van Every, E. J. (2000). The emergent organization: Communication as its site and surface. Mahwah, NJ: Erlbaum. Vogus, T. J., Cooil, B., Sitterding, M. C., & Everett, L. Q. (2014). Safety organizing, emotional exhaustion, and turnover in hospital nursing units. Medical Care, 52(10), 870–876. Vogus, T. J., & Hilligoss, B. (2016). The underappreciated role of habit in highly reliable healthcare. BMJ Quality and Safety, 25(3), 141–146. Vogus, T. J., & Iacobucci, D. (2016). Creating highly reliable health care: How reliability enhancing work practices affect patient safety in hospitals. ILR Review, 69(4), 911–938. Vogus, T. J., & Sutcliffe, K. M. (2007a). The impact of safety organizing, trusted leadership, and care pathways on reported medication errors in hospital nursing units. Medical Care, 41(10), 992–1002. Vogus, T. J., & Sutcliffe, K. M. (2007b). The safety organizing scale: Development and validation of a behavioral measure of safety culture in hospital nursing units. Medical Care, 45(1), 46–54. Vogus, T. J, & Sutcliffe, K. M. (2012). Organizational mindfulness and mindful organizing: A reconciliation and path forward. Academy of Management Learning and Education, 11(4), 722–735. Vogus, T. J., Tangirala, S., Lehman, D. W., & Ramanujam, R. (2015). The antecedents and consequences of mindful organizing in workgroups (Working paper). Vanderbilt University. Vogus, T. J., & Welbourne, T. M. (2003). Structuring for high reliability: HR practices and mindful processes in reliability-seeking organizations. Journal of Organizational Behavior, 24, 877–903. Weick, K. E. (1987). Organizational culture as a source of high reliability. California Management Review, 29, 112–127. Weick, K. E. (2009). Making sense of the organization: The impermanent organization (Vol. 2). Chichester, UK: Wiley. Weick, K. E. (2011). Organizing for transient reliability: The production of dynamic non-events. Journal of Contingencies and Crisis Management, 19(1), 21–27. Weick, K. E., & Putnam, T. (2006). Organizing for mindfulness: Eastern wisdom and Western knowledge. Journal of Management Inquiry, 15(3), 275–287. Weick, K. E., & Roberts, K. H. (1993). Collective mind in organizations: Heedful interrelating on flight decks. Administrative Science Quarterly, 38, 357–381. Weick, K. E., & Sutcliffe, K. M. (2001). Managing the unexpected: Assuring high performance in an age of complexity. San Francisco: Jossey-Bass. Weick, K. E., & Sutcliffe, K. M. (2003). Hospitals as cultures of entrapment. California Management Review, 45(2), 73–84. Weick, K. E., & Sutcliffe, K. M. (2006). Mindfulness and the quality of organizational attention. Organization Science, 17(4), 514–524. Weick, K. E., & Sutcliffe, K. M. (2007). Managing the unexpected: Resilient performance in an age of uncertainty (2nd ed.). San Francisco: Jossey-Bass. Weick, K. E., & Sutcliffe, K. M. (2015). Managing the unexpected: Sustained performance in a complex world (3rd ed.). San Francisco: Wiley. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. In B. M. Staw & L. L. Cummings (Eds.), Research in organizational behavior (Vol. 21, pp. 81–123). Greenwich, CT: JAI.

mindful organizing 89

Westrum, R. (1992). Cultures with requisite imagination. In J. A. Wise, D. Hopkin, & P. Stager (Eds.), Verification and validation of complex systems: Human factors issues (pp. 401–416). Berlin: Springer-Verlag. Westrum, R. (1997). Social factors in safety-critical systems. In F. Redmill & J. Rajan (Eds.), Human factors in safety critical systems (pp. 233–256). London: Butterworth-Heinemann. Wildavsky, A. (1991). Searching for safety (4th ed.). New Brunswick, NJ: Transaction.

chapter 5

R e l i a b i l i t y t h r ou g h R e silience in O r g a n i z a t i o n a l T e a ms Seth A. Kaplan & Mary J. Waller

I n t r oduc t i o n

Given the complexity most contemporary organizations face, teams of individuals, rather than single actors, are typically charged with creating and enacting organizational responses to critical, unexpected events (Baker, Day, & Salas, 2006; Mitroff, 1988); not surprisingly, both the HRO literature (e.g., Benn, Healey, & Hollnagel, 2007; Burke, Wilson, & Salas, 2005; Riley, Davis, Miller, & McCullough, 2010) and crisis management literature (e.g., Pearson & Mitroff, 1993; Pearson & Clair, 1998; James & Wooten, 2010) explore how teams address crisis events. Here, we draw from and attempt to integrate ideas from both of these literatures. Considering the implications of team responses to such events, it is also not surprising that boards of directors increasingly require management to implement some form of crisis preparation and training (PriceWaterhouseCoopers, 2014). This training largely focuses on improving the prediction and prevention of possible organizational crises. At the same time, management team members often retort, “How can we train for ‘Black Swan’ events that are, by definition, completely unpredictable?” The focus on the prediction and prevention of specific events, rather than on building enduring team capabilities vital across responses to crises, tends to create an organizational impasse and, in many cases, inaction.

reliability through resilience 91

By rerouting the conversation in the direction of team resilience, however, teams designated to respond to critical events on behalf of their organizations can focus on developing capabilities that increase the likelihood of appropriate team responses to such situations. Over time, these more appropriate responses lead to more reliable organizations. Many conceptualizations of resilience emphasize adaptability rather than constant invulnerability; these more dynamic conceptualizations imply the presence of resources that can be tapped, configured, and arranged differently in order to cope with unexpected critical situations. Kathleen Sutcliffe and Timothy Vogus (2003) summarize many of these definitions, noting that resilience can be thought of as “the maintenance of positive adjustment under challenging conditions” (p. 95), which forms the capacity to adapt and rebound during such conditions. Similarly, Aaron Wildavsky (1988) argues that to be resilient is to be vitally prepared for adversity, which requires “improvement in overall capability, i.e., a generalized capacity to investigate, to learn, and to act, without knowing in advance what one will be called to act upon” (p. 70). Although some recent work warns that rampant adaptability over a long period might lead to cumulative “drift” away from routines designed to enhance organization-level reliability (Pettersen & Schulman, 2016), our focus here is on more temporally proximal adaptive responses to unexpected situations. The information we emphasize is based on years of research and empirical evidence linking specific team properties and behaviors to higher team performance during critical, unexpected situations (e.g., Salas, Wilson, & Edens, 2009; Stachowski, Kaplan, & Waller, 2009; Waller, 1999; Weick & Sutcliffe, 2007). In relation to John Carroll’s (2006) framework, this information can be best interpreted through the lens of strategic design. Based on this previous work, we suggest that resilience in teams can be achieved through these properties and subsequent behaviors and can be developed and enhanced through recurrent simulation-based training (SBT). We first describe the nature of the resilience-prediction debate, drawing from work in crisis management. Next, we detail key properties within teams necessary for teams to remain resilient during critical events. We then describe the nature of interactions that span across teams’ boundaries to their surrounding environments during critical situations—interactions that serve to maintain or enhance team resilience. Finally, we describe components of resilience-focused SBT designed for teams likely

92

reliability through resilience

to respond to critical events. We close this chapter by examining some of the implications of this discussion for future research on teams operating in HROs.

R e s i l i e n c e v e r sus P r e d i c t i o n

Although the nature of critical events varies in terms of immediacy, magnitude and probability of consequences, physical or psychosocial proximity, and the form of the threat (Hannah, Uhl-Bien, Avolio, & Cavarretta, 2009), two conclusions about critical events are beyond question: their outcomes are extremely consequential to organizations, and organizations rely on teams, either alone or in multiteam systems (Zaccaro, Marks, & DeChurch, 2011), to address them. Much of the traditional literature on crisis management focuses on preparing for the inevitable crises that are likely to befall organizations; however, and as emphasized by Ian Mitroff (2003), the emphasis on predicting specific events relies heavily on probabilistic thinking based on risk analyses using statistical likelihood calculations drawn from data about events that have occurred in the past. In addition to probabilistic thinking, Mitroff argues that organizations should engage in “possibilistic” thinking to imagine and prepare for events that have never occurred or that combine elements of past events in new ways. Possibilistic thinking helps move organizations more toward a crisis-prepared stance as compared with a reliance on specific event prediction (Mitroff, 2003). Other crisis management literature emphasizes the roles of crisis management plans as well as practice and simulation for crisis management teams (Crandall, Parnell, & Spillan, 2014); still other work in this area emphasizes leader capabilities during crises (James & Wooten, 2010). However, a disconnect seems to exist between much of the relevant peer-reviewed empirical research identifying the team-level behavioral capabilities that likely enhance team resilience during critical events and the more applied advice regarding teams in the crisis management literature. We draw from this empirical research in what follows, focusing first on factors within teams and second on factors connecting teams and their environments.

I n t e r n a l T e a m Fac tor s

Over the past several decades, research concerning teams such as aviation flight crews, nuclear plant control room crews, military and police teams,

reliability through resilience 93

and emergency medical teams have added greatly to our knowledge regarding which team properties and behaviors are associated with higher team performance in time-pressured, consequential situations. This research has included studies of existing intact teams as well as ad hoc or “swift starting” teams (see McKinney, Barker, Smith, & Davis, 2004) composed of members who have not worked together before. While the majority of this research focuses on identifying team behaviors that lead to better outcomes, we suggest here that team staffing is a more reasonable starting point for an exploration of resilience in teams facing nonroutine events. Team Staffing Managing who is part of a team is important for team effectiveness in general. Empirical evidence conclusively demonstrates that the characteristics of team members—such as their cognitive ability, personality traits, and functional backgrounds—and the mix of people with respect to those characteristics can have an impact on team functioning and success (Bell, 2007). Figure 5.1(a) depicts a traditional team model, wherein certain individuals comprise an intact, stable team. In subsequent sections we contrast this model with those that better capture the reality of team composition for teams operating in reliable contexts and responding to demanding scenarios [portrayed in panels (b) and (c) in the figure]. It is worth emphasizing that managing the (dynamic) composition of teams may be both especially critical and also especially challenging for teams who must remain resilient under demanding conditions. The events these teams may face, and which may pose the greatest risk, are not completely knowable or predictable (Somers, 2009). Aspects of specific crisis-like events such as their timing, length, magnitude, and complexity cannot be forecast with perfect accuracy, and, in fact, some events (such as those that transpired on September 11, 2001) may be virtually impossible to even fathom before their actual occurrence (Pearson & Clair, 1998; Starbuck & Farjoun, 2005). One way organizations can set the stage for resilience in teams is by staffing teams with the (mix of) individuals most adept at dealing with rare, critical events. However, the unpredictability of these events complicates staffing efforts, as the specific performance requirements of given situations cannot be fully known a priori. As such, trying to form teams with the requisite characteristics to handle those requirements is not always feasible. While organizations

94

reliability through resilience

(a) Intact, Stable Team

(b) Dynamic Team Composition

(c) Information Exchange Involving Internal and External Actors (with Dynamic Team Composition)

Communication Exchange Team Leader

figure 5.1 Three Models of Team Composition for Teams in Reliable Organizations

can create teams with some sense of the types of specific knowledge, skills, and ability team members must possess (e.g., firefighters should be physically able to handle relevant situations), linking specific characteristics to specific events may be less realistic. This recognition—that traditional “best practices” of employee selection and team formation (e.g., Morgan & Lassiter, 1992)— may not suit the needs of organizations seeking to enhance reliability through composing resilient teams and implies that organizations may consider additional approaches to staffing these teams. Instead of striving to form teams by only considering potential members’ technical proficiencies and functional backgrounds, organizations might also try to build resilient capacity at the team level by forming teams able to adapt to a range of potential critical scenarios. Specifically, we suggest that owing to a learning focus and a desire to grow from adversity, these teams composed for resilience should help in achieving and developing reliable organizations in at least three ways. First, resilient teams, unlike purely reactive ones, proactively seek potential problems and tend to build institutional infrastructure to but-

reliability through resilience 95

tress the organization if such events do occur (Wildavsky, 1988), engaging in second-order problem solving by seeking and eliminating root causes of problems (Tucker & Edmondson, 2003). Second, resilient teams are superior in adapting to and weathering the dynamic, stressful, and sometimes prolonged actual crises that organizations can face (Sutcliffe & Vogus, 2003). Finally, postevent, resilient teams are especially adept at learning and benefiting from the experience, instead of simply regarding it as a loss or setback (Somers, 2009). Arnold Sameroff and Katherine Rosenblum (2006) suggest that although resilience is often considered to reside in the individual, factors existing in the surrounding social context may be more robust predictors of individual resilience; similarly, selecting team members who are individually resilient may lead to the “Dream Team” fallacy of composing teams of excellent individual performers who perform horribly together as a team (Colvin, 2006). How, then, can organizations identify the individual-level characteristics and mix of individuals that will provide such resilience at the team level? While there is a voluminous literature on team composition, studies examining the mix of characteristics best for challenging and demanding scenarios are scarce. Some exceptions do exist, though. For example, Sutcliffe and Vogus (2003) offer several promising suggestions such as including members with a learning goal orientation (Dweck & Legget, 1998) who collectively bring a broad and diverse set of knowledge and experiences and who are likely to develop a sense of collective efficacy. In our own research, we have identified some other candidate attributes associated with team resilience and effectiveness. In one study of nuclear power plant operating control room crews responding to simulated crises, we found that groups whose members were higher on polychronicity (i.e., those who prefer to multitask) performed worse than groups composed of members who preferred to perform tasks sequentially (Kaplan & Waller, 2007). Multitaskers were more likely to ignore or forget important information and tasks as they attempted to respond to everything simultaneously in a crisis. In this same context, we also found that teams homogeneous with respect to members’ propensities to experience positive emotions outperformed teams with members who diverged in this tendency; this effect was partly due to experiencing less frustration during the crisis event (Kaplan, LaPort, & Waller, 2013). These and similar studies are informative, but much more needs to be done in identifying both individual factors and

96

reliability through resilience

specific compositional constellations of these factors that lead to team resilience during critical situations. Emergent States Beyond the beginning team composition, key internal team properties that emerge as team members interact—or “emergent states” (Marks, Mathieu, & Zaccaro, 2001)—must exist within a team (or an ad hoc team must have the capability of developing these states very quickly) in order to facilitate resilience during an unfolding crisis. Emergent states have been defined as enduring properties of teams that arise from lower-level interactions among team members (Cronin, Weingart, & Todorova, 2011), or “constructs that characterize properties of the team that are typically dynamic in nature and vary as a function of team context, inputs, processes, and outcomes” (Marks et al., 2001, p. 357). We suggest that the emergent states especially likely to enhance team resilience during critical situations include situational awareness, shared mental models, transactive memory systems, and collective efficacy. Teams facing unexpected complex and critical situations collectively create shared cognitions of the situations, systems, and tasks that help team members share information and coordinate action. Team situation awareness is a team’s awareness and understanding of a complex and dynamic situation at any given point in time (Salas, Prince, Baker, & Shrestha, 1995). The more fully developed a team’s shared awareness of changes or developing critical situations, the earlier the team may sense an emerging crisis. This early sensing may give the team the added time necessary to react appropriately to the event, thus reducing the toll on emotional and cognitive stores and enhancing team resilience. Similarly, shared mental models are collective cognitive structures that facilitate team-coordinated activity in at least two ways. First, teams create system-oriented mental models that enable them to describe, plan, and predict a system with which they interact (Hodgkinson & Healey, 2008; Rouse & Morris, 1986). A highly shared, highly developed understanding of a system may enable a team to more accurately understand the implications of a crisis event as well as anticipate how the system is likely to react to the unfolding situation. This enhanced understanding and anticipation may enhance team resilience by implicitly (i.e., not requiring discussion, debate, or time) ruling out numerous possible actions to take, again saving time and reducing cognitive and emotional strain. Similarly, by enhancing the accuracy of efforts

reliability through resilience 97

to anticipate or hypothesize how the system will respond in the near future, system-oriented mental models that are shared across team members act to reduce felt uncertainty and stress associated with the critical event, and thus enhance resilience in the team. Second, teams create task mental models, which are functional structures that help teams create representations of their tasks, including processes and strategies (Uitdewilligen, Waller, & Zijlstra, 2010). These mental models, when shared across team members, facilitate task coordination, freeing team members from expending cognitive and emotional energy on concern or uncertainty about whether certain subtasks are being performed or will be attended to, enhancing resilience. Similar to task mental models, transactive memory systems are collective memories that are developed in teams for encoding, storing, and retrieving information to ensure that important details are accessible by the team during a crisis (see Wegner, 1986). In addition to facilitating team members’ knowledge of which team member possesses knowledge in which area, a transactive memory system also facilitates the distribution of incoming information to the appropriate team member. The specialization of knowledge represented in a transactive memory system can reduce the cognitive load of each individual team member (Waller, Gupta, & Giambatista, 2004), freeing team members’ attention for more anticipatory and support tasks, and thus enhancing team resilience. Finally, the emergent state of collective efficacy is likely to exert direct effects on team resilience during crisis situations. Collective efficacy is the shared belief that the team has the ability to perform the tasks before it; team members with such confidence in their team will believe in their collective competence and be more likely than other teams to function effectively during adverse conditions (Gully, Incalcaterra, Joshi, & Beaubien, 2002). Put succinctly, perceived collective efficacy “fosters groups’ motivational commitment to their missions, resilience to adversity, and performance accomplishments” (Bandura, 2000, p. 75).

T e a m Bou n d a r y D y n a m i cs

While team staffing and emergent states influence team resilience within the boundaries of the team, other factors exist in team boundary dynamics that

98

reliability through resilience

involve the interaction of the team with elements of its context external to the team. These dynamics include adapting team composition to the situation by reaching out to add necessary (and eliminating unnecessary) team members depending on the changing situation (Klein, Ziegert, Knight, & Xiao, 2006) and rapidly onboarding new team members. Team boundary dynamics also involves the careful tuning of communication with specific stakeholders (Kouchaki, Okhuysen, Waller, & Tajeddin, 2012). We discuss these related factors—adapting team composition as events unfold and managing internal and external communication with stakeholders—in turn next. Adapting Team Composition to Events While organizations may endeavor to compose stable, resilient teams to respond to nonroutine events, the reality of these situations often results in teams who must become quite malleable in terms of team composition. First, rather than being a stable, intact set of individuals, constituent teams responding to nonroutine events often represent groups of previously unacquainted individuals (Klein et al., 2006). For example, in many hospitals, teams responding to individual trauma events are composed of the particular health-care providers (e.g., physicians, nurses, technicians) who first arrive at the trauma bay. Members may never have met each other before that trauma situation, or may be only casually acquainted, and may be ignorant of each other’s tendencies, work styles, roles, and even names (Kolbe et al., 2014). In such situations, team members must quickly try to learn about and adapt to each other’s idiosyncrasies (Vashdi, Bamberger, & Erez, 2013). Similarly, these teams often lack the shared mental models and transactive memory systems that derive from past experience and instead must try to construct shared meaning while concurrently engaged in the demanding scenario (Stout, Cannon-Bowers, Salas, & Milanovich, 1999). Further complicating matters is that the complexity, fluidity, and prolonged duration of these events often necessitates changes in team membership. Over the course of a given event, team members may come and go, depending on the capacities and types of expertise that the dynamic situation requires at different points (Bedwell, Ramsay, & Salas, 2012; Mathieu, Tannenbaum, Donsbach, & Alliger, 2014). As such, the “team” per se, instead of being a stable set of individuals, may very well represent more of a rotating cast of members, some of whom may never serve at the same time as others (Ziller, 1965).

reliability through resilience 99

With each change and addition in team membership comes potential challenges. For instance, the group may have to devote time to onboarding new members who may lack knowledge and situational awareness of the scenario (Edmondson, 2012). Also, the changing membership can introduce communication and coordination challenges as individuals may continually need to renegotiate roles and communication norms (Summers, Humphrey, & Ferris, 2012). Additionally, each change also modifies the composition mix and expertise of the group, thereby resulting in a potentially gestalt-level change in team dynamics and identity as a whole (Mathieu et al., 2014). Figure 5.1(b) depicts the nature of this dynamic team composition. The dashed lines of the ovals here represent the permeability of the team boundaries and the opportunities for changes in team composition. In this panel, two individuals are the first to arrive (whether in-person or through some other medium) at the onset of a potential nonroutine event. These individuals represent the team at this point. In less reliable organizations, these “first responders” may never have worked together previously or, even if they have, may not have worked together in a scenario similar to the one they currently face. At “time 2” two additional members join the team. Again, these members may not only be unfamiliar with the other members, but they may also be largely ignorant with respect to the nature and dynamics of the unfolding situation. Upon their arrival, one of the original team members must temporarily leave the immediate context to complete another pressing task (e.g., communicate with a higher-level organizational official about the scenario). At “time 3” this member rejoins the team, resulting in a team of four individuals. Changes like these will continue to occur as events unfold. This portrayal is a more accurate depiction of team composition in reliable organizations than is the model of stable team composition shown in panel (a). We propose that resilient teams will be able to better weather the challenges that changing team composition presents and potentially even thrive in the face of them. Specifically, institutions striving for reliability can implement several strategies to foster team resilience in such situations. First, as a proactive step, organizations should seek to create opportunities for interaction and collaboration among employees from different units to counter the challenges that accompany working with otherwise unacquainted individuals (Bedwell et al., 2012). Doing so can serve to reveal latent organizational vulnerabilities and can ignite and generate novel opportunities and ideas (Perry-Smith &

100

reliability through resilience

Shalley, 2003; Starbuck & Farjoun, 2005). This strategy can also help build familiarity among individuals who one day may need to work together on an unforeseen challenge or crisis. In turn those subsequent challenges will appear not so much as crises but rather as extreme instantiations of the past interactions and for which members have opportunity to prepare. Supportive of these ideas, one study showed that member familiarity was positively associated with requesting and accepting more help among teams of air traffic controllers (Smith-Jentsch, Kraiger, Cannon-Bowers, & Salas, 2009). Second, in addition to just facilitating interaction among unacquainted employees, reliable organizations may institute formal training strategies among members who may one day serve together on a team. For instance, and as elaborated on in the section on SBT, organizations can implement standardized simulations to develop transferable teamwork skills and to promote procedural norms and shared mental models regarding team functioning (Salas, Rosen, Held, & Weissmuller, 2009). These simulations could include features such as having previously unacquainted individuals participate together, having members arrive at the critical event at staggered times, providing different information to different members to highlight the need for information sharing (e.g., Lu, Yuan, & McLeod, 2012), suddenly decreasing the time available for teams to act (Waller, Zellmer-Bruhn, & Giambatista, 2002), and having members leave and then return to the event. Introducing features such as these can help reliable organizations develop in their employees the transferable teamwork capabilities whose utility should translate across events and regardless of the composition of specific organizational members who make up ad hoc response teams (Salas, Rosen, et al., 2009). In addition to simulations, or as part of them, organizations seeking to build resilient teams should also consider using job rotations and cross training to develop members’ knowledge of others’ roles and in turn improve understanding and coordination during challenging events. A third strategy that reliable organizations can use to manage changing team composition is to emphasize and train for agility during member switches. We suggest that resilient teams are those who can manage dual objectives during these transitions. On the one hand, reliable organizations seek to establish strong and consistent norms that will guide members, including unacquainted ones, as they transition into existing teams. On the other hand, reliable organi-

reliability through resilience 101

zations must at the same time create an environment of adaptability, whereby existing members have the awareness and capacity to make adjustments to incorporate the new member (Burke, Stagl, Salas, Pierce, & Kendall, 2006). An example may help to illustrate the dynamic we are describing. Consider a band performing a concert where additional performers are waiting backstage and may be called to join the band onstage at any point. Both the members onstage and those waiting in the wings know the basic norms and procedures for new members joining mid-song while at the same time recognize that some improvisation on both the band’s and the incoming members’ parts will be necessary as the onboarding occurs so as not to “skip a beat.” Strategies such as training can aid in meeting these dual objectives. A final strategy we would suggest regarding these compositional changes concerns what happens after the resolution of particular events. Here, the change is the (potential) dissolution of the team. In many instances action teams work together to respond to a particular event and then immediately disband after addressing the scenario (Arrow, McGrath, & Berdahl, 2000). However, these teams, and their experiences during demanding situations, have the potential to provide insights that can enhance the reliability of the organization going forward (Wildavsky, 1988). Rather than just viewing a given event as a threatening shock to the organization, reliable institutions regard such events as opportunities to reveal previously unforeseen peril and capitalize on dormant capabilities (Starbuck & Farjoun, 2005). To ensure that this learning occurs after challenging events, we recommend that organizations implement thorough debriefings with the teams who experienced them before teams disband. Considerable evidence suggests that teams who go through debriefings tend to be more effective in the future than those who do not (Tannenbaum & Cerasoli, 2013). Also, having members from different units provide lessons learned when they return to their constituent units can enhance learning at the organizational level. Managing Information across the Team Boundary A second factor that teams operating in demanding circumstances must address is the challenge of managing information in the face of often complex boundary dynamics. While timely and appropriate information exchange is critical to team functioning in general (e.g., Waller, 1999), it can become

102

reliability through resilience

e specially critical for the types of scenarios that teams in (potentially) reliable organizations encounter. Given the demanding and sometimes uncertain nature of these scenarios (Pearson & Clair, 1998), sharing, obtaining, and integrating information about the events and the possible ways to address them can dictate whether the team responds effectively (Waller et al., 2004). Unlike in routine or easily addressable cases, teams operating in reliable contexts face unique situations that make effective understanding of the situation paramount. However, several aspects of these demanding scenarios can also make this necessary information exchange and integration especially daunting. First, in the types of complex scenarios that these teams face, the sheer amount of information is overwhelming, as different messages and bits of data depicted across different media swirl about. Moreover, these bits of information are often ambiguous and contradictory. As such, teams must try to separate fact from fiction in piecing together a coherent tapestry of the situation and potential solutions (Woods, Patterson, & Roth, 2002). Plus, even when all the information is consistent, it could still be wrong. For example, the Northeastern Air Defense team responsible for the airspace around New York City on September 11, 2001, received several false reports of aircraft locations and hijackings, along with accurate data, as it struggled to make sense of the events on that day (Waller & Uitdewilligen, 2009). Compounding these difficulties associated with the amount and type of information are the team process challenges inherent in trying to share and act on this information. As members enter and leave the team (see earlier discussion), the team must ensure that there is a continuity of knowledge and shared situational awareness. Furthermore, information sharing and corresponding decision making does not only occur within the team. Rather, in these types of situations, a network of individuals and teams usually responds (Zaccaro et al., 2011). Figure 5.1(c) builds on scenario (b) to depict this more common state of affairs. Here, not only must the team maintain effective internal communication while individual team members come and go, but also, concurrently, the team must exchange information with external constituents. For example, it is common across contexts to have the team leader act as the information hub in a team responding to a crisis. The leader coordinates members’ specific tasks while communicating with external partners to share and gather relevant information about the unfolding scenario. In an aviation context, the leader is typically the captain, the other team member would be the first officer (the

reliability through resilience 103

second officer would be an additional member, if present), and the external parties would include air traffic controllers, ground operations, and perhaps flight attendants. In nuclear power plant control room crews, the leader is the supervisor coordinating the actions of the operators who are monitoring and manipulating specific controls and parameters, and external parties may include a shift manager who alternates between speaking with the supervisor and communicating with other external parties (e.g., other plant personnel). In trauma teams a physician (often a resident) is generally at the bedside working on the patient while coordinating team members’ (e.g., nurses’ and technicians’) tasks. A few feet away, a more senior physician may stand aside or perhaps behind a line marked on the floor, overseeing the entire scenario and, at times, communicating with the bedside physician. In each of these examples, the team leader is central, serving as an information nexus [see Figure 5.1(c)]. This individual has two primary communication tasks. The leader must alternate between (1) communicating with his or her teammates (e.g., providing information about the scenario and issuing task instructions to them and gathering their feedback about the scenarios and their task execution), and (2) communicating with the external agents—for example, providing information about the local scenario and gathering knowledge and instructions from those agents. Regardless of whether the external agents are colocated with the team or are geographically separate from it, in each of these scenarios, the leader must alternate between communicating with the team members who are monitoring and interacting with local conditions and with these other parties who try to provide a more global perspective on the scenario. In essence, then, the team leader (and potentially other team members) must concurrently achieve effective communication both within each sphere and between them. The challenge in executing these dual tasks reflects more than simply role or information overload. The difficulty lies not just in having to manage the potentially massive amounts of information (which, again, may be ambiguous and/or contradictory in nature; Woods et al., 2002) but also in having to decide with whom to communicate, for how long, and how frequently to do so. To fully appreciate this challenge, consider the trauma team physician. How often should this individual check in with the senior physician in the room? At what points in this scenario should she do so? When initiating these interactions, should she also involve the local members of the team (e.g., nurses

104

reliability through resilience

and technicians) who are working on the patient? If she does not, how often and when should she brief those individuals? Should she brief them all or have mini briefings with smaller groups? The team leader must make these types of decisions all while executing her own tasks (e.g., with the patient). Interaction Patterns within and across Team Boundaries We suggest that resilient teams, and the leaders who guide them, engage in distinct types of interaction that allow them to achieve superior outcomes. Specifically, mounting evidence seems to suggest that what distinguishes more effective teams from those less adept is not just the presence or amount of certain types of communication but also (and sometimes more so) the patterns of interaction in which those communications are embedded (e.g., Kolbe et al., 2014; Stachowski et al., 2009). In our own work, we and our and collaborators have conducted a series of studies in naturalistic settings or using high-fidelity simulations in various contexts including aviation (Lei, Waller, Hagen, & Kaplan, 2014; Waller, 1999), nuclear power plant control room crews (Stachowski et al., 2009), anesthesia teams (Kolbe et al., 2014), and mine rescue teams (i.e., Waller & Kaplan, 2015) to uncover the amount and nature of the patterns that relate to team effectiveness. As described next, these studies collectively paint a consistent picture of how teams should structure their interactions—including alternating between those within the team and those external to it. First, highly effective teams switch between engaging in more versus less consistent and patterned interactions. During routine scenarios or phases of situational assessment (e.g., after implementing a given procedure), these teams tend to operate in a structured and consistent manner, establishing and adhering to systematic patterns of communication. This regularity is functional insofar as the situation matches members’ cognitive template of it. However, when the situation deviates, morphing into a nonroutine event or crisis, effective teams are able to shed these ingrained patterns of interaction and adapt as the situation unfolds and unforeseen contingencies appear (e.g., Lei et al., 2014; Stachowski et al., 2009). Here, rather than adhering to a protocol of how often to interact with whom, resilient teams and their leaders possess a latent capacity for adaptability (Somers, 2009) and fluctuate between using more or less patterned interaction as the situation demands. In contrast, we find that these teams’ less effective counterparts are not as flexible. Even when the situation is novel and/or deviates from expectations,

reliability through resilience 105

these teams often continue to interact in more patterned, consistent ways and in a less adaptive manner during the crises. They continue to hold longer, more consistent, and more elaborate briefings, for example, in contrast to adaptive teams’ rapid and frequent briefings (Waller et al., 2004), suggesting an inability or unwillingness to engage with the dynamic environment and adjust situational understanding and protocols as the situation changes. This rigid response is indicative of these teams being less resilient and adaptive. We recently replicated and built on the previous findings in a study of five-person mine rescue teams engaged in high-fidelity simulation (Waller & Kaplan, 2015). Specifically, this study revealed two other differences in patterns that serve to bifurcate more effective teams from their counterparts and that also appear indicative of varying levels of resilience. Our results first indicated that captains (i.e., leaders) of the higher-performing rescue teams engaged in more internal/external team communication iterations—that is, they spanned their team boundary more often. Figure 5.1(c) illustrates this pattern of information exchange. Perhaps more noteworthy, though, is the finding that the captains of effective teams engaged in different patterns when communicating with an external party who was on the surface versus when communicating with members of the rescue team who were colocated in the mine. As in past studies, we found that leaders of effective teams engaged in shorter, less complex, and less reciprocal communications when interacting with team members in the mine with them. Notably, though, the captains of the higher-performing crews employed a different interaction style when communicating with the briefing officer on the surface. Here, these captains demonstrated interaction patterns that were significantly longer, more complex, and more reciprocal (or two-way). Taken together, these results suggest that effective leaders recognize the importance of iterating between internal and external agents and are able to adjust interaction styles across them. This agility is indicative of a latent source of resilience as it evidences awareness of the need for such adaptability as well as the ability to execute it.

S i mu l a t i o n - B a s e d T r a i n i n g

In the preceding discussion, we have described the challenges inherent in the scenarios that (potentially) reliable organizations confront and the primary role of teams in responding to those challenges and, ultimately, in improving

106

reliability through resilience

organizational effectiveness. Specifically, we have elaborated on selected intrateam emergent states that largely underlie team effectiveness in reliable organizations and also on the complex boundary dynamics these teams face. A primary thesis serving as the connective thread throughout this discussion has been that the demanding situations that (potentially) reliable organizations encounter are unique and largely unknowable. Because of the idiosyncratic nature of these events, attempts to formulate, and train for, responses to particular circumstances are often not effective in addressing those circumstances. Supportive of this notion, the evidence regarding the causal efficacy of predisaster planning in actual crisis response is, at best, mixed and inconclusive (Tierney, Lindell, & Perry, 2001). Furthermore, some scholars have even suggested that developing a specific plan to respond to specific scenarios can be maladaptive. Trying to understand an event through a certain preconceived lens or schema can discourage organizations from recognizing and responding to the unique features of that particular event (Weick & Sutcliffe, 2007). These features of potential scenarios clearly present a dilemma for organizations. How are organizations to prepare to respond to situations without knowledge of the nature and contingencies of those situations? The answer to this question speaks to the larger question of how organizations can develop teams to respond to nonroutine events—namely, by developing and fostering resilience among those teams. Whereas some have defined resilience simply as an ability or capacity to “bounce back” from unforeseen events, others construe resilience as capturing the commitment to be prepared for such occurrences (see Somers, 2009, p. 13). Adopting this latter treatment, we regard team resilience as not just about weathering crises; it also refers to a latent capacity to adapt to events that may not resemble those for which teams had prepared. Notably, though, this capacity may not be manifest until events occur or, due to the team’s proactive capabilities and corresponding prevention of such events, may not be manifest at all. We submit that teams, and the organizations that employ them, can develop such resilience through two broad strategies (see Bedwell et al., 2012, for a similar suggestion). First, and as discussed previously, organizations can select members and staff teams with individuals who possess the stable qualities likely to result in resilient teams. In doing so, organizations can consider

reliability through resilience 107

these attributes at the organizational level (i.e., selecting members with levels of certain traits, regardless of position or job type) and/or at the team level (i.e., creating teams whose members possess certain levels of certain traits). In either case, the organization also must consider which configural properties are important (Mathieu et al., 2014). For instance, an organization might decide that achieving a higher mean of positive affectivity is beneficial (at the organization or team level) and therefore only select individuals with higher scores on this attribute. Another potential perspective is that heterogeneity on this trait is (also) important, and therefore the organization would try to achieve such heterogeneity (perhaps while also trying to achieve a fairly high mean; see Kaplan et al., 2013). These considerations apply to staffing more generally but have received surprisingly little attention in the area of organizational reliability. While there is certainly potential value in considering the dispositional attributes of team members, this strategy also likely has some notable limitations as the primary method of developing team resilience. First, as noted earlier, teams in reliable organizations are often formed in an ad hoc manner and are composed of members who may never have worked together or even known each other prior to the precipitating scenario (e.g., Bedwell, 2012). Thus, selection efforts would have to take place at the organizational level with the goal that any ad hoc collection of members will possess the desired attributes. Trying to achieve other configural properties (e.g., a mix of experience levels or variability with respect to given traits) in rapidly forming teams would be more challenging. Also problematic is the fact that organizations (rightly or wrongly) likely would be hesitant to prioritize selecting on attributes related to team resilience above selecting on characteristics related to technical proficiency (e.g., general cognitive ability or specific knowledge or skills). For these reasons, selection and placement efforts cannot represent the entirety of organizational efforts to build resilient teams. An additional strategy is required. We suggest that organizations can build resilient teams through recurrent SBT during which teams exercise and further develop the behaviors and emergent states previously discussed. In simulations, teams encounter scenarios that reflect fundamental characteristics of the situations they may face in reality. As in actual demanding situations, simulation scenarios can be designed to be reactive to the actions of the participants (Waller, Lei, & Pratten, 2014). Depending

108

reliability through resilience

on the participants’ decisions and actions, the nature of the information teams receive in response, or the subsequent problem space and resultant decision points, may change (Moats, Chermack, & Dooley, 2008). Although perhaps generally thought of as being high in fidelity and even potentially dangerous (e.g., a firefighting team practicing with a real burning structure), simulations range in terms of both psychological and physical fidelity and also can include methods such as gaming and tabletop exercises. While a full review of SBT methods is beyond the scope of this chapter, certain points bear mention. For decades SBT has been a fundamental component of team training and preparation in high reliability industries such as aviation, the military, and power generation facilities (Salas, Rosen, et al., 2009). Increasingly, the healthcare industry is also advancing the potential role of team SBT in preparing for both routine and nonroutine scenarios (e.g., a train accident resulting in mass numbers of trauma patients; see Weaver, Dy, & Rosen, 2014). SBT is also becoming more common in business organizations, to prepare both for extreme, potentially life-threatening events (e.g., earthquakes) as well as extreme business events (e.g., publicized ethical scandals; see Clarke, 2009). Still, though, while the benefits seem apparent in terms of training specific skills such as following appropriate procedures under stress and time pressure, many organizations have not embraced systematic and recurrent SBT as a mandatory and fundamental aspect of developing transferable team capabilities such as resilience. Given the evidence documenting the benefits of SBT (Salas, Rosen, et al., 2009), this lack of integration seems problematic. Notably, while SBT may be generally helpful in enhancing team performance, several aspects of simulation suggest that its potential to develop team resilience may be especially pronounced. Even though teams facing real challenges may not be composed exactly as the teams trained through SBT, it seems reasonable to expect that teams that have experienced and benefited from emergent states in simulated conditions would be more likely than others to readily engage in interactions that support the emergence of the same states during real situations. Simulations can be quite challenging and present cognitive and emotional demands that far surpass those to which teams are accustomed, further encouraging the development of the emergent states and information management capabilities discussed here. These demands, though, ultimately can benefit the team. They can encourage errors that foster

reliability through resilience 109

metacognitive processes implicated in learning (Tucker & Edmondson, 2003). Also, these demands force teams to face and overcome poor performance and perhaps even failure—key ingredients of some definitions of resilience (e.g., Luthar, Cicchetti, & Becker, 2000). Used over time, simulations may help teams become desensitized to situations involving time pressure, uncertainty, and dynamic workloads.

Imp l i c a t i o n s f o r H R O T e a m R e s e a r ch

Previously we described three factors proposed to be significant for the development and effectiveness of resilient teams in HROs. As documented throughout, a considerable amount of research exists on each of these three factors. Despite the volume of this research, though, significant questions about these topics remain, both in general and with respect to the application of these topics to teams in HROs. Thus, in the hope of motivating and helping to guide future research, we highlight here some questions that seem significant. Future Research on Internal Team Dynamics and Emergent States In HROs the teams called on to address unexpected critical situations may be single-purpose action teams—that is, teams such as firefighting crews that exist only for the purpose of addressing such situations, whose members have no significant routine organizational duties to perform—or dual-purpose action teams such as nuclear plant control room crews that normally perform routine work but who are also called on to address nonroutine events when they occur (Waller & Uitdewilligen, 2012). Either type of team consists of members who typically train in a team to address crisis events and thus possess similar expectations of emergent states, although the membership of a single-purpose action team may be more unpredictable from day to day than that of a dual-purpose action team. Given this context, future HRO research on teams might approach the two types of teams differently; although they both exist in HROs and both address extreme events, the challenges of the two teams regarding emergent states differs significantly. First, because the membership of single-purpose teams fluctuates based on shift assignments and other factors, the teams do not begin work together during an event with fully formed emergent states already in

110

reliability through resilience

place; instead, team members must interact in order for shared mental models, transactive memory systems, and collective efficacy to emerge. Although these emergent states are facilitated by shared training and protocols, the manifestation of each state is unique to the team and requires team member interaction and time to emerge. As single-purpose teams are typically activated by other organizational mechanisms once a critical event is recognized, the teams are typically required to scan the environment constantly in order to activate themselves. Future HRO research could focus on the factors enabling singlepurpose teams to facilitate the quick emergence of these important states. However, the situation is quite different for dual-purpose teams, which spend the majority of their work time performing complex but routine tasks. Although these teams of relatively permanent membership are likely to possess refined emergent states, with the possible exception of transactive memory systems, these states may have emerged due to interactions in routine situations and may not be appropriate or useful for nonroutine action. Future HRO research should investigate if this is indeed the case, and if higher-performing dual-purpose teams (1) are able to jettison states for routine operations in favor of creating new states for nonroutine operations, or if the teams (2) possess two sets of states—one for routine and one for nonroutine situations—with the latter being created via simulation training or prior nonroutine experience. The additional emergent state of situation awareness is key for these teams, as they themselves must often notice that a critical event is occurring and decide when to abandon routine work in favor of nonroutine procedures and action. As such, studies directly measuring situation awareness (in real time) would seem especially valuable. Future Research on Team Boundary Dynamics Research focusing on how teams do and should manage boundary dynamics in HROs is scarce. We return to Figure 5.1 in suggesting directions that we regard as potentially yielding significant theoretical and practical value in understanding how resilient teams can manage these dynamics. Turning first to panel (b), questions abound regarding the onboarding and departing of team members. While a growing body of literature looks at the positive and negative effects of changing team composition (see Mathieu et al., 2014), we know very little about the dynamics that transpire when these transitions are occurring (but see Summers et al., 2012), especially in demanding and time-

reliability through resilience 111

pressured scenarios. As members come and go, what information is most critical to convey, in what manner should it be conveyed, by whom, and at which point during critical incidents? Presently, answers to questions like these are lacking. As an example of the type of question we propose addressing, a useful study could investigate the impact of different forms of transitioning when one or more new members join the team. One transition strategy would be for the team to hold frequent meetings at regular intervals during which the team leader would have the opportunity to brief any members who have joined the team since the prior meeting. This approach lends predictability but also may result in delayed transfer of information to team newcomers. Conversely, the leader may prioritize briefing the new members, suspending team operations to do so. This strategy introduces more interruptions and may lead to more switches between action and transition phases (Marks et al., 2001), but it also allows for more rapid onboarding and information transfer. Another option is for the team leader to brief the incoming members without the rest of the team present or to delegate the briefing to another team member. Knowing which approach is superior (in which circumstances and for which outcomes) not only has obvious practical value for developing resilient teams but also has theoretical impact. Such a query could both draw from and contribute to literature on topics such as team communication (Kolbe et al., 2014), transactive memory (Wegner, 1986), task prioritization and distribution (Waller, 1999), and shared situational awareness (Salas et al., 1995). Figure 5.1(c)—depicting the information exchange between internal team members and external team partners—also suggests several fundamental issues that merit addressing. One such topic concerns the relational dynamics between team members who may seek assistance and those external members who may be called on to provide it. Helping is a dyadic phenomenon; one individual (or team) must seek or accept help and another must provide it. In our own research, though, we have observed that team members sometimes hesitate to seek outside assistance and that others sometimes respond to such requests with frustration. For instance, we have spoken with nurses who possessed concerns about a patient’s condition but were reluctant to initiate a “code” and request assistance for fear of outside members’ (e.g., physicians’) reactions if the nurse’s assessment was incorrect. While promoting psychological safety can reduce such occurrences (Edmondson, 1999), establishing such safety among members who may never have worked together (or

112

reliability through resilience

even met each other) would seem more challenging. Thus, we would call for research addressing how to promote familiarity in HROs and to foster safety even with the lack of such familiarity. Figure 5.1(c) also begs for an analysis of team leaders’ switching behaviors. While alternating between communicating with internal and external members is likely beneficial in general (Waller & Kaplan, 2015), the appropriate frequency and pattern of switching may vary, depending on factors such as the phase of the scenario. For instance, frequent switching may be more necessary during action phases than times of transition (Marks et al., 2001). During action phases, the leader may need to regularly update the external agent on changing local conditions and team actions (e.g., regarding specific parameters of a nuclear power reactor), and the external individual needs to provide corresponding information about resultant consequences of those actions (e.g., on other parts of the system or plant). Alternatively, given that such action phases can be especially demanding, particularly for leaders (Ibid.), perhaps leaders should delay such switches until transition phases. Experimental, simulation, and naturalistic studies addressing questions like this would seem important to conduct for developing resilient team responses. Future Research on Simulation-Based Training Finally, we also propose some future directions for the use of SBT in developing resilient teams. One potential direction follows from our own observation that there sometimes exists a lack of motivation to participate in simulations (or to do so in a serious manner; see also Smith, 2004). While SBT is standard practice in some HRO industries such as aviation and nuclear power generation (Salas, Rosen, et al., 2009), other industries, such as health care, have only recently begun widely implementing simulations (Weaver et al., 2014). Although we can only speak anecdotally, we have observed cases of significant resistance to participating in SBT in some of these newer forums. Part of this resistance seems to reflect a perceived lack of utility of these simulations, especially for developing “soft skills” such as communication and teamwork. While speculative, one certainly could imagine even greater skepticism if individuals are asked to participate in the types of SBT that we regard as important for developing team resilience—such as those among un acquainted members responding to events unlike any they have experienced (or could fathom occurring). As such, given the critical importance of motivation

reliability through resilience 113

in training effectiveness (Colquitt, LePine, & Noe, 2000), we would call for work on how organizations, and perhaps even industries, can induce cultural changes in perceptions of SBT. In general, we would suspect these changes to be especially likely to manifest, and to be successful, insofar as they are part of a gestalt shift from a response-based culture to a resilient, anticipatory one (Somers, 2009). Following from this last point, a second area of potential research would entail developing simulations to foster team capabilities for foreseeing events, growing from them, and for transitioning from one event to another. As emphasized earlier, resilient teams anticipate and embrace change and may “bounce” from one critical and demanding situation to the next. Current SBT, though, generally does not mirror this reality. Rather, SBT focuses on (team) response to a single critical event (or to multiple events in a given simulation session) and, to some degree, on developing generic teamwork knowledge, skills, and abilities to enable such a response (Salas, Rosen, et al., 2009). Simulations are needed that develop foresight for potential threats and opportunities (Kaplan, Stachowski, Hawkins, & Kurtessis, 2010) and for handling the emotional, cognitive, and logistical challenges in transitioning from one change to a different (perhaps unrelated) change. Creating these types of simulations presents theoretical and operational challenges (e.g., in terms of the time required for the simulation and creating emotional responses). Research addressing these challenges, though, could make a significant contribution to helping to develop more resilient teams.

C o n c l us i o n s

Throughout this chapter we have suggested ways in which managers and leaders of teams might enhance reliability through team resilience. These suggestions center on: (1) engaging in deliberate selection and staffing strategies designed to enhance team-level resilience, and (2) using recurrent, realistic, interactive training simulations to help teams develop important emergent states (e.g., situational awareness, shared mental models, transactive memory systems, and collective efficacy) as well as the capability to adapt their own composition and communication to dynamic situations. Many of our ideas regarding staffing and team composition for teams likely to face critical events are based on integrations of existing research and are thus partially tested in

114

reliability through resilience

the literature we have cited. The development and use of SBT for teams, however, is a well-researched technique that most organizations—even those not typically considered HROs—could better leverage (Salas, Rosen, et al., 2009; Waller, Lei, & Pratten, 2014). In sum, key dynamics unfolding inside crisis management teams, along with dynamics involved in bridging team boundaries, form the repository for resilience in teams facing unexpected critical situations. No one area of team dynamics alone is sufficient for the development of team resilience. The challenge addressed by SBT is the creation of a learning context in which teams can experience the interplay of dynamic behaviors and states in a realistic and stressful setting. However, elements of organizational culture may decrease the effectiveness of such training by negatively influencing attitudes toward simulation, perhaps labeling it as a “game.” Thus, in order to leverage resilience for longer-term reliability, a multilevel organizational approach is paramount.

R efer ences Arrow, H., McGrath, J. E., & Berdahl, J. L. (2000). Small groups as complex systems: Formation, coordination, development, and adaptation. Thousand Oaks, CA: Sage. Baker D., Day, R., & Salas, E. (2006). Teamwork as an essential component of high-reliability organizations. Health Services Research, 41, 1576–1598. Bandura, A. (2000). Exercise of human agency through collective efficacy. Current Directions in Psychological Science, 9, 75–78. Bedwell, W. L., Ramsay, P. S., & Salas, E. (2012). Helping teams work: A research agenda for effective team adaptation in healthcare. Translational Behavioral Medicine, 2, 504–509. Bell, S. T. (2007). Deep-level composition variables as predictors of team performance: A metaanalysis. Journal of Applied Psychology, 92, 595–615. Benn, J., Healey, A. N., & Hollnagel, E. (2007). Improving performance reliability in surgical systems. Cognition, Technology & Work, 10, 323–333. Burke, C. S., Stagl, K. C., Salas, E., Pierce, L., & Kendall, D. (2006). Understanding team adaptation: A conceptual analysis and model. Journal of Applied Psychology, 91, 1189–1207. Burke, C. S., Wilson, K. A., & Salas, E. (2005). The use of a team-based strategy for organizational transformation: Guidance for moving toward a high reliability organization. Theoretical Issues in Ergonomics Science, 6, 509–530. Carroll, J. S. (2006). Introduction to organizational analysis: The three lenses (Working paper). Boston: MIT Sloan School of Management. Clarke, E. (2009). Learning outcomes for business simulation exercises. Education & Training, 51, 448–459. Colquitt, J. A., LePine, J. A., & Noe, R. A. (2000). Toward an integrative theory of training motivation: A meta-analytic path analysis of 20 years of research. Journal of Applied Psychology, 85, 678–707. Colvin, G. (2006, June). Why dream teams fail. Fortune, 153(11), 87–92.

reliability through resilience 115

Crandall, W., Parnell, J. A., & Spillan, J. E. (2014). Crisis management: Leading in the new strategy landscape (2nd ed.). Los Angeles: Sage. Cronin, M. A., Weingart, L. R., & Todorova, G. (2011). Dynamics in groups: Are we there yet? In J. P. Walsh & A. P. Brief (Eds.), The Academy of Management annals (Vol. 5, pp. 571–612). Philadelphia: Taylor & Francis. Dweck, C. S., & Leggett, E. L. (1988). A social-cognitive approach to motivation and personality. Psychological Review, 95, 256–273. Edmondson, A. C. (1999). Psychological safety and learning behavior in work teams. Administrative Science Quarterly, 44, 350–383. Edmondson, A. C. (2012). Teaming: How organizations learn, innovate, and compete in the knowledge economy. San Francisco: Jossey-Bass. Gully, S. M., Incalcaterra, K. A., Joshi, A., & Beaubien, J. M. (2002). A meta-analysis of team-efficacy, potency, and performance: Interdependence and level of analysis as moderators of observed relationships. Journal of Applied Psychology, 87, 819–832. Halbesleben, J. R. B., & Wheeler, A. R. (2015). To invest or not? The role of coworker support and trust in daily reciprocal gain spirals of helping behavior. Journal of Management, 41, 1628–1650. Hannah, S. T., Uhl-Bien, M., Avolio, B., & Cavarretta, F. L. (2009). A framework for examining leadership in extreme contexts. The Leadership Quarterly, 20, 897–919. Hodgkinson, G. P., & Healey, M. P. (2008). Cognition in organizations. Annual Review of Psychology, 59, 387–417. James, E. H., & Wooten, L. P. 2010. Leading under pressure: From surviving to thriving before, during, and after a crisis. New York: Taylor & Francis. Kaplan, S. A., LaPort, K., & Waller, M. J. (2013). The role of positive affect in team performance during crises. Journal of Organizational Behavior, 34, 473–491. Kaplan, S., Stachowski, A., Hawkins, L., & Kurtessis, J. (2010). Canaries in the coalmine: On the measurement and correlates of organizational threat recognition. European Journal of Work and Organizational Psychology, 19, 587–614. Kaplan, S., & Waller, M. (2007, April). On the perils of polychronicity: Multitasking effects in nuclear crews. Paper presented at the 22nd annual meeting of the Society for Industrial and Organizational Psychology, New York. Klein, K. J., Ziegert, J. C., Knight, A. P., & Xiao, Y. (2006). Dynamic delegation: Shared, hierarchical, and deindividualized leadership in extreme action teams. Administrative Science Quarterly, 51, 590–621. Kolbe, M., Grote, G., Waller, M. J., Wacker, J., Grande, B., Burtscher, M., & Spahn, D. R. (2014). Monitoring and talking to the room: Autochthonous coordination patterns in team interaction and performance. Journal of Applied Psychology, 99, 1254–1267. Kouchaki, M., Okhuysen, G., Waller, M. J., & Tajeddin, G. (2012). The treatment of the relationship between groups and their environments: A review and critical examination of common assumptions in research. Group & Organization Management, 37, 171–203. Lei, Z., Waller, M., Hagen, J., & Kaplan, S. (August 2014). Team adaptiveness in dynamic contexts: Contextualizing the roles of interaction patterns and in-process planning. Presented at the Academy of Management annual meeting, Philadelphia. Lu, L., Yuan, Y. C., & McLeod, P. L. (2012). Twenty-five years of hidden profiles in group decision making: A meta-analysis. Personality and Social Psychology Review, 16, 54–75. Luthar, S. S., Cicchetti, D., & Becker, B. (2000). The construct of resilience: A critical evaluation and guidelines for future work. Child Development, 71, 543–562. Marks, M. A., Mathieu, J. E., & Zaccaro, S. J. (2001). A temporally based framework and taxonomy of team processes. Academy of Management Review, 26, 356–376.

116

reliability through resilience

Mathieu, J. E., Tannenbaum, S. I., Donsbach, J. S., & Alliger, G. M. (2014). A review and integration of team composition models: Moving toward a dynamic and temporal framework. Journal of Management, 40, 126–156. McKinney, E. H., Jr., Barker, J. R., Smith, D. R., & Davis, K. J. (2004). The role of communication values in swift starting action teams: IT insights from flight crew experience. Information & Management, 41, 1043–1056. Mitroff, I. I. (1988). Crisis management: Cutting through the confusion. Sloan Management Review, 29, 15–20. Mitroff, I. I. (2003). Crisis leadership: Planning for the unthinkable. New York: Wiley. Moats, J., Chermack, T. J., & Dooley, L. M. (2008). Using scenarios to develop crisis managers: Applications of scenario planning and scenario-based training. Advances in Developing Human Resources, 10, 397–424. Morgan, B. B., & Lassiter, D. L. (1992). Team composition and staffing. In R. W. Swezey & E. Salas (Eds.), Teams: Their training and performance (pp. 75–100). Norwood, NJ: Ablex. Pearson, C. M., & Clair, J. A. (1998). Reframing crisis management. Academy of Management Review, 23, 59–76. Pearson, C. M., & Mitroff, I. I. (1993). From crisis prone to crisis prepared: A framework for crisis management. Academy of Management Executive, 7, 48–59. Perry-Smith, J. E., & Shalley, C. E. (2003). The social side of creativity: A static and dynamic social network perspective. Academy of Management Review, 28, 89–106. Pettersen, K. A., & Schulman, P. R. (2016, March 26). Drift, adaptation, resilience and reliability: Toward an empirical clarification. Safety Science [online version]. Retrieved from http://www .sciencedirect.com/science/article/pii/S0925753516300108 PricewaterhouseCoopers. (2014). Trends shaping governance and the board of the future: PwC’s 2014 annual corporate directors survey. New York: Author. Riley, W., Davis, S., Miller, K., & McCullough, M. (2010). A model for developing high-reliability teams. Journal of Nursing Management, 18, 556–563. Rouse, W. B., & Morris, N. M. (1986). On looking into the black box: Prospects and limits in the search for mental models. Psychological Bulletin, 100, 349–363. Salas, E., Prince, C., Baker, D. P., & Shrestha, L. (1995). Situation awareness in team performance: Implications for measurement and training. Human Factors, 37, 123–136. Salas, E., Rosen, M. A., Held, J. D., & Weissmuller, J. J. (2009). Performance measurement in simulation-based training. Simulation & Gaming, 40, 328–376. Salas, E., Wilson, K. A., & Edens, E. (2009). Crew resource management. Surrey, UK: Ashgate. Sameroff, A. J., & Rosenblum, K. L. (2006). Psychosocial constraints on the development of resilience. In B. Lester, A. Masten, & B. McEwen (Eds.), Resilience in children. Annals of the New York Academy of Sciences, 1094, 116–124. Smith, D. (2004). For whom the bell tolls: Imagining accidents and the development of crisis simulation in organizations. Simulation & Gaming, 35, 347–362. Smith-Jentsch, K. A., Kraiger, K., Cannon-Bowers, J. A., & Salas, E. (2009). Do familiar teammates request and accept more backup? Transactive memory in air traffic control. Human Factors, 51, 181–192. Somers, S. (2009). Measuring resilience potential: An adaptive strategy for organisational crisis planning. Journal of Contingencies and Crisis Management, 17, 12–23. Stachowski, A. A., Kaplan, S. A., & Waller, M. J. (2009). The benefits of flexible team interaction during crises. Journal of Applied Psychology, 94, 1536–1543. Starbuck, W., & Farjoun, M. (Eds.). (2005). Organization at the limit: Lessons from the Columbia Disaster. New York: Blackwell.

reliability through resilience 117

Stout, R. J., Cannon-Bowers, J. A., Salas, E., & Milanovich, D. M. (1999). Planning, shared mental models, and coordinated performance: An empirical link is established. Human Factors, 41, 61–71. Summers, J. K., Humphrey, S. E., & Ferris, G. R. (2012). Team member change, flux in coordination, and performance: Effects of strategic core roles, information transfer, and cognitive ability. Academy of Management Journal, 55, 314–338. Sutcliffe, K. M., & Vogus, T. J. (2003). Organizing for resilience. In K. Cameron, J. E. Dutton, & R. E. Quinn (Eds.), Positive organizational scholarship: Foundations of a new discipline (pp. 94– 110). San Francisco: Berrett-Koehler. Tannenbaum, S. I., & Cerasoli, C. P. (2013). Do team and individual debriefs enhance performance? A meta-analysis. Human Factors, 55, 231–245. Tierney, K., Lindell, M., & Perry, R. (2001). Facing the unexpected: Disaster preparedness and response in the United States. Washington, DC: Joseph Henry. Tucker, A., & Edmondson., A. (2003). Why hospitals don’t learn from failures: Organizational and psychological dynamics that inhibit system change. California Management Review, 45, 56–72. Uitdewilligen, S., Waller, M. J., & Zijlstra, F. (2010). Team cognition and adaptability in dynamic settings: A review of pertinent work. In G. P Hodgkinson & J. K. Ford (Eds.), The international review of industrial and organizational psychology (Vol. 25, 293–353). Chichester, UK: Wiley. Vashdi, D. R., Bamberger, P. A., & Erez, M. (2013). Can surgical teams ever learn? The role of coordination, complexity, and transitivity in action team learning. Academy of Management Journal, 56, 945–971. Waller, M. J. (1999). The timing of adaptive group responses to nonroutine events. Academy of Management Journal, 42, 127–137. Waller, M. J., Gupta, N., & Giambatista, R. (2004). Effects of adaptive behaviors and shared mental model development on control crew performance. Management Science, 50, 1534–1544. Waller, M. J., & Kaplan, S. (2015). Life on the edge: Dynamic boundary spanning in mine rescue teams. Paper presented at the annual meeting of the Academy of Management, Vancouver, BC, Canada. Waller, M. J., Lei, Z., & Pratten, R. (2014). Exploring the role of crisis management teams in crisis management education. Academy of Management Learning & Education, 13, 208–221. Waller, M. J., & Uitdewilligen, S. (2009). Talking to the room: Collective sensemaking during crisis situations. In R. Roe, M. Waller, & S. Clegg (Eds.), Time in organizations—Approaches and methods 186–203). London: Routledge. Waller, M. J., & Uitdewilligen, S. (2012). Transitions in action teams. In G. Graen & J. Graen (Eds.), Management of teams in extreme context (pp. 167–195). Charlotte, NC: Information Age. Waller, M. J., Zellmer-Bruhn, M. E., & Giambatista, R. C. (2002). Watching the clock: Group pacing behavior under dynamic deadlines. Academy of Management Journal, 45, 1046–1055. Weaver, S. J., Dy, S. M., & Rosen, M. A. (2014). Team-training in healthcare: A narrative synthesis of the literature. BMJ Quality and Safety, 23, 1–14. Wegner, D. M. (1986). Transactive memory: A contemporary analysis of the group mind. In B. Mullen & G. R. Goethals (Eds.), Theories of group behavior (pp. 185–205). New York: Springer-Verlag. Weick, K. E., & Sutcliffe, K. M. (2007). Managing the unexpected. San Francisco: Jossey-Bass. Wildavsky, A. (1988). Searching for safety. New Brunswick, NJ: Transaction. Woods, D., Patterson, E. S., & Roth, E. M. (2002). Can we ever escape from data overload? A cognitive systems diagnosis. Cognition, Technology and Work, 14, 22–36. Zaccaro, S. J., Marks, M. A., & DeChurch, L. A. (Eds.). (2011). Multiteam systems: An organization form for complex, dynamic environments. New York: Taylor & Francis. Ziller, R. C. (1965). Toward a theory of open and closed groups. Psychological Bulletin, 64, 164–182.

chapter 6

H ow H i g h R e l i a b i l i t y M i t ig at e s Org a n i z at ion a l Go a l C o n f l i c t s Peter M. Madsen & Vinit Desai

O

rg a n i z at ion a l sc hol a r s h av e l ong recognized that goals constitute a fundamental characteristic of formal organizations (Parsons, 1956; Simon, 1964). Indeed, Philip Selznick (1949) defined organizations as “instruments designed to attain certain goals” (p. 254). However, organizational theorists have also noted that complex organizations rarely pursue unified, unitary goals but rather simultaneously seek multiple goals, some compatible and some incompatible. For example, Richard Cyert and James March (1963) note that rather than a single goal, “an organization has a complex of goals” (p. 47). Resolving conflicts among disparate organizational goals is a major challenge for organizational decision makers (Ethiraj & Levinthal, 2009). The ability to effectively deal with such goal conflicts is one of the definitional qualities of HROs. HROs exhibit virtually error-free performance despite operating in a highly hazardous environment that offers many opportunities for catastrophic failure. In such environments, always salient is “the goal of avoiding altogether serious operational failures” (La Porte & Consolini, 1991, p. 21). However, like other organizations, HROs are also characterized by the need to achieve operational objectives, which Todd La Porte and Paula Consolini (1991) refer to as “short term efficiency” goals. The tension between

high reliability mitigates goal conflicts 119

reliability goals and efficiency goals is present in virtually all organizations. But because HROs inhabit environments that offer so many opportunities for serious failure, they do not have the luxury of leaving unresolved goal conflicts that could damage reliability. For example, Karlene Roberts (1990) notes that one of the challenges to reliability in complex organizations is that such organizations are characterized by “large numbers of constituencies and participant groups that result in multiple and conflicting goals” (p. 164). She goes on to discuss how HROs are characterized by “a strong sense of primary mission” that allows their members to resolve apparent goal conflicts without jeopardizing the primary goal (p. 172). HROs may be unique in their ability to manage goal conflicts without compromising the attainment of important goals. The broader literature on the organizational pursuit of multiple goals suggests that the process of resolving goal conflicts is a difficult one. Management scholars building on the Behavioral Theory of the Firm (BTF; Cyert & March, 1963) tradition have theorized several mechanisms through which organizational decision makers may deal with goal conflict (Audia & Brion, 2007; Ethiraj & Levinthal, 2009; Greve, 1998, 2008). But attempts at understanding the challenges of resolving goal conflicts also pervade the literatures of several other business and economic disciplines. For example, accounting scholars have devoted a great deal of attention to the measurement and reporting of multiple dimensions of performance (Ittner & Larcker, 2003; Kaplan & Norton, 1996). The balanced scorecard, which was recently named one of the one hundred greatest management innovations of the twentieth century (Meyer, 2002), is one outcome of this line of research. Additional accounting and economic work examines the performance implications of the number of performance variables that corporations explicitly pursue (Cools & van Praag, 2003). Similarly, agency theorists examining the design of incentive contracts for corporate executives have considered the implications of tying executive compensation to multiple types of corporate performance, including stock market performance, accounting measures of financial performance, social and environmental performance, and customer satisfaction (e.g., Prendergast, 1999). Finally, economic, business and society, and stakeholder theorists have long debated the number and nature of performance variables for which organizations ought to be considered responsible (e.g., Jensen, 2001; Freeman, 1990).

120

high reliability mitigates goal conflicts

These varied literatures have proposed several mechanisms through which organizational decision makers might deal with the complexity of concurrently considering multiple dimensions of performance in guiding organizational policy and action. In this chapter we first propose a framework that categorizes this prior work on goal conflict resolution. We then demonstrate that the method employed by HROs to deal with goal conflicts constitutes a novel mechanism distinct from those discussed in the broader organizational and business literatures. While HROs may be unique in their ability to manage goal conflicts, the structures and tactics used by these organizations to balance competing goals may be broadly applicable to a large array of organizations operating in complex environments across diverse industries such as health care, transportation, utilities, technology, finance, education, and business services, to name a few. Thus, we conclude this chapter with a discussion of the implications for managing goal conflicts across all organizations operating in hazardous or uncertain domains.

E x i s t i n g M od e l s o f t h e R e so l u t i o n o f O r g a n i z a t i o n a l Go a l C o n f l i c t s

As noted, prior work in several disciplines suggests various methods for resolving organizational goal conflicts. We suggest that these models of how decision makers manage various forms of performance differ from one another primarily along two dimensions: performance relatedness and causal ambiguity. We define performance relatedness as the degree to which a model assumes that various forms of performance are interdependent, such that organizational decisions that affect one type of performance are likely to affect others as well. Relatedness between performance dimensions can be positive, such as when an increase in one type of performance (like talent retention) drives an increase in another type of performance (like new product development), or negative, such as when an increase in performance on one dimension reduces performance on another dimension (e.g., when an increase in market share achieved through product price reduction leads to a decrease in profitability). In either case the model that predicted the positive or negative relationship between two performance dimensions would be rated as high in relatedness because the dimensions are thought to interact with one another. However, if a model assumes that organizational decisions made in the pursuit of one

high reliability mitigates goal conflicts 121

Performance Relatedness

type of performance are largely unrelated to other forms of performance, that model is rated as low in relatedness. We define causal ambiguity as the degree to which organizational decision makers face uncertainty regarding the effects of their actions on organizational performance dimensions of interest. In other words, a model of how decision makers concurrently manage goals related to multiple types of performance that assumes that decision makers can predict with some accuracy the effects of their actions on the various types of performance that are of interest is rated low in causal ambiguity; whereas a model that assumes difficulty in comprehending means-ends relationships is rated high in causal ambiguity. Causal ambiguity may arise due to several reasons, such as when the relationship between cause and effect is complex, when the relationship is partially stochastic, or when the mechanism through which a cause relates to an effect is unobservable (Levitt & March, 1988; March & Simon, 1958). Our definition of causal ambiguity encompasses all these possibilities. Based on differences in assumed performance relatedness and causal ambiguity, we propose a typology of models of organizational decision making in the face of organizational aspirations on multiple performance dimensions. Figure 6.1 illustrates the proposed typology in the form of a two-by-two matrix that classifies organizational decision-making models into four quadrants. Existing approaches to goal conflict resolution fall into three of these quadrants,

High

Unitary Consideration Models

HRO approach

Low

Concurrent Consideration Models

Sequential Consideration Models

Low

High Causal Ambiguity

figure 6.1 Typology of Organizational Goal Conflict Resolution Models

122

high reliability mitigates goal conflicts

which we label concurrent consideration models, sequential consideration models, and unitary consideration models. The fourth quadrant, the HRO approach, has not been formally theorized in prior work on organizational goal conflict. We argue that the model of goal conflict resolution proposed in the HRO literature falls into this category. Concurrent Consideration Models Several proposed models of organizational decision making in the context of multiple goals assume both low levels of relatedness among disparate types of performance and low causal ambiguity. We refer to models of this sort as concurrent consideration models. These models assume that organizational decision makers pursue goals on multiple types of organizational performance at once but that decision makers treat each goal independently (i.e., the performance of the organization relative to one goal does not affect decisions designed to meet a different goal related to a different type of performance). Concurrent consideration models also assume that it is possible for organizational decision makers to pursue disparate goals independently because the causal relationships between organizational action and various types of performance are relatively clear. For example, agency theorists argue that any aspect of organizational performance that can be (cost efficiently) measured and adds incrementally to a principal’s ability to gauge the level of effort exerted by an agent should be included in incentive contracts and tied to the agent’s compensation (Eisenhardt, 1989; Holmstrom, 1979). This agency theoretic view of managing multiple forms of organizational performance takes for granted that various dimensions of organizational performance are largely independent, such that an executive could pursue goals related to each of them simultaneously and that means-ends relationships can be understood. Similarly, a large body of work in the accounting literature examines financial and nonfinancial performance measurement in organizations (for a recent review, see Ittner & Larcker, 2009). Much of this work argues that organizations should pursue goals related to several nonfinancial forms of performance as well as financial goals at the same time, since financial performance may not provide a clear picture of global organizational performance (Ittner & Larcker, 1998, 2003; Kaplan & Norton, 1996). The balanced scorecard is one attempt to operationalize this recommendation; it indicates that organizational deci-

high reliability mitigates goal conflicts 123

sion makers should track performance relative to goals in at least four areas: financial, customer satisfaction, internal process, and employee development (Kaplan & Norton, 1996). The balanced scorecard holds that decision makers must pay attention to each of these areas independently because these four different forms of performance cannot be easily decomposed to a common denominator that would allow them to be integrated into a higher-level performance metric. Consequently, this work implicitly adopts a concurrent consideration model of organizational decision making by assuming that performance goals on one dimension of performance may be tracked independently of other goals related to other forms of performance. A small body of organizational theory research also implicitly adopts this perspective, examining the independent and additive effects of organizational performance relative to disparate goals without examining any possible interaction among the different goals. For example, Pino Audia and Olav Sorenson (2001) find that organizational decision makers pursue both technological performance and sales performance goals concurrently, but these authors do not consider how performance relative to one of these goals influences how decision makers approach the other. Similarly, Joel Baum and colleagues (Baum, Rowley, Shipilov, & Chuang, 2005) explore how decision makers simultaneously seek market share and organizational network status without examining interactions between the two goals. Concurrent consideration decision-making models have been applied to several different research topics in different literatures. The central feature of all such models, however, is the assumption that organizational decision makers pursue multiple goals on various forms of organizational performance simultaneously without giving much consideration to interactions among the disparate goals. Sequential Consideration Models The second class of models of organizational decision making in the pursuit of multiple goals also assumes that different dimensions of organizational performance are largely unrelated to one another but also assumes that organizational decision makers usually operate in a context of high causal ambiguity. Following Henrich Greve (2003), we refer to such models as sequential consideration models. This name derives from the predictions of these models,

124

high reliability mitigates goal conflicts

which argue that decision makers devote attention to organizational goals on disparate performance dimensions sequentially rather than simultaneously. This prediction in turn derives from the assumptions of BTF on which sequential consideration models build. BTF argues that organizational decision makers are boundedly rational due to their cognitive limitations (Simon, 1957). It is these cognitive limitations that generate causal ambiguity in means-ends relationships, as decision makers lack the ability, in most contexts, to accurately predict the long-term effects of their decisions (Levitt & March, 1988). As a result of bounded rationality, decision makers are unable to identify “optimal” alternatives (as is assumed by classical decision theory), and must rather be content to satisfice—to search for satisfactory alternatives (Cyert & March, 1963; March & Simon, 1958). Organizational decision makers determine whether performance has been satisfactory by comparing performance with organizational performance goals; performance that exceeds goals is deemed satisfactory and performance that falls short of goals is deemed unsatisfactory. Building on BTF, sequential consideration models hold that boundedly rational decision makers pursue goals on different dimensions of performance separately and sequentially, rather than concurrently (Greve, 2003). Most sequential attention models argue that decision makers strive to attain performance goals in some predetermined order of importance (March, 1962). For example, March and Zur Shapira (1987, 1992) suggest that decision makers in most organizations hold both financial and survival goals, but that survival goals predominate such that decision makers seek performance goals only after being assured that the attainment of survival goals is not in jeopardy (also see Wiseman & Bromiley, 1996; Miller & Chen, 2004). Similarly, Greve (2008) finds that organizational leaders consider both financial and size goals. But because decision makers value financial goals more than size goals, they only devote attention to size goals when financial performance exceeds financial goals. In other situations decision makers may turn attention to secondary goals only after primary goals have not been met or may prioritize goals based on actual performance. For example, decision makers who are motivated to avoid making changes to organizational policies or structures could minimize the perceived need to change following failure to achieve a primary goal by devoting attention to a secondary goal. Prior work has labeled decision-making models that take this approach as self-enhancement models (Greve, 1998). Audia and

high reliability mitigates goal conflicts 125

Sebastien Brion (2007) found evidence that organizational decision makers follow the self-enhancement model in both lab and field settings. Alternatively, in some cases decision makers may not maintain consistent prioritizations of goals over time. Rather, they may use performance feedback to change goal priorities. The so-called fire alarm decision model is one suggested sequential consideration model that takes this perspective (Greve, 1998). The fire alarm model suggests that organizational decision makers shift their attention first to dimensions of organizational performance where performance falls the most short of the goal; then after performance in that area has reached its aspiration level, decision makers turn their attention to other aspects of performance (Greve, 2003). Clearly, the several sequential consideration models that have been proposed in the literature offer disparate predictions about decision-maker behavior. But these models all share several common characteristics: they build on BTF’s assumption of causal ambiguity brought about by bounded decision-maker rationality; they assume that this causal ambiguity constrains decision makers to focus on one performance goal at a time, shifting attention among various goals over time; and they assume (often implicitly) that different performance goals are independent enough that they can each be pursued separately and sequentially. Unitary Consideration Models A third class of organizational decision making models in the context of multiple goals, which we label unitary consideration models, relaxes the assumption that disparate performance goals are independent of one another, instead taking the perspective that different performance goals are intimately interrelated. These models also assume that causal ambiguity is low, such that the various performance goals held by organizational decision makers can be decomposed to a common denominator and compared directly to one another. This approach derives from the neoclassical economic assumption of decision-maker rationality, which suggests that decision makers’ preferences across all performance goals of interest can be reduced to a utility function and compared directly (Arrow, 1951; Becker, 1978). Like most sequential consideration models, unitary consideration models assume that organizational decision makers prioritize their various performance goals, such that some aspects of organizational performance are more valued

126

high reliability mitigates goal conflicts

than others. From these two assumptions (goal decomposability and goal prioritization) it follows that organizational decision makers should pursue one primary performance goal and concern themselves with other goals only to the extent that their fulfillment furthers performance relative to the primary goal. Unitary consideration models typically arise in normative work describing what aspects of performance corporate executives should be concerned with. For example, Michael Jensen (2001) argues that “it is logically impossible to maximize in more than one dimension at the same time. . . . Thus, telling a manager to maximize current profits, market share, future growth in profits, and anything else one pleases will leave that manager with no way to make a reasoned decision. The result will be confusion and a lack of purpose that will handicap the firm in its competition for survival” (pp. 11–12). Thus, unitary consideration models suggest that organizational decision makers pursue one overriding performance goal and focus attention on other goals only insofar as accomplishing those goals contributes to the accomplishment of the primary goal. The most common version of the unitary consideration model in the academic literature privileges some form of financial performance (typically shareholder value) as the dominant aspect of organizational performance (e.g., Jensen, 2001; Friedman, 1970; Cools & van Praag, 2003). But this is only a special case of the general class of unitary consideration models. Unitary consideration models espoused by organizational decision makers often take other forms of performance to be the primary organizational goal. For example, decision makers at Ponseti International, a nonprofit organization dedicated to “the treatment of children born with clubfoot through education, research, and improved access to care” (Ponseti International, 2017), set goals on several performance dimensions, including financial (fund-raising), educational (number of health-care professionals trained), and treatment (number of children treated). Organizational leaders place priority on the treatment goal and attend to the other goals only to the extent that they expect meeting those goals to further the treatment goals (Ibid.). Many other nonprofits report similar single-minded pursuit of nonfinancial goals.

M u lt i p l e Go a l s i n H R O s

The manner that organizational leaders in HROs approach goal conflicts does not fall neatly into any of these existing classes of decision making mod-

high reliability mitigates goal conflicts 127

els in the context of multiple goals. In fact, the HRO approach to goal conflict resolution appears to be distinct in that the HRO literature assumes that goal conflicts are high in both performance relatedness and causal ambiguity. In the HRO literature, goal conflicts are typically considered to occur between the goals of organizational reliability and organizational efficiency, cost reduction, or scheduling pressure (La Porte & Rochlin, 1994; Rochlin, 1993). For example, La Porte and Consolini (1991) note that HROs “share the goal of avoiding altogether serious operational failures. This goal rivals shortterm efficiency as a primary operational objective. Indeed, failure-free performance is a condition of providing benefits” (p. 21). Similarly, Palie Smart and colleagues (Smart et al., 2003) call for manufacturing organizations that have focused on “lean” thinking for decades to integrate HRO principles, noting: “Creating organizations that have focused almost exclusively on ‘getting more for less’ might be argued to have stripped organizations of their adaptive and responsive capacity or organizational slack, by promoting a managerial mindset driven more or less exclusively by the continuing requirement for cost minimization” (p. 733). Furthermore, citing the loss of the Columbia orbiter as an example of unacceptably low reliability, the Columbia Accident Investigation Board argued that, leading up the Columbia disaster, “NASA had conflicting goals of cost, schedule, and safety. Safety lost out” (Columbia Accident Investigation Board, 2003, p. 200). In the HRO context, these forms of organizational performance—reliability and efficiency—must be seen as intimately related. For example, in the context of goal relatedness in aircraft carrier operations, the commanding officer of the USS Carl Vinson said to Roberts (1990), “Sometimes I think you don’t keep your eye on the goal around here. The primary goal is to get the planes off the pointy end of the ship and back down on to the flat end without mishap” (p. 172). Unreliable performance on an aircraft carrier, in the form of accidents or mishaps, is antithetical to the efficient fulfillment of this primary goal. Thus, the goals of reliability and efficiency cannot be separated in this context. Similarly, in the US ATC system, both reliability and efficiency are key goals, but the two are virtually inseparable in the sense that system inefficiencies lead to hazardous conditions, while unreliability (in the form of accidents or near accidents) reduces efficiency and increases costs (La Porte, 1988). At the same time that it views different performance goals as highly related, the HRO literature also views causal ambiguity as high in HRO operations.

128

high reliability mitigates goal conflicts

HRO theorists view organizational environments, and organizations themselves, as complex, uncertain, tightly coupled, and unpredictable, rendering the ability of organizational leaders to comprehend all relevant causal mechanisms weak at best (Roberts, 1993; Roberts & Rousseau, 1989). For example, in his study of the Diablo Canyon nuclear power plant, Paul Schulman (1993) notes a “widespread recognition that all of the potential failure modes into which the highly complex technical systems could resolve themselves have yet to be experienced. Nor have they been exhaustively deduced” (p. 384). Such recognition that organizational understandings and models of systems and environment are incomplete leads to a “reluctance to simplify interpretations” among members of HROs, in the sense that members of HROs hesitate to ignore any aspect of their operations or environment, not knowing in advance where future challenges may come from (Chassin & Loeb, 2013; Weick, Sutcliffe, & Obstfeld, 1999). In a setting where causal mechanisms are clearly known, this reluctance to simplify would be wasteful and inefficient. Only in settings where causal mechanisms are not clearly known or evolving does reluctance to simplify make sense. Thus, by assuming that various categories of organizational performance are highly related and assuming that causal ambiguity is high, the HRO view on the resolution of goal conflicts falls into the upper-right quadrant in Figure 6.1—a domain that has not been previously theorized by the mainstream literatures on organizational decision making in the context of multiple goals. The processes through which HROs resolve goal conflict (discussed in the next section), therefore, may have broader applicability in any context where goal relatedness is high and causal ambiguity is high.

S t r a t e g i e s f o r Go a l C o n f l i c t R e so l u t i o n i n H R O s

A variety of frameworks relating processes in HROs to the performance of those organizations touch on the approaches to goal conflict resolution employed by HROs (e.g., Bigley & Roberts, 2001; La Porte & Consolini, 1991; Libuser, 1994; Roberts, 1990; Weick et al., 1999). These approaches share the view that the ability of HROs to determine a desirable course of action in the presence of goal conflicts derives from mindfulness—the ability of an organization’s members to actively engage with organizational conditions and envi-

high reliability mitigates goal conflicts 129

ronments continuously rather than relying heavily on automatic processes for environmental scanning and decision making (Rerup, 2009; Weick & Roberts, 1993; Weick et al., 1999). Particularly, they consider mindfulness to involve more than merely allocating attention broadly across various dimensions of performance but rather describe it as the capability to collectively notice weak or rare signals such as minor or unanticipated deviations from expected outcomes along any of these valued performance dimensions, to accurately and efficiently determine the causes of these deviations even under the considerable causal ambiguity and high performance relatedness that characterize HROs, and to pursue desirable corrective actions even when these conflict with received wisdom or standard practices. Thus, mindfulness refers not just to a rich awareness of unexpected events but also to the organization’s capacity to act in ways that effectively manage and correct these deviations. This prior work calls out four processes in particular that enable HROs to manage goal conflicts mindfully: bargaining, balancing of incentives, incremental decision making, and commitment to resilience. Each of these processes will be discussed in detail in the following paragraphs. Bargaining Roberts (1990) argues that goal conflicts between the drives for efficiency and for reliability are inherent to daily operations in HROs because “the organization needs to resolve tensions created by the need for maximal organizational effectiveness today without jeopardizing the ability to do it again tomorrow” (p. 172). Based on her study of US Navy aircraft carrier operations, Roberts identifies bargaining—the ongoing negotiation of the relative importance of different goals—as the primary mechanism through which these goal conflicts are managed: “Bargaining is a successful strategy for operationalizing goals because the superordinate goals are clear to everyone. . . . Despite multiple constituencies the organization does not fail at specifying and operationalizing goals” (Ibid., p. 172). Continuous bargaining over goal priority serves to keep all members of the organization informed about the activities and needs of all parts of the organization. Thus, bargaining makes the organization’s “big picture” clear to everyone in the organization and helps them to coordinate their work (Roberts & Bea, 2001). Organizational members within HROs collectively construct

130

high reliability mitigates goal conflicts

mental maps of their organizations’ operational environment, which can be fairly complex as they include dynamic relationships among people, tools, technologies, and environmental pressures, and use these integrative cognitive maps to continuously identify and implement “ongoing small adjustments that prevent errors from cumulating” (Weick et al., 1999, p. 43). Such mental maps are particularly beneficial when HROs encounter unexpected events or unanticipated problems, since members’ reaction times can be considerably quicker and their choices of action considerably more sophisticated when they already possess an accurate understanding of their operating environment rather than when this understanding must first be developed (Weick & Roberts, 1993). Ongoing bargaining over goals allows the organization’s members to detect and proactively manage goal conflicts. For example, Karl Weick, Kathleen Sutcliffe, and David Obstfeld (1999) provide the example of air traffic controllers dealing with complex airspace; they must balance the competing aims of conveying aircraft through their jurisdiction quickly or efficiently but also safely. To manage this problem, other air traffic controllers may “gather around a person working a very high amount of traffic and look for danger points,” effectively enhancing the organization’s capacity to simultaneously attend to performance along multiple dimensions (Ibid., p. 44). This process departs from the presumed approaches in other decision-making models, since the unitary and sequential models assume that organizational resources need not be directed toward making interpretations along multiple dimensions of performance at the same time, and the concurrent model assumes that while this is possible, there is no need to integrate information about performance along multiple dimensions throughout the organization because each category of performance is distinct and unrelated to the others. Balancing of Incentives Perhaps one of the largest differences between HROs and other organizations is that HROs formally account for the possibility and costs of catastrophic failure (Libuser, 1994; Roberts & Bea, 2001; Roberts & Libuser, 1993). Thus, “organizations that have fewer accidents than expected balance the tension between rewarding efficiency and rewarding reliability. . . . They seek to establish reward and incentive systems that balance the costs of potentially unsafe but short-run profitable strategies with the benefits of safe and long-run

high reliability mitigates goal conflicts 131

profitable strategies” (Roberts & Bea, 2001, p. 74). Maintaining a balance in incentive systems between efficiency and safety is a major challenge in organizations because efficiency (and short-term profitability) concerns are constant, while serious accidents are very rare (Madsen, 2013). Thus, over time, adaptive pressures tend to increase the perceived value of efficiency and reduce the perceived value of reliability in the minds of organizational decision makers such that actions that improve efficiency come to be rewarded much more richly than actions that promote reliability—until an accident occurs, which reminds decision makers of the importance of safety (Marcus & Nichols, 1999). This cycle then repeats itself when time since the last major accident increases and efficiency again comes to be valued above reliability (Turner, 1978). The Deepwater Horizon disaster is a recent example of this tendency. Prior to the disaster, BP executive compensation had not been closely tied to process safety. But in April 2010, several months after the Deepwater Horizon spill, BP CEO Tony Hayward resigned and was replaced by Robert Dudley. One of Dudley’s first actions after taking control of BP was to announce that all BP executive bonuses be made contingent solely on corporate safety performance. HROs avoid such ebbs and flows of incentives for reliability by continuously balancing incentive systems to reward both efficiency and reliability. Roberts and Robert Bea (2001) argue that HROs maintain such balance in reward systems by making “it politically and economically possible for people to make decisions that are both short-run safe and long-run profitable. This is important to ensure that the focus of the organization is fixed on accident avoidance” (p. 74). HROs balance reward systems by developing metrics of reliable performance at the individual level and placing performance on those reliability metrics on the same standing as financial and efficiency metrics in formal incentive programs, and by continuously reviewing incentive systems to ensure that this balance is not eroded over time (Roberts & Libuser, 1993). HROs also seek to maintain this balance by continuously surveying their members to verify that the “real goals of the organization are the same as the public goals” (Roberts & Bea, 2001, p. 74). This balance in reward and incentive systems allows the organization to simultaneously address and resolve problems along multiple dimensions of performance by forcing decision makers to consider each dimension of performance in tandem with the others. As a result, HROs accumulate relevant

132

high reliability mitigates goal conflicts

information about their environment and direct it appropriately within their organizations, ultimately enhancing their ability to simultaneously address multiple dimensions of performance and comprehensively resolve any conflicts that arise among these dimensions. This approach departs from the concurrent, unitary, and sequential consideration models, which allow decision-making authority to migrate within organizations but maintain that organizational decision makers are nonetheless only motivated or able to address unrelated dimensions, single dimensions, or sequential dimensions of performance one at a time. Incremental Decision Making HROs are distinct from many organizations in their attention to detecting and learning from failures (Weick & Roberts, 1993; Weick et al., 1999). Failures reveal lapses in structures and systems that could induce even more problematic or catastrophic effects if left uncorrected. La Porte and Consolini argue that this emphasis on learning from past deviations represents a form of “incremental decision making” (1991, p. 26). At the same time, though, HROs tend to have a limited history of failures to analyze and learn from, given their typically error-free operations. Thus, HROs are often “preoccupied with something they seldom see” (Weick et al., 1999, p. 39). To overcome this tension, HROs tend to treat any failures that occur, no matter how rare, as rich windows into system health, they tend to expand the volume of knowledge-generating events by analyzing near failures as well as outright failures, and they actively seek to avoid the complacency that follows long histories of success. La Porte and Consolini (1991) argue that because of the use of this form of incremental decision making, “errors resulting from operational or policy decisions are limited and consequences are bearable or reversible, with the costs less than the value of the improvements learned from feedback analysis” (p. 27). HROs operationalize this type of incrementalism through practices such as “hot washups”—where organizational members debrief immediately following an operation so that any small errors that occurred during the operation can be captured and analyzed while memories of them are fresh—and the production of “lessons learned” case studies that capture the context surrounding negative outcomes and the knowledge gleaned from them (Jordan, 2010; La Porte & Consolini, 1991; Roberts, 1990).

high reliability mitigates goal conflicts 133

This strategy differs from the approaches presumed by the unitary, sequential, and concurrent consideration models of organizational decision making. According to the unitary model, for instance, failures along secondary dimensions of performance would only receive attention to the extent that they interfere with the attainment of a more primary goal (e.g., Jensen, 2001; Friedman, 1970). According to the sequential model, failures along various dimensions of performance would be attended to one at a time, rather than simultaneously, following some predetermined order of importance (Greve, 2003). The strategy presumed by concurrent consideration models is perhaps closest to the approach within HROs in this regard, as this model also allows organizational decision makers to simultaneously attend to failure along multiple dimensions of performance. However, the concurrent model assumes that decision makers treat goals independently and that they pursue separate corrective actions with respect to each goal (Audia & Sorenson, 2001; Baum et al., 2005). In contrast, failures to attain goals along multiple performance dimensions in HROs are presumed to be interrelated and are interpreted collectively in order to gain a broader sense of underlying lapses and to design more comprehensive interventions (Weick et al., 1999). Commitment to Resilience Weick and colleagues (1999) argue that one of the central elements of HROs, their commitment to resilience, allows them to deal with goal conflicts. Resilience refers to the capacity of organizations to cope with problems and absorb their effects with minimal adverse impact to ongoing operations but also in a broader sense conveys the ability of organizations to anticipate and prepare for these problems in advance (Vogus, Rothman, Sutcliffe, & Weick, 2014; Weick et al., 1999). An essential attribute of resilience, in this broader sense, involves HROs’ efforts to amass and direct operational resources toward areas of operational weakness whenever problems are anticipated. Latent interpersonal networks on aircraft carriers, one example of this process identified by the HRO literature, spontaneously emerge and serve to combine knowledge collectively throughout the organization whenever problems are anticipated but rapidly disassemble following these deviations from the norm (Weick & Roberts, 1993; Weick et al., 1999). By combining disparate knowledge throughout the organization, these networks help their members to cope

134

high reliability mitigates goal conflicts

with imperfect individual knowledge and choose acceptable courses of action despite the causal ambiguity that characterizes their environment (Colquitt, Lepine, Zapata, & Wild, 2011; Weick et al., 1999). This notion, that HROs accumulate and direct resources toward solving complex performance challenges in a resilient fashion, departs considerably from the approaches suggested by the unitary, concurrent, and sequential consideration models. In particular, these three models portray the resolution of goal conflict as a more static process that does not invoke an escalation of organizational resources. For instance, the sequential model presumes that attention within organizations is limited and that members will first turn to the most pressing problems whenever deviations from the norm arise (Greve, 2003), while the HRO approach suggests that organizational attention can be expanded through an escalation of resources in order to simultaneously consider pressing problems along with less pressing problems, given that this combined information may reveal better knowledge about potential solutions than any effort to attend to each performance dimension sequentially or in isolation. The HRO approach, therefore, also differs from the unitary model, which considers one performance dimension in isolation, and from the concurrent consideration model, which addresses multiple performance dimensions but does not require that information is combined throughout the organization in this process.

C o n c l us i o n s

Members of HROs are required by the challenging environments their organizations occupy to continuously balance the demands of multiple organizational performance goals on multiple performance dimensions—in particular reliability and efficiency. In the face of these demands, members of HROs have developed techniques for resolving goal conflicts effectively without compromising performance along any important performance dimension. This ability to manage goal conflicts is one reason why HROs have proven capable of operating in hazardous conditions while experiencing far fewer serious failures than would be expected. The HRO literature also identifies unique and broadly applicable strategies for organizational decision makers to resolve goal conflicts. According to prior research on the unitary, sequential, and concurrent consideration mod-

high reliability mitigates goal conflicts 135

els, decision makers are presumed to resolve goal conflicts by attending predominantly to one dimension, one dimension at a time, or by pursuing actions among multiple unrelated dimensions. These extant strategies appear useful in cases where either performance relatedness or causal ambiguity is low, yet this leaves open the question of how decision makers should resolve conflicts among multiple related forms of performance (high performance relatedness) when the effects of their actions are also difficult to predict (high causal ambiguity). The HRO literature fills this void by identifying the structures and tactics used by organizations to balance competing goals and raise awareness of the connections between actions and outcomes as action unfolds. This has notable implications for theory and practice, which we discuss in turn next. Theoretical Implications and Directions for Future Work A variety of settings feature both high performance relatedness as well as high causal ambiguity, and the strategies for goal conflict resolution identified through the HRO literature likely hold implications for future research across a wider array of theoretical approaches and empirical settings. First, research on the BTF (Cyert & March, 1963) aims to broadly address and identify organizational responses to performance feedback, and this literature has increasingly examined multiple performance dimensions (Audia & Brion, 2007; Greve, 2003, 2008). While theory in this area has primarily espoused sequential consideration models, HRO research provides a window into additional models of performance feedback. For instance, organizations with processes related to bargaining, incentive balancing, incremental decision making, or resilience may uniquely respond to multiple dimensions of performance feedback simultaneously, or through actions aiming to uniformly address all forms of performance shortfall. If so, this would provide a theoretical lens to predict and examine differences in organizational responses to performance feedback based on internal structural elements. At least subsets of the defining structural characteristics of HROs are employed in a variety of other organizations and may allow these organizations to resolve competing demands of conflicting goals. Future work in this area could yield important insights regarding organizational decision making for both HROs and other organizations. Next, researchers would be wise to consider the performance implications for organizations that employ HRO structures and processes in environments where either performance relatedness or causal ambiguity is low rather than

136

high reliability mitigates goal conflicts

concurrently high. Although HRO research identifies tactics such as bargaining, incentive balancing, incremental decision making, and resilience as useful in environments with high performance relatedness and causal ambiguity, an open question involves whether these strategies for goal conflict resolution may actually become detrimental in environments where either dimension is low. For instance, organizations that devote more time to bargaining or incremental decision making may respond more slowly to environmental changes, which could be a detrimental approach in environments where causal ambiguity or performance relatedness are low. Thus, a direct comparison of these approaches to goal conflict resolution, across organizations in different environments or distinct settings, seems warranted. Such research could shed light onto the environmental conditions that require HROs versus other organizational forms. Finally, when the HRO literature addresses the issue of goal conflict resolution, the two goals that are almost universally discussed are reliability and efficiency. This focus is logical given the importance of these two objectives for HROs and the inherent conflict that exists between them. However, these two goals represent only a subset of the organizational objectives that may be relevant to members of HROs and that may come into conflict. This being the case, a contribution of this chapter is to delineate a general approach to conceptualizing goal conflict in HROs. This approach may be applied to any set of conflicting goals in HROs and is not restricted to reliability and efficiency goals. Future HRO research could examine the resolution of conflicts among a wider variety of goals—including, perhaps, reliability, response speed, quality, efficiency, and the perceptions of external audiences, among others. Such research could yield additional insights into the processes that allow HROs to successfully navigate highly uncertain environments. We encourage work in this direction. Practical Implications Even beyond HROs, an astounding variety of organizations operate in complex environments characterized by high causal ambiguity and performance relatedness. Indeed, organizations in many industries and sectors that are central to the economy, such as health care, transportation, utilities, technology, finance, education, and business services—to name a few—are driven by decision makers who must attend to multiple forms of performance and

high reliability mitigates goal conflicts 137

choose appropriate courses of action for their organizations, despite facing significant ambiguity regarding potential outcomes and substantial complexity regarding how changes in any one area might unintentionally affect related performance in other areas. Several of the unique goal conflict resolution strategies identified in HROs may help facilitate decision making in this broader group of organizations. While the HRO literature turns attention to several promising approaches (La Porte & Consolini, 1991; Libuser, 1994; Roberts, 1990; Vogus et al., 2014; Weick & Roberts, 1993), one commonality across these frameworks is that HROs are often able to determine a desirable course of action through mindfulness, or an active and continuous engagement rather than more automated or sporadic environmental scanning (Weick & Roberts, 1993). The processes identified by the HRO literature to enhance mindfulness within HROs, which include bargaining, balancing of incentives, incremental decision making, and resilience, may also support mindfulness in other organizations facing goal conflicts. For instance, accounts exist in the HRO literature of ongoing bargaining and negotiation over the importance of various performance goals (Roberts, 1990). While organizations in other settings may feature similar bargaining processes, these may represent more sporadic or superficial efforts rather than ongoing commitments. For example, public corporations may articulate their commitment to long-term growth, community involvement, or philanthropy but must often defer to short-term profitability whenever conflicts arise (Marcus & Goodman, 1991). However, ongoing bargaining and negotiation in these settings may reveal situations where broad or long-term objectives are viewed as so important that the short-term market penalty is acceptable or may even direct organizational decision makers to identify common approaches that enhance short-term value while simultaneously accomplishing other broader or longer-term objectives deemed important to the organization. In addition, the HRO literature emphasizes the balancing of incentives, where organizations in hazardous industries strive to balance the tension between rewarding safety or efficiency, specifically by avoiding the pressure to emphasize efficiency given that serious accidents are often already very rare (Madsen, 2013; Roberts & Bea, 2001). Incentives and reward systems in other settings may also drift over time, for example when organizations that balance between exploration and exploitation begin to emphasize new product

138

high reliability mitigates goal conflicts

evelopment or market expansions at the expense of the routine activities or d minor refinements required to maintain existing operations (March, 1991). Organizations that avoid this drift may be able to maintain a more appropriate balance between exploration and exploitation. For instance, these organizations may be able to achieve a greater level of stability by foregoing exploration when it would be too risky or costly rather than rewarding their members for pursuing exploration even when exploitation would appear more beneficial. The HRO literature also identifies incremental decision making as an essential characteristic of organizations striving to perform reliably in hazardous settings, meaning that these organizations must view any deviations from the norm, no matter how minor, as rich windows into system health given that more catastrophic failures are naturally very rare, and that they must use knowledge from these deviations to incrementally and continuously refine their operating practices and procedures (La Porte & Consolini, 1991). This approach may be beneficial for resolving goal conflicts at organizations operating in other settings as well. An essential attribute of incremental decision making at HROs is that deviations are viewed as important regardless of which performance metric they affect. Since the involved metric may be related to other forms of performance (given that performance relatedness is high within HROs), any deviations that arise provide important information regarding the overall organizational system, even if they involve forms of performance that are not initially viewed as valuable or important. In contrast, in many other organizational settings, a performance deviation may be overlooked unless it directly involves a metric of interest to the organization. For example, some organizations may overlook minor or infrequent complaints from customers regarding salesforce behavior as long as salespeople meet revenue targets. However, these complaints could provide knowledge regarding training gaps that could create larger problems for the organization in the future or could yield information about how to improve salesforce revenue performance even further. Finally, resilience is known to enhance HROs’ abilities to anticipate and resolve goal conflicts (Christianson, Sutcliffe, Miller, & Iwashyna, 2011; Weick et al., 1999). Specifically, HROs appear to amass operational resources and direct these resources to areas of weakness whenever problems are anticipated. Whenever goal conflicts arise, additional resources and expertise allow HROs to determine how to prioritize various actions and to more fully explore the potential outcomes of different action scenarios. Similar fluidity may arise in

high reliability mitigates goal conflicts 139

other settings as well, such as when governing board meetings are convened to deal with important strategic or competitive dilemmas, yet HROs appear to take resilience further by maintaining the ability to decentralize decision making in these situations, allowing people or teams with the greatest expertise to implement solutions regardless of their formal roles or positions within the organization. Such flexibility may enhance decision making in other settings as well; even though the expertise to solve particular problems may be sporadically distributed throughout an organization, decision making within many organizations appears to remain rigidly hierarchical even when goal conflicts or other dilemmas arise. Although strategies such as bargaining, balancing of incentives, incremental decision making, and resilience appear to aid HROs with resolving goal conflicts, they may require modifications prior to their transfer to other settings. To be effective these strategies require ongoing managerial commitments as well as socialization processes that encourage members to work seamlessly across formal boundaries whenever required. While HROs increasingly tend to occupy cultural, social, and industry environments that support these approaches, organizations in other competitive contexts may find that the required investments create competitive trade-offs or lead to strategic disadvantages given the underlying costs of maintaining these strategies. This opens an array of questions, suitable for future researchers, regarding whether and how successfully non-HROs are able to adapt HRO approaches to resolving goal conflicts. For instance, are non-HROs able to strike a careful balance among the strategies used by HROs, or do the strategies reinforce each other, such that organizations risk failure unless they adapt all of the strategies in their entirety? Or, can organizations in contexts where failure is less hazardous or costly still benefit through the HRO strategies, or are these strategies particularly tailored to contexts where failure along multiple performance dimensions is to be avoided at all costs? Despite the vastness and depth of the literature on organizational decision making and the resolution of goal conflicts, these questions perhaps suggest that the literature and its implications for practice are still in their infancy.

R efer ences Arrow, K. J. (1951). Social choice and individual values. New York: Wiley. Audia, P., & Brion, S. (2007). Reluctant to change: Self-enhancing responses to diverging performance measures. Organizational Behavior and Human Decision Processes, 102, 255–269.

140

high reliability mitigates goal conflicts

Audia, P. G., & Sorenson, O. (2001). A multilevel analysis of organizational success and inertia (Unpublished working paper). London School of Business. Baum, J. A., Rowley, T. J., Shipilov, A. V., & Chuang, Y. T. (2005). Dancing with strangers: Aspiration performance and the search for underwriting syndicate partners. Administrative Science Quarterly, 50(4), 536–575. Becker, G. S. (1978). The economic approach to human behavior. Chicago: University of Chicago Press. Bigley, G. A., & Roberts, K. H. (2001). The incident command system: High-reliability organizing for complex and volatile task environments. Academy of Management Journal, 44, 1281–1299. Chassin, M., & Loeb, J. (2013). High-reliability health care: Getting there from here. Milbank Quarterly, 91, 459–490. Christianson, M., Sutcliffe, K., Miller, M., & Iwashyna, T. (2011). Becoming a high reliability organization. Critical Care, 15(6), 314. Colquitt, J., Lepine, J., Zapata, C., & Wild, R. (2011). Trust in typical and high-reliability contexts: Building and reacting to trust among firefighters. Academy of Management Journal, 54, 999–1015. Cools, K., & van Praag, M. (2003). The value relevance of a single-valued corporate target: An explorative empirical analysis (Tinbergen Discussion Paper TI2003-049/3). Tinbergen Institute, The Netherlands. Retrieved from http://www.tinbergen.nl/discussionpapers/03049.pdf Columbia Accident Investigation Board. (2003). Columbia Accident Investigation Board report, volume 1. Washington, DC: Government Printing Office. Cyert, R. M., & March, J. G. (1963). A behavioral theory of the firm. Englewood Cliffs, NJ: Prentice-Hall. Eisenhardt, K. (1989). Agency theory: An assessment and review. Academy of Management Review, 14, 57–74. Ethiraj, S. K., & Levinthal, D. (2009). Hoping for A to Z while rewarding only A: Complex organizations and multiple goals. Organization Science, 20(1), 4–21. Freeman, R. E. (1990). Corporate governance: A stakeholder interpretation. Journal of Behavioral Economics, 19(4), 337–359. Friedman, M. (1970, September 13). The social responsibility of business is to increase its profits. New York Times Magazine, 122–126. Greve, H. R. (1998). Performance, aspirations, and risky organizational change. Administrative Science Quarterly, 43(1), 58–86. Greve, H. R. (2003). Organizational learning from performance feedback: A behavioral perspective on innovation and change. Cambridge: Cambridge University Press. Greve, H. R. (2008). A behavioral theory of firm growth: Sequential attention to size and performance goals. Academy of Management Journal, 51, 476–494. Holmstrom, B. (1979). Moral hazard and observability. The Bell Journal of Economics, 10, 74–91. Ittner, C., & Larcker, D. (1998). Are nonfinancial measures leading indicators of financial performance? An analysis of customer satisfaction. Journal of Accounting Research, 36(Suppl.), 1–35. Ittner, C., & Larcker, D. (2003). Coming up short on nonfinancial performance measurement. Harvard Business Review, 81(11), 88–95. Ittner, C., & Larcker, D. (2009). The stock market’s pricing of customer satisfaction. Marketing Science, 28, 826–835. Jensen, M. C. (2001). Value maximization, stakeholder theory, and the corporate objective function. Journal of Applied Corporate Finance, 14(3), 8–21. Jordan, S. (2010). Learning to be surprised: How to foster reflective practice in a high-reliability context. Management Learning, 41(4), 390–412. Kaplan, R., & Norton, D. (1996). The balanced scorecard: Translating strategy into action. Boston: Harvard Business School Press.

high reliability mitigates goal conflicts 141

La Porte, T. R. (1988). The United States air traffic system: Increasing reliability in the midst of rapid growth. In R. Mayntz & T. Hughes (Eds.), The development of large scale technical systems (pp. 215–244). Boulder, CO: Westview Press. La Porte, T. R., & Consolini, P. (1991). Working in practice but not in theory: Theoretical challenges of high reliability organizations. Journal of Public Administration Research and Theory, 1, 19–47. La Porte, T. R., Jr., & Rochlin, G. (1994). A rejoinder to Perrow. Journal of Contingencies and Crisis Management, 2, 221–227. Levitt, B., & March, J. G. (1988). Organizational learning. Annual Review of Sociology, 14, 319–340. Libuser, C. (1994). Organizational structure and risk mitigation (Unpublished doctoral dissertation). University of California, Los Angeles. Madsen, P. M. (2013). Perils and profits: A re-examination of the link between profitability and safety in U.S. aviation. Journal of Management, 39, 763–791. March, J. G. (1962). The business firm as a political coalition. Journal of Politics, 24, 662–678. March, J. G. (1991). Exploration and exploitation in organizational learning. Organization Science, 2, 71–87. March, J. G., & Shapira, Z. (1987). Managerial perspectives on risk and risk-taking. Management Science, 33(11), 1404–1418. March, J. G., & Shapira, Z. (1992). Variable risk preferences and the focus of attention. Psychological Review, 99(1), 172–183. March, J. G., & Simon, H. A. (1958). Organizations. New York: Wiley. Marcus, A. A., & Nichols, M. L. (1999). On the edge: Heeding the warnings of unusual events. Organization Science, 10(4), 482–499. Marcus, A. A., & Goodman, R. S. (1991). Victims and shareholders: The dilemmas of presenting corporate policy during a crisis. Academy of Management Journal, 34(2), 281–305. Meyer, M. W. (2002). Rethinking performance measurement: Beyond the balanced scorecard. Cambridge: Cambridge University Press. Miller, K. D., & Chen, W. (2004). Variable organizational risk preferences: Tests of the MarchShapira model. Academy of Management Journal, 47, 105–115. Parsons, T. (1956). A sociological approach to the theory of organizations. Administrative Science Quarterly, 1, 63 Ponseti International (2017). http://www.ponseti.info/about-us.html last accessed on Aug 28, 2017. Prendergast, C. (1999). The provision of incentives in firms. Journal of Economic Literature, 37, 7–63. Rerup, C. (2009). Attentional triangulation: Learning from unexpected rare crises. Organization Science, 20(5), 876–893. Roberts, K. H. (1990). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176. Roberts, K. H. (Ed.). (1993). New challenges to understanding organizations. New York: Macmillan. Roberts, K. H., & Bea, R. G. (2001). Must accidents happen: Lessons from high reliability organizations. Academy of Management Executive, 15, 70–79. Roberts, K. H., & Libuser, C. (1993). From Bhopal to banking: Organizational design can mitigate risk. Organizational Dynamics, 21(4), 15–28. Roberts, K. H., & Rousseau, D. M. (1989). Research in nearly failure-free, high-reliability organizations: Having the bubble. IEEE Transactions on Engineering Management, 36, 132–139. Rochlin, G. (1993). Defining high-reliability organizations in practice: A definitional prolegomenon. In K. H. Roberts (Ed.), New challenges to understanding organizations (pp. 11–32). New York: Macmillan. Schulman, P. R. (1993). The negotiated order of organizational reliability. Administration & Society, 25(3), 353–372.

142

high reliability mitigates goal conflicts

Selznick, P. (1949). TVA and the grass roots: A study in the sociology of formal organization. Berkeley: University of California Press. Simon, H. A. (1957). Administrative behavior (2nd ed.). New York: Macmillan. Simon, H. A. (1964). On the concept of organizational goal. Administrative Science Quarterly, 9, 1–22. Smart, P. K., Tranfield, D., Deasley, P., Levene, R., Rowe, A., & Corley, J. (2003). Integrating “lean” and “high reliability” thinking. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 217(5), 733–739. Turner, B. A. (1978). Man-made disasters. London: Wykeham Science. Vogus, T., Rothman, N., Sutcliffe, K., & Weick, K. (2014). The affective foundations of high-reliability organizing. Journal of Organizational Behavior, 35(4), 592–596. Weick, K. E., & Roberts, K. H. (1993). Collective mind and organizational reliability: The case of flight operations on an aircraft carrier deck. Administrative Science Quarterly, 38, 357–381. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. In R. S. Sutton & B. M. Staw (Eds.), Research in organizational behavior (Vol. 1, pp. 81–124). Greenwich, CT: JAI. Wiseman, R. M., & Bromiley, P. (1996). Toward a model of risk in declining organizations: An empirical examination of risk, performance and decline. Organization Science, 7, 524–543.

chapter 7

Org a n i z at ion a l L e a r ni ng a s R eli a bilit y Enh a ncement Peter M. Madsen

I n t r oduc t i o n

When an organizational catastrophe occurs, a natural reaction among organizational members and stakeholders is to promise that the organization will learn from the event so that a similar catastrophe will never happen again. For example, following the 2006 Sago Mine disaster in Sago, West Virginia, in which an explosion killed twelve coal miners, US Secretary of Labor Elaine Chao vowed to “determine the cause of this tragedy and . . . take the necessary steps to ensure that this never happens again” (US Department of Labor, 2006). Despite such promises, however, organizational learning from disaster faces many challenges that render safety improvements highly uncertain. In response to large failures, for instance, determining accountability and punishing those found to be responsible for the failure often becomes the central motive of postfailure investigations. When assigning responsibility is seen as paramount, efforts to uncover the root causes of the failure and learn from them must necessarily be secondary (Sagan, 1993). Moreover, following a major fiasco, organizational members may be reluctant to share information about their involvement for fear of being blamed. Finally, due to the drive to assign responsibility for a large failure, organizational leaders may be less likely to undertake any significant learning or change efforts (Staw & Ross, 1987; Weick, 1984).

144

organizational learning

In HROs learning to prevent disasters is critical. HROs must achieve “nearly error-free operations all the time because otherwise they are capable of experiencing catastrophes” (Weick & Roberts, 1993a, p. 357). This pressure has motivated HROs to develop strategies for learning how to avoid major failures as well as for learning from the few disasters to do occur. This chapter lays out these HRO learning approaches and discusses how HROs utilize them to better defend against disaster. Specifically, I examine and synthesize the HRO perspective on direct learning from major failure experience as a basis for future reliability improvement and discuss several other learning strategies that don’t rely on direct failure experience, including experiential learning, vicarious learning, learning from near misses, and experimental and simulation learning. Although research from an HRO perspective, as well as other perspectives, has begun to examine the utility of all these strategies, there are still many open questions that warrant future work. Some avenues for further research are discussed at the end of the chapter.

L e a r n i ng i n HROs

An organization may be said to be high-risk when it operates complex technologies under such difficult environmental conditions that it is at risk of experiencing catastrophic accidents (Perrow, 1984; Shrivastava, 1987; Vaughan, 1996). Barry Turner (1978; also see Pidgeon & O’Leary, 2000; Turner & Pidgeon, 1997) suggests a six-step model of how disasters occur in high-risk organizations called the disaster incubation model (DIM). DIM holds that organizational disasters (man-made disasters in Turner’s terminology) develop through six stages: (1) starting point, (2) incubation period, (3) precipitating event, (4) onset, (5) rescue and salvage, and (6) full cultural readjustment. The various stages in Turner’s DIM highlight different sources of information that organizational members may learn from and different opportunities and challenges to such learning. This being the case, I adopt DIM as a central and unifying perspective on learning in HROs in this chapter. I introduce the model in the following paragraphs, then, as the various HRO learning approaches are discussed in following sections, I place them into the context of the DIM process. The DIM starting point is the promulgation of an accepted model of the hazards that an organization faces and how these hazards may be managed well

organizational learning 145

enough that negative consequences may be avoided. This disaster-mitigation model consists of organizational members’ mental models of what is required for safe organizational operation as well as organizational standard operating procedures, rules and codes designed to promote safety and related laws and regulations. During the incubation period, small latent errors occur both because actual organizational operations may fail to follow the organization’s disaster-mitigation model completely and because the model maps imperfectly onto the true requirements of the organizational environment. Such misalignment between reality and organizational understandings of what is required to prevent disaster may occur because of failure to adjust the model to environmental change, because of gradual alterations (or “shift”) to the model itself, or because the model never fit well with task requirements in the first place (see Starbuck & Milliken, 1988; Vaughan, 1996). Despite the possibility that minor accidents indicate fundamental shortcomings of the organizational disaster-mitigation model, all such errors are typically interpreted as indicators that this model is not being adequately implemented or that minor adjustments to it are required (see Carroll, Rudolph, & Hatakenaka, 2002; Turner, 1976). Thus, latent errors accumulate in the organizational system during the incubation stage. The precipitating event is a minor organizational error or novel environmental condition that exposes the weaknesses of the extant disaster-mitigation model and initiates a full-scale organizational disaster. This process of disaster initiation usually occurs through the interaction of the precipitating event with one or more of the latent errors that have crept into the system as well as hazards extant in the organizational environment. The fourth step, onset, represents the initiation of the disaster itself and the development of the disaster. The fifth step, rescue and salvage, is the organization’s initial response in the immediate aftermath of the disaster, including clean-up efforts, providing aid to victims, and recovery efforts. During Turner’s final stage, full cultural readjustment, organizational members and outsiders analyze the disaster, assign it one or more “causes,” and adjust the organization’s disaster-mitigation model accordingly. In essence the full cultural readjustment phase encompasses the process through which organizational participants come to terms with the deficiencies in their prior understandings of the hazards facing the organization and their approaches to mitigating those hazards. During this process organizational members and

146

organizational learning

stakeholders develop a new consensus on the true nature of the hazards in the organization’s environment and what changes need to be made to the organization’s policies and practices to better manage those hazards going forward. Postdisaster investigations and the development and implementation of new safety programs based on those investigations occur during this stage. The adjusted disaster-mitigation model developed during the full cultural readjustment phase then becomes a new starting point, and the cycle repeats itself. Karlene Roberts and Denise Rousseau (1989) argue that “high-reliability organizations are a subset of high-risk organizations designed and managed to avoid such accidents” (p. 132; also see Waller & Roberts, 2003; Vogus & Welbourne, 2003). Like other high-hazard organizations, HROs typically display four key characteristics (Perrow, 1984; Roberts & Rousseau, 1989, 1990; Rochlin, La Porte, & Roberts, 1987). First, HROs are complex systems in the sense that they are made up of a wide variety of technical components, people, and organizational structures. Second, HROs are characterized by tight coupling among these various system elements such that they interrelate reciprocally in time-dependent manners with very little slack in the process. Third, HROs are made up of many organizational levels populated by many different decision makers who interact in complex communication networks. Fourth, HROs pursue more than one primary operating objective simultaneously. Because of these distinctive characteristics, organizational learning poses unique challenges in HROs. Indeed, HROs are differentiated from most other organizations to the extent that Roberts (1990) questions whether mainstream organization studies literatures can meaningfully be applied to HROs “addressing phenomena at least partly derived from precisely the opposite conditions” (p. 161; but also see Waller & Roberts, 2003). These differences notwithstanding, in this chapter I draw on and integrate both the HRO literature and related literatures on organizational learning and disaster prevention in examining learning in and by HROs. The early HRO literature discounted the importance of learning (particularly learning from experience) to HROs because HROs were seen as organizations that couldn’t afford to experience failure in the first place (e.g., Rochlin et al., 1987). However, later HRO research identified the importance of various forms of learning to the production of high reliability operations (La Porte & Consolini, 1991; Ramanujam & Goodman, 2003; Roberts, Stout, &

organizational learning 147

Halpern, 1994). As I have already pointed out, learning in HROs takes several forms, including experiential learning, vicarious learning, learning from near misses, and experimental and simulation learning (see Sterman, 1994). In the following sections, I discuss each of these forms of organizational learning in turn, integrating their use in HROs with the various stages of the DIM to illustrate how they interrelate.

E x per i en t i a l L e a r n i ng

HRO research has coevolved with a related perspective on organizational disasters—normal accidents theory (NAT). Although taking a similar perspective as HRO work on the factors that generate disastrous failures in organizations, NAT takes the contrary perspective that because high-risk organizations are so complex and tightly coupled, serious accidents are inevitable (Perrow, 1984; Sagan, 1993; Vaughan, 2005). As part of this argument, NAT suggests that it is virtually impossible for organizations to learn effectively from major accidents that they experience (Sagan, 1993; Vaughan, 2005). It holds that since serious accidents are rare, they do not provide enough information to allow organizations to understand the complex environments and technologies involved in the production of the accident (Baumard & Starbuck, 2005; Levitt & March, 1988; Vaughan, 1999). Furthermore, because organizational participants interpret the nonoccurrence of disasters as evidence that the level of organizational resources devoted to safety may be reduced, organizational defenses against serious accidents quickly degrade (Marcus & Nichols, 1999; Starbuck & Milliken, 1988). The NAT position on learning also suggests that disasters are caused by unexpected interactions between multiple components of hazardous sociotechnical systems. Because such systems are large, complex, and tightly coupled, any safeguards put in place to prevent the interaction that initiated one disaster will not provide protection against many other potentially disastrous interactions and, consequently, will not increase overall system reliability (Perrow, 1984; Rijpma, 1997; Vaughan, 1996). Furthermore, because major accidents are highly visible and highly emotional events, a significant (often unstated) purpose of virtually all postaccident investigations is to assign responsibility for the calamity (Sagan, 1994). As a result, such investigations become more

148

organizational learning

political than scientific, with involved parties directing more effort toward protecting themselves from fallout than toward discovering how future accidents may be avoided (Edmondson, 1996; Perrow, 1984; Sagan, 1994; Vaughan, 1996). From this perspective it appears unlikely that organizations could learn from experience with disaster to prevent future disasters. But as with other aspects of organizational reliability, the HRO literature disagrees with the NAT perspective on learning, arguing that organizations (when designed and managed effectively) are capable of learning from disasters despite these difficulties (Carroll, 1997; Carroll et al., 2002; La Porte & Consolini, 1991). HRO scholars suggest that firsthand experience with disaster teaches some organizations to watch vigilantly for all unexpected events, not just those events that brought about the initial disaster (Rochlin et al., 1987). And this vigilance, coupled with improvisation strategies that allow organizational participants to respond effectively to novel situations, allows these organizations to operate for long periods of time without experiencing serious accidents (March, Sproull, & Tamuz, 1991; Rerup & Feldman, 2011; Roberts, 1990; Weick & Roberts, 1993a; Weick, Sutcliffe, & Obstfeld, 1999). Furthermore, some organizations retain the lessons learned from experience with disaster over very long periods of time, resisting the tendency to dismantle safety programs after accident-free periods (Starbuck, 1988). For example, the US Navy credits the strong safety record of its nuclear submarine program over the past fifty years to the lessons learned from two serious accidents that occurred in the mid-1960s (Columbia Accident Investigation Board, 2003). In DIM terminology, HRO theorists suggest that HROs are able to learn from disasters first by taking full advantage of full cultural readjustment (stage 6) to update their safety models to correctly reflect the realities of the hazards they face. In other words, members of HROs seem to be able to learn richly from the very small sample of disasters they do experience and, in doing so, bypass the political processes that NAT suggests can make learning from disaster impossible. The noted emphasis in HROs on operations as opposed to other organizational functions (Weick et al., 1999) may facilitate such dramatic reimagining of organizational safety models and processes following disaster. At the same time, HROs are able to learn from experience with disaster to enhance their ability to identify and fix latent errors that creep into their systems during the incubation period (stage 2) in order to reduce the

organizational learning 149

likelihood that unexpected interactions among latent errors and precipitating events will generate future disasters (Weick et al., 1999). HROs are able to maintain this additional vigilance for seeking and fixing latent errors for a very long time following the disaster experience. For example, members of the US Navy aircraft carrier community still speak regularly of a massive fire that occurred onboard the USS Forestal in July 1967, which resulted in 134 fatalities, as a reminder of the constant need to remain vigilant for safety hazards (Roberts & Bea, 2001). Although HROs are certainly not immune to the politics and self-preservation that can make learning from disaster experience so challenging, they seem to be able to deal with those challenges enough to permit genuine experiential learning. The small amount of existing empirical work that examines organizational learning from disasters lends support to the HRO view of learning. For example, Pamela Haunschild and Bilian Sullivan (2002) find that prior airline accident experience decreases future accident rates in large US airlines. Peter Madsen (2009) reports similar findings of learning from coal mining disasters by extraction companies. And Madsen and Vinit Desai (2010) find that orbital launch vehicle organizations learn from failed launch attempts to improve their future reliability rates. These findings certainly don’t suggest that learning from experience with disaster occurs automatically or easily, but they do indicate that such learning is possible. Moreover, empirical work on the rate at which organizational knowledge derived from failure is forgotten in organizations suggests that the lessons learned from even severe failures are lost relatively quickly in most organizations. For example, Madsen (2009) found that immediately following a fatal accident, the likelihood that a coal mine would experience another fatal accident dropped sharply but that after four or five years, the fatal accident risk had returned to its previous level. Qualitative work looking at HROs suggests that the rate of forgetting the lessons learned through disaster is dramatically lower than this in HROs—even approaching zero in some cases (Madsen et al., 2006; Roberts & Bea, 2001). In summary, members of HROs seem to be adept at learning from catastrophic failure in at least two processes. First, members of HROs are able to glean the key lessons of such failures and incorporate them into the organization’s knowledge structures and safety practices without allowing political wrangling and public relations posturing to impede this learning (see Lampel,

150

organizational learning

Shamsie, & Shapira, 2009). Second, members of HROs are able to retain this knowledge learned through experience with disaster over long periods to time and use it to both guard against the specific conditions that produced the disaster and to maintain high levels of vigilance for other conditions and small errors that could interact in novel ways to create new disasters in the future.

V i c a r i ous L e a r n i n g

Because HROs so rarely experience disasters firsthand (Roberts, 1990), a strategy of learning directly from experience alone would yield relatively little new knowledge. To overcome this dearth of firsthand experience, members of HROs expend considerable effort attempting to learn vicariously from disasters experienced by other organizations (Schulman, 1993; Weick & Roberts, 1993b). Perhaps no group of organizations illustrates this ability to learn vicariously more clearly than does the US nuclear power industry. The nuclear power plant is the archetypal example of an organization that is both enormously complex and very tightly coupled. As such, the 1979 Three Mile Island accident, and the characteristics of the plant at which it occurred, formed the genesis of Perrow’s (1984) NAT. However, recognizing the significance of the risks that their organizations face, members of the US nuclear power industry have developed processes for learning from each other and from nuclear power operations in other countries in the hope of improving their reliability and ability to prevent disaster. For example, in the wake of the Three Mile Island accident (just two weeks after the event), members of the US nuclear power industry created INPO, a private organization funded by the industry. It acts as a private regulator of sorts as well as distributes best practices and lessons learned from negative events across the industry. INPO was created as an exercise in vicarious learning— specifically tasked with helping the industry learn both from Three Mile Island and from the enviable safety record of the US Navy’s nuclear program. Indeed, the founding CEO of INPO was a retired navy admiral (Blandford & May, 2012). The lessons learned by other nuclear operators from Three Mile Island as well as the best practices imported from the navy’s nuclear operations precipitated substantial improvements in safety and reliability for the US nuclear power industry as a whole (David, Maude-Griffin, & Rothwell, 1996),

organizational learning 151

demonstrating the value of vicarious learning for reliability enhancement. Paul David, Roland Maude-Griffin, and Geoffrey Rothwell (1996) examine this improvement empirically, treating the occurrence of Three Mile Island as a natural experiment. They find that indicators of unreliability, such as unplanned nuclear power plant outages, occurred significantly less frequently in the United States after the Three Mile Island incident, indicating meaningful vicarious learning from the disaster. The US nuclear power industry, in connection with INPO, likewise undertook major learning efforts following the Chernobyl disaster in 1986 and the Fukushima disaster in 2011—the two worst nuclear power accidents in history. The Chernobyl disaster in particular would have been easy for US-based operators to dismiss as a learning opportunity, given significant technological differences between Chernobyl and US plants (Kletz, 2001). However, the US nuclear power industry, largely through INPO, investigated the Chernobyl disaster in depth and derived a set of lessons learned that applied across this technology difference—including the need for enhanced operator training on emergency situations and the importance of responding quickly to precursor events that do no damage themselves but create hazardous conditions (Blandford & May, 2012). Similarly, the US nuclear power industry has undertaken multiple investigations of the Fukushima disaster with the aim of learning to improve the safety of nuclear plants in the United States. Some of the main findings from these investigations have been the importance of designing for low-frequency external threats such as natural disasters and preparing for emergencies during which external power is not available (Miller et al., 2011). In the US nuclear power industry case, vicarious learning is facilitated by technical and organizational similarities between the organizations that experienced the disasters and those that drew lessons vicariously from them. Even in the face of technological differences between American and Soviet nuclear power systems, the fundamental elements of the Chernobyl plant were similar to those of US nuclear power plants. However, members of HROs have also gone beyond learning vicariously from similar organizations in the same industry, searching for lessons from very different organizations that, nonetheless, may enhance reliability. For example, Madsen and colleagues (2006) report the case of a PICU whose reliability in patient outcomes was outstanding relative to its peers over several years. The PICU’s founding physician-director was a former

152

organizational learning

navy fighter pilot. The director attributed the PICU’s exemplary performance, in part, to lessons in reliable operations that he had learned through his experience in naval aviation and imported to the medical domain. Specifically, in the navy the PICU director had learned the value of making safety and reliability the responsibility of every member of an organization—including frontline, lower-level members. Drawing on this lesson, the director instituted virtually unprecedented levels of authority over patient care decisions for nurses and medical technicians. This decentralization of patient care decision making permitted very rapid responses to emergencies, which led to better patient outcomes. Such borrowing of lessons from aviation for use in improving reliability in health-care settings is becoming more common. A recent review identified twenty-five studies that tested the usefulness of aviation practices in health-care organizations (Wauben, Lange, & Goossens, 2012). The bulk of this work reports positive effects of this cross-industry vicarious learning. HROs commonly borrow lessons from one domain for application in another, and many maintain libraries of information about disasters in and across industries. In a few cases, observing disasters from other domains allows members of an HRO to identify concrete similarities between the latent errors that generated the disaster and vulnerabilities in their own organization. More often, however, the value of such vicarious examinations of disasters experienced elsewhere seems to come in reminding members of the HRO that small latent errors can interact in unexpected and unpredictable ways and thus encourage them to maintain vigilance in searching for and fixing latent errors in their own operations. In DIM terminology it seems that vicarious learning allows members of HROs to enhance reliability in two ways. First, vicarious observation of others’ disasters may produce a sort of “cultural readjustment” (DIM stage 6). The extent to which this readjustment of safety models is undertaken varies. In the example of the US nuclear power industry’s response to the Three Mile Island meltdown, the disaster prompted other nuclear operators to engage in nearly as deep and full of a cultural readjustment as if they had experienced the disaster firsthand in the sense that they rethought their entire approach to safety and reliability as a result of the accident. In other cases the readjustment may be much smaller in scale, representing perhaps only a tweak in the applicable safety model. Even so, repeated small improvements to an organiza-

organizational learning 153

tion’s safety model over time in response to vicarious learning can significantly improve the organization’s reliability (Madsen, 2009). The second DIM process that may be enhanced by vicarious learning is the incubation period. During this second stage, latent errors creep into the organizational system and build up over time unobserved. However, vicarious learning from other organizations may facilitate the vigilance necessary for members of organizations to identify and correct many of the latent errors that occur in any organization. By correcting latent errors in a timely fashion before they interact with each other and with precipitating events, organizations may be able to extend the incubation period and stave off the onset stage (DIM stage 4) indefinitely, or at least for a very long time (Roberts, 1990).

L e a r n i n g f r om S m a l l F a i l u r e s a n d N e a r M i ss e s

In addition to attempting to learn from disasters and serious accidents (whether experienced by their own organizations or others), members of HROs meticulously search for lessons from errors and small failures that their organizations experience in the hope of finding ways to enhance reliability (Vogus, Rothman, Sutcliffe, & Weick, 2014). In fact, Karl Weick, Kathleen Sutcliffe, and David Obstfeld (1999) argue that one of the defining characteristics of HROs is their members’ “preoccupation with failure” (p. 39). Weick and colleagues define such preoccupation with failure as an almost obsessive desire to find failures in an organization’s operations, no matter how small, with an aim of correcting such failures before they do damage to the organization. This drive to find even small, obscure failures is a unique aspect of HROs because most organizations tend toward the opposite approach of ignoring small failures until they germinate into large failures (see Weick, 2015). Examples of this tendency abound and include such well-known organizational disasters as the Challenger and Columbia tragedies (Vaughan, 2003), the Air France– Concorde crash (Dillon, Tinsley, Madsen, & Rogers, 2016), and the BP Deepwater Horizon explosion (Tinsley, Dillon, & Madsen, 2011). The HRO approach toward learning from small failures lines up with Sim Sitkin’s (1992) suggestion that organizational leaders should place an emphasis on learning from small failures rather than from large ones—a practice he labels the “strategy of small losses.” The “small losses” perspective holds that

154

organizational learning

following any failure, organizational leaders are motivated by two often competing motives: to assign responsibility for the failure so that those deemed at fault may be held accountable for it, and to discover the root causes of the failure so that organizational members can learn to prevent similar failures in the future (Sagan, 1993; Sitkin, 1992; Weick, 1984). Sitkin (1992) argues that small failures do not threaten the identity of the organization and that, consequently, organizational leaders are not powerfully motivated to determine accountability. As a result, organizational responses to the failure center on searching for novel solutions to its underlying causes so that future failures may be mitigated. The validity of the strategy of small losses is supported by a significant body of theoretical and empirical research demonstrating the difficulty of organizational learning from large failures and the benefits of learning from small failures (Hayward, 2002; Perrow, 1984; Sagan, 1993; Staw & Ross, 1987). For example, Mathew Hayward (2002) examined organizational learning from acquisition experience and found that firms learned more from their prior acquisitions that had resulted in small losses than they did from prior acquisitions that had produced any other type of performance. Similar findings on the value of learning from small failures comes from HRO case studies. For example, Paul Schulman’s (1993) study of the Diablo Canyon nuclear power plant describes a persistent wariness among plant employees about the possibility that an unexpected chain of events would generate a serious accident. This wariness pushes plant employees to view any unexpected events and small failures in the plant as signs of possible pathways through which this unexpected chain could occur. Their hope is that by learning from these small events, they can disrupt the chain that could produce disaster (for similar findings in nuclear power plants, also see Carroll, 1997). Similarly, Martin Landau and Donald Chisholm (1995) describe how much value is placed on identifying small operational failures on a US Navy aircraft carrier, where such small errors, if not corrected, could easily cascade into catastrophe. Indeed, Mark Cannon and Amy Edmondson (2001) hypothesized that higher-performing nursing teams would report fewer small errors than their lower-performing peers but were surprised to find that higher-performing teams actually reported more small errors—indicating that the identification of, and learning from, small errors may be a key to enhanced reliability performance. Although Sitkin’s (1992) original articulation of the small losses approach did not deal with near misses, the small losses logic applies to near misses as

organizational learning 155

well. A near miss may be defined as an “event that could have had bad consequences, but did not” (Reason, 1990, p. 118). In other words, near misses are events during which hazardous conditions were present but did not interact with a precipitating event in such a way that a significant failure was generated (Dillon & Tinsley, 2008). Because major organizational failures are usually caused by the chance interaction of multiple hazards and latent errors with a precipitating event (Perrow, 1986; Turner, 1978), the preconditions that produce near misses are very similar to those that generate outright failures. But the outcomes of near misses are typically much less costly than those of outright failures. Near misses have been referred to as “free lessons” because they provide opportunities to correct hazardous conditions that could cause future failures (Reason, 1997, p. 119). Since near misses do not result in costly losses, near-miss events should not put into motion political and defensive processes that reduce the effectiveness of organizational learning efforts (Sitkin, 1992). Opportunities for learning from near misses are not only less costly and less threatening than learning from larger failures, but they are also much more numerous. Estimates suggest that between several hundred to several thousand near misses occur in an organization for each major accident (Bird & Germain, 1996; Kletz, 2001). For example, medical researchers estimate that physicians make significant errors in 10 to 15 percent of all medical diagnoses; but the vast majority of these errors do not result in any harm to patients (Graber, 2005; Shojania, Burton, McDonald, & Goldman, 2003). The frequency of near misses stands in contrast to the infrequency of major organizational disasters, which happen so rarely in HROs (as discussed earlier) that it is very difficult for organizational decision makers to be certain that any conclusions drawn from one failure will generalize to other situations and settings (March et al., 1991). Thus, because near misses are generated by the same conditions that lead to large failures, if organizational decision makers could identify and correct hazardous conditions through experiencing and learning from near misses, they may be able to reduce the likelihood that their organizations would experience major failures in the future. Because of this potential for extremely low-cost learning from near misses, near-miss identification, codification, and learning has become a high priority in many domains. In recent decades, for example, organizations and regulators in several high-hazard industries have implemented near-miss learning systems, including in chemical processing,

156

organizational learning

commercial aviation, medicine, and nuclear power industries, among others (Barach & Small, 2000; Tamuz, 1994). One such system, the FAA’s Aviation Safety Reporting System (ASRS), receives more than 50,000 reports of nearmiss events each year (Aviation Safety Reporting System, 2009). In DIM terminology, organizational learning from small losses and near misses allows members of organizations to find latent errors that have accumulated in their systems during the incubation period without the high costs of a disaster. If members of organizations can consistently eliminate such latent errors, they can starve potential disasters of the fuel that they need in order to materialize, and thus allow their organizations to remain in the incubation stage indefinitely.

E x p e r i m e n t a l a n d S i mu l a t i o n L e a r n i n g

Because HROs experience very few large-scale accidents themselves, their members frequently resort to alternatives to experiential learning in the attempt to improve their ability to prevent disaster. One approach to learning from disaster in the absence of an actual disaster is experimental learning, whereby organizations, systems, and processes are consciously pushed beyond normal operating levels in the attempt to learn about how they perform under extreme conditions (Garvin, 2000). An example of experimental learning in an HRO setting is the experimental process through which NASA determined that a foam strike was responsible for the breakup of the Columbia orbiter during its return to Earth in 2003. Initially after the event, engineers were skeptical that foam from the external tank could break through the reinforced carboncarbon panels that made up the leading edge of Columbia’s wings. But an experiment involving firing foam insulation at a reinforced carbon-carbon panel demonstrated that, at high speeds, the foam was capable of doing just that. In many HRO settings, direct experimentation on real systems is not possible due to the risks of bringing high-hazard systems to extreme conditions (the 1986 Chernobyl accident was partially the result of an experiment). This being the case, members of organizations often experiment by simulating disasters in the hope of learning how to prevent real ones. Simulation of failure scenarios is an additional demonstration of HROs’ preoccupation with failure (Weick et al., 1999). Because members of HROs are so concerned with failure,

organizational learning 157

when genuine failure cannot be found, they create simulated failure. The purpose of these simulations is to help organizational members learn both how to prevent real failures and how to respond effectively when a failure does happen such that the effects of the failure may be minimized. Simulation learning has many potential benefits for organizations. First, failures may be simulated much more frequently than they occur naturally, allowing members of an organization to gain experience dealing with events that they may not have ever seen firsthand. Second, simulation learning allows organizational members to learn about hazardous situations under conditions where mistakes will not lead to costly outcomes. For example, due to increasing societal concern with medical error, medical schools are relying more and more on simulation rather than supervised patient care to give medical students their initial medical training (McLaughlin et al., 2008). Third, simulations may explore an infinite variety of hazards, conditions, and errors, allowing organizational members to test uncommon conditions and interactions among system elements and environmental hazards. Simulation learning facilitates HROs’ reluctance to simplify reality (Weick et al., 1999) in that it allows organizational participants to consider the myriad different ways in which failures may occur. Such experience can be extremely beneficial when organizational members find themselves in an uncommon situation. What has become known as the “Miracle on the Hudson” illustrates how prior simulated situations came to bear on an emergency situation. When US Airways flight 1549 hit a flock of geese during takeoff from LaGuardia Airport on January 15, 2009, it lost engine power and was forced to make a very difficult emergency landing on the Hudson River. Pilot Chesley “Sully” Sullenberger and his flight crew safely landed the Airbus A320 on the river with no loss of life. Of the difficult landing with no engine power, Sullenberger later told an interviewer that he recalled “on a number of occasions attempting [similar landings] in the simulator under visual conditions” (Shiner, 2009). Thus, the water landing that NTSB board member Kitty Higgins referred to as “the most successful ditching in aviation history” may not have been carried off without simulation learning (Olshan, 2009). Because of these benefits, simulation learning has become a major thrust in many high-hazard industries in recent decades. Simulation training was pioneered in the aviation industry and remains an important element of pilot

158

organizational learning

training, as evidenced by the USAir flight 1549 example (Allerton, 2010). Simulation learning is also a staple of operator training in the nuclear power and chemical process industries, where it has been credited with significant ongoing improvements in reliability (Hoffman, 2009; Zhe, Jun, Chunlin, & Yuanle, 2014). More recently, simulation learning has become important in reliability enhancement efforts in health care (McLaughlin et al., 2008). In fact, the bulk of the recent research on the value of simulation learning has been conducted in health-care settings (see Christianson, Sutcliffe, Miller, & Iwashyna, 2011). Such use of simulation learning as a training method for members of organizations that manage hazardous technologies and environments syncs well with the HRO literature’s view of the importance of training for maintaining reliability. In fact, Carolyn Libuser(1994) includes ongoing training of personnel in her typology of the fundamental characteristics of HROs. Simulation learning may play an important part in such training. Despite the major potential benefits of simulation learning for reliability enhancement, concerns exist regarding whether simulation of failure can lead to genuine learning about how to prevent real failures. A major question researchers ask is whether learning produced through simulated experience with disasters translates to real-world disasters (Ford & Schmidt, 2000; Gonzalez & Brunstein, 2009). This question is a key issue in the training and development field (Aguinis & Kraiger, 2009). In this domain, a common approach to evaluating the effectiveness of a training program is to employ Kirkpatrick’s hierarchy (Kirkpatrick & Kirkpatrick, 1994, 2005), which suggests four different hierarchical levels of learning from training: reaction, knowledge, behavior, and results. Reaction refers to the attitudes of those trained toward the training; knowledge refers to the acquisition of declarative knowledge by participants through the training; behavior refers to behavioral changes exhibited by trainees as a result of the training; and results refers to changes (typically improvements) in the performance results obtained by participants as a result of the training. Evidence for the effectiveness of simulation learning on reliability enhancement may be characterized according to the levels of Kirkpatrick’s hierarchy. The use of simulation training for disaster prevention and response is widespread—common in both private organizations and governments (Perry & Lindell, 2003). Such simulations are usually conducted live but also may be .

organizational learning 159

computer-based (Gonzales & Brunstein, 2009). Yet there is surprisingly little research examining simulation training’s effectiveness—particularly at the highest levels of Kirkpatrick’s hierarchy. The lowest level of the hierarchy, reaction, is the most widely studied in the context of emergency and disaster simulation. The most common finding in this work is that participants are nearly universally positive about their experience with the simulations they are involved in (for reviews, see Khanduja et al., 2015; McLaughlin et al., 2008; Okuda et al., 2009). Simulation training provides an enjoyable experience that participants perceive to be valuable. The evidence of learning through simulation at the second level of the hierarchy, knowledge, is also compelling. Many studies examine the value of simulation learning in emergency contexts for the acquisition and retention of declarative knowledge related to disaster prevention and response. The bulk of these studies consider simulation training alone, but some also report control groups trained using other training techniques (for reviews, see Allerton, 2010; Hoffman, 2009; Khanduja et al., 2015; Okuda et al., 2009). The results of this work show that, in the vast majority of cases, simulation training is effective for the acquisition and retention of declarative knowledge and that simulation training is usually more effective for both learning and retention than are other training techniques such as classroom-based instruction. Again, simulation learning appears to be highly effective in helping organizational members acquire knowledge related to reliability enhancement. Empirical studies examining the third level of Kirkpatrick’s hierarchy, behavior, are much less common than those looking at the first two levels, perhaps due to the difficulty of observing behavioral changes in real-world settings. Only a handful of studies look at the link between emergency or disaster simulation learning and genuine post-training behavior differences. And these studies report divergent findings. On the one hand, Susan Steineman and colleagues (2011) find that simulation training given to members of a trauma team in a health-care setting resulted in improved teamwork behaviors in emergency situations. But on the other hand, Marc Shapiro and colleagues (2004) and Jeffrey Cooper and colleagues (2008) report no significant differences in work behaviors in emergency health-care situations following simulation training. These mixed results almost certainly reflect the need for additional work in this area, as very few studies have examined the issue.

160

organizational learning

A larger number of studies look at behavioral change in emergency simulations following simulation training. And this work generally finds positive effects of simulation training on behaviors exhibited during emergency simulations (for a review, see Okuda et al., 2009). The highest level of the Kirkpatrick hierarchy, results, is the most difficult to study but the most relevant for reliability enhancement. Recent reviews of the use of simulation learning in hazardous industries have failed to identify any examples of empirical tests of the effects of simulation learning on actual performance in preventing or responding to emergency situations (Allerton, 2010; Khanduja et al., 2015). This is a key gap in the empirical literature where future work could make a very valuable contribution. Organizations in many industries rely heavily on simulation training for disaster prevention and preparation. Empirical work demonstrating the efficacy of such training on actual operational results could help establish the value of simulation learning. Such work could also help identify what characteristics of simulation training lead to the best outcomes. While there are suggestions in the literature that simulation learning is more effective when it is high fidelity and correctly represents the full complexity of high-hazard activities (e.g., Malakis & Kontogiannis, 2012), these suggestions have yet to be tested empirically. Simulation learning has the potential to improve organizational reliability across virtually all six of the DIM stages. In particular, simulation training is typically used to help organizational members prepare to intercede between the precipitating event (stage 3) and disaster onset (stage 4) and to react to disaster during rescue and salvage (stage 5). In the first context, organizations can use simulation to anticipate and prepare for a wide variety of possible latent errors, precipitating events, and failure pathways. In the later context, simulation learning allows organizational members to prepare for many different types of potential failure and disasters and become familiar with the interorganizational coordination that is necessary during disaster response.

C o n c l us i o n s

Although in many ways learning appears antithetical to the notion of HROs as organizations that are able to stave off disaster almost indefinitely (Roberts, 1990), in reality HROs engage in several different forms of learning that facilitate their members’ ability to perform reliably. The same types

organizational learning 161

of learning may enable any organization operating in hazardous conditions to enhance its reliability. Organizational disasters are rare enough events that members of organizations have few opportunities to learn firsthand about the causal pathways that produce disaster or about effective strategies for responding to the onset of disaster. In the face of these difficulties, members of HROs engage in several learning approaches. First, members of HROs attempt to learn richly from the few disasters that they do experience. They do this by collecting as much information as possible about the disaster, the conditions and latent errors that allowed it to incubate, and the events that precipitated it. Such rich information collection requires all members of the organization to be committed to learning rather than to assigning or avoiding blame. Even when members of HROs engage in deep learning from direct experience with disaster, the rarity with which they encounter disaster directly requires additional forms of learning, such as vicarious learning. Vicarious learning strategies take advantage of the fact that although major accidents are rare for any individual organization, they occur regularly across a large population of organizations. Members of HROs use serious accidents experienced by other organizations to supplement their knowledge of failure modes and techniques for identifying and preventing them. Similarly, members of HROs look to expand their understanding of the systems they operate and the environments that they operate in by observing and learning from small failures and near misses. These events are easily ignored because they don’t produce large negative consequences. But because of their emphasis on understanding failure, members of HROs see small errors as low-cost opportunities to learn. Small failures and near misses are typically produced by the same sort of interaction among latent errors and precipitating events that can generate major failures. So, when organizational members investigate and correct the causes of these small events, they remove ingredients for disaster from their organizational systems reducing the likelihood that an expected interaction will cascade into disaster. Thus, learning from small failure events and near misses can ultimately elongate the incubation stage of the DIM indefinitely—producing the long periods of operation without serious failures that characterize HROs. Finally, members of HROs are concerned enough with learning to prevent failures that, in addition to observing failures that occur in the wild, they create simulated failures themselves in order to explore their antecedents and

162

organizational learning

consequences. Simulation learning allows organizational members to gain experience detecting and dealing with events that they will experience firsthand only very rarely. It also permits them to imagine all sorts of potential failure modes and learn how to prevent them. Avenues for Future Research Organizational learning represents a domain in which HRO research has the potential to inform and further mainstream organizational scholarship because the focus in HRO research on rare events is germane to theories on organizational learning generally. Future research on organizational learning from an HRO perspective would be valuable both within the HRO literature and to other research streams. Important areas for future work exist relating to the four learning approaches discussed in this chapter. In the following paragraphs, I discuss each in turn. The topic of organizational learning from the experience of disaster is relevant to many different phenomena and research literatures. Nevertheless, there are major gaps in our current understanding of how such learning occurs. For example, although HRO and organizational learning research has now established that organizations can, and often do, learn from direct experience with major failures and disasters, we know very little about the conditions under which organizations and their members learn more or less from such experience (for exceptions, see Desai, 2015; Haunschild & Sullivan, 2002). The HRO literature suggests that organizations that take nonpunitive approaches to employees following a failure may learn faster (Weick & Roberts, 1993a). But this assertion has not yet been tested empirically. Future work could test this suggestion as well as suggest and test other organizational characteristics and practices that may facilitate direct learning from major failure. One interesting avenue for future work in this vein could be to examine how individual organizational members experience and learn from organizational failures in light of recent research showing that individuals don’t learn as well from their own individual failures as from their successes (Diwas, Staats, & Gino, 2013). Similarly, the fact that organizations learn vicariously from the failures of other organizations is well established. But the precise mechanisms through which this vicarious learning occurs are not well studied. For example, it is clear that in many industries, regulators play an important role in the vicarious

organizational learning 163

learning process (see Madsen, 2009), but the details of how regulators learn from failures, how they share what they learn with regulated organizations, and how this knowledge is integrated with the organizations’ own vicarious learning efforts are not well understood. Similarly, the extant literature has not illuminated what types of vicarious failures facilitate greater learning. Many organizations make efforts to learn from disasters that occur in very different industries than their own. More research is needed to determine whether these efforts can be successful or if organizations only learn effectively from failures among industry peers. Additional work on these topics from both HRO and organizational learning perspectives could be very valuable. Research on how organizations and their members deal with near misses has documented a human tendency to view near misses as successes rather than as near failures (Dillon & Tinsley, 2008). As noted earlier, HROs seem to counteract this tendency in their members by maintaining an obsessive focus on failure. But the precise mechanisms through which this process occurs in HROs is not well understood. Additional research from an HRO perspective may be able to isolate the causal processes that allow HROs to detect near misses and recognize their significance. Similarly, extant research on learning through disaster and failure simulation has not progressed enough to answer several basic questions on how simulation learning could be best employed to prevent serious organizational failures. First, as already noted, research on simulation learning has not yet demonstrated that simulation training allows members of organizations to better prevent or respond to actual disasters. Additional work in this direction could help establish the value of simulation training in high-hazard contexts. Second, the current literature does not establish the degree of fidelity that is necessary in simulating disaster scenarios to accrue the benefits of simulation learning. This is a key practical question in that higher-fidelity simulations are typically more costly to build and employ than are more abstracted simulations. If a relatively simple simulation may achieve the same outcomes as a more detailed, more costly one, organizations in high-hazard domains would be well advised to invest in the simplest simulation that achieves high efficacy. Future research that could document the necessary level of fidelity for effective simulation learning in an emergency or disaster context could be both theoretically and practically important.

164

organizational learning

Parting Thoughts Taken together, the four learning strategies discussed in this chapter facilitate organizational learning across all of the stages of Turner’s (1978) DIM. They facilitate the identification and correction of the types of latent errors that typically become incorporated into an organizational system during the incubation stage. These learning approaches also help organizational members to more effectively identify and respond to precipitating events, to intercede during disaster onset so as to minimize the severity of a disaster, and to most effectively undergo cultural readjustment in the wake of a disaster. When used in concert, these four learning approaches may allow members of organizations to operate so reliably as to prevent major failures entirely over very long periods of time and, thus, qualify as HROs.

R efer ences Aguinis, H., & Kraiger, K. (2009). Benefits of training and development for individuals and teams, organizations, and society. Annual Review of Psychology, 60, 451–474. Allerton, D. J. (2010). The impact of flight simulation in aerospace. Aeronautical Journal, 114, 747–756. Aviation Safety Reporting System. (2009, January–2010, January). Callback: NASA’s Aviation Safety Reporting System. Retrieved from asrs.arc.nasa.gov/publications/callback.html Barach, P., & Small, S. D. (2000). Reporting and preventing medical mishaps: Lessons from nonmedical near miss reporting systems. BMJ: British Medical Journal, 320, 759–763. Baumard, P., & Starbuck, W. H. (2005). Learning from failures: Why it may not happen. Long Range Planning, 38(3), 281–298. Bird, F. E., & Germain, G. L. (1996). Loss control management: Practical loss control leadership (rev. ed.). Oslo, Norway: Det Norske Veritas. Blandford, E., & May, M. (2012). Lessons learned from “ lessons learned”: The evolution of nuclear power safety after accidents and near-accidents. Cambridge, MA: American Academy of Arts and Sciences. Cannon, M. D., & Edmondson, A. C. (2001). Confronting failure: Antecedents and consequences of shared beliefs about failure in organizational work groups. Journal of Organizational Behavior, 22, 161–177. Carroll, J. S. (1997). Organizational learning activities in high hazard industries: The logics underlying self-analysis (Working paper no. 3936). Cambridge, MA, MIT. Carroll, J. S., Rudolph, J. W., & Hatakenaka, S. (2002). Learning from experience in high-hazard organizations. Research in Organizational Behavior, 24, 87–137. Christianson, M., Sutcliffe, K., Miller, M., & Iwashyna, T. (2011). Becoming a high reliability organization. Critical Care, 15(6), 314. Columbia Accident Investigation Board. (2003). Columbia Accident Investigation Board report, volume 1. Washington, DC: Government Printing Office. Cooper, J. B., Blum, R. H., Carroll, J. S., Dershwitz, M., Feinstein, D. M., Gaba, D. M., . . . & Singla, A. K. (2008). Differences in safety climate among hospital anesthesia departments and the effect of a realistic simulation-based training program. Anesthesia & Analgesia, 106(2), 574–584.

organizational learning 165

David, P. A., Maude-Griffin, R., & Rothwell, G. J. (1996). Learning by accident? Reductions in the risk of unplanned outages in U.S. nuclear power plants after Three Mile Island. Journal of Risk Uncertainty, 13(2), 175. Desai, V. (2015). Learning through the distribution of failures within an organization: Evidence from heart bypass surgery performance. Academy of Management Journal, 58(4), 1032–1050. Dillon, R. L., & Tinsley, C. H. (2008). How near-misses influence decision making under risk: A missed opportunity for learning. Management Science, 54(8), 1425–1440. Dillon, R. L., Tinsley, C. H., Madsen, P. M., & Rogers, E. W. (2016). Improving recognition of near-miss events through organizational repair of the outcome bias. Journal of Management, 42(3), 671–697. Diwas, K. C., Staats, B. R., & Gino, F. (2013). Learning from my success and from others’ failure: Evidence from minimally invasive cardiac surgery. Management Science, 59(11), 2435–2449. Edmondson, A. (1996). Learning from mistakes is easier said than done: Group and organizational influences on the detection and correction of human error. Journal of Applied Behavioral Science, 32, 5–28. Ford, J. K., & Schmidt, A. M. (2000). Emergency response training: Strategies for enhancing realworld performance. Journal of Hazardous Materials, 75, 195–215. Garvin, D. A. (2000). Learning in action: A guide to putting the learning organization to work. Boston, MA: Harvard Business School Press. Gonzalez, C., & Brunstein, A. (2009). Training for emergencies. Journal of Trauma® Injury, Infection, and Critical Care, 67(2 Suppl.), S100–S105. Graber, M. (2005). Diagnostic errors in medicine: A case of neglect. Journal on Quality and Patient Safety, 31, 106–113. Haunschild, P. R., & Sullivan, B. N. (2002). Learning from complexity: Effects of prior accidents and incidents on airlines’ learning. Administrative Science Quarterly, 47, 609–643. Hayward, M. L. A. (2002). When do firms learn from their acquisition experience? Evidence from 1990–1995. Strategic Management Journal, 23, 21–39. Hoffman, E. (2009). Safety by simulation. ATW: International Journal for Nuclear Power, 54(6), 396–402. Khanduja, P. K., Bould, M. D., Naik, V. N., Hladkowicz, E., & Boet, S. (2015). The role of simulation in continuing medical education for acute care physicians: A systematic review. Critical care medicine, 43(1), 186–193. Kirkpatrick, D. L., & Kirkpatrick, J. D. (1994). Evaluating training programs: The four levels. San Francisco: Berrett-Koehler. Kirkpatrick, D. L., & Kirkpatrick, J. D. (2005). Transferring learning to behavior: Using the four levels to improve performance. San Francisco: Berrett-Koehler. Kletz, T. (2001). Learning from accidents (3rd ed.). Oxford, UK: Gulf Professional. Lampel, J., Shamsie, J., & Shapira, Z. (2009). Experiencing the improbable: Rare events and organizational learning. Organization Science, 20(5), 835–845. Landau, M., & Chisholm, D. (1995). The arrogance of optimism: Notes on failure avoidance management. Journal of Contingencies and Crisis Management, 3, 67–80. La Porte, T. R., & Consolini, P. (1991). Working in practice but not in theory: Theoretical challenges of high reliability organizations. Journal of Public Administration Research and Theory, 1, 19–47. Levitt, B., & March, J. G. (1988). Organizational learning. Annual Review of Sociology, 14, 319–340. Libuser, C. (1994). Organizational structure and risk mitigation (Unpublished doctoral dissertation). University of California, Los Angeles. Madsen, P. M. (2009). These lives will not be lost in vain. Organizational learning from disaster in U.S. coal mining. Organization Science, 20, 861–875.

166

organizational learning

Madsen, P. M., & Desai, V. M. (2010). Failing to learn? The effects of failure and success on organizational learning in the global orbital launch vehicle industry. Academy of Management Journal, 53, 451–476. Madsen, P., Desai, V., Roberts, K., & Wong, D. (2006). Mitigating hazards through continuing design: The birth and evolution of a pediatric intensive care unit. Organization Science, 17, 239–248. Malakis, S., & Kontogiannis, T. (2012). Refresher training for air traffic controllers: Is it adequate to meet the challenges of emergencies and abnormal situations? International Journal of Aviation Psychology, 22(1), 59–77. March, J. G., Sproull, L. S., & Tamuz, M. (1991). Learning from samples of one or fewer. Organization Science, 2(1), 1–13. Marcus, A. A., & Nichols, M. L. (1999). On the edge: Heeding the warnings of unusual events. Organization Science, 10(4), 482–499. McLaughlin, S., Fitch, M. T., Goyal, D. G., Hayden, E., Kauh, C. Y., Laack, T. A., . . . & Gordon, J. A. and on behalf of the SAEM Technology in Medical Education Committee and the Simulation Interest Group. (2008). Simulation in graduate medical education 2008: A review for emergency medicine. Academic Emergency Medicine, 15, 1117–1129. Miller, C., Cubbage, A., Dorman, D., Grobe, J., Holahan, G., & Sanfilippo, N. (2011, July 12). Recommendations for enhancing reactor safety in the 21st century: The near-term task force review of insights from the Fukushima Dai-ichi accident. Rockville, MD: US Nuclear Regulatory Commission. Retrieved from http://pbadupws.nrc.gov/docs/ML1118/ML111861807.pdf Okuda, Y., Bryson, E. O., DeMaria, S., Jacobson, L., Quinones, J., Shen, B., & Levine, A. I. (2009). The utility of simulation in medical education: What is the evidence? Mount Sinai Journal of Medicine, 76(4), 330–343. Olshan, J. (2009, January 17). Quiet air hero is Captain America. New York Post. Retrieved from http://nypost.com/2009/01/17/quiet-air-hero-is-captain-america/ Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books. Perrow, C. (1986). Complex organizations: A Critical Essay (3rd ed.). New York: Random House. Perry, R. W., & Lindell, M. K. (2003). Preparedness for emergency response: Guidelines for the emergency planning process. Disasters, 27, 336–350. Pidgeon, N. F., & O’Leary, M. (2000). Man-made disasters: Why technology and organizations (sometimes) fail. Safety Science, 34, 15–30. Ramanujam, R., & Goodman, P. S. (2003). Latent errors and adverse organizational consequences: A conceptualization. Journal of Organizational Behavior, 24, 815–836. Reason, J. T. (1990). Human error. Cambridge: Cambridge University Press. Reason, J. T. (1997). Managing the risks of organizational accidents. Aldershot, UK: Ashgate. Rerup, C., & Feldman, M. (2011). Routines as a source of change in organizational schemata: The role of trial-and-error learning. Academy of Management Journal, 54(3), 577–610. Rijpma, J. A. (1997). Complexity, tight-coupling and reliability: Connecting normal accidents theory and high reliability theory. Journal of Contingencies and Crisis Management, 5(1), 15–23. Roberts, K. H. (1990). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176. Roberts, K. H., & Bea, R. G. (2001). Must accidents happen: Lessons from high reliability organizations. Academy of Management Executive, 15, 70–79. Roberts, K. H., & Rousseau, D. M. (1989). Research in nearly failure-free, high-reliability organizations: “Having the bubble.” IEEE Transactions on Engineering Management, 36, 132–139. Roberts, K. H., Stout, S. K., & Halpern, J. J. (1994). Decision dynamics in two high reliability military organizations. Management Science, 40, 614–624. Rochlin, G., La Porte, T., & Roberts, K. (1987). The self-designing high-reliability organization: Aircraft carrier flight operations at sea. Naval War College Review, 40(4), 76–90.

organizational learning 167

Sagan, S. D. (1993). The limits of safety: Organizations, accidents, and nuclear weapons. Princeton, NJ: Princeton University Press. Sagan, S. D. (1994). Toward a political theory of organizational reliability. Journal of Contingencies and Crisis Management, 2, 228–240. Schulman, P. R. (1993). The negotiated order of organizational reliability. Administration & Society, 25(3), 353–372. Shapiro, M. J., Morey, J. C., Small, S. D., Langford, V., Kaylor, C. J., Jagminas, L., . . . & Jay, G. D. (2004). Simulation based teamwork training for emergency department staff: Does it improve clinical team performance when added to an existing didactic teamwork curriculum? Quality and Safety in Health Care, 13(6), 417–421. Shiner, L. (2009, February 18). Sully’s tale: Chesley Sullenberger talks about That Day, his advice for young pilots, and hitting the ditch button (or not). Air & Space. Retrieved from http://www. airspacemag.com/as-interview/aamps-interview-sullys-tale-53584029 Shojania, K. G., Burton, E. C., McDonald, K. M., & Goldman, L. (2003). Changes in rates of autopsy-detected diagnostic errors over time: A systematic review. Journal of the American Medical Association, 289, 2849–2856. Shrivastava, P. (1987). Bhopal: Anatomy of a crisis. Cambridge, MA: Ballinger. Sitkin, S. B. (1992). Learning through failure: The strategy of small losses. In B. M. Staw & L. L. Cummings (Eds.), Research in organizational behavior (Vol. 14, pp. 231–266). Greenwich, CT: JAI. Starbuck, W. H., & Milliken, F. J. (1988). Challenger: Fine-tuning the odds until something breaks. Journal of Management Studies, 25(4), 319–340. Staw, B. M., & Ross, J. (1987). Behavior in escalation situations: Antecedents, prototypes, and solutions. Research in Organizational Behavior, 9, 39–78. Steinemann, S., Berg, B., Skinner, A., DiTulio, A., Anzelon, K., Terada, K., . . . & Speck, C. (2011). In situ, multidisciplinary, simulation-based teamwork training improves early trauma care. Journal of Surgical Education, 68(6), 472–477. Sterman, J. D. (1994). Learning in and about complex systems. Journal of the System Dynamics Society, 10, 291–330. Tamuz, M. (1994). Developing organizational safety information systems for monitoring potential dangers. In G. E. Apostolakis & T. S. Win (Eds.), Proceedings of PSAM II (Vol. 2, pp. 7–12). Los Angeles: University of California. Tinsley, C. H., Dillon, R. L., & Madsen, P. M. (2011, April). How to avoid catastrophe. Harvard Business Review, 90–97. Turner, B. (1976). The organizational and inter-organizational development of disasters. Administrative Science Quarterly, 21(3), 378–397. Turner, B. A. (1978). Man-made disasters. London: Wykeham Science. Turner, B. A., & Pidgeon, N. F. (1997). Man-made disasters (2nd ed.). Boston: Butterworth-Heinemann. US Department of Labor. (2006). Statement of U.S. Secretary of Labor Elaine L. Chao on the West Virginia mine incident. Retrieved from https://www.dol.gov/opa/media/press/opa/archive/ OPA20060004.htm Vaughan, D. (1996). The Challenger launch decision: Risky technology, culture, and deviance at NASA. Chicago: University of Chicago Press. Vaughan, D. (1999). The dark side of organizations: Mistake, misconduct, and disaster. Annual Review of Sociology, 25, 271–305. Vaughan, D. (2003). History as cause: Columbia and Challenger. In Columbia Accident Investigation Board Report, volume 1 (pp. 195–204). Washington, DC: Government Printing Office. Vaughan, D. (2005). System effects: On slippery slopes, repeating negative patterns, and learning from mistakes? In W. H. Starbuck & M. Farjoun (Eds.), Organization at the limit: Lessons from the Columbia disaster (pp. 41–59). Malden, MA: Blackwell.

168

organizational learning

Vogus, T., Rothman, N., Sutcliffe, K., & Weick, K. (2014). The affective foundations of high- reliability organizing. Journal of Organizational Behavior, 35(4), 592–596. Vogus, T. J., & Welbourne, T. M. (2003). Structuring for high reliability: HR practices and mindful processes in reliability-seeking organizations. Journal of Organizational Behavior, 24, 877–903. Waller, M. J., & Roberts, K. H. (2003). High reliability and organizational behavior: Finally the twain must meet. Journal of Organizational Behavior, 24, 813–814. Wauben, L. S. G., Lange, J. F., & Goossens, R. H. M. (2012). Learning from aviation to improve safety in the operating room—A systematic literature review. Journal of Healthcare Engineering, 3(3), 373–389. Weick, K. E. (1984). Small wins: Redefining the scale of social problems. American Psychologist, 39, 40–49. Weick, K. E. (2015). Ambiguity as grasp: The reworking of sense. Journal of Contingencies and Crisis Management, 23(2), 117–123. Weick, K. E., & Roberts, K. H. (1993a). Collective mind and organizational reliability: The case of flight operations on an aircraft carrier deck. Administrative Science Quarterly, 38, 357–381. Weick, K. E., & Roberts, K. H. (1993b). Collective mind in organizations: Heedful interrelating on flight decks. Administrative Science Quarterly, 38, 357–381. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. In R. Sutton & B. Staw (Eds.), Research in organizational behavior (Vol. 1, pp. 81–124). Greenwich, CT: JAI. Zhe, S., Jun, S., Chunlin, W., & Yuanle, M. (2014). The engineering simulation system for HTRPM. Nuclear Engineering and Design, 271, 479–486.

chapter 8

M e ta pho r s o f C ommu n i c a t i o n i n High R e l i a bi l i t y Org a n i z at ions Jody L. S. Jahn, Karen K. Myers, & Linda L. Putnam

C ommu n i c a t i o n i n H i g h R e l i a b i l i t y O r g a n i z i n g

Karl Weick’s (1987) work problematizes the “reliability question” in understanding organizations and organizing processes. He treats reliability as encompassing three main activities. First, reliability refers to seeing and acting on the invisible in dynamic nonevents (La Porte & Consolini, 1991; La Porte, 1996; Weick, Sutcliffe, & Obstfeld, 1999). As Weick points out, problems and failures are visible, but reliability is invisible, meaning that people do not know how many mistakes they could have made but didn’t. Thus, when situations go well, organizational members do not notice what makes organizing reliable. In this way, reliability is invisible but also active. Through constant attention, adjustment, and change, it is possible to produce a reliable outcome; yet when constant tweaking stops, accidents happen. Second, reliability involves enactment. Enactment refers to people developing systems that they in turn must manage. For example, air traffic controllers configure airplanes in a certain way, which enacts a system that solves some problems while creating others (Weick, 1987). Enactment makes reliability possible, but it also adds elements that can hinder long-term effects. Third, reliability

170

metaphors of communication

is made possible through requisite variety. This concept suggests that when a system has a great deal of complexity and variety, organizational members must respond with equal or greater human complexity to achieve high reliability. Requisite variety is attained through incorporating multiple and divergent perspectives and by relying on rich communication (e.g., dense talk, rich media; Weick, 1987). However, successfully incorporating divergent perspectives requires trust among members and efforts to ensure that divergent voices are heard and valued. These three mechanisms of reliability depend on communication to make it possible to negotiate meanings and actions. These mechanisms also imply that reliability should be a contested process that incorporates multiple perspectives through social interactions. Yet, studies on HROs often conceptualize communication as an accessory to action or as operating at the periphery of organizing—for example, serving as a vehicle for information transmission, creating voice, or remaining silent. Treating communication as the generic transmission of information or exchange of meanings runs the risk of providing incomplete insights as to what communication is and how it functions in HROs. This chapter makes these perspectives explicit by comparing and contrasting communication-relevant studies of HROs that have different views of what communication is and how it functions. We adopt several metaphorical lenses to review the role of communication in the HRO literature. These lenses reveal how scholars conceptualize communication, aid in identifying the strengths and weaknesses of these conceptualizations, and point to ways that scholars might reimagine the role of communication in HRO processes. They also show that for the most part the HRO literature has relied on some metaphors of communication more than others. In this chapter we recognize and privilege each of the different metaphors of communication in understanding HROs, and we show how the use of each reveals insights as well as blind spots in understanding the complexities of HROs. Our aim is to demonstrate that research could be strengthened through adopting complex understandings about the relationship between communication and HRO cultures. In addition, we urge scholars to embrace the underutilized metaphors discussed at the end of this chapter, particularly the symbol and performance metaphors, which focus on meaning and action.

metaphors of communication 171

O r g a n i z a t i o n a l C u l t u r e , C ommu n i c a t i o n , a nd High R eli a bilit y

The concept of organizational culture is pivotal to understanding different metaphors of communication in HRO studies. Research on the ways that organizational and safety cultures influence high reliability practices has a long history in the HRO literature (Klein, Bigley, & Roberts, 1995; Roberts, 1993; Roberts, Rousseau, & La Porte, 1994; Weick, 1987). From a communication perspective, organizational culture refers to the values, norms, and beliefs that emerge from social interactions in an organization (Keyton, 2011). In effect, communication scholars treat culture as what an organization is rather than something that it has (Smircich, 1983). Thus, we embrace John Carroll’s (see Chapter 3) cultural lens as a way of exploring the role of communication in understanding organizational complexity. Organizational culture is typically developed through communication, specifically messages, information flow, symbols, and interpretations of actions (Keyton, 2011). HRO scholars have focused on the types of organizational cultures that facilitate reliability and error-free environments. In particular, cultural norms, values, and beliefs support key management processes that enable high reliability (Roberts, 1993). For example, scholars have drawn attention to ways that members enact distributed organizing to allow for local decision making and enable loose coupling and tight coordination (Novak & Sellnow, 2009). Members also weigh checks and balances through managing by exception or negation (Scott, Allen, Bonilla, Baran, & Murphy, 2013). This means members learn to follow rules, but they also know that each individual assumes responsibility for assessing emerging situations and knowing when to act and when to bend or break rules. In addition, HRO members develop a big-picture or umbrella understanding of what members are trying to accomplish while also digging deep to grasp the patterns of a situation (Roberts, 1993; Roth, Multer, & Raslear, 2006). This literature treats culture as an underlying feature of reliability, but it focuses primarily on how culture reinforces ongoing practices and conveys norms and values; both of which are top-down understandings of how culture operates, ones that fall short in accounting for how culture enacts HRO processes. Drawing on organizational culture, we review the communicationbased HRO literature through several metaphorical lenses, including conduit, information processing, voice, symbol and performance.

172

metaphors of communication

C ommu n i c a t i o n a n d H R O s

A metaphor is a way of seeing something complex and abstract through comparing it to something concrete (Lakoff & Johnson, 1980). Morgan’s (1986, 2001) work, Images of Organizations, shows how metaphors provide a lens for scholars to understand abstract concepts, such as organizations, by revealing implicit assumptions that underlie different research programs. Organizational theorists have employed metaphor analysis to decenter old theories, unpack nuances of different perspectives, and develop new orientations (Oswick, Putnam, & Keenoy, 2004). Metaphors establish figure-ground relationships between a source and a target by focusing on similarities between the two, but in doing so, scholars often overlook dissimilarities; thus, a metaphor is only a partial view. Understanding the metaphors that researchers use opens up opportunities to rethink concepts and engage in theory building. Consistent with organizational theorists, Linda Putnam and Suzanne Boys (2006) set forth eight metaphors that represent the ways that organizational scholars typically view communication. To review and extend the HRO literature on communication, this chapter draws on five of these metaphors: (1) the conduit metaphor, which treats communication as a channel in an organizational container; (2) the information-processing metaphor, which focuses on the nature and flow of communication in organizations seen as maps; (3) the voice metaphor, which is expressive, suppressive, or silent; (4) the symbol metaphor, which views sensemaking through rituals and narratives and organizations as literary texts; and (5) the performance metaphor, which casts communication as social interaction and the organization as coordinated activities. We selected metaphors that have surfaced in the HRO literature or that we believe provide a valuable lens for adding complexity to HRO communication studies. These five metaphors highlight particular features of communication and organizational culture that have direct implications for reliability. Although each offers value, some metaphors are more comprehensive, complex, and constitutive than others. We suggest that scholars can enrich each metaphor through integrating communication with studies on organizational culture in HRO environments. In particular, we argue that two metaphors, symbol and performance, are best suited to examine how communication constitutes high reliability organizing.

metaphors of communication 173

Th e C o n du i t M e t a pho r

For the most part, HRO studies conceptualize communication through the metaphors of conduit, information processing, or voice. A conduit is “a channel through which something is conveyed, such as a tube, cable or cylinder” (Putnam & Boys, 2006, p. 545). With the conduit metaphor, communication functions as a tool for accomplishing goals as members transmit messages between each other. This metaphor focuses on (1) concerns for the adequacy and amount of information exchanged, and (2) communication skill or competencies. HRO research that adopts a conduit view of communication advocates adequacy of information flow and accurate readings of organizational environments. Adequacy of Information Flow In the conduit or transmission view of communication, high reliability is linked to assuring an adequate flow of information. Reliability is at risk when members fail to pass along enough information about cues, plans, and changes. Research shows that failures to maintain continuous communication and to report errors immediately threaten reliability (La Porte & Consolini, 1991; Weick, Sutcliffe, & Obstfeld, 2005). Reliability is also at risk when a message between the sender and the receiver is obstructed, such as a failure to communicate clearly, which causes a communication “breakdown.” This approach treats communication as an issue of adequacy or reducing barriers to message transmission. Accurate Understandings Another line of research that fits the conduit or transmission model of communication emphasizes accuracy. For example, Emilie Roth, Jordan Multer, and Thomas Raslear (2006) found that reliability for railroad operations depended on clear, precise, and constant communicative exchanges. This approach treated both communication and organizational culture as means to particular ends—that is, both became tools or instruments to achieve organizational success and avoid failure (Keyton, 2011). Strengthening the Conduit Metaphor Overall, the conduit metaphor treats communication as serving instrumental ends by linking the transmission of information to organizational

174

metaphors of communication

effectiveness and managerial goals (Putnam & Boys, 2006). Importantly, a conduit lens assumes an unproblematic process that ignores the ways that status, rank, norms, and other factors either clarify or obstruct pathways for message flow. In addition, a reliance on the conduit metaphor centers on developing accurate rather than plausible understandings of operating environments. As Karl Weick (1979) notes, however, accurate understandings can never be fully attained, nor are they necessary. Instead, members act on plausible or acceptable meanings derived from interpretive processes. The conduit metaphor also treats organizational culture as homogeneous or shared. It fails to acknowledge that subcultures with different values and norms often exist and could alter the understandings of messages. Overall, the conduit lens ignores the complex, multifaceted ways that the amount, accuracy, and interpretation of information are intertwined with organizational culture.

Th e I n f o r m a t i o n - P r oc e ss i n g M e t a pho r

Although closely linked to the conduit metaphor, the information-processing lens focuses on the routing of information and the content of messages as well as targets of exchange, feedback, and information-seeking activities (Putnam & Boys, 2006). “In the information processing metaphor, the organization becomes a map or trajectory; that is, organizational circumstances such as socialization, turnover, or innovation affect the nature and routing of information” (p. 546). This lens begins with the assumption that organizational effectiveness depends on acquiring and processing information. HRO research employs an information-processing metaphor to examine how organizations develop cultures that foster and manage convergent and divergent views, particularly through inculcating centralized decision premises. Seeking Information and Providing Feedback HRO cultures tend to reward and encourage practices associated with sending, receiving, sharing, and processing information that provides feedback and influences desired outcomes (Erickson & Dyer, 2005; Krieger, 2005; Novak & Sellnow, 2009). These studies reveal that organizational subgroups develop their own local norms, acceptable attitudes, and coping strategies for dealing with the dangerous demands of their work (Roberts et al., 1994).

metaphors of communication 175

These norms function as centralized decision premises for members to assess whether certain actions are consistent with what is acceptable and rewarded in their subgroup. For example, Janice Krieger (2005) observed that teams of pilots who fostered “positive reasoning” behaviors during flight simulations avoided failure. To enact positive reasoning, these teams made ongoing efforts to seek information, acknowledge each other’s messages, and share similar or different information. Their patterns of information processing—rooted in asking questions, sharing messages, and acknowledging each other—created centralized decision-making norms that fostered high reliability. In research that embraces the information-processing metaphor, HROs privilege practices that enable decentralized, yet interconnected work. In this view communication means exchanging information to build a big picture or a “bubble” that creates a unified understanding of the system (Roberts, 1993). In the rail industry, for example, a big picture is attained through cooperative communication practices or “courtesies” that construct a “shared situational awareness,” such as letting other train operators know when a train has passed so members can visualize and map how the tracks are being used (Roth et al., 2006). Building a big-picture understanding depends on having a communication system in place so that members can respond quickly through their access to internal and external sources of information (La Porte, 1996). Fostering Alternative Perspectives Additionally, researchers who embrace the information-processing metaphor see HROs as fostering different viewpoints to avoid groupthink. Thus, centralized decision premises embedded in subgroup norms and behaviors also create cultures that elicit and manage different types of information, such as corrective feedback and alternative viewpoints (Krieger, 2005). This orientation to communication in turn allows for subcultures that might differ in ways they access and process information. Developing a free flow of information among HRO members helps them avoid groupthink and stagnancy (Burke, Wilson, & Salas, 2005). Strengthening the Information-Processing Metaphor Viewing communication as information processing has limitations. Primarily, this lens treats information as if it were neutral and self-evident in its

176

metaphors of communication

meaning. Scholars assume that information carries equal weight regardless of what it says or who communicates it (Putnam & Boys, 2006). By adopting the belief that both communication and culture are complex, scholars could improve research through questioning whether and how information is free flowing. Adopting a network or linkage metaphor for communication might extend this research. Instead of directing attention toward transmission and information processing, the linkage metaphor could address the connections that tie members together in communication networks and relationships that span time and space (Ibid.). The linkage metaphor shifts the focus of communication to the patterns and structures of interactions linked to power and influence, knowledge, and relationships between subgroups and organizations. Three options for future research might add complexity to the informationprocessing lens. First, research that examines information routing and flow in HROs needs to situate these patterns in networks of power and influence (Ibid.). Following Noshir Contractor and Susan Grant (1996), researchers might examine how HRO members gain access to information and generate meanings through the density and diversity of their networks and the efficiency with which they generate big-picture understandings of their environment and operations. Second, HRO studies that put too much emphasis on information exchange as a way to avoid failure should attend to how this information is interpreted and treated as knowledge. One way HRO researchers could do this, following Timothy Kuhn and Steven Corman (2003), is to examine how connections among members create structures that enact knowledge collectively and converge on clusters of meaning. This type of research would shed light on ways that information is not neutral. Rather, meanings emerge and cluster together in rationalities of knowledge about what constitutes dangerous aspects of work and how to handle them. Third, HRO studies often assume that organizations are monolithic cultures or that all members interpret and enact values and beliefs in similar ways. However, an organization’s espoused or advocated norms might differ from those enacted within and across subgroups. HRO researchers could examine and compare the norms and communication patterns in different subgroups of the same organization. For example, Jody Jahn (2016) compared how two helicopter rappel wildland firefighting workgroups appropriated common safety

metaphors of communication 177

rules. She found that one crew viewed its group as a learning environment that valued proactive activities of talking through possible options for action and debriefing what happened on assignments. A second crew, in contrast, viewed its group as a collection of experts that enacted reliability through valuing independent decision making, autonomy of judgment, and signaling members’ expertise. In effect, scholars that embrace the information-processing metaphor could enrich their work through avoiding the presumption that information is neutral and that all organizational actors interpret it in the same way. Examining access to and patterns of information networks, exploring how information functions as knowledge, and viewing organizations as differentiated rather than consensual cultures provide ways to strengthen research that embraces this view of communication.

Th e V o i c e M e t a pho r

Unlike the conduit and information-processing metaphors, the voice metaphor focuses on when and how individuals speak up, raise concerns, and enact participatory processes: “Voice is not simply having a say, but the ability to act, to construct knowledge and to exert power” (Putnam & Boys, 2006, p. 556). Voice entails a focus on efforts to be heard, expression or suppression of different voices, occasions to speak, and equality of voice. HRO research that embraces a voice metaphor examines voice as (1) action, (2) participation, and (3) control. Voice as Action HRO studies often examine the conditions in which members perceive that they can speak up—that is, voice their thoughts and questions. Organizational members might raise issues because they believe that they will be heard, or they might remain silent if they anticipate that their ideas will fall on deaf ears or produce a negative reaction. As an example, Ruth Blatt and colleagues (Blatt, Christianson, Sutcliffe, & Rosenthal, 2006) studied medical residents’ accounts of mishaps, specifically whether they voiced their opinions or remained silent at critical moments. These authors found that most medical residents, especially individuals who were transitioning between medical student and physician, remained silent in two circumstances: (1) when they

178

metaphors of communication

anticipated that speaking up would anger the recipient, and (2) when their identities as “physicians” might be jeopardized. These findings indicate that the degree to which members see the situation as calling for voice influences their decisions to speak up. Also, this study reveals that organizational members self-censor or mask their opinions when they doubt their legitimacy to speak up. In a different HRO context, Michelle Barton and Sutcliffe (2009) also point out that wildland firefighters, working in a profession that values experience, often fail to speak up because they question their own expertise. Thus, legitimacy is particularly important in HROs because members must manage equivocal circumstances in which numerous plausible explanations exist and alternative perspectives are important in making the best possible choice. Voice as Participation The view of voice as participation parallels research on whether organizational members speak up or not. This work, however, focuses specifically on what members say, not just the act of speaking up. These studies call for voice that raises objections or contributes disconfirming information. From the voice metaphor, activities such as disagreement, criticism, and raising objections are cast as important aspects of participation. Participation directs attention to the importance of both confirming and disconfirming information, regardless of the source. According to Julie Novak and Timothy Sellnow (2009), for example, upward dissent or raising objections with supervisors is crucial for making corrective actions linked to reliability, but dissent can be rejected when a culture does not welcome it. Thus, participatory practices of voice succeed when organizational cultures value openness, criticism, and dissent. Voice as Control Both the action and the participation approaches to voice are linked to power and control. Treating voice as control embraces Carroll’s (Chapter 3) political lens to the study of HROs in which power shapes communication processes. Studies that extend the voice metaphor into the arena of power examine how organizational cultures and supervisory practices develop tacit patterns of control that silence members. For instance, Heather Zoller’s (2003) investigation of a safety culture in an automobile manufacturing plant demonstrated how cultural norms effectively suppressed workers’ concerns about injuries and

metaphors of communication 179

costly safety repairs. Specifically, company values that privileged productivity over safety combined with worker benefits and privileges perpetuated injuries and the failure to address safety violations. By not voicing concerns about their own well-being, employees subjected themselves to managerial control and perpetuated a culture that prioritized profitability and being viewed as a strong, tough worker over safety. Strengthening the Voice Metaphor With the exception of Zoller’s (2003) study, most HRO research that embraces the voice metaphor treats communication as a binary form of either speaking up or remaining silent. These studies often fail to consider how voice is tied to organizational cultures and forms of control—that is, the meanings that arise from expression and suppression depend on relationships and cultural norms that make silence seem normal and natural. We argue that future HRO studies should explore the links between voice and organizational power, voice and diversity, and voice as social practice. HRO studies could incorporate the work on communication and dissent into attaining high reliability. Specifically, they could examine how willingness to engage in dissent relates to the quality of supervisory relationships, an employee’s status, his or her personal influence, and freedom of speech in the workplace (Kassing, 2001; Kassing & Avtgis, 1999). Also, future research could consider how diversity contributes to the complexity of voice by exploring how members create systems of participation that privilege some voices and marginalize others (e.g., based on race, gender, ability, etc.). Bringing a diversity lens sheds light on who is silent, who speaks, who voices dissent, and how equality of voice becomes possible. Finally, research that treats communication as voice typically houses it as an individual action as opposed to a social practice. That is, individuals through their own volitions choose to speak up, participate, or dissent. When HRO is tied to individual behaviors, scholars fail to unpack social practices that produce high reliability. Studies of voice that move beyond individual choice examine ways that organizations foster authentic participation, employ dialogue effectively, create and maintain a system of self-reflection, and maintain structural and logistical conditions that encourage participation and accountability (Putnam & Boys, 2006, p. 559; Cheney, 1995; Cheney et al., 1998; Hoffman, 2002). In effect, HRO scholars often treat communication as voice—namely,

180

metaphors of communication

the action of speaking up, offering disconfirming or critical feedback, or remaining silent. Although these studies are tied to high reliability, they locate communication in individuals and treat voice and silence as binaries that either help or hinder reliability. Situating HRO research on voice in the broad social context can enrich the use of this metaphor and account for social practices tied to supervisory relationships, organizational norms, cultural values of respect and openness, power and control, and participation activities (see also Carroll, Chapter 3, in terms of power bases and relationships).

Th e S y m b o l M e t a pho r

As noted previously, the conduit, information-processing, and voice metaphors typically privilege channels, message transmission, information exchange, or speaking out as forms of communication. What is missing in these views of communication is a focus on meaning or the interpretation of messages and symbols that are critical for HRO. Meanings and symbols are important for HROs because members rely on culture to carry lessons that cannot be learned through trial and error without great risk to human life (Weick, 1987). Symbols play a central role in sensemaking and creating shared meanings (Keyton, 2011). A symbol is something that represents or stands for something else (de Saussure, 1916/1983). Examples of symbols include stories, terminology or language patterns, routine practices, and special ceremonies. A symbol requires translation and interpretation because it combines concrete or direct experiences with abstract feelings and attitudes to represent something different (Keyton, 2011, p. 19). For instance, Karlene Roberts, Denise Rousseau, and Todd La Porte (1994) found that stories, rites, and ceremonies were used symbolically on nuclear-powered aircraft carriers to socialize new members to the cultural norms, values, and beliefs regarding failure-free organizing. Symbols serve four functions that aid in making sense of organizations: (1) they make values and beliefs tangible and thus reflect organizational culture, (2) they influence behavior by triggering values and norms, (3) they facilitate communication and sensemaking about organizational experiences, and (4) they produce organizational systems of meaning (Keyton, 2011). Several HRO studies focus on the first two functions—ways that symbols reflect or-

metaphors of communication 181

ganizational culture and trigger values and norms (La Porte, 1996; Roberts, 1993; Roberts et al., 1994; Weick, 1987)—however, less attention in the HRO literature focuses on sensemaking and meaning. Studies that embrace the symbol metaphor examine (1) how organizational cultures make actions sensible, and (2) how they determine which behaviors and actions in a crisis appear in the foreground and which are cast in the background. Making Actions Sensible Symbols form the foundation of organizational culture and make organizations unique. In addition, they are the keys to maintaining and changing culture (Keyton, 2011). HRO studies that embrace the symbol metaphor treat reliability as an “occasion for sensemaking” (Weick, 1995, p. 86) through examining language, stories, and rituals. Sensemaking focuses on how individuals make plausible, coherent, and reasonable accounts of what is happening in organizations (Ibid.)—that is, it examines how organizational members construct interpretations of actions that are acceptable and ring true in the situation. Unlike information processing, sensemaking is aimed at believability rather than accuracy. In HROs sensemaking occurs when something happens that was not expected or when something that was expected does not occur (Weick, 1995; Weick et al., 2005). These occasions, often marked by complexity and equivocality, become focal points for organizational stories that make members aware of unusual or unexpected situations and reinforce how to act appropriately (Klein et al., 1995; Roberts et al., 1994). For example, commanding officers on aircraft carriers told stories about events that helped explain why they engaged in certain activities, like pointing out system flaws that needed correcting (Roberts et al., 1994). These stories might be accompanied by memorable messages (Stohl, 1986), such as “a million atta boys don’t equal one ah shit,” a message that condenses a story and reinforces the meaning of it (Roberts et al., 1994, p. 157). Further, rites—such as daily meetings of all personnel on the aircraft carrier—reinforced social structure, allowed people to express common feelings, and maintained social relationships (Ibid.). These authors also indicated that rites of passage, degradation, enhancement, and integration (Trice & Beyer, 1993) might reinforce organizational norms and beliefs as members changed rank or status.

182

metaphors of communication

Routine practices also serve as symbolic occasions for sensemaking. For instance, Larry Browning and colleagues (Browning, Greene, Sitkin, Sutcliffe, & Obstfeld, 2009) described a cultural practice at Barnstorm Air Force Base in which the flight technicians who repaired planes conducted a Foreign Object Damage (FOD) walk by checking every inch of the flight line to ensure that no object was left on the ground after a plane was repaired. Even the smallest foreign object could get sucked into an engine and lead to catastrophic results. As a symbolic activity, the FOD walk connected the aircraft technicians to the pilots and physically demonstrated their commitment to the pilots’ safety. The FOD walk served as a regular occasion for sensemaking because each time technicians swept the deck for debris, they were reminded that the pilots’ safety depended on them and that their work was important in keeping error rates down. Providing Figure/Ground Distinctions Sensemaking is also related to how individuals array cues in their environments, particularly through distinguishing which elements are figure or focal points and which are background. Individuals from diverse organizational and occupational cultures make different figure/ground distinctions among appropriate behaviors and interpretive repertoires. For example, Clifton Scott and Angela Trethewey (2008) found that occupational discourses about how to fight fires in buildings shaped the interpretive repertoires that firefighters drew on to take action. They defined interpretive repertoires as “discursive resources that members [used] to appraise the nature and extent of hazards” (Scott & Trethewey, 2008, p. 300). They found that occupational discourses about firefighting connected directly to the ways that members defined and acted on what they considered to be priority behaviors. In particular, talk that emphasized the novelty and uncertainty of fire-related hazards promoted cautious actions while interactions that underscored the need for speedy responses promoted hazardous driving practices. Framing also reveals how members make sense of appropriate behaviors and actions. Framing refers to a “communication process which occurs when people create foreground/background distinctions during their definitions of what is going on in a situation” (Brummans et al., 2008, p. 26). The foreground refers to the elements that organizational actors focus on while the

metaphors of communication 183

background includes features that fall into the periphery (Bergeron & Cooren, 2012). Framing presumes that different people will construct the same situation in a variety of ways because organizational members bring diverse professional knowledge and expertise to a crisis. As an example, Caroline Bergeron and François Cooren (2012) observed that a fire commander, an arson investigator, and an insurance adjuster focused on vastly different features of the same burned building, based on concerns linked to their professional roles. Their study of municipal crisis simulations revealed diverse figure/ground distinctions aligned with professional roles that influenced recommendations for action. Importantly, reliability worked best when each agent fully embraced his or her role so that team members could make sense of each other’s concerns and could navigate authority. For instance, in an H1N1 outbreak simulation, the health unit representative failed to enact her designated role in speaking on behalf of the agency. Her detachment diminished her authority on the team because she failed to show other representatives that she understood their concerns (i.e., concerns of their constituencies) and also failed to answer their questions; hence, no “common ground” for a solution emerged from discussing the problem. Framing, then, plays a key role in sensemaking and in HROs through revealing diverse ways that individuals see a situation and creating expectations for appropriate behaviors linked to professional roles. Strengthening the Symbol Metaphor Much of the HRO research that embraces this metaphor focuses on how symbols reflect organizational culture. Very little attention is given to the social interactions, shared experiences, and dynamics of meaning construction that make sense of ongoing actions or failure events. In this work, symbols often function in narrow and reactive ways, such as conveying or transmitting values and normative behaviors. Sensemaking, however, is a co-constructed process in that meanings shape organizing rather than simply serve as a byproduct of culture (Rafaeli & Worline, 2000). Future studies need to capture the co-constitutive processes of sensemaking through treating it as an ongoing, dynamic, and evolving process. In effect, reliability occurs through the ways that sensemaking is situated within ongoing streams of activities, not in isolated events. Related to this recommendation, HRO studies should analyze the key features of symbols rather than treat them as reflecting organizational

184

metaphors of communication

culture. For instance, the process of sharing failure events is not simply giving information; rather, it is storytelling that enacts a point of view, a logic, a rationale, and a scenario with plots and characters (Keyton, 2011). HRO studies need to focus on examining the symbols per se as enacted in a continual social process that plays a role in high reliability situations. Moreover, future HRO studies need to focus on locally constructed practices of sensemaking or ones situated in group and local organizational actions. Some local cultures may be better than others at reducing equivocality. Locally situated norms for interaction also influence how members coordinate with each other, and these situated norms define which actions are sensible for a group. Acting outside of what is considered to be sensible may confuse members and impede their abilities to derive plausible explanations and workable actions needed to reduce equivocality and attain high reliability. In effect, HRO scholars who conceive of communication as symbols focus on interpretations and meanings, especially through enhancing reliability, making actions sensible, and drawing on diverse figure/ground framing based on different occupational cultures. Researchers can strengthen their use of this metaphor through treating sensemaking as a dynamic, co-constructed process rather than an isolated event and focusing on locally constructed practices, especially ones that analyze features of symbols per se.

Th e P e r f o r m a n c e M e t a pho r

While the symbol metaphor focuses on sensemaking and interpretations, the performance metaphor highlights social interactions in organizations. Here, “performance refers to the process of enacting organizing, rather than to an organization’s productivity or output. [It] combines [Victor] Turner’s (1980) view of accomplishment with Weick’s (1979) notion of enactment” (Putnam & Boys, 2006, p. 549). Organizational cultural performances are interactive (include the participation of others), contextual (grounded in history and map future behaviors), episodic (have clear beginnings and endings), and improvisational (never fully scripted). As one researcher explains, “The situational and temporal embeddedness of an organizational performance allows specialized and localized meanings to develop” (Keyton, 2011, p. 84). Hence in some studies the symbol metaphor is productively combined with

metaphors of communication 185

the performance lens. Alexandra Murphy’s (2001) study illustrated the symbolic nature of flight attendant safety announcements, for example. While attendants communicated important safety information, their announcements functioned as performances to symbolize flight attendants’ authority. These performances served as sensemaking for travelers and other flight attendants. Grounded in Action and Interactions HRO researchers who adopt the performance metaphor treat organizational cultures as grounded in action, produced and reproduced through ongoing interactions, and carrying consequences for high reliability organizing. First, through social interactions, members configure their organizational relationships to assert voice. Under the performance metaphor, the notion of participation includes the contributions of both human actors and material and non-material “figures” (Cooren, 2010, p. 6). Figures, such as texts (e.g., rules, policies, and procedures) and organizational positions (e.g., fire chief, fire captain, etc.), participate in organizing when members invoke them to assert authority. When Bergeron and Cooren (2012) analyzed a crisis simulation that involved multiple Canadian municipal agencies, they found that documents as symbols of identity, expertise, and professionalism were often incorporated into interactions when members invoked a position title (e.g., doctor) and its attendant concerns (e.g., virus spread, public health) to gain legitimacy over others who held low-status or nonrelevant positions. Members also invoked written safety checklists as a trump card to argue for rejecting another person’s plan (Jahn, 2016) and to assert the complementarity of relationships. These studies demonstrate how scholars have combined the performance and symbol metaphors of communication. Secondly, organizational performances both enable and constrain member learning. Jahn (2016), for example, observed that a wildland firefighting group engaged in a daily collective debriefing discussion that not only created a space where multiple perspectives were welcomed but also generated expectations that members must have something to contribute. This practice prompted members to search for anomalies and concerns that they could discuss later. This performance also reinforced a complementary relationship between veterans and newcomers that promoted an ongoing teaching/ learning dialogue.

186

metaphors of communication

Participating in High Reliability Cultures Participation plays a different role in HROs when it enacts rather than reflects organizational culture, as it does with the voice metaphor. In these enactments organizational members draw from objects, bodies, and spaces to enhance legitimacy. Members might speak on behalf of their organizations and might invoke documents, collective norms, or particular identities to lend authority to what they say (Cooren, 2010). Of particular interest for the performance metaphor are action-based features of organizational culture (e.g., norms, rituals, or collective identities). Jahn (2016) found that the collective identity of a wildland firefighting crew, for example, was closely tied to members’ high levels of experience. In their interactions with each other, crew members drew on their team’s reputation for expertise. Because members felt pressured to perform as experts, team interactions were characterized by conflict rather than dialogue as they continually asserted themselves to test out options and enact reliability. Strengthening the Performance Metaphor For the most part, HRO studies rarely employ the performance metaphor of communication. Thus, finding ways to incorporate this lens into HRO research is the first step for strengthening this metaphor. HRO work that examines communication as performance often ignores organizational history and societal influences while placing too much emphasis on organizing processes. Thus, future HRO research should examine how an organization’s history, cultural stories, and institutional responses to successes and failures shape everyday activities. Relatedly, HRO research often ignores the role of written safety documents and rules in achieving high reliability, especially as they shape and are shaped by member interactions. Additionally, scholars need to show how HRO performances are improvisational while also ritualistic and routine. Another way to strengthen HRO research is too broaden the notion of what performing reliability entails. Most HRO literature centers on enculturation performances—that is, ones linked to socializing, acculturating, and learning procedures that enhance reliability. However, additional types of communicative performances that are pivotal to HRO include the enactment of emotion. HRO organizations may perform emotional routines in ways that might heighten reliability. For example, Karen Myers’s (2005) study

metaphors of communication 187

noted that emotional routines became essential for new firefighters to gain acceptance and become assimilated into the culture. Specifically, probationary firefighters (called booters) demonstrated deference to more senior firefighters by performing emotionally draining and demeaning routines known as the “Humble Boot.” These routines—such as daily washing of trucks, cleaning bathrooms and kitchens, and working from dawn to dusk without taking a break—demonstrated the booters’ trustworthiness and commitment to the crew and to departmental and firefighting traditions that in turn influenced the exercise of safety performances. Capturing emotional performances and routines, then, aids in understanding assimilation and ways that organizational members develop high reliability cultures. Other performances that are rarely studied in HRO research and yet seem pivotal to reliability are the ways that members exercise power and influence. Future work might investigate who influences whom and how political systems operate in HROs (see Carroll, Chapter 3). Studies of health-care organizations might examine performances that negotiate or exercise power among physicians and health-care workers or between health-care firms and insurance companies in ways that help or hinder reliability. In effect, the performance metaphor treats communication as social interactions that shape reliability, hence, it draws on process-based studies that enable and constrain learning, perform legitimacy and authority, and engage in conflict rituals. Scholars who employ this lens sometimes combine it with the symbol metaphor. They might, for example, examine how organizational members use written documents and safety rules as symbols to enact legitimacy and reliability. HRO studies, however, center heavily on enculturation performances and need to broaden their foci to incorporate emotional routines, the enactment of power, and societal/institutional influences on reliability.

C o n c l us i o n s a n d Fu t u r e D i r e c t i o n s

Communication in HROs Communication plays a pivotal and complex role in understanding HROs. This chapter reveals that research on HROs employs different metaphors that yield multiple insights about the link between communication and reliability. In both the conduit and information-processing metaphors, reliability stems

188

metaphors of communication

from passing along information, reporting errors, fostering diverse points of view, and developing decision-making premises that help organizational members see the big picture. These studies, however, typically treat information as neutral, content-free, and devoid of meaning and emphasize continual information flow, clarity, and accuracy rather than plausibility. Research that incorporates the voice metaphor introduces content of communication, organizational roles, and cultural norms as significant factors that contribute to reliability. Specifically, organizational members are more likely to speak up, report errors, and offer criticism if they see these practices as consistent with their roles and identities and as welcomed in their organizational cultures. They are more likely to withhold disconfirming information when their expertise or legitimacy might be questioned and when the organization or subculture discourages it. This research, however, often fails to show how power and control shape these cultural norms and practices. In contrast, studies that embrace the symbol metaphor examine communication by focusing on the meanings and interpretations of such symbols as stories, rituals, and norms. Stories and organizational routines often convey safety lessons that encourage newcomers to point out system flaws and be mindful of unexpected situations. HRO researchers also focus on framing or how organizational actors sustain reliability through situating some features of an equivocal situation in the figure and others in the background. For example, framing a situation as novel and uncertain is likely to increase mindfulness as opposed to casting it as needing quick action. Finally, research that adopts a performance lens focuses on organizational routines as social interactions, especially ones that enact legitimacy, enable and constrain learning, and develop complementary relationships to enhance high reliability. Importantly, combining the symbol and performance metaphors shows that invoking written documents and safety rules can enter into performances of expertise and authority to legitimate multiple perspectives and norms for interrogating safety practices. In conceptualizing communication in the majority of these studies, researchers adopt a container view of organizations, one that casts communication as an activity that occurs within HROs, rather than as a core process responsible for enacting high reliability. We contend that understanding communication as a central process of constituting organizing brings attention to the con-

metaphors of communication 189

sequences of communication for HROs. To move away from this container view, we argue that the symbol and performance metaphors hold the greatest potential for understanding how communication hinders participation; organizes relationships in particular ways; and accounts for the roles of safety rules, policies, and local norms in producing high reliability. Communication as High Reliability Like HRO theorizing, treating communication as constituting an organization has its roots in Weick (1979), especially his assumption that an organization does not exist as an entity but as a set of ongoing practices, texts, and memory traces (Fairhurst & Putnam, 2004). A constitutive view does not assume that communication generates the organization “from zero,” rather, organizing is both locally situated and contextually distant (Putnam & McPhee, 2009). Several assumptions that underlie this approach aid in reconceptualizing communication and high reliability cultures. The first assumption is that reliability occurs in the continuous flow of local interactions that are developed through complementary relationships among organizational members (Fairhurst & Putnam, 2004). Complementary relationships (e.g., supervisor-subordinate, mentor-mentee, etc.) are ones in which members interact on behalf of an organization or in relation to it; they represent the starting point for communication (Taylor & Van Every, 2000). In a complementary relationship, members have their own orientations and perspectives to an organization that they bring to their interactions. A second assumption is that these local interactions become layered and interwoven across time and in different communities of action with different practices and local rationalities. These communities of action reflect previous interactions while they guide ongoing sensemaking (Taylor, 2009; Taylor & Van Every, 2000). A core preoccupation of communication studies is reconciling how enduring understandings of an organization emerge from both local interactions and contextually distant communication (Fairhurst & Putnam, 2004). Research on HRO cultures would benefit from examining how contextually distant aspects of an organization become translated into unfolding activities through symbols that constitute history, engage in safety performances, and establish workgroup norms. How do organizations invoke symbols linked to their histories, lore, or distinctive reputations to motivate safety-related actions?

190

metaphors of communication

How do occupations in which bravery and resilience are considered badges of honor enable and control meanings of risk and failure? How does the history or reputation of an industry discourage organizations from promoting a culture of improved safety? In addition, examining how workgroups generate and reinforce norms for appropriate actions can connect these guidelines to regulations that require compliance. Many HROs are highly regulated operations. Specifically, in the areas of mining, wildland firefighting, nuclear power generation, and air transportation, regulations are an inescapable aspect of work. Therefore, how do safety documents participate in generating (or inhibiting) reliability, and how do members translate mandatory rules and procedures into ongoing actions and interactions in largely unpredictable circumstances? Relatedly, HROs are defined by their preoccupation with failure (Weick & Sutcliffe, 2001, 2007). Yet scholars have largely focused on the ways that avoiding failure fosters learning cultures while ignoring how such responses might create a chilling effect that obscures learning from failures and admitting future mistakes. This kind of inhibitor would be an important issue to overcome in achieving and maintaining reliability. Following the 1994 South Canyon wildland fire fatalities, for example, the US Forest Service discursively opened discussions about improving safety, activities that fostered a learning environment for a few years (Thackaberry, 2004). However, following the 2001 Thirtymile fire fatality incident, the Forest Service charged one of the employees as criminally responsible for the four deaths in the disaster (Maclean, 2007). This practice might cause firefighters to question whether their agencies will punish them for taking action that could later be interpreted as a “mistake.” This fear of doing the wrong thing might inhibit members’ abilities to act confidently in the moment—a core requirement of reliability. These institutional actions, while arguably removed from local interactions, send strong signals about an organization’s priorities and the degree to which management will support its personnel who are fighting fires on the ground. This chapter focuses on the importance of communication in achieving high reliability in organizations. It demonstrates how research draws on various metaphors of communication to reveal and obscure meanings and to enact norms linked to organizational culture. Through analyzing the metaphors that researchers embrace, scholars can gain an understanding of the strengths and weaknesses of existing research. By combining and embracing the symbol

metaphors of communication 191

and performance metaphors, researchers can adopt a more complex view of communication and its critical role in enacting high reliability. R efer ences Barton, M. A., & Sutcliffe, K. M. (2009). Overcoming dysfunctional momentum: Organizational safety as a social achievement. Human Relations, 62(9), 1327–1356. Bergeron, C. D., & Cooren, F. (2012). The collective framing of crisis management: A ventriloqual analysis of emergency operations centres. Journal of Contingencies and Crisis Management, 20(3), 120–137. Blatt, R., Christianson, M. K., Sutcliffe, K. M., & Rosenthal, M. M. (2006). A sensemaking lens on reliability. Journal of Organizational Behavior, 27(7), 897–917. Browning, L. D., Green, R. W., Sitkin, S. B., Sutcliffe, K. M., & Obstfeld, D. (2009). Constitutive complexity: Military entrepreneurs and the synthetic character of communication flows. In L. L. Putnam & A. M. Nicotera (Eds.), Building theories of organization: The constitutive role of communication (pp. 89–116). New York: Routledge. Brummans, B. H. J. M., Putnam, L. L., Gray, B., Hanke, R., Lewicki, R. J., & Wiethoff, C. (2008). Making sense of intractable multiparty conflict: A study of framing in four environmental disputes. Communication Monographs, 75, 25–51. Burke, C. S., Wilson, K. A., & Salas, E. (2005). The use of a team-based strategy for organizational transformation: Guidance for moving toward a high reliability organization. Theoretical Issues in Ergonomics Science, 6(6), 509–530. Cheney, G. (1995). Democracy in the workplace: Theory and practice from the perspective of communication. Journal of Applied Communication Research, 23, 167–200. Cheney, G., Straub, J., Speirs-Glebe, L., Stohl, C., DeGooyer, Jr., D., Whalen, S., Garvin-Doxas, K., & Carlone, D. (1998). Democracy, participation, and communication at work: A multidisciplinary review. In M. E. Roloff (Ed.), Communication yearbook 21 (pp. 35–91). Thousand Oaks, CA: Sage. Contractor, N. S., & Grant, S. (1996). The emergence of shared interpretations in organizations: A self-organizing systems perspective. In J. H. Watt & C. A. VanLear (Eds.), Cycles and dynamic processes in communication processes (pp. 216–230). Newbury Park, CA: Sage. Cooren, F. (2010). Action and agency in dialogue: Passion, incarnation, and ventriloquism. Amsterdam: John Benjamin. de Saussure, F. (1983). Course in general linguistics (C. Bally & A. Sechehaye, Eds., R. Harris, Trans.). La Salle, IL: Open Court. (Original work published 1916). Erickson, J., & Dyer, L. (2005). Toward a strategic human resource management model of high reliability organization performance. International Journal of Resource Management, 16, 907–928. Fairhurst, G. T., & Putnam, L. L. (2004). Organizations as discursive constructions. Communication Theory, 14(1), 5–26. Hoffman, M. (2002). “Do all things with counsel”: Benedictine women and organizational democracy. Communication Studies, 53(3), 203–218. Jahn, J. L. S. (2016). Adapting safety rules in a high reliability context: How wildland firefighting workgroups ventriloquize safety rules to understand hazards. Management Communication Quarterly, 30, 362–389. Kassing, J. W. (2001). From the looks of things: Assessing perceptions of organizational dissenters. Management Communication Quarterly, 14(3), 442–470. Kassing, J. W., & Avtgis, T. A. (1999). Examining the relationship between organizational dissent and aggressive communication. Management Communication Quarterly, 13(3), 100–115.

192

metaphors of communication

Keyton, J. (2011). Communication and organizational culture: A key to understanding work experiences (2nd ed.). Thousand Oaks, CA: Sage. Klein, R., Bigley, G., & Roberts, K. (1995). Organizational culture in high reliability organizations: An extension. Human Relations, 48, 771–793. Krieger, J. L. (2005). Shared mindfulness in cockpit crisis situations: An exploratory analysis. Journal of Business Communication, 42(2), 135–167. Kuhn, T., & Corman, S. (2003). The emergence of homogeneity and heterogeneity in knowledge structures during a planned organizational change. Communication Monographs, 30, 198–229. Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press. La Porte, T. R. (1996). High reliability organizations: Unlikely, demanding and at risk. Journal of Contingencies and Crisis Management, 4(2), 60–71. La Porte, T. R., & Consolini, P. M. (1991). Working in practice but not in theory: Theoretical challenges of high-reliability organizations. Journal of Public Administration Research and Theory, 1, 19–47. Maclean, J. N. (2007). The Thirtymile fire: A chronicle of bravery and betrayal. New York: Holt. Morgan, G. (1986). Images of organization (1st ed.). Beverly Hills, CA: Sage. Morgan, G. (2001). Images of organization (2nd ed.). Thousand Oaks, CA: Sage. Murphy, A. (2001). The flight attendant dilemma: An analysis of communication and sensemaking during in-flight emergencies. Journal of Applied Communication Research, 29(1), 30–53. Myers, K. K. (2005). A burning desire: Assimilation into a fire department. Management Communication Quarterly, 18(3), 344–384. Novak, J. M., & Sellnow, T. L. (2009). Reducing organizational risk through participatory communication. Journal of Applied Communication Research, 37(4), 349–373. Oswick, C., Putnam, L. L., & Keenoy, T. (2004). Tropes, discourse and organizing. In D. Grant, C. Oswick, L. L. Putnam, & T. Keenoy (Eds.), The Sage handbook of organizational discourse (pp. 105–127). London: Sage. Putnam, L. L., & Boys, S. (2006). Revisiting metaphors of organizational communication. In D. Grant, C. Oswick, L. L. Putnam, & T. Keenoy (Eds.), The Sage handbook of organization studies (pp. 541–576). London: Sage. Putnam, L. L., & McPhee, R. D. (2009). Theory building: Comparisons of CCO orientations. In L. L. Putnam & A. M. Nicotera (Eds.), Building theories of organization: The constitutive role of communication (pp. 187–208). New York: Routledge. Rafaeli, A., & Worline, M. (2000). Organizational symbols and organizational culture. In N. M. Ashkanasy, P. M. C. Wilderom, & M. F. Peterson (Eds.), International handbook of organizational climate and culture (pp. 71–84). Thousand Oaks, CA: Sage. Roberts, K. H. (1993). Cultural characteristics of reliability enhancing organizations. Journal of Managerial Issues, 165–181. Roberts, K. H., Rousseau, D. M., & La Porte, T. R. (1994). The culture of high reliability: Quantitative and qualitative assessment aboard nuclear-powered aircraft carriers. Journal of High Technology Management Research, 5, 141–161. Roth, E. M., Multer, J., & Raslear, T. (2006). Shared situation awareness as a contributor to high reliability performance in railroad operations. Organization Studies, 27(7), 967–987. Scott, C. W., Allen, J. A., Bonilla, D. L., Baran, B. E., & Murphy, D. (2013). Ambiguity and freedom of dissent in post-incident discussion. Journal of Business Communication, 50, 383–402. Scott, C. W., & Trethewey, A. (2008). Organizational discourse and the appraisal of occupational hazards: Interpretive repertoires, heedful interrelating, and identity at work. Journal of Applied Communication Research, 36, 298–317. Smircich, L. (1983). Concepts of culture and organizational analysis. Administrative Science Quarterly, 28, 339–358.

metaphors of communication 193

Stohl, C. (1986). The role of memorable messages in the process of organizational socialization. Communication Quarterly, 34, 231–249. Taylor, J. R. (2009). Organizing from the bottom up? Reflections on the constitution of organization in communication. In L. L. Putnam & A. M. Nicotera (Eds.), Building theories of organization: The constitutive role of communication (pp. 153–186). New York: Routledge. Taylor, J. R., & Van Every, E. J. (2000). The emergent organization: Communication as site and surface. Mahwah, NJ: Erlbaum. Thackaberry, J. A. (2004). “Discursive opening” and closing in organizational self-study: Culture as trap and tool in wildland firefighting safety. Management Communication Quarterly, 17(3), 319–359. Trice, H. M., & Beyer, J. M. (1993). The cultures of work organizations. Englewood Cliffs, NJ: Prentice Hall. Turner, V. (1980). Social dramas and stories about them. Critical Inquiry, 7, 141–168. Weick, K. E. (1979). Social psychology of organizing. Reading, MA: Addison-Wesley. Weick, K. E. (1987). Organizational culture as a source of high reliability. California Management Review, 29, 112–127. Weick, K. E. (1995). Sensemaking in organizations. Thousand Oaks, CA: Sage. Weick, K. E., & Sutcliffe, K. (2001). Managing the unexpected: Assuring high performance in an age of complexity. San Francisco: Jossey-Bass. Weick, K. E., & Sutcliffe, K. M. (2007). Managing the unexpected: Resilient performance in an age of uncertainty (2nd ed.). San Francisco: Jossey-Bass. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (1999). Organizing for high reliability: Processes of collective mindfulness. Research in Organizational Behavior, 21, 81–123. Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (2005). Organizing and the process of sensemaking. Organization Science, 16(4), 409–421. Zoller, H. M. (2003). Health on the line: Identity and disciplinary control in employee occupational health and safety discourse. Journal of Applied Communication Research, 31(2), 118–139.

chapter 9

E x t endi ng R el i a bil it y A n a ly s i s a c r oss Org a n i z at ions , T i m e , a n d S cop e Paul R. Schulman & Emery Roe

T

his chapter addresses a number of challenges to the promotion of high reliability as an organized process, and it does so by looking carefully at the concept of reliability itself. We extend the frame of reference for understanding reliability, not only to encompass larger-scale networks of organizations but also in relation to time periods over which events and effects are managed for the production of “reliable” outcomes. In addition, we differentiate standards of reliability and the alternative approaches and strategy to pursue them. These efforts to expand the concept of reliability in scale, scope, and time are founded on a recognition from our reliability research into interconnected infrastructures (Roe & Schulman, 2016) that modern organizations increasingly cannot autonomously design, control, or even fully assess the reliability of their own internal processes. In key respects, high reliability can no longer be a property of single organizations but is in fact becoming an interorganizational or networked property (de Bruijne, 2006). Unfortunately, it is far from clear that key decision makers know how to manage for reliability, nor have organizational researchers fully analyzed reliability as a networked property. An ongoing problem with the analysis of organizational reliability is that its focus has been just that—the reliability of single organizations. From the

extending reliability analysis 195

early period of research on HROs, the basic unit of analysis was the organization, even when those “organizations”—the flight deck operations of a nuclear aircraft carrier, the Diablo Canyon nuclear power plant, or individual ATC centers—were units of larger organizations (Rochlin, La Porte, & Roberts, 1987; La Porte & Consolini, 1991; Roberts, 1993; Schulman, 1993) or, as in the case of ATC centers, were enmeshed within a larger network of air traffic control and air transportation organizations (Sanne, 2000). In order to extend the perspective on high reliability, we first address input factors (both supports and constraints), including supply chains (on which organizations are dependent for reliable performance), regulatory organizations, planning activities, and, importantly, public attitudes toward specific risks associated with production or service provision activities. After introducing the special topic of reliability within interconnected infrastructures, we then address output factors including downstream organizational and social dependencies that develop on a foundation of reliable performance as well as risks that extend across time and generations. We explore these factors as analytic variables central to the future analysis of organizational reliability. We conclude with an assessment of the likely future of reliability itself as a feature of complex interorganizational processes.

E x pa n d i n g R e l i a b i l i t y I n pu t A n a l y s i s

Public Attitudes toward Organizations That Manage Risks The original HRO research looked at organizations with high reliability standards that focused on certain failures or events that could not be allowed to happen. These were treated as “precluded events,” such as the environmental release of radioactive materials or gases from a nuclear power plant, the collision of commercial aircraft under air traffic control, and the loss of readiness for the flight deck operations of a major nuclear aircraft carrier (La Porte, 1996; Roberts, 1990; Schulman, 1993; Rochlin et al., 1997). HROs were organized to deterministically prevent events like these from happening. This precluded-event reliability constitutes a different standard from the probabilistic or marginal reliability that prevails in many conventional organizations. In the latter, reliability is traded off against other organizational values, such as efficiency, speed, volume in production, constancy of service, or cost containment.

196

extending reliability analysis

In HROs reliability is nonfungible—it is not traded off at the margins for other values. These organizations can and do shut down services or output rather than risk a precluded event. A foundation of this reliability standard is the very precarious “niche” occupied by each organization with respect to its task environment where, as Todd La Porte put it, “the first error could be the last trial” (La Porte & Consolini, 1991). Relatively little high reliability research, however, has probed the character of these niches, the reasons they exist, and the public attitudes and psychology that come to shape them.1 The evolution of commercial nuclear power in the United States, for instance, has been dramatically affected by an intense public worry concerning nuclear energy—what one analyst has called the “nuclear fear” (Weart, 2012) arising out of its original weapons association. This public attitude has led to a long-standing, ever watchful public concern expressed by a variety of active antinuclear groups (Mothers for Peace in San Luis Obispo, site of the Diablo Canyon nuclear plant, for example) and a close regulatory scrutiny of the nuclear industry by regulators, such as the US Nuclear Regulatory Commission (including two on-site resident inspectors at all operating nuclear plants) as well as the industry group INPO. It could well be argued that a necessary cornerstone for the pursuit of high reliability in organizations managing hazardous technical systems (such as nuclear reactors or air traffic control) is public dread of the hazards (Schulman, 2004). In its absence we do not see examples of the adoption of precluded-event standards of reliability. In health care, or even in natural gas systems, for instance, there is little sense of collective public risk nor a prevailing public dread of the risks; medical errors as well as gas explosions are hardly precluded and indeed have been socially tolerated in the operation of these infrastructures. The hazards at the center of public dread are understood by the public to be involuntary and collective. The dread of these hazards is both a constraint and a support for high reliability management. It constrains because careful and often hostile media and regulatory scrutiny attend the operation of the hazardous systems and any errors made in their operation. At the same time, it supports precluded-event reliability commitments in organizations because, while public pressure results in a variety of regulatory requirements, these same regulations may also demand and justify reliability investments well in excess of basic operational and managerial requirements.

extending reliability analysis 197

Regulation can also sustain these investments in market organizations because market rivals cannot undercut required safety measures and expenditures to secure a competitive advantage. Continuing pressures from public dread can also help safeguard high reliability commitments in organizations from internal pressures to trade off reliability from competing values such as speed, efficiency, or other cost-cutting measures. Another distinctive feature of public dread in relation to high reliability is its stability over time. In this respect dread can be contrasted to another public attitude—condemnation following accidents or failure. Dread is prospectively focused on what might happen. Condemnation is retrospectively focused on what has happened. Dread, in this respect, is a stable fear. Condemnation, however, is episodic or cyclical. After time, the public’s condemnation recedes, following what Anthony Downs termed an “issue attention cycle” (Downs, 1972). This attenuation in public attention can leave regulatory agencies precariously positioned politically to continue their close regulation of hazardous operations. This in turn can make reliability investments and commitments more fungible in organizations and vulnerable to trade-offs to the competing values mentioned earlier. Thus dread, as opposed to after-the-fact condemnation, is a significant factor in protecting organizations from a “drift into failure” (Dekker, 2011; Pettersen & Schulman, 2016). Given its impact, the dynamics of public perceptions are important to understand—specifically, in relation to perceptual shaping and mediating factors. Public perceptions consist of both public expectations and aspirations. Sometimes these are matched and reinforcing. This is a stable foundation for the pursuit of high reliability. When expectations fall significantly below aspirations, condemnation heightens and leads to amplifying spirals of public disillusionment and despair. Obviously, the media has a major impact on public perceptions. The US media tends to be event-oriented, causing general perceptions about the safety of an industry to focus around single incidents that can undermine previous expectations and sharpen their conflict with aspirations. It is difficult for an organization to “bank” a positive reliability image gained or sustained over the years against a sudden lapse in safety. Public perceptions and pressures are also affected by the character of actual failures. In HROs precluded-event management is directed toward risks to

198

extending reliability analysis

safety that are collective and external—that imperil people outside an organization, not only its employees or individual clients. In other industries threats may be primarily internal and/or individuated. In construction, injuries are confined principally to the job site. In medical organizations, safety risks are largely internal and individuated. Except for the public health arena, medical errors do not typically spill out to the public beyond a hospital or health-care facility and do not extend beyond the individual patient (Schulman, 2004). Public Perceptions and Alternative Reliability Standards Finally, public perceptions are also influenced by assumptions about the inevitability of organizational failure or error. In our research on critical infrastructures, we have found that many interconnected infrastructures are not in a position to preclude significant events (Roe & Schulman, 2016). First, the major infrastructure events of concern to modern society are the loss of service itself and its aftermath. Given social dependence on infrastructure services, the loss of service on its own can constitute a societal catastrophe. But infrastructures cannot preclude the loss of service in the other infrastructures they depend on or which depend on them. Temporary disruptions and loss of service for major infrastructures are not that infrequent. Modern interconnected infrastructures cannot preclude failure in the face of natural catastrophes such as earthquakes, major fires, and floods, which cause destruction of assets and an extended loss of service. Loss of key services in turn can produce concatenations of downstream failure. Failures and consequences of this type may actually be accepted by the public under a different standard of what might be termed inevitable-event reliability. Here, restoration after a disruption becomes itself a “service”—and the accepted reliability standard becomes how quickly can disrupted service be restored. Crisis management and recovery after failure can be the public test for reliability in an interconnected infrastructure system. The postrecovery system and its operation under a new normal then pose yet another organizational reliability test—the new output must at least attain if not exceed the performance levels of prior normal operations. For still other events, societal aversion may be less strong and persistent. Refinery and natural gas explosions and Internet hacking arouse the aforementioned social condemnation, but public attention is evanescent and only

extending reliability analysis 199

retrospectively energized. Regulatory and managerial attention may similarly follow a cyclical pattern. When societal dread is less salient, precluded events unclear, and technical control less complete, an infrastructure may internally adopt its own “avoidedevents” standard for reliability. Here there are primarily internal events that the infrastructure tries to avoid but cannot or chooses not to attempt to preclude.2 Infrastructures may, however, prioritize these events against other values, including other risks that are considered less compelling. These are “should avoid” rather than “must never happen” events, and they can occur in normal operation, disruption, or restoration conditions. For example, a natural gas distribution system seeks to avoid depressurization of its lines in both normal and disrupted operations due to the delay, difficulty, and risk of subsequent relighting of pilot lights, one by one. It is possible, of course, that pursuing an avoided event may make a socially adverse, but not precluded, event more likely. That effort by a natural gas provider to maintain line pressure to avoid costly and hazardous repressurization processes may make gas explosions more likely in the face of leaks. Depending on public perceptions, even a socially must-prevent event or internally should-avoid event can be framed within a publicly acceptable compensable reliability standard—if the new normal (based on lessons learned, technical systems redesigned, or organizational reforms) works for a higher future reliability compared to what preceded it. The Three Mile Island accident proved to be such a compensable event when it led to significant safeguards and much safer industry practices in nuclear power. Whether an event can be compensable depends on public perceptions and political reaction as much as on actual learning and technical, organizational, and managerial changes that occur later on. It is interesting to note that two types of reliability—precluded-event and avoided-event reliability—are founded on preventing specific things from happening. Psychologist Karl Weick has described this reliability as “a dynamic non-event . . . continuously reaccomplished” (2011, p. 25). This formulation, however, does not cover the other two types of reliability. Inevitable- and compensable-event reliability are about very distinctive things that do happen—namely, disruptions or failures and effective restoration or recovery to a notably improved new normal.

200

extending reliability analysis

This necessarily brief analysis argues that public perceptions need to be more carefully understood as important explanatory elements in shaping internal and regulatory approaches taken to reliability management within an organization. It may well be assumed (wrongly) that public perceptions don’t change whether an organization is actually reliable or not, but in reality public perceptions concerning risk do affect the reliability standards, supports, and constraints surrounding the reliability management of organizations. Public perceptions, for example, are a key element in determining whether a precludedor avoided-event standard will be adopted as a reliability strategy. Failing to analyze reliability without taking into account the larger public context within which it is pursued leads to an incomplete understanding of the management of reliability as a social as well as organizational process. Regulatory Inputs for Reliability Public safety perspectives regarding an organization have two important dimensions: first, the reliability standard applied to it, ranging for our purposes from precluded event to inevitable event to avoided or compensable event; and second, how much public trust is accorded the expertise of operators and managers of these risks. These perspectives in turn shape the regulatory context—how intensive and extensive such regulation will be and how adversarial or cooperative regulation will be. In understanding organizational reliability, at whatever standard it is practiced, it is important to recognize the effects that external regulation can have on internal reliability management. On the positive side, strong external regulatory attention adds to the internal organizational status of personnel with expertise relating to regulatory compliance. Safety management regimes and their officials can gain an enhanced role in organizations from external public safety pressure and its impact on regulatory oversight (Hayes, 2013). But the positive relationship depends on public confidence in the reliability and safety management generated within these organizations. If substantial public doubt about the effectiveness of an internal safety regime (Amalberti, 2013) exists, then the regulatory relationship will likely become adversarial and prescriptive so as to limit discretion or autonomous initiative on the part of internal safety managers. Frequent external regulatory inspections may preempt and further diminish the authority of these internal safety managers.

extending reliability analysis 201

Regulatory approaches, whether cooperative or adversarial, can in this way shape specific safety management regimes within individual organizations, including the strategies they establish for selection, training, job design, accountability, performance assessment, supervision, rewards, conflict resolution, protocols, and information systems, and the specific reliability structures and procedures they adopt. In our infrastructure research, we have found it is necessary to distinguish two aspects of regulation: regulation for reliability (the commitment of a regulatory agency to reliability and safety as goals of regulation) and regulatory reliability (the degree to which regulators are able and competent in their own right to avoid prescriptive error and to promote reliability in a targeted organization). Regulatory Reliability versus Regulation for Reliability In the most general terms, regulatory agencies are created to promote a public interest through standards, legal constraints and requirements, inspections, and enforcements applied to organizations and their operational decisions and actions. The public interest sought by regulators can depend on the reliability of their own internal processes.3 From this wider perspective, regulation is frequently an exercise in what we term “macro design,” establishing formal anticipatory principles as a framework under which to control behavior as a means of guaranteeing safety and reliability through compliance. In a focus on regulation, we are now considering a larger scope of reliability input variables, and we are also extending the analysis of these variables over time. Regulatory requirements in rules, procedures, and fines can have effects on real-time operations over many years. They can influence protocols of operation and standards of reliability as well as internal organizational culture relative to reliable operation. Further, regulation can become part of the background reliability assumptions of personnel, which helps to establish a bandwidth of “acceptable” conditions for reliable operation.4 But this design-based approach and its compliance focus, while a necessary element in ensuring reliability and safety, is insufficient to produce the levels of reliability and safety that would match public precluded-event expectations and demands. We have found in our research that experience and tacit knowledge must compensate for limitations in formal design in the promotion of safety in complex operations. In fact, a major role for control

202

extending reliability analysis

operators in many infrastructures that we have observed is to buffer errors or incompleteness in formal design relative to the range of conditions and states that can be expressed by complex systems in their real-time operations (Roe & Schulman, 2008). Just as macro design is not the full answer to reliability and safety in operations, so formal rules and prescriptions in regulation are not the full solution to achieving high reliability and safety in human behavior (Schulman, 2013). Macro designs are subject to representational error—that is, their failure to depict accurately how the system they cover actually works, case by case. These errors can undermine the ability of regulators to produce the reliability effects they seek. Sensitivity and attention to the possibilities of error or incompleteness in formal reliability and safety regulation is a major part of what we term “regulatory reliability.” Regulatory reliability involves a distinction between goal-focused management and error-focused management. In organizations we have analyzed, it is evident that reliability and safety are not understood as discrete goals in their own right but rather as outcomes of processes organized, among other things, to detect and prevent errors of misunderstanding, misperception, and miscommunication surrounding hazardous operations (Schulman, 2004). These error-detection and prevention processes are not permanently locked into place by formal structures, rules, or procedures. They must be ongoing and subject to continual renewal and improvement, not only to avoid the representational drift just mentioned but also to cope with changes in technology and the task environment. There are other contrasts between the two regulatory approaches (see Figure 9.1). Within this set of contrasts, regulatory reliability has several dimensions that can be sketched briefly. Clearly, regulatory rules, requirements, and constraints are instrumental in ensuring that safety and reliability as values are not lost in organizations among competing values such as return on investment, efficiency, speed, and the like. Regulation also functions as a background condition for safety and reliability management in interconnected infrastructures, so that each infrastructure can assume some base-level of predictability in the others. But regulatory reliability must be more. It must also entail error-focused management in regulatory agencies themselves. This means a constant reappraisal with respect to questions such as: To what extent do regulatory activities

extending reliability analysis 203

Goal-Focused Regulation

Error-Focused Regulation

Definition of Safety

Specific set of targets measured and attained

Clearly defined set of events to be precluded or avoided plus a broader set of avoided precursor conditions as “proxies” of safety

Focus of Attention

Goals to be achieved

Processes to be continued

Performance Metrics

Retrospective (successes)

Prospective (risks and failures)

figure 9.1 Goal-focused vs Error-focused Regulation

promote and enhance error management in regulated organizations and industries? To what degree might regulatory activities push organizations away from error management to a more narrow compliance goal orientation toward safety and reliability? To what degree might adversarial relations between regulators and organizations lead to formalization and rigidity of safety management? Answers to these questions have profound implications for regulatory reliability. In particular, if regulators themselves define “safety” as a formal goal of regulation, the risk is that the regulatory purview will narrow to a small set of compliance metrics. The regulatory time perspective may become retrospective and not prospective. A regulatory goal perspective may increase representational errors insofar as the goals and their metrics might give an incomplete or misleading picture of how reliable and safe regulated operations in an organization may be. Organizations and their members will distort their behavior toward looking good in measured goals if the stakes are high in achieving them. As a saying in organizational analysis affirms: Organizations will do what you inspect, not necessarily what you expect. Formal goals, rules, and procedures can also be founded on a misunderstanding of the actual activities toward which they are directed. The use of formal checklists to define and measure a “safety culture,” for example, can lead to regulatory error. As one observer noted when a regulated organization was required to demonstrate its safety culture: They are being asked to reify something that they do into something that they have.5

204

extending reliability analysis

This effort to formalize often requires that a regulator deal with “knowledge by description” in place of “knowledge by acquaintance” (Baron & Misovich, 1999; Weick, 2011). But formal descriptions cannot fully or accurately represent what is tacitly known by operators and professionals who work the complex systems. The translation error in converting one type of knowledge into the other can easily lay the foundation for regulatory error (Hollnagel, 2014). Regulatory reliability requires sensitivity to these types of regulatory error. It may also require that regulatory agencies attempt to enhance the internal error-detection capabilities of a regulated organization even as they make continuous efforts to enhance their own. This is likely to be a cooperative rather than an adversarial challenge for both parties. Yet in our view, the full weight of the regulatory reliability challenge has been neglected by both regulatory agencies and their individual regulatees: the challenge is increasingly one of promoting the management of reliability across organizations, particularly interconnected organizations in networks. Networked reliability and risk are now in combination one of the most important challenges facing the understanding and pursuit of system reliability in the modern era (de Bruijne, 2006; Roe & Schulman, 2016). Planning As an Input to Reliability In considering planning we again extend the understanding of reliability over time. Planning decisions can set the stage for reliability effects far into the future. With planning also comes an expansion of scope—considering additional risks that could come into play, such as threats to environmental conditions, that could arise as future consequences of today’s reliable output. Just as we distinguish regulating for reliability from reliable regulation, we must also distinguish planning for reliability from reliable planning. Reliable planning is about the process of planning itself—a process for selecting appropriate means for achieving expected outcomes in safety and reliability. Planning can assure assets and resources that can maintain or increase reliability in the future. In electrical grids, for example, transmission planning can assure future line capacity that can sustain grid reliability in the face of rising electricity demands. Conversely, planning failures can undermine electrical reliability, and the effects can extend far into the future because it can take as long as ten years to site and build new transmission lines.

extending reliability analysis 205

Two error types in particular undermine reliable planning and therefore the future reliability of organizations and networked infrastructures: •

•

Errors of underestimated uncertainty. These are errors of overconfidence and overplanning to a level of false precision relative to forecast accuracy, cost estimations, and implementation capability. “Answering the wrong questions precisely” is another form this error takes (Mitroff & Silvers, 2009; Kahneman, 2013). Here planning can propagate a misunderstanding of how a system actually works and what variables should be addressed to preserve reliability over time. Errors of decision avoidance. These are errors of delay, inaction, or omission in planning relative to reliability and safety risks. Here the absence of planning adds to future risks, as in the earlier case of failure to plan for transmission lines needed to carry future load or the failure to plan for climate change and sea level rise in flood protection policy.

From the standpoint of reliability analysis, it is important to recognize that current real-time operations are founded, both in positive and negative aspects, on the results of previous planning. When it comes to planning as an input, as one transmission engineer put it: “Everything that happens in planning ends up in ops.”6

N e t wo r ks a n d I n t e r co n n e c t e d R e l i a b i l i t y

A major feature of modern reliability is that it can be both an input and an output property. In our research on modern infrastructures such as electric power, water, marine transport, telecoms, and dams and levees, it became quickly apparent that the outputs of any one of these infrastructures can depend on the inputs from others (Hokstad, Utne, &Vatn, 2013; Palfrey & Gasser, 2012; Roe & Schulman, 2016). Infrastructures such as water or telecoms depend on the long-term reliability of electric power, for instance. But the interconnections can be much more complex than a simple one-way serial dependency.7 While telecoms may depend on electricity, electricity reliability may depend on telecoms to coordinate the repair of downed lines. This is a bidirectional interdependency. Meanwhile, some infrastructures such as a high-voltage electrical grid or an air traffic or marine vessel control center establish a pooled or

206

extending reliability analysis

mediated interconnectivity among multiple users. For the grid, generators are pooled together in the supply of electric power and users are tied to generators through their grid connections. Pooled interdependency can socialize risk among diverse participants because of aggregated resources. But if load is high in the grid and cuts severely into generation reserves, the pooled interconnectivity can shift into a reciprocal unmediated interdependency of each generator on another. All in effect are hostage to the least reliable among them because a single generator failure could trip the grid into conditions where blackouts are necessary, affecting both energy suppliers and users. Some infrastructures, such as the Internet, place many users into potentially direct, reciprocal interconnectivity with one another. This reciprocal interconnectivity lies at the heart of the functionality we seek from the Internet. Yet at the same time, it challenges the reliability of large numbers of users and the Internet itself. As has been noted, it is vulnerable to cyberattack from any location, any scale, and at varying degrees of precision (Demchak, 2011). These varying types of interconnectivity each have different implications for the reliability of operations in the individual organizations. Moreover, the configurations themselves can change as a result of shifts in the performance states of networked components. Specific types of interconnectivity that exist within normal operations can also shift when one or more of the systems moves into different states along a range of operational conditions (e.g., shift from normal operations into disruption or failure). The shift points in interconnectivity are exceedingly important to appreciate, as they represent the transformation of what were latent interconnections (i.e., not part of normal operations) into manifest ones. To understand this process, we need to rethink reliability in relation to output variables.

E x pa n d i n g R e l i a b i l i t y O u t pu t A n a l y s i s

In classical HRO analysis, the concept of “high reliability” was treated as a nominal variable. It referred to a special category of organization. High reliability meant precluding dreaded events in these organizations. The events were either precluded or they weren’t. High reliability defined one of two binary states, normal operations or failure. In this perspective reliability wasn’t considered a continuous variable—there was no framework for analyzing higher or lower or more or less reliable organizations. But, here again consid-

extending reliability analysis 207

ering reliability over time and scope leads us now to consider a larger system perspective beyond that of a single organization (Leveson, Dulac, Marais, & Carroll, 2009). “Reliability” takes on new meaning, as does that overgeneralized term “resilience” (Hopkins, 2014; Boin & van Eeten, 2013). Our research into interconnected infrastructures highlights that these individual infrastructures assume a variety of performance conditions or states beyond normal operations and failure. Their outputs can range across at least six states that affect customers, clients, and connected infrastructure organizations differently: normal operations, disruption, restoration, failure, recovery, and establishment of a “new normal” (Roe & Schulman, 2016). Obviously, operational reliability is directed toward maintaining operations as continuously and safely as possible. At times, events or stresses may push an organization and its operating personnel near or even into their precursor zone of unwanted, higher-risk operating conditions. High reliability, then, requires that organizations be able to exercise what we term “precursor resilience.” This is the capability to recognize precursor conditions and respond to move quickly back into acceptable operating conditions. Precursor resilience keeps outputs stable and continuous throughout this process so that in essence no downstream effects occur and the exercise of precursor resilience may well be invisible outside an individual organization. But in highly dynamic networked systems, this resilience may not be possible and alternate states occur: disruption, which we define as a temporary and often partial loss of function or service, generally not over twenty-four hours; or failure, the loss of function or service, associated with the destruction of structures, equipment, or other assets, lasting considerably longer than twenty-four hours. The state of disrupted service is transient and can lead to two follow-on states: restoration, which can lead back to normal operation, or disruption, which can fall over into failure. Failure, given its association with the destruction of assets, can be protracted and may be terminated only by recovery and a new normal. What does reliability mean in these additional infrastructure states? In disruption, reliability is associated with the speed and surety of restoration back to normal operation and the decreased likelihood that disruption will decay or flip into failure. In disruption, it’s the services, not the infrastructures that disappear, and control operator reliability during disrupted service can be directed to containing lags, lapses, and errors that would delay restoration.

208

extending reliability analysis

This reliability is closely associated with what we term “restoration resilience,” the ability of operators and managers to restore outputs quickly and prevent disruptions from lapsing into failure. The Special Reliability of Interinfrastructural Recovery In failure the infrastructure (in whole or part) disappears, such as the collapse of a levee, the explosion of a pipeline, contamination of a water system, or the destruction of major transmission assets. Reliability here means the speed and surety of reconstructing both physical systems and their interconnectivity. This process, different from restoration after a temporary disruption in service, is associated with what we term “recovery resilience.” Control operators may need to formulate entirely new strategies to aid this recovery process. Recovery reliability can also mean that the recovered (new normal) operations in the infrastructure are at least as reliable, if not upgraded in reliability, as before failure, and this may entail the cooperation of higher-level organizational executives, political leaders, and policy makers. Recovery is especially an interorganizational process. In our view, recovery has special features that bear on how we should understand reliability from a networked perspective. For one thing, a precluded-event or avoided-event reliability standard of normal operations is likely to be moot once failure has occurred involving multiple infrastructures. The reliability challenge now must be directed toward coordinating a variety of diverse activities across many organizations, including special emergency response organizations (often called incident command centers) that may now be mobilized (within the United States at federal, state, and local levels) as a consequence of the failure.8 Such building back poses its own organizational and cognitive challenges for control operators as well as coordination challenges for policy makers, political leaders, and regulators. Given unique and contingent aspects of each failure, the standard of reliability applied to this process may only be retrospectively determined. Long-Term Outputs and Reliability Analysis A last set of reliability issues arises as time and scope dimensions associated with outputs lengthen still further. These issues can be addressed by considering two questions: Reliability with respect to what? Reliability with respect to whom?

extending reliability analysis 209

An organization may have a record of high reliability in providing a given set of services and preventing a set of publicly dreaded precluded events. But this set often constitutes a narrow view of reliability in both time and scope. Over time unobserved, unrecognized, or discounted actions or events can add up cumulatively to a large-scale problem or risk.9 A number of naval bases that once seemed to be reliable operations routinely stored fuel and other fluids in tanks underground. Over time some of these storage tanks leaked, contaminating groundwater. As bases closed, the tanks became major hazards to future land use. Even nuclear power plants that operated as archetype HROs were and still are contributing to the growing problem and risks of radioactive waste. Concerns for environmental sustainability may well lead analysts and the public to rethink what constitutes high reliability beyond today’s precluded-event frame of reference. If the “reliability with respect to what” question arises with time and scope, so does that of “reliability with respect to whom.” Reliability analysis, cast on an extended time frame, must inevitably address risks and consequences that current “reliable” operations are likely to impose on future generations. The loss of future environmental resources is one example. Ironically, one negative and unintended future consequence can stem from current reliability itself. As mentioned, infrastructure reliability can increase dependency on infrastructure services on the part of other networked infrastructures as well as society in general. A major pillar of modernity indeed is the social pace, pattern, and scale of contemporary life that has evolved largely to match current infrastructure capacities. Modern social life is closely attuned to, not merely dependent on, the functioning of its infrastructures. “Alwayson” infrastructure reliability is not just taken for granted; it is a prerequisite of modern social life. Already this has led to increased social vulnerability to infrastructure failure, and not just to terrorist attack. As is now well recognized, the failures of modern infrastructures can create their own social catastrophes. In this sense current levels of infrastructure reliability may lay the foundation for vulnerability among successive generations to a deeper form of social catastrophe.

C o n c l us i o n s : O n t h e Fu t u r e o f R e l i a bi l i t y a n d I t s A n a ly s i s

We have argued here that the analysis of organizational reliability is too limited in time and scope. On the input side, single organization reliability

210

extending reliability analysis

analysis fails to address important external variables necessary to understand the foundations of reliability and challenges to it. Reliability analysis has failed to embrace the fact that reliability is more and more an interorganizational and networked property. It is not clear that we know how to manage for interorganizational reliability and its increased dimensions of scale, scope, and time. The role of extra-organizational variables such as public psychology regarding specific risks has been neglected, and this has helped to produce a partial and misleading picture of the autonomy of reliability practices and processes within single organizations. The importance of regulatory and planning processes as inputs to reliability has also been neglected in reliability analysis. Many reliability studies are all too often ahistorical—amounting to the study of operations and practices in a single and limited slice of time. This severely limits our attention to regulation and planning as reliable operations in themselves, as well as our attention to the propagation of errors in these processes onto present or future reliability practices. On the output side, reliability analysis has largely been framed around the concept of high reliability and has neglected the existence and/or emergence of other standards of reliability. We have also failed to consider the multiple performance states and diverse types of interconnections that organizations and their core technologies can assume and the reliability challenges associated with each.10 In addition, reliability analysis has had too limited a frame in time and scope to inspect all the outputs of production and service processes from the perspective of risks to the environment and to future generations. It would be a “tragic choice” indeed if high reliability in some present-day infrastructures is paving the way for more difficult failures later on. Reliability research will need to extend across a variety of research fields to address these enlargements. We believe involving multiple fields of study is necessary to create more productive links between our research and the world of organizational practice. At present, too many managers have approached reliability researchers with questions, and more often entreaties, along the lines of, “How can I transform my organization into an HRO?” or “How can we make our industry as reliable as nuclear power and aviation?” In doing so they have no idea, nor often do we investigators, of the full range of input and output variables that bear on these questions. We need to enlarge our research

extending reliability analysis 211

agenda so that our answers to these questions do not in themselves constitute representational errors. Currently, a great deal of practical strategy in organizations centers on new technology as the key to enhanced reliability. This is particularly the case in the domain of critical infrastructures. The “smart grid” is one example of this deus ex machina approach. While technologies can and have in the past improved reliability through the reduction of human error, the major assumption behind this is that the problem with reliability is nearly always human error at the “sharp end” of operations. An engineer once expressed it this way: “I try to design systems that are not only fool proof but damned fool proof, so even a damned fool can’t screw them up!”11 But from our research into control rooms, it seems to us that a great many errors attributed to control operators are forced errors—a consequence of technological designs that are demanding, unforgiving, and filled with errors in their own right. A good deal of the value added to reliability from control operators comes from their ability to fashion “make do’s” or “work-arounds” to overdesigned technical systems necessary to make them work. This operational redesign as we call it compensates for a great many design errors in firstgeneration technological “advances” (Roe & Schulman, 2008). Enhanced reliability research is important to the future of reliability as a means to identify both the likely positive and negative reliability effects of many design-based input factors such as technology, regulation, and planning. At present control operators in many critical systems are in effect continually “beta testing” the designs and prescriptions of others, without a strong enough research foundation to assess their risks to both themselves and the public. In the meantime it is important to recognize and address an immediate threat to the future of reliability analysis itself. Researching reliability and risk management practices is becoming much more difficult. Access to the control rooms of large sociotechnical systems, never an easy matter in the United States, is in many cases closing: “9/11 changed everything,” a senior Coast Guard emergency manager told us, and we couldn’t agree more.12 When you add homeland security to the growing restrictions on control room access because of proprietary and market pressures, it is no surprise that entry into, continued observations of, and long-term analysis of the reliability management of critical systems in the United States has itself become unreliable.

212

extending reliability analysis

The challenge of continued research is great, especially as longitudinal studies, not point-in-time or short-term investigations (so favored by funding agencies and consultants), are needed to capture shifting interconnectivity configurations as well as assess the stability of reliability mandates, regulatory relationships, and public attention. Without steady interorganizational and control room access by researchers and younger scholars, the reliability management of many organizations will become even more of a “black box” lying outside careful analysis and deep understanding by scholars, regulators, policy makers, and the public.

No t e s 1. There is, of course, a great deal of research on public attitudes toward risk but little on the effects of these attitudes on organizations that must manage these risks. One exception has been a small strain of research on institutional trust and confidence (Schöbel, 2009; Metlay, 1999; La Porte & Metlay, 1996). 2. In effect many precursor events and conditions identified and managed under a precludedevent standard are avoided events for the infrastructures in question. 3. In some regulatory organizations, consumer rate containment is also an important public interest sought in regulation. This can lead to potentially contradictory objectives in a single regulatory organization. 4. For an analysis of high reliability bandwidths, see Emery Roe and colleagues (Roe, Schulman, van Eeten, & de Bruijne, 2002). 5. Ragnar Rosness, SINTEF. Statement at a NetWork conference titled “How desirable or avoidable is proceduralization?” Tarn, France (December 9–11, 2010). 6. Chris Mensah-Bonsu, Regional Transmission Engineer, California Independent System Operator, March 19, 2012. Personal communication. 7. Serial, pooled, and intensive interrelationships have been described in a classic work by organization theorist James D. Thompson (2003). Additional interrelationships are explored in La Porte (1975). 8. In important respects the emergency response part of a failure and recovery process often doesn’t have clear reliability standards associated with it. In firefighting organizations, for example, there is a strong commitment not to have fatalities among firefighters. But this is clearly not a precluded-event reliability standard, and while it may constitute an avoided-event standard among firefighters, it is not necessarily an accepted reliability standard for the public, which may well use speed of containment and acreage and number of homes lost as reliability performance standards by which firefighting is judged. 9. The problem of these “crescive” events has been explored insightfully by Thomas Beamish in Silent Spill: The Organization of an Industrial Crisis (2002). 10. We have introduced these topics here, but they are much more fully discussed in Roe and Schulman (2016). 11. Ray McClure, Precision Engineering Program, Lawrence Livermore Laboratory, September, 1983. Personal communication. 12. Scott Humphrey, Vessel Traffic Service, United States Coast Guard Sector San Francisco. Personal communication.

extending reliability analysis 213

R efer ences Amalberti, R. (2013). Navigating safety: Necessary compromises and trade-offs—Theory and practice. New York: Springer. Baron, R., & Misovich, S. (1999). On the relationship between social and cognitive modes of organization. In S. Chaiken & Y. Trope (Eds.), Dual process theories of psychology (pp. 586–605). New York: Guilford. Beamish, T. (2002). Silent spill: The organization of an industrial crisis. Cambridge, MA: MIT Press. Boin, A., & van Eeten, M. J. G. (2013). The resilient organization. Public Management Review, 15(3), 429–445. de Bruijne, M. (2006). Networked reliability: Institutional fragmentation and the reliability of service provision in critical infrastructures (Doctoral thesis). Delft University of Technology, The Netherlands. Dekker, S. (2011). Drift into failure: From hunting broken components to understanding complex systems. London: Ashgate. Demchak, C. (2011). Wars of disruption and resilience: Cybered conflict, power, and national security. Atlanta: University of Georgia Press. Downs, A. (1972). Up and down with ecology—The issue attention cycle. The Public Interest, 28 (Summer), 38–50. Hayes, J. (2013). The regulatory context. In Safety Management in Context (pp. 69–71). Zurich: Swiss Re Center for Global Dialog. Hokstad, P., Utne, I. B., &Vatn, J. (Eds.). (2013). Risk and interdependencies in critical infrastructures: A guideline for analysis. New York: Springer. Hollnagel, E. (2014). Safety-I and safety-II: The past and future of safety management. London: Ashgate. Hopkins, A. (2014). Issues in safety science. Safety Science, 67, 6–14. Kahneman, D. (2013). Thinking, fast and slow. Princeton, NJ: Princeton University Press. La Porte, T. (Ed.). (1975). Organized social complexity: Challenge to politics and policy. Princeton, NJ: Princeton University Press. La Porte, T. R. (1996). High reliability organizations: Unlikely, demanding and at risk. Journal of Contingencies and Crisis Management, 4(2), 60–71. La Porte, T. R., & Consolini, P. (1991). Working in practice but not in theory: Theoretical challenges of high reliability organizations. Journal of Public Administration Research and Theory, 1(1), 19–47. La Porte, T., & Metlay, D. (1996). Hazards and institutional trustworthiness: Facing a deficit of trust. Public Administration Review, 56(4), 341–347. Leveson, N., Dulac, N., Marais, K., & Carroll, J. (2009). Moving beyond normal accidents and high reliability organizations: A systems approach to safety in complex systems. Organization Studies, 30(2/3), 227–249. Metlay, D. (1999). Institutional trust and confidence: A journey into a conceptual quagmire. In G. Cvetkovich & R. Lofstedt (Eds.), Social trust and the management of risk (pp. 100–116). London: Earthscan. Mitroff, I., & Silvers, A. (2009). Dirty rotten strategies: How we trick ourselves and others into solving the wrong problems precisely. Stanford, CA: Stanford University Press. Palfrey, J., & Gasser, U. (2012). Interop: The promise and perils of highly interconnected systems. New York: Basic Books. Pettersen, K. A., & Schulman, P. R. (2016, March 26). Drift, adaptation, resilience and reliability: Toward an empirical clarification. Safety Science [online version]. Retrieved from http://www .sciencedirect.com/science/article/pii/S0925753516300108 Roberts, K (1990). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176.

214

extending reliability analysis

Roberts, K. (Ed.). (1993). New challenges to understanding organizations. New York: Macmillan. Rochlin, G., La Porte, T., & Roberts, K. (1987). The self-designing high-reliability organization: Aircraft carrier flight operations at sea. Naval War College Review, 40(4), 76–90. Roe, E., & Schulman, P. (2008). High reliability management: Operating on the edge. Stanford, CA: Stanford University Press. Roe, E., & Schulman, P. (2016). Reliability and risk: The challenge of managing interconnected critical infrastructures. Stanford, CA: Stanford University Press. Roe, E., Schulman, P., van Eeten, M., & de Bruijne, M. (2002). High reliability bandwidth management in large technical systems. Journal of Public Administration Research and Theory, 15(2), 263–280. Sanne, J. (2000). Creating safety in air traffic control. Lund, Sweden: Arkiv Förlag. Schöbel, M. (2009). Trust in high-reliability organizations. Social Science Information, 48(2), 315–333. Schulman, P. (1993). The negotiated order of organizational reliability. Administration and Society, 25(3), 353–372. Schulman, P. (2004). General attributes of safe organizations. Quality and Safety in Health Care (Suppl. 2), ii39–ii44. Schulman, P. (2013). Procedural paradoxes and the management of safety. In M. Bourrier & C. Beider (Eds.), Trapping safety into rules: How desirable or avoidable is proceduralization? (pp. 243–256). London: Ashgate. Thompson, J. (2003). Organizations in action: Social science bases of administrative theory. New Brunswick, NJ: Transaction. Weart, S. (2012). The rise of nuclear fear. Cambridge, MA: Harvard University Press. Weick, K. (2011). Organizing for transient reliability: The production of dynamic non-events. Journal of Contingencies and Crisis Management, 19(1), 21–27.

part iii

Imp l e m e n ta t i o n

This page intentionally left blank

chapter 10

Org a n i z i ng for R eli a bilit y i n H e a lt h C a r e Peter F. Martelli

I n t r oduc t i o n

Undoubtedly, health-care organizations face significant technical, organizational, financial, and regulatory challenges, and health care as a sector presents many opportunities for improvement. Of the major challenges, patient safety, an area increasingly in the public eye since the 1990s, may be even worse than previously imagined (James, 2013). The performance gaps are well documented, yet progress in reducing the impact of medical error has been painfully slow (Clancy, 2010; Dickey, Corrigan, & Denham, 2010; Wachter, 2010; Rice, 2014; US Senate Subcommittee on Primary Health and Aging, 2014). The publication of two influential reports by the Institute of Medicine (IOM) galvanized practitioner focus on reducing preventable medical error (To Err Is Human; Institute of Medicine [IOM], 1999) and exploration of systems thinking and safety science in health care, including such approaches as human factors engineering and HRO theory (Crossing the Quality Chasm; IOM, 2001). More than a decade after these landmark reports, it is possible to say that “overall [healthcare] certainly is safer in many places more of the time. We are still watching significant variation within the industry despite all of the amazing organizations we see. We’re seeing organizations whose journey has yet to begin” (Jim Conway, Senior Fellow, Institute for Healthcare Improvement [IHI], quoted in Johnson, Raggio, Stockmeier, & Thomas, 2010, p. 4). Yet, while the focus on quality and safety has led to success stories and a sense

218

organizing for reliability in health care

that “change is gradually occurring,” it is generally acknowledged that “uniformly reliable safety in healthcare has not yet been achieved” (Dickey et al., 2010, p. 1), and the sentiment persists that “if safety is a marathon, I think we just now know what the two mile marker looks like, but boy, there are a lot of miles between here and there” (Gary Yates, chief medical officer [CMO] of Sentara Healthcare, quoted in Dixon & Shofer, 2006, p. 1627). If the question is, “Are patients clearly safer in U.S. hospitals today than they were 15 years ago?” then “the unfortunate answer is no. . . . We have not moved the needle in any demonstrable way overall [and] no one is getting it right consistently” (Ashish Jha, professor, Harvard T. H. Chan School of Public Health, quoted in Rice, 2014, para. 3). In this safety marathon toward uniformly reliable safety, the increased perception is that high reliability is the “next stop” in the journey (Chassin & Loeb, 2011). What technical, economic, and institutional forces have generated this next stop, and what challenges might health care encounter on the way? What does it mean for a hospital to “become an HRO”? Or for health care to be “reliable” more generally? The title of this chapter is “Organizing for Reliability in Health Care.” Whether and to what extent high reliability theory faithfully applies in this setting depends on what we mean by HRO and what we mean by in health care. If health-care organizations are striving to become HROs, then we need to consider not only what that entails in theory but also how it has emerged in practice. To give context to this question, this chapter presents an overview of critical moments related to the establishment of HRO in health care, leading into two illustrative cases that explore implications for the theory’s application toward patient-safety issues. By examining the growth of HRO per se in health care, I argue that a solution in one setting became an “implementation label” for several approaches in health care, and that this confusion results from a growing gap, or “increasing relative ignorance” (La Porte, 1994) between HRO theory and practice. This chapter assumes familiarity with the contemporary framework of health-care delivery and with the basics of HRO theory, much of the latter being addressed elsewhere in this book. The reader is also cautioned that this is not an operational catalog of HRO research in health care, nor a “handbook” for implementation. At present, we know very little about moving from low to high reliability—a point particularly salient in health care, where “we know of no well-documented blueprints for elevating a low-reliability organi-

organizing for reliability in health care 219

zation or industry into a highly reliable one and sustaining that achievement over time” (Chassin & Loeb, 2013, p. 467). Rather, perhaps by refocusing on the assumptions and challenges of HRO in the quest for quality and safety in health care, we can either make progress in developing a HRO handbook or concede that a handbook may not be preferable or even possible.

H R O Th e o r y M e e t s H e a l t h - C a r e P r a c t i c e

Articles discussing the potential of high reliability in health care began to appear in the earliest periods of the theory’s development, stemming particularly from the anesthesiology literature (e.g., Gaba, Howard, & Jump, 1994), where its consideration built on earlier efforts to address errors using the framework of human factors design (Cooper, Newbower, Long, & McPeek, 1978) and informed by the systems focus of NAT (Gaba, Maxwell, & DeAnda, 1987). That anesthesiology was at the vanguard of safety science in health care is not surprising. The surgical anesthesiology setting is an ideal microcosm to test approaches to attenuating behavioral risk. On the one hand, once the procedure room door shuts, it is a comparatively closed system with distinct protocols, defined and specialized roles within teams, and reasonable agreement on what constitutes catastrophic failure. On the other hand, it is subject to highly dynamic decisions, intense time pressure, ill-structured problems, and often competing goals among multiple players (Gaba, Howard, Fish, Smith, & Sowb, 2001). But it is not merely that individuals in this setting dealt with HRO first but also that the features just described give anesthesia a distinct character compared to other medical fields. In this respect HRO was a “solution” tailormade for this setting—and one that was very easy to assimilate in its totality, albeit with the necessary customization. If there are to be scope conditions for HRO, especially given the theory’s potential limitations across the health system, it is important to remind readers of what David Gaba (1994) observed early in the theory’s application: What makes anesthesia different from the rest of Medicine? Why do we in anesthesia, and not those in other specialties (e.g., internal medicine or pediatrics), seek models for our work from arcane fields far removed from the care of the sick. . . . The reason is that characteristics of the work are analogous. The dominant features of the

220

organizing for reliability in health care

anesthetist’s environment include a combination of extreme dynamism, intense time pressure, high complexity, frequent uncertainty, and palpable risk. This combination is considerably different from that encountered in most medical fields. Thus, anesthetists have had to turn away from much of the research on decision-making in medicine and look instead at other human activities that share these features. (p. 198)

In pursuing the solution that HRO offered, these researchers naturally built on established successes and “ways of doing things” in anesthesiology. The use of mannequins to improve technical skills had been around for decades, but the application of team approaches based on cockpit (or crew) resource management (CRM) from the aviation industry introduced a new approach to simulation training (Howard, Gaba, Fish, Yang, & Sarnquist, 1992). Drawing analogies between the efforts of flight crews and operating teams allowed the extension of technical simulation into the realm of team-based training, thus opening the door for wider discussion of the organizational elements of safe surgical practice, including communication, authority, decision making, and culture. The communication and leadership concepts in CRM dovetailed nicely with research streams of the late 1980s in human factors engineering, teamwork, cognitive load, and naturalistic decision making. As Gaba (1994) has noted, managing real-world resources in health care requires “cognitive skills, of which interaction with the rest of the operating room team is particularly important” (p. 218). In some respects the growth of CRM is the growth of HRO in health care. Perhaps because of the focus of its application on surgical teams, the wide potential of training through simulation, and its championing by high-status medical centers, the CRM concept seemed to establish itself in health care more readily than other fundamental HRO tributaries. For instance, principles from the incident command system (ICS) have been important in elaborating a picture of the flexible structuration seen in high reliability systems. Yet, the diffusion of the ICS concepts took a very different route, particularly in health care. Until Gregory Bigley and Karlene Roberts’s (2001) definitive academic article in the Academy of Management Journal, there appeared to be no explicitly stated connection between HRO theory and the ICS by which a wider diffusion and application across domains could arise.1 Though the ICS had been used in firefighting since the early 1970s (e.g., Phoenix Fire Ground Command System, NFPA 1561, and FIRESCOPE), its application in

organizing for reliability in health care 221

health care as the Hospital Emergency Incident Command System grew out of the need for a standardized emergency response plan, originally to respond to earthquakes in California (San Mateo County Health Services Agency, 1998; Federal Emergency Management Agency, 2004; Londorf, 1995). This framework for the ICS in health care as formalized crisis operations has kept it generally segregated from other hospital operations—only recently has the range of applications now widened beyond times of emergency response (see California Emergency Medical Services Authority, 2014).

Tu r n i n g Pa t i e n t S a f e t y i n t o a P r i o r i t y

As high reliability was developing in the 1990s, the health-care world was experiencing punctuated change. In December 1994 Betsy Lehman, a wellknown health columnist at the Boston Globe, died of a chemotherapy overdose at Dana-Farber Cancer Institute. Her death led to several high-profile newspaper articles, which both publicly surfaced the topic of medical error and encouraged serious introspection on how and where error occurs (Knox, 1995; Altman, 1995; Marcus, 1995). In the ensuing year, the “first multidisciplinary conference on errors in health care” was held2 (Leape et al., 1998, p. 1444), the National Patient Safety Foundation (NPSF) and the public-private National Patient Safety Partnership were formed,3 and the IOM convened the first of six meetings of the new National Roundtable on Health Care Quality, which later provided part of the foundation for To Err Is Human (Chassin & Galvin, 1998). Michael Millenson (2002) points out how during this period the press “turned patient safety into a priority,” despite the well-worn protest of the medical establishment. He observed that the “AMA and the [JCAHO]—whose board is dominated by representatives of the AMA and the [AHA]—used the conference [establishing the NPSF] to publicly embrace [Lucian] Leape and other long time critics of hospital safety. Talk of ‘isolated’ errors, ‘good intentions,’ and ‘bad’ clinicians was conspicuously absent” (Ibid., p. 58). Like those in anesthesia, Leape was at the forefront of health-care scholars who recognized the potential of organizational approaches to safety and eschewed simplistic “bad apple” justifications (Berwick, 1989). Coincident with the Lehman case, two important publications argued that “the person associated with an error is not automatically considered as the precipitating factor, the cause of that error” (Bogner, 1994, p. 3), that the

222

organizing for reliability in health care

human causes of failure can vary from lapses and mistakes to deliberate violations, and that these failures can lie latent in an organization for a long time before becoming evident (Reason, 1995). The focus of these publications was to acknowledge and address the human factors that lead to errors as well as the organizational defenses erected to protect against failure (cf. the Swiss Cheese model). The concept of “systems thinking” was increasingly adopted by authors in the years leading up to the first IOM report, but the developing notion of high reliability was rarely mentioned in this period4 in favor of the language and perspective of human factors engineering, with the model of safety in aviation remaining dominant (e.g., Leape, Simon, & Kizer, 1999, pp. 2–8; Vincent, Taylor-Adams, & Stanhope, 1998). For the most part, the health-care literature was still contending with contemporary health services ideas of improving quality by tackling overuse, underuse, and misuse, using performance-based payment structures (e.g., “P4P”), and engaging in continuous quality improvement and total quality management techniques (for a contemporary discussion, see Ferlie & Shortell, 2001). At least one of these research streams, applying the Six Sigma® approach developed by Motorola to address operational defects in manufacturing, created the potential for deep confusion by equivocating the “high reliability” of HRO and the “high reliability” of statistical process control (see, e.g., Chassin, 1998; and later, Nolan, Resar, Haraden, & Griffin, 2004)—an issue that remains problematic on account of the broad penetration of these approaches, paired with theoretical tensions with high reliability derived from divergence in the value of slack resources.5 A few months after the release of To Err Is Human, the British Medical Journal (BMJ) published a special issue focused on “facing up to medical error” and advocated for a systems approach (“System Approach to Reducing Error,” 2000) and learning from other high-risk industries (“Facing Up to Medical Error,” 2000). In addition to articles on reporting systems, leadership responsibility, use of information technology, and continuity of care, two other important works were published in this issue. Gaba’s (2000) article cited anesthesiology as a model for patient safety, and James Reason’s (2000) article expanded on his earlier work, now explicitly referencing and couched in terms of high reliability research. The editors at BMJ pointedly indicated in this issue that “the BMJ argued 10 years ago that Britain needed a similar study and was roundly criticised by the president of a medical royal college

organizing for reliability in health care 223

for drawing the attention of the mass media to medical error” (“Facing Up to Medical Error,” 2000, para. 2). In the period immediately following the IOM report, articles on both research (e.g., Pronovost et al., 2003) and practice (Uhlig, Brown, Nason, Camelio, & Kendall, 2002) started alluding to high reliability, albeit in more general terms as a guide or inspiration. It is fair to assume that before this point some intrepid health-care managers had been exposed to features important for high reliability organizing, such as safety culture, mindfulness, and organizational learning. However, based on the literature and archived websites from the period, it is also fair to assume that most in health care at this time were unaware of HRO theory as a developing concept to explore, pursue, and attempt to apply in health-care settings.

E f f o r t s t o B r i d g e t h e W o r l ds o f H R O R e s e a r ch a n d H e a l t h - C a r e P r a c t i c e

The first notable efforts to bridge the gaps between research and practice occurred in this period. For instance, the first of a continuing series of conferences to discuss and share HRO principles was organized in 2003, explicitly promoted as “High Reliability Organization Theory and Practice in Medicine and Public Safety.”6 As a benefit of attendance, certain nonmedical health-care professions were able to apply attendance hours toward continuing education requirements (van Stralen, 2003). Speakers at the two-day conference included Gaba, Roberts, Weick, Thomas Mercer—one-time captain of the USS Carl Vinson—and other HRO luminaries, with several of the presentations drawing connections between the foundations of high reliability research in the US Navy and its practical applications in medicine.7 While difficult to benchmark the impact of the conference, it is reasonable to assert that it was part of the groundwork for more widespread adoption of the HRO concept in health care. Also by this point, innovating hospitals seeking new models to address quality and safety were becoming aware of the tenets of high reliability and the potential in their application. For instance, the first TJC and National Quality Forum (NQF) “John M. Eisenberg Patient Safety Award for System Innovation” was awarded to Concord Hospital in New Hampshire for developing a “team-based, collaborative rounds process—the Concord Collaborative Care

224

organizing for reliability in health care

Model—that involved use of a structured communications protocol,” and which was developed “on the basis of theory and practice from human factors science, aviation safety, and [HRO] theory” (Uhlig et al., 2002, p. 666). Two influential books published in the early 2000s bridged the worlds of HRO research and health-care practice. Another major IOM publication, Keeping Patients Safe: Transforming the Work Environment of Nurses, released in 2004, more liberally and comprehensively referenced high reliability theory, likely as a result of Roberts’s participation on the report’s committee (IOM, 2004). And Weick and Sutcliffe (2001) published the first edition of Managing the Unexpected, offering a compact set of five core HRO principles around anticipation and containment. The brevity and insight of these principles, as opposed to the academic verbosity of earlier research (e.g., on sensemaking), allowed them to be quickly adopted as a standard language of high reliability. For instance, the book had an immediate and sustained impact within the US Forest Service, which had been contending with preparing for and managing crises for years.8 The agency established the Wildland Fire Lessons Learned Center in 2002 to archive and disseminate knowledge on safety and reliability. By the second edition in 2007, Managing the Unexpected had become an entry point for HRO-inquiring practitioners and a “must-cite” reference in academic articles in health care and elsewhere (e.g., see Chassin & Loeb, 2013; Henriksen, Dayton, Keyes, Carayon, & Hughes, 2008). Now in its third edition, the book is considered a “classic” reference in health-care circles (Patient Safety Network [PSNet], 2015). In 2003 Health Affairs featured an article written by Carolyn Clancy, director of the Agency for Healthcare Research and Quality (AHRQ), and Tom Scully, administrator of the Centers for Medicare and Medicaid Services (CMS), signaling that the “federal government’s health agencies are responding to the call for improved patient safety and accountability” by focusing on a “‘systems’ view” and beginning “several initiatives that address the silences of both deed and word” (Clancy & Scully, 2003, p. 113). While the article referenced CMS’s Quality Improvement Organization (QIO) initiative and AHRQ’s new WebM&M (Morbidity and Mortality Rounds on the Web) publication, it made no reference to the rapidly establishing field of high reliability in health care. This was to change in the next few years. By late 2005 AHRQ, together with the Canadian Patient Safety Foundation (CPSF),9 had

organizing for reliability in health care 225

convened a group of leaders from nineteen hospital systems committed to applying high reliability concepts into a short-term “HRO Learning Network” in order to explore “how concepts learned from [HROs] might be applied to improve patient safety . . . , [to] learn more about HROs in healthcare, and to provide healthcare organizations an opportunity to share their experiences” (Hassen & Clancy, 2009, para. 1). From the network experiment, AHRQ reported that “while [members of their sample] were well versed in the concepts of patient safety, only two volunteered the term ‘HRO’ in their descriptions of their patient safety efforts,” and that “the Agency saw a great opportunity to work with health care systems interested in becoming an HRO” (Dixon & Shofer, 2006, p. 1624). The following years saw the explicit introduction of high reliability per se into health-care practice, with 2006 as a possible inflection point for the theory’s favorable reception. A special issue of Health Services Research (Clancy & Reinersten, 2006) highlighted the reflections of health-care researchers contending with HRO and included early results of AHRQ’s qualitative research with health systems belonging to the short-term HRO Learning Network. Although the desired long-term network never materialized, AHRQ subsequently released a report based on its findings entitled Becoming a High Reliability Organization: Operational Advice for Hospital Leaders (Hines, Luna, Lofthus, Marquardt, & Stelmokas, 2008). On those lines, stories of “achieving,” “inventing,” “creating,” or “becoming” an HRO become more visible in this period, along with advice or cautions to the pursuit of high reliability (e.g., Ibid.; Jose, 2005; Sheps & Cardiff, 2005; Madsen, Desai, Roberts, & Wong, 2006; Roberts, Madsen, Desai, & van Stralen, 2005). Likewise, diffusion and implementation of strategies across hospitals under the banner of high reliability theory became increasingly evident—for instance, in the case of “promoting high reliability surgery and perinatal care through improved teamwork and communication at Kaiser Permanente” (McCarthy & Blumenthal, 2006, p. 18). Finally, probably cementing the turn toward high reliability, periodicals aimed at health-care executives started reporting on, and implicitly advocating, implementation of the theory, citing high-profile adopters such as Don Berwick, administrator of CMS (e.g., “High-reliability organizations understand the risky environments in which they operate.” [DerGurahian, 2008a, para. 22]), and Mark Chassin, president of TJC (e.g., “The

226

organizing for reliability in health care

goal is to reach the same level of effectiveness present in other so-called highreliability industries, like aviation and nuclear power.” [DerGurahian, 2008b, para. 6]). With a heightened public awareness about the errors occurring in health care (Spear, 2005; Kalb, 2006)10 and a continued focus on promoting general quality and safety efforts within health care (e.g., IHI’s 100,000 Lives Campaign [Davis, 2005; McCannon, Schall, Calkins, & Nazem, 2006], NQF’s Patient Safety Practices [Kizer, 2001], and the Leapfrog initiatives [Milstein, 2002]), the framework for widespread attention to HRO in health care had been established.

Th e H R O App r o a ch a s I t S t a n ds Tod a y

In the decade since that point, acceptance of high reliability as an approach per se has been steadily rising. Take, for example, Crozer-Keystone Health System, which is “joining dozens of other healthcare systems that are fundamentally changing the way everyone thinks, communicates and acts. Like them, the health system’s goal is to become a high-reliability organization (HRO): using proven principles from naval aviation and other high-risk, high-performance industries to improve quality, safety and efficiency” (Crozer-Keystone, 2012, para. 1). Also consider Cincinnati Children’s Hospital Medical Center’s strategic plan, which “calls for the elimination of all serious patient harm and achievement of the lowest rates of employee injury by leveraging internal and external expertise toward becoming a high reliability organization (HRO) by June 30, 2015” (James M. Anderson Center, n.d., para. 1). These organizations appear to have a clear sense of what HRO is, what it entails, and how to achieve it—yet, as other chapters in this book attest, significant gaps remain in our understanding of HRO as a theory, especially with respect to the finer causal mechanisms and their interactions, the scope conditions and transferability, and any preconditions to sustained implementation. This issue casts a giant shadow for high reliability theory as it diffuses through the health-care environment, driven not only through ongoing innovation by sites but also through promotion by consultants and the influence of high-profile champions. The first serious efforts by consulting firms offering to educate, run initiatives about, or transform hospitals into HROs also began in this decade.11

organizing for reliability in health care 227

K n ow l e d g e I n t e r m e d i a r i e s D r i v e t h e C o n v e r s a t i o n

Perhaps it is a truism that popular ideas seem to take on a life of their own. In the case of high reliability, its recent popularity may be creating a scenario in which “the range of alleged high reliability concepts is now enormous” (Vincent, Benn, & Hanna, 2010, para. 3). Mathilde Bourrier (2011) has gone as far to argue that in the recent period, “the HRO literature has continued to grow, evolving from a research topic to a powerful marketing label: organizations concerned with their level of safety and/or with their public image want to become HROs and maybe more importantly they want to be described as HROs. The HRO term has somehow become a label of excellence, even appearing in Wikipedia” (p. 12).12 So, what exactly is diffusing from site to site and through these consultants and champions? To understand this situation better, consider two instructive cases of knowledge intermediaries driving adoption of particular approaches.13 First, in 2002, driven by CMO Yates, Sentara Healthcare, one of the original hospital networks included in the AHRQ HRO Learning Network effort, partnered with Performance Improvement International (PII), an existing consultancy “rooted in the foundations of engineering and refined in the exacting environment of the nuclear power industry” (Performance Improvement International, 2015, para. 2), to develop the Sentara Safety Initiative at Sentara Norfolk General Hospital (Cohn, 2011; McCarthy & Klein, 2011; Weinstock, 2007). The overlap is evidenced in many of the initiative’s actions, such as the establishment of an organization-wide Lessons Learned program “modeled after the Institute for Nuclear Power Operations’ Significant Event Evaluation Information Network” (McCarthy & Klein, 2011, p. 8). The Sentara Safety Initiative not only reported successes in reducing medical error rates but also led to the formation of Healthcare Performance Improvement (HPI) in 2006 as a joint engagement between Yates and Craig Clapper, then chief operating officer of PII. HPI markets the aim of “improving and sustaining a culture of reliability that optimizes results in safety and performance excellence” and reports to have “led comprehensive safety culture improvement in over 600 hospitals across the United States” using methods “based on the best practices of high reliability organizations that ‘get it right’ in safety” (Healthcare Performance Improvement, 2015). HPI’s broad consulting penetration, together with Sentara’s recognition as a model of HRO

228

organizing for reliability in health care

a pplication by AHRQ, The Commonwealth Fund, and elsewhere makes their nuclear power–inspired approach to dissemination noteworthy. While HPI’s models may have found success in nuclear power, aviation, and within circumscribed areas of hospitals, it is not evident that these methods are necessarily transferable to broader health-care applications. Schulman (2002, 2004, 2014; Schulman & Roe, 2014) has thoughtfully argued for a distinction between “precluded events” and a “marginal” conceptualization of reliability. On the one hand, reliability by the standard of “precluded events” is driven by social dread that a given failure will be catastrophic and is characterized by nonprobabilistic (every-last-case) management and nonfungible resources to prevent that defined failure (Schulman, 2014). On the other hand, “marginal reliability” is a continuous, not categorical, variable characterized by probabilistic (run-of-cases) management and by fungible resources that are leveraged in trade-offs across competing reliability priorities (Ibid.). In health care, it is not clear “that society views medical mistakes with the same dread as those of other precluded event organisations [and] because there is no social dread surrounding medical reliability, other values applied to medicine, such as efficiency, overall cost control, and timeliness seem to intrude in ways they do not in precluded event organisations” (Schulman, 2004, p. ii43). In fact, as Schulman’s (2002) analysis suggests, “an approach to reliability at the level of precluded events is itself precluded for medical organizations. It would take a major shift in public demand or an alteration in the boundedness of medical errors to lay the foundations for such an approach” (p. 214). What might it mean to so widely disseminate an approach that is itself “precluded”? Second, coincident with his keynote address at the Fourth International HRO Conference, Chassin started a wider campaign to drive the adoption of high reliability theory across US hospitals. As a longtime champion of quality and safety in health care, Chassin was perfectly positioned to further promote HRO as the “next stop” in the “ongoing quality improvement journey” (Chassin & Loeb, 2011). He partnered with Strategic Reliability, LLC,14 to cohost and present at the 2012 International HRO Conference at The Joint Commission Conference Center in Chicago (van Stralen, 2012; Chassin, 2012) and shortly thereafter established The Joint Commission High Reliability Resource Center website (The Joint Commission Center for Transforming Healthcare [TJC-CTH], 2015).

organizing for reliability in health care 229

The hallmark of Chassin’s approach to HRO is a “stages of maturity” model, consisting of four stages of organizational progress toward high reliability, ranging from Beginning, to Developing, Advancing, and finally, Approaching (Chassin & Loeb, 2013). A hospital can benchmark its “development as it grows into a high reliability organization” by means of the High Reliability Self-Assessment Tool® (HRST), a trademarked, web-based survey allowing “senior leadership to determine the ‘maturity’ of their organization in adoption of practices that lead to High Reliability within three domains: Leadership Commitment [“to zero patient harm”], Adopting a Safety Culture, [and] Applying Robust Process Improvement™ [RPI] methods” (Smith, 2014, p. 36). The last of these items, RPI, requires further note. Almost immediately after joining TJC in 2008, Chassin established The Joint Commission Center for Transforming Healthcare, a 501(c)(3) affiliate that offers to assist in “Creating Solutions in High Reliability Health Care” through its trademarked RPI method and Targeted Solutions Tool® (TST; Chassin et al., 2010; TJC-CTH, 2014a, 2014b). Both RPI and TST advertise using a “fact-based, systematic, and data-driven problem-solving methodology [that] incorporates tools and concepts from Lean Six Sigma and change management methodologies” (TJCCTH, 2014b, p. 5), an area in which Chassin (1998) had been publishing for more than a decade and in which TJC is disseminating widely through both informal and formal routes (e.g., The Joint Commission, 2012). However, as operationalized by the Center for Transforming Healthcare, the exact composition of these methods and tools is not publicly available. Nonetheless, the tools speak the language of health-care professionals and appear to hold great promise for operational improvements—and because the work of TJC is widely influential in the health-care sector, they are likely to take root. Whether a maturity model is a viable or appropriate rubric for high reliability is a broader theoretical and empirical issue15; however, the grouping of HRST, RPI, and TST within the Center for Transforming Healthcare does stimulate a few questions. First, to what extent are existing methods to address quality and safety in health care also appropriate to promote reliability? Which approaches should be incorporated, across which subsystems, and with what priority? Second, to what extent does the HRST survey, and any similar current approaches, leverage the aviation/nuclear power “precluded-events” perspective on reliability, rather than a “marginal reliability” perspective that

230

organizing for reliability in health care

might be more theoretically suitable for health care (cf. Schulman, 2004)? Finally, to what extent should high reliability models, tools, and measures remain transparent and testable versus subject to intellectual property constraints?16 As these data shift increasingly from public and governmental sources to private organizations, how can the scientific community build effective partnerships to evaluate whether these discoveries extend our theoretical understanding? A healthy debate of these questions will be beneficial.17 H R O a n d t h e P r o b l e m I t Add r e ss e s i n H e a l t h C a r e

As we enter this new phase of HRO dissemination and adoption, it is worth returning to the question of what problem HRO is solving and what constitutes reliability in this setting. Clearly, the problem is no longer limited to surgical anesthesia, and, as a consequence, reliability is taking on a broader character than originally conceived. Consider what the problem looks like from the practitioner perspective. There remains “a wide gap between the level of knowledge published and debated in the academic circles on these issues and the level of knowledge transfer that has actually occurred from these circles to the industry or regulatory circles” (Bourrier, 2011, p. 12). For most hospitals, the problem of reliability is understood in the context of reducing errors by improving aspects of patient safety. For William Beaumont Hospital in Royal Oak, Michigan, the problem and solution looked like this: What is patient safety all about, and how can an organization be transformed toward its pursuit? Human factors, high-reliability organization, error, injury, disclosure, empowerment, communication, team-work, hierarchy, just culture, nonpunitive culture, nurse-to-patient ratios, duty hours, work shifts, mindfulness, legibility, computerized order entry, IHI, NPSF, ISMP, ECRI, JCAHO, CMS, AHRQ, NQF, the Leapfrog Group, simulation, learning organization, standardization, simplification, Six Sigma, lean thinking, flow, throughput, handoffs, Internal Bleeding, “The Bell Curve,” pay for performance, compensation, litigation, risk management, organizational ethics, mission, apology, latent errors, active errors, Swiss cheese, sentinel events, incident reports, Codman award, Eisenberg award, Quest for Quality award, and on and on: With all of these terms, do we really understand the essence of patient safety? At Beaumont, we believe that we do. (Winokur & Beauregard, 2005, p. 19)

organizing for reliability in health care 231

A great deal of research on the constituent concepts underpinning high reliability has taken place, much of which has been convincingly demonstrated to promote features of quality and safety, primarily in hospitals. These concepts principally include: safety culture and climate (Singer & Vogus, 2013a, 2013b; Vogus, Sutcliffe, & Weick, 2010; Helmreich & Merritt, 1998; Weick, 1987; Singer et al., 2009; Goodman, 2003), learning (Carroll, Rudolph, & Hatakenaka, 2002; Davies & Nutley, 2000; Bohmer & Edmondson, 2001; Tucker & Edmondson, 2003; Edmondson & Singer, 2012), teamwork (Baker, Day, & Salas, 2006; Knox & Simpson, 2004; Salas, Cooke, & Rosen, 2008; Wilson, Burke, Priest, & Salas, 2005; Manser, 2009), psychological safety (Edmondson & Lei, 2014), mindfulness (Weick & Sutcliffe, 2006; Vogus, Rothman, Sutcliffe, & Weick, 2014; Vogus & Sutcliffe, 2012), complexity (Plsek & Greenhalgh, 2001; Plsek, 2003), just culture (Wachter & Pronovost, 2009; Frankel, Leonard, & Denham, 2006), and relational coordination (Carmeli & Gittell, 2009), as well as extensive research on communication, leadership, and other topics subsumed by the research just cited. In addition, dozens of notable applied research models and methods have been applied in health-care settings, such as simulation, checklists, timeouts, huddles, frontline staff, and so forth. Yet, can we say that all the tools being used by Beaumont Hospital work together to solve the problem of reliability in the hospital as a whole? This is an unresolved question because there has yet to be a compelling integrative study in the hospital setting.18A longitudinal, comparative case study will be required to disentangle which mechanisms within the dozens of interventions that occur in a hospital at any given time are sustaining reliability in the long run. It seems natural that, like Beaumont, many practitioners would either add “HRO” to the list of interventions planned, contracted, staffed, and under way, or attempt to augment existing interventions to serve multiple goals; the latter may especially obtain under the conditions of “initiative overload” that many hospitals face, where there are too many initiatives to adequately attend to all the details of implementation, measurement, and reporting. A practitioner should be careful with AHRQ’s suggestion that “although high reliability concepts are very useful, you should not view them as conflicting with strategies or vocabularies [such as “Six Sigma®, Lean, Baldrige, and Total Quality Management (TQM)”] that you already may be using to promote quality and safety” (Hines et al., 2008, p. 3). It may be that they conflict in a given system context to produce unintended consequences either for

232

organizing for reliability in health care

short-term outcomes or for longer-term reliability; the question of context is a serious one, as the history of the diffusion of surgical checklists attests (e.g., Dixon-Woods, Bosk, Aveling, Goeschel, & Pronovost, 2011). With little integrative guidance, no understanding of the limits of a precluded-events approach, unclear model organizations (see Griffith, 2015), and a tolerance for consulting solutions, it is no surprise that health-care organizations overwhelmed by the prospects of the “journey to high reliability” would expect to harness the “Power of Zero”19 (May, 2013; Rees, 2013) by turning to available solutions, such as a portfolio of tools and other resources for hospitals and health care organizations to use to help guide them toward high reliability, including The Joint Commission Center’s High Reliability Self-Assessment Tool (HRST), Outcome Ingenuity’s [sic]20 Just Culture Community training and resources, Healthcare Performance Improvement’s (HPI) Safety Event Classification (SEC) and Serious Safety Event Reporting (SSER) system. (South Carolina Hospital Association, 2015, para. 6)

C o n c l us i o n s : Towa r d R e l i a b i l i t y S e e k i n g

Part of the hope of this book is to recall what HRO theory was conceived to address, take stock of how it is being applied, and discuss what revisions need to be made in either theory, practice, or both in order to extend the spirit of the original reliability framework into future application. Assembling the preceding historical observations, what can we say about the state of HRO in health care? First, we might conclude that the concept of HRO taking root in health care is deeply associated with aviation and nuclear power and that this association has led to the wide diffusion and institutionalization of an approach that was originally appropriate to solve a problem within a highly localized setting within health care. Second, the intense efforts to address health-care quality and safety in the period of HRO’s establishment, coupled with the continuing development of the theory itself, has led practitioners to use an amalgamation of approaches under the heading of “high reliability organizing” that may not properly fit that label; in fact, we still do not understand what features are necessary or sufficient in order to establish and maintain reliability in health care—not to mention in what amount, with what priority and investment, and at the ex-

organizing for reliability in health care 233

pense of what other resource allocations. This suggests a challenge to the expectation and reality of HRO as a cohesive approach may come in the future, especially as we consider the diffusion and adoption in health-care settings across the continuum of care. Third, it is important to note that most of the research on HRO and its related concepts in health care has been based in hospitals and that a wider system application across the health-care continuum will require additional theorizing.21 Both fixing discrete health problems and ensuring continued physical and mental wellness increasingly occurs outside of the brick-and-mortar hospital organization, in settings as diverse as community clinics, physical therapy offices, nutritionist kitchens, long-term care facilities, or the patient’s home. In any case it seems clear that our theorizing is not matching the incredible speed of change occurring in practice, leading inevitably to the “increasing relative ignorance” between HRO theory and practice in health care. If high reliability is to be the “next stop” on the journey to patient safety, it is important to acknowledge that “in health care, high reliability has come to be used as a proxy for safety, although reliability and safety are not truly equivalent” (Sutcliffe, 2011, p. 135).22 This problematic conflation arises not only from indifference to scope conditions but also from the technical, economic, and institutional forces that generated the need to “become” an HRO. As Todd La Porte noted: “We never attributed the moniker HRO. The whole idea is deeply pessimistic, very hard in the long term” (personal communication, May 4, 2012). In health care we should be mindful of “the intrinsic costs and difficulties of seeking continuously to achieve failure-free performance in large organizations and the theoretical impossibility of assuring it under all conditions” (La Porte, 1996, p. 61). Reconsidering high reliability health care in terms of Schulman’s marginal reliability might yield an opportunity to bring theory and practice closer together under a reliability-seeking framework and in turn hopefully stem normative adopters from looking for turnkey solutions and certifications. Yet, the unspoken question remains: If incremental gains are being made in ensuring patient safety, as the health-care literature suggests, then does it matter whether what we’re doing is “high reliability” as conceived? This is a bigger issue than can be resolved here. It is possible that the version of HRO being applied to improve patient safety in narrow settings may not be appropriate for sustained reliability of health-care institutions across the system

234

organizing for reliability in health care

continuum, or perhaps that these gains are made in isolation or even at the expense of the wider system. As Charles Vincent, Jonathan Benn, and George Hanna (2010) remind us: “The problem is not that health care is not reliable or resilient at all, but that huge variability exists within teams, within organisations, and across the system. The hospital that contains centres of excellence may have other units in which outcomes are poor or even dangerous” (para. 5). We can expect more integrative and implementable research on high reliability organizing in health care in the near future.23 We can also expect that health-care practitioners will increasingly turn to knowledge intermediaries, whether thoughtful or naïve, to find solutions to their organizational patientsafety and quality problems. Superimposed over both, the rapidly changing world of health care will require greater understanding of practice constraints and theorizing on how high reliability management functions across the healthcare continuum. Until that point, we should continue to contemplate the intrinsic and path-dependent conditions affecting implementation in health care and explore the most potent ways in which health care and high reliability can speak to and learn from one another. For the broader reliability-seeking project addressed in this book, perhaps we can learn a few lessons from the story of high reliability in health care. First, complex service organizations urgently seeking solutions for safety may face similar pitfalls, including contending with the nonequivalence of safety and reliability and disentangling the array of interrelated improvement strategies that these organizations are often enticed to implement and evaluate simultaneously. Second, the parameters of appropriate application for these organizations will require both defining the problem that HRO is engaged to solve and delimiting the boundaries of the affected system. Without clarity on these questions, it will be difficult to answer the question of “reliability with respect to what?” Third, we should be sensitive to the role that knowledge intermediaries play in defining these parameters, driving adoption, and developing professional standards, particularly with respect to the diffusion of a precluded-events or marginal reliability perspective. As HRO enters new sectors, these intermediaries will be critical partners in ensuring that organizations understand their boundary conditions, protect attention and resources over the long term, and avoid the temptation of seeking a badge or status to be attained before “moving on to other things.” Fourth, given the extensive role of CRM-derived principles across HRO applications, this may be an opportune

organizing for reliability in health care 235

time to revisit the functions of organizational structure not only in terms of flexible structuration in the ICS tradition but also with respect to the structural characteristics in interorganizational coordination and corporate governance that constrain action. Finally, returning to Gaba (1994), recall that we “seek models for our work from arcane fields [when the] characteristics of the work are analogous.” As we continue the reliability project, we should recall and preserve our pioneer spirit, drawing lessons across disciplinary boundaries in order to explore the character of work, the character of error, and the technical, economic, and institutional forces that promote reliability seeking in complex, interdependent organizations.

No t e s 1. Localized diffusion of the ICS concept as connected to HRO may have occurred, especially in Southern California, but there is no indication of more concrete or widespread establishment in the indexed literature or available gray literature until this point. 2. Other participants were the American Association for the Advancement of Science (AAAS), the American Medical Association (AMA), the Joint Commission on Accreditation of Healthcare Organizations (JCAHO, or TJC), and the Annenberg Center for Health Sciences. 3. Founding members included the AMA, American Hospital Association (AHA), Veterans Health Administration (VHA), American Nurses Association (ANA), American Association of Medical Colleges (AAMC), JCAHO, IHI, and the new NPSF (Leape et al., 1998). 4. When mentioned in either the academic or gray literature, the reference to HRO was often ceremonial—save for a few notable cases, such as Berwick’s address at the National Forum on Quality Improvement in Health Care in December 1999, later published as Escape Fire: Lessons for the Future of Health Care (2002), which focused on sensemaking in systems, and Marilynn Rosenthal and Kathleen Sutcliffe’s (2002) book Medical Error: What Do We Know? What Do We Do? which resulted from a conference at the University of Michigan and included articles by Karl Weick and Paul Schulman. 5. The challenge reduces to a twofold problem around waste and slack: first, which processes can be trimmed of waste; and second, in what circumstances is the removed element necessary versus waste? A longer discussion of Six Sigma is being prepared for publication by the author. 6. This conference was held in California in 2003 and 2006, then in Deauville, France, in 2007, and from 2010 on it has been held annually in cities across the United States. 7. For instance, “HRO: From Aircraft Carriers to Medicine and Public Safety” and “Crew Resource Management: Naval Aviation and Hospital Operating Room” were two of the presentations. 8. The Montana Mann Gulch Fire of 1949, for example, promoted “standing orders” and “watch out situations.” 9. Canada was tackling the same issues related to patient safety and produced its own reports that researched and referenced high reliability (see Baker & Norton, 2002, p. 167; Sheps & Cardiff, 2005). 10. Whether this would qualify as “social dread” for HRO purposes is an open question. 11. Many firms exist, including several with significant background in the aviation industry, such as Safer Healthcare and Convergent HRS. Government registered trademarks on the terms “high reliability teams” and “high reliability mindset” are enough to suggest that, in addition to any purer motivations for clinical improvement, the potential for economic profit is likewise recognizable.

236

organizing for reliability in health care

12. This labeling has been a worry of HRO authors since the start of the research endeavor. In 1993 Gene Rochlin remarked that the “compact, acronymic terminology was both unnecessary and unfortunate” (1993, p. 12). In our 2012 workshop, Weick reiterated, “It is disconcerting that the acronym HRO has become something of a marketing label. When it is treated as a catchword this is unfortunate because it makes thinking seem unnecessary and even worse, impossible. The implication is that once you have achieved the honor of being an HRO, you can move on to other things” (2012, p. 4). 13. Intermediary information and knowledge brokers include consultants, professional organizations, thought leaders, conferences, and executive education programs (see Shortell, 2006). 14. Strategic Reliability, LLC, was formed in 2010 by Daved van Stralen, host of the HRO Conference Series and a pediatrician who had become familiar with fundamental HRO principles when working in emergency services, and Rear Admiral Tom Mercer, who had served as captain of the USS Carl Vinson, a nuclear-powered aircraft carrier, during the original high reliability studies. Though Strategic Reliability eschews the term “consultancy” in favor of “coaching,” it serves in many of the same functions of training and workshop development. 15. John Carroll and Jenny Rudolph (2006) have previously discussed high reliability design as an “ongoing self-design process” in “different organizational stages” from the local to control to open to deep stages. Their model emphasizes organizational challenges and tensions and is less prescriptive than the Stages of Maturity model in its current form. 16. The RPI/TST example above is hardly unique. Recently, PII likewise formed a subsidiary that “developed exclusive technology called Error-Free®,” promising an “operation event rate and injury rate [that] could go to near zero, instantly” (Error-Free, Inc. Retrieved from http://errorfree .com/; Error-Free® operation and human performance. Retrieved from http://errorfree.com/error -free-operation-and-human-performance-2/). 17. The cases presented here are examples of prominent drivers of HRO adoption. The reader is encouraged to research other potential routes for isomorphic adoption—for example, “the Institute for Healthcare Improvement has sponsored a variety of educational efforts that describe reliable processes in healthcare and the types of activities to achieve error-free operation” (Clancy, 2008, para. 6). 18. There have been many papers on individual interventions that yield benefits in safety and quality (as cited in the text), several catalogs of important common features of interventions (e.g., Lekka, 2011), some good framework discussions of a general (e.g., Tamuz & Harrison, 2006) and more specific nature (e.g., Christianson, Sutcliffe, Miller, & Iwashyna (2011), focused on ICUs, and a handful of cross-sectional cases offering insight (e.g., McCarthy & Klein, 2011), but no longer-term reliability studies of the type conducted by Emory Roe and Schulman (2008) with the California ISO electric grid. 19. The “zero” here refers to “zero errors” by a given safety measure, which is the natural consequence of maintaining a precluded-events framework. The idea of “zero” has become a common enough buzzword in health care (e.g., see DerGurahian, 2008a) to warrant a closer investigation of its consequences on adoption. 20. Outcome Engenuity is a firm that produces the Just Culture Organizational Benchmark Survey™ and offers Just Culture certification courses. 21. Though this is slowly changing, much more research is required to understand the ramifications of these approaches. 22. The article containing this observation was published in the journal Best Practice and Research in Clinical Anaesthsiology, which provides a fitting bookend to the larger history provided in this chapter. 23. For an example of integrative work, see Pronovost and colleagues’ 2015 article, “Creating a High-Reliability Health Care System: Improving Performance on Core Processes of Care at Johns Hopkins Medicine,” which focuses on governance issues.

organizing for reliability in health care 237

R efer ences Altman, L. K. (1995, March 24). Big doses of chemotherapy drug killed patient, hurt 2d. New York Times. Retrieved from http://www.nytimes.com/1995/03/24/us/big-doses-of-chemotherapydrug-killed-patient-hurt-2d.html Baker, D. P., Day, R., & Salas, E. (2006). Teamwork as an essential component of high-reliability organizations. Health Services Research, 41(4), 1576–1598. Baker, G. R., & Norton, P. (2002). Patient safety and healthcare error in the Canadian healthcare system: A systematic review and analysis of leading practices in Canada with reference to key initiatives elsewhere—A report. Ottawa: Health Canada. Berwick, D. M. (1989). Continuous improvement as an ideal in health care. New England Journal of Medicine, 320(1), 53. Berwick, D. M. (2002). Escape fire: Lessons for the future of health care. New York: Commonwealth Fund. Bigley, G. A., & Roberts, K. H. (2001). The incident command system: High-reliability organizing for complex and volatile task environments. Academy of Management Journal, 44(6), 1281–1299. Bogner, M. S. E. (Ed.). (1994). Human error in medicine. Hillsdale, NJ: Erlbaum. Bohmer, R. M. J., & Edmondson, A. C. (2001). Organizational learning in health care. Health Forum Journal, 44(2), 32–35. Bourrier, M. (2011). The legacy of the high reliability organization project. Journal of Contingencies and Crisis Management, 19(1), 9–13. California Emergency Medical Services Authority. (2014). Hospital incident command system guidebook (5th ed.). Rancho Cordova, CA: Author. Retrieved from http://www.emsa.ca.gov/media/ default/HICS/HICS_Guidebook_2014_10.pdf Carmeli, A., & Gittell, J. H. (2009). High‐quality relationships, psychological safety, and learning from failures in work organizations. Journal of Organizational Behavior, 30(6), 709–729. Carroll, J. S., & Rudolph, J. W. (2006). Design of high reliability organizations in health care. Quality and Safety in Health Care, 15(Suppl. 1), i4–i9. Carroll, J. S., Rudolph, J. W., & Hatakenaka, S. (2002). Organizational learning from experience in high-hazard industries: Problem investigation as off-line reflective practice (MIT Sloan Working Paper No. 4359-02). Boston: MIT Sloan School of Management. Retrieved from http://papers .ssrn.com/sol3/papers.cfm?abstract_id=305718 Chassin, M. R. (1998). Is health care ready for Six Sigma quality? Milbank Quarterly, 76(4), 565–591. Chassin, M. R. (2012, May 21). High reliability healthcare: What’s holding us back? Presentation at the Fifth International High Reliability Organizing Workshop, Oakbrook Terrace, IL. Chassin, M. R., Conway, J. B., Umbdenstock, R. J., Dwyer, J., Langberg, M. L., & Petasnick, W. D. (2010). Accreditation. Chassin and Joint Commission aim to inspire. Interview by Howard Larkin. Hospitals & Health Networks/AHA, 84(3), 24. Chassin, M. R., & Galvin, R. W. (1998). The urgent need to improve health care quality: Institute of Medicine National Roundtable on Health Care Quality. JAMA, 280(11), 1000–1005. Chassin, M. R., & Loeb, J. M. (2011). The ongoing quality improvement journey: Next stop, high reliability. Health Affairs, 30(4), 559–568. Chassin, M. R., & Loeb, J. M. (2013). High‐reliability health care: Getting there from here. Milbank Quarterly, 91(3), 459–490. Christianson, M. K., Sutcliffe, K. M., Miller, M. A., & Iwashyna, T. J. (2011). Becoming a high reliability organization. Critical Care, 15, 314. Clancy, C. M. (2008, May–June). AHRQ: Putting reliability into practice. Lessons from healthcare leaders. Patient Safety & Quality Healthcare. Retrieved from http://psqh.com/ahrq-putting -reliability-into-practice-lessons-from-healthcare-leaders Clancy, C. M. (2010). Ten years after “To err is human.” American Journal of Medical Quality, 24(6), 525–528.

238

organizing for reliability in health care

Clancy, C. M., & Reinersten, J. L. (Eds.). (2006). Keeping our promises: Research, practice, and policy issues in health care reliability [Special issue]. Health Services Research, 41(4, pt. 2), 1535–1720. Clancy, C. M., & Scully, T. (2003). A call to excellence. Health Affairs, 22(2), 113–115. Cohn, K. (2011, May 5). Building and sustaining a system-wide culture of healthcare safety: Getting it done chapter 4 [Blog post]. Retrieved from http://healthcarecollaboration.com/chapter -4-getting-it-done-building-and-sustaining-a-system-wide-culture-of-safety/ Cooper, J. B., Newbower, R. S., Long, C. D., & McPeek, B. (1978). Preventable anesthesia mishaps: A study of human factors. Anesthesiology, 49(6), 399–406. Crozer-Keystone Health System. (2012, August). Becoming a “high-reliability organization”: The next step in improving quality, safety and efficiency. The Journal: News and Views from the Crozer-Keystone Health System. Retrieved from https://web.archive.org/web/20130902115154/ http://www.crozerkeystone.org/news/Publications/The-Journal/2012/august/becoming-a-high -reliability-organization-the-next-step-in-improv/ Davies, H. T. O., & Nutley, S. M. (2000). Developing learning organisations in the new NHS. BMJ: British Medical Journal, 320(7240), 998–1001. Davis, T. (2005, May–June). 100,000 lives campaign adds 1,700 hospitals . . . and counting. Physician Executive, 31(3), 20–23. DerGurahian, J. (2008a). Mark of zero; Providers, patient-safety advocates brainstorm on decreasing errors, bad attitudes. Modern Healthcare, 38(49), 26. DerGurahian, J. (2008b). Raising the industry to the bar; Chassin hopes to reproduce other industries’ levels of effectiveness. Modern Healthcare, 38(49), 26. Dickey, N. W., Corrigan, J. M., & Denham, C. R. (2010). Ten-year retrospective review. Journal of Patient Safety, 6(1), 1–4. Dixon, N. M., & Shofer, M. (2006). Struggling to invent high-reliability organizations in health care settings: Insights from the field. Health Services Research, 41(4, pt. 2), 1618–1632. Dixon-Woods, M., Bosk, C. L., Aveling, E. L., Goeschel, C. A., & Pronovost, P. J. (2011). Explaining Michigan: Developing an ex post theory of a quality improvement program. Milbank Quarterly, 89(2), 167–205. Edmondson, A. C., & Lei, Z. (2014). Psychological safety: The history, renaissance, and future of an interpersonal construct. Annual Review Organizational Psychology and Organizational Behavior, 1, 23–43. Edmondson, A. C., & Singer, S. J. (2012). Confronting the tension between learning and performance. Reflections, 11(4), 34–43. Facing up to medical error [Special issue]. (2000, March 18). BMJ: British Medical Journal, 320(a). Federal Emergency Management Agency. (2004, November 23). NIMS and the incident command system. Retrieved from http://www.fema.gov/txt/nims/nims_ics_position_paper.txt Ferlie, E. B., & Shortell, S. M. (2001). Improving the quality of health care in the United Kingdom and the United States, a framework for change. Milbank Quarterly, 79(2), 281–315. Frankel, A. S., Leonard, M. W., & Denham, C. R. (2006). Fair and just culture, team behavior, and leadership engagement: The tools to achieve high reliability. Health Services Research, 41(4, pt. 2), 1690–1709. Gaba, D. M. (1994). Human error in dynamic medical domains. In M. S. E. Bogner (Ed.), Human error in medicine (pp. 197–224). Hillsdale, NJ: Erlbaum. Gaba, D. M. (2000). Anaesthesiology as a model for patient safety in health care. BMJ: British Medical Journal, 320(7237), 785. Gaba, D. M., Howard, S. K., Fish, K. J., Smith, B. E., & Sowb, Y. A. (2001). Simulation-based training in anesthesia crisis resource management (ACRM): A decade of experience. Simulation & Gaming, 32(2), 175–193.

organizing for reliability in health care 239

Gaba, D. M., Howard, S. K., & Jump, B. (1994). Production pressure in the work environment: California anesthesiologists’ attitudes and experiences. Anesthesiology, 81(2), 488–500. Gaba, D. M., Maxwell, M., & DeAnda, A. (1987). Anesthetic mishaps: Breaking the chain of accident evolution. Anesthesiology, 66(5), 670–676. Goodman, G. R. (2003). A fragmented patient safety concept: The structure and culture of safety management in healthcare. Hospital Topics, 81(2), 22–29. Griffith, J. R. (2015). Understanding high-reliability organizations: Are Baldrige recipients models? Journal of Healthcare Management, 60(1), 44–62. Hassen P., & Clancy, C. M. (2009, December 9). Invitation by the Canadian Patient Safety Institute and the Agency for Healthcare Research and Quality to the HRO Learning Network Consensus Meeting [Invitation letter]. Helmreich, R. L., & Merritt, A. C. (1998). Culture at work in aviation and medicine: National, organizational, and professional influences. Aldershot, UK: Ashgate. Henriksen, K., Dayton, E., Keyes, M. A., Carayon, P., & Hughes, R. (2008, April). Understanding adverse events: A human factors framework. In R. G. Hughes (Ed.), Patient safety and quality: An evidence-based handbook for nurses (chap. 5). Rockville, MD: Agency for Healthcare Research and Quality. Hines, S., Luna, K., Lofthus J., Marquardt, M., & Stelmokas, D. (2008, April). Becoming a high reliability organization: Operational advice for hospital leaders (Prepared by the Lewin Group under Contract No. 290-04-0011, AHRQ Publication No. 08-0022). Rockville, MD: Agency for Healthcare Research and Quality. Howard, S. K., Gaba, D. M., Fish, K. J., Yang, G., & Sarnquist, F. H. (1992). Anesthesia crisis resource management training: Teaching anesthesiologists to handle critical incidents. Aviation, Space, and Environmental Medicine, 63(9), 763–770. Healthcare Performance Improvement. (2015). Who we are. Retrieved from http://hpiresults.com/ index.php/intro/who-we-are Institute of Medicine. (1999, November). To err is human: Building a safer health system. Washington, DC: National Academy Press. Retrieved from http://www.nap.edu/openbook.php?isbn=0309068371 Institute of Medicine. (2001, March). Crossing the quality chasm: A new health system for the 21st Century. Washington, DC: National Academy Press. Retrieved from http://www.nap.edu/ openbook.php?record_id=10027 Institute of Medicine. (2004, November). Keeping patients safe: Transforming the work environment of nurses. Washington, DC: National Academy Press. Retrieved from http://books.nap.edu/ openbook.php?record_id=10851 James, J. T. (2013). A new, evidence-based estimate of patient harms associated with hospital care. Journal of Patient Safety, 9(3), 122–128. James M. Anderson Center for Health Systems Excellence. (n.d.). Becoming a high reliability organization. Retrieved from http://www.cincinnatichildrens.org/service/j/anderson-center/safety/ methodology/high-reliability/ Johnson, K., Raggio, R. D., Stockmeier, C., & Thomas, Jr., C. S. (2010). Healthcare performance improvement and high reliability: A best practice methodology. In M. L. Delk, L. Linn, M. Ferguson, & R. Ross (Eds.), Healing without harm: 21st century healthcare through high reliability (pp. 17–26). Atlanta, GA: Center for Health Transformation. Jose, K. S. (2005, March/April). Creating a high reliability organization. Notes on Nursing at Lahey Clinic. Burlington, MA: Lahey Clinic. Retrieved from http://www.lahey.org/About_Lahey/ Publications/Nursing_Publications/Notes_on_Nursing_March_-_April_2005.aspx Kalb, C. (2006, October 16). Fixing America’s hospitals. Newsweek, 72, 44–68.

240

organizing for reliability in health care

Kizer, K. W. (2001). Patient safety: A call to action. A consensus statement from the National Quality Forum. Medscape General Medicine, 3(2), 10. Knox, G. E., & Simpson, K. R. (2004). Teamwork: The fundamental building block of high-reliability organizations and patient safety. In B. J. Youngberg & M. J. Hatlie (Eds.), The patient safety handbook (pp. 379–415). Sudbury, MA: Jones and Bartlett. Knox, R. A. (1995, March 23). Doctor’s orders killed cancer patient: Dana-Farber admits drug overdose caused death of Globe columnist, damage to second woman. Boston Globe, p. A1. La Porte, T. R. (1994). A state of the field: Increasing relative ignorance. Journal of Public Administration Research and Theory, 4(1), 5–15. La Porte, T. R. (1996). High reliability organizations: Unlikely, demanding and at risk. Journal of Contingencies and Crisis Management, 4(2), 60–71. Leape, L. L., Simon, R., & Kizer, W. K. (1999). Reducing medical error: Can you be as safe in a hospital as you are in a jet (Issue Brief No. 740). Washington, DC: National Health Policy Forum. Leape, L. L., Woods, D. D., Hatlie, M. J., Kizer, K. W., Schroeder, S. A., & Lundberg, G. D. (1998). Promoting patient safety by preventing medical error. JAMA, 280(16), 1444–1447. Lekka, C. (2011). High reliability organisations: A review of the literature (Research report 899). London: Health and Safety Executive. Londorf, D. (1995). Hospital application of the incident management system. Prehospital and Disaster Medicine, 10(3), 184–188. Madsen, P., Desai, V., Roberts, K., & Wong, D. (2006). Mitigating hazards through continuing design: The birth and evolution of a pediatric intensive care unit. Organization Science, 17(2), 239–248. Manser, T. (2009). Teamwork and patient safety in dynamic domains of healthcare: A review of the literature. Acta Anaesthesiologica Scandinavica, 53(2), 143–151. Marcus, J. (1995, April 2). Fatal goof jolts famous cancer institute. Los Angeles Times. Retrieved from http://articles.latimes.com/1995-04-02/news/mn-49896_1_boston-globe-s-betsy-lehman May, E. L. (2013). The power of zero: Steps toward high reliability healthcare. Healthcare Executive, 28(2), 16–22. McCannon, C. J., Schall, M. W., Calkins, D. R., & Nazem, A. G. (2006). Saving 100,000 lives in US hospitals. BMJ: British Medical Journal, 332(7553), 1328–1330. McCarthy, D., & Blumenthal, D. (2006). Committed to safety: Ten case studies on reducing harm to patients. New York: Commonwealth Fund. McCarthy, D., & Klein, S. (2011). Sentara Healthcare: Making patient safety an enduring organizational value. The Commonwealth Fund, 1476(8), 1–18. Millenson, M. L. (2002). Pushing the profession: How the news media turned patient safety into a priority. Quality and Safety in Health Care, 11(1), 57–63. Milstein, A. (2002). What does the Leapfrog Group portend for physicians? Seminars in Vascular Surgery, 15(3), 198–200. Nolan, T., Resar, R., Haraden, C., & Griffin, F. A. (2004). Improving the reliability of health care [IHI Innovation Series white paper]. Boston: Institute for Healthcare Improvement. Patient Safety Network [PSNet]. (2015). Managing the unexpected: Sustained performance in a complex world, 3rd edition [Book report]. Retrieved from http://psnet.ahrq.gov/resource.aspx?resourceID=1605 Performance Improvement International. (2015). What makes PII’s approach unique. Retrieved from http://errorfree.com/company/ Plsek, P. (2003, January 27–28). Complexity and the adoption of innovation in health care. Paper presented at the Conference on Accelerating Quality Improvement in Health Care: Strategies to Speed the Diffusion of Evidence-Based Innovations. Washington, DC: National Institute for Health Care Management Foundation and National Committee for Quality Health Care. Retrieved from http://www.nihcm.org/pdf/Plsek.pdf

organizing for reliability in health care 241

Plsek, P. E., & Greenhalgh, T. (2001). The challenge of complexity in health care. BMJ: British Medical Journal, 323(7313), 625–628. Pronovost, P. J., Armstrong, C. M., Demski, R., Callender, T., Winner, L., Miller, M. R., . . . & Rothman, P. B. (2015). Creating a high-reliability health care system: Improving performance on core processes of care at Johns Hopkins Medicine. Academic Medicine, 90(2), 165–172. Pronovost, P. J., Weast, B., Holzmueller, C. G., Rosenstein, B. J., Kidwell, R. P., Haller, K. B., . . . & Rubin, H. R. (2003). Evaluation of the culture of safety: Survey of clinicians and managers in an academic medical center. Quality and Safety in Health Care, 12(6), 405–410. Reason, J. (1995). Understanding adverse events: Human factors. Quality in Health Care, 4(2), 80–89. Reason, J. (2000). Human error: Models and management. BMJ: British Medical Journal, 320(7237), 768–770. Rees, C. (2013, April 11). An update on South Carolina safe care: The journey to high reliability. Presentation to Medical University of South Carolina Board of Trustees. Retrieved from http:// www.fathompbm.com/MUSC/internal/QUALITY-REPORT-REES.pptx Rice, S. (2014, July 17). Hospital patients no safer today than 15 years ago, Senate panel hears. Modern Healthcare. Retrieved from http://www.modernhealthcare.com/article/20140717/ NEWS/307179965/hospital-patients-no-safer-today-than-15-years-ago-senate-panel-hears Roberts, K. H., Madsen, P., Desai, V., & van Stralen, D. (2005). A case of the birth and death of a high reliability healthcare organisation. Quality and Safety in Health Care, 14(3), 216–220. Rochlin, G. I. (1993). Defining high-reliability organizations in practice: A taxonomic prolegomenon. In K. H. Roberts (Ed.), New challenges to understanding organizations (pp. 11–32). New York: Macmillan. Roe, E. M., & Schulman, P. R. (2008). High reliability management: Operating on the edge. Palo Alto, CA: Stanford Business. Rosenthal, M. M., & Sutcliffe, K. M. (2002). Medical error: What do we know? What do we do? San Francisco: Jossey-Bass. Salas, E., Cooke, N. J., & Rosen, M. A. (2008). On teams, teamwork, and team performance: Discoveries and developments. Human Factors, 50(3), 540–547. San Mateo County Health Services Agency, Emergency Medical Services. (1998, June). The third edition of the hospital incident command system. San Mateo, CA: San Mateo County Health Services Agency. Schulman, P. R. (2002), Medical errors: How reliable is reliability theory? In M. M. Rosenthal & K. M. Sutcliffe (Eds.), Medical error: What do we know? What do we do? (pp. 200–216). San Francisco: Jossey-Bass. Schulman, P. R. (2004). General attributes of safe organisations. Quality & Safety in Health Care, 13(Suppl. 2), ii39–ii44. Schulman, P. R. (2014, October 12). High reliability research and safety research: Frameworks for unified theory? Presentation at the Center for Catastrophic Risk Management, Berkeley, CA. Schulman, P. R., & Roe, E. M. (2014, November 14). What is high reliability management? Can it help health and medical care? Presentation at the Vanderbilt Healthcare Workshop, Nashville, TN. Sheps, S., & Cardiff, K. (2005, December). Governance for patient safety: Lessons from non-health riskcritical high-reliability industries. (Report to Health Canada, Project 6795-15-2003-5760006). Ottawa: Health Canada. Shortell, S. M. (2006). Promoting evidence-based management. Frontiers of Health Services Management, 22(3), 23–29. Singer, S. J., Falwell, A., Gaba, D., Meterko, M., Rosen, A., Hartmann, C., & Baker, L. (2009). Identifying organizational cultures that promote patient safety. Health Care Management Review, 34(4), 300–311.

242

organizing for reliability in health care

Singer, S. J., & Vogus, T. J. (2013a). Reducing hospital errors: Interventions that build safety culture. Annual Review of Public Health, 34, 373–396. Singer, S. J., & Vogus, T. J. (2013b). Safety climate research: Taking stock and looking forward. BMJ Quality and Safety, 22(1) 1–4. Smith, C. (2014, May 4). Driving high reliability: A systematic approach for eliminating harm. Presentation at the Indiana Patient Safety Summit, Indianapolis, IN. South Carolina Hospital Association. (2015). South Carolina safe care commitment. Retrieved from https://web.archive.org/web/20150308014903/http://www.scha.org:80/south-carolina-safecare-commitment Spear, S. J. (2005, August 29). The health factory [Op-ed]. New York Times, p. 15. Sutcliffe, K. M. (2011). High reliability organizations (HROs). Best Practice & Research Clinical Anaesthesiology, 25(2), 133–144. A system approach to reducing error works better than focusing on individuals [Special issue]. (2000, March 18). BMJ: British Medical Journal, 320(g). Tamuz, M., & Harrison, M. I. (2006). Improving patient safety in hospitals: Contributions of highreliability theory and normal accident theory. Health Services Research, 41(4, pt. 2), 1654–1676. The Joint Commission. (2012). Improving patient and worker safety: Opportunities for synergy, collaboration and innovation. Oakbrook Terrace, IL: Author. The Joint Commission Center for Transforming Healthcare. (2014a). Creating solutions for high reliability health care [Brochure]. The Joint Commission Center for Transforming Healthcare. (2014b). Targeted solutions tools. Retrieved from http://www.centerfortransforminghealthcare.org/assets/4/6/TST_brochure.pdf The Joint Commission Center for Transforming Healthcare. (2015). High reliability resource center. Retrieved from http://www.jointcommission.org/highreliability.aspx Tucker, A. L., & Edmondson, A. C. (2003). Why hospitals don’t learn from failures: Organizational and psychological dynamics that inhibit system change. California Management Review, 45(2), 55–72. Uhlig, P. N., Brown, J., Nason, A. K., Camelio, A., & Kendall, E. (2002). System innovation: Concord hospital. Joint Commission Journal on Quality and Patient Safety, 28(12), 666–672. US Senate Subcommittee on Primary Health and Aging. (2014, July 17). Subcommittee hearing— More than 1,000 preventable deaths a day is too many: The need to improve patient safety. Retrieved from http://www.help.senate.gov/hearings/hearing/?id=478e8a35-5056-a032-52f8-a65f8bd0e5ef van Stralen, D. (2003, August 10–11). Getting it right: The science behind solving the unsolvable (with the duty to act) [Conference brochure]. Retrieved from http://high-reliability.org/files/25 _Conference_Brochure.pdf van Stralen, D. (2012, May 21–23). Seeking reliability through operations, attitudes, and measuring success [Conference brochure]. Retrieved from http://high-reliability.org/flyer-rev2.pdf Vincent, C., Benn, J., & Hanna, G. B. (2010). High reliability in health care. BMJ: British Medical Journal, 340(c84). Vincent, C., Taylor-Adams, S., & Stanhope, N. (1998). Framework for analysing risk and safety in clinical medicine. BMJ: British Medical Journal, 316(7138), 1154. Vogus, T. J., Rothman, N. B., Sutcliffe, K. M., & Weick, K. E. (2014). The affective foundations of high‐reliability organizing. Journal of Organizational Behavior, 35(4), 592–596. Vogus, T. J., & Sutcliffe, K. M. (2012). Organizational mindfulness and mindful organizing: A reconciliation and path forward. Academy of Management Learning and Education, 11(4), 722–735. Vogus, T. J., Sutcliffe, K. M., & Weick, K. E. (2010). Doing no harm: Enabling, enacting, and elaborating a culture of safety in health care. The Academy of Management Perspectives, 24(4), 60–77. Wachter, R. M. (2010). Patient safety at ten: Unmistakable progress, troubling gaps. Health Affairs, 29(1), 165–172.

organizing for reliability in health care 243

Wachter, R. M., & Pronovost, P. J. (2009). Balancing “no blame” with accountability in patient safety. New England Journal of Medicine, 361(14), 1401–1406. Weick, K. E. (1987). Organizational culture as a source of high reliability. California Management Review, 29(2), 112–127. Weick, K. E. (2012, April 1). Continuities in HRO thought. Preconference materials for the Looking Back, Looking Forward Workshop, Nashville, TN. Weick, K. E., & Sutcliffe, K. M. (2001). Managing the unexpected: Assuring high performance in an age of complexity. San Francisco: Jossey-Bass. Weick, K. E., & Sutcliffe, K. M. (2006). Mindfulness and the quality of organizational attention. Organization Science, 17(4), 514–524. Weinstock, M. (2007). Can your nurses stop a surgeon? Hospitals & Health Networks/AHA, 81(9), 38. Wilson, K. A., Burke, C. S., Priest, H. A., & Salas, E. (2005). Promoting health care safety through training high reliability teams. Quality and Safety in Health Care, 14(4), 303–309. Winokur, S. C., & Beauregard, K. J. (2005). Patient safety: Mindful, meaningful, and fulfilling. Frontiers of Health Services Management, 22(1), 17–32.

chapter 11

Org a n i z i ng for R e l i a bi l i t y i n Pr ac t ic e Searching for Resilience in Communities at Risk Louise K. Comfort

Org a n i z at iona l R e l i a bi l i t y a t t h e C ommu n i t y L e v e l

The challenge to organizational reliability lies in practice, and the degree to which an organization can achieve reliability is shaped by the context in which the organization operates. If that context is changing, the organization’s previous rules, assumptions, and practices may also need to change simply to maintain its basic performance. The paradox of organizational reliability operating in uncertain conditions is that, in order to maintain stability, the organization, like a gyroscope, must adapt dynamically to a changing environment or lose its balance in performance. Other chapters in this book review the early research on HROs (Rochlin, 1987; 1996; Roberts, 1993; La Porte & Consolini, 1991; Weick, 1995; Weick & Sutcliffe, 2001, Roe & Schulman, 2008). Briefly restated, the goal of the early research on HROs was to identify the factors that enabled organizations to maintain reliable performance under conditions of high uncertainty and change. In this broad arena, early researchers focused on specific types of organizations designed to function in high-risk environments, such as aircraft carriers, nuclear plants, or electrical grids, and the personnel who operate these highly technical and potentially dangerous systems.

organizing for reliability in pr actice 245

The standard means of managing risk in these environments is to bound the organization, develop training and operating procedures for that specific set of tasks, foster practice of heedful interaction to alert all members of the organization to potential error or malfunction, and articulate a culture in which each member accepts responsibility for correcting observed errors in any part of the organization’s performance. Maintaining high performance is viewed as a collective goal and a shared responsibility of the whole organization. Implicit in this early research is the assumption that the factors that produce high reliability in one type of organization are transferable to other types of organizations operating in risky environments. Consequently, the goal was to understand this process by intensively studying single organizations operating in high-risk environments, identifying the key factors that contributed to reliable performance, and applying this model in practice to other organizations, increasing their reliability in performance accordingly. Social reality, however, belies this basic assumption. Instead, there is a growing recognition that the goal of highly reliable performance can no longer be met by single organizations. Most, if not all, societal functions operate in highly complex, multiorganizational systems at nested scales of action and interaction. Given the fundamental interdependence of organizations operating in human communities, from single-family households to metropolitan regions to globally connected networks of commerce, transportation, and finance, the task of achieving high reliability becomes much more complex. Yet, the earlier HRO research offers substantive insight into framing inquiry into reliable performance on a community scale. First, recognizing that reliability at the community level is achieved by a system of organizations working in concert to achieve a shared goal, such as seismic risk reduction, alters the basic unit of analysis. It is essential to identify the number of actors and levels of operation that are necessary for mobilizing the personnel, equipment, materials, and actions involved in reducing seismic risk. For example, specific tasks include drafting building codes; assessing existing buildings; monitoring the construction of new buildings; and ensuring that the builders, owners, clients, financiers, and users of buildings are aware of the requirements for seismic risk reduction and have the knowledge, skills, and capacity to meet them. These tasks are performed by different people, organizations, and authorities; if any one unit falters, the reliability of the whole system involved in reducing seismic risk is diminished.

246

organizing for reliability in practice

Second, given the number of actors engaged at different scales of operation within the same system that is seeking to reduce seismic risk, information asymmetry between different scales of operation becomes a risk in daily operations. Differences in access to information, capacity for understanding and updating information in real time, and ability to share and exchange information with relevant actors in the system provide key measures of potential dysfunction for the whole system. By identifying the differences in performance among system components that lead to gaps or weaknesses in information flow, it is possible to diagnose and reduce those differences before they weaken system performance as a whole. Third, rules and procedures for operations are defined as a systematic method of achieving reliable performance in changing conditions. The difficulty is that the rules and procedures for operating at one level of performance in a complex system may be inconsistent or conflict with rules and procedures defined for organizations operating at a different level of performance in the overall system, even if they are seeking to achieve the same goal. If organizations have not implemented a means of detecting inconsistences, or resolving procedural conflicts within the larger system, these inconsistences may inhibit the flow of information throughout the system and, instead of increasing reliability, reduce system performance through dysfunction and conflict. Related to managing differences in rules and procedures among different levels of operation within a complex operating system is the need to monitor performance at distinct levels within the system. If there is no clear understanding of distinct activities performed at each level, nor a method for integrating different types of information from different sets of activities into a common profile of operations for the whole system, there is little capacity for the different actors and components to function as a coherent system. The dynamic energy of the whole system is dissipated through an inadequate structure for managing multiple tasks simultaneously. In contrast, channeling the information flow throughout the whole system creates the shared awareness that guides collective action and enables reliable performance. Given these observations, the question of what factors lead to high reliability in organizational performance changes to how can complex, interorganizational activities scale in manageable operations to achieve coherent performance as a system? That is, how can organizational performance at one level of operations

organizing for reliability in practice 247

be designed to create a basis for aggregation and integration of actions at the next level of operations, and the next? What are the minimum requirements that allow transition from one level of complex operations to the next wider level of activities, influence, and performance, and the next, while still retaining focus on the same goal for the whole system? The inquiry moves from seeking reliability for a single organization to seeking reliability for a system of interacting organizations—a more difficult, complex task.

Th e C h a l l e n g e : M e a su r i n g a n d C a l i b r a t i n g R i sk i n a S y s t e m o f S y s t e ms

A long-held maxim in organizational studies is that, in order to understand a complex social construct, it is necessary to measure it. The challenge is to create appropriate tools for measuring dynamic system performance and visualizing the complex sets of interactions as operating conditions change, allowing actors to adapt their performance in real time. In dynamic systems, the unit of analysis shifts from a single organization to systems of organizations interacting at different scales of operation. With this shift, a fundamental change occurs in the performance of the system. The mechanisms of operation necessarily become sociotechnical as human managers can no longer track and comprehend the volume and rapid shifts in information flow without technical assistance. The evolving system necessarily includes human managers, technical sensors, and computers working together to provide decision support for managing complex tasks. Yet, just as this sociotechnical interaction enables enlargement of the system’s operational capacity to address more complex problems encompassing greater risk, it also creates new sources of potential error, if any one of the primary actors—human managers, technical sensors, or computational systems—fails. The basic mechanisms of system operation include computation of the likely consequences of success or failure in making decisions for action and provision of prompt feedback to relevant actors in order to translate new information into action and adaptation to a changing environment. But these mechanisms, designed by fallible humans, are also subject to failure that may cascade throughout the system. Measurement of system performance to capture the three-way interaction among organizational actors, technical infrastructure, and actual operating

248

organizing for reliability in practice

conditions requires assessment of system operations in reference to an actual context of risk. In hazardous situations, risk and reliability tend to coexist in reciprocal tension. That is, as attention, effort, and knowledge are focused on reducing a specific risk, the operational system becomes more reliable in managing that known risk, but more vulnerable to other, unanticipated threats that may emerge outside the bounds of the managers’ focused attention and resources (Carlson & Doyle, 2000). This condition, termed “highly optimized tolerance” (Ibid.) illustrates the limits of human managerial capacity (Simon, 1997) in situations where the proportion of known risk to unknown risk decreases. The complexity of technical and social interactions may simply exceed human capacity to comprehend and manage such systems reliably. Consequently, risk, unattended, may escalate for the very communities that build sociotechnical systems to create more reliable performance in providing basic functions such as power, communications, transportation, gas, water, and wastewater distribution.

A Th e o r e t i c a l F r a m e wo r k f o r C ommu n i t i e s a t R i sk

Four key concepts from the review of the earlier literature on HROs provide a logical framework for analyzing dynamic systems in practice. These basic concepts, grounded in earlier literature on high reliability, complex adaptive systems, and interorganizational learning, serve as a useful structure for analyzing the search for reliable performance among organizations in communities at risk. Initial Conditions Drawing on complex systems theory, the role of “initial conditions” is fundamental in shaping organizational systems that are capable of self-assessment, learning, and adaptation in response to change, whether sudden or slow onset (Prigogine & Stengers, 1984; Kauffman, 1993; Comfort, 1999; Ostrom, 2005). The specific tasks of assessing risk in its changing states, allocating each state of risk to known personnel for managing its visible indicators, aggregating the overview of emerging risk for the whole community, and communicating the degree of known risk to relevant groups in the whole community build on a sound assessment of the existing characteristics, functions, and resources

organizing for reliability in practice 249

of a specific community. The capacity to execute these tasks depends on the degree of knowledge, resources, and personnel available at each level of operation. As the physical area, size of the population, and complexity of interdependent functions increases, the fragility of the system designed to manage risk also increases. Reliability As a Dynamic Process A key implication of the high reliability approach is to view reliability as a dynamic process. The concept of “heedful interrelating” among individuals exposed to shared risk (Weick & Roberts, 1993, p. 360) acknowledges that maintaining collective awareness in uncertain conditions is an ongoing process. Karl Weick (1995) extends this fundamental insight in later work on sensemaking, treating this effort as a continuing activity in uncertain conditions that acknowledges the need to validate perceptions of risk against actual evidence and to update one’s assessment as more information becomes available. Exploring further conditions under which organizations manage to maintain reliable performance in changing contexts, Weick and his colleague, Kathleen Sutcliffe (2001), identify strategies for coping with vague, uncertain conditions as an ongoing process of recognizing discrepancies from expected models of action and adjusting both perceptions and actions to fit actual conditions more appropriately. Limits on Reliability In contrast to the assumption that high reliability is consistently achievable in potentially dangerous, uncertain contexts is recognition that any complex set of operations will have constraints of time, knowledge, and budget that will limit its performance. Rather than expending resources to overcome constraints endemic to complex operations, Robert Axelrod and Michael Cohen (1999) advocate “harnessing complexity” as a means of acknowledging both the limits of any one organization seeking to achieve a certain goal and the possibility of generating complementary interactions with other organizations in order to achieve a shared, if partial, goal. Acknowledging the limits of its actions at one level of operation requires an organization to select its interactions with other organizations from the total population of available strategies in a consistent effort to achieve its basic goal at the next level of operations.

250

organizing for reliability in practice

Recognition of its limits becomes a driver for the selection of the next set of interactions among available strategies and conditions. Consequently, the complexity of the operating environment shapes the degree of reliability that can be achieved within the constraints of time, budget, and knowledge at any scale of operations. Scalability as Transition within Flexible Structures Understanding transitions in size and scale as demand for organizational response fluctuates is a long-standing challenge for multiorganizational networks. If rules are developed to define orderly progression in stable conditions, the same rules may restrict necessary adaptation and adjustment in unexpected or uncertain conditions. The concept of fractals in complex systems offers insight into this dilemma, as mathematician Benoit Mandelbrot (1977, 2004) asserted that fractals represent self-similar patterns at different scales of size and extent of interaction. If these self-similar patterns are abundant in nature—for example, in wave forms or sand dunes—are they also characteristic of organizational forms that humans create? If so, then a small number of key organizational elements may replicate their patterns of interaction at different scales of performance, retaining the same form and function at micro-, meso-, and macrolevels of operation. The concept is intriguing as it balances change with stability and offers a plausible mechanism for managing the transition from one level of operation to the next in a large-scale, sociotechnical system. Viewing interactions among multiple organizations as a complex, adaptive system provides an appropriate method for analyzing the search for reliability at a community scale. Searching for Reliability in Communities at Risk Managing the scalable transition from one operational level to the next in a complex adaptive system of systems (CASoS) is a key requirement for enhancing the reliability of complex systems consisting of multiple, interacting subsystems that are performing distinct tasks. Is there an identifiable set of self-similar patterns that enable transition from one operational level to the next in a CASoS? Or, are there simply different lenses through which organizational participants view the same set of operational dilemmas, as argued by John Carroll (Chapter 3), consequently leading to different choices in action?

organizing for reliability in practice 251

What are the key components that drive patterns of reliable performance in multiorganizational contexts, and could they be replicated for different types of events in different types of communities? This inquiry next examines the theoretical construct of organizational reliability against empirical evidence in three cases of response systems that evolved following severe earthquakes. This analysis provides a critical test for extending the concept of organizational reliability from single organizations to multiorganizational communities at risk.

M a n agi ng Org a n i z at ions i n t h e E x t r e m e

Recent events serve as vivid reminders that hazards occur in a given location but trigger interacting consequences over a wider region, crossing jurisdictional, organizational, and disciplinary boundaries. Extreme events represent extraordinary challenges to practicing managers as they seek to create and maintain stable, secure communities for their residents. Such events include “Superstorm Sandy” that ravaged the coasts of four states in the eastern United States in late October to early November 2012; floods that disrupted downtown Jakarta, Indonesia, in January 2013; Ya’an earthquake that tumbled houses and lives in Sichuan province, China, on April 20, 2013; and the devastating EF5 tornado that touched down in Moore, Oklahoma, on May 20, 2013. The challenge is recognizing the interconnected, interdependent systems that create the potential for cascading failures following each event before it occurs. The impact of Sandy not only crumpled beachfront homes and businesses, but also shut down electrical power through a wide region, impairing performance in homes, businesses, schools, public agencies, and nonprofit organizations. In Jakarta, flooded streets in the downtown area damaged electrical and computational systems and stalled governmental performance in the capital city, affecting the delivery of public services to other provinces in Indonesia and communications with other nations in the region. It is not the specific hazard in these events that is most disruptive, but the cumulative set of interactions that is triggered by the initial failure in the mesh of physical, technical, social, and economic systems that characterize the contexts of current society. Unraveling the interdependencies that both generate, and lessen, exposure to risk in a specific community is the quintessential task of emergency services. Capturing the dynamics involved in these interacting

252

organizing for reliability in practice

systems requires an integrated, interdisciplinary, interjurisdictional approach in contrast to daily practice of controlling single systems separately. Actual events also illustrate the three lenses of strategic design, political contestation, and cultural norms (see Carroll, Chapter 3) that filter differing perceptions of the same event and shape the potential set of strategies articulated for community action. The question is whether a small set of critical conditions that facilitate transition across rapidly escalating scales of operation to increase reliability in community action can be identified in actual practice.

CASoS

CASos is a promising approach to analyzing the dynamics of rapidly evolving emergency events. The CASoS initiative is led by Robert Glass and his colleagues at Sandia National Laboratory (Glass et al., 2011). This group has studied the dynamics of change in interacting systems and initiated a research agenda to explore the interactions among key agencies in emergent systems that form in response to specific events or tasks. For example, a metropolitan regional transportation system consists of multiple systems: an engineering system that lays out the routes and mechanisms of transport in a geographic region; a physical system of hills, rivers, and urban construction over which the engineered transport system is laid; an organizational system of transportation engineers, window clerks, and mechanics that keep the trains, buses, and autos running on time; and a socioeconomic system of passengers, prices, and demand for transportation that drives the entire set of functions to enable people in the area to move easily and efficiently within the region. Each part of the system is influenced by the other parts; no component of the system could function alone. As one component adapts to fit a changing environment more appropriately, other components are also affected and adapt in turn. It is the flow of people, machines, and money through the subsystems that enables the meta-system to provide needed transportation as a service to the region. The interdependence of functions in the system is such that a disruption in any one subsystem likely results in delay or disruption in other subsystems, which cumulatively slows down performance and efficiency of the whole system. Public managers are tasked with determining which functions are essential to keep the system running and what nodes in the system

organizing for reliability in practice 253

serve as either facilitators or inhibitors of system performance under stress. The goal is not to deny potential disruptions from hazards, as they will occur, but rather to acknowledge the risks to which the region is exposed and to enable multiple actors in complex, multiscale operating systems to assess, design, and test methods of adaptation and learning in actual environments that build resilience on a community-wide scale. Other analytical approaches incorporate similar recognition of multilayered scales of operation, with different sets of shared beliefs, rules, and expectations characterizing each scale (Ostrom, 2005; Hess & Ostrom, 2007; Fligstein & McAdam, 2012). Researchers have sought to identify and analyze networks of action in administrative contexts in a range of policy areas (Nohria & Eccles, 1992; Provan & Milward, 2001; Agranoff, 2007; Koliba, Meek, & Zia, 2010; Koppenjan & Klijn, 2004). Each of these studies offers insights into the design and management of networks of action. The CASoS approach is distinctive in its recognition of the interaction of technical systems with organizational systems in a changing context that enables self-organizing, collective action in anticipation of risk.

V i s i o n : A G l o b a l C ommo n s f o r C o l l e c t i v e L e a r n i n g a n d Ac t i o n i n R i sk E n v i r o n m e n t s

Given rapid advances in information technology in recent years, with handheld devices enabling rapid search, transmission, and access to information in real time, it is now both essential and practical to envision a global framework to support responsible leadership in developing local capacity to manage risk. Such a vision entails four key components: (1) articulation of a global goal of scalable risk reduction for all communities of the world, with consequent specification of actors and actions needed to achieve that goal at different levels of administrative jurisdiction; (2) design of strategies of action that are appropriate for each level of operation within an overall framework of an operating meta-system; (3) implementation of designs for action in an actual context to test the overall framework for mobilizing response to key needs and its utility at different levels of operation; and (4) integration, synthesis, and evaluation of performance at both subsystem and system-wide levels of operation in achieving the goal of global risk reduction. Each of these components entails

254

organizing for reliability in practice

a detailed specification of risks, resources, knowledge, and tasks (Carley, 2003) as well as the design for an overall process of iterative review, reflection, and redesign (Comfort, 2007b). Such an approach allows for updating assumptions based on valid data, refocusing strategies of action in accordance with actual constraints, and systematic feedback from the participants in the process to the managers making policy decisions and guiding action. Such a vision compels the design of a framework for collective learning and action in dynamic, complex environments. It is essential to expand the existing capacity of organizations already engaged in risk reduction activities to function effectively in a complex, global arena. Equally important is the acknowledgment of emergent organizations that form and reform in response to changing conditions. Such organizations may serve temporary purposes, forming and reforming into new networks that operate within a larger social network. This capacity for adaptation to changing conditions strengthens the resilience of the whole community. Identifying and tracking the conditions under which such networks emerge or transition into new forms of social organization and action are central tasks in understanding the dynamic processes of risk management. The driving force in this model is the shared commitment of all participating organizations to develop and maintain effective performance for managing risk under uncertain conditions.

As y mm e t r y o f I n f o r m a t i o n

The inherent complexity of a global system to manage risk underscores a classic problem in decision making under uncertainty. Since a global system is necessarily a multilevel set of operational units, each interacts with different local contexts, cultures, laws, or lack of same. The basic assumption of a global system for managing risk is that all actors share the same basic goal. In practice, different actors bring particular perspectives regarding methods of reaching their shared goal, and, consequently, they bring different degrees of experience regarding risk, resources, and knowledge of methods to achieve that goal. Further, these organizations seek, and acquire, different types of information regarding the same event. This situation creates the practical problem of asymmetry in information distributed among the actors participating in a specific event that may lead to miscommunication, misunderstanding, and

organizing for reliability in pr actice 255

inadequate recognition of the risks, resources, knowledge, and tasks that are needed to characterize and address that event (Comfort, 2007a). While asymmetry of information is regrettably persistent in complex systems, it is also a largely solvable problem. This problem fundamentally engages public managers in the tasks of monitoring risks and managing the information essential for decision making to reduce risk at the multiple levels of operation. Calculating the interdependencies among possible actions by different organizations at different levels of jurisdictional operations is a rigorous task but allows comparison of different options (Comfort et al., 2013) to identify those with the highest utility, given the actual constraints of the situation. This analytical task enables practicing policy makers to consider varying combinations of actions that may result in more efficient, effective outcomes than following simple, rule-based checklists that are common practice in emergency services. Building collective capacity for action in a system-wide, collaborative effort to reduce disaster risk means building a common understanding of risk among multiple actors and using that understanding to guide collaborative action to achieve a shared goal. Resilience, in simplest terms, means the capacity of organizations and actors to recognize risk, learn in a dynamic situation, adjust actions to existing constraints, and, most importantly, to continue operations for the community in the most innovative, effective way possible, given available knowledge, skills, and resources.

C o n t r a s t b e t w e e n P l a n s a n d Ac t i o n

Dynamic events create a particular conundrum for public managers. While they do plan for extreme events to reduce risk, their planning processes are primarily based on known information and past experiences. Ironically, past events often do not anticipate the cumulative, interdependent processes of change that characterize a complex, dynamic society (Taleb, 2010). Strategies that had proven effective ten years ago may now be obsolete, given changes in technologies, movements of population, increasing deterioration of infrastructure, recent findings from scientific studies of risk, or constraints in funding for hazard mitigation. As a result, public agencies responsible for managing risk and communities exposed to hazards often do not anticipate the interdependent relationships among the interconnected systems that provide public

256

organizing for reliability in practice

services and undergird the basic economic, social, technical, and ecological functions of their region. Consequently, these communities face further and more far-reaching consequences when failure in one system triggers failure in the next and the next system. An extreme event escalates into region-wide consequences that damage other communities and becomes a major disaster. The dilemma, then, is how can public managers learn from past events to anticipate future risks? Crystal balls are notoriously ineffective, but it is possible to reexamine planning processes for extreme events, incorporate current methods of dynamic modeling into these processes, and reframe the problem as one of continual inquiry and adaptive learning for the whole community. To do so, it is instructive to compare the planning processes that were in place in communities that experienced a severe event, assess the extent to which their respective plans enabled the community at risk to minimize consequences from the event, with the utility of those plans in reducing the scale of the hazard that the community confronted. In this study, I briefly compare the planning processes for three major seismic disasters—the Sumatra earthquake and tsunami, December 26, 2004; the Haiti earthquake, January 12, 2010; and the Tohoku-Oki earthquake, tsunami, and nuclear breach, March 11, 2011—and assess the utility of those plans in reducing the consequences of these sudden, urgent events. In each of these communities, planning processes had been initiated to varying degrees. Yet, the plans proved seriously ineffective in enabling the communities to respond to the actual risk with informed, timely action. The cost and consequences following each disaster escalated beyond the community’s capacity to respond, resulting in catastrophic, global events.

D i v e r g e n t P l a n n i n g P r oc e ss e s : I n do n e s i a , H a i t i , a n d J a pa n

Professional planning assumes a direct relationship between the process of planning and the capacity of organizations engaged in the process to implement the plans that are developed. Yet, in planning for extreme events, even the most carefully developed plans almost never fit the requirements of the actual situation. Planning does, however, serve a fundamental purpose in focusing attention on a risk shared among multiple actors and participants and engaging them in considering potential strategies for action. To the degree that

organizing for reliability in practice 257

planning informs community residents about the risks they face and catalyzes collective action in preparation for extreme events, it is a critical step toward reducing disaster risk. The discrepancy between planning for extreme events and capacity for action during extreme events is a measure of resilience for the community. Determining how to reduce this discrepancy between planning and action is a key task for public managers. Examining this problem in comparative perspective allows identification of factors that contribute to, or inhibit, the capacity for collective action in extreme events. Three major disasters in Indonesia, Haiti, and Japan reveal different degrees and types of discrepancy between planning and action in response to the actual hazard. It is instructive to review briefly how these planning processes differed among the three nations exposed to seismic risk and to assess what factors served to reduce or, conversely, increase risk in each nation. Further, what consequences followed from the actions taken, or not taken, to reduce risk in each nation? Finally, what steps are essential to build global resilience to the recurrence of extreme events? Indonesia: Sumatra Earthquake and Tsunami On December 26, 2004, at 0758 local time, a massive earthquake, registering moment magnitude (Mw) 9.3, occurred on the Sunda trench of the western coast of North Sumatra, Indonesia. The rupture extended for approximately 1,300 kilometers along the fault line and caused a massive uplift of the sea floor, estimated at 5 meters. This huge displacement generated three tsunami waves that affected twelve nations in the Indian Ocean Basin, but proved most destructive to Indonesia. Although seismic risk is well-known in Indonesia, with moderate to severe earthquakes occurring every year somewhere in the archipelago, this earthquake was beyond anyone’s expectation. Indonesia had no national disaster plan at that time, and, consequently, people residing in coastal communities had little knowledge of tsunami risk and less information regarding what protective action to take, should one occur. This was particularly the case for Banda Aceh, a city of 350,000 located at the tip of the island of Sumatra, as it was struck by tsunami waves from two directions—first as the giant wave moved eastward toward the Malay Peninsula and again as the wave returned westward, ricocheting off the peninsula to strike Sumatra. Given the lack of

258

organizing for reliability in practice

preparedness and the power of the tsunami waves, an estimated one out of every three persons in Banda Aceh was lost in this event. Indonesia, an archipelago of nearly 18,000 islands stretching more than 3,000 miles from west to east, is extraordinarily vulnerable to seismic and volcanic risk. Yet, little planning on a national scale had been done prior to 2004. In the province of Aceh, at the northern tip of the island of Sumatra, an intense civil war had waged between the national government and the Free Aceh movement, sapping the attention, energy, and resources of both sides in this long-running conflict. Consequently, the huge tsunami waves, reaching more than 30 meters high at some locations along the shore, caught residents of the coastal communities by surprise. Without a systematic disaster plan in place, residents had no guidance on what to do or where to go if danger strikes. The losses from this sobering event were dramatic. Indonesia reported 237,071 people dead or missing; Sri Lanka tallied over 30,957 fatalities; India listed 10,749 deaths; and Thailand reported 5,393 dead, with approximately half that number identified as tourists from Europe. The economic losses were estimated at US$36 billion across the region, as the tsunami triggered major disruptions in social, economic, political, and cultural activities across the Indian Ocean region. Seeking to identify the organizational response to this event, our research team at the University of Pittsburgh conducted a content analysis of news accounts of response operations reported in the Jakarta Post for three weeks following this event, December 26, 2004, through January 16, 2005. Table 11.1 presents the distribution of organizations participating in response operations following the earthquake and tsunami. News reports do not provide a complete record of organizational participation in disaster operations, but they do provide a measure of daily participation and offer at least a partial characterization of the number and types of organizations that engaged in response operations. Given imperfect data documentation in disaster environments, these records provide at least a systematic account, which we then validated by expert interviews and documentary records from national and international organizations. The striking finding in Table 11.1 is the large number of international organizations that participated in response operations, nearly 43 percent of the total number of 350 organizations, followed by 30.3 percent of national organizations. Local organizations contributed the third highest proportion at 15.4 percent, with the remaining jurisdictions representing lesser proportions.

organizing for reliability in practice 259

table 11.1 Frequency Distribution of Organizations Participating in Response Operations following the 2004 Sumatra Tsunami by Jurisdiction and Source of Funding source of funding public

private

nonprofit

special interest

%

N

%

N

totals

Jurisdiction

N

%

N

%

N

International

99

28.3

23

6.6

26

7.4

2

0.6

150

42.9

National

57

16.3

28

8.0

19

5.4

2

0.6

106

30.3

Provincial

10

2.9

1

0.3

3

0.9

1

0.3

15

4.3

Special Region

13

3.7

0

0.0

5

1.4

0

0.0

18

5.1

4

1.1

0

0.0

2

0.6

0

0.0

6

1.7

Regency Subdistrict

%

1

0.3

0

0.0

0

0.0

0

0.0

1

0.3

Local

32

9.1

15

4.3

7

2.0

0

0.0

54

15.4

Totals

216

61.7

67

19.1

62

17.7

5

1.4

350

100.0

Source: News reports, Jakarta Post [Indonesia], December 26, 2004–January 16, 2005. Totals are subject to rounding error.

Figure 11.1 shows the Indonesian national response network of organizations linking five key nodes to other nodes in the network. The key nodes represent the president of Indonesia, Indonesian military, Indonesian Red Cross, Ministry of Peoples Welfare, and Ministry of Health. Figure 11.2 shows the erratic entry of new organizations into the response system, with a marked increase in the number of organizations entering in the first three days, followed by a sharp drop in new entries on days 4, 5, and 6, and then a peak of new organizations entering the response system on January 4, 2005. This peak represented a major influx of international organizations following the convening of an international conference by the Association of Southeast Asian Nations (ASEAN) to address the need for a global response to this severe disaster (Jakarta Post, 2005). This graph reveals the degree of disconnectedness among organizations from different jurisdictions, documenting a lack of coherent organization and near absence of planning across jurisdictional scales. While a response system did emerge, and organizations engaged in operations with a clear sense of their particular missions, there was little coherence in mobilizing collective response to this devastating event. The results from this analysis show a strong discrepancy between the degree of planning and the capacity of the local communities to respond to this calamity. In many respects, the situation reflected in these findings propelled

ptper amhosp

nsc

memrid

birhosp

konia

newmt ptamg

embind pirhosp

ptpln

mdlf

serid

ubrc

primfm

asiafd ndu

sumuh

kbr68

fwb

muhuni ytc

sarhosp sksu

lion astra

kcamp

mphid

psf

wbf oidmif

95cmp

muhamrad

ggcamp

gabsi

kafe

aaf

biag

ddrep

aiglip

ptjas

minid hca

ctyyog

nccp

msaid

dalrad tndpk

unilever

telko

megrad

unwfp

eb

sicamp

hmmal

lamcamp

metro

mctid

ocsid

pafb

mmtid

ptsei osmcsme mghosp icuds oasdaj govjak mjhrid jpwa ptdi nadmcc mfmid idsc oggogss djpm jhvn idoag af1mp cngotv cmecpmssid govnsum navind abc milmalis afid mercy imafp af6mptndapt ogwnt rcsab ausaid usaid golkar ptpsn cmghosp odgcdch mpapt baitmos iapa oxfam govaceh rcwab ogj idb govger hec hrmab mhaid fa natpol meid stc gor3mp imf iddps nadpo milsing simapt aaat achppol osmrt ogns bihs govita milid opnrgovsa gamcc muham ctybanda opmin navus mhealid mpwid adp okingnor gam opmnor unoccba orm idrcrosjc kuvil ogap govid dodus shellid mfasp clsa ovpid mndpid uclg excel milaus idrcres cabpid nmgaid mfajap govtur sumapt govtai pttel unicef opid opmneth ctylho clsarf opet govhk ocmeid govlib idrcros govchin subapt mfager ldp afaus ibma lbsb wbcgid cida opin unochaid govuk idhr oqneth ocmpwid wb mfaid govalg mfaet govus rajgr noso whoid govsing milus osgun unsctre govfra msf eu govjap who govaul accor govsp miduk govmala govswe tepid mdid jbic ptmei ndmcbid navidun embus hpab usds gmy idcci govmyan govin govsk radcan motor khf ircro wwm asean undfw govcan bkid mfinid sucorp opmsing paris socafo bit opus hcdfa seim ptperum oecp mdaus bha dutch govuae govrus dss mdsing sjh ifrc pateng habunhosp smwe iba adb ny alc inpex edb stfla sfr unfoba gp geundp govth govsl eumic amazon ombak jakpol govnor unoca govmald sb ausrcros govlao jatayu govvn stcrbnk govetim acs binsup sihosp gar idped govbru govphil govcam govnz unmp eurpar dfs ptkebd conair tjc ofpc ctybat ofpb afus appsmg amrc jcc exxon govssp bat unil shiapt crapus ndmtna paloat bamus jac ksctd gm mcbssp odgns sth media senapt ptsai idbc

cimghosp

asndra

apjhosp

csps wv gmu

iom

idcrp

haai

unhcr pthutkar wvid ica ptninkar cbc

nsraid

khfa

tk

maf

lafarge ptjam

figure 11.1 Indonesian Response Network, Sumatran Tsunami, 12/26/2004–1/17/2005. Source: News Reports, Jakarta Post, Jakarta, ID. 12/26/2004–1/17/2005.

govrom

Jurisdiction International National Provincial Regency Sub-District Special Region Local

Shape Square Upward-pointing triangle Circle Plus Circle in a box Downward-pointing triangle Diamond

Sector Public Nonprofit Private Political

Color Black White Gray Black

organizing for reliability in practice 261

Number of New Organizations

25 20 15 10 5

1/16

1/15

1/14

1/13

1/12

1/11

1/10

1/9

1/8

1/7

1/6

1/5

1/4

1/3

1/2

1/1

12/31

12/30

12/29

12/28

12/27

12/26

0

Date Local

Subdistrict Regency Provincal National International

Special Region

figure 11.2 Indonesian National Response System, Sumatran Tsunami, 12/26/2004– 1/17/2005

the Indonesian government to begin in 2005 to invest substantially in disaster planning and public education as a means of reducing disaster risk in the vulnerable nation. The Indonesian government and responsible agencies engaged in developing a national disaster plan, officially adopted the plan in 2008, and proceeded to invest in targeted programs of disaster risk reduction, evacuation planning, and public education in six major cities. One of those six cities, Padang, West Sumatra, experienced an earthquake in 2009. The plan improved response to the 2009 event, but a significant discrepancy remained between the 2008 national disaster plan and the capacity of the province of West Sumatra and the city of Padang to implement their plans at the local level in response operations. The need for building the capacity of local communities in Indonesia to manage the substantial risk of seismic and other hazards continues. Haiti: Earthquake and Aftermath The role of disaster planning in Haiti is more difficult and complex for a nation with a fragile government and limited economic, social, and technical capacity. The nation had been undergoing a planned development process since the change in government in 2004, with structured support from the

262

organizing for reliability in practice

international community in security and economic policy. Ironically, the international scientific community understood the seismic risk in Haiti and sought financial support for geological instrumentation that would have monitored the risk.1 Yet, other hazards were deemed more immediate, and other social and economic needs were considered more pressing in this vulnerable nation. Haiti had not experienced a severe earthquake for more than two hundred years; the most recent events occurred in 1770 and 1751 (Calais et al., 2010). Thus, disaster planning, to the extent that it existed in Haiti, focused on hurricanes, a virtually annual hazard. Haiti had no established building codes, and many residents built their own houses without benefit of engineered design. The interacting conditions of fragile government, limited economy, low literacy, and endemic poverty led to an almost total lack of awareness of seismic risk among the population and an equally troubling lack of capacity for organized action. Given these initial conditions in Haiti, the consequences of the Mw 7.0 earthquake were sobering. The earthquake occurred on the Enriquillo-Plantain Garden fault, with an epicenter close to the small city of Léogâne, approximately 25 kilometers west of Port-au-Prince, the capital city and population center of Haiti. The earthquake struck on Tuesday, January 12, 2010, at 4:53 p.m., just as government agencies and businesses were closing for the day. A shallow earthquake with a depth of 13 kilometers, or 8.1 miles, the event shattered buildings and tumbled already compromised infrastructure. An estimated 230,000 lives were lost, although the exact figure is still in dispute. Approximately 80 percent of the buildings in Port-au-Prince were damaged or destroyed, including eleven out of twelve government ministries. Nearly 1.5 million people were left homeless, living in tents with temporary water and sanitation facilities. Figure 11.3 shows two photographs that have become emblematic of the damage caused by the 2010 Haiti earthquake. The presidential palace, no longer habitable, was completely demolished in 2012 and has not been rebuilt. The building housing the Ministry of Public Health, the agency responsible for maintaining basic health services in Haiti, was reduced to a heap of rubble. Table 11.2 presents the distribution of organizations participating in the Haiti response system, identified from news reports in the regional newspaper Caribbean News Online. Nearly three-quarters of the organizations reported as engaged in disaster operations were international, at 74.5 percent. Next were organizations from the Caribbean region, at 18.8 percent.

organizing for reliability in practice 263

figure 11.3 Photographs of Presidential Palace (left) & Ministry of Public Health (right)

table 11.2 Frequency Distribution of Organizations Participating in the Haiti Response System by Jurisdiction and Source of Funding source of funding public

private

nonprofit

totals

Jurisdiction

N

%

N

%

N

%

N

%

International

97

56.7

17

9.9

13

7.6

127

74.5

Regional

22

12.9

4

0.2

6

3.5

32

18.8

National

10

5.9

0

0.0

1

0.1

11

6.5

Local

1

0.1

0

0.0

0

0.0

1

0.2

Totals

130

76.0

21

12.3

20

11.7

171

100.0

Source: Caribbean News Online, January 12–February 3, 2010. Totals are subject to rounding error.

Figure 11.4 shows the network diagram of interacting organizations reported in Caribbean News Online as engaged in response operations to the 2010 Haiti earthquake for the three-week period January 12–February 3, 2010. The overall measure of degree centralization, at 19.92 percent, shows a loosely connected network with four clear nodes of interaction in Figure 11.4, centering on the government of Haiti, Caribbean community, Caribbean

rms axco fifa

uwjam

fpoor

cbu

rcjam concacaf unban

act

jamcc unga unfpa

wcc

cmc

liat

ttcic

impacs

cdru

paho

govhaimf

ccc

govstl

odpem

nyasnp govhaimh

airjam oecs

govguy rss

usgovaj

bdf

flgov

cccc

eccaa

oas

padf

caic govbar

govus

ijmg

iica

usoda

govcrica

govtt

wfp

usbush

govgre

govbah govaus

caricom govhaipre jdf govcan

govjam

jcg

idb

watbot govant

govdom govbra

un

govhai

fivb

ccrif

pa eccu

nlcr

govfr

wb

govjap

daic

govstv

nert

eu

imf

govacan undp

unsc

alp

dexia

usbclin

unicef

rgpf

hgf

rg

cdb govhaipm

govhaicgny

cb

cdema

govira

govstkit

govmex

norceca

irc

icc govhaisam ecca

wicb

gfdrr

dfid

govuk

Jurisdiction International: Regional: National: Municipal:

Shape Square Circle Triangle Diamond

Sector Public: Nonprofit: Private: Political:

figure 11.4 Network Diagram of Interacting Organizations in the Haiti Earthquake Response System, January 12–February 3, 2010. Source: Caribbean News Online, January 12–February 3, 2010.

Color Black White Gray Black

organizing for reliability in practice 265

isaster and Emergency Management Agency, and the government of Jamaica, D with a lesser focus on the office of President of Haiti. The international public organizations connected largely through these nodes, but not noticeably with one another. More surprising is the ring of disconnected organizations that surround the central nodes; they appear to be operating largely without contact with Haitian organizations. Key organizations played bridging roles, such as the government of Jamaica, in linking international public organizations to the government of Haiti. The network reveals marked asymmetry in information processes, illustrating the dependence of the Haitian government on international organizations and the lack of community resilience. This situation was exacerbated by the near absence of disaster planning and preparedness prior to the earthquake. Japan: The Tohoku-Oki Triple Disasters The March 11, 2011, Tohoku-Oki earthquake, tsunami, and nuclear reactor breach represented a striking contrast to the disasters experienced by Indonesia and Haiti in terms of planning, but remarkably similar patterns of information asymmetry and lack of coordination in response operations, particularly for the nuclear disaster. Seismic risk is well known in Japan, with its advanced economy and technical expertise in engineering and seismology. The nation has made major investments in planning for earthquake risk reduction and public education, implementing a nationwide seismological monitoring system that provides immediate notification to residents of earthquakes reported throughout the nation. Planning for tsunami hazards was less well developed, based on the assumption that tsunamis would not occur unless there was a major undersea earthquake. Consequently, tsunami risk reduction was largely subsumed under earthquake preparedness. Different from other nations, Japan granted substantial responsibility to private engineering companies for managing nuclear risk, asserting that the companies had more advanced engineering knowledge and extensive practice than government agencies. Yet, the organizational infrastructure among public agencies remained largely hierarchical, with limited communication and exchange among the national, prefectural, and local levels of operation. This pattern restricted the exchange of information among public, private, and nonprofit organizations at each jurisdictional level needed to build community resilience to nuclear risk. The result, again, was a serious asymmetry in information exchange between the Tokyo Electric Power

266

organizing for reliability in practice

Company, which owned and managed the plant, and the national government agencies responsible for public safety, as well as disconnected communications among the jurisdictional levels of government agencies responsible for risk reduction and response operations. The interacting disasters created a cascade of failures that escalated the consequences beyond the planners’ imagination and the communities’ capacity for collective action. Table 11.3 shows the frequency distribution of the types of organizations that participated in response operations following the Fukushima, Japan, nuclear disaster on March 11, 2011. This distribution contrasts sharply with similar classifications for Indonesia and Haiti, with a much smaller percentage of international organizations at 17.7 percent, a larger percentage of national organizations at 58.3 percent, of which nearly 25 percent were private organizations participating in response operations. The striking difference in the response system that emerged following the Japan nuclear disaster from the Indonesian tsunami and Haitian earthquake disasters is the significantly larger percentage of private organizations (36.8 percent), or more than onethird of the total number of organizations participating in response operations, in contrast to the relatively small percentage, 4.6 percent, of nonprofit organizations. This distribution of organizations participating in response operations reveals asymmetry of information among the organizations engaged in response table 11.3 Frequency Distribution of Responding Organizations to the 2011 Fukushima, Japan, Nuclear Disaster by Jurisdiction and Source of Funding source of funding public

private

Jurisdiction

N

%

International

N

nonprofit

totals

%

N

%

N

%

45

12.1

21

5.6

0

0.0

66

17.7

Regional

0

0.0

12

3.2

1

0.3

13

3.5

National

115

30.9

92

24.7

10

2.7

217

58.3

Prefectural

32

8.6

5

1.3

3

0.8

40

10.8

Municipal

11

3.0

1

0.3

3

0.8

15

4.0

Local

15

4.0

6

1.6

0

0.0

21

5.6

Totals

218

58.6

137

36.8

17

4.6

372

100.0

Note: Local jurisdiction refers to those in Iwate, Miyagi, and Fukushima Prefectures. Totals are subject to rounding error. Source: Yomiuri [Tokyo] newspaper, March 11–April 1, 2011.

organizing for reliability in practice 267

operations, but a very different pattern than in either the Indonesian or Haitian response systems. Private organizations operating at the national level played a critical role, but there were no specified requirements for communication of information regarding risk to other jurisdictional levels of public organizations and no prior experience or expectation for communication with nonprofit organizations in anticipation of risk of exposure to nuclear contamination. Figure 11.5 shows the network diagram of organizations participating in response operations to the Fukushima nuclear disaster. The network diagram shows the dominance of national organizations, with a densely connected set of public organizations interacting with one another (black, upward-pointing triangles), but relatively little interaction with prefectural organizations (black, downward-pointing triangles), and almost no interaction with municipal organizations (black circle in a box) or local organizations (black diamonds). The sobering finding from this analysis is that, despite the extensive investment in planning for seismic risk reduction in Japan, the planning processes for tsunami risk, triggered by the earthquake, and nuclear risk, triggered by the tsunami, proved seriously inadequate for the cumulative sequence of hazardous events. More troubling, the process used did not enable the relevant organizations to craft more effective means for responding to these catastrophic events as they were occurring. The planning process, carefully designed to meet specific, clearly identified needs from previous seismic events, proved woefully inadequate to meet the multiple interacting and interdependent sequences of damaging events triggered by the massive undersea earthquake. In this fundamentally altered environment that shattered both technical and organizational systems, the affected communities had little collective capacity to respond.

C ompa r i so n o f E x t r e m e Ev e n t s

Networks of response actions emerged at the local level of operations in all three nations—Indonesia, Haiti, and Japan. Yet, as the events escalated in complexity and impact in each nation, response at the prefectural and national levels stumbled and slowed in similar patterns. Lack of coordination among participating organizations was apparent in each response system at multiple scales of operation.2 The flow of information through direct means of communication was not sufficiently accurate, timely, nor complete to enable the

dpj_diet

saitama_gv

futaba_my

hrep_pre

dpj_sg sdp_prc yp_prcvc nrp_sg komei_prcvc ldp_econ djp_prc_vc ldp_dsg pnp_prc sunrise_conrep jpp_diet

tokyo_fire_eres

kanagawa_emsq

tokyo_gv niigata_gv mext_min

russia_mofa_min

fukushima_gv

uk_gov korea_mofat_min us_gov

iaea fukushima

meti_vmin

tepco_pre tepco_vpre jp_pm

japan_gov

cab_sc

kanuma

us_pre saitama

us_doe_sc korea_modf_min

tama_univ fr_pre

saitama_wlf

jpgov_nuclearhq

npolicy_min toshiba jpgov_elechq mod_min

meti_min pm_assist jpgov_nsc

us_army

us_army_walsh sdf_jsoc

jcci sdf_gr

tepco

us_army_field keidanren

nisa

fukushima_dq jreast meti

okuma futaba

jpp_chair sdp_head

nsc_chair

ibaraki_gv

italy_gov eu_com germany_gov

nirs

osaka_fire

france_gov

ishikawa

fukushimam_hosp

hcon_com_budget

jcp_hs

sdf

gunma_gv tochigi_ov

mofa_min

iidate

sato

komei_sg sunrise_sg

ldp_pre

mhlw_vmin

cord

ldp_aizawa

jpgov_nuclear_hq cab_ri dpj_maehara dpj_ozawa dpj_hatocab_coun india_pm hrep_tsujimoto grevit_min dpj_ipre cab_sa tokyo_fire cab_dsc un_sg komei_cr russia_pre

maff_min

us_dos_sc

sdp_sg

ldp_sg

hcon_pre

china_mofa_min

hiroshima_univ

jcp_sg yp_sg pnp_sg

tokyo_police nissay smtbank mizuho_corp ntteast mufj_bank msbank daiichi_ins jaea anre

chukin dev_bank

us_army_willard

us_nrc

sdf_mr tokyo

jafkmchuo

nttwest mod

mhlw

mof_min

fukushimar_hosp

mext mlit

pmoffice aomori france_mof_min

nho jart

maff

fdma shiga

kashima_hosp

gifu

shizuoka

boj_gv

Jurisdiction International: National: Regional: Prefectural: Municipal: Local:

figure 11.5 Nuclear Network by Source of Funding and Jurisdiction, March 11–April 1, 2011. Source: Yomiuri Newspaper, Tokyo, Japan. March 11–April 1, 2011. Analysis by Aya Okada.

Shape Square Upward-pointing triangle Circle Downward-pointing triangle Circle in a box Diamond

Sector Public: Nonprofit: Private:

Color Black White Gray

organizing for reliability in practice 269

relevant organizations to mobilize action effectively in coherent response to the rapidly changing demands in any of the three systems. Interestingly, aspects of the three lenses of interpretation of risk in organizational performance, advanced by Carroll (Chapter 3), can be observed in each of these cases. In the Indonesian case, strategic design was largely absent as documented by the lack of planning for tsunami disasters in an archipelago widely vulnerable to seismic and tsunami risk. Yet, strategic design appeared in the decision of the Indonesian government to establish a national disaster plan after experiencing the devastating impact of the 2004 tsunami. The cultural lens was evident in both the long-running civil conflict between the Free Aceh movement and the Indonesian government, prior to this traumatic event, and in the decision, after the tsunami, by both parties involved in the conflict to resolve their differences and collaborate in recovery as a shared goal. The political lens, nonetheless, was still apparent in the conflicting priorities placed on reconstruction by the different parties at the national, provincial, and community levels of operation in the recovery process. In the Haitian case, strategic design was apparent before 2010 when geologists and geophysicists recommended the implementation of a seismic monitoring system for the small island nation, but it was limited by the effects of the other two lenses. The cultural effect of lack of awareness of seismic risk in the broader population diminished the importance of making the necessary investment in scientific instruments and training needed to monitor the actual risk. The political choice to use scarce resources for more immediate threats created a cumulative sequence of decisions that limited collective capacity to respond to the January 12, 2010, earthquake when it occurred. In the 2011 triple disaster in Japan, the effects of all three lenses were most acute in reference to decision making regarding the nuclear breach at Fukushima, resulting from the cascading effect of the earthquake that generated three massive tsunami waves that, in turn, disabled the nuclear reactor, releasing radioactive pollution into the air, land, and sea. The lens of strategic design was very apparent in the decision by the Japanese government, long prior to March 11, 2011, to place major responsibility for building, monitoring, and managing the nuclear plant at Fukushima with the Japanese private engineering company Tokyo Electric Power Company. Presumably, the rationale for this decision was that the engineering company had expertise in operating nuclear plants that the government agencies did not have. This decision was reinforced

270

organizing for reliability in practice

by a cultural lens of respect for authority and reluctance to question experts, either from the engineering company or from national governmental agencies, characteristic of Japanese society. The political lens was evident in the slow response of the Japanese government agencies to acknowledge the extent of the damage and to provide informed response and assistance to the residents of Fukushima and the surrounding region. While the influence of the three lenses—strategic design, cultural norms, and political contestation—can be observed in the cases of Indonesia, Haiti, and Japan, one can conclude that each separate lens masks critical actions that are essential for mobilizing collective action by the whole community to reduce risk and attain high reliability in performance on a community scale. The initial differences among the three interorganizational response systems in planning and investment notwithstanding, the parallel lesson from each of them is clear: Building capacity for resilient action to reduce disaster risk requires an adaptive, iterative learning model across all jurisdictional levels of operation in specific regions. Without local, national, and regional capacity to reduce risk and respond rapidly to severe hazards, the impact of disaster escalates to the global level of operations.

R e d e s i g n o f a G l o b a l C ommo n s t o R e duc e D i s a s t e r R i sk

Rethinking measures of disaster risk reduction on a global scale requires a fresh conceptualization of disaster management. It means recognition of the initial conditions that precipitate disaster and building capacity to reduce risk through collective action on a community-wide scale. This is a broader, more complex, and more interactive set of tasks than has been incorporated into the disaster planning process in most countries. Yet, expanding information and electronic communication technologies have enabled new means of building capacity for collective action through the timely, multiway exchange of information among groups and organizations exposed to recurring risk. The near-ubiquitous access to cell phones provides a means of transmitting information to individuals, groups, and institutions in near real time. Social media platforms such as Facebook and Twitter, and their technical counterparts in other nations, provide a means of capturing and sharing images and video of hazardous events as they are occurring. Such

organizing for reliability in practice 271

technologies are still evolving and their effective role in managing and responding to risk requires review and redesign, but they provide a remarkable opportunity for building resilience in complex systems through collective action. The experience of the three disaster response systems briefly outlined in the preceding text demonstrates that the greatest resource for communities exposed to recurring risk is the human capacity to learn and adapt their actions to changing conditions based on timely, accurate information through trusted channels of communication. Broadly speaking, this is generalizable across contexts. Designing environments that enhance collective learning about risk through sociotechnical means at the community level builds resilience that supports the next level of operations, which, in turn, enhances the third and fourth levels of operation. In this complex, interdependent context of shared risk, responsibility for informed action is shared among participants with different sets of skills, knowledge, resources, and vulnerabilities. Crafting the design and means of guiding collective action in such environments is quintessentially a learning process.

Bu i l d i n g R e s i l i e n c e i n C A S o S

The practical tasks involved in the CASoS approach require identifying the key components of a complex adaptive system of systems as well as the mechanisms of change in each component system. The components are necessarily the geophysical, engineered, computational, socioeconomic, and cultural systems that make up living, functional communities. Identifying interdependencies among these components provides insight into what forces and conditions shape community response to hazards and enable collective action. A CASoS approach involves rapid, dynamic adaptation to risk by the whole community and constitutes a web of practice among groups and organizations that is supported by sociotechnical systems. Building that web of practice is a continuing challenge for communities exposed to recurring risk. Acknowledgments: I acknowledge, with warm thanks and appreciation, Thomas W. Haase, Aya Okada, Michael Siciliano, and Steven Scheinert, who, as graduate student researchers at the Center for Disaster Management, assisted in coding the newspaper articles for the three earthquake response systems and in running the network analyses. They are now young researchers in their

272

organizing for reliability in practice

own professional settings, and I am grateful for their contribution to the data analyses presented in this chapter.

No t e s 1. M. Matera, World Bank, Washington, DC, May 23, 2013. Personal communication. 2. Briefing, Morioka Emergency Operations Center, Morioka, Japan, October 31, 2011.

R efer ences Agranoff, R. (2007). Managing within networks: Adding value to public organizations. Washington, DC: Georgetown University Press. Axelrod, R. & Cohen, M. D. (1999). Harnessing complexity: Organizational implications of a scientific frontier. New York: Free Press. Calais, E., Freed, A., Mattioli, G., Amelung, F., Jónsson, S., Jansma, P., Hong, S-H., Dixon, T., Prépetit, C., & Momplaisir, R. (2010). Transpressional rupture of an unmapped fault during the 2010 Haiti earthquake. Nature Geoscience 3, 794–799. Retrieved from http://www.nature .com/ngeo/journal/v3/n11/full/ngeo992.html Carley, K. M. (2003). Dynamic network analysis. In R. Breiger & K. M. Carley (Eds.), Summary of the NRC workshop on social network modeling and analysis (pp. 133–145). Washington, DC: National Research Council. Caribbean News Online. (2010, January 12–February 3). http://cananewsonline.com/main/ Carlson, J. M., & Doyle, J. (2000). HOT: Robustness and design in complex systems. Physical Review Letters, 84, 2529–2532. Comfort, L. K. (1999). Shared risk: Complex systems in seismic response. Amsterdam: Pergamon Press. Comfort, L. K. (2007a). Asymmetric information processes in extreme events: The 26 December 2004 Sumatran earthquake and tsunami. In D. Gibbons (Ed.), Communicable crises: Prevention, response and recovery in the global arena (pp. 135–165). Charlotte, NC: Information Age. Comfort, L. K. (2007b). Crisis management in hindsight: Cognition, communication, coordination, and control. Public Administration Review, 67 [Special Issue: Administrative failure in the wake of Katrina], S188–S196. Comfort, L. K., Colella, B., Voortman, M. J., Connelly, S. S., Wukich, R. C., Drury, J. L., & Klein, G. L. (2013). Real-time decision making in urgent events: Modeling options for action. Proceedings of the 10th International ISCRAM Conference. Baden-Baden, Germany. Fligstein, N., & McAdam, D. (2012). A theory of fields. New York: Oxford University Press. Glass, R. J., Ames, A. L., Brown, T. J., Maffitt, S. L., Beyeler, W. E., Finley, P. D., . . . & Zagonel, A. A. (2011, June). Complex adaptive systems of systems (CASoS) engineering: Mapping aspirations to problem solutions. (SAND 2011-3354 C). Albuquerque, NM: Sandia National Laboratories. Retrieved from http://www.sandia.gov/CasosEngineering/_assets/documents/ICCS_Map ping_Aspirations_2011-3354.pdf Hess, C., & Ostrom, E. (Eds.). (2007). Understanding knowledge as a commons: From theory to practice. Cambridge, MA: MIT Press. Jakarta Post [Indonesia]. (2004, December 26–2005, January 16). http://www.thejakartapost.com/ Jakarta Post [Indonesia]. (2005, January 4). More donations pour in for tsunami victims in Republic of Indonesia. http://www.thejakartapost.com/

organizing for reliability in practice 273

Kauffman, S. A. (1993). The origins of order: Self-determination and selection in evolution. New York: Oxford University Press. Koliba, C., Meek, J., & Zia, A. (2010). Governance networks in public administration and public policy. Washington, DC: CRC Press. Koppenjan, J., & Klijn, E-H. (2004). Managing uncertainties in networks: Public private controversies. New York: Routledge. La Porte, T. R., & Consolini, P. (1991). Working in practice, but not in theory: Theoretical challenges of “high reliability organizations.” Journal of Public Administration Research and Theory, 1(1), 19–48. Mandelbrot, B. B. (1977). Fractals: Form, chance, and dimension. San Francisco: W. H. Freeman. Mandelbrot, B. B. (2004). Thinking in patterns: Fractals and related phenomena in nature. M. M. Novak (Ed.). River Edge, NJ: World Scientific. Nohria, N., & Eccles, R. G. (Eds.). (1992). Networks and organizations: Structure, form, and action. Boston: Harvard Business School Press. Ostrom, E. (2005). Understanding institutional diversity. Princeton, NJ: Princeton University Press. Prigogine, I., & Stengers, I. (1984). Order out of chaos: Man’s new dialogue with nature. New York: Bantam Books. Provan, K. G., & Milward, H. B. (2001, July/August). Do networks really work? A framework for evaluating public sector organizational networks. Public Administration Review, 61(4), 414–423. Roberts, K. H. (Ed.). (1993). New challenges to understanding organizations. New York: Macmillan. Rochlin, G. I. (Ed.). (1996). Special issue: New directions in reliable organizational research. Journal of Contingencies and Crisis Management, 4(2). Rochlin, G. I., La Porte, T. R., & Roberts, K. H. (1987). The self-designing high-reliability organization: Aircraft carrier flight operations at sea. Naval War College Review, 40(4), 76–90. Roe, E., & Schulman, P. (2008). High reliability management. Stanford, CA: Stanford University Press. Simon, H. A. (1997). The sciences of the artificial (3rd ed.). Cambridge, MA: MIT Press. Taleb, N. N. (2010). The black swan: The impact of the highly improbable (2nd ed.). New York: Random. Weick, K. E. (1995). Sensemaking in organizations. Thousand Oaks, CA: Sage. Weick, K. E., & Roberts, K. H. (1993). Collective mind in organizations: Heedful interrelating on flight decks. Administrative Science Quarterly, 38(3), 357–381. Weick, K. E., & Sutcliffe, K. M. (2001). Managing the unexpected: Assuring high performance in an age of complexity. San Francisco: Jossey-Bass. Yomiuri [Tokyo] newspaper. (2011, March 11–April 1). The author worked through a translator to access this paper, which has the largest circulation in Japan.

chapter 12

App ly i n g R e l i a b i l i t y P r i n c i p l e s : L e sso n s Learned W. Earl Carnes

I n t r oduc t i o n

This chapter differs from the others in this book because it is the product of practitioners, myself included, who operate, regulate, or oversee organizations that demand high reliability. The experts who contributed to this chapter would not presume that their organizations exemplify some abstract standard of being an HRO, and they don’t pretend to have all the answers. For these reasons I use the term “reliability seeking organizations,” or RSOs (Vogus & Welbourne, 2003; Roberts, Stout, & Halpern, 1994), to indicate the reality of continuing pursuit rather than the hubris of attainment. This chapter is a dialogue among practitioners, which I have facilitated by blending their voices to provide context and transitions. Respondents’ own words are used as much as possible and selected quotes that introduce or emphasize a theme are indented. I also synthesize comments from multiple respondents that amplify themes. These paraphrases are introduced with the nomenclature “dialogue.” While outlier comments merited careful consideration and discussion, a lesson expressed by any one person is not necessarily sufficient for adoption by others, thus themes in this chapter represent viewpoints shared among a majority of the contributors. It is worth noting, however, that in no case did any contributor contradict important lessons shared by others; differences expressed were essentially matters of emphasis and examples cited.

applying reliability principles 275

The Survey I sent individual e-mails to practitioners in various industries with an invitation to participate by explaining the chapter concept and posing the fundamental question, “What are the most important lessons you have learned about applying high reliability principles and practices?” I included four amplifying questions to prompt broad reflection about factors influencing reliability such as technical excellence, leadership, management systems, inquisitiveness, constant learning, and risk-based regulatory environment. Most respondents chose to provide written replies ranging from one to five pages, a few preferred telephone conversations lasting from one to two hours. Similarly, I used follow-up e-mail exchanges and phone conversations for clarifications and agreement checking. In keeping with the sensitivity of positions held by contributors, their comments are not attributed, but I include the industry they are affiliated with. Respondents usually opened their comments with general discussions about reliability seeking and recounted stories from their careers, then offered observations about lessons and suggestions they wanted to highlight. Some referenced source material to supplement their comments. The chapter is structured similarly. Recurrent topics are introduced by questions followed by narrative amplified by contributor comments. I present five high-level lessons learned, followed by an expansive discussion of organizational behaviors that most strongly characterize RSOs. I then provide suggestions about first steps for implementing reliability principles. The Respondents Thirty-one of the thirty-nine practitioners who were invited to participate accepted my invitation. My objective was to seek perspectives from successful senior practitioners who have dedicated their careers to managing or regulating high-hazard operations requiring high standards of reliability. Selection was a “convenience sample” stratified across industries based on my network of reliability professionals. Contributors serve as chief operating officers, senior managers or advisors in major business units for international corporations, career professionals and political appointees with US government regulatory and oversight agencies, and others are senior professionals who provide consulting services to RSOs. Their business sectors include electrical generation (nuclear),

276

applying reliability principles

electrical transmission utilities, petrochemical, transportation, nuclear defense, environmental remediation, and basic and applied research. With few exceptions, contributors have earned academic degrees in engineering or physical sciences. Most have earned multiple degrees in multiple disciplines. Many hold professional certifications (e.g., professional engineer, certified safety professional) as well as internal certifications required by their respective organizations (e.g., radiation worker, chemical worker, project manager), eight have earned PhDs. Each has worked at the front lines of their professions, learning firsthand about the practice of hazardous work, and each according to their own paths has over time converged on similar principles of what it takes to seek reliability.

W h a t D o e s t h e R e l i a b i l i t y W o r l d Look L i k e ?

According to Adam Gopnik (2015), “A ‘world’ . . . consists of real people who are trying to get things done, largely by getting other people to do things that will assist them in their project. . . . The resulting collective activity is something that perhaps no one wanted, but is the best everyone could get out of this situation and therefore what they all, in effect, agreed to.” In RSOs people know why they come to work, they know what their job is and how to do it, they know how to work with others in the organization to get the job done, they constantly challenge themselves and others, and they are constantly learning. Their work is more than a job; their labor and voluntary daily encounters with hazards are undertaken to make life better for others. What they do and how they do it is a part of their personal identity. They are members of unique professional families with relationships that last a lifetime. How Do RSOs Get to Be This Way? They were designed this way, gradually over time. Their design approach was different, not by engineered specification but rather by principle-based adaptive strategies, iterative and responsive to emergent discoveries. In short, these organizations were designed and continue to evolve by learning. RSOs excel at designing work-delivery systems to help people accomplish quality work, day after day. They strive to nurture resilience so individuals can sense, respond, and adapt to that which is unimaginable in moments of seeming normalcy. There are no artificial distinctions about doing the mission safely, securely, with quality—

applying reliability principles 277

or other conditionals often considered in normal organizations as negotiable. Lack of safety, security, or quality can mean that people may die, so it’s just how they do work. They succeed or fail together, there is no other alternative. Why Is It So Hard to See What People in RSOs Do? When trying to learn about RSOs, most people focus on tools, like process mapping, reporting systems, or standardization or accountability systems. But attempts to adopt step-by-step practices without understanding concepts, models, and principles routinely meet with failure. How people in RSOs think, how they work together, and what they view as important is missing in this approach. Perhaps the best way to learn is to listen to their stories, to hear about what they do and how they think. That’s the point of this chapter. What’s Important to Know If You Want to Be an RSO? Leaders need to understand that RSOs don’t get that way by chance. Certain foundational elements must be in place—predicated on collaborative teamwork and a sense of camaraderie where all are committed to a shared vision through a sense of personal identity and shared collective contribution. RSOs don’t leave competence to chance; they train and qualify everyone to become proficient in their work. Continuous improvement is a core skill based on open dialogue and trust, centered on curiosity and intellectual flexibility to identify and understand problems, assess risk, and develop effective remediation activities or innovations. Improvement is enabled by being self-critical while maintaining a sense of competency and professional pride, without blame or pejorative defaults about individual actions or perceived intent. Team members have confidence that they work within a just culture where they can be open about their actions, their doubts, and their failures as they seek to improve themselves and the team; this confidence is not a social nicety but an operational necessity.

F i v e L e sso n s S h a r e d b y R S O L e a d e r s

Lesson One—It’s Simple, But Not Easy “Simple” and “easy” are not the same. Many of the concepts and principles associated with RSOs are relatively simple. That does not mean they are easy.

278

applying reliability principles

Leaders devote much of their lives to learning how to design organizations that shape desired group behavior repertoires. People tend to follow provided guidance if it serves their needs. If that guidance is not clear or fully established or if trust is lacking, people fill in the blanks with their own perceptions and judgments. (Respondent, electrical transmission)

In RSOs, transparency of expectations and alignment around interpretations are constantly reviewed and refreshed. Several respondents commented on the simple/easy paradox. Dialogue: Unfortunately, because the concepts and principles are relatively “simple,” managers unindoctrinated in the culture of RSOs try to make too many changes at once presuming they can just tell other people to make this “simple” change and somehow performance will magically improve. Consider the straightforward surgery practice of flagging. When surgery is to be performed on a body part where a mix-up is possible (i.e., ears, fingers, knees, wrists, elbows, shoulders, etc.), either the doctor or patient will flag the body part with a marker to indicate the area to be operated on (or the area not to be operated on). Flagging is common practice now, but it certainly was not that way ten or twenty years ago. The point is, you can’t get much simpler than using an inexpensive marker and flagging a part of your body. Even with such simple precautions, wrong-site surgery remains a significant patient safety concern. Surgical practices are long entrenched and adoption of new behaviors is often slow to permeate a system and become the new normal. Simple and easy are not the same. Advocate, educate, monitor, communicate; don’t try to fix everything at once and don’t assume problems are completely understood. Prioritize and go after the top three to five issues each year. A living improvement plan, with regular performance monitoring and management oversight, is a basic element of sustainment. People are smart and know how to get things done. Over time, they will find easier and faster ways to achieve the goal. (Respondent, electrical transmission)

Dialogue: It is human nature: over time these streamlined methods may deviate from the original intent, and unexpected consequences, including catastrophic failure, could ensue. Leaders may even encourage this behavior by only looking at and measuring outcome results or by rewarding people who seemingly accomplish heroic tasks. There may be no one right way to perform many tasks, but there are definitely wrong ways. Reliable practice is a function

applying reliability principles 279

of both routines and mental models; they are co-constitutive. Variation is normal and frequently acceptable and is often the source of innovation. RSOs are uniquely aware of this issue and mindfully approach practice variation with healthy caution and omnipresent awareness of risk. We have to establish standards for practice—organizational and individual practices—within our prescribed envelope of reliability and be alert to drift. What is expected needs to be inspected. (Respondent, nuclear industry)

Lesson Two—Engage Everyone in Meaningful Work Two experts in organizational culture wrote: The single most consistent observation in our experience in the effort to apply high reliability principles is that the level of engagement across the entire organization is the key component for success. Many times we have seen organizations attempt to apply these principles only at a management level, often only senior management, when they must be understood and implemented throughout the entire organization. (Respondents, consulting advisors)

A nuclear practice leader added: From experience in 30 nuclear safety culture assessments, management engagement with the workforce coupled with comprehensive plans/methods to communicate throughout the organization offers the greatest opportunity to affect the culture of the organization. The challenge to the management team is learning how to balance leadership skills with the management skills. The reward is the improved performance and the sustaining effect of the commitment. (Respondent, nuclear consulting)

It’s easy for people of position and education to become convinced of their importance and forget fundamental human truths. Specialists in human and organizational factors view engagement as a tapestry of human relationships and interactions. All respondents spoke about relationships and engagement offering several perspectives and emphasizing management responsibility. Dialogue: A leadership team that sets ambitious goals, understands its maturity level honestly, and engages its workforce to achieve these goals through structured continual learning and improvement is critical to success. Engagement of the workforce entails more than just directing and showing up from

280

applying reliability principles

time to time. Engagement is a two-way street, and leaders need to establish mechanisms to “listen” to employees’ concerns, as well as their suggestions. For employees to become engaged, first managers must engage with employees—and with each other. Dialogue: In RSOs the frontline operator is at the center. If your operators don’t believe that their time, intellect, and experience are incredibly valued, no token gifts or platitudes will do. If you’re working to support the operators and they believe in and trust you, you will have a never-ending stream of energy, support, advice, and a phenomenal sounding board. They can be your biggest supporters, and together you can create massive gains in reliability and risk reduction. You’ll also have a lot of fun along the way. Because—let’s face it—they are really awesome, if you only let them be. Dialogue: Ultimately, the goal of every manager is to facilitate the extreme success of his or her employees. For operators, this means they need to know their management will focus on giving them the most support that is reasonable. If there’s an issue that may have a political tinge, it’s easy for management to become fearful, and ultimately this leads to operators getting mixed messages or perceiving their jobs or support as tenuous. Then in high-stakes, high-stress scenarios, this additional uncertainty increases the probability of human error and changes patterns of responding. Therefore, operators must feel that they and their manager share an understanding of the support the operator can expect. Ideally, it’s a lot of support. At the end of the day, unless we decide to let high reliability functions be run by autonomous machines, we will see human beings with all their brilliant strengths and shortcomings come into play. (Respondent, electrical transmission)

Dialogue: People will get stuck in patterns of thinking, won’t adapt well to change, and will tend to respond more out of fear (when fear is present) than toward any positive goal. At those moments, it’s important to remember that we all see our behaviors in the context of our situations (not personalities), and if we can be helped to feel safe, we can spend our energies growing, rather than fighting a battle to defend ourselves. Lesson Three—RSO Leaders Are Not Deciders-in-Chief Is the job of the CEO, as often mythologized, that of Decider-in-Chief? Can leaders design their organizations by embracing experimentation, learn-

applying reliability principles 281

ing, and iterative change so that decisions are made by people best qualified to make the decisions? Comments (paraphrased) from a chief nuclear executive illustrate key points about leader roles in transforming toward reliability. Dialogue: The prior executive team had focused on technical and financial details. The new team determined that our real job was to create an environment and provide the resources that would enable our people to perform with excellence. With the support of our board, we rewrote all our positions descriptions accordingly and redesigned the executive compensation system to align with our new position descriptions. Naturally, we retained final accountability for technical and financial performance of the organization, but day-to-day management and monitoring of these areas were assigned to the second tier executive team. Designing an organization that enabled our people to excel became our primary focus. The challenge for many executives was expressed by a chief operating officer for a large government contractor: The successful formula I have found in minimizing adverse operational events is creating a culture of reinforcement for desired individual and organizational practices. (Respondent, nuclear defense and environmental remediation)

Dialogue: Most companies “actively” reinforce things that specifically make a business viable, such as revenue, gross margins, days’ sales outstanding, accounts receivable, and so on. Unfortunately, things like industrial and process safety are “passively” reinforced—that is, other than flippantly noting they are “priorities” at meetings, very little is done to increase the margin of safety. This phenomenon is understandable given the lack of patience by company stakeholders and the willingness to change leadership if growth numbers don’t meet expectations. A similar comment comes from the electric transmission industry: Managers matter more than you think. Often, Western organizations have the habit of promoting people to management positions due to their amazing performance as individual contributors. Knowing how to write an amazing algorithm doesn’t make you any better at bringing together an amazing team, or supporting your employees. (Respondent, electrical transmission)

A majority of respondents underscored the need to acculturate managers to become reliability competent. Dialogue: Continuously train your managers,

282

applying reliability principles

provide them with mentors and coaches, send them out in the world to learn how others think, expose them to all aspects of the organization’s operations. Praise your managers every time an employee has a success, and encourage employees to share praise for successes with their management. Ask employees for positive feedback on their managers, and highlight the ones who stand out. Give them gifts, plaques, whatever it takes—but if you don’t have good management, the best you can hope for as an organization is to “stumble into okay.” Many contributors spoke about culture as a daily focus: Don’t set and forget. The industries that need high reliability are always in flux, always growing and learning. There is no “set and forget.” Measure what matters and strive to continually grow. (Respondent, petrochemical industry)

As we can see, there is no perfect design, no perfect organization, and no single designer. Becoming an RSO is a journey of exploration and uncertainty guided by principles and aspirations as primary to goals, strategies, and metrics. Lesson Four—Reliability Cannot Be Regulated, But It Can Be Encouraged Think that reliability can be regulated? Think again. Regulatory respondents and several industry respondents fear unrealistic expectation of regulators and the value of different regulatory approaches. Many seem to believe that responsibility is shared between the industry and the regulator, and when something bad happens, all responsible parties must be “punished.” Regulators—given resource and staff limitations—can at most hope to influence the performance of the operator. (Respondent, regulator)

Dialogue: Regulators rarely have enough data and information to truly be “risk-based.” Risk-informed is likely the best we can ever hope for. Industry will always have much more information and be much more nimble than the government. Bureaucratic burdens created by Congress and the administration create hugely challenging barriers to any organization trying to adapt to changing circumstances. The challenges of government are not entirely dissimilar from those of industry. In many ways they can be even greater: Regulators never get to set safety standards that they believe are adequate due to the barriers of the regulatory process. (Respondent, regulator)

applying reliability principles 283

Dialogue: Examples of these barriers are cost/benefit analyses, environmental analyses, small-business impact analysis, information collection/reporting burden checks, advisory committees who pass judgment on all rules, and so on. Two different regulators echoed what many in the industry say: The standards set in regulations take years to develop and are watered down all along the process due to these influences. (Respondent, regulator) Following an accident, investigating bodies—pressed for time, money, and competent staff—all too quickly conclude that the event was caused by “inadequate oversight” and “bad safety culture.” (Respondent, regulator)

The terms “inadequate oversight” and “bad safety culture” themselves are not actionable and are all too easy to use as popular labels. The investigative root-cause determinations rarely go broadly enough to fully understand the many contributions to accidents (Rasmussen, 1997). Without full consideration of systemic factors, crafting constructive remedial actions that hold promise for sustaining safe performance becomes difficult and resources are wasted. Investigator conclusions of “inadequate oversight” do not go to improvement but instead serve to shift attention and blame from the operator to the regulator without fully appreciating the limitations of the regulator or regulations. (Respondent, regulator)

Trends toward collaborative governance are encouraging. Moving from compliance toward reliability can be incremental but most often requires transformational change. Even if precipitated by a survival-threatening crisis, transformation can only be internally generated and requires a social ecology that supports development of reliability-centered behaviors (Carnes, 2011). I illustrate this movement with two examples (aviation and health care) in the text that follows, though a number of other examples come to mind (e.g., commercial nuclear power, pipeline safety, etc.). What does such a social ecology look like? Traceable to the 1930s, the US aviation regulatory history focused for decades on technical design and human factors. Aviation safety today is energized by a collaborative approach that began in the 1960s. Driven by learning from operational experience, the ASRS is a benchmark for confidential reporting and was a precursor to more extensive data-driven analysis. In the 1990s the Commercial Aviation Safety Team was established as a forum for industry and government to jointly analyze industry

284

applying reliability principles

operating experience data and agree on actions needed to address vulnerabilities. In 2007 the FAA established the Aviation Safety Information Analysis and Sharing (ASIAS) to leverage internal FAA datasets, airline proprietary safety data, publicly available data, manufacturers’ data, and other data. In early 2015 the FAA issued a final rule requiring most US commercial airlines to have an SMS in place by 2018. An SMS is intended as a set of business processes and management tools to examine data gathered from everyday operations, isolate trends that may be precursors to incidents or accidents, take steps to mitigate the risk, and verify the effectiveness of the program (“Safety Management Systems,” 2015; US Department of Transportation, 2013). The aviation system illustrates fundamental characteristics of a supportive social ecology: collaborative efforts informed by rich data collection and analysis, a regulatory system that maintains the integrity of technically sound requirements, a performance-focused regulatory oversight system, and nonbinding advocacy for the pursuit of excellence through stakeholder engagement. In health care, collaborative governance began to emerge in the early 2000s. The IOM’s 2000 report catalyzed a national health-care improvement initiative for which high reliability science has become the reference framework (Committee on Quality of Health Care in America, 2000). The logic of this transformation was explained in a paper coauthored by Mark Chassin, president of TJC, the principal health-care certification and accreditation body in the United States: Regulation had only a modest and supportive role in the dramatic quality and safety improvements in other industries. . . . In health care, regulators should pay attention first and foremost to identifying and eliminating requirements that obstruct progress toward high reliability. In some instances, such requirements impose outdated and ineffective methods of quality management. In others, they impose unproductive work on regulated organizations that distracts them from dealing more effectively with their quality challenges. Regulators can support the transformation to high reliability, for example, by well-crafted programs of publicly reporting reliable and valid measures of quality. (Chassin & Loeb, 2013, p. 484)

TJC regulates but also advocates. Support for organizations seeking reliability is provided in the forms of education, research, and promulgation of effective practices by organizations such as the IHI, the NPSF Lucian Leape Institute, AHRQ, and other similar industry organizations.

applying reliability principles 285

Collaborative governance is established in practice with demonstrated results. However, this approach is far from being universally accepted and remains a challenge requiring constant attention from regulators, industry, and policy makers. I personally believe in a collaborative, problem-focused approach between equal partners—both of whom have a strong set of ethics. However, it is clear that Congress and the media both see this as inherently evil and far too cozy. The media, whether because it “plays” or because they just don’t have the time to really understand the whole scope of a story, are particularly skillful in applying innuendo to discredit a regulatory body (not that some haven’t deserved it). (Respondent, regulator)

It is important that the regulator and the RSOs have broadly interacting rather than narrow relationships. One of the biggest killers to morale is the concept of “no good deed going unpunished.” An RSO has to have open communications with the regulator and present not only the required information but also properly prepare the regulator’s understanding of the processes being followed and provide the regulator the much needed feedback within the context of the bigger picture. (Respondent, regulator)

Dialogue: The regulator must honor such discourse with professional ethics, understanding the importance of voluntary disclosure as well as the difference between appropriate interactions and malfeasance. Enacted policy must reflect such dialogue as desirable conduct between the regulator and the regulated. Until the industry and regulator get beyond cat-and-mouse antics, the real quest for seeking reliability cannot begin. Nothing is more dangerous than an uninformed regulator. Industry must take responsibility for this as well as the regulator to make sure they are eliciting the information and not withholding for fear of reprisal. The back and forth has to be constantly worked as rapport and trust are built. (Respondent, regulator)

Often-hyped aspersions of regulatory capture or coziness with industry can create a chilled environment and overshadow standards of ethics and professionalism that serve to protect the public much more than bureaucratic “Thou Shall Not’s.” Achieving an atmosphere of openness in the exchange of risk-related information is always a work in progress and is usually a precarious balance of safety, security, and business-sensitive factors.

286

applying reliability principles

Lesson Five—Some Behaviors Matter More Than Others A leader in the US pipeline industry brought up organizational culture: By far the biggest lesson I’ve learned is that all the procedures and processes in the world don’t matter if you don’t have the right organizational culture to follow them and be vigilant about hazards and risks. Too many companies have fooled themselves by thinking their binders on the shelf are controlling risks when what really matters is what the frontline employees and contractors do every day. (Respondent, petroleum industry)

Culture as a prerequisite to reliability design is a top issue with almost all respondents. Dialogue: Leaders who design and sustain RSOs care first about the culture of reliability knowing that culture shapes how people think, what they value, how they relate and work with one another, and ultimately how they succeed or fail. Mindful of this, technical organizations tend more toward James Reason’s (1997) perspectives on engineering a culture. Their attention is directed at exhortations, affirmations, organizational systems, process, and individual behaviors, what Edgar Schein (2010) writes about as artifacts and espoused values. While “engineering a culture” concepts are not wholly sufficient, they are necessary and provide a platform for change. The substantive work of designing toward reliability should begin by addressing foundational organizational behaviors.

W h i ch O r g a n i z a t i o n a l B e h av i o r s P r ov i d e t h e M os t L e v e r a g e

Research sponsored by the US Nuclear Regulatory Commission identified seventeen organizational behaviors associated with high-performing organizations (Haber, Shurberg, Jacobs, & Hofmann, 1994). The principal investigator for that research observed: A healthy safety culture is most often found within an aligned organization that has effective processes and motivated people. (Respondent, applied research, consulting advisor)

An assessment model based on these behaviors was used to assess organ izational culture in over sixty different organizations representing diverse

applying reliability principles 287

industries and five different countries. The assessments consistently identified four organizational behaviors as discriminators for a positive and healthy organizational culture: (1) leadership, (2) communication, (3) organizational learning, and (4) problem identification and resolution. We examine each behavior in turn next. Behavior One—Leadership I have grown to believe that the key element in any successful organization is an engaged leadership that actually cares about and listens to the people it leads. (Respondent, consulting advisor)

Culture is set from the top and echoed below. Culture enables the management systems to produce safely, effectively, and efficiently. A committed, problemsolving workforce that is engaged is a sure sign of good leadership. Some of the executive respondents made note of high-level barriers to effective leadership. Dialogue: One alarming corporate peculiarity is the rise of the finance and legal dominance of boards and “C” suites (CEO, CFO, etc.). To successfully manage risk, one must be a little afraid of it. If one has never experienced a significant failure in operations—especially those with injuries, fatalities, or massive property loss—judgment suffers. If resource allocation decisions are not tempered by risk judgment and are dominated solely by best rate of return, then a company’s likelihood of failure rises quickly. There are enabling behaviors for high reliability, such as thinking processes of reflection, self-assessment, articulating risks, and operationalizing constructs like a questioning attitude, ownership, accountability, and responsibility. Leaders have to create space to nurture such behaviors and take opportunities to emphasize them. Probably the largest obstacle encountered in analysis of many organizations is the resistance to change and resistance to learning. Management tends to take even constructive feedback very personally and has a difficult time accepting the results of assessment. (Respondent, consulting advisor)

About half of the respondents spoke about management resistance. Dialogue: Few have the objectivity, courage, or integrity to embrace those results and use them to formulate a path forward to improvement. Often, resistance and negativity are encountered even when assessment conclusions are

288

applying reliability principles

supported by substantial amounts of data. Such resistance likely means the data are touching truths of serious organizational dysfunction, ethical breaches, or even potential illegal actions. When the results are “accepted,” there is often an attempt to place blame on individuals or groups within the organization. Blame allocation is commonly seen as a tactic to justify the data and to distance (or insulate) organizational senior leadership or specific managers from the results. In other words, even with considerable data and methodological rigor, bad news is someone else’s fault. An example was mentioned of a safety culture assessment chartered by management and designed and led by a PhD-trained engineer manager. This is the principal conclusion and recommendation: The number-one opportunity for improvement in the [name brand] self-assessment is that managers have a more positive perception of safety culture than nonmanagers. This disparity could be corrected by improving the perceptions of nonmanagers. (Respondent, nuclear defense)

This statement represents a not infrequent example of senior managers who are completely out of touch with employees and bespeaks an attitude of a “divine rightness of hierarchy” in which the boss is always right. Where such attitudes exist, reliability is a mere publicity slogan, indicative of a widely spread organizational disease that has been described as the “triumph of emptiness” (Alvesson, 2013). RSOs strive to achieve a level of emotional maturity among managers and staff so people can face difficult facts and enact change necessary to chart a more healthy future. Behavior Two—Communication Want to get real personal about what communication is? Just get sick. A review of reports by TJC found that communication failure (rather than a provider’s lack of technical skill) was at the root of over 70 percent of serious adverse health outcomes in hospitals (Rosenstein & O’Daniel, 2008). High reliability is about asking good questions, particularly about surprise. Technical excellence may be worthless if it can’t be clearly and more simply articulated to policy makers or leaders that don’t have the technical background. (Respondent, nuclear industry)

applying reliability principles 289

Several respondents spoke about sensitivity to clear communication. Dialogue: Expectations must be clearly stated, communicated, and constantly reinforced. Many managers/leaders believe that they can simply stand up and tell everyone that they want employee behavior to change and it will happen. They may follow that up with an e-mail, or, if they are really working hard, they will send everyone to a four-hour training class. Job done—start to change your behavior. The fallacy is that “managers tell and employees do.” An organization is about conversation, and the first steps toward change are about managers changing how they communicate and behave. It’s what Schein (2013) refers to as “humble inquiry.” If things seem to not be going well, begin with some questions: Here’s what I think we may be seeing, what do you think? What may be going on? What could we do differently? When changes are needed in work execution, behavior change can only occur through collaboration and sustained communication. (Respondent, nuclear industry)

A few respondents offered examples of this kind of collaboration, even at the CEO level. Dialogue: Desired work behaviors must be clearly stated in a manner that everyone can understand. Multiple communication methods must be used; for example, talk about the why and what of changes in training, meetings, tailgate sessions, and so on. Write it down so that everyone has access for reference. Build it into your processes. Put it in a procedure if necessary. Actively seek feedback in how the new approaches are working—go out and see. Constantly observe work practices and coach on the new approaches. Don’t know how to do this? Then ask for help. Outside specialists can help, perhaps a coach that specializes in reliability. But more importantly, ask people in your own organization. What do you think would happen if you asked a respected mechanic to walk you around a plant, to take you along on a work job? A CEO for a sizable holding company did just that at a large plant that had received a lot of negative regulator attention. He started visiting the plant, going with teams working in hazardous environments, dressing in protective clothing just like they did. He did not go there to tell; he went there to just listen and learn. After several visits, word got around. People began to say hello; gradually they

290

applying reliability principles

would stop him, offer comments, and ask questions. In turn he would ask questions of them, making notes of what he heard, then asking management about what he was hearing. After a while, craft personnel presented him with a custom-designed hardhat inscribed with his name—so more people would know when he was in the facility. There were many improvement interventions over several years, not just visits by the CEO. As of the writing of this chapter, that plant is respected as a leader in the US commercial nuclear industry. Who knows what you can learn by asking? Unfortunately, simply disseminating information is where it ends for most organizations, and they wonder why people have not changed the way they do things. (Respondent, nuclear industry)

Dialogue: Managers/leaders must be “consistent” and “persistent”—consistent in the way one establishes and communicates the expected change and persistent in continuous reminders that it is an expected change and must be done. Dialogue: There is a need to constantly test how messages are heard and understood at all levels of the organization. It is very easy to get misaligned between the intent of messages from senior management and actual practice. This is particularly dangerous if the front line stops raising risks and concerns—you may see unintended consequences because the intent at the top does not line up with the impact in the field. RSO principles and practices have been effective ways to address this issue. Alignment of intent, understanding, and mutual agreement requires constant checks, calibrations, affirmations, and revisiting; failure to do so is one of the most insidious sources of normalized deviation. Communication takes many forms; often the most influential are the simple day-to-day interactions around just doing work. Work procedures are a particular case in point. Work procedures have caused operational events due to being overly confusing and complex. Procedure growth typically occurs over time as organizations, with good intentions, work to include “catch-all” controls that more often than not create confusing, overly complex instructions that are nearly impossible to implement verbatim. (Respondent, nuclear industry)

Some respondents saw attention to work controls as a primary influencer of trust and communication. Dialogue: Do people feel comfortable talking

applying reliability principles 291

about work processes that don’t work or are too cumbersome, and if they do speak up, does the organization listen and then act? Do people have any influence over their work processes and tools? In RSOs people have ownership of their work and the tools of work are their tools. Ownership or lack thereof sends strong messages. Behavior Three—Organizational Learning Learning in the RSO sense encompasses individual skills and knowledge; collective skills and knowledge; and institutional and social mechanisms for capturing, analyzing, and adapting using new knowledge. One cannot overemphasize the importance of a highly trained workforce. (Respondent, nuclear industry)

Academic degrees or technical and professional certifications may be entry requirements, but learning to do the job is not left to others. RSOs do it themselves to make sure it’s done right. The first step is to focus on the physics [i.e., the basic science and engineering; this respondent uses the term “physics” in the sense of knowing the fundamentals of the science involved]. (Respondent, nuclear defense)

A majority of the respondents emphasized the coequal role of technical and nontechnical skills. Dialogue: There is no substitute for mastery of the operational science and technology. Training goes far beyond technical task performance. Culture, organizational, and human factors provide insights about the nontechnical skills needed to do work with professional excellence. Leadership, teamwork, communications, and relationships are essential to people in reliability-demanding organizations and are too important to be left to chance. Learning is social as well as structured. Nontechnical traits may be facilitated by formal training, but coaching, tailored job experiences, and organizational social influence are also necessary. Peer encouragement and recognition help people change the way they think. Learners may lack the requisite degree of humility at first. It may take some time to recognize they possess the very traits and characteristics of a reliability professional. Here’s a case in point. A manager of nuclear fuel operations in a large nuclear services corporation was at first skeptical of the high reliability idea, but to make a long story short:

292

applying reliability principles

Within 18 months of shifting his mindset from blame to designing for performance he was recognized by the executive body of the multi-business line international corporation with a corporate Leadership Award. (Respondent, nuclear industry)

The essence of the story goes like this: The manager discounted the “new way of thinking” at first. Past performance included “notices of violations” from the regulator, several significant events adverse to safety, and negative observations by customers. Human error was routinely attributed to worker violation of procedure. Performance turned around by questioning whether the old way was the best way and by fixing the fundamentals instead of the symptoms. The injury rate and regulatory oversight declined while product quality and financial performance improved. Now workers raise procedure variances to their supervisor’s attention and work together with engineers on developing more efficient and safer steps. Can executives learn? Yes, if they design executive intelligence and reinforcement systems. It is vitally important that executive leadership routinely show the value he or she has for “how” we are accomplishing the mission. From a process standpoint, we do this by making sure that we have a strong self-assessment process, and we also invest in external independent review teams. (Respondent, nuclear defense and environmental remediation)

Executive respondents made particular note of the role of assessments. Dialogue: External assessments are investments in the future, not inexpensive, but essential to protect against complacency. Less formally, we must work hard at providing positive reinforcement for actions in the field that demonstrate technical inquisitiveness and a willingness to stop and understand before moving forward. Often significant industrial events are avoided because someone took a difficult stand to stop work until they were comfortable with the evolution. There are boundaries to what is acceptable; the ability to recognize and stay within the envelope of reliability is a hallmark of reliability professionalism. Leaders must embrace and reinforce those instances to create the kind of environment that continually asks questions and improves. (Respondent, nuclear defense)

This is a simple example of how leadership can show and reinforce that it’s important “how” things get accomplished:

applying reliability principles 293

Say a work crew accomplishes a difficult compressor overhaul in just a couple of shifts, where it usually takes several days. In addition to reinforcing that they completed the task quickly, which allows the process to be restarted quickly, you might ask how they were able to get it done so efficiently without anyone getting hurt. You will either get a blank stare, which suggests they were lucky, or you’ll get a thoughtful answer of how they talk through the hazards during a pre-job briefing, and so on. In either case you have reinforced what is important to you and what is expected in your work culture. (Respondent, nuclear defense)

Have you practiced failing recently? In RSOs people are trained to fail. They are drilled and drilled in seemingly no-win scenarios, not to tear them down but to develop resilience, to promote innovation in the face of adversity; so that when the unexpected happens they do not freeze, they do not panic—instead they are programmed to remain calm, to think not react, to depend on each other, to persevere. RSOs understand that people are capable of truly amazing things when they are properly prepared, so prepare them for failure as well as success. Behavior Four—Problem Identification and Resolution Do you recall a time when something just didn’t feel right or things seemed a bit strange? Anything that seems a bit off from the norm falls within what the nuclear industry calls problem identification and resolution (PI&R). It could be a formal concern report or as simple as asking for a time-out to discuss a team member’s feeling of uncertainty. PI&R refers to the extent to which the organization draws on knowledge, experience, and current information to identify potential problems. Dialogue: Problems need not be violations, error, or emergent crises—though they might be. Problems may be issues that left unresolved can result in accidents or declining performance over time; they may also be opportunities for setting new standards that enable growth and organizational transformation. Either way, anomalies within the normal scheme of things are identified and acted on by RSOs. Just identifying problems is not the point, and definitely not the solution. Each issue needs to be understood in its individual and collective context, be analyzed for extent of condition, and be considered for organizational or cultural implications. Finding and fixing issues has to be accorded the same level of attention as other corporate functions and must be incorporated into the organization’s normal funding and accounting practices.

294

applying reliability principles

If managers and business units are monitored on the basis of financial performance, then finding and fixing problems will be viewed as a cost, not an investment. (Respondent, natural gas industry)

Creating separate funding accounts for PI&R as an essential element of beginning a culture change toward an RSO. Bottom line: don’t pretend that assessment and improvement will be paid for as an overhead out of operations funds; it just won’t happen. If you want it done, fund it and pay for it, and make the results visible on the positive side of the ledger. Reducing uncertainty may be more important than reducing risk. (Respondent, regulator)

A majority of industry respondents and all of the regulatory respondents commented about risk management. Dialogue: Risk reduction is based on a mathematical framework that uses known or well-characterized variability in the systems and barriers being analyzed (simply put, risk = consequence × probability). This model works reasonably well for simple systems. However, as systems increase in complexity and range of possible consequences, the uncertainty (what is not known or well-characterized) in the variability in the systems, barriers, and consequences cause the simple risk model to break down. Consequently, an RSO pays continual attention to reducing that uncertainty (converting it into risk that can be managed) through continual support and analyses of all aspects of operational performance, relevant science and engineering research, and lessons learned from all available sources. The one thing I’d change to advance the goals of high reliability is to convince middle management that high reliability/high safety is the best business choice. Too often we fight the perceived tension between safety and budget, a shortsighted goal that does not take into account the full perspective. In industries like mine, one big incident could end the company. That’s not to say budgets are irrelevant—ignoring that could end the company as well. However, we often look at a too short-term focus when making decisions. (Respondent, nuclear industry)

W h a t ’ s t h e P l a n ; H ow D o W e Imp l e m e n t I t ?

Successful managers are action-oriented; they love to solve problems and get things done. Their main concern is, How do we implement this? In their

applying reliability principles 295

zeal to get things done, they overlook the principle that in sustainable change, thoughts precede action. RSOs Understand the Practical Significance of Mental Models A petroleum pipeline manager wrote: I have seen some fundamental challenges in shifting how people think, particularly in a highly technical organization with a lot of engineers. (Respondent, natural gas industry)

A number of respondents offered generalizations that apply to a certain mindset common in industries like energy and transportation. Dialogue: Engineers often think in terms of projects, with a clear start and end. When we talk about a management system or reliability seeking programs, there is no project and no end date—it’s a continuous state and shifts frequently. A lot of the engineers kept asking me, “When will we be done with these reliability processes?” and were very frustrated by the answer of “Never, as long as we are operating this asset.” Dialogue: For design engineers, when designing and building infrastructure, it is essential that the outcome be error-free. However, with management systems such as robust Management of Change, we frequently ran into managers who did not want to begin the program until we had thought of every possible nuance and variable before we defined a plan. While that is the right approach to constructing pipelines, if that approach were used in implementing reliability principles, we would never move forward. Sometimes you need to get it “good enough” and move forward. This is not to say that your goal isn’t zero incidents and 100 percent reliability—it’s more the recognition that behavior and culture change sometimes takes stages of evolution and cycles of improvement. What Implementation Might Look Like Respondents from a US nuclear defense operation describe how they began implementing RSO practices. Dialogue: Insights learned from organizational behavior were used to design an inquiry process and to equip teams to investigate organizational causes of events. A framework was developed to describe how organizational practices should fit together, an explicit mental model to tie theory to practice including: (1) a common/consistent language;

296

applying reliability principles

(2) a methodology to apply the practices; (3) tools to implement the practices, test ideas, and improve behaviors; and (4) measures to monitor systems (systems thinking) and to monitor improvements in behavior (organizational culture assessments). When the challenges with safety culture were raised, we recognized that it was not “‘the problem” but rather a symptom of the problem because it meant that our employees weren’t buying into our systems as we had designed them. (Respondent, nuclear defense)

Dialogue: The historical institutional problem definition was shifted from one of fixing deficiencies to one of improving the work environment. A research-based transformation process was adopted to analyze pushback from those resistant to pursuing a reliability-based improvement strategy. We began to understand that it’s not just about correcting errors or changing processes; that in fact every action dealing with people is a transformation effort. (Respondent, nuclear defense)

Dialogue: Elements of organizational transformation can be derived from a variety of sources—reliability theorists, change experts, culture experts—but one thing is clear, the plan must be visible, and it must be managed as a project to provide the best chance of being successful. Organizations need to take a systemic view of the performance that they want to improve. They must be sensitive to the complexity of what influences performance: the culture of the organization as well as outcomes desired. Interdependencies and interrelationships of internal and external factors must be considered to truly change and sustain improved performance. Above all, remember two things: the people who do the work know most about how work is really done, and models of organizing and standards of practice derive from the work, not from requirements, systems, processes, or ideology. Keep Asking Three Questions Begin by asking three questions—ask them over and over again: What does “good” look like? How are you doing? How do you know?” (Respondent, nuclear industry)

First, priorities should be performance-based. Use independent self- and performance standards as a benchmark for improvement. While getting these in place,

applying reliability principles 297

focus on small wins in operations. A beginning operational approach has been described in the two-volume Human Performance Handbook published by the US Department of Energy (2009). Sometimes known as Reality Checking, this learning assessment allows organizations to begin a journey toward reliability by providing work performers with resources to identify and compensate for organizational vulnerabilities, make work practices visible, and initiate collaborative organizational learning. RSOs are built by persistent progress, not by Hail Mary passes; “Small wins make for long races,” as one respondent put it.

P e r f o r m a n c e S t a n d a r ds t o S uppo r t C r i t i c a l S e l f - Ass e ssm e n t

You cannot talk about how and why your performance is drifting if you don’t have documented standards or expectations that everyone understands. (Respondent, nuclear industry)

Examples of reliability-supportive standards are numerous: performance objectives and criteria (INPO), management system standards (IAEA), health-care quality standards (AHRQ), and so on. These are the products of collaboration by industry and regulators; as such they are important sources of benchmarks. At the local level—the organization—the whole point of asking “What does ‘good’ look like” is to question how work is imagined to happen in comparison to how it is actually done. In other words, What are your standards in practice? Beware of assuming that standards of practice are explicit—typically they are not. To the extent that explicit standards do exist, they tend to be technical in nature with some degree of process specification. They are usually audit-driven checklists with little recognition of culture, organizational, and human factors that influence the performance of operational work. The question of “what does ‘good’ look like” is intended to bring into stark reality how much is assumed versus how much is really understood about the performance of work, the delta between how work is actually performed and what is necessary to perform work at consistently high levels across a multiplicity of dynamic conditions. (Respondent, nuclear industry)

Separate the executive intelligence system from operational control. This is how to drive strategic learning. Design an independent function reporting

298

applying reliability principles

to the executive level to carry out assessment, analysis, and championing of improvement interventions. The ultimate goal is for evaluation and improvement to be core strengths of each person and each organizational unit, yet the truth is that managers are driven by attainment of production objectives. They cannot possibly be expected to produce and assess with equal priority; it’s just human nature. Create a separate executive intelligence system to assess changes needed to design for excellence, recruit respected high-caliber people to staff the organization, train and educate them in organizational science and practice, make membership a badge of honor, and use a rotating assignment basis to spread the knowledge, skills, and practices throughout the management and technical ranks. Assessment is intended to drive the organization to examine how it is performing now and how it is equipped to perform sustainably by adapting to foreseeable and unforeseeable changes in internal and external environments. Assessment should be inquiry-based, not compliance-based or normative of a fixed set of expectations. In the beginning simply answer the question “How do we do work?” and compare various sets of expectations about how work should be done from different communities within the organization. The objective is to make work visible over time and to establish shared mental models of how work should be done.

C o n c l ud i n g O b s e r va t i o n s

Examples of success and failure using a variety of improvement tools and approaches for organizational change abound. An organization’s ability to achieve a highly reliable state for an extended period is determined more by the organization’s ability to assess, understand, and evolve with the changing environment than by the specific tools employed. What’s the difference between RSOs and normal organizations? Both exhibit the same range of personalities and idiosyncrasies of the general population. But RSOs are different in that they are trained, mentored, and socialized to do critical hazardous work. They are never satisfied; they are always seeking, questioning, and learning. Competence is expected, excellence is pursued, and they are wary of hubris. RSOs find community acceptance, identity, and satisfaction in knowing that perfection is a dangerous myth and success is a simply doing well enough today, and thinking about how to do better tomorrow.

applying reliability principles 299

The problems of safety-critical organizations are fundamentally problems of people, not of science or technology per se. What this really means, what it implies for the future, is woefully misunderstood by most policy makers and so-called industry leaders. The problem does not lie with errors of individuals but in collectives of people: how we think about the roles of organizations in society, the influence of policy and regulation, the jobs of management, the education and qualification of people, and how to mobilize collaboration in pursuit of meaningful work. Acknowledgments: I gratefully acknowledge David Bowman, Fred Carlson, Ronald A. Crone, Steve Erhart, Mark Griffin, Sonja B. Haber, Monica Hagge, Chris Hart, Rick Hartley, Bill Hoyle, Paul Kruger, Mike Leggat, Jake Mazulewicz, John McDonald, Cheryl McKenzie, James Merlo, Brianne Metzger-Dorian, Doug Minnema, Mike Moon, Billy Morrison, Tom Neary, Ed Peterson, Bill Rigot, Jim Schildknecht, Mike Schoener, Deborah A. Shurberg, Ralph Soule, Pat Sweeney, Cindy Wagner, and Jeff Weise. Special thanks for insightful reviews of drafts to Rick Hartley, Jake Mazulewicz, Karlene Roberts, and my D.D. R efer ences Alvesson, M. (2013). The triumph of emptiness: Consumption, higher education, and work organization. Oxford, UK: Oxford University Press. Carnes, W. E. (2011). Highly reliable governance of complex socio-technical systems (Working paper, Deepwater Horizon Study Group). Retrieved from: http://ccrm.berkeley.edu/pdfs_papers/ DHSGWorkingPapersFeb16-2011/HighlyReliableGovernance-of-ComplexSocio-Technical Systems-WEC_DHSG-Jan2011.pdf Chassin, M. R., & Loeb, J. M. (2013). High-reliability health care: Getting there from here. Milbank Quarterly, 91(3), 459–490. Committee on Quality of Health Care in America. (2000). To err is human: Building a safety health system. Washington, DC: National Academies Press. Gopnik, A. (2015, January 12). The outside game: How the sociologist Howard Becker studies the conventions of the unconventional. The New Yorker. Retrieved from http://www.newyorker .com/magazine/2015/01/12/outside-game Haber, S. B., Shurberg, D. A., Jacobs, R., & Hofmann, D. A. (1994). Safety culture management: The importance of organizational factors (R.N. 940653). NUREG-60966-Engineering Technology Division, Brookhaven National Laboratory. Upton, New York. Rasmussen, J. (1997). Risk management in a dynamic society: A modelling problem. Safety Science, 27(2/3), 183–213. Reason, J. (1997). Managing the risks of organizational accidents. Aldershot, UK: Ashgate. Roberts, K. H., Stout, S. K., & Halpern, J. J. (1994). Decision dynamics in two high reliability military organizations. Management Science, 40(5), 614–624.

300

applying reliability principles

Rosenstein, A. H., & O’Daniel, M. (2008). A survey of the impact of disruptive behaviors and communication defects on patient safety. The Joint Commission Journal on Quality and Patient Safety, 34(8). Retrieved from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.475 .4722&rep=rep1&type=pdf Safety management systems for domestic, flag, and supplemental operations certificate holders, 80 Fed. Reg. 1308 (January 8, 2015) (to be codified at 14 C.F.R. pts. 119, 5). Retrieved from https:// www.federalregister.gov/documents/2015/01/08/2015-00143/safety-management-systems-for -domestic-flag-and-supplemental-operations-certificate-holders Schein, E. (2010). Organizational culture and leadership (4th ed.). San Francisco: Jossey-Bass. Schein, E. (2013). Humble inquiry: The gentle art of asking instead of telling. San Francisco: BerrettKoehler. US Department of Energy. (2009, June). Human performance improvement handbook (Vols. 1–2, DOE-HDBK-1028-2009). Washington, DC: Author. US Department of Transportation. (2013, December 18). FAA’s safety data analysis and sharing system shows progress, but more advanced capabilities and inspector access remain limited (Office of Inspector General Audit Report AV-2014-017). Washington, DC: Author. Retrieved from https:// www.oig.dot.gov/sites/default/files/FAA%20ASIAS%20System%20Report%5E12-18-13.pdf Vogus, T. J., & Welbourne, T. M. (2003). Structuring for high reliability: HR practices and mindful processes in reliability-seeking organizations. Journal of Organizational Behavior, 24(7), 877–903.

Ep i l o g u e Karlene H. Roberts & Rangaraj Ramanujam

T

h i s b o o k ’s c o n t r i b u t o r s o f f e r a d i v e r s e set of perspectives for understanding HROs/RSOs, designing research, and implementing reliability-enhancing processes. Their discussions identify several important themes and questions for future research.

Th e D e f i n i t i o n o f R e l i a b i l i t y

As Karlene Roberts (Chapter 1) observes, reliability connotes different things to different authors. Rangaraj Ramanujam (Chapter 2) notes reliability has multiple meanings (e.g., performance consistency, safety, resilience, and service continuity) that correspond to multiple standards for assessment (e.g., variance in performance outcomes, avoidance of precluded events, rates of errors and near misses, recovery from shocks, effective learning from failures, and avoidance of disruptions in service delivery) and multiple organizational capabilities (e.g., anticipation, detection, containment, response, recovery, resilience, and learning). Such differences in how reliability is framed are reflected in the various chapters and in the studies they draw from. For instance, Peter Madsen and Vinit Desai (Chapter 6) adopt a view of “high reliability” as the avoidance of major adverse outcomes, whereas Seth Kaplan and Mary Waller (Chapter 5), Madsen (Chapter 7), and Louise Comfort (Chapter 11) discuss reliability in terms of organizational response to adverse outcomes or shocks— that is, resilience and learning from failures. Kathleen Sutcliffe (Chapter 4) and

302 epilogue

Paul Schulman and Emory Roe (Chapter 9) explicitly acknowledge multiple notions and multiple standards of reliability. Given the complexity of reliability as a construct, a consensus definition will remain elusive. However, the multiplicity of meanings, standards, and capabilities raises some basic questions: To what extent are the organizational antecedents of reliability as anticipation (i.e., avoidance of adverse outcomes) different from the organizational antecedents of reliability as resilience? Does organizing for anticipation pose a conflict with organizing for resilience? Intriguingly, Sutcliffe’s (Chapter 4) review of empirical studies of mindful organizing suggests that collective processes that aid anticipation (e.g., preoccupation with failure) are often observed in tandem with collective processes that aid recovery (e.g., commitment to resilience). Further studies will help to delineate the structures and processes that contribute to the several distinct notions of reliability. The expansive conceptualization of reliability also creates a need for integrating findings from multiple overlapping fields of inquiry. For example, safety management is increasingly framed in terms of processes linked not only to safety but also to resilience and learning (Grote, 2012). Error management, an active approach to errors that reduces negative error consequences and enhances positive effects, emphasizes the timely detection and containment of errors (Frese & Keith, 2014). Resilience engineering focuses on a system’s ability to sustain required operations under both expected and unexpected conditions (Hollnagel, 2014). This overlap and convergence in the scope of inquiry presents unexplored opportunities for organizational research to draw from as well as contribute to these merging research streams.

L e v e l of A n a ly s i s

The authors discuss reliability at different levels of analysis. For instance, Sutcliffe (Chapter 4) and Kaplan and Waller (Chapter 5) focus primarily on reliability in teams or organizational subunits. Others such as John Carroll (Chapter 3), Madsen and Desai (Chapter 6), Madsen (Chapter 7), and Jody Jahn, Karen Myers, and Linda Putnam (Chapter 8) focus, implicitly or explicitly, on reliability at the organizational level of analysis. Schulman and Roe (Chapter 9) and Comfort (Chapter 11) discuss reliability at an interorganizational level of analysis. Notably, most studies that are discussed in these

epilogue 303

chapters focus exclusively on a single level of analysis. However, we currently have very limited understanding about how reliability at one level (e.g., subunit) affects and is affected by reliability at other levels (e.g., organization, interorganizational system).

Th e o r e t i c a l L e n s

Carroll’s three lenses approach (strategic design, political, and cultural), presented in Chapter 3, provides one systematic way or even a checklist to identify the perspectives that might be missing in any given account of reliability and to develop a more comprehensive analysis by comparing ideas from each perspective against one another and combining them where possible. Applied to organizational research on reliability, this approach reveals the extent to which most narratives draw implicitly on the strategic design and/or cultural aspects of reliability. As Carroll notes, chapters in this book view reliability either through a single lens (e.g., Chapter 10, which draws on the strategic lens) or through two lenses (e.g., Chapter 9, which cites cultural and political lenses). Notably, few recent studies have focused on the political context around the pursuit of reliability. As a result, several important questions remain underexplored. What is the role of power and political processes in whether and how the reliability agenda is framed and enacted in organizations? How will incorporating a political perspective modify our current understanding of what it takes to develop organizational capabilities for anticipation, resilience, and learning? Also, how can organizational political processes aid the development of such capabilities?

C o n t e x t- S p e c i f i c M od e l s o f R e l i a b i l i t y

The chapters in this book seem to rest on the implicit assumption that findings from studies of organizational reliability are generalizable across contexts. Sutcliffe (Chapter 4) cautiously interprets the consistency in the findings from empirical studies of mindful organizing carried out in very different organizational settings as preliminary evidence for generalizability. However, Ramanujam (Chapter 2) draws on René Amalberti’s (2013) thesis that different organizational contexts might call for very different approaches depending on how organizations approach risk management and settle on trade-offs

304 epilogue

between adaptability and safety. For instance, some models might give priority to local adaptation, whereas other models might give priority to rules and supervision. Identifying and explicating the prevalence of different models of reliability and the contextual features that drive such differences represent important tasks for future research.

Org a n i z at iona l T r a nsi t ion to H igh R e l i a bi l i t y

Peter Martelli (Chapter 10), Comfort (Chapter 11), and W. Earl Carnes (Chapter 12) provide detailed information about efforts to implement reliability-enhancing practices in very different settings. Even as these chapters highlight the noteworthy success of organizational reliability research in influencing practice, they underscore the paucity of studies that examine how RSOs become significantly more reliable. Simply put, studies of organizational reliability are almost without exception snapshots of reliability attaining organizations (RAO) or RSOs. To emphasize this point, Martelli quotes Mark Chassin and Jerod Loeb (2013): “We know of no well-documented blueprints for elevating a low-reliability organization or industry into a highly reliable one and sustaining that achievement over time” (p. 467). Clearly, we need studies that can help us understand how organizations make this transition.

Impo r t a n t Asp e c t s o f R e l i a b l e O r g a n i z i n g

In addition to the prior themes that run through multiple chapters, several chapters identify specific opportunities for future research about the various aspects of reliable organizing. Sutcliffe (Chapter 4) advances the notion that the processes underlying high reliability organizing are not unique to the “prototypical” HROs but are in fact relevant, even essential, to organizations operating in environments characterized by growing volatility, uncertainty, complexity, and ambiguity. As evidence she points to studies that span such varied settings as hospitals, highway maintenance, drug rehabilitation, firefighting, offshore oil and gas production, and IPO software firms. Overall, research has made significant advances in systematically and rigorously measuring mindful organizing and identifying several of its antecedents and outcomes.

epilogue 305

These advances open up several opportunities for future research. What is the dimensionality of mindful organizing? Most studies report evidence for a single factor suggesting that the five constituent processes function in unison (e.g., Vogus & Sutcliffe, 2007). However, another study (Ray, Baker, & Plowman, 2011) reported a five-factor structure. More research is needed to investigate the intriguing possibility that the nature of mindful organizing might vary across contexts. How is mindful organizing related to and distinct from related group-level constructs such as situation awareness, safety climate, team coordination, transactive memory, and team-level resilience? What are the various outcomes, in addition to reliability, that are linked to mindful organizing? How does mindful organizing enable and how is it enabled by trade-offs that organizations make between reliability and other goals? What role do organizational routines play in facilitating and/or impeding mindfulness? What is the link between individual and organizational mindfulness? As Sutcliffe notes, the question of whether and to what extent mindful organizing depends on individual mindfulness remains understudied. Kaplan and Waller (Chapter 5) discuss research that links specific properties such as team staffing, team boundary dynamics, stability in team composition, and behaviors of teams to high performance during critical and unexpected situations. They discuss two broad strategies for enhancing resilience. One is to select and staff teams with requisite qualities or dispositions such as mindset growth that aids resilience. The second is to rely on SBT, a fundamental component of team training and preparation in high reliability industries such as aviation, the military, and power generation facilities (Salas, Wilson, & Edens, 2009). In the context of reliability, teams could be either single- or multipurpose. Kaplan and Waller suggest that the emergence of shared mental models, transactive memory systems, and collective efficacy will be critical for resilience and might develop differently in these two types of teams. In particular, single-purpose teams are less stable in their membership and have fewer ongoing interactions between events. How does team composition, stability, and onboarding work? What is the role of team leaders in switching between internal and external communications? Given the critical importance of motivation in training effectiveness, these authors call for research to better understand the conditions that lead to effective SBT in high-risk work settings.

306 epilogue

Madsen and Desai (Chapter 6) point to a basic yet underrecognized feature of HROs—the pursuit of reliability as a goal occurs in the context of several other potentially conflicting goals such as efficiency, speed, and innovation. Viewed in this light, the organizational pursuit of reliability is inevitably in conflict with the organizational pursuit of other equally important goals. Traditional models of goal conflict resolution would suggest that organizations deal with situations by focusing on multiple goals concurrently or sequentially, or by focusing unitarily on a single overriding performance goal. However, these models cannot adequately explain the simultaneous pursuit of efficiency and reliability in contexts where reliability is not fungible. The authors argue that HROs rely on collective mindfulness not only to allocate attention to multiple performance goals but also to notice even minor deviations from expected outcomes with respect to these goals. However, we need additional studies to better understand how the mindfulness-linked processes (continuous bargaining over goal priority, balancing the incentives for reliability and other goals, incremental decision making through learning from small failures, and commitment to resilience) facilitate goal conflict resolution. The authors propose that these processes together allow organizational attention to be expanded through an escalation of resources to simultaneously consider multiple problems linked to different goals. What are the conditions that promote such processes? Do these strategies conflict or reinforce one another? Madsen (Chapter 7) describes the multiple forms of learning that enable and sustain high levels of reliability. The low frequency of major accidents in an HRO provides limited opportunities for learning from firsthand experience. Therefore, organizations must rely on learning directly from the rare disasters they do experience, learning vicariously from the disasters experienced by other organizations in their population, learning continuously from small failures and near misses, and learning experimentally through simulations that present events and failures that fall outside the range of their direct experience. Although a few studies have examined each kind of learning in the context of reliability, several open questions remain. How does each type of learning augment various organizational capabilities for enhancing reliability? Specifically, what kinds of learning aid prevention, response, and resilience? How do these forms of learning interact with one another? More generally, the chapter underscores the need to draw on the literature on organizational learning.

epilogue 307

Jahn, Myers, and Putnam (Chapter 8) propose that organizational reliability research can benefit from in-depth investigations of the relationship between communication and HRO cultures. They discuss this relationship from the perspective of five communication metaphors (as a conduit, as information processing, as voice, as symbol, and as performance). They point out what the use of each metaphor reveals and obscures actions embedded in organizational culture. The conduit metaphor focuses on passing along information but sees information as neutral, content-free, and devoid of meaning. The information-processing metaphor addresses the content and routing of information as well as targets of information-exchange activities. The voice metaphor introduces meaning and shows when people speak up or withhold information, but it fails to show how power and control operate. The symbol metaphor emphasizes meanings and interpretations of stories, artifacts, rituals, and so on. The performance metaphor centers on routines as social interactions. They conclude that symbol and performance, the two metaphors that focus on meaning and action, remain underutilized in reliability research. They propose that combining and embracing these metaphors can deepen our understanding of communication and its critical role in enacting high reliability. From the viewpoint of communication as symbol, how do organizational cultures make actions sensible and, during a crisis, cause some actions to appear in the foreground while causing other actions to recede to the background? From the viewpoint of communication as performance, how are everyday activities as well as the enactment of emotion shaped by an organization’s history, cultural stories, and institutional responses to successes and failures? Schulman and Roe (Chapter 9) propose an important reframing of reliability as an interorganizational or networked property and argue that the scope of research must be expanded to include input and output factors outside the organization. Simply put, reliability of an organization cannot be understood in isolation and must take into account such factors as the shifting public attitudes toward and expectations about the risks of operational failures in organizations; the reliability of regulatory agencies responsible for what the authors refer to as “macro design,” the interorganizational coordination required to respond to and recover from failures that affect shared infrastructures; and the long-term effects of failures on future generations. This raises several important questions. What is the role of public perceptions of risk and dread on

308 epilogue

organizational efforts to manage reliability? What are the reliability standards in an interconnected work? What is the role of regulatory agencies and their emphasis on compliance for organizational efforts to promote flexibility and resilience? To answer these questions, however, calls for large-scale, longitudinal studies that require the kind of access that is becoming more difficult.

Imp l e m e n t a t i o n o f R e l i a b i l i t y- E n h a n c i n g P r a c t i c e s

Martelli (Chapter 10) examines the adoption of the language and principles of high reliability organizing in health care, the largest sector in the economy. Whether HRO theory applies to health care depends on what one means by HRO theory and what one means by health care. Martelli points out that HRO concepts were brought into health care by anesthesiology, a relatively closed, dynamic, intense, time-pressured, complex, uncertain, and high-risk setting. Subsequently, a number of factors pushed the industry to pay attention to medical errors as a widespread threat to patient safety and search for solutions from other industries and academic disciplines. The result, as Martelli concludes, is “an amalgamation of approaches under the heading of ‘high reliability organizing’ that may not properly fit that label.” Some of the differences in how reliability was originally conceived in organizational research and how it is being interpreted in health care present opportunities for refining our conceptualization—for example, thinking of reliability as a continuous rather than a dichotomous variable is helpful to research in this setting. Other differences, such as using high reliability as a proxy for patient safety—although the two are not equivalent—poses the threat of obscuring the explanatory focus of reliability studies. The experience of health care is instructive in many ways. First, delimiting the boundaries of the system (e.g., patient floor, clinical unit, hospital, or the continuum of care that often stretches across organizations and providers) is critical to designing, implementing, and assessing the effectiveness of reliability-enhancing practices. Second, knowledge intermediaries such as consultants play an increasingly important role in how findings from academic research about reliability are interpreted, implemented, and assessed in practice. At the very least, this trend calls for a careful examination of the role of consultants in disseminating reliability research and its consequences. Third, implementation efforts in health care frequently look for evidence-based

epilogue 309

prescriptions for appropriate organizational structures for coordination and governance that will enhance reliability. This in turn points to the need for empirical studies to better understand and specify the organizational design for enhancing reliability in different contexts. Comfort (Chapter 11) asks whether findings from HRO research can be applied to a community of organizations. She suggests that the appropriate questions asked in HROs are about factors leading to high performance. By contrast, the appropriate question in the context of a system of organizations is, How can complex interorganizational activities scale in manageable operations to achieve coherent performance as a system? She discusses the importance of good metrics in this area and provides a theoretical framework for investigation. The theoretical framework includes initial conditions of a system, reliability as a dynamic process, limits to reliability, scalability and transition in flexible structures, and searching for reliability in communities at risk. Comfort states that in nested systems there are differences in access to accurate information, capacity for understanding and updating information, and the ability to share information. She notes that in such systems the use of rules and procedures are seen as a systematic way to achieve reliable performance but that there may be inconsistent or conflicting rules and procedures at different levels of analysis. She poses key questions about the shift in focus from the reliability of a single organization to that of a system of interacting organizations. How can organizational performance at one level of operations be designed to create a basis for aggregation and integration of actions at the next level of operations, and the next? What are the minimum requirements that allow transition from one level of complex operations to the next wider level of activities, influence, and performance, and the next, while still retaining focus on the same goal for the whole system? Carnes (Chapter 12) offers a refreshingly different perspective of practitioners who operate, regulate, or oversee organizations that demand high reliability. Drawing on his conversations with thirty-one of his professional colleagues, he discusses what they’ve learned from managing the challenges of RSOs. His respondents consistently identify a set of critical issues including leadership, communication, organizational learning, and PI&R. Leadership counts on nurturing behaviors like self-reflection, articulating risk, questioning attitudes, and overcoming management resistance. Good communication includes asking good questions, being consistent and persistent, and constantly

310 epilogue

testing how messages are heard and understood. In the minds of Carnes and his respondents, the issue of organizational learning has to do with whether executives can learn. The answer: yes, if the right incentives are in place. PI&R is concerned with drawing on knowledge, experience, and current information to understand issues in their contexts. The discussions in this chapter draw attention to the paucity of studies in the organizational reliability literature about the role of leadership and top management in guiding the implementation of reliability-enhancing practices in their organizations.

C o n c l us i o n s

Systematic examination of the organizational origins of reliability commenced in the 1980s as a small set of studies of so-called HROs in “extreme settings” (Roberts, 1990). This examination has since evolved into an expanding line of inquiry about reliable organizing in an ever-widening range of contexts. The chapters in this book substantiate the significant advances in our understanding of the nature of reliable organizing and its antecedents. Equally, they point to several exciting opportunities for future research. The growing public intolerance for even small-scale organizational failures is heightening awareness about how much we depend on organizations and anxiety about how much we take their dependability for granted. As a result, the reliability of organizations, not just technologies, is increasingly coming into question. The need for understanding and implementing reliability-enhancing structures and processes is greater than ever before. We hope the chapters in this book will help to further advance research and practice that can begin to address this need.

R efer ences Amalberti, R. (2013). Navigating safety: Necessary compromises and trade-offs—Theory and practice. New York: Springer. Chassin, M. R., & Loeb, J. M. (2013). High-reliability health care: Getting there from here. Milbank Quarterly, 91(3), 459–490. Frese, M., & Keith, N. (2014). Action errors, error management, and learning in organizations. Annual Review of Psychology, 66, 1–21. Grote, G. (2012). Safety management in different high-risk domains—All the same? Safety Science, 50, 1983–1992. Hollnagel, E. (2014). Safety-I and safety-II: The past and future of safety management. London: Ashgate.

epilogue 311

Ray, J. L., Baker, L. T., & Plowman, D. A. (2011). Organizing mindfulness in business schools. Academy of Management Learning and Education, 10(2), 188–203. Roberts, K. H. (1990). Some characteristics of one type of high reliability organization. Organization Science, 1(2), 160–176. Salas, E., Wilson, K. A., & Edens, E. (2009). Crew resource management: Critical essays. Surrey, UK: Ashgate. Vogus, T. J., & Sutcliffe, K. M. (2007). The safety organizing scale: Development and validation of a behavioral measure of safety culture in hospital nursing units. Medical Care, 45(1), 46–54.

This page intentionally left blank

Con t r ibu tor s

W. Earl Carnes (retired) was Senior Advisor for High Reliability at the US Department of Energy (DOE). Part of his twenty-one years at DOE included serving in safety policy and oversight positions. He has eighteen years of experience in commercial nuclear power including with the Institute of Nuclear Power Operations (INPO) and as a management consultant. He has authored various technical documents for INPO, DOE, and the International Atomic Energy Agency. He was a member of the INPO safety culture advisory group. John S. Carroll is Gordon Kaufman Professor of Management at the MIT Sloan School of Management. He has published four books and numerous articles in several areas of social psychology and organization studies. Current projects examine organizational safety issues in high-hazard industries such as nuclear power, oil and gas, and health care; the focus of his work in these projects includes leadership, self-analysis and organizational learning, safety culture, communication, and systems thinking. He is also Fellow of the American Psychological Society and received the Jay W. Forrester Award in 2012 for the best System Dynamics paper in the previous five years from the System Dynamics Society.

314 contributors

Louise K. Comfort is Professor of Public and International Affairs at the University of Pittsburgh. She teaches in the field of public policy analysis, information policy, organizational theory, and sociotechnical systems. She is currently engaged in several large-scale research projects on crisis management focused on the design, development, and integration of information processes to support decision making in urgent, uncertain environments. Vinit Desai is Associate Professor of Management at the University of Colorado, Denver. His academic research focuses on issues of organizational learning, failures, and organizational legitimacy. He has published in the Academy of Management Journal and Organization Science, among other outlets, and his research has been featured in online articles by Forbes, Fortune, the Economist, and others. He received his MS and PhD in Business Administration from the University of California, Berkeley. Jody L. S. Jahn is Assistant Professor in the Department of Communication at University of Colorado at Boulder. Her research examines communication processes involved in high reliability organizing, safety climate, and high performance. Specifically, she is interested in ways that team members in highhazard occupations utilize safety rules and make sense of hazards. Her research has been published in Management Communication Quarterly, Journal of Applied Communication Research, and others. Seth A. Kaplan is Associate Professor of Industrial/Organizational (IO) Psychology at George Mason University. His research focuses on understanding the determinants of team effectiveness in HROs and in extreme contexts. His scholarly work has appeared in journals including Psychological Bulletin, the Journal of Applied Psychology, the Journal of Management, and the Journal of Organizational Behavior among several others. He has received funding from sources such as the Department of the Army and NIH. Peter M. Madsen is Associate Professor of Organizational Leadership and Strategy at the Marriott School of Management at Brigham Young University. His research focuses on organizational learning from failure and organizational impact on employee health and safety. Madsen’s work has been published in top management and safety journals including Academy of

contributors 315

Management Journal, Organization Science, Strategic Management Journal, Journal of Management, Harvard Business Review, Business Ethics Quarterly, and Quality and Safety in Health Care. He has received the following awards and honors: Western Academy of Management Ascendant Scholar, Next Generation of Hazards and Disasters Researchers Fellowship, Academy of Management Meetings Best Paper (Careers Division), and Western Academy of Management Best Paper. Peter F. Martelli is Assistant Professor of Healthcare Administration in the Sawyer Business School at Suffolk University and a board member of the Center for Catastrophic Risk Management, University of California, Berkeley. His research focuses on the intersection of theory and practice, with an emphasis on evidence-based management, organizational change, and behavioral models of risk and resilience. He received an MSPH from Thomas Jefferson University and a PhD in Health Services and Policy Analysis from the University of California, Berkeley, and was previously Research Coordinator in the Scientific Policy Department at the American College of Physicians. Karen K. Myers (PhD, Arizona State University) is Associate Professor in the Department of Communication at the University of California, Santa Barbara. Her current research includes membership negotiation (socialization, assimilation); vocational anticipatory socialization; workplace flexibility and work-life balance issues; organizational identification; and interaction between generational cohorts in the workplace. Her work has appeared in Management Communication Quarterly, Human Communication Research, Journal of Applied Communication Research, Communication Monographs, Communication Yearbook, Human Relations, and elsewhere. Linda L. Putnam is Distinguished Research Professor Emerita in the Department of Communication at the University of California, Santa Barbara. She is coeditor of eleven books, including The SAGE Handbook of Organizational Communication (3rd ed.), Building Theories of Organization: The Constitutive Role of Communication, Organizational Communication: Major Work, and The SAGE Handbook of Organizational Discourse. She is a Distinguished Scholar of the National Communication Association, a Fellow of the International Communication Association, and the recipient of Life-Time Achievement Awards

316 contributors

from the International Association for Conflict Management and Management Communication Quarterly. Rangaraj Ramanujam is Professor at Vanderbilt University’s Owen Graduate School of Management. His research examines the organizational causes of operational failures in high-risk settings. He serves on the editorial board of Stanford University Press’s High Reliability and Crisis Management series and on the advisory board of Underwriters Laboratories’ Institute for Integrated Health and Safety. Karlene H. Roberts is Professor Emeritus at the Walter A. Haas School of Business, University of California, Berkeley, where she is also Chair of the Center for Catastrophic Risk Management. A recipient of the 2011 Academy of Management Practice Impact Award, her research focuses on the design and management of organizations in which error can result in catastrophic consequences. She is coeditor of Stanford University Press’s High Reliability and Crisis Management series. Emery Roe is Senior Research Associate at the Center for Catastrophic Risk Management, University of California, Berkeley. He has been a practicing policy analyst and is author of many articles and other publications. His most recent book, coauthored with Paul Schulman, is Reliability and Risk: The Challenge of Managing Interconnected Infrastructures (2016, Stanford University Press). His other books include Making the Most of Mess: Reliability and Policy in Today’s Management Challenges (2013, Duke University Press); High Reliability Management: Operating on the Edge (with Paul Schulman, 2008, Stanford University Press); and Narrative Policy Analysis (1994, Duke University Press). Paul R. Schulman is Professor of Government Emeritus at Mills College and Senior Research Associate at the Center for Catastrophic Risk Management at the University of California, Berkeley. He has written extensively and consulted on the challenge of managing hazardous technical systems to high levels of reliability and safety within organizations and across networks of organizations. His books include Reliability and Risk: The Challenge of Managing Interconnected Critical Infrastructures (with Emery Roe; Stanford University Press, 2016), High Reliability Management (with Emery Roe; Stanford University Press, 2008), and Large-Scale Policy-Making (Elsevier, 1980).

contributors 317

Kathleen M. Sutcliffe is Bloomberg Distinguished Professor of Business and Medicine at Johns Hopkins University. Her research focuses on understanding how organizations and their members cope with uncertainty and unexpected surprises and how organizations can be designed to be more reliable and resilient. She has published widely in organization theory and health care. In 2015 she received a distinguished scholar award from the Managerial and Organizational Cognition division of the Academy of Management. Mary J. Waller (PhD, University of Texas at Austin) is M.J. Neeley Professor of Management at Neeley School of Business, Texas Christian University. Her research interests center on team interaction and effectiveness in complex, critical contexts. Her work appears in various management and psychology outlets including Academy of Management Journal, Journal of Applied Psychology, Organization Science, Management Science, and Journal of Organizational Behavior.

This page intentionally left blank

I nde x

Page numbers followed by f, t, and n indicate figures, tables, and notes. Accounting, 119, 122–23 Action: plans versus, 255–56; voice as, 177–78 Agency for Healthcare Research and Quality, 224–25, 231 Agency theory, 122 Aircraft carriers: classes of, 14n2; communication, 181; HRO research, 4, 6, 8; mindful organizing, 70; organizational goal conflicts, 127, 129; three lenses framework, 52–53, 58 Aircraft technicians, 182 Airline industry. See Aviation Air traffic control, 4, 5, 29, 48, 127, 130 Amalberti, René, 28 Ambiguity, causal, 121, 127–28, 135–36 Anesthesiology, 219–20. See also Health care Anticipation, 64–65 Art, modern, 57 Assessments, 292–93, 297–98 Audia, Pino, 123, 124–25 Audit teams, 50 Ausserhofer, Dietmar, 75 Automobile manufacturing, 178–79 Aviation: governance, collaborative, 283–84; organizational learning, 156,

157–58; political approaches to safety management, 49; team leaders, 102–3 “Avoided-events” reliability standard, 198 Axelrod, Robert, 249 Baker, Lakami, 75 Balanced scorecard, 119, 122–23 Band concerts, 101 Bargaining, 129–30, 137 Barnstorm Air Force Base, 182 Barton, Michelle, 81–82, 178 Baum, Joel, 123 Bea, Robert, 131 Beaumont Hospital (Royal Oak, Michigan), 230 Behavior, in Kirkpatrick’s hierarchy, 158, 159–60 Behavioral Theory of the Firm, 119, 123, 125, 135 Benn, Jonathan, 234 Bergeron, Caroline, 183, 185 Berwick, Don, 225 Bidirectional interdependency, 205 Bigley, Gregory, 220 Blatt, Ruth, 177–78

320 index

Boin, Arjen, 27 Bounded rationality, 124 Bourrier, Mathilde, 227 Boys, Suzanne, 172 BP, 6, 55–56, 131 Brion, Sebastien, 124–25 Bristol Royal Infirmary, 66 British Medical Journal, 222–23 Browning, Larry, 182 Busby, Jerry, 82 Business schools, 75 Calculative culture, 57–58 Canadian Patient Safety Foundation, 224–25, 235n9 Cannon, Mark, 154 Caribbean News Online, 262–63, 263t, 264f Carl Vinson (aircraft carrier), 52–53, 58, 127 Carroll, John, 91, 171, 178, 236n15, 269 CASoS (complex adaptive systems of systems), 252–53, 271. See also Communities at risk Causal ambiguity, 121, 127–28, 135–36 Chassin, Mark, 74, 225–26, 284, 304 Chernobyl nuclear accident, 52, 151 Chisholm, Donald, 154 Churchill, Winston, 83 Cincinnati Children’s Hospital Medical Center, 226 Clancy, Carolyn, 224 Cleirigh, Daire, 80 Cockpit resource management approach, 220 Cohen, Michael, 249 Collaborative governance, 283–85 Collective efficacy, 97 Columbia space shuttle accident, 27, 50, 127, 156 Communication: about, 307; conduit metaphor, 172, 173–74; future research, 189–91, 307; information-processing metaphor, 172, 174–77; lessons learned, 288–91; linkage metaphor, 176; metaphors overview, 172; network metaphor, 176; organizational culture and, 171; performance metaphor, 172, 184–87; symbol metaphor, 172, 180–84; voice metaphor, 172, 177–80 Communities at risk: about, 309; asymmetry of information, 254–55; CASoS, building resilience in, 271; CASoS initiative,

252–53; extreme events, 251–52; extreme events, comparison of, 267, 269–70; future research, 309; global commons, 253–54, 270–71; Haiti earthquake, 261– 63, 263f, 263t, 264f, 265, 269; Indonesia earthquake/tsunami, 257–59, 259t, 260f, 261, 261f, 269; Japan earthquake/ tsunami/nuclear reactor breach, 151, 265– 67, 266t, 268f, 269–70; organizational reliability at community level, 244– 47; plans versus action, 255–56; risk in system of systems, 247–48; theoretical framework, 248–51; three lenses framework, 269–70 Compensable reliability standard, 198 Complex adaptive systems of systems (CASoS), 252–53, 271. See also Communities at risk Concord Hospital (New Hampshire), 223–24 Concurrent consideration models, 122–23 Condemnation, 197 Conduit metaphor for communication, 172, 173–74 Consolini, Paula, 118, 127, 132 Consumer rate containment, 212n3 Contractor, Noshir, 176 Control, voice as, 178–79 Conway, Jim, 217 Cooper, Jeffrey, 159 Cooren, François, 183, 185 Corman, Steven, 176 Craft workers, 54 Crew resource management approach, 220 Crozer-Keystone Health System, 226 Cultural lens, 43–44, 52–56 Cultural readjustment, in disaster incubation model, 145–46, 148, 152–53 Culture: calculative, 57–58; generative, 58; Just, 232, 236n20; organizational, 171, 286; pathological, 57; proactive, 58; reactive, 57; safety, 52, 57–58, 203 Cyert, Richard, 118 Dana-Farber Cancer Institute, 221 David, Paul, 151 Debriefing, 101 Decision avoidance, errors of, 205 Decision making, incremental, 132–33, 138 Decision model, fire alarm, 125

index 321

Deepwater Horizon explosion, 6, 131 Deming, W. Edwards, 63 Desai, Vinit, 149 Diablo Canyon nuclear power plant, 4, 6, 64, 128, 154 Disaster incubation model: about, 144–46, 164; experiential learning, 148–49; learning from small failures and near misses, 156; simulation learning, 160; vicarious learning, 152–53 Disasters, increasing frequency of, 14n1 Disruption, defined, 207 Downs, Anthony, 197 Dread, public, 196–97 Dudley, Robert, 131 Dynamic nonevent, reliability as, 65, 169, 199 Dynamic process, reliability as, 249 Earthquakes: Haiti, 261–63, 263f, 263t, 264f, 265, 269; Indonesia, 257–59, 259t, 260f, 261, 261f, 269; Japan, 265–67, 266t, 268f, 269–70 Edmondson, Amy, 154 Efficacy, collective, 97 Efficiency goals, 118–19, 127, 136 Eisenbeiss, Silke, 80 Electrical reliability, 204, 205–6 Emergent states, 96–97, 109–10 Enactment, 169, 184 Endsley, M. R., 80 Engineers, 295 Entrepreneurs, 81–82 Error-Free technology, 236n36 Error management, 24 Errors of decision avoidance, 205 Errors of underestimated uncertainty, 205 Exception, management by, 8 Experiential learning, 147–50, 161, 162 Experimental learning, 156. See also Simulation-based training Expertise, deference to, 8, 53–54, 72–73 Failures: defined, 207; preoccupation with, 53, 71; preparing for, 293; public attitudes toward, 197–98; small, 153–54, 161 FedEx, 62 Financial goals, 124 Fiol, C. Marlene, 79

Fire alarm decision model, 125 Firefighters: communication, 176–77, 178, 182, 185, 186–87; failure, 190; reliability standards, 212n8 Fishing, 49 Flagging, in surgery, 278 Flight attendants, 185 Flight technicians, 182 Floods, 251 Foreign Object Damage, 182 Foresight, 113 Framing, 182–83 Fukushima nuclear reactor breach, 151, 265–67, 266t, 268f, 269–70 Full cultural readjustment, in disaster incubation model, 145–46, 148, 152–53 Gaba, David, 219–20, 222, 235 Garite, Thomas, 77 Generative culture, 58 German craft workers, 54 Glass, Robert, 252 Global commons, 253–54, 270–71 Goal conflicts. See Organizational goal conflicts Goals: efficiency, 118–19, 127, 136; financial, 124; multiple, 126–28; size, 124; survival, 124 Gopnik, Adam, 276 Governance, collaborative, 283–85 Grant, Susan, 176 Greaney, John, 80 Greve, Henrich, 123, 124 Gun drill, 5 Haiti earthquake, 261–63, 263f, 263t, 264f, 265, 269 Hanna, George, 234 Haunschild, Pamela, 149 Hayward, Mathew, 154 Health Affairs, 224 Health care: about, 308; articles, 224–25; books, 224; challenges, 217–18; cockpit/ crew resource management approach, 220; communication, 183; conferences, 221, 223, 228, 235n2, 235nn6–7; future research, 234, 308–9; governance, collaborative, 284; HRO approach,

322 index

Health care: about (continued ) current, 226; HRO Learning Network, 224–25; incident command system principles, 220–21, 235n1; issues addressed by HRO, 230–32; knowledge intermediaries, 227–30, 236n13; lessons learned, 234–35; mindful organizing, 73–74, 74–75, 76, 77, 78; organizational learning, 159; patient safety, 26, 221–23; reports, 217, 225; research and practice, 223–26; safety awards, 223–24; team resilience, 98; zero errors as concept in, 236n19. See also Hospitals; Nurses Healthcare Performance Improvement, 227–28 Health Services Research, 225 Heedful interrelating, 70 Helping, 111–12 Herald of Free Enterprise ferry disaster, 54 Highly optimized tolerance, 248 High reliability organizations (HROs): attributes, 53–54, 146; cultural approaches to safety management, 52–54; mindful organizing, 61–62; as term, 18, 22, 227, 236n12. See also specific topics High reliability organizations (HRO) research: characteristics of initial, 4–7; conceptual problem, 7–8; early, 3–4, 17; findings, early, 8–9; history, 17–18; HRO as label, 18, 22; question, basic, 9 High Reliability Self-Assessment Tool, 229 Ho, Zhi Wei, 79 Hospitals: communication, 288; consulting firms, 226, 235n11; mindful organizing, 78; organizational learning, 151–52; reliability models in, 29; stages of maturity model, 229; team resilience, 98. See also Health care; Nurses HRO Learning Network, 224–25 HRO model, 28–29 HROs. See High reliability organizations Human Performance Handbook (US Department of Energy), 297 “Humble inquiry,” 289 Iacobucci, D., 76 Images of Organizations (Morgan), 172 Implementation, 294–97 “Inadequate oversight,” 283

Incentives, balancing, 130–32, 137–38 Incident command system, 220–21, 235n1 Incremental decision making, 132–33, 138 Incubation period, in disaster incubation model, 145, 148–49, 153, 156 Individual mindfulness, 78–80 Indonesia: earthquake/tsunami, 257–59, 259t, 260f, 261, 261f, 269; floods, 251 Inevitable-event reliability, 198 Information: asymmetry of, 254–55; channels for communicating, 8; seeking, 174–75; team boundary, managing across, 101–4 Information-processing metaphor for communication, 172, 174–77 Input analysis: planning as input to reliability, 204–5; public attitudes toward organizations that manage risks, 195–98; public perceptions and alternative reliability standards, 198–200; regulatory inputs for reliability, 200–201; regulatory reliability versus regulation for reliability, 201–4, 203f Inquiry, humble, 289 Institute of Medicine, 217, 224, 284 Institute of Nuclear Power, 150, 151 Interconnected reliability, 205–6 Interdependency, 205–6 Internal team factors: about, 92–93; collective efficacy, 97; emergent states, 96–97, 109–10; future research, 109–10; mental models, shared, 96–97; situational awareness, 96; team staffing, 93–96, 94f; transactive memory systems, 97 International Atomic Energy Agency, 52 Interrelating, heedful, 70 Iszatt-White, Marian, 82 Jahn, Jody, 176, 185, 186 Jakarta, Indonesia, floods, 251 Japan earthquake/tsunami/nuclear reactor breach, 151, 265–67, 266t, 268f, 269–70 Jensen, Michael, 126 Jha, Ashish, 218 Johannessen, Idar, 80 Joint Commission Center for Transforming Healthcare, 229, 284, 288 Jordan, Silvia, 80 Just Culture, 232, 236n20

index 323

Kirkpatrick’s hierarchy, 158–60 Knowledge, in Kirkpatrick’s hierarchy, 158, 159 Knowledge intermediaries, 227–30, 236n13 Knox, G. Eric, 77 Krieger, Janice, 175 Kuhn, Timothy, 176 Landau, Martin, 154 Langer, Ellen, 67–68 La Porte, Todd, 118, 127, 132, 180, 196, 233 Leaders and leadership, 79–80, 102–3, 112, 280–82, 287–88 Leape, Lucian, 221 Learning: experiential, 147–50, 161, 162; experimental, 156; vicarious, 150–53, 161, 162–63. See also Organizational learning; Simulation-based training Lehman, Betsy, 221 Lekka, Chrysanthi, 8–9 Lessons learned: about, 274, 309–10; behaviors that matter, 286–94; communication, 288–91; implementation, 294–97; leadership, 287–88; leadership roles, 280–82; organizational learning, 291–93; performance standards, 297–98; problem identification and resolution, 293–94; regulation versus encouragement of reliability, 282–85; respondents to survey, 275–76; RSO characteristics, 276–77; simple/easy paradox, 277–79; survey, 275; work, meaningful, 279–80 Leveson, Nancy, 24–25 Libuser, Carolyn, 158 Linkage metaphor for communication, 176 Loeb, Jerod, 74, 304 Macro design, 201–2 Madsen, Peter, 149, 151–52 Management by exception, 8 Managerial practice, 82–83, 287–88 Managing the Unexpected (Weick and Sutcliffe), 224 Mandelbrot, Benoit, 250 March, James, 118, 124 Marginal reliability, 228 Maturity stages model, 229 Maude-Griffin, Roland, 151 Meaningful work, 279–80

Media effect on public attitudes, 197 Mediated interdependency, 205–6 Medical error, 221–23 Medical residents, 177–78 Medication errors, 77 Mental models, 96–97, 129–30, 295 Milbank Quarterly, 74 Millenson, Michael, 221 Millstone Nuclear Power Station crisis, 50–52 Mindful organizing: about, 61–62, 304; antecedents, 76–77; conceptual foundations, 62–64; conditions enabling, 69–70; construct validity of, 74–76; future research, 30–31, 78–82, 305; goal conflict resolution and, 128–29, 137; individual mindfulness, 78–80; leader mindfulness, 79–80; managerial practice, implications for, 82–83; opportunities versus threats, 81–82; organizational mindfulness and, 66–68; outcomes, 77–78; positive psychology and, 84; in practice, 69–73; processes, 70–73; reliability and logics of anticipation and resilience, 64–66; research on, 73–78; situation awareness, 80–81; three lenses framework and, 84 Mining, 49, 105, 143, 149 “Miracle on the Hudson,” 157–58 Mitroff, Ian, 92 Morgan, G., 172 MS Herald of Free Enterprise ferry disaster, 54 Multer, Jordan, 173 Multiple goals, 126–28 Multitasking, 95 Murphy, Alexandra, 185 Myers, Karen, 186–87 Narayanan, Jayanth, 79 National Aeronautics and Space Administration (NASA), 27, 50, 127, 156 National Patient Safety Partnership, 221, 235n3 Naval base storage tanks, 209 Near misses, 154–56, 161, 163 Network metaphor for communication, 176 Networks, 205–6 Normal Accidents (Perrow), 49 Normal accidents theory, 49, 147–48

324 index

Northeastern Air Defense team, 102 Northeast Utilities, 51–52 Novak, Julie, 178 Novo Nordisk, 77–78 Nuclear power: Chernobyl accident, 52, 151; Diablo Canyon plant, 4, 6, 64, 128, 154; Fukushima breach, 151, 265–67, 266t, 268f, 269–70; Millstone Nuclear Power Station crisis, 50–52; organizational learning, 150–51; public attitudes toward, 196; team leaders, 103 Nurses: mindful organizing, 70, 74–75, 76, 77, 78; organizational learning, 154. See also Health care; Hospitals Obstfeld, David, 20, 30–31, 67, 130, 153 O’Connor, Edward, 79 Onset, in disaster incubation model, 145 Operational redesign, 211 Operations, sensitivity to, 53, 72 Organizational culture, 171, 286 Organizational goal conflicts: about, 118–20, 306; ambiguity, causal, 121; bargaining, 129–30, 137; concurrent consideration models, 122–23; decision making, incremental, 132–33, 138; future research, 135–36, 306; incentives, balancing, 130–32, 137–38; mindful organizing and, 128–29, 137; models for resolving, 120–26, 121f; multiple goals in HROs, 126–28; performance relatedness, 120–21; practical implications, 136–39; resilience, commitment to, 133–34, 138–39; sequential consideration models, 123–25; strategies for resolving, 128–34; theoretical implications, 135–36; unitary consideration models, 125–26 Organizational learning: about, 143–44, 306; assessments in, 292–93; experiential learning, 147–50, 161, 162; experimental and simulation learning, 156–60, 161–62; experimental learning, 156; future research, 162–63, 306; learning in HROs, 144–47; lessons learned, 291–93; near misses, 154–56, 161, 163; small failures, 153–54, 161; vicarious learning, 150–53, 161, 162–63. See also Simulationbased training

Organizational mindfulness, 66–68 Organizations, defined, 118 Outcome Engenuity, 232, 236n20 Output analysis, 206–9 Oversight, inadequate, 283 Pacific Gas and Electric Company Diablo Canyon nuclear power plant, 4, 6, 64, 128, 154 Participation, voice as, 178 Pathological culture, 57 Patient safety, 26, 221–23 Pediatric intensive care units, 151–52 Performance consistency, reliability as, 20 Performance Improvement International, 227, 236n36 Performance metaphor for communication, 172, 184–87 Performance relatedness, 120–21, 127, 135–36 Perrow, Charles, 3–4, 49, 150 Physicians, 155 Pilots, 47–48, 175 Plans and planning, 204–5, 255–56 Plowman, Donde, 75 Political lens, 42–43, 49–52 Ponseti International, 126 Pooled interdependency, 205–6 Positive psychology, 84 Positive reasoning, 175 Possibilistic thinking, 92 Power, 42–43 Precipitating event, in disaster incubation model, 145 Precluded events, 195, 228 Precursor resilience, 207 Prediction, 92 Proactive culture, 58 Probabilistic thinking, 92 Problem identification and resolution, 293–94 Psychological safety, 111–12 Psychology, positive, 84 Public dread, 196–97 Public opinion, 195–200 Putnam, Linda, 172 Raslear, Thomas, 173 Rasmussen, Jens, 54 Rationality, bounded, 124

index 325

Ray, Joshua, 75 Reaction, in Kirkpatrick’s hierarchy, 158, 159 Reactive culture, 57 Reality Checking, 297 Reason, James, 71, 222, 286 Reasoning, positive, 175 Reb, Jochen, 79 Reciprocal interdependency, 206 Recovery, 23 Recovery resilience, 208 Redundancy, built-in, 9 Regulation, 200–204, 203f, 282–85 Reliability: about, 19–20, 21f; definitions, 38, 64, 301–2; as dynamic nonevent, 65, 169, 199; as dynamic process, 249; electrical, 204, 205–6; enactment and, 169, 184; future research, 30; inevitable-event, 198; interconnected, 205–6; limits on, 249–50; marginal, 228; as multifaceted construct, 23–24; as performance consistency, 20; requisite variety and, 169–70; as resilience, 22–23; as safety, 20, 22; safety and, 24–25; safety climate and, 25–26; as service continuity, 23. See also specific topics Reliability analysis: about, 194–95, 307; future research, 209–12, 307–8; input analysis, 195–205, 203f; levels of, 302–3; networks and interconnected reliability, 205–6; output analysis, 206–9 Reliability-enhancing work practices, 76 Reliability input analysis. See Input analysis Reliability models: about, 26–28, 303–4; future research, 31–32; HRO model, 28–29; reliability attaining versus reliability seeking organizations, 27–28; ultraresilient model, 28; ultrasafe model, 29 Reliability output analysis, 206–9 Reliability seeking organizations (RSOs), 27– 28, 62, 276–77. See also specific topics Requisite variety, 169–70 Rerup, Claus, 77–78 Rescue and salvage, in disaster incubation model, 145 Resilience: in CASoS, 271; commitment to, 53, 72, 133–34, 138–39; defined, 24, 91; precursor, 207; prediction versus,

92; recovery, 208; reliability as, 22–23; restoration, 208. See also Team resilience Resistance, management, 287–88 Respect, 69–70 Restoration, defined, 207 Restoration resilience, 208 Results, in Kirkpatrick’s hierarchy, 158, 160 Risk management, 294 Ritchie-Dunham, James, 79 Roberts, Karlene: communication, 180; high reliability organizations, 146; incident command system, 220; mindful organizing, 67; organizational goal conflicts, 119, 127, 129, 131 Robust Process Improvement, 229 Rochlin, Gene, 7, 18, 236n12 Roe, Emery, 19–20, 76–77 Rosenblum, Katherine, 95 Roth, Emilie, 173 Rothwell, Geoffrey, 151 Rousseau, Denise, 146, 180 RSOs (reliability seeking organizations), 27– 28, 62, 276–77. See also specific topics Rudolph, Jenny, 236n15 Rules, formal, 38, 65 Safety: awards for, 223–24; as cultural construction, 55; patient, 26, 221–23; psychological, 111–12; reliability and, 24–25; reliability as, 20, 22 Safety climate, 25–26 Safety culture, 52, 57–58, 203 Safety management: cultural approaches to, 52–56; future of, 56–58; political approaches to, 49–52; strategic design approaches to, 45–48 Sago Mine disaster, 143 Sameroff, Arnold, 95 Sandelands, Lloyd, 67 Sandy, Superstorm, 251 Schein, Edgar, 286, 289 Schulman, Paul, 19–20, 27, 76–77, 128, 154, 228 Schultz, Patricia, 80 Scott, Clifton, 182 Scott, Dick, 63 Scully, Tom, 224 Self-assessment, 297–98

326 index

Self-enhancement models, 124–25 Sellnow, Timothy, 178 Selznick, Philip, 118 Sensemaking, 181–82, 183 Sentara Safety Initiative, 227 September 11, 2001 terrorist attacks, 102 Sequential consideration models, 123–25 Service continuity, reliability as, 23 Shapira, Zur, 124 Shapiro, Marc, 159 Shareability constraint, 71 Simple/easy paradox, 277–79 Simplification, avoidance of, 53, 71–72 Simpson, Kathleen, 77 Simulation-based training: foresight, 113; motivation to participate in, 112–13; organizational learning, 156–60, 161–62, 163; team resilience, 100, 105–9, 112–13 Sitkin, Sim, 153–54 Situation awareness, 80–81, 96 Size goals, 124 Small failures, 153–54, 161 Small losses perspective, 153–55 Smart, Palie, 127 Smart grids, 211 Smircich, Linda, 4 Sorenson, Olav, 123 Stablein, Ralph, 67 Stages of maturity model, 229 STAMP (Systems-Theoretic Accident Model and Processes), 47–48 Standards, 198–200, 297–98 Starting point, in disaster incubation model, 144–45 Steineman, Susan, 159 STICC protocol, 83 Strategic design lens, 40–42, 45–48 Strategic Reliability, LLC, 228, 236n14 Submarine Safety Program (SUBSAFE), 46– 47, 49–50, 55, 57, 58 Sullenberger, Chesley “Sully,” 157 Sullivan, Bilian, 149 Sumatra earthquake/tsunami, 257–59, 259t, 260f, 261, 261f, 269 Superstorm Sandy, 251 Surgery, 278 Survival goals, 124

Sutcliffe, Kathleen: communication, 178; HRO attributes, 53; Managing the Unexpected, 224; mindful organizing, 30–31, 67, 68, 74–75, 75–76, 77; organizational goal conflicts, 130; organizational learning, 153; reliability attributes, 20, 249; resilience, 22–23, 91, 95 Symbol metaphor for communication, 172, 180–84 Systems-Theoretic Accident Model and Processes (STAMP), 47–48 Targeted Solutions Tool, 229 Team boundary dynamics: about, 97–98; debriefing, 101; events, adapting team composition to, 98–101; future research, 110–12; helping, 111–12; information management, 101–4; interaction patterns, 104–5; psychological safety, 111–12; simulation-based training, 100; switching behaviors, 100–101, 112; team leaders, 102–3, 112; transition strategies, 111 Team leaders, 102–3, 112 Team resilience: about, 90–92, 305; emergent states, 96–97, 109–10; events, adapting team composition to, 98–101; future research, 109–13, 305; information management, 101–4; internal team factors, 92–97, 109–10; prediction versus, 92; simulation-based training, 105–9, 112–13; team boundary dynamics, 97–105, 110–12 Team staffing, 93–96, 94f Three lenses framework: about, 37–40, 39f, 303; combining three lenses, 44–45, 56–58; communities at risk, 269–70; cultural lens, 43–44, 52–56; mindful organizing and, 84; political lens, 42–43, 49–52; safety management, future of, 56–58; strategic design lens, 40–42, 45–48 Three Mile Island accident, 56, 150–51, 152 Thresher (submarine), 46, 55 Tohuku-Oki earthquake/tsunami/nuclear reactor breach, 151, 265–67, 266t, 268f, 269–70 Tolerance, highly optimized, 248 Toyota Way, 67

index 327

Traffic alert and Collision Avoidance System, 48 Training, continuous, 8. See also Simulationbased training Transactive memory systems, 97 Transportation system, 252 Trauma teams, 103–4 Trethewey, Angela, 182 Trust, 69–70 Tsunamis: Indonesia, 257–59, 259t, 260f, 261, 261f, 269; Japan, 265–67, 266t, 268f, 269–70 Turner, Barry, 4, 144–46, 164. See also Disaster incubation model Turner, Victor, 184 Ultraresilient model, 28 Ultrasafe model, 29 Uncertainty, errors of underestimated, 205 Unitary consideration models, 125–26 US Airways flight 1549 emergency landing, 157–58 US Army, 5 US Department of Energy, 297 US Federal Aviation Administration, 4, 5, 156 US Forest Service, 190, 224 US Navy, 46–47, 49–50, 55, 57, 58. See also Aircraft carriers US Nuclear Regulatory Commission, 286 USS Carl Vinson (aircraft carrier), 52–53, 58, 127 USS Thresher (submarine), 46, 55

Van Knippenberg, Daan, 80 Variety, requisite, 169–70 Vicarious learning, 150–53, 161, 162–63 Vincent, Charles, 234 Vogus, Timothy: mindful organizing, 68, 70, 74–75, 75–76, 77, 81; resilience, 22–23, 91, 95 Voice metaphor for communication, 172, 177–80 Weick, Karl: communication, 174, 189; enactment, 184; HRO as term, 236n12; HRO attributes, 53; HRO research, 3–4; Managing the Unexpected, 224; mindful organizing, 30–31, 67, 68, 70, 79–80; organizational goal conflicts, 130, 133; organizational learning, 153; reliability attributes, 20, 65, 199, 249; reliability mechanisms, 169–70 Welbourne, Theresa, 76, 81 Westrum, Ron, 57–58, 67 Wildavsky, Aaron, 91 William Beaumont Hospital (Royal Oak, Michigan), 230 Work: meaningful, 279–80; procedures for, 76, 290–91; relationships at, 69 Yates, Gary, 218, 227 Zero errors, as concept in health care, 236n19 Zoller, Heather, 178–79