Becoming A Data-Driven Organisation: Unlock The Value Of Data 3662603039, 9783662603031, 3662603047, 9783662603048

Data is the foundation of any current and future market transformation during this digital era. Companies are expected t

1,057 300 2MB

English Pages 148 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Becoming A Data-Driven Organisation: Unlock The Value Of Data
 3662603039,  9783662603031,  3662603047,  9783662603048

Table of contents :
Foreword......Page 6
Acknowledgements......Page 9
Contents......Page 10
List of Figures......Page 14
1.1 What Exactly is “Data”?......Page 15
1.2 Isn’t Data the Job of the IT Department?......Page 17
1.3 Isn’t Data and Information the Same?......Page 19
1.4 Isn’t Data Management About Analytics?......Page 21
1.4.1 Manage Data Sources......Page 22
1.4.2 Validate Data on Entry......Page 23
1.4.4 Manage Data Quality......Page 24
1.4.5 Do Data Housekeeping......Page 26
1.4.6 Use Your Data: Analytics......Page 27
1.4.6.2 Diagnostic Analytics......Page 29
1.4.6.3 Predictive Analytics......Page 30
1.4.6.5 Future Development......Page 31
1.4.7 Use Your Data: Operational Use......Page 32
1.4.8.1 Blockchain......Page 33
1.4.8.2 Internet of Things......Page 34
1.4.8.3 Robotic Process Automation......Page 35
1.4.9 Summary......Page 36
1.5.1 Reliable Sources......Page 37
1.5.2 Proper Definition of Data......Page 38
1.5.6 Masterdata Management (MDM)......Page 39
1.5.6.3 Plan for Scheduled Data Cleansing......Page 41
1.5.6.5 Make Masterdata Available......Page 42
1.5.7 External Data......Page 43
1.5.8 Data Sharing Across Organisations......Page 44
1.5.9 Data Flow Definition and Documentation......Page 45
1.5.11 Data Ethics......Page 46
1.6.2 Comparing Data with Goods and Money......Page 48
1.7 Data Helps Enhance Capabilities......Page 49
References......Page 53
2.1.1 Data Versus Conventional Assets......Page 54
2.1.1.3 How Does Data Differ from Other Intangible Assets?......Page 55
2.1.2 How to Treat Data as an Asset......Page 56
2.2.1 Aren’t We Using Data Already?......Page 57
2.2.2.1 Agile Masterdata Management......Page 58
2.2.2.3 Risk of just Curing the Symptoms......Page 59
2.2.3 Gaining Insight Where Humans See Nothing......Page 61
2.3.1 The Cost of Silos......Page 62
2.3.1.3 Additional Resources for the Extra Effort......Page 65
2.3.3 The Cost of Complexity......Page 66
2.3.5 The Cost of Inaccuracy......Page 68
2.3.6 The Cost of a Ruined Reputation......Page 70
2.3.7.2 Mistrust in the Opportunities Behind Data Science......Page 71
2.3.7.5 Using Great Tools for the Wrong Tasks......Page 72
2.3.7.6 Ambiguous Verbiage......Page 73
2.3.7.9 The Unconscious Trap of Singularity......Page 75
2.3.7.11 Extrapolation Beyond a Model’s Sound Boundaries......Page 76
2.3.8 Do You Trust in Algorithms?......Page 78
2.3.9 The Bias of the Investigator......Page 82
2.4 Why Now......Page 83
2.4.2 The Expectation of Customers is Changing......Page 84
2.4.6 Data Privacy Sensitivity Increases......Page 85
2.4.7 Progress in Technology Accelerates Even Further......Page 86
References......Page 87
3 Designing your Data Office......Page 88
3.2.1 What: The Data Vision......Page 89
3.2.2 How: Translate the Vision......Page 90
3.2.3.1 Types of Data......Page 91
3.2.3.2 Data Areas......Page 92
3.3.1 The Culture of an Organisation......Page 93
3.3.1.1 Reality has Changed Dramatically......Page 96
3.3.1.3 Society becomes More Complex......Page 97
3.3.2 Objectives of the Data Office......Page 98
3.3.3.1 King John Lackland......Page 99
3.3.3.2 Voluntary Support......Page 100
3.3.3.3 Additional Team......Page 101
3.3.3.4 Moving People......Page 102
3.3.4 The Mandate......Page 103
3.3.5 Managing Expectations......Page 104
3.3.6 Positioning a Data Office in the Organisation......Page 105
3.3.6.1 Should the Data Office be a Business Function?......Page 107
3.3.6.2 Data Office within the Organisational Hierarchy......Page 108
3.3.7 The Data Office—Police or Service Provider......Page 110
3.3.8 Centralising the Management of Data?......Page 114
3.3.9 Data Collaboration......Page 115
3.4.2.2 Communicator......Page 116
3.4.3 What is the Future of the CDO Role?......Page 117
3.5.1 Data Councils......Page 118
3.5.2.1 Business Bodies......Page 120
3.5.3.1 When to Apply the Data Review and Decision Process......Page 121
3.5.4 Data Ownership......Page 122
3.6.1.3 Possible Setup......Page 123
3.6.2 Data Guidelines......Page 124
3.6.3 Regular Reports to the Board......Page 125
References......Page 126
4.1 Set the Targets—Plan the Journey......Page 127
4.1.1 Stocktaking......Page 128
4.1.1.2 Organisation......Page 129
4.1.2 Opportunities......Page 130
4.1.4 Plan and Execution......Page 131
4.2.2 Use Strong Change Management......Page 132
4.2.3.1 Add Sustainability to the Equation......Page 133
4.2.4.1 Struggle for Supremacy......Page 134
4.2.4.6 Disengagement......Page 135
4.2.5.2 Approach......Page 136
4.2.5.4 Joint Forces......Page 137
4.2.6 The Chief Digital Officer......Page 138
4.2.7 The Employees......Page 139
References......Page 140
Glossary......Page 141
References......Page 148

Citation preview

Martin Treder

Becoming a data-driven Organisation Unlock the value of data

Becoming a data-driven Organisation

Martin Treder

Becoming a datadriven Organisation Unlock the value of data

Martin Treder Königswinter Nordrhein-Westfalen, Germany

ISBN 978-3-662-60303-1 ISBN 978-3-662-60304-8  (eBook) https://doi.org/10.1007/978-3-662-60304-8 Springer Vieweg © Springer-Verlag GmbH Germany, part of Springer Nature 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Clipart images: Public domain, taken from clipartpanda.com, wikiclipart.com, clipart-library.com, kisspng.com. This Springer Vieweg imprint is published by the registered company SpringerVerlag GmbH, DE part of Springer Nature. The registered company address is: Heidelberger Platz 3, 14197 Berlin, Germany

Companies that have not yet built a data strategy and a strong data management function need to catch up very fast or start planning for their exit. Tomas H. Davenport, “What’s your data strategy?”, Harvard Business Review (Davenport 2017)

Foreword

Many different trends are competing for an organisation’s attention—be it a company, a public authority or a non-government organisation1. At first glance, “data” is just one of them. So, why should you particularly pay attention to the data topic now? First of all, the discussion is not about “data OR other aspects” but about “data to support other aspects”. Data is becoming the foundation—for existing and new topics. Without pro-active and orchestrated management of an organisation’s data, it will be increasingly difficult to adequately address existing challenges or to introduce new ways of working. And you cannot not deal with data. Earlier or later, every organisation needs to decide how to address the “data” topic— strategically and organisationally. This book is meant to give you a quick introduction to data. It tries to help you understand what data is about and where its value comes from. And it will provide recommendations about how to address the data topic within your organisation. I have tried to avoid data jargon—unless it really wasn’t possible. I realise that even the term “data” itself is jargon. You will hopefully understand it better as you read through this book.

1I

am using the term “organisation” throughout the book, to indicate that this topic is by no means limited to companies. vii

viii

Foreword

Some of my general recommendations may already be part of your management toolset. Please consider them part of my encouragement to consciously apply these concepts in the area of data as well. This book does not contain any recommendation for data tools or cloud storage providers, for a good reason: These are IT topics. Data Management is not supposed to take over from IT. It should complement the responsibilities of the IT department— which will also make life easier for the CIO. You think your organisation is too small for a dedicated Data Office or Chief Data Officer? While this may be the case, no organisation is too small for a data-driven approach. The recommendations in this book are meant to be scalable. The Data Office may be a team, it may be a CDO’s one-man-show, or it may even be an additional role for the General Manager! Are you ready to follow me on my short data journey? I hope it will enable you to take well-informed decisions about your way forward: • Chapter 1—What is Data? “Data” is a very abstract term that could mean anything. This chapter explains what data stands for in the context of an organisation. It also provides a list of concrete topics a data office would deal with. Here you will find answers to typical questions about data management as they often come from within organisations. • Chapter 2—The Value of Data Organisations should not deal with data simply because it is in vogue, or because “everybody does it”. There must be something in it for the organisation, a tangible commercial value. This chapter provides a clear rationale for active engagement in data today. • Chapter 3—Designing your Data Office Efficient management of data is not an automatic result of an organisation's commitment to take the subject seriously. It requires a conscious design, to systematically address all relevant data topics in a future-proof way.

Foreword

ix

This chapter suggests a sustainable data approach, considering structural and cultural differences between organisation cultures, organisation sizes and industries. • Chapter 4—Implementing the Data Office Even the best-designed data office means change to any organisation, compared to the status quo. Not everybody will embrace the change. This chapter provides hands-on recommendations to overcome obstacles such as reluctance, individual interests of players or misperceptions, across all stakeholder groups. Martin Treder

Acknowledgements

I’d like to thank my wife, for sharing me with an evolving book for too many weeks. Great input has come from the communities of CDO Exchange and Data Leaders. Thanks to my numerous peers for ensuring my book is not based on a single person's perspective only. A special thanks goes to Hugo Patten, whose concrete and constructive feedback helped iron out a few wrinkles in my original manuscript.

xi

Contents

1 What is Data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 What Exactly is “Data”? . . . . . . . . . . . . . . . . . . . . 1 1.2 Isn’t Data the Job of the IT Department? . . . . . . . . 3 1.3 Isn’t Data and Information the Same? . . . . . . . . . . 5 1.4 Isn’t Data Management About Analytics? . . . . . . . 7 1.4.1 Manage Data Sources . . . . . . . . . . . . . . . 8 1.4.2 Validate Data on Entry . . . . . . . . . . . . . . . 9 1.4.3 Classify Data . . . . . . . . . . . . . . . . . . . . . . 10 1.4.4 Manage Data Quality . . . . . . . . . . . . . . . . 10 1.4.5 Do Data Housekeeping . . . . . . . . . . . . . . 12 1.4.6 Use Your Data: Analytics . . . . . . . . . . . . . 13 1.4.7 Use Your Data: Operational Use . . . . . . . 18 1.4.8 The “Next Big Thing”? . . . . . . . . . . . . . . 19 1.4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.5 What Else Does Data Management Deal with? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.5.1 Reliable Sources . . . . . . . . . . . . . . . . . . . . 23 1.5.2 Proper Definition of Data . . . . . . . . . . . . . 24 1.5.3 Documentation . . . . . . . . . . . . . . . . . . . . . 25 1.5.4 Active Data Quality Management . . . . . . 25 1.5.5 Data Ownership . . . . . . . . . . . . . . . . . . . . 25 1.5.6 Masterdata Management (MDM) . . . . . . 25 1.5.7 External Data . . . . . . . . . . . . . . . . . . . . . . 29 1.5.8 Data Sharing Across Organisations . . . . . 30 1.5.9 Data Flow Definition and Documentation . . . . . . . . . . . . . . . . . . . . . 31 xiii

xiv

Contents

1.5.10 “Data Reviews” . . . . . . . . . . . . . . . . . . . . 32 1.5.11 Data Ethics . . . . . . . . . . . . . . . . . . . . . . . . 32 1.6 Data in the Overall Context of Our Business . . . . . 34 1.6.1 Data as a Business Aspect . . . . . . . . . . . . 34 1.6.2 Comparing Data with Goods and Money . . . . . . . . . . . . . . . . . . . . . . . . 34 1.7 Data Helps Enhance Capabilities . . . . . . . . . . . . . . 35 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2 The Value of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.1 The Commercial Value of Data . . . . . . . . . . . . . . . 41 2.1.1 Data Versus Conventional Assets . . . . . . . 41 2.1.2 How to Treat Data as an Asset . . . . . . . . . 43 2.1.3 Why Should We Treat Data as an Asset? . . . . . . . . . . . . . . . . . . . 44 2.2 Opportunities Through Managing Data . . . . . . . . . 44 2.2.1 Aren’t We Using Data Already? . . . . . . . . 44 2.2.2 Lean, Agile and Data . . . . . . . . . . . . . . . . 45 2.2.3 Gaining Insight Where Humans See Nothing . . . . . . . . . . . . . . . . . . . . . . . 48 2.2.4 Research and Development . . . . . . . . . . . 49 2.3 The Cost of NOT Doing it Right . . . . . . . . . . . . . . 49 2.3.1 The Cost of Silos . . . . . . . . . . . . . . . . . . . 49 2.3.2 The Cost of Non-quantifiable Impact . . . 53 2.3.3 The Cost of Complexity . . . . . . . . . . . . . . 53 2.3.4 The Cost of Wasted Resources . . . . . . . . . 55 2.3.5 The Cost of Inaccuracy . . . . . . . . . . . . . . 55 2.3.6 The Cost of a Ruined Reputation . . . . . . . 57 2.3.7 The Cost of Using Data Without Understanding it . . . . . . . . . . . . . . . . . . . . 58 2.3.8 Do You Trust in Algorithms? . . . . . . . . . . 65 2.3.9 The Bias of the Investigator . . . . . . . . . . . 69 2.4 Why Now . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.4.1 The Amount of Data is Growing Exponentially . . . . . . . . . . . . . . . . . . . . . . 71 2.4.2 The Expectation of Customers is Changing . . . . . . . . . . . . . . . . . . . . . . . . . 71 2.4.3 Business Decisions Require Data . . . . . . 72

Contents

xv

2.4.4 2.4.5 2.4.6 2.4.7

The Competition Does it Already . . . . . . 72 The War for Talent has Started . . . . . . . . . 72 Data Privacy Sensitivity Increases . . . . . . 72 Progress in Technology Accelerates Even Further . . . . . . . . . . . . . . . . . . . . . . . 73 2.4.8 Change has Become the Norm . . . . . . . . . 74 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3 Designing your Data Office . . . . . . . . . . . . . . . . . . . . . 75 3.1 What are the Preconditions to Succeed? . . . . . . . . 76 3.2 What does a Data Office Deal with? . . . . . . . . . . . 76 3.2.1 What: The Data Vision . . . . . . . . . . . . . . . 76 3.2.2 How: Translate the Vision . . . . . . . . . . . . 77 3.2.3 The Different Kinds of Data . . . . . . . . . . 78 3.3 Shaping the Data Office . . . . . . . . . . . . . . . . . . . . . 80 3.3.1 The Culture of an Organisation . . . . . . . . 80 3.3.2 Objectives of the Data Office . . . . . . . . . . 85 3.3.3 The Footprint . . . . . . . . . . . . . . . . . . . . . . 86 3.3.4 The Mandate . . . . . . . . . . . . . . . . . . . . . . 90 3.3.5 Managing Expectations . . . . . . . . . . . . . . 91 3.3.6 Positioning a Data Office in the Organisation . . . . . . . . . . . . . . . . . . . . . . . 92 3.3.7 The Data Office—Police or Service Provider . . . . . . . . . . . . . . . . . . . . . . . . . . 97 3.3.8 Centralising the Management of Data? . . . . . . . . . . . . . . . . . . . . . . . . . . 101 3.3.9 Data Collaboration . . . . . . . . . . . . . . . . . . 102 3.4 The Chief Data Officer (CDO) . . . . . . . . . . . . . . . 103 3.4.1 Authority or Charisma? . . . . . . . . . . . . . . 103 3.4.2 What are the Key Attributes of a CDO? . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3.4.3 What is the Future of the CDO Role? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 3.5 Data Governance: Review and Decision . . . . . . . . 105 3.5.1 Data Councils . . . . . . . . . . . . . . . . . . . . . . 105 3.5.2 Data Representation on Existing Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 3.5.3 Data Review and Decision Process . . . . . 108

xvi

Contents

3.5.4 Data Ownership . . . . . . . . . . . . . . . . . . . . 109 3.6 How can a Data Office Wield Influence? . . . . . . . . 110 3.6.1 Internal Data Networks . . . . . . . . . . . . . . 110 3.6.2 Data Guidelines . . . . . . . . . . . . . . . . . . . . 111 3.6.3 Regular Reports to the Board . . . . . . . . . . 112 3.6.4 Data needs Internal Marketing . . . . . . . . . 113 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4 Implementing the Data Office . . . . . . . . . . . . . . . . . . . 115 4.1 Set the Targets—Plan the Journey . . . . . . . . . . . . . 115 4.1.1 Stocktaking . . . . . . . . . . . . . . . . . . . . . . . 116 4.1.2 Opportunities . . . . . . . . . . . . . . . . . . . . . . 118 4.1.3 Aspiration . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.1.4 Plan and Execution . . . . . . . . . . . . . . . . . . 119 4.2 Stakeholder Management . . . . . . . . . . . . . . . . . . . . 120 4.2.1 Be an Active Sponsor . . . . . . . . . . . . . . . . 120 4.2.2 Use Strong Change Management . . . . . . . 120 4.2.3 Get People to Play Ball . . . . . . . . . . . . . . 121 4.2.4 The Data Office and Business Functions . . . . . . . . . . . . . . . . . . . . . . . . . 122 4.2.5 The Data Office and IT . . . . . . . . . . . . . . 124 4.2.6 The Chief Digital Officer . . . . . . . . . . . . . 126 4.2.7 The Employees . . . . . . . . . . . . . . . . . . . . . 127 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

List of Figures

Fig. 1.1 Fig. 1.2 Fig. 1.3 Fig. 2.1 Fig. 2.2 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 3.7 Fig. 3.8 Fig. 3.9 Fig. 3.10

The Data Value Chain . . . . . . . . . . . . . . . . . . . . 6 The Data Supply Chain . . . . . . . . . . . . . . . . . . . 8 The SCPDS change process . . . . . . . . . . . . . . . 36 Average PPG per season/LeBron James . . . . . . 64 A normal distribution curve . . . . . . . . . . . . . . . 68 A data vision for the CDO . . . . . . . . . . . . . . . . 77 The targets of a data office . . . . . . . . . . . . . . . . 78 Types of data covered by the data office . . . . . . 79 CDO model “King John Lackland” . . . . . . . . . 86 CDO model “voluntary support” . . . . . . . . . . . 87 CDO model “additional team” . . . . . . . . . . . . . 88 CDO model “moving teams” . . . . . . . . . . . . . . 89 Growing the three dimensions of data . . . . . . . 92 Is there a space for a new CDO? . . . . . . . . . . . . 93 The data governance pyramid . . . . . . . . . . . . . . 106

xvii

1

What is Data?

You can have data without information, but you cannot have information without data. Daniel Keys Moran (Quotes n.d.). An American computer programmer and science fiction writer, born 1962.

1.1 What Exactly is “Data”? In daily life, we face multiple definitions of “data”. Most of them have a technical connotation. We think of “databases”, “data transmission” and other computer-related terms. The online tech dictionary Webopedia states “Since the mid1900s, people have used the word data to mean computer information that is transmitted or stored.” (Webopedia - Data 2019) More recently, data is increasingly seen as a magical, intangible material used by advanced experts to facilitate better decision making. I fear none of both views explains the relevance of data in today’s business reality—but maybe an analogy does:

© Springer-Verlag GmbH Germany, part of Springer Nature 2019 M. Treder, Becoming a data-driven Organisation, https://doi.org/10.1007/978-3-662-60304-8_1

1

1  What is Data?

2

Analogy

Data is the constituent of information and knowledge. It can be considered the intangible equivalent of quarks, which are defined as “a type of elementary particle and a fundamental constituent of matter” (Wikipedia - Quark 2019). (For this analogy, you do not need to understand physics, beyond believing that all atoms are composed of quarks and that all matter consists of atoms.) Now I invite you to take this analogy further: Data needs to be put together in an intelligent way to become information. As a next step, information from different areas can be put together to gain previously unknown insight. This is comparable to how quarks form hadrons which are components of atoms of which a glass of orange juice consists, or a glass of lemonade, or… (fill in your favourite drink)—the value of which is obvious. Finally, you get from insight to value, by drinking whatever drink you may have thought of. (I hope it was something as healthy as sparkling water!) Insight without creating value is as useless as a drink forgotten in the cellar, while you are dying of thirst in the desert. Why do I consider it important to explain and illustrate the meaning of data, and which role it plays? You have arrived at the first key message of this book: Data Management is not “yet another discipline” in the everincreasing complexity of how our organisations are organised. Instead, data is in the centre of almost everything we do in our business life. The current “digital age” may be the first one that does not reduce data to its technical aspects (which was distinctive of the

1.2  Isn’t Data the Job of the IT Department?

3

second half of the twentieth century). It will not be the last one, though. Expect it to be followed by future “digital ages”. We do not yet know what these digital ages are going to look like. But we can be sure that all of them will be centred around—data. That is why you may want to see your organisation structurally prepared for the challenges of the digital ages to come.

1.2 Isn’t Data the Job of the IT Department? “Data can exist in a variety of forms – as numbers or text on pieces of paper, as bits and bytes stored in electronic memory, or as facts stored in a person’s mind. […] Simply speaking, a ‘datum’ is a piece of information, and ‘data’ is the plural of datum.” (Webopedia - Data 2019)

This definition indicates that data is not necessarily related to computers. In fact, data had existed for centuries, long before the first computers were built, let alone the first IT departments were set up. And as early as 1891, Sir Arthur Conan Doyle made Sherlock Holmes state “It is a capital mistake to theorize before one has data.” (Doyle 1892) This being said, it has to be admitted that computers have made the handling of data far easier, faster and less error-prone. But—and this is important—computers don’t define data, and they don’t give meaning to data. Instead, they are a technical means of processing data. Equally well, it should not be the task of an IT department to manage the content and meaning of data. It is for good reasons that “IT” stands for “Information Technology”, not for the information itself. As we will see, it is a good idea for organisations to look beyond technology and software, and to have someone to look at the content and to guide them in turning data into information. But how to call such a person? Unfortunately, the historical job title of the head of Information Technology became “CIO”, “Chief Information Officer”, while the correct title would probably have been “Chief Information Technology Officer”.

1  What is Data?

4

As a result, the person in charge of managing information cannot be called Chief Information Officer. As the second-best choice, the title Chief Data Officer has been adopted into the language (although the abbreviation CDO sometimes collides with that of the Chief Digital Officer1). So, if the CIO and the CDO co-exist, what is the ideal relationship between them, or between IT and Data Management in general? The following delineation has turned out to be helpful: • Data Management is about the business view on data. It is about the content, the structure, the logic, the quality of data. It deals with the business questions directed at data, and it provides the methods to answer those questions. • IT is about the technology to process, store and transmit data. Beyond hardware and connectivity, it covers software and databases, but also standards for software development, data storage and data transmission as well as technical solutions for data protection, data compression, data encryption etc. These definitions do not suggest any particular reporting lines, but it becomes obvious that CIO and CDO have complementary tasks.

Analogy

IT is for Data Management what the Mercedes Formula 1 team is for Lewis Hamilton. None of both can win alone. But they are strong together, and each of the two can turn the other one into a winner.

1Throughout this book, CDO is used as the abbreviation for Chief Data Officer, for consistency reasons. But if you have a Chief Digital Officer or a Chief Development Officer, it is absolutely okay to use the acronym CDO.

1.3  Isn’t Data and Information the Same?

5

1.3 Isn’t Data and Information the Same? If IT is about technology, and if Data Management is about the business perspective, can we use data and information synonymously? Actually, this is more than an academic discussion. There are good reasons for us to distinguish between the two. According to techtarget.com, data is “information converted into binary digital form”. Again, this is a very computer-centric view. Remember, data existed long before computers did. This brings us back to the definition of data as the first stage and information as the second stage of the path from data to value. Let’s call this path the Data Value Chain. I invite you to follow me on the short journey along that supply chain. A key aspect of a Data Office’s work is to turn (raw) data into “information”, thus making it far less abstract to, say, an export clerk. Information describes reality. While 6″3′ is data, the fact that I am 6″3′ tall is information. But this mere information alone does not help too much. It is only when information is put in relation to other information, e.g. to the size of clothes or the average body height of mankind, that people understand the world better. This is what I call insight—the third stage of the Data Value Chain. But insight alone doesn’t earn me any money. We need to monetise what we have gotten so far. If a garment organisation took the insight it has gained about me and others to make customers an offer for adequately sized clothes, it would reach the fourth stage of the Data Value Chain—value. All of these steps need to be a joint effort of the respective business functions, of IT and the Data Office. But who is supposed to lead which step of the Data Value Chain? Have a look at Fig. 1.1 which illustrates the setup in a well-governed organisation. Note that even the very first step from data to information requires business folks to think about what insight they could

1  What is Data?

6

Fig. 1.1  The Data Value Chain

gain from it. This is why this step should not be done by the Data Office in isolation. In fact, business needs to get involved at every step. So much for the theory. Now let’s have a look at a concrete business example of a Data Value Chain: Example 1

Sensors in a delivery car can collect temperature, humidity, time, speed, location, fuel consumption. All of this is data: terabytes of records in databases. Computers can put all of this data together to reconstruct the entire delivery tour. This is meaningful information. Information can now be turned into insight: Understanding whether the driver has driven efficiently and followed the rules; understanding whether the best route was chosen, finding out to which extent the average speed differs between different days of the week and so on. Value is generated if this insight is used to pre-plan delivery tours based on historical traffic conditions by road segment, day of week and time of day, or to determine which drivers need training in driving skills or driving discipline.

1.4  Isn’t Data Management About Analytics?

7

Again: If we called the Data Office by its output instead of its input, we’d rather call it Information Office or Insight Office. While the former term has a different connotation, the latter hasn’t been observed so far. To illustrate the steps of the Data Value Chain, let me describe an analogy with conventional, physical assets:

Analogy

• Data  = raw material • Information  = product • Insight  = usage of product • Value  = what somebody gains by using the product Extending this analogy, you can hardly get value directly from data, just as you cannot get quickly from A to B using crude oil—you need to refine it first, and then you need to fuel a plane and fly.

Takeaway

None of the steps of the Data Value Chain happens automatically. The entire Data Value Chain needs to be actively managed, in order to transform data into information and information into insight.

1.4 Isn’t Data Management About Analytics? You may have noticed that people who talk about “data” often focus on “gaining insight from data”. They limit their data perspective to admittedly famous data disciplines such as Business Intelligence, Analytics, Artificial Intelligence etc. All of this is good and important. However, you need good and reliable data for these disciplines to be successful.

8

1  What is Data?

Fig. 1.2  The Data Supply Chain

That is why an organisation needs to manage the entire data journey from data acquisition to data consumption. That is why a Data Office should never be limited to Analytics. So, what are the different areas, and how does a data-driven organisation address them? Let’s have a look at the entire journey of data, describing concrete data disciplines to get from the acquisition of data to the utilisation of data. I am illustrating this view in Fig. 1.2, before I provide a brief description and a checklist for each of the steps. But you can already see from the diagram that the usage of data, be it for Analytics, Business Process Automation or any other important discipline, is only the sixth out of six steps of the data journey.

1.4.1 Manage Data Sources A lot of data is created within an organisation every day. This includes data calculated by software, data entered manually by employees and data captured by sensors. However, data needs to be acquired from outside as well, for example external data about competitors or customers, externally defined data such as IATA codes, or volatile data such as currency exchange rates.

1.4  Isn’t Data Management About Analytics?

9

Your data checklist: • Know where your data comes from. Know about its reliability and quality. • Select the right data to collect and store (externally/ internally). • Maintain and document data lineage, i.e. its journey through the systems, including all modifications. • Consider data Privacy: Are you collecting data “just in case” (which GDPR does not allow)? Are you anonymising data wherever possible?

1.4.2 Validate Data on Entry It is easier to ensure clean data by not letting bad data in than to clean contaminated data later. In other words, the best recipe for having clean data is to check for bad data on entry. This requires some investment, but it is the most cost-­ effective way of ensuring data quality as each piece of data only needs to be validated once. Data ages, of course, and what once was correct may have become incorrect over time. Internal manipulation of data (aggregation, anonymisation, merging etc.) may lead to quality issues as well. But proper data quality logging helps determine action needs in time. If all data is validated upon entry, you can define by data type how frequently it needs to get re-validated. You cannot do this if bad data can enter unnoticed. Your data checklist: Do you… • …perform strict validation of all entering data (internal and external), based on clearly defined rules and metadata? • …maintain quality statuses of data records?

10

1  What is Data?

1.4.3 Classify Data Whenever it is possible to give structure to data, it should be done. Unstructured data is difficult to use. But it can be structured to a certain extent. Imaged letters are a nice example: The image file can be turned into text through Optical Character Recognition (OCR), and an algorithm can subsequently extract keywords and tag the document accordingly. As a result, you would be able to see how many letters are actually complaints or claims, and which topics customers complain about most. Structure is also helpful to measure the quality of data. If you know that a certain data field is meant to contain a postcode, you can apply a postcode validation routine. All of this is neither rocket science, nor is it limited to business. The tagging of holiday photos is a typical example that more and more people are familiar with from their private life. Your data checklist: Do you… • …define and document all data structures, hierarchies and relationships (data model!) • …use well-defined terminology (glossary!) • …standardise data formats (calendar dates, monetary values, units of measure).

1.4.4 Manage Data Quality No matter how we are going to use and exploit our data, it requires a solid data foundation. This is active work. You cannot reap where you don’t sow. So, what does “sowing” mean in this context? It simply means that data needs to be taken care of during its entire life cycle, from creation or acquisition to consumption. Why that? Because you cannot do anything useful with data that is of either bad or unknown quality.

1.4  Isn’t Data Management About Analytics?

11

Usage of data is one of the areas where a famous statement of Classical Logic applies: From falsehood, anything follows (Latin: ex falso sequitur quodlibet)2 In other words: Data teams don’t need high data quality to produce colourful dashboards, KPIs with two digits after the decimal point, ambitious statements about correlations and causality etc. However, all of this lacks any significance in case of incorrect (as non-managed) Masterdata, missing data governance, missing data quality management and the absence of a well-documented data model. It is therefore of virtually no value to the organisation. Yes, all of us want the data to tell us what we haven’t known before. But we always need to remember that data insight is a long journey. It does not only start when a sophisticated algorithm turns a huge amount of data into an understanding of ­causalities and correlations. But what may render the best usage of data useless? • Wrong data • Outdated data • Incomplete data • Ambiguous data definitions • Correct data but with an entirely different meaning • Unknown data quality (How do you know the result is correct?) Imagine if strategic decisions are accidentally based on such a wrong outcome. As long as the output looks plausible, you cannot tell that there is a problem with the input. Based on incomplete market data, you may have decided to go for product A while full information would have told you that product B is the better choice. Product A may earn you some money as well, so you will never get to know that there would have been a better choice.

2This is called the “Principle of Explosion”, often quoted in its Latin version. For a definition see (Rationalwiki 2019).

1  What is Data?

12

It even becomes severely dangerous where incorrect data is used on purpose: • Your Head of Sales may manipulate the definition of “revenue” to suggest that the last Sales initiative was a great success. How could you tell? • Or an executive responsible for one of your organisation’s business units may filter market research data in a way that the acquisition of a local competitor in China, as suggested by him/her looks great. How would you know? The best way to avoid such a situation is an entirely neutral Data Office that does not get rewarded or penalised depending on the outcome of an assessment. Your data checklist: Do you… • • • • • • •

…measure data quality? …understand data through profiling? …correct data values (misspellings)? …add missing data values? …find and deal with duplicate data? …determine and fix root causes of bad data quality? …raise awareness for the importance of data quality?

1.4.5 Do Data Housekeeping It is easy to pile up data endlessly. And you often hear data folks suggest never to throw away any data. They claim that we may come to use it later, and that storage is becoming cheaper and cheaper. But do you really benefit from more and more data? Old data may be outdated. It may blur the view on critical current data—you may miss the forest for the trees. And, of course, more and more data incurs more and more cost—this is not only about the storage costs but about the costs of keeping all the data under control. It easily becomes like herding cats.

1.4  Isn’t Data Management About Analytics?

13

Last but not least, you have legal obligations to delete (not just archive) old data as far as it is personal data. Your data checklist: Do you… • … remove (or archive) any data that is outdated or no longer required? (This requires strict governance. Data has the habit of staying by default.) • …consider data privacy aspects by all means? (High-level policies are important but they need to get translated into concrete rules and processes.) • …protect all of your data? (The more data you have, the more difficult it becomes to detect data leaks early.)

1.4.6 Use Your Data: Analytics This is what many people consider the key area of data management—but as we have seen, a lot of steps are necessary before you are able to gain insight from data. If, however, the foundation is laid properly, Analytics can give you the competitive edge. Using data is like using tools—you start with what you want to achieve before selecting your tools. Technology-driven organisations, notably those with a strong IT department, tend to come up with answers before they start searching for the right questions. A Chief Data Officer’s obligation is to ensure “problems” come before “solutions”. It is definitely good for a savvy IT department to keep staying abreast of technology trends and to keep the business teams up to date on the newest possibilities in the area of data handling. However, the most fancy, trendiest and most performant solution cannot be justified if there is no business opportunity. Let’s assume we are ready for Analytics in this sense. Isn’t there a wide variety of methods and approaches? Yes, there is. But as the area matures, certain categories are establishing, which makes this topic easier to understand.

14

1  What is Data?

Let’s have a first look at some verbiage, without a scientific attitude (however you define a term, someone will challenge you): • “Analytics” is the generic term for all efforts to get from data to insight. • The term “Business Intelligence” (BI) is as old as the first attempts to use computers to gain insight. While there is no clear delineation, you could consider BI standing for the subset of Analytics that has already been available around 2010, such as: Consolidating and aggregating (relatively) big amounts of data, statistically analysing data, generating reports (including real-time reports). If you want to be on the safe side (and if you don’t want to sound old-fashioned), don’t use the term BI and always refer to Analytics—which covers it all. • “Data Science” is a fascinating data discipline. If you consider regular reporting and forecasts as mass production, data science is handwork. Data Scientists are the most highly skilled (and rare!) data people around as they need to have a broad knowledge of formulas and algorithms. They invest a lot of brainwork before they finally tell the computer how to analyse huge amounts of data. • “Artificial Intelligence” (AI) has become a buzzword, often used by people who expect computers to replace human beings sooner or later. The term recognises the fact that, traditionally, human problems have always needed to get somewhat “pre-processed”, for “non-intelligent” computers to be able to deal with them. AI aims at enabling computers to understand (and solve) human problems (more or less) right away. In fact, AI is not a method that someone implements and executes. It rather stands for the aspiration to make computers think similarly to human brains. This aspiration has led to the development of different methods throughout the past decades. That is why I don’t use this term for concrete approaches. It is a useful term, though, to

1.4  Isn’t Data Management About Analytics?

15

distinguish modern data-handling concepts from conventional concepts where the entire translation from a business problem into a computer program has to be done by human experts (if possible at all). • “Machine Learning” (ML) is a key method in Data Science: A computer interprets huge amounts of data sets based on a significant number of already classified data sets (from which it “learns”). It checks a data set against a hypothesis and returns the probability of that hypothesis being true. Well-known examples of ML are image classification and optical character recognition (OCR): ML would check whether an image shows a cat, or the letter “Z”. As cases in ML can be tremendously complex (a photo usually consists of billions of dots), ML decomposes them into smaller, recognisable chunks. It then determines the probability of these chunks matching their counterparts in the hypothesis. Secondly, let’s look at a commonly accepted set of different types of Analytics:

1.4.6.1 Descriptive Analytics Descriptive Analytics tries to find out what is going on. Example: How good is the quality of product X (compared to the target or to last months), and to which degree do different production plants contribute to possible quality issues? You find this both in batch processes (using big chunks of historical data) and in real-time processes (measuring along the way, data set by data set, and possibly sending alerts where thresholds are exceeded). Forecasting is not the main purpose of Descriptive Analytics. But it allows you to answer the question “What is going to happen if nothing changes?” 1.4.6.2 Diagnostic Analytics Diagnostic Analytics goes a step further than Descriptive Analytics. It asks “Why is it happening?”

16

1  What is Data?

Of course, computers cannot easily analyse technical causeand-effect relationships unless you feed them with all relevant physical, chemical and biological laws. Instead, Diagnostic Analytics will search for correlation between events, using methods of Statistics. Note that it is still usually the human interpreter who needs to tell whether there is causality or just correlation. This is the difference between “B always happens when A happens” and “A causes B”. Even if statistical models suggest a less-than-1 per cent chance that the observed joint occurrence of A and B is a pure accident, it cannot tell whether A causes B or B causes A, or perhaps there is even a C that always causes A and B. This is where today’s trendy methodologies around the magic word “agile” may not really help you. Instead, the good old Six Sigma methodology (and mentality!) comes into play. It aims at always fixing the root cause rather than curing the symptoms. That’s why Six Sigma demands proper root cause analysis, suggesting to ask “Five times Why” to get to the bottom of it. Diagnostic Analytics is a great support for Six Sigma to reach this objective. It allows for fact-based answers to the right questions.

1.4.6.3 Predictive Analytics Predictive Analytics focuses on forecasting. It asks “What can we expect to happen in future?” The art of Predictive Analytics is a smart way of extrapolating from the past. You’d apply algorithms to consider other parameters that may have an impact, and you’d incorporate any causality you may have found earlier. This is where Machine Learning unfolds its tremendous opportunities. Furthermore, external data can help create a broader basis than just looking at your own organisation. Note that Predictive Analytics does not necessarily have to ask WHY. It may forecast a change in consumer behaviour in a surprisingly accurate way, without understanding the mechanisms, i.e. what makes consumers change their minds.

1.4  Isn’t Data Management About Analytics?

17

Predictive Analytics either asks “what is going to happen overall/on average?” or “what is going to happen in each concrete case with which probability?”. An example for the former could be the development of an organisation’s revenue if a product line is split into a standard product line and a luxury product line. An example for the latter would be a reliability rating of suppliers where each supplier gets an individual score that helps you decide which supplier not to select.

1.4.6.4 Prescriptive Analytics Here we expect a machine to come up with proposals itself, finally suggesting the “best” option. This may be the supreme discipline of analytics, and it is by far the most difficult one to run. Note that the weighting of target criteria still needs to be provided by humans, usually based on an organisation’s value proposition. Example 2

Prescriptive Analytics could recommend whether or not to violate a trade embargo, based on how much money you would earn if successful, the fine you’d have to pay if you get caught and the probability of the latter. However, you may wish to protect your organisation’s reputation as always being compliant. In this case, you’d ­ NOT go for the deal even if Analytics told you that you’d win on average. For you to be able to take a well-informed business decision, you may wish for this type of Analytics to also tell you WHY the recommended option is the best one. You will have to insist on it—it is not a mandatory part of the core concept!

1.4.6.5 Future Development Even the best approach is still far away from the human brain's ability to think. ML tried to mimic the brain’s network of

18

1  What is Data?

neurons—but it is far away from our brain’s ability to analyse and interpret patterns. On the other hand, the human brain is somewhat limited in its ability to process structured mass data and to see patterns behind the values in columns of a database. But research is busy trying to complement existing Analytics capabilities with human ways of thinking, by observing and mimicking the operating mode of our brains. It is definitely worthwhile following the progress closely! Your data checklist: Do you… • … start Analytics with business needs instead of technology? • …always base Analytics on quality-controlled data? (avoid the garbage-in/garbage-out effect) • …actively follow the development of data science?

1.4.7 Use Your Data: Operational Use Data is not only relevant for Analytics. Most organisations also need data to steer their daily business e.g. their production or distribution processes. You need data to ensure proper operations in such diverse cases as: • goods move in the right direction • production is stopped in case of statistically significant deviations from the quality target • customers are contacted in the most adequate way And unlike many activities in Analytics or Business Intelligence, a failure in this area usually has an immediate impact on your daily business. While Analytics is about learning for the future based on past or current cases, operational usage tells us how to deal with exactly the cases we are looking at. Your data checklist: Do you…

1.4  Isn’t Data Management About Analytics?

19

• …have the operational use of data in scope of the Data Office? • …measure the impact of bad data quality on the quality of your operational processes? • …ensure operationally used data is regularly measured and improved?

1.4.8 The “Next Big Thing”? Many emerging technologies keep filling the headlines, and more and more opportunities are brought to light as people focus on developing these technologies. Let’s have a brief look at three of these emerging technologies:

1.4.8.1 Blockchain Originally having become famous for Bitcoins, the digital currencies, more and more use cases for the blockchain technology are being developed. Its decentralised nature and its structural protection against undiscovered changes make it a great candidate for any type of “contract” between different parties3. Even if there is no reason for any such parties to trust each other, they do not require a dedicated third party as a referee. In fact, the term “distributed ledger” already suggests that a huge community of parties serve as the referee. The blockchain cannot prevent fraud from happening. But it makes unauthorised changes visible.

3The

newest prominent member of the family of blockchain currencies is possibly Libra, driven by Facebook. While it does not fulfil all blockchain criteria, its added value is clearly visible: One third of the world’s population doesn’t have a bank account—but half of these have a smartphone. Libra intends to enable them to digitally perform financial transactions for which a bank account is otherwise indispensable.

20

1  What is Data?

1.4.8.2 Internet of Things We have learned from the Internet that any two computers on earth can communicate with each other as long as they are connected to the same physical network, and as long as they follow the Internet standards, including the unique identity of each computer. The key progress that came with the Internet was that software developers do no longer need to agree bilaterally on how their computers communicate. Technically, such computers do not need human beings either to decide what to communicate with which other devices. A predefined set of rules is sufficient for devices to communicate with each other. Why would this be relevant? Imagine today’s situation in our private lives: We have created thousands of devices that are supposed to help us in our daily lives, from lamps to dishwashing machines or electric garage door openers. But we still need the human being in the centre to bridge the world between them: The thermometer (or the weather forecast) tells a human being that it is getting colder and this human being decides to switch on the heating. Do we really need human beings in-between? What if all these devices could talk directly to each other? What if we weren’t even required to find out that it is getting dark or that the refrigerator needs to get de-frosted? That’s where sensors and actuators come into play: a device measures something (“sensor”) and informs another device about it. Depending on the rules, the other device is going to do something (“actuator”) in response. You end up having dialogues between devices resulting in actions. All of this used to take place exclusively between human beings a few decades back. The humans can make a step back and leave the Internet communication to devices: The “Internet of Things”. I have chosen the household example because most people have seen or used it for quite a while. With a bit of creativity, it is easy to imagine that this does not only work in our households.

1.4  Isn’t Data Management About Analytics?

21

In fact, any cause-effect relationship that can be described by rules is a potential candidate for this technology. The most prominent example may have become road traffic where each car becomes a “thing”, and where all of the information gathered by each car is transmitted to other “things” which may report a traffic jam or open a separate lane for the traffic. Add rain sensors as additional “things” which cause a digital traffic system to activate a speed limit in case of heavy rain, and you see how many different “things” can interact autonomously where humans used to be indispensable just a few years back.

1.4.8.3 Robotic Process Automation Even the first generation of computers was able to do tasks that humans had to do before. Remember the departments big organisations had until 60 years ago where dozens of employees were doing nothing than calculations—a job taken over by pocket calculators in the ’80s. But many jobs remained untouched as computers were considered unable to replace humans. Well, the abilities of humans haven’t changed too much—but those of computers did. First, the hardware became more and more powerful, then the algorithms developed as well. As a result, jobs that can be described by rules can usually be automated today. Modern algorithms can even determine from a customer email that it contains a claim, that it is written in a friendly tone and that it contains a deadline and the claimed amount. If a piece of software can identify both the customer and the underlying transaction, and if the rules are well-defined, a system can decide to accept or reject the claim. This is often implemented as a hybrid solution where a computer decides if the probability of correct interpretation of the email is above, say, 90 per cent, or if the amount in question is below a certain threshold. In all other cases, the email is forwarded to an employee for human judgement4. This allows for a

4This

is often referred to as “augmented intelligence”.

22

1  What is Data?

significantly reduced workforce while still providing high-quality responses. And the customer usually cannot tell the difference. What do these three “next big things” have in common? They need good data to work. That is why these topics should not be competing with data management for attention and resources. Instead, properly managed data is a precondition for each of these technologies to succeed. So, to be prepared, set up a strong data foundation!

1.4.9 Summary Data exploitation is a bit like the viewing platform that you find on top of some skyscrapers: You cannot just build it into the air. You need to erect the entire building first! Otherwise, your viewing platform will become a castle in the air (pun intended!). And what do you need to do before you pile up the storeys? Yes, you need to lay a solid foundation. The higher you want your building to become, the broader and deeper the foundation has to be! Both data operations and data analytics are dangerous if based on non-managed data. If you don’t have your data managed properly before using it, you will obtain erratic or wrong results. The trouble particularly in predictive analytics is that you cannot tell from the results how correct they are. Analytics based on uncontrolled data is like a Chef cooking with bad raw food! Sometimes you can smell the problem in time—but you may also end up in hospital, suffering from food poisoning. Certain types of food would not even have an immediate impact on your well-being. Yet they have the potential to shorten your life by a number of years. You may equally well base decisions on bad data for a long time without your organisation going bankrupt. But you will probably be wasting opportunities. Your data checklist: Do you…

1.5  What Else Does Data Management Deal with?

23

• …and all business stakeholders know the opportunities, both from a data perspective and a technology perspective? • …have each initiative start with a problem statement confirmed by a business function? • …make it mandatory to base all usage of data on organisation-wide agreed rules?5 • …have the governance to ensure the best possible use of data, ranging from operational usage to Analytics?

1.5 What Else Does Data Management Deal with? Data Management consists of a lot of “enabler” activities without which the best Data Analytics and Data Science approaches cannot succeed. This chapter will give you a brief overview, without claiming to be exhaustive. All of this shows that a CDO who is tasked with looking at Analytics alone will most probably fail. Too much can (and will) go wrong before data finally gets used in Analytics. It is indispensable for a Data Office to have the full data supply chain under control. Without all the necessary data management steps before, Analytics will tell lies.

1.5.1 Reliable Sources We need to ensure we know where our data comes from—be it internal or external data, reference data or Big Data, structured or unstructured data. Data does not always need to be perfect, but we need to know to which degree it is.

5Remember,

Fig. 1.1.

the best Data is of no good if it is interpreted incorrectly. See

1  What is Data?

24

1.5.2 Proper Definition of Data Just as a technical device or a complex product, data needs to be described thoroughly. This applies to the definition of data terms (glossary) as well as its logic and relationships between different data elements (data model). Example 3

It is not only important to understand the definition of a customer and of an account number—you also need to know whether one customer can have more than one account number, or whether one account number can be shared by multiple customers. This sounds easy to you? Then think of customers having independent accounts with your organisation in different countries, of different customers belonging to the same organisation, of one customer owning 50 per cent of another customer. Does your organisation have unambiguous rules for all of these cases? Or do you allow a global customer to always order through the one account for which the best conditions were negotiated—because your sales reps were not aware that all of these accounts belong to the same organisation…?

Example 4

Think of the many different ways of determining the exchange rate between two currencies: Who determines the rate? Is it a daily rate, or does it change permanently? Does it include surcharges? In which cases do you apply today’s exchange rate to transactions of the past? Example 5

Does revenue include internal revenue (e.g. if a printer manufacturer buys own printers for own printing purposes)? When do you attribute revenue to the month of closing the deal, and in which cases do you consider the moment of payment?

1.5  What Else Does Data Management Deal with?

25

1.5.3 Documentation There is no value in developing a great data model and data rules if they don’t get documented and shared. And without a proper documentation policy, it will probably not happen. Next to the rules and standards as such, their implementation status needs to be documented as well. This is to always know whether we just have nice standards, or whether they are being applied. Gaps require a systematic follow-up.

1.5.4 Active Data Quality Management One of the key responsibilities of a Data Office is to ensure all data we use has the highest possible quality. Furthermore, it needs to check whether improvement initiatives have a measurable effect on data quality (measuring quality delta). This includes plausibility checks, validation, root cause analyses etc.

1.5.5 Data Ownership Data structure needs to follow business logic. Business owners need to ensure all business needs are covered. But does each data area have one single, ultimately accountable business owner? Without a strong Data Office, we often observe the situation where everybody wants to have a say, but nobody wants to take ownership. And without ownership, things don’t happen.

1.5.6 Masterdata Management (MDM) Masterdata exists, whether we manage it or not. The quality of Masterdata, however, depends strongly on how well it is managed. MDM defines the authoritative sources (where do we find the definitive list of products, locations, … etc.).

26

1  What is Data?

It also answers data maintenance questions, like: Who is allowed to modify data, who has to review a change, whose approval is required? Masterdata may be the most underestimated aspect of the entire data portfolio. The tendency of many organisations to narrow down data management to Analytics comes with tremendous risks: • Critical information about our customers and suppliers is part of Masterdata. Whenever we get it wrong, we may lose revenue or jeopardize business success • Many operational processes are steered by Masterdata. This applies to all key activities of an organisation, such as production processes, finance processes, customer service processes or even customers’ browsing on an organisation’s website. • Process outsourcing depends on the ability of third-party suppliers’ staff to execute processes without long experience. In order to always do the right things, they need correct Masterdata, both for direct lookups by humans and to make their IT tools take the right decisions. • RPA (Robotics Process Automation) cannot succeed at all without solid Masterdata to base processes on. Wrong Masterdata usually means wrong automated decisions. A typical example is the credit rating of customers which is used by software to decide about offered payment methods: Accidentally offering customers with a high credit risk score to purchase on account can directly impact an organisation’s bottom line. • Eventually, all Analytics and Machine Learning activity bases on Masterdata. If the attributes of our products are not maintained properly, how can we find out the best combination of attributes for our markets? This is why a lot of attention needs to be paid to Masterdata, in terms of quality, completeness, timeliness, availability:

1.5  What Else Does Data Management Deal with?

27

1.5.6.1 Ensure Masterdata is Validated upon Entry Masterdata is often captured by employee based on telephone calls or filled paper forms. Is the data entry validated, and if so, against which rules? Do all systems validate against the same metadata (length of entry, alphanumeric or numeric-only, the field may or may not stay blank, checksum, etc.)? Does the system do plausibility checks (e.g. if the data entry clerk always enters the syntactically valid dummy VAT number GB999999973 if the true number is unknown)? 1.5.6.2 Regularly Execute Plausibility Checks Ageing leads to wrong records. A full list of records may be implausible even if every single record is valid (e.g. if each customer has the same postcode). Such cases may stay undetected as all systems accept them. But using them will cause damage. So, just as you have your car checked regularly, or as you (hopefully) go to the dentist every year, your data needs a regular health check. 1.5.6.3 Plan for Scheduled Data Cleansing Even data records passing all plausibility checks may be wrong. Even worse, you wouldn’t recognise them as wrong in most cases. That is why it is important to regularly go beyond plausibility checks and do data cleansing against most recent data and parameters, in many cases against external data. Typical cases are official lists of codes that are subject to changes, e.g. Commodity Codes for Customs purposes6. If you work with an outdated list here, you are easily getting in conflict with the law.

6A

nice brief description is provided by the Finnish customs service provider Tulli (tulli.fi 2019): “Customs needs the commodity code for the import goods e.g. in order to calculate the correct amount of duty and taxes. Finding the correct commodity code is important also for you, so that you will pay the correct amount and type of duties, taxes and other payments for the goods. As importer, you are responsible for classifying the goods correctly.”

1  What is Data?

28

1.5.6.4 Plan and Maintain a Masterdata Lifecycle Different kinds of Masterdata age with different speed, costs of cleansing differ, and the impact of outdated information depends on data and usage. As a consequence, Masterdata requires a lifecycle plan, to ensure each type of Masterdata is treated individually. 1.5.6.5 Make Masterdata Available Best-maintained Masterdata is of no value if it is not used systematically by all processes. If this usage is not supported by the right technical solutions and not enforced by adequate policies, applications and processes will stay in their own silos, maintain their own Masterdata or even have such data hard-wired in their logic. If you had to start with Data Management right now, Masterdata is a better place to start than Analytics. It is the data that keeps the engine running. And this is not only about operating under the status quo— well-managed Masterdata gives you the agility to change direction, basically overnight. Example 6

Imagine you intend to change the geographical organisation structure of your organisation: A few country organisations are to be shifted from one region to another, and two regions are to be merged. What needs to be done to make it happen on the technical part? Analyse all IT applications and databases? Set up a change project to update the new geographical setup everywhere? Move customer data from one regional system to another? Schedule a phased migration approach, carefully managing the transition period with its inconsistencies between applications? With well-organised Masterdata, you would not need to do any of this. Once the concrete date for the change is set,

1.5  What Else Does Data Management Deal with?

29

a Data Steward would do two things: (1) re-configure the list of regional organisations in a reference data solution, and (2) re-allocate countries to regions. The system would ensure that these changes become effective on the planned day of the change. That’s it. Magic? No, this is just modern Masterdata Management. How does this work? In a properly architected environment based on Service-Oriented Architecture, all software applications would always take any currently valid Masterdata directly from the respective repositories, on-demand, through web services. In other words, whenever the mofification of Masterdata does not reflect a change to the overall business logic (and it does not in the example above), nothing needs to be done beyond updating reference data in the respective system(s). Ten years ago, all of this required complex software development which made many organisations postpone the introduction of adequate Masterdata management. Today, MDM software has matured sufficiently to allow organisations to decide for an offthe-shelf solution (or even a SaaS offering) and to configure it according to their needs. The assessment and planning as described above, however, needs to be done using an organisation’s in-house expertise. The Data Office would usually run this as a project, together with IT and the business owner of the respective Masterdata.

1.5.7 External Data Internally gathered, entered or captured transactional data helps understand the past. It has limited power to shed some light on the future. And it is not suited to compare your performance with that of your competition. That is where external data plays an increasingly important role. You can buy data, but you can also collect data for free— or you can ask a specialist company to do it for you. Unless

30

1  What is Data?

your organisation is big enough to have a dedicated department for external data intelligence, it is more efficient to rely on the expertise of providers that handle public data as their main profession. What can you do with external data? • You can find out whether the demand for a certain product category goes up or down. • You can determine whether any of your competitors prepare for a new product line. • You can measure the true customer satisfaction for any of your product or service offerings. • You can measure customer loyalty, in comparison with your competition. All of this requires nothing but access to publicly available information plus the right algorithms.

1.5.8 Data Sharing Across Organisations Just as it is beneficial to share data between departments, you can increase the value of data by sharing it with other organisations. The two key models of data sharing are: • Win/win situations: Data for data • Unidirectional sharing of data: Data for money May it make sense to even share data with competitors? Yes, good models exist for the sharing of data between direct competitors. After all, everybody gains. And in many cases, the entire industry becomes stronger against substitution by other industries. Think of a number of organisations that update each other on cleansed customer data. None of them loses anything. And the entire industry may improve customer experience.

1.5  What Else Does Data Management Deal with?

31

Sometimes organisations gain data as a side product of their core business. This data can be shared effectively. Think of a taxi organisation where each taxi gathers traffic data. The gathered information can be worth pure money. There are different ways of sharing data. • You can use platforms between peers or independent organisations, or you can share data along a supply chain. The latter often involves customer/supplier relationships. Point-to-point sharing is possible, as well. • You can use a third-party data hub, or you can use Blockchain technology to be independent of any third party. Or products of different suppliers communicate with each other at a customer’s site7. In any of these cases, you need standardised data structures so that the systems are compatible with each other

1.5.9 Data Flow Definition and Documentation Data gets modified, aggregated, filtered etc. If we don’t know exactly to which extent data has been pre-processed, we may come to wrong results even though the data sources contain correct data. All of this needs to be managed, documented and made available to all possible consumers of data. Don’t expect the various people involved in creating and processing this cacophony of data sources to do this in a coordinated way. You need a central Data Office to turn this into an orchestrated and well-documented flow of data.

7A

typical example in private life is the Smart Home which can only work if various devices exchange information.

32

1  What is Data?

1.5.10 “Data Reviews” Reviews from a data perspective should become a mandatory part of each project approval process. The commercial idea behind it is to ensure sustainable solutions and to make any cost/ benefit calculation truly cross-functional. Data reviews would ideally take place as part of the initial project approval process and as part of the final project sign-off. Any organisation will benefit from a Data Office that is truly independent by design. They won’t turn a blind eye to non-compliance, just because, say, it is about an executive’s favourite project (or else your organisation has a bigger governance issue). Getting rejected after all preparatory work is not the nicest experience for a project manager. This would usually lead to the Data Office getting involved in the early preparation phases of any project to help avoid late surprises. There may be cases where violation of data standards and policies can be justified, e.g. where laws need to be adhered to quickly, or where employee health is at risk. In any such case, the Data review will help make a conscious, well-informed decision, after balancing all pros and cons. In any case, non-compliance in the area of data needs to be documented, including a deadline for a compliant solution. Equivalently to the “technical debt” in Agile projects, we can talk about “data debt” here which needs to be actively maintained and followed up. The entire approach means a cultural change to most organisations. That is why it requires full Board support, including a “no exceptions” policy for the obligation to undergo the data reviews.

1.5.11 Data Ethics It is impossible to deal with data without talking about the ethics of data.

1.5  What Else Does Data Management Deal with?

33

This is because data cannot only be used; it can also be misused. You have certainly heard of these examples: What if… • ...personal data is disclosed to unauthorised people? • ...deliberately withheld information leads to incorrect conclusions? • ...inadequately applied algorithms give undue preference to a party? • ...data is NOT made available to authorised parties? At first glance, each of these moves pays off. You misuse data, and the result is in your favour. It gets a bit more challenging if you multiply the benefit with the probability of not getting caught, and if you contrast it with the cost of getting caught, multiplied by its probability. Very often, a single case of data misuse is sufficient to ruin an organisation’s reputation for a long time. So, if you misuse data a couple of times, the overall risk goes up exponentially. How does that risk materialise? Lack of ethics can easily lead to severe commercial impact: A buyers’ boycott in response to (real or perceived) unethical behaviour of your organisation can severely impact your revenue. On the other hand, actively applying data ethics doesn’t make life difficult for an organisation (except for those organisations where data misuse is part of the business model. Let’s assume that your organisation is none of those.) You can conclude that acting ethically is the result of having the organisation’s success in mind. Ethics, if publicly enforced, increase an organisation’s value. They build trust and increase reliability. For another aspect, think back 20 years ago. Hardly anybody would have assumed that a significant amount of people is willing to pay a premium for ethically produced goods. We know better today—this is another good reason to make your organisation successful by being ethical! And remember, this is about behaving ethically, not about talking ethically. You really need to walk the talk.

1  What is Data?

34

1.6 Data in the Overall Context of Our Business 1.6.1 Data as a Business Aspect According to literature, there are quite a few cross-functional aspects of running a business. The most outstanding ones are probably: VALUE, PEOPLE, TECHNOLOGY, DATA and PROCESSES. All of these aspects should, therefore, be managed across the normal functional activities. Yet, too many organisations still don’t have DATA on their list.

1.6.2 Comparing Data with Goods and Money Again, across all functions, most organisations deal with the flow and transformation of GOODS, MONEY and DATA. While the focus used to be on the flows of goods and money just a few decades back, the third aspect, the flow of data, has constantly been gaining relevance for the past 30 years - so that it also requires attention on management level.

Analogy

The list of activities required to manage data is similar to the lifecycle of conventional products: • • • •

design (information architecture), construction (building the business), operations (running the business) and monitoring (steering the business through analysis and learning)

A strong IT department is required to ensure proper processing of data, just as most organisations need a good Engineering department and a good Logistics department to manage the physical processing of goods.

1.7  Data Helps Enhance Capabilities

35

But just as we need planning teams to optimise the production and flow of goods, we need a business-oriented team to manage the Data Supply Chain: The Data Office. It needs to set direction, priorities, logic, rules, relationships, interpretation, quality, governance etc. around data.

1.7 Data Helps Enhance Capabilities In order to stay competitive, it is not sufficient to simply do the same things more and more efficiently. Unless your organisation operates in a purely price-driven industry, you will need to give your customers reasons to select you over your competitors. To get there, you would obviously try to enhance your organisation’s capabilities: “We as an organisation want to be able to do something that we haven’t been able to do before.” Capabilities cover such diverse things as additional product offerings, improved customer intimacy or the ability to react to changed customer needs more quickly. But what would be the ideal sequence of steps, and which role does data play in this process? We should resist the temptation to start with the specification of an IT solution. Let’s have a look at two typical scenarios: (i) We want to make it easier for customers to understand our product portfolio on our website. (ii) We want to be able to calculate the lowest acceptable price for a key account customer within minutes (the price at which a customer’s profitability is equivalent to the internal rate of return) Many organisations would now start by specifying the respective IT solutions—in both Waterfall and Agile environments—or engineering solutions. And very often, business requirements are in fact already technical specifications in disguise. Instead, you may wish to take a step back and ask yourself a few strategic and logical questions first. You would also want to ensure a consistent design of processes—and data handling.

36

1  What is Data?

Fig. 1.3  The SCPDS change process

A sustainable approach follows the sequence of StrategyCapabilities-Processes-Data-Solutions, as illustrated in Fig. 1.3. Let me share a few thoughts on each of these five steps: 1. Strategy What do we, as an organisation, want to focus on? What do we want to be known for? Why should a customer select one of our products instead of a competitor’s? Do we want to be the fastest, the finest, the ones with the personal touch, the cheapest, the most reliable, …? Aiming at all of these at the same time will make us fail. Let me add a few general remarks about strategy at this point: As you will know, a good strategy does not simply state the final target, such as in “We aim at being the fastest (biggest, nicest, most profitable etc.) organisation in our sector”. Instead, it shapes the direction to get there. In this sense, it represents a choice from a selection of generally valid options, e.g. “maturing in country A” versus “expanding to countries B and C”, or “focusing on B2B” versus “focusing on B2C”. Such a strategy effectively allows teams to refer to it when choosing between different options on a smaller scale. A typical example is a decision between project X and project Y.

1.7  Data Helps Enhance Capabilities

37

A good strategy plays an important role here, complementing other factors such as business case, resources, dependencies etc. The same applies to an organisation’s data strategy. The only difference is that the data strategy itself should be derived from the organisation’s overall strategy. While the latter is always in the lead, it definitely makes sense to consider data aspects when developing this overall strategy. What if this overall strategy already exists, and if there are no plans to revise it any time soon? In this case, you’d simply ensure that your data strategy matches and supports your organisation's strategy. If a strategy is effective as described above, you would certainly foresee regular strategy matching reviews: “Are we still on the right track?” A strategy alignment assessment should check whether all of your organisation’s activities match your strategy. The same applies to all data handling activities—they need to match your data strategy. Note that this applies equally to activities outside the Data Office. Why is such a regular strategy alignment assessment valuable? Well, there is a difference between “Does it bring us forward?” and “Does it lead in the right direction?” The second question needs to be answered with Yes as well, to justify an activity. The strategy is your direction. 2. Capabilities Once we are clear about our strategic positioning, we can determine which capabilities we need. Don’t try to achieve all of them—we will hardly be able to become the best and the cheapest at the same time. So, if we want to be the cheapest, try to automate. If we want to be known for our customer-centricity, maybe we should invest in well-informed customer service teams and workflows based on customer SLAs: “You will obtain a response within X hours” etc. Does this step take place without any involvement of Data people?

38

1  What is Data?

No, of course not. But the role of data at this point needs to be clear: It is a consultancy role, usually together with IT and Engineering. 3. Processes This is the translation of capabilities into execution. It leads from the WHAT to the HOW. And just as you need a crossfunctional view on data, processes should be considered cross-functional. They hardly ever stop at functional borders. The ability for department B to process data will have an impact on the best way for department A to capture data. A siloed approach may reward department A for minimising its own effort. 4. Data Most process steps deal with data, somehow. Data is captured, or data is requested from “the system”, to use it in subsequent steps. Why does Data not come earlier in this sequence? Simply because it is not an end in itself. Data always has to serve a purpose. With the reduction in storage costs, you frequently hear the proposal to gather all data the organisation can get hold of. Someone would check for a purpose later. An organisation should not do so. You’d end up with a haystack that will make it difficult to find the needles. And remember, you do not only need to consider the cost of data storage. The cost of organising your data is high. And it goes up exponentially with the amount of data you keep in a managed way. 5. Solutions It is only here that IT and Engineering solutions get specified. The teams will need to be involved far earlier, as “consultants”. But solutions should not be specified before the requirements are clear. Otherwise, it’s like giving answers before knowing the questions. You will have noticed that IT and Engineering don’t play a ­leading role before the very last step. In fact, those teams are often burdened with tasks that should be with commercial

References

39

functions. But if these tasks are executed well, the technical implementation job becomes straightforward! Data Management is an indispensable part of change management. And this step requires data experts!

References Doyle, SAC (1892) A Scandal in Bohemia. In: The Adventures of Sherlock Holmes, by Sir Arthur Conan Doyle Quotes, Daniel Keys Moran. allauthor.com n.d. https://allauthor.com/ quotes/90995/. Accessed 31 July 2019 Rationalwiki (2019) rationalwiki.org. 20 July 2019. https://rationalwiki.org/ wiki/Principle_of_explosion. Accessed 7 Aug 2019 tulli.fi (2019) https://tulli.fi/en/private-persons/ordering-goods-online/commodity-codes. Accessed 31 July 2019 Webopedia - Data (2019). https://www.webopedia.com/TERM/D/data.html. Accessed 24 July 2019 Wikipedia - Quark (2019). https://en.wikipedia.org/wiki/Quark. Accessed 29 Apr 2019

2

The Value of Data

The principle is pretty simple: The more usable you make data, the more productive you make the people who work with it. And the more productive you make people, the more money you’re likely to make. H. O. Maycotte (Founder and CEO of Umbel, a data rights management company headquartered in Austin, TX.)

2.1 The Commercial Value of Data More and more, you hear people talk about “Data as an asset”. But what does that mean, beyond becoming yet another buzzword? A while ago, The Economist published an article with the title “The world’s most valuable resource is no longer oil but data”. Well, data doesn’t make planes take off, and it doesn’t heat houses. So why is data called an “asset”?

2.1.1 Data Versus Conventional Assets Data may have more in common with the typical assets on a balance sheet than we think. But there are remarkable differences. Here is a brief overview: © Springer-Verlag GmbH Germany, part of Springer Nature 2019 M. Treder, Becoming a data-driven Organisation, https://doi.org/10.1007/978-3-662-60304-8_2

41

42

2  The Value of Data

2.1.1.1 What Does Data have in Common with Physical Assets? • We can usually buy and sell it • Using it brings tangible business benefits, e.g.: – The asset makes us more efficient – We can win new customers and keep existing ones • Having it and not using it is waste • We can transform it ourselves into something with an even higher value • It degrades over time and requires maintenance. Few things are as dangerous as customer records with outdated credit rating… • You don’t need to sell it in order to monetize it. Data can be a product, but it can equally well serve as auxiliary material, to create or improve other products.

2.1.1.2 What is Specific to Intangible Assets? Intangible assets include data, Intellectual Property, patents etc. They have a few, often underestimated advantages over physical assets: • It doesn’t become less when used • It can be made available everywhere, and everybody can use it at the same time • While “Sweating the asset” is often not sustainable for tangible assets, it is a perfect recipe for intangible assets such as data

2.1.1.3 How Does Data Differ from Other Intangible Assets? Many intangible assets can make it onto the balance sheet. This is in recognition of the fact that they represent value, and that investing in them should not have a detrimental effect on your organisation’s financial rating.

2.1  The Commercial Value of Data

43

Intentional overrating of intangible assets was one of the causes that made the dot-com bubble burst early in the twentyfirst century. In response, rules have become stricter, e.g. through the International Accounting Standard (IAS) 38 which also provides a definition of an “intangible asset”, as published by the IFRS Foundation: The cost of an item […] is recognised as an asset if, and only if: it is probable that future economic benefits associated with the item will flow to the entity; and the cost of the item can be measured reliably. (IFRS, IAS 16 2018) (…) An intangible asset is an identifiable non-monetary asset without physical substance. (IFRS, IAS 38 2019)

Although data matches this definition perfectly, it still cannot be treated as an asset on a balance sheet: Examples of intangible assets include computer software, licences, trademarks, patents, films, copyrights and import quotas. (IFRS, IAS 38 2019)

But this should not prevent organisations from treating data as an asset in all other matters.

2.1.2 How to Treat Data as an Asset In short, Data should be managed like an asset, just as any other item on your balance sheet. You can… … apply a price tag … inventory it … maintain it … keep track of all the ins and outs … refine it In fact, raw data is like raw material. We can increase its value.

44

2  The Value of Data

To use another analogy, this is comparable to turning Iron ore into steel, then turning steel into parts and finally assembling parts to create products such as planes or cars. With every step, the value increases. How does this translate into data? Well, we increase its value by classifying it, by cleansing it, by bringing data sources together, by discovering patterns behind it, and by actively managing data quality across all steps.

2.1.3 Why Should We Treat Data as an Asset? It has the potential of becoming a competitive advantage. An organisation’s core products are often already pretty much optimised, just as all competitors’ products are. It has become difficult to be the “better choice” for your customer purely based on your core product portfolio. Opportunities often lie in digitalisation and utilisation of data. In many cases, investing in these areas gives you the highest marginal value (or more bang for a buck, to express it less scientifically).

2.2 Opportunities Through Managing Data 2.2.1 Aren’t We Using Data Already? Ask yourself: • Do we know what data we have? Does any single entity within the organisation know about all pockets of data? • Do we know the potential value of each piece of data? • Which role does Data play in our organisation strategy? • Are we taking well-informed strategic decisions based on data? • Do we systematically think about how to monetise our data? • Does our Risk Management cover Data? (risk of lost data, wrong data, illegally kept data, …)

2.2  Opportunities Through Managing Data

45

• Do we have the people and the know-how to act quickly? • Can we judge big initiatives from a data perspective? (An example would be the introduction of salesforce, Workday or other SaaS (Software as a Service) solutions each of which comes with its own data logic.)

2.2.2 Lean, Agile and Data We observe a general trend towards agility in business. Without further elaboration on this trend, we can state that the markets don’t allow for long, well-planned change cycles anymore. Let’s have a look at three different aspects of Agile in the light of managing data:

2.2.2.1 Agile Masterdata Management How long does it take you to have a new product introduced across all business functions? If you have millions of small products on offer, such as the huge web shopping organisations, you will probably be well prepared to introduce new items quickly. But many organisations deal with a small number of dedicated products, be it in the service sector, be it about selling huge items. Here, product engineering, market research and operational enablement already cost enough time. Wouldn’t it be nice to at least simplify the digital enhancement or change of your product portfolio within a day? Remember, a new product needs to be made available to your customers, known to all relevant employees, available to the billing process, visible to Customer Service, and so on. This is where professional, cross-functional Masterdata Management can help. It ensures data maintenance is coordinated centrally, only once and in one place. Once activated, a change is made available across all departments, all software applications and all customers within a moment.

46

2  The Value of Data

2.2.2.2 Stopping Escalation of Commitment Data helps stop wrong development, by providing evidence. “Escalation of commitment” means sticking to a decision even as doubts come up. Although this is hardly ever in the interest of an organisation, the reasons for it may very well be rational. If, for instance, an organisation has the (informal) culture of using “failed” projects against those in charge, it may not be sensible to admit that a project should be stopped. In such a context, you can often observe that project managers push the project through, and compromise on quality or sustainability to avoid looking bad. To prevent such situations from happening, organisations need to do two things: Firstly, they need to promote honesty (and occasionally promote managers who dared to recommend stopping their own projects). Secondly, they need to work in a data-driven way: The business case needs to be fact-based, and it needs to be maintained throughout the project. This approach fits very well into an Agile project approach. Data-driven project reviews can enable a project manager or a sponsor to create a positive business case for the termination of a project: Sunk costs are sunk costs, but stopping now will make more sense than continuing (or: throwing good money after bad money, as the investors say). 2.2.2.3 Risk of just Curing the Symptoms Agile is not unconditionally good when dealing with Data. If you keep fixing the issues as they come up, you may not address the underlying issues. If you don’t spend enough effort on a building’s foundation, you may have a hard time addressing instabilities while the building is growing. So, be it in the early stages of a project, be it when the project gets in trouble, you need to ask for a proper analysis. Einstein once said: “If I had one hour to save the world, I would spend fifty-five minutes defining the problem”. (Note he didn’t say 15 or 30 min…)

2.2  Opportunities Through Managing Data

47

Quick “agile” fixes often come with the risk of curing the symptoms instead of addressing the root cause. This is again an area where Six Sigma can really help: If you introduce Six Sigma tools like “Gemba Walks” or “5 Times Why”, you encourage teams to search for the root cause, and you provide them with adequate tools. A specific point I repeatedly observe where data meets agile: If a project takes data architecture decisions too quickly, you may be logically tied to these decisions for the foreseeable future. Changes may become complex. Example 1

If none of your systems allows you to distinguish between two different types of certification for raw material, it will cost you a fortune to add it later. And if your product identification numbering system is not globally unique, you may run into terrible duplication with your next acquisition of (or merger with) another organisation. In summary, here are my four key rules for Data in Agile: (i) Use Agile selectively You should apply Agile methods where it makes sense - but don’t make “Agile” a mantra. Ensure your CDO is always part of the discussion. (ii) Shorten time-to-market intelligently You may be afraid of business requirements being outdated by the time of project completion. But you need to resist the temptation to shorten data projects by allowing for unsustainable shortcuts. Instead, ask for projects to be divided into smaller, digestible chunks, ask for independent parts to be run in parallel and be ready to change direction at any time during a project. (iii) Fix workarounds quickly If the markets really require a shortcut, tidy up immediately after going live. The more shortcuts an organisation keeps in place, the less agile it will become over time!

48

2  The Value of Data

(iv) Become sustainably agile Consider this general data rule: You need a non-agile approach first to become truly agile later.

2.2.3 Gaining Insight Where Humans See Nothing Imagine a data record of an organisation and its history (or better, a filing card from the old days, taken from the customer card box). An advanced version may tell you whom that organisation does business with, how its revenue developed, how often it has been mentioned positively or negatively by investment banks, and so on. What does that data tell you about making business with that organisation? Nothing? Is that organisation going to pay your invoices? You don’t know, from looking at the record? You are not to blame. Humans can’t tell from a single record, and even thousands of records from comparable organisations wouldn’t help you much. This is where Data Science comes into play. It analyses huge amounts of data, and it gives you astonishingly precise predictions, be it about credit risk, about fraud, about the probability of errors, or about reliability. It probably won’t tell you why, but it will tell you the probability of events. What does Data Science use as input? Well, data, data, data! What does it look at? Trends, peer comparison, combinations of attributes, correlation and other things that human being cannot derive from columns of data. What does it deliver? Predictions that will save you money on average, and that you will be able to validate against reality later. Would you want to miss this opportunity? You can trust in information derived from data – if the job is done by data experts. And your Data Office needs a trustworthy Chief Data Scientist to distinguish the true experts from the big herd of pretenders.

2.3  The Cost of NOT Doing it Right

49

2.2.4 Research and Development Not all benefits of data are universally applicable and can be found in “Data as an asset” publications. Monetisation opportunities are often very specific to the industry your organisation is operating in. That is why it makes sense to establish an “Information R&D” function, to systematically explore opportunities to monetise data, both internally and externally. No matter how you run R&D in other areas, you can usually apply the same logic to data R&D.

2.3 The Cost of NOT Doing it Right Several years back, the tyre company Pirelli had a very famous advertising scheme. Their ads showed an athlete ready for a sprint, wearing red high-heels. The catchy headline read “Power is nothing without control”. This phrase perfectly describes the risk of using data in a wrong way: Data has tremendous potential, but you need the right setup to realise this potential―just as you need the right shoes for a sprint or the right tyres to translate horsepower into speed. And it is not sufficient for the solution to look good―which the red high-heels of Pirelli’s athlete certainly did. You need an adequate solution―otherwise, you will fail. This chapter describes a few consequences of not getting it right, including recommendations for a better approach.

2.3.1 The Cost of Silos Of course, each department can build up its own data expertise, and each application can implement its own data logic. And, of course, this can work successfully, as it has obviously done in many organisations for decades.

2  The Value of Data

50

But it destroys value. If not in place, this is what may happen: • Same questions―different answers (even if the algorithm is the same) • Long time-to-market in case of structural changes (e.g. new business model) • Incompatibility where data from different areas is pulled together • Double work is avoided―each piece of logic needs to be implemented only once • GDPR compliance is difficult to achieve if a customer’s information is spread across various systems Example 2

Identity & Access Management (IAM) is often handled by IT, while employee data is managed by HR. There are interdependencies, obviously. Even where IAM is based on roles (as it should be), each role needs to be linked to one or more human beings. Where does IT get the employee data from? How do they ensure it is always up-to-date? Do they get automated updates when employees leave the organisation or change internally? How does IAM ensure no role is ever without a human being behind it? How does IAM ensure that people changing within the organisation get their old access rights revoked immediately (and not only after the next quarterly review)? Having old and new rights granted at the same time may violate basic Segregation of Duties principles. How does IAM ensure people leaving the organisation will lose their VPN access as soon as they are released from all duties that require this access? Under cross-functional Data Management, a joint initiative of IT, the HR department and the Legal department would collect all relevant use cases first, considering known issues, new opportunities and legal regulations. This would result in the development of IT-agnostic processes, starting with a trigger (e.g. the change of an employee

2.3  The Cost of NOT Doing it Right

51

to a new internal role) and subsequently defining all necessary activities. Eventually, the initiative would look at all information impacted by each process step (e.g. the unique ID of an employee or the relationship between employee and role). Data Management would use the outcome to create a data model (e.g. the employee, all employee attributes and all relationships to other data entities). A logical Single Source of Truth would be defined. Only now would the initiative look at how to implement the solution, starting with a gap analysis. Data principles would be used to define (and later to validate) the solution, so that no single function is tempted to take a shortcut. A migration path would be developed, with interim key milestones (each of which has to add value already) and a final milestone. It would be prioritised and scheduled based on the determined value-add and by comparing it with other initiatives competing for the same resources. An implemented solution could have a logical single source of truth for employees, roles and access rights where a near-real-time mechanism avoids gaps and outdated information. Features could contain an easy mechanism to allow for tailored role-specific communication, and an allocation mechanism could ensure that every vacating position makes the line manager the successor for all roles―incentivising him or her to find and configure successors quickly. But why does not everybody in each organisation voluntarily work towards an organisation-wide, collaborative data approach (e.g. a single source of truth and web services that can be used by all business functions)? In fact, such an approach may rob departments and projects of a good deal of their own flexibility and freedom. Furthermore, a project working towards benefits beyond the own scope will probably take longer. Data handling and logic need to be implemented in a way that other departments can use it as well. Both the solution itself and the necessary active engagement of other departments require additional effort and time.

52

2  The Value of Data

However, from a shareholder perspective, it is obviously the right thing to do. That is why it is important to develop mechanisms to make people take the shareholder perspective. Here are a few recommendations:

2.3.1.1 Transparency Transparency is an important recipe. Do not only explain why it is important but ensure all projects need to be assessed and certified. If handled well, project teams don’t want to be labelled “Non-compliant with the organisation’s data standards”. Another aspect of transparency is the communication of successful cases: As soon as a second initiative becomes cheaper or faster by being able to re-use something from the first initiative, benefits are kicking in. The general acceptance of a sustainable approach goes up with each case. 2.3.1.2 Comprehensive Business Cases Each project request and each funding request should require a comprehensive business case: The case needs to be calculated including the next couple of projects in the respective topic area, not only the first one. This approach is required to motivate the design of sustainable solutions. How to achieve this? We are talking about a typical task of an independent Data Architecture team. A CDO would know how to set up such a team. 2.3.1.3 Additional Resources for the Extra Effort You can generally ask for project proposals to determine the extra effort required to do it right:The delta between “achieving the core targets of the project” and “doing the right thing from a shareholder perspective”. Projects may then be explicitly granted additional resources, funding and/or time if (and only if) they choose to “do it right first time”. If this approach becomes an organisation policy, project managers may be less concerned about being perceived as “slow and expensive”. You will find that hardly any project manager prefers

2.3  The Cost of NOT Doing it Right

53

a shortcut over a proper solution―unless they are measured exclusively against time and money.

2.3.2 The Cost of Non-quantifiable Impact Finding out about an issue is one thing. Determining the impact is another one. And, of course, you want to know the impact. In case of scarce resources (which is more often the case than not), you need it to be able to prioritise between activities. In some cases, the impact analysis will even suggest that you do not fix an issue, as fixing it would cost more than the impact of the issue. Proper data handling allows you to ask questions like these: • How frequently does the problem occur?” • “At what percentage is step xyz going wrong?” • “What is the impact on cost / on the speed of service / on customer satisfaction?” Assessing the impact of using data often leads to surprising insights. You may, for instance, find out that the three executive complaints sent to the Head of Customer Service independently represent a mere three out of three million cases where something went wrong. If you then find out that the three people who complained do not have a multiplier function (as many social media influencers have), you may conclude that it is better to spend the money elsewhere, where a fix has a bigger impact.

2.3.3 The Cost of Complexity Have you ever faced the situation that a certain business change required heavy investment and a long project duration on the IT side? If you look at your organisation’s history of IT changes, you may notice a series of “minimal viable products”. Each of them

2  The Value of Data

54

required a little workaround to solve a problem. And each of them had a positive business case. But the resulting spaghetti chaos costs a fortune to maintain, and quick changes are not possible anymore. Why? Because the business cases may not have considered the cost of complexity. What does that mean? Let’s look at a typical situation. Example 3

Imagine you have three single systems in Finance, Operations and Customer Service, all of which deal with customer information and require data updates whenever customer information changes. If, say, a customer loyalty project adds another customer database, turning this into four systems, you could argue that four systems are still manageable. A few years (and a few projects) later, another project may propose something similar: Just adding a single, tiny data repository, to be able to quickly implement a solution at a reasonable cost. But meanwhile the story may have changed: It may read: Hey, we have spaghetti anyway, so what is the difference if the number of repositories increases from 25 to 26? It doesn’t really matter, right? The truth is: Little increases in complexity easily remain undiscovered. But after a number of little increases, the organisation’s application landscape has entirely lost its agility. Your CIO may even have warned you at times. But the first project with a sustainable approach would have faced a severe impact on the cost side of its own business case. That’s why organisations often decide to “do it right next time”. A Data Office will help you manage business cases beyond single projects. They can calculate the benefits of enabler projects (which may not provide an immediate benefit on

2.3  The Cost of NOT Doing it Right

55

their own), and they can take a long-time, cross-functional perspective. And if a Data Office is incentivised based on the entire organisation’s success, they will resist the temptation to invest in data for the sake of data, becoming biased themselves. To achieve adequate handling of business cases, you would select a CDO with a strong commercial or financial mindset, on top of all necessary data competencies.

2.3.4 The Cost of Wasted Resources You may have the feeling that your Analytics teams deliver good results, and this may be true. However, at what cost do they deliver? I have learned from discussions with many big or mediumsize organisations that Data Scientists often spend a mere 10-15 per cent on actual science. The rest of their time is spent on searching, structuring and organising data. Why that? It’s the lack of Data Governance. Just like a physical warehouse, organising data is key to efficiency. This will not happen automatically, and it requires a centralized governance approach.

2.3.5 The Cost of Inaccuracy Some questions are not as easy as it seems at first glance. The more questions you ask, the more complex the situation becomes. You will definitely want the answers to be easy―but they also need to be accurate, right? You may not achieve accuracy without accepting complexity in getting there. Let’s look at a typical business question

2  The Value of Data

56

Example 4

A CEO wants to know from the Head of Sales: “What are our ten biggest customers?” A seemingly simple question like this often turns out to be difficult to answer. Look at the following questions: • What is a customer? Is it a “company” (like “Microsoft”)? Is it an account (everything under one rate card)? Is it the logistics procurement person our salespeople talk to? • Are customers and accounts the same? • How many different definitions of account do we use? • Do we consider organisations that belong together as one single customer? How do we know? • Do we consider Volvo cars and Volvo trucks to belong to the same company? • What means “belong together”? 100 per cent? Joint venture? • Which selling period do we use? Can this list change every month? • Do we rely on an external data provider like Dun & Bradstreet? (And do we know their definition of “customer”?) So: no chance to get it right? Too many options? Well, let’s ask the right question: What do we want to do with the information “Ten biggest customers”? This question may help define “customer”―in this specific sense. Case 1: We may want to find out about all the revenue that would be at risk if we messed up in a certain case. In some cases, even the regional organisations within one customer organisation don’t talk to each other about the quality of their suppliers. It is obviously difficult to define “same organisation” here. Case 2: One organisation = all units under the same CEO. This is important if we want to build an executive-level relationship. But how would we deal with 50% owned organisations? And shouldn’t we look at the entities (pension funds etc.) that own most of our customers?

2.3  The Cost of NOT Doing it Right

57

• Think of buying decisions: How do we determine on which level these decisions are made? Maybe an organisation’s CEO or General Manager is not the right contact as he or she doesn’t care about purchasing? • How often do we accept changes to our list of key account customers? If we want to build a department for corporate accounts, we wouldn’t want to change the list every month.

2.3.6 The Cost of a Ruined Reputation What is the worst data issue you could face? That you have data quality problems? Honestly, the worst thing is if your customers spot your data quality problems before you do! Whenever you don’t have your data under control, and this becomes visible to your customers, what could happen to your organisation’s reputation? As an example, we can have a look at the area of data privacy: Let me assume that your organisation is not 100 per cent GDPR-compliant BUT that you could prove to external investigators that you have done what could realistically be expected within the given timeframe. With this in mind, you may hope that investigators would focus on worse offenders amongst all of the organisations out there. You are probably right. However,… Will you be able to respond to a customer’s request to provide all data you have collected about that customer (throughout your entire application and database landscape)? How about a customer asking to get entirely forgotten? These questions are publicity-related. No matter how well your organisation is prepared compared to others, if the right person publishes your failure, your organisation’s reputation is at stake. This topic shows again that Data Management is not primarily a technical discipline. It is not sufficient to work with business functions on how to properly manage their data.

58

2  The Value of Data

That is why a Data Office ideally includes a person or team to work with support functions like Communication, Legal and HR in order to address the non-technical aspects of data.

2.3.7 The Cost of Using Data Without Understanding it Usage of data doesn’t need experts anymore because we have all the algorithms and tools at hand, right? Deadly wrong. In order to have data tell a valid story, it is not sufficient to be able to put it into a database, to apply algorithms and formulas to it, and to turn the output into charts or metrics. There are several traps that users may fall into in case they only know how to use the tools without understanding the ­(complex!) science behind it. Let’s have a brief look as some of these traps:

2.3.7.1 The Bias of the Investigator Hardly anybody assesses a question without being interested in the topic. Whoever is interested, usually has an opinion already before starting the assessment. Wouldn’t you think this opinion can influence the results? 2.3.7.2 Mistrust in the Opportunities Behind Data Science If you only look at what is possible today, you may miss the opportunities of tomorrow. Based on today’s knowledge, you may assume that AI may be getting better at face recognition but will never be as good as human beings. As a result, you will always trust a human being’s judgement more than a computer’s judgement. Data Scientists may remember how insufficient face recognition used to be just a few years back. This experience enables them to imagine what else is going to become possible if the speed of progress sustains.

2.3  The Cost of NOT Doing it Right

59

In fact, in certain areas such as skin cancer detection, algorithms are already at par with human doctors (yet both are failing in different areas so that a combination of both is the most promising approach―at the moment). Seasoned data folks will, therefore, keep their eyes open, expecting what they know today to be outdated tomorrow. Yes, all areas of technology are developing quickly these days―but data science is developing at the speed of light!

2.3.7.3 Missing Common Sense Using data for decision making is one of the key steps forward for an organisation in the twenty-first century. However, data advocates sometimes tend to spill the child with the bathwater, asking for the old-fashioned decision process to be entirely replaced by data-based decision-making. Business-savvy data folks know that you should rather complement experience and gut feeling with data than replacing it by data. And they usually know how to balance the two. 2.3.7.4 Concluding from Competencies in One Area to Competencies in Other Areas Have you ever noticed that people follow actors and musicians on social media, to hear what they say about politics, economy and other areas? Question: Why on earth should actors understand these topics any better than your neighbour, just because they are famous, and probably excellent actors? The same applies to data handling. Imagine the organisation’s undisputed marketing mastermind using data to tailor the next marketing campaign. We would tend to assume that the result must be excellent. But can we conclude that someone very marketing-savvy is an expert in handling data as well? 2.3.7.5 Using Great Tools for the Wrong Tasks Yes, I am thinking of spreadsheets. But not only! Most modern tools in data handling have their specific ­purposes, and they should not be used for other tasks.

60

2  The Value of Data

At best, the performance is bad (e.g. if a Hadoop environment is used for single queries rather than for batch processing of mass data), but in some cases, you’d even draw wrong conclusions. Issues of this kind are often observed in organisations where data handling is IT-driven so that the solution comes before the corresponding question, and in organisations where shadow IT departments develop that are lacking the overall IT perspective. This situation can best be addressed through a permanent dialogue between the IT department and the Data Office, representing the business community.

2.3.7.6 Ambiguous Verbiage Who takes care of how certain terminology is defined and used within your organisation? Do you have a well-documented, organisation-wide definition of “revenue”? Is everybody in agreement about the definition of a “key account customer” (and how this status of a customer can change)? If you ask around, people will confirm how important welldefined expressions are. This is why, in bigger corporations, each department has its glossary, and sometimes even each project does. But the existence of multiple glossaries is defeating the purpose. The same expressions appear in different glossaries, and they are usually not aligned. Each department will be able to explain why the other departments’ definitions cannot be used― and sometimes rightfully so. But if the Sales department reports an increase in revenue year-on-year while the Tax department reports a decline, you are caught between the devil and the deep blue sea. Even worse, both departments could be right! Sales may look at the sales process and attribute the revenue to the month where the “deal was closed”. Tax may be legally forced to consider the moment where the “money is in the bank”. This may be significantly later, and in some cases, it may not happen at all…

2.3  The Cost of NOT Doing it Right

61

Example 5

Let’s have a look at a simple question: What is a country? Sounds straightforward, right? But how about these concrete cases: • Are Ceuta, Melilla and the Canary Islands countries on their own or part of Spain? • Is San Marino a part of Italy? • Which country does the Gaza strip belong to? Or is it even a country on its own? • Is Northern Cyprus part of Cyprus, part of Turkey or a country on its own? • Does the Crimean Peninsula belong to Russia or Ukraine? • Would you consider Greenland part of Denmark or a country on its own? The answer to all of these questions is: It depends! It depends on the purpose of the data. In other words, you do not need to take a political position―you just need to decide what you intend to use the reference data for! If you want to ship to customers in those countries, you’d be interested in customs borders―shipments to customs exclaves need to be customs cleared. If you want to divide the world into Sales territories, you’d rather follow geographical realities―you simply cannot serve Crimean customers from Kyiv. Eventually, you may decide to break down the world to areas that can be assembled to form different “countries”, dependent on attributes. This allows you to have different tailoring of geographies for different purposes. To avoid situations of ambiguity, you’d better have a neutral body that manages the organisation’s single glossary, in collaboration with all business functions. Ideally, this body becomes part of your Data Office.

62

2  The Value of Data

A well-managed glossary function would lead to harmonisation where there is no reason for different definitions, and it would lead to different expressions for different definitions where necessary.

2.3.7.7 Underestimating the Influence of Bad Data Quality You cannot tell from a database table how accurate its data is. Some data users rather take data of unknown origin or quality than not finding any data of the required kind at all. Summary: The one thing that is worse than decisions based on gut feeling are decisions based on an incorrect interpretation of data. 2.3.7.8 Confusing Causality with Correlation Most citizens of Helsinki have a very light skin colour. The average temperature in Helsinki is 5,9 °C. Marseille has a population with slightly darker skin colour, and the average temperature is 11 °C. Cairo in Egypt has an average temperature of 21.3 °C, with even darker average skin colour. Conclusion: The darker the skin, the warmer the climate. Question: How on earth does skin colour influence the temperature? Can we lower the temperature in Cairo by asking Finnish people to relocate to Egypt? You get the point, right? What we cannot derive from the mere data: does A influence B, or does B influence A, or do A and B not influence each other at all (but maybe equally influenced by an unknown C)? To determine causality, you need additional data. If someone cannot provide the evidence, please don’t believe in causality. 2.3.7.9 The Unconscious Trap of Singularity You wouldn’t consider something a rule if it happens only once. But if it happens again? There must be a pattern? Well, no, not necessarily! Maybe 80 per cent of your customers are dissatisfied with a certain product. If you check two customers in a row, you may

2.3  The Cost of NOT Doing it Right

63

still have caught two members of the minority of satisfied customers, by pure coincidence. Yes, chances are a mere 4 per cent (1/5 × 1/5 = 1/25 = 4/100  = 4%). However, if applied in many cases, 1 in 25 cases shows a false positive. Would you take the risk?

2.3.7.10 Valid Formula―Wrong Context People love formulas. You put data in, and they spit out a perfect result, as we have learned at school. You don’t even need to understand the formula. How about this one? f (x) =

2 1 − (x−µ) √ e 2σ 2 (σ > 0) σ 2π

This is the famous Normal Distribution (or Gaussian Distribution), one of the most frequently used formulas to describe how events distribute around a centre point. You will have come across the equivalent diagram, the Gaussian bell. This formula can be applied easily to any one-dimensional distribution, and it is in fact applied in many cases. But it comes with a series of preconditions which hardly ever get validated. You guess it: Even where the preconditions are not met, the formula delivers perfectly looking results. But these will probably be wrong. Even if you have a lot of data points that support the selection of a certain formula, it may not match any other data points. It is important to first understand the logic behind what leads to the data points. Only this will allow for the selection of the best possible formula.

2.3.7.11 Extrapolation Beyond a Model’s Sound Boundaries I recall a diagram in a German magazine in the early ’90s. It was based on available figures about HIV infections back then. During these days, there was a significant increase year on

2  The Value of Data

64

year. The magazine had determined a valid mathematical function that reflected these figures well, and they had derived some fairly scary forecast for the year to come, by applying that very function. I recall my professor in Applied Statistics showing us the graph of the same function, but simply extended by one decade. And, voilà, according to this mathematically valid function, the population of Germany would seize to exist within the next few years… But even if the formula is correct and can be validated, we need to consider the boundaries of the model. One day my Physics teacher made an experiment with us around the conductivity of iron wire, dependent on temperature. There was a clear and reproducible linear dependency. But then he increased the temperature even further. The iron became soft, the iron wire stretched, and the relationship became extremely non-linear. What we learned: You can apply a formula―but only within the range of which you know that it is correct! Or look at fig. 2.1 which shows the point averages of basketball star LeBron James in the NBA league, for his first 15 seasons (NBA.com 2019): It is fairly safe to derive from this diagram that his point average in his 16th season will be somewhere in the range between 25 and 30. A simple formula (called Linear Regression) would, 40

LeBron James - average points per game

30 20 y = -0.0414x + 27.471

10 0

0

1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16

Fig. 2.1  Average PPG per season/LeBron James

2.3  The Cost of NOT Doing it Right

65

indeed, suggest an average of 26.8 points per game―which turns out to be a mere 0.6 points below his true 2018/2019 performance. However, the very same formula suggests that LeBron James would score an average of 26.2 points in his 30th season―when he will be around 50 years old… I guess you get the point. A formula doesn’t stay valid forever. But if a formula works within a certain range in our business life, we tend to assume that it is valid beyond that range. It usually isn’t. And this is not always as easy to spot as in the example above.

2.3.7.12 Purported Accuracy “1.41421356 inches” sounds very accurate, right? Let’s assume someone reported it as the scientifically calculated length of a certain square’s side. In fact, someone may have estimated the square’s surface area to be 2 square inches. Subsequently, another person may have applied the (mathematically correct) formula that the length of either side is the square root of the surface area. And guess what? 1.41421356 is the calculated square root of 2. You will agree that the calculated value cannot be more accurate than the input number. However, eight digits after the decimal point suggest that it was. Now imagine you don’t know the history behind the result, i.e. you were lacking the knowledge that the surface area of 2 inches was an estimate. How would you interpret the result?

2.3.8 Do You Trust in Algorithms? We all know that forecasts come with uncertainty, no matter how good the underlying data is. However, more and more people dream of getting to the best possible forecast.

2  The Value of Data

66

And, yes, mathematical formulas and algorithms are usually well-proven, beyond doubt. But does this allow us to expect even more decisions to be taken through algorithms? Unfortunately not. Even backwards-looking analyses will never be assessable int the best possible way. Here’s the reason: Each formula or algorithm is proven and valid under clearly defined limitations and preconditions only. If any of these is violated, you cannot guarantee any accuracy of the result. That is where human experience remains necessary. Let’s look at a concrete algorithm to illustrate the challenge: Example 6

A well-known statistical method is the Analysis of Variance (also called ANOVA): It is used to find out whether a certain attribute (e.g. a quality measure) of a population (e.g. all instances of product X produced last week) is sufficiently different between different subgroups (e.g. production processes). Interpretation in this production example • if ANOVA says the expected average quality is significantly different between these subgroups, you can assume that one production process is superior to another. • If not, the choice of the production process does not have an impact on the quality (and you may well go for the cheapest process). To understand why even such a broadly accepted method can never be implemented in the best possible way, let’s see how Wikipedia (Wikipedia―ANOVA 2019) describes the applicability of this algorithm. While you don’t need to understand the details, I’d like you to have a look at the underlined words: A test result (calculated from the null hypothesis and the sample) is called statistically significant if it is deemed

2.3  The Cost of NOT Doing it Right

67

unlikely to have occurred by chance, assuming the truth of the null hypothesis. A statistically significant result, when a probability is less than a pre-specified threshold (significance level), justifies the rejection of the null hypothesis, but only if the a priori probability of the null hypothesis is not high. Do you see how many assumptions we face in this case?In daily Analytics practice, all of these assumptions are judged by human beings, based on their experience. That means for example that two experts may decide to set the significance level differently. As a result, one of them may see the result above the own threshold; the other one’s threshold may be high enough for the result to be below. Same data, same algorithm, different conclusion! But can’t we develop software that judges these assumptions? Can’t we find an algorithm that determines the best possible significance level in the example above? Let’s take the example further. A typical precondition you find in statistical algorithms is that the underlying process (e.g. the production process) is “normal-distributed” (Simplistically said: if it spreads around the average value, in the form of a bell curve―see fig. 2.2). We had already looked at this distribution earlier in this part of the book. You won’t be surprised to hear that full normal distribution does not exist in reality. Nature does not follow mathematical formulas. Some distributions, however, are closer to a normal distribution than others. But when is a distribution sufficiently close to normal distribution so that we can consider this precondition fulfilled? Is there a way NOT to leave the decision to humans’ gut feeling? Of course, we can think of another algorithm that estimates whether a normal distribution is on hand. Yes, such algorithms exist! But they come with their own preconditions which need to be validated as well. And so on… We may end up with a complex construct of algorithms validating the applicability of other algorithms―without ever having full validation in place.

68

2  The Value of Data

Fig. 2.2  A normal distribution curve

But how about the results? Are we at least coming closer to the best possible way? Well, we always get a result. A mathematically correct result, looking at the formulas. But do we still know to what extent the result applies to the real world? Do we have all assumptions and preconditions under control? OCR (optical character recognition) works well as you can generally validate the results, and you know where your model needs fine-tuning. But how often will you never know whether you could have taken a better decision? Would the other product line have been more profitable? Would the other location for your new plant have come with cheaper operating costs? Wherever you will not be able to find out in hindsight, you should be clear about your assumptions. On top of all this, we face another, non-technical challenge. Data scientists may be tempted to use the lacking possibility to validate in their favour. They can claim everything without running the risk of getting proven wrong, so why should they bother about preconditions?

2.3  The Cost of NOT Doing it Right

69

So, how do we ideally address all of these challenges? You may guess that a Data Office would play a prominent role here, and you can already derive a few requirements for a future data office: • Human judgement cannot be avoided. This is not a matter of maturity of algorithms but inherent in the system. However, all decisions need to be transparent and well-documented. • Biased data scientists are a good reason for a central data science team. Such a team would not only support functional data Science teams but also sets standards and measures quality. These two moves will pay off in the long term as they foster trust in Data Science.

2.3.9 The Bias of the Investigator During my university times in the early nineties, I was lucky to have the unforgettable Swiss mathematician Fritz Ulmer as my professor in Applied Statistics. Throughout his career, he was busy unmasking the misuse of statistics in daily life1. A key expression Ulmer coined 35 years ago is the “bias of the investigator”: The missing transparency of statistical methods and the lack of expertise outside the reporting teams allow for many manipulated figures to remain undiscovered. And as the investigator is usually aware of this situation, he or she may be tempted to adjust the findings towards own preferences. Have a look at your own organisation:

1Professor

Ulmer’s readable website on the shady practice around election polls is still alive, unfortunately in German only. You can download a good document under (Ulmer 1987), with a brief English summary on pages 111–113

70

2  The Value of Data

Does the Board get informed through data? Who is preparing the message? Does that party have a stake in it? Would it prefer (or benefit from) certain outcomes? Eventually, each hierarchy level gets the reporting honesty it encourages. Whoever is afraid of the consequences of reporting bad figures, bad quality or generally bad news may be tempted to “interpret” data differently. Even worse, data gets cumulated while travelling to the top. Where reporting is already crooked at lower levels, there is no way to fix it at a higher level. All of this is well-known from the financial reporting area. However, in traditional organisations there is an important difference between financial reporting and data analytics: the former is subject to external auditing (And remember: even Enron got caught)―while people who fudge in the latter area may never get unmasked. The results of Analytics usually look plausible, and nobody outside the Analytics department has the skills and insight to validate the results. That is why, independently of data management, you may wish to encourage honesty, even if it means bad news. You don’t want to be caught in a bubble of fake good news Beyond this, you may already guess that a properly designed Data Office will have responses to this kind of challenges. This comprises neutral data quality management not only for the data itself but also for the quality of reports.

2.4 Why Now Companies that have been successful for decades often fail in seeing why they should suddenly introduce a focus on data. After all, success had come without such a focus. The key point is that the world out there is subject to change. This change happens at a faster pace than ever before. What used to be an adequate way of doing business may not be adequate anymore.

2.4  Why Now

71

In the Digital Age, how are we going to • • • •

improve the speed and accuracy of decision-making increase revenue opportunities reduce costs increase productivity

without actively and systematically managing our data? You may wish to consider the following aspects:

2.4.1 The Amount of Data is Growing Exponentially The amount of collected data is assumed to be doubling every two years, according to Gartner2. The potential amount of insight grows even faster, as more and more of the collected data becomes statistically relevant. Furthermore, an increasing number of devices produce or collect data. Think of mobile phones, tablets, sensors, and other socalled IoT (Internet of Things) devices. Falling hardware costs allow for those devices to be deployed in masses. You can assume that there is far more information about your customers outside your organisation than inside.

2.4.2 The Expectation of Customers is Changing Companies face “digital natives” as customers. Those customers know what is possible, and they expect it from others as well.

2Gartner

Data & Analytics Summit 2019. Presented by Valerie Logan, Gartner Inc.

72

2  The Value of Data

2.4.3 Business Decisions Require Data Back then, when every CEO took decisions based on at least 20 years of experience, those with the best experience and the best interpretation of experience won. Today, the best CEOs combine both: They don’t throw away their gut feeling, but they validate it against data.

2.4.4 The Competition Does it Already You don’t just fall back if your performance deteriorates. You may perform as well as you have always done. But the competition may already use data at their advantage. Data is not only an internal advantage. The way an organisation treats data also shapes its reputation, and customers love savvy organisations! You’d better assume that the competition has already understood the message.

2.4.5 The War for Talent has Started Data experts are a scarce resource. If you need them later, they may no longer be available on the labour market. That is why you may wish to develop a team now. This takes time, so you’d rather start soon. Loyalty is difficult to achieve. Data experts and data scientists have the free choice. But if they have become data scientists within your organisation, and if they develop strengths in both data and your particular business, they can be given a unique perspective they wouldn’t find too easily elsewhere.

2.4.6 Data Privacy Sensitivity Increases More and more people around the world become sensitive to how their data is handled. Not too long ago, people used to share

2.4  Why Now

73

with Facebook all details of their private life. Today they have started to realise, mostly through published cases of data misuse, that there are associated with data being spread in an uncontrolled way. As public opinion changes, legislation follows. An increasing number of countries regulate the handling of data. The resulting laws make data privacy breaches tangible and quantifiable. If a court decides that an organisation has violated data privacy law, it is hard to shy away. And it is not the fine that hurts most―even if, e.g. in the case of GDPR, it can reach a substantial percentage of an organisation’s revenue. It is the negative impact on an organisation’s reputation that does the most damage. The public doesn’t forgive data privacy misbehaviour as easily as in the past. In response, it is not sufficient to run a GDPR project. Data privacy needs to become an organisation value; controls and governance need to be implemented as a permanent measure, to prevent and to alert early.

2.4.7 Progress in Technology Accelerates Even Further The past 150 years have seen technology develop faster than in any era before. And it’s not slowing down. While the structure of cars, planes and vessels hasn’t changed too much during the past 50 years, a lot of progress has been observed in other areas. In the area of handling data, we do not only see more performant hardware, cheaper memory and faster data transmission but also better utilisation of these technologies. Big Data and cloud computing have simply not been possible a few years back. Even previously well-known algorithms can now develop and mature as it becomes possible to implement and apply them to real cases. Parallel processing of sub-tasks allows for performance to increase faster than the mere increase in calculation speed of computer processors. The increase in collecting data allows for faster and deeper Machine Learning.

74

2  The Value of Data

All of this is expected to continue, and some disciplines are even still in their early development phases. Your organisation may not need to be a pioneer―but you would not want to miss the train.

2.4.8 Change has Become the Norm In the past, the world changed over time. Today, change is part of the world. We cannot stop the changes around us by not changing ourselves. Instead, we may simply be left behind.

References IFRS (2018) IAS 16. https://www.ifrs.org/issued-standards/list-of-standards/ ias-16-property-plant-and-equipment/. Accessed 8 June 2019 IFRS (2019) IAS 38. https://www.ifrs.org/issued-standards/list-of-standards/ ias-38-intangible-assets/. Accessed 8 June 2019 NBA.com (2019) Retrieved 05 30, 2019, from www.nba.com Ulmer F (1987, 10). Wahlprognosen-info.de. http://www.wahlprognoseninfo.de/archiv/GH.pdf. Accessed 8 May 2019 Wikipedia – ANOVA (2019, 07 17) https://en.wikipedia.org/wiki/Analysis_ of_variance. Accessed 8 June 2019

3

Designing your Data Office

The Chief Data Office should not just be a cost center. It should proactively identify new business opportunities pertaining to the use of information assets to achieve competitive advantage in the marketplace. The CDO must represent data as a strategic business asset within the executive suites. Sunil Soares, Aditya Kongara, in: Information Management (Soares and Kongara 2014)

According to IBM (The Chief Data Officer Playbook 2016), organisations with a CDO are: • 1.3 times more likely to outperform peers • 1.5 times more likely to use data and analytics to stay ahead of competitors • 1.9 times more likely to have business-driven data and analytics governance • 1.7 times more likely to have a big data and analytics strategy • 1.8 times more likely to use big data and analytics technologies pervasively across their organisation It seems rewarding to embark on that journey.

© Springer-Verlag GmbH Germany, part of Springer Nature 2019 M. Treder, Becoming a data-driven Organisation, https://doi.org/10.1007/978-3-662-60304-8_3

75

76

3  Designing your Data Office

This chapter describes a few topics you may wish to address before searching for a CDO and setting up a Data Office.1 While many of these topics are meant to be addressed by the future CDO, the CEO should develop an opinion as well. The role of data is becoming too critical to blindly rely on a single executive.

3.1 What are the Preconditions to Succeed? If you intend to get a savvy Chief Data Officer on board, you will leave the organisation to that person, of course. But a few steps would need to be done before actively searching for a CDO. • You’d need to create a story so that everybody in your organisation can understand what you are aiming for. Otherwise, it may stay too abstract. Ask ten friends what “Data” means, and you’ll get eleven different answers… • Your management should have clarity about the impact on the existing structure of your organisation. Bringing in a new person means existing managers may lose influence. They may also interpret your step as mistrust in the existing management. • Thirdly, you also need clarity in discussion with CDO candidates. They need to know what you expect from them, and you’d like to know whether their vision is compatible with yours.

3.2 What does a Data Office Deal with? 3.2.1 What: The Data Vision Before a Data Office is tailored, it must be clear what it is supposed to achieve. 1The organisation headed by a CDO can have different names such as Office of the CDO, Data Office or Data Management department. I will subsequently use the term “Data Office”.

3.2  What does a Data Office Deal with?

77

Fig. 3.1  A data vision for the CDO

There is not the one universally valid data vision. It may differ from organisation to organisation. But it should be formulated from a stakeholder perspective. You can take Fig. 3.1 as an example: Such a vision allows you to work backwards: “In order to achieve X, we need to do A, B and C.” And it should not be carved in stone: During the ongoing dialogue with stakeholders, you may come across further aspirations. Add them!

3.2.2 How: Translate the Vision A good first step from an abstract vision to a concrete plan is a high-level list of targets. See Fig. 3.2 for a real-life example.

78

3  Designing your Data Office

Fig. 3.2  The targets of a data office

Such a list allows you (or the CDO) to properly design a Data Office organisation against it: Which activities are required to achieve the desired attributes of your organisation’s data? The typical tasks of a Data Office are around bringing all data to a desired state and around establishing a way of maintaining that state. The latter of these two tasks is challenging as routine kicks in easily once it is all set up. It also emphasises that data management is a permanent function, not just a project! Eventually, data must always be ready to be exploited—be it by the Data Office itself, be it by other data consumers within the organisation.

3.2.3 The Different Kinds of Data What kind of data would a Data Office deal with? In short, if you want to avoid inconsistencies, you would include all kinds of data. But what does “all kinds of data” mean? Here is an overview, summarised in Fig. 3.3 and explained subsequently:

3.2.3.1 Types of Data You can generally distinguish between four different types of data, and they are related to each other.

3.2  What does a Data Office Deal with?

79

Fig. 3.3  Types of data covered by the data office

• Master Data • Reference Data • Transactional Data • Metadata Transactional data refers to Master Data or Reference Data, and Metadata describes the structure of the other data types. That is why it makes sense to manage them under the same organisational umbrella, i.e. the Data Office. In this context, “manage” does not necessarily mean “maintain” or “create”. Reference Data or Master Data may be maintained where the business ownership lies, and transactional data may even be created by customers or by intelligent devices (e.g. through sensors). However, the governance of all types of data needs to be centralised under the Data Office. The same rules need to apply, the change process and data quality definitions need to be valid across all business functions, and adherence to standards need to be monitored centrally.

3.2.3.2 Data Areas Across all data types, you can distinguish between external and internal data, structured and unstructured data, or by current, future and past data (data lineage).

80

3  Designing your Data Office

There may be different ways of classifying data areas. They all have in common that you would want to have different areas being dealt with by one single, integrated team.

3.2.3.3 Confidential Data Is any data too confidential to be given to the Data Office? Honestly, you would need to trust this office, just as you trust your HR department by letting them know your own salary. As long as the normal Segregation of Duties rules are followed, and as long as you trust your CDO, there is no reason to keep certain data areas out. In addition, you would not want to risk data quality issues by having the most critical data managed by somebody else.

3.3 Shaping the Data Office 3.3.1 The Culture of an Organisation Before looking at the organisational aspects of a Data Office, it is important to have a look at the cultural aspects of the necessary change. You may agree or disagree with the frequently-quoted statement “Culture eats strategy for breakfast”.2 As a matter of fact, culture plays a key role in any change process. Does this mean that a good culture removes the need for a strategy? Let me borrow the analogy: Don’t let culture starve because it doesn’t get breakfast! In other words: A good culture will help fill a strategy with life. The first thing you would want to look at is to which extent your organisation’s culture is already prepared for a change towards a data-driven organisation. More importantly, you will want to know where your organisation’s culture is in need

2This phrase is often attributed to the great management consultant and author Peter Drucker (1909–2005).

3.3  Shaping the Data Office

81

of development, to allow for the introduction of a data-driven approach, or ideally to even foster it. Not surprisingly, you would hardly find an organisation that is perfectly ready to take a data-driven approach but hasn’t done so yet. This is because preparation for becoming data-driven does not happen accidentally. It needs to be driven actively, with the target in mind. On the other side, almost no organisation has to start from scratch. Most organisations are somewhat prepared. Eventually, an organisation consists of human beings, each of which is part of the changing world we are living in. In many cases, well-prepared cultures of groups and individuals within the organisation just need to be brought together in a structured and well-communicated way. The first task is to determine the gaps and to develop a plan to address them: Which aspects of our current organisation culture prevent us from becoming a data-driven organisation? Let’s have a look at three examples: Example 1

A traditional brick & mortar corporation already has a mature governance structure that could be used to organise data management. A Data Office could be set up like one of the established, cross-functional back-office entities such as Human Resources, Legal or Security. If this is played well, the Data Office will “inherit” the positioning and authority of those entities—which are usually well-introduced and generally unquestioned. On the other hand, fairly typical for such a traditional corporation, the need for a Data Office is questioned by other executives. Some of them claim that “we have been successful for decades without such an office” or that a Data Office will inevitably become “another layer of bureaucracy”. This is where this organisation’s culture requires development.

3  Designing your Data Office

82

Let’s have a look at an entirely different organisation now: Example 2

A start-up company is growing. It tries to keep the start-up spirit while maturing its structure to become sustainable as an organisation. The CEO is bringing up the idea of a Data Office. Everybody agrees. “Data” is associated with the dynamic changes of the world the company is operating in. There is no dinosaur around who needs to get convinced that “data is the biggest next thing”. Great—this part of the company culture is a tick in the box. But then the new CDO talks about documentation, about standardisation, about playing by the (data) rules. This is against all the flexibility that has brought the company where it stands today! People stick to the culture of the times where the company was small enough for everybody to know everybody, and for all knowledge to be readily available in one of the few players’ heads. This company’s culture obviously needs development as well, to become truly data-driven. Finally, an organisation somewhere between the first two examples: Example 3

The owner of a medium-sized company asks the management board to consider the opportunities around data. Everybody agrees, and the company is considered sufficiently agile to question old ways of working. Now the board starts a discussion about how to address this challenge. A culture change is required, the right knowledge needs to be acquired, processes need to be questioned, and technology needs to be developed. The only member of the board who “speaks data”, however, is the Head of IT. This experienced leader claims the

3.3  Shaping the Data Office

83

natural responsibility for the data topic. A new project is set up, and a project team of IT specialists starts to assess solutions for Masterdata Management, for Analytics and Business Process Automation (BPA). The board agrees that progress will be presented to the other functions on a regular basis. This company is “doing” things. It has both the systematic approach that the start-up was lacking and the agility that was missing with the big corporation from the first example. But what drives the changes here? Yes, it is technology. Back in the 20th century when data was playing a minor role, computers helped calculate faster and with fewer errors. Later, they helped connect people to communicate over long distances and even to store data to a certain extent. The IT departments were the masters of all this, as it was technology that made the difference. The management board members, including the Head of IT, started their own careers back in those days. They happily agree for data management to be “outsourced” to IT. You may have noticed the similarity between all three cases: People refer to the successes of the past, to justify sticking to the recipes of the past. This is a typical aspect of resistance to change which, as you have seen now, is not limited to the “old folks” in traditional industries. It is my clear recommendation that you make the culture change towards a data-driven organisation part of a broader culture change within the organisation, to prepare for a rapidly changing world. Data is just one of the various aspects where we observe change these days. To bring it to the point, any organisation needs to become ready to address three key challenges:

3.3.1.1 Reality has Changed Dramatically All recipes that worked in the past need to be questioned. This is valid for both technology and society. Think of the everincreasing amount of data being collected, of your customers’ increasing maturity in dealing with technology and data, of the

84

3  Designing your Data Office

increasing risk of algorithms taking decisions without us being able to validate these, to name just a few aspects.

3.3.1.2 The Frequency of Change has Increased Once you have adjusted to the current reality, you are not done. Expect the next change of reality to be around the corner. Historians talk about what has changed “from the 18th to the 19th century”, suggesting relative stability within each of the two centuries. The 20th century already deserves a finer granularity: People are comparing “the eighties with the seventies”. Meanwhile, we observe almost annual changes, driven by both technology and a truly globalised exchange of thoughts. Trends develop overnight. 3.3.1.3 Society becomes More Complex As a consequence of the first two points, more and more different “generations”3 co-exist. Unless you run a niche company, you will have to address more different generations (preferences, ways of living) in parallel than at any time in the past. Such a situation justifies a fundamental change of culture in your organisation. What worked well in the past (which may mean “last year”!) may not work in the future. And if your organisation doesn’t make the change, your competitors will. Now, which role would IT play in a data-driven culture? Nontechnical aspects of data such as logic, definitions and dependencies rapidly gain relevance. As a result, business people do not simply need to get engaged. They even need to take the lead. This has an impact on the role of IT, as part of the necessary cultural change. More and more, we should see a split of the traditional “Information Technology” area into a business-driven “Information” organisation (this is the Data Office’s area) and a “Technology” organisation (which is the classic IT role,

3Due

to the increased speed of change, the traditional, biological meaning of “generation” is no longer adequate in this context. I am using the Wikipedia definition “all of the people born and living at about the same time, regarded collectively” instead (Wikipedia—Generation 2019).

3.3  Shaping the Data Office

85

remaining indispensable as a solution provider and a trusted advisor). The necessity for this split is independent of the formal reporting line of a CDO. But it is not sufficient to introduce a culture that addresses the topics of data and other changes to technology and society. You also need a culture of permanent adjustment to a changing world, both within the area of data and beyond.

3.3.2 Objectives of the Data Office There are various options for the setup of a Data Office under a CDO. The choice depends on the data priorities as well as on the organisation’s culture and situation. So before designing the Data Office, it would be good to get clarity about the most important objectives of the Data Office. How about these: • Encourage Business to take ownership of their data (and ask IT for the right technology) • Collaborate across Functions (structure and automate the way we work together) • Enrich standardisation by configurability (e.g. products, locations) • Make data available across functions (Information at your fingertips) • Base quality metrics on standardised data definitions; and measure data quality accordingly • Avoid misunderstandings (through a glossary, definitions, data model, documentation) • Help exploit data—both operationally and analytically • Maintain, cleanse, secure and protect data Your list may look different, and you may even prioritise some points over others. But you should have such a list very early during your planning phase.

86

3  Designing your Data Office

3.3.3 The Footprint A Chief Data Officer may be a lone warrior, a consultant to the CEO. But having a team to support the CDO may be advisable. The fundamental approach needs to be looked at before drawing the first org charts. You can assume that not all data-related activities are new to the organisation. Somebody is doing them already today. That means you face a potential area of conflict: You’d either have to move data authority away from existing functions, or you’d have to keep the CDO powerless. None of the two extremes is advisable, and neither is an uneasy compromise. Let me sketch out four different models, admittedly a bit black and white, to illustrate the range of options:

3.3.3.1 King John Lackland This is the least intrusive model (see Fig. 3.4): a CDO is added, without touching the existing organisational structure. The advantage is a conflict-free setup, and hardly anybody will perceive the CDO as a threat to their own power.

Fig. 3.4  CDO model “King John Lackland”

3.3  Shaping the Data Office

87

But such an approach will put the CDO in a disadvantageous position as other functional heads may push the CDO to the side using their mere resource advantage. The CDO may feel like at court without a lawyer: His or her valid position will have an uphill battle against a myriad of lawyers on the other side.

3.3.3.2 Voluntary Support You could still go ahead with a standalone CDO. In this case, you would need to communicate broadly that functional heads are expected to provide all necessary support. One important aspect is the provision of resources for projects run by the CDO (illustrated in Fig. 3.5). While this setup sounds like a good compromise between the CDO’s need for resources and the attempt not to hurt the other functional heads, it comes with a few significant disadvantages: • Lack of loyalty: In case of doubt, all team members supporting the CDO will be loyal to their formal line manager. • Lack of neutrality: The people supporting the CDO would most probably keep their functional focus—while a CDO needs an entirely cross-functional approach.

Fig. 3.5  CDO model “voluntary support”

88

3  Designing your Data Office

• Lack of quality: The CDO cannot expect to be provided with the best resources. And in case of full utilisation, only the “remaining time” will be made available to the CDO

3.3.3.3 Additional Team You can avoid most conflicts by just letting the new CDO recruit an entirely new team, on top of the existing resources, as illustrated in Fig. 3.6: This model would obviously incur costs, and the business case for a Data Office would become more of a challenge. But there are also other disadvantages to this model: If additional resources come from outside the organisation, they may lack the insight in business, culture, data structures, not have a network etc. If people from other departments apply and are offered a position in the new Data Office, the CDO will face the same conflicts that were intended to be avoided.

Fig. 3.6  CDO model “additional team”

3.3  Shaping the Data Office

89

3.3.3.4 Moving People This model would require an assessment of all existing departments for people who are already doing data work and would thus belong to the new Data Office. The shift of responsibilities from other teams to the data office has been made visible in Fig. 3.7: An interesting first question would be: Who would be responsible for this exercise? • The new CDO? This would put this person in a very delicate position, having to take away resources from other departments himself. Furthermore, how can you guarantee that the CDO does not misuse this power to create an own new kingdom? • Or the CTO? The CFO? The Head of HR? How should any of these (admittedly very capable) leaders know which persons to select? • An external consultant? Wouldn’t even such a neutral person depend on information provided by people who are impacted and thus biased?

Fig. 3.7  CDO model “moving teams”

90

3  Designing your Data Office

Furthermore, expect many data specialists to have mixed tasks. Maybe just 25 per cent of a person’s daily work is about data. How do you deal with such cases? You may agree that a plain-vanilla selection of any of these four models is not advisable. A promising approach is to build the Data Office over time. You could have the CDO start with a small team, a mixture of shifted resources and new positions. The key is not to slavishly follow a detailed plan. Over time, you can react individually on recent development, in close dialogue with the CDO, while listening carefully to the organisation. I recommend that you consider the following aspects: • The CDO’s concrete plans and projects, • the preferences of data experts (who may wish to become part of the new Data Office or equally well want to stay “at home” in their business function or IT), • the perspectives of Senior leaders (who may see a chance in strengthening the CDO position and subsequently creating a win-win situation) and • the acceptance of the CDO’s strategy by the Board.

3.3.4 The Mandate The basic question is whether a CDO needs a strong mandate from the CEO, or whether a bottom-up approach works. In other words, can a competent CDO with a team of enthusiastic people turn an organisation data-driven? The short answer, based on the experience in a variety of organisations across all industries, is a straight “No”. The objective of a Data Office goes beyond supporting single functions (which the functions could achieve on their own by getting the right experts on board). A Data Office starts adding true value as soon it succeeds in harmonising across functions and in creating synergies for the entire organisation.

3.3  Shaping the Data Office

91

How many people in an organisation are naturally interested in such a cross-functional approach? A handful, at best. Sometimes it is the CEO only. Most other people work for one of the various business functions and are primarily interested in optimising their own area. The right data strategy, however, is what is best for the organisation, across all functions. This is far superior to the sum of the best for each function (which does not consider synergies and win-win situations). Consequently, a strong mandate from the CEO, representing the owners/shareholders of the organisation, is a precondition of a successful data strategy. Of course, authority alone won’t be sufficient. For a datadriven approach to develop, all business functions must want to play ball. Authority or a mandate do not replace the hard work of convincing stakeholders!

3.3.5 Managing Expectations Data management cannot cure the world’s hunger or protect us from earthquakes. And even the good things data management can achieve will not be achievable overnight. Even at an early stage, before a Chief Data Officer can provide a concrete roadmap, expectations need to get managed. A good key message at this point could be that we cannot do the second step before the first. In other words, people need to expect a period where things get tidied up and the foundation is laid. Of course, a CDO needs to deliver first results soon, to keep people confident. However, promising fantastic data insight within the first few months would be dishonest. A good means of conveying the message is a look at the three dimensions of data, as illustrated in Fig. 3.8: We cannot just focus on future opportunities. The other two areas are usually far more urgent and necessary for sustainable success.

92

3  Designing your Data Office

Fig. 3.8  Growing the three dimensions of data

Improving current is necessary to ensure today’s rules are adhered to—and data can be a great help here. Creating a solid foundation is a long journey. But without this foundation, data is useless, or even malicious as it suggests wrong decisions.

3.3.6 Positioning a Data Office in the Organisation There is no single natural position for a Data Office in an organisation’s hierarchy. Different options come with their pros and cons, and the best solution also depends on the situation of the organisation. Yet it is important to clarify this question very early in the process as a CDO must have a clearly defined place and role to be successful. This sounds like common sense, yet it is far from being the norm. The leadership consultancy Spencer Stuart states “[…] we find many organizations assigning the title without allocating the resources or responsibilities necessary to make this person successful” (spencerstuart.com 2018) A CDO floating around as a grey eminence or as a member of the CEO’s staff will not be sufficiently deeply embedded in the organisation’s business to be effective.

3.3  Shaping the Data Office

93

Fig. 3.9  Is there a space for a new CDO?

A CDO needs to fit into the overall “corporate jigsaw” (Fig. 3.9), both organisationally and culturally. If no such space is created, the CDO will remain disconnected and fail. Here are a few points you may wish to consider: • A CDO needs authority. Key players may not listen to a subordinate, no matter how good the story is. • A CDO and the entire Data Office team need proper incentivisation. If a CDO reports into one of the business functions, the CDO’s targets may get set with a bias—while proper data management requires functional independence. • Data Office work may benefit from being close to other crossfunctional work such as innovation, transformation or change management. • Effectiveness of data management also depends on the personal engagement of the person a CDO reports into. If this person is a visionary, cross-functionally thinking strategist, even the Chief Marketing Officer may be a good choice. (But remember—people come and go…)

94

3  Designing your Data Office

• A new Data Office would not only do new activities. Where do the experts sit today that would ideally work for the CDO in future? What is your story to take away responsibilities from their managers and give it to the CDO? • What are the key stakeholders of the CDO? Do they easily feel represented by the CDO? If a CDO talks business but reports into the CIO, credibility may be at stake. Let’s think it through step by step:

3.3.6.1 Should the Data Office be a Business Function? Most people would intuitively locate a Data Office within IT. But is this really the most logical way of setting it up? According to my observations, most organisations that struggle to become data-driven do NOT have a big issue with their Information Technology. Instead, they have a gap in bringing the business functions and IT together, for a cross-functional, business-driven data approach. Look at the following important business roles a CDO could (and should) play: i. Encourage the business functions to take ownership of their own data: Make them understand that there is a difference between the information technology required to technically process data and the data itself ii. Work cross-functionally, i.e. make the business functions work together. Data does now know functional borders. iii. Represent the (cross-functional) business view towards IT. Formalise the data translation of how the business wants to work, and help IT develop the right database structure. Incorporate flexibility to allow for the future development of the business model. Particularly the third point means a significant change to many organisations. The monopoly on defining the relationship between business and IT is very often with IT—just because

3.3  Shaping the Data Office

95

they are the only truly cross-functional entity within many organisations! Thank goodness they have taken care of crossfunctional business alignment in the past. But are they really the best department to do so? Of course, all of these points could be addressed by a function within IT as well as outside IT. But reports from various organisations suggest that IT people are generally weaker in approaching topics from a business perspective. They tend to provide answers and search for the corresponding questions in a second step. Very often, however, it is not IT that requires a mindset change for an organisation to become a data-driven organisation. IT folks have dealt with data for decades, and they usually know how powerful it can be. It is the business people who often consider data as something abstract that naturally falls under the responsibility of IT. As a consequence, they easily miss data opportunities. At the same time, most IT specialists do not know the business side as the respective business specialists do. That makes them miss opportunities as well. But where on the business side would a Data Office ideally be placed?

3.3.6.2 Data Office within the Organisational Hierarchy Is data one of the key drivers of your organisation? In this case, you may want to have a Chief Data Officer on Board level. It particularly helps if CIO and CDO are peers. The same logic also applies to organisations where information is an increasingly important aspect of their service offering. Even brick & mortar organisations realise that the digital accessibility of their services and the information about the status of rendered services becomes a key differentiator. But how about organisations where data is expected to function as a classic enabler in the background? You could argue that the same applies to Finance, and yet can we consider the CFO as a well-accepted function at board level.

96

3  Designing your Data Office

Well, there may be reasons not to have a CDO as a member of the Board, notably if you want to start small and have data management grow with its successes. Does your organisation have a cross-functional Board member? For instance, a Chief Transformation Officer or a Chief Strategy Officer? This is a great thing as it allows the CEO to share the burden of cross-functional thinking, while most Board members will naturally focus on their functional area. Such an Officer would be a natural reporting line for a CDO. Cross-functional independence is granted, and data could become part of a shareholder-oriented organisation strategy. It could work with process management and business capability management, which are cross-functional by nature as well. Alternatively, the CFO could be a good fit. This applies particularly where a CFO is expected to be more than a bean counter, maybe even the financial counterpart to a more commercially oriented CEO. A CFO tends to understand the power of data by virtue of the role. Finance is traditionally about numbers. Furthermore, Finance folks often find themselves at the end of the data chain: Every day, they process information that most other departments have touched before, and every error or inaccuracy directly impacts Finance. This is a challenge as regulatory expectations demand high data quality, particularly from Finance functions. As a result, a CFO will have a vested interest in powerful data management. This is not the worst precondition for a successful Data Office. To sum it up, a CDO’s hierarchy level should be the higher, the more an organisation’s operating model and product portfolio are related to data. In extreme cases, you have organisations like Facebook, where data is the product. In this sense, you may consider Mark Zuckerberg himself as Facebook’s Chief Data Officer. Finally, remember that a CDO’s mandate is more important than the formal position and hierarchy!

3.3  Shaping the Data Office

97

3.3.6.3 Relationship to the Other Top Executives How should you design the relationship of a CDO with the CIO, other Board Members and further key players such as a Chief Digital Officer? Even if you decide against a CDO at board level, that position requires a certain degree of authority over the Board members, comparable to the authority of a Chief Compliance officer, the Legal Counsel or a Chief Information Security Officer. A CDO should regularly meet all Board members. This will support the CDO in understanding the needs and pain points across all functions. Reversely, it will help data considerations play a more prominent role in functional strategies.

3.3.7 The Data Office—Police or Service Provider What is the biggest value you add as a Data Office? It certainly depends on your organisation’s situation—but in many cases, you will face a dilemma which I call “Police or Service Provider”. Let me describe a typical situation that you often find in organisations before a Data Office is set up. Example 4

• IT is running a BI function that supports different business functions. Whenever there is something new in technology, the BI team will get trained on it, introduce it and share it with their business counterparts. If a business function needs something—be it technology, be it data—it will go to that BI team, and they will support, at the best of their knowledge (and as their bandwidth allows). • Most departments collaborate with IT on BI and maybe on other data topics (e.g. Masterdata), but they usually don’t work with each other. • Furthermore, certain departments have traditionally been doing their own thing. This reaches from developing an unprecedented level of sophistication in dealing with Excel up to building an own data warehouse, with a departmental team of true data experts.

98

3  Designing your Data Office

Sounds familiar? The entire setup can be found in many organisations that are too big to sit in one single office building. Most managers would confirm this observation. Worryingly, they don’t see it as an issue. This is where the challenge starts. It comes with two problems that go hand in hand and amplify each other. The first problem is a subject-specific one. In an environment as described above, activities are duplicated, knowledge is not shared, different departments develop approaches that are not compatible with each other, the source of data may not be validated by business specialists and so on. The second problem is even more of a challenge: Everybody seems to be happy with the status quo. Why is that? Well, it is human nature to optimise what is important for oneself. The Head of Marketing is NOT focusing on the best possible data management for the organisation—he or she is trying to get the most out of the data to support Marketing. This is usually amplified by an organisation’s rewarding system. And here I am not only talking about the bonus scheme. It is about how “success” is defined in an organisation. Senior managers and executives usually have the one area they are associated with—they are successful if this area is (seen as) successful. So, why on earth should they be interested in investing own resources in supporting other departments’ targets? This sounds bad, and most organisations have something like “collaboration” in their values or principles. But it simply is human nature. Of course, there are those altruists who always have the success of the organisation in mind. But how often do those managers make it up the promotion ladder? Consider a manager’s target to be seen as successful, and imagine a project this manager is being associated with. Now assume that this project comes under pressure (which probably applies to 95 per cent of all project during their project lifetime). What is going to happen? First, the managers in charge (the project manager and everybody above) will focus all energy on getting this lighthouse project back on track. They will sacrifice targets that are not

3.3  Shaping the Data Office

99

supporting this one initiative, including targets of other departments and overall organisation targets. And now remember the three dimensions of any initiative or project: No matter which project management science you apply, it comes down to time, money and quality. A project in trouble simply means that you cannot make the target in all three dimensions anymore. You’d either have to postpone project milestones (compromising on the time dimension), or you have (had) to add resources to recover (compromising on the money dimension), or you have to compromise on the quality dimension, by reducing scope, simplifying requirements and so on. So, which dimension would you prefer to compromise on? Remember, you don’t want to look bad! If you compromise on time, everybody will see the slipped milestones (unless you are a professional cheater—but we can focus on non-criminal behaviour here). If you compromise on money, you admit that you are running out of budget, for all to see. Again, experienced managers have their spare buckets and buffers, but in a properly run organisation, people can usually not hide too much. But wait! Isn’t there this third dimension called “quality”? And, yes, this dimension gives far more flexibility than the other two. It is not only about formally agreeing on a scope reduction with all relevant stakeholders, visible to all of them. It is also about the “quality under the hood”. It is about rotten compromises that can be hidden for a while. It involves dirty workarounds that are replacing sustainable solutions. In general, most of this may cause issues later, but ideally in other areas or, if in the same area, by the time it hits the successors of the people in charge. Why do I describe all of this, without hardly ever mentioning the word “data”? Well, data is usually the first victim of such a situation. Data is cross-functional, and proper data management is sustainable, with a long-term perspective. This leads us to the question in this chapter’s headline: Police or Service Provider?

100

3  Designing your Data Office

• The Police role comes with telling departments what to do and (even more importantly) what not to do. It means explaining rules, insisting on adherence and even monitoring compliance • The Service Provider role supports departments in dealing with data. It may comprise activities the departments could also do on their own but not as efficiently as a centralised cross-functional team. Not surprisingly, most people like the “Service Provider” role of the Data Office more. But the overall perspective of the organisation certainly requires the “Police” role. Obviously, both roles are important, and they must be part of a Data Office’s mandate. In reality, those parts would need to be executed hand in hand, somewhat following the “carrot and stick” principle. A first key idea is for a Data Office not to insist on full data compliance immediately everywhere. A typical statement could be “Okay, I support your workaround for now, so that you can complete your project on time—but you have to commit publicly to closing the gap within the next six months.” A second important aspect is to ensure that people experience “policing” as a positive thing. This is where reciprocity helps: • “We have ensured that department X adheres to our data standards. This has made life easier for your department Y.”— suggesting that department Y is expected to adhere to the data standards as well. • “Since all departments keep their parts of our customer data up to date as per our data rules, people like you have a true customer 360 view.”—suggesting that the addressee is expected to also keep her/his part of the customer data up to date. A target can be to position the Data Office similarly to departments like Security or Legal: Not everybody loves following their direction, but everybody involves them as the consequences of not doing so are not desirable.

3.3  Shaping the Data Office

101

3.3.8 Centralising the Management of Data? Some organisations benefit from centralising the way data is handled. Others succeed with a federated model. For most organisations, a balanced hybrid between the two extremes works best. Certain areas need to be centralised by all means, to ensure a consistent way of working. Most prominent aspects are Data Governance, data modelling, data strategy and data quality principles. These points make the difference between a federated model and data anarchy. When finding the right balance, the Subsidiarity principle may help. In this case, it means: Whatever can be done in a decentralised way should be done this way. But “decentralised” does not mean “anarchy” or ungoverned! It usually means that people in all geographical areas and all business functions work within a level of freedom that has been determined centrally. You may consider the following aspects: • Does your organisation have a single product portfolio or various independent portfolios? • How complex is your organisation’s business model? If it is easy to understand for data specialists, they could be centralised under a single Data Science and Data Analytics department. Otherwise, it may be wiser to have Analytics teams within the business functions, surrounded by functional specialists and focused on one aspect of the business. In this case, you may foresee a small central team setting the rules and supporting certain cross-functional tasks in data analytics and data science. • Whenever data reaches across functions or geographic areas, rules should be centralised. Execution may be federated • Whenever quality is to be compared across functions or geographic areas, the standards of data quality measurement need to be centralised. The measurement itself may well be federated

102

3  Designing your Data Office

• Centralisation may have a psychological impact. People may disengage, leaving it to “those in the ivory tower”. Or they may boycott “those smart-asses in the head office”. It is usually not required to give up necessary centralisation efforts to prevent this from happening. Instead, make it a natural thing that there is a central team dealing with these topics, just as another central team is dealing with corporate HR topics or Legal stuff. Or, as Gartner’s 2019 Chief Data Officer survey suggests, “for CDOs to be successful, it’s about being the centre of gravity and not control.”4

3.3.9 Data Collaboration Collaboration is an important aspect. Independently of the degree of centralisation, all data players of an organisation should be linked together through a network. So, what is a “data player”? This term describes all employees that work on data most of the time, executing non-repetitive tasks. It reaches from professional data scientists to people who monitor KPIs to determine patterns and root causes. Creating such a network should be part of the mandate of a Data Office. It involves asking for a certain (small) percentage of time of data players across the organisation to work together. You can expect the following benefits from such networks: • • • • • •

Avoiding duplicate work Avoiding inconsistencies Sharing successful practices Sharing failures, to avoid repetition Learning new stuff together Sharing ideas

4Gartner Data & Analytics Summit 2019. Presented by Valerie Logan, Gartner Inc.

3.4  The Chief Data Officer (CDO)

103

Most data players usually like the idea. It helps them learn from others but also share their own achievements with the rest of the community. A mandate is required as those data player’s line managers often don’t want them to spend time on activities not directly related to their departmental tasks.

3.4 The Chief Data Officer (CDO) 3.4.1 Authority or Charisma? In many cases, a CDO has three challenges at the same time: Managing Data, developing a Data organisation and selling the concept to the key stakeholders. The first challenge can be assumed to be the easiest of the three. So, yes, a CDO needs authority. However, you should search for a CDO who is a good salesperson, a good communicator and a diplomat at the same time.

3.4.2 What are the Key Attributes of a CDO? Yes, a CDO should know how to treat data properly. But a lot of non-technical skills are required in this role. In fact, a CDO is three roles in one:

3.4.2.1 Data Expert While a CDO will usually leave data science or data maintenance to the Data Management team, experience and knowledge are paramount, particularly in bigger corporations. A CDO needs to be respected, and respect without knowledge is difficult to achieve 3.4.2.2 Communicator A CDO needs to be a storyteller and a salesperson. People at all levels within the organisation need to get convinced, even fascinated. A good CDO makes them buy into the

104

3  Designing your Data Office

overall concept and the importance behind it, without necessarily having to understand all details. Thank goodness, the idea to be sold is a really good one. There is something in it for everybody.

3.4.2.3 Diplomat Different business functions have different targets. They may not be willing to invest in cross-functional data. The frequent question “What’s in it for me?” needs to receive a convincing answer. People may very well understand why proper management of data is important—but they may have good reasons to resist for personal reasons. What if managers lose influence? What if they are asked to invest in work that other departments benefit from? What if they need to accept compromises where they could previously do it all on their own?

3.4.3 What is the Future of the CDO Role? Will a CDO have the same scope in ten years? Most probably not! Will it disappear as it makes itself superfluous? No, not either! In fact, the role of a CDO will develop, and it is difficult to forecast the direction We can assume, though, that the training and selling aspect will diminish slightly as the understanding of data will increase throughout the workforce over time. A lot of pioneer work will also be done at a certain point, e.g. the establishment of a stable data governance framework. However, the responsibilities that come with managing data as an asset will stay, just as the responsibilities of a CFO won’t go away. Content-wise, a CDO role will not narrow down to a Chief Analytics Officer, no matter how important Analytics, Machine Learning and Artificial Intelligence are going to become. After

3.5  Data Governance: Review and Decision

105

all, the success of all these disciplines strongly depends on how data is dealt with before it gets used here. In other words, data governance, data quality, data model, glossary, data architecture will even become more critical the more an organisation does with its data. Mid-term, proper management of data will need to become one of the pillars of a well-managed organisation, similar to the management of the organisation’s other assets, of its technology and its workforce.

3.5 Data Governance: Review and Decision 3.5.1 Data Councils Data is cross-functional by nature, and data decisions need to be taken from a shareholder perspective. In many organisations, however, the Board itself is the only cross-functional decision body, and it may not have the bandwidth nor the expertise to decide on all data matters. It is, therefore, a good approach to have cross-functional data bodies on multiple levels within an organisation, as illustrated by a real-life example in Fig. 3.10.5 A CDO would focus on shaping these bodies, to be able to take all relevant stakeholders with him or her on the data journey. This would also help with the different hierarchy levels of an organisation do work together as efficiently as possible. But there is one body that needs to be explicitly mandated by the CEO: The Executive Data Decision Body. This top-level body below the Board would consist of all relevant functional heads, and it would be chaired by the CDO. Its responsibility is to

5The names of the bodies used here are by no means expected to be strictly adhered to. In fact, naming should follow the respective habits of your organisation. Composition and authority are far more important than names.

106

3  Designing your Data Office

Fig. 3.10  The data governance pyramid

• • • • •

Provide strategic direction for the CDO Hold the overall responsibility for Data Management Authorise data-related programmes, projects, workstreams Set priorities Liaise with any budgeting and funding approval bodies

Important: This body should be a pure approval body, not a discussion body! It requires well-prepared decision proposals. The ideal statement in such a body is “my team told me that this proposal is okay, so I approve”. Any concern should be addressed upfront, and requests for approval should only be submitted to this body as soon as all functions have signalled their agreement. This is ideally achieved through a cross-functional collaboration body below this executive body. Such a collaboration body would consist of departmental heads, and it would: • Represent Data Producers and Data Consumers across all functions • Discuss data-related opportunities and issues • Take tactical decisions on its own • Propose data-related initiatives • Track progress of data-related projects and request action in case of problems • Prepare decisions of the Executive data decision body

3.5  Data Governance: Review and Decision

107

Depending on the size of the organisation, such a second-level body would also be chaired by the CDO or by one of the CDO’s direct reports, e.g. the Head of Data Governance. An important principle for all councils is the Collaborative Approach: Avoid majority votes! Agree jointly what’s best for the organisation. This requires a lot of pre-work and discipline, but it pays off in the long run as there is no “loser” who may feel inclined to boycott a majority vote during daily business. To avoid inflation of councils in an organisation, you may agree to set up one single second-level body for multiple purposes, including data. After all, the attendees are not going to be data experts but functional heads with multiple responsibilities. Another idea is to have a Data Collaboration Council coshared by an IT Executive (the CIO or one of the CIO’s direct reports) and the CDO. This is to stress the fact that data discussions are not only about business requirements but also about implementation priorities.

3.5.2 Data Representation on Existing Bodies Data should be on the agenda of all business and IT initiatives and discussions. That is why the Data Office should be represented on existing collaboration or decision bodies.

3.5.2.1 Business Bodies This is not only about data-heavy bodies such as a Digital Board or a Finance committee. You should also think about areas like Labour Relations, where representatives of employer and employees regularly review HR topics—data privacy is usually an important topic here, and the Data Office should represent the employer view. Setup and level of participation with any such business body need to be determined individually—but the overall direction that the data view is mandatory needs to come from the top.

108

3  Designing your Data Office

3.5.2.2 IT Bodies Collaboration between the Data Office and IT can be fostered through an Architecture Review Board—which IT should have set up independently of any Data discussion. This is usually a place where Data and IT complement each other, and they can form a joint voice towards the business. In such a board, you would typically have IT representation for Technical Architecture and Application Architecture while the Data Office would represent Data Architecture. It is even possible for this Board to jointly develop and enforce architecture principles for the entire organisation. Many of these principles traditionally stand on the agenda of both IT and Data anyway such as Service Oriented Architecture (SOA) or minimisation of data duplication.

3.5.3 Data Review and Decision Process People may not know which kind of decisions to bring to the Data Councils for approval

3.5.3.1 When to Apply the Data Review and Decision Process I suggest that concrete criteria e agreed for activities to go through this process: (i) Potential changes to the structure or logic of data – Example: Data field mapping when merging two ­organisations; changes to the structure of the attributes of “customer”. (ii)  Introduction of or changes to data handling (processes, roles) – Example: Reference data management as part of moving a functional application to the cloud. (iii) Potential changes to the data lifecycle (sourcing, movement, modification, consumption) – Example: Replication of Masterdata into the Cloud (where a functional solution is migrated to a cloud solution)

3.5  Data Governance: Review and Decision

109

(iv) Suspected cases of violated Data Standards (or their IT equivalents) – Examples: Duplicated reference data maintenance or data sourcing from an unreliable source

3.5.3.2 How is this Process Triggered? The best process is of no value if it is not clear when to use it. People may want to circumvent it to avoid resistance. That is why it is important to clearly define all situations that trigger this process. Furthermore, governance needs to be clear: Who can bring in a topic, who decides whether the process applies and so on. It is very important to ensure that everybody can bring up a topic, either directly or through a clearly defined representative. Here is a list of typical triggers of the data process: (i) Project approval or funding request (directly or via Architecture Review Board) (ii) Issue report—Somebody reports something to be wrong (including findings of data quality assessments executed by the Data Office itself) (iii) Change request (against data structure, terminology, applications, data handling process, data sources, data flow) (iv) Expiring data compliance exemption (where a deviation was approved temporarily) (v) Request for clarification (where an existing policy or guideline may be ambiguous) (vi) Escalation: All cases where different parties cannot agree voluntarily. (This is not necessarily a negative thing as different parties naturally have different objectives, and escalation leads to a cross-functional perspective)

3.5.4 Data Ownership Data ownership needs to be defined for all data elements, ideally at the executive level.

110

3  Designing your Data Office

This is a task a CDO cannot take on without full Board endorsement as it means concrete responsibilities and work for most of an organisation’s business functions.

3.6 How can a Data Office Wield Influence? 3.6.1 Internal Data Networks 3.6.1.1 Purpose Networks are meant to enable and encourage teams to work together on projects, find synergies, learn from each other, develop joint solutions and concepts, submit requests etc. You would want to have separate networks for the “Data Folks” (those primarily busy with data) and the business community (those to whom data is just a means of achieving their targets) 3.6.1.2 Challenges But there should not only be the one data network. The data roles within the departments are simply too different. And: All data domains need business owners, but people work for departments, and data elements are used across departments. The most famous data domain is probably “Customer”, and if it is owned by Sales, who will represent Marketing, Customer Service, Legal, Customs, etc.? There may be multiple ways of taking all of this into consideration. In other words, there is no single “right answer”. But it should be clear that the design of a good data network is never easy and straightforward. 3.6.1.3 Possible Setup Here is an example of what the different elements of an organisation’s data network could look like. (i) Functional Data Champions Each business function would select one Data Champion to represent the interests and needs of that function in the area

3.6  How can a Data Office Wield Influence?

111

of data. This is usually an additional role for a seasoned, data-savvy functional expert. (ii) Business Data Owners Business Data Owners represent one Data Domain (e.g. “Customer” or “Product”). They have to align with all stakeholders of that Data Domain, beyond their own function. This not a dedicated position either. For this role, an organisation needs open-minded people who don’t wear blinkers. They foster data collaboration between business functions. (iii) Data Creators: Data Stewardship Network These people have the authority to maintain data on a daily basis—be it the insertion of a new country code or the modification of a customer record. They usually sit across departments. Their work has an impact on other departments so that they should form a community for an ongoing dialogue. People are not “appointed” as a Data Creator—they are Data Creators by virtue of their normal business role. (iv) Data Consumers: Analytics Network Most organisations have data analysts, data scientists, reporting specialists and other people who try to get insight from data on a daily basis. These people may belong to different departments, and they may even be spread across locations. Bringing them together in one community bears great opportunities. It is a huge chance for a Data Office to orchestrate those networks actively. This includes an actively operated collaboration platform such as Yammer. People will use it if they find it helpful, e.g. if their questions get answered—be it by the community, be it by the Data Office. Once accepted, such a platform can also be used to disseminate messages and information around Data.

3.6.2 Data Guidelines What is right or wrong in data matters should not only be clear in the head of your CDO. It needs to get documented.

112

3  Designing your Data Office

This is not about data knowledge in detail—rather about the general direction that can be understood by everybody, from the nerdiest database expert to the Head of Marketing. Think of “The Constitution” rather than a law book. The creation of such a set of data principles should be one of the first deliverables of a CDO. It is important for the CDO to develop these principles together with key data players across business and IT functions. (Please expect the right people to come from within the data network as described above.) The target is a set of data principles that everybody considers “our data principles”.

3.6.3 Regular Reports to the Board Data must be on the agenda of the organisation. It must be visible. Reports are an important tool to achieve this. That is why a regular report of a handful of data metrics should be a standing item on the Board’s agenda, ideally in the form of a dashboard. Key considerations: • The KPIs should be linked to business performance • It should be possible to break down these KPIs—which would only be relevant in case of issues • KPIs should be compared to the previous reporting period, to the agreed targets and to industry benchmarks where possible • It must always be clear that any data quality targets are not just the CDO’s target but the entire organisation’s targets. • Bad KPI results should not just be reported. At the time of presenting them to the Board, they should come with analysis and ideally already first learnings and recommendations. • Each of the key KPIs should be part of at least one Board Member’s personal objectives.

References

113

3.6.4 Data needs Internal Marketing Data will only turn into value if there is a mindset change across all functions. Look at the following checklist to find out how far you are: • Everybody deals with data. All of us should take ownership of the data aspects of our work. • Teams should engage the Data Office early. Data aspects should be considered during planning and design1. • Data should not be considered an “IT topic”. IT provides the technology that enables proper data handling • Proper data management work should not be considered “additional work”—it will pay off soon after • Projects should always think cross-functional. Data is crossfunctional in most cases. • It is not a sign of failure to ask specialists for support. The Data Office can complement functional experts.

References Soares S, Kongara A (2014) www.information-management.com. 03 04 2014. https://www.information-management.com/news/key-considerations-in-establishing-a-chief-data-office. Accessed 15 June 2019 spencerstuart.com (2018) Data leadership. November 2018. www.spencerstuart.com/-/media/2018/november/data_leadership_2018.pdf. Accessed 23. May 2019 The Chief Data Officer Playbook (2016) IBM Institute for Business Value. May 2016. https://www.ibm.com/analytics/yourdata/cdo-best-practice/. Accessed 30 Apr. 2019 Wikipedia - Generation (2019) https://en.wikipedia.org/wiki/Generation. Accessed 31 May 2019

4

Implementing the Data Office

A CDO appointment can trigger a range of problematic responses from business and IT, and sometimes organizational chaos, ranging from simple ignorance to overt conflict. Instead of rallying the troops and leading the charge, the CDO is hemmed in and besieged. Gartner Inc. (Faria and Logan 2018).

4.1 Set the Targets—Plan the Journey As in most areas, “one size fits all” does not work when starting to deal with data. Certain steps have to be taken before setting up a Data Office or employing a CDO. Eventually, you don’t know whether a dedicated CDO is the best choice for your organisation, whether data should be dealt with by IT or by a business function, or which hierarchy level data should be managed at. As soon as you have decided for an adequate setup and put it in place, you can leave the next steps to the CDO, unless you decide to assume that responsibility yourself. But to get there, you may wish to go through a series of steps, to create a clear basis for a Data Office, and to avoid misunderstandings and disappointments. As the role of a CDO is by far not as well-defined as that of, say, a CFO, you may end up having everybody on the Board expecting something different, © Springer-Verlag GmbH Germany, part of Springer Nature 2019 M. Treder, Becoming a data-driven Organisation, https://doi.org/10.1007/978-3-662-60304-8_4

115

116

4  Implementing the Data Office

possibly including the new CDO. That is why I recommend a diligent process before setting up an organisation that manages data for the organisation. At this stage, it is helpful to reach out for support—external persons who are entirely unbiased and have gone through this before. And you do not need to pay a fortune for consultants, usually without even knowing whether they can really help you, or whether they create future dependencies, turning a one-off engagement into a permanent (and expensive) support model. If you are part of a network of leaders, you may check which other organisation has already gone through this exercise and whether they’d be willing to share their learnings. This should include failures, as you don’t want to repeat errors. Alternatively, you may look inside and pick a good change manager to lead you through the assessment. That person may not even need to be a data expert—this is about organising the journey, not about knowing all answers. The key would be for such a person to be entirely independent of any existing function, including IT. It would therefore typically be a project run under the CEO or the Office of the CEO (depending on your organisation’s organisation, of course). You may also try to find a potential (external or internal) CDO first, maybe even on a temporary assignment, with the option for this person to become the official data lead if the Board likes the resulting concept. In any case, my recommended approach consists of these four steps:

4.1.1 Stocktaking Every organisation works with data. Somehow. This step is about understanding what is going on today, across all functions, from business to IT. Stocktaking is not limited to the “technical” aspects such as technology, processes, responsibilities. It also needs to look at the human factor, as the success of a data initiative strongly depends on the buy-in of everybody involved.

4.1  Set the Targets—Plan the Journey

117

Find out what is happening already, and to which degree the different activities are interlinked. Data-immature organisations are usually not data-illiterate altogether. Instead, different players often address different aspects of data independently of each other. This may become a challenge later as a CDO faces resistance when trying to bring it all together. People will claim they are already addressing “their” piece of data in an adequate way, and that they know their particular business area better than any new CDO. It is important not only to focus on what is happening but to find out why it is happening this way. Perhaps IT is doing all the data stuff as business folks are not interested? Are any business teams doing data on their own as they think IT doesn’t understand their business problems? Are there any shadow IT pockets within departments that don’t want to wait until IT has time for them—or maybe IT is perceived as too bureaucratic? You may find certain data aspects that are already welladdressed within your organisation. This is great! Don’t insist on renewing everything. The more things a CDO can re-use later, the easier the other aspects become. Your Checklist may look like this:

4.1.1.1 Data • • • • •

Do you know what data we have? Do you know the quality of our data? Do you know what data we could have? Do you know the value? Do you know the opportunities?

4.1.1.2 Organisation • • • • •

Do we have the right people? Do we approach data the best way? (How do YOU approach data?) Are we ready for data? Do you feel confident today?

118

4  Implementing the Data Office

4.1.1.3 Strategy • • • •

How about a data strategy? How about cross-functional collaboration How about creating awareness amongst ALL team members? How about measuring data quality?

4.1.2 Opportunities Opportunities generally fall into three categories: • Deviations from how things are supposed to be done • Potential to improve the prescribed way of doing things • New activities and approaches At this point, you wouldn’t want to come up with final solutions. Instead, you’d be after the potential, and after the order of magnitude to realise it. That is why the focus here is on observing, not on solving problems. A very strong tool known from Six Sigma is called Gemba Walk, after the Japanese word for “scene of the crime” or “where it happens”. You go where activities take place, you watch, you try to understand what is happening. You ask questions, just to understand why things are done the way you observe them. Surprisingly, such observers often discover great opportunities for improvement. They are unbiased and not burdened by the heritage that has led to the current practice over time. This approach helps determine opportunities in conjunction with data, without requiring a data expert. Document what you find. Estimate how big it is (do you find details or fundamental flaws?) and how broad it is (do all departments show significant opportunities?) You will not find all opportunities but get a first idea of the potential.

4.1  Set the Targets—Plan the Journey

119

4.1.3 Aspiration Do you want your organisation to become a data-driven organisation? Should data become a key differentiator for your customers? Or do you “just” want data to help optimise key processes and decision-making in the background? All of these are valid aspirations which will improve the situation of your organisation. But you should decide what to focus on, before designing a data organisation. If you have found a lot of expertise within the different teams, you may come to the conclusion that a CDO would rather coordinate. If there is a lot of business knowledge within IT, this may be the best function to run data for your organisation. Once you have a good overview of the status quo, and once you have determined opportunities, you can take such high-level organisational decisions, leaving the detailed organisation to your future CDO. None of these decisions is taken for eternity. You may find out later that there are other opportunities, or that certain ideas are not realistic. But those decisions should then be taken in a wellmanaged, conscious process.

4.1.4 Plan and Execution Not only will a CDO need a target, a plan and a schedule, but so will you, in your plan towards getting a CDO on board. You’d need to decide what you want in preparation of a CDO versus what you’d like that CDO to address. The former is also the framework you will need to share with any CDO candidate to ensure you are on the same page. Consider the installation of a CDO as a pre-project, where the target is not to solve any data problems but to establish a setup that is ready to do so. Just with every other project, you will need to define deliverables, milestones and a schedule. Think of mountaineering: You wouldn’t just get up in the morning, put on your shoes and depart for the mountains. You will first get a general understanding of distances and levels of difficulty, and

120

4  Implementing the Data Office

then you will select a target. You will then plan the trip, using all information available, including the weather forecast. You may even prepare alternative plans, depending on your progress. Nevertheless, you will be ready to take decisions along the way, depending on everything that will happen without being part of your plan. Same here: Know where you are heading, plan your route, be ready to adjust.

4.2 Stakeholder Management Developing an organisation into a data-driven organisation means change. Not everybody embraces change. People are often afraid of change as they rate risk higher than the opportunities that come with it. This is not limited to the workforce— leaders may also be afraid of losing “something” (reputation, influence, team size, …).

4.2.1 Be an Active Sponsor Data Management needs Top Executive sponsorship. Otherwise, there will be too many good reasons for other leaders not to follow. From my discussions with dozens of data leaders over the past years, I have learned that “Lack of Executive sponsorship” is their number one concern, across all industries. Brent Dykes, Director of Data Strategy at Domo, observes: “Strong executive sponsorship is almost always needed to overcome the organizational inertia that may resist and derail a transformation effort to become more data-driven” (Dykes 2016)

4.2.2 Use Strong Change Management If your organisation has a developed Change Management in place, this is the time to apply it. If there is no dedicated Change Management, this may be a great opportunity to establish it.

4.2  Stakeholder Management

121

4.2.3 Get People to Play Ball Exciting the people around you about your data vision is important but not sufficient. They also need to see the added value for themselves, their immediate targets, their team objectives and their own reputation. This is why so many data initiatives fail, although everyone agrees how great they are for the future of the organisation. An African proverb says “Go alone if you want to go fast - go together if you want to go far”. Most people would agree - while at the same time often being forced to go fast in the short run. Particularly under a culture where people and teams are rewarded based on simplified criteria (such as “project completed on time”), it doesn’t make sense for them to look at the greater good, i.e. the shareholder value. So, what to do about it?

4.2.3.1 Add Sustainability to the Equation Even long-term business cases often don’t consider sustainability aspects such as a cross-functional view or long-term ­implications. While this applies to all of the magic five dimensions Value, Process, People, Technology and Data, the latter two are most critical here. Typical attributes of sustainability are re-usability, flexibility and durability. This has been well-known for technology but widely ignored in the data space for years. An active Data perspective would spot weaknesses in efficiency and accuracy based on inconsistent, wrong or incomplete data, unknown data quality, error-prone data handling, duplicate work. But how to consider all of these aspects? Companies often use a “sustainability index” to manage their environmental footprint. You may consider measuring the sustainability of your projects and solutions as well. The key parameters are “How complex would a future next step be?” and “what’s the cross-functional impact?” To be able to measure sustainability, you’d need two things:

122

4  Implementing the Data Office

• A target to check against (What does “sustainable” mean to us, and by when do we want to achieve it?) • Independent expert bodies, for both Technology and Data. Data Architects can play a pivotal role here, and that is why they should be organisationally independent of any department that delivers solutions. This is one of the motivations behind having Data Architecture report into a Data Office and not into IT.

4.2.3.2 Add Transparency To be clear, the most sustainable solution is not always the solution of choice. Speed is often important to react to changes in the markets, or not to miss a unique opportunity. But how can you balance speed with sustainability if you don’t measure the latter? You need transparency. In this sense, the mandate of expert bodies would not be to stop project approaches or to veto them. Instead, their assessment would add to the information basis for a management decision. This includes, but is not limited to, the financial impact of sustainability issues.

4.2.4 The Data Office and Business Functions Most business functions have been around forever. The Data Office is new. That means that even the strongest CDO needs the CEO’s ongoing support. Experience shows that most CEOs face the following six behaviours earlier or later:

4.2.4.1 Struggle for Supremacy As mentioned earlier, most data topics already have a historic “owner”. Those owners may be reluctant to give up their influence. Variant 1 of the challenge: “Stay out! This is my data!” Variant 2 of the challenge: “I don’t report into you!”

4.2  Stakeholder Management

123

While a good CDO will be able to manage these reactions, visible backing by the CEO is indispensable.

4.2.4.2 Lack of Awareness Many seasoned business leaders have no experience in dealing with data. They don’t know what they have been missing so far. A typical question in this context is “What’s the problem?” 4.2.4.3 Functional Silos Functional leaders are often afraid of being slowed down by having to consider other functions’ needs—which is key for proper data management. Another silo-related behaviour is “We know best what is good for us!” While a good CDO will handle this, it is up to the Board to define “good” in this context. The best approach for one function may not be the best approach for the entire enterprise. 4.2.4.4 Lack of Ownership Before a Data Office is introduced, most leaders in an organisation assume that data is a natural IT topic. The CDO will have to demonstrate that certain topics are preferably managed by a data organisation on the business side. However, the message that “Data is no longer exclusively with IT” needs to come from the top. 4.2.4.5 Opt-out Attitude People like to be with the winner. Leaders in an organisation will not know how this new data stuff is going to develop. They will watch closely and then decide whether to follow or whether to continue to do their own thing. That is why it must be made clear very early that “opting out” is not a valid option. This is one of the easier means of giving a CDO an easy start. 4.2.4.6 Disengagement There is a risk that the appointment of a CDO leads to the assumption that all other leaders can now lean back and leave the data topic to that new person. Obviously, the opposite is true!

124

4  Implementing the Data Office

That is why an early message has to come from the CEO and the CDO that “Data is everybody’s business”. This will later need to be supported by proper governance of “data business ownership”.

4.2.5 The Data Office and IT 4.2.5.1 Starting Point Traditionally, IT is no stranger to data. Particularly cross-functional aspects of data, such as data modelling, have been covered by IT departments for years. Admittedly, business folks were often glad to be able to “outsource” these activities to IT. In response, IT departments set up “Architecture” teams that cover the entire path from the business idea up to the design of applications and infrastructure. Unfortunately, this includes the parts for which the business functions have a natural responsibility, i.e. the underlying meaning of data, and the business logic that data structures are supposed to reflect. When introducing a Data Office that is not part of IT, you can expect discussions to come up around exactly this area. CIOs often use Architecture models to outline that the entire chain falls into their remit. Note that most of these models were originally developed by IT folks. Interestingly, at a closer look, these models usually describe all of these activities, without explicitly stating whether they are IT responsibilities or business responsibilities. 4.2.5.2 Approach Organisations who have introduced a Data Management Office regularly report that IT and Data Management only started to get along with each other once the Board has announced a data strategy. That strategy should contain data-related targets and related responsibilities. I’d like to introduce you to two complementary approaches to coming to a joint agreement with IT: Delineation and Joint Forces.

4.2  Stakeholder Management

125

4.2.5.3 Delineation You will need to clarify which type of decisions should be taken on the business side, as a criterion for delineation. A good approach can be to agree with IT that the future CDO will discuss the delineation with IT-based on high-level criteria. These criteria can be agreed early; they should be based on who should have the last word. The target can be a list of areas with the following classification: (1) Owned by Business (2) Co-owned by Business and IT, with Business in the lead (3) Co-owned by Business and IT, with IT in the lead (4) Owned by IT Example of a business responsibility: Where Business Data Modelling and Business Process Modelling are used to prepare software development, they are part of the business requirements, not of the related IT activity. As a typical example for an IT-driven Architecture approach, let’s have a brief look at the TOGAF model1 which defines four disciplines of Architecture. You could use this model for a first high-level delineation proposal, as IT will most probably be familiar with the concept: • Technical Architecture: IT • Application Architecture: IT • Data Architecture: Data Office • Business Architecture: Business Functions (maybe coordinated by a cross-functional business process department)

4.2.5.4 Joint Forces You can work towards a close partnership between Data Management and IT, jointly supporting business in data matters.

1TOGAF

stands for The Open Group Architecture Framework. See (togaf. info 2019) for a good introduction.

126

4  Implementing the Data Office

In organisations where functional Analytics groups are traditionally strong, Data Management and IT can jointly create a great foundation for those groups to work efficiently. Instead of taking away their power, responsibilities can be shared. Strong functional Analytics teams have often built their own “Shadow IT”. (As part of the initial stocktaking, your future CDO will need to search for the “Teradata computer under an Analyst’s desk.) Functional Analytics teams will only be willing to give up their IT expertise if the official IT department offers at least a comparable service level—particularly in terms of speed. On the other side, they will be happy to have someone else take care of it so that they can concentrate on their core objectives, i.e. gaining insight from Information. If a CDO can promise help in this area, functional buy-in is almost guaranteed. While IT is providing the technology, Data Management takes care of data logic, data standards, data definitions and the single source of truth. This allows those Analytics groups to concentrate on their core tasks, instead of having to find the right data themselves, discovering the definitions, and so on. The advantage of this setup is that a lot of work that is often done in duplication by the different business Analytics teams is now done once for everybody. This creates synergies, and the base data will be consistent across all functions.

4.2.6 The Chief Digital Officer Just as a Chief Data Officer, the Chief Digital Officer represents a fairly new concept. It is not well-defined either, so that scope and mandate of Chief Digital Officers vary strongly between organisations. In most cases, the Chief Digital Officer is looking at “digital channels” for customer interaction, notably the Internet presence. If the two roles are not unambiguously defined and delineated, there’s potential for conflict. This should be addressed very early.

4.2  Stakeholder Management

127

A good approach may be to consider the Chief Digital Officer a “customer” of the Chief Data Officer. The latter would make life easier for the former, in collaboration with IT.

4.2.7 The Employees What is a world-class coach without a team that understands the playbook? If the ideas remain in the head of the coach, the team will not improve. No employee is too unimportant to become the target of datarelated training—both technical and motivational training. Not only would you use all methods the organisation has to reach the entire workforce—you would also use existing feedback channels such as employee satisfaction surveys to determine whether the teams feel comfortable and whether they are committed to supporting the new way of managing data. We discussed the topic “organisation culture” earlier. It is a key channel to reach all employees. As soon as the usage of data becomes normal, and as soon as “everybody does it”, single employees find it easier to engage. But a data-driven culture needs to be seen as something positive. Imagine an organisation that starts an initiative to develop a “culture based on honesty” or a “culture based on respect”. Both honesty and respect are almost objectively positive so that it is easy to get the support of the workforce (provided the organisation leaders have credibility in this area). And now compare such an initiative with the introduction of a “data-driven culture”. “Data” as such is not objectively positive—to some, it may definitely sound negative! Many employees will at least have ambivalent feelings—curiosity or excitement goes hand in hand with anxiety or uncertainty. And, yes, to some people “data” is simply a boring topic, as entertaining as reading a telephone book from the nineties… At best, they believe data is important, but they’d expect “the experts” to deal with it. That means additional activities are required to address the employees’ valid concerns. You may find out about privacy

128

4  Implementing the Data Office

concerns, about misuse of data, and about other issues frequently associated with data in public. This is where a “data-driven culture” needs to be specified more concretely: It is the culture of treating data adequately! Your message can be: As part of our new culture, we will consciously manage data from an Information Security and Data Privacy perspective. We are committed to setting up alert mechanisms to discover data privacy issues early. We aim at a data-­ literate workforce, as proper data handling cannot be limited to the top executives. It is extremely critical to take data-related concerns seriously. If you hear people associate “Big Data” with “Big Threat”, don’t focus on convincing them of the opposite. Instead, convey the message that a data-driven culture is required to properly address such threats. All these challenges make it obvious again: To ensure broad acceptance throughout the workforce, responsibility for the creation and implementation of a data-driven culture should not be in the hands of IT, nor in the hands of any single business function. Instead, it is best run by an independent Chief Data Officer reporting into the CEO.

References Faria M, Debra L (2018) Chief Data Officers Should Align Authority and Responsibility to Succeed. Gartner Inc., 28 August 2018 Dykes B (2016) forbes.com. 24 August 2016. https://www.forbes.com/sites/ brentdykes/2016/08/24/data-driven-success-rests-on-the-shoulders-of-astrong-executive-sponsor/#31c68cb52233. Accessed 04 July 2019

Glossary

This section provides a list of frequently used terms in the world of data, together with short definitions. You may find slightly deviating definitions in literature as the area of data has not yet come to a global consensus on terms and definitions. The expressions “Data Value Chain”, “Data Supply Chain” and “Return on Data” have not been taken from literature. They were introduced through this book. 1. Data Data is the translation of reality into a coded form that can be stored and transmitted. It is still considered data if processed or changed. Data does not exist naturally—it is always consciously created or gathered. Data requires context to be understood. The string “WBABB13270A999999” is data, but it could mean anything. The context that this is supposed to be a Vehicle Information Number (VIN) allows you to translate data into meaningful information. 2. Information Data plus context equals information. “23” is data. The context that this data reflects a temperature measured in a certain location at a certain time in Celsius turns this data into information. To be useful, databases need to contain information rather than data. © Springer-Verlag GmbH Germany, part of Springer Nature 2019 M. Treder, Becoming a data-driven Organisation, https://doi.org/10.1007/978-3-662-60304-8

129

130

Glossary

3. Insight Knowing the temperature of all places on earth at all times is a lot of information. However, the resulting long data tables with geo-coordinates, calendar dates and temperature values do not necessarily help an organisation do their job in a better way. This is where the different data disciplines come into play. They translate performance measurement into indicators; they discover patterns and trends, and they visualise their findings for easier understanding. This is what we call insight. 4. Value Data insight is not supposed to replace years of business experience, and not everything that takes place in the human brain can be adequately mimicked by algorithms. But if decision-makers use all the insight gained through data to intelligently complement their gut feeling and experience, they can significantly increase the value of the organisation. Action, based on insight, creates value. 5. Metadata Metadata is data that describes other data, e.g. its format, al­lowed values or simply the description of how to interpret it.  In the VIN example above, metadata would tell you (amongst others) that the first three digits must be alphanumeric, and that the resulting code unambiguously identifies a car’s manufacturer (WMI—World Manufacturer Identifier). 6. Masterdata The example VIN above starts with “WBA”. This WMI code stands for the car manufacturer BMW (Bayerische Motoren Werke AG). The full list that contains all car manufacturers and their VINs is an example of Masterdata: It describes data that

Glossary

131

does not change frequently. It is used for lookup purposes, to understand other data. 7. Transactional data An organisation’s daily business consists of transactions. Whether you sell a product or send an invoice, whether you receive a complaint, undergo a tax audit or run a marketing campaign—all of these are transactions.  Data that describes such transactions is called transactional data. 8.  Data Value Chain Raw data, no matter where it comes from, has relatively low direct value—similar to other raw materials. Data, however, has potential value, considering what could be done with it. This potential value would also determine the price data gets sold for and, possibly in future, its price tag as an asset on the balance sheet. The potential value of data can be increased by adding context, joining data from different sources, structuring data or cleansing data. The result is information.  Sometimes such information can immediately create value, e.g. by using it to support or facilitate operational processes, or by enabling Robotics Process Automation. But increasingly frequently you observe another step before truly capitalising on data: Modern algorithms allow for information to help us understand past and presence, and to forecast the future. This is what we call gaining insight. As soon as an organisation uses the insight it has gained through data to take well-informed business decisions, it turns potential value into real value. This is the ultimate value-add of data. 9. Data Supply Chain Data does not appear out of thin air. And just bringing it in is not sufficient either. Many disciplines have to play their roles during the lifecycle of data so that data becomes the foundation of a data-driven organisation. These disciplines validate data, complete data, structure data, fix data, compare data, evaluate data, and so on.

132

Glossary

If all parties are doing well in this process, and if they work closely together, an organisation’s Data Supply Chain can be considered well-managed. 10. Chief Data Officer (CDO) If the management of data is left to individual parties, each of these parties will tend to optimise the handling of data for their own purposes. This approach conflicts with the target to optimise data handling from an organisation’s overall perspective. It would, for instance, be advisable to increase department A’s project costs by an amount X if this resulted in benefits of three times X for all other departments together. Department A alone may not have been motivated to take such a holistic view. This is one of the main purposes of a Chief Data Officer with a cross-functional mandate and reporting line. Such a role can not only enforce the necessary “shareholder view”. It can also lead the process of determining what is best for the organisation. In the example above, a CDO can even increase acceptance of the move, by communicating that additional funds for department A are not due to their failure in coping with the existing budget. Moves like this should be part of the overall data strategy, which is owned by the CDO as well. 11. Data Office A CDO may not need a huge department. It is particularly not necessary for all employees with data roles to report into the CDO. However, a CDO should not be a one-person show either. A Data Office would take over all cross-functional activities, such as the entire Data Governance work. Such activities should never be part of individual functional departments as they need to be executed from an overall organisation’s perspective. 12. Data Management The sum of all activities dealing with data is called Data Management. These activities are not limited to a Chief

Glossary

133

Data Officer, to a Data Science team or even to the entire data office. In a data-driven organisation, all employees are incentivised to manage their own data in the best interest of the entire organisation 13. Data Governance Even in the most agile environment, handling of data has to follow certain rules. That is why a strong Data Governance organisation needs to develop data policies, principles, standards and a glossary, as well as change processes for both data content and data structures. 14. Data Quality The quality of data is as important as the orientation of signposts in the desert. Only good quality data will show you the right direction, just as only correctly oriented signposts will lead you to your target. You cannot tell from looking at data whether the quality is good, just as looking at a signpost doesn’t tell you whether it is properly aligned, or whether it leads you deep into the desert. 15. Data Ownership  Data needs to reflect an organisation’s business reality and business model. This is not a technical task but business-related content work. That is why each subset of an organisation’s data needs to be owned by a business person.  Ownership is not to be confused with people being appointed to represent their respective departments in data matters. Instead, data owners require a cross-functional role (and mandate). Take the Data Owner of a data subset such as “customer” or “revenue”: This role needs to consolidate the business needs of all business functions (which are as diverse as Sales and Finance) in that specific area, balance them and ensure an organisation-wide view is achieved.

134

Glossary

16. Data Network As soon as an organisation has more than 100 employees, chances are that people dealing with data are better off being spread across the organisation, rather than placing all of them in one single data silo. It is still desirable to have them collaborate on a daily basis. This is possible without a shared reporting line, by establishing a Data Network. Managing such a data network should be a defined task within the Data Office. A Data Network manager would provide a shared repository of ideas and solutions, a collaboration platform and regular target group-oriented communication.  Another important aspect is the organisation of joint activities such as Hackathons (where different, ideally cross-functionally mixed teams compete in solving business problems using data tools), joint visits of production sites or even the occasional “data barbecue”. 17. Data Councils How do you ensure different functions talk to each other and jointly come to decisions that are optimal from a shareholder perspective? An almost indispensable governing mechanism in the area of data is the establishment of Data Councils. These are bodies at different hierarchy levels where peers discuss data topics across functional borders, ideally moderated by a member of the Data Office. A nice side aspect of Data Councils is that parties that don’t play ball become visible right away. The most important and powerful Data Council is what I usually refer to as the “Data Executive Council”, consisting of level 2 executives, and chaired by the CDO. This council develops the data strategy and sets priorities. You don’t need a Data Council at Board level—the Board itself should be considered the final approval and escalation body in data matters.

Glossary

135

18. Return on Data This expression is derived from the financial term “Return on Investment” (RoI)—which is critical for any move that costs money. The idea behind “Return on Data” is that dealing with data is not an end in itself. All data activities must be judged by the effect they have on the organisation’s financial situation. This includes indirect and long-term effects, as well as the avoidance of potential issues (which means working with risks). 19. Data-driven organisation When can an organisation be considered “data-driven”? As soon as data is broadly accepted as a relevant contributor to decision-making, at all levels of the organisation. 20. Data as an asset Many guidelines for the handling of data do not need to be developed from scratch. You can adopt similar guidelines that exist for tangible assets. Aspects include the facts that data has a financial value, or that data needs to be maintained. Data can even be acquired (or captured) at a certain cost, subsequently refined and sold at a higher price. 21. Data as an enabler Business cases should be a standard tool for each organisation, to find out where to put your money or resources. This applies to investments in data as well. However, you need to consider an important additional aspect: Data projects do often not provide direct benefits, even if they are successful. They serve as enablers for future projects— which would be more expensive or even impossible without the enabler being put in place. That is why it makes sense to establish dedicated rules for the evaluation of enabler projects.

References

Davenport TH (2017) Harvard Business Review. https://hbr.org/2017/05/ whats-your-data-strategy. Accessed 20 July 2019 togaf.info (2019). http://www.togaf.info/togaf9/chap02.html. Accessed 12 Apr 2019 Wikipedia - Generation (2019). https://en.wikipedia.org/wiki/Generation. Accessed 23 May 2019

© Springer-Verlag GmbH Germany, part of Springer Nature 2019 M. Treder, Becoming a data-driven Organisation, https://doi.org/10.1007/978-3-662-60304-8

137