Digital Enlightenment Yearbook 2013 : The Value of Personal Data [1 ed.] 9781614992950, 9781614992943

The value of personal data has traditionally been understood in ethical terms as a safeguard for personality rights such

198 25 5MB

English Pages 320 Year 2013

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Digital Enlightenment Yearbook 2013 : The Value of Personal Data [1 ed.]
 9781614992950, 9781614992943

Citation preview

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

DIGITAL ENLIGHTENMENT YEARBOOK 2013

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlighhtenm ment Yeearboook 2013 T Valu The ue of Perssonal Datta

Edited by y

Mireillle Hildebrandt Radboud University U N Nijmegen, Vrij ije Universiteeit Brussel, Erasmus University Rotterdam R

Kieeron O’H Hara Univerrsity of South hampton

and

Mich hael Waiidner

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Technischee Universitätt Darmstadt

Amstterdam • Berrlin • Tokyo • Washington, DC

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

© 2013 The authors. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-61499-294-3 (print) ISBN 978-1-61499-295-0 (online) Library of Congress Control Number: 2013946636 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: [email protected]

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail: [email protected]

LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

v

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors.

Foreword

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Robert MADELIN Director General of DG Connect of the Euorpean Commission I welcome the effort of the Digital Enlightenment Forum to harness the humanism, rationality and optimism of the Age of Enlightenment to better shape and to enhance the benefits to society of our evolving Digital Age. I write this foreword in the midst of a warm global debate on the value of Data – a theme for the DEF and for its Davos cousin the WEF – as well as on the balance between Westphalian security needs and personal privacy aspirations. This yearbook’s focus on the ‘Value of Personal Data’ is apt, because the explosion of data and of computing power over the last few years represents a further stepchange in the development of the Internet. The statistics of the data explosion bears repetition: by some estimates, 90% of the world’s current data has been produced in the last two years; in turn, this is more than was generated in the previous 2,000 years. Meanwhile, the world generates 1.7 billion bytes every minute. Big data (high volume, high velocity, high variability) is here to stay. Beyond its purely commercial uses, the societal benefits of Big Data will be more slowly understood and developed. For the public sector, better data allows services that are more efficient, transparent and personalised. In the health sector, open results and open data permit whole new fields of research. For example, scientists at Columbia and Stanford universities analysed millions of online searches to learn about the symptoms and conditions of certain drugs. This led to the unexpected medical discovery that the combination of two drugs – paroxetine, an antidepressant, and pravastatin, a cholesterol-lowering drug – cause high blood sugar. Meanwhile, the Global Viral Forecasting Initiative (GVFI) uses advanced data analysis on information mined from the Internet to identify comprehensively the locations, sources and drivers of local outbreaks before they become global epidemics. Such techniques offer guidance up to a week ahead of previous indicators. Controversy can arise. TomTom, a Dutch manufacturer of satellite navigation devices, ran into problems with the anonymised data that it collected from its users about individual driving behaviour, which it provided to the Dutch government to help improve the national road system. However, it transpired that data was being used in part to identify the most appropriate sites for speed cameras. Users complained that they were unaware of this application of their data, and were concerned that the police would be able to identify individual speeding violations from the data. TomTom assured consumers that the data had indeed been fully anonymised and that the company would prevent such use in the future. The lesson for all Big Data companies is to focus on perception of data use as much as on actual use. As that example goes to show, user trust is key to Big Data success. Neelie Kroes, Commission Vice-President for the Digital Agenda, has put this succinctly in many public statements on ensuring such confidence: “Privacy is not just about technical features. Without privacy, consumers will not trust the online world. And without trust,

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

vi

the digital economy cannot reach its full potential”. She goes on to identify her three requirements for privacy in the digital age: transparency so that citizens know exactly what the deal is; fairness so that citizens are not forced into sharing their data; and user control so that citizens can decide – in a simple and effective manner – what they allow others to know. These concepts underpin much of the material presented in this Yearbook. In an emergent field such as Big Data, the Forum’s work can inform EU-wide innovation, the EU research agenda and our vision of our common future. On the innovation front, I believe that the right standards can accelerate demand growth. Without interoperability and harmonised formats, large datasets can be too difficult to fit together and use in practice. In this respect the Commission has engaged with stakeholders in the European public sector information ecosystem to forge lightweight agreements and standards that are needed to enable interoperability and integration of Public Sector Information (PSI). It is also promoting standardisation of data formats on our EC Open Data portal and one of the goals of its Pan-European Open Data Portal is to drive the harmonisation of data-formats and licensing conditions in Europe. On research and innovation, the Commission has provided on average €76 million p.a. for data and language technologies. In Horizon 2020, it intends to continue to fund innovation in the area of data products and services, and has also set up the Research Data Alliance, to help scientific data infrastructure become fully interoperable. Open Data standards are also considered set to continue as part of the Horizon 2020 activities from 2014. As to the longer-term vision, this is where “Digital Futures” comes in, a foresight project launched by DG Connect to prepare for the world beyond 2030. The project taps into the collective wisdom and aspirations of stakeholders to co-create long term visions (on a time horizon 2040–50) and fresh ideas for policies that can inspire the future strategic choices of DG Connect and the Commission. It draws inspiration from the long term advances at the intersection between ICT and society, economy, environment, politics, humanities and other key enabling technologies and sciences. This is why we are co-hosting with the Forum a Digital Futures Workshop on the “Future of Personal Data and Citizenship”. In terms of the privacy implications of Big Data, these are just some of the issues that could be addressed by Digital Futures: Can we achieve better metadata management so that potential re-users understand better what uses of the data are covered by consent and/or the statutory law grounds on the basis of which data were collected? Can we build in features in data-management systems that allow a level of anonymisation of personal data compatible with legal requirements? How can we develop privacy-enhancing technologies facilitating the process of giving consent to new uses of personal data? Can we establish “data banks” – dedicated digital spaces for the management of the personal information for each data subject? What kind of role should self-regulation and co-regulation have in ensuring compliance with privacy rules? Regulators do not have all the answers, but we can at least ask the right questions.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

vii

Contents Foreword Robert Madelin

v

Introduction Mireille Hildebrandt, Kieron O’Hara and Michael Waidner

1

Part I. Background Chapter 1. Die Aufklärung in the Age of Philosophical Engineering Bernard Stiegler

29

Chapter 2. Personal Data: Changing Selves, Changing Privacies Charles Ess and Hallvard Fossheim

40

Part II. The Need for Privacy Chapter 3. Not So Liminal Now: The Importance of Designing Privacy Features Across a Spectrum of Use Lizzie Coles-Kemp and Joseph Reddington

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Chapter 4. Privacy Value Network Analysis Adam Joinson and David Houghton

59 74

Chapter 5. Personal Data Ecosystem (PDE) – A Privacy by Design Approach to an Individual’s Pursuit of Radical Control Ann Cavoukian

89

Chapter 6. Personal Information Markets and Privacy: A New Model to Solve the Controversy Alexander Novotny and Sarah Spiekermann

102

Part III. Architectures for PDMs and PDEs Chapter 7. Online Privacy – Towards Informational Self-Determination on the Internet Simone Fischer-Hübner, Chris Hoofnagle, Ioannis Krontiris, Kai Rannenberg, Michael Waidner and Caspar Bowden Chapter 8. Personal Information Dashboard: Putting the Individual Back in Control Johannes Buchmann, Maxi Nebel, Alexander Rossnagel, Fatemeh Shirazi, Hervais Simo and Michael Waidner Chapter 9. Towards Effective, Consent Based Control of Personal Data Edgar A. Whitley

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

123

139

165

viii

Part IV. Other Sources of Data Chapter 10. Open Data Protection: Challenges, Perspectives, and Tools for the Reuse of PSI Ugo Pagallo and Eleonora Bassi

179

Chapter 11. Open Data: A New Battle in an Old War Between Access and Privacy? Katleen Janssen and Sara Hugelier

190

Chapter 12. Midata: Towards a Personal Information Revolution Nigel Shadbolt

202

Part V. Personal Data Management: Examples and Overview

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Chapter 13. A User-Centred Approach to the Data Dilemma: Context, Architecture, and Policy M.-H. Carolyn Nguyen, Peter Haynes, Sean Maguire and Jeffrey Friedberg

227

Chapter 14. Life Management Platforms: Control and Privacy for Personal Data Martin Kuppinger and Dave Kearns

243

Chapter 15. Digital Enlightenment, Mydex, and Restoring Control over Personal Data to the Individual William Heath, David Alexander and Phil Booth

253

Chapter 16. Personal Data Management – A Structured Discussion Jacques Bus and M.-H. Carolyn Nguyen

270

Afterword Kim Cameron

289

Biographies

295

Subject Index

309

Author Index

311

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

1

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-1

Introduction Mireille HILDEBRANDTa,1, Kieron O’HARA b and Michael WAIDNER c a Radboud University Nijmegen, Vrije Universiteit Brussel, Erasmus University Rotterdam b University of Southampton c Technische Universität Darmstadt

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Keywords. Personal data management, privacy, privacy by design, personal data ecosystems, Enlightenment

If you have nothing to hide2

1. Was Ist Aufklärung in the Age of Personal Data Monetisation? 1.1. Enlightenment The 18th century Enlightenment thinkers inspired new relationships with knowledge, power, authority and human values. Critical thought, balancing of countervailing powers, rejection of un-scrutinised or unaccountable authority and a strong emphasis on human autonomy and individual flourishing are often heralded as the unassailable heritage of western civilization. The narratives of empiricism (Bacon), scepticism (Hume), rationalism (Descartes, Voltaire, Kant) and epistemological inquiry (Kant again), as well as theological, political and ethical radicalism (Spinoza) have shaped the course of 1

Corresponding Author. Reproduced with permission from Rob Cottingham. See http://www.robcottingham.ca/cartoon/archive/ifyou-have-nothing-to-hide/. 2

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

2

M. Hildebrandt et al. / Introduction

western traditions and even evoked forms of Enlightenment fundamentalism. Without denying the many contradictions and inconsistencies between different strands of Enlightenment thought, the Digital Enlightenment Forum aims to build on the historical artefacts of critical thinking (daring to contest mainstream knowledge claims), countervailing powers (speaking truth to power), transparent government (as a precondition for accountability) and the protection of civil liberties (as a precondition for individual autonomy and political participation). These values cannot be taken for granted, and are confronted with novel challenges in the era of Big Data and information-driven government, commerce and science.3 The proliferation of incredibly massive searchable datasets, the increasing use of predictive analytics in almost every domain of public and private life and the extent to which critical infrastructure has come to depend on information and communication technologies (ICTs), warrant a new Enlightenment discourse to sustain the values we want to preserve and/or need to reinvent. The Age of Reason seems to slip into the Age of the Algorithm, the Age of Correlation or the Age of Data-Driven Nudging. As Bateson, one of the founding fathers of cybernetics, would say: what is the difference that makes a difference here [1, p. 315]? How to reclaim the patience, the prudence, the practical wisdom and the reflective equilibrium that nourished the Rule of Law, in the era of hyper-connectivity and real time autonomic decision systems? To put it more bluntly in terms of the subject matter of this volume: Was ist Aufklärung in the Age of Personal Data Monetisation?

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1.2. The Dialectic of Digital Enlightenment Enlightenment remains an important reference point for the enquiries reported in this volume. The Enlightenment era, firmly rooted in the information infrastructure of the printed word, nourished the practices of systemisation, indexing and codification, as Bernard Stiegler argues in this volume. Our political, business and legal systems are rooted in Enlightenment concepts such as enforceable rights, individualism and the power of reason. In this world, privacy plays several important roles, protecting the autonomy of individuals and governing their relationships with institutions, communities and society as a whole. The advent of digital computing systems, networking, data mining and machine learning, is in tension with those concepts, while consistent with the idea of rationalistic mastery over our environment, which is often associated with Enlightenment thought (notably that of Francis Bacon). This tension is an example of how specific Enlightenment ideals initiate a dialectic where their very success threatens their foundations, as Adorno and Horkheimer famously argued [2] (and see also [3]). The paradigmatic anti-Enlightenment philosopher Nietzsche framed the dialectic of Enlightenment in a characteristically pithy way, which still reflects our own ambivalence. Enlightenment, for Nietszche, releases the individual from domination: “the priests all become priests with a bad conscience – and the same must be done with regard to the State. That is the task of the Enlightenment: to make princes and statesmen unmistakeably aware that everything they do is sheer falsehood”. Equally, he saw, in Enlightenment’s promise of non-domination, a tool for manipulation. “The way in which the masses are fooled in this respect, for instance in all democracies, is very useful: the 3 Big data is of course not the only novel and challenging factor. We could also cite, for instance, global connectivity, the Internet of Things, online social networks, and cloud computing.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

M. Hildebrandt et al. / Introduction

3

reduction and malleability of men are worked for as ‘progress’!”4 This dialectic shows itself today by the very control over our data that the Digital Enlightenment affords, 5 creating a resource of great value for governments and corporations who thus impose their own ideas of our well-being upon us, potentially manipulating and even removing our autonomy. Dynamic interconnected networks of individuals, private and public organisations, identities, credentials, personal and other data often demand massive and real time processing. Yet our social norms also evolve to accommodate these technological developments. The mindset of a digital immigrant seems at odds with that of a digital native [4]. To some, privacy may appear more costly, as it can be traded off for the benefits that follow from being highly visible to the network, while the volume of data available for processing makes it impossible for the individual to police effectively – even though, as Robert Madelin’s Foreword to this volume makes abundantly clear, user trust “is key to Big Data success”. There are substantial implications for trust and mutual expectations, as the extent to which individuals are capable of overseeing and foreseeing how their data are processed, shared and put to use by whom, where, and to what purpose is now unclear. Despite that, assumptions from the pre-digital era still govern current policy. It is becoming increasingly hard to track what knowledge is mined from the proliferation of networked data, how such knowledge will map onto individuals’ identities, and what consequences will follow from these matches. The chapters in this yearbook have been invited from not only scholars from across various disciplinary backgrounds, notably computer science, psychology, law and philosophy, but also a number of authors involved in specific personal data management initiatives, and they investigate how these technologies will affect individuals with regard to privacy, informational self-determination, contextual integrity, and the notions of personal identity and the networked self. What values do the different stakeholders associate with and derive from personal data and individual privacy? What are the options for individuals and society to control the use of personal data in a digital world full of user-generated content, multinational service providers, smart and interconnected devices, and sophisticated Big Data algorithms? How can individuals and civil society organisations use these new technologies for their own benefit and for their own perception of the public benefit, for example, via the exploitation of open data – and, when it comes to open data, can they really exploit without being exploited? To what extent can increasing transparency support trust and privacy? What technical and social infrastructures are needed for supporting control and transparency? Can they be put in place without destroying the social (and commercial) value that Robert Madelin highlights? And what if they can’t? To what extent must a Digital Enlightenment live with the monetisation of our personal data?

2. The Value of Personal Data Though the title of this book speaks of The Value of Personal Data, we can no longer take for granted that the concept of value refers to something mental, ethical and in4

These are quotes from his Nachlass, his unpublished notebooks. Though we do not necessarily agree that the proliferation of digital infrastructures can be qualified as a Digital Enlightenment. 5

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

4

M. Hildebrandt et al. / Introduction

valuable. This volume aims to confront the notion of ‘values’ in the sense of guiding principles for individual persons and their societies with that of ‘value’ in the sense of monetary value. In the chapters that follow, the incalculable worth of the value of a person and her data is confronted with the quantifiable worth of manipulable, 6 machine-readable data that relate to an identifiable individual person. It thereby builds on a tension that is inherent in the Enlightenment Age, namely between reason and rationality, between what must be argued and what can be calculated. It may be, that to preserve the invaluable value of our personhood we need to engage with the calculable value of the personal data on which so many business cases in public administration and industry now depend. Monetisation of personal data is already a fact. To reclaim some degree of autonomy as an individual person, society may have to enable a person to anticipate how her data can be monetised and what could be the consequences of ‘leaking’ her data:7 loss of a job, lucrative discounts, personalised surveillance, exclusion from social security benefits, rejection of credit, or being registered on whatever blacklist. The point is not so much to provide people with a share of the profits made by the monetisation of personal data, but foremost, the idea is to create effective transparency and control over what data are captured, created, mined and aggregated in order to come to grips with the usage, abuse and exchange of our data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.1. Volunteered, Observed and Inferred Data The World Economic Forum (WEF) has initiated a series of discussions under the heading of Rethinking Personal Data, notably reporting on personal data as a new economic ‘asset class’. In its 2011 report on Personal Data: The Emergence of a New Asset [5], the WEF discriminated between three types of data: volunteered, observed and inferred data (see Fig. 1). Volunteered data are defined as data that individuals “explicitly share about themselves” through electronic media, for example, when someone creates a social network profile or enters credit card information for online purchases. Observed data are data “captured by recording activities of users” (in contrast to data they volunteer). Examples include Internet browsing preferences, location data when using cell phones or telephone usage behaviour. Inferred data are “data from individuals, based on the analysis of personal data [such as] credit scores… calculated based on a number of factors relevant to an individual’s financial history”. Though the categories may overlap, this mapping of personal data and data that may affect a person is a timely proposal to bring some order in the debate over the sharing of personal and unpersonal data. Obviously, volunteered data are indeed created by a person in the sense of her deciding what information to hand over for whatever reason to whichever other person, organisation or software programme. Observed data, however, are in fact created by the software that tracks and traces our online and offline behaviours: clickstream, public transport, electronic payment, mobility, jogging or any type of machine readable behaviour. These data refer to a specific person, and are therefore personal data under EU data protection legislation, but they are ‘made’ by the software machines of companies or public administration departments. Inferred data are ‘made’ by data mining technologies, looking for patterns in the volunteered and/of observed data, that are aggregated 6

On the meaning of the term manipulation, see Stiegler in this volume. ‘Leaking’ refers to what the World Economic Forum calls ‘observed’ data, they are mostly behavioural data that register clickstream, mobility, purchasing and other data without our conscious or deliberate intention to share such data. 7

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

5

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 1. The Personal Data Ecosystem: A Complex Web From Data Creation to Data Consumption.8

in a database. These inferred data need not refer to an identifiable person, as they may be statistical correlations or other patterns, mined from the anonymised data of millions of people. However, the fact that such patterns are not personal data does not mean they will not impact individual persons. Once a subset of data points match such patterns, a person may be identified as one who is willing to pay a higher price, to take more risk, or one who is prone to develop a specific disease or inclined to aggressive behaviour. Though this is all about statistics, the consequences of applying such probabilities to individual users are real. Protection is warranted, even when inferred data are not personal data [6]. 2.2. Working Definitions In this volume, the authors have examined the extent to which privacy and data protection require new technological, legal and organisational architectures in the era of proliferating personal data ecosystems. Since the field we are exploring is an emergent domain of knowledge we will start with a set of explorative definitions, to help the reader to navigate a pioneer’s landscape. The value of personal data. As already mentioned above, the value of personal data can be understood in two ways: as an invaluable asset that is intrinsically linked to the individual person, and as a quantifiable variable that can be traded against various types of services or even money. Identity. In the context of digital security and data protection, the term identity often refers to either the complete set of attributes that define an individual person or 8 This figure was taken from [5, p. 15], © World Economic Forum (WEF), Bain Company, and Marc Davis, who developed the concepts of volunteered, observed and inferred data in collaboration with the WEF.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6

M. Hildebrandt et al. / Introduction

whatever data can uniquely identify a natural person. This can be said to stand for identity in the technical sense, and it relates to identification and authentication. In the context of privacy and human values, identity refers to the sense of self that individuals develop in the course of their lives, enabling them to develop an ‘own’ personality and to act as a moral and legal agent. Privacy and data protection. The human right of privacy is usually seen as a liberty or negative freedom, i.e. imposing an obligation on others not to interfere with the home, private and family life, and communications of a person. As a liberty, it is not easy to define, as definitions restrict its scope. As a human right, privacy protects against infringements by the state, though horizontal effect may oblige states to protect their citizens against privacy infringements by other citizens or private companies. Data protection is a set of legal norms that more strictly defines a set of rights of data subjects (those to whom personal data relate) and obligations of data controllers (those who determine the purpose of data processing). Privacy rights prohibit infringement of the privacy; data protection conditions the free flow of information. Privacy protects the opacity of the individual; data protection warrants transparency of personal data processing. Though some would claim that consent is the hallmark of data protection, others will argue that informed consent is becoming illusive, and purpose limitation is the main protection offered by data protection legislation. Personal data ecosystems (PDE). In descriptive terms this concept refers to the interacting personal data processing systems (PDPSs) that have been proliferating for some time now. This may concern the vast data servers that store data collected by the National Security Agency (NSA), or the consumer data aggregated and mined by data brokers such as Axciom or Experian, who sell consumer profiles for marketing or credit rating, or those that offer public or private cloud services to either businesses or individual end users. It may also concern those that offer authentication services (trusted identities in the technical sense) or personal data management services (based on credentials or attributes as pseudonyms). At a higher level of abstraction, a PDE may include Trust Frameworks and Trust Networks that should allow, for example, context-aware personal data management. Personal data management (PDM). Though PDM is also the acronym for personal data monetisation, we use it to refer to the user-centric management of an individual’s own personal data – facilitated by various types of architectures – to make sure that a person can retain a degree of control over who gets access to which of her personal data. In many ways it builds on user-centric Identity Management Systems. PDM can be achieved by securing one’s personal data in a digital vault (on one’s own device or distributed in a cloud) and the architecture may be offered by one service provider to all of its customers or in the form of a platform which allows individuals to connect in a secure way with various service providers. The bottom line is that a user’s personal data are not shared without their consent, or in the case of necessity (contract, a legal obligation, vital interests, public tasks or the legitimate interests of the data controller), on condition that they will not be used for other purposes than those stipulated when access was provided. Monetisation can be the effect of personal data management, i.e. if companies are willing to share part of the profits they make on the value of personal data with the data subject. PDM may lead to a situation where the added value services which are nourished by personal data will only function if data subjects can actively and knowingly participate in the creation of the added value, while getting their piece of the cake that was thus enlarged. One important issue is to what extent PDMs will

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

7

provide adequate control over the observed and inferred data that nourish most of the business cases of Big Data analytics.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.3. User-Centric Personal Data Management The basic mechanisms that the various authors in this volume explore for understanding and managing this delicately balanced ecosystem are user-centric data management tools; methods and business models that are intended to empower the individual by allowing her a measure of control over her own data. This should not only concern her volunteered but also her observed data and the inferences made from them. Usercentric data management is of course only one solution to the problems described above, but it is worth considering as a focus for this enquiry, as it has the potential for addressing the problems while remaining consistent with the Enlightenment project of self-determination. Both within the regulators and within the industry the idea has been taken up to develop some form of personal data management as a follow-up to identity management. The vision of personal data management is quite simple; the data subject controls and curates data about herself (not all data, but the data she is able to collect). If anyone wants it, they have to ask her for it – she gives them whatever she wants to share. If she wishes to keep the data to herself, then she does not give out information. She can give a certain piece of information to one person (her doctor, say), but not another (her insurance broker). This is a complex process, but it is managed by software tools which simultaneously store or access the data (it may be distributed across various repositories), and manage the interface between data subject and data user. She can outsource the management of the data to other organisations, and ask them to intercede for her; many of the likely situations where her information is required will be too complex for her to want to manage, and the volume of requests may simply outpace her ability to monitor them personally. There are clearly issues of trust, usability and security raised by such an arrangement. Is it feasible? Is it limited to access management or could it also enable usage management? The vision is simple – its implementation is hardly so. How are we to understand user-centric personal data management? This issue is explored at the theoretical level in the chapter by Bus and Nguyen, who provide an abstract specification of the various relationships and positions relevant to the social and technical context of the new digital world as they impact on the individual and her self-determination. They promote the idea of context-aware personal data management, building on ideas from thinkers such as Kim Cameron and Ann Cavoukian (both represented in this volume) in the following terms: “Context-aware PDM (CPDM) enables an individual to control the access and use of her personal data in a way that gives her sufficient autonomy to determine, maintain, and develop her identity as an individual, which includes presenting aspects (attributes) of her identity dependent on the context of the transactions (communication, data sharing, etc.), and enabling consideration of constraints relevant to personal preferences, cultural, social, and legal norms. Trustworthy data practices are foundational to enabling Context-aware PDM.” Bus and Nguyen unravel this definitional statement in their chapter, but without anticipating their careful discussion, it is useful for us to highlight a couple of its aspects. It sets out an ideal for personal data management in terms familiar to students of the rationalist version of Enlightenment. The ideal technological solution produces autonomy for the individual, in particular, by allowing her to develop her identity (or identities), in the sense of selfhood, and to present different aspects of those identities

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

8

M. Hildebrandt et al. / Introduction

to different audiences and organisations depending on the requirements of the context and her own preferences (i.e. she identifies herself in a more technical sense). To get a beer she needs to show she is over 18 while to drive a car she needs to show that she has taken an appropriate test, which will clearly require a greater release of information. To access a web service she may be required to accept tracking and tracing, which she could reject by default or accept depending on certain conditions. This autonomy has limits. For instance, in order for an identity, in the technical sense, to be useful it has to serve the purposes not only of the individual but also of organisations demanding it. The data required to drink a beer, while disclosing minimal information, must be verifiable. This must inevitably stifle some of the individual’s options – for example, she may creatively wish to present herself as she is not (for example, to fake some episodes in the past). There are many reasons why this could be done – most obviously determining one’s own identity (for example as a good father or mother) is not always strongly connected to the facts. Erving Goffman’s dramaturgical analysis of the presentation of the self, by evoking the theatre, ipso facto evokes the possibility of creative reconstruction [7]. The function of memory is not the retrieval of facts, but rather sensemaking, which is not dependent on the strict truth of what is remembered. More prosaically, Dodge and Kitchin [8] have argued that the best way to protect personal data stores is to falsify random pieces of information so that anyone snooping in the store cannot be certain that a particular data item retrieved is actually true. Mayer-Schönberger [9] has produced a related argument that the only way to ensure that identities can develop in a non-pathological way is to delete information periodically and automatically. This does somewhat overturn the Enlightenment ideal for personal data management by removing human agency from the decision-making. There is indeed a flip side to the rationalist version of the Enlightened ideal of personal data management, which is that the collected data, even (perhaps especially) if curated by the data subject herself, is a valuable resource for others. Commercial organisations may pay for the right to process or use it. That at least would allow the individual some rights over how the information is used, but as Hildebrandt has argued [10], without transparency the individual may be unaware of how it will be used and what value will be extracted by the purchasing firm. How can she make an informed decision to provide access? How can her supposed autonomy be reconciled with her profound ignorance of what is likely to happen? Perhaps more to the point, governments will always give themselves the powers to use collected data, covertly or otherwise [11]. Policing, national security and public health are reasons that will always be cited. And, although one wearies of the extraordinarily dim, wilfully complacent and shockingly bogus argument that “if you have nothing to hide you have nothing to fear,” it is routinely trotted out, not only by tabloid newspaper editors with a vested interest in the destruction of privacy, but also by responsible public officials. It was, for instance, the response of the United Kingdom’s Foreign Secretary William Hague [12] in a comment about the US government’s clandestine PRISM surveillance programme, the existence of which was revealed (a “landmark event”, according to Kim Cameron in his Afterword in this volume) by a whistleblower in The Washington Post and The Guardian. Yet the UK Intelligence Services Act of 1994 allows electronic surveillance “in the interests of the economic wellbeing of the United Kingdom in relation to the actions or intentions of persons outside the British islands”, which has allegedly been interpreted as allowing the UK government to bug foreign diplomats to determine their negotiating positions prior to economic

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

9

summits [13]. These diplomats had something to hide, sure – but it is perfectly legitimate to want to hide a negotiating position. The PRISM affair highlights an important failing of the European digital economy. European citizens are at risk from American snoopers because there are very few alternatives to US-led online services. We use Google, Facebook and Twitter because the European alternatives are puny in comparison. This is partly due to the first-mover advantage afforded by network effects, but partly it reflects the greater entrepreneurialism evident in Silicon Valley, which has had the unhappy effect of giving the NSA easy access to ‘our’ data.9 It is probably a myth that President George W. Bush once declared that the trouble with the French is that they have no word for entrepreneur, but given the US’s commercial lead, it has the ring of truth (applied not only to the French, but to Europe as a whole). As demonstrated in this volume, there are plenty of options for the private sector to move into the space of personal data management – and this may be the very time to strike. The bottom line is simple: if we put all our data eggs in one basket, we should not be surprised when those interested in us make a grab for the basket. Hence we endorse the conclusion of Bus and Nguyen that the functioning of the edifice of personal data management would require appropriate norms, regulations and “trustworthy data practices”. To the extent that it provides for data protection by design it might even enable a win-win situation, though if it merely allows consumers to trade their data without inbuilt verification of lawful and fair usage it is unclear what the advantage will be. We should not allow ourselves to become compromised by our own involvement in the monetisation of personal data in ways that undo the invaluable value of personal data.

3. Autonomy and Heteronomy in the Era of Predictive Analytics As philosopher Bernard Stiegler observes, in our first Chapter, ‘Die Aufklärung in the Age of Philosophical Engineering’:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

And while traceability continues to expand, it seems it is mainly being used for behaviour profiling, and thus to increase the heteronomy of individuals rather than their autonomy.

The combined usage of volunteered, observed and inferred data that concerns us – because these data refer to us and/or generate all types of automated decisions about us – confronts us with a novel cognitive economy. The quantity, forms and implications of privately held or publicly shared knowledge have long since reached unprecedented levels of accumulation, while no human mind nor any computing system could claim to ‘know’ the content of all the data ‘out anywhere’. Is it important, therefore, to reassess what it means that so much information is being processed by interacting computing systems whose operations are opaque to most of us and may even be hard to assess by those who designed them. Stiegler, in fact, describes how the grammatisation (discretisation) of behaviours follows up on the earlier grammatisation of spoken and written language, all depending on what he identifies as retention, i.e. our ability to retain the flux of life as a perception (primary retention), a memory (secondary reten9 Though having said that, the colossal Tempora programme of the UK’s GCHQ to ‘Master the Internet’ is also stunning in the quantity of data and metadata it amasses, all on the basis of an innocuous loophole in the Regulation of Investigatory Powers Act 2000, which has allowed GCHQ’s lawyers to present a case for Tempora’s legality, but has also cast doubt on its democratic legitimacy. GCHQ allegedly shares sensitive personal information with the NSA [14].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

10

M. Hildebrandt et al. / Introduction

tion) and by means of technical devices that allow the storing or even the machine-tomachine processing of information (the handwritten manuscript, the printing press, photographs, mass media, and now, the digital computer and the Internet; all different forms of tertiary retention). For a tertiary retention to make sense to us, we need to introject it; to make it ‘come to mind’ – otherwise it remains outside our cognition and cannot inform our actions. The Enlightenment Age, with its emphasis on critical thinking and reasoned discourse, may be understood as the age of ‘reading brain’. As Stiegler indicates, referring to the work of Maryanne Wolf [15], the morphology as well as the behaviour of a ‘reading brain’ differs substantially from brains that have not been trained to read and write. So, if introjection of written text is a matter of reading, what would be the introjection of Big Data or of predictive analytics – that altogether different tertiary retention? Stiegler thus raises the important question of what politics are involved in the introjection of digital automata that may have been processed machine-to-machine by a number of computing systems before reaching a human ‘consumer’:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Without such a politics, the inevitable destiny of the digital brain is to find itself slowly but surely short-circuited by automata, and thus find itself incapable of constituting a new form of society with other digital brains.

The question of personal data monetisation is part and parcel of such politics: can we develop a personal data ecosystem that allows for the trading of personal data in a way that is fair and comprehensible for individual human beings – or will the architecture we need to enable such trading take over and replace our individual discernment? This is not merely a question of trusted computing in the technical sense of having a secure environment with the right type of encryption to authenticate access and to guarantee the confidentiality of the content of our personal data. It is not even the question of the confidentiality, integrity and availability of my personal data. It is about the extent to which we can foresee what data derivatives we match and how this will constrain or enable us to develop our personal identity. Will personal data management, for instance, help to achieve such foreseeability? Will it re-enable a degree of autonomy or would it reinforce the heteronomy of individuals in the era of Big Data? This raises another set of questions related to the nature of personal identity. Agre and Rotenberg have defined the right to privacy as: The freedom from unreasonable constraints on the construction of one’s identity [16, p. 7].

This definition is more abstract, but maybe more effective that the more conventional definitions of privacy as (1) the right to be left alone and as (2) the right to decide when, how, and to what extent information about oneself is communicated to others. The first heralds the right to privacy as a liberty; a negative form of freedom, and the second heralds the right to privacy as the tool of a sovereign who reigns supreme over her personal data; a positive form of freedom. Both definitions have their drawbacks, and the beauty of Agre and Rotenberg’s notion of privacy is that it acknowledges what is at stake with our privacy by referring to one’s identity, while it also admits the relational character of privacy and the need to realise that constraints are, in principle, inevitable. Identity is relational, it is built while interacting with others and subject to all kinds of constraints. This is not a new fact, but perhaps the Enlightenment Age with its emphasis on the individual person has come to believe its own myth: that an individual person with an individual identity is an entirely independent and fully autonomous being that admits no interference from the outside. This is a line of thought investigated by the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

11

second Chapter in this volume, Personal Data: Changing Selves, Changing Privacies, by Charles Ess and Hallvard Fossheim. They explain: [T]he technologies of literacy-print correlate with high modern conceptions of the self as a primarily individual self – in Charles Taylor’s terms, a ‘punctual’ or radically disengaged self (1989). Such an individual self, understood as a rational autonomy, and the modern liberal democratic state seem non-accidentally suited to each other.

They then suggest that:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

By contrast, both orality and secondary orality correlate with more relational conceptions of selfhood.

Secondary orality is a term derived from media studies, identifying certain characteristics in hyperlinked digital environments that compare to those of prehistoric, oral societies. One of these characteristics is a more relational understanding of self and a less unified notion of identity. They highlight the vulnerability of the relational self, and they warn that societies cherishing the relational character of the self tend to make the self dependent on social hierarchies, and seem to foster non-democratic politics. Though this is debatable, considering the egalitarian nature of many face-to-face societies [17], it seems clear that a relational self is more dependent on its environment than the unencumbered self of the Enlightenment’s autonomous rational subject. This has major implications for privacy. According to Ess and Fossheim, societies that thrive on the idea of a relational self often have a negative opinion of privacy, as this is seen as an antisocial characteristic. They suggest that the current tendency to share and expose one’s identity indicates a shift from atomistic notions of an independent self to more relational notions of an interdependent self. This, they argue, requires a new, hybrid understanding of the self as both relational and autonomous. This would require a shift towards, for instance, Nissenbaum’s theory of contextual privacy [18]. The interesting question of how this relates to the monetisation and management of personal data within the context of the emerging personal data ecosystem remains unanswered, but in order to assess to what extent we can expect individual persons to effectively manage their personal data, we need to come to terms with the issue of individual autonomy and the impact of others, of technological infrastructures and of societal institutions on the self. Can we discuss personal data management in terms of individual consent and rational choice, or should we admit to the bounded rationality that forms the point of departure of behavioural economics? Is the rational liberal subject that forms the hidden premise of much talk over consent as autonomous as liberals proclaim, or should we admit the social nature of the self and thereby acknowledge the delicate mix of autonomy and heteronomy of individual persons? What does this mean for privacy: is it perhaps more than a private interest [19]? 4. Privacy as a Public Good In this context, it is worth asking how the evolution of social norms, the development of technology and our normative, political and philosophical assumptions have interacted to create this particular example of the dialectic of Enlightenment. The answer might best be illustrated by considering the perhaps unexpected tension of personal choice and autonomy, as it is played out in the ideological and commercial arenas where the subtleties of political theory are so easily lost in the noise. So let us consider this dialectic in action.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

12

M. Hildebrandt et al. / Introduction

Note that there is a health warning here: the arguments presented in this section are not normative ones, and this is not a piece of political philosophy. This section reports what happens when people with political, ideological, commercial and technological agendas conflict in a contested and ineffectively regulated space. The point here is not whether one believes that privacy, as related to autonomy and identity, is a precondition for a fair and free market, as well as for a viable democracy that requires individuals capable of making up their own mind. If you are reading this book, you probably subscribe to that complex proposition. In this section we consider how, and how not, to establish that point against those who do not believe it, those who believe it but think that care in this area is unnecessary, and those who believe that it is irrelevant to some of the current trends in technology which have caused concerns. Obviously, some would claim that the argument for privacy as a public good is a normative issue, and reformulating it in terms of a descriptive argument hides rather than removes normative assumptions. For the sake of the argument, we will, however, investigate the strength of a liberal position that is often invoked in favour of privacy as autonomy. In a celebrated passage, political philosopher John Stuart Mill argued that:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

The sole end for which mankind are warranted, individually or collectively, in interfering with the liberty of action of any of their number, is self-protection. That the only purpose for which power can rightfully be exercised over any member of a civilized community, against his will, is to prevent harm to others. His own good, either physical or moral, is not a sufficient warrant. He cannot rightfully be compelled to do or to forbear because it will be better for him to do so, because it will make him happier, because, in the opinion of others, to do so would be wise or even right [20, p. 14].

Let us call this the Mill Test of whether coercion (whether overt, or of the subtler ‘nudge’ kind) is justified to prevent harm (it is often referred to as the ‘harm principle’ [21]). It has become increasingly influential as freedom has become a prized political good, and now defines an area of private life where you have, in the classic account of Warren and Brandeis, the right to be let alone [22]. Ironically, the application of the Mill Test specifies a space for decisional privacy, but in our networked society decisions are often made to sacrifice informational privacy in exchange for free or useful services, even though many commentators believe that to do so is hardly wise or right at all. In less inflammatory terms, a decision is taken to exercise one’s data protection rights which results in a loss of control over one’s personal data – even if one retains rights of control under data protection law, these rights are seldom exercised and are, in practice, very hard to exercise. When we apply the Mill Test to a decision to relinquish control over personal information – for example by joining a social networking site (SNS), and thereby colluding in the creation of a large amount of rich data about our social network and other aspects of our activities to the SNS owners – there is a tacit assumption made that the privacy that results from being invisible to the SNS benefits only the individual. Given that this is so, then in a ‘civilized community’ it is not acceptable to regulate SNSs by preventing or discouraging people from joining them in order to prevent them from carelessly giving their privacy away in a manner they might later regret. People can be informed of the potential for breaches of privacy, but if they continue to have a cavalier attitude towards their personal information that is their business. If privacy benefits the individual only, then it follows that lack of privacy may harm the individual only; the lack of potential for harm to others means that the Mill Test rules out any interference with the actions of individuals to preserve their privacy.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

M. Hildebrandt et al. / Introduction

13

That analysis is shared by two usually antagonistic ideologies. Many liberals, libertarians and individualists champion the Mill Test, but privacy is important because it protects autonomy; the ability to make informed decisions in the absence of coercion. Without control over access to his person, reflections, decisions, and information about him, an individual is not fully informed about his environment, cannot be guaranteed to detect or avoid coercion and cannot be authentically himself [23]. Against that, so-called communitarians argue that freedoms only make sense against the background of a culture which maintains them. Rights entail responsibilities to ensure that communities function properly and humanely, and when individuals pursue their own rights beyond a certain point, the community suffers. Although they are important, privacy rights can undermine community cohesion, so when a community faces a well-documented threat (not just a theoretical one) to the common good, steps to curb privacy are not ruled out by the Mill Test. Etzioni lists a series of potential areas for intervention, including ‘Megan’s Law’, a group of laws at both federal and state level in the US which require the addresses of convicted sex offenders to be publicly available [24]. Liberals and communitarians are generally in opposition, but in their joint support for the Mill Test, they agree implicitly that the gains of privacy accrue to the individual, while its costs are felt by wider society (perhaps in terms of loss of security, or lack of efficiency). Liberals and communitarians thus believe that privacy is a private good, like life, wealth and freedom. They think that unlike clean air, clean water and democracy, it is not a public good whose benefits accrue to the community at large. This unusual consensus also finds support from those technological determinists who argue that the extent of privacy is a function of our deployment of information technology to undermine the so-called ‘practical obscurity’ which results from what Luciano Floridi has characterised as ‘ontological friction’ in information flow [25]. If technology makes it easier for information about an individual to get from A to B, and if that individual colludes with the deployment of that technology, then privacy is de facto difficult to protect, and some technological determinists – for example, Jeff Jarvis – argue that in that case it is socially harmful to do so [26]. Such determinism, rendered more plausible by the tacit assumption that the technology itself plays only a neutral role in this novel scenario, was famously expressed by Scott McNealy as “you have zero privacy anyway. Get over it.” Facebook CEO Mark Zuckerberg has also often argued that the reduction of privacy in the Webenabled world is solely a result of the unforced take-up of privacy-inhibiting technologies changing social norms.10 These arguments bring with them the unspoken implication that the technology has no effect on social norms, and users take up privacy-threatening technologies that are already and independently acceptable to them. This argument, which tends to come from technology companies and cyber-enthusiasts, is surely disingenuous, if not selfserving. Rather, we find ourselves facing a new paradigm. When surveillance, data production and data analysis were the prerogatives of governments and large corporations, individuals gained little from visibility – privacy was the right to be left alone. Data tended to be created in separate and extra processes – for example, data about shopping 10 In honour of Zuckerberg’s role in crafting this deterministic position, one author of this introduction has christened it ‘Zuckerbollocks’ [27]. We should make it clear that Zuckerbollocks is a populist rendering of technological determinism, rather than a carefully thought-out and philosophically-respectable position, and when we talk of technological determinism in this section we specifically refer to Zuckerbollocks.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

14

M. Hildebrandt et al. / Introduction

habits was gleaned by market research companies asking people to fill in questionnaires. Filling in forms was tedious, and there was little incentive for consumers or citizens to cooperate by doing so. Demand for data exceeded supply. Now, digital transactions increase the supply of data by producing data as an organic by-product of an online transaction. A person looks at a web page, buys a book, uploads a photo, sends a text – the observed data recording the transaction is created naturally and immediately. The massive supply of data has laid the groundwork for the design and production of innovative services (which themselves generate even more data), and which not only provide value for companies and governments, but allegedly also the data subjects. Moore’s Law means that processing power has kept pace with this increase in data supply. So for the first time we find ourselves in a world where there are many enticing incentives for consumers and citizens to risk their privacy by exercising their data protection rights, and becoming visible to their networks. If we take a revealed preference view of belief-desire psychology, individuals’ behaviour in this new world shows that they (or many of them) prefer the benefits of visibility to the now rather old fashioned idea of being private. They assess its benefits and costs, and make a decision accordingly. This may not be easy – how do you compare the immediate benefits of providing data about oneself with the theoretical risk, several years down the line, of its being misused? – but that is no different in principle from many similar discounting decisions we make in the ordinary course of events. The benefits can be quantified – in 2010, the value of free services funded by surveillance-based advertising, minus a discount for foregone privacy, was estimated at over €100 billion [28]. Liberal arguments about the value of privacy in its protection of autonomy [23] now seem theoretical and unrealistic as more people flock to SNSs and see benefit in playing with identities and self-descriptions, and exploring new types of meaningful interaction. Most people are reasonably clear about the artificialities of SNSs (e.g. they know the difference between a real-world friend and a Facebook friend), and are prepared to experiment – few are completely passive consumers [29]. Loss of autonomy may be compensated for by increased control over identity and self-presentation. To many, the ability to play with identity is a real and present benefit, and the loss of autonomy a complex, theoretical and distant postulate of political philosophy. So the liberal, by assuming the application of the Mill Test to decisions about data, cedes so much ground to the technological determinists that it becomes difficult in the extreme to defend privacy.11 The liberal arguments to defend privacy tend to point to theoretical harms to society as a whole. For example, [30] argues cogently that privacy is essential to the basic values of the European tradition – self-determination, democratic participation and economic well-being. But these philosophical and normative arguments are not guaranteed to be accepted by opponents who constantly make the demand: show me another individual who has been harmed tangibly by my actions in consenting to allow third party access to my data. Although privacy is immensely important to a liberal society, classical liberalism will undermine itself if it cannot accede to this demand. Is there a way out of the impasse? 11 This isn’t quite true. There are arguments available that an individual’s autonomy has a social value as well as value to the individual (for example, it enables democracy to function more effectively). However, these arguments tend to rest on liberal assumptions, and so do not always appeal widely to non-liberals, and are not effective to defend liberalism against hostile attack.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

15

Fortunately, there is. Arguments that can meet the demands of the opponent are readily available, although not made as often as they should be.12 •







Copyright © 2013. IOS Press, Incorporated. All rights reserved.





Accountability. An individual’s autonomy has a social function – only autonomous persons are properly accountable. When a person’s privacy is diminished, the question about his responsibility for his actions becomes muddied – and the loser is wider society, not that person himself [19,31]. Profiling. Many decisions are framed by the use of data to classify people and ‘personalise’ (or, put another way, ‘restrict’) choices. When others forego privacy, their data can create a stereotype against which a privacy-sensitive individual may be matched despite her attempts to maintain control [10,32,33]. The stereotypes can be developed from group behaviours, and need not include anything about the individual herself, except to place her in the group (perhaps for demographic reasons – she is a divorced executive aged 41–50 – or because she lives in a particular postcode or zip code area). Security. Much writing on privacy assumes a security/privacy trade off. Privacy is a right, but security is a primary function of the state. Yet even if this trade off sometimes exists, is it the usual condition? Arguably not – a loss of privacy can result immediately in a loss of security when data become public, or are leaked [34,35]. Trading data. Because data is economically valuable, there is a case for commodification to allow the data subject to profit alongside data processors [36]. Yet without the measure of control that privacy brings, asymmetries of knowledge would make the functioning of such a market inefficient [37]. Could citizens meaningfully consent to their data being exploited without any idea of how and in what context it will be used? Credible signalling and full disclosure. New markets are being created as increased data lowers the costs of credible signalling. For instance, a social network is a good indicator of creditworthiness, so an individual could agree to make their SNS data available to a lender in return for a discounted interest rate.13 This may help the underbanked and those without collateral. However, there is a danger that those who do not wish to disclose data will be penalised on the assumption that their data would not provide a positive signal; in the banking example, a privacy-aware individual would automatically be assumed to be uncreditworthy [38]. Chilling effects. As privacy decreases, behaviour will adapt. Even in the absence of overt censorship, people will experiment and innovate less, and express themselves less freely [9]. This may be a particular effect of the recent revelations about PRISM.

These various arguments all demonstrate the importance of privacy as a public or social good without relying on liberal premises. If any of these arguments is convincing, then it follows that the Mill Test does not apply to privacy, and that society may take steps to protect privacy even in the face of mass market behaviour which reveals preferences for other goods over privacy.

12

This argument is made in more detail in [27]. Examples of companies that are exploring this business opportunity include Kreditech (http://www. kreditech.com/), Neo (https://www.myneoloan.com/) and Lenddo (https://www.lenddo.com/). 13

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

16

M. Hildebrandt et al. / Introduction

In fact we may go further. They basically confirm that liberal notions of purely autonomous individuals, communitarian celebration of collective identity and technological determinism all miss the point. Individuals are always relational; they emerge from the various communities in which they participate and they also co-constitute these communities. Information and communication technologies mediate the process of identity construction, but do not over-determine either individual or society. These arguments build on a more robust understanding of self, technology and society, acknowledging the inherently relational character of privacy. For the user-centric data management systems considered by the chapters in this book, then, there are indeed important reasons why privacy protection needs to be built into the architectures and system designs. Personal data management should not simply endow the data subject with the ability to sell to the highest bidder. Many of the chapters in this book discuss regulations, architectures and technologies which will begin to help us negotiate the tricky line between maximising the value of our data and minimising our exposure to unwelcome surveillance, and to help us draw lines (either in law or via social norms) when immediate personal gain threatens to produce long-term social loss.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5. Overview of the Chapters Such is the political, legal and technical background in which we find ourselves in 2013. We are grasping to find solutions to problems we perhaps only dimly perceive – or worse, we are looking for problems which may not exist while missing the serious issues just around the corner. We do not know whether solutions should be given the full power of law, or whether a quick techie fix will do the trick. Meanwhile, the market presents its own dialectic and generates the solutions for which people will pay. Is the market excluding large sectors of the population, and will the discourses it produces warp or invade public space? And are innovation and social norms in any event moving so quickly that it is futile for institutions to try to keep up? The Digital Enlightenment Yearbook 2013 has collected a series of chapters exploring the idea of the value of personal data. The different stakeholders in society and the different scientific communities (technology, law, philosophy, social science, economics), as well as entrepreneurs and policymakers, will have very different opinions and perspectives on this motive. Our intention in this book was to bring together these different perspectives to form a basis for inspiring and constructive discussions across disciplines. Most were written especially for this volume, but a small minority are reworkings or repurposings of previously published material. We are also extremely grateful for a Foreword by Robert Madelin, and an Afterword by Kim Cameron. Both of these pieces place our thinking into the wider context of big data, its promise (Madelin) and its dangers (Cameron). How, the reader may think, is the poor individual to cope, tossed about in the tsunami of data, services and innovation that is engulfing her? Both Madelin and Cameron agree that PDM is an essential part of the story, and the papers in this collection go some way to filling in the detail. 5.1. Part I: Background The chapters are divided roughly into five main groups – of course in a volume of this nature there is a lot of inter-group overlap and intra-group heterogeneity, which is a Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

17

roundabout way of saying that the editors (themselves a diverse trio) had some trouble deciding which chapters should go where. Part I provides a background, and continues with some of the themes discussed in this introduction. The two chapters in this section, by Stiegler, and Ess and Fossheim, have been discussed at length above. They set the scene for the more detailed examination of the concepts of personal data management and valuation in the remaining chapters. Following this opening section, four more follow.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5.2. Part II: The Need for Privacy Part II moves from the general issues surrounding our new digital world, to the more particular considerations and challenges of the problem of personal data and its use and abuse, its protection and commoditisation. Sociology, psychology and policy are all explored here as we consider the ways in which the individual responds to and is shaped by technology. The individual wishes to determine her own identity – how can she use her own data to do that, and how can she control the process? Put another way: how do informational self-determination and privacy interlink so that each needs the other? Lizzie Coles-Kemp and Joseph Reddington’s chapter, entitled Not So Liminal Now: The Importance of Designing Privacy Features Across a Spectrum of Use, pushes beyond the mainstream of computer use to examine how the glut of personal data being created will affect sectors of society that are often neglected by technology markets and data regulators. Assumptions are made about data subjects’ cognitive abilities that are simply unrealistic in the general case, and much of the discussion of how the new environment affects the autonomy of individuals is predicated on the possibility of them refusing to use a technology. Coles-Kemp and Reddington argue that neither of these optimistic assertions is necessarily true. They take a specific example – that of people with severe speech defects who use Augmentive and Alternative Communication systems (AAC) – in which complex problems of data storage and data use crop up that perhaps would not have been anticipated. Yet the solutions proposed by AAC developers do perhaps contain lessons for more widely available and applicable technologies. In their chapter on Privacy Value Network Analysis, Adam Joinson and David Houghton address the notion of privacy value head on. They use methods from network analysis and the social capital literature to analyse and visualise the creation of value across a network. As we have noted above, there is real value for consumers created by networks, and any privacy-aware individual needs to make highly complex calculations as to how much information she can release – and even then there is an important question as to how informed she can possibly be about its use (a lacuna that a number of papers in this volume attempt to fill). Consumers generally can be disturbed when specific advertising practices are made clear to them (Google’s Eric Schmidt once famously described his own company’s policy as being “to get right up to the creepy line” – a notion that is in itself so creepy that it raises creepiness to the metalevel). Joinson and Houghton overlay techniques from value network analysis with ideas about managing communications at the boundary of our selves, our personal relationships and our group memberships to visualise notions of information exchange, the goals of interaction and the impacts across the network. In this way they hope to express, and ultimately to influence, decisions made to disclose or not disclose information.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

18

M. Hildebrandt et al. / Introduction

Ann Cavoukian’s chapter, The Personal Data Ecosystem (PDE): A Privacy By Design Approach to the Pursuit of Radical Control, describes the technological components of the evolving idea of a personal data ecosystem (PDE – defined earlier in this introduction). Cavoukian takes the idea of privacy by design (PbD) seriously – this is the notion that, when designing systems that will hold or otherwise deal with personal data, one should design it with the privacy of the data subjects as a first order feature of the system, as opposed to the all too common practice of designing a system that does everything the business model demands, and then tacking a privacy management component onto the design as an afterthought. PbD, one would think, is an obvious way forward, yet it is proving strangely (or not so strangely!) difficult to promote. Cavoukian describes the ideal of ‘radical control’ over our personal data , which she argues should underlie the design of any PDE. In the absence of trust in government to resist pressure from large commercial players and from its own intelligence communities, radical control may well be the only way forward to protect individuals’ data unconditionally. Put another way, if we wish to inject privacy by design into PDEs, Cavoukian argues that it is essential to put control totally in the hands of the individual; if it is contracted out to governmental actors, we are lost. It is certainly interesting, when considered in the context of the rest of the chapters in this book, that many of the components of radical control are addressed separately, which is suggestive of the grand sweep of the ambition of this chapter. But Cavoukian appears to argue that without such ambition, privacy cannot be protected in full. Alexander Novotny and Sarah Spiekermann’s chapter signals equal ambition, in its title: Personal Information Markets AND Privacy: A New Model to Solve the Controversy. In company with many other chapters, Novotny and Spiekermann start from the extraordinary business value derived from personal data, and consider the loss of trust that consumers feel as their data is swapped, shared and analysed outside any apparent control. The control that is afforded, for example, by Privacy Enhancing Technologies, requires more input than the poor consumer is prepared to put into the matter. Life is not only not private, but it is too short to make it private. Novotny and Spiekermann propose a three-tier model for markets in personal data and information, in which money can be made but privacy also protected. The first tier contains the data subjects and those organisations with which they have a direct relationship. The second tier contains the business service providers which support the first tier operations. The third tier is everyone else in the market who cannot deal direct with personal data. Breaking markets down like this, the model is able to specify a set of rights and responsibilities for each actor, and technological and legal enablers for each relationship, many of which already exist in current regulation but which may not always be properly enforced. The authors recognise several challenges to their model, some technical – for example, the need to ensure anonymisation of data when it reaches the third tier – and others more practical, including global enforcement. Nevertheless, the model disaggregates a number of intertwined norms and drivers in information markets, and enables us to think somewhat more clearly about how such markets could be designed to benefit everyone. 5.3. Part III: Architectures for PDMs and PDEs Part III brings together papers which describe potential architectures at a relatively high level of abstraction. Here, we look not at proposed systems but at types of technology, considering what issues they raise and what problems can be solved when we consider

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

M. Hildebrandt et al. / Introduction

19

the functional units within systems and the relations between them. What are the properties that systems will need in order to foster privacy or to support flexible, informed consent? Online Privacy: Towards Informational Self-Determination on the Net, the chapter by Simone Fischer-Hübner et al. decries the current status of online privacy provision and the way in which we have sleepwalked into a world where privacy is routinely compromised in order to fund free services such as search and social networking support. Informational self-determination appears to have been lost in the rush to exploit and monetise personal data. The chapter, an updated version of a manifesto written and published in 2011, reviews the state of the art in privacy provision, and argues that current PETs lag behind the progress made in unlocking the meaning in data, and often fall down on important characteristics such as usability and scalability. No wonder demand appears to be low. They identify a series of challenges, including introducing transparency about the use of data and the risks to privacy, and the provision of workable tools. Their third challenge is identity management systems that undo information asymmetries and restore control of identifying information to the individual, and this may well be a crucial and neglected part of the picture. Their ten recommendations for regulatory change, and four recommendations for further research, set out a manifesto which puts the individual at the centre of data management. Johannes Buchmann et al. focus on online social networks in their chapter Personal Information Dashboard: Putting the Individual Back in Control. The selfexplanatory title indicates their approach of providing intuitive visualisations and automatic tools for bringing together data from a range of sources to allow the user to understand the nature of the footprint she makes across the range of her online identities – a step toward the transparency which is called for by Fischer-Hübner et al. Techniques such as machine learning and correlation models allow the presentation of options for lowering privacy risk (assuming that is what the individual wants). If she does not, then at least she is informed about the risks to privacy that she runs. Buchmann et al. set out in detail an architecture including a series of privacy-enhancing features and components to calculate current levels of privacy and to aid in decisions about what to make public. Uninformed or unconditional consent is a problem identified by both FischerHübner et al. and Buchmann et al. Edgar Whitley contributes a study of technical methods for supporting informed consent in his chapter Towards Effective ConsentBased Control of Personal Data. Consent is a cornerstone of existing data protection, although as Whitley shows, current information systems tend to treat it as a simple concept, a black box that either allows or disallows the processing of data. However, as a matter of fact, our views of how our data should be accessed and used are likely to be more nuanced in a real-world context. Furthermore, consent needs to be informed, yet – as a number of papers in this collection make clear throughout – it is far from established that a typical data subject is genuinely informed about what goes on, given the quantity of data floating around cyberspace and the sheer complexity of the various transactions in which it is central. Indeed, many institutions engineer their interactions with data subjects so that consent decisions are skewed, creating the illusion of informed consent. Does too much hang on the concept of consent, then? Whitley explores the possibility of a more dynamic and user-centred notion of consent supported by technology, but informed by wider social science research.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

20

M. Hildebrandt et al. / Introduction

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5.4. Part IV: Other Sources of Data Part IV is a digression which connects with our main theme through the use of data. We have focused, in this introduction, on data about an individual, whether volunteered, observed or inferred. In managing her own data, the individual is in general likely to be using data she herself has generated, yet this is dwarfed in quantity by data held by others (banks, supermarkets, energy companies). Add to this the extra data to which she arguably has a right (government data), which is relevant to her, if not directly about her (data about her local community, schools, roads, transport timetables), and suddenly her PDE looks very rich and valuable to her personally – if she can get hold of that data and use it effectively. The push toward open data, which has developed enormous momentum in a very short space of time [39], creates another source of valuable data about the communities in which the individual lives and works. Open data, machine readable and online under highly liberal licences, can be exploited for any purpose, and it is hoped will revolutionise government and commerce. New innovative services will be built on the back of open data stores, and will be part of the big data story that Robert Madelin’s Foreword looks forward to. However, open data, with the consequent lack of control that this implies, brings its own issues to the individual as well. No government would publish personal data as open data, but how about open data derived from personal data (aggregated or anonymised)? How about open microdata? The law here is somewhat untested, and two papers explore the interesting nexus between open data and personal data from a legal perspective. A third looks at the provision of data about individuals to those individuals by business. Ugo Pagallo and Eleonora Bassi, in their chapter Open Data Protection: Challenges, Perspectives and Tools for the Reuse of Public Sector Information, consider the relation between personal data and open data, considering the large amount of public sector information (PSI) which is derived from personal data, such as registrations of vehicles or land. They stress the real possibility of a “divorce” between rights to data and data protection, then explore the role of privacy by design as a marriage guidance counsellor might. There are several mechanisms, including privacy impact assessments and anonymisation techniques, which individually may not be sufficient to prevent a split, but which in aggregate may allow progress; they also would wish to include the notions of control and sensitivity to user preference that several authors have already advocated. They argue that, in the right regulatory and technical context, privacy and openness are not in a zero sum game. Katleen Janssen and Sara Hugelier’s chapter Open Data: a New Battle in an Old War Between Access and Privacy examines the same question from a more historical perspective, looking at how law has developed to help us balance rights to information with the rights to control, suppress or filter information which seem to constitute the positive privacy right. The drivers for freedom of information and open data seem very different – FoI mechanisms are generally seen as a kind of redress or remedy following a reasonable request for information that has been denied, while open data is a positive decision to release data. One does detect arguments about rights to data within the open data movement, but, in general, innovation and growth seem to be the main political drivers. Indeed, rights to privacy appear to be much more fully developed in law than rights to information. Janssen and Hugelier revisit older law resolving conflicts between transparency and privacy to gain insights into the potential for conflict in a world of open data.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

21

Sir Nigel Shadbolt reviews the UK government’s midata initiative, working with the private sector to give data subjects access to the data that has been collected about them, in his chapter Midata: Towards a Personal Data Revolution – a title which shows the level of ambition for this programme. Data protection law guarantees an individual access to the information about her, but midata aims to make that routine. As with open data – a programme with which midata has much in common – there is no specific target to aim at. Midata is intended to stimulate new markets, and it is envisaged that individuals could gain a lot of value simply by visualising and then adapting their own consumption patterns. Several case studies are described, such as energy use, which in the era of smart grids, high energy costs and concern about the environment looks like a potential winner. The legal context of midata is complex, and Shadbolt describes the different approaches of the UK, the EU and the US. The midata initiative, like open data, provides another piece of the PDM jigsaw – we need technology to manage our data, but how much more powerful that model will be if we can secure access to some of the extremely rich data about us and our environment held by other organisations.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5.5. Part V: Personal Data Management: Examples and Overview It is surprising how often a book on technology policy makes assumptions and lays down the law in an abstract way, independent of actual developments. Equally, it is also often the case that a particular form of technology is taken by commentators to be the paradigm for future developments, only for both it and the commentaries to be made obsolete by next year’s wonder. It is of course hard to make a book on technology future-proof, but we hope that the debates within these covers will resonate beyond the situation at the time of writing in the summer of 2013. To that end, we invited a number of key players in the field of user-centric personal data management to write chapters describing their views of the field, how we have got to where we are, and where we might go in the future. In particular, we asked these authors to set out what systems, methods, tools or formalisms they had implemented to help answer the reader’s obvious question: what is out there now? Of course at the time of writing we cannot know the future, but we have sampled the present, and in future years readers will be able to consider how far such systems have been influential. For now, these chapters provide a level of context which allows the reader to understand some of the sometimes very abstract discussions which occur throughout the book, and to get a sense that the politics of this area are very much current. Carolyn Nguyen et al. look at the collection of information by businesses in their chapter A User-Centered Approach to the Data Dilemma: Context, Architecture and Policy. They point out the dilemma that regulation to protect the individual may threaten the free flow of information, dampening innovation and undermining companies’ business models, and argue that the way round this dilemma is itself technological. They describe an architecture to handle metadata which will associate user preferences and permission with data, allowing users the flexibility to change their policies and consider unanticipated uses of their data. The core of their chapter is empirical work carried out by Microsoft (the affiliation of the authors) to understand user attitudes toward personal data, identity and trust – this is particularly welcome, as there is a surprising lack of such empirical work being used directly in system design or, as in this chapter, the principles that form the framework for fair use of data. The metadata

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

22

M. Hildebrandt et al. / Introduction

architecture is intended to allow data to flow, but with the privacy policies of the data subjects alongside it, with the aim of circumventing the ‘data dilemma’ described earlier. Martin Kuppinger and Dave Hearns describe their vision for Life Management Platforms (LMPs), whose purpose is clearly signalled in the subtitle of their chapter (adapted from a previously published report) Control and Privacy for Personal Data. An LMP is a means of consolidating data from various sources, especially sensitive data, and Kuppinger and Hearns argue that we can go beyond current notions of personal data stores in terms of the flexibility of support for security and privacy. A ‘personal domain’ of data is the metaphor in which ideas of minimal disclosure and retention of control are explored. The authors set out their business model which, if feasible, promises to open up an interesting space for solutions to appear, although they remain aware that there are still many inhibitors which could suppress the sector. William Heath et al. have a similar agenda, described in their chapter Digital Enlightenment, Mydex and Restoring Control Over Personal Data to the Individual, in which they describe the Mydex system in the context of an idiosyncratic but undeniably recognisable history of the drivers of the erosion of privacy by technology. Like Kuppinger and Hearns, Heath et al. are keen to establish the possibility of a feasible business model for personal data management systems, in their case based on a ‘Catherine wheel’ model (a type of firework where a series of separate, simultaneous ignitions drive the firework round) where benefits are seen for the individual and the organisation, as well as contributing to the economy of volunteered personal information. Mydex is a social enterprise in the UK which aims to develop tools to help people realise the value of their personal data online while also supporting informational selfdetermination, and Heath et al. describe their own system, while welcoming the growth of a vibrant sector in which there are many players and many alternatives. They describe the architecture and design of Mydex, both in functional terms but also with a historical perspective, showing how their community prototype led to particular issues and drove particular solutions. The historical, almost Hegelian, perspective of this chapter gives a strong sense of how hard it will be to provide a futureproof roadmap for the development of personal data ecosystems. To conclude this section, Jacques Bus and Carolyn Nguyen consider the state-ofthe-art range of practical examples in the field (including examples from other chapters in this book), and in their chapter, Personal Data Management – A Structured Discussion, they abstract away from the complexities to produce a framework for considering how to frame the debates about the use of personal data. Many basic terms are defined, including – as noted above – personal data management, and a reference model for representing the requirements of context-based PDM, which requires the basic infrastructure, the elements which allow management of data, and the elements that allow interaction between the user and external parties. From this reference model, Bus and Nguyen work to create a framework which proposes a set of relations in which trust can be negotiated and placed accurately and rationally. The proposal is intended to support and promote dialogue and discussion without necessarily presupposing particular solutions or market structures. Nevertheless, even the creation and support of such a trust network would already be a long way along the road from the current state of exploitation and ignorance to the radical control of Cavoukian, or the defined and enforced rights of Novotny and Spiekermann.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

23

6. Research Agenda It may seem that we fully endorse the idea of user-centric Personal Data Management, even though this does not imply that we would necessarily endorse user-centric Personal Data Monetisation. At this point in time, however, the editors do not agree on whether PDM will enlighten us. There are worries about the further ‘datafication’ [9, p. 73 ff.] of personal identity, increasing discretisation and commodification of invaluable and untradeable dimensions of human agency and concerns about slipping from PDM as ‘management’ to PDM as ‘monetization’. What we do endorse, however, is further research into the potential of PDM for user empowerment. That PDM will achieve the redressing of power imbalances between the ‘owners’ of Big Data and the individuals they apply to cannot be taken for granted, and such research should also raise the question of what problems cannot be solved by means of data management and what problems may be created or reinforced by developing tools and frameworks to manage one’s data. So we end with a cross-disciplinary research agenda of three pivotal questions:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. 2. 3.

What problems can current models of PDM solve? What problems cannot be solved by PDM systems? What new problems might be created by effective employment of PDMs?

This is not the place to answer these questions, precisely because this will require both empirical investigation and philosophical reflection. As to the first question, we need to investigate the difference and similarities between identity management and personal data management and the extent to which PDM facilitates access to and/or usage of personal data. Does PDM provide possibilities of profile transparency, enabling users to foresee how advertisers, law enforcement, potential employers, credit providers and insurance companies will ‘rate’ them? As to the second question, as in the case of identity management, it is important to flesh out whether PDM helps in ensuring purpose specification and purpose limitation, and whether it provides intuitive transparency regarding third party access and usage. How does PDM relate to the novel rights of data portability and the right to be forgotten? Can PDM help in providing control over observed and inferred data or is this an illusion in the era of Big Data? If so, what does this mean in an age where algorithms increasingly inform a host of decision systems that run on observed and inferred data and hardly require volunteered data? Does PDM protect against the application of inferred profiles that have been derived from anonymised and/or aggregated data, or does this fall outside its scope? As to the third question, to what extent does PDM require root identities that are ‘real identities’, thus incentivising increased use of real identities to gain access to all kinds of services that are now easily accessible by means of fake identities? Does PDM ignore the maxim that trust does not scale; would it make us dependent on trust frameworks and vulnerable to the volatility of high frequency trading with personal data? Might PDM turn our indeterminate personal identity into something that can be measured and mined, thus inviting us to participate in the process of monetisation of our behaviour? The only correct answer will be: ‘it depends’. On how we engineer, design and negotiate existing and emerging personal data ecosystems and on how we integrate them into our life world. How we arrange for countervailing powers between Big Data ‘owners’ and individual persons. The idea of countervailing powers is another Enlightenment idea, going back to Montesquieu. This volume is a call to scrutinise the various

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

24

M. Hildebrandt et al. / Introduction

types of PDM that are proposed, to develop new ways to empower individual persons and to reinvent the checks and balances of constitutional democracy in the face of novel knowledge asymmetries.

Acknowledgements During the editing of this book and the writing of this Introduction, KOH was supported by SOCIAM: The Theory and Practice of Social Machines, funded by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant number EP/J017728/1, and MW was supported by the European Center for Security and Privacy by Design (BMBF) and the Center for Advanced Security Research Darmstadt (LOEWE). Our thanks to the many scholars and professionals who peer reviewed the chapters for us. Thanks also to Rob Cottingham for supplying the illustration on page 1, reproduced here under Creative Commons Non-Commercial Attribution License.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] G. Bateson, Steps to an Ecology of Mind, Ballantine, New York, 1972. [2] T.W. Adorno & M. Horkheimer, Dialectic of Enlightenment, Herder & Herder, New York, 1972. [3] J. Gray, Enlightenment’s Wake: Politics and Culture at the Close of the Modern Age, Routledge, London, 1995. [4] J. Palfrey & U. Gasser, Born Digital: Understanding the First Generation of Digital Natives, Basic Books, New York, 2008. [5] K. Schwab, A. Marcus, J.R. Oyola, W. Hoffman & M. Luzi, Personal Data: The Emergence of a New Asset Class, World Economic Forum, 2011. [6] M. Hildebrandt and K. de Vries, (eds.), Privacy, Due Process and the Computational Turn: The Philosophy of Law Meets the Philosophy of Technology, Routledge, Abingdon, 2013. [7] E. Goffman, The Presentation of Self in Everyday Life, Anchor Books, New York, 1959. [8] M. Dodge & R. Kitchin, ‘Outlines of a World Coming Into Existence’: Pervasive Computing and the Ethics of Forgetting, Environment and Planning B: Planning and Design, 34 (2007), 431–445. [9] V. Mayer-Schönberger, Delete: The Virtue of Forgetting in the Digital Age, Princeton University Press, Princeton, 2009. [10] M. Hildebrandt, The Dawn of a Critical Transparency Right for the Profiling Era, in: J. Bus, M. Crompton, M. Hildebrandt & G. Metakides (eds.), Digital Enlightenment Yearbook 2012, IOS Press, Amsterdam, 2012, 41–56. [11] A.L. Allen, Dredging Up the Past: Lifelogging, Memory and Surveillance, University of Chicago Law Review, 75 (2008), 47–74. [12] N. Watt, PRISM: Claims of GCHQ Circumventing Law Are ‘Fanciful Nonsense’, Says Hague, The Observer, 9th June, 2013, http://www.guardian.co.uk/world/2013/jun/09/prism-gchq-william-haguestatement. [13] J. Borger, L. Harding, M. Elder & D. Smith, G20 Summit: Russia and Turkey React with Fury to Spying Allegations, The Guardian, 17th June, 2013, http://www.guardian.co.uk/world/2013/jun/17/turkeyrussia-g20-spying-gchq. [14] E. MacAskill, J. Borger, N. Hopkins, N. Davies & J. Ball, GCHQ Taps Fibre-Optic Cables for Secret Access to World’s Communications, The Guardian, 21st June, 2013, http://www.guardian.co.uk/uk/ 2013/jun/21/gchq-cables-secret-world-communications-nsa. [15] M. Wolf, Proust and the Squid: The Story and Science of the Reading Brain, Icon Books Ltd, Thriplow, 2008. [16] P.E. Agre, Introduction, in: P.E. Agre & M. Rotenberg (eds.), Technology and Privacy: The New Landscape, MIT Press, Cambridge, Massachusetts, 2001. [17] M. Sahlins, Poor Man, Rich Man, Big Man, Chief: Political Types in Melanesia and Polynesia, Comp. Stud. Soc. Hist., 5 (1963), 285–303.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Hildebrandt et al. / Introduction

25

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[18] H.F. Nissenbaum, Privacy in Context: Technology, Policy, and the Integrity of Social Life, Stanford Law Books, Stanford, Calif., 2010. [19] M. Hildebrandt, Privacy and Identity, in: E. Claes, A. Duff & S. Gutwirth (eds.), Privacy and the Criminal Law, Intersentia, Antwerp, 2006, 43–58. [20] J.S. Mill, On Liberty, in: On Liberty and Other Essays, Oxford University Press, Oxford, 1991, 1–128. [21] J. Feinberg, The Moral Limits of the Criminal Law Vol. 1, Oxford University Press, New York, 1987. [22] S.D. Warren & L.D. Brandeis, The Right to Privacy, Harvard Law Review, 4 (1890), 193–220. [23] B. Rössler, The Value of Privacy, Polity Press, Cambridge, 2005. [24] A. Etzioni, The Limits of Privacy, Basic Books, New York, 1999. [25] L. Floridi, The Ontological Interpretation of Informational Privacy, Ethics and Information Technology, 7 (2005), 185–200. [26] J. Jarvis, Public Parts: How Sharing in the Digital Age Improves the Way We Work and Live, Simon & Schuster, New York, 2011. [27] K. O’Hara, Are We Getting Privacy the Wrong Way Round? IEEE Internet Computing, 17 (2013). [28] IAB Europe, Consumers Driving the Digital Uptake: The Economic Value of Online Advertising-Based Services for Consumers, White Paper, 2010, http://www.iabeurope.eu/media/95855/white_paper_ consumers_driving_the_digital_uptake.pdf. [29] N.B. Ellison & D.M. Boyd, Sociality Through Social Network Sites, in: W.H. Dutton (ed.), The Oxford Handbook of Internet Studies, Oxford University Press, Oxford, 2013, 151–172. [30] acatech, Internet Privacy: Taking Opportunities, Assessing Risks, Building Trust, National Academy of Science and Engineering, Munich, 2013, http://www.acatech.de/fileadmin/user_upload/Baumstruktur_ nach_Website/Acatech/root/de/Publikationen/Stellungnahmen/acatech_Internet_Privacy_Pos_eng_final. pdf. [31] C. Sunstein, Republic.com, Princeton University Press, Princeton, 2001. [32] T.Z. Zarsky, ‘Mine Your Own Business!’: Making the Case for the Implications of Data Mining of Personal Information in the Forum of Public Opinion, Yale J. Law Technol., 5 (2003), 17–47. [33] M. Hildebrandt & S. Gutwirth, Profiling the European Citizen. Cross-Disciplinary Perspectives, Springer, Dordrecht, 2008. [34] J. Waldron, Security and Liberty: The Image of Balance, J. Polit. Philos., 11 (2003), 191–210. [35] M. Hildebrandt, Balance or Trade-off? Online Security Technologies and Fundamental Rights, Philos. Technol., doi:10.1007/s13347-013-0104-0 (May 2013). [36] L. Lessig, Code and Other Laws of Cyberspace, Basic Books, New York, 1999. [37] P.M. Schwartz, Beyond Lessig’s Code for Internet Privacy: Cyberspace Filters, Privacy-Control and Fair Information Practices, Wis. Law Rev., (2000), 743–788. [38] S.R. Peppet, Unraveling Privacy: The Personal Prospectus and the Threat of a Full Disclosure Future, Northwestern University Law Review, 105 (2011). [39] N. Shadbolt & K. O’Hara, Linked Open Government Data, IEEE Internet Computing, 17 (2013).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Part I

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Background

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

29

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-29

Die Aufklärung in the Age of Philosophical Engineering

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Bernard STIEGLER1

Abstract. Has today’s digital society succeeded in becoming mature? If not, how might a new Enlightenment philosophy and practice for the digital age be constructed that could hope to address this situation? Such a philosophy must take into account the irreducibly ambivalent, ‘pharmacological’ character of all technics and therefore all grammatisation and tertiary retention, and would thus be a philosophy not only of lights but of shadows. Grammatisation is the process whereby fluxes or flows are made discrete; tertiary retention is the result of the spatialisation in which grammatisation consists, a process that began thirty thousand years ago. The relation between minds is co-ordinated via transindividuation, and transindividuation occurs according to conditions that are overdetermined by the characteristics of grammatisation. Whereas for several thousand years this resulted in the constitution of ‘reading brains’, today the conditions of knowledge and transindividuation result in a passage to the ‘digital brain’. For this reason, the attempt to understand the material or hyper-material condition of knowledge must be placed at the heart of a new discipline of ‘digital studies’. The pharmacological question raised by the passage from the reading to the digital brain is that of knowing what of the former must be preserved in the latter, and how this could be achieved. This means developing a ‘general organology’ through which the social, neurological and technical organs, and the way these condition the materialisation of thought, can be understood. Integral to such an organology must be consideration of the way in which neurological automatisms are exploited by technological automatisms, an exploitation that is destructive of what Plato called thinking for oneself. The task of philosophical engineering today should be to prevent this short-circuit of the psychosomatic and social organological layers, a task that implies the need for a thoroughgoing reinvention of social and educational organisations. Keywords. Enlightenment, digitalisation, Nicholas Carr, philosophical engineering, Tim Berners-Lee, Michel Foucault, Walter Ong, Maryanne Wolf

Introduction Public access to the web is twenty years old. Through it, digital society has developed and spread throughout the entire world. But has this society become mündig, that is, mature, in Immanuel Kant’s sense, when he used this term to define the age of Enlightenment as an exit from minority, from Unmündigkeit [1]? Certainly not: contemporary society seems, on the contrary, to have become profoundly regressive. Mental disorders, as well as environmental, economic, political and military problems, do not cease to proliferate and increase. And while traceability continues to expand, it seems it is mainly being used for behaviour profiling, and thus to increase the heteronomy of individuals rather than their autonomy. 1

Corresponding Author. Translated by Daniel Ross.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

30

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

There are many ways in which digitalisation clearly holds promise, and socialising digitalisation in a reasoned and resolute way is (I am convinced of this) absolutely imperative if the world is to escape from the impasse in which the obsolete consumerist industrial model finds itself. But if this is the case, then this socialisation in turn requires the creation and negotiation of a new legal framework that itself presupposes the formation of new ‘Enlightenments’. For this reason, it pleases me greatly that Neelie Kroes has called for a new Enlightenment philosophy for the digital age, just as Tim Berners-Lee and Harry Halpin have argued [2], in dialogue with the contrasting position of Vint Cerf (who developed the TCP-IP protocol) [3], that internet access must become a universal right. But what exactly does access mean here? Or again: what type of access should we claim will bring light or enlightenment, rather than darkness or shadow? And under what conditions will such access prove beneficial for individuals and the societies in which they live? The challenge we face today is to respond to these questions, and in the first place to pose them in the correct terms. And to try to take the measure of these questions, we must see how Nicholas Carr, for example, in his book The Shallows: What the Internet is Doing to Our Brains [4], outlines the context that constitutes the theme of this global summit on the web. The growth of digitalisation since 1992 has brought with it a genuine chain reaction that has transformed social life at its most public level, and the life of the psychic 2 individual at its most intimate level. The Shallows bears witness to the immense distress that has accompanied this meteoric rise – which increasingly seems to resemble a tsunami – and that has, by his own account, significantly disrupted the mental capacities of Nicholas Carr himself. And this tsunami threatens to wipe out all the inherited structures of civilisation on every continent, which may in turn produce immense disillusionment and tremendous disaffection [5]. The negativity of this new state of affairs continues to gain ground. Faced with this, we must assert the necessity and the positivity of a new state of law. And this means turning to the question of the relationship between technology and law. What we refer to as the law is in fact founded on writing. Now, digital technology constitutes the latest stage of writing. And just as in the age of Socrates writing was a pharmakon,3 so too, today, we can say the same about the digital: it can lead either to the destruction of the mind or to its rebirth; to the destruction of spirit or to its renaissance. 4 I would like here to elaborate the following points: 1. 2.

the constitution of a new state of law, a new rule of law, founded on digital writing, in fact presupposes a new age of Enlightenment(s); these new Enlightenments must, however, conduct a critique of the limits of the philosophy of the Aufklärung itself, notably in relation to the questions

2 Translator’s note: ‘Psychic’ is used here in Gilbert Simondon’s sense of ‘psychic and collective individuation’. See Gilbert Simondon, L’individuation psychique et collective (Paris: Aubier, 2007). 3 Translator’s note: In ancient Greek, pharmakon refers both to poison and remedy (and relates to the pharmakos, the scapegoat). Fundamental to the author’s position is that all forms of technics and technology possess this ambivalent, ‘pharmacological’ character, as he explains. Also important to his use of these terms is Jacques Derrida’s treatment of this theme in ‘Plato’s Pharmacy’, in Dissemination (London: Athlone, 1981), pp. 61–171. 4 Translator’s note: it should at all times be borne in mind that the French word esprit means both mind and spirit.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

3.

31

raised by ‘philosophical engineering’,5 developed at W3C at the instigation of Tim Berners-Lee; the new philosophy that must arise from the worldwide experience of the web, and more generally of the digital, across all cultures, an experience that is in this sense universal – this new philosophy, these new Enlightenments, cannot merely be that of digital lights: it must be a philosophy of shadows (and maybe of shallows, in Nicholas Carr’s terms), of the shadows that inevitably accompany all light.

We can no longer ignore the irreducibly ‘pharmacological’ – that is, ambivalent – character of writing, whether alphabetical or digital, etched in stone or inscribed on paper, or in silicon, or on screens of digital light. Writing, and more precisely printed writing, is the condition of (the) ‘enlightenment’, and it is for this reason that Kant says that he is addressing the ‘reading public’ [6]. But there is never light without shadow. And it is for this reason that Theodor Adorno and Max Horkheimer were able to perceive, in 1944, in the rationalisation of the world, the very opposite of reason or the Aufklärung [7]. This irreducible ambivalence applies to technology in general, and is what the twenty-first century, like Nicholas Carr, discovers on a daily basis through a thousand experiences of the limits and ambiguities of technological progress. But this irreducible ambivalence is what neither modern philosophy, nor ancient philosophy, have yet proven themselves capable of thinking. And faced with the new challenges of the digital world, this is what it is today imperative for us to understand: it is with this ambivalence that we must learn to think, and to live, differently.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Unlike Tim Berners-Lee and Harry Halpin, Vint Cerf argues that internet access cannot be subject to legal regulation because digital technology is an artefact that can change – and that never stops changing. But was not writing itself, which in Greece lay at the origin of law as well as of geometrical thinking, equally artefactual? It is true that writing seems to have taken a stable, universal form as alphabetical writing, and it seems as though, as such, through this writing and through its apparent stability, it is the universal structures of language and, beyond language, the universal structures of thought, that have been discovered. This is how things may seem – but they are perhaps not quite so clear. And herein lies what is truly at stake in the debate with that philosophy that we have inherited in the age of generalised digitalisation: what has been the role of writing and, beyond that, of technics, in the constitution of thought, and especially of that thought which was universalised through the Enlightenment and through its emancipatory discourse? The development of the web has made such a debate unavoidable – and it is a debate that I argue must be held in the framework of digital studies, and it is, within that framework, the fundamental question. Tim Berners-Lee and Harry Halpin propose a universal right to internet access, and that this right should be embodied in a philosophy designed to conceptually underpin 5 See Brian Runciman, interview with Tim Berners-Lee, ‘Isn’t it Semantic?’ (2006), available at: http:// www.bcs.org/content/conWebDoc/3337. And see Harry Halpin and Alexandre Monnin (eds.), Philosophical Engineering: Toward a Philosophy of the Web (forthcoming 2013).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

32

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

the web, a philosophy to which W3C would give rise. But if this is what Berners-Lee and Halpin propose, it is precisely because the web is a function of a digital technical system that could be otherwise, or which could even disappear. And if they argue that this is a right, it is because this philosophy and this stability must support the need to ensure not only a certain conception of the internet, its functions and its goals, but the sense of mental, intellectual, spiritual, social and noetic (the latter in Aristotle’s sense) progress that digitalisation in general must constitute. In order to deepen these formidable questions, we must take the measure of the following two points: •

Copyright © 2013. IOS Press, Incorporated. All rights reserved.



first, the digital technical system constitutes a global and contributory publication and editorialisation system that radically transforms the ‘public thing’, given that the res publica, the republic, presupposes a form of publicness, of ‘publicity’ – what the Aufklärung called an Öffentlichkeit – sustained by publication processes second, this publication system is inscribed in the history of a process of grammatisation, which conditions all systems of publication: the concept of grammatisation, as forged by Sylvain Auroux, provides important ingredients for the discussion inaugurated by Tim Berners-Lee around what he referred to as philosophical engineering.

With the concept of grammatisation, Auroux was able to think the technical conditions of the appearance of grammata, of letters of the alphabet, and of their effects on the understanding and practice of language. And he was able to think these conditions for the pre-alphabetic conditions of grammata (ideograms and so on), right up to the linguistic technologies that Auroux calls ‘language industries’, passing by way of the printing press. I have myself extended this concept by arguing that, more generally, grammatisation describes all technical processes that enable behavioural fluxes or flows to be made discrete (in the mathematical sense) and hence to be reproduced, meaning all those behavioural flows through which are expressed or imprinted the experiences of human beings (speaking, working, perceiving, interacting and so on). If grammatisation is understood in this way, then the digital is simply the most recent stage of grammatisation, a stage in which all behavioural models can now be grammatised and integrated through a planetary-wide industry of the production, collection, exploitation and distribution of digital traces.

2. The grammatisation of behaviour consists in a spatialisation of time, given that behaviour is above all a form of time (a meaningful sequence of words, an operational sequence of gestures, a perceptual flow of sensations, and so on). Spatialising time means, for example, transforming the temporal flow of a speech, such as the one I am delivering to you here and now, into a textual space, a de-temporalised form of this speech: it is thus creating a spatial object. And this is what is going on, from alphabetic writing to digital technology, as Walter Ong made clear: Writing […] initiated what print and computers only continue, the reduction of dynamic sound to quiescent space [8].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

33

This spatial object can be engraved on a wall, inscribed on a clay tablet, written or printed on paper, metastabilised on a silicon memory chip, and so on – and these various supports make possible operations that are specific to each form of support, that is, proper to each stage of grammatisation. Spoken language is an example of what Edmund Husserl called a temporal object: that is, it is an auditory object that appears only in the course of disappearing. But when speech is written down, it is grammatised and thereby becomes a spatial object, that is, a synoptically visible object. And this synopsis makes possible an understanding that is both analytic (discretised) and synthetic (unified). This spatialisation is a materialisation. This does not mean that there was initially something ‘immaterial’ that subsequently became material: nothing is immaterial. For example, my speaking is material: it is produced by vocal organs that produce sound waves, which are themselves supported by molecules, composed of atoms, that begin to vibrate in the surrounding air, and so on. One can speak of a visibly spatialising materialisation to the extent that there is a passage from an invisible, and as such indiscernible, and unthinkable material state, to another state, a state that can be analysed, criticised and manipulated – in both senses of the verb ‘manipulate’. That is, this is a state: 1.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.

on which analytical operations can be performed, and intelligibility can be produced; and it is a state with which one can manipulate minds – Socrates accused the Sophists of doing precisely this with writing, the latter being the spatialisation of the time of what he called ‘living speech’ [9].

If grammatisation is therefore a process of materialisation, hominisation is itself, and in a very general way, a process of materialisation: man is the living being who fabricates tools, and in so doing he transforms the world by never ceasing to materialise anticipations – what Husserl called protentions,6 and I shall explain below why I must express this through the vocabulary of the founder of phenomenology. Grammatisation is a very specific type of materialisation within a much larger process of materialisation of all kinds that Georges Canguilhem called ‘technical life’ – which distinguishes us from other living things.7 Grammatisation begins during the Upper Paleolithic era, some two million years after technical life first arose. It enabled mental and behavioural flows to be made discrete, and thus enabled new mental and behavioural models to be created. In the course of materialisation, and the spatialisation in which it consists, the constitutive elements of grammatised mental and behavioural flows are made discrete, and temporal mental realities, which have become identifiable through lists of finite, analysable and calculable elements, are modified in return. The visible and tangible reality emerging from this spatialisation constitutes an object that belongs to the class of things that I refer to as tertiary retention. I borrow the term ‘retention’ from Husserl. Retention refers to what is retained, through a mnesic function itself constitutive of a consciousness, that is, of a psychic apparatus. Within this psychic retention, Husserl distinguishes two types of retention, one he refers to as primary and the other as secondary. 6 See Edmund Husserl, On the Phenomenology of the Consciousness of Internal Time (1893–1917) (Dordrecht: Kluwer, 1991). 7 See Georges Canguilhem, The Normal and the Pathological (New York: Zone Books, 1991), pp. 200– 201.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

34

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

Secondary retention, which is the constitutive element of a mental state that is always based on a memory, was originally primary retention: primary in this case means retained in the course of a perception, and through the process of this perception, but in the present, which means that primary retention is not yet a memory, even if it is already a retention. To perceive a phenomenon is to retain and unify, in the course of the perception of the phenomenon, everything that appears as the identical ‘content’ of the perception (of the perceived phenomenon) but which each time presents a different aspect (an Abschatung). A primary retention is what, constituting the course of a present experience, is destined to become a secondary retention for somebody who has lived this experience, which subsequently becomes past experience – secondary because, no longer being perceived, it is imprinted in the memory of the one who had the experience, from where it may be reactivated. But a retention, as deriving from a flux and emerging from the temporal course of experience, can also become tertiary through the spatialisation in which consists the grammatisation (and more generally, in which consists any technical materialisation process) of the flow of retentions. This mental reality can thus be projected onto a support that is neither cerebral nor psychic but rather technical. The web grants access to such a space, through which shared, digital tertiary retentions are projected and introjected, constituting as such a new public, global and contributory space, functioning at the speed of light. What light and what shadow, what Enlightenment and what Darkness, can and must this bring us?

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. Michel Foucault spoke about the materialisation of knowledge in The Archaeology of Knowledge – but without placing it in the context of the grammatisation process, nor by understanding it in relation to primary and secondary retention – at a time when he was interested in the archives that make possible all knowledge [10]. Knowledge is above all a collection of archived traces, that is, ordered and modelled traces, thereby constituting an order – and submitted to this order and to this model, which orders these traces. Knowledge, modelled in this way, thus conserves the trace of the old from which it comes, and of which it is the rebirth and the transformation, through a process that Plato described as an anamnesis. The conservation of traces of the past is what enables the constitution of circuits of collective individuation across time and in the framework of a discipline. Such disciplines govern the relations between minds, which individuate themselves in concert, and in the course of intergenerational transmission, through which a transindividuation process is concretised, producing what Gilbert Simondon called the transindividual and forming meanings.8 The conditions of this process, however, are over-determined by the characteristics of grammatisation, that is, by the characteristics of the archival supports that are the tertiary retentions of different epochs: ideograms, manuscripts, texts, prints, records, databases, metadata, and so on. The archive is material, according to Foucault, and knowledge is essentially archived. This means that the materiality of the archive is not something that occurs after 8 See Gilbert Simondon, ‘The Genesis of the Individual’, in Jonathan Crary and Sanford Kwinter (eds.), Incorporations (New York: Zone Books, 1992), pp. 297–319.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

35

the fact in order to record something that would have occurred before its materialisation: the latter is the very production of knowledge. This materialisation doesn’t come after the form that it conserves, and it must be thought beyond the opposition of matter and form: it constitutes a hyper-material. The hyper-materiality of knowledge must, in the epoch of the web and the new transindividuation processes it produces, be studied as the condition of construction of rational forms of knowledge and of knowledge in general. We must situate the study of the hyper-materiality of knowledge within the framework of a general organology that studies the supports and instruments of every form of knowledge. And in the contemporary context, this study of hyper-materiality must be placed at the heart of digital studies, which must itself become the new unifying and trans-disciplinary model for every form of academic knowledge. General organology studies the relations between the three types of organs characteristic of technical life: physiological organs, technical organs and social organisations. Grammatisation began thirty thousand years ago, inaugurating a specific stage of the process of the co-evolution of these three organological spheres, which are inseparable from one another. This is shown in an extremely clear way by the neurophysiology of reading, through which, as Maryanne Wolf puts it, the brain is literally written by the socio-technical organs, and where our own brains, which she calls ‘reading brains’, were once written by alphabetical writing, but are now written by digital writing: We were never born to read. Human beings invented reading only a few thousand years ago. And with this invention, we rearranged the very organization of our brain, which in turn expanded the way we were able to think, which altered the intellectual evolution of our species [11].

Now, with the web, we are living through a passage from the reading brain to the digital brain, and this raises a thousand questions of rights and duties, in particular with regard to the younger generations:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[W]e make the transition from a reading brain to an increasingly digital one. […] Reading evolved historically […] and […] restructured its biological underpinnings in the brain [of what must be thought of] as a literate species [12].

And during this transition, the point is to know ‘what [it] is important to preserve’ [13]. It is a question of knowing what must be preserved, within the digital brain, of that which characterised the reading brain, given that writing new circuits in the brain can erase or make illegible the old circuits. The writing of the psycho-physiological organs through the socio-technical organs constitutes the reality of the history of thought, that is, of what Hegel called and described as the phenomenology of Geist – except that, within the phenomenology of tertiary retention I am talking about here, technics is the main dynamic factor, and is so precisely insofar as it constitutes a system of tertiary retention, this dynamic being ignored by Hegel [14].9 The emergence of digital technologies, of the internet and the web, which is also the age of industrial tertiary retention, is obviously the new page (a hypertextual and hyper-material page) on which is being inscribed and read (in HTML5) the history of thought – through what must be understood as a new system of publication constituting a new public thing, a new res publica. 9 See Bernard Stiegler, États de choc: Bêtise et savoir au XXIe siècle (Paris: Mille et une nuits, 2012), Ch. 5.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

36

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

The web is an apparatus of reading and writing founded on automata that enable the production of metadata on the basis of digital metalanguages, modifying what, in The Archaeology of Knowledge, Foucault called enunciative modalities and discursive formations. All this can be thought only on the condition of studying in detail the neurophysiological, technological and socio-political conditions of the materialisation of the time of thinking (and not only of thinking, but also of life and of the unthought of what one calls noetic beings, which is also, undoubtedly, of their unconscious, in the Freudian sense). It is for this reason that we must develop a general organology capable of specifying the characteristics, the course and the stakes of a process that began in the Upper Paleolithic as the materialisation of the flux of consciousness, projecting new kinds of mental representations forged through this very projection – and which we shall now see is also an intro-jection.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. From out of this rupestral projection, which is also the birth of art, the exteriorisation of the content of the mind begins to unfurl. After the Neolithic age, specific retentional forms appeared, making it possible for mental content to be controlled: the earliest forms of calculation, then the step-by-step recording of geometric reasoning. The mind or spirit, self-controlling and self-critical, thereby constitutes the origin of logos – this is what Husserl realised in 1936, when he grasped that the origin of geometry was founded on literal (that is, lettered) tertiary retention [15]. From the origin of philosophy, and up until our own time, this process has been concealed: its study was made impossible by the metaphysics of fundamental ontology – conceived as an ontology of pure thought, that is, an ontology of thought prior to the impurity of its exteriorisation. Kant would call this thought ‘a priori’, and metaphysics would take it to be the only true knowledge, in the presumption that being precedes becoming, and is thereby knowable. Today, grammatisation continues to spread and accelerate, and it transforms all forms of knowledge. And this is occurring at a time when we are also learning from neurophysiology that cerebral plasticity – and the transformation of what Maryanne Wolf calls ‘mental circuitry’ through the introduction of tertiary retentions (literal tertiary retention, for example) – is thinking: thinking consists in the production of new circuits, through the materialisation process that comes to modify existing circuits, and sometimes to destroy them, the question being to know what must be ‘preserved’. The mind, then, is constituted through the introjection of tertiary retention, and today this has become visible, because neurophysiologists can study it experimentally. These researchers are equipped with tools and apparatus designed to observe mental life, that is, the movements occurring within the cerebral apparatus, such as introjection, but also and above all between this apparatus and the tertiary retentional apparatus deriving from grammatisation, that is, from the projection of the mind outside itself. The fact that the exteriorisation of the mind is the condition of its constitution means that the mind cannot be some pure substance that, by exteriorising itself, alienates itself through this exteriorisation. The constitution of the mind through its exteriorisation is its expression, resulting from a prior impression. The projection of the mind outside itself constitutes the mind through its materialisation and spatialisation as a

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

37

movement: the mind is as such mobility, motility and emotion (and this is how we should interpret the theses of Antonio Damasio) [16]. This projection that is constitutive of the mind, however, can also lead to its evacuation: it makes possible what Socrates described – the short-circuiting or bypassing of the life of the mind via an exteriorisation without return, that is, without reinteriorisation. Projection can, in fact, constitute a mind only insofar as it is retemporalised: what has been spatialised must then be individuated and ‘interiorised’ in order to come to life. Tertiary retention is dead, and it remains so if it does not transform, in turn, the secondary retentions of the psychic individual affected by this tertiary retention. This transformation of the individual is possible because the latter has, for example, ‘literalised’ his or her own brain, which has thus become a ‘reading brain’, and is therefore now a fabric woven from literalised secondary retentions, that is, textualised secondary retentions, and becomes as such the object of constant self-interpretation. It is for this reason that Joseph Epstein can write that ‘we are what we read’ [17]. And this is what Walter Ong made comprehensible when he wrote of literate human beings:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[They are] beings whose thought processes do not grow out of simply natural powers but out of these powers as structured, directly or indirectly, by the technology of writing. Without writing, the literate mind would not and could not think as it does, not only when engaged in writing, but normally even when it is composing its thoughts in oral form [18].

In other words, even when they speak, and express themselves orally, literate human beings are reading themselves and interpreting themselves to the letter. That is, they are ‘literally’ in the course of writing themselves, given that everything they read is inscribed within their brains, and given that everything they read reactivates and reinterprets the previously written, and textually written, circuits of their secondary retentions: literate human beings speak like the book that they are and that they read. What Maryanne Wolf adds, however, building on the work of Stanislas Dehaene, is that the acquisition of new retentional competences, through the interiorisation of tertiary retentions, can also recycle ‘existing […] circuits’ [19], that is, destroy them, and that this is the reason it is a matter of knowing what must be ‘preserved’. Moreover, Socrates argued that exteriorisation cannot occur without re-interiorisation, that is, without individuation. Such individuation is required in order to produce real thoughts, and this is why we must, as a law and a duty, struggle against the slide into sophism. What consequences can we draw from these considerations within the framework of our encounter here, and from the perspective of a reactivation of the Enlightenment project in the age of the web? I will conclude by attempting to give a broad outline of an answer to this question. 5. The writing of the brain is the writing of capacities enabling brains to cooperate – notably through the constitution of communities of reading (that is, lettered or literate) brains, or digital brains. Socrates, however, argued that by enabling souls (and their brains) to be short-circuited or bypassed, the writing of the brain can also destroy both noetic and social capacities, and result in structural incapacitation, that is, lead to an inability to think for oneself.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

38

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

This means that the interiorisation of technical organs by the cerebral organ, which is thereby reorganised, constitutes a new stage of thinking only on the condition that social organisations exist to ensure this interiorisation – such as, for example, the paideia (education) practiced at Plato’s academy. The question of what must be preserved thus involves not just cerebral circuits, but social circuits. We cannot re-constitute internet access, therefore, without completely rethinking the formation and transmission of knowledge with a view to ensuring a historical understanding of the role of tertiary retention in the constitution, as well as the destruction of knowledge, and with a view to deriving, on this basis, a practical and theoretical understanding of the digital tertiary retentions that transform cerebral and social organisations. Without such a politics, the inevitable destiny of the digital brain is to find itself slowly but surely short-circuited by automata, and thus find itself incapable of constituting a new form of society with other digital brains. Automatisation makes digitalisation possible, but although it immeasurably increases the power of the mind (as rationalisation), it can also destroy the mind’s knowledge (as rationality). A ‘pharmacological’ thinking of the digital must study the contradictory dimensions of automatisation in order to counteract its destructive effects on knowledge. The point is not merely to ensure that there is a right to access the internet, but of having a right and a duty to know (through education) that invisible automatisms exist, and that these may elude digital brains – and may manipulate these brains without teaching them how they should themselves be manipulated, how they should be handled. This question arises in a context in which neuromarketing is today in a position to directly solicit the automatisms of the lower layers of the cerebral organs by shortcircuiting the networks inscribed through education in the neo-cortex. The automatisms of the nervous system are in this way combining with technological automatisms: this is the threat (that is, the shadow) against which new Enlightenments must struggle. Thinking is, above all, the history of grammatisation; the history of the relations of projections and introjections occurring between the cerebral apparatus and tertiary retentions. It is for this reason that the question of philosophical engineering, posed by Tim Berners-Lee, comes to the fore. Philosophical engineering must lead to a close articulation between psychosomatic organs, technological organs and social organisations, while ensuring that the technological layer does not short-circuit the psychosomatic and social layers. It is a question of articulating the social web with the semantic web in an intelligent way. The social web and the semantic web must not be opposed, but rather composed – through social and educational organisations that must themselves be completely rethought on the basis of this perspective. It is thus a question of what we, at the Institut de recherche et d’innovation, 10 call transindividuation technologies, through which the organs of contributive society must be constituted.

References [1] Immanuel Kant, ‘An Answer to the Question: “What is Enlightenment?”’, Political Writings (Cambridge: Cambridge University Press, 1991), pp. 54–60. [2] Tim Berners-Lee and Harry Halpin, ‘Defend the Web’, Digital Enlightenment Yearbook 2012, available at: http://www.digitalenlightenment.org/upload/pdf/3-7.pdf. 10

See: http://www.iri.centrepompidou.fr/?lang=en_us.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

B. Stiegler / Die Aufklärung in the Age of Philosophical Engineering

39

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[3] Vinton G. Cerf, ‘Internet Access Is Not a Human Right’, New York Times (4 January 2012), available at: http://www.nytimes.com/2012/01/05/opinion/internet-access-is-not-a-human-right.html?_r=0. [4] Nicholas Carr, The Shallows: What the Internet is Doing to Our Brains (New York: W.W. Norton & Co., 2009). [5] See Bernard Stiegler, The Decadence of Industrial Democracies: Disbelief and Discredit, 1 (Cambridge: Polity Press, 2011), and Uncontrollable Societies of Disaffected Individuals: Disbelief and Discredit, 2 (Cambridge: Polity Press, 2013). [6] Kant, ‘An Answer to the Question: “What is Enlightenment?”’, p. 55. [7] Max Horkheimer and Theodor W. Adorno, Dialectic of Enlightenment: Philosophical Fragments (Stanford: Stanford University Press, 2002). [8] Walter Ong, Orality and Literacy: The Technologizing of the Word (New York: Routledge, 2002), p. 81. [9] Plato, Phaedrus 276a. [10] Michel Foucault, The Archaeology of Knowledge (London: Routledge, 2002). [11] Maryanne Wolf, Proust and the Squid: The Story and Science of the Reading Brain (New York: Harper, 2007), p. 3. [12] Ibid., p. 4. [13] Ibid. [14] G.W.F. Hegel, Phenomenology of Spirit (Oxford: Oxford University Press, 1977). [15] Edmund Husserl, ‘The Origin of Geometry’, in: Jacques Derrida, Edmund Husserl’s Origin of Geometry: An Introduction (Lincoln, Nebraska, and London: University of Nebraska Press, 1989). [16] Antonio Damasio, The Feeling of What Happens: Body, Emotion and the Making of Consciousness (London: Vintage, 2000). [17] Joseph Epstein, cited in ibid., p. 5. [18] Ong, Orality and Literacy, p. 77. [19] Maryanne Wolf, Proust and the Squid: The Story and the Science of the Reading Brain (New York: Harper, 2007), p. 12.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

40

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-40

Personal Data: Changing Selves, Changing Privacies Charles ESSa and Hallvard FOSSHEIMb Associate Professor, Department of Media and Communication, University of Oslo, Norway b Professor II, University of Tromsø, Norway

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

a

Abstract. We first use Medium Theory to develop the tension between print and digital media, i.e as contrasts between literacy-print and the secondary orality of contemporary online communication. Literacy-print facilitates high modern notions of individual selfhood requisite for democratic polities and norms, including equality and gender equality. By contrast, secondary orality correlates with more relational conceptions of selfhood, and thereby more hierarchical social structures. Recent empirical findings in Internet Studies, contemporary philosophical theory, Western virtue ethics and Confucian traditions elaborate these correlations, as do historical and contemporary practices and theories of “privacy.” Specifically, traditional conceptions of the relational self render individual “privacy” into something solely negative: by contrast, high modern conceptions of autonomous individuals render individual privacy into a foundational positive good and right. Hence, the shift towards relational selves puts the conception of selfhood – at work in current EU efforts to bolster individual privacy – at risk. Nonetheless, contemporary notions of “hybrid selves” (conjoining relational and individual selfhood) suggest ways of preserving individual autonomy. These notions are in play in Helen Nissenbaum’s theory of privacy as contextual integrity [1] and in contemporary Norwegian cultural assumptions, norms, and privacy practices. The implications of these transformations, recent theoretical developments, and contemporary cultural examples for emerging personal data ecosystems and user-centric frameworks for personal data management then become clear. These transformations can increase human agency and individual’s control over personal data. But to do so further requires us to reinforce literacy and print as fostering the individual autonomy underlying modern democracy and equality norms. Keywords. Relational selfhood, data privacy, literacy-print, medium theory, gender equality

Introduction As the Yearbook call for papers makes exquisitely clear, our core notions of selfhood, privacy, and other rights affiliated with democratic processes, which have defined high modernity, are inextricably rooted in the “information infrastructure of the printed word”. Or, as conceptualized in Medium Theory, these notions and practices are prime correlates of the communication technologies of literacy and print. What the call for papers identifies as a tension between print and digital computing systems is explored in Medium theory as the continuities and differences between the technologies of literacy and print, on the one hand, and on the other hand, the “secondary orality” (and secondary textuality) of “electronic media” – most especially including the multiple

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

41

forms of computer-mediated communication which increasingly form the mediated environments of our daily lives. In Medium Theory as well, these technologies are affiliated with nothing less foundational than our core conceptions of selfhood and identity. Most briefly, the technologies of literacy-print correlate with high modern conceptions of the self as a primarily individual self – in Charles Taylor’s terms, a “punctual” or radically disengaged self [2]. Such an individual self, understood as a rational autonomy, and the modern liberal democratic state seem non-accidentally suited to each other. By contrast, both orality and secondary orality correlate with more relational conceptions of selfhood. We begin here with these correlations from Medium Theory as our first conceptual foundation (Section 1.1).1 We then bolster these correlations – specifically, the predicted turn towards more relational conceptions of selfhood in the contemporary age of electronic media – by way of empirical findings drawn from Internet Studies (Section 1.2) and recent philosophical theories (Section 1.3). As we unfold in Section 2, however, for all of their benefits and affordances, such relational selves tend to be highly dependent on social hierarchies and non-democratic forms of polity. Essentially, relational selves – as explored here in terms of both Western virtue ethics and Confucian traditions – thereby stand at odds with the high modern artefacts of democratic processes and norms, including equality and gender equality (Section 2.1). More specifically, we will take up as the red thread and primary focus of our paper the implications of these modalities of communication and shifts in understanding of selfhood for our conceptions, expectations, and practices of “privacy.” 2 Briefly, strongly individual conceptions of selfhood – emerging contemporaneously with the Enlightenment and in conjunction with literacy-print media – correlate with modern understandings of “privacy” as individual and as something inherently positive (with the status of being both desirable and a right). Indeed, individual privacy and affiliated rights – including current rights to personal data protection – are often argued to be foundational to modern democracies and norms.3 Par contra, more relational selves are conceptually and historically correlated with conceptions of individual privacy as predominantly negative in moral and legal terms. The implications of this contrast for current EU efforts to protect individual privacy even more robustly are clear: the shift 1 We stress here correlation only. That is, an early – and well-justified – criticism of Medium Theory was that it fell too easily into a technological determinism, one that would claim, e.g., that literacy-print (inevitably) caused the emergence of strongly individual conceptions of selfhood and thereby democratic polities. In general, any such mono-causal claim must be immediately rejected: the emergence of both distinctive conceptions of selfhood and democratic polities entails a complex of contributing factors, beginning with the material realities of increasing wealth through industrialization that literally afford new possibilities for individual spaces and thus individual privacies previously available only to the very wealthy. Moreover, to state the obvious, correlation is not causation. Given these critical caveats, however, we take these correlations to be substantive and significant – sufficient to provide at least an initial framework for the analyses, questions, and reflections we seek to raise here. (See also footnote 7, below.) 2 We will note below that “privacy,” as discussed and conceptualized in Denmark and Norway in terms of privatlivet (private life) and intimsfære (intimate sphere), thereby invokes more relational understandings of selfhood, in contrast with strongly individual conceptions. It appears that more individual conceptions have historically defined privacy in European and U.S. traditions, but it is outside the scope of this paper to argue that point in detail. ([3, pp. 125–136], details central aspects of notions of privacy in a data protection context; for one illustration of the plurality of meanings that can be made to hinge on the term, cf. [4, Chapter 2].) Presuming the distinction holds, however, we then place “privacy” in quotation marks in order to signal this ambiguity. 3 Cf. [5].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

42

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

towards relational selves, as fostered by electronic media, is treated as a threat to the conception of selfhood at work in those efforts. All is not lost, however. On the contrary, Section 1 introduces us to examples of hybrid selves – selves that conjoin both individual and relational emphases – beginning precisely with feminist notions of “relational autonomy” [6]. Moreover, we show there how Nissenbaum’s theory of privacy as contextual integrity [1] builds upon James Rachels’ conceptions of such a hybrid self – one that is relational while simultaneously maintaining a strong sense of individual agency and control [7]. Lastly, we argue that these more hybrid conceptions of selfhood and privacy are already in play in cultural assumptions, norms, and privacy practices exemplified by contemporary Norway (Section 2.2). Taken together, then, these first two sections build what we argue to be a framework of empirical, conceptual, historical, cultural, and legal observations that are supplementary to one another in mutually reinforcing ways. This framework then prepares us to turn in the concluding section to the question: What are the implications of these transformations and new theoretical developments for emerging personal data ecosystems and user-centric (or, as we prefer, human-centric) frameworks for personal data management? Here we first argue that these developments can – and certainly ought to – work to increase human agency and control over personal data. Here we highlight Norway’s research ethics guidelines as an extant example which at least hints at what a more relational sense of selfhood and privacy ethics as developed by Nissenbaum would look like in practice. But second, we recall that the historical affiliation of relational selves with more hierarchical, if not frankly authoritarian social structures and practices, thus represent a potential challenge to classic modern notions of autonomous selves as foundational for democratic societies and norms, including equality and gender equality. Returning to Medium Theory, we argue that this risk thus sharpens the urgency of our careful choices regarding media usages. Specifically, Medium Theory would urge us to reinforce the facilities with literacy and print that foster the individual autonomy underlying modern democracy and equality norms, vis-à-vis our ever increasing use of electronic media.

1. Changing Media, Changing Selves 1.1. Medium Theory Medium Theory emerges with the work of Marshall McLuhan, Harold Innis, Elizabeth Eisenstein, and Walter Ong; more recent work, for example, by Naomi Baron [8] helps develop the applications of Medium Theory to contemporary electronic media – most notably, the multiple modalities of communication made possible by computer networks, including Internet-facilitated communication increasingly taken up through mobile devices (cf. [9]). As elaborated more fully elsewhere [9], Medium Theory highlights a number of key differences between the communication modalities of literacy-print – i.e. the conjunction of literacy with the emergence of print-facilitated communication, ca. 1453 to present (e.g. [10]) – and those of electronic media, beginning with radio, film, and then TV. For our purposes, the most important of these differences is the notion of secondary orality affiliated with electronic media [11]. Medium Theory specifically

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

43

highlights how each of the communication modalities fosters specific sorts of selfhood and identity. Most briefly, literacy-print correlates with a (high) modern sense of individual selfhood – including the strongly rational and autonomous senses of selfhood affiliated with early modern philosophers such as John Locke and Immanuel Kant. By contrast, secondary orality – echoing the primary orality affiliated with earlier societies and cultures – fosters the (re-)emergence of more relational senses of selfhood and identity [9]. As we are about to see in a number of ways, such senses of identity and selfhood turn on the multiple relationships that are taken to define precisely who one is – e.g. the child of two particular parents, the sibling of a particular brother or sister, parent to particular children, and so on. The key point here is that Medium Theory thus predicts that the shift from literacy-print to secondary orality will be accompanied by a shift from more individual towards more relational senses of selfhood. We can see that this is so in two important domains of scholarly inquiry, namely: Internet Studies and contemporary philosophy. We will see, in particular, that this shift further correlates with fundamental transformations in our understandings and practices of “privacy”.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1.2. Internet Studies The literatures of Internet Studies have demarcated this shift towards focusing on more relational senses of selfhood across a range of Internet-facilitated communication venues. As a first example: within the first decade of public access to Internet-facilitated communication, Barry Wellman and Caroline Haythornthwaite documented the rise of “the networked self.” Such a self is recognizably individual; but at the same time, the networked self is defined precisely by the extensive webs of relationships facilitated and instantiated by the whole range of networked communications – ranging from email to participation in online communities and virtual worlds – that are thereby incorporated into one’s sense of identity [12]. This notion of a networked self is now robust enough to justify, for example, arguments for radical revisions of modern notions of legal agency and responsibility – i.e. notions based precisely on more atomistic conceptions of selfhood ([13]; see [14, 118f.]). These empirical findings are further coherent with the prevailing theoretical frameworks taken up in Internet Studies – namely, social theories of selfhood that emphasize the primary roles of diverse relationships in defining our sense of identity. These include Georg Simmel’s notion of the “sociable self” [15,16] and G.H. Mead’s “social theory of consciousness” [17, p. 171]. Perhaps most prominent in contemporary Internet Studies is Erving Goffman’s notion of selfhood as enacting and performing the diversity of relationships that define such a self [18]. As both the theories and the studies based on these theories document (e.g. [19]), relational selfhood appears to be the primary affordance of contemporary digital technologies – most prominently, Social Networking Sites (SNS) that instantiate precisely the Goffmanian performative self, i.e. one that is focused on establishing and sustaining (through role-appropriate performance) the diverse networked relationships that define its identity and possibilities for agency. As we will explore below, whether or not such a self can retain the strong sense of individual identity and agency as established in high modernity emerges as a critical question.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

44

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

1.3. Contemporary Philosophy  Emerging Philosophical Conceptions This question is further illuminated by recent philosophical work on online identities and their relationships to offline identities. On the one hand, some contemporary philosophical work reinforces more high modern notions of identity as unitary and consistent over time, in contrast especially with 1990s’ emphases on postmodern conceptions of identity as multiple, fluid, and ephemeral [20]. On the other hand, a range of recent theoretical developments in feminism and Information and Computing Ethics emphasise relationality in various ways. These emerging understandings include C. Mackenzie [6] and others’ work on “relational autonomy,” Judith Simon’s work on distributed epistemic responsibility [21], and Luciano Floridi’s account of distributed morality and distributed responsibility [22]. More broadly, virtue ethics has re-emerged as a stance within moral theory during the last decades, not least by contrasting itself with approaches that have tended to take as their basic premise the individual as an entity considered in abstraction from defining relations like family and friends. (For a classic and influential argument to this effect, cf. [23]; cf. also [14]. We will explore this more fully below by way of Alasdair MacIntyre as a primary example.) Both individually and together, these accounts instantiate a shift towards more clearly relational understandings of ethical responsibility and agency – i.e. understandings appropriate to relational selves more than strongly (high modern) individual selves.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

 Nissenbaum’s Theory of Privacy as Contextual Integrity It seems quite clear that the emergence of modern notions of strongly individual senses of selfhood and moral agency correlate with high modern notions of individual privacy (e.g. [14, 62f.]). Broadly speaking, the rise of networked communications has issued in multiple efforts over the past several decades to develop robust law and procedures for protecting individual privacy by way of protecting personal data – what philosophers such as Herman Tavani (among others) identify as informational privacy [24, p. 136]. Perhaps the perspectives afforded by thinking in terms of personal data management (PDM) can help us begin to see more clearly how narrowly conceived issues of privacy constitute a fuzzy zone in a sea of widely differing relations. As the terms signify, PDM concerns personal data management, that is, the entire flow of data that together constitutes important dimensions of the individual’s life. At the same time, however, the notion of PDM might be said to highlight, more starkly than talk of privacy, a particular challenge for any ethically responsible handling of personal information: it must find ways of dealing with all those settings where the information, and/or the identities to which the information contributes, are shared in defining relations. We here have occasion merely to note this issue without being able to suggest strategies for solving the dilemmas that follow from it. Recent developments in European Union data privacy protections demarcate efforts to provide individuals with greater individual privacy in the form of individual data protection [25,26]. At the same time, however, the ongoing shifts towards more relational senses of selfhood immediately imply profound transformations in our expectations and practices regarding privacy. Most briefly, it is clear across a range of online behaviours – most especially in social networking sites – that we are witnessing a shift from an individual sense of privacy to what some have called “publicly private”

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

45

and “privately public” senses of privacy, i.e. privacy as defined more for specified groups of either close- and/or weak-tie relationships [27]. Within philosophy, these shifts have inspired a number of efforts to reconceptualise privacy in ways more appropriate to more relational senses of selfhood. The most significant of these is Helen Nissenbaum’s account of privacy as a matter of “contextual integrity”. Nissenbaum develops a definition of privacy as a right to an “appropriate” flow of information as defined by a specific context [1, 107ff.]. Contexts here can refer to the marketplace, political life, education, and so on. Within a given context, a specific set of informational norms define the usual or expected flows of information. What is striking – but not, to our knowledge, otherwise emphasised either by Nissenbaum or those we’ve seen take up her work, is that a core element in defining a context is what we have seen previously as the relational self. That is, Nissenbaum identifies three parameters as defining a given context, beginning with the actors involved. In addition, the attributes (types of information) and “transmission principles” of a context then determine “the constraints under which information flows” [1, p. 33]. Nissenbaum’s example hints here at the relationality of the actors involved: she describes a fictive case of medical information shared between patients and their doctors. What she denotes as the patient – what we denote as the (agent as) patient 4 – expects this information, as highly personal and sensitive, to remain confidential. At the same time, the (agent as) patient recognizes that such information can be appropriately shared as needed with other medical professionals – in our terms, other (agents as) medical professionals. By contrast, were the (agent as) physician to sell such information to a marketing company – to follow the informational norms of the market – the (agent as) patient’s expectations of appropriate information flow “would be breached” and “we would say that informational norms for the health care context had been violated” (ibid). We use this slightly clumsy circumlocution of (agent as) patient, etc. in order to highlight what Nissenbaum makes more explicit elsewhere: namely, in her understanding of the actors who define a given context, Nissenbaum assumes a strongly relational sense of selfhood – one defined by the wide range of relationships and thereby roles that we take up with one another. This becomes clearest as Nissenbaum invokes the earlier work of James Rachels, who first of all highlights the connection between given roles – “businessman to employee, minister to congregant, doctor to patient, husband to wife, parent to child, and so on” – and specific expectations regarding privacy ([7, p. 328], cited in [1, pp. 65, 123]). In addition, in his account of privacy, Rachels articulates a close connection between our agency and our relationality: “there is a close connection between our ability to control who has access to us and to information about us, and our ability to create and maintain different sorts of social relationships with different people” ([7, p. 326], cited in [1, p. 65]; emphasis added, CE, HF). Again, as we read them, neither Rachels nor Nissenbaum explicitly identifies their understandings of human identity in terms of relational selfhood. But Rachels clearly highlights our social relationships as critical to defining privacy: in doing so, he equally clearly points to the relational or social selfhood we recognize from Mead, Simmel, and Goffman, for example. So it seems safe to say that Nissenbaum’s account of pri4 We use the formulations “the (agent as) patient” and “(agent as) physician” etc. in order to highlight how a singular self, capable of choice and agency, at the same time is highly relational, precisely as it takes up and interacts with others along the lines defined by specific social roles.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

46

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

vacy as contextual integrity – where contexts are defined first of all in terms of actors and their roles – and thereby on Rachel’s explication of social relationships, is highly appropriate to the relational sense of selfhood we have seen come to the forefront in the age of networked communications. At the same time, Rachel’s account of the relational self highlights the agency of the individual involved in creating and sustaining the diverse relationships contributing to one’ sense of identity; it further emphasizes this agency in terms of the interest in controlling one’s information within a given context. This suggests a “hybrid” self – one that is both individual and relational. Indeed, we can note here that such a hybrid self is precisely what at least one contemporary researcher has documented to be at work in the communicative processes of a prominent Danish blogger and her audience. As we saw above, Stine Lomborg is an example of contemporary researchers who draw on theories of selfhood as relational and social – in her case, the work of Georg Simmel. Moreover, in her empirical analyses of the communication between the author of Huskebloggen and her readers, Lomborg describes a process of negotiation that recognizes the boundaries of individual privacy in conjunction with the creation of a shared “personal space”, which is neither fully individual nor fully public, but squarely relational [19]. As we will see below, this analysis of privacy in the Danish context holds further interest in that the primary concepts and language for “privacy” in play here – namely, privatlivet and intimsfære – are fully relational terms. In both Danish and Norwegian, privatlivet – “private life” – connotes not only the life of a given individual, but a social private life constituted precisely by one’s close family, friends, and other close relationships. Similarly, intimsfære – “intimate sphere” – demarcates precisely the shared space that is neither fully individual nor fully public, but is typically articulated on the close relationships contributing so profoundly to one’s individual sense of identity. We will argue more fully below for this affiliation between these individualrelational senses of selfhood and identity and the sense of the “individual agent as relational” that we see at work in Rachels’ and thereby Nissenbaum’s account of privacy.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Future Selves, Privacies, Data Controls? What are the implications, in both theory and practice, of these foundational transformations for more relational conceptions of selfhood and privacy – especially for our emerging conceptions of how human beings may appropriately exercise agency and control over their personal data? In this section, we develop further background for broadly addressing this question by first reviewing the implications of relational selfhood for our social and political practices (Section 2.1). Here we draw primarily on the work of communitarian and virtue ethicist Alasdair Macintyre to show the ways in which relational selves – as Medium Theory initially makes clear – can tend to correlate with non-democratic social structures and practices. This point is reiterated in a brief discussion of the relational self in Confucian traditions. Both examples thus highlight the risks to (high) modern commitments to democratic processes and norms, including basic norms of equality and gender equality, that emerge alongside the rise of the relational self as fostered by contemporary electronic media. We then turn to Norway as a possible example of how relational selfhood may remain inextricably interwoven with solidly high modern individual notions of selfhood

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

47

(Section 2.2). This becomes apparent with the ways in which “privacy”5 is conceptualized and discussed – namely in the terms of privatlivet and intimsfære. Moreover, the law and practice of research ethics in Norway provides an example of how such individual-relational privacy can be conceptualized as an object of protection. Finally, contemporary Norway offers a concrete instance of how the anti-democratic potentials of more relational selves can indeed be avoided in practice – at least given specific media choices.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.1. The Risks of Relational Selfhood: MacIntyre We have seen how our sensibilities, practices, expectations, and performances – most especially online and most especially with regard to privacy – demarcate a shift towards more relational senses of selfhood and identity in multiple ways. Here we emphasize that such selves are thereby far more dependent upon their networks of relationships, first of all to define their own identity. It is no surprise to discover then, that historically such selves are at home in social and political structures marked by hierarchy and the non-democratic exercise of power. As we have now seen, in multiple ways and contexts, relational selves are selves that exist and are defined through the relations into which they enter. The functioning and identity of such a self are inherently co-determined by these inter-individual relationships. Several desirable qualities can be attributed to such a state of affairs. A tendency to promote harmonious social practices is one of the central assets associated with it: if the self already has others as parts of itself, and is part of those others, the potential for social conflict is reduced as compared to settings where more atomistic selves are in play. However, there are attributes which are sometimes seen as intimately related to such relational harmoniousness, but are not for that reason considered desirable in themselves. One family of claims, which is often brought up in the discussions of relational selves, amounts to a criticism of its anti-democratic tendency. Two main expressions of this tendency, putatively inherent to relational selves, are the existence of rigid social hierarchies and a risk of overt corruption or nepotism. Hierarchical structures, it is claimed, tend to correlate with relational selves because their relational status entails dependence and a corresponding recognition of dependence. While this dependence might, in the abstract, be thought of in the form of interdependence, that is, a mutual recognition of community, in practice differences in resources and power may lead to a social system where only a few rule, but all see themselves as obliged to respect the decisions of those in power. Nepotism and downright corruption can easily follow from this state of affairs, because the perspective of those in power is not gainsaid, but respected by all. The claim is thus not (or at least not primarily) that the ones wielding power are corrupted in the sense of developing egotistical motives because they are in a position to do so, but that their perspective comes to be treated as representing everyone in spite of the fact that the majority of individuals involved have not even voiced their opinion. The corruption consists in the ones in power illegitimately taking their own perspectives (which will tend to support their own sense of values, need, and self-worth) to be the voice of reason tout court.

5

See footnote 1, above.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

48

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

But let us be more concrete. This potential shortcoming, suspected of being inherent to relational selves and to the corresponding social realities, has been remarked upon in very different contexts. Among other things, the tendencies in question have been said to inhere in Confucianism and in Communitarianism. As these two schools of thought are also representative of, respectively, Eastern and Western traditions, they constitute a fruitful pair with which to start diagnosing and evaluating the nature of the claims made. If one had to choose a Grundschrift for Communitarianism, the main contender would probably be Alasdair MacIntyre’s After Virtue: A Study in Moral Theory, first published in 1981. Not least the concept of a tradition which he expounds, and which has been highly influential in Communitarianism more generally, is of interest in detecting elements that can be (and have been) the subjects of attacks on the counts spelled out above. At the same time, since MacIntyre’s framework forms part of and has been important to the development of some strands of virtue ethics, this analysis brings out a relational potential in much virtue ethics as well. MacIntyre introduces his notion of a tradition in the context of expounding the narrative self, which is also a self where “the story of my life is always embedded in the story of those communities from which I derive my identity” [28, p. 221]. To MacIntyre, the self is constituted in its very core by specific types of relationships. This also means that each of us approaches the specifics of our lives:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

as bearers of a particular social identity. I am someone’s son or daughter, someone else’s cousin or uncle; I am a citizen of this or that city, a member of this or that guild or profession; I belong to this clan, that tribe, this nation. Hence what is good for me has to be good for one who inhabits these roles [28, p. 220].

The import of the multi-relational identity explained by MacIntyre is perhaps clearest in his discussion of practices.6 In order to qualify as a practice, a set of interrelated activities has to possess a high degree of complexity, and doing well in those activities has to be an achievement requiring long and manifold training. To give an idea of the sort of requirements and potentials involved, MacIntyre’s first examples of practices are football, chess, architecture, farming, physics, chemistry, biology, the work of the historian, painting, and music [28, p. 187]. Now part of the trick for the individual being socialized into such a practice, which to MacIntyre is the place where virtue can be developed and sustained as those human qualities required to obtain the goods internal to the practice, is to reach a point where the practice ceases to be merely a means, and acquires the status of an end for that person. Given the complexity of a practice, this is the only way to obtain its goods, as an absence of such understanding excludes one from the sort of effort required to realise virtue, or even to see those goods properly. What is most interesting for our purposes is the fact that, in order to make oneself part of a practice in this way, one has to submit to it: A practice involves standards of excellence and obedience to rules as well as the achievement of goods. To enter into a practice is to accept the authority of those standards and the inadequacy of my own performance as judged by them. It is to subject my own attitudes, 6 MacIntyre defines a practice as “any coherent and complex form of socially established cooperative human activity through which goods internal to that form of activity are realized in the course of trying to achieve those standards of excellence which are appropriate to, and partially definitive of, that form of activity, with the result that human powers to achieve excellence, and human conceptions of the ends and goods involved, are systematically extended” [28, p. 187].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

49

choices, preferences and tastes to the standards which currently and partially define the practice [28, p. 190].

To temper the perhaps even totalitarian impression one might get from the above formulations, MacIntyre adds: Practices, of course […], have a history: games, sciences and arts all have histories. Thus the standards themselves are not immune from criticism, but nonetheless we cannot be initiated into a practice without accepting the authority of the best standards realized so far [28, p. 190].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Now such submission must importantly include submission to individuals deemed authoritative by those already part of the practice. In other words, the very logic of the thing includes reference to a hierarchical system where some get to decide what counts as the better way of carrying out the practice. And while one might submit that, as one progresses, one develops a sense of judgement for the products and goods of the practice, it remains true that the entire undertaking is built on inequalities and quiescent submission. Furthermore, one’s mindset is attuned to such hierarchical thinking and judgment as a premise for even getting started, and for one’s entire advancement in the practice. And correspondingly, as one reaches the upper echelons of the practice, one has already been taught in no uncertain terms (in fact: in practice) that this is where authority resides. There can be little doubt – whether we confine ourselves to the argument or look to real-life instantiations of practices – that the risk of hierarchical structures and mindsets bearing fruits of corruption is a real one. It definitely bears notice that this stern view of the relation between tradition and the individual talent is not MacIntyre’s invention. Its perhaps greatest influence has been through hermeneutics, itself a tradition of enormous importance to the last couple of generations’ academic work and self-understanding in the west. Hans-Georg Gadamer’s magnus opus (originally published in 1960), Truth and Method, proffers a similar understanding of understanding (as has been pointed out by, e.g. Jürgen Habermas). In a passage clearly echoed by MacIntyre, Gadamer writes that Admittedly, it is primarily persons that have authority; but the authority of persons is ultimately based not on the subjection and abdication of reason but on an act of acknowledgement and knowledge—the knowledge, namely, that the other is superior to oneself in judgment and insight and that for this reason his judgment takes precedence – i.e. it has a priority over one’s own. […] It is true that authority implies the capacity to command and be obeyed. But this proceeds only from the authority that a person has. Even the anonymous and impersonal authority of a superior which derives from his office is not ultimately based on this hierarchy, but is what makes it possible [29, p. 281].

Characterising the claims of such authority as being in principle discoverable as true, Gadamer adds that this “is the essence of the authority claimed by the teacher, the superior, the expert” [29, p. 281]. Undermining the Enlightenment stance as one which has set up a false dichotomy between tradition and reason is one of Gadamer’s main ambitions with the book. In ethics as in other fields, tradition is nothing less than the ground of its validity [29, p. 282]. And so for Gadamer, no less than for MacIntyre, submission to the authority of tradition, as incorporated in other individuals and expressed (in a well-functioning system) through their positions as well as their activities, becomes the only way to learn, to cultivate oneself, to belong to a field, and of course to reach a position of authority in it. Similar criticism has often been voiced against Confucianism. One expert on Confucian tradition and scholarship, citing a host of scholars, states that “[t]he common

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

50

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

view […] is that rights do not find a congenial home in Confucianism because of its emphasis on community” [30, p. 32]. And while his own more recent work modifies this picture by bringing in a qualification in the form of “the germ of an argument in the idea that the common good is sustained by recognition of a duty to speak” [30, p. 36], this does nothing to alter the basic tendency to see in Confucianism’s advocacy of respect for elders and figures of authority also the germ of anti-democratic, hierarchic thinking and practice. In particular, this means, as the Confucian scholar Mary Bockover has summarised, “Western values of free expression, equality and free trade as well as the idea of personal and political autonomy are incompatible with Confucian values” ([31, p. 170]; cited in [32, p. 168]; emphasis added, CE, HN; see [33, pp. 94– 97]). Our intention here is not to criticise “Eastern” or “Western” conceptions and practices. Rather, these observations bring us to a core point: what does “privacy” mean for such relational selves? Most briefly: just as relational selves appear to challenge high modern commitments to individual autonomy, equality, and democratic processes – so such selves stand completely opposed to high modern conceptions of individual privacy as a positive good. As we have seen, it is just such individual privacy that current EU regulations are at pains to protect. By contrast, however, in societies constituted by relational selves, such individual privacy is understood solely in negative terms. For example, until recently in China – reflecting at least in part a strong Confucian tradition rooted in relational selfhood [34, 25ff.] – “privacy” (yinsi) was defined as something bad or hidden [35]. This is not surprising: if our sense of selfhood and identity is defined and enhanced by our multiple relationships; we would, it would seem, only seek solitude if we indeed had something to hide. By contrast, we have seen above that more familiar notions of individual privacy as a positive good arise in conjunction with high modern notions of individual, autonomous selves (cf. [36]).7 In this light then, it is no surprise that contemporary practices and expectations of “privacy” in online networked environments demonstrate the shifts we have noted away from strongly individual toward more “publicly private/privately public” notions [27]. Especially where individual privacy is a defining concept and value of modern democracies, then its potential loss in our shift towards more relational selves may be in lockstep with a threat to affiliated democratic norms and values, including basic commitments to equality. Happily, however, we do not think that these shifts must inevitably end in such losses. On the contrary, we will now argue – by way of the example of Norway – that our futures can include hybrid selves which, as they conjoin individual with relational senses of selfhood, can thereby sustain high modern ethical norms and political commitments. 2.2. Example: Norway Norway stands as something of a middle ground between what may otherwise appear as opposite poles. That is, Norwegian culture and society have long fostered a sense of 7 This is of course not to say that privacy, seen as something positive, and Modern atomistic individuality logically imply each other. As one of our reviewers pointed out, certain religious practices for example, might include positive evaluations of privacy without the selves involved being atomistically Modern. And correspondingly, at least in theory, we can imagine a thoroughly Modern self with no eye for privacy. However, the general trend, both historically and conceptually, seems to have been their co-production and coordination.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

51

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

individual selfhood that is at the same time resolutely relational. In doing so, Norway appears to succeed in sustaining high modern conceptions of individual identity, privacy, equality, and democracy alongside conceptions of more relational identity. This is apparent first of all in the ways “privacy” is conceptualized, discussed, and regulated. To begin with, individual privacy, as an exclusively individual concern and right, is certainly understood and protected. Norway is not a member of the European Union, and the suggested national implementation of the EU’s Directive 2006/24/EC occasioned a heated and critical public debate about the prolonged storing of personal information. The directive was none the less passed by the Norwegian Parliament. The Norwegian NDP (Data Protection Agency) Director has, on the other hand, expressed sympathy for the notion of a “right to be forgotten” that has come into play in the now ongoing debate about a new European data protection regulation. At the same time, however, “privacy” is discussed and understood partly in terms of privatlivet (private life) and sometimes the intimsfære (the intimate sphere) – meaning, the sphere of close relationships critical to an individual’s own identity and personhood. These terms and concepts thus closely echo and reinforce the analysis of privatlivet and intimsfære we saw in the Danish example [19]. As a reminder, these concepts represent precisely the conjunction of the individual and relational selves, including the correlative practices of holding together respect for individual privacy in conjunction with shared or personal spaces (online and off) that are neither purely individual nor purely public. This conjunction of individual with relational emphases of selfhood can further be discerned in the justifications given for Article 100 of the Norwegian Constitution – an Article that introduced dramatically expanded (high modern) rights to freedom of expression and freedom of speech. One of the primary arguments offered here turns on the principle of autonomy, defined as “the individual’s freedom to form opinions…” – a freedom further based on a concept of a “mature human being”: This is neither the collectivist concept of the individual, which states that the individual is subordinate to the community, nor the individualistic view, which states that regard for the individual takes precedence over regard for the community. The conception of “the mature human being” can be said to embody a third standpoint which transcends the other two and assumes that a certain competence (socialization or education) is required in order to function as an autonomous individual in the open society [37, p. 18].

Such a mature human being, we suggest, is thereby at least in part an individualrelational self. Secondly, these conceptions appear to underpin extant research ethics guidelines. Consider the heading of § 13 of the Norwegian Guidelines for research ethics in the social sciences, law and the humanities: “The obligation to respect individual’s privacy [privatliv] and close relationships” [38, p. 17]. Seen in this light, this Norwegian research code seems to imply that researchers are obliged to protect the privacy and confidentiality of not simply their individual research subjects, but also the privacy and confidentiality of their close relationships – the relationships that help constitute privatlivet. A concern with a level transcending the individual as ethically central might also be gleaned from the same guidelines’ § 22, “Respect for vulnerable groups”, which singles out groups in addition to persons in stating that “[v]ulnerable and disadvantaged individuals and groups will not always be equipped to defend their own interests” [38, p. 22].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

52

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

In slightly different terms, we can see here an example of what Nissenbaum’s account of privacy as contextual integrity might “look like” in practice. That is, the NESH guidelines appear to address the core importance of the individual-relational self, maintaining control over information flows within a close circle of relationships (recall [7]). In any event, while there is generally a gap between theory and practice, both of these ethical requirements are of interest mainly because they articulate theoretical descriptions of how ethical practices tend to work – as it were, in practice – whenever relationality is acknowledged to form a mainspring for them. In considering the Norwegian example, we would further suggest that the emphasis on relationality apparent in the ways outlined above is balanced by exceptionally strong commitments to individualism and the equality of individuals. As a basis for social interaction, the notion of a basic equality, irrespective of (centrally) class, income, or gender, has rather strong foothold on much of the population. This is not of course to say that Norwegians are egalitarians through and through, or that they are so in a systematic or rationally coherent manner. It is to say, however, that the equality of individuals is apparent in a striking range of ways. As a primary example, consider Norway’s GINI coefficient. The GINI coefficient demarcates distribution of wealth and income within a society: a GINI coefficient of 0 would mean perfect equality, while a GINI coefficient of 100 would mean complete inequality. The GINI coefficient for Norway is 25 – the third lowest in the world (alongside Denmark and Slovenia: [39, p. 80f.]) Moreover, Norway is “considered to be one of the most gender equal countries in the world” ([40], http://www.gender.no/Policies_tools). Finally, these concepts and practices are strongly coherent with what Medium Theory would predict. That is, in media terms, Norway represents a very strong balance of literacy-print and digital technologies. For example, Norway has one of the highest literacy rates in the world – including very high production and consumption of print media such as newspapers and books. For example, UNESCO reports that newspaper circulation in the United States in 2004 averaged 193 per 1000 inhabitants;8 in the same year, the average circulation in Norway was 516 per 1000 inhabitants.9 At the same time, the use of digital media in Norway – supported for decades by strong state investment in infrastructure, etc. – is also among the highest in the world. For example, Internet penetration is measured at 97.2%, second only to Iceland (97.8%); by comparison, the U.S. – despite being the birthplace of the Internet – is ranked 27th in the world with 78.3%.10 To be sure, there are many other cultural, historical, and political factors that make these accomplishments possible. But our central point is that the Norwegian example suggests the possibility of holding together the communication technologies of literacyprint and the secondary orality-textuality of electronic media – alongside notions of selfhood that are resolutely individual and intrinsically relational – with the legal and political correlates of individual and close-relationship privacy protections and ongoing commitments to high modern understandings of democratic processes and the norms of equality and gender equality.

8 http://stats.uis.unesco.org/unesco/TableViewer/document.aspx?ReportId=124&IF_Language=eng&BR_C ountry=8400&BR_Region=40500 9 http://stats.uis.unesco.org/unesco/TableViewer/document.aspx?ReportId=124&IF_Language=eng&BR_C ountry=5780&BR_Region=40500 10 http://www.internetworldstats.com/top25.htm

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

53

3. Concluding Remarks

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.1. Implications for Human-Centric Frameworks for Personal Data Management? Given that the Norwegian example indeed instantiates a notion of an individualrelational self, one that is thereby closely attuned to Nissenbaum’s theory of “privacy” – it thereby provides an initial sketch of what our control over personal data might look like in the years to come. To begin with, the primarily individual notions of selfhood, agency and privacy, affiliated with high modernity and strongly protected within current and pending EU guidelines, need not disappear as we shift towards more relational selves. To be clear: we do not believe that the risks of such a loss – and thereby, affiliated losses of democratic rights and norms, including equality and gender equality – should be minimized. Nonetheless, in Rachels’ and Nissenbaum’s theory of privacy, Lomborg’s analysis of online negotiation processes, and the exemplified Norwegian conceptions of and codes for protecting privatlivet and the intimsfære, we see notions of relationality that retain an emphasis on individual agency in the control of one’s information, precisely as that information is shared within specified contexts defined by specific relationships. Again, such agency – as rooted in high modern notions of autonomy intertwined with literacyprint – cannot be assumed or taken for granted, most especially in the face of ubiquitous pressures to render everything digital in an increasingly hyperconnected “onlife” world [43]. Nor, we would argue, should we let efforts made in good-faith to protect individual privacy in more instrumental ways – e.g. “privacy by design” (http://privacybydesign.ca/) – allow us to become complacent in the assumption that technological design coupled with carefully crafted regulations will be sufficient to protect individual privacy. Rather, alongside such important initiatives and projects, becoming a relational self that simultaneously maintains individual agency and control over privacy and privatlivet requires both individual initiative and nothing less than the intentional and sustained support of the wider society. This is made clear precisely in the Norwegian example, whose account of “the mature human being” is exactly that of an autonomy inextricably interwoven with larger social communities and infrastructures, including those of education. 3.2. Implications for Choices Concerning Our Media Usage? Indeed, both Medium Theory and the Norwegian examples make clear that our future – in terms of our identities, our privacies and private lives, and our polities – will, in no small measure, turn on choices we make regarding media usage and media literacies. Medium Theory highlights the correlations between literacy-print, high modern conceptions of individual selfhood and privacy expectations, and high modern emphases on democratic processes and norms, including equality and gender equality, on the one hand, and, on the other, secondary orality and a relational self at home in hierarchical, if not authoritarian, structures and regimes. In this light, if we aim to sustain and enhance a high modern understanding of selfhood, privacies, and democratic norms and processes, we would thus be well served to preserve and foster the communicative skills and abilities affiliated with literacy-print. Doing so, as we have seen in the Norwegian example, is compatible with developing a hybrid self that conjoins strongly individual notions of agency and autonomy with more relational sensibilities. Failure to

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

54

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

do so, however, especially in the face of multiple pressures to further develop and expand “digital literacies” – i.e. the abilities and skills affiliated primarily with electronic media and thus secondary orality – would seem to include a risk of turning our future away from the balances and equalities required for democratic norms and practices, towards more hierarchical and potentially more authoritarian social structures and regimes.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] Helen Nissenbaum (2010). Privacy in Context: Technology, Policy, and the Integrity of Social Life. Palo Alto, CA: Stanford University Press. [2] Charles Taylor (1989). Sources of the Self: The Making of the Modern Identity. Cambridge, Mass: Harvard University Press. [3] Lee A. Bygrave (2002). Data Protection Law: Approaching Its Rationale, Logic and Limits. Information Law Series 10. The Hague: Kluwer Law International. [4] Daniel J. Solove (2008). Understanding Privacy. Cambridge, Massachusetts: Harvard University Press. [5] Julie E. Cohen (2013). “What Privacy Is for”. Harvard Law Review 126: 1904–1933, http://www. harvardlawreview.org/issues/126/may13/Symposium_9476.php. [6] Catriona Mackenzie (2008). Relational Autonomy, Normative Authority and Perfectionism, Journal of Social Philosophy 39(4), Winter 2008: 512–533. [7] J. Rachels (1975). Why Privacy Is Important. Philosophy and Public Affairs 4(4): 323–333. [8] Naomi Baron (2008). Always On: Language in an Online and Mobile World. Oxford: Oxford University Press. [9] Charles Ess (2010). The Embodied Self in a Digital Age: Possibilities, Risks, and Prospects for a Pluralistic (Democratic/Liberal) Future? Nordicom Information 32(2), June 2010: 105–118. [10] Elizabeth Eisenstein (1983). The Printing Revolution in Early Modern Europe. Cambridge: Cambridge University Press. [11] Walter Ong (1988). Orality and Literacy. London: Routledge. [12] Barry Wellman and Caroline Haythornthwaite (Eds.) (2002). The Internet in Everyday Life. Oxford: Blackwell. [13] Julie E. Cohen (2012). Configuring the Networked Self: Law, Code, and the Play of Everyday Practice. New Haven: Yale University Press. [14] Charles Ess (2013a). Digital Media Ethics (2nd edition). Oxford: Polity Press. [15] G. Simmel (1955). Conflict and the Web of Group-Affiliations. New York: The Free Press. [16] G. Simmel (1971). Sociability. In: D.N. Levine and Georg Simmel (Eds.), On Individuality and Social Forms. Selected Writings (pp. 127–140). Chicago and London: The University of Chicago Press. [17] G.H. Mead ([1934] 1967). Mind, Self & Society. Chicago: Chicago University Press. [18] E. Goffman (1959). The Presentation of Self in Everyday Life. London: Penguin Books. [19] Stine Lomborg (2012). Negotiating Privacy Through Phatic Communication: A Case Study of the Blogging Self, Philosophy and Technology 25: 415–34. doi: 10.1007/s13347-011-0018-7. [20] Charles Ess (2012). “At the Intersections Between Internet Studies and Philosophy: “Who Am I Online?” (Introduction to special issue), Philosophy & Technology 25(3), September 2012: 275–284. doi: 10.1007/s13347-012-0085-4. [21] Judith Simon (2013). Distributed Epistemic Responsibility in a Hyperconnected Era. In: Broadbent et al. (Eds.), The Onlife Initiative (pp. 135–151). Brussels: European Commission. https://ec.europa.eu/ digital-agenda/en/judith-simon-0. [22] Luciano Floridi (2012). Distributed Morality in an Information Society. Science and Engineering Ethics. doi: 10.1007/s11948-012-9413-4. [23] Michael Stocker (1997). The Schizophrenia of Modern Ethical Theories. In: Roger Crisp and Michael Slote (Eds.), Virtue Ethics (pp. 66–78). Oxford Readings in Philosophy. Oxford: Oxford University Press. (The paper originally appeared in 1976.) [24] Herman Tavani (2013). Ethics and Technology: Ethical Issues in an Age of Information and Communication Technology (4th edition). Hoboken, NJ: Wiley. [25] Article 29 Data Protection Working Party (2012a). Opinion 04/2012 on Cookie Consent Exemption (00879/12/EN WP 194), http://ec.europa.eu/justice/data-protection/article-29/documentation/opinionrecommendation/files/2012/wp194_en.pdf (accessed December 19, 2012).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

C. Ess and H. Fossheim / Personal Data: Changing Selves, Changing Privacies

55

[26] Article 29 Data Protection Working Party (2012b). Opinion 08/2012 providing Further Input on the Data Protection Reform Discussions (01574/12/EN WP199), http://ec.europa.eu/justice/dataprotection/article-29/documentation/opinion-recommendation/files/2012/wp199_en.pdf (accessed December 18, 2012). [27] Patricia G. Lange (2007). Publicly Private and Privately Public: Social Networking on YouTube, Journal of Computer-Mediated Communication 13(1), article 18, http://jcmc.indiana.edu/vol13/issue1/ lange.html. [28] Alasdair MacIntyre (1994). After Virtue: A Study in Moral Theory (2nd edition). Duckworth: Guilford. [29] Hans-Georg Gadamer (2004). Truth and Method. English translation with revisions by Joel Weinsheimer and Donald G. Marshall. London: Continuum. [30] David Wong (2004). “Rights and Community in Confucianism”. In: Kwong-Loi Shun and David B. Wong (Eds.), Confucian Ethics: A Comparative Study of Self, Autonomy, and Community (pp. 31– 48). Cambridge: Cambridge University Press. [31] Mary Bockover (2010). Confucianism and Ethics in the Western Philosophical Tradition I: Foundational Concepts, p. 170. Philosophy Compass 5(4), 307–316. Cited [32, p. 168]. [32] Pak Wong (2012). Net Recommendation: Prudential Appraisals of Digital Media and the Good Life. PhD thesis, Department of Philosophy, University of Enschede, Enschede, The Netherlands. [33] Charles Ess (2013b). The Onlife Manifesto: Philosophical Backgrounds, Media Usages, and the Futures of Democracy and Equality. In: Stefana Broadbent, Nicole Dewandre, Charles Ess, Luciano Floridi, Jean-Gabriel Ganascia, Mireille Hildebrandt, Yiannis Laouris, Claire Lobet-Maris, Sarah Oates, Ugo Pagallo, Judith Simon, May Thorseth and Peter-Paul Verbeek (Eds.), The Onlife Initiative (pp. 75–97). Brussels: European Commission. https://ec.europa.eu/digital-agenda/sites/digital-agenda/ files/Onlife_Initiative.pdf. [34] Roger Ames and Henry Rosemont Jr. (1998). The Analects of Confucius: A Philosophical Translation. New York: Ballantine Books. [35] Yao-Hui Luሷ (2005). Privacy and Data Privacy Issues in Contemporary China, Ethics and Information Technology 7(1): 7–15. [36] Bernhard Debatin (2011). Ethics, Privacy, and Self-Restraint in Social Networking. In: S. Trepte and L. Reinecke (Eds.), Privacy Online (pp. 47–60). Berlin: Springer. [37] “There Shall be Freedom of Expression”: Proposed New Article 100 of the Norwegian Constitution: Report of Commission Appointed by Royal Decree on 26 August 1996: Submitted to the Ministry of Justice and the Police on 22 September 1999: Excerpts. www.uio.no/studier/emner/hf/imk/JOUR4330/ v13/unesco-report.pdf. [38] The [Norwegian] National Committee for Research Ethics in the Social Sciences and the Humanities (NESH) (2006). Forskningsetiske retningslinjer for samfunnsvitenskap, humaniora, juss og teologi [Research ethics guidelines for social sciences, the humanities, law and theology]. http://www.etikkom. no/Documents/Publikasjoner-som-PDF/Forskningsetiske%20retningslinjer%20for%20 samfunnsvitenskap,%20humaniora,%20juss%20og%20teologi%20%282006%29.pdf. [39] OECD (2011). “Income Inequality”, in OECD Factbook 2011–2012: Economic, Environmental and Social Statistics OECD Publishing. http://dx.doi.org/10.1787/factbook-2011-31-en. [40] Gender in Norway. N.d. http://www.gender.no/Policies_tools. [41] Charles Ess (2013c). Trust, Social Identity, and Computation. In: Richard Harper (Ed.), The Complexity of Trust, Computing, and Society. Cambridge: Cambridge University Press. [42] David Wong (2008). “Chinese Ethics”. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/ entries/ethics-chinese/ (accessed April, 2013). [43] Stefana Broadbent, Nicole Dewandre, Charles Ess, Luciano Floridi, Jean-Gabriel Ganascia, Mireille Hildebrandt, Yiannis Laouris, Claire Lobet-Maris, Sarah Oates, Ugo Pagallo, Judith Simon, May Thorseth, Peter-Paul Verbeek, The Onlife Manifesto. The Onlife Initiative: Being Human in a Hyperconnected Era, Brussels, European Commission: 2013, available at http://ec.europa.eu/digital-agenda/en/ onlife-manifesto.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Part II

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

The Need for Privacy

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-59

59

Not So Liminal Now: The Importance of Designing Privacy Features Across a Spectrum of Use Lizzie COLES-KEMP 1 and Joseph REDDINGTON Royal Holloway University of London, Egham Abstract. There are many communities of ubiquitous computing users that are on the periphery of society, and these liminal users are often left to negotiate their relationship with technology without the help and support provided to more mainstream users. One such community is formed around users of Augmentative Alternative Communication (AAC) technology. Changes in the commercial landscape have brought within reach dramatic improvements in AAC and made them more accessible and supportive to their user community. These improvements, though overwhelmingly positive, amplify a family of personal data management problems that are similar to those experienced by more typical ubiquitous computing users. This paper argues that information manaagement practices deployed by the AAC user community are ones that mainstream society may benefit from using. Accordingly, this paper explores a number of personal data management problems that arise during AAC use and considers how AAC users have developed work arounds and information management practices to protect their personal information. Whilst this paper is focused on AAC technology, the responses could be generalised for a broader spectrum of society.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Keywords. AAC, information management practices, personal data, ubiquitous computing

1. Introduction As we look back on the late 20th Century, immersive and ubiquitous computing that increasingly touches every aspect of our lives might possibly be regarded as one of the most notable aspects of this period. Technological innovations have combined into sociotechnical methods of engagement and delivery such as smarter cities, “digital by default” service delivery and consumerisation of information technology, and these new digital capabilities affect the lives of many. These themes are often shaped by service delivery paradigm shifts and are fuelled by economic drivers such as consumer choice, just-intime delivery, personalisation, cost-effectiveness, and latterly government programmes of “austerity measures”. Increasingly, the citizen is not only a consumer of data but also a producer of data and much of that data is personal and related to individuals. The value of this data is dynamic and individual and requires an increasingly sophisticated range of tools to manage it. 1 Corresponding

Author: [email protected].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

60

L. Coles-Kemp and J. Reddington / Not So Liminal Now

It is sometimes argued that technology design is a process that helps society develop its use of technology. Indeed, it is often argued that envisioning and re-envisioning the human relationship with technology is an important aspect of the design process [1]. Sometimes envisioning takes place through the creation of new technologies and sometimes envisioning happens more in narrative form where technologies are developed from those narratives. The cultural roots of technology narratives have been the study of some research [2–4]. These narratives are useful not so much for any predictive quality [2], but for what they tell us about anxieties related to contemporary technology use. When new technologies emerge they can, in some sense, be viewed as a blending of science fact and fiction [5] and users are often required to adapt their lives to include these new technologies. This is particularly true of ubiquitous computing technologies where users can be observed shaping the technology around their lives and shaping their lives around the technology. The complex interactions between individual, society, technology, and institutions are apparent in the case of ubiquitous computing, and in the way that humans and technologies are enmeshed through ubiquitous computing results in many forms of security and privacy issues. Similar to the way in which narratives in the arts point to trajectories of technology use, a number of reports have been produced that describe potential trajectories of ubiquitous computing and internet technology use and the potential for security and privacy risks inherent in the immersion of these technologies. Two such examples are reports from ENISA [6] and SWAMI [7] that envisage scenarios where the user is dependent, socially and professionally, on ubiquitous, internetenabled computing and where the user routinely faces the need to exchange privacy for service delivery. The narratives point to concerns about the loss of user control over their personal data, the lack of awareness of the impacts of digital practices on others, and obfuscation of the true proposition being presented by service providers when exchanging user data for services.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1.1. The Liminal Vanguard Notably, in the narratives about complex social technology use, whether from the arts or reports from research groups such as ENISA, the user is typically characterised as both the consumer and producer of data, much of it personal. These narratives paint a picture of a society where smarter cities, “digital by default” services, and the consumerisation of technology are all facets of everyday life. The narratives in the ENISA and SWAMI reports illustrate how the perceived value of personal data is shaped by the context and the feelings and responses that the data evokes. The reports also show that, to describe the potential risks that may emerge, the categories of information risk need to be extended to include risks to identity, risks to personal reputation, and risks to psychological well-being. The ENISA report also makes recommendations as to how to mitigate the risks. The risk management recommendations are made on the assumptions that endusers have the cognitive ability to: make informed choices about the value of their personal data; decide on information disclosure; and use devices unassisted and independently. Crucially, they are also predicated on the notion that end-users can choose not to use the technology. However, what options for personal data management do users have if they do not possess these capabilities? A technically-mediated world is already the reality for many groups of ubiquitous computing users and not all of these groups are independent users capable of unassisted

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

L. Coles-Kemp and J. Reddington / Not So Liminal Now

61

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 1. An AAC user with device.

use. This paper examines one such group: users who rely on technology to communicate with the world around them. Electronic Augmentative and Alternative Communication (AAC) systems enable individuals with severe speech impairment to verbally communicate their needs. In many cases these devices are life changing. A user’s device is designed to give greater independence and improved opportunities of social integration. AAC devices enable users to construct utterances, many of which describe themselves or aspects of their lives, including their interactions with others and, as such, can be considered ‘personal data’. AAC devices can be used in all areas of the user’s life and are always-on devices that frequently require assisted, rather than independent, use. It is important to note that this is not an homogeneous user group: a wide range of different needs benefit from AAC use and the wider user community represents users with all levels of cognitive and physical ability. For these groups of users, responses to personal data management problems that are predicated on informed choice, independent use, and the right to opt-out will not be effective or useful and, as a community of users, they have needed to develop other ways of responding to these problems. This paper explores how AAC users develop their information management and technology practices to overcome difficulties with personal data management, and explores what mainstream society might learn from their endeavours. It is structured as follows: Section 2 gives an overview of AAC, including the technical landscape, the nature of the community, the implicit privacy concerns, and a brief survey of related work. Section 3 goes into some detail on the nature of the challenges faced by both mainstream and AAC communities and illustrates how AAC users might be considered in some senses to be the ‘liminal vanguard’ of mainstream society’s movement to ubiquitous computing. Section 4 gives three examples of information management practices performed by AAC users and evaluates how these might be developed into design and practice principles for mainstream technology use. Section 5 gives conclusions, an agenda for future research, and our acknowledgements.

2. Cyborg People. Here! Now! Augmentative Alternative Communication (AAC) systems (Fig. 1) are used to supplement or replace communication for individuals with severe speech impairments. Often

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

62

L. Coles-Kemp and J. Reddington / Not So Liminal Now

electronic, such systems are frequently the only means that an individual has to be able to communicate needs (both immediate: “I would like a drink”, and long term: “I would like to start divorce proceedings”), or interact with society. Modern internet-enabled AAC devices are designed in such a way that AAC users assume the dual role of producers and consumers of personal information. Moreover, this is not a small user community: the Domesday Dataset [8] records over 8,000 purchases of speech aids by the NHS between 2007 and 2012 and estimates the overall spend during this time on speech aids as around fourteen million pounds. However, other than extremely non-standard cases such as Stephen Hawking, this is a hidden community of ubiquitous technology users and one that is rarely, if ever, considered when reflecting on the use of ubiquitous computing in contemporary society. AAC devices may excel at needs-based communication (e.g. “I am hungry”, “I’m cold”, “get the phone”) but they are limited for real conversation [9]. Typical AAC devices tend towards a hierarchical structure of pages, each of which typically focuses on a context (e.g. shopping) or a category (e.g. clothes, sports), rather than observations or recent personal stories [10]. Work by [11] reports that spontaneous conversation with typical devices is slow and difficult (new utterances are typically constructed at a rate of between 8 and 10 words per minute, slightly more if, for example, word prediction is used). Using pre-programmed phrases (which give a much higher number of words per minute) can reduce the ability for self-expression [12], which limits the range of exposable personal data. In general, new utterances must be prepared in advance either by the user or a carer, which requires a significant overhead in terms of time and effort. It is this implementation of functionality designed to speed up utterance production that restricts the production of personal data rather than the underlying technology. So, in the current generation of AAC devices, the implications for both personal data generation and its use are relatively small because the linguistic capabilities are small. Even so, a range of authors [12,13] have acknowledged theoretical issues such as: anonymity; personalisation of services; identity management; autonomy; and the changing of relationship boundaries through mediation. With enhancements brought about through the use of internet-enabled services such as geo-location, more refined and more accurate logging of both speech creation and movement, and the ability to integrate the AAC device into a range of internet-enabled external services ranging from internet banking to on-line gaming, these personal data management challenges gradually cease to be theoretical. Moreover, recent work by [14–16], and [17] makes explicit use of personal data (about both the user and other parties) to improve the functionality of AAC devices, this results in a very direct, although not necessarily explicit, trading of personal data for personal capabilities. Such work puts previous abstract models of user-privacy into sharp relief and exposes deep tensions between the aim of user empowerment (which these developments are aimed to promote) and the protection of AAC users from threats to their privacy and safety. 2.1. Rapidly Evolving Community The tension between user empowerment and user protection becomes even more evident when the trajectory of the development of this type of ubiquitous device is examined. The more sophisticated the use of the AAC device becomes, the more able an AAC user is to produce personal information and make it available to those with whom they want

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

L. Coles-Kemp and J. Reddington / Not So Liminal Now

63

to converse. At the same time, the more access an AAC user has to personal information related to other people, the more effective their conversations can become. The range of ways in which personal data is harnessed in the AAC sector is increasing dramatically. Apple’s iPad has caused a huge investment in AAC software for tablet technology. Multiple third party applications (e.g. proloque2go,2 JabTalk,3 and Talkadroid4 ) already exist that allow this new range of tablets to function as AAC devices. Information from the (AAC) Domesday Dataset [8] shows widespread adoption of tablet technology with implicit, and typically mandatory, usage of cloud storage [18]. The overall effect of the tablet entry into the marketplace is that hardware devices are becoming more standardised in terms of size, shape, capabilities, and design, while at the same time the software that runs on such devices is exploding in terms of variety and capability, particularly in terms of the ways that personal data can be exchanged for enhanced and more efficient functionality. The lowering of the barrier of entry to AAC technology foreshadows the “panoply of different privacy problems” that privacy theorist [19] envisaged when contemplating worlds that are mediated through technology in an increasingly complex manner. For this community of ubiquitous computing users, the privacy problems include: establishing privacy for the AAC user that is separate from their carers; the communication of privacy options to non-literate users; the balance between the logging of user activity to promote more rapid or richer speech and user activity logging to protect the user; and the ability of the AAC user to communicate and experiment with the identity that they project. AAC devices are designed to increase social interaction in all settings and therefore the devices, and their supporting approaches, must be able to respond to all settings, as is the case with ubiquitous technology. Also, AAC users themselves develop their uses and desires for communication [12] as their identity evolves and their relationship with their AAC device changes. Therefore, any approach to personal data management has to be highly context sensitive and capable of responding to changing requirements. As the capability for interaction increases, both in terms of the user’s capabilities and the functionality of the AAC software, the potential for increased personal data also increases. This has been somewhat exacerbated by a new generation of devices that are capable of using tablet hardware to geo-locate and mine social media to increase language support. The variety of personal information stored on AAC devices includes not only information about the data subjects themselves but also carers and other members of a support structure: for example, devices that log utterances made by a literate user implicitly record the whereabouts of staff members, the exact times of conversations and the subjects they discuss. Similarly, utterances that have been generated from input by teachers, care staff and parents can again potentially contain information about other individuals, as well as increase the range of information about device users themselves. There are also more mainstream issues: internet access as a medium brings a range of issues for personal data use in terms of the methods used to broadcast and replay utterances and it greatly increases the possibilities for data input (potentially including information about third parties) into the utterances. The general browsing facility of internet access increases the ability of users to communicate with the wider world, carrying with it a set of personal data management and privacy issues, much of which is the subject 2 http://www.proloquo2go.com,

retrieved May 2013. retrieved May 2013. 4 http://www.epidream.com/, retrieved May 2013. 3 http://www.jabstone.com/,

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

64

L. Coles-Kemp and J. Reddington / Not So Liminal Now

of ongoing research [20–22] that is focused on personal data management within mainstream society. Interestingly, such research does not consider the existing practices of ubiquitous technology user communities for whom the standard and existing information management practices are not effective or workable. As has already been identified, the constraints under which personal information management practices and technologies have to operate are considerable and are constraints that apply to a spectrum of ubiquitous technology users, not just those dependent on AAC technology. One could then argue that this is not so much a specialised use-case but an extreme use-case that has something to offer wider parts of society.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.2. Related Work in AAC Literature Although ethical issues in the context of complex disabilities are well studied, there is little direct research into privacy and personal data management issues in AAC, much of the work is in small accompanying sections to other research contributions and focuses directly on personal data dissemination. For example, [23] notes that externally displayed lexicons (such as a communications board) violate some aspects of privacy and proposes finding ways to ensure that vocabulary can be delivered discreetly without affecting access. Additionally, there is some meta-work that looks at the ethics of research into AAC rather than AAC itself: [24] notes that the data collected by AAC devices makes identification of the individual trivial, especially when considering the relatively small pool of users, a theme that is also examined in work by [25] on logging output of AAC devices. Privacy has also been raised explicitly in the AAC community by researchers considering design frameworks for next generation devices, e.g. [12] and [13]. Work on the future of AAC and internet connectivity (in particular key features highlighted in [13]) have great bearing on personal data management, although privacy and personal data management are not directly discussed. [13] discuss simplified usability, including ‘embeddedness’ functionality: making AAC devices unobtrusive in their environment. When simplifying usability, there is a tension between requiring user intervention and decision making automation. For example, where should consent mechanisms related to personal information disclosure be placed in such devices? Should such mechanisms be usable by carers as well as AAC users and, if so, how is informed consent ensured?

3. The Design Challenge The ENISA report perhaps most clearly highlights and exemplifies the best practice advice that is typically given to users and providers of ubiquitous technology. The content of the ENISA report focuses on life-logging applications and was created by a number of specialists with an outstanding track record in personal data management research and practitioner support. The report is a well researched collection of use-cases for typical, mainstream immersive ubiquitous computing use and the life-logging scenario means that it is particularly relevant to use it as a comparator with AAC-use scenarios. The ENISA report highlights that three groups of stakeholders have a responsibility for responding to personal information management risks in internet-mediated communications: the individual, the service provider, and the state. These three groups of stake-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

L. Coles-Kemp and J. Reddington / Not So Liminal Now

65

holders are equally relevant for AAC-use scenarios [26]. However, in the AAC context, the user community represents a significant challenge to designing personal information management functionality as part of internet services: as already outlined in this paper – much of best practice guidelines does not take into account the constraints that face parts of the AAC community and therefore the focus becomes designing technology and practices for the user rather than to the guidelines. The ENISA report points out that users must take advantage of and use privacy protection and security functionality and this is a common thrust of best practice advice when it comes to managing personal information. In terms of the individual, the best practice advice is based on the premise that individuals must be better informed and by being better informed the user has control of their own data. However, the recommendations do not refer to the non-user roles. The non-user is the ‘significant other’ in the AAC usage story. It is a regular theme in Human Computer Interaction literature that the non-user plays an important role in technology use [27] although they may never become users of that technology [28]. In AAC, the significant non-user is an important role and one that is often adopted by members of the carer team, including family, but also potentially by teachers, healthcare professionals and friends. In many cases an AAC user’s interaction with the world around them is mediated through not only the AAC device but also through a carer or other non-user. This poses a design challenge because delegation rights need to be developed and made accessible to both the AAC user and carer, and at the same time, an empowered user should be able to control access to their personal data – including the ability to forget and hide utterances. In addition, the role of the non-user extends the boundaries of the AAC system to include the social protocols that are used to manage and engage with the device. The recommendations from the ENISA report also characterise the most significant challenges to an individual’s take-up of best practice as: being poorly informed; protection technologies being too inaccessible; and a lack of empowerment to take control. In the case of AAC users, these problems exist but so do the problems related to the ability to: forget utterances that are regretted, hide utterances from significant non-users and control the identity projected of themselves and others. These forms of personal data management pose a design challenge for AAC technology designers because the needs and rights of the AAC user have to be balanced with the duty of care towards the AAC user which carers feel that they have placed on them. The report recommends that the service provider designs life-logging services with accessible, privacy-friendly default configurations and settings. The report also recomends that the service provider performs impact assessments and assesses the personal information management risks. Finally, there is a recommendation that the service provider is transparent about access to data and with whom it is shared. Service providers are also called upon to make individuals aware of and control the privacy risks associated with use of life-logging services. From a technological perspective, use of encryption and stronger authentication is advocated. The service provider is also encouraged to use multiple data stores and control access to those data stores. All of these requirements pose design challenges for AAC technology designers who often have little control over these issues, either because of the platforms upon which the AAC software is designed or because of inherent problems with the architecture of bespoke AAC devices. There is also a question of education; AAC designers are not security specialists and have typically not had security engineering training.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

66

L. Coles-Kemp and J. Reddington / Not So Liminal Now

The state, on the other hand, is encouraged to try and create a regulatory environment that provides incentives for privacy-aware or privacy-friendly devices and services while supporting competition through promotion of interoperability and interconnection between devices and services, as well as providers. The report also encourages the state to conduct impact assessments on service designs and to make the citizen aware of both the benefits and risks of using the life-logging services; more importantly, they should also aim to educate the individuals of the risks and ways to protect themselves (e.g. the inclusion of privacy training in computer science education). At a federal level, states are encouraged to harmonise laws and regulations across states. Regulators in general are encouraged to create strong incentives for companies to include user interface “nudges” towards safer behaviour by customers, as well as to consider privacy requirements in early stages of product development. It is important to reflect on the fact that whilst the AAC user community may be a hidden one, it nevertheless is part of society and resides under the same overall regulatory and legal framework as the rest of society. Whilst the best practice messages of the ENISA community are relevant, the AAC scenarios foreground constraints to the implementation of best practice that affect not only the AAC community that other user communities dependent on assisted use, affected by literacy issues and isolated from mainstream society. In response to these constraints, the AAC community also demonstrates information practices developed to overcome these constraints and put in place privacy controls. The next section outlines these information practices and considers their usefulness for other parts of society.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. You Are what You Disclose: Management of Everyday Tensions in Personal Data Management Many everyday tensions faced by AAC users have parallels in the themes and scenarios that are projected in the ENISA and SWAMI reports. In particular, the life-logging context described in the ENISA report presents many of the trade-offs that AAC users and their assistors must negotiate. This section examines three such tensions in personal data management problems, they are constructed with input from legal experts, speech and language therapists, youth work practitioners, and disability officers, as well as from observations of non-literate AAC users. These examples focus on non-literate use of AAC, a user context for which traditional privacy and security technologies are not a realistic solution and techniques to stimulate informed consent are also not usable. The responses describe forms of information practices that can be seen as a type of security and privacy in the wild. Security and practices in the wild have been observed in typical user communities [29–31] but not previously in disabled communities. It is important to recognise, when considering the following examples, that the AAC community is very wide and all combinations of different levels of cognitive, social, and physical capability are represented. To categorise and map the solution space is far beyond the scope of this work. Instead we choose several recurring solutions that have particular resonance with issues faced by mainstream users. As with all information practices, these examples are of innovative solutions found by a number of individual users, rather than responses from the AAC community as a whole.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

L. Coles-Kemp and J. Reddington / Not So Liminal Now

67

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.1. Example 1 – Communicating who You Are Manufacturers of AAC devices take pains to provide devices with carefully designed sets of pages that they believe are most useful for the communication needs of the general user (users can build their own from scratch, but typically they simply edit an existing framework to better suit their needs). One of the communication needs recognised by manufacturers and Speech and Language Therapists alike is that a user will want to talk about themselves, and many page sets include a templated page to do just that. Examples of phrases on an ‘about me’ page might include such fill-in-the-blank sentences as “My name is X”, “I am X years old”, “I have X brothers and Y sisters” and also includes space for other personal information, for example “I really like Disney films”. A major part of preparing a device for a new user is the completion of such ‘about me’ pages. This ‘about me’ page replicates, almost exactly, the ‘about me’ pages that users of social media have on services like Facebook and LinkedIn: showing what a user likes, which social groups they belong to, and other aspects of what they want to project about themselves. For some, literate, AAC users this ‘about me’ set of utterances effectively represents their ‘cocktail party’ level of conversation, allowing participation in a social ritual that potentially may have no other goals than the social interaction itself. However, for others, such a system can quickly become out of date in terms of their preferences (for example, “My boyfriend’s name is Jeff”). Whereas a mainstream user of social media, or a literate AAC user, can periodically, and silently alter their interests or other aspects of their social identity, non-literate AAC users lack this luxury because almost all changes to the device require a high level of engagement with care staff, as the user must explain to the care staff some fairly high-level concepts about change and make very explicit their desire to change what biographical data is communicated about them. The carer, of course, has the ability to interpret and adjust what is communicated. When this page represents more than just interaction, it illustrates the limitations of control a non-literate user can have over the functionality to express identity. An observed work-around adopted by some AAC users in response to this limitation amounts to a wholesale rejection of the concept of digital presentation and the ‘about me’ because its functionality is not fit for the desired purpose. Instead, the core language in the rest of the voice is used to express likes, dislikes, passions, and dreams, which makes for a more fluid identity constructed of feelings and emotions rather then descriptions and attributes. The expression of identity is performed through the expressivity of the language rather than the biographical details that are projected and offers non-literate AAC users greater control over the identity that is projected of them. The development of a more expressive language that projects identity is perhaps also of value to mainstream users of social network technology and life-logging applications. From the personal page, to Myspace, to Facebook and LinkedIn, to Twitter – there has been increasingly less space devoted to static display of the user-as-snapshot – and greater expression about what the user has been doing recently. Personal information practices that encourage greater expressivity of identity are a further means of users retaining control over the identity that is projected about them. Rather than solely focusing on internet safety, education that encourages users to explore their on-line expressivity may also help users to set and control boundaries.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

68

L. Coles-Kemp and J. Reddington / Not So Liminal Now

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.2. Example 2 – The Ability to Hide Utterances The limitations of the static identity discussed above represent a specific case of a more general problem – the ability to hide or remove utterances. If a user has had the personal phrase “Becky is definitely going to win this year’s X-factor” or even “My girlfriend Susan loves me very much” permanently added to their device, the non-literate AAC user may not wish to draw attention to it by asking it to be removed if they change their mind or want to deny it, instead choosing just to ignore the existence of the phrase. We can draw parallels between the list of all utterances programmed into an AAC user’s device and the list of all comments and statements by a user on social media. Again, typical users have the luxury of being able to periodically ‘curate’ their social media stream, removing unwise commentary, deleting things that do not meet the standards of sober reflection, and to potentially reclaim their identity by, for example, removing all photos of them with an ex-boyfriend or girlfriend. Such an action is, by nature, a very private one, and again a non-literate AAC user cannot take such a set of actions without drawing direct attention to it (this is also a reflection of the complex and very real factor that so much effort has been directed into putting more capability into an AAC device, that any effort to reduce the expressivity of the device is not in line with the design ethos). Consequently, typically the response from the AAC user community is one of community values and principles rather than personal information management practice or technology redesign. In general, the wider AAC community holds with the principle that an AAC user does not have to hold a view or want to say a thing simply because they have the option to; in the same way that debaters can argue in favour of a contrary postion to the one they hold. It is understood by the community that the statement may have been entered by carers, perhaps long gone, or be part of a joke or a page set and therefore not necessarily attributable to the AAC user.5 Therefore, the overriding principle that the AAC community abides by is that AAC users are not solely accountable for the content on their devices, and utterances are confirmed and contextualised by communication partners. We can draw a parallel here with some social network strategies seen amongst today’s social-media users – the cultural tropes of ‘vaguebooking’, ‘schoolboying’ and being ‘hacked’ all, perhaps not intentionally, but certainly effectively, give users a degree of deniability for their own posts [32] and acceptance by the community that content is not necessarily attributable to the social-media user. However, wider society has a significant way to go before it accepts that the content of a ubiquitous device and the content of social and life-logging applications in some contexts is not necessarily controllable by the user, and that additional corroboration needs to be sought to assess the provenance of published content. In addition to internet safety education and training, the development of programmes to promote better understanding of the limitations of the nature of internet publishing and the control of ubiquitous devices may encourage a reflection on societal values in this area. 4.3. Example 3 – The Ability to Forget Unlike Section 4.2, which focused on utterances explicitly programmed into a device, this section focuses on the records of phrases spoken, both individual utterances and com5 The

issues around post-literate users are yet more complex and will be explored in future work.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

L. Coles-Kemp and J. Reddington / Not So Liminal Now

69

Dad Cat Would you like to tell your Dad about a cat Dave? giggles and nods head Dad Cat What is it that you’d like to tell your Dad about the cat? Dad Cat Car Oh – you think we should send Dad to get a cat? nods, grins

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 2. Example script between an AAC user and their care staff. Utterances in device history (bold), motions by user (italic), and speech by communication partner.

binations, and focuses on the challenge of forgetting what has been previously published. It would be natural to assume that conversations had with AAC users become part of a corpus of information: unlike spoken conversation, AAC devices create embodiments of conversations that can be permanently stored or logged. Then conversations become data that largely focuses on living individuals, either the users themselves or their family and friends. The permanent nature of these embodiments means that the users can potentially lose the right to forget their utterances, delete their utterances and configure the ways in which their identity is projected. Indeed, issues of personal data for post-literate AAC users include such cases as users wanting to ensure that statements they have given to doctors, police, and loved ones are unrecoverable. The ability for AAC devices to have direct “forget” functionality is difficult for AAC devices because they are memory devices and, therefore, utterances are stored for the lifetime of the device (and in some cases the devices themselves function as an external memory storage location for the user). AAC devices are designed this way because of the dependency that a user has on the technology and the severe impact that losing their ‘voice’ can have. However, the way in which AAC conversations work actually reduces and potentially removes the problem of forgetting. In practice, conversations with non-literate AAC users are not subject to anything like the problems of, say, emails or blogging. This is partly because AAC devices do not yet routinely record conversations (logging for research purposes is discussed in [25]) but, even for the ones that do, the nature of the conversations had with non-literate AAC users is such that the information recorded is of little use without the recollections of either the user or the communication partner. Figures 2 and 3 illustrate the level to which, for some AAC users, the communitation partner, location, and the user’s physical movements give semantics to the utterances. This is therefore a form of mixed media or mixed mode conversation and, combined with the logging mode options outlined below, provides a natural means of degrading the ability to remember a conversation. We consider two ways of logging AAC conversations. Firstly at the phrase level, keeping track of the phrases used in the last hour, day, or seven days: this is often a useful feature for the user as it helps users repeat recent comments, or carefully constructed phrases (for example, one might want to say “We were late because of heavy snow at Junction 14” quite regularly on the evening that one was late, but not necessarily have it on the device for posterity). Secondly, recording can be done at the level of individual button presses, which is a feature intended to be used by carers and Speech and Language Therapists to see how much use is being made of various aspects of the device (this is an ‘opt in’ feature often included in Dynavox brand devices). Interestingly, these log-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

70

L. Coles-Kemp and J. Reddington / Not So Liminal Now

Dad Cat Would you like to tell your Dad about a cat Dave? laughs and shakes head Dad Cat Is it something that’s happened before? Dad Cat Car Oh – you want to us to tell the present 3rd party about the time that Dad took the cat in the car? laughs

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 3. Another example script between an AAC user and their care staff. Utterances in device history (bold), motions by user (italic), and speech by communication partner.

ging modes, particularly when used together, replicate the gentle degradation of human memory in a much more accurate way than typical electronic systems. If, for example, a digital camera’s memory degraded in a similar way to human memory, then the oldest images would gradually lose their resolution and detail as new images were added to their memory; eventually the older images would merge together and then disappear entirely. In the case of AAC devices we see that much of the information degrades in a similarly gradual way. When AAC phrases are first constructed, we can recall both their structure and the fact that they were created in the last hour. After the hour we know only that they were created in the last day, and shortly the last week.6 Following the end of the week (devices with this capacity often allow users to specify the amount of time before phrases are removed) the information may exist only as recorded in the count of button presses on the device, and as time passes and more and other events are recorded as part of the button press count, the information gracefully degrades. In this way, some features of AAC design present an elegant case study of an electronic device whose memory gradually degrades, not by intentional design, but as an emergent property of the interface. This combination of technology design and personal information practice is perhaps something that should be considered for the design of ubiquitous technology and social and life-logging applications for the wider community. As an approach it offers a range of possibilities and enhances the user’s control over the ability to remember and to forget and is perhaps more natural and sympathetic to the particular technology and service use.

5. Conclusions Advances in natural language generation and speech processing techniques, combined with changes in the commercial landscape, have brought within reach dramatic improvements in the design of AAC devices. These improvements, though overwhelmingly positive, amplify a family of personal data use problems. This work has argued that the challenges faced by AAC users in managing their personal information can be generalised for other communities affected by low digital literacy, low literacy levels, and cognitive challenges. Accordingly, this paper explored personal data management problems 6 For

space reasons we gloss over a certain amount of technical detail of timestamps, but this is programatically solvable. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

L. Coles-Kemp and J. Reddington / Not So Liminal Now

71

and considers some of the work arounds that AAC users have developed to protect their personal information. These everyday work arounds from AAC users point to a different cultural reality and therefore a different design fiction, one that security and privacy technology design tends to ignore. A combination of personal information management practices, cultural principles and values, and technology design offer alternatives and work arounds to the difficult problems of personal data management, content attribution and the ability to forget and remember content. Perhaps, in the early part of the 21st Century, mainstream society could take the lead from this liminal vanguard of ubiquitous computing users and adopt and extend some of their techniques for everyday technology use.

Acknowledgments The authors must thank the tireless effect of many inviduals who made contributions to this work, in particular Kate Williams, Liz Panton, Sarah Moffat, Claude Heath, Stephen Bieniek and many others.

References [1] [2] [3] [4]

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[5] [6]

[7]

[8] [9]

[10] [11] [12]

[13]

L. Bannon, “Reimagining hci: Toward a more human-centered perspective,” Interactions, vol. 18, no. 4, pp. 50–57, 2011. P. Dourish and G. Bell, “ “Resistance is futile”: Reading science fiction alongside ubiquitous computing,” Personal and Ubiquitous Computing, 2008. G. Bell and P. Dourish, “Yesterday’s tomorrows: Notes on ubiquitous computing’s dominant vision,” Personal and Ubiquitous Computing, vol. 11, no. 2, pp. 133–143, 2007. J. Tanenbaum, K. Tanenbaum, and R. Wakkary, “Steampunk as design fiction,” in Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems, pp. 1583–1592, ACM, 2012. J. Bleecker, “Design fiction: A short essay on design, science, fact and fiction,” Near Future Laboratory, vol. 29, 2009. I. Askoxylakis, I. Brown, P. Dickman, M. Friedewald, K. Irion, E. Kosta, M. Langheinrich, P. McCarthy, D. Osimo, S. Papiotis, et al., “To log or not to log? – Risks and benefits of emerging life-logging applications,” 2011. P. Alahuhta, P. De Hert, S. Delaitre, M. Friedewald, S. Gutwirth, R. Lindner, I. Maghiros, A. Moscibroda, Y. Punie, W. Schreurs, et al., “Dark scenarios in ambient intelligence: Highlighting risks and vulnerabilities,” SWAMI Deliverable D, vol. 2, 2005. J. Reddington, “The domesday dataset: Linked open data in disability studies,” Journal of Intellectual Disabilities, 2013. G. Soto, E. Hartmann, and D. Wilkins., “Exploring the elements of narrative that emerge in the interactions between an 8-year-old child who uses an AAC device and her teacher,” Augmentative and Alternative Communication, vol. 22, no. 4, pp. 231–241, 2006. D. Beukelman and P. Mirenda, Augmentative and alternative communication: Supporting children and adults with complex communication needs, 3rd ed. Paul H. Brookes, Baltimore, MD, 2005. D. J. Higginbotham, H. Shane, S. Russell, and K. Caves, “Access to AAC: Present, past, and future,” Augmentative and Alternative Communication, vol. 23, no. 3, pp. 243–257, 2007. T. Rackensperger, C. Krezman, D. Mcnaughton, M. Williams, and K. D’Silva, “ “When I first got it, I wanted to throw it off a cliff”: The challenges and benefits of learning AAC technologies as described by adults who use AAC,” Augmentative and Alternative Communication, vol. 21, no. 3, pp. 165–186, 2005. F. DeRuyter, D. McNaughton, K. Caves, D. Bryen, and M. Williams, “Enhancing AAC connections with the world,” Augmentative and Alternative Communication, vol. 23, no. 3, pp. 258–270, 2007.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

72 [14] [15]

[16]

[17] [18]

[19] [20]

[21] [22] [23] [24]

[25] [26]

[27]

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[28] [29]

[30] [31]

[32] [33]

[34] [35]

L. Coles-Kemp and J. Reddington / Not So Liminal Now R. Patel and R. Radhakrishnan, “Enhancing Access to Situational Vocabulary by Leveraging Geographic Context,” Assistive Technology Outcomes and Benefits, p. 99, 2007. R. Black, J. Reddington, E. Reiter, N. Tintarev, and A. Waller, “Using NLG and sensors to support personal narrative for children with complex communication needs,” in Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies, (Los Angeles, California), pp. 1–9, Association for Computational Linguistics, June 2010. E. Reiter, R. Turner, N. Alm, R. Black, M. Dempster, and A. Waller, “Using NLG to help languageimpaired users tell stories and participate in social dialogues,” in European Workshop on Natural Language Generation (ENLG-09), 2009. J. Reddington and N. Tintarev, “Automatically generating stories from sensor data,” in Proceedings of the 15th International Conference on Intelligent User Interfaces, pp. 407–410, ACM, 2011. L. Coles-Kemp, J. Reddington, and P. Williams, “Looking at clouds from both sides: The advantages and disadvantages of placing personal narratives in the cloud,” Information Security Technical Report, 2011. D. Solove, Understanding Privacy. Harvard University Press, 2008. E. Kani-Zabihi and L. Coles-Kemp, “Service Users’ Requirements for Tools to Support Effective Online Privacy and Consent Practices,” in Procedings of the 15th Conference on Secure IT Systems, Nordic 2010, 2010. P. Kumaraguru and L. Cranor, “Privacy indexes: A survey of westins studies,” Institute for Software Research International, 2005. S. Spiekermann and L. Cranor, “Engineering privacy,” Software Engineering, IEEE Transactions on, vol. 35, no. 1, pp. 67–82, 2009. M. Smith, “The dual challenges of aided communication and adolescence,” Augmentative and Alternative Communication, vol. 21, no. 1, pp. 67–79, 2005. L. Pennington, J. Marshall, and J. Goldbart, “Describing participants in AAC research and their communicative environments: Guidelines for research and practice,” Disability & Rehabilitation, vol. 29, no. 7, pp. 521–535, 2007. G. Lesher, G. Rinkus, B. Moulton, and D. Higginbotham, “Logging and analysis of augmentative communication,” in Proceedings of the RESNA Annual Conference, Citeseer, 2000. J. Reddington and L. Coles-Kemp, “Trap hunting: Finding personal data management issues in next generation aac devices,” Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies, pp. 32–42, 2011. C. Satchell and P. Dourish, “Beyond the user: Use and non-use in hci,” in Proceedings of the 21st Annual Conference of the Australian Computer-Human Interaction Special Interest Group: Design: Open 24/7, pp. 9–16, ACM, 2009. J. Redstr¨om, “Towards user design? On the shift from object to user as the subject of design,” Design Studies, vol. 27, no. 2, pp. 123–139, 2006. P. Dourish, R. E. Grinter, J. D. De La Flor, and M. Joseph, “Security in the wild: User strategies for managing security as an everyday, practical problem,” Personal and Ubiquitous Computing, vol. 8, no. 6, pp. 391–401, 2004. L. Palen and P. Dourish, “Unpacking privacy for a networked world,” in Proceedings of the SIGCHI Conference on Human factors in Computing Systems, pp. 129–136, ACM, 2003. J. Lingel, A. Trammell, J. Sanchez, and M. Naaman, “Practices of information and secrecy in a punk rock subculture,” in Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, pp. 157–166, ACM, 2012. A. Lenhart, M. Madden, M. Duggan, and A. Smith, Teens, Social Media, and Privacy. Pew/Internet, 2007. M. Kamp, P. Slotty, S. Sarikaya-Seiwert, H. Steiger, and D. Hanggi, “Traumatic brain injuries in illustrated literature: Experience from a series of over 700 head injuries in the asterix comic books,” Acta Neurochirurgica, pp. 1–5. P. Golle, F. McSherry, and I. Mironov, “Data collection with self-enforcing privacy,” ACM Transactions on Information and System Security (TISSEC), vol. 12, no. 2, pp. 1–24, 2008. J. Cornwell, I. Fette, G. Hsieh, M. Prabaker, J. Rao, K. Tang, K. Vaniea, L. Bauer, L. Cranor, J. Hong, et al., “User-controllable security and privacy for pervasive computing,” in Mobile Computing Systems and Applications, 2007. HotMobile 2007. Eighth IEEE Workshop on, pp. 14–19, IEEE, 2007.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

L. Coles-Kemp and J. Reddington / Not So Liminal Now [36] [37]

[38] [39] [40]

C. Karat, C. Brodie, and J. Karat, “Usable privacy and security for personal information management,” Communications of the ACM, vol. 49, no. 1, pp. 56–57, 2006. C. Bonnici and L. Coles-Kemp, “Principled Electronic Consent Management: A Preliminary Research Framework,” in 2010 International Conference on Emerging Security Technologies, pp. 119–123, IEEE, 2010. L. Coles-Kemp and E. Kani-Zabihi, “On-line privacy and consent: A dialogue, not a monologue,” in Proceedings of the 2010 Workshop on New Security Paradigms, pp. 95–106, ACM, 2010. S. Balandin, N. Berg, and A. Waller, “Assessing the loneliness of older people with cerebral palsy,” Disability & Rehabilitation, vol. 28, no. 8, pp. 469–479, 2006. J. Todman, N. Alm, J. Higginbotham, and P. File, “Whole utterance approaches in AAC,” Augmentative and Alternative Communication, vol. 24, no. 3, pp. 235–254, 2008. S. Reilly, J. Douglas, and J. Oates, Evidence-based practice in speech pathology. Whurr, London, 2004. L. Wittgenstein, Philosophical investigations. (Trans. GEM Anscombe). Basil Blackwell, 1956.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[41] [42]

73

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

74

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-74

Privacy Value Network Analysis1 a

Adam JOINSONa,2 and David HOUGHTON b Bristol Business School, University of the West of England b Birmingham Business School, University of Birmingham

Abstract. In this chapter we present a new method for visualising the use of personal information by stakeholders, and the transfer of value between those same groups based on the provision of tangible and intangible goods. Value network analysis combines elements of social network analysis with value chains and social capital in order to identify where value is generated in a network of stakeholders. The privacy value network (PVN) approach develops this methodology to identify the ways in which the value of personal information is realised across a network of stakeholders. Privacy Value Networks also introduce the notion of information costs within the model – and tracking personal information across a network allows for the identification of both exogenous and endogenous costs. At present, the PVN approach is primarily an analytic and visualisation tool, but in the future it should be possible to quantify value and costs across the network, and to calculate the degree of value/cost balance (and imbalance). Keywords. Privacy, disclosure, network analysis, value, personal information

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction Privacy has been described in the past as, “a concept in disarray” [1, p. 1], and a term that is “a chameleon-like word, used denotatively to designate a wide range of wildly disparate interests”[2, p. 458], with Thomson [3, p. 272] concluding, “that nobody seems to have any very clear idea what it is.” Many of these problems come from the very nature of privacy itself – for instance, it can be a state of isolation as well as enabling people to set boundaries to allow for small group intimacy. It can be conceptualised as a human right or as an individual’s interpersonal goals and intra-psychic motivations. These multiple approaches to conceptualising privacy make valuing privacy particularly difficult for both the data subject, and potentially also the data collector. Moreover, in many interpersonal settings the disclosure or reserve of personal information each has potential gains and losses for the individual. For instance, taking the risk of disclosing a need to another simultaneously opens the possibility of value for the discloser being gained through both additional support options and the intrapsychic value of disclosure itself. However, opening of the self is also risky as selfdisclosure can reveal a personal vulnerability which could later be exploited by the recipient, or third-parties who are informed by the recipient [4]. The complex nature and interactions of possible gains and losses make valuing privacy (or openness) in 1 The work in this chapter was supported by EPSRC grant (EP/G002606/1, “Privacy Value Networks”). We are grateful to two anonymous reviewers for their feedback on the chapter, and to Mina Vasalou for her input and insight during discussions about the project. 2 Corresponding Author.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A. Joinson and D. Houghton / Privacy Value Network Analysis

75

interpersonal interactions challenging – not only to academic researchers, but also to the very people grappling with the decision whether to disclose sensitive personal information or to keep it to themselves. Antaki et al. [5] discuss how disclosure decisions are contextually embedded within day-to-day life and induce socially bound consequences which people must ‘weigh up’ – something many of us will have experienced when struggling with the decision whether or not to reveal information to another person. More recently, Joinson et al. [6] have described how the design of social network sites may create an experience for users akin to ‘digital crowding’ – leading to a loss of utility (and value) due to unfettered sharing of information across multiple audiences. The interactions between individuals and organisations are almost as complicated in terms of the value of personal information and disclosure. That value is generated by the collection and use of people’s personal information is undeniable given the revenue and market capitalisations of organisations such as Google and Facebook. The value creation process has three key stages: the production of the value offering (i.e. building the service), value realisation (i.e. use of the service by a consumer) and worth capture [7,8]. The business model adopted by many of the ‘free at point of use’ technology companies requires sufficient worth to be captured from the consumer indirectly during value realisation in order to cover the production and maintenance costs, and to provide a profit for shareholders. In the case of many technology companies, this translates into a desire to increase the effectiveness of Internet advertising in order to increase revenue [9] – something that usually relies on access to ever increasing amounts of personal information from users. One such method is to increase customer profiling and segmentation in order to serve more relevant advertising – for instance, through analysis of clickstream traffic (e.g. [10]), integration of social media into advertising (e.g. [11]) or tracking customers across multiple sites in order to improve targeting. Indeed, one of the promises of ‘big data’ and the associated analysis and visualisation of data is the additional value that can be found for sales and marketing by mining customer data [12]. However, there is considerable evidence that consumers are opposed to many of the core practices that underpin the extraction of value from free services. For instance, McDonald & Cranor [13] report that 46% of their respondents agreed that, “It’s creepy to have advertisements based on sites I’ve visited”, and only 9% agreed that behavioural advertising was acceptable in exchange for a free email service. Turow et al. [14] report that 86% of young Americans would reject tailored advertising if it involved tracking them across multiple sites online. For organisations, understanding how users’ personal information is used and value is extracted, alongside the potential privacy costs, is vital in order to avoid becoming the next ‘privacy blow-out’. We argue that, due to the complex web of inter-dependencies and interactions between stakeholders when dealing with the value of personal information, a networkcentric systems approach is most appropriate. In the following pages, we explain why it is important to adopt a network-centric approach to understanding privacy and value, and how three key theoretical and methodological approaches – social network analysis, value network analysis and communication privacy management – provide the foundations of our proposed method.

1. Networks Networks, and more specifically network approaches to understanding dynamic systems and interactions, have become central to many scientific endeavours, ranging

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

76

A. Joinson and D. Houghton / Privacy Value Network Analysis

from understanding biological systems to power grids, from behaviour within groups to food production and distribution [15]. Within everyday discourse, the notion of networks (whether it be an old boys’ network, or parlour games connecting actors or people together) is well established. Two key rationales are proposed for adopting a network-centric approach. The first is that any dynamic system requires a multifaceted approach to understanding. According to Wilson, “The greatest challenge today… in all of science, is the accurate and complete description of complex systems. [Scientists] think they know most of the elements and forces. The next task is to reassemble them [to] capture the key properties of the entire ensembles” [16, p. 85]. The second reason is, “because structure always affects function” [15, p. 268]. Thus, to fully understand the use of personal information (i.e. how it is exploited, shared and transformed) by stakeholders, and the consequences of such actions, the structure of the network needs to be fully understood. In the context of privacy, networks are more often implicitly invoked to aid the understanding of personal information sharing. At one end of the scale, Gavison’s [17] concept of ‘perfect privacy’ denotes isolation and the lack of a network, while the theorising of behavioural scholars, e.g. [18], places privacy within the context of interaction between multiple parties. When discussing the sharing of personal information, the variety of possible actors supports a networked approach. Information sharing can occur between two or more individuals, an individual and a group or organisation, between two or more organisations, or indeed between any combinations of such actors. Often, incidents between seemingly isolated parties can involve other stakeholders within the network. For example, Wondracek et al. [19] discuss attacks on social network sites by de-anonymising data held by the service provider to an attacker’s advantage. In this relationship it is not just the attacker and the individual who are involved, but also the social network service, Internet service providers, and possibly the contacts of the data subject. Adopting a network approach can help to visualise and understand the flow of information exchanges amongst multiple stakeholders. However, utilising a networkcentric methodology is not without its challenges [15]. Networks are complicated, especially if all possible nodes and connections are shown. For example, a customer using a store loyalty card can involve multiple nodes beyond simply the customer and the store – the customers’ family may be invoked in the network, as may third parties who analyse the data provided, or even the postal service if traditional mail is used to send targeted advertising. Not only that, but the links between nodes can vary by the content, valence, weight, and direction of flow. Although a store loyalty card would seem to involve only the exchange of personal information for a tangible reward of some sort, there are also connections that signify the purchase, and intangible transactions (e.g. trust, loyalty). Alongside multiple types of links, there are also multiple types of nodes (or stakeholders), including individuals, organisations and regulators. Each will be connected to the network in different ways, and may be seeking multiple, perhaps competing, goals. Networks also evolve over time, with seemingly minor changes sometimes leading to new patterns of interaction, and sometimes leading to major shifts in the nature of the network which extend well beyond the originator of the change. Taking the above example, a store may choose to sell its loyalty card data to a third party, introducing not only a new node, connection and pattern of information flow, but, depending on the response of customers, potentially leading to flux in the network and an entirely new set of relationships and exchange patterns.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Joinson and D. Houghton / Privacy Value Network Analysis

77

Given the complexity and challenges inherent in adopting a network-centric approach to understanding privacy, the approach adopted in the present chapter is to encourage simplification where possible while acknowledging that additional complexity is likely as a network continually evolves.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Social Network Analysis Social Network Analysis (SNA) can be used to map the ‘ties’ (connections) between ‘actors’ (nodes), and to evaluate whether a tie exists between two actors (called a ‘strong tie’), or if the connection is via a series of related actors (a ‘weak tie’, [20]). Within a social network, actors are interdependent, and the ties between them represent resource flow [21]. Social network analysis allows for the identification of the degrees of separation between actors (i.e. how many steps are required to link two actors) and the density of the network (i.e. the percentage of possible ties in existence). For example, a less agile (i.e. high degrees of separation), sparse network may indicate a reliance on ‘weak ties’ and social communication may be improved in the network by increasing the number of ‘strong’ ties. Fundamentally, the benefit of a social network approach is its emphasis on the relationships between actors [21]. Social network analysis has been applied to privacy. For example, Strahilivetz [22] suggests that the number of steps personal information has travelled through a social network may be a useful measure of the degree to which privacy has been violated (working on the assumption that personal information is regularly shared across strong ties, and less often, if at all, via weak ties). Social network analysis – as applied to privacy – while valuably adding the dynamic nature of networks to our understanding, is also limited by the uni-dimensional nature of the ties mapped between nodes. When visualising the results of SNA, the nature of the ties between actors is ambiguous (in terms of perceived or real value, or how the actors communicate, for instance). From an SNA map, it can only be observed visually whether actors are connected to one another, not the nature of any connections. SNA visualisations – of the type useful to easily and graphically understand the complexity of connections (see Fig. 1) – can inform of the ties between actors but not usually whether they are connected due to friendship, rivalry, occupation or family, and does not possess the ability to deal with multiple types of transaction and communication between them [23]. In addition, the overall value of the network, or whether there are intangible as well as tangible transferences between the actors is non-evident [23]. However, a development of SNA in the organisational field – value network analysis – embeds semantics into the network visualisation, and identifies multiple ties between actors.

3. Value Network Analysis Value Network Analysis (VNA) was originally developed from an organisational perspective as a method to create value within company networks by improving communication and information flow [23]. VNA enables the economics of a network to be visualised and can innovate structural changes that produce value. VNA enables adaptation of the network to changes in its internal or external environment, in addition it is both dynamic and temporal. In contrast to social network analysis, VNA does visually identify the properties of deliverables (the item transferred in the transaction/tie be-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

78

A. Joinson and D. Houghton / Privacy Value Network Analysis

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Fiigure 1. Examplee Social Network for Organisation n X.

tween actors) both betw ween actors and a for the neetwork as a whole. w It deterrmines the uence of each h tie, allows for f multiple ties t between actors and directionaality and sequ determinees the type of deliverable. d D Deliverables caan be either ta angible, i.e. thhe formally contractuaal and expecteed exchanges between actors, or intangiible, i.e. the uunexpected, non-contractual and oftten overlooked d informalitiess of exchangee, such as trustt or spoken 3]. VNA usu ually refers to two differen nt intangibles that can be ttransferred advice [23 between actors a – know wledge (e.g. market m research h) and ‘other’’ intangibles ((e.g. credibility). v t the deliveraables can be social, environ nmental or ecoonomic (i.e. applied to The value a large delivery of coal can be perceiived as a negaative environm mental value aas well as a ue). However, perceived vaalue is also im mportant as noot all delivpositive economic valu erables caan be quantifieed in specifically agreed teerms. While taangible deliveerables can be quantiffied in monettary or purelyy economic terrms, the non--contractual, uunexpected intangiblee deliverables cannot [23]. Intangibles received by an n actor must be used to produce a tangible outp put, to createe monetary vaalue, or passed on to maintain a perceived inttangible valuee. Transaction ns and deliveraables in a vallue network can be anything, thuss enabling a value v network k to be applied d to almost any context. Hoowever, the understand ding of tangib ble and intang gible deliverab bles needs to be discussed within the applied co ontext, as each h type of transsaction has mu ultiple definitiions. Figure 2 illustrates an examp ple value netw work of fiction nal ‘Organisaation X’, whicch retails Teleevisions in the UK. a contex xt, VNA ideallly requires alll actors to bee present to Regarrdless of the applied initially map m the network. Their inp put is necessary since preedicting who they communicate with, i.e. whicch intangible knowledge th hey transfer/reeceive from otther actors, or actors to bee present, from m an organisaational peris error prrone [23]. It is also vital fo spective, to determine what occurs in the netwo ork as a whole and whetheer their inuts can be alteered to improv ve value flow. puts/outpu

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Joinson and D. Houghton / Privacy Value Network Analysis

79

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 2. Example Value Network for Organisation X.

Value network analysis has not previously been applied to issues of privacy and personal information transfer. However, because of the flexibility of the tangible and intangible connections and transfers, it is well suited to a consideration of the wider impact of personal information flow across networks, and the impact of such flow on all stakeholders. Since an intangible value can include such diverse constructs as social capital, trust and community, value network analysis provides a method by which the transfer of an intangible can be tracked across a network. It also provides a method for visualising potential losses of value (or costs) not only at the data subject level, but also across the network. For instance, the provision of false information to a social network service provider may provide value to an individual (through the protection of privacy), but may lead to lower overall value across the network due to the potential loss of trust in a system comprising false identities. While VNA can provide an insight into privacy and the impact of personal information leakage across a network, there are some limitations when applied to privacy. For instance, in VNA information is allocated a value (or cost) of a particular type within the network, but there is no method for identifying how (or why) such information exchange may change in either type or polarity. Thus, the depth of value appropriation tends to be ignored. For instance, disclosing personal information to close friends and family may garner positive value in the form of emotional support. If this information leaks outside of this close set of ties, it may lead to negative value for the data subject in the form of humiliation or embarrassment. Neither VNA nor SNA provide a system for understanding the reasons why this might occur. Therefore, we adapt

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

80

A. Joinson and D. Houghton / Privacy Value Network Analysis

communication privacy management to contribute the final layer of analysis in our approach to understanding networked personal information flow.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. Communication Privacy Management Communication Privacy Management (CPM) is a theory of boundary and dialectical negotiation in the management of privacy. CPM places private information at the epicentre of communication, and proposes that individuals decide to disclose to others based on the implicit or explicit permeability of boundaries surrounding the self and the potential recipient(s) of information [4]. In deciding whether or not to disclose private information, a variety of dialectical issues are considered, which act on an opposing continuum. These include disclosure-privacy, concealing-revealing, publicprivate, openness-closedness and autonomy-connectedness. According to CPM, the flow of private information occurs within three boundaries: personal, dyadic or collective. A personal boundary surrounds the individual, preventing private information disclosure to another. A dyadic boundary forms around two individuals and a collective boundary forms around three or more individuals, all of whom share private information amongst each other. Choosing to reveal or conceal information within the boundaries is subject to a process of rule management, which occurs through privacy rule foundation, boundary coordination operations and boundary turbulence [4]. Privacy rule foundations. Privacy rules are founded in two ways. First, they are shaped by individual characteristics such as culture, gender, disclosure motivations, the context of the disclosure and a risk-benefit assessment. Second, rules are taught when individuals enter pre-existing boundaries (e.g. family) or they are negotiated as new boundaries are formed with others. These acquired rules can remain stable, becoming part of a person’s sharing habits, or they can evolve over time [4]. Boundary coordination operations. In our daily information exchanges, we constantly forge new, diverse dyadic and collective boundaries. A dyadic boundary shared with an intimate partner, for example, is very different from a dyadic boundary shared with a parent. Through a process of boundary coordination, members of these contrasting boundaries are able to implement privacy rules. Boundary coordination is conducted through forging links with others, altering the permeability of the boundary itself and taking or relinquishing boundary ownership [4]. Linkages may occur when two boundaries become connected, deliberately or inadvertently, through the sharing of information. Permeability represents the openness of the boundary and the subsequent bidirectional information flow. Boundary ownership refers to how the discloser and coowners perceive their privileges and obligations in regard to the private information they co-own. Boundary turbulence. Key in CPM is the interpersonal nature of disclosure. Information shared by one party is received by another, and becomes co-owned. Co-owners must then negotiate rules to constrain the information within their dyadic boundary. Turbulence occurs when the receiving party does not recognise that the information is co-owned or disregards the negotiated rules. When turbulence is caused, the owner often uses sanctions against the individual(s) who disclosed the private information, such as excommunication, withholding further information in the future, or reprimands [4].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Joinson and D. Houghton / Privacy Value Network Analysis

81

5. Privacy Value Network Analysis (PVNA) The Privacy Value Network Analysis (PVNA) method is grounded in both SNA and VNA, with CPM providing a method to identify boundary conditions of information disclosure. Within PVNA the key elements are nodes (or actors), ties (the connections between nodes), and deliverables (that which is exchanged from one node to another). PVNA builds on SNA and VNA in three ways: 1.

2.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.

The visualisation of multiple uni-directional exchanges between nodes. Privacy is subjective and dynamic (e.g. [4,18,24]) and differs between individuals and the relationships held with others [4,25]. It is therefore necessary to assimilate the concept of multiple types of information exchange between the multiple actors from VNA into PVNA. For instance, the provision of purchase information to a store may be followed by financial reward passed onto the customer. Actors are motivated by personal and collective goals of interaction. The dynamic, subjective and non-monotonic nature of privacy requires the inclusion of individual goals into a PVNA model. Altman [18] suggests that desired levels of privacy differ across individuals, situations and time. For example, a low desire for privacy may be motivated by a goal to socialise. A change in circumstances may alter the desired state of privacy. The same actor may have different goals dependent on the interaction partner, e.g. an organisation may wish to maintain loyalty with a customer, while seeking to maximise profit in negotiations with a supplier. An individual can seek support from a relative, and intimacy from a romantic partner. PVNA accommodates multiple, potentially competing, goals held by different actors. Transactions across a network impact on the achievement of goals by actors. As multiple exchanges and goals exist between actors, the final assertion of PVNA is that the content of an exchange has implications for the goals of the different actors. In PVNA the key form of transaction is information – e.g. the transfer of personal information. The consequences of information exchange on intangible assets (e.g. trust, social capital) can alter the goals of the actor. For instance, a store and customer could have competing financial and privacy related goals. The exchange of money and goods will have implications for the fulfilment of a commercial goal, while the exchange (or non exchange) of purchase information linked to a personal identifier will have implications for the privacy-related goals of each node, as well as perhaps the non-privacy related goals (e.g. loyalty to the store by the customer). In VNA, trust would be treated as an intangible deliverable. However, in reality, trust is not something transferred from one node to another, it is a belief held at the individual level. For this reason, it is seen as a goal-related implication of the transaction, and treated as such, rather than as something exchanged.

6. Applying Privacy Value Network Analysis In order to develop a PVNA for a specific context, five main steps are proposed: 1) identification of the network actors; 2) the recognition of the goals of these actors; 3) map the connections between actors (including the nature of exchange); 4) connect

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

82

A. Joinson and D. Houghton / Privacy Value Network Analysis

3 Deliverables off a PVNA. Figure 3.

the outcom me of exchan nge between actors a to the goals g of the actors a that maade the exchange; an nd 5) graphicaally visualise the t interaction ns across timee and situationns. 6.1. Step 1: 1 Identify thee Actors of thee Network It is important to recog gnise the limittations of the system under investigationn, by definndaries of the network [26,2 27]. Ockham’’s Razor shouuld be used ing the sccope and boun as a guidee to identifyin ng the critical actors involveed in the phen nomena underr investigat the compleexity inherent in adopting a network-cen ntric approach. To detertion, due to mine the nodes n in the network n it mayy be necessary y to investigatte who is invoolved in the phenomen na to be repressented.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6.2. Step 2: 2 Determine the Pertinent Information Exchanges E (Deeliverables) B Between All Acctors As noted, PVNA assum mes multiple exchanges, e an nd the potentiaal number of exchanges a is high (see ( Fig. 3 forr examples off multiple exch hange types). In order to between actors contain po otential complexity PVNA limits deliverrables to tang gible informattion, goods (or service), or money. Additional deliverables d th hat would be termed ‘intanngibles’ in VNA are located at thee node level and treated as an outcomee of the deliveerable (see mation leads to o reduced trusst in a state Step 3). For instance, iff the loss of peersonal inform he loss of info ormation is trreated as the deliverable, d nd distrust is the consean agency, th quence fellt at the indiviidual actor lev vel (both the recipient r of tru ust, and the peerson trusting). 6.3. Step 3: 3 Determine the Goals of Each E Actor in the Network Once the nodes have been b determineed and the ex xchanges iden ntified, the goaals of each d to be identiified in order to set a baseeline for the mapping m of innformation actor need exchangess and subsequ uent valuationss. As with datta collection for fo SNA and V VNA, there are no preescribed metho ods for gatherring data to bee used in network analysis [28], and in hoices made will w be determ mined by the pu urpose of the PVNA. P part the ch

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Joinson and D. Houghton / Privacy Value Network Analysis

83

6.4. Step 4: Determine the Valuation of Deliverables (the Achievement of Goals) The fourth step connects the deliverables to the actor’s goals. By connecting the deliverables to the network actors’ goals it can be determined if a particular exchange is beneficial or detrimental to the individual and to the network as a whole. There are no prescribed methodologies to implement in this step. Depending on the context, the use of interviews, surveys, or questionnaires can inform if the node achieved their goal due to a deliverable. If similar measures were taken to determine actors’ goals (step two), the process can be repeated and a comparison conducted between the data at step two and step four.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6.5. Step 5: Visualisation and Analysis Across Time and Events The final step combines the outcomes of steps one to four into a single visualisation in order to convey the key interactions within the PVNA. Developing the autopoietic principle of networks from VNA – the ability to adapt, reproduce and redefine itself based on internal and external environmental changes ([23]; [29], 1998) – and the dynamic and temporal constructs of privacy, it is important to map two time points to identify the changes across a network. Subsequently, the network can be adjusted to lever privacy value from specific interactions and deliverables for the benefit of individual actors and/or the entire network. If the aim of a PVNA is to identify the effects of a change in privacy practices, i.e. alter information flow to protect an individual’s personal privacy, multiple PVNAs can be mapped to illustrate the different network states (e.g. pre and post privacy violation). Dependent on the number of actors, it may be most appropriate to adopt a higher level of analysis (for instance, to study the impact of Facebook privacy policy changes on users’ behaviour). In keeping with VNA, the sequence of exchanges can be added to a network map to provide information on the linear progression of deliverables and the impact on nodes: that is, visualise the order that deliverables are exchanged between actors to fully appreciate the depth of network connections and identify problematic or beneficial transactions to enhance the overall privacy value. If a deliverable contributes positively towards a goal, a positive valuation should be depicted for the recipient node (plus sign). If the goal is not achieved, the deliverable works against goal achievement, or the deliverable affords another unwanted or unexpected goal, then a negative evaluation can be depicted (minus sign). Valuations can be displayed on a network map using any desired symbol to represent positive or negative evaluations. In the example in Fig. 4, algebraic symbols are used for simplicity.

7. Example Application of Privacy Value Network Analysis An example of a PVNA is provided in Figs 5 and 6. The figures illustrate a hypothetical network of actors involved in a retail loyalty card scheme. At time 1 (Fig. 5) the customer is providing personal information to the retailer in return for reward vouchers. The customer is also spending money in the retail outlet. Within the customer and retailer dyad each transaction is perceived positively by the recipient actor. Note also that multiple goals exist for each of these two actors. As well as the customer and retailer dyad, the network shows an outsourced database system is

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

84

A. Joinson and D. Houghton / Privacy Value Network Analysis

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figurre 4. Full PVNA example includin ng Valuation Indicators.

ple PVNA of loyaalty cards, time 1. Figure 5. Examp

providing a service to the t retailer byy storing theirr customers’ details d in returrn for payork multiple goals g are evideent: Behaviou ural Targetingg; Financial ment. Acrross the netwo Reward; and a Service Prrovision. At tim me 2 (Fig. 6) the t outsourced d database sysstem providerss have sold thee customer data to ad dvertisers. Th he advertisers are targeting g the original customers wiith leaflets

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Joinson and D. Houghton / Privacy Value Network Analysis

85

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 6. Exam mple PVN of loyalty cards, time 2.

posted to them individu ually. At this time point, th he customer takes t a protecctionist apnd no longer provides p accu urate informattion to retaileers, as they peerceive the proach an leaflets ass contrary to their idea off behavioural targeting. Th he retailer perrceives the receipt of false informaation negativelly, as it is detrrimental to thee success of thheir behavgeting. Also note, n omer has stop pped spending g so there is nno longer a the custo ioural targ positive valuation possiible and valuee has been lostt. Multiiple networks across time can be mapped if there arre numerous cchanges or more patteerns emerging g than are visib ble.

8. Discusssion Within the present chap pter, we havee demonstrated d the benefitss of applying social netn analyysis and com mmunication privacy p manaagement in work anaalysis, value network nteractions an nd their implications for combinatiion for a noveel perspectivee on privacy in both perso onal and netw work value. Using U any of the three app proaches in issolation restricts thee focus to a uni-dimension u nal paradigm. In synthesisiing the methoods we are able to tak ke a multi-dim mensional approach to soccial communication and prrivacy, and the impliccations of succh interaction ns. At presentt, privacy value network analysis is primarily a qualitative visualisation v t tool to aid thee understandin ng of static annd dynamic hin a network for all stakeh holders, and can help actorss to underprivacy beehaviour with stand whyy privacy costss are incurred within their network. n

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

86

A. Joinson and D. Houghton / Privacy Value Network Analysis

Privacy value network analysis uses the principles of social networks to show ties between actors and allow the basic network structure to be interpreted. The addition of value network principles introduces multiple ties, definition of tie type and definitions of tangible or intangible deliverables. Value networks add significance to the deliverables for the sender, receiver and network whilst interpreting the efficiency of information and value flow. This allows innovation in adapting a network to new environments or circumstances. Communication privacy management principles identify boundaries of private information around the actors in the network and enable identification of private information flow within these boundaries. A significant feature of a privacy value network analysis (PVNA) is the identification of goals for each actor and the potential for information leakage outside of dyadic interactions, resulting in negative perceived value for the actors involved. Further use of PVNA can demonstrate where information has leaked and result in the necessary renegotiation of boundaries amongst co-owners of the disseminated information. The use of SNA, VNA and CPM hinges on their common construction. Social networks and value networks allow the analysis of groups or networks, and CPM analyses groups and personal privacy using boundaries, personal goals and expectations of interaction, and rule management. All three frameworks handle information flow and each contains an element of analytical subjectivity, such as understanding the network actors involved, perceived value, goals of interaction and the concept of private information. One advantage of adopting a network-centric approach to visualising privacy (and its potential violation) is that wider effects of an action can be tracked across a network so that the ramifications of a particular action can be seen across the network. For organisations this is important, since they may extract greater value by not collecting particular bits of personal information if doing so leads to lower value elsewhere. We can see similar network effects at a policy level too – for instance, if a large amount of Internet data is routinely collected by the security apparatus of a Government this may lead not only to a ‘chilling effect’ on behaviour, but also to increased use of encryption and privacy-enhancing technologies. The potential gains of increased surveillance are then lost by responses across the network. Using a network-oriented approach we are also able to identify the micro aspects of a phenomena. For instance, we can examine the interactions, private information exchanged and value generated and lost, and then expand to the network level and the wider implications for other connected networks. Similarly, PVNA can begin from a macro level and narrow to the micro interactions. Privacy value network analysis reflects the view of CPM that private information and disclosure do not only involve the self, but others. Rules and boundaries are negotiated or created to control information flow. The concept of PVNA highlights the role of others’ involvement and reflects the view that subsequent rule and boundary coordination are important. Critically, PVNA recognises that the effects of internal information exchanges on both individuals and relationships are also paramount to external entities otherwise uninvolved, and can impact on their value contribution to the network within which they exist. For example, a social network site user who posts photographs of other users who were intoxicated may add value amongst the users depicted, but reduce the value held by “friends of friends”, such as parents or work colleagues not expecting to see such photographs. We propose that the PVNA method is useful to understanding how each actor can balance the maximisation of user potential and value, whilst respecting privacy boundaries. In this sense, at present PVNA is primarily a method of visualisation. With additional work it would be possible to quantify privacy

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Joinson and D. Houghton / Privacy Value Network Analysis

87

value exchanges between stakeholders, and to maximise the value across the entire network, although it is expected that such applications would be considerably contextbound.

9. Summary Privacy value network analysis is a proposed visualisation tool that can help a broad audience understand the nature of personal information flow across a network of connected individuals, groups and organisations. By mapping a network that includes statements about each stakeholder’s goals, and the information actually transferred between network connections, it is possible to view the flow of information to a wider degree than might be possible by restricting the view of information flow to include only those immediately involved. For example, disclosing a secret to a friend may typically be viewed from a dyadic perspective. Mapping this information flow onto a network can illustrate its potential dissemination, and value and devalue to the individual and the wider network of stakeholders.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] Solove, D.J. (2006). A Taxonomy of Privacy. University of Pennsylvania Law Review, 154(3), 477. [2] BeVier, L.R. (1995). Information about individuals in the hands of government: Some reflections on mechanisms for privacy protection. Willam and Mary Bill of Rights Journal, 4(2), 455–506. [3] Thomson, J.J. (1984). The Right to Privacy. In F.D. Schoeman (Ed.), Philosophical Dimensions of Privacy: An Anthology (pp. 272–290). Cambridge: Cambridge University Press. [4] Petronio, S. (2002). Boundaries of Privacy. Albany: State University of New York. [5] Antaki, C., Barnes, R., & Leudar, I. (2005). Self-disclosure as a situated interactional practice. British Journal of Social Psychology, 44(2), 181–199. [6] Joinson, A.N., Houghton, D.J., Vasalou, A., & Marder, B.L. (2011). Digital Crowding: Privacy, SelfDisclosure and Technology. In S. Trepte & L. Reinecke (Eds.), Privacy Online: Perspectives on Privacy and Self-Disclosure in the Social Web (pp. 31–44). Heidelberg and New York: Springer. [7] O’Cass, A., & Ngo, L.V. (2011). Examining the Firm’s Value Creation Process: A Managerial Perspective of the Firm’s Value Offering Strategy and Performance. British Journal of Management, 22(4), 646–671. [8] Osterwalder, A., & Pigneur, Y. (2010). Business Model Generation: A Handbook for Visionaries, Game Changes, and Challengers. New Jersey, USA: John Wiley & Sons, Inc. [9] Beales, H. (2010). The Value of Behavioral Targeting. National Advertising Initiative. [10] Montgomery, A.L., Li, S., Srinivasan, K., & Liechty, J.C. (2004). Modeling Online Browsing and Path Analysis Using Clickstream Data. Marketing Science, 23(4), 579–595. [11] Kaplan, A.M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53, 59–68. [12] LaValle, S., Lesser, E., Shockley, R., Hopkins, M.S., & Kruschwitz, N. (2011). Big Data, Analytics and the Path from Insights to Value. MIT Sloan Management Review, 52(2), 21. [13] McDonald, A.M., & Cranor, L.F. (2010, October) Americans’ attitudes about internet behavioral advertising practices. In Proceedings of the 9th Annual ACM Workshop on Privacy in the Electronic Society (pp. 63–72). ACM. [14] Turow, J., King, J., Hoofnagle, C., Bleakley, A., & Hennessy, M. (2009), Americans reject tailored advertising and three activities that enable it. SSRN. Retrieved 10/12/12 from: http://papers.ssrn.com/ sol3/papers.cfm?abstract_id=1478214. [15] Strogatz, S.H. (2001). Exploring Complex Networks. Nature, 410, 268–276. [16] Wilson, E.O. (1998). Consilience. New York, USA: Alfred A. Knopf, Inc. [17] Gavison, R. (1980). Privacy and the Limits of Law. The Yale Law Journal, 89(3), 421–471. [18] Altman, I. (1975). The Environment and Social Behavior. Belmont, California: Wadsworth Publishing Company, Inc.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

88

A. Joinson and D. Houghton / Privacy Value Network Analysis

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[19] Wondracek, G., Holz, T., Kirda, E., & Kruegel, C. (2010). A Practical Attack to De-Anonymize Social Network Users. [20] Granovetter, M.S. (1973). The Strength of Weak Ties. The American Journal of Sociology, 78(6), 1360. [21] Wasserman, S., & Faust, K. (1994). Social Network Analysis. Cambridge: Cambridge University Press. [22] Strahilivetz, L.J. (2004). A Social Networks Theory of Privacy. John M. Olin Law & Economics Working Paper No. 230. [23] Allee, V. (2003). The Future of Knowledge: Increasing Prosperity Through Value Networks: Elsevier Science. [24] Westin, A.F. (1967). Privacy and Freedom: New York: Athenaeum. [25] Burgoon, J.K., Parrott, R., le Poire, B.A., & Kelley, D.L. (1989). Maintaining and restoring privacy through communication in different types of relationships. Journal of Social and Personal Relationships, 6(2), 131–158. [26] Allee, V. (2006). Value Network Mapping Basics, ValueNet Works Fieldbook (pp. 1–5). [27] Scott, J. (1987). Social network analysis: a handbook. London: Sage Publications. [28] Haythornthwaite, C. (1996). Social network analysis: An approach and technique for the study of information exchange. Library & information science research, 18(4), 323–342. [29] Battram, A. (1998). Navigating Complexity. London: The Industrial Society.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-89

89

Personal Data Ecosystem (PDE) – A Privacy by Design Approach to an Individual’s Pursuit of Radical Control Ann CAVOUKIAN, Ph.D.∗

Abstract. Personal data in the networked world is considered “the new oil” – its collection is said to enhance user experience but is in the control and for the profit of others, leading to a lack of transparency and erosion of privacy. Expectations surrounding what constitute a healthy privacy-protective relationship between individuals and organizations are being reset under the umbrella of the emerging Personal Data Ecosystem (PDE). The PDE is supported by new technologies and services, such as Personal Data Vaults (PDV) and data sharing platforms. These technologies and services allow individuals to control and manage their own information. While PDE developments are positive from a privacy perspective given the control they provide to the individual, in the wrong hands, one’s PDV and activities within the PDE could be exploited as a major surveillance tool. The paper introduces Privacy by Design (PbD) which the author sees as essential to the success of the PDE. For several years, the Information and Privacy Commissioner of Ontario, Canada, has examined emerging technologies and best practices that are relevant to the PDE, which can assist in developing the PDE in a manner consistent with PbD. By following PbD, privacy in the PDE can indeed be assured.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Keywords. Personal Data Ecosystem, Privacy by Design, Radical Control, Personal Data Vaults

Introduction The Personal Data Ecosystem (PDE) is the emerging landscape of companies and organizations that believe individuals should be in control of their personal data, and make available a growing number of tools and technologies to enable this control. The right to privacy is highly compatible with the notion of PDE because it enables the individual to have a much greater degree of control – “Radical Control” – over their personal information than is currently possible today. The use of the term Radical Control in this article is meant to foster discussion, and refers to the level of personal control necessary for an individual to exercise “informational self-determination.” The concept of information self-determination originates from the German Constitution (“Grundgesetz”) and is elaborated on by the German Federal Constitutional Court in a 1983 decision regarding the German census. Although short of complete mastery over one’s personal information, the right of informational self-determination “prevents any processing of personal data that leads to an inspection of or an influence upon a person



Ann Cavoukian, Ph.D., is the Information and Privacy Commissioner of Ontario, Canada.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

90

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

that is capable of destroying an individual capacity for self-governance.” 1 To be clear, my notion of radical control thus relates to an individual’s control over their personal information, protected by data protection frameworks, and is not necessarily exercised by data protection authorities or the state on behalf of the individual. Radical control excludes the possibility of an individual infringing another’s control over their own personal information, subject to legal processes available under data protection frameworks, etc.2 At the moment, individuals lack the information and options to fully predict the informational outcomes of their actions when interacting with organizations. While empowering the individual in these situations with radical control is highly desirable, we must also be mindful of the risks to privacy posed by an ecosystem which proposes to bring together into one place all data relating to the user in a multi-stakeholder environment. Discussions surrounding the adoption of a Privacy by Design (PbD) framework are essential to the successful implementation of the PDE. The aim of this paper is to ensure that PbD is considered and applied at the early stages of development for a PDE. PbD moves privacy well beyond the normative sphere of laws and best practices and directly into technologies and the marketplace. Privacy must be embedded into each layer of the PDE, accompanied by exemplary systems security and accountable business practices. By following PbD, the success of both the PDE and privacy can be assured. While this article hopes to encourage discussion on this topic generally, it is important to note that not all aspects in which the PDE intersects with privacy will be explored. There are many discussions that must occur independently of these pages, and it is my hope that much more work in the area of privacy and a PDE will take place. For example, this article does not contain a discussion on possible frameworks for regulation of a PDE (i.e., the legal mechanisms or self-regulation). Also, while this article may draw attention to the existence of some technologies and procedures to be considered when designing a trustworthy system for individual control, this is not meant to be exhaustive. It is acknowledged that much work will need to take place in order to fully embed PbD into the PDE’s component technologies. Further, this paper focuses on the changes brought about by a PDE on the commercial uses of personal information in the context of the Internet. For example, instead of receiving spam or behavioural advertising, proponents of a PDE claim that the individual could exercise control over if, when, and what type of message they receive. Currently, PDE structures are very simplistic (e.g., filling out online forms), therefore, questions relating to law enforcement and state management of personal information, while important to consider, are not yet fully developed in relation to the PDE and must be explored as plans for the PDE evolves. 1. Personal Information as an Asset Class Personal data in the networked world has been compared to currency and energy [1], often being called “the new oil” [2]. These metaphors try to capture the real economic 1 Schwartz, Paul. “The Computer in German and American Constitutional Law: Towards an American Right of Informational Self-Determination.” The American Journal of Comparative Law 37, no. 4 (1989): 675–701, p. 690. 2 For example, the issue of overlapping privacy interests has been dealt with extensively by my Office. Please see decisions relating to “another individual’s personal privacy” at http://www.ipc.on.ca/english/ decisions-and-resolutions/Subject-Index-Listing/Subject-Results/?id=980.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

91

value flows that organizations can derive from online consumer data. With such data, organizations claim they can more effectively market and advertise their services and products, reduce operational costs, and predict product demand. They also use clickstream data to improve website design and communications with their customers. In addition, online reviews by customers are used to inform product design [3]. Many online business models rest on the delivery of free online content, for which the individuals supply their personal information in return. In this sense, personal information is very much akin to currency, since data is what the individual uses in “payment” for online services [4]. I first explored the prospect of placing a value on personal information in 1995 when I published my first book, titled, “Who Knows: Safeguarding Your Privacy in a Networked World” with co-author Don Tapscott. In that book, we explored the literature and the notion of one’s data having value, and the possibility of collecting royalties. We stated that “with respect to the commercial uses of our personal information, a property right to personal information, which in turn would ensconce the right to control our property, might be a good idea” [5]. We also noted that “the present-day reality is one in which the personal information of the public – both rich and poor – is being freely used, without any financial gain to the individuals involved. Clearly, all consumers would benefit from a royalty system for the commercial uses of their personal information” [6]. There exist longstanding discussions surrounding the ownership of personal data, and proposals for its legal framework [7]. The question as to “who owns personal information” will continue to be asked. In this paper and various other outreach materials issued by the Office of the Information and Privacy Commissioner of Ontario, Canada, discussion is refocused towards the question of “control” [8]. Privacy relates primarily to an individual’s ability to maintain control over the uses of their personal information, and the notion of control is the basis of the comprehensive privacy laws that are overseen in Ontario, Canada [9].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Current Situation Personal information is generally collected and stored by many different applications and service providers rather than by the individual. This often results in individuals not knowing who, within the service provider’s organization, is able to see their uploaded data. In addition, the individual has little, if any, control over the use and disclosure of their personal information by service providers. The value exchange is also uneven – the service provider is able to profit from the use of an individual’s personal information, while the individual does not – certainly, not directly. Part of the reason for this uneven exchange is because the Internet lacks an identity layer. The difficulty faced by websites and Internet users to identify and authenticate themselves to each other has led to problems we are all familiar with (phishing, account fraud, etc.) [10]. As a result, individuals must repeatedly provide information in order to complete online transactions. When individuals go about their daily activities online, it is said that they also release, on average, 700 pieces of personal information a day [11]. It is this stream of data that organizations profit from. While many organizations act as dutiful custodians of the personal information they collect, many others also believe they have the right to control this information so that they can extract its value. They may argue that individuals give away their personal information freely

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

92

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

online and thus conclude incorrectly that they care little for privacy. It is important to dispel such myths. In fact, I maintain the opposite is true. Indeed, experiments indicate that individuals place a tangible value on privacy when they are first provided with it. For example, subjects in one study were five times more likely to reject a $12 purchase tracking gift card if they were first given a $10 gift card that did not track purchases [12]. Since marketplace transactions are currently completed with a lack of information provided to consumers, and a poor ability to control personal information, individuals are not always able to make privacy-protective choices: … once an individual provides personal information to other parties, she literally loses control of that information. That loss of control propagates through other parties and persists for unpredictable spans of time. Being in a position of information asymmetry with respect to the party within whom she is transacting […] the magnitudes of factors that may affect the individual become very difficult to aggregate, calculate, and compare [13].

It is no wonder that individuals routinely think they have little role to play in how their personal information is controlled. This inevitably leads to a relationship of distrust, and at worst, a toxic battleground over personal data, where organizations want more and more, and individuals feel increasingly vulnerable, intruded upon, and unwilling to share their information. To date, few economists have argued that privacy can be protected while also achieving merchants’ interests [14]. Enter the Personal Data Ecosystem or PDE which represents the first market-wide attempt to restructure the informational relationship between organizations and individuals. Expectations surrounding what constitute a healthy privacy-protective relationship between transacting individuals and organizations are being reset under the umbrella of this emerging ecosystem.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. “Radical Control” Over One’s Own Personal Information in the PDE While the definition of informational privacy may differ among jurisdictions, the essence of privacy relates to the ability of individuals to have personal control and freedom of choice about the collection, use and disclosure of information about themselves – in short, control over personal data flows. When looking at the meaning of privacy, one can see, right from the outset, that the PDE aligns closely with privacy. In a PDE, individuals would be able to analyze and set priorities or constraints based upon their own behaviour, life goals and events, rather than having to accept the analysis and priorities set by others. The user has the ability to [15]: • • • • • • • • •

manage his or her personal information; create a single, integrated view of their behaviours and activities; provide identity and claims verification; selectively share this view with organizations of their choice; better use their information as a tool; receive personal information handed back from organizations; use analytics applied to their information to spot trends; communicate and share opinions and views with others (e.g., P2P product reviews); and set priorities and planning for different aspects of their life (e.g., getting married, planning for retirement, etc.).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

93

The drivers for a PDE arise from a number of factors. Technology processing power has increased, while the cost of processing information has decreased. As a result, it is now possible for individuals to have the same sophisticated data generation, management, sharing, and analysis tools as organizations. These advances support the emerging PDE in placing the control of personal information in the hands of the data subject [16]. In addition to technology, the commercial impetus for the development of the PDE is exemplified by the World Economic Forum and Personal Data Ecosystem Consortium (PDEC) – an industry association. The WEF published a series of reports centered around personal information as a “new asset class” and identifies “key imperatives for action” for the PDE, encouraging consensus regarding the rules for obtaining individuals’ trusted and permissioned flow of data in different contexts [17]. The World Economic Forum emphasizes that “Privacy by Design is key to ensuring privacy is proactively embedded into the technology itself” in its 2013 report titled, “Unlocking the Value of Personal Data: From Collection to Usage.” The report also credits Privacy by Design with helping to find privacy protective features required for the evolving personal data ecosystem. For its part, the PDEC is composed of over 30 companies who are committed to adopting user-centric models of business [18]. Governments are also taking steps to hand data back to their citizens, and are encouraging other organizations to do the same. For example, the U.S. Government has launched an open data initiative to release mass data sets to the public, and personal records about their health care, energy consumption and education to individuals. The U.S. Department of Veteran Affairs created the Blue Button initiative to allow veterans to download their personal health record as a text file or PDF, and it is anticipated that the Blue Button program will be expanded to other U.S. departments [19]. A similar program, Green Button, allows consumers to have access to their own energy usage information in a downloadable, easy-to use electronic format, offered by their utility or retail energy service provider [20]. In the U.K., under the project title ‘midata,’ the U.K. government is encouraging organizations to release personal information back to customers in a portable, machine-readable, and reusable format [21]. The recently passed Enterprise and Regulatory Reform Act 2013 allows the U.K. Secretary of State to make regulations requiring businesses to provide customer data to a customer at their request, or to a person who is authorised by a customer to receive the data, at the customer’s request. The Secretary can also specify the form in which customer data is to be provided and when it is to be provided [22].

4. Personal Data Ecosystem (PDE) Components The PDE is supported by several component parts and stakeholders, existing and new technologies, and services that allow individuals to control and manage their own information, such as Personal Data Vaults (PDV), data analytic services, and data sharing platforms. It is important to note that the stakeholders and technologies proposed as components of a PDE still may not be at the highest levels of maturity from a privacy perspective. For example, in a PDE comprised of component parts, ID management will be an integral piece, and must necessarily be based on a long line of research and technologies relating to privacy. It is outside the scope of this article to provide the literature review on ID management relevant to the PDE. However, the link between the importance of a PDE enabled by user-controlled ID management is found within the 7th foundational principle of PbD, explained at Section 6.7; namely, that systems

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

94

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

should not place an unfair burden on the individual to protect their privacy without a proper user interface. 4.1. Personal Data Vaults (PDVs)

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A key concept behind PDVs is ‘controlled push’ and ‘informed pull.’ The customer engages in a controlled pushing out of their personal information, but they can also pull information in by requesting data from different sources, based on the customer’s own criteria (e.g., best price for car insurance). PDVs help individuals to collect, store, use, share, grant access to, and manage their own personal information. Instead of an individual’s information being held by many different organizations, the data would reside with the individual, contained in their PDV. PDVs act as a central point of control for their personal information (e.g., interests, contact information, affiliations, preferences, and friends), including structured or unstructured data, such as text, images, video or sound. The information that a person chooses to put into a PDV may be general in nature, or can relate to a specific topic, such as health or education information, or it can be information relating to a particular objective, such as managing their online presence. Currently, individuals must manually share their personal information with different organizations, multiple times (e.g., email address, phone number). In a PDV, personal information is stored once and then shared selectively with organizations. Individuals retain full control over their data in their PDVs, deciding what information to share, with whom, and under what conditions. A PDV may enable users to choose whether to be discovered by other users or third party service providers, or to share personal information with the same, according to criteria set by the individual. PDV systems use specialized software and distributed hardware for data storage, authentication, and access control mechanisms. Many of them also offer an API (Application Programming Interface) for developer access. A PDV may reside in one location or can be distributed among several federated sources. It can also be self-hosted by the user (i.e., using one’s own server or have a third party hosting company that acts in the legal role of personal data agent) [23]. 4.2. Data Analytics Services Other aspects of the PDE include information logistics platforms and services that can be used for efficient information delivery and exchanges. Also, personal information management services (PIMS) can assist individuals in researching and coordinating life processes and episodes, such as getting married, moving or retiring. Analytics tools can monitor one’s patterns, examine variances, set personal targets and goals, and visualize data. Targeted personal data services, such as to manage one’s online reputation, function to enhance PDVs and could eventually integrate with PDVs [24]. 4.3. Data Sharing Platforms PDV providers must be made interoperable by way of data sharing platforms. The goal is to make multiple PDVs from different providers interoperable with each other and with the businesses that want to connect with them. Such a platform is much like a credit card network that makes electronic payments interoperable between different banks and merchants, and where banks provide structures for digital banking transac-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

95

tions. However, instead of charging an interchange fee based on the value of a transaction, businesses could pay a relationship fee for the value of each customer relationship maintained over the network. Such a network could also feature a network-wide, peerto-peer reputation system that applies equally to vendors and customers [25]. For example, we are aware that, in the banking sector, there is already interest in creating a data exchange universe for the various personal information data sharing platforms. The goal of this data sharing platform ‘superset’ would be to provide a scalable global network – one that operates in parallel to a bank’s existing financial infrastructure – for ‘digital data banking’ i.e., trustworthy transactions of any digital asset between any two parties on the network. This network would be intended to act as a digital map that describes the location of the data, the framework(s) under which access to the data is available, the digital identities who have access to that data, and the access and usage rights these identities [26].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5. Privacy Concerns with a PDE While the concept behind the PDE aligns with many aspects of the protection of privacy, there are, nonetheless, some areas which require careful attention to ensure that a PDE successfully protects personal information. If improperly designed, the PDE could raise a number of privacy risks, such as the oversharing of information due to insufficient granular user options. While it is difficult to pinpoint all such risks within this article, a few are provided as examples. The focus at this stage must be on mitigating privacy issues across the overall framework for a PDE, as well as its component parts. Simply put, all stakeholders and technologies within the PDE will have to address privacy, not just individual PDV operators. A major example of a privacy risk within the PDE is that one’s PDV and activities within the PDE could be exploited as a major surveillance tool by PDE stakeholders in a commercial context (e.g., behavioural data for Internet marketing purposes). While the individual retains full control over his or her raw data in the PDV, this does not preclude third parties in the PDE from retaining copies of their data, without the individual’s permission. This means that all actors in the PDE must strictly follow the permissions for data held by others in the ecosystem – the retention of copies of that data increase the chances of unintended disclosure and erosion of personal control.3 Questions of interoperability, interactions and information-sharing mechanisms between PDE actors may also have an impact on privacy. Privacy and trustworthiness are almost always more difficult to establish within an ecosystem of multiple stakeholders than within a single enterprise. Across multiple stakeholders, there are likely to be many different policies, different security architectures, and different tools and technologies deployed, all of which need to be both interoperable and consistent with the protections provided for shared data. Different approaches to privacy within the ecosystem could lead to information being collected, used, and disclosed contrary to the individual’s preferences. 3 An analysis of covert access to personal information by government (e.g., PRISM) outside of accepted channels such as by judicially authorized warrant is outside the scope of this article. My Office has, however, commented extensively elsewhere on government proposals for warrantless access to personal information, and has made several recommendations on how to ensure transparency and accountability. See http://www. realprivacy.ca.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

96

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

Strong security measures undertaken by a single stakeholder become meaningless if its data trading partners do not have compatible measures; the policies and technologies of all the members of the ecosystem must satisfy the requirements of the trusting party (the individual). Weak links in security can also lead to data breaches, compromised identity credentials, and identity theft. There is a need to proactively ensure privacy within any federated identity system, given the privacy issues arising from datain-motion between multiple stakeholders.

6. Privacy by Design and the PDE Privacy by Design (PbD) is a concept I developed back in the ’90s to address the evergrowing and systemic effects of Information and Communication Technologies and large-scale networked data systems. PbD advances the view that the future of privacy cannot be assured solely by compliance with regulatory frameworks; rather, privacy assurance must ideally become an organization’s default mode of operation. PbD extends to a trilogy of encompassing applications: 1) IT systems; 2) accountable business practices; and 3) networked infrastructure. The objectives of PbD – ensuring privacy and gaining personal control over one’s information and, for organizations, gaining a sustainable competitive advantage – may be accomplished by practicing the following 7 Foundational Principles: The 7 Foundational Principles of Privacy by Design

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. 2. 3. 4. 5. 6. 7.

Proactive not Reactive; Preventative not Remedial Privacy as the Default Setting Privacy Embedded into Design Full Functionality – Positive-Sum, not Zero-Sum End-to-End Security – Full Lifecycle Protection Visibility and Transparency – Keep it Open Respect for User Privacy – Keep it User-Centric

The Office of the Information and Privacy Commissioner of Ontario, Canada, has a long track record of identifying best practices for emerging technologies, which can in turn assist in ensuring that the PDE is built with privacy in mind [27]. Regardless of what these new approaches may be, it is evident that Privacy by Design will be critical to the systems and initiatives that guide the Personal Data Ecosystem. Proponents of a PDE intend to build more sophisticated architectures in the future. Therefore, we must consider the present and future vision for a PDE from a privacy perspective, and encourage others to think of how to build for privacy now. What follows is a high level appraisal of past work regarding PbD generally as it may relate to an emerging PDE. 6.1. Proactive Not Reactive; Preventative Not Remedial PbD is predicated on taking proactive steps towards the protection of privacy, rather than reacting to a privacy harm after it has occurred. Multiple stakeholders involved in a PDE should conduct a Federated Privacy Impact Assessment (F-PIA) or implement a similar type of tool that considers issues arising from data-in motion, where personal

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

97

information is transferred among various stakeholders [28]. A F-PIA goes beyond a regular PIA because it captures data flows that occur in a cross-layered cloud computing environment involving multiple parties. To establish privacy and trustworthiness, all stakeholders in the PDE, common policies, technologies and tools should be interoperable and consistent in protecting shared data. Personal information should be handled according to those established rules or trustworthy frameworks, conforming to promises made to the individual at the time of collection. Engineers should also assess whether the infrastructure to be created could have unintended uses or consequences for privacy. For example, developers may wish to examine as a case study the issue of Media Access Control (MAC) addresses and the creation of Wi-Fi Positioning Systems (WPS). While MAC addresses are necessary for network devices to communicate with each other, they are now widely used as unique device identifiers for, among other things, mapping the location of Wi-Fi signals. We noted that the stakeholders in the mobile space have a contributing role to ensuring end-to-end privacy, including the device manufacturer, the operating system and platform developer, network providers, application developers, data processors, etc. We recommended that, working with the broader research community, Wi-Fi location aggregators and location-based technology and application developers should research and implement alternatives that protect the privacy of individuals, and provide individuals with a choice in whether their devices may be used in the creation and updating of WPS architecture [29].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6.2. Privacy as the Default Setting Default settings, by and large, tend to remain intact – the default rules! As a result, it is important to ensure that the default settings for IT systems and business practices automatically protect privacy. This is an especially important starting point for the creation of the identity layer of the PDE [30]. If personal information is not needed it should never be collected in the first place. Data minimization should be practiced at every stage of the information lifecycle. For example, researchers were able to install a set of filters that control and minimize the flow of data from a PDV to third party mobile applications [31]. In addition, techniques exist to perform analytics on anonymized data, which may further enhance trust in information-sharing environments such as the PDE [32]. 6.3. Privacy Embedded into Design The design and architecture of IT systems and business practices should be structured so that, when they operate, their results or outputs minimize the data and protect the personal information within them. It is difficult to embed privacy after systems and practices have been designed; if the opportunity is missed at the design stage, privacy never becomes fully incorporated. Privacy must be built-in at the outset of all aspects and levels of the PDE, including hardware manufacturing, operating platform, software and thin-client layer, as well as any outsourced or mobile components. Internal control systems based on essential preconditions can ensure that only authorized stakeholders in the PDE can contribute to and access the data. The system design principles of a PDE architecture where individuals can manage their own data must take into account a number of additional considerations. First, us-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

98

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

ers should retain control over their raw data and be able to make decisions to share subsets of the data. This does not preclude third parties from keeping copies of the data, if permitted by the individual. Second, the system must provide high-level tools and guidance on the implications of users’ decisions about their data. It must allow users to understand the risks of their participation, and help them to make informed decisions about data collection, sharing and retention. Third, the system should encourage the continued engagement of users by allowing them to check whether their data is still visible and relevant, and through the use of triggers, make continuing and ongoing decisions about their sharing policies [33]. 6.4. Full Functionality – Positive-Sum, Not Zero-Sum Unnecessary trade-offs are not required in order to incorporate privacy while also accommodating legitimate interests and objectives. It is important to be aware of false dichotomies in this regard, and to pursue solutions which seek to provide positive results for all. The same is true regarding the PDE in which business interests in the economic value of personal information and individual interests in maintaining control over personal information must both be met. If individuals are confident that they are genuinely in control, they will be far more willing to share information, such as their preferences, plans and intentions, with organizations [34]. Also, widely available cryptographic protocols show that privacy can be protected without impeding useful flows of personal information, allowing for equilibrium in the marketplace [35].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6.5. End-to-End Security – Full Lifecycle Protection Security of personal information is an essential component of privacy and must be designed into all points in the information lifecycle, including the end, at which point information must be securely destroyed in a timely way. The full spectrum of IT security is captured within this principle, however, it is important to emphasize the importance of the security of information at the end of its lifecycle. Methods of destruction for paper records include the mechanical (crosscut shredding, pulping, pulverizing), to pieces millimeters in dimension, and incineration to white ash. For electronic media, methods of secure destruction include mechanical (degaussing and sanitation / secure erase) to an unusable state. Deleting files or reformatting hard drives is no longer sufficient to ensure that electronic media is securely destroyed [36]. Naturally, security of personal information within a PDE must reach a very high threshold. However, not only must security meet a high standard, a great challenge will be to ensure that all stakeholders are meeting those obligations simultaneously. 6.6. Visibility and Transparency – Keep It Open The component technical parts and the operation of business practices must be visible and transparent to all users. To achieve such visibility and transparency, assurance must be provided that business practices and technologies are operating according to their stated promises or objectives. There must be the possibility of submitting such assurances to independent verification. An important objective of the PDE is to provide individuals with a sense of control over their personal information, and to provide such

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

99

control to individuals requires the full adoption of visibility and transparency in any PDE business approach. The outcome should be that users of a PDE are clearly made aware of the access and sharing privileges of any service provider or third party. The three layers involved with increasing transparency include the following [37]. • •



Policies and Commitment: PDE stakeholders should have solid policies, commitments to protect privacy, meaningful transparency, and a willingness to demonstrate their own capacity to uphold promises and obligations. Implementation Mechanisms: PDE stakeholders should have robust internal standards and controls that integrate privacy into the design of their systems, as well as processes to guide and support decision-makers. Practices should incorporate ethics and values-based considerations, and should fully consider a broad range of their performance risks. Assurance Practices: PDE stakeholders should have an ability to monitor and evaluate how they are doing in terms of accountability, and to make real-time corrections, where necessary.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6.7. Respect for User Privacy – Keep It User-Centric The interests of the individual user must be made central to business practices and IT systems by way of strong privacy defaults, appropriate notice, and empowering userfriendly options. In addition, users must be treated with the same consideration at all touch points within the organization. The starting point of a PDE is user control over one’s personal information. The PDE makes respect for user privacy a cornerstone of its systems, practices, design and infrastructure. As such, the PDE aligns with the seventh foundational principle of PbD, also with the use of user-centric interfaces. As the PDE evolves, it will be important to design any user interface, including hardware and software components, with ‘usability’ in mind. Usability is defined as the degree to which the design of a particular user interface takes into account the human psychology and physiology of its users, and makes the process of using the system effective, efficient and satisfying. In order to protect privacy, usability can entail measures to assure transparency, attain informed user consent, provide rights of access and correction, and make effective redress mechanisms available [38].

Conclusion The aim of this paper has been to foster discussion of the adoption of PbD, while the PDE is still in nascent stages. The PDE results in a paradigm shift in the assumptions about privacy from organization-centric to fully user-centric. This pushes past the current practice of making organizational promises regarding data protection, and elevates it to the level of ecosystem-wide trustworthy frameworks. PDE transforms this debate by recognizing privacy as a personal setting where the individual is empowered to choose what information he/she wishes to share. For the first time, the individual can have ultimate control over their own personal information – I call that Radical Control! Trust and governance are key forces that will shape the PDE, and its success will depend on establishing the privacy and security of the user’s personal information. In this regard, now is the time to apply PbD to the PDE, since it is still in the early stages of development. PDE companies will need to devise easy-to-use features, employ data

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

100

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

minimization techniques when sharing data, and ensure that the user is fully aware of the privacy implications of his or her data sharing decisions. While it is envisioned that PDE will improve information exchanges between individuals, governments, and corporations, we must be mindful of developing privacy-protective protocols and platforms that facilitate interoperability between data sets.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] C. Moiso, R. Minerva, Towards a User-Centric Personal Data Ecosystem, Paper presented at the 16th International Conference on Intelligence in Next Generation Networks, Berlin, Germany, 2012. [2] M. Kuneva, Keynote Speech: Roundtable on Online Data Collection, Targeting and Profiling, Brussels, (31 March 2009). [3] C. Tucker, The Economics Value of Online Customer Data, The Economics of Personal Data and Privacy: 30 Years after the OECD Privacy Guidelines, OECD, Paris, 2010. [4] J. Sanders, Personal Data as Currency, Privacy and Data Protection 2:4 (2002). [5] A. Cavoukian, D. Tapscott, Who Knows: Safeguarding Your Privacy in a Networked World. Toronto, Random House of Canada, 1995, 99. [6] Ibid, 100. [7] See generally See also K. Laudon, Markets and Privacy, Communications of the ACM (1996), 92; and, P. Samuelson, Privacy as Intellectual Property? Stanford Law Review 52 (2000) 1125. [8] E.g., A. Cavoukian, Assets beyond the meter article – Who should own them? Electric Light and Power Magazine Sept (2010). [9] Freedom of Information and Protection of Privacy Act, R.S.O. 1990, c. F. 31, Municipal Freedom of Information and Protection of Privacy Act, R.S.O. 1990, c. M.56, Personal Health Information Protection Act, 2004, S.O. 2004, c. 3 Sched. A. [10] A. Cavoukian, 7 Laws of Identity: The Case for Privacy-Embedded Laws of Identity in the Digital Age, Office of the Information & Privacy Commissioner of Ontario (2006). [11] Ctrl-Shift, Personal Data Stores: A Market Review (2012) 8. [12] A. Acquisti, J. Leslie, G. Loewenstein, What Is Privacy Worth, Paper presented at the Workshop on Information Systems and Economics (WISE), Phoenix Arizona (2009). [13] A. Acquisti, Privacy in Electronic Commerce and the Economics of Immediate Gratification, Paper presented at the Proceedings of the 5th ACM Conference on Electronic Commerce, New York NY, 2004. [14] A. Acquisti, Identity Management, Privacy, and Price Discrimination. IEEE Security & Privacy March/April (2008). [15] Ctrl-Shift, Personal Data Stores: A Market Review (2012) 8. [16] Ctrl-Shift, The New Personal Data Landscape (2011) 6. [17] See generally WEF, Report about Personal Data: The Emergence of a New Asset Class, World Economic Forum (2010). [18] WEF, Report about Personal Data: The Emergence of a New Asset Class, World Economic Forum (2010); WEF, Rethinking Personal Data: Strengthening Trust, World Economic Forum (2012), WEF, Unlocking the Value of Personal Data: From Collection to Usage, World Economic Forum (2013), Personal Data Ecosystem Consortium, at: http://pde.cc/. [19] United States Government, What is the Blue Button Initiative, Department of Veterans Affairs, at: http://www.va.gov/bluebutton. [20] United States White House, Green Button Giving Millions of Americans Better Handle on Energy Costs, White House Blog, at: http://www.whitehouse.gov/blog/2012/03/22/green-button-givingmillions-americans-better-handle-energy-costs. [21] Mydex, Mydex and the UK government’s new midata policy, Mydex, at: http://mydex.org/2011/11/03/ mydex-ukgovernments-midata-policy. [22] 2013 c. 24 PART 6, s. 89 Supply of customer data. [23] D. Reed, J. Johnston, S. David. The Personal Network: A New Trust Model and Business Model for Personal Data, ConnectMe Blog (2011), at: http://blog.connect.me/whitepaper-the-personal-network. [24] See generally Ctrl-Shift, Personal Data Stores: A Market Review (2012). [25] E.g., The Respect Network, at: http://respectnetwork.com. [26] Innotribe, Digital Asset Grid, at: http://innotribe.com/tag/digital-asset-grid/.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Cavoukian / PDE – A PbD Approach to an Individual’s Pursuit of Radical Control

101

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[27] See generally A. Cavoukian, Privacy by Design and the Emerging Personal Data Ecosystem, Office of the Information and Privacy Commissioner Ontario (2012), at: http://www.ipc.on.ca/images/Resources/ pbd-pde.pdf. [28] A. Cavoukian, Liberty Alliance Project, The New Federated Privacy Impact Assessment (F-PIA), Office of the Information and Privacy Commissioner Ontario (2009); A. Cavoukian, NEC Company Ltd., Modelling Cloud Computing Architecture without Compromising Privacy: A Privacy by Design Approach, Information and Privacy Commissioner of Ontario (2010). [29] A. Cavoukian, K. Cameron, Wi-Fi Positioning Systems: Beware of Unintended Consequences, Information and Privacy Commissioner of Ontario (2011). [30] A. Cavoukian, 7 Laws of Identity: The Case for Privacy-Embedded Laws of Identity in the Digital Age, Office of the Information & Privacy Commissioner of Ontario (2006). [31] K. Shilton, J. Burke, D. Estrin, R. Govindan, M. Hansen, J. Kang, M. Mun, Designing the Personal Data Stream: Enabling Participatory Privacy in Mobile Personal Sensing, Ethics in Science and Engineering National Clearinghouse (2009) 7. [32] A. Cavoukian, J. Jonas, Privacy by Design in the Age of Big Data, Office of the Information and Privacy Commissioner (2012). [33] Shilton et al. [34] A. Cavoukian, Tyler Hamilton, Privacy Payoff: How Successful Businesses Build Customer Trust, Toronto, McGraw-Hill Ryerson, 2002. [35] A. Acquisti, The Economics of Personal Data and the Economics of Privacy, Paper presented at the Economics of Personal Data and Privacy: 30 Years after the OECD Privacy Guidelines, Paris, France (2010). [36] A. Cavoukian, National Association for Information Destruction Inc., Get Rid of It Securely to Keep It Private: Best Practices for the Secure Destruction of Personal Health Information, Information and Privacy Commissioner of Ontario (2009). [37] A. Cavoukian, Hewlett-Packard, Centre for Information Policy Leadership, Privacy by Design: Essential for Organizational Accountability and Strong Business Practices, Information and Privacy Commissioner of Ontario (2009). [38] A. Cavoukian, J. Weiss. Privacy by Design and User Interfaces: Emerging Design Criteria – Keep It User-Centric, Information and Privacy Commissioner of Ontario (2012).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

102

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-102

Personal Information Markets and Privacy: A New Model to Solve the Controversy1 Alexander NOVOTNY 2 and Sarah SPIEKERMANN Vienna University of Economics and Business

Abstract. From the earliest days of the information economy, personal data has been its most valuable asset. Despite data protection laws, companies trade personal information and often intrude on the privacy of individuals. As a result, consumers feel that they do not have control, and lose trust in electronic environments. Technologists and regulators are struggling to develop solutions that meet the demands of business for more personal information while maintaining privacy. However, no promising proposals seem to be in sight. We propose a 3-tier personal information market model with privacy. In our model, clear roles, rights and obligations for all actors re-establish trust. The ‘relationship space’ enables data subjects and visible business partners to build trusting relationships. The ‘service space’ supports customer relationships with distributed information processing. The ‘rich information space’ enables anonymized information exchange. To transition to this model, we show how existing privacy-enhancing technologies and legal requirements can be integrated. Keywords. Informational privacy, personal data markets, privacy regulation

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction The digital economy is faced with a dilemma. From its inception, personal information (PI) has emerged as the digital economy’s core asset. PI is “any information relating to an identified or identifiable natural person” [1]. Abundantly leveraged as a free commons, PI is at the core of the Internet economy and is considered the motor for online innovation. “Personal data is the new oil of the Internet and the new currency of the digital world” [2]. It finances the Internet’s free content. It strengthens an Internet company’s competitive stance. In cases such as social networking it is actually the key ingredient that brings an online service to life. Besides playing a key economic role, PI is associated with many people’s notion of humanity: identity, dignity and privacy. And as PI is increasingly collected, used, packaged, and sold, more conflict arises around how people can retain control of their identities and protect their dignity and privacy. Under the umbrella terms “data protection” and “privacy” – the ability to control both the circulation of PI (out-flowing information) and the access of others to the self (in-flowing information) [3] – a global political debate has emerged. This debate centers on whether people should be enabled to control their PI and which aspects companies should be allowed to use. 1

An earlier, short version of this essay was published in Alt, R., Franczyk, B. (eds.): Proceedings of the 11th International Conference on Wirtschaftsinformatik, Feb 27th–Mar 01st, pp. 1635–1649, Leipzig, Germany (2013). 2 Corresponding Author. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

103

In the meantime, the economic realities of personal data markets on one side and data protection efforts on the other are drifting apart. Companies capitalize on opportunities to collect and trade PI on an unprecedented scale. Uncontrolled PI trading has evolved [4]. Every time a user surfs online, an average of 56 parties track their activities on a website, largely without their consent or knowledge [5]. Companies claim ‘legitimate’ business interests in the data they collect. They argue that individuals and companies benefit equally, and that, in any case, the data belongs to the companies. It is estimated that at least 1,200 companies currently profile people for advertisements and marketing [6]. The digital marketing association claims that “marketing fuels the world” [7], profiling is enabling more relevant ads and serving as the only way for companies to provide free online services and content. Device manufacturers assert that they benefit end users by regularly uploading a myriad of information about hardware usage patterns, thereby increasing product quality [8]. And because companies have created these records, they believe that they own the data [9]. As a result, companies treat the use of PI as an issue of self-regulation. Major self-regulatory efforts, though, such as the Safe Harbor Agreement and the “do not track” initiatives, are failing [7]. Against this background, regulators, privacy rights organizations and scholars are up in arms to protect privacy. Fujitsu’s global survey found that 88% of people worry about who has access to their PI, and over 80% expect governments to regulate privacy and impose penalties on companies that don’t use PI responsibly [10]. Due to mass media reporting of privacy breaches, executives have had to quit or face lawsuits over data abuses. As a result, almost every regulatory privacy framework in the world (EU data protection directive 95/46/EC, Convention 108, OECD Data Protection Guidelines, US Bill of Rights Proposal, and more) is now being overhauled with the goal of strengthening consumer rights. However, will regulation and self-regulation initiatives achieve what they say they aim for? With increasing business interest in personal information and an escalating conflict between privacy rights groups, regulators and industry, we believe that the time is ripe to develop a tenable vision of sharing PI with businesses in PI markets. This vision must allow for an innovative, information-rich world while maintaining privacy. We should embrace the fact that having ubiquitous accessibility to information about us leads to unprecedented insights into our being and new forms of social interaction, eventually improving our quality of life [11]. Fruitful streams of research and innovation depend on data about people. However, harm to human dignity and privacy must be avoided, and people must remain masters of their identities. Neglecting good governance of PI markets could endanger human self-determination and erode the societal advantages of the digital age [12]. What if we had digital markets that used and traded PI whilst allowing people to control their information and identities? Because of incongruous technical, economic and legal assumptions, it seems that we are far from shaping such a future. Technology scholars have developed privacyenhancing technologies (PETs) that could put PI management back into consumers’ control [13,14]. However, their technical proposals often build on the assumption that people prefer anonymity in transactions with companies [15–17]. Consumers, in contrast, often don’t mind being identified in transactions with business partners, and companies are keen to foster ‘personal’ relationships [18]. While most PET proposals imply that consumers will invest time into privacy management, people simply expect regulators to protect them and companies to behave in an ethical way [10]. Finally, the PET community operates with concepts such as “data minimization” [19], which are hardly

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

104

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

realistic in times when users submit 95 million tweets on Twitter and send about 47 billion (non-spam) e-mails on an average day. The result is a patchwork of PET solutions that are adopted by neither industry nor governments. Besides the difficulties of deploying easy-to-use PETs, economists disagree about the effects of privacy on welfare [20]. Chicago school proponents argue that PI disclosure benefits society because information asymmetries are reduced [21,22]: as companies learn more about their customers, they can better serve customer preferences. In contrast, critics contend that privacy protection increases social welfare [23]. Everyone acknowledges that people need control over the use of their PI [24,25], but no consensus has been reached on whether people should legally own their PI as a property right [1,25]. Many want to view privacy exclusively as a human rights issue because they are concerned that people could be ‘propertized’ [3,26], but giving people control over their personal information has driven human rights-based privacy regulation so far [24]. As a result, few scholars have theorized about how PI markets could be organized with privacy in mind [3,9,25,27]. Where scholars have theorized about privacyenabled PI markets, models have failed to integrate the current technological and legal landscape, and these models provide no pathway to implementation. This chapter provides the model that we need to make privacy efforts work in current economic environments. Based on insights about consumer behavior, market mechanisms, existing regulation and privacy technologies, we propose a 3-tier model for PI markets. Our model embraces information richness as the future of a digital economy. ‘Social data’ originating from people will inevitably be an important resource. We acknowledge that many transactions will be identified, however, the market we propose aims to empower people as much as companies. People and companies are assigned a few core rights and obligations, resulting in a new and simple market structure. Many of these rights and obligations are already established, however, they are either weakly enforced or their importance is not recognized by policy makers. In our model, company obligations vis-à-vis consumers are enforced by the law and supported through privacy-enhancing technologies. To make market rules enforceable, our model combines complementary legal and technical enablers. Our model is limited to the private commercial PI sphere, excluding government activity. In the next section, we describe our model of a functioning PI market in which privacy can be preserved and consumer trust in PI handling can be re-established. In the subsequent sections, this hypothetical market model is described in detail, including the derivation of technical and legal requirements to enforce it. The chapter closes with a critical discussion of our model’s benefits and challenges.

1. A Three-Tier Model for PI Markets The model builds on the existing PI ecosystem. Currently, this system is complex and opaque and its players engage in many secondary data use activities that undermine consumer privacy and trust [4]. We create transparency and simplicity by assigning existing players to a simple three-tier market structure (see Fig. 1). The first market tier, which we call “relationship space”, includes the business relationship between data subjects and 1st tier partners. For example, a data subject might be a book buyer named Bob, and a 1st tier partner might be an online bookshop called bookshop.com. The second market tier, “service space”, includes the distributed computing and service infrastructures that enable today’s business relationships. It integrates all those proces-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

105

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 1. Three-tier model for PI markets.

sors that need to receive customers’ PI to directly enable and enrich 1 st tier services. We distinguish between service delivery providers, which are necessary to perform the principal service, and service enhancement providers, which contribute to the 1 st tier business relationship. For example, bookshop.com’s cloud service partner IBX Cloud Services is regarded as a service delivery provider. In contrast, bookshop.com’s clickstream research agency Nilsenix is treated as a service enhancement provider. The third market tier, “rich information space”, encompasses those players who do not directly support the 1st tier relationship. Participants in this part of the market can process as much data as they want, but the data they work on needs to be anonymized – to the degree that it cannot be linked with reasonable effort to 1st or 2nd tier transactions or data subjects. Each time PI is transferred to “rich information space”, it has to pass what we call the “anonymity frontier”. When information passes the frontier, it loses its personal nature. 3rd tier market participants could, for example, be the traffic monitoring service TraffiMon, which receives anonymized real-time location data from GoogixSmartCars. The stakeholders in our model are connected by contractual relationships. For any given relationship, market actors (see Table 1) are unambiguously assigned to one of the three tiers. Table 2 summarizes the rights (Right 1–3) and obligations (Obl. 1–9) of all actors in our model. Usually, the data subject and 1st tier partner agree on a contract governing the exchange of service, compensation and PI. Alternatively, their relationship may be governed by legal requirements; for example, mobile operators are legally required to preserve some connection data that they gather about their customers. 1st tier partners arrange service-level agreements with service delivery and enhancement providers specifying the expected service quality. In exchange, service delivery and enhancement providers receive monetary compensation or the right to use and sell anonymized information (AI). Market participants in the 3rd tier close sales contracts over AI with other actors. 1.1. The 1st Market Tier: Relationship Space The 1st market tier is termed a “relationship space”: visible 1 st tier partners maintain identified one-to-one relationships with their customers. All PI they receive is the recognized property of their customers and can be used only for purposes set down in PI usage policies, which accompany every PI exchange. The 1st tier has five characteristics: identified business relationships between customers and one visible company, a

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

106

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

Table 1. Actors in the three-tier model and their rights and obligations Role Data subject

Definition Natural person disclosing PI in the course of a service transaction in a business relationship with the 1st tier partner.

1st tier partner

Visible and primary opposite party in the service transaction and, from the viewpoint of the data subject, the party that is responsible for the PI.

Collector

Party that gathers the PI from the data subject either by interrogation or observation.

Controller

“Natural or legal person, public authority, agency or any other body which alone or jointly with others determines the purposes and means of the processing” of PI (Art 2 Directive 95/46/EC).

Service delivery provider

Entity authorized by the 1st tier partner that is necessary to perform the principal service.

Service enhancement provider

Entity authorized by the 1st tier partner that is not the service delivery provider but contributes by sufficiently close enrichment to the relationship between the 1st tier partner and the data subject.

Market participant

Any party including businesses, private persons, and governments who exchanges AI with other entities in the marketplace.

4

x (x) (x)

x

x x x

x

x

Anonymization of information exchanged in the 3rd tier

3

Demonstrate PI usage rights to authorities

Acknowledgement of standardized PI usage policies

2

Separation of PI from multiple data subjects or 1st tier partners

Offering a privacy-friendly service option with minimum PI

1

7

8

9

x

x x x

x x x

x x x

Request for authorization of 1st tier partner when using PI

Handling of PI in accordance with PI usage policy

3

Initiating an accountability system

Obtaining legitimization for PI usage

x

2 Rights x

Right to alienate AI

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1 Role Data subject 1st tier partner Collector Controller Service delivery provider Service enhancement provider Market participant

Right to a privacy-friendly service

Property right in PI

Table 2. Rights and obligations of actors in the three-tier model

5 6 Obligations x

(x)

x

x

x

x

x

(x)

x

x

x

x

x

x

x

separation of service and information exchange and the right to a privacy-friendly service, legitimized information collection, people’s property rights in their personal information, and liability of the 1st tier partner for any PI abuse. The next paragraphs justify these characteristics from an economic and human rights perspective and describe how they can be technically and legally implemented.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

107

Identified Business Relationships and a Unique, Visible 1st Tier Partner. Because personalized customer relationships have proven effective, companies have invested in CRM solutions. Companies need and want identified customer relationships [18]. And many customers are willing to provide their PI in the service context if they receive appropriate returns. Therefore, we depart from traditional data protection models, which promote the idea of total anonymity vis-à-vis companies [15,16]. However, users want predictable relationships in which they can control the use of their PI [28]. Predictability is supported when users deal with only one visible PIcollecting business partner. We define partner visibility as a state in which data subjects visiting a physical or electronically enabled premise can unambiguously and effortlessly name the commercial entity that they are transacting with. The brand of this 1st tier partner should be signaled to users when users enter an electronic premise. For example, when a user enters a bookseller portal such as bookshop.com, the visible partner is bookshop.com. Customers in a physical retail store such as Walldepot see Walldepot as the 1st tier partner (and not, for example, the shelf suppliers). All parties that have a contractual agreement with data subjects must be visible, otherwise the parties are not allowed to collect any PI through mechanisms such as cookies or uploaded software daemons. If data aggregators and brokers want to collect PI from users, they must establish a distinct and visible relationship with data subjects. The reason for this one-partner rule is that people lose control when multiple parties invisibly collect their PI at the same time. This loss of control promotes distrust on the web [28] and fosters feelings of helplessness in ubiquitous computing environments [29]. From a company perspective, the one-partner rule enables companies to regain the monopoly on PI collection in their transactions. The efforts of companies to build trust in customer relationships are not eroded by a multitude of parallel data collectors. This control increases the power that companies get from competitive information, as unrelated data traders will not have access to identified information. Selected technical enablers:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

• •

identity and claim assurance graphical user interface design.

Legal enablers: • • •

use of standardized symbols for signaling 1st tier partner mandatory principle of “one visible partner” legal liability of 1st tier partner for PI use.

Separation of Service and Information Exchange and the Right to a PrivacyFriendly Service. Today, most online transactions are of a composite nature. Information is collected as a service spin-off [30] without making the ‘information deal’ visible to the customer. In our model, companies are asked to distinguish an information layer and a service layer within a business relationship. The service layer includes the delivery of the principal service to the data subject; for example, the service layer might include the sale and delivery of a book. Within the information layer, PI is split into the information that is needed to deliver a service (“minimum information”) and additional information that is used to enrich and enhance the service experience (“enrichment information”). Minimum information can be defined as the set of PI that is necessary and sufficient to perform the principal service. For the online book retailer, the minimum information is the name, delivery address and payment information. The

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

108

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

individual’s purchase history, date of birth, and affinity profile, in contrast, is what we consider enrichment information. In our model, these two types of PI are consciously separated. PI used for any purpose beyond the bare minimum of service fulfillment should be presented to individuals as part of a separate transaction. Additionally, partners are obliged to offer one service option (Obl. 3) that requires data subjects to disclose only the minimum amount of their PI. Thus, people always have the right to a privacy-friendly service (Right 2). This right repackages the existing concept of “data minimization” [19] but limits its scope to users preferring data minimization over information rich services. Consider, for example, a web search engine, look-and-find.com, which offers three service options. Selected by default, the privacy-friendly option requires the individual to pay a subscription fee of € X per month; this option neither records the search queries of the data subject nor shows any personalized ads. In contrast, the second option possibly costs less, for instance € Y per month. This option collects more PI and uses it for an agreed time period to provide a richer service experience, such as individualized search results. The third option commercially leverages users’ PI for an agreed time period for purposes such as the targeted placement of ads. This option may be free. The user trades his or her PI in exchange for the free search service. The ‘free’ mentality governing online business relationships today would make room for a more realistic view of what digital services are actually worth. But PI won’t come for free either. Customers would need to knowingly consent to the use of their PI, potentially giving it up in exchange for a free service while being aware that PI is being exchanged for a benefit. The separation of service options benefits all market participants: competition in the market for PI may be improved because the salience of the information transaction increases [30]. In addition to service quality, marketers could compete on PI usage rights and privacy. They could realize new revenue streams from privacy-friendly service options. And people would finally get a true choice of PI disclosure options. A market challenge is that 1st tier partners could deliberately create opacity by providing a myriad of options, with variations on factors such as retention times or usage purposes for the PI. We therefore see the need for regulators to require standardized PI usage policies, at least for the privacy-friendly baseline offer. To foster enhanced comparability and innovation, minimum information policies shall be standardized (Obl. 4). A further challenge concerns prices for the privacy-friendly default option. Businesses could easily price this alternative highly enough to force people into data disclosure. Regulators would need to prohibit this practice. Technical enabler: •

standards for the presentation of minimum PI service options.

Legal enablers: • • •

mandatory separation of the service deal from the PI deal obligation to offer one service option with minimum information use at reasonable quality and price (Obl. 3) mandatory compliance with standardized privacy policies.

Legitimized Information Collection. The legitimization of data collection is probably the most important bridge between US American and European data protection frameworks [31]. Legitimization justifies the collection and use of PI. It can be obtained either through the informed consent (Obl. 1) of a data subject or by legal empowerment;

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

109

for example, mobile operators are legally required to preserve some connection data. Consent is the voluntary, unambiguous, prior, verifiable agreement of the data subject given by an affirmative, not by default action to the 1st tier partner’s PI terms [32]. Those terms need to be explicitly communicated to the data subject [33]. Reconsider the search engine example. The default option is the privacy-friendly version of a service. At one click, customers can explicitly opt into the free version. Whatever service option a customer chooses, all parties handling PI must respect the agreement between data subjects and their 1st tier partners as manifested in electronic PI usage policies (Obl. 2). Software agent solutions, such as P3P agents, enable people to initially configure privacy preferences in their client details once (i.e., in the browser); for example, people might object to data processing for marketing purposes or request immediate deletion of their data. A client-based architecture gives users more control over settings [34]. The user’s software agent matches PI usage preferences with companies’ standard usage policies (cf. ‘Privacy Bird’ presented in [13]) and supports the negotiation of an agreed PI usage policy. People are empowered to take advantage of their legal rights in every transaction, and companies benefit from better data quality and compliance. Technical enablers: • •

standards for the presentation and content of PI usage policies privacy policy negotiation supported by software agents.

Legal enablers:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

• •

legitimization for PI usage obtained by 1st tier partners handling of PI in accordance with electronic PI usage policies (Obl. 2).

Property Rights to Personal Information. A core component of our model is that data subjects have property rights for their PI (Right 1). The property right to PI cannot be alienated [1,25]. Because of its character as a personal right – similar to moral rights in copyright – seizing PI-related rights shall be prohibited. The characteristic of identifiability inseparably binds PI to an individual. Identifiable information can never be an object separable from a beholder in the way that a book can be divided from its owner. However, usage rights to PI can be transferred. From a human rights perspective, data subjects have the biggest interest in the PI asset. Thus, they are the natural holders of this property right. The main reason for proposing property rights for PI is a psychological one: property rights would create stronger asset awareness in the minds of all stakeholders. The awareness that PI is an asset of economic value makes data subjects more informed when deciding about disclosing PI [35]. Equally, companies will probably be more cautious and reflective in collecting and using it. To make people aware of this asset, we must label information self-determination rights as “property rights”. From a legal perspective, two characteristics set property rights apart from informational self-determination rights in data protection laws: one is the numerus clausus and the second is the erga omnes effect [24]. A numerus clausus of rights is that only one type within the finite set of rights in rem gives the largest extent of control over an object to its holder: the property right. A property right could summarize and simplify the numerous rights of control and access dispersed in data protection laws. The other advantage of property is its erga omnes effect: property rights can be enforced against

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

110

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

anyone. Data subjects, consequently, are able to sue any parties infringing on their property right who are not contractual partners or subject to data protection law. Technical enabler: •

PI usage policy repository on the client side.

Legal enabler: •

recognition of a property right to PI (for an elaborate discussion of the feasibility of this proposal, see [24,25]).

Liability. In our market model, the 1st tier partner is legally liable for any collection and use of PI as well as its contextual integrity. Liability safeguards the data subjects’ property right and a contractually agreed-on PI usage policy. PI is abused if it is handled in discordance with the PI usage policy. Liability of the 1st tier partner is natural from a customer perspective. The 1st tier partner acts as the single point of contact for the data subject. For example, data subjects disclosing their PI to bookshop.com feel that bookshop.com is responsible for any abuse – regardless of whether a subcontractor or any other involved party caused the damage. As the 1st tier partner enjoys the benefits of PI use, the partner is also liable for any damage caused through this use. The 1st tier partner, though, can take redress if another accountable 2 nd tier party does not adhere to the policy. Most importantly, we envision that the 1st tier partner is responsible for implementing a technical accountability system that ensures that the PI usage rights that are set down in electronic PI usage policies are obeyed (Obl. 5). Accountability ensures that any access, use, disclosure, alteration, and deletion of PI can be traced by technical means back to the party who has done so. The 1st tier partner shall therefore have a technical infrastructure that can demonstrate PI usage rights to authorities and auditors at any time (Obl. 8). Technical enabler:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.



use of an accountability system to enable and monitor policy-compliant use of PI (e.g. sticky policies, audit logs).

Legal enablers: • •

legal obligation to have and regularly audit an accountability system liability of 1st tier partner for all PI transactions.

1.2. The 2nd Market Tier: Service Space Typically, the 1st tier partner is assisted by subcontractors, outsourcers, and strategic alliances to deliver services and products. This complex service web adds to the insecurity of today’s personal information markets. In fact, consumers are most concerned about secondary uses of their data by invisible partners [36]. For this reason, we create a “market chunk”, organizing this web of invisible service providers. The 2 nd tier includes all companies that contribute to the services delivered in the 1 st tier. For instance, Datenix provides data that enables bookshop.com to improve users’ personalized book recommendations. As a 1st tier partner, bookshop.com is accountable for Datenix’s actions. The PI abuse is likely if parties at greater distance from the initial service perceive less responsibility for the PI they use [37].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

111

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

To extend the context-based trust between data subjects and 1st tier partners, 2nd tier service providers must be legally tied to the initial business relationships. This tie is created via a chain of accountability that ensures authorization, non-repudiation, separation, and auditability. Since all 2nd tier providers need to serve the 1st tier business relationship with the customer, our model ensures contextual integrity of PI use. PI is used within the boundaries of contextual integrity when the applicable social norms of appropriate PI collection and distribution are upheld in a given situation [38]. The following characteristics enable the 2nd tier. Tying the Service Space to 1st Tier Relationships. We distinguish between service delivery and service enhancement providers (see Table 1). Service delivery providers, such as the parcel services that deliver book orders, are necessary to perform the principal service. They are always immediately involved in the 1st tier relationship; examples of service delivery providers include entities supporting the accountability and security of transactions. Service enhancement providers might also need to receive PI. These providers are parties that directly or immediately contribute to the 1st tier business relationship. A subcontracting party that receives PI from the 1 st tier partner and only uses it for its own interest or the interest of the 1 st tier partner is not contributing to the relationship between the data subject and 1st tier partner. Directness addresses the factual relation of the enhancement service to the 1st tier partnership, while immediacy refers to the time dimension. Indirect long-term enhancement service providers do not have a sufficiently close correspondence to the business relationship between data subject and 1st tier partner to receive PI. For instance, passing on PI to a market research agency that develops a corporate strategy for the 1st tier partner is out of the context of the initial service transaction. As their consultation threatens to de-contextualize the PI that is employed [19,38], the market research agency is not allowed to use PI in the context of the 1st tier business relationship. However, business strategy consultants may act as separate 1st tier partners and acquire the right to use PI. In this scenario, two 1st tier relationships would co-exist: one original service delivery relationship and one additional information collection relationship. If a data subject chooses such an enhanced service option, the service delivery providers can also handle enrichment information and service enhancement providers can process minimum information. Technical enablers: • •

privacy policy language accountability system to enable and monitor policy-compliant use of PI.

Legal enabler: •

legal obligation to have and regularly audit an accountability system.

Authorization, Non-repudiation, Separation, and Auditability. For 2nd tier parties, an accountability system must comply with the requirements of authorization, nonrepudiation, separation, and auditability. First, authorization requires that access to PI by the service provider is approved by the 1st tier partner on an individual transaction basis (Obl. 6). When a customer purchases goods or services, the online shop must explicitly authorize a credit scoring agency to use customer data for a credit check. Second, non-repudiation prevents service providers from falsely denying that they have accessed, used, altered or deleted PI. Third, separation requires that PI units stemming

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

112

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

from different service transactions, data subjects, and 1st tier partners are kept in strict isolation unless the legitimized purpose allows for the combination of PI (Obl. 7). For instance, a billing provider is not allowed to combine the PI from bookshop.com customers and bank customers. This practice safeguards contextual integrity. Fourth, auditability ensures that compliance can be demonstrated at any time to authorities and auditors (Obl. 8). Technical enabler: • •

use of an accountability system to monitor policy-compliant use of PI (e.g. sticky policies, audit logs) accountability system to enable and monitor policy-compliant use of PI.

Legal enablers: • •

separation of PI from multiple data subjects or 1st tier partners legal obligation to and auditing of the accountability system.

1.3. The 3rd Market Tier: Rich Information Space

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

The 3rd tier is a market space where businesses, individuals, governments, and other parties not contributing to an identified business relationship freely exchange and trade information. They, however, need to ensure anonymity according to state-of-the-art technical standards. PI may originate from data subjects, but when the anonymity frontier is passed, this information becomes a freely exchangeable good. This asset is usable for innovative services or research. Innovation can be significantly promoted, vividly spurred on the basis of this data. We assume that the marginal utility from identification outside of business relationships is so minimal that it does not justify the ensuing privacy risks. Severe sanctions should be imposed on 3rd tier market players who distort competition by holding identifiable or re-identifiable data. Anonymity. Data subjects want to retain control over the distribution of their PI. They want to share in peace of mind. A straightforward way to create control and peace of mind is to legally enforce anonymity of all data, except in situations where identification is needed or desired by the customer (1st and 2nd tier). People are granted a privacy commons; a shared space of anonymity [25]. In our model, this space is created by ensuring that PI cannot leave the contextual boundaries of the 1st and 2nd tier. When it does, it must be anonymized. What constitutes sufficient anonymization is a dynamic concept dependent on the current state-of-the-art of technology. Regulators should document and update current standards for anonymization in “best available techniques reference documents” (BREFs), which have been applied successfully for integrated pollution prevention and control (IPPC, Directive 2010/75/EU). Currently, the concepts of “k-anonymity” [17], “l-diversity” [39] and “t-closeness” [40] suggest that it is sufficient to have a large anonymity set of individuals, diverse attribute values and similar attribute value distributions. Each market participant in the 3 rd tier is obliged to respect these anonymity mechanisms (Oblg. 9) and is regularly audited for the fulfillment of this requirement. Specific PI that cannot be anonymized, such as genetic information, would require separate legislation. Technical enabler: •

anonymization.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

113

Legal enabler: •

legal obligation and auditing of anonymity requirement in 3rd tier.

Sanctions. In a trustworthy market regime, anonymity is protected by damages and penalties for the illegal acquisition, possession, use or sale of identifiable information. Any entity in the role of a 3rd tier market participant is not allowed to hold PI. If any such entity is caught engaging in PI storage or processing, it shall pay damages to data subjects, partners and others, and pay substantial punitive damages [41]. Moreover, any persons involved in illegal activities shall face criminal prosecution, as they have encroached upon the fundamental rights of other individuals. Legal enabler: •

sanctions for breaking the anonymity rule.

Free Exchange. Free trade of anonymized information increases the amount of exchanged information. Our market model reduces the amount of legislation or other barriers restricting the alienation of anonymized information. Unhindered trans-border flows of anonymous information are fostered. Any market participant shall have free access to the 3rd tier market, including data subjects who may want to sell their anonymized information directly. As compensation for the costs 1st tier partners incur in our model, they have the right to anonymize and sell any PI collected independently of the data subjects’ consent. Market participants can resell anonymized data once they acquire it (Right 3). Legal enabler: •

right to alienate anonymized information.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Implementing the Three-Tier Model As has been outlined throughout Section 1, technical and legal enablers are required to support the implementation and enforcement of our model. Many of these technologies and legal enablers already exist. This section outlines how our model builds on existing enablers and identifies the enablers that need to be developed or changed. 2.1. Technical Enablers Well-established privacy-enhancing and security technologies enable the enforcement of our model [42]. Table 3 overviews selected technologies and assigns them to the relevant market tiers. To implement the requirement of accountability in the 1 st and 2nd tiers, different systems based on sticky PI usage policies and audit logs are available [14,43,44]. Most accountability systems suitable for ensuring contextual integrity are based on cryptographic technologies that can be easily applied in distributed environments [45,46]. Existing identify technology can determine the party responsible for a data breach. Existing security mechanisms, such as SAML, can identify the 1 st tier partner and the data subject [47]. To specify the content of PI usage policies, privacy policy languages are necessary. Some privacy policy languages have already been standardized by the W3C consortium (P3P). Since negotiating these policies is a labo-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

114

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

Table 3. Existing technologies to support enforcement in the market tiers Service Space (2nd tier) Relationship Space (1st tier) Accountability system Sticky policy, Privacy injector, Privacy-aware access control, Distributed auditing logs

Rich Information Space (3rd tier) Anonymization k-anonymity, l-diversity, t-closeness, Graph anonymity

Identity mechanisms SAML, OAuth, OpenID Contextual integrity-compatible cryptography Identifier-based encryption, NOYB Privacy policy languages POL, PrimeLife policy language, E-P3P, EPAL, Rei, EnCoRe, PERFORM, Ponder, Contextual integrity language Privacy policy negotiation P3P, PISA Web anonymity and pseudonymity agents LPWA, Crowds, Hordes, Onion Routing, Mixminion

Privacy-preserving data mining Randomization, Perturbation, Differential privacy, KD cycle-based data mining

Do not track

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Human-computer interface Privacy pictograms, User privacy agent interface design, Visual tagging

rious and complex task for the data subject and 1st tier partner, architectures can make the task easier by employing software agents that semantically understand policy content [48,49]. The usability of privacy functionality and user agents at the interface between humans and machines is more and more improved [13]. Although data subjects are possibly identified on the application layer, they might want to be anonymous to third parties on the communication layer. To ensure their anonymity, data subjects can employ existing web anonymity technologies that protect the interaction between data subject and business partner [15]. Anonymity on the web can be supported by browser functionality building on the “do not track” concept; this functionality indicates to the communication partner that no PI shall be collected. Additionally, anonymization technologies are needed to realize sufficient anonymization of PI in the 3rd tier [17,39,40]. Data mining technologies that preserve privacy can guarantee endured anonymity [50]. 2.2. Legal Enablers Our model needs not only to be technically feasible, but must also be meaningful to public policy. Policy makers need to know which of the rights and obligations we propose already exist in the current legal framework. One important idea is to consider PI as the private property of data subjects [24]. A property right to PI (Right 1) is reflected in the principles of informed consent (Art 7 Directive 95/46/EC, Art 7 General Data Protection Regulation-draft [51], Para. 7 OECD, Art 2 FTC Fair Information Practices (FIP)) and the right to object (Art 14 Directive 95/46/EC, Art 19 [51]). Informed consent gives data subjects the right to determine what happens to their PI. Data subjects can decide on the usus, the right to use the information. The right to object to the processing of PI resembles the elements of excludability and abusus of a property right. Data subjects can exclude anyone from using their PI, just as the owner of a book can prevent anyone from reading it. Moreover, they can ‘destroy’ PI by completely revok-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

115

ing all usage rights for anyone and thus have the PI ‘forgotten’ (cf. Art 17 [51]). Data subjects cannot have ius abutendi – the right to alienate the property right itself. The property right is always bound to their identity. However, the usage right can be alienated. Current data protection legislation does not recognize a property right’s usus fructus: the data subject’s right to receive a share of the fruits from the 1 st tier partner’s usage of PI [52]. The recognition of full property rights to PI is missing in civil law. So far, a data subject’s right to a privacy-friendly service (Right 2) exists only within a very limited scope. For example, Art 8 Directive 2002/58/EC mandates service providers to offer an option blocking the presentation of calling line identification. A German ban on tie-ins of sales and competitions has failed because of concerns about antitrust issues and limits on a data subject’s choice (ECJ C-304/08). A data subject’s right to object (Art 14 Directive 95/46/EC, Art 19 [51]) prevents usage of the PI but does not grant data subjects a positive right to a PI-minimal service. All other obligations in our model already exist in legal frameworks. Obtaining legitimization for PI usage (Oblg. 1) is reflected in the principle of informed consent. Three of our model’s obligations can be realized by extending the principles of explicit specification of PI usage purposes (Art 6 Directive 95/46/EC, Art 5 [51], Art 5 Convention 108, Para. 9 OECD, Art 1 FIP): policy-compliant PI use (Oblg. 2), standardized PI usage policies (Oblg. 4), and separated PI handling (Oblg. 7). First, the obligation of handling PI in accordance with an agreed PI usage policy is fulfilled when 1 st or 2nd tier parties cannot unilaterally extend beyond the purposes that have been agreed upon with data subjects. Second, standardized PI usage policies extend the principle of purpose specification by restricting the contractual freedom of data subject and 1 st tier partner to pre-specified modes of PI use. Standardized policies do not imply a legal numerus clausus of PI usage contract types, but this is most likely a matter of industryspecific self- and co-regulation. Third, separating the PI of multiple data subjects or 1st tier partners to prevent de-contextualization is automatically ensured if PI cannot be used for purposes other than those specified in the policy. The obligations to initiate an accountability system (Oblg. 5), to request authorization from the 1st tier partner when using PI (Oblg. 6), and to demonstrate legal compliance (Oblg. 8) are part of the principle of accountability (Art 22 [51], Para. 14 OECD). Implementing an accountability system ensures that 1st and 2nd tier parties are technically capable of complying with the accountability principle. Requesting authorization from the 1st tier partner when using PI is necessary to achieve accountability; this requirement is recognized in Art 2 (f) Directive 95/46/EC, where third parties process PI “under the direct authority of the controller or processor”. The accountability principle requires PI users to demonstrate legal compliance. We extend applicable law by requiring companies to ensure policy-compliant data processing by using appropriate technical systems. Finally, the obligation to anonymize any information exchanged in the 3rd tier (Oblg. 9) already exists, to some extent, in the principle of data quality (Art 6 Directive 95/46/EC). PI should “be kept in a form which permits identification of data subjects for no longer than is necessary […]”. To this vague formula, our model adds a clear anonymity frontier that unambiguously determines when anonymization takes place. Best available technique reference documents (BREFs), kept current by data protection authorities, prescribe state-of-the-art anonymization technologies. Moreover, our model supports an information-rich space by imposing more stringent sanctions for the abuse of identified information than are currently in place [41]. To conclude, our model’s

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

116

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

rights and obligations can be implemented by making only minor adaptations to the current legal framework.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. Discussion We are aware that many of the rights, obligations and legal and technical enablers we propose are not new. They have been proposed for over two decades by researchers in privacy, identity, security, and legal studies and debated by companies and regulators. For example, we do not need new security mechanisms to identify the 1st tier partner; we can build on existing technologies, which were outlined in Section 2.1. However, no one has demonstrated how all of the puzzle pieces could be arranged in a market model to benefit both individuals and companies. For personal information markets to benefit both individuals and companies, the enforcement of market rules must be improved. The main design principle of our market model is to combine legal and technical mechanisms, overcoming the weaknesses that each type of mechanism has on its own. A legal property right to PI backed up by technical accountability of data usage simplifies access to law enforcement for data subjects. One burden that companies will have to carry is to finally provide people with a privacy-friendly default service option. But, as we have shown in this chapter, the burden isn’t that heavy. Companies can re-enter competition on the basis of service quality. Furthermore, our model meets the privacy preferences of different individuals: access to content at potentially lower cost for those who are willing to ‘pay’ with their PI and alternative versions for customers who are concerned about their privacy. Privacy rights proponents may argue that this preference-based market structure disadvantages the poor, who may be forced to sell their PI. This argument is true only if marketers choose to have people pay for the privacy-friendly version. Marketers could also make the data-rich version more attractive from a service perspective – with greater functionality and no ads – while offering a baseline: privacy-friendly service with nonpersonalized ads. With respect to the default privacy-friendly version, we suggest regulating the identified market space. One market regulation may be enforcing a price cap and a minimum service quality obligation for privacy-friendly services. Price regulation is common for many service and product areas, including books, public housing, water and electricity, and parks and roads. And even if people were asked to pay more than they do now, we argue that other services areas have seen successful transitions from an initial free offering to paid-for offerings: for example, the short message service (SMS) has become an important source of income for mobile operators even though it was initially a free by-product of telephony services. Finally, even if individuals opted into the usage of their PI in exchange for the service, our market proposal provides privacy protection: companies would be accountable and liable for how they use PI. Limitless reuse and repackaging out of context would be outlawed. Privacy risks would hence be limited, even for those who share. As data subjects will have property rights to their PI, they will also be brought back to the negotiating table. Property rights, a right to privacy-friendly service options and defaults, company accountability and a transparent market structure promise to re-establish the trust we need if we are to see information services flourish.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

117

A core benefit of our model is also its main technical challenge: the creation of a free market space that ensures anonymity. Ensuring anonymity becomes more difficult as technology becomes more powerful, facilitating identification. Anonymization could reduce the entropy of information to such an extent that the utility for information users would vanish. For multidimensional PI that contains many attributes about data subjects, the “curse of dimensionality” forces that information to be extensively aggregated to guarantee reasonable anonymity [50]. Utility-based privacy preservation, however, guarantees that the utility of anonymized data does not drop by more than a defined threshold ε, known as ε-differential privacy [53,54]. Data protection authorities define the “BAT” (Best Available Techniques) (Directive 2010/75/EU) that guarantee sufficient anonymity. Flourishing service spaces based on “non-identified, social data” instead of “personal data” may be the result. Information buyers want to obtain a representative sample of a population of individuals, not the information of identified single data subjects [50]. Some information buyers in the current PI ecosystem cannot use anonymized information and require identities. One example is genetic researchers, who inherently operate on the individual’s genome code. Such fields, though, would not be excluded from gaining access to PI. Every party requiring PI can take on the role of a 1 st tier partner and purchase the information with the informed consent of the data subject. Another benefit we see is that our model builds on readily available technologies. Technical feasibility depends heavily on whichever infrastructure changes are necessary to put the proposal into effect. Ideally, the enforcement of property rights to PI would require technology that can trustfully certify the identities of property right holders and data subjects [55]. Unfortunately, waiting for the missing identity layer on the Internet [56] would probably be like waiting for Godot. Consequently, our proposal would be interoperable with, but does not necessarily require, additional large-scale technical infrastructure – such as an identity layer – on the Internet. The technologies compatible with our model, as described in Section 2, can be readily implemented by 1st tier partners. Finally, two more fundamental challenges of our model must be considered: the concern of ‘monopolizing’ information and the international enforceability of our model. The idea that personal data could be recognized as property originated in the US; this idea has been met by the criticism that people shouldn’t be ‘propertized’ [3,26] as well as a series of other arguments (for an overview see [25]). Ralph Waldo Emerson once remarked, “As long as our civilization is essentially one of property, of fences, of exclusiveness, it will be mocked by delusions.” For these reasons, we view the idea of property rights to PI critically. However, because markets already treat PI as property, we ask only that individuals are accorded the same rights that companies have already claimed for themselves. Moreover, a property right would not substitute but enhance the status of privacy as a basic human right [24]. In Europe, it would provide people with an additional legal instrument, giving them easy access to existing, well-proven enforcement structures. Data subjects would be enabled to effectively claim their rights to PI on their own instead of calling on data protection authorities. Even though data protection authorities have tried to support data subjects in cases of data breach, their effectiveness is limited. Their independence stands on shakier ground than those of ordinary courts. Most importantly, they do not have the capacity to handle the volume of cases that require settlement in personal data markets. It therefore seems more appropriate to give data subjects access to existing law enforcement infrastructure.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

118

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

Another challenge to our model is its international practicability. Recent years have shown how difficult it is to reach international consensus on data protection or privacy. Enforcement is even more difficult. The Safe Harbor Agreement between the US and Europe on data handling practices is a good example of failure. A more effective path might be to implement and enforce binding, hard law for data protection. For example, property rights are enforceable as well-recognized legal instruments in both the European and US legal orders. If Europe and the US applied property rights to PI [57], the rest of the world would potentially follow suit.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. Conclusion For information about individuals to be used effectively, participants in the digital economy must have enough trust to willingly share and exchange information. “Social data” is becoming an increasingly important ingredient for innovative companies; the use of PI is applied with astonishing accuracy by applications that predict traffic jams and monitor public health in real time. Notwithstanding the increasing regulation of data protection and privacy, individuals feel their privacy is being progressively undermined by collecting, aggregating, storing, exchanging, and selling PI for opaque purposes and with shallow consent. Similarly, organizations using PI are worried about unpredictable consumer backlash. The missing governance of personal data markets threatens the autonomy of individuals and undermines the benefits of having an abundant amount of information available. Markets for PI based on property rights have long been recognized as an alternative, but early proposals, not clearly delineating themselves from ‘propertizing’ data subjects, have been viewed in a suspicious light by data protection proponents. Our vision for a personal information market could pave the way to a consensus between players in the current PI ecosystem and data protection proponents. Our model embraces data richness as the future of a digital economy, and creates room for information-rich services and data trading as well as identified customer relationships. To help people understand their transactions with companies and the value of their PI, we create a new and simple market structure that assigns clear rights and obligations to all market players. Trust built by a clear allocation of rights also aids companies and legal enforcers. At the same time, our technical and legal suggestions facilitate the ideals of digital enlightenment by empowering people to participate in PI markets and protecting their privacy.

Acknowledgement We would like to thank Julian Cantella for the editing of the text.

References [1] Bergelson, V.: It’s Personal But Is It Mine? Toward Property Rights in Personal Information. UC Davis Law Review 37, 379–451 (2003). [2] WEF: Personal Data: The Emergence of a New Asset Class, World Economic Forum, Jan (2011).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

119

[3] Noam, E.M.: Privacy and Self-Regulation: Markets for Electronic Privacy. In: Wellbery, B.S. (ed.) Privacy and Self-Regulation in the Information Age, pp. 21–33. NTIA (1997). [4] WEF: Rethinking Personal Data: Strengthening Trust, World Economic Forum, May (2012). [5] Angwin, J.: Online Tracking Ramps Up – Popularity of User-Tailored Advertising Fuels Data Gathering on Browsing Habits. Wall Street Journal, June 18, B1 (2012). [6] Brock, J.: Introducing Privacyfix: Now it’s up to you. Privacychoice, Oct 9, 2012, http://blog. privacychoice.org/2012/10/09/your-privacy-simplified/. [7] Bott, E.: The Do Not Track Standard has Crossed into Crazy Territory, Oct 9, 2012, http://www.zdnet. com/the-do-not-track-standard-has-crossed-into-crazy-territory-7000005502/. [8] Tißler, J.: Heftig: iPhone und iPad speichern Location auf Schritt und Tritt. t3n Open Web Business, April 20 (2011). [9] Laudon, K.C.: Markets and Privacy. Communications of the ACM 39, 92–104 (1996). [10] Personal Data in the Cloud: A Global Survey of Consumer Attitudes, Fujitsu Res. Inst. (2010). [11] Cheng, W.C., Golubchik, L., Kay, D.G.: Total Recall: Are Privacy Changes Inevitable? In: Proceedings of the 1st Workshop on Continuous Archival and Retrieval of Personal Experiences, pp. 86–92. ACM, New York, USA (2004). [12] Metakides, G.: Foreword. In: Bus, J., Crompton, M., Hildebrandt, M., Metakides, G. (eds.) Digital Enlightenment Yearbook 2012, pp. v–vi. IOS (2012). [13] Cranor, L.F., Guduru, P., Arjula, M.: User Interfaces for Privacy Agents. ACM Transactions on Computer-Human Interaction 13, 135–178 (2006). [14] Karjoth, G., Schunter, M., Waidner, M.: Privacy-Enabled Services for Enterprises. In: Proceedings of the 13th International Workshop on Database and Expert Systems Applications (DEXA), pp. 483–487, Aix-en-Provence (2002). [15] Gritzalis, S.: Enhancing Web Privacy and Anonymity in the Digital Era. Information Management & Computer Security 12, 255–287 (2004). [16] Bella, G., Giustolisi, R., Riccobene, S.: Enforcing Privacy in E-commerce by Balancing Anonymity and Trust. Computers & Security 30, 705–718 (2011). [17] Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal of Uncertainty, Fuzziness & Knowledge-Based Systems 10, 557–570 (2002). [18] Spiekermann, S., Dickinson, I., Günther, O., Reynolds, D.: User Agents in E-commerce Environments: Industry vs. Consumer Perspectives on Data Exchange. In: Eder, J., Missikoff, M. (eds.) LNCS, vol. 2681, pp. 696–710. Springer, Berlin (2003). [19] Borcea-Pfitzmann, K., Pfitzmann, A., Berg, M.: Privacy 3.0:= Data Minimization + User Control + Contextual Integrity. IT-Information Technology 53, 34–40 (2011). [20] Acquisti, A.: The Economics of Personal Data and the Economics of Privacy. 30 Years after the OECD Privacy Guidelines. OECD (2010). [21] Posner, R.A.: The Economics of Privacy. American Economic Review 71, 405–409 (1981). [22] Calzolari, G., Pavan, A.: On the Optimality of Privacy in Sequential Contracting. Journal of Economic Theory 130, 168–204 (2006). [23] Acquisti, A., Varian, H.R.: Conditioning Prices on Purchase History. Marketing Science 24, 367–381 (2005). [24] Purtova, N.: Property Rights in Personal Data: A European Perspective. Dissertation, Uitgeverij BOXPress, Oistervijk (2011). [25] Schwartz, P.M.: Property, Privacy, and Personal Data. Harvard Law Review 117, 2056 (2003). [26] Cohen, J.E.: Examined Lives: Informational Privacy and the Subject as Object. Stanford Law Review 52, 1373–1437 (1999). [27] Aperjis, C., Huberman, B.: A Market for Unbiased Private Data: Paying Individuals According to Their Privacy Attitudes. HP Working Paper, (2012). [28] Smith, H.J., Milberg, S.J., Burke, S.J.: Information Privacy: Measuring Individuals’ Concerns About Organizational Practices. Management Information Systems Quarterly 20, 167–196 (1996). [29] Spiekermann, S.: Perceived Control: Scales for Privacy in Ubiquitous Computing. In: Acquisti, A., Vimercati, S.D.C.d., Gritzalis, S., Lambrinoudakis, C. (eds.) Digital Privacy: Theory, Technologies and Practices, pp. 5–25. Taylor & Francis, New York (2005). [30] Jentzsch, N., Preibusch, S., Harasser, A.: Study on Monetising Privacy: An Economic Model for Pricing Personal Information. ENISA (2012). [31] Oetzel, M.C., Spiekermann, S.: A Systematic Methodology for Privacy Impact Assessments – A Design Science Approach. (forthcoming 2012). [32] Pachinger, M.M.: Der neue “Cookie-Paragraph” – Erste Gedanken zur Umsetzung des Art 5 Abs 3 E-Privacy-RL in § 96 Abs 3 TKG 2003 idF BGBl I 2011/102. jusIT 2012/8 (2012). [33] Art29WP: 01197/11/EN WP 187 – Opinion 15/2011 on the Definition of Consent, Article 29 Data Protection Working Party, adopted on 13 July (2011).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

120

A. Novotny and S. Spiekermann / Personal Information Markets and Privacy

[34] Spiekermann, S., Cranor, L.F.: Engineering Privacy. IEEE Transactions on Software Engineering 35, 67–82 (2009). [35] Spiekermann, S., Korunovska, J., Bauer, C.: Psychology of Ownership and Asset Defense: Why People Value Their Personal Information Beyond Privacy. In: Proceedings of the International Conference on Information Systems (ICIS), Orlando, FL, USA (2012). [36] Culnan, M.J.: “How Did They Get My Name?”: An Exploratory Investigation of Consumer Attitudes toward Secondary Information Use. Management Information Systems Quarterly 17, 341–363 (1993). [37] Art29WP: 00264/10/EN WP 169 – Opinion 1/2010 on the Concepts of “Controller” and “Processor”, Article 29 Data Protection Working Party, Adopted on 16 February 2010 (2010). [38] Nissenbaum, H.: Privacy as Contextual Integrity. Washington Law Review 79, 119 (2004). [39] Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy Beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data 1, 3 (2007). [40] Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE), pp. 106–115 (2007). [41] Traung, P.: The Proposed New EU General Data Protection Regulation – Further Opportunities. Journal of Information Law and Technology 2, 33–49 (2012). [42] Shen, Y., Pearson, S.: Privacy Enhancing Technologies: A Review. Report HPL-2011-113, HewlettPackard Laboratories (2011). [43] Ringelstein, C., Staab, S.: DIALOG: Distributed Auditing Logs. In: Proceedings of the IEEE International Conference on Web Services (ICWS) pp. 429–436, Los Angeles, CA, USA (2009). [44] Mont, M.C., Pearson, S., Bramhall, P.: Towards Accountable Management of Identity and Privacy: Sticky Policies and Enforceable Tracing Services. In: Proceedings of the 14th International Workshop on Database and Expert Systems Applications (DEXA), pp. 377–382, Prague (2003). [45] Mont, M.C., Bramhall, P.: IBE Applied to Privacy and Identity Management. Technical Report HPL2003-101. Hewlett-Packard Laboratories (2003). [46] Guha, S., Tang, K., Francis, P.: NOYB: Privacy in Online Social Networks. In: Proceedings of the 1st Workshop on Online Social Networks, pp. 49–54. ACM, Seattle, WA, USA (2008). [47] Recordon, D., Reed, D.: OpenID 2.0: a Platform for User-Centric Identity Management. In: 2nd ACM Workshop on Digital Identity Management, pp. 11–16, Alexandria, VA, USA (2006). [48] P3P: The Platform for Privacy Preferences 1.1 Spec., W3C, 13 Nov (2006). [49] Borking, J.: Privacy Incorporated Software Agent (PISA): Proposal for Building a Privacy Guardian for the Electronic Age. In: Federrath, H. (ed.) Anonymity 2000, LNCS, vol. 2009, pp. 130–140. Springer, Berlin (2001). [50] Aggarwal, C.C., Yu, P.S.: A General Survey of Privacy-Preserving Data Mining Models and Algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining, vol. 34, pp. 11–52. Springer, New York (2008). [51] COM: Proposal for a Regulation of the European Parliament and of the Council on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data (General Data Protection Regulation), European Commission, Jan 25 (2012). [52] Demsetz, H.: Toward a Theory of Property Rights. The American Economic Review 57, 347–359 (1967). [53] Ghosh, A., Roth, A.: Selling Privacy at Auction. In: Proceedings of the 12th ACM Conference on Electronic Commerce (EC), pp. 199–208. ACM, San Jose, CA (2011). [54] Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating Noise to Sensitivity in Private Data Analysis Theory of Cryptography. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Berlin (2006). [55] Sholtz, P.: Economics of Personal Information Exchange. First Monday 5, (2000). [56] Cameron, K.: Laws of Identity, Microsoft (2005). [57] Purtova, N.: Property Rights in Personal Data: Learning from the American Discourse. Computer Law & Security Review 25, 507–521 (2009).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Part III

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Architectures for PDMs and PDEs

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-123

123

Online Privacy – Towards Informational Self-Determination on the Internet a , Chris HOOFNAGLE b , Ioannis KRONTIRIS c , ¨ Simone FISCHER-HUBNER c,1 Kai RANNENBERG , Michael WAIDNER d and Caspar BOWDEN e a Karlstad University, Sweden b University of California, Berkeley, USA c Goethe University Frankfurt, Germany d TU Darmstadt, Germany e Independent Privacy Researcher

Abstract. While the collection and monetization of user data has become a main source for funding “free” services like search engines, online social networks, news sites and blogs, neither privacy-enhancing technologies nor their regulations have kept up with user needs and privacy preferences. The aim of this chapter is to raise awareness of the actual state of the art of online privacy, especially in the international research community and in ongoing efforts to improve the respective legal frameworks, and to provide concrete recommendations to industry, regulators, and research agencies for improving online privacy. In particular we examine how the basic principle of informational self-determination, as promoted by European legal doctrines, could be applied to infrastructures like the internet, Web 2.0 and mobile telecommunication networks.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Keywords. Online privacy, privacy-enhancing technologies, transparency-enhancing tools, data protection directive

Introduction The principle of informational self-determination is of special importance for online privacy because of the infrastructural and interactive nature of modern online communication and to the options that modern computers offer, even though it is much older than the notion of “Online Privacy”. Well before the advent of Web 2.0, the term informational self-determination originated in the context of a German constitutional ruling, related to the 1983 census, making Germany the first country to establish the principle of informational self-determination for its citizens. The German Federal Constitutional Court ruled that:2 “[. . . ] in the context of modern data processing, the protection of the individual against unlimited collection, storage, use and disclosure of his/her personal data is encompassed by the general personal rights of the [German Constitution]. This basic right warrants in this respect the capacity of the individual to determine in principle the disclo1 Corresponding

Author: Institute of Business Informatics, Goethe University Frankfurt, Gr¨uneburgplatz 1, D-60629 Frankfurt am Main, Germany; E-mail: [email protected]. 2 BVerfGE 65,1 – Volksz¨ ahlung, available in English at http://en.wikipedia.org/wiki/Informational selfdetermination. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

124

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

sure and use of his/her personal data. Limitations to this informational self-determination are allowed only in case of overriding public interest.” To put it simply, this provision gave individuals the right to determine what personal data is disclosed, to whom, and for what purposes it is used. Informational selfdetermination also reflects Westin’s description of privacy as “the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others” [1]. Despite this legal development, the path of the Internet did much to undermine these values and nowadays individuals have effectively lost control over the collection, disclosure and use of their personal data. With the evolution and commercialization of the Internet and the advent of Web 2.0, including its search engines and social networks, the environment, in which we need to support online privacy and informational self-determination, has become more complex. Some new business models, like ad-financed “free services” for Internet users, rely on a wide-range collection of user data for various purposes, such as marketing of online shops or targeted advertising, and include user profiling. It appears that many users of online services are unaware of the implications of this business model. In other contexts, data collected for commercial uses has later been employed for government purposes; this has been made possible by the fact that the rules of “free services” place little restriction on reuse of data [2]. This chapter is an updated version of the Manifesto that was produced as the main result of the Dagstuhl Perspectives Workshop 11061, which took place in February 2011 [3]. Not all ideas in the Manifesto are necessarily shared by all participants in the workshop. A primary challenge that it deals with is the correction of power imbalances arising from a loss of informational self-determination, as introduced above. The rest of the document is structured as follows: Section 1 discusses the current state of online privacy, existing technologies to protect privacy, as well as transparency in online information systems, in order for users to have leverage to protect their privacy. Section 2 analyzes what Engineering and Industry can do to improve online privacy. Section 3 gives recommendations on how regulators can improve online privacy, and Section 4 suggests more long-term research topics that are needed to improve online privacy. Finally, the Annex provides further details to these four main chapters.

1. State of the Art The state of the art in online privacy includes extensive work in a broad spectrum of disciplines. The primary purpose of this section is to set the landscape for the chapter and provide responses to clusters of specific questions that arise concerning the current state of work in the area of online privacy. 1.1. Understanding Online Privacy Privacy is subjective, contextual and therefore hard to evaluate. In this regard, one of the main challenges that researchers are currently exploring is linked with the analysis of individual attitudes to privacy. For instance, research has shown that most users of websites with customizable privacy settings, such as Online Social Networks (OSNs), maintain the default permissive settings, which may lead to unwanted privacy outcomes [4]. The ex-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

125

planation for this behaviour is not necessarily that users do not care about their privacy.3 Instead, existing studies demonstrate an ambivalence of the users’ attitudes towards privacy [5,6]. What makes it more difficult to interpret people’s attitude to privacy is that the notion of privacy differs or changes, depending on the culture which individuals are coming from. So, there is still much need for experiments with individuals to allow a broader range of privacy related analysis to be tested and enable a better understanding of people’s concerns and the actions they take to address these concerns. This analysis becomes particularly difficult, since frequently there is no immediate damage for individuals. Even though, in some cases, an individual may directly experience an offence if harassed, manipulated or embarrassed as a result of a prior privacy violation, more frequently the consequences may occur only later or not at all, as for example in third-party tracking of online behaviour for targeted advertisements. Even though tools that try to limit this tracking by third parties exist, with varying degree of effectiveness, these tools do not reveal the impact of processing and usage of personal data from third parties for their own purposes. Evidence also exists to show that third parties are not only receiving browsing behaviour information about individuals, but are in a position to link this behaviour to personal information via sites such as OSNs [7]. While being subjective and contextual, privacy as a concept has a larger function in society. In this chapter we discuss privacy under the light of collection and usage practices, but a more general discussion of the basic values that are challenged by the changes brought about by a networked society remains open. What are the effects on our democratic societies of massive-scale data collection, trend prediction and individual targeting? Are people forced into higher conformance? Is conformance pressure affecting the building of political opinions? A scientific approach to these questions cannot rely on the repetition of an old mantra saying that data collection is bad, but will undertake research into the new power relations as they form in the new networked landscape.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1.2. Privacy Technology Landscape and Technology Transfer In this section, we briefly sketch the current landscape of privacy-enhancing technologies (PETs) and then try to shed light on the reasons for lack of widespread adoption of PETs. There is a growing amount of research in the field of PETs proposing technologies for solving various aspects of the privacy problem [8]; yet PETs are not widely adopted in practice. One cannot expect a simple explanation to this, as online privacy is a complex and interdisciplinary issue. Therefore, we will revisit this issue in the next sections, from the perspective of different disciplines separately with the goal of suggesting specific actions. But first, in this section, we set the landscape from a more general perspective. More specifically, we elaborate on the following reasons, with the understanding that this is an incomplete list of issues: • current economic environment fosters personal data collection in some business models, • user awareness of the privacy problems, as well as demand for transparency of data usage and information processing are low, • today’s PETs still lack usability, scalability and portability in many cases, 3 On

the contrary, several polls and surveys support the opinion that individuals care about their privacy. For example, such a collection for US consumers is presented in http://www.cdt.org/privacy/guide/surveyinfo.php. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

126

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

• regulatory and technical agendas lag behind new data collection practices and data flows, • integration of many new PETs requires costly changes in the existing infrastructure. After more than 20 years of research in the area of privacy and PETs, there exists a wide variety of mechanisms [9]. Broadly speaking, we could distinguish between opacity tools and tools that enforce other legal privacy principles, such as transparency, security or purpose binding.4 Opacity tools can be seen as the “classical” PETs, which “hide information”, i.e. striving for data minimization and unlinkability. They cover a wide variety of technologies, ranging from cryptographic algorithms and protocols (e.g., [homomorphic] encryption, blind and group signatures, anonymous credentials, oblivious transfer, zero-knowledge proofs etc.) to complex systems like user-centric identity management. Opacity tools can be further characterized depending on whether they focus on data minimization at the network layer or at the application layer. Proposals for achieving sender or recipient anonymity at the network layer comprise protocols such as Chaumian Mixes, DC-Net, etc. At the application layer, a much greater variety of technology proposals exists, such as private information retrieval, privacy preserving data mining (random data perturbation, secure multiparty computation), biometric template protection, location privacy, digital pseudonyms, anonymous digital cash, privacy-preserving value exchange, privacy policies etc. Transparency-enhancing tools (TETs) belong in the second category of PETs and focus on enforcing transparency, in cases where personal data need to be processed. By transparency we mean the informative representation to the user of the legal and technical aspects of the purpose of data collection, how the personal data flows, where and how long it is stored, what type of controls the user will have after submitting the personal data, who will be able to access the information, etc. TETs frequently consist of end-user transparency tools and services-side components enabling transparency [10]. The end-user tools include, among other techniques, (1) tools that provide information about the intended collection, storage and/or data processing to the users when personal data are requested from their system (via personalized apps or cookies) and (2) technologies that grant end-users online access to their personal data and/or to information on how their data have been processed and whether this was in line with privacy laws and/or negotiated policies.5 Examples are the Google Dashboard6 or the Amazon’s Recommendation Service, which grant users online access to their data and allow them to rectify and/or delete their data. However, these are server-side functions and not user-side tools and they usually grant users access only to parts of their data and not to all the data that the respective service processes. An example of a user-side transparency enhancing tool is the Data Track developed in the EU project PrimeLife [12], which gives the user an overview of what data have been sent to different data controllers and also makes it possible for a 4 Purpose

binding means that personal data should be relevant to the purposes for which they are to be used and to the extent necessary for those purposes, and should not be usable in other contexts. 5 A third type of TETs, which has so far only been discussed in the research community, includes tools with “counter profiling” capabilities helping a user to “guess” how her data match relevant group profiles, which may affect her future opportunities or risks [11]. 6 https://www.google.com/dashboard/

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

127

data subject to access her personal data and see information on how her data have been processed and whether this was in line with privacy laws and/or negotiated policies. In the current state, once the data has been submitted to an online information system, individuals are given no knowledge about any further processing. But, even if we assume that the data processing of such complex systems as Facebook, Apple iTunes or Google Search could be transparent to the public, it would be hard or impossible for ordinary individuals to understand what happens to their data. Full transparency of data movements also increases security problems in such environments, if misused with malicious intent. Consequently, this limitation leads to the observation that it is more important for individuals to understand the outcome and implications of data flows in complex online information systems than understanding the full data movements. One technique, among others, which can achieve this kind of transparent outcome-based approach is the creation of ad-preferences by some third-party advertisers, where users are allowed to see the set of outcomes, based on which the data has been forwarded to the third-party (examples here would include Google Ad Categories7 or the Deutsche Telekom Privacy Gateway for location-based services). In general, most, if not all, of the proposed PET solutions lag behind the real world situations. They still need to overcome the shortcomings of current approaches, as real world solutions require properties like usability, scalability, efficiency, portability, robustness, preservation of system security, etc. Today, only a patchwork of mechanisms exists, far from a holistic approach to solve the privacy problems. The interaction between these mechanisms and their integration in large scale infrastructures, like the Internet, is not well understood. Our infrastructures have not been designed with privacy in mind, and they evolve continuously and rapidly, integrating new data collection practices and flows. Current privacy mechanisms not only have difficulties in catching up with these developments, but they also collide with some security and business requirements. A redesign of the system in question can often resolve the collision of interests, but this sometimes requires costly investments. At the same time, the demand of users for PETs is rather low today. One reason for this is the lack of user awareness with respect to privacy problems, which can be partly attributed to missing transparency of data acquisition and the related information processing, as emphasized above. A second reason lies in the complicated and laborious nature of control imposed on persons, as no legal standards or general consumer protection rules exists. Finally, PETs do not always take into consideration the evolution of privacy models caused by the rapid creation of new technologies and communication models. Yet another important reason for the lack of adoption of existing PETs lies in some models in data commerce that are based on access to personal data. In the current ecosystem, doing nothing about privacy or even aggressively collecting data sometimes pays off, as some companies seem to acquire new clients with new features based on creative data use and serendipity. Furthermore, for some players, implementing complex data minimization schemes is costly and time consuming and makes information filtering catered to the end-user much harder, if not impossible. It is important to note here, however, that this approach is not adopted by all industry players. The next section takes a closer look at the problem of adoption of PETs by the industry and suggests addressing specific challenges to overcome this problem. 7 http://www.google.com/ads/preferences

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

128

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

2. Engineering and Industry Options Generally speaking, there is a lack of clear incentives for enterprises to manage personal data in a privacy-respecting manner, to design privacy-preserving products, or to make the use of personal data transparent to the data subject.8 We identify the following root causes for this current situation: 1. Lack of customer (individuals, business partners) and market demand for privacy-respecting ICTs, systems, services and controls (beyond punishments for breaches and other excesses). Usage models for privacy-enhancing technologies cannot currently be targeted to customer demand; 2. Some industry segments’ norms, practices and other competitive pressures that favour exploiting personal data in ways contrary to privacy and the spirit of informational self-determination (resulting in diffusion of transparency and accountability); 3. Poor awareness, desire, or authority within some industry segments on the operationalization of privacy (e.g., to integrate existing PETs, to design privacyrespecting technologies and systems, and to establish, measure and evaluate privacy requirements); and 4. Lack of clarity, consistency, and international harmonization in legal requirements governing data privacy within and across jurisdictions (avoided, for example, by migrating data somewhere up in the cloud). To improve the current environment, we need to increase awareness across users, industry and technologists regarding

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

• the protection of privacy of users across different media, • the transparency for processing of personal data, • the acceptance and incorporation of improved privacy-enhancing technologies by technologists outside of the “privacy community”. To support this goal, we recommend that industry addresses three mid-term challenges, which we discuss in the rest of this section. 2.1. Challenge 1: Promoting Transparency 2.1.1. Transparency-Enhancing Tools Transparency-enhancing technologies (TETs), which have been developed in the recent years within research projects and by the industry, can help end users to better understand privacy implications and thus help to increase the user awareness, as we demanded. On the other hand, allowing users to control and correct their data processed at services sides will also lead to better data quality for the respective industries. Challenges for practical TETs that still remain include the following: • Providing transparency in a privacy-friendly manner means that TETs should work for pseudonymous users. Industry should consider integration of existing research prototypes and concepts of such privacy-friendly TETs, like the PrimeLife Data Track [8], in real world processes and IT systems. 8 We

make a disclaimer here that these deficiencies do not apply across the board to all enterprises.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

129

• The open source development of transparency-enhancing technologies and enduser tools needs to increase, in order to lower market entry costs. • Better use interfaces for transparency tools in complex environments will need to be created. Also, user-friendly display of data handling practices by “hidden” data processors will play an important role. 2.1.2. Transparency Within Industrial Organizations Industry needs to foster in-house transparency and awareness for the risks of systemimminent privacy issues in order to effectively enhance privacy in the developed products and services. Principles, such as data minimization and purpose-binding, have to become design principles for processes, IT, service and product design. Industry needs to consistently consider privacy issues, risks, and privacy principles in internal guidelines. These guidelines need to be communicated to engineers, developers, etc. to create a “culture of privacy”. 2.2. Challenge 2: Designing and Delivering Privacy Respecting Products to End Users 2.2.1. Demonstrating the Power of PETs by Blueprints and Sample Prototypes

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

When building applications, engineers often lack practical knowledge on incorporating PETs to achieve security and privacy protection. To support engineers in employing privacy-enhancing technologies, we propose to build blueprints and sample prototypes for key scenarios and for different industries. Examples for such prototypes include the following: • A service that can be delivered to a user on a mobile device, such that the parties involved are able to deliver their parts and are paid for their service, while the user is ensured that every such party receives and stores only minimal data. The user is provided with transparency and control of his own data flows, while data dispersion is minimized, e.g. by attribute-based access-control [13]. • A communication platform that offers its users a convenient communication and collaboration environment with simple and secure user privacy controls to set the audience for certain private data, dependent on different social roles and the support of user pseudonyms. The prototype must further demonstrate its economic viability by proper business models that do not conflict privacy requirements. 2.2.2. Open or Shared-Source Developments Sharing source code which can be reused and adopted easily allows market entrants to lower development costs. One example is the WebKit library.9 An open-source suite of privacy-enhancing tools can lower market entry costs for companies that want to offer privacy products and support the emergence of non-commercial software that integrates privacy-protecting functions. 9 http://www.webkit.org/

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

130

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

2.3. Challenge 3: Identity Management as a Key Technique It has been pointed out that identity management is instrumental to the implementation of online privacy management [10]. We also believe that identity management can be used to manage handling of data relevant to satisfy privacy requirements, such as data minimization and transparency. The scope of identity management is quite broad, comprising authoritative information about legal persons, customer or user relationships, self-issued claims, pseudonyms and anonymous credentials. A minimum of personal data must be conveyed to the service in order to authenticate and authorize the accessing subject. The service-side storage of personal information without transparent and traceable relation to identities creates fundamental asymmetries in the relationship between the users and the industry and erodes transparency, confidence and trust. Therefore, we propose user-centric identity management systems, which can restore this balance and confidence. User-centric identity management in this context implies that personal data, even in cases that are created by a service, is always handed back to the user upon completion of the service. If the user desires consistency across service invocation, it is her decision to hand over the data again to the same or another service. This way, individuals can supervise and limit personal data disclosure and exercise rights of access to their data held by third parties. User-centric identity management allows users to detect any linkages to third parties created from the primary relationship. Enterprise policies and procedures should support user-centric identity management as well, to prevent unwanted linkages and inadvertent disclosures of personal data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. Recommendations for Improving Regulations 3.1. Current Regulatory Framework Insufficient Neither the current European legal framework, nor the US approach toward private sector self-regulation, has been effective for the protection of privacy online, particularly with regard to new business models, such as behavioural targeting, user profiling, social networking and location-based services. Key weaknesses in the EU framework include that: (1) services based predominantly in the US are effectively outside European jurisdiction (2) European users have little choice but to “consent” to companies’ terms of use and privacy policies in the absence of alternatives of comparable functionality, (3) the concept of “personal data” is currently the necessary trigger for the applicability of the Data Protection Directive (DPD) and (4) there seems to be too much reliance on ex post securing of data rather than on ex ante elimination of privacy risks through data minimization (for example the recent Art.29 WP Opinion on smart metering10 omitted entirely any consideration of radical data minimization through cryptographic methods11 ). 10 http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/2011/wp183 11 http://research.microsoft.com/en-us/projects/privacy

en.pdf

in metering/

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

131

3.2. Distinctive European Privacy Values The European culture of privacy incorporates values of democracy, autonomy and pluralism. The European views on privacy as a societal good and as a factor of public interest lead to a more prominent role of the State in this domain. This conception is not widely shared outside Europe, where the notion of privacy is strongly linked to “the right to be let alone”. Consequently, countries such as the US do not necessarily establish the same balance between economic needs and privacy protection. European approaches protect privacy through consumer protection interventions, instead of reliance upon contract. For instance, it is conceivable that a European national government might prohibit certain extremely privacy-invasive practices, like long-term storage of online search requests for commercial purposes. Unlike contract approaches, such prohibitions can never be waived by acquiring the consent of the users.12 It has to be recognized that this European view is not shared by legislators in other parts of the world. 3.3. Surveillance Society and Blanket Retention of Data An important issue of principle for the future Internet of things is the legitimacy of the blanket retention of traffic data (or metadata). In so far as such data relates to individuals, it constitutes a “map of private life” [14]. Case law of the European Court of Human Rights (ECtHR) establishes that “merely” storing such data engages the right to privacy. The troubling exception to this rule is the Data Retention Directive (DRD), requiring storage of certain telecommunications and Internet traffic data. However, the legitimacy of the DRD remains controversial and the concept of indiscriminate continuous retention of data about the entire population has been ruled unconstitutional in its entirety by the Romanian Supreme Court,13 because it “makes the essence of the right disappear”.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.4. A Strong European Legal Instrument Remains Useful Notwithstanding the fact that a global harmonization in this area is not yet possible, a strong and effective European legal instrument has the potential of having an impact on the global online context. The essential question in conceiving a unique, strong and effective European legal instrument is the goal we want to achieve. The first fundamental objective should be the prevention of privacy-endangering information-processing practices at all levels. This instrument must guarantee real protection against actual and potential risks, taking into account technological developments and not merely offer formalistic legal assurances. Therefore, it is crucial to take maximum advantage of the opportunity offered by the current proposed review of the European directive to maximize its impact. The path of least resistance is to make incremental changes in the existing directive. However this may not succeed in rectifying some serious conceptual defects. There is a risk that the review of the directive will merely result in a slightly better but still largely ineffective regulatory solution. 12 Art.

8.2, a) of the Directive provides that in certain cases the prohibition on processing sensitive personal data may not be lifted by the data subject’s giving his consent. 13 http://www.legi-internet.ro/english/jurisprudenta-it-romania/decizii-it/romanian-constitutional-courtdecision-regarding-data-retention.html Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

132

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

Moreover, recent results in the field of de-anonymization [15,16] suggest that some data may be impossible to anonymize (e.g. social networks) and it is difficult to predict the vulnerabilities for and consequences of re-identification when contemplating the release of pseudonymous data [17]. It seems that a better approach would be to make the application of the legal framework dependent on an evaluation of the actual and potential privacy risks related to any data processing. 3.5. Consent Must Not Overrule Everything From a European perspective, explicitly given user consent should not be accepted as a waiver for privacy-intrusive online practices. A European legal instrument should clearly emphasize that individual privacy is not purely a matter of the individual concerned, but of the society as a whole. Moreover, in many situations the voluntariness of the user’s consent can be put into question because of the lack of reasonable alternatives for commonplace services, which meaningfully adhere to European Data Protection. US consumer protection law recognizes many situations where consumer consent cannot waive risks. This policy has not yet widely mobilized into the law of privacy. Consent should expire, according to the scope and extent of processing. When asking for consent, data controllers should make explicit what is revocable and what is irrevocable and how it is possible to revoke that consent. Legislation may prohibit processing that would have irrevocable consequences. Otherwise a European-wide warning system for specific risks or breaches, similar to governmental travel warnings for dangerous regions, might be advisable.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.6. Effective Implementation and Enforcement Is Crucial Privacy lobbying has been concentrated amongst a few law firms and trade associations. These interests pursue a maximally anti-regulatory agenda, even in situations where their clients would admit that they could comply with privacy rules proposed. Retention of information in network advertising companies is a key example – while on the lobbying level, it is often argued that this data must be retained for very long periods of time, the engineers at network advertising companies will admit that data becomes less valuable after a very short period of time, and is often unused for ad targeting purposes within a few months. However, since policy makers rarely engage beyond trade associations, they develop a jaundiced view of the actual requirements that businesses have for data. Too often, regulators pose technical and implementation questions to attorneys rather than the technical experts who design systems. We recommend that to the extent practicable, regulators invite relevant actors, rather than their representatives, to public fora, consultations, and other consensus-building events around privacy. The US Federal Trade Commission has expanded its approach of hiring technical experts to assist lawyers in its activities. We believe that technical expertise is increasingly necessary for policy makers and regulators, and we recommend they look more to in-house technical expertise to assist in their rule-making and investigations. So far the DPD agenda for reform has not sufficiently considered basic limitations on the effectiveness of enforcement when 27 national authorities must reach consensus. Competence for international enforcement actions should be given to a central authority, advised by the Art. 29 WP, leaving national DPAs better able to focus on national-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

133

level issues. Current technological knowledge must be an indispensable part of the professional competence of DPA administrations. But at most a few percent of these officials have any relevant postgraduate scientific competence, and overwhelmingly DPAs have an irredeemably bureaucratic culture. A complete renewal of these institutions is necessary. A minimum of one third of staff should be experts in the computer science of privacy, as well as first rate talent in law, economics, political sciences, sociology and philosophy. Access to justice through privacy litigation is out of reach for most people today. Data Protection Authorities should evolve into Information Privacy Ombudsman (IPO), explicitly acting to uphold privacy rights. IPOs must expect to show intellectual excellence in every relevant field, and earn their authority through merit, or be dissolved. 3.7. Privacy by Design A privacy by design approach can be mandated (or otherwise encouraged) by legal or regulatory provisions, if scientific discoveries demonstrate that a service can be offered practicably in a more privacy-protecting way. This could involve, for example, requiring that comprehensive and iterative privacy risk and impact assessments be carried out and that state-of-the-art privacy technologies be adopted.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.8. Transparency for Data Subjects In order to give meaningful effect to the right to informational self-determination, it is clearly necessary for users to have the possibility of “information self-awareness”. Its importance has already been emphasized in all previous sections, together with the limitation in the corresponding transparency-enhancing tools. Because invoking existing “subject access” rights is cumbersome, slow, and often incomplete, these rights should be strengthened to provide a right to comprehensive online access to data about an individual’s use of an online service, conveniently, securely, privately, and free of charge. This should henceforth be regarded as an indispensable aspect of the human right to privacy in the 21st century. To provide such data genuinely in “intelligible form”, more disclosure of algorithms will be necessary (also for automated processing or anonymization), whether these act on personal data or can affect the individual through the application of statistical data models (“red-lining”). Consumers’ ability to designate “privacy agents” as proxies for exercise of their rights should be recognized by firms and governments. Consumer privacy agents are now a viable business, but they are frustrated by organizations that question the authority of the agent to act for the consumer, and by systems that attempt to obfuscate the invocation of rights to opt out or gain access to personal information. Collective negotiation through “privacy unions” is potentially also an important democratic mode of political expression and must be protected from harassing lawsuits. 3.9. Transparency by Design for Auditors A further important aspect of transparency is the need to design mechanisms that allow the flows of data in a system to be documented and verified by internal and external auditors, including algorithms used to perform profiling and social sorting. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

134

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

3.10. Accountability A core reason for defining the notion of a data controller was to assign clear legal responsibility for compliance. However, the complex mesh of legal relationships that have since arisen often do not allow a controller to guarantee any effective operational performance of such obligations. Mere logging of system activity is insufficient to counter insider threats – active policing of such logs is required. “The Principle of Accountability” should be understood to mean not merely a passive ability to provide an account, but the creation of an effective deterrent against abuse. Moreover the creation of detailed logs about data subjects itself is prejudicial to privacy and therefore all logging activity must be assessed from the point of view of the interests of privacy protection as well as justifiable security goals.

4. Recommendations for Research The up-scaling of privacy-enhancing technology to larger systems and its integration with existing systems fails, mainly because systems aspects and the related interdisciplinary issues are not taken into account. In this section we address this by recommending research into: • scalability and integration on a large scale, • technologies to support privacy-enhanced systems engineering and • research to enable systems for privacy protection at an individual level. Finally, we recommend research into the “known unknowns” of the technological and societal environment that privacy technology exists in.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.1. Web-Scale Integration, Deployment and Infrastructures As discussed in Section 1, over the last 20 years, the privacy community has developed a large pool of tools and primitives. Yet, we do not see deployment in large scale infrastructures such as the Internet and the World Wide Web, and the interaction between individual technological tools is poorly understood. To address this limitation, we recommend research and experimentation focusing on integration and deployment: how do privacy-enhancing technologies scale, in terms of deployment on large and open-ended networks with decentralized control and governance structures, large populations, and qualitatively different scales of data collection and processing? Research instruments to address this question may include: mathematical modelling of interdependencies and integration effects, test beds and demonstrators that enable research and demonstration, as well as private-public partnerships focusing on adoption and deployment of experimental technologies. This approach can also foster infrastructure and product development through pre-commercial procurement. Also other incentives for deployment should be analysed. Specific fields, in which these approaches should be tried, include (but are not limited to): • Privacy-enhanced identity management infrastructures; • Techniques for minimal data disclosure; • Data governance and policy language approaches; Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

135

• Accountability in data disclosure and processing, including transparency and auditability, as well as real-time detection and investigation of privacy breaches and data abuse; • Technological approaches that help to reconcile privacy interests and business models; • Privacy-protected communications. Research approaches towards these questions need to be multidisciplinary. Relevant disciplines include economics, psychology, sociology, business administration, law and political studies, as well as various fields within the discipline of computer science (ref. Annex for examples). Within this research agenda of understanding large-scale, system-level interactions of technological and social phenomena, the empirical data about the evolution of data collection practices and data flows on the Internet and the Web become a critical asset. Relevant data flows include data treated by cloud-based applications, sensors that interact with the physical environment and users’ behaviour as they interact with online services. Regulatory and technical agendas need to be informed by empirical understanding of these data flows. We recommend creating an observatory for these flows and interdependencies, taking existing research work to a systematic new level, and creating the basis for more rigorous analysis of the technical status quo.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.2. Towards Privacy-Friendly System Engineering The privacy-enhancing building blocks available today need to be integrated into an overall privacy engineering environment, so as to enable adequate evolution of privacy concepts and the required privacy friendly technology and systems in the future. Requirements for privacy need to be analysed, especially when new systems are coming up that might have a negative impact on privacy. Consequently, a Privacy Impact Assessment (PIA) is needed. PIA requires research about methods to develop it from an art into a systematic and transparent process, which also allows the comparison of different development alternatives and their privacy impact. When privacy enhancing technologies are deployed on servers, network infrastructures and devices, multiple independently developed technologies are brought together. The way in which the privacy properties of these modules interact with each other and with the surrounding system through the entire protocol stack and via a number of applications, is often poorly understood. For example, integration of different systems can lead to surprising effects (e.g., unwanted data flows) resulting in the violation of privacy policies or assumptions implicit in privacy technologies. Further research into the composability (e.g. considering the current research in “differential privacy”) of these tools and systems is needed to develop suitable best practices. First, this research will contribute to the development of methodologies and guidelines that contribute to the ability to engineer practical privacy in complete systems with the help of a multi-stakeholder community. In order to evaluate the privacy assurances given by these approaches, further research into evaluation methods, criteria and metrics is necessary. Second, the research direction proposed here will also facilitate re-engineering processes of deployed systems to take privacy aspects into account.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

136

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

4.3. Individual Protection In the area of individual protection, we can frame many privacy concerns in terms of power imbalances between data subjects and data processors, and we can frame privacyenhancing technologies as tools to assure or restore an adequate power balance. Research should continue towards tools that assist individuals’ informational self-determination and permit users to learn, e.g. when they share data and may not know about the consequences. Those tools should leverage progress in machine learning. We could imagine relevant tools (“care-takers”), as for example: • advisers, helping users before they engage in privacy-relevant activities online, • bodyguards, assisting users as they act online, • litigators that might be able to help users reconcile breaches of their expectations afterwards. A crucial element of individual protection and autonomy is individuals’ ability to understand and act on their context, assisted by appropriate and intuitive tools. Related to the observations on scaling in the previous sections, research should address how the implications of massive-scale data collection and processing can be made comprehensible and practically manageable for individuals. Research topics here range from usability of technology to the development of philosophical and psychological models for the consequences of data processing. Additionally, empirical experiments should be designed to better understand what users’ privacy interests and assumptions are, and to what extent they are (or are not) able to take action using the tools available today.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.4. Known Unknowns: Possible Changes to the Technological and Societal Environment Research agendas in privacy need to address the evolution of underlying technologies and the surrounding societal and business landscape, in particular in cases where that evolution might create qualitative changes to the privacy landscape. Cryptographic technologies build the foundation for controlling access to data and are also used as a primitive in many privacy-enhancing tools. When there are risks that cannot be articulated, such as whether Quantum Computing will lead to negative consequences for cryptographic primitives as available today, the question is raised whether we are prepared for the consequences of changing the underlying assumptions that today’s technology is built on. Other examples can be found in the rapid advance in the availability and quality of face recognition technology and natural language processing. Some of these progresses are further aided by the increasing availability of large data sets. The implications of these effects are likely compounded by the availability of more powerful mobile devices. Finally, we should also include unpredicted events and disasters in the factors that may change societal attitudes toward privacy in the future. More generally, bluesky research should be undertaken to identify and prepare for changes in underlying technologies and a broader science and societal context that we might not foresee today.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

137

Annex A – Example of Multidisciplinary Questions As mentioned in Section 4, research approaches on new privacy technologies should be multidisciplinary. Examples for the principles and use cases to check for multidisciplinary questions include: • • • • •

Cryptographic feasibility Scalability Usability/acceptability Regulatory Business models. Is the new technology approach compatible with the future?

For example, the above principles should be checked on the following upcoming technologies in privacy preserving or privacy-friendly distributed systems and activities: • Privacy preserving distributed data mining and processing. In this category falls for example the application of Peer-to-Peer architectures on online social networks, in order to avoid control over user data and behaviour by a single entity, such as the service provider. Another example could be technologies targeting the protection against Spam or DDoS, in existing anonymous transport networks. • Privacy preserving distributed data collection. In this category falls for example the sensing and collection of environmental data that are connected with the context of specific people (e.g., their location). This is the case, when sensors embedded in mobile devices are used for such a collection. While this is an upcoming technology, the privacy implications have been hardly studied.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Annex B – Participants of Dagstuhl Workshop 11061 Andreas Albers, Goethe University Frankfurt, DE Caspar Bowden, Microsoft WW Technology Office, GB (now independent researcher) Sonja Buchegger, KTH Stockholm, SE Johannes A. Buchmann, TU Darmstadt, DE Jacques Bus, DigiTrust.EU, Brussels, BE Jan Camenisch, IBM Research, Z¨urich, CH Fred Carter, Office of the Information and Privacy Commissioner of Ontario, CA Ingo Dahm, Deutsche Telekom AG, DE Claudia Diaz, KU Leuven, BE Jos Dumortier, KU Leuven, BE Simone Fischer-H¨ubner, Karlstad University, SE Dieter Gollmann, TU Hamburg-Harburg, DE Marit Hansen, ULD SH, Kiel, DE J¨org Heuer, Deutsche Telekom AG Laboratories, DE Stefan K¨opsell, TU Dresden, DE Ioannis Krontiris, Goethe University Frankfurt, DE Michael Marh¨ofer, Nokia Siemens Networks, M¨unchen, DE Andreas Poller, Fraunhofer SIT, Darmstadt, DE Kai Rannenberg, Goethe University Frankfurt, DE Thomas L. Roessler, W3C, FR Kazue Sako, NEC, JP

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

138

S. Fischer-Hübner et al. / Online Privacy – Towards Informational Self-Determination

Omer Tene, Israeli College of Management School of Law, IL Hannes Tschofenig, Nokia Siemens Networks, Espoo, FI Claire Vishik, Intel, London, GB Michael Waidner, TU Darmstadt, DE Rigo Wenning, W3C/ERCIM, FR Alma Whitten, Google London, GB Craig E. Wills, Worcester Polytechnic Institute, US Sven Wohlgemuth, National Institute of Informatics, Tokyo, JP Observer: Jesus Villasante – European Commission, BE

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] A. Westin, Privacy and Freedom. New York: Atheneum, 1970. [2] C. J. Hoofnagle and J. Whittington, “The price of ‘Free’: Accounting for the cost of the Internet’s most popular price,” Forthcoming UCLA Law Review, vol. 61, no. 3, 2014. Available at http://ssrn.com/ abstract=2235962. [3] S. Fischer-H¨ubner, C. Hoofnagle, I. Krontiris, K. Rannenberg, and M. Waidner, “Online privacy: Towards informational self-determination on the Internet (Dagstuhl Perspectives Workshop 11061),” Dagstuhl Manifestos, vol. 1, no. 1, pp. 1–20, 2011. [4] B. Krishnamurthy and C. E. Wills, “Characterizing privacy in online social networks,” in Proceedings of the 1st Workshop on Online Social Networks (WOSN ’08), pp. 37–42, 2008. [5] J. Turow, C. J. Hoofnagle, D. K. Mulligan, N. Good, and J. Grossklags, “The federal trade commission and consumer privacy in the coming decade,” I/S: A Journal of Law & Policy for the Information Society, no. 723, 2007–08. [6] L. Brandimarte, A. Acquisti, and G. Loewenstein, “Privacy concerns and information disclosure: An illusion of control hypothesis,” in Proceeding of the 9th Workshop on the Economics of Information Security (WEIS 2010), June 2010. [7] B. Krishnamurthy and C. E. Wills, “On the leakage of personally identifiable information via online social networks,” ACM SIGCOMM Computer Communication Review, vol. 40, pp. 112–117, January 2010. [8] F. Kelbert, F. Shirazi, H. Simo, T. W¨uchner, J. Buchmann, A. Pretschner, and M. Waidner, “State of online privacy: A technical perspective,” in Internet Privacy. A Multidisciplinary Analysis, J. Buchmann (ed.), ch. State of Online Privacy: A Technical Perspective, pp. 189–279, September 2012. [9] G. Danezis and S. G¨urses, “A critical review of 10 years of privacy technology,” in Proceedings of Surveillance Cultures: A Global Surveillance Society? (London, UK), April 2010. [10] K. Rannenberg, D. Royer, and A. Deuker (eds.), The Future of Identity in the Information Society – Challenges and Opportunities, p. 507. Springer, 2009. [11] M. Hildebrandt, “Behavioural Biometric Profiling and Transparency Enhancing Tools,” Tech. Rep. FIDIS Deliverable D7.12, March 2009. [12] J. E. W¨astlund and S. Fischer-H¨ubner, “End User Transparency Tools: UI Prototypes,” Tech. Rep. PrimeLife Deliverable D4.2.2, June 2010. [13] J. Camenisch, I. Krontiris, A. Lehmann, G. Neven, C. Paquin, K. Rannenberg, and H. Zwingelberg, “D2.1 Architecture for Attribute-Based Credential Technologies – Version 1,” ABC4Trust Deliverable D2.1, December 2011. [14] C. Bowden, “Closed circuit television for inside your head: Blanket traffic data retention and the emergency anti-terrorism legislation,” Computer and Telecommunications Law Review, March 2002. [15] L. Backstrom, C. Dwork, and J. Kleinberg, “Wherefore art thou r3579x?: Anonymized social networks, hidden patterns, and structural steganography,” in Proceedings of the 16th International Conference on World Wide Web (WWW ’07), (Banff, Alberta, Canada), pp. 181–190, 2007. [16] A. Narayanan and V. Shmatikov, “Robust de-anonymization of large sparse datasets,” in Proceeding of the IEEE Symposium on Security and Privacy, pp. 111–125, 2008. [17] P. Ohm, “Broken promises of privacy: Responding to the surprising failure of anonymization,” UCLA Law Review, vol. 57, p. 1701, 2010.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-139

139

Personal Information Dashboard: Putting the Individual Back in Control

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Johannes BUCHMANN a , Maxi NEBEL b , Alexander ROSSNAGEL b , Fatemeh SHIRAZI a , Hervais SIMO a,c,1 and Michael WAIDNER a,c a Department of Computer Science, Technische Universitaet Darmstadt b Chair of Public Law, University of Kassel c Fraunhofer Institute for Secure Information Technology, Darmstadt Abstract. This paper contributes to the discourse on how information technologies can empower individuals to effectively manage their privacy while contributing to the emergence of balanced personal data ecosystems with increased trust between all stakeholders. We motivate our work through the prism of privacy issues in online social networks (OSN) acknowledging the fact that OSNs have become important platforms for information sharing. We note that on OSN platforms, individuals share very intimate details about different aspects of their lives, often lacking awareness about and understanding of the degree of accessibility/visibility of information they share as well as what it may implicitly reveal about them. Furthermore, service providers and other entities participating on OSN platforms are increasingly relying on profiling and analytics to find and extract valuable, often hidden patterns in large collections of data about OSN users. These issues have caused serious privacy concerns. In the light of such issues and concerns, we argue that protecting privacy online implies ensuring information and risk awareness, active control and oversight by the users over the collection and processing of their data, and support in assessing the trustworthiness of peers and service providers. Towards these goals, we propose the Personal Information Dashboard (PID), a system that relies on usable automation tools and intuitive visualizations to empower end-users to effectively manage both their personal data and their self-presentations. With the PID, the user is able to monitor and visualize her social networking footprint across multiple OSN domains. To enable this, the PID aggregates personal data from multiple sources and links the user’s multiple OSN profiles. Leveraging various inference and correlation models as well as machine learning techniques, the PID empowers users to assess and understand the level of privacy risk they are facing when sharing information on OSNs. Based on outputs of the underlying prediction and learning methods, the PID can suggest to the user corrective options aimed at reducing risks of unintended information disclosure. We present a prototype implementation demonstrating the feasibility of our proposal. Keywords. Personal data, online social networks, privacy, informational selfdetermination

1 Corresponding

Author: Technische Universitaet Darmstadt, Mornewegstrasse 30, D-64293 Darmstadt, Germany; E-mail: [email protected]. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

140

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction The increases in collection and analysis of information about individuals have become essential for innovation and economic growth. The emergence of analytic tools, new business models, and new social norms related to the use of data have the potential to create benefits and opportunities not only for the identity owner or the various organizations monetizing this data, but also for the society at large. However, it is crucial to guarantee privacy protection, and establish and maintain an online culture of trust between all stakeholders. Both are very complex and challenging issues in view of the increasing amounts and diversity of data expected to be available and shared within these online ecosystems. Privacy management, in particular, is often a complex and error-prone task. As a consequence, Internet users, even those highly concerned about privacy, routinely fail to match their privacy preferences to their actual online behaviors. Additionally, privacy enhancing strategies deployed in today’s online services such as online social networks (OSN) mainly focus on control, assuming that users are informed individuals able to make rational disclosure decisions relying on an accurate cost-benefit analysis. However, recent works from the field of behavioural economics and social science have demonstrated that such an underlying assumption is wrong [48–50]. These studies point to users being subject to cognitive limitations and behavioural biases. That is, users having limited knowledge of how disparate pieces of information about different aspects of their life flows across the web; feeling confused about control options and notices which they usually ignored; often lacking understanding of the implications of giving up control over their personal information, and thus unable to accurately estimate privacy risks and compare it to the short-term benefits of sharing personal information. These limitations are the source of various privacy risks including inference risks, i.e. an adversary relying on the user’s observable digital footprints to infer supposedly hidden information about that specific user. We argue that for individuals to become active and informed participants in a data economy, managing privacy requires them being aware of (i) who knows what of their data; (ii) the context in which they disclose information; and (iii) being able to estimate trust towards peers and the impact a piece of data will have when shared with a certain audience or entity. To address privacy issues resulting from users’ cognitive biases/overload and limited knowledge about data processing and related privacy policies, effective transparency enhancing tools (TETs) are required. TETs would indeed enhance users’ awareness of their OSN activities and understanding of the privacy implications of their participation in OSNs. Based on this knowledge, users can then make informed decision about whether to disclose a particular piece of information, to use a specific OSN service, or to rely on a particular privacy control options. Although being a step in the right direction, and great sources of reference for our work, today’s TET (see Section 4) applicable to OSNs often tend to address only a portion of the problem. For instance, TETs developed within the EU fp7 Project PrimeLife2 mainly focus on providing users with information about privacy policies negotiated with service providers and granting users online access to their personal data. These tools leave unaddressed the possibility to leverage metrics to quantify privacy risks posed by unintended disclosure of personal information. To overcome the aforementioned limitations and in an attempt to develop a comprehensive and effective TET tailored to OSNs and addressing more privacy issues/concerns 2 http://primelife.ercim.eu/results/

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

141

than the state-of-the-art TETs, this paper proposes a Personal Information Dashboard (PID). This paper presents early research that shows how machine learning and natural language processing applied to users’ digital social footprints (cf. [34]) may extend the scope of existing TETs, empowering OSN users to effectively perform privacy selfmanagement. Specifically, in this paper, we describe the PID, a system designed to give OSN users a unified control point for managing the various facets of their online social networking presence, including their different personas in a privacy-preserving manner. PID is envisioned as an integration point for various analytic tools that assist OSN users in exercising their right to informational self-determination as they interact across social network sites. Moreover, the PID provides users of OSNs with a practical and easy-touse approach to access personal data that is usually kept in isolated domains. It enables users to know and understand (i) which personal data about them is available online, (ii) how this data flows across the various online ecosystems in which they participate, and (iii) what is happening to these data. Access to, and knowledge about the flow of personal data is leveraged to design feedback and recommendation mechanisms that can help users estimate and understand the risks of personal information disclosure. Furthermore, feedback and recommendations provided by the PID are meant to support users making risk-informed disclosure decisions. The trade-off between personal benefits of sharing one’s information online and the risks of doing so should be made by individuals themselves. Our PID provides means to help them take that rational, informed decision. Outline: The rest of the paper is organised as follows. Section 1 motivates our work by describing primary factors that prompt concerns about privacy in OSN scenarios in which individuals hold accounts on diverse and heterogeneous platforms. Section 2 provides a detailed description of the PID architecture. Next, the implementation of our proposal is outlined in Section 3. Section 4 discusses related work. The paper concludes with a general evaluation of the proposal and discussion of important issues for future work.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Motivating the Personal Information Dashboard In this section, we motivate our work through the need for new technological options for empowering individuals to interact on OSN platforms in a way that they considered to be beneficial for their privacy. This includes providing them with means to understand and regulate the extent to which they could be profiled and which implications this may have (cf. [7,9,10]), and tools to share information in a way that prevents them from inadvertently giving up their privacy (cf. [9,11]). We describe, at a high-level, factors that motivate concern about privacy in OSN (Section 1.1) and a set of key conditions for privacy protection in OSNs (Section 1.2). Please note that at this point, we do not claim that this list of motivating concerns is exhaustive. Here we will be focusing on some of the key concerns, relevant to trust and privacy management. The assumption throughout the paper is that with social network services becoming pervasive, individuals now have accounts on diverse and heterogenous platforms. 1.1. Motivation and Privacy Issues Users of Web-based social networks tend to have accounts on various social networking sites, many of which provide different sets of functionalities targeting specific user comDigital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

142

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

munity audiences. For instance, a typical social Web user may have a Google+ account to keep in touch and share intimate thoughts with family and friends, use Twitter and other micro-blogging platforms to share their political views, and a LinkedIn account for managing their professional contacts. Doing so, social network users often disclose, willingly or unknowingly, a wealth of personal information, i.e. different pieces of information about various aspects of their private life. However, previous works on trust and privacy management and their current implementation in OSNs are, in their majority, single domain oriented. Hence, when connecting and interacting in heterogeneous social communities across multiple OSNs, users may easily lose track of which data they have shared on which OSN with which audiences under which conditions/terms of service. In particular, the following issues arise in such a situation. Issue #One – Inferences and linkability of user profiles: Most social Web users lack a full understanding of how their information is being collected and used, and what the trade-off is between having their data collected for the purpose of accessing services at virtually no cost, and the consequences that doing so could entail, e.g. in terms of implications for their privacy. This lack of both user awareness and support for transparency can actually cause or aggravate the privacy problems associated with the unintended/implicit disclosure of private information by the user. That is, when sharing personal information on a social network platform, most users assume that their data will never leave the boundaries of that specific social network and only be accessible to a restricted group of users, i.e. their “friends”. As a matter of fact, an adversary3 can easily invade the privacy of the OSN user by exploring the social network site and gathering seemingly non-sensitive information from publicly available profiles. Based on this information, the adversary can then perform local inference attacks to infer supposedly private and hidden user’s attributes. A more powerful adversary4 can collect, aggregate and correlate disparate pieces of information about that user (and her contacts) from multiple, a priori isolated OSNs. This way, he can easily construct a more complete profile of the victim, than when inferring the hidden/private attribute from a single social network. We refer to this problem as global inference problem. In both cases, the aggregation and correlation of seemingly non-sensitive and publicly available information can generate a new set of information that may tell much more about the user than she is aware of disclosing. The victim is often unaware that she and her contacts are, either directly or indirectly, the ones leaking that information. While allowing, if properly configured, the users to protect sensitive information from unauthorised and direct accesses, privacy control models that today’s OSN services offer are quite ineffective against indirect accesses via inference techniques. Issue #Two –Collision of social contexts and regrettable posts: There is no question that a growing number of social Web users have privacy concerns that arise out of the collision of social contexts. Recent studies [1,12,13] have reported that many users fear disclosing sensitive information to an unintended audience and posting contents that may result in negative consequences for themselves and/or their friends. Paradoxally, OSN users tend to post sensitive contents and contentious information about themselves and others while lacking awareness of the potential negative consequences thereof. Examples 3 Any

potentially malicious 3rd app used by a large enough portion of a social network including all user’s contacts. 4 Any entity with the ability to track and monitor user activities across online domains eventually for surveillance purposes. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

143

of sensitive contents and contentious information include messages related to the use of drugs, to religious or political beliefs, to criticisms about workplaces and colleagues, personal/family issues, and to other controversial topics (cf. [1]). Lack of awareness and control over the disclosure of sensitive information may lead to embarrassment (e.g. user’s unflattering messages, posted on her wall might be accessible to and take out of context by others) and loss of reputation and credibility among friends, colleagues, and business partners. While helping the user to control the visibility of, and accessibility to the contents she posts, the privacy settings offered by today’s OSN sites fall short when it comes to supporting the user in automatically assessing the sensitivity, and eventually preventing the dissemination of critical/sensitive messages that she is about to post. Issue #Three – Uninformed consent and privacy-invasive 3rd party Apps: Social network users are increasingly becoming consumers of third party applications (a.k.a. “apps”) which they use for a variety of purposes, e.g. to manage their contact lists, create personalized multimedia contents, or gain access to news online. Once installed, these apps may need access to various types of personal data, depending on the functionalities they are expected to offer. Unfortunately, some design choices and practices by 3rd apps providers and/or OSN operators make it difficult for users to fully understand the content and implications of the terms of service and privacy policy the provider is presenting [22,23]. As a consequence, most users tend to consent to whatever terms of use when installing third party applications. Thus, third party apps introduce new source of privacy and security threats [14,21–23]. However, control mechanisms provided by today’s social network services also fall short when it comes to monitoring and controlling how 3rd party application providers access personal data. Recall that a large number of 3rd party apps are poorly designed and that a few are even malicious apps [28,51].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1.2. Privacy Goals and (Security) Assumptions Analysing existing legal frameworks for data protection and privacy protection principles from the relevant literature [8,16,17,20,26,44], we have identified four conditions which have to be fulfilled in order to gain privacy protection in OSNs. These conditions are awareness, control, trustworthiness, and usability (cf. [5]). Awareness and Transparency aim at making data processing understandable and clear for the individual user of an OSN. The user needs to be aware of the details of her personal data that is available to the OSN and to other OSN users, in order to be able to act out her rights and to gain control (see below) concerning her personal information in OSNs. Transparency regards different aspects of data retention: when and how data is gathered, what kind of data categories are gathered, the purposes of data retention, and the audience to which her data is visible. The concept of Control aims at empowering the user to supervise the use of her data. Having control over one’s data enables the user to decide about the existence and distribution of her data, correction in case of false data, and the deletion of her data if necessary or desired. The main instrument for the user to exercise control over her data is by asking the user for her informed consent and also enabling her to revoke given consent. Informed consent is given when the user has unmitigated awareness of all relevant aspects regarding the handling of her data to which she is consenting. This includes the purpose for which that specific piece of data is necessary, the recipients if the service provider intends to share it with third parties, the duration of storage and the rights the user can exercise.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

144

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

The concept of Trustworthiness presupposes that Awareness and Control are met. The knowledge and assurance that the provider has met the privacy standards the user has defined, and that her rights have been complied with, increases the users’ trust in OSN providers. For realizing Trustworthiness the involved parties handle privacy-critical issues in a privacy-preserving manner so that users can rely on their right to informational self-determination being respected in OSNs. Usability: Technologies aiming to empower the user to protect her personal data must consider usability requirements in order to be of any use to her. Technological options to fulfil the previous three conditions should be designed and implemented in a way that by bringing key fact to the user’s attention, she may become aware of both her cognitive biases and possible consequences of data sharing, while avoiding overburdening her. Some of the usability requirements that we have considered during the design of the PID are: ease of learning, task efficiency, ease of remembering, understandability, and subjective satisfaction. In order to verify the mentioned requirements usability tests have to be carried out, which we plan to do in future work.

2. The Personal Information Dashboard In this section, we present the design of our proposal: the Personal Information Dashboard (PID). We give a high-level overview of the PID architecture and design characteristics in Section 2.1 and provide a detailed description of its components and the features they provide in Section 2.2.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.1. Architecture and Design Overview The goal of our architecture is, relying on the analysis in the previous sections, to provide users of OSNs with a unified control point for managing the various facets of their online social networking presence. Providing a single point of control allows individuals who use multiple OSNs (i) to view and control how their personal data flows across the various online ecosystems in which they participate; (ii) to estimate and understand the risk of information disclosure; and (iii) to accurately assess their current level of anonymity, i.e. assess the level of risk for privacy leakage due to linkability of their actions (and accounts) across multiple OSN sites. Additionally, the architecture if implemented in the right way would not only reduce the inconvenience caused by the need to keep track of and manage different accounts on various OSN sites, but also reduce cognitive overload that may arise from very detailed analysis and/or fine-grained control over personal data. The PID, designed as a Web based tool, has means to retrieve and analyse large amount of data from multiple online social networks in an efficient manner. Examples of such data include the user’s username, her profile data, details about her relationships/contacts, and details about content she creates (e.g. posts, likes) or content that has been shared with her. As such, the PID provides users of OSNs with a practical and easy-to-use approach to access personal data that is usually kept in seemingly isolated domains. Users can log into the PID and thus give permission to collect and analyse their data. Data collection is performed through the APIs provided by OSN sites and a distributed Web Crawler. The collected data is then aggregated and analysed using an attribute inference algorithm and well-known ma-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

145

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 1. Overview system architecture.

chine learning models. The main purpose of combining inference algorithm and machine learning models is to be able to accurately compute the probability of sensitive attribute leakage. A Web interface is used to provide end-users with transparency and control. The interface presents to users the results of the aggregation and analysis. These results may include the level of privacy exposure the user is currently facing, a top-level view on her recent activities, the distribution of their friends across the globe or across social media channels. In the Web interface, visualizations and other plug-ins are used to provide the user with knowledge and an aggregated view of which data about her is available in the various online ecosystems in which they participate, and who can see/access it. The user can rely on the visualizations (providing implementation of the interactive graphs, charts, maps, diagrams, data tables, etc.) to easily follow the distribution of her data over time with respect to where data is located and the privacy preferences she defined for it. An overview of the architectural framework of our PID is depicted in Fig. 1. The architectural framework consists of three major blocks of components. The first block includes a Web interface for users, designed according to the dashboard metaphor and populated with visualization plug-ins and tools. The second block is the block of backend components. It includes knowledge analytics engines (inference and correlation models, machine learning algorithms), the visualization libraries, the data retrieval functions (API-based and those provided by the distributed Web Crawler) as well as other data processing modules. The backend components are the back-bone of our architecture. The third and last block is composed of the set of ecosystems in which the user may participate, i.e. different OSNs in which she may have accounts. These ecosystems may include popular OSNs, such as Facebook, Google+, and Twitter. Most importantly, however, note that these OSNs provide application-programming interfaces (APIs) which third-party application developers can use to provide services that enhance users’ online social net-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

146

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

working experience. We leveraged these APIs when designing and implementing the various components, tools, and features of the PID. Three components of our architecture vision for the PID, Cross OSN Policy & Contact Manager, Secure Cross OSN Messaging, and Backup & Data Portability Engine are relevant, but will not be further discussed in this paper. Their design and implementation are left as a subject of future work. 2.2. Privacy Analytics Tools and Transparency Enhancing Features This section presents the key components in our system in detail. 2.2.1. Knowledge and Privacy Analytics Engine One key innovative aspect of our proposal is the Knowledge and Privacy Analytics Engine (KPAE) in which we combine modern machine learning techniques and well known inference models for the purpose of empowering individuals with the ability to manage both their private data and their self-presentation. These analytics represent an approximation of limited set of inference mechanisms and re-identification models that potential adversaries (e.g. intrusive OSN providers, malicious 3rd party application providers, fake friends aiming at collecting large amount of personal information) may employ. The models provided by the KPAE are used by three modules: PrivacyMeter, WallGuard and AtyPiCAL. A description of these three modules is provided in the following subsections.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.2.2. PrivacyMeter PrivacyMeter is the module responsible of computing the user’s level of privacy exposure. To do so, the PrivacyMeter quantifies risks of information leakage arising from accessibility of the user’s private information in her friends’ profiles. The underlying assumption here is that in OSNs, adversaries can infer private information (profile details) the user shares implicitly just by analyzing the user’s social ties. Indeed, PrivacyMeter can leverage various inference and correlation models to automatically calculate the probability of sensitive information leakage. As input, the PrivacyMeter uses information extracted from the user’s profile and details about her friendship links as well as values of her friends’ profile attributes. We consider a very simple and straightforward model to estimate the level of privacy exposure of the user. The level of privacy exposure, which is a measurement for information leakage through inference, is represented as a combined probability of inference risk for each of her profiles in the different OSNs. The privacy inference risk associated with the user’s profile in a particular OSN is in turn defined as the arithmetic mean of the risks associated with each of her hidden attribute in that OSN. Our Inference Quantification Model: Let O = (A,V ) be the set of observations an adversary can make. We think of Ai and Vi , with i ∈ {1, 2, . . . , n}, as representing the set of profile attributes and the values they can have, respectively. Note that this includes details extracted from both the user’s profiles and her friends’ profile attributes. The total number n of possible attributes in a user profile and Ai ( j) denotes the i-th attribute of user j profile. I( j) ⊆ A is the subset of possible inferable profile attribute. These attributes contain sensitive information that is either not provided by the user j or simply not visible to adversaries. We define the privacy inference risk associated with the user’s profile as:

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

1 Score( j) = |I( j)|



|I( j)|

∑ Ri ( j)

147

 100.

(1)

i=1

The privacy score Ri ( j) of a specific attribute Ai in user j’s profile is computed as: Ri ( j) = P(Ai ( j) = Vi j |O)ωAi

(2)

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

|I( j)|

where ∑i=1 ωAi = 1. Vi j is the correct and predictable value of the attribute Ai . P(Ai ( j) = Vi j |O) represents the inference probability for attribute Ai and ωAi ∈ {0, 1} is the weight assigned by the user j to her attribute Ai . In this paper we assume that attributes’ weights are given. They could, however, also be obtained by leveraging collaborative filtering approaches [45] and thus incorporating feedback/input from other community members. The actual overall score for the level of privacy exposure is calculated and displayed to the user every time she logs into the PID. The login mechanism leverages authentication services provided by OSN ecosystems, i.e. the user can log in using credentials associated with one of the OSNs interfacing with the PID. Upon successful login, user identifier and permissions5 granted to the PID are used to automatically query information from both, user’s profiles and her contacts’ profiles. Note that the inference model in the current PID design considers only direct user’s contacts. In the future, we intend to extend the model to consider information available in friends-of-friends’ profiles, parameters other than profile attributes, e.g. messages posted on different platforms, and behavioural patterns in user’s offline communities. The information collected from all the profiles is then analysed to detect possible correlations between different user’s profiles and most importantly to determine the set of observations an adversary can make. Based on this outcome and relying on the inference quantification model provided by the KPAE, the Privacy Exposure Calculator then computes the user’s current level of privacy exposure. The computed exposure level is then passed to a Notification and Feedback Engine that in turns leverages visualisation tools provided by the Web front-end interface to plot the overall privacy score, a detailed analysis of the user’s profile, as well as a set of recommendations. To provide the user with the ability to reduce the level of exposure and possibly mitigate further unintended disclosure, the PrivacyMeter offers a list of recommendations. In order to provide meaningful recommendations, the PrivacyMeter first determines those of the user’s contacts who contribute the most to the disclosure of the exact value of a certain private profile attribute. In a second step, PrivacyMeter uses both the leakage score (the friend’s level of contribution to attribute inference) and the ranking of the inference contributors to signal to the user which of her friends are the most vulnerable. Based on the ranking, the user is provided with a list of recommended actions to help limit information leakage. Examples of recommended actions include requesting those friends who contribute the most to private attribute inference to hide a sensitive attribute in their profiles, or simply unfriending one or several vulnerable friends. Figure 2 shows the components of the PrivacyMeter.

5 When

the user first installs the PID using a particular account, she has to give PID permission to access certain information related to her account and required in order for the application work appropriately. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

148

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

Figure 2. Privacy Meter – Architectural View.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.2.3. WallGuard WallGuard is the component designed to give the user the ability to automatically control the kind of messages she and others should post on her private space (a.k.a. wall in the Facebook jargon). This feature clearly contributes to raise awareness about and reduce privacy risks associated with un-/explicitly sharing messages that may cause negative outcomes [1]. WallGuard makes it easier for the user to decide what information to publish to or receive from certain person or audience/groups of people. It automatically mine all the posts the user is about to share, detect which of them may violate the user privacy expectations, e.g. because they contain controversial languages, and give the user the chance to reconsider her decision before sharing. Figure 3 shows an illustration of an use case for WallGuard. Figure 4 shows an architectural view of WallGuard. It contains a blacklist-based filter that takes the message intended to be posted together with a set of rules specified by the user and decides if the message should be blocked relying on a set of default words for filtering or be further processed. This module relies on a default blacklist and userdefined words. The default blacklist is a compilation of socially unacceptable, indecent words. In addition, the user can add or remove words from the blacklist as she please. If the post contains one or more words present in the blacklist, then the message is passed to a Message Scheduler. Upon receiving the blacklisted post, the Message Scheduler applies a predefined scheduling policy and signals to the Notification Agent that the post contains indecent/inappropriate words. The Notification Agent then leverages the visualization libraries and plugins in Web browser to issue notifications, warnings and recommendations to the user. In contrast, if the message intended to be posted does not contain words from the blacklist, then semi-supervised text classification techniques are leveraged to automatically assign the post to a predefined category. For this purpose we have used the Distributed Web Crawler to build an initial set of OSN-like messages extracted from various

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

149

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 3. Use Case Overview for WallGuard: User wants to control what she and others post on her wall.

Figure 4. An architectural view of WallGuard.

online platforms. We then labelled each entry in the collection with categories of sensitive topics that may be a cause of regrets on social networks. Relying on feature selection and extraction methods for text classification [18,19], we extracted and compiled a large set of characterising features for each of the categories. Such a set is generally referred to as Corpus. Our Corpus is then used to train a high quality text classifier. The classifier, which is basically a machine learning algorithm, represent a function that takes a message as input and returns a score value representing, for instance, the probability that the message belongs to a given category. Borrowing from Wang et al. [1], we define six categories of sensitive topics: Personal and Family Issues; Work and Company; Religion

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

150

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

and Politics; Alcohol and Illegal Drug Use; Profanity and Obscenity; and Offensive or Violent Language. The Message Scheduler receives its input, i.e. the original message, details about the intended recipient/audience, the message category and the related score value from the classifier. Based on that and according to a set of predefined rules, the Message Scheduler decides if the message should be posted or not. Our architecture supports both a priori and a posteriori enforcement of the predefined set of rules. In both approaches, the architecture provides individuals with the ability to fully specify rules, expressing policy that should prevent themselves and others from publishing posts on their Walls that may be interpreted as offensive, indecent, abusive, or controversial. In an a priori enforcement approach, the user defines rules that prevent posting messages relying on features characterising certain sensitive topics and the level of trust she may hold towards the intended audience. In this paper we assume that trust ratings are assigned to users’ (group of) contacts. Such ratings (e.g. from one star up to three stars) could either be specified by the user herself or automatically inferred from the details that characterized the strength of the social ties6 between the user and her contact. We define a policy for a priori enforcement, as a function that maps sets of elements of the form (< Mess Category >, < AudienceLevelOfTrust >) to the set of actions {ALLOW, DENY} whereby < Mess Category > refers to the result of the text classifier and < AudienceLevelOfTrust > to the level of trust the user hold towards the intended audience. Rules, in turn, are statements of specific mappings of tuples to either ALLOW or DENY. The statements themselves rely on results from the classifiers and details about the intended audience. An example privacy policy rule would map the tuple (“Personal and Family Issues”, “Two Stars”) to the value ALLOW, indicating that the user should be prevented from sharing messages related to personal and family issues with those of her contacts having less than a three star rating. The policy decision by the Message Scheduler is then passed to the Notification Agent, along with the output of the classifiers, and details about the intended audience (i.e. identifier, level of trust). Depending on the outcome of the policy decision, notification of various types may be displayed to the user. Figure 5 shows a example of such notifications.

Figure 5. A Notification by WallGuard: It illustrates the result of the a priori enforcement of a rule specified by the Wall Owner and preventing her from posting messages related to politic and religion.

6 Granovetter

[2] defines “strength of a social tie” as a combination of the amount of time, the emotional intensity, the intimacy, and the reciprocal services that characterize such a tie. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

151

In addition, the WallGuard system design relies on an a posteriori enforcement approach to automatically verify whether the messages posted on the user’s Wall are compliant to her pre-defined privacy preferences. For a priori enforcement, preference/policy is defined as a function that maps sets of elements of the form (< Mess Category >, < OriginLevelOfTrust >) to the set {COMPLY, VIOLATE, NEUTRAL} whereby < Mess Category > refers to the result of the text classifier and < OriginLevelOfTrust > to the level of trust the user holds towards the source/author of the message being analysed. For that, the Web Crawler component is used to extract metadata and the content of recent messages from the user’s Wall. Depending on the topic of a given post and user’s trust towards its origin, the Message Scheduler evaluates a set of pre-defined rules and indicates its decision to the Notification Agent. The Notification Agent then uses these indications to signal compliance with or violation of the actual user’s privacy preferences. For signalisation purposes, WallGuard relies on a colour coding inspired by the well-known traffic light metaphor. That is, green for compliance with user’ preferences, red for violation of these preferences, and yellow for neutral decisions. Neutral decisions are indicated if an automatic analysis of the message were not possible. This may be the case for extremely short message.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.2.4. AtyPiCAL – Apps Terms & PolICy AnaLyser AtyPiCAL is the component that supports social Web users in making informed decisions when installing third party applications (a.k.a. “apps”), based on a clear understanding of the related Terms of Service (ToS) and Privacy Policies (PP). ToS and PP for Web applications, e.g. OSN Apps such as online games, are legal agreements between the user and the service provider. That is, users of such applications are subject to the terms and conditions stated in these documents whether or not they are aware of or understand them. PP typically provide information about the type of personal data the service provider collects, the purpose of the collection, if and how the collected information will be shared with others and the various security measures taken to protect users’ personal information. ToS on the other hand additionally specify conditions (e.g. copyright licensing) on information inserted by users and limitation of liability with respect to users’ online actions, behaviour, and conduct. Despite the importance of ToS and PP related to trust and privacy protection, recent researches indicate that all too often individuals neither read nor understand them, mentioning the fact that ToS and PP as plaintext documents typically contain too much legalese and thus are generally difficult to understand [24,25]. As a worrisome consequence, most users tend to consent to whatever terms when installing third party applications. In addition, third party apps introduce a new source of privacy and security threats. Indeed, some of them are poorly designed or even malicious apps [28,51]. AtyPiCAL leverages natural language analysis and machine learning techniques provided by the KPAE Engine to automatically detect privacyinvasive 3rd party apps and support the user decision process. It relies on functionalities of the Web Crawler to extract raw texts from both ToS and PP of a given application. It then applies pre-processing techniques such as stop-words removal and tokenisation to transform the crawled texts into a suitable representation. Additional natural language and data mining techniques are then used to extract characterising features from the texts’ representation. For the purposes of this work, we have compiled a set of features based on an analysis of various privacy and data protection frameworks, mostly from the FTC Fair Information Practice Principles [44], the European Directives EU 95/46/EC [16],

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

152

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control Table 1. Overview of relevant privacy categories and features for AtyPiCal Feature Categories

Availability Inform about Collection of PII

The provider discloses what user data is going to be collected.

Reason/Purposes of the Collection

The provider explains why user data is going to be collected and how (for which purpose) it will be used.

Use of Cookies (→ Tracking)

The provider explains which data collection techniques, e.g. cookies it is using.

Sharing (+ Conditions) with a Third Party

The provider informs whether it shares user data with third party. If so then it should explain how and what kind of data are shared, what rights the individual can exercise against the third party especially regarding deletion.

Secondary Use

The provider informs whether it uses information for purposes different from the original one.

Data Transfer Outside The EU

The provider informs whether and how the data they collect may be transferred or processed outside the European Union.

Notification of Policy Change

The provider explains how eventual changes to its ToS and PP would be handled and how users would be notified.

Transparency on Data Request by Government/3rd Party

The provider informs whether government or any other 3rd party requested for information about the user.

Choice

Explicit/Consented Sharing; Opt-out/opt-in by Default

The provider informs about clear options for the user, e.g. ways to opt-out of use of personal data, before sharing data with 3rd Parties and while giving the opportunity to actively object to it.

Access/ Redress

Easy Data Access

The provider informs about easy-to-use options to access and view the collected personal data.

Data Consistency Check

The provider explains whether and how user can review and check completeness and correctness of information about her.

Data Update

The provider informs about easy-to-use options to modify the collected personal data if this is considered inaccurate.

Data Deletion

The provider informs that data that is no longer necessary will be deleted. It explains how user may ask to delete the collected/ shared data from its servers.

Data Portability Right

The provider informs whether and how the user can freely and completely transfer her data to another service domain.

Security

Provider-Side Data Protection Measures

The provider informs about the measures it set in place to protect user data.

Enforcement

Sanctions against Privacy Violation

The provider explains if and what sanctions and penalties it may impose on its users for violations of its policies and code of conduct, according to the law.

App (or provider itself) is Certified and Monitored

The provider informs whether it has been subjected to a certification and continued compliance monitoring programme as well as providing references to such a program.

Notice, Awareness and Transparence

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Description The ToS and PP are available and easily accessible to all users.

Child safety

The provider informs about its process, rules, and measures aiming at protecting the privacy and data of users of young ages.

EU 2002/58/EC [20], and EU 2006/24/EC [26], and the proposed new European Union Data Protection Regulation [8]. Table 1 shows a brief description and Taxonomy of the features. The system design of AtyPiCAL relies on text classification algorithms to automatically decide whether a given pre-computed representation of ToS and PP contains one or several feature categories depicted in Table 1. We use and combine different algo-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

153

rithms for different features in order to increase the accuracy of the overall approach. To train our classifiers, we have built a Corpus of excerpts from Terms of Service and Privacy Policies annotated with the categories and features presented in Table 1. We have relied on annotations from two main sources when building our Corpus: the Privacyscore Website from PrivacyChoice project7 and the “Terms of Service; Didn’t Read” (short: ToS;DR) project Website.8 On these Websites, privacy activists and usability experts presents results (i.e. scores and labels) of manually assessed ToS and PP on criteria, categories, and features similar to those used by AtyPiCAL. In order to support the user’s decision prior to the installation process, AtyPiCAL computes an overall score which express the level of privacy risks associated with the installation of the given App, based on the output of the classification algorithms. Recognizing the subjective character of privacy, our model assumes that different users may have different point of views about the necessity to have a certain feature present in the agreement with a service provider. Accordingly, AtyPiCAL provides users with the ability to select and weight categories and features. That is, given the ToS and PPof an App, AtyPiCAL parses both documents, performs a semantic analysis of each taking into account a user-defined set of categories, and computes the overall score. The score is calculated as arithmetic mean using the following formula:   1 |Si | Score(App j ) = (3) ∑ wi Di 100 |Si | i=1

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

|S |

i wi = 1. wi ∈ {0, 1} is the where |Si | is the number of user-selected categories and ∑i=1 weight the user assigns to a category i, e.g. “Notification of Policy Change”: 1 means the category is important for the user, 0 means it is not. Di represents the output of the classification and semantic analysis algorithms on a document for a category i. Di is either 0 (the category is not present in the document), 12 (the category is present but a semantic analysis was not possible, i.e. the expected feature could not be extracted due to vagueness and ambiguity in the text), or 1 (both the category and the expected feature are present in the text). The overall score along with detailed per category results are passed to the Notification Agent. Relying on features provided by the Visualization Libraries, AtyPiCAL then displays this information in the Web interface. The display of the analysis results is inspired by traffic light and smiley metaphors.

3. System Implementation We have built a prototype of the Personal Information Dashboard (PID) based on the design described in Section 2. Our current implementation of the PID is to be considered as research prototype interfacing at the moment with Facebook only. This decision was motivated by the prevalence of Facebook among today’s OSNs and the fact that compared to other OSNs’ Platforms Facebook provides a large set of APIs which enable third-party developers to retrieve or post data to its platform. Integration with Twitter, Google+ and other OSNs that provide strong support for third-party applications like our 7 http://www.privacyscore.com/ 8 http://tosdr.org/about.html

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

154

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 6. Login window to the Personal Information Dashboard.

PID is part of future work already underway. In this section, we give an overview of the current implementation of PID. We present the different user interfaces included in the PID and discuss the functionalities they may provide to OSN users. The PID is implemented as a standalone Web application which integrates three Facebook applications: Privacy Meter, WallGuard, and AtyPiCAL. Each of these applications exposes a Web front-end interface to provide the user with transparency and control. Each interface supports TLS and thus can be accessed using HTTPS. The block of backend components of the PID are written in the PHP programming Language and use the PostgreSQL, an open-source object-relational database system for storage of persistent data (e.g. user’s preferences for WallGuard). The implementation of the underlying Knowledge and Privacy Analytics Engine (KPAE) leverages “Learning-Libraryfor-PHP”9 a PHP-based open-source library for machine learning and natural language processing algorithms. As illustrated in Fig. 6, the PID first requests users to log in. For that, Facebook PHP SDK2 is used to request users’ Facebook account details and permission to access and download the user’s data. In addition, we use Facebook PHP SDK2 to implement user’s logout. Both the Graph API and the Facebook Query Language (FQL) are used to query the user’s and friends’ profile data. Please note that the current version of our prototype only works with data from Facebook. Integration of other OSNs is further work already underway. Upon successful login to the prototype, the user is presented, see Fig. 7, with a view on a control panel (a.k.a. “Dashboard view”) which provides a summary of and a visual depiction of her current level of privacy exposure as computed by the PrivacyMeter, her activities (posts, comments) over the last four months, the geographic distribution of her friends, and a word cloud generated from her posts and comments, among other things. In addition, the “Dashboard View” provides users with three additional tabs as shown in Fig. 8. The most important tabs available on the main control panel are “Profile & Resource Management” and “Privacy Anatyics Tools”. Both are drop down menu lists which contain links to other user interfaces with detailed and completed analysis results and links to additional control options. “Profile & Resource Management”, for instance, contains links to a panel named “My Data” where the user is provided with detailed analysis of four types of data: profile data, photos, video, and status. 9 https://github.com/gburtini/Learning-Library-for-PHP

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

155

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 7. Main Screen of the Personal Information Dashboard: the “Dashboard view”.

Figure 8. Navigation tabs in the “Dashboard View”.

Figure 9 shows results related to user’s status data (i.e. posts, likes, comments) generated by the user. Another control panel button, “Privacy Anatyics Tools” lets the user select which of the privacy tools discussed in Section 2 she wants to use. In order to attract a large set of users, we have implement the current versions of our proof-of-concept for AtyPiCAL and WallGuard as Google Chrome Extensions. The Web front-end of the PID relies on the Google Visualization Gallery10 to provide interactive graphs, charts, and other visualisations. Part of our Web front-end is implemented in Java, therefore the user must have Java Applet support enabled to run our prototype.

10 https://developers.google.com/chart/interactive/docs/gallery

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

156

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

Figure 9. The “My Data” control panel presents what contents have being shared where and with which permissions. Here the view for status data. Please note that statistics refering to Twitter and Google+ only serve demonstration purposes. They were generated using synthetic data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. Related Work Our work can be viewed as a new way of leveraging inference models and data mining to help users to estimate risks of unintended information disclosure and profiling, improve self-representation, and make Terms of Service (ToS) and Privacy Policies (PP) more usable. We discuss the most closely related work in these areas below. Inference Detection and Quantification. Early work on detection and quantification of inferences has been focused on statistical databases [3]. Along with the rise of OSNs, researchers started looking at inference-related privacy risks in the context of OSNs. One active line of research in this area is de-anonymisation [4,27] in which the adversary relies on various techniques (e.g. inference techniques) to reconstruct the user identity. De-anonymisation is often achieved by correlating publicly available information about the user and supposedly obfuscated user data. A prominent work in that area is by Narayanan and Shmatikov [4] in which they show that it is possible to correctly re-identify a user willing to keep private the fact that she has accounts in both Twitter and Flickr. Several well publicised privacy incidents in recent years have demonstrated the power of de-anonymisation through correlation of data sources. Aside from works on de-anonymisation, attribute inference [29–33] and quantification [34–38] of privacy risk in OSNs are two other lines of research related most closely to this paper. The research question here being, how to measure the probability that

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

157

sensitive and per se hidden user information can be inferred? Unlike in the case of deanonymisation, researchers have been exploiting OSN data to infer details about what the user shares implicitly, without need for correlation with external auxiliary information. For instance, Mislove [29] and Zheleva [30] have shown how to infer specifics about the user leveraging community detection and relational/group based classification algorithms, respectively, and considering features such as group membership and friendship ties. Recently, Kosinski et al. [31] show how information easily accessible e.g. Facebook Likes can be leveraged to automatically and accurately predict highly sensitive user information (e.g. sexual orientation and political views), personality traits, and emotional state. Further similar work includes [32,33]. In regards to work specific to quantification of privacy risks, Irani et al. [34] and Krishnamurthy and Wills [35] proposed metrics to quantify the size of a user’s online social footprint. Their underlying assumptions being that the adversary can aggregate user-related information from several sources (social networking sites) and thus build a more complete, potentially more invasive, profile of the user. We note that a problem with the approaches described above is that they are more concerned with deducing specific characteristics of individuals and/or re-identifying OSN users than leveraging the underlying learning and prediction models to increase OSN users’ awareness about and control over their privacy. Liu and Terzi [36] proposed a framework for computing the privacy scores of OSN users. Their model captures the sensitivity and visibility of information (profile attributes) explicitly shared by the user. Unlike our proposal, [36] it does not measure inference-related privacy risks, i.e. those risks caused by what the user implicitly shares. Work by Becker and Chen [37] and Talukder et al. [38] present approaches to infer sensitive user identity attributes given details in her OSN profile and the list of her OSN friends. In both papers the computed overall probability of attribute inference represents the user’s privacy risk caused by using third-party Facebook applications. Compared with [37] and [38], our approach measures the privacy risk arising from the different privacy issues facing individuals actively participating in various OSNs. In addition to profile information, our proposal considers strength of ties between users and the language in content shared. Leveraging Automation to Thwart Unwanted Posts. Our work also relates to the challenging problem of increasing OSN users’ awareness and control over disclosure of sensitive based on privacy-related regrets associated with users’ posts as studied in [1]. The EU FP7 di.me project11 proposed a prototype [47] that employs natural language processing techniques to detect privacy sensitive online posts and raise the users’ awareness with respect to their auto-disclosure of privacy-sensitive information online. Unlike our proposal, the prototype by the di.me project does not empower users to define rules for analysing and filtering privacy sensitive online posts. Recently, Vaneti et al. [39] proposed a user-centric mechanism for content-based message filtering on social networks. Through the use of machine learning techniques, their proposal enables users to automatically analyse messages intended to be posted on their profile walls. The decision to publish or filter a message does not only depend on the relationship between the user and recipient/audience of her message. It also take into account the result of the classification algorithm applied to the user’s message as well as her preferences which are either specified in “Filtering Rules” or BlackLists of users who are temporarily prevented from 11 www.dime-project.eu/?

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

158

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

posting any kind of messages the user’s wall. This work focuses on protecting the user from risks related to other peoples posting on her wall. It does not, unlike our approach in WallGuard (see Section 2.2.3), consider options to empower the user to protect herself from privacy risks related to her own postings. To the best of our knowledge, the work in [39] and our proposal in this paper are the only works proposing user-centric and automated approaches to thwart unwanted posts on OSNs based on the enforcement of both user-defined relationships and content-based access rules. Making ToS & PP more Usable. Several works [39–43] have explored innovative new ways to provide usable alternatives to traditional form of notice used as a means to signal privacy policy and terms of service. Ideas proposed so far include non-verbal privacy notices or visceral privacy notices as coined by M. Calo [41]; the Platform for Privacy Preferences (P3P) [44] machine-readable formats and related agents for signaling and negotiating privacy policies; privacy dashboards12 deployed at user’s side; and systems that leverage machine learning techniques to analyse and categorize privacy policies for online services. Recently, Costante et al. [42] present an approach to (semi-) automatically assess the completeness of privacy policies. Similarly to our proposal AtyPiCAL, their approach also leverages machine learning and text classification techniques as well as simple visualisations to address the drawbacks of the traditional form of privacy notice. However, the work does not consider applying natural language processing techniques to additionally extract semantic information out of a privacy policy and as such does not cover the same scope as the AtyPiCAL module suggested in this paper.

5. Discussion and Conclusion In this section, we evaluate our proposal with respect to our identified requirements (Section 1.2), summarise our results and outline our plans for future work.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5.1. Enhancing User Control over Personal Data Sharing Enhanced Transparency and Privacy Control: By aggregating multi-source user data and finding correlation within this data, our tool can keep users informed about information sharing related risks on OSN and provide recommendations on how to mitigate such risks. We use data mining techniques to aggregate and analyse information from the user and her contacts, in order to inform/warn her, that certain of her attributes intended to remain secret are inferable, with high, moderate or low probability. Control options are provided through privacy recommendations that, if followed by the user, would result in mitigation of user profiling or unintended disclosure of sensitive identity attributes. Another key benefit of our proposal is that it indeed helps raise a user’s awareness with respect to possible collision of contexts and the risk of posting sensitive information. Our framework relies for this purpose on user-specified preferences. These preferences are modelled as combinations of relationship-based access control rules and content-based access control rules, specifiable through a simple user interface as depicted in Fig. 10. Note that control of information sharing in current OSNs is largely modelled through relationship-based access rules. Disclosure decisions based on the sensitivity of a piece 12 E.g.

the EU FP7 ICT Project PrimeLife Privacy Dashboard: http://www.primelife.eu/images/stories/ primer/dashboard-plb.pdf. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

159

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 10. User interface to customise and change WallGuard’s settings.

of information are commonly left to the user experiencing cognitive biases/overload. Our approach is an extension of the widespread but limited relationship-based access control model [6], aiming at preventing unintended and harmful information disclosure. We employ natural language processing and data mining techniques to automatically analyse and extract discriminative features from user’s messages. Such features, along with metadata about the intended recipients/audience or sender, are used by the framework to enforce suitable user’s preferences. Through a simple Web interface, the user is provided with transparency and control options with respect to the availability and level of visibility (i.e. visible to her only or to another subset of her contacts) of her personal data across OSN domains. As a consequence, the user can review which data is stored in which ecosystem and to which audiences it has been made visible. By providing means to automatically assess the terms, conditions and promises that service providers specify in their ToS and PP, our proposal contributes to increased transparency and improved comprehension of privacy disclosures. Usability, a Key Concern Throughout the Project: Recognising that communicating and controlling privacy risks often introduces additional cognitive load on the users, we have designed and implemented the PID with simple user interfaces and intuitive dialogues and recommendation formulations to support transparency and guide the end user as much as possible. By focusing on bringing possible key facts to the user’s attention, we pursued an approach that prevents overburdening the end-user. Additionally, part of our current implementation, the PrivacyMeter was submitted to a usability evaluation. From the user study, in form of an online survey (19 participants) the PrivacyMeter receives interesting and encouraging feedback about user experience and functionalities. Results of the study tentatively indicate the acceptance of the PrivacyMeter as a tool to manage privacy risks related to vulnerabilities in OSN profiles. However, a large scale and more complete usability study of the overall PID is left for future work. 5.2. Current Limitations and Challenges Ahead In designing the PID, we sought to provide a straightforward yet effective and usable approach to ensuring information and risk awareness, controlling privacy online, and thus enabling trust. Towards these high-level goals, we build a system that allows the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

160

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

user to capture her social networking footprint by aggregating data from multiple sources and linking her multiple OSN profiles, assess and understand the level of privacy risk associated with, and suggest to her corrective measures aimed at reducing/preventing the risk of unintended information disclosure. As of now, the PID system design presented in this paper should, however, be viewed as a first step towards achieving these goals. Two major challenges that demand considerable attention and are topics of future work already underway are: improvement of the underlying inference and correlation models on the one hand and improvement and empirical evaluation of the overall framework through a large-scale usability test on the other hand. In particular, we intend to focus on the following aspects and sub-challenges: Efficient algorithms and accurate models. Part of future work is to test our proposal with more elaborate inference algorithms and models. To improve the accuracy of such models, we are already working on extensions to consider not only profile attributes and data about user’s direct contacts but also other types of OSN data (cf. [46]) such as behavioural data and background knowledge available outside the realm of OSN. Accordingly, we plan to tailor our proposal to scenarios from the emerging era of smart environments where interconnected ecosystems, each augmented with embedded and networked sensor systems, are capable of collecting and sharing information about people’s emotions, activities and behaviours, and this in a continuous way. We will develop models to quantify privacy risks associated with intrusive third-parties’ ability to infer sensitive user behavioural patterns and personally identifiable information, based on correlation of data disclosed in OSNs with digital traces the user generates when using technologies and service in public/private smart spaces. Extending our framework to encompass such accurate inference models and data outside social networks is challenging, but a necessary step towards technologies to support individuals’ “transparency right for the profiling era” [7]. Integration of inference risks models and trust metrics. Towards addressing the challenge of leveraging elaborate attribute inference models to build usable transparencyand privacy-enhancing technologies, we plan to investigate the interplay between quantification of inference-related privacy risks and trust metrics. More precisely, we will investigate how such an interplay can inform the design of privacy control and recommendations in environments where social and pervasive computing converge. Another piece of future work will look at ways to enable the PID to learn from user’s interactions, preferences, and past decisions in order to accurately predict trustworthiness of services, processes, devices, and peers around her. Increased accountability. Accountability is, besides transparency, a critical factor in the level of trust between stakeholders within the various user’s ecosystems. We intend to further extend our PID with mechanisms to empower users with the ability to: (i) specify and obtain reliable assurance about compliance with their privacy preferences, business policies, and data protection laws regardless of the ecosystem in which disparate pieces of her personal data reside; and (ii) exercise their right to data portability and right to be forgotten [8]. Extending PID to mobile applications. As another future research direction we would like to extend and implement our proposal to resource-constrained devices such as smartphones and/or tablets that have become all-in-one tools for our living. Deployment alternatives and incentives. In our current design, the user needs to trust the PID server with the confidentiality and integrity of both her raw data and infor-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

161

mation inferred from it. We acknowledge that relying on a central server (being either provided by a cloud computing provider or hosted by an OSN provider) is from a privacy point of view far from perfect. Indeed, since the centralised PID Server enables the aggregation of personal data from various sources to infer new sensitive information, it can generate aggregated digital footprints and detailed profiles of its users. The information can in turn be used for unwanted targeted advertisement or intrusive surveillance. Hence, an ideal privacy solution would have been not to rely on such central servers at all. However, doing so would clearly be challenging and even counterproductive in many respects. For instance, performing all the analytics in the Web Browser is not an effective option. Therefore, a backend server for PID becomes necessary, at least for now. In the future we will also look into approaches to ensure confidentiality and integrity of personal information computed by the PID, given that it runs on semi-trusted servers. We will also study possible deployment incentives for the PID from the different stakeholders (i.e. end-users, ONS providers, or any other entity) point of views. In particular, we will investigate the feasibility of a user-centric deployment approach in which the PID server is a home-based computer possibly implemented on the wireless router. Better Usability. When tackling usability challenges, which largely remain future work, we plan to answer the following questions in the context of the proposed PID framework: How to design and evaluate usable privacy policy/permission/consent management interfaces for the PID? How to qualitatively measure the effectiveness of signalling privacy risks and recommendations?

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5.3. Concluding Remarks In this work in progress paper, we have presented our vision for a Personal Information Dashboard (PID), a service which relies on automation and visualisation techniques to empower individuals to move from the back seat to the driver’s seat when it comes to the management of both, their personal information and their privacy. We have described how the PID could aggregate and analyse personal data from various domains in order to allow users of social networks to understand and control privacy risks related to the diffusion of their information. Our Dashboard provides users with options for information and privacy risk awareness as well as control over the dissemination of their personal data. As a consequence, users become (i) aware of information about them available within and across different ecosystems; (ii) aware of the context in which they disclose information; and (iii) are able to estimate trust towards other stakeholders and the impact a piece of data will have when shared with a certain audience or entity. In this paper, we conjecture that individuals in control of their data and privacy would ultimately become active and informed participants in a data economy. We have implemented the PID as a research prototype focusing on three of its key components: WallGuard, PrivacyMeter, and AtyPiCAL. However, our work clearly does not end here. We are currently working on implementing and evaluating the rest of our proposed framework. We also plan further research as discussed in the previous section.

Acknowledgements The research reported in this paper has been supported in part by the German Federal Ministry of Education and Research (BMBF) within EC SPRIDE (www.ec-spride.de),

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

162

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

the Hessian LOEWE excellence initiative within CASED (http://www.cased.de/en.html) and the German National Academy of Science and Engineering project “acatech Internet Privacy – A Culture of Privacy and Trust for the Internet”. We would also like to thank Marit Hansen for helpful discussions on a early PID implementation as well as Ronald Marx, and the anonymous reviewers for their useful suggestions.

References [1]

[2] [3] [4] [5] [6] [7]

[8]

[9]

[10]

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[11] [12]

[13] [14] [15]

[16]

[17] [18] [19] [20]

Y. Wang, G. Norcie, S. Komanduri, A. Acquisti, P. G. Leon, and L. F. Cranor, “I regretted the minute I pressed share: A qualitative study of regrets on Facebook”, In: Proceedings of the Seventh Symposium on Usable Privacy and Security (SOUPS ’11), ACM, New York, NY, USA, Article 10, 16 pages, 2011. M. Granovetter, “The Strength of Weak Ties”, The American Journal of Sociology 78(6), (1973). C. Farkas and S. Jajodia, “The inference problem: A survey”, SIGKDD Explor Newsl. 4(2), 6–11 (December 2002). A. Narayanan and V. Shmatikov, “De-anonymizing social networks”, IEEE Sympolsium on Security and Privacy (May 2009). J. Buchmann et al., “Internet Privacy – Options for adequate realisation”, In: J. Buchmann (ed.), Springer Vieweg, 2013 edition, Juni 2013. M. Madejski, M. Johnson, and S. M. Bellovin, “The Failure of Online Social Network Privacy Settings”, Tech. Report CUCS-010-11, Columbia University, February 2011. M. Hildebrandt, “The Dawn of a Critical Transparency Right for the Profiling Era”, In: Digital Enlightenment Yearbook 2012, J. Bus, M. Crompton, M. Hildebrandt, and G. Metakides (eds.), IOS Press, Amsterdam, 2012. European Commission, “Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation)”, 2012. S. Fischer-Huebner, C. J. Hoofnagle, K. Rannenberg, M. Waidner, I. Krontiris, and M. Marh¨ofer, “Online Privacy: Towards Informational Self-Determination on the Internet”, (Dagstuhl Perspectives Workshop 11061), Dagstuhl Reports 1(2), 1–15 (2011). M. Hildebrandt, “Profiling into the future”, Future of Identity in the Information Society (FIDIS) 1, 1–20 (2007). A. Acquisti, “Nudging Privacy: The Behavioral Economics of Personal Information”, Security & Privacy, IEEE 7(6), 82–85 (Nov.–Dec. 2009). doi:10.1109/MSP.2009.163. M. Sleeper, R. Balebako, S. Das, A. L. McConahy, J. Wiese, and L. Faith Cranor, “The post that wasn’t: Exploring self-censorship on facebook”, In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW ’13), ACM, New York, NY, USA, pp. 793–802, 2013. J. M. Xu, B. Burchfiel, X. Zhu, and A. Bellmore, “An Examination of Regret in Bullying Tweets”, In: Proceedings of NAACL-HLT, pp. 697–702. Preliminary FTC Staff Report, “Protecting Consumer Privacy in an Era of Rapid Change: A Proposed Framework for Businesses and Policymakers”, December 2010. Y. Cheng, J. Park, and R. Sandhu, “Preserving user privacy from third-party applications in online social networks”, In: Proceedings of the 22nd International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee, pp. 723–728. EU Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data, Official Journal of the EC, 23, 1995, L 281, p. 31–50. US Federal Trade Commission, “Protecting America’s Consumers”, 2012. Y. Yang and J. P. Pedersen, “A comparative study on feature selection in text categorization”, ICML-97. S. Kotsiantis and D. Kanellopoulos, “Data preprocessing for supervised leaning”, International Journal of Computer Science 1(2), 111–117 (2006). Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications), Official Journal of the EC 2002, L 201, p. 37–47.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

163

[21] B. Krishnamurthy and C. E. Wills, “Privacy leakage in mobile online social networks”, In: Proceedings of the 3rd Conference on Online Social Networks 44, 2010. [22] N. Wang, Heng Xu, and J. Grossklags, “Third-party apps on Facebook: Privacy and the illusion of control”, In: Proceedings of the 5th ACM Symposium on Computer Human Interaction for Management of Information Technology (CHIMIT ’11), ACM, New York, NY, USA, Article 4, 10 pages. doi:10.1145/2076444.2076448, http://doi.acm.org/10.1145/2076444.2076448 [23] P. H. Chia, Y. Yamamoto, and N. Asokan, “Is this app safe?: A large scale study on application permissions and risk signals”, In: Proceedings of the 21st International Conference on World Wide Web (WWW ’12), ACM, New York, NY, USA, pp. 311–320. [24] T. Vila, R. Greenstadt, and D. Molnar, “Why we can’t be bothered to read privacy policies: Models of privacy economics as a lemons market”, In: L. Camp and S. Lewis (eds.), Economics of Information Security (Advances in Information Security, Volume 12), Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 143–153, 2004. [25] A. McDonald and L. Cranor, “The cost of reading privacy policies”, I/S: A Journal of Law and Policy for the Information Society 4(3) (2008). [26] EU Directive 2006/24/EC of the European Parliament and of the Council on “the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC”, Official Journal of the EC 2006, L 105, pp. 54–63. [27] A. Acquisti, R. Gross, and F. Stutzman, Faces of Facebook: Privacy in the age of augmented reality, BlackHat, USA, 2011. [28] M. Egele, A. Moser, C. Kruegel, and E. Kirda, “Pox: Protecting users from malicious facebook applications”, In: Pervasive Computing and Communications Workshops (PERCOM Workshops), 2011 IEEE International Conference on. IEEE, pp. 288–294, 2011. [29] A. Mislove, B. Viswanath, K. P. Gummadi, and P. Druschel, “You are who you know: Inferring user profiles in online social networks”, In: WSDM, 2010. [30] E. Zheleva and L. Getoor, “To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles”, In: WWW, 2009. [31] M. Kosinski, D. J. Stillwell, and T. Graepel, “Pri-vate Traits and Attributes are Predictable from Digital Records of Human Behavior”, In: Proceedings of the National Academy of Sciences of USA: US National Academy of Science, 2013, (in press). [32] M. Pennacchiotti and A.-M. Popescu, “Democrats, republicans and starbucks afficionados: User classification in twitter”, In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11), ACM, New York, NY, USA, pp. 430–438, 2011. [33] A. Chaabane, G. Acs, and M. A. Kaafar, “You are what you like! information leakage through users’ interests”, In: Proceedings of the 19th Annual Network & Distributed System Security Symposium, Ser. NDSS’12, 2012. [34] D. Irani, S. Webb, K. Li, and C. Pu, “Large online social footprints – An emerging threat”, In: Proceedings of the 2009 International Conference on Computational Science and Engineering, Volume 03, Ser. CSE ’09, pp. 271–276, 2009. [35] B. Krishnamurthy and C. E. Wills, “Generating a privacy footprint on the internet”, In: IMC ’06: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, ACM, New York, NY, USA, pp. 65–70, 2006. [36] K. Liu and E. Terzi, “A framework for computing the privacy scores of users in online social networks”, ACM Transactions on Knowledge Discovery from Data (TKDD) 5(1), 6 (2010). [37] J. Becker and H. Chen, “Measuring privacy risk in online social networks”, In: Proceedings of the 2009 Workshop on Web, Volume 2. Citeseer. [38] N. Talukder, M. Ouzzani, A. Elmagarmid, H. Elmeleegy, and M. Yakout, “Privometer: Privacy protection in social networks”, In: Proceedings of 26th International Conference on Data Engineering Workshops (ICDEW), IEEE, pp. 266–269, 2010. [39] M. Vanetti, E. Binaghi, E. Ferrari, B. Carminati, and M. Carullo, “A System to Filter Unwanted Messages from OSN User Walls”, IEEE Transactions on Knowledge and Data Engineering 25(2), 285–297 (Feb. 2013). doi:10.1109/TKDE.2011.230. [40] V. Groom and M. R. Calo, “Reversing the Privacy Paradox: An Experimental Study”, Soc. Sci. Res. Network (Sept. 25, 2011). [41] World-Wide Web Consortium, The Platform for Privacy Preferences 1.1 (P3P1.1) Specification, 2006.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

164 [42]

[43] [44] [45] [46] [47]

[48] [49]

[50]

E. Costante, Y. Sun, M. Petkovic, and J. den Hartog, “A machine learning solution to assess privacy policy completeness: (short paper)”, In: Proceedings of the 2012 ACM Workshop on Privacy in the Electronic Society (WPES ’12), ACM, New York, NY, USA, pp. 91–96, 2012. P. G. Kelley, L. J. Cesca, J. Bresee, and L. F. Cranor, Standardizing Privacy Notices: An Online Study of the Nutrition Label Approach, CHI 2010. Federal Trade Commission, “Fair information practice principles”, http://www.ftc.gov/reports/privacy3/ fairinfo.shtm. X. Su and T. M. Khoshgoftaar, “A survey of collaborative filtering techniques”, Adv. in Artif. Intell. 2009, Article 4 (January 2009), 1 pages. B. Schneier, “A taxonomy of social networking data”, IEEE Security and Privacy 8, 88 (2010). M. Bourimi, I. Rivera, S. Scerri, M. Heupel, K. Cortis, and S. Thiel, “Integrating multi-source user data to enhance privacy in social interaction”, In: Proceedings of the 13th International Conference on Interacci´on Persona-Ordenador (INTERACCION ’12), ACM, New York, NY, USA, Article 51, 7 pages, 2012. A. Acquisti and J. Grossklags, “Privacy and rationality in individual decision making”, IEEE Security and Privacy 3(1), 26–33 (January/February 2005). M. Johnson, S. Egelman, and S. M. Bellovin, “Facebook and privacy: It’s complicated”, In: Proceedings of the Eighth Symposium on Usable Privacy and Security (SOUPS ’12), ACM, New York, NY, USA, Article 9, 15 pages, 2012. M. Madejski, M. Johnson, and S. M. Bellovin, “The Failure of Online Social Network Privacy Settings”, Tech. Report CUCS-010-11, Columbia University, February 2011. P. H. Chia, Y. Yamamoto, and N. Asokan, “Is this app safe?: A large scale study on application permissions and risk signals”, In: Proceedings of the 21st International Conference on World Wide Web (WWW ’12), ACM, New York, NY, USA, pp. 311–320, 2012.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[51]

J. Buchmann et al. / Personal Information Dashboard: Putting the Individual Back in Control

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-165

165

Towards Effective, Consent Based Control of Personal Data Edgar A. WHITLEY Department of Management, London School of Economics and Political Science, Houghton Street, London WC2A 2AE, United Kingdom [email protected], http://personal.lse.ac.uk/whitley

Abstract. The principle of consent is widely seen as a key mechanism for enabling user-centric data management. Informed consent has its origins in the context of medical research but the principle has been extended to cover the lawful processing of personal data. In particular, the proposed EU regulation on data protection seeks to strengthen the consent requirements moving them from unambiguous to explicit. Nevertheless, there are a number of limitations to the way that even explicit consent operates in real-life situations which suggest that an alternative, more dynamic form of consent is needed. This chapter reviews the key concerns with static forms of consent for the control of personal data and proposes a technologically mediated form of dynamic consent instead. Keywords. Privacy, consent, dynamic consent, biobanking

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction: The Problem of Privacy and Consent1 Notions of consent are becoming increasingly important in questions of data protection and privacy. Leading privacy advocate Simon Davies compiled a report on the issues and trends that will dominate the privacy landscape in 2013 [1]. This report drew on a survey of over 180 privacy specialists from 19 countries. They were asked to identify the most influential privacy themes for 2013 and consent was listed third (after data aggregation and regulatory changes). Consent is also a key feature of the proposed EU regulatory changes regarding data protection (where it is understood as “clear, affirmative consent”) [2]. Consent is a key feature of many privacy principles and features in many existing data protection guidelines and regulations, including the UK’s Data Protection Act [3]. It is found (expressly or implied) in industry best practice and most notions of fair information processing as well as their implementation in various data protection regulations and guidelines. As such, informed consent is likely to be a significant issue for most public and private sector enterprises that handle personal data, whether for day-today operations or for innovative applications. Much of our understanding of informed consent has its origins in what is seen as ethical medical practice. The importance of requiring informed consent from patients can be traced back at least to the abuses of medical practice in the Second World 1 This chapter draws on work undertaken as part of the EnCoRe project funded by the Technology Strategy Board (Grant TP/12/NS/P0501A) and the Engineering and Physical Sciences Research Council and the Economic and Social Research Council (Grant EP/G002541/1).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

166

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

War [4]. As a result, obtaining informed consent from patients, and the limitations of what is covered by this informed consent, directly influence the use of medical data and samples. For example, Dame Fiona Caldicott’s review of information sharing for the UK Department of Health [5] noted the importance of allowing people to “give, refuse or withdraw explicit consent” and the need to ensure that these decisions are “traceable and communicated to others involved in the individual’s direct care” [5, p.13]. The report continued by reiterating the Helsinki principle that “Patients can change their consent at any time” [5, p. 13]. Consent, however, is a multi-faceted concept that is all too often ‘black-boxed’ and treated as unproblematic. For example, under UK law, consent is not actually required for all forms of data processing and there are very significant practical issues around how “informed” consent might be operationalized [6]. From an enterprise perspective, for online consent to work companies need to ensure that the person who is expressing the consent is in fact the relevant user and not someone else. One illustration of this is Facebook’s requirement that all new accounts are opened with the person stating their age. This is intended to allow Facebook to filter age-restricted services, adverts and applications. It also relates to their terms of service which states that “You will not use Facebook if you are under 13” [7, 4 Registration and account security, condition 5] and is a direct consequence of the US Children’s Online Privacy Protection Act (COPPA). However, as Boyd et al. [8] note, Facebook is currently such a desirable space for young children to use that parents are increasingly being pressurized into giving ‘parental consent’ and setting up accounts on Facebook for them (typically by having the parent knowingly lie about the child’s age thus totally undermining the safeguards introduced by the service provider).

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Data Protection and Consent In January 2012, the EU put forward proposals for the fundamental reform of data protection within Europe [2]. The proposals are wide ranging and update and modernize the principles enshrined in the 1995 Data Protection Directive [9] to guarantee privacy rights. They focus on reinforcing individuals’ rights, strengthening the EU internal market, ensuring a high level of data protection in all areas (including police and criminal justice cooperation): proper enforcement of the rules, facilitating international transfers of personal data and setting global data protection standards [10]. A key feature of the proposed changes is that they are intended “to give people more control over their personal data… Wherever consent is required for data to be processed, it will have to be given explicitly, rather than assumed as is sometimes the case now” [10]. As such, it refines the earlier “Madrid Resolution” on international privacy standards [11]. The influential “Article 29” Working Party of the EU released an “opinion” on consent in 2011 [12]. That opinion “provides a thorough analysis of the concept of consent as currently used in the Data Protection Directive and in the e-Privacy Directive”. It notes that the Council Common Position in 1995 introduced a standard definition of consent. Consent is defined as “any freely given specific and informed indication of his wishes by which the data subject signifies his agreement to personal data relating to him being processed”. However, even this definition has resulted in areas of confusion and ambiguity.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

167

The opinion gives “numerous examples of valid and invalid consent, focusing on its key elements such as the meaning of ‘indication’, ‘freely given’, ‘specific’, ‘unambiguous’, ‘explicit’, ‘informed’ etc.”. These examples, which are drawn from the experiences of members of the Working Party, attempt to clarify some aspects related to the notion of consent, including questions about the timing as to when consent must be obtained and how the right to object differs from consent, etc. That the Working Party felt it necessary to clarify this (fundamental) area of data protection law and practice is indicative that the concept of consent raises many legal and practical consequences which were not necessarily anticipated when the Directive was originally drafted. Raab and Goold [13] further suggest that “it is debatable whether ‘consent’ should be a further (data protection) principle in its own right, or whether – because it is so difficult to define and apply in practice – it should only play a supportive role to the package of other principles” [13, p. 63] and note that as consent is frequently set aside (see below) and is difficult to obtain it is open to question whether it should form the basis for an informational self-determination foundation for privacy. Before focusing on the problematic issues associated with consent, it is important to recognize that consent is not the only basis upon which data may be lawfully processed. For example, under the UK Data Protection Act [3] the fair processing conditions are: • • • • •

Copyright © 2013. IOS Press, Incorporated. All rights reserved.



Processed with the consent of the data subject; Required by contract, or pre-contractual negotiations, with data subject; Legal obligation for data controller to process the personal data; Necessary to protect the “vital interests” of the data subject; Necessary for the administration of justice, parliament, under an Act, crown/government, public interest; Necessary for the “legitimate interests” of the data controller/third party unless the “processing is unwarranted… by reason of prejudice to the rights and freedoms or legitimate interests of the data subject”, unless ordered otherwise by Secretary of State.

That is, consent is only one (of six) possible conditions for lawful processing of personal data in the United Kingdom. 1.1. Article 29 Working Party Concerns The Article 29 Working Party was set up under Article 29 of Directive 95/46/EC. It is an independent European advisory body on data protection and privacy. Its tasks are described in Article 30 of Directive 95/46/EC and Article 15 of Directive 2002/58/EC [14]. The Working Party issues opinions on a range of issues associated with data protection and privacy including technological developments such as search engines [15]. It also reviews the changing technological basis of key data protection terms such as the concepts of data controller and data processor [16]. In 2011 it issued an opinion on the definition of consent [12], and the question of consent also arose in relation to online behavioral advertising [17]. The Working Party argued that a key feature of consent is transparency towards the data subject. “Transparency is a condition of being in control and for rendering the consent valid. Transparency as such is not enough to legitimize the processing of personal data, but it is an essential condition in ensuring that consent is valid” [12].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

168

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

Another important issue relates to how consent is signified and the timing of this signification. Thus, although not explicitly stated in legislation, the use of consent implies that processing cannot start until consent is granted, so consent must generally be given before processing starts. This is related to, but different from, the right of objection [cf. 18]. Consent, broadly defined, can be any form of “indication” of the data subject’s wishes, although the working party argues that it should really involve some purposeful action (rather than consent being inferred from a lack of action). At present, continued use (for example of an online service) is frequently taken to be an “indication” that the data subject provides consent for processing to be performed. Consent must also be ‘freely given’ and this condition raises the prospect that consent choices might be engineered (see below) and the set of default values for consent options is also important (cf. opt-in versus opt-out considerations [19]). Moreover, to avoid consent being seen as “Hobson’s consent” [20] the data subject must have a choice of options available. For example, if consent (perhaps to be added to a marketing mailing list) must be given in order for an online purchase to be completed then this consent is arguably not freely given. Another requirement of consent articulated by the Working Party is that it should be “specific”. Indeed, the opinion states that “blanket consent without specifying the exact purpose of the processing is not acceptable” [12, p. 17]. That is, consent “should refer clearly and precisely to the scope and the consequences of the data processing. It cannot apply to an open-ended set of processing activities” [12, p. 17], although there are important distinctions between telling a person how you are going to use their personal information and getting their consent for this [21, p. 8]. Informed consent also raises important issues of the intelligibility of the description of the purposes for which data is processed [21–23] and their readability [e.g. 24,25].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Consent and Control The Article 29 Working Party review of consent [12] emphasizes the relationship between consent and control. Indeed, many surveys of the literature on privacy identify the central role played by an individual’s control over the use of their personal data [26,27]. For example, Introna’s [28] review suggests that there are three broad categories of privacy definitions: privacy as no access to the person or the personal realm; privacy as control over personal information and privacy as freedom from judgment or scrutiny by others. Drawing on earlier discussions of the distinction between public and private realms, legal theorists began drawing out some of the implications of this distinction in terms of legal rights. One of the earliest and most significant was the argument by Samuel Warren and Louis Brandeis [29], who developed a right of privacy, namely “the right to let alone”, based on an earlier judgment by Thomas Cooley, who proposed “the right to one’s person and a right of personal immunity” [see 30, p. 14]. That is, they saw privacy as closely related to being able to control actions and information about oneself. Privacy is thus associated with notions of personhood and self-identity. The Warren and Brandeis definition, therefore, both falls within Introna’s first and second categories and raises questions about the kinds of controls that can reasonably be implemented or expected to limit access to the individual. For example, this helps us to distinguish between conversations undertaken in our home with those that take place

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

169

in a public space. We can control who enters our home and hence who might overhear our conversations; a level of control we can’t have in a public space. Introna’s second definition highlights what is often described as informational selfdetermination [31], based on a 1983 ruling by the German Federal Constitutional Court. The argument here is that if an individual cannot reasonably control how their information is used (for example, if it is subject to searches by the authorities) then they may refrain from undertaking socially useful information-based activities such as blogging on particular topics. The third category, freedom from judgment by others, again relates to the disclosure and use of personal data by others. For example, in this category personal health data might reasonably be considered private because its involuntary disclosure may cause others to judge an individual’s lifestyle choices [32]. Many scholars see privacy as having intrinsic value as a human right; something that is inextricably linked to one’s essence as an (autonomous) human being. For example, Introna considers the hypothetical case of a totally transparent society (i.e. where there is no privacy). He questions the nature of social relationships in such a space, asking how your relationship with your lover could differ from that with your manager: “What is there to share since everything is already known?” [28, p. 265]. This transparent world also highlights a more instrumental perspective on privacy. In a totally transparent world, competitive advantage (knowing something that your competitors do not) is not possible (or at least not sustainable). All of these definitions share an implicit and limited view of the kinds of controls that an individual could or should have, particularly with regard to informational privacy. For example, Westin’s [33] seminal book defines privacy as “the claim of individuals, groups or institutions to determine for themselves when, how and to what extent information about them is communicated to others” [33, p. 7]. Control, in this context, is seen as something that occurs at the start of a disclosure process, and privacy control is seen solely in terms of limiting what personal data is made available to others. In practice, however, this is a rather partial view of how personal data is disclosed and shared with others. It is increasingly common for individuals to register with various online services and disclose data about themselves (name, email address, etc.). This data is then stored in enterprise databases for significant periods of time and may be shared with other parts of the enterprise or selected third-party organizations. Whilst in earlier times control over personal data may have been best undertaken by preventing the data from being disclosed, in an internet enabled society it is increasingly important to understand how disclosed data is being used and reused and what can be done to control this further use and reuse. 2.1. Academic Studies of Informational Privacy Control In the literature, online privacy concerns have been particularly associated with issues of trust in e-commerce [34], internet use [35] and personalization [36] with many studies noting that concerns about privacy may limit the ways in which individuals interact with organizations online, for example by refusing to disclose data or misrepresenting themselves to the company [35]. Issues of control and procedural fairness [37] are frequently mentioned in these studies, for example, Hann et al. [38] note in a footnote that “Control was commonly operationalized by allowing information to be disclosed only with the subjects’ permission” [23, footnote 3] and Alge et al. [39] add a second facet to their model: once data

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

170

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

has been collected “how much control one believes he or she has over the handling of information (use and dissemination)” [22, p. 222]. Similarly, Son and Kim [35] suggest that online companies “need to give their customers a certain level of control over the collection and use of their personal information” to increase perceived fairness. Unfortunately, the example they give of this level of control is limited to giving consumers the choice of whether to be included in the database to receive targeted marketing messages [see also 40]. Indeed, for some authors consumer control is limited to something that is “communicated on behalf of companies when they state on application forms that any personal information collected will not be shared with any other organization” [41, p. 40].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.2. Engineered Consent Questions about the nature of human agency have attracted the attention of numerous philosophers and social scientists. In relation to privacy, discussions have ranged from Jeremy Bentham’s Panopticon [42] through Foucault’s disciplinary structures [43] to contemporary discussions about behavioral ‘nudges’ [44]. This last category of studies highlights the important role of a “choice architect” setting appropriate default settings and taking advantage of norms of behavior. For example, if the empirical evidence suggests that most people don’t actively monitor and manage their pension funds then a libertarian paternalist position would propose setting up individuals with a well-performing generalist pension as a default whilst also allowing them to change their pension provider if they choose to. In the context of privacy and consent, Kerr et al. [45] examine the ways in which the consent-gathering process is frequently “engineered to skew individual decision making” [45, p. 8]. Whilst such activities create an illusion of free choice, they call into question the underlying premise of truly informed, freely given consent. In particular, they highlight the risk that “the full potential of the consent model may be compromised in practice due to predictable psychological tendencies that prevent people from giving fully considered consent, and withdrawing it once given” [45, p. 15]. To illustrate this point, Kerr et al. [45] present a hypothetical example of a privacyconcerned individual being told about a useful “breaking news story alert” functionality provided by their favorite newspaper. Unfortunately, in order to obtain this functionality, they are required to register with the online newspaper, disclose various pieces of personal data and consent to various uses of this data. Thus, for the individual to gain the immediate benefits of the service, they must accept loss of control over the personal data that is disclosed to the newspaper. Although legally they are able to revise this initially given consent, Kerr et al. [45] argue that people will tend to provide the required personal information and offer their consent if the perceived benefits of the alerting service outweigh the perceived costs. From a ‘nudge’ perspective, it is apparent that subscribing to the alerting service results in an immediate and positive gain. Against this needs to be balanced the subjective loss of control and important future consequences regarding uses of the data. Similarly, revoking consent at some point will result in an immediate loss of benefit (no more alerting messages received) and a potential long term advantage (data not being (mis)used by unknown third parties). In each case, Kerr et al. [45] point out that the subjective utility of the decisions are heavily skewed. A variety of behavioral studies have shown that, in general, the subjective utility of a benefit change is more pronounced when it happens now rather than at

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

171

some point in the future. There is also evidence that the rate of change in subjective utility is faster for gains than for losses. Accordingly, “an immediate gain against a temporally distant loss of privacy, rendered less negative precisely because it occurs in the future” is more likely to result in consent being given than “if both outcomes occurred at the same time” [45, p. 17]. Similarly, a decision to revoke consent would evaluate an immediate loss against “a temporally distant gain whose value is much reduced because it occurs in the future” [45, p. 18]. Furthermore, there is growing evidence that losses are weighted more heavily in decision making than gains [e.g. 46]. Other academics question the true role of seeking informed consent in many organizational settings. For example, Heimer [47] studies the use of consent forms in the context of HIV/AIDS health care. She suggests that a primary use of the signed form is to act “as a shield for the organization should questions be raised about the study” [47, p. 21]. Moreover, “the ‘consenting’ of research subjects often follows rather than precedes the decision to participate. People arrive at the point of being ‘consented’ having already made a considerable investment in research participation” [47, p. 23]. A similar point is made by Anderson and Agarwal [48] who, noting the effect of emotions on health decisions, raise important policy questions regarding the timing of the consent process. “If people’s judgments vary with their emotions related to their health at a given point in time, should consent be sought at every interaction with a healthcare professional?” [48, p. 486].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. Towards a Dynamic Model of Consent Data These concerns about consent shaped a large inter-disciplinary research project into informational privacy, undertaken collaboratively by UK industry and academia. The EnCoRe project [49] ran from June 2008 to April 2012 and included Hewlett Packard Laboratories, HW Communications Ltd, the London School of Economics and Political Science, QinetiQ, the Computer Science Department and the Centre for Health, Law and Emerging Technologies (HeLEX), University of Oxford as partners. The project ran alongside two other projects (VOME [50]; and PVNets [51]) within the same funding program. EnCoRe’s work started with what might be termed “natural consumer behaviour”. For the reasons outlined above, it soon became clear that despite the legal and ethical requirements underlying consent, it was unreasonable to expect that all forms of consent would really be informed and freely given, instead consent is often given without individuals reading the terms and conditions of the proposed service or reflecting on the implications of their choice. As a result, EnCoRe came to consider consent as a dynamic (and changeable), rather than static, process. To achieve this, the project sought to develop scalable, cost effective and robust consent and revocation methods for controlling the usage, storage, location and dissemination of personal data that would offer effective, consent based control over that data and would help restore individual confidence in participating in the digital economy. In particular, by recognizing that the initially given consent might not have been fully informed or that the consent process might have been engineered to encourage the giving of consent, EnCoRe sought to develop mechanisms that would allow individuals to change their consent preferences over time (for example, when they became more

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

172

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

informed about the implications of the choices they had previously made or when their circumstances changed) rather than requiring them to be stuck with the initial consent choices made. Technological measures such as cryptographic ‘sticky policies’ helped ensure that these consent preferences remained associated with the data they referred to. The underlying technical architecture for EnCoRe contains a number of core components [52] that support easy integration with bespoke or legacy systems. These components include a client-side, Consent & Revocation Privacy Assistant that supports data subjects’ privacy preferences, a System Configuration Database that provides a centralized store of the schema, defined by administrators, describing how various types of privacy preferences are associated to personal data and the mapping of internal representations of types of personal data (e.g. names of data items within legacy databases, LDAP, etc.) to higher-level definitions used by EnCoRe. Another component provides Consent & Revocation Provisioning and is the contact point between the organization’s web server/portal and the EnCoRe components. Its purpose is to provide workflow-based coordination and provisioning capabilities. The Data Registry Manager is in charge of storing data subjects’ privacy preferences, along with associations to the related personal data. This component also keeps track of the whereabouts of personal data, within and across organizations. The Privacy – Aware Access Control Policy Enforcement component is in charge of enforcing security & privacy access control policies on personal data, driven by data subjects’ preferences. These policies takes account of preferences such as purposes for accessing data, entities the data may/may not be disclosed to, etc. It works in conjunction with the Obligation Management System. The system generates Audit Logs as each EnCoRe component is instrumented with a configurable agent to provide logging data. The Audit Logs component provides key information to support compliance checking. Finally, the Trust Authority component is in charge of dealing with checking the fulfillment of sticky policies; the release of cryptography keys, subject to the fulfillment of policy constraints; audit logging; and forensic analysis [52]. The latest iteration of the dynamic consent model is currently being evaluated in the context of biobanking research in Oxford and Manchester. Biobanks are repositories of tissue samples and associated data that are available for researchers to use, subject to the consent of donors. The very nature of biobanks means that it is frequently impossible to specify all potential future uses of tissue samples, and biobanks often rely on broad consent and ethical oversight to determine acceptable uses of the tissue sample. In this next stage of research, the study has been broadened to include consideration of interface issues. In particular, project partners HW Communications have developed a tablet-based implementation of the dynamic consent mechanism and integrated them with a broader, education and awareness portal that offers videos and other information about biobanking to patients as well as allowing them to manage their consent preferences dynamically, see Fig. 1. Figure 2 shows how the biobank donor can learn more about their specific interactions with the biobank and update their consent choices. Figure 3 illustrates potential consent choices open to donors. The range of available consent choices is determined by the biobank and allows the donor to provide fine level control over the use of their sample and data; decisions that are normally made on behalf of patients by research governance committees. Such an approach will clearly have consequences for the ethical oversight of research and the views of researchers whose practices might have been

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 1. Interface that helps explain what biobanking involves.

Figure 2. Information for the user.

Figure 3. Sample consent choices.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

173

174

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

affected by dynamic forms of consent were sought [53]. The attitudes of potential donors to dynamic consent are also being studied. All too often, questions of consent are left black-boxed or treated as secondary to questions of privacy. This chapter has presented an overview of current thinking that seeks to explore the limits of consent as it is currently understood and operationalized. The chapter has presented an alternative, dynamic model of consent. The model emerges from a multi-disciplinary research group and combines technical architecture considerations with real-time risk assessment and compliance monitoring, as well as insights into the changing ethical and governance issues surrounding consent.

References

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

All URLs checked 24 June 2013 [1] S. Davies, The Privacy Surgeon: Predictions for privacy 2013. LSE Enterprise (January) 2013. Archived at http://www.privacysurgeon.org/blog/wp-content/uploads/2013/01/PS-future-issues-full-report.pdf. [2] EU Commission proposes a comprehensive reform of the data protection rules (25 January) 2012. Archived at http://ec.europa.eu/justice/newsroom/data-protection/news/120125_en.htm. [3] OPSI Data Protection Act 1998. Archived at http://www.opsi.gov.uk/acts/acts1998/ukpga_19980029_ en_1. [4] K. Hoeyer, Informed consent: the making of a ubiquitous rule in medical practice. Organization 16(2), (2009), 267–288. [5] Department of Health Information: To share or not to Share: Information Governance Review (26 April) 2013. Archived at https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/ 192572/2900774_InfoGovernance_accv2.pdf. [6] Information Commissioner’s Office Guidance on the rules on use of cookies and similar technologies (May) 2012. Archived at http://www.ico.org.uk/for_organisations/privacy_and_electronic_ communications/the_guide/~/media/documents/library/Privacy_and_electronic/Practical_application/ cookies_guidance_v3.ashx. [7] Facebook Facebook Legal Terms of Service. Last modified 8 June 2012 2013. Archived at http://www. facebook.com/legal/terms. [8] D. Boyd, E. Hargittai, J. Schultz and J. Palfrey, Why parents help their children lie to Facebook about age: Unintended consequences of the ‘Children’s Online Privacy Protection Act’. First Monday 16(11), (2011). [9] EU Directive 95/46/EC of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (24 October) 1995. Archived at http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:en: HTML. [10] EU Data protection reform: Frequently asked questions (25 January) 2012. Archived at http://europa. eu/rapid/pressReleasesAction.do?reference=MEMO/12/41&format=HTML&aged=0&language=EN& guiLanguage=en. [11] Madrid Resolution Data protection authorities from over 50 countries approve the “Madrid Resolution” on international privacy standards. 31st International Conference of Data Protection and Privacy 2009. Archived at http://www.privacyconference2009.org/media/notas_prensa/common/pdfs/061109_ estandares_internacionales_en.pdf. [12] Article 29 Data protection working party Opinion 15/2011 on the definition of consent. Article 29 Data Protection Working Party (13 July) 2011. Archived at http://ec.europa.eu/justice/policies/privacy/ docs/wpdocs/2011/wp187_en.pdf. [13] C. Raab and B. Goold, Protecting information privacy. Equality and Human Rights Commission 2011. Archived at http://www.equalityhumanrights.com/uploaded_files/research/rr69.pdf. [14] Article 29 Data protection working party about us 2013. Archived at http://ec.europa.eu/justice/dataprotection/article-29/index_en.htm. [15] Article 29 Data protection working party Opinion 1/2008 on data protection issues related to search engines. WP 148 (4 April) 2008. Archived at http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/ 2008/wp148_en.pdf.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

175

[16] Article 29 Data protection working party Opinion 1/2010 on the concepts of “controller” and “processor”. WP 169 (16 February) 2010. Archived at http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/ 2010/wp169_en.pdf. [17] Article 29 Data protection working party Opinion 16/2011 on on EASA/IAB Best Practice Recommendation on Online Behavioural Advertising. WP 188 (8 December) 2011. Archived at http://ec.europa. eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2011/wp188_en.pdf. [18] L. Curren and J. Kaye, Revoking consent: A ‘blind spot’ in data protection law? Computer Law & Security Review 26(3), (2010), 273–283. [19] N. Lundblad and B. Masiello, Opt-in Dystopias. script-ed 7(1), (2010), 155–165. [20] C. Pounder, Facebook passwords and employment: why data protection works and Facebook’s promise to take legal action to protect privacy doesn’t. Amberhawk (28 March) 2012. Archived at http://amberhawk.typepad.com/amberhawk/2012/03/facebook-passwords-and-employment-why-dataprotection-works-and-facebooks-promise-to-take-legal-action-to-protect-privacy.html. [21] Information Commissioner’s Office Privacy notices code of practice. ICO 2009. Archived at http://www. ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_guides/privacy_notices_cop_ final.pdf. [22] S. McRobb and S. Rogerson, Are they really listening?: An investigation into published online privacy policies at the beginning of the third millennium. Information technology and people 17(4), (2004), 442–461. [23] I. Pollach, A typology of communicative strategies in online privacy policies: Ethics, power and informed consent. Journal of Business Ethics 62(3), (2005), 221–235. [24] P.G. Kelley, L. Cesca, J. Bresee and L.F. Cranor, Standardizing privacy notices: An online study of the nutrition label approach. CyLab, Carnegie Mellon University (12 January) 2010. Archived at http:// www.cylab.cmu.edu/files/pdfs/tech_reports/CMUCyLab09014.pdf. [25] G.R. Milne, M.J. Culnan and H. Greene, A Longitudinal Assessment of Online Privacy Notice Readability. Journal of Public Policy and Marketing 25(2), (2006), 238–249. [26] F. Belanger and R.E. Crossler, Privacy in the Digital Age: A Review of Information Privacy Research in Information Systems. MIS Quarterly 35(4), (2011), 1017–1041. [27] H.J. Smith, T. Dinev and H. Xu, Information Privacy Research: An Interdisciplinary Review. MIS Quarterly 35(4), (2011), 989–1015. [28] L.D. Introna, Privacy and the computer: Why we need privacy in the information society. Metaphilosophy 28(3), (1997), 259–275. [29] S. Warren and L. Brandeis, The right to privacy. Harvard Law Review 4, (1890), 193–220. [30] J. DeCew, In pursuit of privacy: Law, ethics and the rise of technology. Cornell University Press, Cornell, 1997. [31] P. De Hert, Identity management of e-ID, privacy and security in Europe. A human rights view. Information Security Technical Report 13(2), (2008), 71–75. [32] D.J. Willison, V. Steeves, C. Charles, L. Schwartz, J. Ranford, G. Agarwal, J. Cheng and L. Thabane, Consent for use of personal information for health research: Do people with potentially stigmatizing health conditions and the general public differ in their opinions? BMC Medical Ethics 10(1), (2009), 10. [33] A.F. Westin, Privacy and Freedom Atheneum Press, New York, 1967. [34] T. Dinev and P. Hart, An extended privacy calculus model for e-commerce transactions. Information Systems Research 17(1), (2006), 61–80. [35] J.-Y. Son and S.S. Kim, Internet users’ information privacy-protective responses: A taxonomy and a nomological model. MIS Quarterly 32(3), (2008), 503–529. [36] N.F. Awad and M.S. Krishnan, The personalization privacy paradox: An empirical evaluation of information transparency and the willingness to be profiled online for personalization. MIS Quarterly 30(1), (2006), 13–28. [37] M.J. Culnan and P.K. Armstrong, Information privacy concerns, procedural fairness, and impersonal trust: An empirical investigation Organization Science 10(1), (1999), 104–115. [38] I.-H. Hann, K.-L. Hui, T.S. Lee and I.P.L. Png, Online information privacy: Measuring the cost-benefit trade-off. In Twenty-Third International Conference on Information Systems, 2002. [39] B.J. Alge, G.A. Ballinger, S. Tangirala and J.L. Oakley, Information Privacy in Organizations: Empowering Creative and Extrarole Performance. Journal of applied psychology 91(1), (2006), 221–232. [40] E.A. Whitley, Informational privacy, consent and the “control” of personal data. Information security technical report 14(3), (2009), 154–159. [41] K.A. Stewart and A.H. Segars, An empirical examination of the concern for information privacy instrument. Information Systems Research 13(1), (2002), 36–49. [42] J. Bentham, Panopticon; Or, The Inspection-House: Containing The Idea of a New Principle of Construction applicable to any Sort of Establishment, in which Persons of any Description are to be kept under Inspection: And In Particular To Penitentiary-Houses, Prisons, Houses Of Industry, Work-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

176

[43] [44] [45]

[46] [47] [48]

[49] [50] [51] [52]

Houses, Poor Houses, Manufactories, Mad-Houses, Lazarettos, Hospitals, And Schools: With A Plan Of Management adapted to the principle: in a series of letters, written in the year 1787, from Crecheff in White Russia. T. Payne, London, 1791. M. Foucault, Discipline and punish. Random House, New York, 1977. R.H. Thaler and C.R. Sunstein, Nudge: Improving decisions about health, wealth and happiness. Penguin, London, 2008. I. Kerr, J. Barrigar, J. Burkell and K. Black, Soft surveillance, hard consent: The law and psychology of engineering consent. In Lessons from the identity trail: Anonymity, privacy and identity in a networked society (I. Kerr, V. Steeves and C. Lucock, Eds.), pp. 5–22, Oxford University Press, Oxford 2009. A. Acquisti and J. Grossklags, What Can Behavioral Economics Teach Us About Privacy? International Conference on Emerging Trends in Information and Communication Security (ETRICS), (June) 2006. C.A. Heimer, Inert facts and the illusion of knowledge: Strategic uses of ignorance in HIV clinics. Economy and Society 41(1), (2012), 17–41. C.L. Anderson and R. Agarwal, The digitization of healthcare: Boundary risks, emotion, and consumer willingness to disclose personal health information. Information Systems Research 22(3), (2011), 469– 490. EnCoRe About EnCoRe – Ensuring Consent and Revocation 2012. Archived at www.encore-project. info. VOME Visualisation and other means of expression 2012. Archived at http://www.vome.org.uk/. PVNets Privacy value networks 2012. Archived at http://www.pvnets.org/. EnCoRe Technical architecture (18 November) 2011. Archived at http://www.encore-project.info/ deliverables.html. E.A. Whitley, N. Kanellopoulou and J. Kaye, Consent and Research Governance in Biobanks: Evidence from Focus – groups with Medical Researchers. Public health genomics 15(5), (2012), 232–242.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[53]

E.A. Whitley / Towards Effective, Consent Based Control of Personal Data

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Part IV

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Other Sources of Data

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-179

179

Open Data Protection: Challenges, Perspectives, and Tools for the Reuse of PSI Ugo PAGALLOa and Eleonora BASSIb a University of Turin b University of Trento

Abstract. Open data debate and PSI (public sector information) legislation have so far focused on the access to public rather than personal data for reuse purposes. As a number of cases discussed before some European Data Protection Authorities have shown over the past years, however, conditions of access to PSI for reuse purposes may raise issues of data protection. Consider commercial and land registers, case law databases, public deliberations, vehicle registrations, socioeconomic data, and more, in light of such techniques as big data analytics. Correspondingly, it is likely that the reuse of PSI will increasingly concern principles and criteria for making personal data processing legitimate. The aim of this paper is to examine today’s legal framework and the technical means that may enable the lawful reuse of personal data, collected and held by public sector bodies. Whereas openness and data protection are often conceived as opposed in a ‘zero sum game,’ the paper explores whether a ‘win-win’ scenario is feasible.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Keywords. Anonymisation, data protection, privacy by design, privacy impact assessment, public sector information, transparency, soft law

1. Introduction The Open data debate and the PSI (public sector information) legislation have so far focused on the access to public, rather than personal, data for reuse purposes. Attention has been drawn, most of the time, to the factors on which such openness depends, i.e., the availability of information, the conditions of its accessibility, and so forth, rather than the legal principles and norms that constrain the flow of information so as to protect individual privacy, anonymity, personal data, etc. Over the past years, however, a number of cases discussed before some European Data Protection Authorities have shown that conditions for the reuse of PSI may raise issues of data protection. Suffice it to mention the Agencia in Madrid and its recommendation 2/2008; the Agencia in Barcelona and its recommendation 1/2008; several decisions by the Italian Garante; etc. [1,2]. This case study pinpoints the legal implications of disclosing information as a major challenge for information providers, namely public sector bodies (PSBs). On the one hand, informational openness goes hand in hand with the principles that make transparency ‘good’: in the wording of Turilli and Floridi [3, p. 106], such disclosed information “has the potential to show whether the providers are not only abiding by the legal requirements, but also effectively practising the ethical principles to which they are allegedly committed.” For example, consider how disclosed information may con-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

180

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

tain details that publicly show whether the activities of PSBs are consistent with such principles as equality, fairness, or environmental care. On the other hand, lawmakers and PSBs have to evaluate the potential illegal uses of disclosed information, e.g. copyright protection, freedom of speech, anonymity, data protection, and so on. By differentiating between data and information, i.e. that which is produced through the elaboration of data, three further levels of analysis should be mentioned:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

(1) Access to public information as provided by the different national Freedom of Information (FOI) Acts; (2) Access to personal data as determined by national laws, e.g., EU Member States transposing art. 2 of the data protection directive; and, (3) Access to PSI for reuse purposes as established by the national laws of, say, Member States transposing the EU PSI directive 2013/37/EU of the European Parliament and of the Council (the “PSI Amendment”) amending D-2003/98/ EC on the reuse of public sector information (the “PSI Directive”). Here, focus is on how levels (2) and (3) are related to each other. More particularly, PSBs have all kinds of data concerning citizens, businesses, vehicles, health, land properties, and the like. In addition, such public organisations collect the information in computerised databases that, most of the time, are accessible through public administration websites. Some have drawn attention to the principles that are endorsed by disclosing information, such as fairness, impartiality, and respect, or to the principles that need to be supported by information, e.g., issues of safety, welfare, or informed consent. For instance, according to Aichholzer and Burkert [4, p. xv], “most of this digitalized information is used to ensure the public tasks or missions of public interest of the public bodies and to provide services to citizens and businesses (eGovernment). Public sector information is also essential for citizens exerting their civic rights and to enable their democratic participation, as well as for governments to legitimise their political choices and actions. At the same time, PSI can furnish raw material and new resources for the creation of value-added information products and services to businesses.” This opinion on the potentialities of PSI as the single largest source of information in Europe, which could be fruitfully reused or integrated into new products and services, such as car navigation systems, weather forecasts, financial and insurance services, etc. has been corroborated time and again. As the European Commission is keen to inform on the EU’s “information society” thematic portal, “according to a survey on existing findings on the economic impact of public sector information conducted by the European Commission in 2011 (Vickery study) the overall direct and indirect economic gains are estimated at €140bn throughout the EU. Increase in the reuse of PSI generates new businesses and jobs and provides consumers with more choice and more value for money.”1 However, this information should be considered as “personal data” in several cases that concern commercial and land registers, case law databases, public deliberations and provisions, vehicle registrations, socioeconomic data, and more. Consider the cases mentioned by the European Data Protection Supervisor (EDPS) in the Opinion on the “open-data package” of the European Commission, that is, “information about directors and other representatives of companies registered in a trade register, information about salaries of civil servants or expenses of Members of Parlia1 See the corresponding webpage on “PSI – Raw Data for New Services and Products” at http://ec. europa.eu/information_society/policy/psi/index_en.htm (last accessed on 10 May 2013). A good introduction in Ricolfi and Sappa [5].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

181

ment, and the health records of patients held by a national health service” [6, § 9]. From this latter stance, the focus is neither on principles that necessitate information, nor on principles endorsed by disclosing information. Rather, we are brought back to the field of the legal rules that constrain the flow of information, namely, cases where it is likely that access to public data for reuse purposes will regard principles and provisions of the data protection framework, e.g. norms on restricted access and the protection of individual opaqueness [7]. By taking into account key aspects of this framework, a number of issues follow as a result [8]. Are principles and provisions of data protection and the reuse of PSI compatible? Alternatively, is it a matter of legal balancing? Would a ‘win-win’ scenario be feasible thanks to the technical means by which personal data, collected and held by public bodies, can be anonymised? In order to offer a hopefully comprehensive view of these issues, the paper is presented in four parts. Next, in Section 2, attention is drawn to the current legal framework on both data protection and PSI, so as to stress why such provisions entail the risk of a divorce. Then, in Section 3, focus is on how the marriage could still be possible in accordance with the suggestions of the European data protection authorities, i.e. both the EDPS and the Working Party art. 29 (WP29). Specifically, Section 4 examines the technical tools through which the protection of personal data can be strengthen within and by public sector bodies. Although “privacy by design” approaches, privacy impact assessments, techniques of anonymisation and soft law tools do not guarantee to exploit all the potentialities of PSI reuse, the conclusions of this paper draw on some of these mechanisms to propose at least the basis for a legitimate wedlock.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. PSI and Data Protection: A Dodgy Divorce? Conditions and clauses for the reuse of PSI in Europe were initially established by D2003/98/EC. This set of provisions allowed and encouraged the reuse of public sector information, which includes personal data, for commercial or non-commercial purposes. The data could be reused compatibly with the public tasks and reasons for which personal data was collected and, moreover, in a way that should not affect the level of protection of individuals, according to D-95/46/EC. In fact, recital 21 of the PSI directive affirms that this latter directive should be “implemented and applied in full compliance with the principles relating to the protection of personal data.” Likewise, Article 1(4) of D-2003/98/EC “leaves intact and in no way affects the level of protection of individuals with regard to the processing of personal data… and in particular does not alter the obligations and rights set out” in D-95/46/EC. Moreover, Article 2(5) of D2003/98/EC clarifies that “personal data means data as defined in Article 2(a)” of the data protection directive. This legal framework has not changed with the new PSI directive 37 from 2013. According to WP29 in the Opinion on “open data and PSI reuse” of June 2013 (wp207), the analysis of the interaction between the reuse of PSI and the protection of personal data becomes far “more complex… in light of new legislative and technological developments” and still, the conclusion of the Opinion adopted on 12 December 2003 remains the same, i.e. wp83. The new PSI directive “does not create an obligation to disclose personal information (n. 4.1 of wp207), so that the reuse of PSI is not “automatic… when the right to the protection of personal data is at stake” (ibid). In addition to the rights and obligations set out in D-95/46/EC, further rights and interests should be taken into account in the field of PSI, such as freedom of information and the right to knowledge, freedom of expression, access to public documents,

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

182

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

transparency, participatory democracy, and the functioning of the internal market with the free circulation of services and information. Although striking a balance may not be simple, the reuse of PSI is strongly encouraged by the new directive as a means to empower and strengthen such rights and interests. The aim is that both open data and data protection safeguards foster a fair environment for the processing and spread of the data. Yet, the PSI directive only includes few references to determine how the balance between the rights and interests mentioned above could be struck. In past years, such lack of normative clarity has induced an inconsistent praxis among PSBs, a poor harmonization of the PSI directive in Europe, the distortion of the information market, e.g. between public and private sector stakeholders – up to the risk of a ‘dodgy divorce’ between data protection law and the reuse of PSI. Also, the heterogeneity of PSI national rules provokes legal uncertainty for possible trans-border reuses as well as striking differences for the protection of personal data as a condition for the reuse of PSI. For example, some MS, e.g., Article 4 of the Belgian Law from 7 March 2007, either impose a complete “anonymisation” of personal data, or request a “formal consent” of the data subjects as a condition for the reuse of PSI. Other MS, like France and Slovenia, adopt a mix of such solutions, often on the basis of a national law which specifies the clauses for the reuse of PSI owned by PSBs. Whereas further scenarios can be considered, such as the authorisation of the national data protection authorities as a condition for the reuse of PSI, the outcome of the European Commission’s proposal of 12 December 2011, aiming to amend the PSI directive, i.e. D-2013/37/EU does not avert the risk that the protection of personal data may forestall the reuse of PSI. As stated by § 3.2 of the proposal, “the object of the revision is not to regulate the processing of personal data by public sector bodies [i.e., PSBs] or the status of intellectual property rights, which are not affected beyond what is already the case under the existing rules of the Directive” (COM(2011)877 final 2011/0430 (COD)). Consequently, it is not necessary to insist on the drawbacks of privacy regulations (e.g. [9]), EU data protection frameworks [10–12], or the European institutions as such [13] to admit that a number of MS and PSBs alike may still refer to current data protection safeguards as a preposterous way to curb manifold legitimate applications of PSI reuse. Rather than dealing with the complex set of rights and obligations put forward by the data protection directive, some PSBs may find it easier to claim that no PSI reuse is possible due to alleged privacy reasons. To make things even worse, new techniques, such as big data analytics [14], increase the risks and challenges for personal data protection and PSBs making personal data available for reuse. Despite the potential for innovation, relying on the ability of big data analytics to support the collection and storage of huge amounts of data so as to understand and exploit a whole array of informational resources, this scenario makes it even harder to determine what data should be made available for reuse. For example, in the Opinion 03/2013, adopted on 2 April 2013 (wp203), the WP29 insists on “the sheer scale of data collection, tracking and profiling” as a new source of threat for the protection of personal data concerning, say, the security of the data and the transparency of the processing, much as the “increased possibilities of government surveillance” (wp203: Annex 2 on Big Data and Open Data, pp. 45–50). On this basis, should the conclusion be that rights, obligations, and principles of the data protection directive legitimately hamper the great economic potential of PSI reuse? Do big data scenarios necessarily confirm this conclusion or, conversely, might some feasible approaches prevent such a stalemate?

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

183

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. PSI vs. Data Protection: A Troublesome Marriage? As stressed in the previous section, the conditions for the legitimate reuse of PSI shall conform to rights, obligations, and principles of the data protection directive. On one hand, both the PSBs and reusers, conceived as the first and new data controllers respectively, have to respect the principles of finality, proportionality, data quality, and necessity of the processing, pursuant to Article 6 of D-95/46/EC. Moreover, they should abide by the criteria for making such processing legitimate in accordance with Article 7, i.e. on the basis of the data subject’s consent, or in cases where the processing is necessary for the performance of a contract, in compliance with a legal obligation of the controller, vital interests of the data subject, legitimate interests pursued by the controller, and so forth. On the other hand, besides the adoption of technical and organisational measures for the confidentiality and security of data processing, both PSBs and reusers shall respect the data subjects’ rights to information, access, rectification, erasure, blocking and right to object. Hence, PSBs should provide for clear privacy guidelines in their websites, much as reusers have to provide for additional information to the data subjects in accordance with Article 11 D-95/46/EC, so that the provision of such information is not necessary only when it “proves impossible or would involve a disproportionate effort or if recording or disclosure is expressly laid down by law.” In any event, it is an open question whether reusers should specify the purpose of their own processing of personal data in order to allow the PSBs to ascertain cases where the reuse is incompatible with the initial purpose of the gathering of personal data. We will return to this below. In addition to the provisions and principles of the data protection directive, however, the focus of the analysis should be broadened by taking into account the opinion of the European data protection authorities. In particular, the WP29 has delivered some important opinions on the reuse of PSI and the protection of personal data, that is, wp20, wp83, and the aforementioned wp203 and wp207 (see above in the previous section). For example, in wp83 from December 2003, the WP29 insists on the role that techniques of anonymisation can play in this sector: “with a view to avoiding the disclosure of personal data in the first place, such should be excluded where the purpose of the reuse can be fulfilled with the disclosure of personal data rendered anonymous in such a way that the data subject is no longer identifiable.” Ten years later, in Annex 2 of wp203 on purpose limitation, examining the challenges of big data analytics to rights, obligations, and principles of the European data protection legal framework, the WP29 has confirmed this viewpoint, in that anonymisation techniques represent the “most definitive solution to minimise the risks of inadvertent disclosure.” This is, after all, the approach endorsed by the European directive on the framework for the deployment of Intelligent Transport Systems in the field of road transport and for interfaces with other modes of transport (D-2010/40/EU or ITS directive).2 Recital 12 refers to the need to safeguard privacy rights as regulated by D-95/46/EC and D-2002/58/EC, and respecting, inter alia, the principles of purpose limitation and data minimisation. Hence, the ITS Directive requires the adoption of anonymisation techniques for personal data from ITS databases (Recital 13 and Article 10, specifically referring to rules on privacy, security and reuse of information).

2 See the corresponding website at http://eurlex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:207: 0001:0013:EN:PDF (last accessed on 10 May 2013).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

184

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

On the other hand, since its Opinion 7/2003, i.e. wp83, the WP29 deems the principle of finality as crucial: although “the [data protection] Directive does not prohibit the reuse for different, but for incompatible purposes… there are several elements which may have to be taken into account in the evaluation of whether further processing would be compatible with the original purpose.” On this basis, the WP29 suggests a ‘case by case’ approach in connection with the enforcement of the finality principle, the observance of data subjects’ rights and the aim for which data are intended to be reused, e.g. for commercial uses, direct marketing purposes, and so forth. In the wording of wp83, “the question of whether the data protection directive allows the reuse of public sector information that contains personal data requires a careful and case-bycase assessment in order to strike the balance between the right to privacy and the right to public access” (op. cit., at 11). This latter approach, however, raises further problems. In the Policy Recommendation of April 2012, the LAPSI group properly claim “that such case-by-case assessment could easily lead to heterogeneity of practices and solutions either between public bodies (even for the same personal data) or between different levels of public sector bodies [i.e., PSBs], and therefore between Member States. That would create greater legal uncertainty and an additional obstacle to the reuse of personal data gathered by the public sector” (op. cit., at 18). Aside from further remarks on how different national restrictions to the reuse of PSI can affect the decisions of PSBs, what the LAPSI group stress seems appropriate, namely, that some “general assessments” should be made by the WP29. For instance, the “case by case” approach can be improved by taking into account the different categories and nature of the personal data with which we are dealing, much as the multiple types of purposes of the reuse processing. Yet, there is a further way to approach the reuse of PSI in connection with the protection of personal data, that is, what the EDPS presents as a “proactive approach” in the paper on Public Access to Documents Containing Personal Data after the Bavarian Lager Ruling from 24 March 2011. Here, “a proactive approach means that institutions assess and subsequently make clear to data subjects – before or at least at the moment they collect their data – the extent to which the processing of such data includes or might include its public disclosure.” This method reappears in the EDPS Opinion on the “open-data package” from 18 April 2012, in connection with “practical tools” such as data protection impact assessments and “new concepts” like privacy by design and accountability. In the phrasing of Peter Hustinx, “it is crucial that public sector bodies [i.e. PSBs] take a proactive approach when making personal data available for reuse. A proactive approach would make it possible to make the data publicly available with the explicit purpose of reuse, subject to specific conditions and safeguards in compliance with data protection rules” (op. cit., § 39). Admittedly, this approach is particularly fruitful for a new generation of cases concerning the collection and reuse of personal data. Consider the scheme we adopted at the University of Turin, Law School, to enable the reuse of the personal data of students by the job placement services of the faculty. The Law School privacy form gives students information on the personal data that might be accessible and disclosed to third parties, e.g. law firms. Moreover, a clear reference is made to the clauses and the finality principle that govern the possible reuse of such processed data and set the proper conditions for an informed consent: have a look at our Law School website. Nonetheless, such a proactive approach does not fit the reuse of all the personal data already gathered and held by PSBs. In this latter case, we should in fact revert to the aforementioned techniques of anonymisation of personal data and such tools as data

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

185

protection impact assessment, which have been stressed by both the WP29 and the EDPS in their opinions and recommendations. Since these approaches may help us prevent some of the drawbacks of today’s legal framework, let us dwell on these ways for striking a balance between the reuse of PSI and the protection of personal data separately.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. Tools for a Legitimate Wedlock Throughout the past years, privacy authorities and data protection commissioners have increasingly referred to the principle of “privacy by design,” so as to protect the rights of the data subjects by embedding legal safeguards into technology [1]. Think of the minimisation and quality of the data, its controllability, transparency, and confidentiality, which design mechanisms can ensure in three different ways. First, design may encourage people to change their conduct, as we do with the transparency of user-friendly interfaces that increase or reduce the prominence of a default setting so as to allow individuals to configure and use their software as they deem appropriate. This approach to design mechanisms arguably fits the role that personal choices play in the field of data protection, when individuals modulate different levels of access and control over information in digital environments depending on the context and its circumstances [15]. Second, design mechanisms may aim to decrease the impact of harm-generating behaviour rather than encouraging individuals to change their conduct. A typical instance is given by security measures and the processing of personal data in accordance with Article 17 of D-95/46/EC. Here, once developers of information systems have determined whether the processing of personal data is legitimate or what kind of data should be conceived of as personal, focus is on the functional efficiency, robustness, reliability, and usability of the design project. On this basis, through the use of prototypes, internal checks among the design team, users tests in controlled environments, surveys, interviews, and more, “verifying the inclusion of values is likely to draw on strategies and methods not unlike those applied to other design criteria like functional efficiency and usability” [16]. Third, design mechanisms may aim to prevent harm-generating behaviour from occurring through the use of self-enforcement technologies. According to the Ontario’s Privacy Commissioner, for instance, personal data should be automatically protected, in every IT system as its default setting, in such a way that a cradle-to-grave, start-tofinish, or end-to-end lifecycle protection would ensure that privacy safeguards are at work even before a single bit of information has been collected [17]. Although this approach to design mechanisms and the principle of privacy by design are popular in the field of the reuse of PSI, there are reasonable grounds for concern [18]. Think of specific design choices that may result in conflicts between values and, vice versa, conflicts between values that may impact on the features of design. Even though legal systems help us overcome a number of conflicts between values, it is likely that the use of self-enforcement technologies in such field as data protection would make conflicts between values even worse due to specific design choices, e.g. the opt-in vs. opt-out diatribe over the setting for users of IT systems. In addition, there is the technical difficulty of applying to a machine concepts traditionally employed by lawyers, through the formalisation of norms, rights, or duties: as a matter of fact, informational protection safeguards present highly context-dependent notions, as personal data and data control-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

186

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

lers, which raise a number of relevant problems when reducing the informational complexity of a legal system where concepts and relations are subject to evolution. In particular, attention should be drawn to the difficulties of achieving such a total control over the flow of information through the use of self-enforcing technologies. Doubts cast by “a rich body of scholarship concerning the theory and practice of ‘traditional’ rule-based regulation bear witness to the impossibility of designing regulatory standards in the form of legal rules that will hit their target with perfect accuracy” [19]. Such doubts and concerns reverberate in today’s debate on the reuse of PSI that includes personal data. Consider the use of tools of anonymisation supported, as a default rule, by the WP29 in the Opinions 7/2003 (wp83) and 3/2013 (wp203), much as the EDPS (§ 45 of the Opinion on the “open-data package”). Theoretically, making personal data anonymous in such a manner that data subjects are no longer identifiable is a proper way to prevent the obstacles of the data protection directive concerning the reuse of PSI, because anonymised data should no longer be deemed to be personal data. Still, some question the privacy-protecting power of anonymisation, insofar as computer scientists would “have demonstrated they can often ‘re-identify’ or ‘de-anonymize’ individuals hidden in anonymised data with astonishing ease” [20]. Others reckon that such stances are exaggerated since anonymisation remains a strong tool for protecting privacy and the data subjects’ rights [21]. In between such extremes, it seems fruitful to follow the aforementioned Opinion on the open-data package, where the EDPS suggests that we should ascertain “whether data protection law permits the data to be made available for reuse after only partial anonymisation and if so, what level of anonymisation is required” (op. cit., § 46). This perspective brings us back to the need to distinguish the categories and nature of the personal data, much as the multiple types of purposes concerning the reuse processing of PSI with which the WP29’s case-by-case approach, as seen in the previous section, should be improved. An effective way is offered by work on privacy impact assessment and the aim to evaluate personal data information technologies, even before IT systems are introduced [22]. In order to assess such impact and determine whether, or under what conditions and subject to what specific safeguards, personal data can be made available, experts will have to typify different purposes of PSI reuse, multiple kinds of personal data and alternative security measures, down to different levels of anonymisation. According to the Opinion 4/2007 of the WP29 on the concept of personal data, for example, we may envisage cases where pseudo-anonymisation, rather than strict anonymisation, will ensure the minimisation of the risk of re-identification and misuse of personal data: “Using a pseudonym means that it is possible to backtrack to the individual, so that the individual’s identity can be discovered, but then only under predefined circumstances. In that case, although data protection rules apply, the risks at stake for the individuals with regard to the processing of such indirectly identifiable information will most often be low, so that the application of these rules will justifiably be more flexible than if information on directly identifiable individuals were processed” (wp136; § 3, at 18). This “more flexible” approach has been confirmed by the Opinion 03/2013, i.e., wp203, according to which “anonymisation should be done, prior to making the data available for reuse, by the data controller or by a trusted third party” and recommended “to conduct an effective data protection impact assessment to decide what data may be made available for reuse, and at what level of anonymisation and aggregation.” On the basis of this assessment, a legal framework follows that, in the phrasing of Peter Hustinx, appears “comparable to the data protection impact assessment as fore-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

187

seen in the proposal for a general data protection regulation. The EDPS suggests including the main elements of the assessment in the text of the proposed Directive” (§ 40 of the Opinion on the open-data package). Additionally, in order to simplify the management of data protection rules and further requests for access and reuse, PSBs and MS could appoint a Data Protection Officer within each unit, or service, which manages personal data, much as it occurs with the EU institutions practice pursuant to both Article 24 of Reg. 45/2001 and Article 18(2) of D-95/46/EC. Furthermore, we can exploit a number of soft law tools, namely “softer forms of legalized governance when those forms offer superior institutional solutions” [23]. Consider the adoption of codes of conduct for reusers, guidelines for both PSBs and reusers, as well as the standardisation of best practices, which can strengthen the harmonisation of the PSI directive in Europe and prevent the distortion of the information market. However, this win-win scenario has its limits, and crucial political decisions must be taken. Reflect on both the Opinion 03/2013 of WP 29 (op. cit., Annex 2), and the Opinion of the EDPS on the open-data package (op. cit., §§ 40–41), where the European data protection authorities affirm that PSBs, rather than PSI reusers, should carry out the assessment concerning the impact of such reuse on the protection of personal data and, moreover, such assessment must ensure that the reuse is available only for a set of compatible purposes. Whereas this approach of course makes sense in the case of virtuous PSBs that will cover the costs of the assessment so as to let both start-ups and big companies reuse PSI on the basis of a fair level of data protection, there is nonetheless the risk that, in the case of less virtuous PSBs and MS, such costs, and the need to specify the set of compatible (rather than the incompatible) purposes of the reuse, may add up to the inertia of several public organisations due to alleged data protection issues. In light of these and further dilemmas, the time is ripe for the conclusions of this paper.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5. Conclusions Time and again, scholars have presented both privacy and data protection as if they were opposed to openness in a zero-sum game. Think of some approaches to privacy as a condition of ‘solitude’, ‘exclusion’, ‘secrecy’, and so forth [24–26], etc. Contrary to these outlooks, the starting point of this paper was to insist on the role that personal choices and privacy policies play in the field of data protection, so that a number of issues revolve around the different levels of access and control over information that both individuals and PSBs can determine in digital environments. In the case of the reuse of PSI, the compatibility between openness and personal data safeguards has been illustrated with the “proactive approach” recommended by the EDPS, as seen above in Section 3. Still, this approach does not fit the reuse of personal data that PSBs have collected over past decades: in this latter case, Section 4 has shown how the use of tools of anonymisation, pseudo-anonymisation, soft law and work on data protection impact assessment can make personal data available for the reuse of PSI. By distinguishing multiple purposes of such reuse, alternative security measures, levels of anonymisation and different types of personal data, e.g., the curricula of Turin students vs. health records of patients, the information held by PSBs can be opened, and thus reused, while abiding by the level of protection established by the EU directives on data protection in a ‘win-win’ scenario.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

188

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

Admittedly, the devil is in the detail, and some parameters, such as different types of personal data, purposes of PSI reuse, levels of anonymisation, etc. will probably spark controversy. Leaving aside the technicalities of the problem, e.g. the formalisation of norms, concepts, and their relations, through the model of legal ontologies for the reuse of PSI and data protection safeguards, we should not miss a key point. Whilst current proposals of amendment provide for a number of exceptions to the principle that all PSI have to be reusable for both commercial and non-commercial purposes, the PSI directive should not be interpreted in a way that may curb manifold legitimate reuses of such information in the name of alleged data protection issues. Contrary to the EDPS opinion on the open-data package, for example, it seems unnecessary that PSBs should specify the set of compatible (rather than incompatible) purposes of the reuse, since it is hard to predict all the ways in which individuals might reuse that information. Likewise, contrary to the WP29’s Opinion 03/2013, it seems unnecessary to determine, once and for all, that the full set of operations and costs for making PSI available for reuse via the interoperability of public databases, anonymisation of personal data, data protection privacy assessments, and the like, should entirely fall on the PSBs. Under certain circumstances, it makes sense that the costs of, say, privacy impact assessment should fall on the private companies aiming to reuse PSI for commercial purposes. At the end of the day, we should avert the risk that MS and PSBs keep closed what could vice versa be opened. A more flexible interpretation of the rules, together with tools for a legitimate wedlock between PSI reuses and personal data safeguards, would strengthen the free circulation of information, knowledge and services that foster economic growth. In the light of principles and criteria for making data processing legitimate, the PSI reuse of personal data does not entail a ‘zero sum’ game after all.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] U. Pagallo and E. Bassi (2011). The future of EU working parties’ “The future of privacy” and the principle of privacy by design. In: An Information Law for the 21st Century, edited by M. Bottis, pp. 286–309. Athens: Nomiki Bibliothiki Group. [2] LAPSI (2012). Policy recommendation on privacy (WG 02), available at http://www.lapsi-project.eu/ wiki/index.php/Policy_recommendation_on_privacy. [3] M. Turilli and L. Floridi (2009). The ethics of information transparency. Ethics and Information Technology, 11, 105–112. [4] G. Aichholzer and H. Burkert (2004). Public Sector Information in the Digital Age. Cheltenham: Elgar. [5] M. Ricolfi and C. Sappa (2013). Extracting Value from Public Sector Information: Legal Framework and Regional Policies. Quaderni del Dipartimento di Giurisprudenza dell’Università di Torino. Napoli: ESI. [6] EDPS (2012). Opinion on the open-data package (April 18, 2012), available at http://www.edps. europa.eu/EDPSWEB/webdav/site/mySite/shared/Documents/Consultation/Opinions/2012/12-04-18_ Open_data_EN.pdf. [7] H. Arendt (1958). The Human Condition. University of Chicago Press: Chicago. [8] C.D. Raab (2004). Privacy issues as limits to access, In: Public Sector Information in the Digital Age, edited by G. Aichholzer and H. Burkert, pp. 23–46. Cheltenham: Elgar [9] R.A. Posner (1983). The Economics of Justice. Cambridge, Mass.: Harvard University Press. [10] Ch. Kuner (2003). European Data Privacy Law and Online Business. Oxford-London: Oxford University Press. [11] R. Volkman (2003). Privacy as life, liberty, property. Ethics and Information Technology, 5(4), 199– 210. [12] Ch. Reed (2012). Making Laws for Cyberspace, Oxford University Press. [13] H.M. Enzensberger (2011). Brussels, the Gentle Monster. London: Seagull.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

U. Pagallo and E. Bassi / Open Data Protection: Challenges, Perspectives, and Tools

189

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[14] V. Mayer-Schönberger and K. Cukier (2013). Big Data: A Revolution That Will Transform How We Live, Work And Think. Boston: John Murray. [15] H. Nissenbaum (2004). Privacy as contextual integrity. Washington Law Review, 79(1), 119–158. [16] M. Flanagan, D.C. Howe and M. Nissenbaum (2008). Embodying values in technology: theory and practice. In: Information Technology and Moral Philosophy, edited by J. van den Hoven and J. Weckert, pp. 322–353. New York: Cambridge University Press. [17] A. Cavoukian (2010). Privacy by design: the definitive workshop. Identity in the Information Society, 3(2), 247–251. [18] U. Pagallo (2012). On the principle of privacy by design and its limits: technology, ethics, and the rule of law. In: European Data Protection: In Good Health? Edited by S. Gutwirth, R. Leenes, P. De Hert and Y. Poullet, pp. 331–346. Dordrecht: Springer. [19] K. Yeung (2007). Towards an understanding of regulation by design. In: Regulating Technologies: Legal Futures, Regulatory Frames and Technological Fixes, edited by R. Brownsword and K. Yeung, pp. 79–108. London: Hart. [20] P. Ohm (2009). Broken promises of privacy: responding to the surprising failure of anonymisation. UCLA Law Review, 57, 1701, 2010, available at SSRN: http://ssrn.com/abstract=1450006. [21] A. Cavoukian and K. El Emam (2011). Dispelling the myths surrounding de-identification, available at http://www.ipc.on.ca/images/Resources/anonymisation.pdf. [22] D. Wright and P. de Hert (eds.) (2012). Privacy Impact Assessment. Dordrecht: Springer. [23] K. Abbott and D. Snidal (2000). Hard law and soft law in international governance, International Organization, 54(3), 421–456. [24] A.F. Westin (1967). Privacy and Freedom. New York: Atheneum. [25] R. Gavison (1980). Privacy and the limits of the law, Yale Law Journal, 89, 421–471. [26] A. Allen (1988). Uneasy Access: Privacy for Women in a Free Society. Totowa, N.J.: Rowman and Littlefield.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

190

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-190

Open Data: A New Battle in an Old War Between Access and Privacy? Katleen JANSSEN 1 and Sara HUGELIER ICRI, KU Leuven, iMinds Abstract. This paper addresses the relationship between freedom of information and privacy. It looks into the privacy exemptions in freedom of information legislation and the way they are applied in national redress mechanisms. It argues that the balancing exercise performed between FOI and privacy can provide important insights in the discussion on the possible conflicts between open data on the one hand and privacy and data protection on the other hand. Keywords. Freedom of information, privacy, access, data protection

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction In today’s information society, the citizen has developed a double-edged relationship with information. On the one hand, she has access to an unprecedented amount of information via the Internet and also expects this information to be freely available for scrutiny, comment or further dissemination. On the other hand, she is also increasingly aware that her online behaviour is a source of information for others, e.g. information posted on Facebook on her personal life or newspaper articles about a one-off event can remain available on the Internet forever and have an influence on her further life or career. Hence, while the citizen may claim a right to know about others, she may also want a right to privacy about her own information and a ‘right to be forgotten’. While the possible clash between the right to know and the right to privacy causes challenges on many different levels of society, the possible impact is also becoming particularly clear in relation to information held by government and public authorities. In today’s society, government is one of the largest holders of information, including e.g. statistical data, citizen registries, cultural heritage, and corporate data. Many of these government databases contain personal data, or information with an impact on the individual’s privacy, e.g. regarding health, social benefits, or taxes. Civil rights groups and academics have increasingly protested against government’s use and dissemination of personal data (e.g. [1,2]). At the other end of the spectrum, citizens and activist groups are also increasingly demanding access to government information, with campaigns such as AsktheEU (www.asktheeu.org), Wobbing Europe (www.wobbing.eu) or the Global Right to Information Rating (www.rtirating.org). Proponents of the right to know claim that access to such information promotes transparency, participation, economic growth and innovation [3,4]. However, if this information also includes personal data or privacy-sensitive information, which interest prevails: the right to know or the right to privacy? Possible 1

Corresponding Author.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

191

conflicts can be imagined around the spending of tax revenue by Members of Parliament (see e.g. [5] on the start of a public expenses scandal in the United Kingdom), but also around information on ‘ordinary people’ outside of the public eye (see e.g. the Slovakian Fair Play Alliance website on public procurement money received by companies and private persons, http://znasichdani.sk/l?l=en). These conflicts are not just a theoretical possibility, they also materialise. For instance, in 2010, the European Court of Justice had to examine whether the publication of information on the beneficiaries of subsidies under the Common Agricultural Policy on the website www.farmsubsidy.org infringed the protection of personal data [6]. As both the right to know and the right to privacy are fundamental and constitutionally protected rights under the Charter of Fundamental Rights of the European Union (Charter), the European Convention for the Protection of Human Rights and Fundamental Freedoms (ECHR), as well as national constitutions [7,8], a balance needs to be found between these rights based on clear, transparent and objective criteria that can be applied by public bodies in dealing with access requests or proactive dissemination of government information. The relationship between both these rights has received particular attention in the last few years, generally linked to two developments. On the one hand, the European Commission, with the introduction of the PSI directive [9], started a legislative process for opening up public sector information for re-use by the private sector for the creation of pan-European information products and services. As the public sector information that falls under the provisions of this directive will include a large amount of personal data (e.g. business registries, vehicle registers, etc.), such re-use may come into conflict with the legal principles for the protection of personal data laid down, among others, in the EU directive on data protection [10]. On the other hand, governments and policy makers have decided to make the datasets they hold available as open data, allowing them to be used for any kind of purpose by all members of the public in order to increase accountability and public participation and stimulate economic growth and government efficiency. Contrary to the PSI directive, national open data policies generally exclude personal data from their scope, entailing that any data sets containing personal data will not be made available via national data portals or other channels. However, the broad availability of these datasets in a machine-readable format enables the combination of many different datasets, with an ever increasing risk of de-anonymisation of the data included in these datasets [11]. Hence, open data may also pose a risk to the individual’s privacy. Moreover, the exclusion of personal data may not always benefit the objectives of open data, as in some cases the publication of these personal data may actually be in the public interest for accountability reasons. In another contribution in this volume, Pagallo and Bassi discuss the potential dangers of open data, and particularly re-use of PSI, for data protection and privacy, and they argue that the solution lies in tools such as privacy by design and privacy impact assessments. The need for such an ‘open data & PSI v. privacy & data protection’ discussion is indeed urgent, as is also shown by the recent Article 29 Working Party’s Opinion on Open Data and Public Sector Information (PSI) Reuse [12]. However, it is not a completely new debate. It is part of a broader and older discussion on the balancing of two values: transparency on the one hand, and privacy on the other. In this paper, we explore the potential value of this older discussion for the balancing exercise that is needed today. To avoid drifting off into theoretical discussions about the scope of concepts such as privacy, transparency or openness, we will use the concrete example of the relationship between freedom of information (FOI)

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

192

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

legislation and the protection of privacy and/or data protection, concretised in the exemptions in FOI acts allowing or mandating non-disclosure of documents held by the public sector for reasons of privacy. We will try to assess whether the criteria or balancing processes used can provide input on how to deal with personal data in an open data environment. This paper represents a first exploration of the potential interest of case law, practice and literature relating to this topic. It is not based on extensive research of a representative number of cases, but rather picks a few examples to highlight some of the concrete questions relating to access and privacy. We argue that in-depth research into this topic could be useful as background for the open data v personal data debate. In the following sections, we first set out the problem of balancing privacy and access rights, after which we examine some examples of how this balance has been translated into legislative provisions and in which institutional settings it is applied by various existing redress mechanisms. Next, we highlight some criteria that have been applied, or problems that have come up in the case law and literature, in order to conclude with a first assessment on the usefulness of these elements in the light of the new developments on open data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. The Search for a Balance Between Freedom of Information and Privacy While the relationship between FOI (or access to information held by the public sector)2 and privacy has often been described as conflicted [13], there is a growing consensus that both rights are complementary and aim to increase the democratic character of society and, at the same time, the information autonomy of the individual [11,14–16]. While in most cases there is no tension between them, situations may arise where the rights do collide [15], i.e. when a document requested under FOI legislation holds information that may harm an individual’s privacy. Both the right to privacy and the right to access have been recognised as fundamental rights [7,8,17]. However, the right to privacy is much more well-established within the European legal order, while the right of access has only more recently overcome the reluctance of the legislators and the European Court of Human Rights to recognise it. As both rights are now indeed considered fundamental, there is no hierarchical order between them [18]. Therefore, the solution lies in finding an appropriate balance between the interests protected by these rights in situations where they come into conflict [19,20]. In this respect, stating that privacy represents the interests of the individual, while access represents the interests of the community would be too simple: privacy is a public good and also of vital importance in a democratic society [11]. This search for a balance can take place on at least two levels. First, the legislator tries to express the balance in generally applicable rules, most often laid down in FOI legislation. In almost all EU Member States, and in many other countries, FOI legislation establishes a general obligation for public bodies to grant access to their documents “with the exception of, or with due regard being made to, the rules on privacy and data protection” [21]. However, such general provisions do not provide ready answers to public bodies facing a particular request for information. Therefore, a second 2 We will use these two terms interchangeably and assume them to be the same, not taking into account the broader concept of freedom of information as it is enshrined in article 10 of the European Convention on Human Rights.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

193

balancing exercise is performed by the public bodies themselves, and by redress bodies competent for potential complaints about the decisions of the public bodies. This implies an analysis of the circumstances for each particular question on a case-by-case basis [21]. In the next sections, we will take a short look at both levels of balance seeking.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. The Balance Between Access and Privacy in FOI Legislation While FOI legislation starts from the fundamental presumption that all information held by government should be accessible to the general public, it has never intended this to be an absolute right. Exceptions are possible, allowing the government to refuse access in order to protect other legitimate rights or interests. One of these interests is the protection of the private life or the privacy of the individual. The balance between access and privacy has been described in many different ways in the legislation [18]. For instance, on an international level, the Council of Europe Convention on Access to Official Documents allows the Council of Europe Member States to limit the right of access to protect “privacy and other legitimate private interests”, on condition that these limitations are set down precisely in law, are necessary in a democratic society and are proportionate to the aim of protecting said interests [22]. The Convention is not yet in force, but many national regulations already have comparable exceptions. The French FOI rules have a general wording: documents, the communication of which would harm the protection of the private life of an individual, cannot be disclosed [23]. The Belgian federal and Flemish FOI acts require the public authorities to refuse a request for access if disclosure of the document would harm the private life of an individual, unless that person has consented to the disclosure [24,25]. Contrary to these European examples, the United States 1966 Freedom of Information Act, one of the oldest FOI legislations,3 contains more specific provisions. Two exemptions relate to the protection of privacy: exemption 6 allows agencies to deny access to “personnel and medical files and similar files the disclosure of which would constitute a clearly unwarranted invasion of personal privacy”; and exemption 7c) relates to “records or information compiled for law enforcement purposes, but only to the extent that the production of such law enforcement records or information […] could reasonably be expected to constitute an unwarranted invasion of personal privacy” [26]. Some FOI legislations also explicitly mention the protection of personal data as an interest that needs to be safeguarded. For instance, the United Kingdom’s Freedom of Information Act 2000 exempts, in its section 40b), personal data if the disclosure thereof would contravene any of the data protection principles of the Data Protection Act 1998 [27]. Article 6 of the Access to Public Information Act of Slovenia requires public bodies to deny access to requested information if the request relates to “[p]ersonal data the disclosure of which would constitute an infringement of the protection of personal data in accordance with the Act governing the protection of personal data” [28]. The reference to personal data rather than privacy in an exemption may have an impact on the balance between the different rights involved. Data protection and privacy are obviously closely related, they are not identical, possibly entailing different 3

The oldest FOI legislation is the 1776 Freedom of the Press Act in Sweden.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

194

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

results as to when the exemptions apply and data may not be disclosed. While we will not go into the details of the relationship between privacy and data protection, we will just indicate the relevance of the distinction with the example of the current EU regulation on public access to documents held by the EU institutions (Regulation 1049/2001) [29]. According to the Regulation, the institutions have to refuse access to a document where this would undermine the protection of “privacy and the integrity of the individual, in particular in accordance with Community legislation regarding the protection of personal data”. The European Commission and the European Data Protection Supervisor (EDPS) interpreted this provision differently. In short, according to the European Commission, the assessment of whether privacy is undermined should be based on whether the Rules of Regulation 45/2001 on data protection within the EU institutions are complied with. If this is not the case, the document should not be accessible. The EDPS believed that it should first be assessed whether there is a possible privacy-sensitive matter, and only then would the data protection rules come into play. If there is no danger to privacy, access should be allowed, without any need for invoking Regulation 45/2001 [14,18]. The Court of Justice followed the Commission’s approach in the Bavarian Lager case, which is also mentioned by Pagallo and Basso in their paper [30]. While the legislator gives guidelines on the relationship between access and privacy, the general character of the provisions obviously does not tell us much about how to deal with a potential conflict in practice. As was mentioned in the previous paragraph, this has to be decided by the public bodies on a case-by-case basis, taking into account the particular circumstances of the case. So if we want to learn anything about how conflicts between access and privacy (and/or data protection) are treated, we need to look into the decision practice of the public bodies, and particularly into the decisions of the redress bodies dealing with complaints about the public bodies’ treatment of requests. The arguments used in these cases may set out some de facto standards on the criteria or factors that play a role in the weighing of interests.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. The Balancing of Access and Privacy in Redress Mechanisms While FOI legislation may provide a very general approach to how to deal with possible conflicts between access and privacy (and in some cases even make specific rules for particular types of information, cf. infra), in most cases it does not provide sufficient support for public bodies to determine their attitude towards disseminating particular documents, information or datasets. An assessment by relevant experts in the context [11,21] of the particular case is therefore still necessary. Guidance for such assessment can be found in the decision practice of the redress bodies that have been established to deal with complaints about public bodies’ refusal of access. The redress mechanisms against decisions based on FOI legislation have taken various forms in different countries. This includes internal review and review by Ombudsmen, independent Information Commissions, commissions within the central administration, special tribunals, and national courts [16]. For instance, in France, the Commission d’Accès aux Documents Administratifs (CADA) is competent for any complaints on access and re-use of public sector information [31]. It has a dual role: alongside handing complaints from citizens about refusals of access (or re-use), it also advises public authorities on how to deal with requests for information. The Belgian and Flemish Commissions are not set up as independent authorities, but organised

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

195

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

within the Ministry of Internal Affairs and the Flemish Ministry’s Services for the General Government Policy respectively [32,33]. In the United Kingdom, the Information Commissioner’s Office is an independent authority competent for both FOI and privacy and data protection [34]. Appeals against notices from the Information Commissioner can be brought before the Information Tribunal [35]. In Slovenia, the choice has also been made to combine the competence for data protection and FOI under the Information Commissioner [36]. In contrast, some countries have chosen not to implement a separate structure for complaints under FOI. For instance, in the United States complaints are dealt with within the general court system. Whatever their structure, the redress bodies have to weigh the different factors involved in the case before them and try to find an appropriate balance in the particular circumstances. As the UK’s Information Tribunal has formulated it: “When assessing the public interest balance for the purposes of each exemption we take an approach under which we aggregate all public interests in non-disclosure. We reach our conclusion on the overall balance by assessing the weight of their cumulative effect against the weight we give to the public interests in disclosure” [37]. An interesting topic of research would be whether the decision-making process, and particularly its results, are different for the redress mechanisms which are only competent for FOI complaints and the redress bodies, such as the UK’s and Slovenia’s Information Commissioners, which also have competence for privacy and data protection matters. Added to the economic advantage of being more cost-effective, the combination of competences and the necessary expertise to exercise them should presumably lead to a very balanced view of the relationship between privacy and FOI [15,16]. Would redress bodies only competent for FOI be inclined to interpret that freedom more broadly (and hence the exceptions more strictly) or not? As this would require an extensive analysis of the case law of the institutions involved, this question lies beyond the scope of this paper. In the next section, we will just highlight some elements taken from case law in different countries which may provide food for thought for the debate on open data and privacy/personal data.

4. Food for Thought for the ‘Open Data v Privacy’ Debate Our research into the case law on FOI and privacy is too premature to propose potential criteria used in this case law that might be applicable in the search for the appropriate way to tackle the potential conflicts between open data and privacy. However, we have found some elements and concepts in the case law and literature relating to access and privacy that can be considered at least as ‘food for thought’ for the open data community. This includes attention for proactive measures and partial access; the concept of ‘core purpose’; and the impact of the public character of persons or their information. 4.1. Proactive Approach The suggestion of a proactive approach has been heard often with regard to preparing datasets for their release as open data on a national open data portal. For each dataset, the potential privacy risks should be assessed before its release (or, ideally, at the moment of its collection). For instance, O’Hara describes a procedure for pre-release screening of data to ensure respect for privacy [11].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

196

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

This proactive approach was also suggested by the EDPS in relation to access to documents (before the concept of open data came into fashion). The EDPS proposed that public bodies consider adopting internal rules on the access to certain documents containing personal data, or at least inform the data subject beforehand about the way their data was to be used and/or disclosed [14]. Public bodies could, for instance, use their rules of procedure, employment procedures or staff regulations to clarify whether certain documents might be disclosed [14]. This proactive approach has been adopted by some governments, which require that public bodies create a ‘publication scheme’ for the information they hold [38]. A particular way for public bodies to be proactive regarding potential releases of personal or privacy-sensitive data would be to ask for the individual’s consent at the moment at which their data is provided to the public body. However, Kranenborg argues that always making the disclosure of personal data dependent on consent would not be proportionate, as this would allow the individuals to always determine themselves whether or not their data could be seen or not, without any regard for a potential overriding public interest in the data (e.g. the spending of allowances by Members of Parliament). Relying on consent might work for some types of data, e.g. where there is a clear privacy-interest at stake, but not for all personal data [18]. It would be interesting to gain more insight into how such a proactive approach is built under the terms of FOI legislation, in order to determine how well this holds up for open data. However, the earlier emphasis put on case-by-case assessment has already shown that a proactive approach will not be able to take into account the potential interests at play in particular cases. In general, this is a problem that any active dissemination of data faces: it is difficult to take all potential circumstances into account beforehand.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.2. Partial Access Another solution to possible conflicts between access and privacy often used is the rendering of partial access to a document, i.e. by deleting names of individuals or any other privacy-sensitive information. This solution is explicitly foreseen in many national FOI legislations. According to the EDPS, partial access can be seen as an application of the principle of proportionality: “in those cases where the proportionality test, based on a concrete and individual examination, has shown that full access would undermine the privacy of an individual, the possibility to grant partial access must be considered” [14]. This was also recognised by the Court of Justice in its 2001 Hautala judgment [39]. For instance, in its advice to public authorities, the French CADA has suggested the removal of data such as bank accounts [40], the home addresses of building permit applicants (but not the names of the applicants or the address of the building sites) [41], the address or date of birth of trade union members (but not their names). The United Kingdom’s Information Tribunal allowed disclosure of senior public officers’ professional activities and interests outside of their public function, but required redaction of their other interests and association memberships [37]. More insight into the types of information or data that are generally required to be removed from documents or files would be interesting as a guideline for trying to redact datasets containing potentially harmful information. This could also be useful in the discussion on the value of anonymisation and the potential risks of deanonymisation and jigsaw identification [11]. However, there is an obvious limit to the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

197

concept of partial access that also limits the usefulness of anonymisation. Where personal data is at the heart of the document, or where the personal data are actually the purpose of the request, partial access does not help [18]. However, the examples from the CADA and the Information Tribunal’s decision practice show that partial access might allow for at least some of the personal data to be available, while other data could be redacted. Exactly this kind of example might be very helpful for open data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.3. Public Character of the Data While many FOI Acts enable the disclosure of personal information where it is determined to be generally in the public interest (e.g. Canada, Australia, Ireland), a number of national FOI legislations contain specific provisions for data from people ‘in the public eye’. For instance, in the federal FOI Act of Germany the public interest in disclosure of the name, title, academic title, professional function, work address and work phone number of public servants is deemed to outweigh the interest of the public servant’s privacy [42]. In Hungary, Poland and Romania, special provisions are included in the legislation for the personal data of persons performing public functions, insofar as this data relates to their public function [43–45]. Under Lithuanian law, “information concerning the private life of a public figure (state political figures, public servants, heads of political parties and public organisations as well as other persons participating in public or political activity) may be made public without their consent if such information discloses the circumstances of the aforementioned person’s private life or personal traits which are of public significance” [46]. Redress bodies in several countries have had a chance to develop a practice on the potential disclosure of personal data of people holding a political mandate or a position in public service. An examination of this practice, and possible standards arising from this, could contribute to the development of guidelines for the publication of open datasets containing data about ‘public’ persons. As many datasets holding such data are of interest to the public (as shown by e.g. the public debate raised in United Kingdom about the spending behaviour of Members of Parliament) such guidelines could already help in the decision on whether to publish many datasets. The French CADA has published a large body of decisions on the publication of public servants’ personal data related to their function. For instance, it allowed the disclosure of the names of local authority staff with their function and grade or level, but not information on their holidays or administrative sanctions. Also, for electoral candidates, a set of data could be published which included their names, gender, date and place of birth, profession and domicile) [47]. In general, CADA considers it in the public interest to publish the names of people with a public function attending meetings of local authorities etc., but not the names of the other people involved, e.g. new inhabitants [48]. The United Kingdom’s Information Commissioner also takes into account the public function and status of the individual involved in its decisions. It has already criticised government departments, in 2006, for routinely blanking out officials’ names in documents without a justifiable reason [17]. The Commissioner not only takes into account the professional character of particular data or activities, but also, within that professional character, it still tries to search for a reasonable balance in each particular case. The same can be said for the Information Tribunal, which deals with appeals against the decisions of the Information Commissioner. For instance, the Information Tribunal found it “fair” to disclose names, departments, sections and job titles for all

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

198

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

Bolton Council Officers included in the register of interest, as this related to “their professional life in public service”. However, for other professional elements, such as “other work done”, or “consultancies held” the Tribunal made a distinction on the grounds of seniority, stating that it would only be fair to demand such disclosure from “Chief Officers”: “the more senior an officer, the greater the need for transparency and confidence in public office as they have the responsibility for decision making” [37]. The key element is the public function: personal data which are related to the “conduct [sic] of public service – irrespective of the actual location and time, i.e. whether it is done during or outside working hours, at the workplace or at home” should be disclosed [15,18]. However, there may also be a lot of personal data that is not related to the public function of the person involved, but which may still be in the public interest. This is, for example, the case when one, as Hopkins calls it, “moves beyond the clear parameters of employment into the more nebulous question of influence” [49]: information about the donors for particular think tanks may very much be in the public interest. This could also true for data from ‘private persons’ who are not involved in government at all. For such data, Kranenborg suggests a distinction based on whether the persons involved have given their data in a professional or personal context, e.g. as CEO of a company or as an individual citizen [18].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.4. Purpose of Freedom of Information Could going back to the original purpose of FOI be of any help in deciding which datasets can be made available as open data? FOI legislation was traditionally adopted to allow the public to check up on its government’s activities and to increase the accountability of the government towards the general public. Should personal data only be disclosed when it contributes to the fulfilment of the objectives of FOI? And how large should that contribution be? This reasoning has been applied in the United States to requests under the 1966 Freedom of Information Act since the Supreme Court’s decision in the Reporters Committee case in 1987. The Supreme Court decreed that the FOI Act’s central purpose is to scrutinize government’s behaviour; access should be given only to information that relates directly to government operations or performance [19,50–52]. According to some, this would avoid misuse of FOI legislation to learn about competitors, opposing parties in litigation, or to collect information on individuals for commercial purposes [52,53]. While the basic idea – a teleological interpretation of FOI law – may seem attractive, its flaws should also immediately be acknowledged. The “central purpose” of FOI could maybe be a first filter for not publishing datasets which hold personal data but are not of public interest. However, there are many datasets that may not directly inform about government activities, but which could be of high public interest, such as census data, scientific reports, jobs and economic data, public health or public safety information [19]. A counter-argument to their publication would be that although in many cases information about individuals would actually say something about the way government is functioning, in most of these cases statistical data, or redacted (partially accessible) data, would be sufficient to address the public interest involved [53]. Of course, a further problem with the “core purpose” approach adopted by the United States Supreme Court is that the current drivers of open data are much broader than only “checking up on government”. They also include public participation, innovation, economic growth and administrative efficiency. Against this background, the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

199

core purpose principle will not be of much use. However, it may still be one in a broader spectrum of factors that are taken into account in releasing personal data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5. From Freedom of Information to Open Data In the previous paragraphs, we have taken a brief look into the way a balance is sought in the relationship between the interests protected by FOI and privacy. While this only provides some first ideas, we believe that it would be interesting to look more into the very rich case law and decision practice on this subject to draw inspiration for the development of a decision-making process on how to deal with the publication of open data that contains personal information. In such an analysis of FOI practice and policy, some specific elements will have to be taken into account to determine the possible value of the results of the analysis. First, as mentioned earlier, the drivers of open data also include economic growth and innovation, and commercial use of information held by the public sector is also explicitly part of the purpose. This is also shown clearly by the European PSI directive. These drivers are not part of the analysis in FOI redress mechanisms, and are sometimes even seen as particular reasons for not disclosing data. Unfortunately, the decision practice of the redress bodies on re-use of PSI is only in its infancy, so not much insight can be gained there as yet. Second, FOI decision practice is generally based on denied requests for data, while open data starts from the premises of proactive releases of data. Even though FOI has already had to deal with changes relating to the provision of electronic documents rather than paper versions [52,54], this does not yet cover the broad release of data, e.g. via the Internet. While some transparency activists argue that “if data could be released reactively under the terms of [freedom of information law], then they should be released proactively under a transparency programme” [11], the privacy concerns may be of a completely different nature, and may require a separate balancing exercise. As mentioned earlier, this complicates the process of case-by-case assessments as to whether there would be a potential threat to privacy. Therefore, it seems imperative that the need for case-by-case analysis is recognised in the release of open data. One way to at least attempt this would be to include a broader set of stakeholders in the decisionmaking process (although without imposing too many administrative burdens or timeconsuming procedures). Of course, as in FOI, the decision practice on open data and privacy will have to be developed, ideally by institutions with expertise and training in areas such as access, reuse, privacy and data protection. Moreover, education of all stakeholders involved, including citizens, data holders, public officials, IT consultants, developers, re-users, etc. could increase awareness of the interrelation between both rights [15,54]. Such education could be supported by guidelines, check-lists or different forms of supporting documents for decision-making [15,16]. An important element that should take a central place in the balancing exercise (and the education on how to perform this exercise) is the concept of a reasonable expectation of privacy for the individuals involved. While this concept has been offered as a solution in literature [51,55], and we can assume that it plays an implicit role in the weighing of interests in the decision practice on freedom of information, the lack of explicit mention has not allowed the development of a generally adopted ‘reasonable expectation’ approach. In addition, the growing attention for data protection has been

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

200

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

forcing this concept somewhat into the background, moving the focus to whether personal data is disclosed in compliance with the conditions for data processing rather than to whether harm is being done to a person’s privacy. The discussion between the European Commission and the European Data Protection Supervisor and its outcome in the Bavarian Lager case are a clear example of this, as is the Court of Justice’s judgment in the Farm subsidy case [6,56]. The focus of the recent transparency programmes on open data actually enforces a focus on the processing of data rather than the reasonable expectation of privacy. However, maybe this reasonable expectation is exactly the criterion that could take the situation out of its impasse and support the development of organisational and technical solutions to reconcile open data and privacy?

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

References [1] S. Gutwirth and P. De Hert, Privacy, data protection and law enforcement. Opacity of the individual and transparency of power, E. Claes et al. (ed.), Privacy and the criminal law, Intersentia, 2006. [2] E. Fura and M. Klamberg, The Chilling Effect of Counter-Terrorism Measures: A Comparative Analysis of Electronic Surveillance Laws in Europe and the USA, J. Casadevall et al. (ed.), Freedom of Expression – Essays in honour of Nicolas Bratza, Wolf Legal Publishers, 2012. [3] E. Mayo and T. Steinberg, The power of information: an independent review, 2007, http://www. epractice.eu/en/library/280988 (accessed on 08/03/2013). [4] Access Info, Beyond Access: Open government data and the ‘right to reuse’, 2010, http://www.accessinfo.org/documents/Access_Docs/Advancing/Beyond_Access_7_January_2011_web.pdf (accessed on 08/03/2013). [5] M. Carpenter, The Accountability of Members for Their Expenses: Legal and Jurisdictional Issues, Journal of Parliamentary and Political Law 5 (2011), 323. [6] European Court of Justice, Joined cases C-92/09 and C-93/09, Volker und Markus Schecke GbR; Hartmut Eifert, 9 November 2010. [7] W. Hins and D. Voorhoof, Access to State-Held Information as a Fundamental Right under the European Convention on Human Rights, Eur. Const. L. Rev. 3 (2007), 114–126. [8] R. Peled and Y. Rabin, The constitutional right to information, Colum. Hum. Rts. L. Rev 42 (2011), 357–401. [9] European Parliament and Council, Directive 2003/98/EC on the re-use of public sector information, 2003. [10] European Parliament an Council, Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data, 1995. [11] K. O’Hara, Transparent government, not transparent citizens, 2011, http://eprints.ecs.soton.ac.uk/ 22769/ (accessed on 08/03/2013). [12] Article 29 Data Protection Working Party, Opinion 06/2013 on open data and public sector information (‘PSI’) reuse, 2013, http://ec.europa.eu/justice/data-protection/article-29/documentation/opinionrecommendation/files/2013/wp207_en.pdf (accessed on 26/06/2013). [13] T. Pitt-Payne. Freedom of Information and Data Protection: Creative Tension or Implacable Conflict? http://www.franco-british-law.org/pages/ENG/publications/documents/Pitt-Payne.pdf, 2007 (accessed on 17/10/2010). [14] European Data Protection Supervisor, Public access to documents and data protection, http://www.edps. europa.eu/EDPSWEB/edps/cache/off/EDPS/Publications/Papers, 2005 (accessed on 07/03/2012). [15] Y. Szekely, Freedom of Information Versus Privacy: Friends or Foes? S. Gutwirth et al. (ed.), Reinventing Data Protection? Springer, 2009, 293–316. [16] N. Torres, Access to Information and Personal Data: An Old Tension, New Challenges, 2012, http://www.transparencyconference.nl/wp-content/uploads/2012/05/Torres.docx (accessed on 05/05/ 2013). [17] P. Birkinshaw, Freedom of information and openness: fundamental human rights? Administrative Law Review 58 (2006), 178–218. [18] H. Kranenborg, Toegang tot documenten en bescherming van persoonsgegevens in de Europese Unie. Over de openbaarheid van persoonsgegevens, https://openaccess.leidenuniv.nl/handle/1887/12352 (accessed on 05/05/2013).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

K. Janssen and S. Hugelier / Open Data: A New Battle in an Old War?

201

[19] M. Halstuk and B. Chamberlin, The Freedom of Information Act 1966–2006: A retrospective on the rise of privacy protection over the public interest in knowing what the government’s up to, Comm. Law and Policy 11 (2006), 511–564. [20] C. Raab, Privacy issues as limits to access, G. Aichholzer and H. Burkert, Public Sector Information in the Digital Age. Between Markets, Public Management and Citizens’ Rights, Edward Elgar, 2004, 23–47. [21] Art. 29 Working Party, Opinion 5/2001 on the European Ombudsman Special Report to the European Parliament following the draft recommendation to the European Commission in complaint 713/98/IJH, 2001, http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/2001/wp44en.pdf (accessed on 08/03/ 2013). [22] Council of Europe Convention on Access to Official Documents, 2008, https://wcd.coe.int/ViewDoc. jsp?id=1377737&Site=CM (accessed on 05:05/2013). [23] Loi n° 78-753 du 17 juillet 1978 relative à la liberté d’accès aux documents administratifs et à la réutilisation des informations publiques. [24] Law of 11 April 1994 on freedom of information. [25] Decree of 27 March 2004 on freedom of information. [26] FOI act US Title 5 of the United States Code (Government Organization and Employees) section 552b. [27] Freedom of Information Act 2000. [28] Access to public information Act of 22 March 2003. [29] European Parliament and Council, Regulation 1049/2001 regarding public access to European Parliament, Council and Commission documents, 2001. [30] European Court of Justice, Case C-28/08 P, European Commission versus The Bavarian Lager Co. Ltd, 29 June 2010. [31] http://www.cada.fr/. [32] http://www.ibz.rrn.fgov.be/index.php?id=2441&L=1. [33] http://openbaarheid.vlaanderen.be/nlapps/default.asp. [34] http://www.ico.org.uk/. [35] http://www.justice.gov.uk/tribunals/information-rights. [36] https://www.ip-rs.si/?id=195. [37] Information Tribunal, Greenwood & Bolton Metropolitan Borough Council v Information Commissioner (EA/2011/0131). [38] L. Batista and M. Kornock, Information sharing in e-government initiatives: Freedom of Information and Data Protection issues concerning local government, Journal of Information, Law & Technology 2 (2009), http://go.warwick.ac.uk/jilt/2009_2/bc (accessed on 03/05/2013). [39] European Court of Justice, Case C-353/99 P, Council v Hautala, 6 December 2001. [40] CADA, Conseil 20062674, Président du Conseil général de l’Isère, 29 June 2006. [41] CADA, Conseil 20073182, Maire de Pontivy, 13 September 2007. [42] Gesetz zur Regelung des Zugangs zu Informationen des Bundes (Informationsfreiheitsgesetz), 5 September 2005. [43] Act LXIII of 1992 on the Protection of Personal Data and Public Access to Data of Public Interest. [44] Act 112/2001 of 6 September 2001 on Access to Public Information. [45] Act 112/2001 of 6 September 2001 on Access to Public Information. [46] Act I-1418 of 2 July 1996 on Provision of Information to the Public. [47] CADA, Avis 20123881, Préfecture de l’Orne, 22 November 2012. [48] CADA, Avis 20122406, Maire de Saint-Paul-de-Vence. [49] R. Hopkins, Expert Comment, F.O.I. 8 (2012), 2. [50] United States Supreme court, United States Department of Justice v. Reporters Committee for Freedom of the Press, 489 U.S. 749, 1989. [51] C. Davis, Expanding Privacy Rationales Under the Federal Freedom of Information Act: Stigmatization as Talisman, Social Science Computer Review 23 (2005), 453–462. [52] D. Solove, Access and Aggregation: Privacy, Public Records, and the Constitution, Minn. L. Rev. 86 (2002), 1137–1217. [53] F. Cate, D. Fields and J. McBain, The Right to Privacy and the Public’s Right to Know, Administrative Law Review 46 (1994), 41–74. [54] D. Byrne, Access to Online Local Government Public Records: The Privacy Paradox, Legal Reference Services Quarterly, 29 (2010), 1–21. [55] G. Barber and F. Corrado, How Transparency Protects Privacy in Government Records, (2011), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1850786 (accessed on 03/05/2013). [56] O. Lynskey, Data protection and freedom of information; reconciling the irreconcilable? C.L.J. 70 (2011), 37–39.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

202

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-202

Midata: Towards a Personal Information Revolution Nigel SHADBOLT 1 Web and Internet Science Group, University of Southampton, Highfield, Southampton S017 1BJ

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Abstract. There has been an explosion of data on the Web. Much of this data is generated by or else refers to individuals. This emerging area of personal information assets is presenting new opportunities and challenges for all of us. This paper reviews a UK Government initiative called ‘midata’. The midata programme of work is being undertaken with leading businesses and consumer groups in order to give consumers access to their personal data in a portable and electronic format. Consumers can then use this data to help them better understand their own consumption behaviours and patterns, as well as make more informed and appropriate purchasing and other decisions. The paper reviews the history and context, principles and progress behind midata. It describes concrete examples and examines some of the challenges in making personal information assets available in this way. The paper reviews some of the key tools and technologies available for managing personal information assets. We also summarise the legislative landscape and various legal proposals under development that are relevant to midata. We review similar efforts elsewhere in particular those underway in the US under a programme of work called Smart Disclosure. This work seeks to release personal information held by government and business back to citizens and consumers. Finally we discuss likely future developments. Keywords. Midata, personal information, consumer empowerment, personal information stores, smart disclosure

Introduction The World Wide Web has had profound effects at all levels of society. It has changed the way that individuals interact with one another, the way governments connects to their citizens, the way businesses connects to their clients. The development of new services and the creation of new content are occurring at ever increasing rates on the Web. One of the most dramatic examples is the extent to which content is being generated by individual users. We now generate prodigious amounts of data. In 2013 every minute it is estimated that YouTube users upload 72 hours of new video, Google receives over 3.25 million search queries, over half a million corporate emails are sent, over a 170,000 tweets are sent and Facebook users share almost 700,000 pieces of content. It is esti1 Corresponding

Author: Nigel Shadbolt, Web and Internet Science Group, University of Southampton, Highfield, Southampton S017 1BJ; E-mail: [email protected]. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

N. Shadbolt / Midata: Towards a Personal Information Revolution

203

mated that 2.7 Zettabytes of information were created and stored in 2012 – 2.7 trillion Gigabytes [1]. This is an abundance of information at an unprecedented scale. This is just one class of data relating to individuals – what we might term explicitly contributed. There is also data that is implicitly contributed. The transactions that we carry out with a whole range of businesses and processes all generate data – the transactions are ones that the user is usually aware of making – however, the resulting data generation is often implicit. There is also data that is third party sensed – transactions where the user is not aware that they are generating a data stream – these third party sensed data can range from switching on a light in a smart metered household to powering up their mobile phone. From the search terms we enter, to the cookies placed in our browsers when we visit a website, from the click streams we generate to the tweets we make, from the location data contained in our mobile phones to the Near Field Communication (NFC) data that logs our transactions, the Web and Information Technology is transforming the way individuals and organisations interact, and these new data sources are increasingly being used to generate new value. The ability to store large amounts of information relating to our individual activities, transactions and behaviours is leading towards the emergence of Personal Data as a new asset class [2]. A recent report has estimated the size of the UK market for information volunteered by individuals at £20bn by 2020 [3]. Research in 2012 estimated that the value extracted from European consumers’ personal data in 2011 was worth e315bn with the potential to grow to e1tn annually in 2020 [4,5]. Until recently nearly all attempts to deal with the growing abundance of data belonging to individuals have been made by the supply side. Companies and governments alike still tend to look at individuals, and at data belonging to individuals, as stuff that needs to be “managed”. This is the fundamental perspective of many Customer Relationship Management (CRM) systems, leading one commentator to note that the proponents of such systems employ the language of slavery (“acquire”, “control”, “retain”, “manage”, “lock in” and “own”) in talking about customers [6]. What many now observe is that individuals are in command of far more data than any large supply-side organisation can maintain on their behalf. On the Internet and in the age of the Web, individuals are in the best position to manage their own data.2 In technical terms, each individual is the best point of integration for his or her own data, and the best point of origination for what gets done with it. The figure below depicts the emerging structure of this market. It will be supported by services helping individuals collect store and manage their own data (Personal Data Management). Varieties of personal data include; personal attribute data, data that we volunteer, data that is observed in our interactions with systems and processes, and data that is inferred from our behaviours and history. We already see services helping individuals use this data to get a variety of things done (PIMS) and number of them will be discussed in this chapter. By encouraging increased sharing of information between individuals and organisations, new opportunities are created for existing providers to add value and reduce cost whilst consumers benefit from new insights and services. In fact this amounts to a new set of opportunities for Business Management of personal information. 2 Current

usage sees the terms data and information used interchangeably – strictly speaking information is data with an interpretation – thus the data elements 150 and 56 might be interpreted in many ways – a scheme to provide them with meaning turns them into information – blood pressure and age is one such, another might be cost in pounds and number of items bought. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

204

N. Shadbolt / Midata: Towards a Personal Information Revolution

Figure 1. The Personal Information Data Market – reproduced with permission from Ctrl-Shift.

In the rest of this paper we will review the history and origins, principles and progress behind the UK midata programme – a programme of work that seeks to empower individuals by giving them access to their own data whilst at the same time enabling business to have richer and more profitable interactions with their customers. We will describe concrete examples and examine some of the challenges in making personal information assets available in this way. Finally we review similar efforts elsewhere and look to future developments in this fast evolving area of our digital lives.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Midata Programme 1.1. History and Origins On the 13 April 2011 the UK Government’s consumer empowerment strategy Better Choices: Better Deals was launched. This strategy had been jointly developed by the Department for Business, Innovation and Skills (BIS) and the Cabinet Office’s Behavioural Insight Team (BIT) [7]. The strategy described voluntary programmes and “nudges” designed to help consumers find and adopt the best choices for their circumstances and needs. One of the key elements of the empowerment strategy is the midata programme under the Chairmanship of the author. This is a programme of work involving Government, consumer groups and leading businesses designed to give consumers more control and access to the data that companies hold about them. Midata seeks to give consumers access to their transaction data in a way that is machine readable, portable and secure. The ambition is to rebalance the current asymmetry that exists between business and consumers. This asymmetry is one in which companies have large amounts of data about the individual transactions we undertake with them whilst consumers usually have little or no access to this data.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

205

We are all confronted with information that is important to us but that is hard to gain useful access to and even harder to make sense of. A powerful combination of new technology and new government policy is set to change this landscape and stimulate the growth of new capabilities – choice engines that interpret this data on our behalf [8]. Since April 2011 midata has been a voluntary partnership between the UK Government, businesses, consumer groups, regulators and trade bodies. The programme is being led by a steering group with working groups tackling specific issues such as interoperability, security, authentication and the like. The initial focus of the midata programme was around four major sectors which account for a substantial amount of the personal information held by corporates in the UK; finance and banking, energy, telcos and retail. The steering group is chaired by the author who is also involved in the effort to open up non-personal public sector and government data in the UK. The three main objectives of midata are to: • secure broad private-sector participation in the project with a significant number of businesses agreeing to release individual consumers’ personal data on request; • let consumers access and use their data in a safe way; and • encourage businesses to develop innovative services and applications that will enable consumers to interpret and use the data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

The belief is that the midata programme will: help improve information sharing between organisations and their customers; sharpen incentives for businesses to compete keenly on price, service and quality; build trust; and facilitate the creation of a new market for personal information services that empower individuals to use their own data for their own purposes. By combining data from many different sources and letting consumers add information of their own, businesses have a significant opportunity to help customers create rich, new, “person-centric” data assets. The Government expects that the data released will help stimulate new markets offering personal data management services. These are likely to include services that: • help individuals understand their own consumption behaviours and patterns and help them change them for the better; • use an individual’s data to help them make more informed purchasing decisions; • and combine personal and other data from a range of different sources for use by the individual and by organisations to offer new goods and services. We shall see examples of these services and benefits later in this chapter. However, it is also important to note and it was always anticipated that some companies would find this a challenging agenda. A number of them derive considerable business and customer intelligence from the transaction data they hold [9]. Disturbing this status quo is difficult for some. Others could not see a user demand for returning the data. Some were concerned that customers might lose their data and so compromise themselves, or else allow third parties to use the data and were concerned about liabilities on such use. Many companies were convinced that this would result in wave of switching in which consumers would move to the lowest priced offerings. Some companies complained that this would result in their words to a “race to the bottom”.3 3 Another

way of looking at it is that transparent markets are efficient markets, and in most competitive efficient markets most of the benefit goes to the consumers rather than the producers. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

206

N. Shadbolt / Midata: Towards a Personal Information Revolution Table 1. The TACT stages of personal data sharing

Transparency Providers are open and transparent about what customer data they hold.

Access

Control

Providers enable individual customers to have secure, personal access to data held about them.

Providers give consumers the ability to correct, update, change settings, preferences, permissions etc.

Transfer Data is released back to the individual for re-use. Data can be analysed and consumers can take action.

These concerns notwithstanding, many organisations where keen to explore a world in which consumers were both empowered and in which they had much richer two way data interaction existed between themselves and their customers. In November 2011 just six months after the release of the Consumer Empowerment Strategy the Government launched its detailed vision for midata with 26 organisations signed up to realise the vision and in particular agreeing a core set of principles about data release [10].

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1.2. Principles In order to assuage the concerns of some in the corporate world the midata programme developed a staged approach to the sharing of data. This is known as “TACT” (Transparency, Access, Control and Transfer) – and it describes a set of key stages or journey that any organisation would follow in the sharing of personal data with consumers. The first is around Transparency. It requires that organisations are transparent about the customer data that they hold. A key benefit that flows is through establishing trust via a reputation for honesty and openness. The second and third phases of the TACT journey Access and Control further enhances an organisation’s reputation as a trusted repository for personal information but it also creates new opportunities to improve data quality and cut costs. The fourth stage of data Transfer provides many opportunities, as we shall see, for the creation of value through service enhancement and innovation. Using the TACT approach we have been able to map the impact of its various stages to particular business functions and associated business benefits. For example, Transparency, Access and Control should lead to higher levels of trust and mean that customers are more likely to be willing to volunteer additional information such as future plans, current preferences and priorities which in turn could lead to increased customer retention and acquisition. Transfer of data to an individual who has the opportunity to forward it to third parties or other providers generates significant market research insights to the benefit of all. For example, to allow an individual to see their financial affairs in the round rather than the silos of bank, credit card, savings and other statements that exist. This more integrated single customer view provides better insight into a company’s share of an individual’s financial world, where products and services are succeeding or failing, better customer segmentation and an ability to gain permissioned access to other personally volunteered information. All of which provides an opportunity for an organisation to cross or up sell and for an individual to have a much more comprehensive view of their own propensities, choice options and patterns of behaviour. Of course, as we begin the TACT journey we are beginning to gather the evidence to support the contentions and quantify the benefits laid out. It is also interesting to note the range of companies that are now embarked on a TACT journey in the UK; from Tesco to Marks and Spencer, Lloyds Bank to British Gas, MasterCard to Telefonica.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

207

Alongside the general stages of TACT there sit 9 core midata principles:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Data that is released to customers will be in reusable, machine-readable form in an open standard format. 2. Consumers should be able to access, retrieve and store their data securely. 3. Consumers should be able to analyse, manipulate, integrate and share their data as they see fit – including participating in collaborative or group purchasing. 4. Standardisation of terminology, format and data sharing processes will be pursued as far as possible across sectors. 5. Once requested, data will be made available to customers as quickly as possible. 6. The focus will be to provide information or data that may be actionable and useful in making a decision or in the course of a specific activity. 7. Organisations should not place any restrictions on or otherwise hinder the retention or reuse of data. 8. Organisations will work to increase awareness amongst consumers of the opportunities and responsibilities that arise from consumer data empowerment. 9. Organisations will provide customers with clear explanations of how the data was collected and what it represents, and who to consult if problems arise. It is no accident that some of these principles echo the public data principles [11] that apply to the Government’s releasing of non-personal public data. Clearly personal information assets are not open data – they are personal to the individual and the individual should have control as to how widely they are disseminated. However, the power of open innovation on any information asset is amplified by the commitment to the information being available in a machine readable fashion, non-restrictive conditions of reuse, the adoption of open standards, by timely release and high quality meta data as to how the data was collected and what it represents. On the back of these principles are specific commitments to develop online ‘personal data inventories’ (PDIs) in each sector. PDIs will describe the types of data an organisation holds about each customer. The proposal is that a consumer would log in to a secure website where their Personal Data Inventory would contain a simple explanation of each category of data available and whether and how it could be accessed. The Personal Data Inventory would contain data such as address and contact details, existing tariffs/contracts, payment methods and a history of items purchased or services used. Protocols will also be established to handle any issues relating to privacy, data security and consumer protection. Midata is also working with companies to develop common approaches that will allow customers to access their data including their contact details, current tariffs and contracts, etc. and update basic information about themselves. And it is to progress in this area that we now turn. 1.3. Progress and Examples In October 2011, and following an Energy summit at No10 Downing Street, energy companies in the UK agreed to make midata a reality on their sites [12]. This was prompted by ministerial concerns that consumers did not have sufficient access to their data to allow them to make informed decisions about switching supplier. Scottish Power, EdF, First Utility, Npower, Scottish & Southern Electricity and British Gas now provide customers with electronic access to their consumption and tariff data in a structured machine readable format that conforms to the midata principles outlined in the previous section.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

208

N. Shadbolt / Midata: Towards a Personal Information Revolution

Figure 2. Southern Electric midata download capability.

When Scottish Power launched its new service for online account holders Neil Clitheroe, CEO of Scottish Power Retail and Generation, said:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

“We want to be open and transparent with consumer data, and build greater trust between Scottish Power and our customers. The midata concept is designed to empower customers when making decisions about their account, and we are pleased to have delivered on the commitment we made to customers at the Energy Summit. . . ” At this stage the data hand backs are in the form of simple structured csv data files – detailing the tariff and usage data. The ambition is that we see this sort of data being used by choice engines to perform comparisons or else provide the sort of insight in to behaviour change that we see in social enterprises such as CleanWeb [13] or in for profit companies such as Opower [14] a $50 million venture backed US company who in 2011teamed up with the UK’s’ First Utility. Opower provides a range of behavioural insight and social media tools to enable consumers to understand their energy data and take appropriate action ranging from energy abatement to switching, from collective purchasing to low income outreach programmes. New entrants to the energy market in the UK are also embracing midata – for example Marks and Spencer Energy [15]. Another sector targeted by the midata programme is the finance, credit card and banking sector. This is a sector in which online accounts are already familiar to consumers and this has certainly increased with the rise of mobile banking applications. A good example of consumer insight around financial data is the Lloyds Money Manager product. Spending is categorised, budgets can be set, and spending profiles analysed. Whilst popular, this only represents a first step, for example, it does not enable individuals to aggregate data from different accounts. This is also an area where concerns around whether consumers and customers can be trusted with their own data [16], and where liabilities lie if such information is disclosed

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

209

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 3. Opower offers insights to energy customers.

Figure 4. Lloyds money manager offers customer insights around customer spending patterns.

to a third party, have already been addressed. Similar innovation is happening around credit agency data. Noddle, a part of Callcredit, now provide free lifetime access to an individual’s credit data. Here the calculation is that there is more to be gained by establishing a strong customer relationship around an individual’s data – particularly in terms of cross and up selling of products and services around the data that is made available free of cost. A third sector on which midata focused was telcos and mobile operators in particular. Such companies have access to some of our most sensitive data – high reso-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

210

N. Shadbolt / Midata: Towards a Personal Information Revolution

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 5. Callcredit’s Noddle offers life time free consumer credit data to users.

lution location data. With this data they are beginning to engage with their customers to offer location based services. O2’s Priority Moments is one such example – offering a range of discount, special offers and promotions as one walks down the high street. The question from a midata perspective is what sorts of additional data hand back and insights will O2 and other operators provide for their customers. O2’s chief executive Ronan Dunne recently described their thinking: “Our trial running right now gives customers a digital dashboard sharing with them all the information we have about them, why we have it, what services it is used for. [Open access to data] gives customers the opportunity to take more or less of our different services, based on a better understanding of what information we hold [on them].” Customers will be able to view their location data, phone usage and sites they have browsed, with access tied to existing services such as the O2 Priority Moments. Dunne said: “If you have an interest in a particular area of shopping and we have a special deal with one of the providers in that area, then we can match those two things together to give you a better experience.”

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

211

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 6. O2 Priority Moments uses personal location data to promote offers.

We can expect to see other significant data access services from UK telcos. However, despite these positive developments it is still the case that routine data describing usage and tariffs – just the sort we have seen emerge from the energy sector – are still far from routinely available. This is one reason why an innovative third party comparison site Bill Monitor still invests substantial effort in scraping the websites of mobile operators to establish tariff data. Consumers, when they engage with Bill Monitor, provide details of their account credentials – Bill Monitor then compares a customers usage with tariffs and plans in the market. This company, approved by the UK regulator Ofcom, has shown that a very significant amount of a customers plan is unused [17] and that more appropriate choices can be made with the data in hand. A fourth sector highlighted by midata is retail. Here data analytics have been at the heart of many large retail marketing strategies. We have already mentioned the UK supermarket Tesco. Despite initial reluctance to engage with the midata programme we have seen recent signs that here too there is a recognition that there are real business benefits in data access services [18]. Tesco is developing products and games to give Clubcard holders “simple, useful, fun” access to their own data, to help them “plan and achieve their goals”. But as yet there is little evidence of what this amounts to. The examples above are largely where additional business services are being created by the existing holders of the data for their customers in order to add value to their products and so retain customers. There are still relatively few examples where competitive or third-party services are driven by that data. At this early stage of the midata programme this could be a market failure (see the later section on legislation), or else a market lag. An area where customer empowerment is happening via third parties is the seemingly innocuous area of eReceipts [19]. With British retailers issuing some 11.2bn paper receipts, at a cost of £32m, there are clearly some efficiencies to be had. But for those companies offering mobile and cloud based storage solutions for our receipts they outline a number of other distinct benefits to the consumer such as proof of purchase, spend profiling, managing insurance claims, warrantee and guarantee registration. For the companies issuing the receipts there are new communication channels, promotional routes and ways of staying in contact with the customers. The former chairman of Tesco and current chairman of E-Receipts Lord MacLaurin is on record as saying

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

212

N. Shadbolt / Midata: Towards a Personal Information Revolution

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 7. Bill Monitor a third party application using personal telco data.

“Combining the obvious benefits for retailers, consumers and businesses, with the simplicity of the eReceipts system, we will see the end of the paper receipt.” Participating shops will be able to publish geographically-located and time specific promotions on the created eReceipt, allowing them to create targeted promotions based on customer spending habits and visible trends. This, combined with the cost savings in terms of the administration streamlining, lack of receipt paper and ink, and the reduction in consumer returns frauds is likely to be a compelling offer. Comprehensive eReceipt records will be a rich personal information resource with many applications for both consumers and business. Despite these various examples of applications and opportunities arising from returning data to consumers there has been a degree of skepticism in boardrooms. To address this the UK midata programme has been keen to show how the development of consumer applications can unlock new ways of using consumer data. The recently launched Open Data Institute (theodi.org) has held two midata “hackathons”, one using publicly available data in the healthcare sector and the other using individuals’ anonymised financial data [20]. These have brought together sector experts, programmers, marketing people and other experts at the Institute over a weekend to brainstorm ideas for applications and prioritise a few to work up. These in turn have been shown to political and business

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

213

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 8. eReceipt company paperlessreceipts.co.uk.

leaders, who once they see the ideas made concrete begin to appreciate the opportunities that midata presents. Often these opportunities arise through an inversion of existing business assumptions. One of the winning concepts at a midata/ODI event was MyLoan, a service to enable individuals to build a personal Request for Proposal of their loan requirements. In MyLoan individuals offer broad information sets to potential lenders of their financial status and other related information. The service is founded on trust and reputation: and embodies the proposition that In the past my word was my bond. In the future, my data is my bond. The individual is incentivised to provide accurate data where accuracy forms the foundation for their current and future reputation, not just with the lender but across their network. The idea is that this therefore also helps enable responsible lending and reduced risk for lenders. These examples of the exploration of the opportunities offered by midata has led to the establishment of the midata Innovation Lab or mIL – joint work between BIS and the ODI – due to launch in July 2013. The mIL will offer a safe haven to bring together consumers, business and platform developers to explore over an extended period of time the opportunities for products and services using personal data. The mIL is a secure and safe development environment where sensitive information is respected and kept secure but used for the purposes of experimentation. These and other developments are all aimed at stimulating the market and applying the midata principles. Another method that has proven successful in the past is competition with other programmes around the world. And this is the topic of the next section.

2. Related Work The World Economic Forum has issued two reports [2,21] on the subject of personal data assets and has encouraged detailed discussion through international “Tiger Teams” on the business, legal and technical requirements for a personal data ecosystem to develop. In France the FING think tank http://www.fing.org/ has created the Mesinfos group [22] to look at developments in personal data and is running a pilot whereby a

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

214

N. Shadbolt / Midata: Towards a Personal Information Revolution

range of different datasets from the private sector are available to explore the opportunities for new applications and services. The most developed programme of work outside of the UK is the US effort – variously called “smart disclosure” and more recently echoing the UK effort it has been relabelled “mydata”. An interesting feature of the US work is that as well as the commercial world of personal data the Federal authorities have been looking at a systematic programme of work to give control of personal data back to the citizen’s to whom it relates. The single most successful example here has been the “Blue Button” initiative. Blue Button, together with the slogan Download my Data, was introduced by the US Veterans Association (VA) in 2010. In August 2012, the millionth download was reached, and the initiative has had high level backing and support within the US Administration all the way up to President Barak Obama. Todd Park the current US CTO was at the Department of Health when Blue Button began and has been a major advocate. In January 2013, the VA announced plans to significantly expand Blue Button by adding: demographics, active problems lists, discharge summaries, progress notes, lab results, vitals and readings, pathology reports, radiology reports and more. Also announced is a new standardised structure for what is called a VA Continuity of Care Document (CDD). The VA CCD is a feature that contains a summary of the Veteran’s essential health and medical care information in an XML file format – and as such the goal is to be a portable and persistent personal information record. It has become so successful that the US Department of Health and Human Services now oversees the Blue Button concept. It is being seen to promote easy, secure electronic access by an individual to their health information and sharing between that individual’s medical practitioners and health services. It has been adopted by a range of public and private healthcare providers. Aetna announced in September 2011 [23] that it had added the Blue Button function to its patient portal, and in addition offered its beneficiaries the ability to share their Blue Button downloads with Aetna providers. At the time, Aetna said it served more than 36 million people. In October, 2011, McKesson Corporation’s Relay Health Division added Blue Button functions to the patient portals which it offers through its 200,000 physician and 2,000 hospital clients [24]. These represent slightly less than one third of physicians and slightly more than one third of hospitals in the United States [25]. United Health Group began offering Blue Button downloads to its commercial health plan beneficiaries in July, 2012, rolling out the capability to its customers [26]. The company expects 26 million plan beneficiaries will have access to Blue Button downloads by mid-2013. Blue Button’s success has led to another sector following suit in the US. In September 2011, former US CTO Aneesh Chopra challenged the energy industry to model a Green Button [27] where energy providers would give energy users their consumption data in an easy to read and use format at the click of the button. Green Button is based on a common technical standard developed in collaboration with a public-private partnership supported by the Commerce Department’s National Institute of Standards and Technology (see http://www.greenbuttondata.org/greendevelop.aspx). In January 2012, two major California utilities Pacific Gas & Electric and San Diego Gas & Electric announced their implementation of Green Button. Recently, nine additional major utilities and electricity suppliers signed on to the initiative, committing to provide more than 15 million households secure access to their energy data with a simple click of an online

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

215

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 9. US Blue Button Initiative.

Green Button. In total, these commitments ensure that 27 million households will be able to access their own energy information. One consequence is that energy customers can manage their consumption via their smart phones using the standard Green Button data format [28]. We have already noted that US efforts seek to empower both the citizen and consumer. In May 2013 the White House announced the latest progress on the programme to release data under their Smart Disclosure programme [29] and their consumer.data.gov site, which hosts data and resources to enable individuals to make more informed choices [30]. Other governments would do well to note this important development. 3. Legislation and Personal Data Both the UK and US work described here has had to proceed against a background of different positions and assumptions as to the extent to which legislation should be used to establish our rights and protections with respect to personal data. The law around personal data is complex, fast evolving and varies from jurisdiction to jurisdiction. Indeed the fundamental notion of data ownership can be subtle and

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

216

N. Shadbolt / Midata: Towards a Personal Information Revolution

complex. In many cases of “explicitly contributed” data the convention is that the user owns the IPR but grants a company a wide licence to use it – for example, this is how Google’s Terms and Conditions are set up. In the case of implicitly contributed or third party sensed data the ownership may not be entirely obvious. In general, data is regarded as non-rivalrous – if a company has data about an individual and makes that data available to the individual – it is still data the company has. “Ownership” of data assets, because of the nonrivalrous nature of data, is less definitive than that for other assets. For these and other reasons legislative protection is often cast in terms of privacy. In Europe the right to privacy is deeply enshrined in law. All EU states are signatories to the European Convention on Human Rights and Article 8 provides a right to respect an individual’s “private and family life, his home and his correspondence”. This has been given a very broad interpretation by the European Court of Human Rights. It is against this context that the European Union developed its Data Protection Directive [31] which regulates the processing of personal data within the member states. The Directive was also concerned to facilitate a single market for the free movement of such data. The Directive embodied a number of principles that had been framed in earlier work by the OECD [32]. The principles were as follows

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Notice – data subjects should be given notice when their data is being collected; 2. Purpose – data should only be used for the purpose stated and not for any other purposes; 3. Consent – data should not be disclosed without the data subjects’ consent; 4. Security – collected data should be kept secure from any potential abuses; 5. Disclosure – data subjects should be informed as to who is collecting their data; 6. Access – data subjects should be allowed to access their data and make corrections to any inaccurate data; 7. Accountability – data subjects should have a method available to them to hold data collectors accountable for following the above principles. The Directive has been a cornerstone for the way in which personal information is managed in the EU. It led to member states setting up Information Commissioners of various sorts. It established the concept of a data controller within an organisation who ultimately had obligations and responsibilities to deal with personal information in line with the Directive. With the passage of time it became evident that the Directive was in need of updating. It had not foreseen a world of Cloud based computing services where an individual’s data may be collected in one jurisdiction, processed in another and stored in a third, nor did it foresee a world in which individuals could be their own data controllers, nor indeed one in which the only practical method of providing access to data was through a machine readable format. Most subject access requests that are invoked under the current Directive involve letters, fees and the provision of data back to an individual on reams of paper. On 25 January 2012, the European Commission unveiled a draft European General Data Protection Regulation [33] that is planned to supersede the Data Protection Directive. From a midata perspective, a critical component in the original draft is Article 18. This introduces the data subject’s right to data portability, i.e. to transfer data from one electronic processing system to and into another without being prevented from doing so

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

N. Shadbolt / Midata: Towards a Personal Information Revolution

217

by the controller. As a precondition and in order to further improve access of individuals to their personal data, it provides the right to obtain from the controller those data in a structured and commonly used electronic format. This is extremely close to the midata principles enunciated earlier. As of May 2013 the new draft Regulation is subject to over 3000 proposed amendments and it is likely to take some time before a definitive timetable for its adoption is known – the ambition had been for it to become law in the member states by 2016. There are also concerns around other elements of the Regulation that many fear will impose onerous burdens on those organisations handing personal data [34], the net effect of which may be to impede innovation around personal data services. In the UK a rather different legislative route has been taken. To date, the midata programme has been proceeding on a voluntary basis. The programme has shown how consumer empowerment through data release can operate. We have seen that progress has been made on establishing a vision and principles. There are now concrete examples of midata being adopted by corporates. Companies have started to make data available. This initial promise convinced the UK Government that more could be done to unlock the benefits of this data revolution. As a consequence the UK Government consulted between July and September 2012 on the possibility of taking an order-making power which, if used, would compel suppliers of goods and services to provide to their customers on request, historic transaction data [35]. The Government’s response to the consultation was published on 19 November [36]. It concluded that there was broad support for the principles of midata and a recognition of the potential of the data that is released to stimulate the market for data services and advice. It became clear that the Government view was that where businesses choose to collect information about an individual’s consumer transaction history which can be linked to that consumer, that individual should be able to access their own transaction data in a portable electronic format. At the end of 2012 the UK Government was pursuing a twin track approach. Looking for progress with the current voluntary approach, whilst looking to establish an order-making power4 as soon as the Parliamentary timetable allowed. On the 24th April 2013 an amendment was made to the Enterprise Bill which saw the order making power pass into law. The legislation is initially targeted at specific sectors, namely energy, the mobile phone sector, current accounts and credit cards. The Government retains the possibility of regulating other sectors should there be an evident need. Other than in these specific sectors, the Government intends that before regulations are made in relation to a particular sector Ministers will need to be satisfied that it is appropriate to regulate that sector. The factors governing such a decision will include a situation where: 1. the market is not working well for consumers, for example where consumers find it difficult to make the right choice or where an individual’s behaviour affects the price they pay but it is difficult for that person to predict what their behaviour will be; 2. there tends to be a one-to-one, long-term relationship between the business and the customer with a stream of ongoing transactions; 4 An

order-making powers is a power to make provisions by means of secondary legislation if certain circumstances arise. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

.

218

N. Shadbolt / Midata: Towards a Personal Information Revolution

3. consumer engagement is currently limited, as evidenced by low levels of switching between tariffs, account types or providers, or where competition needs promoting; and 4. the sector does not voluntarily provide transaction/consumption data to customers at their request in portable electronic format.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

The provision of this legislation sends a powerful message that the UK Government is serious about empowering consumers with their own personal information. It has explicitly not sought to establish a dependency between this legislation and anything that emerges from the Draft Data Protection Regulation – although any revised form of Article 18 would provide additional empowerment to the individual. In the US, until 2012 there was no Federal Legislation proposed or extant around Personal Data Protection. In the US, the approach to the use of personal data was not subject to comprehensive Federal statutory protection, because most Federal data privacy statutes apply only to specific sectors, such as healthcare, education, communications, and financial services or, in the case of online data collection, to children. On February 23rd 2012 the White House published a Consumer Privacy Bill of Rights – here for the first time we saw a framework for protecting consumer information and privacy rights. It contained many of the OECD features described above above. It holds that, with respect to their personal information, consumers have a right to: (i) Individual Control, (ii) Transparency, (iii) Respect for Context, (iv) Security, (v) Access and Accuracy, (vi) Focused Collection and (vii) Accountability. Echoing the precepts of midata, the first principle above relates to Individual Control. This is understood within the Consumer Bill of Rights to mean that consumers have a right to exercise control over what personal data companies collect from them and how they use it. It means that companies should provide consumers with appropriate control over the personal data that consumers share with others and over how companies collect, use, or disclose personal data. In terms of access, the Bill also embodies the Administration’s own commitment to: “publishing data on the Internet in machine-readable formats to advance the goals of innovation, transparency, participation, and collaboration.” At the request of the White House, the Commerce Department was asked to begin convening companies, privacy advocates and other stakeholders to develop and implement enforceable privacy policies based on the Consumer Privacy Bill of Rights. This work is ongoing. One of its principal motivations was: “the goal of increased international interoperability as a means to provide consistent, low-barrier rules for personal data in the user-driven and decentralized Internet environment.” What is noteworthy is that in both the EU and US the work on the legislative protection of personal of information and of the empowerment of individuals through access to that same information seeks to provide comprehensive privacy and security safeguards. 4. Technology and Personal Data The personal asset revolution is being driven by a number of converging technologies. For 50 years the computing power on a specific area of material has doubled roughly Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

N. Shadbolt / Midata: Towards a Personal Information Revolution

219

every two years – following what has come to be known as Moore’s Law. We have witnessed real exponential rates of change: in 1972 the Intel 8008 microprocessor had 3500 transistors, forty years later the Intel Ivy Bridge processor contains 1.4 billion. These rates of change apply at every level in computing, not just to how many components we can fit on a chip or the minimum feature sizes used to build the chips in our computers and electronic devices; they apply to the speed of a microprocessor, they apply to the amount of information we can store in our memory technology. Just as dramatic is the reduction in cost. It has been halving for the equivalent amount of capability at exponential rates also. This increase in computing power and storage is one reason that Google is able to index billions of Web pages and serve them up in microseconds, ranked by relevance, using the search terms from millions users. It is one reason that the Internet and World Wide Web exist at all. It is this increase in computing power that has allowed us to build personal information storage devices that have larger capacities than all the printed books in the Library of Congress. The emergence of mobile computing and the extent to which we now consume and generate content from the device formally known as the mobile phone owes much to Moore’s Law. Ofcom calculates that 58% of the UK population now own a Smartphone: a device with enough computing power and data storage to have rated as a powerful desktop computer 10 years ago. We are seeing an exponential increase in the number of computing devices that are capable of connecting to the Internet and of producing and consuming data. According to one study [37], the number was set to be 9.6 billion by the end of 2012, and 28 billion by the end of 2020. Ranging from simple sensors to Smartphones, individuals are carrying many devices capable of connecting to the Internet. Devices that in turn interact with a huge range of other Internet connected objects; known as the Internet of Things, this will itself be a deeply disruptive development. The increase in computing power and its availability has had a dramatic impact on the world of data and information – as we have noted they have become super abundant resources. It is an era where users are the largest creators of content, far surpassing governments and enterprises. It is the main reason we are even discussing personal data as a new asset class, as a new and disruptive feature of our digital lives. It is perhaps unsurprising then that a number of Personal Data Store technologies and supplier companies have emerged too. However, this is still a relatively immature market [38]. Unpublished data from the UK Governments’ midata programme indicates that most individuals are not particularly interested in ‘managing their own data’ for its own sake.5 They are nervous about security and whether they can trust the technology or businesses with their data. However, it is clear that these same individuals are generating large amounts of personal digital information every day. And there is clear evidence that they would welcome services that make it easier for them to manage this data better, and to have more control over their data [39]. The technology of Personal Data Stores is dominated by start-up companies, backed by personal or venture capital. Some of the key players in the rapidly emerging market for personal data stores include Allfiled, Azigo, Glome, Mydex, MyInfosafe, Paoga, Personal, the Personal Information Brokerage (PIB), Pidder, Privowny, Qiy, Singly and IndX (formerly known as WebBox). Outside of the start-ups there is some, but limited, 5 Although

such research does not generally reveal those innovations that will make the real differences in our lives. Henry Ford is reputed to have said that his customers wanted a faster horse. Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

220

N. Shadbolt / Midata: Towards a Personal Information Revolution

activity from large corporations. AT&T, O2 (Telefonica) and Microsoft are among those who are developing some form of personal data service. As a recent report from Ctrl-Shift, a consultancy specialising in the personal information market stated: “Finding the ‘sweet spot’ of convenience, utility and control is a key challenge for PDSs [38].”

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

It is not possible to engage in a thorough review of the underlying technologies of these various platforms in this article. However, we can discern very distinctive strands in the various PDS platforms. Some emphasise data security (MyInfosafe) whilst others highlight data sharing (Glome). These are not exclusive categories and many platforms would emphasise both (IndX, Mydex and Paoga). Some platforms are differentiated in terms of the locus of data sharing. For example, peer to peer data sharing (Singly) versus information sharing with organisations/service providers (Azigo, Mydex, Paoga, Personal). The source of data can differentiate the focus of a PDS. For example, Glome and Pidder focus on self generated content such as profiles and logs, whereas Qiy deals with the information gathered and held by organisations about individuals. The use to which the data is put is another differentiator. For Paoga it is about verified credentials, whilst for PIB-d, it is education records, and for Azigo, it’s the data needed to generate targeted offers. Services like Mydex and Personal are more focused on ‘life management’ tasks. The base technology in these products varies widely, from proprietary protocol stack with which app developers need to integrate, through to open standard offerings using HTTPS and the like. From proprietary rule-based knowledge representation languages through to open standard semantic web formats like RDF. From proprietary id authentication methods through to variants of OAuth the IETF endorsed open authentication protocol. There is also work on privacy enhancing technologies which limit which data is exposed and whether and how that data can be retained after processing [40]. At the moment the market has no clear dominant offer but there is much activity and we can expect acquisition and consolidation as well as new market entrants. 5. Concluding Remarks Within 10 years, most of the world will be blanketed with high-speed mobile networks. There will have been a switch to mobile as the main interaction method for many digital services. This will include contactless payments. Smart phones will evolve to become personal information storage devices and will themselves be as powerful as today’s laptops. They will continuously and wirelessly connect to a set of IP connected devices that together form a computing environment that will monitor our health, direct our movements, deliver video and other multimedia content. Connected continuously to cloud based computing services, we will have access to voice activated context specific search results offering suggestions of where to eat, what to see, what to buy. Mobile devices will connect me to a wide range of Government services, from my medical records to NHS services, education resources to benefits, welfare and tax. The device will also connect me more widely to the Internet of Things: devices in homes, cars, the workplace, and shops. The device will connect to similar devices carried by others, forming a mesh of social interactions and transactions.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution

221

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 10. Google Glass.

A foretaste of what this world will look like is the Google Glass project due to ship a device with many of the features above later in 2013. The prototype shown is claimed to be capable of voice-activated translation, route finding, video recording and information retrieval. This new type of product will be as disruptive as smart phones and tablets. Health and leisure apps will accelerate the personal information revolution. The emergence of a Personal Information Asset market will see substantial innovation around Personal Information Management Technology. The data generated will yield insights about an individuals’ patterns of behaviour, propensity to buy, and serve to anticipate their intentions [6]. This will build on the work of the midata programme described here. Citizens and consumers will be able to gain machine readable access to their own data from both governments and corporates and in many cases exercise control over who uses it for what purpose. Data empowered consumers and citizens will become a new norm. It is often argued that as a society we have become less preoccupied and concerned with what happens to our personal data and indeed with privacy in general [41]. There are studies that show just how easily individuals will part with personal data – often in return for trivial incentives. However, there is equal concern in many quarters that this perfect storm of technology, ubiquity of data and corporate data mining is a fundamental challenge to a human right – the right to privacy, to be maintained and defended at every opportunity. Against this backdrop programmes like midata must steer a careful course. The midata principle is to empower people through control of their own data assets. This provides for a notion of securing one’s own data from unauthorised or unwarranted use. The ambition is to equalise what was earlier described as an information asymmetry. However, there exist a plurality of views as to how far this can and should go. These views arise from a perception of what is culturally and social desirable. What the technology does do is provide a means to generate, collect, integrate and analyse data at

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

222

N. Shadbolt / Midata: Towards a Personal Information Revolution

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

scales never before imagined. The midata programme is also about economic and market improvement. We need to allow the individual to get much more of the value of the data about them, and because data is non-rivalrous this can be done without reducing the value of the data to the data holder. Moreover, this can be done in ways which respect the privacy, commercial and other interests of all the parties concerned. If there is one area where governments could and should take a lead, it is to give citizens access to their data; giving access to data held in the public sector in the same way that they are asking business to do. Currently in the UK there is only the subject access right under the Data Protection Act. This is burdensome for public agencies to operate and there is no concept of co-creation of data. Moreover, one of the concerns regarding government and personal data is the extent to which sharing occurs with no reference to the individual. A midata approach would enable citizens to see the data about them and to propose corrections if it was wrong. A midata model could serve as a mechanism for making public sector data sharing more effective and more acceptable. It would also allow citizen-mediated sharing – if a citizen wanted Department B to take full regard of the data held by Department A, then the citizen could get the data or else a token from Department A under midata and give it to Department B with his application for Department B’s services. This would have the twin benefits of handling many of the “benign” cases which data sharing advocates cite, and allow more scrutiny of more intrusive data sharing. The move to more engaged, digitally empowered consumers and citizens continues a trend towards prosumerism. Alvin Tofler’s concept of prosumerism foresaw proactive consumers actively engaged to improve or design goods and services. This, in some respects, holds the greatest promise for midata. It is one in which the products and services offered by organisations have to adapt to and be adapted by the individual, group or collective. The world of data empowered consumers and citizens is inevitable. The question is how quickly and how equitably this new settlement emerges. The midata programme is simply the beginning of a fundamental change in how individuals connect and relate to those organisations that provide them with products and services.

References [1] [2] [3] [4]

[5] [6] [7] [8] [9]

F. Gens, “Idc predictions 2012: Competing for 2020,” IDC: Framingham: IDC, Tech. Rep., 2011. C. Kalapesi, “Unlocking the value of personal data: From collection to usage,” World Economic Forum, Tech. Rep., 2013. A. Mitchell, L. Brandt, and W. Heath, “The new personal communication model: The rise of volunteered personal information,” ctrl-shft, Tech. Rep., 2009. J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and A. H. Byers, “Big data: The next frontier for innovation, competition, and productivity,” McKinsey Global Institute, Tech. Rep., June 2011. J. Rose, “The value of our digital identity,” Boston Consulting Group, Liberty Global, Tech. Rep., November 2012. D. Searles, Intention Economy: When Customers Take Charge. Harvard Business School Publishing, 2012. BIS, “Better choices better deals,” Dept. of Business, Innovation and Skills, Tech. Rep., 2011. R. H. Thaler and W. Tucker, “Smarter information, smarter consumers,” Harvard Business Review, January-February 2012. C. Humby, T. Hunt, and T. Phillips, Scoring Points: How Tesco Continues to Win Customer Loyalty, 2nd ed. London, UK: Kogan Page Limited, 2007.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

N. Shadbolt / Midata: Towards a Personal Information Revolution [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]

[24]

[25]

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[26]

[27] [28] [29]

[30] [31]

[32]

[33]

223

BIS/11/2011 (2011, November). The midata vision of consumer empowerment. [Online]. Available: https://www.gov.uk/government/news/the-midata-vision-of-consumer-empowerment. CO/06/2012 (2012, June). Public sector transparency board: Public data principles. [Online]. Available: http://data.gov.uk/sites/default/files/Public%20Data%20Principles For%20Data.Gov%20(1).pdf. No10 (2011, October). Pm energy summit. [Online]. Available: https://www.gov.uk/government/news/ pm-energy-summit. J. Neylon (2012, November). Instant personalised energy saving advice using midata. [Online]. Available: http://empowermi.cleanweb.org.uk/. Opower (2012, June). Providing insight, protecting privacy, and putting you in control. [Online]. Available: http://opower.com/company/data-principles. MandS (2012, December). The key to your personal data helping you access and manage your personal data: midata. [Online]. Available: http://www.mandsenergy.com/help/midata/. K. Oxtoby (2012, March). Midata minefield. [Online]. Available: http://www.utilityweek.co.uk/news/ news story.asp?id=196625&title=Midata+minefield. Billmonitor (2012, June). The billmonitor.com national mobile report 2012. [Online]. Available: http://www.billmonitor.com/billmonitor-national-mobile-report-2012. R. Burns (2012, October). Tesco plans to open up data with clubcard play scheme. [Online]. Available: http://www.marketingmagazine.co.uk/article/1152799/tesco-plans-open-data-clubcard-play-scheme. B. Milligan (2012, September). Bid to transfer receipts from paper to online. [Online]. Available: http://www.bbc.co.uk/news/business-19599978. E. Thwaites (2012, November). Inspiration from personal data at the odi’s midata hackathon. [Online]. Available: http://theodi.org/news/inspiration-personal-data-odi%E2%80%99s-midata-hackathon. M. Luzzi, “Personal data: The emergence of a new asset class,” World Economic Forum, Tech. Rep., 2011. (January, 2012) “MesInfos”: Experimenting the sharing of personal data between business and consumers. [Online]. Available: http://doc.openfing.org/MesInfos/MesInfos Presentation V5EN.pdf Aetna (2011, September). Aetna makes it easier for members to share personal health information with care providers to improve quality care. [Online]. Available: http://investor.aetna.com/phoenix.zhtml?c= 110617&p=irol-newsArticle&ID=1606005&highlight=. RelayHealth (2011, October). Relayhealth gives patients immediate access to patient health record R [Online]. Available: http://www.relayhealth.com/news-and-events/press/Relay data via blue button. Health-gives-patients-immediate-access-to-patient-health-record-data-via-Blue-Button.html. AHA (2013, January). Fast facts on us hospitals. [Online]. Available: http://www.aha.org/research/rc/ stat-studies/fast-facts.shtml. R goes viral: Unitedhealthcare promotes importance of perUnitedHealth (2012, July). Blue button sonal health records to millions of enrollees. [Online]. Available: http://www.unitedhealthgroup.com/ newsroom/articles/news/unitedhealthcare/2012/0705bluebutton.aspx. A. Chopra (2011, September). Modeling a green energy challenge after a blue button. [Online]. Available: http://www.whitehouse.gov/blog/2011/09/15/modeling-green-energy-challenge-after-blue-button. J. Witkin (2012, January). Pushing the green button for energy savings. [Online]. Available: http://green. blogs.nytimes.com/2012/01/20/a-phone-app-for-turning-down-the-thermostat/#. J. Holdren, “Smart disclosureand consumer decision making: Report of the task force on smart discloure,” Executive Office of the President National Science and Technology Council, Tech. Rep., May 2013. S. Raseman and N. Sinai (2013, February). Consumer.data.gov is live! [Online]. Available: http://www. whitehouse.gov/blog/2013/02/11/consumerdatagov-live. EC/95/46, “Directive 95/46/ec of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data,” November 1995. [Online]. Available: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri= CELEX:31995L0046:en:HTML. OECD1980, “Guidelines on the protection of privacy and transborder flows of personal data,” September 1980. [Online]. Available: http://www.oecd.org/internet/ieconomy/oecdguidelinesontheprotectionof privacyandtransborderflowsofpersonaldata.htm. EC/2012/0011, “Proposal for the eu general data protection regulation,” January 2012. [Online]. Available: http://ec.europa.eu/justice/data-protection/document/review2012/com 2012 11 en.pdf.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

224 [34]

[35]

[36]

[37]

[38] [39]

[40]

P. Masons (2012, August). UK submits concerns over proposed data protection reforms. [Online]. Available: http://www.out-law.com/en/articles/2012/august/uk-submits-concerns-over-proposeddata-protection-reforms-/. BIS/12/943 (2012). Midata 2012 review and consultation. [Online]. Available: https://www.gov.uk/ government/uploads/system/uploads/attachment data/file/32687/12-943-midata-2012-review-andconsultation.pdf. BIS/12/1283 (2012). Midata: government response to the 2012 consultation. [Online]. Available: https://www.gov.uk/government/uploads/system/uploads/attachment data/file/43392/12-1283-midatagovernment-response-to-2012-consultation.pdf. IMSResearch (2012, October). Internet connected devices approaching 10 billion, to exceed 28 billion by 2020. [Online]. Available: http://www.imsresearch.com/press-release/Internet Connected Devices Approaching 10 Billion to exceed 28 Billion by 2020&cat id=113&type=LatestResearch. A. Mitchell, “Personal data stores a market review,” ctrl-shft, Tech. Rep., 2012. Jigsaw, “Potential consumer demand for midata: Findings of qualitative and quantitative research,” BIS12/976, Tech. Rep., 2012. [Online]. Available: https://www.gov.uk/government/publications/ potential-consumer-demand-for-midata-findings-of-qualitative-and-quantitative-research. Y. Shen and S. Pearson, “Privacy enhancing technologies: A review,” HP Labs, Tech. Rep. HPL-2011113, August 2011. J. Rauhofer, “Privacy is dead, get over it! information privacy and the dream of a risk-free society,” Information and Communications Technology Law, vol. 17, no. 3, pp. 185–197, 2008.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[41]

N. Shadbolt / Midata: Towards a Personal Information Revolution

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Part V

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Personal Data Management: Examples and Overview

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-227

227

A User-Centred Approach to the Data Dilemma: Context, Architecture, and Policy M.-H. Carolyn NGUYEN, Peter HAYNES, Sean MAGUIRE and Jeffrey FRIEDBERG1 Microsoft Corporation, United States

Abstract. Data is rapidly changing how companies operate, offering them new business opportunities as they generate increasingly sophisticated insights from the analysis of an ever-increasing pool of information. Businesses have clearly moved beyond a focus on data collection to data use, but users have an inadequate model of notice and consent at the point of data collection to limit inappropriate use. An interoperable metadata-based architecture that allows permissions and policies to be bound to data, and is flexible enough to allow for changing trust norms, can help balance the tension between users and business, satisfy regulators’ desire for increased transparency, and still enable data to flow in ways that provide value to all participants in the ecosystem. Keywords. Metadata, big data, personal data, data protection, privacy, context, value exchange, trust, trustworthy data practices, interoperability, architecture

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction Data is rapidly becoming the universal currency of our economy, a digital good whose value does not diminish with use, and whose benefits are realized only when it can flow, not when it remains idle. Its contributions to economic growth and societal benefits are limited only by the innovations it inspires. When appropriately used, big data holds extraordinary potential for new innovations, increased investment, economic growth, and greatly improved social welfare. Unleashing this potential will require policy frameworks that enable data – including, as appropriate, personal data – to flow, be analyzed, and be exchanged freely, across geopolitical boundaries, while minimizing risks and harms to individuals and enterprises globally. Regulations that restrict this flow could hamper new innovations, services, and business models – resulting in significant missed opportunities for all concerned. Achieving an acceptable balance will require a dialog among all major stakeholders in the data ecosystem, along with new policy and technical approaches to how we think about and appropriately manage the flow of information. This paper will make the case that technology can help achieve this goal. Using what is known as an interoperable metadata-based architecture, user permissions and policies can be permanently associated with data, enabling any entity handling that data to do so in accordance with a user’s wishes. Such a metadata-based approach can also 1

1 Microsoft Way, Redmond, WA 98052; E-mail: {cnguyen,peterhay,smaguire,jeffreyf}@microsoft.com.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

228

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

enable users to change their preferences and permissions over time, prevent undesirable use of previously collected data, address unanticipated uses, and reflect changing norms. It can be a highly effective way to strengthen enforcement of user permissions in a decentralized data ecosystem. The paper is organized as follows. Sections 1 and 2 discuss the challenges posed by big data, and increasing regulatory concerns over perceived risks to individuals. Section 3 presents evidence of how users actually define the context in which their personal data is being used. Sections 4, 5 and 6 consider how a metadata-based architecture can address some of the challenges posed, while enabling more context-aware data usage. Section 7 pulls all these elements together and shows how such an approach, when implemented as part of an appropriate policy framework, can enable data to flow while helping to satisfy the interests of regulators, users, and industry.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. The Data Dilemma: Big Data, Analytics, and Their Societal and Economic Value The world is already awash with data – by one estimate, almost three zettabytes (three billion terabytes) of information had been created by 2012, a digital deluge that is growing at around 50% per year and still accelerating [1]. Recent decades have seen a confluence of multiple technology enablers, including drastic improvements in processing power; ready availability of that power to process data relatively cheaply in vast cloud datacenters; a three-million-fold decrease in storage costs since 1980 [2]; a plethora of smart gadgets (from simple sensors to multimedia-enabled devices) that can be deployed pervasively to collect all types of data; and ubiquitous network access. By the end of 2013, the number of mobile connected devices will exceed the number of people on the planet [3]. By 2020, an estimated 50 billion devices will be wirelessly connected to the Internet [4]. At the same time, from 2012 to 2017, machine-tomachine traffic will grow an estimated 24 times to 6 × 1017 bytes per month, a compound annual growth rate of 89% [3]. The majority of big data will be collected passively and automatically, through machine-to-machine transactions, and users will not actively be involved in most of those transactions. This has multiple policy implications that will be addressed below. There is no official definition of big data, but essentially it refers to petabytes of information that can’t be processed by a single machine, and so require cloud resources to store, manage, and parse. Many think about big data in terms of the volume of data being created, the velocity at which it is generated and must be analyzed, and the variety of types of data involved as its defining characteristics – the three v’s of big data. But increasingly there is also a fourth v – the value derived from new insights and knowledge gained from applying advanced machine learning and analytics to big data sets. As the World Economic Forum observes in a recent report [5], the insights that can be gained from the aggregation and analysis of data demonstrate extraordinary potential for new innovations, societal benefits, and economic growth. The societal value of big data is readily identified through multiple examples. Predictive models developed from large-scale hospital datasets can be used to identify patients who are at the highest risk of rehospitalization after discharge. Microsoft applied machine learning to a multi-year dataset of patient hospitalizations at the Washington Hospital Center in Washington, DC, to reveal previously undetectable risk factors. For example, if a patient is admitted for congestive heart failure, she is more likely

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

229

to be readmitted within 30 days if she is also depressed or taking drugs for gastrointestinal disorders. In another example, the UN Global Pulse, formed in 2009, is developing a Crisis Mapping Data Taxonomy to explore how data can better achieve development goals [6]. Its sources include online search data, blogs, online news services, and mobile phone data, all collected and analyzed in real time. Anticipated uses include population movement patterns in the aftermath of a disaster or a disease outbreak and real-time feedback on where development programs are not delivering results. And in Germany, Car-to-X (C2X), a collaboration between the government, automotive sector, communications companies, and research institutes, is exploring how vehicles can exchange information with each other and with the road-network infrastructure (e.g., traffic lights) [7]. C2X can alert drivers to potential traffic hazards, while traffic light systems can be reconfigured in real time to improve traffic flow. It is also worth noting the difference between personal and socially beneficial uses of data. For example, the personal value of the Washington Hospital Center initiative is improved treatment for the patient – and this clearly has direct monetary value in the form of reduced costs and better outcomes. But valuable socially beneficial uses can also create value – e.g., improved medical protocols or identification of new treatment regimes. Parsing exactly how and where big data has the most impact is crucial to policymakers as they consider how potential data uses should be regulated. The economic value of data is equally evident but harder to enumerate for two reasons. For example, the use of personal data to avoid duplicative testing, misdiagnosis, and other diagnostic inefficiencies in the US healthcare sector is clearly producing a real economic benefit. But because the value created doesn’t involve explicit market transactions, attributing this benefit directly to data involves some inspired approximation. One estimate puts the savings in this case at up to $300 billion [8]. Second, the methodologies used to value data vary widely. Among those used are revenues or net income per datum/user (the challenges here are obvious: for instance, $4 per personal datum with near zero profitability is very different from $4 per personal datum with 40% net profit) and the market prices at which personal data is offered and sold. In April 2013, the OECD published a report [9] that examines these issues in depth.

2. The Balance of Power Between Users and Industry – And Regulators’ Growing Interest For much of the past decade, consumers have proved largely sanguine about what happens to their information online. That said, some people are clearly starting to worry about their growing reliance on technologies that impact their lives in ways they don’t understand, and often can’t even know. A recent poll [10] by Eurobarometer, the Public Opinion Analysis unit of the European Commission, found that although 74% of European Union (EU) citizens consider the disclosure of personal information an increasing part of modern life, 62% don’t trust Internet companies to protect their data. Moreover, 75% believe that they should be able to delete personal information stored on a website – often referred to as “the right to be forgotten.” Today, there is no reliable way for individuals to do this – or, for the most part, to modify previously granted permissions – a situation we believe could be helped by using an approach that com-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

230

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

bines advanced metadata-based technologies, a new approach to policy, and industry codes of conduct. This is addressed in more detail below. Regulators, too, are increasingly concerned at what they see as a growing imbalance between data-dependent companies and individuals. The European Commission’s proposed “regulation on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation)” (GDPR) [11], published in January 2012, would, if adopted, limit collection of personal data to “the minimum possible” (often known as the “data minimization” principle); requires “data controllers” (any entity controlling the use of personal data) to give consumers more rights to opt in or out of potential uses of their data, and proposes “right to be forgotten” measures that are, in the reality of today’s digital ecosystems, likely to be difficult both to implement and to enforce. The EU and a number of nation states have also increasingly moved towards recognizing data privacy as a fundamental human right. Indeed, the right to privacy and data protection is already guaranteed under the International Covenant on Civil and Political Rights and the European Convention on Human Rights; it has also been enshrined in the EU Charter of Fundamental Rights, and as a fundamental right in the constitutions of some nation states (including Argentina, Brazil, and Mexico). Conversely, policymakers in the United States have taken a more laissez faire and piecemeal approach to data privacy, enacting laws that separately limit the use of medical records, credit scores, video rental data, and other information, and relying on entities that use personal and other data to protect it in the interests of their own business reputations. The challenge for regulators everywhere is how best to protect individuals without destroying data’s apparent socio-economic potential. For example, much of the EU’s proposed GDPR is predicated on the imposition of additional restrictions when data is collected – effectively extending today’s “notice and consent” approach in a way that could slow both the flow of data and pace of innovation. In reality, it’s not the collection of data that is the primary source of potential harm, but its unconstrained and/or inappropriate use. Yet what is considered acceptable use is very much a personal preference. User attitudes towards personal data are addressed in more detail in the next section.

3. Context-Aware Data Usage and User Perceptions Existing approaches to data protection regulation focus on collection, and treat data as binary – it is either considered personal data or not, and users either do or don’t give consent to the use of that data. An alternative approach, “contextual integrity” developed by Helen Nissenbaum in [12] and [13], describes information sharing in terms of appropriate context and governing social norms within that context. This approach has been gaining acceptance. The US White House included the notion of “respect for context” in the Consumer Privacy Bill of Rights [14]. The US Federal Trade Commission (FTC) extended this with its concept of “context of the interaction” [15]. In its report for businesses and policymakers on consumer privacy, the FTC recommended that companies would “not need to provide choice before collecting and using consumer data for practices that are consistent with the context of the transaction or the company’s relationship with the consumer, or are required or specifically authorized by law.” At Davos 2013, the World

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

231

Economic Forum raised the need for context-aware usage of data as a key outcome of global dialogs it had conducted as part of its “Rethinking Personal Data” initiative [5]. With increasing recognition of the importance of context, there is a corresponding need to understand user attitudes towards personal data, identity, and trust. Previous research, such as [16] and [17], has focused more on assessing which types of data people are more concerned about and what types of institutions people trust most, but not necessarily the variables that impact user sensitivity in sharing personal data or trust in entities they interact with – i.e., how users define the context of their data sharing. Between May 2012 and January 2013, Microsoft sponsored a study to establish insights into these issues to inform development of appropriate policy frameworks by addressing questions such as: •

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

• • • •

What factors do users take into account when considering what is acceptable use of their personal data, i.e., how do users define “context?” What role does trust play in determining this data context? What roles are played by cultural or social norms? What do users think about self accountability in managing their personal data? What do users expect industry and government to be accountable for?

The research was divided into two stages: a qualitative study conducted in Canada, China, Germany, and the United States (May–Jun. 2012); and a follow-up quantitative study conducted in the initial countries and Australia, India, Sweden, and the UK, to quantify and confirm Phase I findings (Dec.–Jan. 2013). The results reinforced the relevance of context, indicating that what is considered acceptable use of data is personal, subject to change, and reflects differences in cultural and social norms. Based on the findings, the International Institute of Communications (IIC) concluded that a simplistic, binary, and static data-management policy that dictates a priori whether data is considered personal is insufficiently flexible for the rapidly evolving digital world [18]. The research identifies seven key variables that users take into consideration in determining whether their data should be used – overall, we define this as the data context: 1. 2. 3. 4. 5. 6.

7.

Type of data – what type of data it is (e.g., location, financial, medical); Type of entity – who is using the data (e.g., retailer, employer, government); Collection method – how the data is collected (e.g., actively provided by the user, passively collected without user awareness, inferred through analytics); Data usage – level of user involvement in the data use, from express consent to autonomous use; Trust in service provider2 – what relationship, if any, do users have with the service provider; Value exchange – do users perceive benefits received from use of their data as offering a fair value exchange (e.g., personal benefits, benefits to the community); Device type – what kind of device is being used (e.g., mobile phone, PC).

2 “Service provider” is an entity that the user interacts with as an online service or an application. This may or may not require the user to provide personal data, but will generally use personal data.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

232

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 1. Developing a context-aware policy framework.

In developed markets,3 collection method and data usage are the two primary drivers that determine user sensitivity, reflecting a strong user desire to control their data. However, in emerging markets,3 the two primary drivers are collection method and value exchange, with data usage as a close third driver, reflecting the importance of perceived value for the personal data provided. In both cases, trust in service provider is the next important variable. These results validated the context variables identified in the quantitative phase and the nuances that result from differences in social and culture norms. Another example of how the results reflect these differences can be seen in the impact of how value exchange is perceived by users in different markets. For developed markets, the value defined as “provide a benefit to the community” renders scenarios less acceptable to users, whereas in emerging markets, the same value renders scenarios more acceptable. These, and other findings from our study, would appear to reinforce the need to consider new policy frameworks that may be more context aware in specifying data use. Figure 1 shows how the results might be used to develop a more holistic, nuanced data policy framework for developed markets. In Fig. 1(a), collection method and data usage are shown as the primary variables in the data context impacting user sensitivity in determining acceptable scenarios of data collection, access, and use. Users respond most positively when data is actively collected and when they give express consent to the data usage. This equates to today’s notice/consent model, where the personal data collected is provided by users who actively participate in the transaction and give consent to use of their data, even if they 3 For the research conducted, developed markets include Australia, Canada, Germany, Sweden, the UK, and the US; emerging markets include China and India.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

233

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

generally do not read or understand the terms and conditions they are agreeing to. The notice/consent given is binary and absolute, i.e., no additional context for the data’s use is considered. In the world of big data, most data will be passively collected or generated, i.e., without active user awareness, and it would be impractical if not impossible for users to give express consent with respect to all the data collected. This situation is portrayed in Fig. 1(b), where the desire is to use the analytics resulting from big data in a way that would result in high socio-economic value, but where a simple extension of today’s notice/consent approach would not be feasible, and a lack of perceived user control on the data flow and usage would result in low user acceptance. Figure 1(b) also represents the tension in today’s data-protection debates, where some policymakers favor imposing additional restrictions when data is collected and/or require explicit consent for every use of data – approaches which, while mitigating risk, would deprive individuals and economies of many potential benefits. An alternative approach, based on the user research results, is shown in Fig. 1(c), where the remaining context variables can be leveraged to increase user acceptance of the scenarios. For example, in both developed and emerging markets, trust in service provider is an important driver for user acceptance. Research shows that users’ trust is enhanced if principles that promote trustworthy data practices are part of the governance of the data ecosystem. Enforceable codes of conduct can also demonstrate companies’ willingness to abide by user permissions on data use. Coupled with technologies such as metadata, this approach can extend the user’s actual control. In addition, our research shows that users generally have a strong sense of self accountability and understand that they are taking some level of risk with their personal data when they use free services. However, they also expect accountability from companies and government. They expect companies to use their personal data only in a manner consistent with their expectations (66% in developed markets; 71% in emerging markets). From their government, users expect enforcement of regulations (82% in developed markets, 86% in emerging markets).

4. Ways in Which Technology Can Enable Alternative Policy Frameworks 4.1. High-Level Overview of a Metadata-Based Architecture In today’s data systems, user permission and data use policies are typically made available to a user during the initial notice/consent process. These policies are implicitly associated with the data when it is collected. At the time of use, because the policies are not explicitly associated with the data, the system must intrinsically honor the terms. If the data were accessed, it would be difficult to determine what uses were acceptable just by looking at the data. In such systems, there is not an effective governance model; compliance is effectively based on trust of good behavior by the actors involved. In a metadata-based architecture, data is logically accompanied by an interoperable “metadata tag” that contains references to the policies associated with the data and related provenance information, specified in an extensible and interoperable markup language. The metadata is logically bound to the data and cannot be unbound or modified for the entire data lifecycle by any parties other than the user or entities as expressed by the policy or rules of the trust framework. An example of such an architecture is described in [19].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

234

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

Conceptually, the policy expressed would reflect user-specified permissions, corporate policies, and applicable regulations. In such an architecture, users could also change their preferences and permissions or set different preferences for the same data in different contexts, including revoking a use previously granted. This would be a useful building block for enabling the “right to be forgotten” discussed earlier. Metadata thus strengthens the enforcement of user permissions in a decentralized data ecosystem, and provides a foundation for trustworthy data exchange. By helping address these fundamental weaknesses, it stabilizes the overall ecosystem. A formidable challenge lies in the specification of user permissions and policies that would govern how data can be used within – and shared across – trust boundaries, and how those permissions and policies would be negotiated among the multiple parties with claims on the data. The metadata enables interoperability when expressed through structured languages such as XML and interpreted with common schemas. Because metadata provides a layer of abstraction from the data, processing and interpretation would occur first on the metadata. The data itself can only be accessed when the entity has properly been validated as having the right to access the data. Such a metadata-based architecture offers value to all stakeholders in the data ecosystem, not only users. Data processors can more easily interpret, understand, and comply with permissions and policies defined for specific data. Companies can develop metadata schemas that fully describe data, and relevant policies to meet industry and regulatory requirements.4 Solution providers can create applications and services that produce new business value yet still use data in privacy-preserving ways. And regulators can take advantage of greatly improved auditability of data, along with a stronger and better defined connection between the data and those policies that govern its use. Achieving this will require the specification of an interoperable metadata-based architecture that can function at Internet scale. The development of such an architecture needs to be a collaboration between multiple data stakeholders to ensure its feasibility, as well as its ability to enable alternative policy frameworks. The architecture would focus on enabling data flows, and assume an underlying secured infrastructure, including interoperable identity services5 to permit validation of data controllers’ identities across multiple trust boundaries. Using the layered reference model proposed in the chapter by Bus and Nguyen in this volume, such architecture would focus on delivering the functionality included the “data management” layer. A detailed description of an example metadata-based architecture is described in Section 6. 4.2. Related and Past Efforts The approach of a metadata-based architecture has been informed by a number of related efforts to improve privacy management for users and enterprises, including: •

Associating policy with data: Hewlett-Packard, among others, presented the concept of sticky policies, which provide mechanisms for ensuring that user consent policies are tied to collected data in an enterprise privacy-management system and included as data flows through an ecosystem [20]. IBM developed

4 When multiple parties have rights to the same data, these rights may conflict or result in restrictions that could potentially render new and innovative uses of such data difficult or impossible. Section 6.1 below discusses how policies may be reconciled in cases of conflict. 5 “Interoperable identity service” denotes an identity-management service that agrees, at a minimum, to compatible assurance levels, exchange protocols, and data formats to enable cross-platform authentication and authorization of digital identities.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma







Copyright © 2013. IOS Press, Incorporated. All rights reserved.





235

the Platform for Enterprise Privacy Practices (E-P3P), which enabled enterprises to manage data more effectively, including associating policy with data [21]. A metadata-based architecture embraces this concept of associating policy with data and extends it by moving beyond tracking consent to representing any policy that is bound to the data. It can also extend these concepts into the broader data ecosystem. Handling multiple policies on data by defining shared rules, flexible hierarchies, and evaluation with precedence and conflict resolution to help resolve policies [22]: A metadata-based architecture would include logic to help resolve multiple policies, arbitrate policy differences, and provide access to the latest versions of policies to ensure valid compliance. Proposing tools to help users manage their privacy: Suggested software capabilities include managing profiles for users, abstracting privacy levels to provide different views into data, managing negotiation between different entities, and controlling access to data through authorization [23]. These controls are extended in a metadata-based architecture by enabling users to change their preferences and permissions after data has been collected if the trust relationship changes. Improving context awareness by automatically inferring privacy, sharing, and interest (PSI) parameters by analyzing user behavior in different contexts: This analysis could provide a way to “set defaults, scan for likely sharing errors and similar mistakes, and validate data from forms or more traditional sources” [24]. As datasets are commingled, re-identification of previously deidentified data becomes more likely. Such data could be protected with a metadata-based architecture that applies policies reflecting commonly accepted norms for data in similar contexts. Expressing policy through computational languages: Privacy policy and privacy preference languages, such as EPAL and APPEL, can be used to describe policy, especially as the policies affect downstream usage [25]. A metadatabased architecture can be language agnostic, so it can provide interoperable components that can take different languages as inputs. Prototyping and testing systems with metadata components to enable key endto-end consumer privacy and trust scenarios: The PrimeLife and EnCoRe projects were both robust multiyear efforts between industry and academia. PrimeLife extended the previous FP6 project PRIME work to “bring sustainable privacy and identity management to future networks and services” and provided extensive prototyping and analysis [26]. EnCoRe focused on the dynamic nature of consent and used it as the basis for applying and developing technological solutions (see the chapter by Whitley in this volume). A metadata-based architecture can leverage the learnings from these projects to apply them toward data-use based instead of data-collection based models, which would be more applicable in the world of big data, with increased emphasis on supporting context-aware data use.

5. Other Elements Required in the Policy Framework to Complement Technology Although a metadata-based architecture can facilitate and strengthen the enforcement of user permissions, it cannot ensure enforcement. Implemented as part of a data policy

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

236

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

framework, it would enable policymakers to focus on what the policy achieves, rather than how. This is one way of ensuring that a policy framework can remain relevant and flexible in the foreseeable future. Principles are an essential part of governance if we are to enable user trust, especially in a decentralized data ecosystem, where there is no single authority, owner, or user of the data. Most data protection policy refers to the Fair Information Practices Principles and/or the OECD Privacy Principles. Although some principles are applicable, the global dialogs from [5] unveiled the need for an emerging set of principles that would consider some changes discussed in Section 3. New imperatives for consideration include the need to reflect a more empowered and engaged role for the individual, a more meaningful approach to transparency, and a context-aware approach to data use. These finding also reinforced the user research results discussed above. Codes of conduct address the how in realizing the principles and can contain both business and technical procedures. Companies voluntarily adopting enforceable codes of conduct that decrease plausible risks for users might be provided with regulatory relief. For example, if technology such as metadata is implemented to reduce the risk of consumer data being used contrary to user-specified permissions in a conclusive way, there could potentially be lower compliancy requirements. At the ecosystem level, when one or more companies agree to abide by a common set of legal rules, codes of conduct, other business and technical rules, and operational rules, they are generally referred to as belonging to the same trust framework.6 If identities of entities in a trust framework are verified, then data can be more easily shared among companies in the same framework. There is more assurance that those entities will abide by the policies expressed in the metadata associated with the data. Trust frameworks are also a potential alternative for enabling cross border data flow, i.e., governments may be more willing to enable seamless data flow across jurisdictions within a trust framework if the governance of such a framework can be made legally enforceable and is consistent with established laws. This latter point is crucial. Although metadata can help facilitate compliancy with the policies expressed by making the latest policies readily available, it cannot guarantee that entities handling the data will honor those policies. Compliance would also need to be pursued through other means, such as regulation, audit, or binding rules within a trust framework. 6. Metadata-Based Architecture Components A metadata-based architecture assumes the following tenets: •



To provide key information about the data itself, the metadata describes the origins of the data (the provenance) and the conditions under which the data can be used (the policies). The metadata also provides a pointer to a gatekeeper, a service that accesses and evaluates the latest version of relevant policies. Data and metadata are logically inseparable. To respect policies under which the data is governed, metadata is bound to the data at the point of collection and must stay with the data as it flows throughout the ecosystem. The data and metadata are considered logically inseparable because we recognize that there

6 For a more extensive discussion about trust frameworks, see the chapter by Bus and Nguyen in this volume.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma



237

may be efficiencies in operating on large datasets by temporarily separating the metadata from the data. To maintain the logical connection between metadata and data, an entity can process data in a trusted processing container. This entity must be able to make claims about the security and integrity of the container and the policies under which it operates on the data (e.g., whether it is certified). When the data leaves the container, the metadata must be reattached. To maximize utility and flexibility, components are policy-language agnostic: there is no set requirement for the language used to describe and interpret policy. Instead, we recommend describing policy with structured languages such as XML and interpreting metadata with common schemas.

6.1. Specifying Policy and Provenance Policy represents the requirements that can be set on data and can include any or all of the following: • • •



Copyright © 2013. IOS Press, Incorporated. All rights reserved.



Data-collector policies: These are authored by the data collector and govern their operation of a service or offering, e.g., terms of service/use, privacy policies, codes of conduct, and software license terms. User-specified permissions: These are exposed by the data collector and controlled by the user. For example, users may be permitted to opt out of marketing communications from the data collector or related third parties. Global user policies: These are set by the user outside the scope of the point of collection but still apply because of corporate or legal requirements. For example, users may add their names to the US FTC National Do Not Call Registry, a do-not-contact policy prohibiting telemarketing calls. Regulatory and jurisdictional policies: These specify requirements that depend on where data is collected and used. For example, there may be requirements to attach policies to data collected from minors in a specific country/region. Legal agreements: These are entered into with the data collector and other third parties and govern the relationship between them. For example, an agreement may specify the frequency with which a data collector must refresh data they obtain from a third party.

Because the policies attached to data must represent the interests of all parties, it is possible to have multiple policies of each type attached to data. A data collector may have terms of service, a privacy policy, and a code of conduct on its Internet service, with all of these policies applying to data as it is collected and used. Typically, policies that are bound to data would be included by reference, with pointers to the most current policy, instead of by value, with all policies enumerated with the data itself. Trust is ephemeral, and the trust relationship between parties can change over time. By using a policy-by-reference model, the framework gives users the ability to change their permissions if the trust relationship has changed. When an entity uses data, there can be a time-to-life specified in the policy for the data. The entity is then obligated to check for the latest policy version and respect that specific version. As described, there is typically not a simple one-to-one correspondence between data and policy. For example, an online service may allow users to upload and share photos. The user, the online service, and any subjects in the picture may all be able to specify policies on that data, and those policies may conflict. Reconciliation capabili-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

238

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

ties and processes must be part of any trust framework to ensure that policies bound to the data can be respected and reconciled. This may be possible through technological approaches, such as defining policy with logic, precedence, and hierarchy rules, or through physical-world approaches, such as enabling negotiation between users and data controllers. More research is required to further investigate these issues. Provenance describes the origin of the data, such as time, location, data-collector name, and data-collector contact information; it can also specify whether the data was actively collected, generated, or inferred. Conceptually, provenance can also include the same information about the policy/policies bound to the data. By attaching provenance to data, it becomes possible to track where data originated as it flows in an ecosystem. Provenance could also be used to track data controllers during onward transfers to show the chain of custody of the data, and also show additional policies that are attached to the data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6.2. Enforcing Policy Binding policy to data can help make the obligations and requirements for data use clear. For less sensitive data, communicating the obligations may in itself be sufficient. For more sensitive data, parties may want more assurance and better security so that only trusted parties can access and use it, and there is no ambiguity about requirements for use. Because the metadata includes a pointer to a gatekeeper, which can provide authoritative information about the policies that are bound to it, we can also enable additional functionality depending on the sensitivity of the data.7 When an entity wants to use data, it goes to the gatekeeper to evaluate the latest policy. For less sensitive data, that may satisfy the requirements for the data controller. For more sensitive data, this may be the first step in a longer authentication and authorization process. The data controller may require that the requestor provide additional claims from a trusted claims provider, such as the verified identity of the requestor and the health of their system. The data controller can also choose to encrypt data and only permit an exchange of keys when the requestor has met certain criteria. In addition to technical ways to enforce policies, a trust framework or a jurisdiction may also define requirements about how policies are enforced. It could potentially be a legal rule that policy bound to data is respected, and violating such policies could have civil or criminal penalties. 6.3. Describing Use with a Taxonomy To ensure that data is used in accordance with policies bound to it, a data-use taxonomy must be defined and understood by the parties within a given trust framework. The taxonomy should describe common access, sharing, use, and disposal patterns. A datause taxonomy can be as complex as needed to satisfy the requirements of trust framework participants. A simple taxonomy to help make use decisions could include the following concepts: • • 7

Who can data be shared with, if anyone? What use is the data intended for?

The gatekeeper can be a federated service that involves multiple entities.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

• • •

239

How long can the data be kept? Where, geographically, can data be transferred? How identifiable is the data subject and other parties?

6.4. Making the Architecture Interoperable As described, the architecture could theoretically exist within one entity. While many features would be useful to the entity and its users, the architecture could be extended to provide value to all entities that participate in a given trust framework, as previously described. The following are the minimum services that would need to be defined for an interoperable ecosystem: •



Copyright © 2013. IOS Press, Incorporated. All rights reserved.



Policy-management services, which could provide authoritative locations where parties can get the latest versions of stored policies. Much as DNS translates domain names to IP addresses and stores the information in servers around the world, a policy-management service would distribute policies among multiple servers and be replicated across geographic areas to improve latency and robustness of policy lookups. Interoperable data-use taxonomy, which provides a common way to describe uses that are attached to the data. By providing a data use taxonomy that allows for interoperability, a trust framework does not have to require that all parties use a specific taxonomy in their own data operations. Instead, they can map from their own data use taxonomies to an interoperable taxonomy through common interfaces and schemas. Identity services, which allow parties to register and authenticate to use data across multiple trust frameworks. Users could utilize such services to access other services; service providers would need to accept claims from multiple identity services to enable proper binding of policy to data. Identity services also make other aspects of the architecture interoperable by connecting data and actions to authenticated parties across the ecosystem.

There could also be a service that enables users to control and renegotiate policies for personal data use across all services. An example of this is the concept of “personal clouds,” also known as personal data stores and personal data ecosystems, which usually involve a secure store for a variety of personal information. In personal clouds, users exercise complete control, e.g., what personal data should be stored, who should have access, how it should be used, etc. Doc Searls makes an eloquent case for this approach in [27]. Ira Rubinstein describes the eight elements of personal data stores in [28].

7. Putting Everything Together – Technology and Policy as a Holistic Ecosystem Thus far, this paper has described the big data revolution, its impact on existing policy practices, user attitudes towards personal data, components of a potential policy framework, and a metadata-based architecture facilitating and strengthening such frameworks. This section brings these components together and shows how they can holistically address the challenges posed by enabling new capabilities. To illustrate how this could work, consider the following scenario.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

240

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

In today’s healthcare systems, many clinicians have access only to point-in-time diagnostics, except within siloed hospital networks. Because most complex health issues involve multiple medical specialties, this reduces the effectiveness and efficiency of patient care. A health IT system with access to a patient’s comprehensive medical history and all of his current records, along with relevant medical analytics, would enable a more holistic approach to treatment. For example, imagine that a patient has lived in several different cities and, over the years, has had respiratory problems that have been diagnosed at various hospitals and clinics. If his physician can retrieve images of his diagnostics and lung X-rays over the previous decades, she can determine whether a current problem should be of concern. In a metadata-based health ecosystem, once the patient has consented to the use of his records and the doctor’s identity is verified, the data elements the doctor could retrieve would include diagnostics from all the various places the patient has sought treatment, regardless of provider network or geographic location. With her identity proven and access policy granted, the physician could securely search for, retrieve, and display these records in much the same way we retrieve results from a search engine with a simple query. The provenance information would be updated to include the fact that the doctor and her institution accessed the records. In addition, the patient, either at the time of initial treatment or any subsequent times, can indicate permission to use the data in a de-identified format for medical researchers with the data use taxonomy indicated above. These permissions can be retracted at any time during the data lifecycle. Depending on the policy indicated, the metadata associated with his records would persist and be accessible to all permissioned parties in the trust framework of entities that have agreed to abide by the policies indicated in the metadata. A metadata-based architecture thus enables the data to flow and be used appropriately to benefit the user, or society, as deemed valuable by the user, while at the same time facilitating enforcement of the permissions expressed. These capabilities would not be easily supportable with today’s approach, especially where explicit consent is required but data access is unenforceable except by contracts. With such approaches, data flow is either overly restricted or susceptible to inappropriate use. Although metadata cannot ensure that entities processing the data will abide by the permissions and policies specified, it strengthens and extends the initial consent by requiring that these entities access the latest policy before processing the data. This can increase user trust in the ecosystem and their willingness to let data flow. Regulations and enforceable codes of conduct would need to provide the legal framework and the ramifications in the event of such policies not being followed. The patient’s treatment can be further improved if other information can also be included in the diagnosis. For example, if the patient has recently started a more stringent exercise routine or traveled to an area with a tuberculosis breakout or with high traffic pollution, these are relevant factors that might impact a diagnosis. Some of this information comes from data that the patient is aware are being collected; some is from data that are passively collected. There are multiple ways that this information can be integrated into the diagnosis, using a combination of technology and policy. Conceptually, all this information and the patient’s preferences can be kept in the patient’s personal cloud, as described in Section 6.4 above. This would provide a centralized means for users to control use of their personal data across all services. There might also be “proxy” services that would learn users’ preferences over time and act on their behalf, simplifying their interaction with service providers across the ecosystem. The level of autonomy and granularity of control could be determined by each user. For

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

241

the current scenario, the patient can use his personal cloud to temporarily give the doctor permission to access relevant personal data solely for diagnostic purposes. This policy change would in turn be reflected in the policy store, hence affecting not just data that reside in the personal cloud, but potentially data residing anywhere else, including passively collected data that had become identifiable. In Section 1, we described issues posed by the amount of passively generated data and appropriate usage of these data. If the metadata includes the class of data, then usage for these data can be specified in a variety of ways. Policy can be set for certain classes of usage, e.g., disaster response. Users can specify permissions using services such as a personal data cloud – e.g., for disaster response, my location data can always be used. Alternatively, commonly accepted norms can be used by leveraging something akin to a “norms service” that would be dynamically updated. Such a service would be analogous to abiding by what is specified by a consumer seal of approval or best common practices. The advantage of such services is that they might better reflect changing social and cultural norms. Whichever method(s) are chosen, the resulting metadata would be bound to the information generated, facilitating appropriate use while enabling data to flow in the ecosystem.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

8. Conclusion The emergence of a data-driven economy has resulted in an increasingly urgent need for a flexible policy framework that can drive new business models and innovation by enabling data to flow, while at the same time protecting the rights of individuals. Such a framework must engender trust on the part of the user by emphasizing trustworthy data practices by businesses and other stakeholders in the data ecosystem – something we believe is essential if the data-driven economy is to thrive. This paper has posited a metadata-based architecture as a technology that can potentially address this issue. By binding user permissions and policies with the data over its lifetime, metadata enables that data to flow, and, with it, a user’s preferences. Entities that use the data are required to access the latest policies and permissions, and a record of their interaction is established. Although metadata cannot guarantee that entities will abide by specified policies, it can facilitate their enforcement by making them readily accessible. A metadata-based architecture can also enable users to modify their permissions and preferences at any point in time, or specify different preferences for different contexts. In this way, the architecture facilitates context-aware data usage. When implemented as part of a principles-based policy framework that provides guidance on trustworthy data practices, and supplemented by voluntary but enforceable codes of conduct, this flexible approach can help satisfy the interests of regulators, users, and industry. It could also help prevent an immensely promising and innovative driver of socio-economic benefits from stalling.

References [1] IDC, IDC Predictions 2012: Competing for 2020, December 2011. [2] M. Komorowski. (2009). A History of Storage Cost [Online]. Available: http://www.mkomo.com/costper-gigabyte. [3] Cisco, Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2012–2017, Feb. 2013.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

242

M.-H.C. Nguyen et al. / A User-Centred Approach to the Data Dilemma

[4] Ericsson, More Than 50 Billion Connected Devices, Feb. 2011. [5] World Economic Forum, Unlocking the Value of Personal Data: From Collection to Usage, Geneva, 2013. [6] United Nations. (2013). Global Pulse [Online]. Available: http://unglobalpulse.org. [7] R. Stahlmann, A. Festag, A. Tomatis, I. Radusch and F. Fischer, Starting European Field Tests for Car2-X Communication: The Drive C2X Framework, 18th ITS World Congress & Exhibition, Orlando, 2011. [8] McKinsey, Big Data: The Next Frontier for Innovation, Competition, and Productivity, May 2011. [9] OECD, Exploring the Economics of Personal Data: A Survey of Methodologies for Measuring Monetary Value, OECD Digital Economy Papers, 220 ed., OECD Publishing, Paris, 2013. [10] European Parliament. (2012, Jan. 27). Protect your privacy: think twice before you give yourself away [Online]. Available: http://www.europarl.europa.eu/news/en/headlines/content/20120120STO35905/ html/Protect-your-privacy-think-twice-before-you-give-yourself-away. [11] European Commission, General Data Protection Regulation, European Commission, Brussels, 2012. [12] H. Nissenbaum, Privacy as Contextual Integrity, Washington Law Review, 79 (2004), 119–157. [13] H. Nissenbaum, A Contextual Approach to Privacy Online, Daedalus, the Journal of the American Academy of Arts & Sciences, 140 (2011), 32–48. [14] The White House, Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy, U.S. Government, Washington, 2012. [15] Federal Trade Commission, Protecting Consumer Privacy in an Era of Rapid Change, U.S. Government, Washington, 2012. [16] TNS Opinion & Social, Special Eurobarometer 359: Attitudes on Data Protection and Electronic Identity in the European Union, European Commission, Brussels, 2011. [17] Nokia Siemens Networks Corporation, Privacy survey 2009, Espoo, 2009. [18] International Institute of Communications, Personal Data Management: The User’s Perspective, London, 2012. [19] President’s Council of Advisors on Science and Technology, Report to the President – Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward, Executive Office of the President, Washington, 2010. [20] P. Ashley, C. Powers and M. Schunter, From Privacy Promises to Privacy Management: A New Approach for Enforcing Privacy Throughout an Enterprise, New Security Paradigms Workshop, Virginia Beach, 2002. [21] G. Karjoth, M. Schunter and M. Waidner, Platform for Enterprise Privacy Practices: Privacy-Enabled Management of Customer Data, Privacy Enhancing Technologies: Lecture Notes in Computer Science, vol. 2482, Berlin Heidelberg, 2003. [22] K. Bohrer, S. Levy, X. Liu and E. Schonberg, Individualized Privacy Policy Based Access Control, Proceedings 6th International Conference on Electronic Commerce Research (ICECR-6), Dallas, 2003. [23] K. Bohrer, X. Liu, D. Kesdogan, E. Schonberg, M. Singh and S. L. Spraragen, Personal Information Management and Distribution, The Fourth International Conference on Electronic Commerce Research (ICECR-4), Dallas, 2001. [24] A. Pentland, J. Gips, W. Dong and W. Stoltzman, Human Computing for Interactive Digital Media, MULTIMEDIA ’06 Proceedings of the 14th Annual ACM International Conference on Multimedia, New York, 2006. [25] L. Bussard, G. Neven and F.-S. Preiss, Downstream Usage Control, IEEE International Symposium on Policies for Distributed Systems and Networks (POLICY 2010), Washington, 2010. [26] PrimeLife. (2011, Oct.). Project Fact Sheet [Online]. Available: http://primelife.ercim.eu. [27] D. Searls, The Intention Economy: When Customers Take Charge, Harvard Business Review Press, Boston, 2012. [28] I. Rubinstein. (2012, Oct. 5). Big Data: the End of Privacy or a New Beginning? [Online]. Available: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2157659.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-243

243

Life Management Platforms: Control and Privacy for Personal Data1 a

Martin KUPPINGER a and Dave KEARNS b Founder and Principal Analyst, KuppingerCole b Senior Analyst, KuppingerCole

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Abstract. Life Management Platforms (LMPs) combine personal data stores, personal cloud-based computing environments, and trust frameworks. They allow individuals to manage their daily life in a secure, privacy-aware, and deviceindependent way. In contrast to pure personal data stores, they support concepts which allow interacting with other parties in a meaningful way without unveiling data. This concept of ‘informed pull’ allows apps to consume information from the personal data store as well as from third parties, and to use that information for decision making, without unveiling data to any of the other parties. Think about comparing offers for insurance contracts or think about pulling articles from various sites for a ‘personalised newspaper’ without unveiling the full list of current interests. The problem with sharing data is: ‘Once it’s out, it’s out of control’. Even with more granular approaches on sharing data, the problem remains – once data has left the personal data store, the individual has no grip on that data anymore. LMPs thus go beyond that, by new concepts of sharing information with security and privacy in mind, but also by relying on trust frameworks. The latter are, for instance, important to rely on a set of relations and contracts once it comes to sharing data – and there is no way to fully avoid sharing data. Once a decision about the preferred insurance company has been made, there needs to be a contract. Some data has to flow. However, defined contracts can reduce the risk of abuse for that data. The chapter explains the need for LMPs, the underlying concepts, and potential benefits. It also looks at some real-world use cases. Keywords. Life management platforms, trust frameworks, privacy, personal data, personal data store, social networks, social computing

Introduction Life Management Platforms are a concept for cloud-based infrastructures that have the potential to change the way individuals deal with sensitive information like their health data, insurance data, and many other types of information – information that is today frequently paper-based or, when it comes to personal opinions, only in the mind of the individuals. They will potentially enable new approaches for privacy and securityaware sharing of that information, without the risk of losing control of that information. A key concept is “informed pull” which allows the consumption of information from other parties, neither violating the interest of the individual for protecting his information nor the interest of the related party/parties.

1 This chapter is a revised version of the KuppingerCole Advisory Note ‘Life Management Platforms: Control and Privacy for Personal Data’, available at www.kuppingercole.com/reports.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

244

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Life Management Platforms with related standards, protocols, business models, applications, etc. might become the one technology that will have the strongest influence on our everyday life (and, on the other side, on enterprise infrastructures and Internet architecture) over the next 10 years. We expect that some years from now, your car will be accessible through a virtual key which is stored in your private domain (an infrastructure for secure storage of personal data), together with all information relevant for the usage and maintenance of that car. This will be kind of a digital driver’s book, which would even report an engine failure to your garage if you wished it to do so (and only then). If they succeed, which we expect, LMPs will become a key enabler for, among other things, the really connected car of the future. We foresee that, some years from now, it will be possible to store all information required to find the best insurance on LMPs. Individuals will be able to request offers from insurance brokers without unveiling all that data and then pick the policy which fits best – without details from each insurance company leaking to other insurance companies and without sensitive personal data from the individual leaking to insurance brokers or insurance companies he doesn’t choose. Similarly, LMPs could allow the receipt of really targeted information, based on the current personal interests, wishes and desires of a person – all the details people will never unveil in a social network or on any platform owned by a content provider. Obviously, LMPs are far more than Personal Data Stores. They not only support a secure store for sensitive personal information, they allow for making better use of that information. The real value lies in the various ways of sharing that information supported by LMPs. Virtually all business models that rely on sharing sensitive information with individuals will fundamentally change if and once LMPs become established. That will challenge existing business models and IT infrastructures, but it provides fantastic new opportunities – not only for new business models, but also for cost savings and better service for virtually all organisations. We expect that this fundamental shift towards the availability of privacy-aware infrastructures such as LMPs is the foundation for new future business models. However, this change won’t come about easily: there are obstacles to overcome. Yet we believe that they will be overcome, and LMPs will be used by most people in the coming years.

1. Definition What are Life Management Platforms? And how do they relate to Personal Data Management or Social Networks? Every time a new concept hits the market, it is about defining the scope. That definition should control the innovation in that field for quite a while. A Life Management Platform (LMP) allows individuals to manage and access all relevant information from their daily life, in particular data that is sensitive and typically paper-bound today, like bank account information, insurance information, health information, or the key number of their car. Notably, they are not limited to such data but support everything which should be shared with, for example, the car manufacturers, the dealers, and the garages (and maybe some other parties). So too, other retailers, insurers, government, schools, fraternal organisations, employers, friends and family all have the need to share in some of our information at some time and for some duration. LMPs are the way to control this. Notably, an LMP can access such information from other data stores such as Personal Data Stores, which provide secure storage ca-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

245

pabilities but lack the required flexibility in secure and privacy-aware sharing of information. LMPs support the privacy- and security-aware sharing of such information, following the concept of minimal disclosure while avoiding the loss of control of this data. They must support the concept of ‘informed pull’ that allows sharing information with other parties in a way that avoids any data leakage, mainly based on a new concept of privacy- and security-aware apps. These apps can receive information from various parties. They can process the data and only share the result with the user. However, they share the input data neither with the user of the LMP nor with other providers of data. This does not require specific technologies but just good design and testing of such apps. However, these apps can benefit from more advanced tools such as Zero Knowledge protocols or Homomorphic Encryption.2 In general, these apps must follow the design principles of Privacy by Design.3 Notably, LMPs are going far beyond Personal Data Stores because they not only provide a secure store for that information but a secure way of sharing information. From a technical perspective, the fundamental problem is that once information is disclosed, it is out of control. Today we neither have a means to encrypt our information and apply access controls, nor to connect a ‘sticky policy’.4,5 As a consequence, there is no association of a policy for using that data. This allows the recipient to share the data, while the original provider of the data has no reliable way to control its flow nor to prohibit its further use. Therefore, today, a lot of information is just leaked unknowingly or unwillingly in various ways. Most new approaches to sharing information, including so called Personal Data Stores, are also rather limited. They allow for a more or less granular approach to controlling access to information. However, they do not address the fundamental issue of minimizing the risk of sharing information. An ideal solution could follow the concepts found in Information Rights Management,6,7 with encrypting information and applying usage policies that must be enforced by the recipient. However, that concept today only works in limited environments and for documents and e-mails, but not for structured information such as data records. And even in these environments there is a potential for abuse, by, for instance, ignoring the access policies after decryption. LMPs will not be able to fully solve this challenge in the near future. They might benefit from advancements such as Information Rights Management technologies for structured data, or technical achievements such as homomorphic encryption that allow processing encrypted data. However, the concepts within LMPs are able to minimize the remaining risks today, by a combination of technology and agreements. As mentioned above, one part of the solution is the apps on LMPs that must follow well-defined concepts for privacy and security by design – design concepts that are rarely followed today. These apps must support the concept of ‘informed pull’ that allows for the receiving of information from various parties without disclosing them to any of the other parties. Processing of that information allows for decision-making in a private way. 2

http://researcher.watson.ibm.com/researcher/view_project.php?id=1548 http://coe.privacybydesign.ca/ 4 http://blogs.kuppingercole.com/kuppinger/2011/03/10/we-need-a-policy-standard-for-the-use-of-data/ 5 http://blogs.kuppingercole.com/kuppinger/2011/07/21/how-to-deal-with-data-sprawl-could-a-stickypolicy-standard-help/ 6 http://en.wikipedia.org/wiki/Information_Rights_Management 7 http://www.prostep.org/?id=585 3

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

246

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

However, there will still be a need for disclosing some information to other parties. While the concept of informed pull might be used for selecting an insurance provider, some information still needs to be shared to make a contract with the insurance company of choice. That is where Trust Frameworks that define a set of parties and agreements between these parties come into play. Based on these agreements, the minimum of information can be shared. We have named this ‘controlled push’ in contrast to the rather uncontrolled push that is common today. It is about disclosing a minimum of information based on agreements that define the usage policies. There would still be room for abuse, but that might be reduced over time based on new technical capabilities. However, compared to common scenarios, LMPs massively reduce the need for sharing information. The concept of LMPs is thus based on the combination of a ‘personal domain’, holding all information securely, and the ability to use this data in a privacy- and security-aware way. Approaches which lack either of these two core features are not understood as LMPs by us. Trust Frameworks are required for managing relationships and agreements. The Life Management Platform itself might provide a secure data store such as a Personal Data Store, but the personal domain might also be partially or fully virtual, integrating various data stores. LMPs might provide an infrastructure for setting up and managing Trust Frameworks but might also rely on external frameworks.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Use Cases and Benefits When looking at today’s Internet, it becomes clear that many of the approaches we find there fulfil the requirements of neither the users nor their counterparts, such as vendors, providers, and other parties. The three fundamental changes in IT, the Computing Troika of Cloud, Mobile, and Social Computing, drive other evolutions. We understand the rise of Life Management Platforms as one of the most potentially important evolutions within the next decade. One of the challenges will be privacy, which is becoming increasingly important to many people (again). Another is the enablement of new business models in a way in which users can start using them quickly without needing to worry about privacy, while removing the burden of entering too much information that might then be widely distributed, such as all the data about your properties. A third is the vital interest of service providers – from vendors to governments – to provide their services to their users only, and not to others (to protect their business models). These parties are interested in gaining a competitive advantage by keeping their relations and the related data disclosed, but also by not unveiling too much information about their offerings. Examples are insurance companies that are reluctant to provide too much information about their terms, conditions, and related business models to freelance insurance brokers who are working with their competitors as well. Another example is the complex relationships in the automotive industry, where, for instance, vendors and dealers do not share all information about their customers – if they share any at all. LMPs allow for better control of information while enabling new and better forms of collaboration. They do this through enabling new, restricted forms of information sharing (‘informed pull’) and mitigating the risk of giving away data by minimising the amount of that data and sharing in the context of agreements based on Trust Frameworks (‘controlled push’).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

247

2.1. The Life Management Platform Use Cases

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

When looking at the factors which will drive the evolution Life Management Platforms stand for, it is best to start with a look at particular areas of the life and use cases which are part of each area. From the perspective of today, there are many use cases for Life Management Platforms in various areas. When looking at the area of Income and Wealth Management, examples include Virtual Salary Statements from employers, tax optimisation applications, or property management. The first example will allow employers to deliver the monthly income statement electronically instead of in a paperbased format. In many countries, privacy or data protection requirements still prohibit this, but LMPs should enable compliance using electronic means. In the case of LMPs, the employer will have the relevant information available in his personal data store for re-use, which will make it far easier to share such information, e.g. with banks requiring it for credit decisions. LMPs could, for instance, instead of sharing the income details, provide the information on whether an employee is above a defined income level. The last example, property management, would allow for the secure handling of all contracts around properties, plus – in combination with the first one – to confirm an ability to pay eventual debts. There is a large number of examples from various areas such as healthcare, connected vehicle, connected home, or the management of relationships between consumer and vendors, the so-called Vendor Relationship Management (VRM). These cases all are fairly self-explanatory. And they all share a common theme: they concern very important, very sensitive, private information. The information is valuable, not only for the owner but also for a lot of organisations which have to deal with the individual, ranging from the government to insurers, different types of banks and other financial service providers, vendors of services and goods, the local garage, and so on. However, it becomes obvious that many people aren’t willing to share all that information in the way many of today’s social networks suggest that information should be shared. And this is even true for many other use cases. You might add to this list: • • • •

your Family Album your Personal Interests (including the hidden ones) the Preferences of your friends (as you see them) and many more.

When looking at the extended list, it becomes obvious that LMPs would greatly affect the way social networks operate today. There is a lot of information in any person’s life which needs to be managed and shared. And a lot of this information has to be managed in a secure way and shared in a controlled, directed way. Some people might decide differently about some of that information. Some cultures might rate the need for privacy and security differently for some of that information. However, there will always be information which is used too frequently to remain paper-based and which is too sensitive to deal with in the way today’s social networks are doing. That is where LMPs come into play: providing the tools to manage the essential information in every person’s life and making it usable for other parties through privacyenhanced applications, thus meeting both the privacy and security requirements. A fundamental consequence of this evolution is the so-called ‘upside down effect’. The individual decides on which information they wish to provide to whom. They decide about what is shared and what not. They can opt for more privacy based on future pri-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

248

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

vacy-enhanced apps which ensure that only that part of their information they wish to share (if any) is given to the other party. Having such an opportunity could, if it takes off, create massive pressure on existing ‘providers’ (again in the broadest sense) to adapt the way in which they deal with customers to that model. Once critical mass is reached, providers will have to support that model and adapt their interaction with individuals. This will have a major impact on both business models and IT infrastructures. One interesting question in this context is: the business models of many providers, such as Google or Facebook, today rely on the fact that they collect and mine or sell the personal data of their customers. LMPs are clearly disruptive to such business models. However, these business models rely on having buyers for such information, for instance, in the field of advertising. If these buyers can gather better information directly from the customer, they will no longer need to pay the providers mentioned above. The entire VRM (Vendor Relationship Management) concept describes that shift for the area of customer relationships. Notably, LMPs enable a mass of other business cases not supported today, so they do not rely on the success of concepts such as VRM or the change of provider business models in that area. Most likely, customers will become more selective in sharing information. Given increasing privacy concerns, this appears to be a valid assumption. From the business model perspective, this change is an enabler. The inhibiting factors, for example, entering details dozens of times to find the cheapest car insurance, will be much lower than today. Only if the insurance broker can convince the prospect that his is a fair offer will he receive sufficient information to contact that prospect. This is just one – and a simple one at that – example of the sort of changes we expect to observe. Even more interesting is that the information flow might move towards a bidirectional approach, depending on the LMPs and the concepts of vendors. And this approach will ensure privacy as well. The art of dealing successfully with LMPs from a provider perspective is in fact simple: provide services and offers that are sufficiently attractive and don’t rely on knowing things about the individuals you shouldn’t know or do not need to know. When looking at VRM (Vendor Relationship Management) – providing the ability for the end user to share information with vendors of choice in a controlled way, one of the most prominent but limited cases – this becomes clear. In addition, this example showcases several of the shortcomings of today’s approaches, including CRMs and social networks – and especially most of the marketing and customer interaction initiatives relying on social networks. VRM allows the customer to share what he currently assumes to be relevant, which might be very different from what he found relevant in the past. The other approaches try to predict what he might find interesting based on incomplete historical information. Organisations today (and tomorrow) need to: • • • • •

know their customer interact closely with him ensure that their competitors don’t know too much about him and their relationship to him ensure that they stay in touch with him over time, building a customer relationship/bond tighten the relationships.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

249

However, if you look at today‘s social networks, the privacy-ignorant approach violates the third of these bullet points. If you know your customer, your competitor will almost certainly be able to gain knowledge about them easily as well. With respect to the fourth bullet point, staying in touch with a customer might quickly become a oneway exchange, where organisations put in a lot of effort and no one is listening. It might even quickly become a dead end once the social network loses popularity. And does anyone believe that social networks really help in a targeted approach of tightening relationships to individual customers? When looking at the customer’s requirements, there are some additional challenges which organisations are facing today, or will face soon, according to our own continuous research: • • • •

people want to keep their life data managed in both the digital and non-digital world they want to ensure privacy they start thinking about which price to pay: privacy or money they want to control their relationships and their data.

Simply put: organisational requirements and customer challenges are not only changing, they are not even being met by what is provided today.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2.2. The Benefit to the Enterprise Much of what has been written about Life Management Platforms emphasises the benefit to the user, typically called the ‘consumer’. There is a lot of information available on the concept of VRM (Vendor Relationship Management) [1]. Thus, the corporation takes on the role of a vendor to that user. But LMPs are also of benefit to the enterprise in two very different ways. First, the enterprise is also a consumer, and can use its own Life Management Platform to smooth the operation, increase efficiency and lower the cost of doing business with vendors, government, other regulatory authorities, partners, etc. But there’s a second benefit, and that’s allowing employees to use LMPs to interact with the organisation. Obvious areas such as wages and benefits come to mind. So do healthcare and other insurance issues. Beyond that, the employee – using APIs designed by or for the enterprise – can take over much of the Identity Management issues (office location, phone, technical inventory, etc.) that are currently often handled by armies of data entry clerks. The benefit to the enterprise is more accurate information in a more timely manner at a much lower cost. 2.3. Why Life Management Platforms Provide a Solution The obvious mismatch between the currently available solutions, the customer challenges and the user requirements is driving change. Besides this, regulatory compliance in several regions, especially the European Union, will greatly influence the way organisations deal with privacy. Needless to say, this is a long-term journey, due to the need for business models that work while providing adequate privacy, the fundamental understanding of how relationships with other parties can be managed successfully, and IT approaches to change. There are, however, so many visible indicators showing that change has begun

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

250

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

that this is just a matter of time. And given the fact that many of the use cases mentioned above allow for new or improved business models, this change might happen more quickly than many expect today. 2.4. How to Earn Money with Life Management Platforms The success of all new economic ideas depends on the success of their business model. The entry advantage makes it highly probable that LMPs will quickly reach a critical mass. This is not about inventing new business models. It is not even about taking business models which have been discussed for years now, but never became really successful, and creating a winning approach now – even though there are many examples which are likely to work once LMPs are established. The entry advantage is far simpler: it is about money no longer being spent on complex, sometimes still analog and paper-based processes – for example, the monthly salary statement sent to thousands of employees in different locations as a traditional postal letter. That helps to reduce costs for either complex IT solutions or the paper-based processes. And it will make the fulfilment of regulatory compliance much easier, given that LMPs will ensure that this sensitive information remains private. Also, it has been shown that there are multiple parties who might be willing to pay. Some potential revenue streams are: • • •

Copyright © 2013. IOS Press, Incorporated. All rights reserved.



fees from individuals using the Life Management Platform and paying for the services provided; fees from companies which want to interface apps or run apps on that platform, and who might be interested in connecting to a larger audience and understand the value of LMPs as a perfect way to connect with potential customers; directed adverts, as a concept which allows a user to decide which adverts they are interested in at a given point of time – adverts are not bad per se, they might even provide value and they can provide a higher value for both sides, advertiser and recipient, in a model where adverts are only shown to really interested people; new media concepts, given that LMPs will finally allow personalised content delivery in a way the content providers always had in mind.

Thus it is, as always, about reaching the critical mass on both sides – users (simple without fees) and services (which then also becomes simple).

3. Underlying Concepts Life Management Platforms are, despite their immature status, increasingly visible in various forms and using various names for the concept. The vendors that are closest to the concept of LMPs appear to be Qiy and Kynetx, while several other offerings such as personal.com focus on Personal Data Stores only. Unfortunately, none of the platforms available today fulfils all the requirements identified by us. Thus it will still be some time before we see a broad adoption of these concepts. However, there are good examples, fundamental concepts to understand, and points about valid business models which are covered in this section. A key concept of LMPs will be the support of what we call ‘controlled push’ and ‘informed pull’, as explained above. These concepts are like two sides of the same coin.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

251

Furthermore, they are the essence of why LMPs are far more than just a store of personal data. Storing personal data is just a small part of the value proposition of LMPs. And nor is just sharing this information by allowing some parties to access it without further control and without keeping a grip on that data (such as policy agreements/contracts at the least, or more advanced, sticky policies as metadata) what really makes a Life Management Platform. That would be nothing more than a social network with some better access control capabilities. The key capability of LMPs is precisely the enabling of the two concepts mentioned; it is about using new types of privacy-aware apps which allow for the making use of sensitive information in a way that provides value to the owner of that sensitive information. Two examples for that are: (1) the insurance app mentioned above, which collects current data about rates from different insurance companies and calculates the best rate based on the ‘virtual car key’ and the related detailed information about the car. It might even use information about the family status, properties, and others to find the best applicable rate. And this app neither leaks current rates of insurer A to insurer B nor leaks any sensitive information from the potential customer to any of the insurance companies – unless the prospect decides to become a lead or customer of one of them; (2) the eHealth app which uses the private health history to create an individual fitness programme, without leaking information. In both cases, there are interested parties. There are those creating the app and providing the details, knowing that they might and can participate in the event that a deal is made; there are the users, who can rely on information that they have stored once, without being afraid of data leakage to all the insurance companies; there are the health insurers or governments who want to reduce costs in the health system; and so on. Obviously, even while no one knows everything about the others, there are business models. Informed pull is the ability of an LMP-based app – following the concepts of security and privacy by design – to pull data from various sources, including Personal Data Stores, and use that data without unveiling it to any other of the parties. Controlled push, on the other hand, is more than just a more or less granular method of granting access to personal data in a personal data store. It is about pushing out such data in the context of contracts or other forms of agreements which limit the use of such data. Ideally, it is technically enforced. However, due to limitations in technical enforcement – there is no such technology as Information Rights Management at the attribute level – it will be mainly based on legal or other forms of agreements. Notably, Trust Frameworks that define relationships and agreements will help LMPs, because they provide the particular foundation for implementing the concept of controlled push. Clearly, the success of the LMPs requires such apps to be built-in in a secure, privacy-enhancing way. The technology for doing that is available; it just needs to be done. It does require a change in attitude, an understanding that masses of data are not necessarily what is needed for successful business. Most importantly, the principles of Privacy by Design must be included in the app planning. Just as there is a great deal of interest and enthusiasm for LMPs, so too there are any number of potential inhibitors to its adoption. Inertia alone – the resistance to change – is something that every new technology has to overcome. But there are other problems which could get in the way, such as undereducated consumers, reluctant vendors, unconnected apps and siloed data, not to mention regulation and Governance.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

252

M. Kuppinger and D. Kearns / LMPs: Control and Privacy for Personal Data

4. Summary We believe that Life Management Platforms, and the related concepts such as Trust Frameworks and Personal Data Stores, are the next big trend in IT and the Internet. If they take off, they will fundamentally change the interaction between individuals and other parties. They will drive innovation in security and privacy. The reason for that success is twofold: they provide a better business model for most companies than traditional approaches like proprietary solutions or today’s social networks. Additionally, they provide measurable cost savings. They also enable communication with privacy and security in mind, taking into account the concerns of many users and accepting the limits of openness.

References D. Searls, The Intention Economy: When Customers Take Charge. Harvard Business Review Press, 2012.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[1]

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-253

253

Digital Enlightenment, Mydex, and Restoring Control over Personal Data to the Individual William HEATH, David ALEXANDER and Phil BOOTH Mydex CIC

Abstract. The present online economy is based on organisation-centric control of personal data, identity and proof of claims. The benefits of adding individualcentric control over personal data and identity appear significant. The paper describes the research, development and deployment to mid-2013 of a platform to achieve this by the London-based social enterprise tech startup Mydex CIC. Keywords. Social enterprise, vendor relationship management, VRM, personal data, identity, ID Assurance, enlightenment

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Introduction The Digital Enlightenment Forum (DEF) Yearbook editors have this year called for a broad exploration of the value of personal data. It’s a timely topic: the value of personal data has driven some of the most spectacular growth and fluctuations in stock market value seen since 2000. Shortcomings in how we protect and value personal data are behind profound and emerging distortions in economics and human rights. The authors of this article are entrepreneurs working for Mydex CIC to restore control over personal data, including identifiers and proofs of claims, to individuals. Our experience is that this field is every bit as interdisciplinary as the DEF suggests in its commissioning brief. DEF suggests a tension between core enlightenment concepts such as enforceable rights, individualism and the power of reason on the one hand and digital computing systems, networking and data mining on the other. Mydex’ work helping individuals to realise the value of their own personal data seeks directly to address that tension, with a carefully researched, deliberately designed and financially sustainable market solution. DEF’s brief says: “In this world, privacy plays several important roles, protecting the autonomy of individuals and governing their relationships with institutions, communities and society as a whole.” In an information society privacy is essential; it is perhaps the key right on which others are built. The right to privacy is protected in Europe under Article 8 of the ECHR (and equivalent legislation in the Member States such as the Human Rights Act in the UK). But in the present context people lack the tools to manage and control their own personal data, and the ability to control just what is shared, with whom and in what situation. This article describes the work of Mydex CIC, which has researched, designed, prototyped, constructed and is now deploying services to help individuals acquire, manage and control the sharing of their personal data.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

254

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

The insight is reasonably widespread and spreading that restoring control over personal data to individuals will have profound effects. The benefits promise to be significant (transformative, even). The challenge lies in initiating the change: getting it seriously under way in a manner that benefits three types of actor: individuals; existing organisations that use personal data; and new developers and entrepreneurs. Mydex offers a case study in how the governance, technical, commercial and design challenges can be addressed. Mydex Community Interest Company is a social enterprise based in the UK. By mid 2013 Mydex had proven a significant degree of market acceptance from UK public-service organisations that rely on personal data. This has started a process which will allow market value to be applied to flows of permissioned personal data emanating from the individual. The start point for this market in the UK is the ID assurance accepted for UK public services. There is a growing number of similar initiatives worldwide, some predating Mydex, which address the range of challenges outlined above in several different ways. We all welcome that diversity, and believe that the market will apply its own evolutionary law over time. What individuals need is the establishment, the sustainability, and also the interoperability of a range of new personal data services. Europe, with its millennia-long history of intellectual innovation to meet pressing social needs – including in recent years its determination to establish data protection laws to regulate what can be done with new technologies – is as good a place as any to start to rectify the problems caused by the initial phase of online exploitation by organisations of personal data. And “enlightenment” seems to us a fitting Leitmotiv for this narrative. So we are grateful to the editors for the chance to contribute this chapter to the Digital Enlightenment Forum 2013 Yearbook.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

2. Value of Personal Data: The Present Context The present context or state of the art of personal data in the online economy is based on effective organisation-centric control over all personal data. In what should be a two-sided relationship between the individual and the organisation only one side has meaningful data collection, management and analytic capabilities. This imbalance causes problems for both sides. It disempowers consumers and citizens and forces expensive work-arounds and counter-productive incentives on to organisations. Organisations – businesses and government – assume an entitlement, a right to beneficial use of and to trade in the value of the personal data they accumulate. This includes marketing and credit agencies, phone companies, retailers, utilities, online ecommerce and social networking businesses, health and financial services. They treat it as their own asset, in many cases a very significant asset which for some businesses forms a substantial portion of their market value. Yet it is not clear that it is routinely properly treated, protected and accounted for as a commercial asset. If it were, stock exchange reporting rules would contribute an interesting analysis to the story about the growth in the value of personal data as control shifts to individuals newly equipped with the ability to protect and manage it. The norm is that to get anything useful done online you have to hand over control of your personal data. People do it en masse because it is the only convenient way to get things done, irrespective of its impact on privacy, control and the value of their personal data. People wish to connect to each other to share pictures and stories, and over 1bn use Facebook to do so. People need federated ID to log in to online services

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

255

and over 500m use Facebook Connect to do so. It is unlikely that they understand – and unproven whether they care – about the full implications of doing so. What is proven is adoption on a vast scale despite a wide range of privacy concerns.1 The credit agency Equifax, for example, holds 26 petabytes of data, with 800bn records on 500m people. “We know more about you than you would care for us to know,” CIO Dave Webb said in a 2012 CIO.com interview. About aggressive mining of that data he says: “The morality question is another dimension, but we have the technology to do this and if it’s legal we should”.2 Under the banner of building closer relationships with customers, a wide range of businesses get sucked into actions and policies which undermine relationships and undermine trust. Examples abound: the collection without users’ consent by Apple’s of iPhone users’, and Google’s of Android users’, geolocation data are two among many.3 O2’s collection of location based data to provide targeted advertising on behalf of business customers without subscribers’ consent is another. Broader problems of data protection abuse are not new, nor confined to one country or sector. In 2007 the then UK Information Commissioner Richard Thomas described as “inexcusable” and “horrifying” the range of data protection breaches by internet firms, banks, direct marketing organisations and telecoms firms. This takes traditional businesses away from their core relationship of providing conventional services to customers (such as energy, financial and health services), dragging them towards a manipulative and non-consensual relationship based on control over personal data. Many problems – high levels of guesswork, waste and inefficiency, restricted ability to provide relevant, appropriate service and services and low levels of trust – flow from the solely organisation-centric approach of today’s personal data infrastructure. One result of this is that personal data, despite its already immense value (evidenced by the market value of Web 2.0 companies such as social-network businesses, or the profits of credit agencies), is nothing like as valuable today as it would be if it were accurate, more complete, and properly protected. Wholesale trade in inaccurate and incomplete data erodes trust and thereby erodes value. Furthermore, as Doc Searls of Customer Commons argues, the value of using personal data properly is set to be far greater than its sale price.4 For all its inefficiencies and clear limitations, the organisation-centric model has not run out of steam; far from it. Evidence of its momentum is everywhere. The specific rationale of what is currently called “Big Data” is well set out by Ken Cukier and Viktor Mayer-Schonberger in their 2013 book.5 They explain that the sheer abundance of big data, coupled with the processing power now available, creates a qualitative shift. It allows important decisions to be made on the basis of correlation alone, without bothering with causality. They give the example of Wal-Mart discovering that hurricane warnings are followed by increased sales not just of flashlights, but also of Pop 1 See e.g. 7 big privacy concerns for new facebook and the Open Graph by Lauren Hockenson for Mashable.com, 27 Jan 2012. 2 “Big data, big brother, big bucks”, CIO Magazine, 2 June 2012. 3 See e.g. http://www.zdnet.com/blog/perlow/google-android-apple-iphone-geolocation-tracking-flapdisclosure-is-everything/16893. 4 Doc Searls, For personal data, use value beats sale value Doc Searls, Customer Commons, May 2013 at http://customercommons.org/2013/05/23/for-personal-data-use-value-beats-sale-value/. 5 K. Cukier, V. Mayer-Schonberger, Big Data: A Revolution That Will Transform How We Live, Work and Think, available on Kindle, 2013.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

256

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

Tarts. The first is understandable, the second perhaps surprising. But that doesn’t matter; what matters is the correlation which supports increased sales. It’s a characteristic of “big data” to be messy, they point out, and not necessary for it to be exact for the purposes of demonstrating correlation. “Exactness requires carefully curated data” they say. Such curation is expensive, but the nature of big data also makes it unnecessary. Their argument is clear and sound from the organisation’s point of view. But it leaves out the perspective of the individual for whom precise and careful curation is often essential: in a world of digital services, you could be unjustifiably barred from boarding a flight or imprisoned, based on false ‘messy’ data. Even mundane acts such as ordering an anniversary outfit for your partner or going into hospital for a medical intervention illustrate that from the perspective of the individual, carefully curated data is essential. This is hard and expensive for organisations to do, yet they persist with the notion they can do it successfully, satisfying the infinite nuances, preferences and eccentricities of every individual they deal with. Only careful curation of personal data can support specific, appropriate and personalised services that are sensitive to individuals’ real needs. Our working assumption is that for reasons of privacy, and also because individuals know more about their preferences and intentions than anyone else, each and every individual should be properly supported with the tools and policies necessary to take control of their personal data, that this is a more enlightened approach, and that it will further increase the value of personal data.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3. Drawbacks and Shortcomings of Purely Organisation-Centric Control This assumption that it is organisations that as a rule should control personal data has to date rarely been challenged outside specialist circles (such as privacy-friendly NGOs, specific fora such as the Internet Identity Workshops, Project VRM at Harvard, other academic projects and entrepreneurs seeking to provide services on the side of the individual).6 As technology enables ever greater exploitation of large data sets, companies mine and cross-correlate this data as far as the law will in practice allow, irrespective of what people would be comfortable with. The assumption behind this is rarely challenged in law or by consumer behaviour. There is, as far as we’re aware, relatively little avoiding or boycotting of services with poor privacy policies or personal data practices. Occasional online campaigns attempt this; Wired cites a 42% reduction in user numbers when the photo sharing app Instagram made aggressive changes to its terms of service and privacy policy in January 2013.7 This trend may grow as the emergence of online social networks at scale makes it easier for disgruntled users to reach out effectively to each other. What we see is that the solely organisation-centric model brings with it not just inefficiencies, but also some darker aspects. That which is technically possible may still be lawful, but contrary to the spirit of what enlightened individuals want. You could call this a form of pre-enlightenment legal servitude. 6 See for example the Naked Citizens EU campaign by Open Rights group and others at https://www. nakedcitizens.eu/, http://www.internetidentityworkshop.com/, http://cyber.law.harvard.edu/projectvrm/Main_ Page. 7 http://www.wired.com/gadgetlab/2013/01/instagram-terms-users/

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

257

Former Google CEO Eric Schmidt spoke in 2010 of a “creepy line” and said: “Google policy is to get right up to the creepy line but not cross it”. 8 It’s a subjective barrier, but one which has been repeatedly tested with the threat of more to come. Verizon has patented a set-top box with camera to deliver home TV viewers targeted advertising.9 Google has patented targeted ads based on listening to the background noise in your phone call.10 Long-term privacy activist Simon Davies regularly sets out key global privacy issues.11 The overall picture is of an enduring and irreconcilable tension between how Web2.0 businesses work and what people actually want; this point is underlined by Stutzman, Grossy and Acquisti in their study of Facebook and social networks.12 There is a specific cause for the underlying tension for many online businesses. The legal obligation on directors in most advanced economies is to maximise returns to shareholders. If a major asset of the business is the personal data of individuals, however acquired, then that obligation sits uneasily with any duty of stewardship of the interests of the individual. Gartner suggests that 30% of businesses will be making money directly from user data by 2016, and has coined the neologism “infonomics”. 13 This tension can be seen even in the case of services which start out with the noblest motives, best intentions and policies. One example of such is XY, the magazine and online service for gay teens, which went bankrupt with highly sensitive personal data as its only asset.14 It can be seen in services whose superficial appeal to users is matched by an enduring insouciance about its effects on their privacy. Facebook founder’s Mark Zuckerberg’s reported dismissive early throwaway remark about users (“They trust me – dumb f*cks”)15 at least has the merit of being more succinct than any written privacy policy Facebook or anyone else has since produced. But perhaps it points to an essentially similar underlying reality. The tension can be seen in the evolution of Google’s popular mantra from a straightforward “Don’t be evil”16 to a stance that is considerably more nuanced; many would say compromised.17 On the government side, failed UK Government schemes such as a National Identity Register (‘ID cards’), ContactPoint and the £12.4 billion National Programme for IT in the NHS were the very opposite of the political promise of “technology to give citizens choice, with personalised services designed around their needs, not the needs of the provider.”18 The darker side of this was explored at the time in the Joseph Rown8

Repeated for example at http://blogs.wsj.com/digits/2011/01/21/top-10-the-quotable-eric-schmidt/. http://arstechnica.com/tech-policy/2012/12/how-to-get-targeted-ads-on-your-tv-a-camera-in-your-set-topbox/ 10 Google wants to serve you ads based on the background noise of your phone calls http:// thenextweb.com/google/2012/03/21/google-wants-to-serve-you-ads-based-on-the-background-noise-of-yourphone-calls/. 11 Privacy Surgeon: Predictions for Privacy 2013 http://www.privacysurgeon.org/blog/wp-content/uploads/ 2013/01/PS-future-issues-full-report.pdf. 12 Silent Listeners: The Evolution of Privacy and Disclosure on Facebook Journal of Privacy and Condentiality (2012). 13 http://www.gartner.com/newsroom/id/2299315 14 One million gay teens’ data could be at risk after US magazine goes bankrupt – http://www.pinknews.co. uk/2010/07/13/gay-teenagers-data-could-be-sold-after-magazine-goes-bankrupt. 15 Facebook CEO Admits to Calling Users ‘Dumb F*cks’ http://gawker.com/5636765/facebook-ceoadmits-to-calling-users-dumb-fucks. 16 Prospectus (aka “S-1”) of Google’s 2004 IPO. 17 See for example Eric Schmidt on whether Google is getting evil – ZDNet http://www.zdnet.com/blog/btl/ eric-schmidt-on-whether-google-is-getting-evil/27268. 18 Tony Blair: Foreword to ‘Transformational Government – Enabled by Technology’ 2005. 9

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

258

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

tree Reform Trust-commissioned ‘Database State’ report.19 The UK Centre for Policy Studies’ looked at the dark side in its report Who Do They Think We Are?20 and then started to explore the individual-centric paradigm in a report by Liam Maxwell, who went on to become deputy CIO for the UK Coalition government.21 Meanwhile, far-reaching legislation such as FISAAA22 and the Patriot Act23 from the United States of America extends that nation’s influence into the personal data of people all round the world who bank, travel, communicate or use data online. The question arises of how people at large react and adjust their behaviour in the face of this organisation-centric control over their personal data. Loss of disclosure control affects trust and can harm relationships, and it can cause strong negative emotions. 24 Research by the Customer Commons suggests that 92% of people engage in lying or concealment in an attempt to protect their personal data.25

4. Personal Control over Personal Data The antithesis to this organisation-centric status quo is the emergence of personal control over personal data. This requires technology, terms and conditions (“tools & rules”). It is quite possible these may be best delivered by dedicated new institutions specifically designed for the purpose. That is the hypothesis behind the setting up of the personal data store (PDS) service Mydex. Personal control over personal data is a constructive addition and disruption to the prevailing Web 2.0 or solely organisation-centric control model. It would be simplistic to present the individual-centric model as a straight replacement for something that, for all its evident limitations, has made immense progress and created huge wealth. Doc Searls, author of The Intention Economy and leading light of Harvard University’s Project VRM likes to say: “the logic is AND not OR”.26 There are many and compelling reasons one could put forward for the emergence of a person-centric model. These include:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

• • •

economic: the logistical cost of the inefficiencies of the organisation-centric model; the problem of poor return on CRM investment; the reduction in value of personal data markets because of loss of trust; better meeting the needs of the individual with respectful and privacy-friendly personalised services; solving the problem encountered by entrepreneurs trying to develop personcentric new services such as Mint, the now-defunct Wesabe, Workdocx, Digi-

19 Anderson, Brown et al.: Database State, Joseph Rowntree Reform Trust, 2009. See http://www.jrrt.org. uk/publications/database-state-full-report. 20 http://www.jillkirby.org/wp-content/uploads/2011/11/who-do-they-think-we-are.pdf (Kirby, 2008). 21 L. Maxwell: It’s Ours – why we, not government, should own our own data Centre for Policy Studies, London 2009 See http://www.ictu.nl/archief/noiv.nl/files/2011/02/its_ours.pdf. 22 See e.g. Foreign clouds in the European sky: How U.S. laws affect the privacy of Europeans at http:// policyreview.info/tags/fisaa. 23 See e.g. the analysis of the Patriot Act by EPIC at http://epic.org/privacy/terrorism/usapatriot/. 24 Brown, I.: How will surveillance and privacy technologies impact on the psychological notions of identity? Evidence paper for the UK Foresight programme, 2013. 25 Mary Hodder and Elizabeth Churchill: Lying and Hiding in the Name of Privacy http:// customercommons.org/wp-content/uploads/2013/05/CCResearchSurvey1Paper_Final.pdf. 26 Doc Searls: The Intention Economy: When Customers Take Charge May, 2012 and see e.g. http://www. searls.com/time2grow.html.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

• • •

259

tal Life Sciences who all face the same problem of needing access to deeply personal information in order to provide new services on a trusted and effective basis; social: media and public perceptions about personal data misuse and loss, intrusion and loss of privacy; legal/legislative: in data protection and consumer law and regulation, and also in a broader human rights framework (especially in the EU) supported by activities of NGOs and “digital rights” civil society organisations; redistribution of power in a more enlightened manner to protect selfdetermination of the individual with enforceable rights.

Those studying the value of personal data can see Mydex as a case study or action research project to test this hypothesis about the overwhelming political, philosophical, ethical, practical, commercial and technical case for individual control over personal data. The results might be measured by means of the setting of a market value for personal data, and by improvements in efficiency from improved data logistics, economic benefits, human rights, well-being and trust. As founders and entrepreneurs the authors are more focused on making Mydex, a high-tech social enterprise start-up, viable and successful. Keen as we are to learn lessons, our focus is on viability and economic sustainability. Key proof points will be market acceptance by organisations and entrepreneurs (which has started) and public use, which can only start with the availability of the connections and applications that come with market acceptance by organisations and entrepreneurs.

5. Challenges in Getting the New Model Started

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Sceptics and critics of the notion of personal control over personal data (or what is called by Harvard’s Berkman Centre “vendor relationship management” or “VRM”) make a series of powerful points: • • • • •

people show little interest in making the effort to curate their own data; people may not read them, but they sign existing Ts&Cs anyway so where’s the problem; “you have zero privacy anyway, so get over it”;27 people like being marketed to: the fact that they buy is evidence that Marketing works; “push” will win because only sellers have the analytical resource to make sense of the world’s data.

Above all, the defenders of the status quo argue: if this were really such a good idea in so many respects why hasn’t it already happened? It’s a question any entrepreneur does well to take seriously, but also a question that by definition every successful new business has answered with action not words. Suffice it to say at this stage that there are undoubtedly substantial challenges in getting the ‘antithesis’ of individual control over personal data established and under way.

27 Phrase used in 1999 by Scott McNealy, Sun Microsystems. See e.g. http://www.wired.com/politics/law/ news/1999/01/17538.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

260

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

These challenges include: • • •

How does it benefit individuals when no organisations are connected? How does it benefit businesses and organisations when no individuals are connected (the “first fax machine” problem)? Why would independent developers produce apps or create new services and journeys based on volunteered personal information when the number of users is negligible?

We sometimes illustrate this threefold challenge with the image of a three-pointed Catherine wheel; a firework that needs to be lit at three points if it is to spin. It needs to: 1.

2.

3.

make life simpler for the individual, solving a pressing problem with an immediate appeal that endures and deepens the more the individual understands it; appeal to organisations by rapidly tackling clearly identified problems, creating quick and sustained return on investment, and then new opportunities for innovation that helps the organisation meet its key aims better; and solve a pressing challenge that is holding back the developers and entrepreneurs who will make their fortunes when a new trusted economy can be built on flows of volunteered personal information.

Each prong of the Catherine wheel is tricky to light by itself. It’s necessary to light all three simultaneously if this is to work. But when you succeed, it promises to be spectacular.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

6. Mydex CIC Approach First lessons: The idea of Mydex as a social enterprise platform to address these issues was conceived by the original founders with the support of the Young Foundation28 under the sympathetic guidance of the then director Geoff Mulgan. The decision was taken to give Mydex the UK legal form of a Community Interest Company (CIC) limited by shares. A CIC must have a stated community which it serves; for Mydex CIC that is people who use personal data online. A CIC must have a statement of community purpose. In Mydex’ case, that is to help people realise the value of their personal data, in order to support better and healthier relationships. The majority of any CIC profits made have to go back towards pursuing that community purpose (there can also be a regulated return to shareholders). And a CIC is asset-locked; this means its assets cannot be sold or transferred other than to another entity which is similarly asset-locked. The social enterprise Mydex was predated by projects with broadly similar aims and vision, including Qiy Foundation in Holland, Paoga, Allfiled and eDentity (now PiB-d) in the UK and Singly in the US.29 Other services such as Personal and Respect Networks have since emerged and are also making strong progress. 30 What follows presents Mydex as one case study. There is no suggestion that the path described is the 28

See http://youngfoundation.org/. See https://www.qiy.nl/en, http://www.paoga.com/, https://www.allfiled.com/, http://www.pib-d.net/ and http://singly.com/. 30 See https://www.personal.com/, http://respectnetwork.com/. 29

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

261

only effective way to solve the problem identified. On the contrary, our strong belief is that only the emergence of a diverse interoperable ecosystem of enlightened personal data service operators, governed, designed and in some way regulated to work in the interests of the individual and adhering to credible standards as they emerge, will properly serve the community of people who use personal data online. That is the community Mydex is set up and bound in its Articles to help.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

7. Mydex Community Prototype In 2009 Mydex created a working proof of concept which showed an individual able to store their own personal data, and to connect to an external service (in that case BT URU) to validate that data. Attitudes towards the prevailing personal-data context and to Mydex’ proposition were explored in 2010 with ethnographic research undertaken by Veronica Massoud, a Master of Design student at the London College of Communications. Massoud filmed and spoke with nine individuals of varying ages. She identified six attitudes: cynical, unaware, obliviously participating, unacquainted, resigned, apathetic but strategic. In all their diversity and regardless of age or profile the research described people who were somewhere between depressed and in denial about what happens to their personal data ‘out there’, resigned to deception and to coping strategies that are self-contradictory or that even they themselves can see are hopeless. They hide their needle in a haystack, use misleading information, and pretend not to care. Above all, their top priority was saving time. They would look first and above all for convenience. In all cases it was found that specific episodes brought the issues to life: recovering from a stolen wallet, applying for work, planning a wedding, spam about pizzas or unstoppable junk mail from Sky. Specific participants offered memorable turns of phrase. On the value of her personal data, one mused: “A pound? Can’t be that much… 50p?” She thought she had given her data to perhaps 50 companies, but then reflected each had perhaps given it to two or three more, or perhaps to 10 or 20: “There would be 1000s of people with your data… it’s almost like getting an STD.” (This interviewee since reports having stopped filling out online forms for free pizzas). The case for personal information empowerment was then set out by Alan Mitchell and others in a Mydex White Paper in September 2010.31 This described an intention and rationale for transforming the personal data landscape by helping individuals manage their own data, turn information into a tool in their own hands, and act as the natural integrating point of data about their own lives. It argues that secure personal data stores will save time and hassle for individuals, while driving efficiencies for organisations. To further test this hypothesis it was necessary to secure the committed involvement of organisations to work in this way with real people, and to examine issues of organisational and individual motivation, suitable technologies, security, design, and legal compliance – especially data protection. To do this Mydex contracted with a series of organisations to participate in a Community Prototype in which lessons would be shared. Participating organisations included the UK government’s Directgov portal, the research firm YouGov, the Direct Marketing Association (DMA), three local au31 Alan Mitchell and others: The case for personal information empowerment, Mydex CIC white paper, available at http://mydex.org/wp-content/uploads/2010/09/The-Case-for-Personal-InformationEmpowerment-The-rise-of-the-personal-data-store-A-Mydex-White-paper-September-2010-Final-web.pdf.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

262

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

thorities (LB Croydon, LB Brent, RB Windsor & Maidenhead) the Department for Work & Pensions (via its TellUsOnce project) and BT. Experian participated and provided verification services.32 DWP lawyers conducted a full day legal workshop with Olswang to consider the legal implications. The Open Society Foundation contracted an LSE team to assess the privacy implications of working in this way. A UCL team led by Dr Sasha Brostoff, under the supervision of Professor Angela Sasse, examined the user experience and the impressions of trust and security. The Mydex Community Prototype33 established a working link between local authorities’ data systems and an individual’s personal data store, allowing two-way data transfer. This faced a number of hurdles, starting with identity verification, and a key finding was that current approaches to identity assurance are slow, expensive and not always accurate. But by bringing multiple tokens of verification from a wide range of different parties together, a personal data store can reduce the hassle and costs of existing identity verification processes for both organisations and individuals, while increasing levels of assurance. The first step of data sharing in the prototype was the local authority ‘showing’ the individual the information it held on its database. As soon as the local authority did this, individuals could see fields that lay empty and identify and rectify errors and inaccuracies. The prototype therefore confirmed some working assumptions (for example, that the PDS would have to be so easy to use as to be almost transparent) and highlighted other questions yet to be addressed: for example, for an organisation to accept a genuine data sharing relationship with customers requires internal change to processes and systems. How, for example, should the organisation treat, and verify, an address change notification from a PDS? The UCL team found a central driver for adoption was around trust. Trust in Mydex correlated strongly with perceptions of benefits and ease of use, and carried with it improved trust in the Internet and in organisations. The findings led the UCL team to suggest that, approached correctly, introducing Mydex could create a “halo of trust” that would benefit all participants. It also found that despite Mydex’ legal team’s efforts to produce the shortest and sweetest privacy policy possible, no-one read it. Eyeballs stayed on it for an average of a third of a second. The Mydex prototype raised important questions. European Data Protection law recognises three actors: data subject, data controller and data processor. But an individual with a personal data store – in complete control of their own data – plays all three roles. Those drafting the current legislation do not seem to have foreseen this, and it is far from clear that current suggested amendments address it clearly. On the other hand, giving individuals clear control over data sharing, and the ability to show informed consent to specific data sharing for specific purposes is clearly a huge step forward in implementing the spirit and the letter of European data protection law. The OSF-funded work by Simon Davies of LSE concluded that the Mydex framework provides a robust contribution to privacy. These changes in interaction between organisations, new entrepreneurial developers and individuals in online networks, as well as the morphing of data subject into data

32 DirectGov has since been disbanded and replaced by Gov.uk. For the other organisations mentioned see https://yougov.co.uk, http://www.dma.org.uk/, http://www.croydon.gov.uk/, http://www.rbwm.gov.uk/, https://www.gov.uk/government/organisations/department-for-work-pensions and https://www.gov.uk/tellus-once. 33 For more information see http://mydex.org/prototype/.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

263

controller and data processor might be usefully examined in terms of Bruno Latour’s actor-network theory.34 That remains to be explored.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

8. Design and Construction of the Live Mydex Service Platform A key finding underlined by the Mydex Community Prototype was that convenience and ease of use was prerequisite to adoption and gaining trust by individuals. The personal data store had to add value beyond basic data storage; it would need to be populated via connections with organisations. Individuals would value control over and access to their data, but were unlikely to key it all in manually. Therefore a central design decision was to start early with organisations and make it possible for them to connect to the platform and be the initial attribute providers and verifiers of data to the individual who would then correct, augment and enrich as they were able to bring data from multiple sources together within their personal data store. The focus for the individual utility has been around providing services that support everyday life, such as storage, organisation, retrieval and sharing of data easily, and in particular basic utilities such as secure bookmarking, browsing history and the ability to add tags and annotations; a familiar browser based capability and one that has attracted developers. The challenge for all alternative approaches in this areas is that the data is either being stored locally (unconnected, device-centric) or within a commercial organisation’s environment with one or more data rights being granted as a consequence of using the service. Demonstrating the difference of personal data storage by using familiar functionality is a key element of the Mydex live service design, as it then opens new opportunities for participation under the individual’s control. In the same way as organisations wishing to send to and receive data from their customers, application developers are looking for secure low-cost, privacy-friendly mechanisms for accessing, storing and securing consent for access to rich datasets needed for the applications or services they operate. Exposing an OpenAPI that was standards-based at the earliest opportunity was seen as a key component for live service. Mydex adopted encrypted private networks using RESTful API supporting XML, JSON and YAML. Market reaction to date suggests this was the right approach. A key decision was to register the Mydex Trust Framework from the outset with the Open Identity Exchange,35 with a clear set of terms for all participants. These terms are designed to ensure that Mydex’ mission, vision, values, and strategy of empowering individuals can be codified in contractual commitments that could be tested against the Mydex Charter at the centre of the Trust Framework.36 The key elements of importance in the design were as follows. • •

The individual is in control of giving consent to any form of connection or data sharing and can revoke it at any point without recourse to anything more than simply switching it off within their PDS. All data and communications are encrypted uniquely per individual using a Private Encryption Key only the individual knows and controls. Mydex has no means of resetting this or accessing their data.

34

See e.g. Reassembling the Social at http://dss-edit.com/plu/Latour_Reassembling.pdf. See http://openidentityexchange.org/press-releases/mydex-trust-framework-recognized-open-identityexchange. 36 The Mydex charter is available online at https://pds.mydex.org/mydex-charter. 35

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

264

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control







All data sharing between an individual and organisation or application is subject to a data sharing agreement between the individual and the organisation or legal entity providing the application. This was designed to afford contractual agreement on consistent terms for the individual which set out precisely the use cases and data being shared. This significant step up from the normal protections afforded by the Data Protection Act allows individuals (or others acting on their behalf) to take action directly should their contractual agreement be breached. Mydex would also be able to take action as the platform provider in the event any evidence of breach of the data sharing agreement also showed breach of the terms of connection they have signed as connecting organisation or application. All connecting organisations and applications have to prove their identity as a legal entity and verify their endpoints for service connection to the Mydex platform prior to being able to operate on the live service. This is a novel and critical element of the Trust Framework as it enables individuals to trust the identity of any organisations available on the platform in the service directory. Mydex’ commercial model places the cost of connecting to the platform on organisations and application developers but at a very low cost entry level called “connection fees”. Individuals are provided with the service by Mydex free of charge.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.





These one-off fees scale as the number of individuals the organisation or application connects to increases and the number of services they make available via the platform broadens. They make a one-off connection fee which covers basic set-up and 1st year support, and then an annual support fee thereafter based on 25% of the aggregate value of all connection fees. This works out at about 4 UK pence (US$ 0.058, €0.045) per year per individual per organisation. The rationale for such low pricing is rapid adoption and compelling return on investment calculation. If you think it costs 40p to print and post a single letter, simply saving an organisation the cost of one letter a year would pay for the service for 10 years. Transaction fees – Mydex makes no charge for data volumes moved across the platform but if an organisation or application developer charges an individual for their services or, as we envisage in the future, pays an individual for specific data rights or consents then Mydex levies a small single-digit percentage transaction fee on that payment which is charged to the organisation or application developer. Again no charge is made to the individual by Mydex; to them the service is free at all times.

9. Technical Architecture From a technical architecture perspective, Mydex wanted to take advantage of cloud infrastructure whilst maintaining the flexibility of federated data storage options for the individual. The Mydex platform architecture is entirely based on open-source software components which we have configured uniquely to meet our needs and in some cases written specific extensions to deliver security or data architecture requirements. The key headlines of our architecture are as follows.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control







265

High Availability, horizontally and vertically scalable platform using open source components running in a UK owned and operated data centre service on a cloud infrastructure that is, like Mydex itself, ISO27001 certified. Our systems are monitored across multiple dimensions focusing on availability, resilience, security, serviceability and automated redundancy and failover procedures operating 24hrs a day all year round. All Mydex Personal Data Stores are discrete independent collections of files encrypted and controlled by the individual. Our architecture enables the individual to choose the location that they want to store their PDS in. This could be locally, or in another data centre. The notions of federation of data and interoperability are core within our design, enabling an individual to either maintain a copy elsewhere in another service or to export the entire contents of their PDS should they wish to move to another service. The resulting export is in a machine-readable format. We operate a zero touch deployment environment as part of efficiency and security measures. No access to our platforms is allowed by individual logins; all new software and services are managed by scripted automated routines which themselves are subject to audit and review. All software is subject to static code analysis and hardening processes to ensure an extensive list of threats are tested for and protected against.

10. Master Data Schema and API – The Core of Open Scalability and Transparency

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Undoubtedly one of the most important innovations, both technically and commercially, was the master data schema and Open API. Mydex maintains the Master Data Schema; we publish it openly to anyone who wishes to view it. The OpenAPI uses web standards. We provide a free to use sandbox to enable anyone to experiment with our platform. Critical to our open philosophy are the following: • •



We will extend the Master Schema in any direction to meet the needs of individuals, organisations and application developers. We manage a programme of work and have designed it so it can be extended with relative ease. We do not charge for extension of the schema but it is a resourceconstrained activity so we manage the pipeline of requests for new datasets and attributes based on demand and priority from the stakeholder communities of developers, organisations and individuals. This work can be accelerated by the level of analysis and work done by requesting organisations. No one has any exclusivity of datasets or attributes. It is an open platform; the schema is for all and the individual cannot be locked into any one provider due to data schema constraints. This is fundamental.

11. User Experience Mydex undertook extensive user experience research during the community prototype and also saw a significant shift away from device-centric environments to web-based solutions using open standards. What is clear is that there are many different contexts

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

266

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

in which human h beingss see and und derstand perso onal data and this t is impactted by age, backgroun nd and level of o understandin ng of the interrnet and techn nology. u o help commu unicate the po otential of perrsonal data We undertook a deesign study to hich we used d successfullyy on a numbeer of external events and which we stores, wh termed Kaatherine’s Jou urney. We tran nslated this work w into an initial i hyper ccard design for the co ore personal data d store. Th his tested welll across all demographics d and backgrounds. We W also recog gnised that, in n a lot of casees, individualss would be using mobile apps whicch accessed th heir personal data store an nd therefore in n a lot of casses the UX would be specific to thee process theyy were using. ur initial laun nch to early ad dopters we prrovided no speecific help orr members’ On ou guide as we w wished to see s how easilyy people undeerstood what it i was and how w to use it. The resultts of this prog gramme fed into i the creatiion of a mem mbers’ guide ddriven in a context-seensitive help model m so thatt we could pro ovide assistan nce covering aany area of the PDS and a its operattion. An exam mple of one of o the screen shots for the members’ guide is below.

11.1. Marrket Adoption The proceess of selling participation p i the Mydex Community Prototype in P alsoo taught us important lessons aboutt the motivatio on and behaviiour of organiisations in moving to the ydex founderss, the rights aand wrongs new modeel of personall data control. For us as My of organissation-centric control of peersonal data and a addressing g the data seccurity risks seemed paaramount and were a primaary motivation n. But we hav ve referred aboove to how we found that for indiv viduals, regard dless of such consideration ns, conveniencce and getging behaviou ur. For organissations, we ting things done are thee key motivatiions for chang posite lesson. It seemed allmost self-evident to us that the vast learned allmost the opp

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

267

return on investment for organisations of working with individuals equipped with structured and secure personal data stores would change their behaviour. The technical, economic and pragmatic case is overwhelming and so – it seemed to us – any clearheaded and rational large organisation would acknowledge the case and change course. What we learned during the prototype was first that any such decision may take a very long time – perhaps several years. But second we learned that there are organisations with leadership capable of recognising the appropriateness of what one might call “fairtrade for personal data”, and immediately recognising that this is how the organisation they lead would want to relate to the individuals it serves. We have learned that this is a relatively quick decision: perhaps 10 minutes. Once that decision is taken, there is still the need for ready and persuasive answers to the lengthy and detailed technical, commercial and legal questions which arise, but that is then a process of months not years. Adoption has to be, as far as possible, seamless and entirely convenient for individuals. Mydex’ sales effort focuses on the organisations that will bring individuals. The organisations need to be businesslike, and to deal with large numbers of individuals, but they also need to want to treat individuals in a responsible and emancipated manner. Market adoption of Mydex by organisations started in a small way with the BBC. It is a peculiarly British institution which communicates and achieves a great deal without significant intrusion into people’s privacy. It has innovated online without yet, perhaps, having had the colossal impact on the Internet that it had on earlier media technologies such as radio and TV. The BBC identified Mydex as a suitable way for viewers to do their own bookmarking, and in due course record their own viewer habits, for its experimental cultural service TheSpace.37 Another early adoption is by the UK Coalition government, which contracted Mydex to act as one of eight identity assurance providers for its “digital by default” public services.38 Mydex did not set out with the specific purpose to be an identity assurance provider, but it became clear that the personal data store provides an elegant and cost-effective solution to the need for individuals to prove trustworthiness by storing and producing proofs of claims. When the government invited tenders from prospective ID assurance providers, Mydex bid and won, the only SME and the only social enterprise to do so. This framework contract, placed originally by DWP but then novated to Cabinet Office, which coordinates cross-government ID assurance, values ID assurance for public services at £25m for 18 months for the UK. The first alpha pilots of UK ID assurance are being rolled out during 2013. A third example is Thames Valley Housing Association, which will offer residents services via a Mydex personal data store, initially for registration and identity authentication processes. These three services are contracted at the time of writing and due to be deployed during 2013. They prove a first stage of market adoption, but user adoption is not yet proven. In a global context these steps by Mydex are relatively insignificant, but they fit into a wider pattern of fundamental global shift to individual control over personal data. This starts with awareness and policy work, such as has been done for several years now by the World Economic Forum. 39 As part of that, Boston Consulting estimated the value of our digital identities alone at €20tr across Europe by 2020.40 Ctrl-Shift’s estimate of future value of volunteered personal 37

See TheSpace.org. See https://www.gov.uk/government/news/providers-announced-for-online-identity-scheme. 39 See e.g. http://www.weforum.org/issues/rethinking-personal-data. 40 See https://www.bcgperspectives.com/content/articles/digital_economy_consumer_insight_value_of_ our_digital_identity/. 38

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

268

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

information by the same date is more modest, putting it at £20bn for the UK alone. And Ctrl-Shift puts our digital ID as but one of five categories of volunteered personal information, of which by far the most valuable is our future purchasing intentions.41 The list below offers further evidence of awareness and of real progress the world is moving in this direction, with: • • • • • •

global-scale data giveback availability from Twitter, Facebook and Google’s “Data Liberation Front”;42 the US Veterans’ Administration Blue Button,43 usage of which passed 1m in August 2012;44 US “yellow button” for education data and “green button” for energy data;45 the UK government’s Midata policy on data givebacks for business policy which became law in April 2013. (Midata did its first hackathon in December 2012,46 and is now setting up a Midata Innovation Lab);47 moves in the UK towards data givebacks from TfL48 and from the NHS for patient records;49 emergence of the first consumer-facing VRM services such as MoneySavingExpert’s cheap energy club.50

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

12. What Does This Lead To? The outcome of this change will be a new personal data ecosystem in which the personcentric model also plays a full and effective role, in mutually beneficial co-existence with the still evolving organisation-centric model. The person-centric model will make possible new things which were not previously possible. It will take over, and perform better, some functions for which we currently rely on the organisation-centric model, and there promises to be an area of rich interaction between the new model and the old. It’s not realistic to predict precisely what this synthesis will look like in the medium and long term. We can offer simplistic images, for example, of building a bridge with support at both ends, or of walking on two legs instead of one, but the way new information logistics play out in every sector of activity where individuals interact with organisations has a fractal quality in its complexity. For any sub-sector within any sector one looks at (whether self-management of tax affairs by freelance programmers in financial services, or self-management of a specific type of diabetes in health) the possibilities discussed will remain every bit as complex as is this ‘helicopter view’ discussion of the whole change.

41

See https://www.ctrl-shift.co.uk/index.php/research/product/37 – subscription required. http://www.dataliberation.org/ http://www.va.gov/bluebutton/ 44 http://www.markle.org/publications/1679-video-blue-button-download-capability 45 http://energy.gov/data/green-button 46 See description at http://mydex.org/why-midata-matters/midata-hackathon-overviews/. 47 Described in a blog post by the BIS official leading this work at http://blogs.bis.gov.uk/midata/ tag/midata-innovation-lab/. 48 For TfL see http://www.tfl.gov.uk/tickets/20785.aspx; for the NHS see http://www.ehi.co.uk/news/EHI/ 8444/cerner-and-tpp-show-nhs-blue-button 49 See e.g. http://www.ehi.co.uk/news/EHI/8444/cerner-and-tpp-show-nhs-blue-button. 50 http://www.moneysavingexpert.com/cheapenergyclub 42 43

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

W. Heath et al. / Digital Enlightenment, Mydex, and Restoring Control

269

Various terms have been coined for the person-centric model: VRM (for vendorrelationship management), buyer-centric commerce. The seismic change from organisation-centric to person-centric has been dubbed a polarity shift or a control shift. Analyst Craig Burton has repeatedly told the ‘true believers’ gathered at the twice-yearly Internet Identity Workshop unConference events that until a common language is sorted out no meaningful progress is possible. In this Yearbook, published September 2013, we would observe that after a decade of discussion and speculation this change appears now to be under way and accelerating. It’s not yet far enough or fast enough to prove Burton wrong; indeed he has often been right before and may well be right again. We are at a Wendepunkt – the pivotal moment – but still need this common language to describe it. Any narration of specific current developments – evidence of problems with the organisation-centric model, progress on data givebacks, ID assurance, emergence of individual-centric services – would date quickly. This year, perhaps for the first time, you could narrate the change on a basis of weekly evidence. The pace of change is better suited to a ‘live update’ weekly news service than a Yearbook. Personal data: progress towards individual-centric solutions. Having business, public services, developers and individuals start to work in a complementary individual-centric control is a profound change. It won’t start overnight. As Susan Landau argues about federated identity management, there are challenges in finding short-term incentives.51 There will be casualties in the transition: not just dinosaurs that fail to evolve but also start-ups whose timing or positioning is wrong (such as the financial management service Wesabe, 52 or the initially promising “quantified self” start-up Zeo).53 Like enlightenment itself, the switch to personal control over personal data is conceptually a simple turning point, but one whose profound impact and implications will take years to play out.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

13. Conclusion The antithesis to more than a decade of organisation-centric personal data and Web2.0 type services is now well thought out, articulated, researched and prototyped, built and ready to take hold. The case for person-centric model of control over personal data is very strong, and there is market evidence it is starting to establish itself. Several questions remain. How far will it go, and how fast? What will enlightened coexistence look like when it’s established and the two models are working together? The initial benefits to organisations of ceding a measure of control and trust to individuals seem compelling. But how far down that path will they go? Actions and market reactions will as ever speak louder than words. But the possibility of a basis of respect for individuals, for individual control and trust in an online society is worth working hard for. Enlightened individuals demand and require autonomy in an information society. To achieve that, they need the tools and rules necessary to give them control over their personal data.

51 Economic tussles in federated identity management; Susan Landau, Tyler Moore in First Monday. See http://firstmonday.org/ojs/index.php/fm/article/view/4254. 52 http://wesabe.com/ 53 http://telecareaware.com/quantified-self-fail-nighty-night-for-zeo/

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

270

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-270

Personal Data Management – A Structured Discussion Jacques BUSa,1 and M.-H. Carolyn NGUYENb a Digital Enlightenment Forum b Microsoft

Abstract. In the coming decade, a seamless integration of our on- and offline lives will be necessary for a sustainable digital society. This requires urgent multidisciplinary debate on the collection, control and use of personal data in society. This paper proposes a framework that can be used to shape this dialogue. It is based on a neutral and consistent terminology that may support a constructive and fruitful debate, avoiding terms that have too often led to controversy and confusion. We have attempted to position in this structure state-of-the-art technology developments, including context-awareness, user-centred data management, trust networks and personal data ecosystems. It demonstrates clear relations between ongoing work in various groups and hence an urgent need for cooperation to achieve common goals. We argue that such cooperation can lead to the emergence of a personal data ecosystem that may truly support a sustainable digital society. Keywords. Personal data management, identity, privacy, context-awareness, trust networks, user-centricity, personal data ecosystem

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Introduction In the coming decade, the seamless integration of our on- and offline lives will become an increasingly important issue facing society. It will require that individuals be able to manage how information related to them is used, whether directly or through other means, in a way that meets with their preferences, contexts, and values, all within existing social and legal boundaries. This will be a departure from current practice, where users would give automatic consent to their data being processed, often without any true understanding of the full context in which the data will be used. Understanding of the impact of this change is crucial, and broad societal discussion is necessary to ensure sustainable social and economic development. This can be challenging, especially in a world of big data, where individuals may not be aware of the majority of the data that exists – or can reveal information – about them. Enabling individuals to exert some control over how these data are used will be an important aspect of an overall solution. Big data poses other challenges to existing approaches in data privacy, pointing to the need for dialogues among stakeholders on a policy framework that can help create a sustainable data ecosystem that will drive new business models and innovation while also strongly protecting individual rights. The World Economic Forum highlighted some of these issues in its report [1]; the Digital Enlightenment Forum, with this Year-

1

Corresponding Author.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

271

book and its 2013 Forum, is bringing together experts to develop a coherent framework and consistent policy recommendations to help address them. This paper aims to facilitate and further these discussions by presenting a possible framework that can be used to shape the dialogue, and select technological development and issues that should be taken into consideration. We first try to develop language and terminology that might help to establish some structure and make progress in the discussions. Words like identity, privacy, personal data, trust, context and many more are used by different people in different ways, leading to disagreement and confusion. Moreover, priorities are looked at differently from different disciplines. In a second step, we describe ongoing technologies and developments that can contribute to a sustainable structure of personal data management.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

1. Privacy, Personal Data Management, and Identity Privacy, personal data management, and identity are controversial issues in policy development for the digital society. Unfortunately, the debate suffers from a persistently confusing use of terminology by various stakeholders, and any progress will require some mutual agreement around these concepts. We start, therefore, with some discussion on the nuances of the terminology, before integrating them into a layered model that can be used to address issues related to use of personal data in the digital society that would take into consideration individual preferences and context. First, although privacy and personal data management (PDM) are rightly seen as overlapping, they are not the same. Some perceive privacy to be broader than PDM, as it also includes physical privacy, “the right to be left alone”, the dignity of the person and other aspects of social behaviour. For others, privacy is considered narrower than PDM, as the latter includes the management of the complete lifecycle of data that may relate to a person (e.g. attributes of relevance to an individual, context for sharing personal data, and decisions on fair value exchange). For them, privacy is just one aspect of a holistic view of PDM. In addition, although privacy often connotes withholding personal information to ensure the “right to be left alone”, PDM connotes managing the flow, access, and use of data to ensure that it can deliver perceived value to the individual. Depending on the individual, this value can be at the personal, social, economic or global level. In the context of the consequences of digitisation, it therefore seems more relevant to discuss PDM than privacy. Thinking about PDM, it must be noted that what is considered personal is a complex concept, and is also interpreted very differently in different contexts, cultures, social and political environments (see, e.g. James Q. Whitman [2] and the report from the International Institute of Communications [3]). This means that if we want to consider personal data management as a universal concept, its definition should not be dependent on the specific social/cultural environment in which we apply it. Furthermore, an operational definition must integrate the contexts of data use, including social and cultural norms. Such a definition would also need to reflect the fact that preferences and norms can change over time. Another source of confusion is the term identity. In the governmental context, identity is often interpreted as the set of data (attributes) required to uniquely and authoritatively identify a person, which is then often related to a unique identifier (e.g. a citizen registration number). The proposed EC eID regulation, for example, demands assurances from Member States that their “notified eID systems” for authentication

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

272

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

lead to a single, unique citizen. However, modern technology for authentication based on attributes aims at data minimisation, as described in the OECD guidelines for privacy, and does not require a unique identifier – only that the attributes relevant to the given transaction be verified to allow access. When applied cross-border, standards would thus be needed to ensure that such assurances, when used to access a public service (including passing a national border), could also satisfy the uniqueness requirement mandated at a national level. In many situations, it is sufficient and desirable to claim only one or a few attributes – often without name and address – to obtain access to a public or private service (e.g. the age of a person purchasing liquor or the assurance that a credit card is valid) or to communicate with others. Moreover, social behaviour in the physical world shows us that people present themselves differently (sharing different data, emphasising particular personal details) in different environments (family, work, sports club, friends, public place, etc.). This “multiple identity” behaviour, including the ability to be anonymous or pseudonymous, must be supported in the digital world if the concept of a digital society is to be acceptable and trusted. For clarity of the discussion that we want to start here, we will avoid the word “privacy” and use the term “context-aware PDM” and “identity” as described below:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Context-aware PDM (CPDM) enables an individual to control the access and use of her personal data in a way that gives her sufficient autonomy to determine, maintain, and develop her identity as an individual, including the choice of attributes of her identity to present depending on the context of the transactions.

CPDM will enable consideration of constraints relevant to personal preferences, cultural, social, and legal norms. Trustworthy data practices are foundational to enabling CPDM and hence issues of security, integrity, and protection of personal data, as well as data governance, auditability and other aspects related to trust must be recognised and addressed. This description contains undefined elements (such as sufficient and context). These depend on the implementing environment (e.g. purpose of data use, personal value derived, social norms, cultural rules, etc.) and make it possible to discuss the concept of CPDM at a universal level. One person in a specific context may make personal choices different from another person in the same context. The same person may also make different choices regarding the same data in different contexts. This does not change the essence of the right to privacy, but rather reinforces it as including the right to such choices, and hence the right to a certain socially and culturally acceptable autonomy. Also note that, although we discuss the concept of data use in the context of a transaction, it is not necessary that the individual is actually engaged in the transaction. For example, she could agree that her health data may be used in medical research; in this case, the research is the “transaction”. Regardless of her level of engagement in the transaction, the use of her data must still meet the constraints defined above. Although the description of CPDM is applicable for various jurisdictions, it does not explicitly address cross-border or cross-jurisdictional aspects. In some ways, the description assumes these. For the definition of CPDM, we have chosen to emphasise one’s choice to develop personal identity and manage personal data within given contexts, as we feel that this is a critical enabler for integrating our on- and offline lives.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

273

2. User Control of Personal Data Let us first come back to the concept of personal data. In the EC proposal for a Data Protection Regulation, it is defined as follows:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

“Personal Data” means any information relating to a data subject, being an identified natural person or a natural person who can be identified, directly or indirectly, by means reasonably likely to be used by a data controller or by any other natural or legal person, in particular by reference to an identification number, location data, online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that person [4].

In the US, the term Personally Identifiable Information (PII) is used, which is essentially the same. Until recently, a commonly accepted approach for preserving the anonymity of personal records and providing some minimal privacy protection was to remove any “personal data” such as names and social security numbers from such records before storing them in large databases or sharing them for public use. However, recent works have demonstrated that by aggregating multiple databases, the records can be easily “re-identified” or “de-anonymised” [5]. An early case that is quite well-known involved a Netflix contest, which ran from October 2006 to Sept 2009, where $1 million was to be awarded to the team that could best improve the company’s movierecommendation algorithm [6]. Contestants were given access to members’ movie ratings to develop an algorithm that could accurately predict movies those same users would enjoy. Two massive data sets were released: 100 million movie ratings from 480,000 customers, along with the date of the rating, subscriber unique ID, and the movie info – all data that that were considered to be anonymous and could not lead to identification, and therefore would not traditionally be considered personal data. Nevertheless, Netflix was sued for breach of consumer privacy after researchers at the University of Texas proved that the movie ratings could be de-anonymised and traced to identifiable individuals [7]. For big data, where diverse data sets are commingled to derive new insights, these research findings make it less clear how personal data should be defined. If taken to the extreme, they may lead one to conclude that any data may potentially be classified as personal data. Since many data protection frameworks today limit inappropriate use of personal data by restricting its processing, and do not differentiate among the different types of personal data, the combination of these two approaches would have a serious impact on enabling the value of big data to be realised. In the remainder of this paper, we use the term personal data (PD) generically to define any data that can be related to an individual, since any such data would have the ability to impact an individual’s identity. However, we believe that an alternative policy approach to CPDM is needed to ensure that the value potential of big data can be realised – one that is based on data use rather than data collection or processing. This is a driver in our development of this paper and is addressed further below. In discussing CPDM in an increasingly data-driven world, it is relevant to distinguish: • •

Actively collected PD provided by the user as part of a transaction (e.g. address, credit-card information); Passively generated PD, which include data that are either (a) collected without user awareness as part of an active transaction (e.g. location information

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

274

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

while completing a purchase); (b) collected without user awareness and users are also unaware of any ongoing transaction (e.g. video camera capture as individuals walk through a train station); (c) inferred data resulting from analytics of data that can be aggregated about the user (e.g. credit-report rating). 2 For actively collected PD, users can control the flow and use of their data by giving or withholding consent at the point of data collection – this is the notice/consent model currently in use today. As indicated above, for most users, consent is given without any true understanding of the full context in which the data will be used, or of the potential benefits obtained beyond the immediate service. With big data, the majority of data collected will increasingly be passively generated. For these data, it is neither practical, nor in some cases possible, for users to exercise explicit control. The aforementioned World Economic Forum report discussed the need for a new approach to managing PD in a decentralised data-driven ecosystem to balance the socio-economic value that can be unleashed by what we term CPDM. It also discussed the shift to a use-based approach for these data that takes personal context into consideration along with new ways to engage individuals. Underlying these changes are a set of updated principles for CPDM, complemented with enforceable codes of conduct and technology that can facilitate the implementation of trustworthy data practices and contextually aware data usage in such an ecosystem. In addition to controlling the use of data, another aspect of CPDM would seem to require that there be protection against unauthorised access of data. This includes preventing entities that cannot adequately verify their identities from accessing and using the data, or preventing a breach of the data while it is at rest (e.g. in storage) or in transit. These measures are necessary, but not sufficient, elements to ensure fundamental trustworthiness of the system, regardless of whether the personal data are actively collected or passively generated. When the data are stored in users’ personal storage devices, the user would be in control and accountable for keeping the data secured using technologies such as data encryption. When the data are stored in a cloud, or databases managed/owned by third parties, the appropriate service providers are accountable for keeping the data secured and conditions would be agreed for enforcement. If the service provider is providing a secured personal data storage service, it may also agree with the user to keep the data encrypted and provide them with a private key. Of course, keeping stored data secured will not mean secured data processing. Some possible ways of improving the secure management of data in transit are discussed later.

3. A Reference Model for Personal Data Management The issues that need to be addressed in achieving CPDM can be structured into three “layers”: 2 The chapter by Nguyen et al. in this volume discusses how, in the world of big data, the number of devices, sensors, and other objects that can collect data will vastly exceed the number of people on the planet, and that the majority of big data will be collected passively and automatically, through machine-to-machine transactions, without user involvement. The choice of terminology used in that chapter, as well as here, actively collected PD and passively generated PD, emphasises these facts and reinforces the need for new approaches towards policy development. The World Economic Forum uses the terms volunteered, observed, and inferred data to refer to same in [1].

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

• • •

275

Infrastructure Data management User interaction.

This model can also provide a useful frame of reference to debate questions regarding future developments on technological, economic, social, and policy issues relevant to PDM. These elements must be balanced and co-exist at every level in the model below. The essence of the debate on management of personal data and citizenship in the digital society is driven by how the elements of the three layers in this model should be specified and implemented; what are the guiding principles for enabling them to provide trustworthy data practices; and how they can balance the needs of individuals, industry, and regulators.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.1. Infrastructure The infrastructure layer contains the services and applications required to assure the integrity and security of the data, both while in transit and at rest, including supporting the appropriate protocols, service bindings, etc. Processes facilitated by the infrastructure should be trustworthy. This will entail an architecture that has been designed with security and data protection in mind (Security and DP by Design). Technical considerations include encryption and using techniques such as differential privacy to protect against unintended personal data breaches and system attacks.3 It should facilitate logging, monitoring and random control of compliance with the basic rules agreed for the infrastructure and the processes facilitated by it. Services provided by these elements should be available through Application Programming Interfaces (API) as part of the platform. Authentication (verifying identity or claims4 in general), and the services provided by Trusted Identity (or Claim) Providers (TIP) at various levels of assurance, including anonymous or pseudonymous claims, are essential elements of this layer to enable a foundation for trustworthy practices. Trustworthiness and acceptability of the infrastructure may be achieved through market mechanisms (reputation, brands, price), through regulation, certification, control, and enforcement, or through other more direct forms of governance directly supervising (parts of) the infrastructure. The choices here depend on the culture and the political environment. In western societies, governance would usually include sufficient checks and balances that aim at separation of concerns and avoidance of conflicts of interest. 3.2. Data Management The data management layer includes the elements (apps, services, etc.) required to enable individuals and service providers to effectively control the flow and use of per3 Differential privacy refers to techniques that would enable analysts to extract useful insights about the population as a whole from a database containing personal information, while at the same time protecting the individuals from being identified in the sample [19]. These techniques work by introducing small distortions into the results in a way that would not invalidate the results extracted, but would protect individual privacy. As such, they prevent unwanted re-identification of the personal data contained in the database. 4 Claims are defined here as assertions about one or more attributes for a given persona.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

276

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

sonal data based on specified use permissions and policies. Elements of this layer include mechanisms that would enable individuals to specify data use permissions, service providers to communicate data use policies, data policy enforcement services, and capabilities to audit data use compliancy. Trustworthy data practices are critical at this layer to enhance user trust in the overall data ecosystem. One method of improving trustworthiness would be a contract between a data controller and the user that the controller is accountable to abide by the use permissions specified by the user. Technology such as an interoperable metadata-based architecture that (logically) binds permissions and policies to the data can provide the means for facilitating the enforcement of these contracts (see the chapter by Nguyen, et al. in this volume). Service providers would leverage services offered by the infrastructure and use the API management system(s) provided. For example, TIP can provide the claims needed to authenticate an entity and enable it to access and utilise a given piece of data based on the data policy specified in the metadata. Services that can be offered based on the elements provided at this layer include user reputation management services, trusted social networks information sharing, consumer-trusted credit ratings, services to facilitate consent management, advertising management, auditing, data access requests, and many more we cannot yet envisage.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

3.3. User Interaction and Context The user interaction layer includes the elements that enable end users to have a meaningful interaction with service providers regarding the permissions and policies associated with the use of their personal data. Clearly, what is meaningful and intuitive in one culture or jurisdiction might not be so elsewhere. And it goes further than simply implementing local legal compliance and cultural norms. The user experience and interaction models to be developed would need to take into consideration the users’ mental models on personal data, reflecting potential differences in shifting personal preferences, as well as social and cultural norms. And it should allow the emotional and autonomous choices of the user in the given context. Today, data use is normally expressed as a binary decision. Either the user agrees to enable use of the data collected or not. In the physical world, data use is highly nuanced and contextual. The same data that are considered sensitive by one person may not be considered so by another. Even for data that are considered sensitive, their uses are highly dependent on the context and the assumed identity or persona at a given time. Consider location data as an example. Some users may never want their location data to be displayed. Others may need location data to be visible to their employers during working hours, while they are on premise, but not outside of working hours or when they are off premise. Some users may want location data to be visible to their immediate families at all times. Others may want location data to be visible to all retail shops in the vicinity that offer deals of special interest. Context can be loosely defined as the factors that may impact users’ consideration regarding what is acceptable use of personal data. This can include a number of elements, such as the type of data, the nature of the interaction, how the data is used, and whether the use is something that is perceived to be of value to the user and/or the community, etc. In the physical world, people use a repertoire of personas as they interact with a variety of different people. The information they share depends on the persona they want to portray in each context. When these different personas are not recognised in the digital world, contextual integrity [8] is violated when the wrong data

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

277

is shared in the wrong contexts. Sometimes users want to maintain integrity of contexts; sometimes they want to traverse contexts. What they choose will depend on their preferences, norms, culture and political and legal environment. These and other factors define a nuance of data use and a granularity of control that are not yet integrated into today’s approach towards privacy and personal data management. The lack of finer control may seem to be justified when one considers the overwhelming evidence that although users express a desire to control the use of their personal data, few actually do so. But there can be many reasons for not doing this – lack of simple and intuitive tools, lack of technical knowledge, lack of perceived value for doing so, too much information being presented, etc. Moreover, this lack of finer control can adversely impact the ability for data to flow, as discussed in Section 2 above. There are multiple conceptual approaches to enabling simple but effective contextaware data use permission setting – not all of which require explicit control by the user. One simple approach is to enable users to set data use permission, not at the point of data collection, but immediately prior to when the data is used, thus giving the user a better sense of the benefit received. Recommender systems that rely on contextual and demographic data to provide personalised privacy settings provide another approach [9]. To accommodate changing norms, it is conceivable that “norms databases” could be established that capture how similar classes of data are used in different contexts. Such services could be provided by third-party consumer organisations similar to the services that consumers rely on today for ratings of everyday household products. Proxy services that learn user preferences and act on a user’s behalf can also help facilitate context-aware data permission settings by users. All these approaches would rely on the underlying data management and infrastructure layers to provide the mechanisms necessary to enable the specification and enforcement of such context-aware data use policies. In the next sections, we discuss how, together, the elements in these layers enable different types of trust networks and enable new business models for CPDM.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4. Building a Context-Aware Data Ecosystem 4.1. The Need for Trust As the volume of digital data generated increases and more businesses depend on data analytics to drive their business, people are growing increasingly concerned about their personal loss of control. This loss of control, and the asymmetry of power between individuals and businesses, are causing a crisis of trust. For CPDM, as defined above, to have a positive effect, it requires that the user be able to trust the various stakeholders in the data ecosystem to implement CPDM satisfactorily. As a result, she can trust that she can develop her social and private life as an individual. Let us therefore first examine the concept of “trust” in more details. Trust is bestowed on an entity (whether a person or system) if that entity is perceived to be trustworthy in the given context and/or for the given task (see [8] and the contributions of Bus and O’Hara in [11] for more extensive discussions on trust). For example: an airline pilot and a surgeon would, in most cases, be considered trustworthy while performing their professional tasks. Most importantly, trust is strongly dependent on the context in which it is given. When applied in the digital environment, we often do not realise that many of the factors that play a role in trusting people or organisa-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

278

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

tions offline are not available online (e.g. the experience of seeing a person’s behaviour and presentation, hearing her talking, etc.) (see also [10]). Although it is clear to many that trust is not only derived from classical rational arguments, it might be good to note that in order to be trustworthy, systems must address two sets of elements in their design. •



Rational elements that provide concrete evidence to support trustworthiness. This includes ensuring mutual interest; providing insurance; being accountable and accepting liability for damage; complying with legal obligations and contractual conditions; providing technical assurances through Privacy by Design, Privacy Impact Assessment, quality of development; ensuring transparency, auditability and understandable information; being certified by trusted entities; ensuring good corporate practices with visible and enforceable codes of conduct; and in general engaging in behaviour that would ensure a good reputation. Note however, that this might be culturally determined. Emotional elements that appeal to users’ perception of the entity. This includes building sympathy and a positive opinion, for example through being friendly or charming, providing consistent value, convenience and usefulness, an environment that feels familiar and friendly, wide social and/or cultural acceptance, or good customer support and maintenance services.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.2. Organising Trust in the Digital Domain Similar to the terminology confusion around identity, privacy and personal data management discussed in Chapter 1, we see a confusion in the various concepts of “trust” in connection to services, frameworks, networks, providers, etc. We will therefore first clarify the language we will use here. In offline life, people (or more generally, stakeholders,5 which can include enterprises, public services, institutions, etc.) generally work and live in communities that share some common beliefs and benefits. We indicated above that people tend to adopt different personae or identities in these communities, examples of which include families, states, sport clubs, healthcare systems, or financial networks. In such communities, participating stakeholders organise their social and economic transactions according to the rules and norms underpinning the trust that had been built up within the community. Given this, online CPDM and the digital ecosystem that facilitates it must have capabilities that can reflect the social structure of offline living. We use the term Trust Framework (TF) for a set of rules and practices on PDM and digital transactions that aims to create trust between the stakeholders in one or more communities within a social structure. Figure 1 shows the components that can be included in a TF. The enforcement element in the TF ensures compliance with the rules specified. Entities engage with a TF via contracts that specify the set of relevant technical, business, and legal rules for the desired transactions. For example, a TF for healthcare in a country may specify the appropriate regulations, codes of conduct for data use and data sharing with others in the sector, the level of data security that must be supported, the business practices, and audit reporting. We call a TF a context-aware TF if it supports CPDM. 5 A “stakeholder” in a community is used here to mean a natural person or legal entity that has an interest in cooperating with others in this community.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

279

Figure 1. Definition of a trust framework.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 2. Example of a service provider-centred trust network.

Finally, a Trust Network (TN) is a community of stakeholders that organises its digital transactions in accordance with a TF. As such, a TN can be considered an instantiation of a TF for a specific community. Figure 2 shows an example of a service provider-centred TN. In this TN, a group of service providers (SPs) have agreed to abide by a set of rules specified by a given TF. The TN also includes additional rules that may be specific to a given context, e.g. application, sector, or specific community. These additional rules may also be adopted for reasons of standardisation and interoperability. Users can enter into contractual relationships with one or more SPs in the TN if they trust the assurances that all SPs would abide by the rules specified by the TF as well as the additional rules included in this specific TN. An example of a TN is a regional healthcare system, with hospitals, doctors, nurses and patients, that adopts a TF specified for healthcare. It might have additional rules regarding the use of personal data for research purposes. Other examples of TNs include bookshops and their customers; companies providing public transport services and their customers; banks and their customers; and also families or groups of friends communicating through social networks and other digital means. A TN which is an instantiation of a context-aware TF is called a context-aware TN. We want to focus on context-aware TNs, and in particular those TNs that give stakeholders sufficient means to control their data. Moreover, we are particularly interested in TNs that ensure sufficiently high trust levels between the stakeholders regard-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

280

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

Figure 3. Example of an O2M service provider-centred trust network.

ing regulatory compliance, context-aware PDM, and compliance with norms and rules and other service-provision conditions that are agreed between the stakeholders in the context-aware TN. Such a TN would need to implement appropriate services and applications at each of the layers discussed above in its implementation stack. For some time now, service providers have been trying to build platforms that can support trust management in TNs. However, they often focus only on specific aspects (authentication, data security, reputation, etc.), leaving many other functions required for CPDM to be implemented by the TN itself. More recent developments have taken different and more promising directions. These are discussed in the following sections.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.3. Trust Networks Involving Service Providers A TN might use a digital platform from a third-party service provider to execute its transactions – such a service provider is termed a TN Platform Provider (TNPP). A TNPP is considered one of the stakeholders in the TN and is responsible for ensuring PDM and service provisioning for the digital transactions in the TN in accordance with the TF, as well as any additional rules and conditions adopted in the TN. The platform provided by the TNPP often implements the services in the infrastructure and data management layers discussed above. It may or may not implement parts of the user interaction layer. Within the TN, “trust” is established when the TNPP can provide, through technology and contract, sufficient confidence about this to all other stakeholders in the TN. There are two primary types of TN involving service providers. 1.

One2Many (O2M): This TN is a transaction network between one service organisation and many end-users, supported by a TNPP. Figure 3 shows an example of an O2M service provider-centred TN. The service organisation can be a group of organisations if there is sufficient agreement between the members of such group to act together as a single organisation. The service organisation agrees, by contract with the end-users, how PD of the end-users in the TN can be used. Based on this contract, a technical interface can be implemented and managed by the TNPP, which is accountable to the TN by agreement that PDM and service management will follow contractual obligations. This could be achieved partly by technical means and partly through governance, depending on the agreement between the TN and the TNPP. The Dutch

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

2.

281

start-up QIY [12] is an example of this type of TNPP. Examples of TNs that would leverage this type of TNPP include a large retailer with its loyalty customers, a large company with its employees, or a group of banks with their account holders. The TNPP could bring a number of such TNs based on the same TF together on one platform. This would simplify governance, and allow standardisation and replicable implementation of (parts of) the interfaces. Note that one customer may also relate to more organisations on such a platform. Many2Many (M2M): This TN reflects a community that may include service providers, individual users (e.g. customers, patients) and possibly other relevant organisations or institutions (e.g. consumer organisations, human right organisations, insurance companies, workers associations). Figure 2 and the discussion above describe an example of an M2M service provider-centred TN. The TNPP provides, and takes accountability for, the infrastructure that ensures compliancy of PDM and service transactions in the TN with the agreed rules and standards of trust. It may offer an API management portal for the SPs in the TN to act in a trustworthy way, and tools and services like “sticky policies”, logging, auditing and random conformity tests, thus supporting trust as pre-agreed in the TN. Synergetics [13] is an example of such a TNPP.

Clearly both types of TNs create trust between their stakeholders, as they are all bound by the rules specified, including in the TF. The strength of that trust will depend on how the contracts were executed, the technology deployed by the platform selected to facilitate enforcement and auditability of the contracts, how the TN is governed etc. The choice of the type of TN to use depends on the service(s) and providers involved. •

Copyright © 2013. IOS Press, Incorporated. All rights reserved.



If a large service organisation (e.g. a retailer or bank), for commercial reasons, does not want to cooperate with competitors, then O2M is likely to be a better solution. If a number of service providers in a given sector would like to cooperate to provide more comprehensive services to their customers (e.g. a healthcare network, a frequent-flier programme), an M2M would be more suitable.

In most cases, the M2M type is more efficient, as multiple service providers can share the same platform and functions for both the backend transactions and in the interactions with users. Despite their differences, the O2M and M2M TNs could use the same TF. Thus the rules implemented for trust in the PDM and service management could be the same. 4.4. User-Centred Trust Networks The asymmetry of power between individuals and data-driven businesses that led to the emergence of unease and loss of trust has also led to the development of a specific type of TN, one which is focused on putting users in charge of the use of their data and motivating the development of new business models around personal data ecosystems. Variously termed vendor relationship management, personal clouds, personal data stores, and reputation systems, these ecosystems aim to correct the imbalance of power, and enable users to be in control of their data – including specifying what data can be accessed, how they can be used, who can use the data, and renegotiating policies for

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

282

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Figure 4. Example of a user-centred trust network.

data use across all services with which they interact ([14]–[16] and the chapters by Heath et al. and Shadboldt (on midata) in this volume). The basic concept is a paradigm shift from businesses being in control of personal data to users being in control of personal data. In the former, businesses are collecting data to intuit user intent; in the latter, users, either individually or in trusted communities, come together to express a need that can then be satisfied by interested vendors. Figure 4 shows an example of an M2M user-centred TN where users come together to define a TF which specifies the policies under which their data can be used and service providers that agree to be part of the user-centred TN must abide by them contractually. Conceptually, there can be an O2M user-centred TN, where a single user would establish a TF that service providers would need to abide by if they want access to the user data. Realistically, this is an unlikely structure: individuals would be more likely to form a community around a TF rather than act on their own, as this would increase their bargaining power in attracting service providers to enter into contracts with them. Most of these TNs involve the use of a secured store for personal data at the infrastructure layer, and strict user-permission management at the data-management layer. An example of this is the “Life Management Platform” discussed elsewhere in this volume by Searls and Kuppinger. The user-centred TNs and the service provider-centred TNs discussed above share common objectives in that they are both concerned with trustworthy enforcement of policies on data access and use. The difference lies in who defines the data policies and permissions and the rules applied in the TF: service providers or users. Current discussions of user-centred TNs require that vendors join the TNs (as shown in Fig. 4) since otherwise, users have no real control of the data use by service providers. Moreover, independent monitoring of the service provider actions would need to be addressed in this TN. This would require the service providers to cede control of data use to the users – a development that will probably require some time for widespread adoption. In a service provider-centred TN, the service provider binds himself to pre-described use of the data, and is being monitored and controlled on it by the TNPP and the governance structure. However, the contract to which users agree on entry to the TN presupposes informed choices about permitted data use – which might be an unrealistic assumption in many cases today. Each of these classes of TN operating on its own would lead to suboptimal realisation of the value of data in the overall ecosystem. In practice, we see the two classes of

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

283

Figure 5. Co-existence of service provider-centred TN and user-centred TN.

TN co-existing, since they complement each other, and together can realise more value for all stakeholders in the ecosystem. Figure 5 shows this hybrid scenario, where a service provider-centred TN would negotiate with a user-centred TN for acceptable data policies, data permissions and operational rules.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

4.5. Governance of TN The essence of a TN is that the stakeholders can trust the network and the transactions – this cannot only be based on a one-time technology implementation, or on a contract signed by the TNPP. As such, governance is needed to ensure proper oversight, auditing, decision-making about members and monitoring procedures, and adapting the rules and conditions to changing circumstances. Governance will depend on culture, legislation and political environment. Transparency, checks and balances, and accountability might be relevant elements to consider. It seems clear that the TNPP would be subject to a representation of the community which forms the TN. Trust will be stronger if all stakeholders in the TN have sufficient representation in the control bodies. In addition, in many TNs it might be strongly advisable to ensure regular auditing by an independent party, or set the right of members to request investigation and reporting by independent experts in specific cases. A TN should also be open to the use of various trust providers, e.g. identity or claim providers, personal data storage providers, reputation providers, etc. in order to stimulate competition, innovation and scalability. 4.6. Interoperability – Building the Ecosystem Together, the various types of TFs, all kind of communities using TNs and other (context-aware) PDM services (including for identity, attribute and claim management) form a data ecosystem. This would reflect and integrate with offline life in society in a dynamic way, and grow and adapt with it. Interoperability in the ecosystem between TNs can be greatly facilitated by using the same TF (i.e. the same trust principles/rules) and agreed-upon APIs. Certification of TNs and use of generally accepted standards and contracts will also support interoperability and the building of an ecosystem that allows users to switch easily from one environment (TN) to another; something which reflects the way people normally live. Note that in an ecosystem of many TNs, it is desirable that an individual be able to manage the use of her personal data with contextually-consistent policies across the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

284

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

multiple TNs that she may belong to. This could be done within each TN, but would be more logically realised by a user-centred TN that can interoperate with the service provider-centred TNs that she currently belongs to. In the latter TN, she may want to specify policies for the use of her health data – for the health network that she belongs to – and they must be abided by and enforced; but for other personal data, she may want to leverage the commonly accepted data use policies that others in the user-centred TN have specified. In effect, the user-centred TN can act as a proxy for the user as appropriate, and furthermore, can reflect changing norms of the “user community” to the service providers. As explained above, TNs will operate within jurisdictions, societal norms and cultures, whilst existing power structures may also limit the choices of the user of the TN they want. In addition to parsing the identity dimension in terms of the range of identities or personas a user can have (e.g. anonymous, pseudonymous, etc.), one can also consider the strength of the claims required. There are now alternative forms of verification, where instead of requiring a single verified claim, users are asked to provide information to answer a number of weak claims based on information that is available about them (e.g. last purchase, last location visited, first employer, etc.). Moreover, it is obvious that certain transactions do not need strong claims, and “Facebook-like” authentication suffices. Everything depends on the context, and the choices of the user and her knowledge about the consequences. An ecosystem as described here could facilitate all three layers described in Chapter 3 above: the infrastructure layer, through the interoperable building blocks provided by TNs; the data management layer, through trust services for identity and claim management, auditing, compliancy enforcement, etc. and a first step towards user-centred context management layer through the various Trust Networks that allow definition of context and procedures that support it.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

5. Interdisciplinary Dependencies In Section 4 we have suggested a number of elements that, together, can lead to building a trusted PD ecosystem for our increasingly digital society. Such an ecosystem would be based on the implementation of CPDM for all members of this society. It is obvious that this needs to be a multi-disciplinary exercise. We can easily distinguish, but not disconnect, a number of perspectives or relevant disciplines that need consideration to achieve an overall acceptable and trustworthy CPDM. Technical: we see strong trends towards trust networks, personal data ecosystems, policy-compliant data management, and personal data vaults, as described in Section 4. We also see “multiple identity” management and data minimisation through attributebased verification and crypto-based credentials [17],[18]. Other issues relate to toolboxes for privacy by design, privacy impact assessment, or the broader social impact assessment. The role of technology in enabling alternative policy frameworks is only now starting to be examined. Economic: discussions are ongoing on valorisation and propertisation of personal data. The current book is an example of this and will also contain many references. New and innovative business models are discussed based on different relationships between customer and service provider, as suggested above in the discussion of various types of TNs.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

285

Legal: this relates to many issues including privacy as a human right, profiling, the newly proposed EC DP Regulation, law enforcement, and regulation in general. The relationship between technology and law is becoming more and more important, where technology may facilitate lawful processing, but may also lead to inscrutable systems imposing constraints which have no political or legal basis. Important in this context is the balance needed in the technology tools so as to allow an international framework that facilitates many different jurisdictions. Socio-political: this includes the general trustworthiness of the infrastructure, institutions and applications that are the products of the above three aspects. Can we construct a digital world in which individual members can have confidence that they will find their place as respected members of the many online communities within and across jurisdictions? Some discussion on this can be found in the papers of Jacques Bus and Kieron O’Hara in [10]. Intergovernmental: the concepts of identity, PDM and privacy come from a physical world governed by sovereign states within national borders. Dealing with them in a “borderless” digital world requires international agreements and standards, and hence a debate that also considers the differences in appreciation of privacy and personal data management in the various global cultural blocs. Any policy (public or private) will need to address these various aspects at the three levels mentioned above, and stimulate societal debate to develop policy that will both engender and strengthen CPDM, taking account of the perspectives we have discussed. The challenges to an international policy framework are enormous and the stakes high. The growth of international trade in the digital era will depend on it. The current work on an EU regulation for interoperability of authentication and digital signatures is only a first step. The rules of governance of a TN, the establishment of electronic passports for international travel, and many international trade practices will be heavily affected by the development of a proper and sustainable personal data ecosystem. It might be commendable to establish an international expert group, bringing together nations with the mission to establish guidelines for context-aware personal data management as a follow-up and extension to the OECD Guidelines on Privacy presented in 1980. In addition to experts representing the different perspectives discussed here, such group must also include representatives of user-centred and service providercentred TN if a more optimal policy framework is to be developed.

6. Conclusions The increasing integration of our on- and offline lives and the potential of big data are creating a pressing need for dialogues among stakeholders on a policy framework that can help realise the value of a data-driven ecosystem while also ensuring adequate rights and protection for individuals. A common language and reference model would greatly facilitate and promote such dialogues by providing consistency and context for these discussions. In this paper, we have proposed a terminology framework that incorporates critical aspects of this conversation, including personal data management, context, and trust. We have also introduced a layered model to structure the dialogues on those elements that are required to develop ecosystems that can appropriately support context-aware personal data management. This is essential to the self-determination of individuals in

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

286

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

the digital world, and the development of trustworthy user-centred data ecosystems that can enable this. We have also discussed new technologies and platforms that could support the development of such ecosystems. Making this a reality will require unprecedented collaboration and cooperation between business, policy makers, and civil society, in addition to dialogues with individuals. These need to be multi-dimensional discussions that incorporate all aspects of the ecosystem: technical, scientific, economic, legal, and political. Moreover, this needs to be carried out on an international scale. These dialogues would be enhanced if there were a strong base of evidence that could inform the process and identify emerging issues along the way. And many dimensions will continue to evolve: technology, social norms, business models, and so on. Any policy frameworks must be able to accommodate this dynamic and evolving landscape. The stakes are high. Tomorrow’s economy will be data-driven. The discussions and recommendations we describe in this paper should be addressed urgently in countries that want to lead this global revolution and unleash the full potential of the digital economy.

List of Acronyms

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

CPDM PD PDM SP TF TIP TN TNPP

Context-aware Personal Data Management Personal Data Personal Data Management Service Provider Trust Framework Trusted Identity (or Claim) Provider Trust Network Trust Network Platform Provider

References [1] World Economic Forum, Rethinking Personal data, Report 300512 (2012). [2] James Q. Whitman, The Two Western Cultures of Privacy: Dignity versus Liberty, The Yale Law J. 113 (2004), 1152–1223. [3] International Institute of Communications, Personal Data Management: The User’s Perspective, London (2012). [4] European Commission, Proposal for a Regulation on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data protection regulation), COM(2012) 11/4 draft. [5] Paul Ohm, P. Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. UCLA Law Review 57 (2010), 1701; U of Colorado Law Legal Studies Research Paper No. 9–12. [6] Ryan Singel, NetFlix Cancels Recommendation Contest After Privacy Lawsuit. Wired Magazine (2010, March 12). [7] Arvind Narayanan, Vitaly Shmatikov, Robust De-anonymization of Large Sparse Datasets. In: SP ’08 Proceedings of the 2008 IEEE Symposium on Security and Privacy. IEEE Computer Society (2008). [8] Helen Nissenbaum, Privacy in Context – Technology, Policy and the Integrity of Social Life, Stanford (2010). [9] Bart P. Knijnenburg, Alfed Kobsa, Making Decisios about Privacy: Information Disclosure in ContextAware Recommender Systems. To appear in the ACM Transactions on Intelligent Interactive Systems (2013). [10] Helen Nissenbaum, Securing Trust Online: Wisdom or Oxymoron? Working document (2001).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

J. Bus and M.-H.C. Nguyen / Personal Data Management – A Structured Discussion

287

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

[11] Jacques Bus, Malcolm Crompton, Mireille Hildebrandt, George Metakides, Digital Enlightenment Yearbook 2012, IOS Press, Amsterdam (2012). [12] QIY: see https://www.qiyfoundation.org/en/. [13] Synergetics: see http://synergetics.be/. [14] Doc Searls, The Intention Economy: When Customers Take Charge. Boston: Harvard Business Review (2012). [15] Personal Clouds Wiki. (n.d.). Retrieved from http://personal-clouds.org/wiki/Main_Page. [16] Respect Network. (n.d.). Retrieved from www.respectnetwork.com. [17] ABC4Trust: see https://abc4trust.eu/. [18] Kim Cameron, Reinhard Posch, Kai Rannenberg, Proposal for a Common Identity Framework: A Usercentric Identity Meta-System. In: Kai Rannenberg et al. (Eds.), The Future of Identity in the Information Society, Springer (2009). [19] Cynthia Dwork, Differential Privacy. In: Automata, Languages and Programming (pp. 1–12). Springer, Berlin Heidelberg (2006).

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

289

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-289

Afterword

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Kim CAMERON Distinguished Engineer, Architect of Identity and Access, Microsoft

We who founded the Digital Enlightenment Forum wanted to create a deeper discussion of digital technology and its relation to society than was taking place around us. We had common interests but different professional backgrounds, which led to exchanges of ideas we thought valuable, not just for ourselves but, potentially, for others. We wanted especially to bring together four groups who would normally never spend much time together – scientists, technology innovators, legal experts and policy makers – and have them collaborate on understanding what was happening to our society. We are living in a period of accelerating technological change. But there is relatively little understanding or discussion by people, otherwise keenly aware of the world around them, of how our new technologies actually work, of what they potentially effect, or of how they might impact the tenets of our society. Nor are technologies generally seen as things that can be managed: they are perceived as inevitabilities even though they are increasingly social and in fact are carefully managed to achieve conscious outcomes by all those who deploy them. Given the absence of a conversation through which the public – or even influential elites – can begin to approach these issues, society’s concept of itself risks being stuck in the past while reality changes underfoot. Sadly, this would mean democratic institutions would have no chance to shape their relation to these technologies. Just as ominous, the public may wake up one day in a very bad mood as changes that have become a fait accompli seep into their consciousness. We saw omens of this even as this volume was being written, when a man as eminent as the President of the United States was swept up in a political whirlwind: revelations of digital surveillance shocked many by what they taught about contemporary uses of digital technology. It is worth pausing to understand why Mr. Obama’s handling of this political challenge stands as an omen rather than just a passing headline. To put some limits on the discussion, let’s remember that surveillance isn’t new. I leave it to the many experts pro and con to lead what is absolutely an important debate about its necessity and legitimacy. Nor is digital surveillance unique to the U.S. The Guardian and Le Monde have asserted that many countries, including France, the United Kingdom and Germany, have digital surveillance programs similar to those operated in the U.S., while most countries that do not have such programs have constituencies within government that wish they did. The situation challenging Mr. Obama was unique because what had been secret suddenly became public, concerned everyone using a telephone or the Internet, and was authoritative, consisting of actual very detailed government documents. The U.S. Administration judged that it was necessary to present the government’s case for digital surveillance to the world.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

290

K. Cameron / Afterword

As Mr. Obama began to do this, one of his statements stood out: ‘This debate has gotten cloudy very quickly.’ Yes it did – and there are historical as well as political and ideological reasons why. Passage to the digital epoch involves many kinds of rupture, each creating inevitable confusion. In digital technology as in all else, things are not necessarily as they appear. We need to look beyond the surface of this new reality and arrive at an understanding of what explains it. As an example of what I mean by this, let’s turn to one the cloudier moments in Mr. Obama’s own televised statement of why he supported the surveillance programs:

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

‘(The 215 Program)… gets data from the service providers like a Verizon in bulk, and basically you have call pairs. You have my telephone number connecting with your telephone number. There are no names. There is no content in that database. All it is, is the number pairs, when those calls took place, how long they took place…’ Why cloudy? Because Mr. Obama, no matter how reputed as a law professor and talented as a politician, finds it convincing to first describe what is in the database and then say it has no content. This in spite of the fact that anyone who has ever used a spreadsheet will spot an oxymoron. One might wonder if the koan of ‘contentless content’ were just a rhetorical device gone bad. But the expert communicator in Mr. Obama thinks he is connecting with his listeners. How is that possible? Mr. Obama counts on a vocabulary in which artifacts that pre-exist the digital age are ‘content’ while those that are digital are not. He addresses his listeners from the pre-digital side of the chasm. Names are content, but phone numbers are not; phone calls are content, but the record of their time and length is not; the text of mail (even email because it is still… mail) is content; but the identifiers of the email recipients are not. Mr. Obama’s discourse explicitly rejects making the transition to defining ‘content’ as including the artifacts that define digital life – including digital metadata. Yet all those millions who have ‘gone digital’ in their daily personal lives know that their phone calls and the SMSs they send, combined with the times of day and durations of calls, reveal their social network. They know instinctively that the composition and usage of our social networks is as much ‘content’ as is a telephone call or video. As a result, the Internet seethed with descriptions of what was called Mr. Obama’s duplicitousness: some polls showed disapproval reaching 61%; one found Americans under 30, who, according to conventional wisdom, care little about privacy, flipped against Obama by 17 points in a matter of days. We need to consider the semiotics of Mr. Obama’s challenge to see how the underlying issues should have been framed “now that we are digital”. Social consensus around technology and society can no longer be fudged using the categories and conveniently naïve mental models of the pre-digital world: too many people in all age groups have digital insights, inchoate as they may be, and have begun to stumble upon new paradigms.

1. Identifiers and Databases Let us return to the question of ‘numbers’ with no content. In a digital world, of course, everything is numbers, so numbers must be content. But some numbers are special in

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

K. Cameron / Afterword

291

that they are identifiers for things that exist in the physical world. The digital world can’t exist without these identifiers. For example, when we get a cell phone, our personal information is added to a directory and linked to our cell phone number – which is an identifier. Such directories are typically available to the public and are always available to law enforcement. When a program like Program 215 collects cell phone ‘numbers but no names’, it is because a simple directory lookup or ‘join’ with the phone directory can be used to convert any number to a name. There is no advantage to storing a name when the name is already associated with each number and can be retrieved as needed. This reality is widely enough understood that Mr. Obama’s ‘numbers without names’ approach didn’t convince anybody. For the same reason, a press release put out by the NSA a little later claiming good behavior for abstaining from collecting geolocation information rang hollow for many: an ever-increasing number of people intuitively understand that data bases are interconnected. In reality, as is widely known, it was unnecessary to collect geolocation information because, like all law enforcement in the United States, the NSA already effectively has a directory that provides access to it. Geolocation information is universally collected and is readily available in the U.S. for lookup, at an hourly fee and without a warrant, using the phone identifiers that are being collected. A recent ACLU study indicates that most law enforcement agencies regularly rely on cell phone companies to track the physical location of their customers. So to return to the question of why Mr. Obama’s challenge was an omen, beyond the surveillance itself there were two serious problems with the way the U.S. Administration tried to explain it to the world. First, enough people have enough intuitions about the digital epoch to doubt the administration’s explanations and render them counter-productive. Second, some of their explanations were intentionally framed to obscure the nature of digital reality, instead of helping people understand it. In this new confusing epoch, understanding is essential to society’s wellbeing and trust. This said, Mr. Obama made a fundamental breakthrough in political discourse when at one point he spoke to his television audience about the potentially identifying aspects of ‘so-called metadata’. Who could have previously envisioned a head of state trading views on metadata with a popular talk-show host? In the wake of this conversation, metadata is becoming a household word that helps people understand the world they really live in. Meanwhile, how many others take refuge in the sleight of hand that identifiers are ‘just numbers’ with no connection to a vast web of content. The same premise is used by many of the world’s governments, and governments are not necessarily the worst offenders. Treating identifiers as ‘just numbers’ is all the rage among various big businesses that benefit from the currently veiled structure of the disparate databases which the numbers weave together.

2. Supercontent and Big Data The identifying numbers that are used to connect people and things to the content located in a staggering maze of databases are the keys to the structure of the digital world.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

292

K. Cameron / Afterword

To properly think about the digital age, we need both to recognize this identifying metadata as content, and, I propose, to promote it to the category of supercontent. Identifiers – numbers without names – are the supercontent that links all other content. There are many different kinds of identifiers. More precisely, unique identifiers exist in different ‘namespaces’ or domains. For example, telephone numbers identify cell phones and land lines. Postal addresses identify places where people live and work. IP addresses identify computing devices connected to the Internet. Social Security Numbers identify individuals in a given country. Email addresses identify email recipients and originators. ‘Login names’ identify users of computers and applications. Customer numbers identify users of products and services. Social networking ‘handles’ identify members of social networks. Credit Card numbers identify credit accounts. Web addresses identify Internet services. All of these identify either a person or a device or resource used by a person or persons. In the pre-digital world, namespaces were generally disconnected from each other. One of the key transformations we have seen in the digital epoch has been the stitching together of these identifiers across more and more domains. When using a browser to place an order on the Internet one enters, at a minimum, one’s name, address, phone numbers, email address, and credit card number. Meanwhile, one’s computer transmits its current IP address, the website likely adds a cookie (identifier) in case the IP address changes, and more often than not web beacons associate the transaction with a set of other transactions made by the same user at other completely unrelated websites. In that single transaction, 8 namespaces are ‘joined’ together. If the identifying information is stored and can be consulted during subsequent transactions, then any transaction that reveals even one of these identifiers can do a lookup to determine the other 7. Further, if the new transaction contains additional identifiers, the new ones can be stitched in with those already assembled. In this sense, identifiers are ‘contagious’ supercontent. As discussed in the chapters of this volume, our digital epoch is synonymous with the storage of the records of people’s daily lives, their activities, transactions, profiles, appraisals, test results and evaluations in databases distributed throughout the world. The identifiers we have been discussing are what connect each of these database records to the person it describes. Until now the records themselves have been distributed through space and controlled by different entities. They have been difficult and expensive to access and assemble because built on incompatible platforms, protected through disconnected security systems, and governed by data usage restrictions. Against this background the World Economic Forum (WEF) has put forward a program that promotes ‘Big Data’ as an economic driver and a plan to ‘rethink’ (critics say ‘eliminate’) restrictions on the collection and use of personal data. The WEF report argues that ‘The discovery and insights derived from linking previously disparate bits of data have become essential for innovation’. It urges allowing – in fact, promoting – the co-location and assembly of billions of records from innumerable databases, including the development of new interoperable mechanisms for doing so, to facilitate data mining and the search for patterns that might hasten advances in technology and science. In this plan, identifier supercontent (identity metadata) would be leveraged in order to ‘unlock’ the economic value of these interconnected fields of data. From a technical point of view, it is not clear how many significant differences exist between the uses of identifiers and databases proposed by the World Economic Forum report and what is currently being done in various surveillance programs. This

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

K. Cameron / Afterword

293

may be why Mr. Obama, as part of his discussion of the issues, called for a national conversation on big data, arguing that ‘the general problem of data, big datasets… is not going to be restricted to government entities.’ There is no doubt that the discussion of big data has also ‘gotten cloudy very quickly’. Statements like ‘Analytics have become the new engine of social value creation’ are made without hard evidence, and need to be evaluated empirically. There has also been much overgeneralization – for example, using a useful outcome resulting from the connection of two specific datasets to argue against restrictions on connecting any data to any other. The essential point is that any data containing identifier supercontent – an identifier that can be linked to a person directly or through contagion – is clearly personal data. The question to be answered by the Digital Enlightenment must be: under what conditions can personal data about people and their possessions be used for purposes other than the transactions they were collected to enable? In particular, is the premise that inference engines might detect associations that propel the human race forward – while bringing in the bacon – sufficient for unleashing what some see as a data free-for-all in which any data can be connected with any other? This is also an area where philosophers of science could contribute to the Enlightenment discussion. The scientific method, which has led to virtually all scientific breakthroughs until now, begins with the elaboration of theories based on models that can then be expressed as hypotheses which are ‘disprovable’ if the theory is not correct. The hypotheses to be tested thus select the data that is to be generated or collated and analyzed for a specific purpose. In this paradigm, use of data can be subject to rational evaluation of risks and benefits. However the premise of the WEF report is that innovation is now no longer possible except through analytics, meaning the use of inference engines that graze the data collected (potentially by every device in the world) for associations and statistical patterns leading to huge new breakthroughs and value creation and eliminating the need for a priori hypothesis. The contention is that the scientific method as we have known it has been superseded – so control over data should be abandoned as we get on with the grazing epoch. However, since no empirical evidence is provided for this assertion, it seems worthy of more discussion before rushing to comply.

3. Data Reciprocity One related area no one seems to be discussing is the idea of “Data Reciprocity”. Rather than eliminating restrictions on the collection of data, Data Reciprocity would encourage restrictions on collection, combined with a requirement that any data collected from an individual that contains identifying supercontent must be made available to that individual through an Application Programming Interface (API). Such APIs would attract developers to develop applications for end users that would ‘join’ their own information records for their own use. For example, if someone opens an account at a retailer and places orders, it should, given Data Reciprocity, be possible for that person to retrieve all the information he or she has generated with the retailer through APIs. The development of Data Reciprocity APIs would be many orders of magnitude less onerous than what is proposed by the WEF report calling for all information anywhere to be sucked into Big Data systems.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

294

K. Cameron / Afterword

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Data Reciprocity should include a requirement that governments and private enterprises collecting email addresses register the location of their Data Reciprocity API in a registry associated with each person’s email address, and only accessible to the individual identified and the software services he or she designates. This would make it possible to develop systems to assess compliance with Data Reciprocity requirements. A rich area for research and discussion within the Digital Enlightenment community could then be whether and under what circumstances Data Reciprocity should apply to information derived from personally identifiable information and made available to third parties. For example, if categorization of an individual’s economic status is done using personally identifiable information subject to Data Reciprocity, should it be a requirement that the derived economic status categorization also be made available to the subject concerned via an API? In any case, establishing the principle of Data Reciprocity is one of the most important enablers of “Life Management Platforms”. It also serves to attach a cost to the collection of personally identifiable information without being punitive in any way. The articles the Digital Enlightenment Forum has brought together in this volume provide much insight that can be applied to these and other related issues. But while we continue to deepen our own understanding of the world around us we must also address the question of how we connect with the increasing number of people who are spontaneously coming to understand more about how digital society works. Digital enlightenment can mean nothing less than a society fully conscious of its digital world and the social issues that characterize it. Standing for a democratic society that is fully digital yet consistent with our values, Digital Enlightenment is more than a name. It is more than a goal. Digital Enlightenment is a necessity.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors. doi:10.3233/978-1-61499-295-0-295

295

Editors’ Biographies Mireille Hildebrandt Mireille Hildebrandt is one of the Founding Members of the Digital Enlightenment Forum. She holds the Chair of Smart Environments, Data Protection and the Rule of Law at the institute of Computing and Information Sciences (iCIS) at Radboud University Nijmegen; she is an Associate Professor of Jurisprudence at the Erasmus School of Law, Rotterdam and a senior researcher at the centre for Law Science Technology & Society at Vrije Universiteit Brussel. The focus of her research is the nexus of philosophy of law and of technology, investigating the implications of smart environments for democracy and the Rule of Law. She co-edited Profiling the European Citizen. CrossDisciplinary Perspectives (2008) with Serge Gutwirth; Law, Human Agency and Autonomic Computing (2011) with Antoinette Rouvroy; Privacy and Due Process after the Computational Turn (2013) with Katja de Vries; and Human Law and Computer Law (2013) with Jeanne Gaakeer. Her roots reside in cultural anthropology, law, and the philosophy and history of criminal law.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Kieron O’Hara Kieron O’Hara is a senior research fellow in electronics and computer science at the University of Southampton. His research interests are the philosophy and politics of technology, in particular the World Wide Web and the Semantic Web. He writes on privacy, trust and memory, and has a particular interest in transparency and open data. As a political philosopher, he has also written extensively on the philosophy of conservatism, and is a research fellow at the Centre for Policy Studies. He has written several books include The Spy in the Coffee Machine (with Nigel Shadbolt), Trust: From Socrates to Spin, and The Enlightenment: A Beginner’s Guide. His latest book, The Devil’s Long Tail (with David Stevens), is about online religious extremism. His report for the UK government on privacy and open data, Transparent Government, Not Transparent Citizens, appeared in 2011, and he currently chairs the Transparency Sector Panel for Crime and Criminal Justice for the UK Home Office and Ministry of Justice.

Michael Waidner Michael Waidner is the Director of the Fraunhofer Institute for Secure Information Technology (Fraunhofer SIT) in Darmstadt, Germany. At the Technische Universität Darmstadt he holds the chair for Security in Information Technology and is the Director of the Center for Advanced Security Research Darmstadt (CASED) and the European Center for Security and Privacy by Design (EC SPRIDE), two cybersecurity research centres supported by the state and federal government, respectively. Michael Waidner received his PhD in 1991 from the University of Karlsruhe. In 1994 he joined the IBM Zurich Research Laboratory where he was responsible for the research activi-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

296

Biographies

ties in security and privacy. In 2006 he moved to IBM in New York. Until 2010, he was an IBM Distinguished Engineer and the IBM Chief Technology Officer for Security, responsible for the technical security strategy and architecture of the IBM Corporation. Michael Waidner is a Fellow of the IEEE and Distinguished Scientist of the ACM.

Author Biographies David Alexander David Alexander is Chief Executive Officer and co-founder of Mydex CIC. David is the co-author of the CRM Pocketbook and a fellow of the Sales Leadership Alliance. He is a member and active supporter of Electronic Frontier Foundation and the Open Rights Group and is passionate about protecting the individual’s privacy, identity, personal data and its associated rights. David is the designer of the Mydex platform and trust framework and has an extensive technical and commercial background spanning over 30 years. David is active in support of community endeavours outside of Mydex, and is an active coach and mentor.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Eleonora Bassi Eleonora Bassi is a Research Fellow of the Department of Information Engineering and Computer Science of the University of Trento and a Fellow at the Nexa Center for Internet and Society of the Polytechnic University of Torino. She holds a Degree in Law and a PhD in Philosophy of Law at the University of Torino. After her PhD she focused her interest on Information Law, Fundamental Rights and Data Protection Issues, and later on Public Sector Information European legal framework and regional policies. Currently, her research follows two main directions. First, the focus is on the new European Data Protection framework that will have a strong impact on privacy rights in digital environments and the circulation of personal data within the information market. Second, her work is on policy-oriented research on Open Data and Big Data.

Phil Booth Phil Booth is an active social entrepreneur and privacy advocate. Phil has worked on projects ranging from systems to improve quality of life through incremental behaviour modification, to helping build BBC Schools online and designing secure online spaces for looked-after children. He advises a number of organisations that seek to provide individuals with greater control over their own personal data. Phil led the non-partisan NO2ID campaign, campaigning successfully to defeat the introduction of ID cards in the UK and other ‘database state’ initiatives; his work has been recognised by awards from Privacy International and Liberty.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Biographies

297

Caspar Bowden Caspar Bowden is an independent advocate for information privacy, and public understanding of privacy research in computer science. He is a specialist in data protection policy, EU and US surveillance law, PET research, Cloud computing, identity management, and information ethics. For nine years he was Chief Privacy Adviser for Microsoft, and previously was co-founder and director of the Foundation for Information Policy Research. He was an expert adviser to the UK Parliament on privacy and surveillance legislation, and co-organized six public conferences on encryption, data retention, and interception policy. He has previous careers in financial engineering and risk management, and software engineering (systems, 3D games, applied cryptography), including work with Goldman Sachs, Microsoft Consulting Services, Acorn, Research Machines, and IBM. He is a fellow of the British Computer Society, and is a member of the advisory bodies of several civil society associations.

Johannes Buchmann Johannes Buchmann is a professor of computer science and mathematics at the Technische Universität Darmstadt and vice-director of the Center for Advanced Security Research Darmstadt (CASED). His research focuses are, in particular, cryptography and its applications. He has received his PhD from Cologne University and was 8 years professor at Saarland University. Since 1996 he has been the head of the ‘Theoretische Informatik – Kryptographie und Computeralgebra’ group at TU Darmstadt. Johannes Buchmann has been awarded the Leibniz prize of the DFG, the Karl-Heinz Beckurts prize, the German IT Security award and the Tsungming Tsu Award from the National Science Council Taiwan. He is also a member of the German national academy Leopoldina, and the German academy of sciences acatech.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Jacques Bus Jacques Bus is Secretary General of the Digital Enlightenment Forum. He studied Mathematics and obtained his PhD at the University of Amsterdam. He worked as a researcher for 12 years and then as research programme manager for 5 years at the Centre for Mathematics and Computer Science (CWI) in Amsterdam (NL). In 1988 he joined the European Commission and has worked since in various parts of the Research programmes Esprit and IST. He was strongly involved in the establishment of the Security Theme in FP7. From March 2004 till March 2010 Jacques was Head of Unit ICT Trust and Security, which includes Security of Network and Service infrastructures, Critical Information Infrastructure Protection, Identity and Privacy management, and enabling technologies for trust and security. In that function he was actively involved in the work of the RISEPTIS Advisory Board and organised, with the Spanish EU Presidency, the Conference Trust in the Information Society. Since March 2010 he has been working as Secretary General of the Digital Enlightenment Forum, Business Development Director of the Privacy and Identity Lab and as an independent consultant (see http://www.digitrust.eu) in related areas. He is Research Fellow at the University of Luxembourg and senior adviser at the Centre for Science, Society and Citizenship.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

298

Biographies

Kim Cameron Kim Cameron is Architect of Identity in the Cloud and Enterprise Division at Microsoft, USA, working to help establish a privacy enhancing Identity Metasystem reaching across vendors, industries, technologies, continents and cultures. He has spent three decades solving the problems of digital identity and directories of people and things. His early innovation in the startup ZOOMIT Corporation focused on how to link and manage peoples’ identifiers across different contexts so as to build a ‘joined view’ of the individual. This research resulted in the first Metadirectory. With the emergence of the Internet, he was one of the first to see that the joined view he had enabled could, in the context of the Web, also be privacy invasive. In response he began developing compensating privacy technology. Microsoft bought ZOOMIT Corporation in 1999 and shipped its Metadirectory (now called FIM). Since then, Mr. Cameron has played a leading role in the evolution of Active Directory, Federation Services, CardSpace, Azure Active Directory and Microsoft’s other Identity Metasystem products. In 2009 he was appointed a Microsoft Distinguished Engineer. He grew up in Canada, attending King’s College at Dalhousie University and l’Université de Montréal. He served on RISEPTIS, the high-level European Union advisory body providing vision and guidance on security and trust in the Information Society. He has won many industry awards, including Digital Identity World’s Innovation Award (2005), Network Computing’s Top 25 Technology Drivers Award (1996), Network World’s 50 Most Powerful People in Networking (2005) and Lifetime Achievement Awards from the European Association of e-Identity and Security (EEMA) and Kuppinger Cole’s Cloud Identity Conference. In 2010 King’s College recognised his work on digital identity by awarding him an honorary Doctor of Civil Law degree. Mr. Cameron blogs at www.identityblog.com, where he has published the Laws of Identity and a number of other documents on identity and privacy issues, and reports on his work.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Lizzie Coles-Kemp Lizzie Coles-Kemp is a qualitative researcher, interested in information security and privacy practices within communities. She is Senior Lecturer in the Information Security Group Royal Holloway University of London. Lizzie’s particular focus is the interaction between humans and security and privacy technologies, how each influences the other and the communities of practice that emerge. Current interdisciplinary work includes: value sensitive design in public service delivery, cultural analysis in organisational security and the use of visual research methods in interdisciplinary research.

Charles M. Ess Charles M. Ess is Professor in Media Studies, Department of Media and Communication, and Director, Centre for Research on Media Innovations, University of Oslo. He has held several guest professorships in Europe and Scandinavia – most recently as Professor MSO (med særlige opgaver), Media Studies, Aarhus University, Denmark (2009–2012). He has received awards for excellence in both teaching and scholarship.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Biographies

299

Emphasising cross-cultural perspectives, Dr. Ess has published extensively in information and computing ethics, e.g., Digital Media Ethics (Polity Press, 2009; 2nd edition, 2013) and (with May Thorseth) Trust and Virtual Worlds: Contemporary Perspectives (Peter Lang, 2011) and in Internet studies, e.g., (with Mia Consalvo), The Handbook of Internet Studies (Wiley-Blackwell, 2011), and (with Pauline Cheong, Peter FischerNielsen, and Stefan Gelfgren) Digital Religion, Social Media and Culture: Perspectives, Practices and Futures (Peter Lang, 2012).

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Simone Fischer-Hübner Simone Fischer-Hübner has been a Full Professor at the Computer Science Department of Karlstad University, Sweden, since June 2000, where she is the head of the PriSec (Privacy & Security) research group. She received a Diploma Degree in Computer Science with a minor in Law (1988), and Doctoral (1992) and Habilitation (1999) Degrees in Computer Science from Hamburg University. Her research interests include ITsecurity, privacy and privacy-enhancing technologies. She was a research assistant/assistant professor at Hamburg University (1988–2000) and a Guest Professor at the Copenhagen Business School (1994–1995) and at Stockholm University/Royal Institute of Technology (1998–1999). She is the chairperson of IFIP (International Federation for Information Processing) Working Group 11.6 on “Identity Management” and the Swedish IFIP TC11 representative. Besides, she is member of the NordSec (Nordic Workshop on Secure IT Systems) steering committee, steering committee member of STINT (the Swedish Foundation for International Cooperation in Research and Higher Education), coordinator of the Swedish IT Secure Network for PhD students (funded by MSB in Sweden), and member of the International Editorial Board of the Springer International Journal of Information Security & the Springer International Journal of Trust Management. She has been appointed by the Swedish government as a member of the advisory board for the Swedish Data Protection Commissioner (Datainspektionens Insysnråd). She is also a member of the IT Security Advisory Board of the Swedish Civil Contingency Agency (MSB informationssäkerhetsråd). She has contributed to several privacy and security-related national and European Research projects, including the EU Celtic project BUGYO, the FP6 projects PRIME, FIDIS and Newcom and the FP7 projects PrimeLife, Newcom++, A4Cloud and SmartSociety. She received the Silver Core Award from the International Federation for Information Processing (IFIP) in 2001 for services rendered to IFIP.

Hallvard J. Fossheim Hallvard J. Fossheim is Professor II in Philosophy at the University of Tromsø. He has written mainly on the practical philosophy of Plato and Aristotle and on the virtue ethical tradition going back to them. He has also focused on film theory and on the philosophy of computer games. Among his current activities is a paper on Aristotelian collective agency and a project dealing with the relation between dialogue, rationality, and identity. Fossheim also holds a position as Director of The Norwegian National Research Ethics Committee for the Social Sciences and Humanities.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

300

Biographies

Jeffery Friedberg Jeffrey Friedberg is Chief Trust Architect for Microsoft. He focuses on new ways that people, organisations, and governments can benefit from the ever expanding flows of data that fuel our digital lifestyle and economy, while at the same time reducing the risks to personal freedoms, intellectual property, and critical infrastructure. This effort includes investigating ways to make privacy and security features more usable for consumers and businesses. He speaks publicly on strategies for reducing Internet threats such as identity theft and has testified before congress on protecting users from spyware. He co-authored the Microsoft Privacy Standard for Development and was responsible for Windows Privacy. Previously at Microsoft he focused on privacy and legal issues relating to the Windows Media Platform and was a Group Program Manager for Microsoft’s graphics software. Jeffrey Friedberg has over 30 years of software development experience, and has delivered products that range from graphics supercomputers used in medical imaging to next-generation gaming devices. As VP of Engineering at Silicon Gaming, he helped launch an IPO and chaired the Gaming Manufacturers Association. At Digital Equipment Corporation, he co-architected the industry standard 3D graphics extensions for the MIT X Window System. In addition to being a Certified Information Privacy Professional, he has a formal background in Computer Graphics and a B.S. degree in Computer Science from Cornell University.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Peter Haynes Peter Haynes is a senior fellow at the Atlantic Council, and senior director, advanced strategies and research at Microsoft Corporation. His focus is on long-term strategy and policy in areas including cybersecurity and privacy, big data and data sovereignty, the internet of things, and the economic impact of digital technologies. He has provided expert advice to government, industry, academia, and policymakers, both directly and via institutions such as the President’s Council of Advisors on Science and Technology, the US State Department, and the Organisation for Economic Co-operation and Development. Previously, Peter Haynes was a writer and editor at The Economist magazine, where he wrote more than 700 articles and served as energy correspondent, technology correspondent, management editor, deputy business editor, and New York bureau chief/U.S. business editor. Peter Haynes has also been a senior editor at Forbes, and has broadcast extensively for the BBC and NPR. Peter Haynes started his career as a senior researcher at two leading British think tanks, the Institute for Fiscal Studies and the National Institute of Economic and Social Research. He helped found Oxford Economic Research Associates, now a leading European economic consultancy, and is co-founder of early-stage high-tech incubator Seattle Feed & Livestock. Peter Haynes holds an MA in Philosophy, Politics and Economics from Keble College, Oxford.

William Heath William Heath is chairman of Mydex CIC. He is an entrepreneur. As well as Mydex CIC he co-founded Ctrl-Shift Ltd, as well as the digital rights campaign Open Rights Group and the publishing and research group Kable Ltd, both of which he chaired. He is a Fellow of the Young Foundation and an adviser (former vice chair) of the Founda-

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Biographies

301

tion for Information Policy Reseach and of DXW.com. In 2013 he cofounded The Bell Inn (Bath) Co-op which now owns and runs the leading music pub in south west England.

Chris Hoofnagle Chris Jay Hoofnagle is a lecturer at UC Berkeley Law, where his research focuses upon the structure of legal and economic relationships that lead to tensions between firms and individuals, manifested through information privacy problems, gaps in understanding of legal protections, deficits in consumer law protections, and the problem of financial fraud. Hoofnagle has written extensively in the fields of information privacy, the law of unfair and deceptive practices, consumer law, and identity theft. His recent work includes, The Price of Free, 61 UCLA L. Rev. (2014) (with Jan Whittington), Unpacking Privacy’s Price, 90 North Carolina L. Rev. 1327 (2012) (with Jan Whittington), and Behavioral Advertising: The Offer You Cannot Refuse, 6 Harvard L. & Policy Rev. 273 (2012). He has also written on payments technologies with a focus upon mobile payments, consumer attitudes toward and knowledge of privacy law, identity theft, the first amendment, and the government’s reliance on private-sector databases to investigate citizens.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

David J. Houghton David Houghton is currently a Lecturer in Marketing at Birmingham Business School, University of Birmingham. He has a PhD in Management from the University of Bath, and a degree in Psychology. His research centres on the psychology of communication online, with a specific interest in the management of privacy, self-disclosure, uncertainty reduction and the formation, maintenance and deterioration of relationships. This research is combined with social marketing and social media to develop an understanding of how to best promote healthy social practices for individuals and society more generally. Dr. Houghton holds a general research interest in psychology, technology and social marketing.

Sara Hugelier Sara Hugelier is a legal researcher at the Interdisciplinary Centre for Law and ICT of KU.Leuven – iMinds since March 2013. During her studies at KU.Leuven, she pursued a one year exchange at Queen Mary University of London, United Kingdom where she focused on intellectual property and competition law. Under the supervision of G. Van Bueren, she wrote a dissertation regarding the interface between freedom of expression and freedom of information which was later rewarded with a publication in the KU.Leuven law review (Jura Falconis) in 2011. She obtained her Master of Laws degree in 2011 at the same University, focusing on European and economic law. After her LL.M in European studies at the Institut d’Etudes Européennes of the Université Libre de Bruxelles (ULB), which she combined with internships at the European Parliament as well as a legal traineeship at a Brussels-based firm, she completed a fivemonth traineeship at the European Commission’s Directorate General Competition in

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

302

Biographies

Brussels, working on cartels and settlement procedures. At ICRI, Sara conducts research in the fields of data protection, privacy, the re-use of public sector information and open data. She is currently engaged in several European projects. For the European Location Framework project (E.L.F.) she deals with the legal aspects of spatial data and the re-use of public sector information. In the LAPSI 2.0 project, she conducts research as regards the legal aspects of public sector information. Finally, in the OpenScienceLink (OSL) project, she focuses on the legal aspects of sharing raw research data for open access.

Katleen Janssen Katleen Janssen is a postdoctoral legal researcher at the Interdisciplinary Centre for Law and ICT of KU Leuven – iMinds. She specialises in access to and use of Public Sector Information, and SDI- and GIS-matters. This includes policies promoting the availability of information and policies restricting such availability, e.g. privacy protection, intellectual property rights, etc. In 2009, Katleen obtained her Phd with a thesis about the legal framework for the availability of public sector spatial data, mainly dealing with the relationship between INSPIRE, PSI and access to environmental information. Katleen works on a number of national and European projects relating to open data, PSI and spatial data infrastructures. Currently, she is the project manager of the LAPSI 2.0 thematic network on legal aspects of public sector information (www.lapsiproject.eu). From 2005 to 2012 she was co-chair of the INSPIRE drafting team on data and service sharing, and she is currently co-chair of the Legal and Socio-Economic Committee of the Global Spatial Data Infrastructure Association. In the spring of 2013, she developed open data licenses for the Flemish government.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Adam N. Joinson Adam Joinson is Professor of Behavioural Change at UWE Bristol. His research focuses on the interaction between psychology and technology, with a particular focus on measurable behaviour. Recently this work has taken in privacy attitudes and behaviours, the social impact of monitoring technology, computer-mediated communication and communication behaviours and the use and impact of social media. Since September 2012 Prof. Joinson’s research attention has been directed towards the ways in which technology can be used to change behaviour to achieve a social good. He also spent 18 months on secondment working with the UK Government on cyber-security and behavioural science.

Dave Kearns Dave Kearns spent 10 years as a network manager, ending up as Information Services Manager for the former Thomas-Conrad Corporation (now part of Compaq). In 1987, he was a founding SysOp of Novell’s Novell Support Connection service on Compuserve and served as the first president of the Association of NetWire SysOps. He´s a past recipient of NetWare Users International’s Nuggie award for “outstanding contributions”. Dave was formerly Technical Editor of Networking Solutions magazine. His

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Biographies

303

“Wired Windows” column appears in Network World magazine. He also writes frequently for PC World, The Novell Companion, World Wide Windows and NC World. He´s written, edited and contributed to a number of books on networking and is a frequent speaker before both trade and business groups. Dave has been a Senior Analyst at KuppingerCole since 2008 and focuses on various areas of IAM (Identity and Access Management), including the Future of Authentication and Authorizaton and the topic of Life Management Platforms and all the related themes such as Trust Frameworks.

Ioannis Krontiris Ioannis Krontiris is a senior researcher in the Deutsche Telekom Chair of Mobile Business & Multilateral Security at Goethe University Frankfurt, Germany. Ioannis holds a PhD in Computer Science from Mannheim University, Germany, an M.S. in Information Networking from Carnegie Mellon University, U.S.A. and a Diploma in Computer Engineering from Technical University of Crete, Greece. His research focuses on Identity Management, Online Privacy, Security and Privacy in Smart Environments (Mobile Computing, Pervasive Computing, Sensor Networks, and Ubiquitous Computing). He is technical coordinator of ABC4Trust, an EU-project that investigates common architectures for privacy respecting attribute based credential systems (i.e., PrivacyABCs) and their deployment in practice. Besides ABC4Trust, Ioannis has contributed to several other privacy and security-related European Research projects, including PICOS and GINI S.A. Since June 2013 Ioannis has been the chair of the IFIP WG 11.2 on Pervasive Systems Security.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Martin Kuppinger Martin Kuppinger is Founder and Principal Analyst at KuppingerCole. He studied economics at the Berufsakademie Stuttgart and the University of Tübingen. Martin wrote more than 50 IT-related books and is known as a widely-read columnist and author of technical articles and reviews in some of the most prestigious IT magazines in Germany, Austria, Switzerland, and the UK. He is also a well-established speaker and moderator at seminars and congresses. His interest in Identity Management dates back to the 80s, when he also gained considerable experience in software architecture development. Over the years, he has added several other fields of research, including virtualisation, cloud computing, overall IT security, and others. Having studied economics, he combines in-depth IT knowledge with a strong business perspective. His role at KuppingerCole is research and advisory. His current areas of interest span most of Information Security and Cloud Computing, plus related areas. He has a special interest and expertise in technologies and innovations that help to improve privacy and security for individuals and for organisations. That includes his research in the area of Life Management Platforms.

Robert Madelin Robert Madelin is the Director-General of DG CONNECT of the European Commission, responsible for the Digital Agenda for Europe. He was educated in England at the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

304

Biographies

Royal Grammar School, High Wycombe and at Magdalen College, Oxford. He has also studied at the Ecole Nationale d’Administration in Paris. Born in 1957, a British civil servant since 1979, Robert has served in the Commission since 1993: in his current post since 2010; as Director General for Health and Consumer Policies from 2004 to 2010; on trade and investment policy before that.

Sean Maguire Sean Maguire is a senior programme manager in Trustworthy Computing Governance for Microsoft. He works on developing and implementing privacy-enhancing technologies for Microsoft devices and services. He is interested in the application of trust frameworks and persona to better respect privacy while still providing meaningful value and controls to consumers, industry, and regulators. Previously, Sean worked at The Boeing Company doing writing, editing, and desktop publishing for the EASY5 dynamic systems and modeling analysis product, and was copy and production editor for AERO, an award-winning quarterly technical magazine for Boeing airplane customers. Sean Maguire is a Certified Information Privacy Professional/United States. He received his Bachelor of Arts degree in French and Linguistics with honors from the University of Washington and his Master of Business Administration degree from Western Governors University.

Maxi Nebel

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Maxi Nebel studied law at Friedrich-Alexander-University Erlangen-Nuremberg. She was legal research assistant in the Project Group for Constitutionally Compatible Technology Design (provet) at Kassel University until February 2013. She is currently recipient of the CASED scholarship programme. Her research topic is data protection in Internet services.

Carolyn Nguyen, PhD Carolyn Nguyen is a Director in Microsoft’s Technology Policy Group, responsible for policy initiatives related to data governance and personal data management. Her work is focused on helping to shape relevant long term technology policies globally in these areas by engaging with stakeholders, and raising awareness of potentially disruptive impacts of emerging technologies, such as big data and the internet of things, on existing social, economic, and policy frameworks. She works at the intersection of these disciplines, taking a holistic approach to policy and developing a supporting evidence base that can inform and steer policy makers and other stakeholders towards innovations in policy. Prior to joining Microsoft, Carolyn Nguyen held a number of positions with Research in Motion, Avaya Communications, Lucent Technologies, and Bell Laboratories. Her experience includes a range of business and technical responsibilities in developing and bringing to market innovative telecommunications solutions globally. She received her PhD from the Center of Telecommunications Research at Columbia University, and completed Executive Business Management Programs at Harvard Business School and the London Business School.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Biographies

305

Alexander Novotny, MSc Alexander Novotny is a research and teaching associate at the Institute for Management Information Systems at Vienna University of Business and Economics (WU). His research focuses on electronic privacy. Particularly, he is interested in questions relating to control over personal data and digital forgetting from a technical, social, and economic perspective. Alexander received a master’s degree in Information Systems from Vienna University of Business and Economics in March 2012. He was awarded the appreciation prize of the Austrian Federal Ministry for Science and Research in 2012. Recently, Alexander has been designated as a standardisation expert for digital marketing and privacy at the Austrian Standards Institute.

Ugo Pagallo Ugo Pagallo has been Professor of Jurisprudence at the Department of Law, University of Turin, since 2000, faculty at the Center for Transnational Legal Studies (CTLS) in London and faculty fellow at the NEXA Center for Internet and Society at the Politecnico of Turin. Member of the Group of Experts for the Onlife Initiative set up by the European Commission (2011–2013), he is chief editor of the Digitalica series published by Giappichelli in Turin and co-editor of the AICOL series by Springer. His main interests are AI & law, network theory, robotics, and information technology law (specially data protection law and copyright).

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Kai Rannenberg Kai Rannenberg has held the Deutsche Telekom Chair (formerly T-Mobile Chair) of Mobile Business & Multilateral Security since 2002. Before that he was with the System Security Group at Microsoft Research Cambridge, UK focussing on “Personal Security Devices & Privacy Technologies”. Between 1993–1999, Kai worked at Freiburg University and coordinated the interdisciplinary “Kolleg Security in Communication Technology”, sponsored by Gottlieb Daimler & Karl Benz Foundation researching Multilateral Security. After a Diploma in Informatics at TU Berlin he had focused his PhD at Freiburg University on IT Security Evaluation Criteria and their potential and limits regarding the protection of users and subscribers. Since 1991 Kai is active in the ISO/IEC standardisation of IT Security and Criteria (JTC 1/SC 27/WG 3 “Security evaluation criteria”). Since March 2007 he has been Convenor of the SC 27/WG 5 “Identity management and privacy technologies”. Since May 2007 Kai has chaired the IFIP TC-11 “Security and Privacy Protection in Information Processing Systems”, after having been its Vice-Chair since 2001. Since September 2009, Kai has been an IFIP Councillor. Kai is active in the Council of European Professional Informatics Societies (CEPIS) chairing its Legal & Security Issues Special Interest Network (CEPIS LSI) since 2003. From July 2004 till June 2013 Kai served as the academic expert in the Management Board of the European Network and Information Security Agency, ENISA. Kai’s awards include the IFIP Silver Core, the Alcatel SEL Foundation Dissertation Award and the Friedrich-August-von-Hayek-Preis of Freiburg University and Deutsche Bank. Kai’s current research interests include Mobile applications and Multilateral Security in e.g. M-Business, M-Commerce, M-Banking, and Location Based

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

306

Biographies

Services; Privacy and identity management, especially attribute based authorisation; Communication infrastructures and devices, such as personal security assistants and services; ICT security and Privacy standardisation, evaluation, and certification.

Joseph Reddington Joseph Reddington is currently a researcher in Programming Languages Semantics at Royal Holloway University of London. As part of his research portfolio, he maintains a strong interest in AAC at the technological, governance, and policy levels. He maintains the Domesday Dataset of AAC use in the UK. The dataset is available from Joseph’s home page under Open Data Commons Attribution Licence.

Daniel Ross Daniel Ross is co-director of the film The Ister (2004), author of Violent Democracy (Cambridge University Press, 2004), and translator of many works by Bernard Stiegler, including the books Acting Out (Stanford University Press, 2009), For a New Critique of Political Economy (Polity Press, 2010), The Decadence of Industrial Democracies (Polity Press, 2011), Uncontrollable Societies of Disaffected Individuals (Polity Press, 2013), What Makes Life Worth Living (Polity Press, 2013), and The Lost Spirit of Capitalism (Polity Press, 2013 forthcoming).

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Alexander Roßnagel Alexander Roßnagel holds the chair for public law with the focus on law regarding technology and environmental protection at Kassel University. Among other functions he heads the Project Group for Constitutionally Compatible Technology Design (provet). Alexander Roßnagel is also executive director of the Interdisciplinary Research Center for Information System Design (ITeG) at Kassel University, and Contributing Professor at the Center for Advanced Security Research Darmstadt (CASED).

Sir Nigel Shadbolt Nigel Shadbolt is Head of the Web and Internet Science Group within Electronics and Computer Science at the University of Southampton. He has made significant contributions to Artificial Intelligence, Computer Science, Open Data and Web Science. The Web and Internet Science group that he leads comprises 140 staff, researchers and PhD students. He is currently Principal Investigator on a £6.14M UK EPSRC funded Programme Grant researching the theory and design of social machines – Web scale problem solving systems comprising large numbers of humans and computers. He is also Chairman and Co-Founder of the Open Data Institute (ODI). Launched in December 2012, the ODI focuses on unlocking supply and stimulating demand for open data. It promotes the creation of economic, environmental and societal value from open data releases. Since 2009, Nigel has acted as an Information Advisor to the UK Government, helping transform public access to Government information, including the

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Biographies

307

widely acclaimed data.gov.uk site. In May 2010, he was appointed to the UK Public Sector Transparency Board responsible for setting open data strategy across the public sector. He Chairs the UK midata programme the goal of which is to empower consumers through access to their data. In 2013 he was appointed a member of the UK’s Information Economy Council. He has also advised EU and US policy makers on Open Data. He has founded successful technology companies, one of which, Garlik, was awarded Technology Pioneer status by the Davos World Economic Forum and won the BT Flagship IT Award in 2008. Nigel is the author of the critically acclaimed The Spy in the Coffee Machine: The End of Privacy as We Know It (with Kieron O’Hara). In June 2013 he was knighted in the Queen’s Birthday Honours “for services to science and engineering”.

Fatemah Shirazi Fatemeh Shirazi is a PhD candidate advised by Prof. Johannes Buchmann at Technische Universität Darmstadt. Her research focus is privacy and, in particular, anonymous communication. She holds a Bachelor’s degree in software engineering from AZAD University Iran, and a Master’s degree in computer science from Saarland University.

Hervais Simo

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Hervais Simo is a PhD candidate in the Department of Computer Science at the Technische Universität Darmstadt, working with Prof. Michael Waidner. His research interests lie in the area of privacy and identity management, privacy engineering, and applied cryptography. Hervais Simo is also a member of the Cloud Computing, Identity and Privacy Group at the Fraunhofer Institute for Secure Information Technology in Darmstadt. Hervais Simo received a MSc degree (Dipl. Inf.) in Computer Science from the Technische Universität Darmstadt.

Sarah Spiekerman Sarah Spiekerman has been a professor for Information Systems since 2009 and chairs the Institute for Management Information Systems at Vienna University of Economics and Business (WU Wien). Before her tenure in Vienna, she was assistant professor at the Institute of Information Systems at Humboldt University Berlin (Germany) and was Visiting Scholar at the Heinz College of Public Policy at Carnegie Mellon University in Pittsburgh. Sarah has published over 70 articles in leading IS, marketing and computer science journals and conferences, in particular in the domain of electronic privacy and electronic marketing.

Bernard Stiegler Bernard Stiegler is a French philosopher, Director of the Institut de recherche et d’innovation, and founder of the School of Philosophy at Epineuil-le Fleuriel. Since

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

308

Biographies

1994 he has published some thirty books, including Technics and Time (3 vols.; Stanford University Press, 1998–2011), Acting Out (Stanford University Press, 2009), The Decadence of Industrial Democracies (Polity Press, 2011), Uncontrollable Societies of Disaffected Individuals (Polity Press, 2013), and What Makes Life Worth Living (Polity Press, 2013). His most recent work is Pharmacologie du Front national (Flammarion, 2013).

Edgar Whitley

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Edgar Whitley is a Reader in Information Systems in the Information Systems and Innovation Group of the Department of Management at the London School of Economics and Political Science. He has a BSc (Econ) and a PhD in Information Systems both from the LSE. Edgar was the research coordinator of the influential LSE Identity Project on the UK’s proposals to introduce biometric identity cards; proposals that were scrapped following the 2010 General Election. Together with Gus Hosein, he has recently published the book Global challenges for identity policies (Palgrave, 2010). This analysis of the UK identity policy proposals has resulted in Edgar advising other governments about the political, technological and social challenges of identity policies. Edgar’s research draws on his interests in social theories and their application to information systems. He is currently involved in developing the concept of dynamic consent that emerged from an interdisciplinary research project EnCoRe (www.encoreproject.info) which addressed the role of consent (and the revocation of consent) as a mechanism for providing control over the use of personal data by commercial and public-sector organisations. Edgar is the co-editor for the journal Information Technology & People and a member of the Cabinet Office’s Identity Assurance Privacy and Consumer Advisory Group, the BCS Information Privacy Expert Panel, the Commonwealth Telecommunications Organisation Cybersecurity Advisory Group and the Information Assurance Advisory Council’s Academic Liaison Panel. Further information about Edgar can be found at http://personal.lse.ac.uk/whitley/.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

309

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Subject Index AAC 59 access 190 anonymisation 179 architecture 227 big data 227 biobanking 165 consent 165 consumer empowerment 202 context 227 context-awareness 270 data privacy 40 data protection 179, 190, 227 data protection directive 123 digitalisation 29 disclosure 74 dynamic consent 165 enlightenment 1, 29, 253 freedom of information 190 gender equality 40 ID assurance 253 identity 253, 270 informational privacy 102 informational self-determination 139 information management practices 59 interoperability 227 life management platforms 243 literacy-print 40 Maryanne Wolf 29 medium theory 40 metadata 227 Michel Foucault 29 midata 202 network analysis 74 Nicholas Carr 29 online privacy 123 online social networks 139 personal data 59, 139, 227, 243, 253

personal data ecosystem 1, 89, 270 personal data management 1, 270 personal data markets 102 personal data store 243 personal data vaults 89 personal information 74, 202 personal information stores 202 philosophical engineering 29 privacy 1, 74, 139, 165, 190, 227, 243, 270 privacy by design 1, 89, 179 privacy impact assessment 179 privacy regulation 102 privacy-enhancing technologies 123 public sector information 179 radical control 89 relational selfhood 40 smart disclosure 202 social computing 243 social enterprise 253 social networks 243 soft law 179 Tim Berners-Lee 29 transparency 179 transparency-enhancing tools 123 trust 227 trust frameworks 243 trust networks 270 trustworthy data practices 227 ubiquitous computing 59 user-centricity 270 value 74 value exchange 227 vendor relationship management 253 VRM 253 Walter Ong 29

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

311

Digital Enlightenment Yearbook 2013 M. Hildebrandt et al. (Eds.) IOS Press, 2013 © 2013 The authors.

Author Index 253 179 253 123 139 270 289 89 59 40 123 40 227 227 253 1 123 74 190 190 74

Kearns, D. Krontiris, I. Kuppinger, M. Madelin, R. Maguire, S. Nebel, M. Nguyen, M.-H.C. Novotny, A. O’Hara, K. Pagallo, U. Rannenberg, K. Reddington, J. Rossnagel, A. Shadbolt, N. Shirazi, F. Simo, H. Spiekermann, S. Stiegler, B. Waidner, M. Whitley, E.A.

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

Alexander, D. Bassi, E. Booth, P. Bowden, C. Buchmann, J. Bus, J. Cameron, K. Cavoukian, A. Coles-Kemp, L. Ess, C. Fischer-Hübner, S. Fossheim, H. Friedberg, J. Haynes, P. Heath, W. Hildebrandt, M. Hoofnagle, C. Houghton, D. Hugelier, S. Janssen, K. Joinson, A.

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,

243 123 243 v 227 139 227, 270 102 1 179 123 59 139 202 139 139 102 29 1, 123, 139 165

Copyright © 2013. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Digital Enlightenment Yearbook 2013 : The Value of Personal Data, IOS Press, Incorporated, 2013. ProQuest Ebook Central,