Privacy, Confidentiality, And Health Research [1st Edition] 1107020875, 9781107020870, 1139107968, 9781139107969, 113951881X, 9781139518819, 1139516957, 9781139516952, 1107696631, 9781107696631

The potential of the e-health revolution, increased data sharing, database linking, biobanks and new techniques such as

320 58 2MB

English Pages 203 Year 2012

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Privilege, Privacy and Confidentiality in Family Proceedings 9781526507891, 9781526507921, 9781526507914

DELETE

186 45 4MB Read more

Privacy And Confidentiality As Factors in Survey Response 0309028787, 9780309028783

Panel on Privacy and Confidentiality as Factors in Survey Response; Committee on National Statistics; Assembly of Behavi

418 115 12MB Read more

Confidentiality and Its Discontents: Dilemmas of Privacy in Psychotherapy 9780823265121

Confidentiality and Its Discontents: Dilemmas of Privacy in Psychotherapy explores the human stories arising from the ps

221 13 3MB Read more

The Futures Of Privacy [1st Edition] 9782915618259

As part of the Privacy program of the Digital Future think tank of the Institut Mines-Télécom, the Fondation Télécom org

732 26 2MB Read more

Data Privacy In European Medical Research: A Contemporary Legal Opinion [1st Edition] 3954665921, 9783954665921, 9783954666034

A legal opinion for medical research to meet the challenges of the GDPR. The European Data Protection Regulation applies

113 2 3MB Read more

Privacy and the Past: Research, Law, Archives, Ethics 9780813574387

When the new HIPAA privacy rules regarding the release of health information took effect, medical historians suddenly fa

144 59 1008KB Read more

Secrecy, Privacy And Accountability: Challenges for Social Research 3030116859, 9783030116859

Public mistrust of those in authority and failings of public organisations frame disputes over attribution of responsibi

613 35 2MB Read more

Innovation, Research and Development Management [1st edition] 9781786303004

In today's business environment, as organizations constantly seek to growth and develop through the optimization of

1,241 154 3MB Read more

Community health and wellbeing: Action research on health inequalities 9781847422767

Improving health in populations in which health is poor is a complex process. This book argues that the traditional gove

215 8 5MB Read more

Fog/Edge Computing For Security, Privacy, And Applications [83, 1st Edition] 3030573273, 9783030573270, 9783030573287

This book provides the state-of-the-art development on security and privacy for fog/edge computing, together with their

1,350 232 9MB Read more

Privacy, Confidentiality, And Health Research [1st Edition]
1107020875, 9781107020870, 1139107968, 9781139107969, 113951881X, 9781139518819, 1139516957, 9781139516952, 1107696631, 9781107696631

Author / Uploaded
William W. Lowrance

Table of contents :
Blank Page......Page 9
Blank Page......Page 1

Citation preview

Privacy, Confidentiality, and Health Research

The potential of the e-health revolution, increased data sharing and interlinking, research biobanks, translational research, and new techniques such as geolocation and genomics to advance human health is immense. For the full potential to be realized, though, privacy and confidentiality will have to be carefully protected. Problematically, many conventional approaches to such pivotal matters as consent, identifiability, safeguarding, security, and the international transfer of data and biospecimens are inadequate. The difficulties are aggravated in many countries by the fact that research activities are hobbled by thickets of laws, regulations, and guidance that serve neither research nor privacy well. The challenges are being heightened by the increasing use of biospecimens, and by the globalization of research in a world that has not globalized privacy protection. Drawing on examples from many developed countries and legal jurisdictions, William Lowrance critiques the issues, summarizes various ethics, policy, and legal positions (and revisions underway), describes innovative solutions, provides extensive references, and suggests ways forward. Dr. William W. Lowrance is a consultant in health research policy and ethics, based in La Grande Motte, France. After earning a Ph.D. from The Rockefeller University in the life sciences with a concentration in organic chemistry, he shifted his attention to the social aspects of science, technology, and medicine. He has been a faculty member or fellow, teaching and conducting research on health policy, environmental policy, and risk decisionmaking, at Harvard, Stanford, and Rockefeller Universities. He has served as the Director of the Life Sciences and Public Policy Program of Rockefeller University, and as the Executive Director of the International Medical Benefit/Risk Foundation, headquartered in Geneva. His books include Of Acceptable Risk: Science and the Determination of Safety and Modern Science and Human Values. In recent years he has focused on the issues of privacy and confidentiality in health research, and he chaired the advisory committee that drafted the Ethics and Governance Framework of UK Biobank.

Cambridge Bioethics and Law This series of books was founded by Cambridge University Press with Alexander McCall Smith as its first editor in 2003. It focuses on the law’s complex and troubled relationship with medicine across both the developed and the developing world. In the past twenty years, we have seen in many countries increasing resort to the courts by dissatisfied patients and a growing use of the courts to attempt to resolve intractable ethical dilemmas. At the same time, legislatures across the world have struggled to address the questions posed by both the successes and the failures of modern medicine, while international organizations such as the WHO and UNESCO now regularly address issues of medical law. It follows that we would expect ethical and policy questions to be integral to the analysis of the legal issues discussed in this series. The series responds to the high profile of medical law in universities, in legal and medical practice, as well as in public and political affairs. We seek to reflect the evidence that many major health-related policy debates in the UK, Europe and the international community over the past two decades have involved a strong medical law dimension. With that in mind, we seek to address how legal analysis might have a transjurisdictional and international relevance. Organ retention, embryonic stem cell research, physician-assisted suicide and the allocation of resources to fund health care are but a few examples among many. The emphasis of this series is thus on matters of public concern and/or practical significance. We look for books that could make a difference to the development of medical law and enhance the role of medico-legal debate in policy circles. That is not to say that we lack interest in the important theoretical dimensions of the subject, but we aim to ensure that theoretical debate is grounded in the realities of how the law does and should interact with medicine and health care. Series Editors Professor Margaret Brazier, University of Manchester Professor Graeme Laurie, University of Edinburgh Professor Richard Ashcroft, Queen Mary, University of London Professor Eric M. Meslin, Indiana University Marcus Radetzki, Marian Radetzki, Niklas Juth Genes and Insurance: Ethical, Legal and Economic Issues Ruth Macklin Double Standards in Medical Research in Developing Countries Donna Dickenson Property in the Body: Feminist Perspectives

Matti Häyry, Ruth Chadwick, Vilhjálmur Árnason, Gardar Árnason The Ethics and Governance of Human Genetic Databases: European Perspectives Ken Mason The Troubled Pregnancy: Legal Wrongs and Rights in Reproduction Daniel Sperling Posthumous Interests: Legal and Ethical Perspectives Keith Syrett Law, Legitimacy and the Rationing of Health Care Alastair Maclean Autonomy, Informed Consent and the Law: A Relational Change Heather Widdows, Caroline Mullen The Governance of Genetic Information: Who Decides? David Price Human Tissue in Transplantation and Research Matti Häyry Rationality and the Genetic Challenge: Making People Better? Mary Donnelly Healthcare Decision-Making and the Law: Autonomy, Capacity and the Limits of Liberalism Anne-Maree Farrell, David Price and Muireann Quigley Organ Shortage: Ethics, Law and Pragmatism Sara Fovargue Xenotransplantation and Risk: Regulating a Developing Biotechnology John Coggon What Makes Health Public?: A Critical Evaluation of Moral, Legal, and Political Claims in Public Health Mark Taylor Genetic Data and the Law: A Critical Perspective on Privacy Protection Anne-Maree Farrell The Politics of Blood: Ethics, Innovation and the Regulation of Risk Stephen Smith End-of-Life Decisions in Medical Care: Principles and Policies for Regulating the Dying Process Michael Parker Ethical Problems and Genetics Practice William W. Lowrance Privacy, Confidentiality, and Health Research

Privacy, Confidentiality, and Health Research

cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Mexico City Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9781107020870 © William W. Lowrance 2012 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2012 Printed in the United Kingdom at the University Press, Cambridge A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Lowrance, William W., 1943– Privacy, confidentiality, and health research / William W. Lowrance. p. ; cm. – (Cambridge bioethics and law) Includes bibliographical references and index. ISBN 978-1-107-02087-0 (hardback) I. Title. II. Series: Cambridge bioethics and law. [DNLM: 1. Health Services Research – legislation & jurisprudence. 2. Privacy – legislation & jurisprudence. 3. Computer Security. 4. Confidentiality – ethics. 5. Confidentiality – legislation & jurisprudence. 6. Databases, Factual – legislation & jurisprudence. 7. Health Services Research – ethics. W 84.3] 610.2850 58–dc23 2011053267 ISBN 978-1-107-02087-0 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

To my wonderfully supportive wife, Catherine Couttenier-Lowrance

Contents

Preface and acknowledgments

1

Introduction Health research as a public-interest cause Privacy protection as a public-interest cause Health data vulnerability The challenges

2

Data, biospecimens, and research A few essential notions The e-health revolution Data and databases Biospecimen collections Public research resource platforms Degrees of sensitivity Data and biospecimen ownership What is research, anyway?

3

Privacy, confidentiality, safeguards Privacy Confidentiality Safeguards

4

Broad privacy and data protection regimes The plethora of controls Two sorts of regimes Early concerns about computers and privacy The OECD privacy principles Council of Europe Convention 108 The European strategy The US approach The Canadian strategy Australia, Japan, APEC

5

Healthcare, public health, and research regimes Healthcare and payment regimes Public health regimes

page xiii

1 1 3 4 4

7 7 10 11 13 18 18 22 24

29 29 33 34

35 35 36 36 38 40 40 47 48 49

52 52 53

ix

x

Contents Human-subject protection regimes Clinical trial and product postmarketing regimes Other, specialized laws and regulations Research ethics review systems “Rules” thickets in four countries

6

Consent Consent as it is applied now Legitimately sought, meaningfully informed, willingly granted The casting of consent Right to withdraw Community engagement Searching for research candidates Research without consent Some reflections

7

Identifiability and person-specific data “Personal data” Identifiers De-identification for research Ways non-identified data can become identified Retaining the possibility to re-identify The HIPAA Privacy Rule approach Key-coding Identifiability terminology Identifiable to whom? No personal data: no human subject? Some reflections

8

9

Genetics and genomics

54 57 59 60 62

67 69 71 74 78 79 80 81 84

87 87 92 93 96 98 99 104 105 107 108 109

111

Genetics/genomics GWAS EHR-driven genomic discovery Genotype-driven recruitment Notice and consent Genetic and genomic identifiability The Personal Genome Project Some reflections

112 114 115 115 116 118 121 122

Safeguards and responsibilities

125

Operational safeguards Formal responsibilities Stewardship Data and biospecimen retention Security Privacy risk assessment Requests for non-research access Enforcement and sanctions

125 126 129 129 130 131 132 135

Contents

10

Data sharing, access, and transfer Data sharing Access The two basic modes of access Access agreements Terms of restricted access Privacy-preserving data linkage Extremely restricted access Oversight and governance International transfer Some reflections

11

xi

138 138 140 140 141 142 146 149 151 152 156

Ways forward

158

Bibliography Index

160 176

Preface and acknowledgments

The potential of the e-health revolution, increased data sharing and interlinking, research biobanks, translational research, and new scientific techniques such as genomics to advance human health is immense. For the full potential to be realized, though, privacy and confidentiality will have to be carefully protected. Problematically, many current approaches to such pivotal matters as consent, identifiability, safeguarding, security, and the international transfer of data and biospecimens are inadequate. The difficulties are aggravated in many countries by the fact that research activities are hobbled by thickets of laws, regulations, guidance, and governance that serve neither research nor privacy well. For reasons that will be discussed, much of the legal context is in flux. The EU Data Protection Directive (adopted in 1995) is being thoroughly revised, after which the Member States will revise their laws to implement the new provisions. The Council of Europe Convention 108 on Protection of Personal Data (1981) is being revised, as are the EU Clinical Trials Directive (2001), the Australian Privacy Act (1988), and the US Federal Common Rule on Protection of Human Subjects (1991). And it is inevitable that at least the research-related requirements of the Privacy Rule under the US Health Insurance Portability and Accountability Act (2002) will be revised before long. Similar changes are occurring in Asia and elsewhere as well. This book can’t resolve all of the issues; no-one has, and no book could. It is concerned not just with information recorded in the course of medical care, central though that is, but with any kind of information, from any source, that can be brought to bear as scientific evidence in health research. It uses selected examples to illustrate thematic points, not to develop comprehensive comparisons. For practical realism, it describes many of the ways research proceeds and how existing data are accessed and used and new data are generated. It is oriented to research in the developed world, and it draws on anglophone examples and sources, but most of the considerations are relevant everywhere health research is pursued. xiii

xiv

Preface and acknowledgments

Every effort has been made to keep the discussion concise. Complementing this, extensive references are provided, not only to support the exposition but also to help novices get their bearings and help experienced readers think about how things are done in technical specialties and ethico-legal cultures other than their own. What this book attempts to do is identify the central issues, review them, describe ways they have been, or are being, addressed in various situations, provide resources, and suggest ways forward. Its purpose is to stimulate and inform reflection and discussion. In particular it focuses on issues, currently handled in rather different ways in different places, that must be worked on as both privacy protection and health research take on more international dimensions, and for which more internationally accepted rules and practices are urgently needed. All web references were current as of March 2012.

Acknowledgments The work on this book was supported by the Wellcome Trust, grant number [086258], for which the author is of course grateful. No-one at the Trust influenced the writing or saw the book until it was published. The author alone is responsible for the contents. The material on p. 25 is reprinted with kind permission from the Council of State and Territorial Epidemiologists. The material in Box 3 from the Institution of Medicine report, Beyond the HIPAA Privacy Rule, is reprinted with permission from the National Academies Press, copyright 2009, National Academy of Sciences. The figure, “Data flow via a research resource platform,” is a modification of a figure drawn by the author for a Policy Forum article by himself and Francis S. Collins, “Identifiability in genomic research,” Science, 317 (2007), 600.

1

Introduction

This book examines the ways society can best promote the use of personal information for health research, and at the same time protect informational privacy and confidentiality. Much is at stake. Occasionally skeptics frame the overall issue as privacy versus research, casting research as mainly serving the intellectual curiosity and self-advancement of scientists. This is shortsighted. Like everyone else, scientists hope for fame and fortune, or at least a good reputation and reasonable income. And having an intense sense of curiosity is a requisite for being a scientist. But the work is awfully demanding, and true discovery moments are rare. Researchers work hard to solve healthrelated problems, as do the institutions that host and support them. To the extent that there is a balancing in properly established research, the issue is privacy versus the advancement of health through research. Throughout, the book reflects the author’s conviction that both health research and privacy protection are public-interest causes. The realizations that lead to such a conviction are familiar and the logic is commonsensical, but the reasoning deserves to be summarized at the outset. Health research as a public-interest cause From conception onward, everyone is exposed to myriad health risks. Many of the risks are either attracted, caused, intensified, or transmitted by human activity, and many can be mitigated or compensated for by human activity. Society at every scale of organization, whether village, city, provincial, national, or supranational, accords high priority to the reduction of health threats, the promotion of health, and the provision of health care. What good health means for any person at any stage of life is relative to his or her risks and resources. But despite the difficulty in defining it precisely, health is universally valued as a core human need and a condition for living life in dignity. Nowadays we live thoroughly interconnected lives in which illness risks and costs are widely shared, although not equally or fairly. The burdens of 1

2

Introduction

illness and disability suffered in resource-poor and unstable countries are burdens on the world in the toughest economic, social, and security senses, as well as in high moral senses. The poor of the world share vectors of contagion with the prosperous, and the prosperous share vectors of unhealthy lifestyles with the less prosperous. Pathogens can propagate at the speed of airline travel or food transport. Epidemics and natural disasters strike with no respect for political boundaries. We are vulnerable together. None of us can know what diseases and disabilities we or our families or friends will fall victim to as our lives go on – and when those afflictions occur, we tend to hope fervently that a great many people’s experiences have been studied in depth and have led to the development of proven effective diagnostic and curative, or at least palliative, techniques and products for coping with them. As we all stand to benefit from the knowledge commons, we should all contribute to the knowledge commons to help others, including people we will never know. Participation in research is a moral opportunity. Relating to all this are costs. In all developed countries and increasingly in the developing countries, health care is an industrial-scale activity, the costs of which are paid for at least partly, and in many cases almost entirely, by the state. These costs, and cost-effective provision of prevention and care, are and always will be of crushing concern to states, and so states depend on research to understand causes, evaluate what works best and is most cost-effective, innovate, and improve. Although private sector healthcare providers and insurers have different financial considerations, they too must worry about costs and quality, and they too depend on research for understanding and improvement. In many ways, it is the poor of the world who stand to gain the most from research, relative to their health and socioeconomic burdens – not only as regards such infectious scourges as tuberculosis, malaria, AIDS, schistosomiasis, leprosy, and river blindness, but also as regards such draining noncommunicable afflictions as infant malnutrition, diabetes, cardiovascular diseases, and cancer. An encouraging thing is that the fruits of health research, whether basic knowledge, practice guidance, techniques, or products, tend to propagate extraordinarily efficiently and widely. There will always be budgetary and cultural limitations to the application of research results and new healthcare technologies, even in the wealthiest communities. But no other universal progressive endeavor comes close to the critical screening and efficiency with which scientific and medical ideas and information are generated, distributed through journals, the web, and conferences, evaluated for quality and relevance, and translated into practice. Knowledge

Privacy protection as a public-interest cause

3

developed anywhere about mastitis, burns, glaucoma, migraine, organic solvent toxicity, wood dust allergies, hospital hygiene, and countless other matters can be put to use around the world. Health research, then, is of great public-interest importance, and this is underscored by the fact that much of it is financially supported or conducted by government bodies, by organizations to which governments grant nonprofit tax-exempt status, and by international organizations in which governments participate. At many points this book will refer to the public-interest justification of research policies and practices that serve the common human good, and to the contributions to the common good that members of the public make by volunteering to participate in research or allowing data about themselves to be studied. Privacy protection as a public-interest cause Simply and profoundly: Privacy should be respected because people should be respected. Despite its personal and contextual relativity and the difficulty in defining it, as will be discussed in Chapter 3, privacy is widely valued as a core human need and a condition for living life in dignity. For the research enterprise – researchers and all the institutions that support, regulate, govern, and disseminate the results of research – attending assiduously to privacy and the relations of confidentiality that serve it is essential to earning the trust that encourages the public to become involved in research, be candid and generous in answering questions or allowing information about themselves to be used, be unselfish in providing biospecimens, and stay involved in projects as long as is scientifically useful. If these matters are not carefully attended to, the people to whom data relate may be offended on principle by a violation of confidentiality promises. They may resent intrusions following wrongful disclosure, such as improper clinical trial recruitment approaches or unwanted disease-targeted marketing. They may be, or fear that they may be, exposed to embarrassment, defamation, stigmatization, harassment, extortion, identity theft, or financial fraud, or denial of access to health or life insurance, employment, job promotion, or loans. And they and their sympathizers may drop their goodwill toward the institutions or research programs. Offending researchers or institutions may suffer negative publicity, litigation, or financial losses. A breach of confidence can cost months or years of remediating effort by university administrators or clinical or corporate managers and their lawyers and public affairs

4

Introduction

staff, and a sullied reputation can be a burden for a long time. A research team may be denied access to data or data-collection opportunities, and even a whole line of research may suffer by association.

Health data vulnerability For calibration one might well wonder how many privacy intrusions have occurred and what harms people have incurred. Thousands of unwarranted disclosures, losses, and thefts of health care data, and some successful hacking attacks, have occurred in a number of countries; however, relatively few direct material harms to data-subjects have been documented and not many lawsuits have been pursued in the courts (although some incidents may have been pursued out of court, off the public record). Probably credit card abuse and identity theft have been the main harms. Most of the violations that have been confirmed have resulted from careless, incompetent, or illegal actions of the sorts that professional care and organizational safeguards should be able to prevent. But the scale and sensitivity of most healthcare data systems, and the fact that they carry patient payment details, will always make them a potential target of attack.1 Health research data have mostly been spared so far. This may be because of precaution, or luck, or because healthcare records are more tempting targets for intrusion in that they tend to be easier to pry into than most research databases, reveal more readily comprehensible health information, and carry more exploitable financial details. None of this justifies being lackadaisical: Any of the dread events recited above could be incurred by health research data. Several research databases have been lost, stolen, or hacked into in the last few years. And health research is changing in ways that are creating new vulnerabilities.

The challenges Ethical, legislative, regulatory, technical, administrative, and day-to-day operational adjustments are being made everywhere to try to cope with the issues raised in this book. It is important to have the assortment of challenges in mind from the beginning, partly because most of the issues

1

For documentation of incidents, see Privacy Rights Clearinghouse, “Chronology of data breaches, 2005–present,” healthcare subset: www.privacyrights.org/data-breach. US Department of Health and Human Services, healthcare data security breach notifications: www.hhs.gov/ocr/privacy/hipaa/administrative/breachnotificationrule/breachtool.html.

The challenges

5

manifest themselves as issue-clusters and have to be dealt with in concert. For concision they are set out here in bulleted lists, with a few examples. New scientific opportunities. We are in an era of unprecedented scientific opportunity, and information technologies are providing reach, speed, memory capacity, and search flexibility as never before. For the health sciences, it’s an exhilarating time. Among the many advances, research is benefiting from: □ generation of entirely new kinds of data (real-time digital images of brain activity, genomic and other -omic data, human microbiome data, i.e., detailed ecological characterization of the trillions of creatures that colonize everyone’s body . . .); □ development of new modes of data capture, storage, and transmission (networked electronic health records, wearable biosensors, Internet-mediated gathering of input from dispersed research participants . . .); □ reclassification of many health problems based on deeper understanding of causal factors instead of, or in addition to, their clinical appearance; □ integration of social and behavioral research with biomedical and health services research (large life-course and multigenerational studies, behavioral toxicology, mapping of health and healthcare disparities across populations . . .); □ amassing of ever-growing mountains of data in healthcare administrative databases, payment databases, and disease and condition registries, most of which can be accessed, under various conditions, for research; □ construction of large-scale research platforms (health-data linkage systems, research biobanks, genome-wide association databases, clinical trial networks, social research data archives . . .). Chronic policy problems becoming worse. Serious privacy and confidentiality impediments continue to hamper research, however, notably among them: □ uncertainties and disagreements around the formal construal of “personal data” or “personally identifiable information,” and the notion of identifiability generally; □ debate about the ethical and legal appropriateness of consent to complicated research and the sharing of data via research platforms, at least as consent tends to be applied currently; □ dispute over the acceptability of broad consent to unspecified, perhaps indeed unspecifiable, future research; □ lack of clarity about how to deal with privacy and confidentiality implications for relatives of people involved in research;

6

Introduction

□ public – and researcher – apprehension about the legal power of researchers to resist forcible access to sensitive research data by the police, courts, banks, health or life insurers, or other external parties; □ inconsistencies and redundancies among the multitude of laws, regulations, and guidelines, many of them vague, duplicative, outdated, or just not relevant for contemporary research; □ onerous, inefficient, and costly procedural requirements for complying with all the laws, regulations, and guidelines. Exacerbations of scale. Many challenges are growing in complexity as scale increases, as with: □ in many situations, reduction of direct control by patients over how data are used, and reduction of direct control by physicians and medical institutions that provide access to patient data and biospecimens for research; □ increased risks to privacy from the growth of data and biospecimen holdings, pooling and interlinking of data-sets, and the generally increasing geographic and institutional dispersing of data and biospecimens; □ decentralization and outsourcing of much data storage and analysis, potentially diffusing accountability. Recently arising or intensifying concerns. Among the newer concerns are ones having to do with: □ the use of kinds of data and biospecimens that in many jurisdictions have been ethically or legally off-limits for research until fairly recently (residual newborn blood screening specimens, abortion registry data, stem cells . . .); □ the availability of some entirely new kinds of data that can be useful for research but that can carry clues to individuals’ identities, habits, and connections (geospatial tracing of people’s movements and exposures, broadcasting of personal details via online social networking . . .); □ privacy risks collateral to the rapid and extensive data sharing that is being widely promoted; □ the security of computerized data generally; □ genomic science’s headlong assembly line decoding of personal originand-fate factors that society is not well prepared to interpret or make judicious use of; □ genotype data as a potentially identifying or tracing tag when linked with otherwise non-identified data; □ the possibility of aggressive demands for access to research data under freedom of information or anti-terrorism laws; □ threats to privacy as torrents of data are transmitted across borders as research continues to globalize, in a world that has hardly globalized privacy protections.

2

Data, biospecimens, and research

A few essential notions Clarity of conception and vocabulary is essential when discussing the subjects of this book. Most of the following notions may seem commonplace and hardly in need of definition, but most of them become the subject of scholarly debate or close legal scrutiny from time to time. (Unless otherwise noted, the definitions are ones consistent with common usage but are the author’s attempt to be precise and clear for the purposes of this book.) Data are records of observations or actions, or, stated slightly more formally, patterns of symbols that stand for observed values or actions. They may be instrument readings, x-ray or scanner images, voice recordings, family lineage charts, interview responses, hospital billing records, or countless other results of looking, asking, listening, measuring, recording, or analyzing. In research, almost all data are now handled in digital form, even if this requires transcription or translation from nondigital formats. This greatly facilitates computerized analysis, of course. It also allows the distribution of data from site to site at close to the speed of light and at very low cost, which can be either wonderful or troublesome, depending on how the data are managed and used. Information is data set within an interpretive context to generate meaning. Often information and data are taken to mean the same thing, but there is some advantage in using them differently in the context of research. Raw numbers, graphs, images, or long strings of digital bits – data – “mean” nothing until they are understood as representing hormone level, biopsy photomicrograph, quality-of-life score, model number of an implanted device, genetic sequence, cost of surgical episode, or whatever, and the sampling scheme, data collecting circumstances, observational system, descriptive scale, and framework of scientific or clinical understanding are taken into account. Data quality is always a concern, of course, as is the quality of the methodological and ethical documentation (metadata). Obviously, data can be incorrect, and information can be false. 7

8

Data, biospecimens, and research

Knowledge, as it is taken to mean when research is defined in regulations as “the pursuit of generalizable knowledge,” can be thought of as widely accepted understanding, based on verified information, compatible with other knowledge, and perhaps proven useful in practical experience. Data-subjects are the persons whom data are about, the people to whom data pertain. In the UK Data Protection Act, for example, “data subject means an individual who is the subject of personal data,” with personal data defined immediately afterward. (Just to indicate how quickly such a definition can become contested: What about an easily contactable but noninvolved relative of a research volunteer about whom the study results hold implications? Has she or he become a data-subject de facto?) Personal data, or individually or personally identifiable data, are data that are about real persons or that can be related to real persons by deduction from partial descriptors or linking with other data. How to formally distinguish data that are somehow person-related from those that are not, though, is a much more subtle matter than one might assume it to be. Identifiability and data-subject status will be discussed in Chapter 7. (A note to readers: Whenever this book mentions either “data-subjects,” “subjects,” or “research participants,” the alternatives should be understood as appropriate. Usually not much distinction is made, but “participants” suggests more aware and active involvement than “data-subject” does, and one can be a data-subject without being aware of it.) Data handling, in ethical or legal senses relating to privacy, has to be construed comprehensively, in order to include any action that provides an opportunity to take the data into knowledge or affect them. Thus it must include such actions as collecting, receiving, holding, examining, analyzing, linking, altering, transferring, archiving, and destroying. Data processing is the term used in the EU countries and many others for data handling. It connotes far more than the outmoded sense of rote data entry and analysis. The EU Data Protection Directive defines processing as “any operation or set of operations which is performed upon personal data, whether or not by automatic means, such as collection, recording, organization, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, blocking, erasure or destruction” (Article 2b). Data disclosure, as it will be used in this book, means the divulging or transfer of, or provision of access to, data. It should not be confused with some statisticians’ usage of disclosure to mean the revealing of

A few essential notions

9

data-subjects’ identities; the passing-on of data may or may not lead to the recipient’s learning who the data are about. Whether a recipient actually looks at the data, does anything with them, or retains or destroys them is usually considered irrelevant as to whether disclosure has occurred. Any showing or passing along of data, whether authorized or unauthorized, whether careful, or careless, or malicious, whether orally, or via paper or electronic media, can amount to disclosure. (An exception might be if data are sent in error and the recipient proves, or at least reliably attests, that the data were not looked at or copied.) Often the act of disclosure in itself is regulated by ethics or law, regardless of what happens once the data are in the hands of recipients. Data sharing means the deliberate provision of access to data for use by others. This is a very important activity in research, as Chapter 10 will discuss. Secondary use of data has long meant the use of existing data for a purpose different from the originally declared purposes. Although the observational techniques, choice of variables, and data quality can’t be controlled the way they can in prospective studies, because studies of existing data can analyze (messy) real experience, they can help in understanding and improving (messy) real experience. Much constructive research depends on using existing data, such as clinical data. If the purposes, users, or auspices are different from the original ones, the implications for the data-subjects’ privacy must be taken into account and compensating actions taken as necessary. Differentiating secondary from primary uses can require judgment. This may hinge, for example, on how different the use is from the initial use, or on whether research is viewed under law or policy as being integral to the work of a healthcare system that is the source of the data. Some organizations, wanting to emphasize the integrated nature of their activities and make it clear that research is part of their mission, prefer to speak of “additional” instead of “secondary” use or to make little distinction, a stance that surely will become more prevalent, at least within large healthcare systems, as time goes on. In many situations, “secondary” can still be useful to connote that a formal threshold – such as approval by an ethics or other governance body, or de-identification of the data – has to be surmounted before the data can be used beyond the specified primary purposes or used by researchers outside the custodial circle. What is important is not the rubric but clarity about the nature of the use, the possible risks associated with the use, and requirements relating to the use. Three important derivative uses of data-sets that not everyone thinks of right away are statistical case-controlling, searching for potential

10

Data, biospecimens, and research

candidates for research, and development and testing of analytic or privacy-protection methods.

The e-health revolution e-health – the use of informatics in collecting and managing healthrelated data – takes many forms and can perform many functions, from medical recordkeeping, to practice management and procedure scheduling, to online prescription ordering (e-pharmacy), to automatic bedside or home collection and transmission of patient data, to the provision of clinical decision guidance based on accumulated medical experience, to the provision of care at a distance (telemedicine), to billing and reimbursing, to amassing research databases and facilitating the kinds of database research discussed throughout this book. This set of transformations in health care and research is now fully underway, even if some elements are developing faster than others.1 Electronic health records (EHRs) are comprehensive digital patient records carrying, or linked with, other health-related data. They are intended to be much more than just digital versions of conventional paper charts. The vision for EHRs includes their accumulating information from a person’s health and healthcare experience over a long term, ideally throughout life, and being linkable with pharmacy dispensing data, biospecimens, medical images, family health and reproductive history, genomic data, disease and other registries, and other data. The challenge is to develop them so that they can manage the enormous variety of data that are handled in health care and payment, function efficiently as networked or at least intercommunicating systems among diverse settings, make information accessible at authorized point of need for a great many points and needs, and do all of this securely. Whether and how EHR systems will accommodate patient input and choices is being explored; they will remain at base formal accounts of medical encounters.

1

A European overview is empirica GmbH for the European Commission, Karl A. Stroetmann, Jörg Artmann, Veli N. Stroetmann, et al., European Countries on their Journey towards National eHealth Infrastructures (2011): www.ehealth-strategies.eu/report/ eHealth_Strategies_Final_Report_Web.pdf. A Canadian overview is Don Willison, Elaine Gibson, and Kim McGrail, “A roadmap to research uses of electronic health information,” in Colleen M. Flood (ed.), Data Data Everywhere: Access and Accountability? (Montreal, Quebec and Kingston, Ontario: McGill-Queen’s University Press, 2011), pp. 233–251. In the US, much e-health activity is being shaped and supported under the US Health Information Technology Economic and Clinical Health (HITECH) Act and can be followed via: http://healthit.hhs.gov/portal/server.pt/community/healthit_hhs_gov_ home/1204.

Data and databases

11

(A note on nomenclature: Generally, “electronic medical records” (EMRs) refers to conventional medical practice documentation. “Electronic health records” (EHRs) usually refers to more robust systems that conform at least partially to the vision just described, which because of their broad scope and interconnectedness can be used more powerfully for healthcare service evaluation, public health purposes, and research as well as for the provision of individual patient care. “Personal health records” (PHRs) refers to records maintained by either a healthcare system or a commercial service but controlled by the patients themselves. Depending on the model, patients may keep track of their immunizations, medications, blood pressure, and other information, enter observations and comments, consult or import information such as test results from their medical records, or communicate with care providers or with other patients who agree to be in contact. Because PHRs are accessed via home computers or other personal digital devices, can be shown to anyone, and medical confidentiality doesn’t apply to them, there is much reason to be concerned about how well they protect privacy.) The development of EHRs for healthcare systems has been slow and awkward, but in part this has resulted from the overwhelming complexity of the task. There is fairly wide agreement as to what their principal functions should be, and in a number of places some elements are working well in practice. In addition to their efficiency for patient care, if properly structured they can provide nearly real-time data for communicable disease tracking, drug adverse event surveillance, and other communal public health purposes, and of course rich clinical data for research. Data and databases We mustn’t let the seeming blandness of the word “data” numb our appreciation of the range and value of information that is potentially derivable from the data and biospecimen collections that are now, or will soon become, available for research. Indeed, many collections have accumulated so much potential as to deserve being considered international treasures. It isn’t necessary to develop a taxonomy, and the categories overlap, but it is important to recognize the diversity of data handled and think about how the implications for privacy and confidentiality can differ depending on the purposes and auspices. Databases are a useful indication of the categories. Databases are ordered collections of data oriented to purposes. Some databases used in research serve highly specialized purposes; some are maintained as resources for a changing multiplicity of uses. Some are small; some are enormous. Some accumulate one-off observations; some

12

Data, biospecimens, and research

are “longitudinal,” following a set of people in a uniform way over time. The databases used in research may relate to: □ health care and promotion (dental office, hospital, ambulance, laboratory, pharmacy, blood supply, speech therapy, home care, genetic counseling . . .); □ social care (support for the blind, school nutrition supplement programs, hospice care . . .); □ payment for health care (government funds, private insurance . . .); □ vital statistics (certification of birth, marriage, adoption, death . . .); □ demographics (Manitobans, Alabama retirement home residents, Londoners of Cypriot descent . . .); □ social or behavioral background (educational attainment, employment history, military experience . . .); □ environmental or lifestyle hazard exposure (residents near cell telephone transmission towers, asphalt pavers, Chernobyl survivors . . .); □ military hazard exposure and health risks (exposure to depleteduranium munitions, artillery troop hearing loss, post-combat stress . . .); □ diseases or disabilities (hepatitis C, Stevens-Johnson syndrome, carpel tunnel syndrome, dyslexia . . .); □ genetics or genomics (family health history, genetic test, genotype . . .); □ medical or sociomedical interventions (renal dialysis, silicone breast implantation, support for new mothers, cognitive behavior therapy . . .); □ public health monitoring, surveillance, or intervention (newborn cleft palate registration, nursing home quality-of-life survey, adverse vaccine reaction surveillance, inner city knife-wound mapping . . .); □ experimental investigations (experimental oral surgery, sleep deprivation studies, nutrition experiments, comparison of home care with hospital outpatient care . . .); □ medical product development and regulation (diagnostic test evaluation, drug cost-effectiveness . . .); □ long-term health experience (birth and other cohorts, occupational exposure monitoring and health surveillance, chronic disease or disability experience . . .). Registries are systematic records of specified phenomena occurring with individuals in a defined population, accumulating entries over time, and maintained for a purpose. Variously, they record vital statistics (such as birth and death), conditions (being an identical twin, being a pregnant epileptic woman), preventions (mammography), exposures (to congenital syphilis, to nanoparticles in manufacturing), diseases or disabilities (chlamydia infection, end-stage renal disease), or interventions (heart valve transplantation, neonatal intensive care). Many are maintained by public

Biospecimen collections

13

health agencies, and others by long-term research projects or pharmaceutical R&D units. Registries differ from clinical databases, in that registries log prespecified occurrences within defined populations, usually based on uniform and obligatory notifications by physicians, hospitals, coroners, public health officials, or others, whereas clinical databases accumulate much more diverse but far less uniform data about assorted patients who present for care. Patient outcome registries, a hybrid model, have been defined as systems that “collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by a particular disease, condition, or exposure, and that serve one or more predetermined scientific, clinical, or policy purposes.”2 Research projects often depend on using registry data to gauge the incidence or prevalence of a phenomenon in a population, trace the progression of disease episodes, establish the natural history of a disease, search for associations between lifestyle, genetic, or exposure factors and disease or mortality, or evaluate the outcomes of diagnostic or treatment regimens. (One can imagine that in the future, virtual registries may be compiled as needed, by selecting and tracking cases recorded in networked EHRs.)

Biospecimen collections Biospecimens (also called specimens, samples, or human materials) used in research can be derived from every body component, product, and waste material. The following list is a reminder of their diversity and the differing circumstances of their collection. Given the current public and policy sensitivities about genomic data, it is worth keeping in mind that all biospecimens except red blood cells contain or carry DNA, although possibly of degraded quality. The categories overlap to some extent, but the main sorts of biospecimens are: □ organs or tissues (including blood, skin, bones, marrow, and teeth); □ blood derivatives (such as plasma); □ dried newborn blood screening spots; □ cell cultures; □ sloughed-off epidermis (inner cheek cells, dandruff); □ hair, nail; 2

US Agency for Healthcare Research and Quality, Richard E. Gliklich and Nancy A. Dreyer (senior eds.), Registries for Evaluating Patient Outcomes: A User’s Guide, second edn. (2010): www.ncbi.nlm.nih.gov/books/NBK49444/pdf/TOC.pdf.

14

Data, biospecimens, and research

Box 1. The case for data: Thomas Percival, Medical Ethics, 1849* Hospital registers usually contain only a simple report of the number of patients admitted and discharged. By adopting a more comprehensive plan they might be rendered more subservient to Medical science and to mankind. The following sketch is offered to the gentlemen of the Faculty. Let the register consist of three tables: the first specifying the number of patients admitted, cured, relieved, discharged, or dead; the second, the several diseases of the patients, with their events; the third, the sexes, ages, and occupations of the patients. The ages should be reduced into classes; and the tables adapted to the four divisions of the year. By such an institution, the increase or decrease of sickness; the attack, progress, and cessation of epidemics; the comparative healthiness of different situations, climates, and seasons; the influence of particular trades and manufactures on health and life; with many other curious circumstances, not more interesting to Physicians than to the community, would be ascertained with sufficient precision. * Medical Ethics, or A Code of Institutes and Precepts Adapted to the Professional Conduct of Physicians and Surgeons, third edn. (London: John Henry Parker, 1849), pp. 33–34, available gratis from Google Books.

□ secretions (milk, earwax, saliva), concretions (gallstones), excretions (urine, feces, sweat); □ eggs, semen, sperm; □ products of conception (embryos, fetal tissue, placentas, umbilical cords, amniotic fluid); □ DNA, proteins, or other subcellular components. Some laws or regulations cover only tissues, i.e., groups of related cellular materials, and not subcellular materials. For example, extracted DNA is not considered a “relevant material” under the UK Human Tissue Act.3 Biological samples can be said to literally carry or convey latent information. Hair, teeth, and urine carry trace metallic compounds, which can imply some environmental or dietary exposure. Urine carries metabolites, which can indicate pharmaceutical or illicit drug use, or occupational exposure to a chemical agent. DNA carries coding for an enormity of information, from indications of family origins to factors of health destiny. 3

UK Human Tissue Act, Supplementary list of materials: www.hta.gov.uk/_db/_documents/Supplementary_list_of_materials_200811252407.pdf.

Biospecimen collections

15

Biospecimens collected for medical purposes are protected by medical confidentiality. Those used for research but not for care are usually covered by human-subject research regulations. Those collected in clinical trials, which involve care and research simultaneously, are regulated under both regimes and by clinical trial regimes as well. Special issues arise when biospecimens are collected as a part of social survey or cohort projects, in that the combining of individual-level social data with data derived from biospecimens can lead to the identification of data-subjects by deduction. Because social scientists may not have the expertise or facilities required to properly manage biospecimens or safeguard or share the combined holdings, usually they must collaborate closely with biomedical scientists having the complementary capabilities. All must attend to the identifiability and privacy issues.4 (A note to readers: Whenever this book mentions “data,” “and/or biospecimens” should be understood as appropriate.) Newborn blood screening specimens (also called Guthrie spots, after Robert Guthrie, who developed the technique) are spots of blood collected by heel-prick soon after babies’ birth and dried on special filter paper, on which they remain stable for years. Small punch-outs are analyzed for biochemical evidence of a number of heritable diseases, and sometimes are screened for HIV positivity or other conditions as well. In many countries such screening tests are mandatory under public health regulations, both to serve babies’ and families’ health interests and to inform population-level analyses. Most babies born in developed countries are screened, although the set of conditions screened-for varies widely. The residual dried blood can be put to beneficial use in many kinds of research, such as analyzing the distribution of genetic conditions in populations, or surveying exposure during gestation to nicotine, caffeine, cocaine, hepatitis B, syphilis, mercury, pesticides, or other harmful agents. And the DNA can be used in detailed genomic studies. Policies on research access and use vary among jurisdictions, especially as to what parental agreement is required, and whether the materials and data must first be de-identified.5 4

5

National Research Council (US), Panel on Collecting, Storing, Accessing, and Protecting Biospecimens and Biodata in Biosocial Surveys, Robert M. Hauser, Maxine Weinstein, Robert Pool, and Barney Cohen (eds.), Conducting Biosocial Surveys: Collecting, Storing, Accessing, and Protecting Biospecimens and Biodata (Washington, DC: National Academies Press, 2010); available at: www.nap.edu. US Department of Health and Human Services, Secretary’s Advisory Committee on Heritable Disorders in Newborns and Children, briefing paper, “Considerations and recommendations for national guidance regarding the retention and use of residual dried blood spot specimens after newborn screening” (2009): www.cchconline.org/pdf/ HHSRectoBankBabyDNA042610.pdf; Institute of Medicine (US), Roundtable on

16

Data, biospecimens, and research

Biobanks, for present purposes, are collections of biological materials used for research, often, although not always, linked to health and other information.6 In recent years the rubric has been allowed to balloon, and many formerly passive specimen collections of modest technical quality and documentation have been rebranded as biobanks. Adding to the ambiguity, some units that use the name are repositories of pathology material having no orientation to research, and others are simply contract sample-storage facilities. An important distinction is whether the collection is a deliberate sampling of a population defined by demographics, exposures, diseases, or other characteristics, or is a less systematic accumulation of material saved in the course of providing care. The Council of Europe’s definition is useful: “A population biobank is a collection of biological materials that has the following characteristics: the collection has a population basis; it is established, or has been converted, to supply biological materials or data derived therefrom for multiple future research projects; it contains biological materials and associated personal data, which may include or be linked to genealogical, medical and lifestyle data and which may be regularly updated; it receives and supplies materials in an organised manner.”7 Because biospecimens alone, or data alone, tend to be far less useful for many lines of research than the combination, many aspiring biobanks are struggling, retrospectively, to gain access to researchquality health data about the people from whom the materials came, and many databases are struggling to gain access to research-quality biospecimens. In either direction, it can be necessary to try to recontact the subjects to obtain new data or samples, or to seek consent to access existing ones. Often such efforts are onerous and only partially successful. New research projects, especially those meant to become resources for broad use for many years, should seriously consider whether they should collect a combination of data and biospecimens from the outset.

6

7

Translating Genomic-Based Research for Health, workshop summary, “Challenges and opportunities in using residual newborn screening samples for translational research” (Washington, DC: National Academies Press, 2010), available at: www.nap.edu. Two solid technical overviews are Paul R. Burton, Isabel Fortier, and Bartha M. Knoppers, “The global emergence of epidemiological biobanks: opportunities and challenges” in Muin J. Khoury, Sara R. Bedrosian, Marta Gwinn, et al. (eds.), Human Genome Epidemiology, second edn. (New York: Oxford University Press, 2010), pp. 77–99; and Madeleine J. Murtagh, Ipek Demir, Jennifer R. Harris, and Paul R. Burton, “Realizing the promise of population biobanks: a new model for translation,” Human Genetics, 130 (2011), 333–345. Council of Europe, “Recommendation of the Committee of Ministers to member states on research on biological materials of human origin” (2006), CM/Rec(2006)4: https://wcd. coe.int/wcd/ViewDoc.jsp?id=977859.

Biospecimen collections

17

Most of what can be considered sophisticated, robust, populationbased, public research biobanks have been purpose-built as banks, systematically developed resources on which investigators who are not otherwise involved, as well as those who are, can draw for a variety of purposes. They are development banks, in that they manage entrusted informational capital, which, shared for research, helps generate informational returns for society. Results are published and, often, derived data are returned to the bank or other accessible databases for reinvestment. One of the most ambitious examples is UK Biobank, a project involving some 500,000 volunteers aged 40–69 from across the UK. Upon enrollment the participants answered an extensive health history and lifestyle questionnaire; underwent a full physical examination; provided blood, urine, and saliva samples; and granted permission for the project to access their National Health Service (NHS) and other health-related records, link to other databases such as registries, and perform genotyping. Researchers anywhere can apply to use the data, under restrictions.8 Among the privacy and confidentiality issues for biobanks like UK Biobank are ones having to do with the selecting of candidates for recruitment, the contacting and recruitment process, the necessary reliance on broad consent to unspecified future research, the protection of identifiability, the terms of eventual data and biospecimen access and use, and safeguards and governance.9 Many initiatives are underway to network diverse biobanks, promote the harmonization of standards and procedures, and facilitate data and sample sharing. Examples are the international Public Population Project in Genomics10 and the European Biobanking and Biomolecular Resources Research Infrastructure project.11 8 9

10 11

UK Biobank: www.ukbiobank.ac.uk. Jane Kaye and Mark Stranger (eds.), Principles and Practice in Biobank Governance (Farnham: Ashgate, 2009); Herbert Gottweis and Alan Petersen (eds.), Biobanks: Governance in Comparative Perspective (London: Routledge, 2008); Heather Widdows and Caroline Mullen (eds.), Governance of Genetic Information: Who Decides? (Cambridge University Press, 2009); Organisation for Economic Co-operation and Development, “Guidelines on Human Biobanks and Genetic Research Databases” (2009): www.oecd.org/dataoecd/41/47/44054609.pdf; Margaret Sleeboom-Faulkner (ed.), Human Genetic Biobanks in Asia: Politics of Trust and Scientific Advancement (London: Routledge, 2009); Western Australian Department of Health, Office of Population Health Genomics, “Guidelines for human biobanks, genetic research databases and associated data” (2010): www.genomics.health.wa.gov.au/publications/docs/ guidelines_for_human_biobanks.pdf. Public Population Project in Genomics (P3G): www.p3g.org. Biobanking and Biomolecular Resources Research Infrastructure: www.bbmri.eu/index. php/home.

18

Data, biospecimens, and research

Public research resource platforms Platforms are large-scale projects that accumulate, organize, and store data and sometimes biospecimens or links to them, and then distribute these, as appropriate, for research. They may be narrowly purposespecific or broad and flexible as to purpose. (Several are sketched in Box 2.) Those discussed in this book are maintained for noncommercial purposes, and most are in principle open to wide use. Many researchbased pharmaceutical and biotechnology companies maintain their own platforms and/or use other proprietary resources in order to maintain technical control and protect company secrets and intellectual property. Longitudinal, or cohort, studies, which record uniform data about the health-related experience of sets of people over a long period, are a classic platform model. Birth cohorts, i.e., projects that follow sets of people in a defined population born at around the same time, are a prominent and highly productive example. Other models of platforms are population research biobanks, twin databases, large genomic databases, and social science data archives; these may or may not continue to collect or assemble uniform data about the same people over time. Platforms can collect data themselves and/or tap data from multiple other sources, and they can distribute data for multiple research uses. Among their advantages in addition to scale are that they can serve as stages for screening data quality, fostering uniformity of data format, maintaining metadata on analytic variables and ethical restrictions, deidentifying data or certifying non-identifiability, providing and tracking data access, linking to other data-sets and possibly biospecimens, and assisting with complex data analysis. Because platforms assemble large amounts of data, usually from many sources, the projects must rigorously attend to privacy and confidentiality and the related issues of consent and identifiability, and maintain tight security. Usually they share individual-level data, in de-identified form, via restricted arrangements with applying researchers. Ethics review is usually conducted at the original sources before they provide data to the platform; additional review may or may not be required before the platform releases data to particular applicants. The decisions on applications are made by data access committees. (Data sharing and access are the subject of Chapter 10.) Degrees of sensitivity It is very difficult, often even pointless, to try to generalize about the relative emotional and ethical sensitivities of different kinds of data, for either

Degrees of sensitivity

19

Data flow via a research resource platform Data and/or biospecimens

Identifiable

De-identification reversible?

Non-identifiable

Scientific review Consent? Ethics review

Data contribution agreements

Research resource platform (Database, biorepository, management, governance)

Further scientific review? Further consent? Further ethics review?

Data release decisions

Unrestricted access

Restricted access via secure Internet or portable data media

Highly restricted access via physical or virtual data enclaves

individual data-subjects or groups of people. Sensitivities are highly relative to personality and the sociocultural context. Data that are considered intensely private by one person may simply not be by others. Data that seem innocuous to a person when young may become an embarrassment later, or vice versa. Over the years, types of data gain or lose sensitivity as

20

Data, biospecimens, and research

science progresses and new factual implications become evident or current interpretations are discredited, or as personal or social values change. Many kinds of data can reveal intimate aspects of people’s lives. Prescription data can suggest the disease, or at least the kind of disease, being treated. Merely the fact that a person has entered into a relationship with a drug abuse, alcoholism, domestic violence, or HIV counseling center – as revealed, say, by appointment logs or billing records – can be held against the person by relatives, partners, friends, employers, or others. Sensitivity may stem from resentment at some maltreatment or ill fortune suffered in the general lottery of life or in the pursuit of adventure, lust, or military service. It may relate to socially marginal or illegal behavior, perhaps in a past that the person wishes to distance himself or herself from. It may arise from a wish not to have elective cosmetic, sexual, or reproductive surgery revealed. It may reflect a fear of negative discrimination. Or it may have to do with the fact that the person is struggling to overcome a health problem and simply doesn’t want anyone, or anyone but close family, friends, or caregivers, to be aware of it. Among the categories fairly universally taken to be more sensitive than others are data about: □ reproduction (fertility, contraception, sterilization, pregnancy, ova or sperm donation, surrogate mothering, miscarriage, elective abortion . . .); □ sexual orientation, practices, or functioning, or sexually transmitted diseases; □ mental problems (bulimia, self-mutilation, suicide attempt . . .); □ alcohol or drug abuse; □ health-related criminal accusations, convictions, or victimization (child battering, rape, forced prostitution . . .); □ embarrassing problems (urinary or fecal incontinence, circumstances of emergency-room admission . . .); □ genetics or genomics. It can be exceedingly difficult to construct formal definitions of such categories for legislative or regulatory purposes, weighting possible harms in privacy risk assessments, or compartmentalizing and managing selective access to data in electronic health records.12 Moreover, although those listed above are among the more obviously 12

The inherent difficulty is evidenced in the US National Committee on Vital and Health Statistics, letter to the Secretary of Health and Human Services, “Recommendations regarding sensitive health information” (November 10, 2010): www.ncvhs.hhs.gov/ 101110lt.pdf.

Degrees of sensitivity

21

delicate kinds of data, a person may just as well have anxieties about their employer becoming aware of debilitations of asthma or migraine, or menopausal stress, or a bad back – conditions that the person may believe she or he can cope with without anyone else’s needing to know about. (Of course, an employer may have good reasons for wanting to know, especially if the problem might affect the health, safety, or wellbeing of other people.) Sensitivity has traditionally been recognized with data about exceptionally vulnerable people (such as infants, children, mentally disabled people, prisoners, or refugees), people under violent threat (victims of domestic violence or sex trafficking), and people in security-sensitive roles (prison guards or undercover police). Elevated data protection may be needed or even mandated by law or a court order. Research institutions are usually aware that they must take special precautions with information about political leaders and celebrities, starting with admonishing and safeguarding against snooping by curious staff and students. A special sensitivity, heightening now, is that surrounding the collection and analysis of data by geographic ancestry, ethnic, or racial categories. In many circumstances the validity of such distinctions is diminishing, both because of migration and interbreeding and because genomic science is profoundly revising our understanding of human variation. “Race” is losing currency, in part because of its vagueness and sociocultural relativity, and “ethnicity” and “ancestry” tend to be imprecise and unreliable, especially when self-reported. Genomic generalizations have to be handled carefully, because even if a genomic variant is known to occur with an unusually high frequency in a population, the variant can’t be assumed to occur in all members of that population, and besides, what appears by superficial characteristics to be a distinct and uniform population may well be neither distinct nor uniform in genomic constitution.13 Nonetheless, our physical makeups, habits, exposures, and vulnerabilities are heavily influenced by our origins and the cultures in which we tend to cluster. Health research always has to focus on particular sorts of people, analyzing by age, sex, environment, lifestyle, and other categories. Taking ethnicity or lineage into account, even just as tentative proxy indicators for such variables as diet, can help identify risk

13

Timothy Caulfield, Stephanie M. Fullerton, Sarah E. Ali-Khan, et al., “Race and ancestry in biomedical research: exploring the challenges,” Genome Medicine, 1, 8.1–8.8 (2009): http://genomemedicine.com/content/pdf/gm8.pdf; Charles N. Rotimi and Lynn B. Jorde, “Ancestry and disease in the age of genomic medicine,” New England Journal of Medicine, 363 (2010), 1551–1558.

22

Data, biospecimens, and research

factors, focus research on the health problems of particular sorts of people, and target the translation of findings into practice. All of this leads to serious policy questions. Under what circumstances should special data sensitivities be recognized? Should different kinds of data be protected differently? These have to be dealt with by ethics and law in local context. Obviously, if some special sensitivity is recognized when data are initially collected, tailored safeguards may have to be imposed. The EU Data Protection Directive avoids the subtleties and defines as “special categories” all “personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life” (Article 8). Data in those categories are now widely referred to as “sensitive data,” as they are in the UK Data Protection Act transposing the Directive into national law (Article 1.2): In this Act “sensitive personal data” means personal data consisting of information as to – (a) the racial or ethnic origin of the data subject, (b) his political opinions, (c) his religious beliefs or other beliefs of a similar nature, (d) whether he is a member of a trade union . . . (e) his physical or mental health or condition, (f) his sexual life, (g) the commission or alleged commission by him of any offence, or (h) any proceedings for any offence committed or alleged to have been committed by him, the disposal of such proceedings or the sentence of any court in such proceedings. Most of these sorts of data are handled in one health research circumstance or another – not only the most obvious categories (e) and (f), but often (a) as a demographic identifier and an indicator of possible health factors, (c) when religious tenets are factors in a patient’s lifestyle or medical history or decisions, or (g) or (h) when a patient’s injury is attributed to a criminal act, or criminal behavior is a consideration in mental health care, or a data-subject is a prisoner. Data and biospecimen ownership Occasionally there is confusion or dispute about who owns some data or biospecimens, and what ownership implies in the circumstances. Even though data in a healthcare or research database may be about a person,

Data and biospecimen ownership

23

the person does not usually own the data in the sense that he or she can take controlling possession and alter, delete, sell, give away, or destroy the data. Most privacy and data protection laws defend a right of people to know what information is held about themselves. Research regimes protect participant withdrawal rights. In many places, patient-access laws guarantee the right of patients to be provided with a copy of their medical records and to request correction or amendment. Clinical trial investigators often discuss observations with participants after the conclusion of a trial, especially with patients for whom the findings might be medically useful. For a variety of medical, scientific, regulatory, and liability reasons, though, healthcare providers, payers, research units, and pharmaceutical R&D units must not relinquish control over the substance of personrelated data, unless, of course, they formally transfer the curation and responsibility to other competent parties. Ownership in the intellectual property sense is beyond the scope of this book, but obviously, data or biospecimens can be owned in the sense of being exclusively possessed, such as by a biotech company that has collected them from consenting donors. Whether a possessor of data can transfer them, sell access to them, or destroy them depends on the commitments made when the data were collected originally and any subsequent changes in the conditions or relevant laws or regulations. Diligence is required if a biomedical company holding data or biospecimens merges with another, or if a research unit becomes unable to protect data, such as if its funding runs out. One thing is clear: Privacy and confidentiality obligations must accompany data wherever they go. One of the strengths of privacy and data protection regimes is that they focus on the consequences of use of data – and having paid for use, or having exclusive rights to use, does not negate responsibilities or liabilities. One of the knottiest sets of ethico-legal issues of today has to do with whether and what intellectual property rights should be viewed as adhering to materials once they have been disassociated from the human body, or to data once they are no longer associated with real persons. Most legal theories hold that one does not “own” one’s living body in any of the classic property senses, despite (ideally) having control over it. But debate is raging over whether and what sorts of exploitive rights, if any, continue to attach to tissues, DNA, or other materials after they are no longer part of the body, and what consent, if any, should apply. Related to this are issues of rights in bodily material after death. At stake for research are the status not only of materials explicitly donated for research, but also of such materials as archived pathology specimens, living materials being held for transplantation or fertilization that are in excess of demand, and medical waste scheduled to be destroyed.

24

Data, biospecimens, and research

Some commentators argue that research participants or data-subjects are owed some sort of financial payback or privileged access, especially if the data or material contributes to an eventual commercial development. This overlooks the fact that it is virtually impossible to trace back and estimate in any meaningful sense the relative contribution that any person’s participation, data, or biospecimens might have made to a discovery or product development. And it overlooks the point, argued at the opening of this book, that we are all in effect rewarded in advance as we benefit from the contributions to the knowledge commons that countless selfless others have made. Many cultural and personal interests must be respected, of course. But this author believes that research would be greatly helped if the notion that people generally should hold a continuing, partially controlling interest in data or biospecimens were abandoned. There can be both a moral and a public interest in letting go.14 What is research, anyway? Research, under US human subject protection regulations, is “a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge.”15 The emphasis on generalizability is meant to help distinguish research from communicable disease surveillance, routine collection of population statistics, and healthcare service evaluation. Similar definitions are employed elsewhere. Many policy interpretations insist that true research must involve testing a hypothesis, as compared with open-ended trawling through data to look for regularities or singularities, which may suggest but cannot confirm hypotheses. A complication is that alert trawling with a research problem in mind often suggests a hypothesis and at the same time suggests how the data can be brought to bear on the idea, thus segueing into research. Some data-intense studies, such as genome-wide association studies (described in Chapter 8), are obviously research even though they 14

15

For exploration of the issues, see Michael Steinmann, Peter Sýora, and Urban Wiesing (eds.), Altruism Reconsidered: Exploring New Approaches to Property in Human Tissue (Farnham: Ashgate, 2009); David Price, Human Tissue in Transplantation and Research: A Model Legal and Ethical Donation Framework (Cambridge University Press, 2010); Christian Lenk, Nils Hoppe, Katherina Beier, and Claudia Wiesemann (eds.), Human Tissue Research: A European Perspective on the Ethical and Legal Challenges (Oxford University Press, 2011); and Nuffield Council on Bioethics, Human Bodies: Donation for Medicine and Research (2011): www.nuffieldbioethics.org/sites/default/files/ Donation_full_report.pdf. US Department of Health and Human Services, Federal Policy on Protection of Human Subjects (“Common Rule”), 45 Code of Federal Regulations 46: www.hhs.gov/ohrp/ humansubjects/guidance/45cfr46.html. The research definition is at §46.102(d).

What is research, anyway?

25

may be close to being hypothesis-neutral at the start. Too, generalizable implications can emerge from public health investigations, as they also can from carefully structured service evaluations in healthcare organizations whose initial purpose is to improve internal practices. Nonetheless, the necessary judgments can be made. Usually a borderline matter with respect to privacy can be resolved by focusing on how the data use might affect the data-subjects, rather than on what the research questions are or what analytic methods are employed. This can help decide, for instance, whether research ethics review is necessary. Needing to distinguish human-subject research, which requires research ethics review, from healthcare audit and evaluation, which usually doesn’t, the UK National Research Ethics Service makes this succinct distinction: “The primary aim of research is to derive generalizable new knowledge, whereas the aim of audit and service evaluation projects is to measure standards of care. Research is to find out what you should be doing; audit is to find out if you are doing a planned activity and assess whether it is working.”16 James Hodge and Lawrence Gostin summarized the differences between public health practice and human subject research in useful detail: Essential characteristics of public health practice include: – involves specific legal authorization for conducting the activity as public health practice at the federal, state or local levels; – includes a corresponding governmental duty to perform the activity to protect the public’s health; – involves direct performance or oversight by a governmental public health authority (or its authorized partner) and accountability to the public for its performance; and – may legitimately involve persons who did not specifically volunteer to participate (i.e., they did not provide informed consent); and – supported by principles of public health ethics that focus on populations while respecting the dignity and rights of individuals. Some of the essential characteristics of human subject research include: – involves living individuals; – involves, in part, identifiable private health information; 16

UK National Research Ethics Service, “Defining research” (2008): www.nres.npsa.nhs. uk/news-and-publications/publications/general-publications. The website provides criteria for distinguishing among research, service evaluation, clinical audit, and public health surveillance.

26

Data, biospecimens, and research

– involves research subjects who are selected and voluntarily participate (or participate with the consent of their guardians), absent a waiver of informed consent; and – supported by principles of bioethics that focus on the interests of individuals while balancing the communal value of research.17 Less valid criteria than these, Hodge and Gostin remarked, are: who is performing the activity, whether or not the findings are published, the urgency, the source of funding, the technical methods used. Such boundary drawing may sometimes seem like making differences without much distinction. But with respect to privacy and confidentiality, it can be formally important for at least three reasons, each of which will be addressed in relevant context later. It may: □ determine whether an activity falls under human-subject research ethics regulation; □ influence whether an activity qualifies for some research exemption under law; □ help define a protective cordon around research data as compared with other data.

Box 2. Sketches of a few large research resource platforms Framingham Heart Study.a A project begun in 1948 studying factors contributing to cardiovascular disease that has followed a cohort of 5,200 people originally living around Framingham, Massachusetts, and many of their children, and now grandchildren. More than 2,200 research articles have been published based on the data. Avon Longitudinal Study of Parents and Children (ALSPAC).b Also known as the Children of the 90s Study. An ongoing study of child health and development, based on more than 14,000 pregnancies in the Bristol and Bath area enrolled during 1991–1992. Having generated masses of data and holding biospecimens from umbilical cord slices to babies’ urine to mothers’ toenail clippings, the ALSPAC trove has been the subject of hundreds of studies, most of them collaborative efforts with external investigators.

17

James G. Hodge and Lawrence O. Gostin, Public Health Practice vs. Research, a report to the Council of State and Territorial Epidemiologists (2004): www.cste.org/pdffiles/newpdffiles/CSTEPHResRptHodgeFinal.5.24.04.pdf.

What is research, anyway?

27

Box 2. (continued ) US National Health and Nutrition Examination Study (NHANES).c An in-depth survey that interviews and conducts physical examinations of thousands of people every year to assess the health and nutritional status of adults and children across the nation. The program has been running in evolving form since the early 1960s. The data are used by many researchers to gauge the prevalence of major diseases and their risk factors, examine how nutrition relates to health promotion and disease prevention, and pursue other inquiries. Wellcome Trust Case Control Consortium.d A collaboration of 50 research groups across the UK that conducted genome-wide analyses of de-identified DNA from thousands of patients suffering from common chronic diseases, and, as controls, samples from the 1958 British Birth Cohort and donors recruited by the UK Blood Services. Much analysis is continuing. UK Biobank.e A project involving 500,000 people aged 40–69 from across the UK, who upon enrollment answered an extensive health history and lifestyle questionnaire, underwent a full physical examination, gave blood, urine, and saliva samples, and granted permission for the project to access their past and future NHS and other health-related records, link to other databases such as registries, and perform genotyping. Opthalmologic examinations have now been conducted on 100,000 participants. Researchers anywhere can apply to use the data, under restrictions. Million Women Study.f A questionnaire-based study, started in 1997, coordinated by researchers at Oxford. The study involved 1,300,000 women who joined at age 50 or older, recruited via NHS Breast Screening Centres. The main focus has been on the effects – positive and negative – of hormone replacement therapy at menopause, but the size of the database has allowed many other women’s health issues to be studied. Database of Genotypes and Phenotypes (dbGaP).g A datasharing platform that manages data contributed by more than 150 genome-wide association and other studies corrrelating genomic data with phenotypes (observable traits). The platform also holds such data as 72,000 photographs of eye structures from the National Eye Institute’s Age-Related Eye Disease Study. UK Data Archive.h An archive that acquires, curates, and manages access to more than 5,000 economic and social databases, including those of many health surveys and several large birth cohort projects. Kaiser Permanente Research Program on Genes, Environment and Health.i A database of DNA samples and survey responses, linkable with electronic health records, from members of the largest nonprofit health plan in the US, Kaiser Permanente

28

Data, biospecimens, and research

Box 2. (continued ) Northern California. Currently has 200,000 participants, growing toward a goal of at least 500,000. The database is beginning to provide data for research on a wide variety of health problems. Data resource programs described elsewhere in the book. The International Cancer Genome Consortium is described on p. 114; the Western Australian Data Linkage System on p. 147; the Manitoba Centre for Health Policy’s Population Health Research Data Repository on p. 148; the UK General Practice Research Database on p. 148; the Vanderbilt University BioVU Program on p. 148. a

A joint project of Boston University and the US National Heart, Lung, and Blood Institute: www.framinghamheartstudy.org. b Hosted by the University of Bristol; www.bristol.ac.uk/alspac. c A program of the US National Center for Health Statistics; www.cdc.gov/ nchs/nhanes.htm. d www.wtccc.org.uk. e www.ukbiobank.ac.uk and www.egcukbiobank.org.uk. f www.millionwomenstudy.org. g Managed by the US National Center for Biotechnology Information: www.ncbi.nlm.nih.gov/gap. h www.uk data-archive.ac.uk. i www.dor.kaiser.org/external/DORExternal/rpgeh/index.aspx.

3

Privacy, confidentiality, safeguards

Privacy Privacy is a notion familiar to everyone, yet is one frustratingly difficult to define precisely. “A concept in disarray,” Daniel Solove called it in his book, Understanding Privacy: Currently, privacy is a sweeping concept, encompassing (among other things) freedom of thought, control over one’s body, solitude in one’s home, control over personal information, freedom from surveillance, protection of one’s reputation, and protection from searches and seizures.1 Defined as a prerogative over the flow of information, privacy is “the claim of individuals, groups and institutions to determine for themselves when, how, and to what extent information about them is communicated to others” (Alan Westin).2 Relating to health, and reaching back to include the act of collecting, “Health informational privacy is an individual’s claim to control the circumstances in which personal health information is collected, used, stored, and transmitted” (Lawrence Gostin).3 More abstractly, privacy is “the interest that individuals have in sustaining a ‘personal space,’ free from interference by other people and organisations” (Roger Clarke).4 Viewed as self-chosen seclusion, privacy is “a state in which an individual is apart from others, either in a bodily or psychological sense or by 1

2 3 4

Daniel J. Solove, Understanding Privacy (Cambridge, MA: Harvard University Press, 2008), p. 1. Contending that “privacy must be determined on the basis of its importance to society, not in terms of individual rights,” Solove develops and discusses a taxonomy of sixteen activities that can impinge on privacy, and argues that “the value of privacy in a particular context depends upon the social value of the activities that it facilitates.” Alan F. Westin, Privacy and Freedom (New York: Atheneum, 1967), p. 7. Lawrence O. Gostin, Public Health Law: Power, Duty, Restraint, second edn. (Berkeley, CA: University of California Press, 2009), p. 316. Roger Clarke, in informal postings at www.rogerclarke.com since at least 1997.

29

30

Privacy, confidentiality, safeguards

reference to the inaccessibility of certain intimate adjuncts to their individuality, such as personal information” (Graeme Laurie).5 Asserted as a right, and extending to groups, “privacy is a domain within which individuals and groups are entitled to be free from the scrutiny of others” (Australian National Health and Medical Research Council).6 The grandest defense of privacy is as a fundamental human right, such as is proclaimed by the European Convention on Human Rights (Article 8.1): “Everyone has the right to respect for his private and family life, his home and his correspondence.”7 The Convention has been ratified by all 47 Members of the Council of Europe and its provisions have been incorporated into the laws of many European nations, thereby becoming enforceable. The articles, verbatim, are Schedule 1 to the UK Human Rights Act, for instance.8 Legal arguments through which such a capacious assertion can be applied to information, as compared with actions or physical spaces, are now beginning to be tested in courts. The recent and more detailed Charter of Fundamental Rights of the European Union (2000, in force 2009) echoes the Convention: “Everyone has the right to respect for his or her private and family life, home and communications” (Article 7).9 The Charter continues, tersely but a bit more specifically, in the data protection tradition (Article 8): 1 Everyone has the right to the protection of personal data concerning him or her. 5 6

7

8

9

Graeme Laurie, Genetic Privacy (Cambridge University Press, 2002), p. 6. Exploration of the implications of this definition in depth is a theme and contribution of the book. Australian National Health and Medical Research Council, Australian Research Council, and Australian Vice-Chancellors’ Committee, National Statement on Ethical Conduct in Human Research (updated 2009), glossary: www.nhmrc.gov.au/_files_nhmrc/publications/ attachments/e72.pdf. Council of Europe, Convention for the Protection of Human Rights and Fundamental Freedoms: http://conventions.coe.int/treaty/en/Treaties/Html/005.htm. The Convention was based on the Universal Declaration of Human Rights, adopted by the United Nations General Assembly in 1948, which proclaimed that “No one shall be subjected to arbitrary interference with his privacy, family, home, or correspondence” (Article 12). It was designed to be more directly transposable into national laws and it established the European Court of Human Rights to which disputes can be referred. UK Human Rights Act: www.legislation.gov.uk/ukpga/1998/42/schedule/1. Generally, the Act requires that public bodies, such as the courts, act in ways compatible with the rights declared by the Convention, and it establishes the right of individuals to seek remedy in UK courts for a breach of a Convention right without having to endure the expense and delay of taking cases to Strasbourg. European Union, Charter of Fundamental Rights: www.europarl.europa.eu/charter/pdf/ text_fn.pdf. The Council of Europe Convention applies to all Members of the Council of Europe, which includes the 27 EU Member States and 20 other countries. The EU Charter, a more ambitious document, applies to the EU Member States only.

Privacy

31

2 Such data must be processed fairly for specified purposes and on the basis of the consent of the person concerned or some other legitimate basis laid down by law. Everyone has the right of access to data which has been collected concerning him or her, and the right to have it rectified. 3 Compliance with these rules shall be subject to control by an independent authority. A semantic complication is that in everyday speech privacy can refer either to decisions or actions taken without interference by others (such as a private decision to undergo a vasectomy or terminate a pregnancy), or to the close holding of thoughts or information (such as the private reflections in a military veteran’s post-combat rehabilitation diary). Another complication is that privacy – taken as a penumbra of personal space, which is a way it can be thought of in its broadest sense – may surround a person’s perceptions, attitudes, or emotions before even the person himself or herself has mentally shaped them into what can be considered information, much less their being documented. Privacy often obtains preinformationally, so to speak, as with private memory, envy, pride, desire, pleasure, affection, suspicion, apprehension, regret, pain, or grief. Thinking of privacy as applying only to existing, tangible, documented information misses this point. Laws in a number of countries are justified in being constructed and titled as “privacy” rather than as “data protection” laws. As has been dramatically evident in disputes over such matters as video surveillance of public spaces, whole body imaging at airport security gates, and police retention of DNA samples, in many aspects of life the legal obligations to respect privacy and confidentiality can be exceedingly controversial, even when the purposes are to protect public order and personal security. Privacy claims may or may not be conceded by others or guaranteed by law. They must be negotiated against countering claims such as rights of other people or collective societal interests. Accordingly, for example, poised against Article 8.1 of the European Convention on Human Rights quoted above is Article 8.2: “There shall be no interference by a public authority with the exercise of this right except [emphasis added] such as is in accordance with the law and is necessary in a democratic society in the interests of national security, public safety or the economic well-being of the country, for the prevention of disorder or crime, for the protection of health or morals, or for the protection of the rights and freedoms of others.” Privacy is a highly relative matter, culturally and personally, as are even the possibilities of having privacy. An illness or medical experience

32

Privacy, confidentiality, safeguards

that one person never wants to mention will be recounted by another at a dinner party in more clinical detail than anyone wants to hear. A genetic condition that in one country is discussed openly as a fact of life to be coped with may be a matter of family shame in another. Uneasiness about trying to defend privacy as a fundamental right applying to actions, relationships, spaces, documents, and communications that can only be generically specified – as with “private and family life, home, and correspondence” – is a reason that some legal systems, such as that of the US, tend to cast informational privacy protections as fair information practices, rather than as rights, and scale protections in proportion to the risks. It simply has to be conceded that in contemporary life there are few matters that can’t at least to some extent be observed in some way by others, to positive and negative effect.10 We don’t live isolated lives. (Nor, for that matter, is this an entirely new social phenomenon: Watching, snooping, gossiping, intruding, and advising have always been part of life in villages, urban quarters, schools, congregations, and workplaces.) Health information privacy and confidentiality are profoundly affected by the scale and interconnectedness of modern life and healthcare delivery. Almost all health care in the developed world, and increasingly elsewhere, is provided and paid for via organizations and sprawling networks of services, many components of which, such as specialist laboratories, information technology units, data management and analysis services, and auditing and reimbursement bureaucracies, may be far from local. As for health research, many of its activities proceed in such dispersed but cooperative fashion, routinely crossing geopolitical boundaries and sectoral divides, that they can be said to be truly global. Claims to a right to control information about oneself can only be exerted so far. Sometimes privacy is taken to be the same as confidentiality, but this sacrifices the important distinction that something can be private, known or felt exclusively by an individual or intimate group, but then deliberately extended into a confidential realm if it is revealed, with restraints on use and further disclosure, to another party.11 Thus, a person may be 10

11

Among our constant observers are state and police agents using surveillance technologies. See, for instance, UK House of Lords, Select Committee on the Constitution, Surveillance: Citizens and the State, 2nd Report of Session 2008–09: www.publications. parliament.uk/pa/ld200809/ldselect/ldconst/18/1802.htm. The notions of privacy and confidentiality, and the differences between them, are discovered by children at a young age as they realize the value in keeping some thoughts and information to themselves and the importance of judging trustworthiness when sharing secrets.

Confidentiality

33

struggling with a deeply private problem, but choose to disclose this to a psychiatrist in medical confidence.

Confidentiality Confidentiality is the respectful handling of information disclosed within relationships of trust, especially as regards further disclosure. Even if an explicit or formal relationship of trust doesn’t exist, often the circumstances or roles imply an assumption of confidentiality. Some court decisions have referred to “reasonable expectations of confidentiality,” whether or not the expectation was determinative, and it is hard to imagine that legal theories won’t increasingly have to interpret obligations this way.12 Confidentiality is in many ways a less abstract, more concrete and more manageable notion than privacy. But privacy is closer to the core of personhood. Confidentiality serves privacy. Much health research works, under restrictions and safeguards, with information collected under one of the most classic, universal, and uncontroversial duties of confidence – medical confidentiality – which is enshrined in law, enforced by professional regulation, and universally expected by the public as the default duty: to which exceptions can be made, but only with justification.13 Much other health research proceeds within relationships of confidentiality between projects and participants. The practical meaning of confidentiality is established in part by what its breaching, or violation, means. Hazel Biggs has listed four criteria that must all be met for a breach of confidence to be established under common law in the UK: The information “must be of a private, personal or intimate nature”; the information must originally have been “imparted in circumstances that import an obligation to maintain confidence”; the disclosure alleged to have been improper must have been to a person “not authorized to have access”; and it must be shown that the subject of the information “would suffer some harm” from the disclosure.14 Similar criteria are embraced elsewhere. Despite centuries of legal recognition of confidentiality, there continue to be uncertainties of interpretation, such as over what should qualify as “harm,” especially if this has to do with an intangible such as emotional 12

13 14

A stimulating review of the differing evolution of the US and English laws of privacy and confidentiality is Neil M. Richards and Daniel J. Solove, “Privacy’s other path: recovering the law of confidentiality,” Georgetown Law Journal, 96 (2007), 123–182, available at: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=969495. General Medical Council (UK), “Good medical practice guidance: Confidentiality” (2009): www.gmc-uk.org/static/documents/content/Confidentiality_0910.pdf. Hazel Biggs, Healthcare Research Ethics and Law: Regulation, Review and Responsibility (Abingdon and New York: Routledge-Cavendish, 2010), pp. 101–103.

34

Privacy, confidentiality, safeguards

distress or damage to personal relationships or reputation. And there are important differences among legal systems, such as over whether lawsuits can be brought for breach of privacy (focusing on ensuing harm to person) or, alternatively, for breach of confidentiality (focusing on violation of the relationship of trust per se). Recognizing that physicians become privy to many aspects of patients’ lives and those of their families, even incidentally, whether they want to or not, the American Medical Association’s code of ethics admonishes doctors to respect privacy as well as confidentiality:15 In the context of health care, emphasis has been given to confidentiality, which is defined as information told in confidence or imparted in secret. However, physicians also should be mindful of patient privacy, which encompasses information that is concealed from others outside of the patient–physician relationship. Physicians must seek to protect patient privacy in all of its forms, including (1) physical, which focuses on individuals and their personal spaces, (2) informational, which involves specific personal data, (3) decisional, which focuses on personal choices, and (4) associational, which refers to family or other intimate relations. Such respect for patient privacy is a fundamental expression of patient autonomy and is a prerequisite to building the trust that is at the core of the patient–physician relationship. Safeguards Safeguards are the many measures taken to protect data or biospecimens while they are being collected or received, and when they are curated and used thereafter. As Chapter 9 will discuss, safeguarding involves an array of intersecting administrative, physical, and information technology measures, including those often thought of as security measures. It is not helpful to think of safeguards, or security, as being synonymous with privacy or confidentiality; the distinctions imply very different actions, roles, and responsibilities. Safeguards serve privacy and confidentiality.

15

American Medical Association, “Code of Medical Ethics,” Opinion 5.059, “Privacy in the context of health care”: www.ama-assn.org/ama/pub/physician-resources/medicalethics/code-medical-ethics.

4

Broad privacy and data protection regimes

The plethora of controls A great complication and source of frustration for health research is the number and complexity of the societal controls impacting on the work. Most of the controls have, in themselves, been developed for good reasons and, in themselves, if reasonable flexibility is allowed, can be followed to good ends. But taken all together – which is the only way researchers can take them – they seem onerous and almost overwhelming. Complying with them requires substantial effort, as enforcing them does. They impose bureaucratic and legal compliance burdens, financial costs, and delays. Most reports on the state of health research, in many countries, complain about their dampening effect.1 A review in 2011 by the Academy of Medical Sciences of the many sets of rules impinging on research in the UK, for example, concluded that: Access to patient data for research is currently hampered by a fragmented legal framework, inconsistency in interpretation of the regulations, variable guidance and a lack of clarity among investigators, regulators, patients and the public.2 “What this [situation] has bred,” Andrew Morris testified to a committee of the House of Lords, “is a culture of caution, confusion, uncertainty, and inconsistency.”3

1

2 3

Infectious Diseases Society of America: William Burman and Robert Daum (authors), “Grinding to a halt: The effects of the increasing regulatory burden on research and quality improvement efforts,” Clinical Infectious Diseases, 49 (2009), 328–335. Daniel Wartenberg and W. Douglas Thompson, “Privacy versus public health: The impact of current confidentiality rules,” American Journal of Public Health, 100 (2010) 407–412. Academy of Medical Sciences (UK), A New Pathway for the Regulation and Governance of Health Research (2011), p. 4: www.acmedsci.ac.uk/p99puid209.html. Andrew Morris, oral evidence, UK House of Lords, Science and Technology Committee, Genomic Medicine, 2nd Report of Session 2008–2009, §6.15: www.publications.parliament.uk/pa/ld200809/ldselect/ldsctech/107/107i.pdf.

35

36

Broad privacy and data protection regimes

In their own frames of reference most of the regimes make sense, and most actually facilitate research by assuring the public and its leaders that care is being taken, thereby encouraging trust. The difficulties are that some of the regimes are very general and not focused on health research, or not even on health information, but must be conformed to nonetheless; some are way behind the scientific times; some simply cannot be applied to contemporary information technology and data flows; and many are implemented in unpredictable and inconsistent ways. And there are just an awful lot of them. These difficulties are not merely nuisances to research. To the extent that they unjustifiably impede research, they impede the realization of the health benefits that might accrue from the research, and their complexity complicates the protection of privacy and confidentiality. Two sorts of regimes The “rules” (for short) take different forms in different locales and situations, but they tend to be of two sorts as regards scope: broad privacy and data protection regimes; and regimes specific to health care, public health, and health research. Broad privacy and data protection regimes include omnibus and sectoral privacy and data protection laws and many telecommunication regulations. Although they may apply to health research, they are drafted broadly to apply to banking, tax reporting, insurance, education, labor, travel, online commerce, direct marketing, public security, and many other activities as well. They will be discussed later in this chapter. Healthcare, public health, and health research regimes include professional guidance and regulations, medical information laws and regulations, human subject protection and research ethics review systems, clinical trial regulations, public health confidentiality provisions, and laws and guidelines pertaining to biospecimens. They are the subject of the next chapter. The rules are enforceable, variously, by statutes, regulations issued under statutes, professional licensing and censure, contract law, tort law, common law, and other means, and interpretive decisions may make reference to international instruments such as the Declaration of Helsinki or the Council of Europe Recommendation on Research on Biological Materials of Human Origin. Early concerns about computers and privacy Starting around 1970, responding to rising apprehension about the effects that “automated data systems,” then coming into practical use, would

Early concerns about computers and privacy

37

have on privacy, several organizations surveyed the prospects and drafted policy recommendations and principles.4 It isn’t necessary here to recount the history in detail, engaging though it is, but the evolution deserves to be noted because it indicates the status of the instruments that emerged and holds implications for their revision.5 In the UK and the US, three committee reports were especially influential. One thing they all did was draft guiding principles. In 1972, a Report of the Committee on Privacy prepared for Parliament (Younger Report) reviewed commercial sector privacy issues and proposed principles relating to purpose limitation, minimization of data collection and retention, notice to data-subjects, accuracy, security, and the separation of identifiers from data collected for statistical purposes.6 Incidentally, in opening the discussion of the report in the House of Lords, Lord Byers remarked: “The first difficulty we faced as a Committee was in trying to define ‘privacy’, and in the event we decided that it could not satisfactorily be done. We looked at many earlier attempts, and we noted that they either went very wide, equating the right to privacy with the right to be let alone, or that they amounted to a catalogue of assorted values to which the adjectives ‘private’ or ‘personal’ could reasonably be applied.”7 In 1973, Records, Computers and the Rights of Citizens, a report from the US Health, Education and Welfare Secretary’s Advisory Committee on Automated Personal Data Systems, reviewed with concern the shift “from record keeping to data processing” in statistical reporting and research. It recommended legislative adoption of what it called “fair information practices” covering openness about the existence of databases, rights of data-subjects to be informed of data about themselves and to request correction or amendment, prohibition of use of data for purposes other 4

5

6 7

Credit for stimulating concern should be accorded to the writing of Alan F. Westin, notably Privacy and Freedom (New York: Atheneum, 1967) and, with Michael A. Baker, Databanks in a Free Society (New York: Quadrangle Books, 1972). For commentary on the early days, see Viktor Mayer-Schönberger, “Generational development of data protection in Europe,” in Philip E. Agre and Marc Rotenberg (eds.), Technology and Privacy: The New Landscape (Cambridge, MA: MIT Press, 1977), pp. 219–241; Colin J. Bennett, Regulating Privacy: Data Protection and Public Policy in Europe and the United States (Ithaca, NY: Cornell University Press, 1992); Priscilla M. Regan, Legislating Privacy: Technology, Social Values, and Public Policy (Chapel Hill, NC: University of North Carolina Press, 1995). A broad review from a political science perspective is Colin J. Bennett and Charles D. Raab, The Governance of Privacy: Policy Instruments in a Global Perspective (Cambridge, MA: MIT Press, 2006). UK House of Lords, Committee on Privacy, Report of the Committee on Privacy, Kenneth Younger, chair (Home Office, Cmnd 5012, H.M. Stationery Office, 1972). UK House of Lords, Session of June 6, 1973: http://hansard.millbanksystems.com/lords/ 1973/jun/06/privacy-younger-committees-report, §106.

38

Broad privacy and data protection regimes

than the original ones without consent, and data holder obligations to keep data reliable for their intended use and to prevent misuse.8 In 1977, Personal Privacy in an Information Society, a report from a study commission established by Congress, reviewed the issues at length and made recommendations for the protection of privacy in several areas, including health care, health benefit programs, and research and statistics activities. It adopted the stance that to be effective, a privacy protection policy must “minimize intrusiveness, maximize fairness of decisions made on the basis of the information, and create legitimate, enforceable expectations of confidentiality.”9 Despite the constructive reports and the fact that in 1974 it had passed a Privacy Act protecting personal information held in government record systems, Congress did not seriously consider broader legislation. Nor did the UK. Through the 1970s and early 1980s several countries, such as France, Sweden, and West Germany, passed laws relating to privacy of government records such as census and tax data, and laws protecting credit and telecommunication data. But none passed laws covering all kinds of data, and applying to both the private and public sectors, until the mid-1980s, when countries started adopting broad data protection acts. In 1984, for example, the UK passed a Data Protection Act covering personal data wherever held (with some exemptions such as for small businesses).

The OECD privacy principles A major step toward universal codification was taken in 1980 when the Organisation for Economic Co-operation and Development (OECD) adopted “Guidelines on the Protection of Privacy and Transborder Flows of Personal Data.” The Guidelines are not legally binding, but among the OECD nations, and indeed elsewhere, they have been considered useful and authoritative, and they have influenced countless policies, including those of corporations and research institutions. Adopting recommendations developed in the review exercises described above, the Guidelines set out eight data protection principles (hereafter the OECD privacy principles), which still 8

9

US Secretary of Health, Education and Welfare, Advisory Committee on Automated Personal Data Systems, Records, Computers and the Rights of Citizens (US Government Printing Office, Washington, DC, 1973), available at: http://aspe.hhs.gov/datacncl/ 1973privacy/tocprefacemembers.htm. US Privacy Protection Study Commission, Personal Privacy in an Information Society (US Government Printing Office, Washington, DC, 1977), available at: http://aspe.hhs.gov/ datacncl/1977privacy/toc.htm.

The OECD privacy principles

39

stand. Some aspects need to be revised now, but because of their importance the principles are worth quoting in full: Collection limitation principle. There should be limits to the collection of personal data and any such data should be obtained by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject. Data quality principle. Personal data shall be relevant to the purposes for which they are to be used, and, to the extent necessary for those purposes, should be accurate, complete and kept up-to-date. Purpose specification principle. The purposes for which personal data are collected should be specified not later than at the time of data collection and the subsequent use limited to the fulfilment of purposes or such others as are not incompatible with those purposes and as are specified on each occasion of change of purpose. Use limitation principle. Personal data should not be disclosed, made available or otherwise used for purposes other than those specified in accordance with [the preceding principle] except: (a) with the consent of the data subject; or (b) by the authority of law. Security safeguards principle. Personal data shall be protected by reasonable security safeguards against such risks as loss or unauthorized access, destruction, use, modification or disclosure of data. Openness principle. There should be a general policy of openness about developments, practices and policies with respect to personal data. Means should be readily available for establishing the existence and nature of personal data, and the main purposes of their use, as well as the identity and usual residence of the data controller. Individual participation principle. An individual should have the right: (a) to obtain from the data controller, or otherwise, confirmation of whether or not the data controller has data relating to him; (b) to have communicated to him, data relating to him: within a reasonable time; at a charge, if any, that is not excessive; in a reasonable manner; and in a form that is readily intelligible to him; (c) to be given reasons if a request made under subparagraphs (a) and (b) is denied, and to be able to challenge such denial; and (d) to challenge data relating to him and, if the challenge is successful, to have the data erased, rectified, completed or amended.

40

Broad privacy and data protection regimes

Accountability principle. A data controller should be accountable for complying with measures which give effect to the principles stated above.10 The principles have the advantage of being terse and commonsensical. They have the disadvantage that all terse rules do in that they leave much leeway for interpretation. Council of Europe Convention 108 At the same time that the OECD was developing its Guidelines, the Council of Europe was drafting a much more detailed Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data, the articles of which are consonant with the OECD principles.11 Not surprisingly, the two organizations coordinated their efforts. The Convention is a multilateral instrument declaring a fundamental right to data protection. It goes further than the OECD principles do, such as by requiring that special attention be paid to highly sensitive categories, for example, racial, religious, and sexuality data, and it addresses many procedural and transborder data transfer issues. As with the OECD principles, some aspects are outdated – after all, these instruments were drafted more than 30 years ago – but the essence remains pertinent and the Convention has served as the scaffold for much national legislation. It has now been ratified by more than 40 Member States, making it binding in those countries, and it is open, by invitation, to ratification by countries that are not members of the Council. A process of full revision is underway. The European strategy The current legal framework in Europe, which is built around the core elements of Convention 108, is comprehensive data protection, based on laws interpreted and enforced by independent supervisory authorities. The 10

11

Organisation for Economic Co-operation and Development, “Recommendation of the Council concerning guidelines governing the protection of privacy and transborder flows of personal data” (1980): www.oecd.org/document/18/0,3746,en_2649_34255_ 1815186_1_1_1_1,00&&en-USS_01DBC.html. Council of Europe, Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data, European Treaty Series No. 108 (1981): http:// conventions.coe.int/Treaty/EN/Treaties/Html/108.htm. The Council of Europe (CoE) should not be confused with the European Union or its governing Council. The CoE is an organization of 47 countries, including all of the EU countries, having a strong concern for human rights and the rule of law. Although it has no enforcement powers, countries ratifying its conventions are expected to comply with them, and this commitment establishes a moral and legal benchmark.

The European strategy

41

scheme is worth describing at some length because it applies throughout Western Europe and parts of Eastern Europe; it is the model upon which the laws of Argentina, Israel, and some other countries are fairly closely based; it is a model that many Asian and Latin American countries are drawing elements from as they develop their laws; and it is a regime that must be accommodated by all health research activities that process personal data in the EU or that import personal data from the EU to non-EU countries. The overarching instrument, adopted by the Council and Parliament of the EU in 1995, is a directive whose title indicates dual purposes: the Directive on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data (hereafter the EU Data Protection Directive).12 The Directive has been transposed by all 27 EU Member States into their national laws, such as the Italian Codice in materia di protezione dei dati personali, the German Bundesdatenschutzgesetz, the Spanish Ley Orgánica de Protección de Datos de Carácter Personal, and the UK Data Protection Act. Transposition of a directive – an obligation to which all EU countries are committed as Member States – means that the requirements in the Directive, which are not in themselves enforceable, are implemented in national law and so are enforceable. Variations in detail and procedure are allowed at national discretion, such as to mesh with other laws, respect cultural sensitivities, or add provisions, as long as these don’t conflict with the tenets of the Directive. As members of the European Economic Area (EEA), Iceland, Liechtenstein, and Norway are also subject to the Directive. Switzerland has a parallel law. The Directive is complex because it must address the two fundamental purposes, cover a staggering variety of ever-changing activities and problems, and be flexible enough to be accommodated by the legal systems of many and differing countries. The Member States’ laws transposing it are complex because they must accommodate the provisions of the Directive and at the same time conform to their other laws and the local legal system. A succinct overview of a few salient features (repeating several definitions here from earlier mentions in order to summarize the points in one place) is that the Directive: □ applies to all personal data held in organized systems in the EU, irrespective of the origin or topic of the data (except for a few exempt

12

EU Directive on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data (Directive 95/46/EC): http://eur-lex. europa.eu/LexUriServ/LexUriServ.do?uri=celex:31995L0046:en:html. The portal to information about the Directive and its implementation by the Member States is: http:// ec.europa.eu/justice/data-protection/index_en.htm.

42

□

□

□

□

□ □ □

□ □

13

Broad privacy and data protection regimes

categories such as national security data) and irrespective of the citizenship or residency of the data-subjects; defines personal data as “any information relating to an identified or identifiable natural person . . . who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity”; focuses on acts and consequences of data processing, which include “any operation or set of operations which is performed upon personal data, whether or not by automatic means, such as collection, recording, organization, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, blocking, erasure or destruction”; places accountability on “data controllers,” defined as the persons (which may be so-called legal persons such as corporate bodies or universities) who determine the purposes and means of data processing, and who in turn supervise “data processors,” agents who work with data on behalf of data controllers; embraces the OECD / European Convention 108 principles, casting them as requirements – there are sections on fair and lawful data collection, data quality, purpose specification, use limitation, security, openness, data-subject participation, and accountability; emphasizes transparency and the informing of data-subjects as to intentions to collect data, the purposes, and the identity of the data controllers; accords exceptionally high protection to health, sexual, and some other data considered to be unusually sensitive; prohibits the exportation of personal data from the EU unless the receiving country affords “an adequate level of protection,” or the data-subject consents, or certain other criteria are met; requires the appointment of independent national supervisory authorities having powers of inspection and enforcement; establishes a European Article 29 Data Protection Working Party, comprising representatives of all of the national data protection authorities, a representative of the EU institutions, and a representative of the European Commission, which works to harmonize implementation of the Directive, share experiences, and advise the European Commission and the public.13

So-called because it was established by that article of the Directive. Various of the Working Party’s documents are discussed in this book: regarding the future of privacy on p. 46; the concept of personal data on pp. 89 and 107; geolocation on p. 91, footnote 7; controller and processor on pp. 127–128; consent on pp. 44 and 73; and accountability on p. 45, footnote 19.

The European strategy

43

Inevitably, the Member State laws vary considerably among themselves and the countries employ a variety of implementation mechanisms. For instance, some require registration of data collections with the supervisory authority, while others don’t, and some require prior approval for the processing of sensitive data, while others don’t. All have national commissioners, commissions, committees, or agencies charged with implementing and enforcing the Acts. Variously these authorities are appointed by, and report to, a legislative body or a minister. All have staff support and a considerable degree of independence. Four provisions of the Directive that are especially relevant to health research should be noted. First, regarding the special categories of data, or “sensitive data” (discussed in Chapter 2), the Directive requires that: Member States shall prohibit the processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life [Article 8.1] [except] where processing of the data is required for the purposes of preventive medicine, medical diagnosis, the provision of care or treatment or the management of health-care services, and where those data are processed by a health professional subject under national law or rules established by national competent bodies to the obligation of professional secrecy or by another person also subject to an equivalent obligation of secrecy [Article 8.3]. One option through which such processing is allowed is explicit consent by the data-subject (Article 8.2(a)). Second, regarding research and statistics activities, Member States may suspend data-subjects’ rights of access to personal data “when data are processed solely for purposes of scientific research or are kept in personal form for a period which does not exceed the period necessary for the sole purpose of creating statistics” (Article 13.2). This should have implications at least for the aggregation of individual-level data into public health or other population statistics, and possibly for suspending clinical trial participants’ access to information until the trial is concluded. Third, regarding secondary use of data, if safeguards are in place, Member States may suspend requirements of notification and fair processing where “for processing for statistical purposes or for the purposes of historical or scientific research, the provision of such information proves impossible or would involve a disproportionate effort or if recording or disclosure is expressly laid down by law” (Article 11.2).

44

Broad privacy and data protection regimes

And fourth, regarding public interest, Member States may consider the processing of personal data to be legitimate if it “is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller or in a third party to whom the data are disclosed” (Article 7). Specifically for “sensitive data” such as health data, “Subject to the provision of suitable safeguards, Member States may, for reasons of substantial public interest, lay down exemptions in addition to those laid down in paragraph 2 either by national law or by decision of the supervisory authority” (Article 8.4). In theory, these clauses could be viewed as justifying the processing of data in at least some research and public health work. Serious problems are that what constitutes “scientific research” under Articles 11 and 13, or exactly what kinds of research exemptions can be inferred, has never been explained by the Article 29 Working Party or critically tested by the research community or the courts. Nor has what might qualify as the public interest in such activities as developing disease registries and using registry data in research. Nor had, until recently, the nuances of consent, beyond the Directive’s definition, “any freely given specific and informed indication of his wishes by which the data subject signifies his agreement to personal data relating to him being processed” (Article 1.2(h)). In 2011, the Article 29 Working Party issued an “Opinion on the definition of consent.”14 But the Opinion even disapproves of consent as a basis for patients’ authorizing the use of electronic health records in the provision of care, arguing that such consent would be too broad.15 Not surprisingly, the document doesn’t provide any guidance on the acceptability of broad consent of the kind necessary for longitudinal studies or population biobanks, but its astringent interpretation of “specific” and “explicit” at various points suggests no receptivity to such a notion. This group of interrelated issues must be addressed as the Directive is revised and then as the Member States change their laws to transpose the new version into national law. The Directive’s language is necessarily generic. The national interpretations tend to be more specific and detailed, and a few countries’ regimes

14

15

European Article 29 Data Protection Working Party, “Opinion on the definition of consent” (2011): http://ec.europa.eu/justice/data-protection/article-29/documentation/ opinion-recommendation/files/2011/wp187_en.pdf. The Opinion declares that “a ‘general agreement’ of the data subject − e.g. to the collection of his medical data for an electronic health record and to any future transfers of these medical data to health professionals involved in treatment − would not constitute consent in the terms of Article 2(h) of the Directive” (p. 18). It points out that Member States may invoke Article 8.4 and enact exemptions in their laws to allow the use of EHRs on public interest grounds.

The European strategy

45

implementing the Directive have mechanisms specific to health research. France’s data protection act, for example, has a constructive section on medical research (Articles 53–61).16 One arrangement is that researchers wanting to undertake non-interventional studies using identifiable health data must submit their proposals to a committee of methodological experts (Comité consultatif sur le traitement de l’information en matière de recherche dans le domaine de la santé). The committee advises the data protection authority, the Commission nationale de l’informatique et des libertés (CNIL), of its assessments. Taking any other considerations into account, the CNIL then makes approval decisions.17 The UK Data Protection Act has a section on the use of data for historical, research, and statistical purposes (Section 33), and its definition of “medical purposes” includes “preventative medicine, medical diagnosis, medical research [emphasis added], the provision of care and treatment and the management of healthcare services” (Schedule 3, Article 8.2).18 (Medical research per se is not mentioned in the Directive.) In the UK, researchers wanting to use National Health Service (NHS) data without the patients’ consent can apply to a statutory board for dispensation, a mechanism that will be described in Chapter 6. Most of the EU authorities recognize that they have not been able to focus accountability and enforce their laws in a way that fully satisfies the intent of the Directive.19 Data uses, privacy values, information technology, and of course the biomedical sciences have all changed in ways that were unpredictable when the Directive was passed in 1995. Moreover, in their defense it must be acknowledged that the authorities have been inundated by a flood of controversial issues – demands for personally identified information to combat terrorism, drug trafficking, and money laundering, video surveillance of public places, biometric recognition, radiofrequency identification, police retention of DNA samples, retention and questionable uses of Internet activity data by service providers, privacy issues in online social networking, geosurveillance of people’s movements, and all the rest – and at the same time have had to devote a lot of effort to developing practical guidance, workable across the 30 EEA countries and

16 17

18 19

France, Loi relative à l’informatique, aux fichiers et aux libertés. English translation: www. cnil.fr/fileadmin/documents/en/Act78–17VA.pdf. Frédérique Claudot, François Alla, Jeanne Fresson, et al., “Ethics and observational studies in medical research: various rules in a common framework,” International Journal of Epidemiology, 38 (2009), 1104–1108. UK Data Protection Act: www.legislation.gov.uk/ukpga/1998/29/contents. European Article 29 Data Protection Working Party, “Opinion on the principle of accountability” (2010): http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/2010/ wp173_en.pdf.

46

Broad privacy and data protection regimes

acceptable elsewhere, on how to comply with the Directive. And some nearly intractable difficulties stem from concepts in the Directive itself.20 A process of revision is underway.21 Informed in part by public consultations, in late 2009 the authorities summarized their reflections in a report, The Future of Privacy, which among other things called for: □ clarification of some core concepts and requirements, such as those regarding consent and transparency; □ encouragement of “privacy by design” in systems and projects; □ focusing of accountability; □ introduction of the possibility of class action (group) lawsuits; □ requiring notification of data security breaches; □ general reduction of bureaucratic burdens; □ harmonization of the differing Member State interpretations; □ redesigning of the process for determining whether non-EU countries afford “adequate protection” to qualify to receive personal data from EU sources.22 The report urges pushing toward global standards and says that if this does not succeed, the feasibility of a binding international framework should be explored. Furthermore, it notes that since “international agreements can be appropriate instruments for the protection of personal data in a global context, the future legal framework could mention the conditions for agreements with third countries” (meaning non-EU countries). The health research community exerted little if any influence on the drafting of the 1995 Directive. It should be engaged, now, with the revision process and subsequent national transposition, as regards such issues as the concept of personal data, consent, public interest 20

21

22

For some bases for this criticism, see among other sections later in this book the discussion of problems with the concept of “personal data” in Chapter 6 and the discussion of the softness of the arrangements relating to the international transfer of data in Chapter 10. Neil Robinson, Hans Graux, Maarten Botterman, and Lorenzo Valeri, Review of the European Data Protection Directive, a report prepared by RAND Europe for the UK Information Commissioner’s Office (2009): www.rand.org/pubs/technical_reports/ 2009/RAND_TR710.pdf; LRDP Kantor Ltd, “New challenges to data protection,” a review, with provocative working papers and country reports, prepared for the European Commission’s Directorate-General Freedom, Security, and Justice (2010): http://ec.europa.eu/justice/policies/privacy/docs/studies/new_privacy_challenges/final_ report_en.pdf; European Commission, “Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation)” (2012): http://ec.europa.eu/justice/data-protection/document/ review2012/com_2012_11_en.pdf. European Article 29 Data Protection Working Party jointly with the Working Party on Police and Justice, “The future of privacy” (2009): http://ec.europa.eu/justice/policies/ privacy/docs/wpdocs/2009/wp168_en.pdf.

The US approach

47

considerations relating to health research, the use of de-identified data without consent for research, the status of de-identified data when in the hands of researchers who cannot themselves re-identify the data, genotype data as potentially identifying data, and the international movement of data. A modest but clear example of the kind of issue that must be confronted is this conclusion of the comprehensive review prepared for the European Commission mentioned above: “Blatantly in violation of the Directive, the UK Data Protection Act adds ‘medical research’ to the list of medical purposes set out in Article 8(3) of the Directive, thus circumventing purpose-limitation in that regard (contrary to the clear guidance on this from the [Article 29 Working Party]).”23 This is not just a question of legislative completeness, but one of the very framing of health research, and not only for the UK.

The US approach Although it attends to privacy in many ways, the US can’t be said to have a coherent strategy. Informational privacy is not explicitly addressed by the Constitution. The US has never seriously considered establishing a comprehensive privacy or data protection law like those in Europe, but has instead adopted laws and regulations specific to particular sectors, kinds of information, or privacy risks, and has relied heavily on professional and industry self-regulation and on tort and other avenues of legal redress. One broad federal law is the Privacy Act, which protects personally identifiable information maintained in systems of records in the possession and control of federal government organizations.24 This covers data held by agencies and institutes of the Department of Health and Human Services, for instance. Although the Act establishes a high fence against uses of the data by nongovernmental parties (unless authorized), it carries a “routine use” provision allowing use of data for purposes compatible with the original purpose, which permits relatively wide leeway for the use of data within the government. Another law, not strictly a privacy law but one that provides avenues of enforcement, is the Federal Trade Commission Act, which can be applied to unfair or deceptive business practices and thus to such matters as conformance of a biotechnology company to its published privacy policies.

23 24

LRDP Kantor Ltd, “New challenges,” p. 29. US Department of Justice, “Overview of the Privacy Act of 1974”: www.justice.gov/opcl/ 1974privacyact-overview.htm.

48

Broad privacy and data protection regimes

The US has many federal laws relating specifically to the privacy of health care and public health data, which will be discussed in the next chapter. In addition to the federal laws, the fifty states have hundreds of laws protecting, variously, healthcare, pharmacy, mental health, communicable disease, genetic test, and other categories of health-related data, and laws requiring notification of data security breaches. This inevitably leads to inconsistencies across the states, and occasionally to questions as to whether federal or state law prevails. US businesses tend to prefer schemes of self-regulation, selfpublication of privacy policies, certification by quasi-independent organizations, trustmarking of websites, and other nonregulatory approaches. As will be discussed in Chapter 10, for transfers of personal data from the EU to the US, American companies can self-certify to a Safe Harbor Framework based on a treaty-like agreement with the EU administered by the US Commerce Department. Thus, it is not that the US has no privacy or confidentiality protections. But what it has is an assortment – almost always referred to as a patchwork or mosaic (or, sometimes, a briar patch) – of statutes, regulations, guidance, and professional and business self-regulation, at both federal and state levels. And again, it relies on the deterrent of possible after-the-fact legal liability much more than other countries do.

The Canadian strategy Canada has since 1983 had a Privacy Act applying to the handling of data by the federal government and a Personal Information Protection and Electronic Documents Act (PIPEDA) since 1999 applying to commercial organizations regulated by the federal or provincial governments.25 PIPEDA was based on the Canadian Standards Association’s “Model Code for the Protection of Personal Information,” which was itself based on the OECD privacy principles. The Act goes further than the OECD principles do by adding pragmatic specifics, such as focusing accountability by requiring that organizations designate a named officer responsible for compliance, and embracing a risk-based approach by saying that “information shall be protected by safeguards appropriate to the sensitivity of the information.” 25

Canada, “Personal Information Protection and Electronic Documents Act”: http://lawslois.justice.gc.ca/PDF/P-8.6.pdf; Canada, Office of the Privacy Commissioner, “Leading by example: Key developments in the first seven years of the Personal Information Protection and Electronic Documents Act” (2008): www.priv.gc.ca/information/pub/ lbe_080523_e.cfm.

Australia, Japan, APEC

49

A challenge in Canada is that the provinces all have privacy or data protection laws, and several have health-information privacy laws. Unsurprisingly, they are not uniform. The intersections between the provincial laws and PIPEDA are complicated. PIPEDA prevails unless the provincial law offers similar protection, and it governs inter-provincial and international data handling by commercial organizations. Another challenge is that the Privacy Act is not fully aligned with the more modern PIPEDA. Avner Levin and Mary Jo Nicholson have argued that Canada’s “coalescing identity as a multicultural haven” has led Canada to settle a “safe middle ground” between the US and EU approaches, generalizing that the US emphasizes liberty, the EU dignity, and Canada autonomy. “As concepts, ‘liberty’ and ‘dignity’ may seem distinct,” they argue, “yet they can both be understood as manifestations of autonomy, one in the political arena, the other in the social field.”26 The structure of the Canadian privacy laws and their implementation by commissioners resemble the EU scheme, however, and although the US certainly values liberty, its approaches to informational privacy tend to emphasize fairness and the practical minimization of privacy risks.

Australia, Japan, APEC Australia. The Commonwealth has since 1988 had a Privacy Act protecting personal information in the possession of the federal government, and in 2000 this was amended to extend coverage to most private sector organizations. Three states have their own privacy laws and all of the states and territories regulate privacy one way or another. Some of the legal provisions are not mutually consistent. At least minor confusion is added by the fact that the Commonwealth’s Privacy Act is keyed to two sets of principles – 11 Information Privacy Principles for the public sector and 10 National Privacy Principles for the private sector.27 In 2008, the Australian Law Reform Commission issued a comprehensive consultation-based review, For Your Information: Australian Privacy Law and Practice. Among many other revisions, the report recommended consolidating the two sets of principles into Unified Privacy Principles. It addressed the use of data in health research in considerable detail and 26

27

Avner Levin and Mary Jo Nicholson, “Privacy law in the United States, the EU and Canada: The allure of the middle ground,” University of Ottawa Law & Technology Journal, 2 (2005), 357–394: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=894079. Entrée to the Australian Privacy Act 1988 and privacy protection in Australia generally can be gained via: www.privacy.gov.au.

50

Broad privacy and data protection regimes

discussed such issues as consent, identifiability, and data linkage.28 At the time of writing, the government had not fully responded to the recommendations, and although Parliament had not revised the Act, there was general expectation that it will do so before long. The Federal Privacy Commissioner has the power to approve binding guidelines and has approved guidelines developed by the National Health and Medical Research Council on the use of data from Commonwealth agencies for research, and on Health Research Ethics Committees’ evaluation of applications to use personal data without consent. Japan. Japan has since 2003 had a Personal Information Protection Act aligned with the OECD privacy principles and covering both the private and public sectors. But differently from the practices of other countries, the interpretation, guidance, regulation, and enforcement of the Act is delegated to local governments and to sectoral ministries such as the Ministry of Health, Labor, and Welfare and the Ministry of International Trade and Industry, which develop guidelines specific to their domains.29 Japan has no national privacy or data protection authority and there seems to be no movement toward establishing one.30 It must be said that to many Japanese doctors, scientists, and lawyers, as well as to baffled Western observers and international corporations, it is unclear how, or how effectively, the law is implemented and enforced in the research arena (which involves several ministries, many government institutes, the pharmaceutical and other industries, and many academic, scientific, and medical organizations). The uncertainty from the lack of unified guidance is debilitating.31 Given the country’s important participation in global commerce and international cooperative health research projects, it is hard to imagine that Japan will not soon have to develop a more coherent and transparent privacy protection regime. APEC. The Asia–Pacific Economic Cooperation (APEC) is an organization of 21 Member Economies, including the eastern Pacific Rim countries Canada, the US, and Mexico. Several of the members, such as 28

29 30

31

Australian Law Reform Commission, For Your Information: Australian Privacy Law and Practice (ALRC Report 108, 2008): www.austlii.edu.au/au/other/alrc/publications/ reports/108. Japan, Personal Information Protection Act, English translation: www.cas.go.jp/jp/seisaku/hourei/data/APPI.pdf. Graham Greenleaf, “Country study B.5 – Japan,” an appendix to LRDP Kantor Ltd, “New challenges” (2010): http://ec.europa.eu/justice/policies/privacy/docs/studies/new_privacy_challenges/final_report_country_report_B5_japan.pdf. This is the author’s perception from personal and seminar discussions in Japan. It does not imply that privacy or confidentiality are not respected, just that it is not clear how the Act and related regulations are applied in practice, or whether they are applied consistently across the implementing jurisdictions.

Australia, Japan, APEC

51

Australia and Hong Kong, have long had data protection laws generally similar to those of the EU, although differing in details and enforcement provisions. But most of the APEC members are still in the early stages in developing serious informational privacy law, and as they feel their way toward it – and toward larger roles in the global economy, which requires attending to privacy so data can confidently be exchanged – many are sizing up the issues and coordinating their actions through an APEC Privacy Framework. The Framework “aims to promote a consistent approach to information privacy protection, to avoid the creation of unnecessary barriers to information flows and to remove impediments to trade across APEC Member Economies.”32 A number of “pathfinder projects” are underway. External commentators tend to find the Framework and the steps toward implementing it weak compared to the European model, but better than the situation was in most Asian countries previously.33 The initiative is strongly oriented to business interests and much of it is devoted to the development of “cross-border privacy rules.” Perhaps this will facilitate the movement of data in pharmaceutical and biotechnology R&D, although this isn’t yet clear, but it is hard to see how it might relate to noncommercial research activities.

32 33

Asia–Pacific Economic Cooperation, “Privacy framework”: www.apec.org/Groups/ Committee-on-Trade-and-Investment/Electronic-Commerce-Steering-Group.aspx. Nigel Waters, “The APEC Asia-Pacific privacy initiative – A new route to effective data protection or a trojan horse for self-regulation?,” Script-ed, 6(1) (2009), 75: www.law.ed. ac.uk/ahrc/SCRIPT-ed/vol6-1/waters.asp; Chris Connolly, “Asia-Pacific region at the privacy crossroads” (2008): www.galexia.com/public/research/assets/asia_at_privacy_crossroads_20080825/.

5

Healthcare, public health, and research regimes

Omnipurpose laws can hardly be specific enough to deal with the complexities of contemporary health care and research. Some countries, notably among the leading research nations the US, don’t have such broad laws anyway. And besides, healthcare and research data have long been protected in many other ways, as will now be discussed.

Healthcare and payment regimes Hippocratic medical confidence, the respectful guarding of medical secrets by healthcare practitioners and those under their supervision, has a long and laudable history that need not be recounted here. Commitment to the ideal is required as a condition of professional licensing and the accreditation of medical institutions, and the obligation is enforceable under law. Serious inadequacies are having to be compensated for now, however, given the dramatic expansion of the circle of care, the scale and indirectness of modern payment systems, and the handling of data via networked electronic systems. For this reason the past two decades have brought much sector-specific legislation, regulation, and guidance. For example, in 1996 the US issued a comprehensive Privacy Rule under the Health Insurance Portability and Accountability Act, protecting personally identifiable information transmitted electronically in the course of health care or payment (hereafter the HIPAA Privacy Rule).1 The Rule has provisions on the use of healthcare data for research, reviewing of clinical databases to identify candidates for recruitment to research, authorization by patients of research use of data, de-identification of data, and other research-related issues, which will be discussed at various points as the book proceeds.2 1

2

US Department of Health and Human Services, Standards for Privacy of Individually Identifiable Health Information (Privacy Rule): www.hhs.gov/ocr/privacy/hipaa/administrative/privacyrule/index.html. US National Institutes of Health, “Protecting personal health information in research: Understanding the HIPAA Privacy Rule”: http://privacyruleandresearch.nih.gov/pr_02.asp.

52

Public health regimes

53

Other important protections in the US healthcare system are the confidentiality provisions in the statutes of the large reimbursement systems Medicare (for people age 65 and over, and people who are permanently disabled and unable to work) and Medicaid (for low-income and otherwise disadvantaged people), and the military Veterans Health Administration. To cite another example, the Canadian province of Ontario, recognizing the need for a law complementary to the federal Personal Information Protection and Electronic Documents Act (PIPEDA) – which, recalling from the previous chapter, applies only to organizations that collect, use, and disclose personal information in the course of commercial activities and is not specific to health – has a highly regarded Personal Health Information Protection Act (PHIPA).3 PHIPA focuses responsibilities on designated “health information custodians,” whoever they are and wherever they are situated, provides guidance on the assessment of privacy issues by Research Ethics Boards, and carries practical provisions on the sharing of healthcare data for research. Several other Canadian provinces also have health information laws.

Public health regimes Public health agencies have a long tradition of respecting privacy and confidentiality. They simply could not carry out their mandate to conduct disease surveillance, investigate disease outbreaks, intervene with individuals when necessary, register conditions and diseases, assemble health statistics, regulate medical products, and pursue research otherwise. Although the objective of most of their work is population-level understanding of occurrences, distributions, associations, and trends, this can only be built up by analyzing individual-level data.4 The work is often intimate and charged with emotional or cultural significance for the affected people, for their relations, and perhaps for people similarly at risk. Sexually transmitted disease surveillance and intervention are poignant examples, as are the registration of birth defects, abortions, knife wounds, and domestic violence. Less dramatic, perhaps, 3

4

Ontario, Personal Health Information Protection Act (PHIPA): www.e-laws.gov.on.ca/ html/statutes/english/elaws_statutes_04p03_e.htm. The relation between PHIPA and PIPEDA is described in Ontario, Ministry of Health and Long-Term Care, “Declaration of PHIPA as substantially similar to PIPEDA”: www.health.gov.on.ca/english/providers/ legislation/priv_legislation/phipa_pipeda_qa.html. A classic overview of the widely disparate activities that are conducted under the rubric of public health is Roger Detels, Robert Beaglehole, Mary Ann Lansang, and Martin Gulliford, Oxford Textbook of Public Health, fifth edn. (Oxford University Press, 2009).

54

Healthcare, public health, and research regimes

but at least as important and sensitive are campaigns against problems affecting situationally disadvantaged people, such as undiagnosed active or latent tuberculosis in recent immigrants from high-prevalence countries, and stigmatizing problems such as endemic alcoholism in aboriginal groups. For these reasons, the legal auspices change when data, such as medical or immigration records, are examined for official public health purposes, and most public health programs are founded on statutes requiring strict protection of privacy and confidentiality. Examples of this in the US are provisions in the Public Health Service Act relating to the handling of potentially identifiable data by the Centers for Disease Control and Prevention and its constituent centers, such as the National Center for Health Statistics and the Agency for Healthcare Research and Quality. All US states and most countries have similar statutory protections.5 Lisa Lee and Lawrence Gostin have argued that, at least in the US, public health authorities, especially state and local authorities, would be more active and responsible stewards of public health data if guidelines were developed complementary to the laws and regulations protecting healthcare data.6 Public health and health statistics agencies, especially the larger ones, are adept at converting personal data, whether collected directly from individuals or from healthcare providers or local public health agencies, into impersonal, i.e., de-identified, data. They conduct and support research themselves, and they provide essential data and biospecimens for research by others.

Human-subject protection regimes For research, the international document of reference is the Declaration of Helsinki, a statement of ethical principles issued by the World Medical Association, the confederation of national medical organizations.7 Initially adopted in 1964, it has been revised six times. It attempts to 5

6 7

Legal dimensions of public health work are reviewed in Richard A. Goodman, editor-inchief, Law in Public Health Practice, second edn. (New York: Oxford University Press, 2007); and in Lawrence O. Gostin, Public Health Law: Power, Duty, Restraint, second edn. (Berkeley, CA: University of California Press, 2009). Lisa M. Lee and Lawrence O. Gostin, “Ethical collection, storage, and use of public health data,” Journal of the American Medical Association, 302 (2009), 82–84. World Medical Association, Declaration of Helsinki, “Ethical Principles for Medical Research Involving Human Subjects”: www.wma.net/en/30publications/10policies/b3/ 17c.pdf.

Human-subject protection regimes

55

stretch Hippocratic ethics into the modern era, in tenets that can be accepted globally. The more recent revisions have struggled, not entirely successfully, to guide the conduct of clinical trials, especially in resourcepoor countries. The Declaration enjoins that: “Every precaution must be taken to protect the privacy of research subjects and the confidentiality of their personal information and to minimize the impact of the study on their physical, mental and social integrity” (Article 23). It is strongly oriented to consent and to the protection of disadvantaged subjects. It recognizes the importance of research to improve medical care, but it offers little guidance on the conditions under which individual-level data might be used for research, and it says nothing about identifiability or the deidentification of data. Although the Declaration itself carries few operational details and has no force of law, its themes have been codified, in differing ways, in laws, regulations, and guidance in most countries. Subsidiary instruments have attempted to elaborate the Helsinki principles rather directly into research guidance. For example, the Council of International Organizations of Medical Sciences has done this for epidemiological studies.8 In the US, one of the most sustained influences has been the Belmont Report, Ethical Principles and Guidelines for the Protection of Human Subjects,9 prepared by a national commission in 1979.10 Among many other accomplishments, the Report crystallized three principles (paraphrasing): □ respect for persons – treating subjects as autonomous agents, and giving special protection to subjects whose autonomy is reduced; □ beneficence – minimizing harm to the subjects and maximizing benefits for the subjects and society; □ justice – avoiding unfair selection and recruitment of research subjects, and having concern for the fairness of the social distribution of burdens and benefits. 8 9

10

Council for International Organizations of Medical Sciences, International Guidelines for Epidemiological Studies (Geneva, 2009), available at: www.cioms.ch. US National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects (1979): www.hhs.gov/ohrp/humansubjects/guidance/belmont.html. “Belmont” refers to the conference center where the report was drafted. For background on the Commission’s work, see Ruth R. Faden, Thomas L. Beauchamp, and Nancy M. P. King, A History and Theory of Informed Consent (New York: Oxford University Press, 1986), pp. 200–221. A collection of articles that “look back to look forward” is James F. Childress, Eric M. Meslin, and Harold Shapiro (eds.), Belmont Revisited: Ethical Principles for Research with Human Subjects (Washington, DC: Georgetown University Press, 2005).

56

Healthcare, public health, and research regimes

The Belmont principles were mostly intended to protect subjects of direct medical and psychological experimentation. They have strong resonance and continue to be widely referred to as benchmarks, including outside the US, and not only with respect to clinical research but also to database research and other lines of inquiry involving little or no contact between researchers and the subjects. Despite their philosophical and legal generality, because of their extensive history of interpretation in real circumstances, they are among the ur-principles of ethics review. In the US, the Belmont and other principles have been codified since 1991 in a Federal Common Rule on Protection of Human Research Subjects (hereafter the Federal Common Rule, or simply the Common Rule).11 Fifteen government departments and agencies that conduct, support, or regulate research involving humans have adopted the Common Rule as their own regulations, with only minor variations, thus making for uniformity. The Common Rule mainly addresses consent and Institutional Review Board (IRB) (independent ethics committee) review. In several places it briefly addresses privacy, but it provides little guidance, which was one reason the HIPAA Privacy Rule was developed in the mid-1990s. For example, the Common Rule says without explication that it exempts “research, involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects.” The HIPAA Privacy Rule provides detailed guidance on identifiability (although with mixed success, as will be discussed in Chapter 7). The Common Rule has wide reach. Institutions wanting to perform research on humans under federal funding or other federal auspices must enter into a Federalwide Assurance, which commits all of the research in the institution to following the Common Rule and requires registration of the responsible IRBs. Some serious policy and procedural problems arise from inconsistencies between the Common Rule and the HIPAA Privacy Rule, such as regards the allowable breadth of consent to future research, the construal of personal identifiability of data, and the use of identifiable records in selecting possible participants for research. Efforts are being made

11

US Department of Health and Human Services, Federal Policy on Protection of Human Subjects.

Clinical trial and product postmarketing regimes

57

now to reconcile the two rules and also some regulations issued under the Federal Food, Drug, and Cosmetic Act.12 Canadian research is guided by a broad Tri-Council Policy Statement, Ethical Conduct for Research Involving Humans, which centers on respect for human dignity as it is expressed through the core principles of respect for persons, concern for the welfare of participants and implicated groups, and justice in the sense of fair involvement in research and equitable distribution of resulting benefits.13 Compliance with the Statement is a condition for receiving funding from the councils. Research organizations key their regulations and guidance to it, as, for example, the Canadian Institutes of Health Research did in its “Best Practices for Protecting Privacy in Health Research.”14

Clinical trial and product postmarketing regimes Clinical trials are highly structured prospective studies conducted on volunteers to evaluate healthcare interventions. They are sponsored – i.e., initiated, managed, and usually financed – by pharmaceutical, biotechnology, or medical-device companies, or government agencies, or occasionally individual investigators. Most are evaluations of the benefits and risks of medicines, vaccines, medical devices, diagnostic tests, or instruments, with the findings used to support applications for regulatory approval to market new products or improve existing products and their use.15 Not all have to do with products; they are used to evaluate interventions as diverse as skin grafting, gene therapy, prevention of pressure ulcers in wheelchair users, effectiveness of sunscreen use, and behavior modification for weight reduction. At

12

13

14

15

The Common Rule and its intersections with the HIPAA Privacy Rule were analyzed extensively in Institute of Medicine (US), Committee on Health Research and the Privacy of Health Information, Sharyl J. Nass, Laura A. Levit, and Lawrence O. Gostin (eds.), Beyond the HIPAA Privacy Rule: Enhancing Privacy, Improving Health Through Research (Washington, DC: National Academies Press, 2009), pp. 126–131 and 162–191, available at: www.nap.edu. A process of revision of the Common Rule was initiated in July 2011. Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council of Canada, and Social Sciences and Humanities Research Council of Canada, Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (second edn, 2010): www.pre.ethics.gc.ca/pdf/eng/tcps2/TCPS_2_FINAL_Web.pdf. Canadian Institutes of Health Research, “Best Practices for Protecting Privacy in Health Research” (2005, and amended since, but now in need of revision): www.cihr-irsc.gc.ca/ e/29072.html. A respected technical treatise is Lawrence M. Friedman, Curt D. Furberg, and David L. DeMets, Fundamentals of Clinical Trials, fourth edn. (New York: Springer, 2010).

58

Healthcare, public health, and research regimes

any one time some 115,000 trials are underway around the world,16 varying in scale and quality.17 Most trials are conducted by units in hospitals and other clinical centers on behalf of the sponsors. Some routine but demanding tasks, such as designing protocols, training study personnel, monitoring progress, analyzing results, and preparing technical dossiers for submission to regulatory authorities, may be conducted for the sponsors by companies called contract research organizations. Because the enterprise is so complex – often single trials are conducted in coordinated fashion in many countries at a time, with the data routinely flowing back-and-forth among clinical centers, research laboratories, contract research organizations, and regulatory authorities – and the stakes are so high, medically and financially, that a high degree of standardization and regimentation is essential.18 Product trials are regulated by government agencies such as the UK Medicines and Healthcare products Regulatory Agency, the Japanese Pharmaceuticals and Medical Devices Agency, and the US Food and Drug Administration. They are managed under clinical auspices, so medical privacy and confidentiality obtain. They amount to experimentation, so elaborate versions of human subject protections called clinical trial regulations, interpreted through Good Clinical (Trial) Practices, obtain. All clinical trial regimes require protection of privacy and confidentiality. But for specific policies, governance, and enforcement, they tend to defer to medical confidentiality, ethics committee oversight, and privacy and data protection laws. For example, the EU Clinical Trials Directive simply says (Article 3.2(c)): “A clinical trial may be undertaken only if . . . the rights of the subject to privacy and to the protection of the data concerning him in accordance with [the Data Protection Directive] are safeguarded.”19 The US Food and Drug Administration enforces its version of the Federal Common Rule, which relies heavily on consent, IRB (ethics committee) review, and data security, and it exercises extreme 16 17 18

19

A continually updated catalogue and results database is ClinTrials.Gov: http://clinicaltrials.gov/ct2/search. This was the source of the 115,000 trials figure. A portal to trial registrations is the WHO International Clinical Trials Registry Platform: http://apps.who.int/trialsearch/Default.aspx. An important mechanism has been the International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use, an ongoing collaboration of the pharmaceutical regulators of Europe, Japan, and the US, and representatives of the regulated companies: www.ich.org/home.html. EU Directive on Good Clinical Practice in the Conduct of Clinical Trials on Medicinal Products for Human Use (2001/20/EC): http://eur-lex.europa.eu/LexUriServ/LexUri Serv.do?uri=OJ:L:2001:121:0034:0044:en:PDF. The Directive is being revised at the time of writing, but the data protection provision is unlikely to be changed.

Other, specialized laws and regulations

59

caution in releasing individual-level information in response to requests submitted under the Privacy Act or Freedom of Information Act. As was said, most clinical trials have to do with products. After a product is approved and released onto the market, a variety of follow-up activities are conducted to evaluate how well it performs in routine use with a variety of patients in diverse circumstances, investigate adverse events, fine-tune the applications or dosing, extend or reduce the range of allowable uses, improve the information provided to physicians and patients, and assess cost-effectiveness.20 Two sets of privacy issues perennially attend clinical trial and postmarketing follow-up work. The first has to do with how subjects (or better, cases) are tracked as trials progress and adverse event reports are investigated. Usually this is managed by assigning random study numbers or case numbers through which the sponsors and regulators can assemble and analyze records but which obscure the identity of the subjects, a weak form of key-coding. Because reports of a suspected adverse event may be sent to the product manufacturer and the regulatory agency by several medical sources and possibly by the patient as well, regulators usually agree that in order to reduce ambiguities and duplicates, reports can be tagged with the patient’s name initials and birthdates. How well this protects the patient’s identity depends on how securely and carefully the data are used. The other set of issues has to do with whether data collected during a clinical trial or postmarketing surveillance can be used later, by disparate divisions of the sponsoring corporation, a contract research organization that worked with the data, or others, to study the data for research purposes other than those of the trial, invite the subjects to participate in a new research project, or contact the subjects for market research. All of this should be anticipated in trial protocols, consent negotiations, and ethics review.

Other, specialized laws and regulations In various jurisdictions, specialized laws and regulations address issues that are not well covered by the basic human-subject protection regime, such as research involving prisoners or other especially vulnerable subjects; the use of human tissues, DNA, newborn screening samples, or stem cells in research; the development and use of research biobanks; or 20

A classic text that among many other things demonstrates the value of database research is Brian L. Strom, Stephen E. Kimmel, and Sean Hennessy (eds.), Pharmacoepidemiology, fifth edn. (Chichester: Wiley-Blackwell, 2012).

60

Healthcare, public health, and research regimes

research on such topics as mental health, assisted reproduction, gene therapy, or the effects of ionizing radiation.21

Research ethics review systems Since the mid-1970s, most health research on humans or on personally identifiable data everywhere has been subject to scrutiny by nominally independent committees, called research ethics committees, human subjects boards, Institutional Review Boards (IRBs), or similar.22 The IRB system in the US was established by the National Research Act in 1974. Shortly thereafter, a requirement for independent review was incorporated in the first revision of the Declaration of Helsinki (Article 1.2): “The design and performance of each experimental procedure involving human subjects should be clearly formulated in an experimental protocol which should be transmitted to a specially appointed independent committee for consideration, comment and guidance.” Under these regimes, researchers submit detailed descriptions of their proposed research (“protocols”) and request either approval or a waiver of the need for detailed review. The committees’ mandates are to protect the rights and welfare of research subjects by judging the proposals in light of recognized ethical standards and any special local concerns. Among many other things, they review the research purposes, how participants will be selected, recruited, and informed, what data and biospecimens will be collected and how they will be managed, what the benefit/risk calculus seems to be, and how the project will conform to legal and regulatory requirements. Privacy and confidentiality, and the related factors of notice, consent, identifiability, safeguards, and data sharing, are among the issues most often subjected to scrutiny. Given the complexity of current datahandling systems, though, and the technical expertise required to review such matters as the adequacy of data de-identification, increasingly questions are being raised as to whether ethics committees have the 21

22

Over 1,000 laws, regulations, and guidelines in more than 100 countries and from some international organizations are catalogued in US Department of Health and Human Services, Office for Human Research Protections, “International compilation of human research protections” (2012): www.hhs.gov/ohrp/international/intlcompilation/intlcompilation.html. The portal to information about Institutional Review Boards in the US and extensive guidance on compliance with the Federal Common Rule is: www.hhs.gov/ohrp. The portal to information about Research Ethics Committees in the UK is: www.nres.npsa. nhs.uk. A monograph mainly relating to the UK Research Ethics Committee system is Biggs, Healthcare Research Ethics and Law.

Research ethics review systems

61

competence to make the required judgments. An option can be to refer issues to ancillary expert committees. The activity has burgeoned beyond anything imaginable when it was started. Around 3,000 IRBs are active in the US and around 1,000 research ethics committees in Europe. Their composition, size, training, and mandate vary considerably. Usually the committees’ decisions are addressed to both the researchers and their institutions. Review may be thorough and full-dress for projects having novel features or possibly posing more than minimal risk; or it may be cursory or fast-tracked for those of a familiar genre and considered to be of low risk. A critique is beyond this book. Because it is inherently a judgmental activity, ethics review evokes controversy, and the literature on it is voluminous. There is no question that ethics review often improves the way projects are conducted and their acceptance by participants and society. It serves as a check against possible conflicting interests on the part of researchers, deters misconduct, and helps certify the correctness of procedures. And, of course, the anticipation of ethics review, and awareness of the criteria that will have to be met, induces researchers to take the ethical and legal issues into account as they plan projects in the first place. Because the committees usually include members having, variously, expertise in assorted scientific and healthcare specialties, statistics, information technology, ethics, or law, and lay public members as well, different committees often arrive at idiosyncratically differing conclusions regarding a single project. Multisite projects may have to seek approval from many ethics committees, perhaps in many countries. Meeting documents sometimes run through hundreds of pages, and there have been instances of projects having to obtain approval from 100 ethics committees. Approval can be a long and tedious process. In general, ethics review is a mechanism for bringing widely held values and special local concerns to bear on research plans, and for reassuring research subjects, the general public, funders, journal editors, and research and political leaders that proper precautions are being taken. In the extreme, ethics review can be viewed as an institutionalization of mistrust of researchers. Basically it is a governance and public-assurance mechanism. Apart from the inefficiencies and inconsistencies, a detriment for society, one has to say, is that ethics review, and the regulations that establish it, tend to be highly individualistic, focusing more on protecting research subjects and less than they could on helping advance knowledge for the common human good.

62

Healthcare, public health, and research regimes

“Rules” thickets in four countries The following sketches indicate the kinds of problems that research programs in many developed countries are having to deal with. They also indicate some system-building pitfalls that countries whose health research capacities are now maturing should try to avoid if they can. The US. As has been described, research on human subjects is generally regulated by the Common Rule, and research using information recorded in the course of health care or payment is regulated by the HIPAA Privacy Rule. But the Privacy Rule “does not protect privacy as well as it should,” a committee of the Institute of Medicine concluded in 2009, “and as currently implemented, it impedes important health research.” The overall problems, in the view of the committee, are that the Privacy Rule: □ is not uniformly applicable to all health research; □ overstates the ability of informed consent to protect privacy rather than incorporating comprehensive privacy protections; □ conflicts with other federal regulations governing health research; □ is interpreted differently across institutions; and □ creates barriers to research and leads to biased research samples, which generate invalid conclusions.23 The committee recommended that Congress authorize an entirely revised approach, one that would exempt health research from coverage by the HIPAA Privacy Rule but at the same time establish statutory protection of privacy in all health research. Apparently such comprehensive and statutory, i.e., legislated, protection has never been enacted anywhere. The proposal should be kept in view and sustained efforts made to refine it. (See Box 3.) Realizing that Congress may not adopt this ambitious recommendation, as a fallback the committee proposed a menu of revisions to the Privacy Rule. It also urged reconciliation of the Privacy Rule with the Common Rule, argued that database research should be regulated separately from interventional research, and recommended that interventional research should be made subject only to the Common Rule. No doubt the situation in the US will be in flux for some time. Similar problems exist elsewhere; other countries should benefit from observing the fray in the US.

23

Institute of Medicine, Beyond the HIPAA Privacy Rule, p. 2.

“Rules” thickets in four countries

63

Box 3. Institute of Medicine recommendation for protection of privacy in all health research* Congress should authorize the Department of Health and Human Services (HHS) and other relevant federal agencies to develop a new approach to protecting privacy in health research that would apply uniformly to all health research. When this new approach is implemented, HHS should exempt health research from the HIPAA Privacy Rule . . . The new approach should do all of the following: □ Apply to any person, institution, or organization conducting health research in the United States, regardless of the source of data or funding. □ Entail clear, goal-oriented, rather than prescriptive, regulations. □ Require researchers, institutions, and organizations that store health data to establish strong data security safeguards. □ Make a clear distinction between the privacy considerations that apply to interventional research and research that is exclusively information based. □ Facilitate greater use of data with direct identifiers removed in health research, and implement legal sanctions to prohibit unauthorized reidentification of information that has had direct identifiers removed. □ Require ethical oversight of research when personally identifiable health information is used without informed consent. HHS should develop best practices for oversight that should consider: – Measures taken to protect the privacy, security, and confidentiality of the data; – Potential harms that could result from disclosure of the data; and – Potential public benefits of the research. □ Certify institutions that have policies and practices in place to protect data privacy and security in order to facilitate important largescale information-based research for clearly defined and approved purposes, without individual consent. □ Include federal oversight and enforcement to ensure regulatory compliance. * Institute of Medicine, Beyond the HIPAA Privacy Rule, p. 3.

Canada. Canadian research is guided by the Tri-Council Policy Statement described above, and projects are overseen by a Research Ethics Board system. The Canadian federal and provincial data protection regimes are respected and are generally sensitive to the special issues of research. Much productive research is conducted in Canada using

64

Healthcare, public health, and research regimes

existing data.24 But, like other countries, Canada is finding itself illprepared to make efficient research use of health-related data that are handled electronically, especially the data that in the future will accumulate in electronic health records. A review preparatory to drafting a national roadmap for the use of electronic data for research identified “numerous challenges that are both complex and intertwined”:25 Confusion and uncertainty regarding law and policy; absence of clarity regarding consent for research use of personal information; heterogeneity in institutional policies and procedures and in the ethics review processes; insufficient capacity for secure management of data; low comparability of data; failure to design research use into the common interoperable electronic health record infrastructure; the proliferation of electronic databases; and political hurdles. Australia. Australian research ethics are founded on a National Statement on Ethical Conduct in Human Research, which is based on the values of respect, research merit and integrity, justice, and beneficence.26 It emphasizes sensitivity to cultural diversity and develops guidance in some detail. Although the National Statement is not legally binding, it bears the approval of the Federal Privacy Commissioner’s Office, which is a good example of how the high-level requirements of an omnibus privacy law can be projected through the specifics of a leading research organization’s guidance. The Statement guides the work of the Health Research Ethics Committees, and compliance with it is a condition for receiving National Health and Medical Research Council and other research grants. All research conducted in institutions administering Council funding must comply with the National Statement, regardless of other sources of funds. In its 2008 review of the Privacy Act 1998, the Australian Law Reform Commission recognized the special nature of contemporary health research and decided that the omnibus Privacy Act was too high-level an instrument to guide research practice. Accordingly, it recommended that the Federal Privacy Commissioner develop special regulatory “Research Rules” covering a range of requirements and exemptions, addressed to agencies and organizations conducting research. At the time of writing, 24 25 26

Flood (ed.), Data Data Everywhere. Willison, Gibson, and McGrail, “A roadmap,” p. 241. Australian National Health and Medical Research Council, Australian Research Council, and Australian Vice-Chancellors’ Committee, National Statement on Ethical Conduct in Human Research (updated 2009): www.nhmrc.gov.au/_files_nhmrc/publications/attachments/e72.pdf.

“Rules” thickets in four countries

65

the government’s stance has been to say that it prefers that the issues be covered in whatever new Privacy Act emerges.27 The UK. How complicated – one might say baroque – the research “rules” matrix can become is illustrated by the situation in the UK. Like other European countries, the UK relies on its Data Protection Act as the legal bedrock; it does so, for example, for National Health Service (NHS) data. But it also has many specialized laws and regulations, such as the Human Tissue Act, the Clinical Trials Regulations, and the Human Fertilisation and Embryology Act, which are consistent with the Data Protection Act but which establish rules, procedures, and governance at a level of detail that cannot be accommodated in an omnibus law. A system of officers called Caldicott Guardians oversee the use of clinical data in NHS operating units. Regulations relating to the use of NHS patient data without consent, issued under the NHS Act and the Health and Social Care Act, are administered by a National Information Governance Board. Research Ethics Committees apply guidance in reviewing project protocols. Rules are imposed as conditions of funding by the Medical Research Council (the government’s principal biomedical research funding body), the Economic and Social Research Council (the government’s funding body for those sciences), and charitable organizations such as the Wellcome Trust and Cancer Research UK. Guidance is issued by the General Medical Council (the medical profession’s regulatory body), the royal medical colleges (societies of pathologists, psychiatrists, and other specialties), and other authoritative organizations such as the Human Genetics Commission and the Nuffield Council on Bioethics. The regulatory structures and rules in the four countries of the UK resemble each other, but they differ in specifics and procedures; cross-UK research can require considerable negotiation and coordination.28 Confronting all this at the request of the government, the Academy of Medical Sciences conducted a broad consultation-based review and in 2011 published a report, A New Pathway for the Regulation and Governance of Health Research, which remarked that: The existing regulation and governance pathway has evolved in a piecemeal manner over several years. New regulatory bodies and checks have been introduced with good intentions, but the sum effect is a fragmented process characterised by multiple layers of 27 28

Australian Law Reform Commission, For Your Information, pp. 2152–2199. A recent effort to increase consistency was the adoption by the four UK Health Departments of “Governance arrangements for research ethics committees: A harmonised edition” (2011): www.dh.gov.uk/prod_consum_dh/groups/dh_digitalassets/documents/digitalasset/dh_126614.pdf.

66

Healthcare, public health, and research regimes

bureaucracy, uncertainty in the interpretation of individual legislation and guidance, a lack of trust within the system, and duplication and overlap in responsibilities. Most importantly, there is no evidence that these measures have enhanced the safety and wellbeing of either patients or the public.29 The report then set out a number of recommendations, relating to four objectives: creating a new Health Research Agency to rationalize the regulation and governance of all health research; improving the UK environment for clinical trials; providing access to patient data that protects individual interests and allows approved research to proceed effectively; and embedding a culture that values research within the NHS. The government received the report favorably and is considering how to pursue the recommendations. It is not simply coincidental that the US Institute of Medicine, the Canadian roadmap project, the Australian Law Reform Commission, the UK Academy of Medical Sciences, and groups like them elsewhere are all pressing for regime reforms, and fairly similar ones at that. Countries’ laws, regulations, and guidance differ in their traditions, coverage, criteria, and mechanisms. Elements of each have been developed in stages over long periods, responding to advances in science or health care, changes in the nature of data or data handling, shifts in national values or local concerns, and in some instances, scandals. But because they inevitably confront similar research issues – and similar privacy and confidentiality issues – they resemble each other in many ways, including the thickets of impedances they have ended up presenting. These thickets need to be thinned, and at the same time the rules made more uniform internationally. Some countries may now want to explore the possibility of comprehensive statutory protection of the personally identifiable data used in health research, regardless of the source, topic, or research use of the data, and possibly also regardless of the funding institutional setting.

29

Academy of Medical Sciences, A New Pathway, p. 6.

6

Consent

Ever since the formulation of the Nuremberg Code – the first article of which is “The voluntary consent of the human subject is absolutely essential” – consent to being systematically observed, experimented on, or having data about oneself analyzed has been broadly viewed as a right and a reassurance in research. It is often called the cornerstone of research ethics. Consent is a priority requirement in the Declaration of Helsinki. Article 24 recites physicians’ consent-related duties: In medical research involving competent human subjects, each potential subject must be adequately informed of the aims, methods, sources of funding, any possible conflicts of interest, institutional affiliations of the researcher, the anticipated benefits and potential risks of the study and the discomfort it may entail, and any other relevant aspects of the study. The potential subject must be informed of the right to refuse to participate in the study or to withdraw consent to participate at any time without reprisal. Special attention should be given to the specific information needs of individual potential subjects as well as to the methods used to deliver the information. After ensuring that the potential subject has understood the information, the physician or another appropriately qualified individual must then seek the potential subject’s freely-given informed consent, preferably in writing. If the consent cannot be expressed in writing, the nonwritten consent must be formally documented and witnessed. Despite all the endorsement of consent, though, there are serious shortcomings. The influences of Nuremberg and Helsinki have been, and continue to be, profound. Both incorporate high moral principles and provide commonsensical ethical guidance. But both have limitations stemming from their historic origins and the passage of time. The Nazi doctors trial, United States v. Karl Brandt et al., was prosecuting the “brutalities, tortures, disabling injury, and death” inflicted by so-called 67

68

Consent

medical experimentation on prisoners viewed by the Third Reich as being subhuman.1 Each of the ten points of the emerging Nuremberg Code refers generically to “the experiment” or “the experimental subject.” The Code, drafted by the staff and advisors to assist the Military Tribunal in determining criminal culpability and punishment, was not addressed to anything as benign and life-enhancing as most research conducted today, and certainly not non-experimental research.2 As for Helsinki, the original versions of the Declaration, drafted by doctors for doctors in the early 1960s and 1970s, simply could not have anticipated today’s health research opportunities and privacy challenges, and later versions, making conservative changes, have not fully kept up with the changes in either science or society. These two codes, and many guidelines and regulations evolved from them, are off-focus for much contemporary research to the extent that they: □ are directed mainly at physicians and have far less bearing for many members of health research teams who are neither medically licensed nor, in reality in many circumstances, working under close medical supervision; □ focus on experimentation, i.e., direct manipulation of people’s bodies, minds, or environments to observe what happens, and not on less intrusive modes of research; □ hinge everything on the construct: informed consent; □ emphasize the protection of individuals over the facilitation of acceptably low-risk, well-governed research that has potential to benefit humankind. It is becoming increasingly clear that the nearly absolute dependence of research ethics on consent is not justified. In their trenchant 2007 critique, Rethinking Informed Consent in Bioethics, Neil Manson and Onora O’Neill said: “We conclude that standard accounts of informed consent, standard 1

2

The experiments involved exposing concentration camp prisoners to high-altitude conditions, freezing, malaria, typhus, epidemic jaundice, poisonous gases and other noxious substances, heavy saltwater consumption, and incendiary bomb explosions; performing sterilization, bone transplantation, and bone, muscle, and nerve regeneration trials; and inflicting wounds and infecting them with streptococcus and tetanus to test the antibiotic effects of sulfanilamide. The Code is available from many sources, including: www.hhs.gov/ohrp/archive/nurcode. html. The principles of “permissible medical experimentation” set out in the verdict that later, verbatim, became the Code were recorded of Trials of War Criminals before the Nuremberg Military Tribunals under Control Council Law No. 10 (October 1946–April 1949), vol. 2, pp. 181–183, available at: www.loc.gov/rr/frd/Military_Law/NTs_war-criminals.html. See also Paul Weindling, Nazi Medicine and the Nuremberg Trials: From Medical War Crimes to Informed Consent (Basingstoke: Palgrave Macmillan, 2004); Jochen Vollmann and Rolf Winau, “Informed consent in human experimentation before the Nuremberg code,” BMJ, 313 (1996), 1445–1447.

Consent as it is applied now

69

arguments for requiring consent in clinical and research practice and standard ways of implementing consent requirements lead to intractable problems.”3 Notice that their objections are to “standard” accounts, arguments, and implementation. It now has to be declared that in many situations, consent – as conventionally applied – is being stretched beyond legitimacy, may not actually protect privacy or confidentiality, and when construed as an uncompromising expression of individual autonomy often obstructs the pursuit of health research for the common good.4

Consent as it is applied now Informed consent, in the conventional interpretation, is a mentally competent person’s understanding, willing, unforced concession to some act that otherwise could be contrary to his or her interests. In research, informed consent can be viewed as a waiving of prohibitions protected by law or custom, such as breach of medical confidentiality, unreasonable search, nonconsensual touching, battery, or intrusion into personal or family life generally. The somewhat ominous tone of the preceding sentence aside, consent often amounts to an enthusiastic “sign me up” for research. People volunteer to participate in research, and agree to have data about themselves used for research, all the time, without objection when involved and without regret afterward. Fortunately for us all. Among other points, consent negotiations almost always address: □ the purposes, overall plan, and an indication of the hoped-for eventual benefits for society; □ the project leadership, institutional setting, and funding source; □ any data or specimen collecting that may involve the person directly, the procedures involved, and the commitment of time and effort required; 3

4

Neil C. Manson and Onora O’Neill, Rethinking Informed Consent in Bioethics (Cambridge University Press, 2007), p. viii. See also Onora O’Neill, Autonomy and Trust in Bioethics (Cambridge University Press, 2002). Many problems are probed in Oonagh Corrigan, John McMillan, Kathleen Liddell, et al. (eds.), The Limits of Consent: A Socio-ethical Approach to Human Subject Research in Medicine (Oxford University Press, 2009). A review of fundamentals, and one that in passing takes exception to aspects of Manson and O’Neill’s analysis, is Roger Brownsword, “Consent in data protection law: Privacy, fair processing, and confidentiality,” in Serge Gutwirth, Yves Poullet, Paul De Hert, et al. (eds.), Reinventing Data Protection? (Dordrecht and London: Springer, 2009), pp. 83–110. A diversity of views are expressed in Franklin G. Miller and Alan Wertheimer (eds.), The Ethics of Consent: Theory and Practice (New York: Oxford University Press, 2010); the chapter by Tom L. Beauchamp, “Autonomy and consent,” pp. 55–78, discusses Manson and O’Neill’s analysis.

70

Consent

□ any proposed use of existing data or biospecimens; □ any foreseeable physical or emotional risks, and whether any resulting harms will be cared for or compensated for; □ privacy and confidentiality assurances, data sharing expectations, and whatever can be said about privacy risks; □ the possibility of future recontacting by researchers or intermediaries, for what sorts of purposes; □ reassurance that participation is voluntary, and that withdrawal is allowed via simple, nonprejudicial notice; □ the safeguards, ethics review, governance arrangements, and experience on the part of the researchers and their organizations that support trusting; □ anything asked about. This list is a selection, emphasizing points relating to privacy, confidentiality, and trustworthiness. Often consent defines purpose limits, such as restriction of the uses of data or biospecimens to specified research topics (such as cancer research), classes of users (such as not pharmaceutical companies), or legal jurisdictions (such as not outside the country). Such limits may be set by the researchers, or requested by the data-subjects, or imposed by funders, ethics review bodies, or laws or regulations. An issue increasingly being addressed is the possibility that data or biospecimens will be transferred to other countries, and the protections that will accompany this. Consent negotiations in clinical trials, which by definition involve direct intervention or manipulation, must make the subjects aware by explaining the procedures involved and the possible physical or psychological risks. And in order to temper patient’s expectations, they may explain aspects of study design, such as double-blind random assignment to the experimental treatment or a placebo or other control. Many other issues can be covered that don’t relate to the protection of informational privacy or confidentiality, such as whether participants will be informed of findings about themselves, or whether anyone will be able to assert intellectual property rights in the results. Consent formulations vary with the kind of research and the cultural, ethical, and legal circumstances. There is an enormous and ever-expanding literature on consent and consent-seeking, and on ethics committee review of these matters.5

5

A classic text is Faden et al., Informed Consent; a legally oriented text is Jessica W. Berg, Paul S. Appelbaum, Lisa S. Parker, and Charles W. Lidz, Informed Consent: Legal Theory and Clinical Practice (New York: Oxford University Press, 2001). Much detailed guidance is available via the US Office for Human Research Protections, policy and guidance portal: www.hhs.gov/ohrp/policy/index.html.

Legitimately sought, meaningfully informed, willingly granted

71

Legitimately sought, meaningfully informed, willingly granted These three, nearly universal, conditions for consent are in essence expressions of the more fundamental of the Helsinki and OECD privacy principles. In practice, viewed realistically, they can be difficult to attain. Legitimately sought. Ethical and legal legitimacy relates at base to the nature of the relationship between the researchers seeking consent and the people from whom it is sought; the framing and context of the consent request; the communication language, level, and dynamics; and opportunities for the potential subjects to ask questions and get satisfactory answers. Other criteria of legitimacy are whether fair information practices are followed, such as making clear that consenting is not obligatory. An indication of legitimacy is being able to state that necessary approvals have been secured from ethics bodies or regulatory agencies. What is most important is that consent exercises be conducted as negotiations, dialogues, two-way communicative transactions (as Manson and O’Neill call them) – far more than just pro forma routines for getting people to fill out forms.6 And they must never be deceptive or coercive, or involve the offering of inappropriate incentives. An important prohibition in the US, as in most countries, is that: “No informed consent, whether oral or written, may include any exculpatory language through which the subject or the representative is made to waive or appear to waive any of the subject’s legal rights, or releases or appears to release the investigator, the sponsor, the institution or its agents from liability for negligence.”7 Having a subject’s signature on a consent form does not absolve researchers of responsibility. Consent must be assumed as continuing unless revoked, or the activity consented-to changes and the consent no longer fits. The most recent valid consent must be viewed as the definitive expression of the person’s wishes. To ensure that consent continues to be meaningfully informed and willingly granted over time, especially as a long-term project evolves, it may be desirable to update the participants on the project’s progress from time to time, and possibly also to seek reaffirmation or modification of the consent. Obviously, whether “reconsenting” (as it is sometimes called) is possible or desirable depends on the circumstances. Meaningfully informed. In consent negotiations, prospective participants usually are given leaflets and perhaps referred to a website, and are 6 7

Manson and O’Neill, Rethinking Informed Consent. US Department of Health and Human Services, Federal Policy on Protection of Human Subjects (“Common Rule”), §46.116.

72

Consent

provided with the opportunity to discuss the project in person or by telephone or email with a qualified communicator on the project team. Recruitment campaigns, especially for projects searching for a large number of participants, may advertise via posters, newspapers, television, or a website, and they may seek news coverage. For projects hoping to involve many local or regional residents, or members of a group healthcare practice, or patients in the care of particular kinds of specialists, welldesigned posters and brochures in waiting rooms can be useful. Sometimes patients’ own physicians or other care providers help explain. (The willingness and competence of otherwise uninvolved physicians to explain projects cannot be assumed, however; projects may need to recruit and educate selected doctors or research nurses for this.) Extensive experience has been accumulated regarding the readability of texts, the comprehensibility and ease of use of documents and websites, and the usefulness of layered information, and generally in recent years enrollment communications have become clearer. Researchers have realized that some basic points that they tend to take for granted may need to be explained, such as that most research is not expected to benefit participants themselves but other people in the long run, or that medical journals don’t publish information or images that allow participants to be identified. When data are collected via websites, as they are, for example, in some social or pharmaceutical survey research, the privacy notices to which the consent request refers should be clearly drafted and prominently displayed on the sites. One would think this to be obvious. But until recently notices have tended to be long, small-print, lawyerly – and ignored. The public and regulators are now demanding that notices crystallize points succinctly and be more readable and less legalistic.8 Many healthcare organizations are now taking advantage of the potential of websites to inform their constituents about how data are used and protected in research as well as in the provision of care.9 Until recently the aspiration endorsed was “fully” informed consent. A problem is that to become fully informed about the substance of a project often would require achieving a depth of comprehension of the science, the health issues, information technology, ethics, and law, and a detailed appraisal of the privacy and confidentiality risks that few research

8

9

UK Information Commissioner’s Office, “Privacy notices code of practice” (2009): www. ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_guides/privacy_ notices_cop_final.pdf. One website that does so straightforwardly is US Veterans Health Administration, “Notice of privacy practices”: www1.va.gov/vhapublications/ViewPublication.asp?pub_ID=1090.

Legitimately sought, meaningfully informed, willingly granted

73

candidates would be able to achieve, or want to try to achieve. Even projects that make serious, sustained efforts to inform people have to simplify, and prospective participants differ widely as to how much and what kinds of information they want. Some people want to understand a lot; for them, detailed information should be made available, including the research protocol if they want it. But most others are content with sizing up the overall circumstances. Research nurses and others who conduct enrollments observe that people’s questions and decisions tend to relate more to the general purposes and the trustworthiness of the arrangements, rather than to the scientific details or technical plans. Kathleen Liddell and Martin Richards put it bluntly in closing a collection of essays on consent: “The idea that researchers can and should provide complete and rigorous information relevant to the cool-headed, rational research participant who then proceeds to analyze it competently, confidently, and sceptically is utopian.”10 There is no reason here to go on about this. One only has to look at a few research protocols or the resulting publications to appreciate the difficulty of comprehending. The problem was recognized by the EU data protection, police, and justice authorities in their joint report in 2009, The Future of Privacy. Noting that although the Data Protection Directive supports “freely given, specific, and informed” consent as a legitimate ground for processing personal data, they had to say that today’s “complexity of data collection practices, business models, vendor relationships, and technological applications in many cases outstrips the individual’s ability or willingness to make decisions to control the use and sharing of information through active choice.”11 A more realistic and useful aspiration is meaningful consent, consent that helps people become aware of the plan generally and assess the circumstances and the trustworthiness of the project’s leadership and governance. More will be said about this at the end of the chapter. Willingly granted. The formulation “freely granted” is the one most often used, but “willingly” or “voluntarily” may be more appropriate adverbs, at least when reflecting about consent, as these imply freedom (from coercion, deception, and unfair persuasion) but beyond that, imply positive affinity with the research opportunity on offer.

10

11

Kathleen Liddell and Martin Richards, “Consent and beyond,” in Corrigan et al., Limits of Consent, p. 213. Liddell’s own chapter in the book (“Beyond a rebarbative commitment to consent,” pp. 79–97) is bracing and useful. European Article 29 Data Protection Working Party and Working Party on Police and Justice, §67. But see the discussion on p. 44 above of the Article 29 Working Party’s recent, and to this author disappointing, “Opinion on the definition of consent.”

74

Consent

Because consenting to research must not be pressured or posed as a condition for medical treatment, a common challenge is how to proceed, sensitively, toward consent when it is sought soon after a patient has been diagnosed with a health problem, which is a natural time to start collecting data or biospecimens for research but a vulnerable and distracted moment for the patient. And surely, at least subtle pressure is always felt when physicians ask patients in their care to consent to research, whether their own projects or colleagues’. Approaches in such situations have to be handled carefully. Special precautions must be taken with infants and children under the age of consent; with adolescents being questioned about sexual practices, pregnancy, abortion, or illicit drug use; with adults of diminished or fluctuating mental capacity; with women, in some cultures; with aboriginal or indigenous people whose values may not coincide with those of the mainstream culture; with patients in emergency or intensive care; with people who are socially marginal or constrained, such as drug addicts, sex workers, prisoners, street youth, or questionably documented immigrants; and with people who may find it awkward to refuse to consent, such as military personnel, or employees or students of the institution hosting the research. Most human-subject research regimes have developed guidance on how consent should be handled in such situations.12 Among possible strategies are proxy permission by family members or legal guardians, participation by a clinical psychologist or a chaplain in the consent discussion, or assurance that the data will be thoroughly deidentified before anyone other than the data collector can see them and that access to the data will be firmly restricted. Children in birth cohorts may be given an opportunity to consent for themselves when they reach the age of maturity. The casting of consent Narrow versus broad consent. Often consent is tightly focused on the study of a particular illness or medical technique or product. This may be necessary, such as because of the burden of participation sought or the nature of the health or privacy risks involved. But many projects derive data that can later be put to constructive uses beyond those initially focused on, either by the original data collectors or by others. For several compelling reasons, it is important now that policies be amenable, for the benefit of society, to allowing at least fairly broad casting of consent even 12

See US Department of Health and Human Services, Office for Human Research Protections, “Policy and guidance.”

The casting of consent

75

Box 4. A plea by Neil Manson and Onora O’Neill* The National Health Service – and similar healthcare institutions in other countries – could stop trying to implement ever more rigorous and numerous informed consent requirements, and could remove requirements that are either dysfunctional or unjustifiable (or both). Research Councils and other funding bodies could stop funding work on “improving” consent procedures to make them fit for unachievable purposes, and could stop demanding the use of such procedures where they cannot or need not be used. Manuals for Research Ethics Committees could be rewritten to ensure that the point and limits of informed consent and the standards it must meet are spelled out, and to deter inflationary elaboration of these requirements. Regulators could insist that it is communication to relevant audiences rather than disclosure and dissemination that matters. They could judge medical and research performance by the quality of the communication achieved, and not by compliance with informed consent protocols whose use cannot be justified. In the UK medical and scientific institutions could open an urgent and unaccommodating dialogue with the Information Commissioner, in the hope of securing agreement on an interpretation of the Data Protection Act 1998 for biomedical practice that supports justifiable rather than illusory conceptions of privacy . . . Patient support groups could insist on forms of accountability that support rather than undermine the intelligent placing and refusal of trust by patients, and could challenge regulatory demands that impose dysfunctional forms of accountability. Both individuals and institutions could do more to strengthen and support the parts of government that argue for – but so rarely achieve – “light touch” regulation. * Manson and O’Neill, Rethinking Informed Consent, p. 199.

in some narrowly focused projects, and justifying this in law and regulation. Broad consent can: □ allow the use of data or biospecimens for purposes that simply cannot be foreseen initially, as for example is essential with research resource platforms; □ accommodate to the fact that remarkably often, data turn out to have relevance for – or even partially answer – research questions different from those initially focused on (or not even imagined), in effect altering the “purpose” in retrospect;

76

Consent

□ reduce the number of independent projects needed, thus helping avoid duplication of data and biospecimen collecting, conserving resources, and reducing recruitment fatigue in volunteers; □ facilitate data sharing. Many surprises emerge from long-term cohort projects and genomewide association studies. Many also emerge from studies of neurological, hormonal, or immunological features that have broad controlling, amplifying, or defending functions. Broad consent, or assent or authorization, will be essential if the data in networked electronic health record systems – linked to registries, genomic data, other data, and biospecimens – are to be tapped to optimal research advantage.13 The same is true for population research biobanks, social data archives, and other large research platforms. Computer programs can be designed to filter research uses of data in accordance with participants’ preferences (if uses and their relation to purposes and preferences can be defined), but if many people impose differing restrictions, the research utility can drop off. Just as it is difficult to generalize about the relative personal and cultural sensitivities of different kinds of data, as was discussed in Chapter 2, it is almost impossible to define criteria for automatically filtering data uses according to generalized degrees of sensitivity. In practice, it appears that not many patients ask for restrictions on use. These considerations should be kept in mind as “patient centeredness” of electronic health records is promoted and systems are designed. Formal support for broad consent has been growing. As far back as a decade ago the UK Information Commissioner’s Office explained that “researchers engaged in open-ended studies are not prevented by the Data Protection Act from soliciting patient data on the grounds that their fair processing notices cannot be sufficiently detailed. Fair processing notices in this case simply need to make clear that the research in question is indeed open-ended, leaving the individual to assess the risk.”14 Under the US HIPAA Privacy Rule, patients can authorize an institution involved with their health care or payment and covered by the Rule to allow data about themselves to be used in research. It then becomes the responsibility of the organization to evaluate access 13

14

A review of the situation in Canada but with resonance elsewhere, and regrettably not yet outdated, is Patricia Kosseim and Megan Brady, “Policy by procrastination: Secondary use of electronic health records for health research purposes,” McGill Journal of Law & Health, 2 (2008), 5–45: http://mjlh.mcgill.ca/pdfs/vol2–1/MJLH_vol2_Kosseim-Brady. pdf. UK Information Commissioner’s Office, “Use and disclosure of health data” (2002), p. 7: www.ico.gov.uk/upload/documents/library/data_protection/practical_application/health_ data_-_use_and_disclosure001.pdf.

The casting of consent

77

requests and make the data sharing decisions. A limitation in the Rule is that it requires separate authorization for each research project. The committee of the Institute of Medicine reviewing the Rule in 2009 recommended that the Privacy Rule (and, as necessary, also the Common Rule on Protection of Human Subjects) be revised to allow unspecified future research to proceed if the authorization describes the sorts of research that can be conducted and an Institutional Review Board or Privacy Board determines that any proposed new research is compatible with the consent and authorization and poses no more than minimal privacy risks.15 Broad consent, complemented by appropriate safeguards and governance, should be much more widely supported. Such consent has long been accepted for well-managed longitudinal studies and research platforms, to little apparent detriment and much public benefit. Explicit versus implicit consent. The issue here is the extent to which people thinking about becoming involved as research subjects must be asked to provide a distinctive and perhaps documented expression of consent, or whether consent can be inferred from actions or attitudes. In general, of course, explicit consent is considered a more reliable indication of acceptance. Implicit consent can be acceptable in situations in which people’s views and expectations are clear by their actions, such as when they send an unsolicited question or complaint about a medication to a drug manufacturer or regulatory agency, or continue to respond without objection to questionnaires from a research project. In some circumstances consent may be assumed if the foreseeable risks and burdens are low and it can confidently be believed, from opinion surveys or experience in similar situations, that almost everyone would consent if asked. This might apply with some health-related observational social research, for example. Obviously consent can be formulated as any combination of narrow or broad, and explicit or implicit. A starting consideration always should be what people’s reasonable expectations can be taken to be. Opting-in versus opting-out. This issue is familiar from everyday experience with commercial transactions. Opting-in is more readily justified ethically. The default assumption is out, so the person’s attention must be caught, an opportunity must be presented to become informed and reflect, and an in decision must be actively solicited. Opting-out can be more questionable, because the default option is in, and whether the person even considers the out option depends on whether

15

Institute of Medicine, Beyond the HIPAA Privacy Rule, pp. 166 and elsewhere.

78

Consent

their attention to the choice is caught. Opt-out is usually considered more justifiable if the choice is presented in a highly visible manner, the decision up for choice is explained clearly, and opting-out requires minimal effort, imposes no cost, and brings no consequences for the person. Opt-out is not necessarily a second-rate option; it can serve well, for instance, as a reassurance in trusted programs that presume consent in pursuing a collective public interest and using de-identified data.

Right to withdraw A correlate to consent – the right to rescind consent or withdraw from participation – is endorsed in the Declaration of Helsinki and assured under most countries’ laws and regulations. For example, the US Federal Common Rule requires that when seeking consent, it must be explained that “participation is voluntary . . . and the subject may discontinue participation at any time without penalty or loss of benefits to which the subject is otherwise entitled.”16 But what does withdrawal imply? Surprisingly, many research plans and consent documents don’t specify. One project that does is UK Biobank, which offers three options, as the “Ethics and Governance Framework” and information leaflet explain: (a) No further contact. This means that UK Biobank would no longer contact you directly, but would still have your permission to retain and use information and samples provided previously and to obtain and use further information from your health records. (b) No further access. This means that UK Biobank would no longer contact you or obtain further information from your health records in the future, but would still have your permission to use the information and samples provided previously. (c) No further use. This means that, in addition to no longer contacting you or obtaining further information about you, any information and samples collected previously would no longer be available to researchers. UK Biobank would 16

US Department of Health and Human Services, Federal Policy on Protection of Human Subjects (“Common Rule”), §46.116(a)(8); US Department of Health and Human Services, Office for Human Research Protections, “Guidance on withdrawal of subjects from research: Data retention and other related issues” (2010): www.hhs.gov/ohrp/policy/ subjectwithdrawal.html.

Community engagement

79

destroy your samples (although it may not be possible to trace all distributed sample remnants) and would only hold your information for archival audit purposes. Your signed consent and withdrawal would be kept as a record of your wishes. Such a withdrawal would prevent information about you from contributing to further analyses, but it would not be possible to remove your data from analyses that had already been done.17 The Framework adds the explanation that “UK Biobank will need to retain some minimal personal data for a number of reasons, which include: ensuring that participants who have withdrawn are not recontacted; and assessing the determinants of withdrawal and any impact on research findings. Participants who withdraw will be assured that this administrative record will not be part of the main database that is available to others.” A decision to discontinue some involvement may have to do with unease about the way a project seems to be going, or altered concern about privacy risks, or just weariness and not wanting to travel to a study center again or give another blood sample or answer any more questions. But whatever the reason, the act of withdrawing should be easy and not require justification (although, in order to manage more effectively, the project may want to ask in an unpressuring way whether any reasons might be offered). Projects should give serious thought to what withdrawal implies and how decisions to withdraw should be communicated, documented, acknowledged, and respected.

Community engagement Many projects and their prospective or active participants can benefit from community consultation, perhaps even starting before the project plan is fully shaped. Often this is conducted via informal discussions at places where the community tends to work or gather socially. It may proceed via focus group discussions, opinion surveys, open-house laboratory or clinic tours, online chats, or liaison committee representation. Sometimes it continues through the life of a project, to considerable benefit for everyone involved. 17

UK Biobank, “Ethics and Governance Framework”: www.ukbiobank.ac.uk/wp-content/ uploads/2011/05/EGF20082.pdf. Recognizing young families’ need for flexibility, the Avon Longitudinal Study of Parents and Children (ALSPAC) offers a varied menu of options, including temporary suspension of involvement: “ALSPAC withdrawal of consent policy”: www.bristol.ac.uk/alspac/documents/ethics-full-withdrawal-of-consent-policy07022011.pdf.

80

Consent

Whether community consultation is desirable and what kind of engagement makes sense depends on the nature of the project. It may also depend on whether a cohesive social community, or accepted representation from a community of interest, can be discerned, as with a definable group of residents or workers, patient advocates, an indigenous or tribal community recognized by law or custom, a group defining itself by religion or ethnicity, membership in a healthcare plan, or active participants in the project. Community engagement can elicit special concerns, foster mutual understanding, reduce a sense of exploitation, help improve project plans or operations, and build trust. Although it can be a useful adjunct to ethics review and consent negotiations, community acceptance is rarely considered a substitute for individual-level informed consent, at least in developed liberal societies.18 Searching for research candidates An issue that can be seen as a dilemma is whether and how permission must be sought to select and approach people to ask them to consent to become involved in a study or have data about them studied – “consent to consenting,” as it is sometimes referred to a little too breezily. This should be dealt with easily by regulations. Trained medical-record reviewers, sensitized and sworn to the obligation to protect medical confidentiality, have long performed such a service in identifying potential participants for clinical trials, with contact with the patients then made, perhaps by letter, by the patients’ physicians or by the healthcare organization with which the patients are affiliated. Similar approaches can serve well for other sorts of research. Such “activities preparatory to research” are allowed under the US HIPAA Privacy Rule, which permits researchers to comb through medical records in hospitals or other covered institutions to identify appropriate study candidates – if the researchers formally assert that the information is necessary for the research and will not be used for any other purpose, and if they promise not to take away any protected health information, which 18

For discussion of some exceptional circumstances, see “Research involving the First Nations, Inuit and Métis peoples of Canada,” in the Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council of Canada, and Social Sciences and Humanities Research Council of Canada, Tri-Council Policy Statement, pp. 105–133; and “Aboriginal and Torres Strait Islander Peoples,” in the Australian National Health and Medical Research Council, Australian Research Council, and Australian Vice-Chancellors’ Committee, National Statement on Ethical Conduct in Human Research, pp. 69–71; US National Institutes of Health, “Points to consider when planning a genetic study that involves members of named populations” (2008): http://bioethics.od.nih.gov/named_populations.html.

Research without consent

81

includes personal identifying data. It is then up to the data-curating institution to decide whether to allow use of the records and make arrangements through which patients are invited to become involved.19 Recognizing such a need in the UK, a broad data sharing review led by Richard Thomas and Mark Walport recommended that “the National Health Service should develop a system to allow approved researchers to work with healthcare providers to identify potential patients, who may then be approached to take part in clinical studies for which consent is needed.”20 Presumably this would apply to nonclinical studies as well. Again, although procedural and some ethical issues attend any such approach, especially as regards the eventual contacting of candidates, regulations should be able to establish acceptable policies. Research without consent Studies are often contemplated, especially retrospective secondary studies, for which appropriate consent does not exist. Efforts can be made to contact the data-subjects and request consent. This may, however, entail considerable effort, expense, and time, and if a sufficiently high proportion of the subjects are not found, or many of those who are found and contacted don’t consent, or those who do consent aren’t representative of the original population, the study may turn out to be statistically underpowered or skewed.21 Often it is highly impractical or simply impossible to contact the datasubjects, because such a long time has passed since the data were collected that the expectation of finding the people is low, or because of lack of resources. Tracing and contacting can be a project in itself, and can be expensive and slow, sometimes taking years. In many jurisdictions, ethics review bodies or regulators have the authority to waive obligations to obtain informed consent. The accepted criteria in Australia, Canada, 19

20 21

US Department of Health and Human Services, Standards for Privacy of Individually Identifiable Health Information (Privacy Rule), §64.512(ii). For commentary, see Institute of Medicine, Beyond the HIPAA Privacy Rule, pp. 170–172. Richard Thomas and Mark Walport, Data Sharing Review Report (2008): www.justice. gov.uk/reviews/docs/data-sharing-review-report.pdf. Una Macleod and Graham C. M. Watt, “The impact of consent on observational research: A comparison of outcomes from consenters and non consenters to an observational study,” BMC Medical Research Methodology, 8:15 (2008): www.biomedcentral. com/1471–2288/8/15; Michelle E. Kho, Mark Duffett, Donald J. Willison, et al., “Written informed consent and selection bias in observational studies using medical records: Systematic review,” BMJ, 12, 338:b866 (2009): www.bmj.com/content/338/ bmj.b866.full.pdf; Khaled El Emam, Elizabeth Jonker, and Anita Fineberg, “The case for de-identifying personal health information” (2011), pp. 25–28 reviewing the issue of consent biases: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1744038.

82

Consent

the UK, and the US for using personally identifiable data without consent include combinations of the following: □ The proposed project holds promise, and the data are essential for the project. □ Years have passed and many members of the study population are likely to be uncontactable because, for instance, they have relocated, changed their names upon marriage or divorce, severed their connections with the healthcare providers or researchers who originally collected the data, or died. □ The logistical effort and/or cost required for attempting to contact and seek consent is forbiddingly high. □ It is undesirable to seek new consent because recontacting, or attempting to recontact such as by inquiring of relatives or neighbors, could induce emotional or social stress or resentment. □ There is reason to believe, perhaps from an opinion survey, that the willingness among the contactable members of the population to consent is biased in ways that could skew the analysis. □ The privacy risks to the data-subjects will be very low, or at least proportionate to the expected personal or societal benefit, and safeguarding is assured. □ The research won’t affect the subjects’ rights or welfare. □ Any known preferences of data-subjects will be respected. □ Any identifying data will be protected and will be destroyed when feasible. □ Additional use will be restricted unless formally authorized.22 Three recent English policies are worth noticing together. First, the use of National Health Service (NHS) patient data can be approved by a central committee when a research project needs to use data in identified or partially identified form and it is “not practicable” to seek consent. Under Section 251 of the National Health Service Act 2006, an Ethics and Confidentiality Committee (ECC) of the National Information Governance Board for Health and Social Care can review project applications and advise the Secretary of State for Health as to whether to set aside the common law duty of confidentiality for the application. Obligations under the Data Protection Act still apply, but Section 251 dispensation can allow such activities as locating, identifying, and contacting patients to invite them to 22

Australian National Statement, p. 24. Canadian Tri-Council Policy Statement, Article 5.5. Ontario, Personal Health Information Protection Act, §44(1). US Federal Common Rule, §46.116(d), and US HIPAA Privacy Rule, §164.512(i)(2)(ii). UK policies are cited in the following pages.

Research without consent

83

participate in research or allow their data or tissues to be used for research; de-identifying NHS data in preparation for use in research; and linking data from multiple sources for research.23 Second, the General Medical Council’s confidentiality guidance (2009) is aligned with this, advising doctors: 41. If it is not practicable to anonymise or code the information or to seek or obtain patients’ consent without unreasonable effort, and the likelihood of distress or harm to patients is negligible, disclosure for an important secondary purpose may be proportionate. You should respect patients’ objections to disclosure. 42. You may disclose identifiable information without consent if it is required by law, if it is approved under section 251 of the NHS Act 2006, or if it can be justified in the public interest and it is either: (a) necessary to use identifiable information, or (b) not practicable to anonymise or code the information and, in either case, not practicable to seek consent (or efforts to seek consent have been unsuccessful). 43. In considering whether it is practicable to seek consent you must take account of: (a) the age of records and the likely traceability of patients (b) the number of records, and (c) the possibility of introducing bias because of a low response rate or because particular groups of patients refuse, or do not respond to, requests to use their information.24 And third, the Human Tissue Authority’s code of practice on research (2009), interpreting the Human Tissue Act, provides that: 41. There is a . . . statutory consent exception for the use and storage of human tissue for research where all of the following criteria apply: 23

24

UK National Information Governance Board: www.nigb.nhs.uk/ecc. The Board is established by the Health and Social Care Act 2008. The ECC replaces the former Patient Information Advisory Group. The “advising” is a formal construction; it would be very rare for the busy Secretary to question the Committee’s decisions, or even be aware of them, unless some decision were truly controversial. A sense of the mechanism and decisions can be gained by scanning the ECC meeting minutes posted on the website. General Medical Council (UK), “Good Medical Practice Guidance: Confidentiality” (2009): www.gmc-uk.org/guidance/ethical_guidance/confidentiality_40_50_research_and_secondary_issues.asp.

84

Consent

– tissue is from a living person; and – the researcher is not in possession, and not likely to come into possession of, information that identifies the person from whom it has come; and – where the material is used for a specific research project approved by a recognised research ethics committee . . . 52. The results of DNA analysis can be used for research without consent, providing the bodily material from which the DNA is extracted: – is from a living person; and – the researcher is not in possession, and not likely to come into possession of information that identifies the person from whom it has come; and – the material is used for a specific research project with recognised ethical approval.25 These policies show a convergence that will doubtless benefit research. A similar US policy is described on p. 108. It would be a substantial advance if principles of this sort were more widely adopted. They lead to a lot of good.

Some reflections Consent is a hallowed notion. But notions, however hallowed, should not be perpetuated beyond their genuine enactment of the moral, ethical, or legal purposes they are intended to serve. Consent negotiations, and the informing and notice-providing that they involve, serve many ends. They induce researchers to attend carefully to privacy and other ethical issues as they design projects, and then to provide carefully formulated information to ethics committees and prospective subjects. They engage the awareness of potential subjects and stimulate questioning. All this helps everyone become aware of the project circumstances and implications, reduces the chances of surprise, deception, coercion, and exploitation, and sets the stage for decisions. In many situations consent has served well, and it may continue to serve well – but not necessarily in the ways, or for the reasons, assumed by many current regulations or argued for in the drier reaches of bioethics. It is at the very least an overstretched notion with too much reliance placed on it.

25

UK Human Tissue Authority, “Code of Practice 9: Research” (2009): www.hta.gov.uk/ legislationpoliciesandcodesofpractice/codesofpractice/code9research.cfm.

Some reflections

85

Consent, based on candid and comprehensible information and sometimes rather extensive explanation, is essential for clinical or behavioral studies that amount to experimentation in which participants are asked to voluntarily take on some risk to body or mind. Consent is also important for many birth, twin, occupational, disease, or treatment cohort projects in which participants are asked to agree to sustained involvement and perhaps repeated contact, and so in which the consent exercise leads to the participants’ making commitments, as well as granting permissions. And consent can be a reassurance in projects that use existing data or biospecimens, although, as was argued earlier, it should be much less necessary for these than for interventional studies or those collecting new data, and for the reasons discussed on pp. 81–82, it may even be undesirable. Probably consent serves less well in conventional terms, or, rather, serves in a derivative or proxy fashion, for some large cohort projects or research resource platforms such as biobanks, in part because of the communicative distance, in several senses, between the data-subjects and the data users. Often the scientists who eventually study the data or biospecimens will never meet the physicians or other intermediaries who contributed the materials to the resource, much less meet the datasubjects, and they may be far away physically, perhaps in foreign legal jurisdictions. Some studies may commence a decade or two after the data and/or biospecimens were originally collected, and the uses of collections may shift over time, as may the privacy issues. And there is always the difficulty of communicating scientific and medical concepts. The consent exercise in such situations just cannot be viewed as pursuing the same ethical and legal ends as when confronting, say, a proposal to agree to experimental surgery. Reliance on “informedconsent,” as the catchphrase goes, as an indication that a participant comprehends in some depth the research purposes and plans, the information security measures, the compliance with laws and regulations, the risks to privacy and confidentiality, and so on, and that signing a consent form serves autonomy, can amount to a charade. Equally flimsy is reliance on a pablum of overly simplified information. The most serious ethical fallacy is when consent is posed as empowering people to “control” the use of “their” data or biospecimens, if in fact what they are doing is conceding to some study of data or biospecimens that are not really theirs in any strong sense in the first place. These limitations must be faced and policies rethought. A conjecture, to invite reaction. In many circumstances now, frank recognition that consent amounts at its core to delegation of trust to a project’s leadership and governance is the only interpretation of consent

86

Consent

that doesn’t amount to proceeding under false pretenses. Realistically what should be sought is general understanding of a project’s purposes, procedures, and risks, in as much depth as prospective subjects want, and then informed deference to the project’s setting, leadership, safeguards, and governance – i.e., consent construed as entrusting – which is what consenting often amounts to in reality. This fits with the conviction that consent must be meaningful and address people’s concerns, and it emphasizes the correlate responsibilities of researchers and their institutions. Perhaps this should be called assent, or authorization, to connote the revised sense. If this suggestion finds resonance, its implications for policy and practice should be explored. Change could be gradual, initially involving discussion of the issues by research leaders, research ethics and governance bodies, various regulators, and patient and privacy advocates; then asking legislators and regulators to consider tailoring policies to accommodate to the special needs of health research without diminishing protections; and then over time revising ethics review and recruitment strategies, communications, and procedures to engage more fully with the circumstances that make trusting reliable. All along, the public interest in health research and public health activities should be emphasized. Casting consent as entrusting accommodates several of the shifts argued for in this chapter – assuming in each case that proper safeguards are maintained and robust governance is exerted: □ being amenable to supporting broad consent or authorization for unspecified uses of data or biospecimens; □ justifying the pursuit of research without explicit permission when it is impractical or inappropriate to seek consent; □ allowing qualified personnel to search through clinical or other records to identify potential research candidates; □ reducing or waiving consent requirements for non-interventional, lowrisk research such as research on de-identified existing data or biospecimens. Each of these is practiced in various places, but none is practiced universally. Where they are not practiced, they deserve to be considered, and where they are practiced, they deserve to be refined and sanctioned in policy and law. The next chapter will suggest several policies that should be added to the list if identifiability issues are managed appropriately.

7

Identifiability and person-specific data

For research, identifiability is a watershed issue for two formal reasons: □ if data or biospecimens are considered identifiable, they may be “personal data” or “individually identifiable health information” under legal interpretations, and a variety of privacy or data protection laws, professional duties, and custodial obligations may apply; □ if identifiable data or biospecimens are used in research, or are being considered for use in research, human-subject research regulations may apply, imposing requirements relating to such matters as consent, ethics board review, and identifiability protection. “Personal data” Like privacy, to which it closely relates, identifiability is a familiar but surprisingly elusive notion. Probably it is best to think of identifiability as having to do with the extent to which data point to and are about an individual in a way that allows inferences to be made or actions to be taken that might affect the person. The EU Data Protection Directive and the national laws transposing it define identifiability as the extent to which data relate to a person. Although relating to is only a degree of difference from being about, if taken literally it would sweep in far more kinds of data, and softer and more mundane data, than formal regimes could usefully capture; so it tends not to be pressed to extremes. The legal dispute in Durant v. Financial Services Authority, to be sketched shortly, illustrates the difficulty with relate to. (For a time, concerned about partial genomic sequence data, this author was tempted to think of identifiability as having to do with the extent to which data are associable with a person, but has realized that this is unhelpfully more sweeping than relating to is. A problem clearly illustrated by genomic data, though, is that data that are merely associable with at one time can become revealingly about as science advances or the inquiry context changes.) 87

88

Identifiability and person-specific data

The challenge is to construe identifiability in a pragmatic manner, one that helps delineate data that deserve to be collected, handled, or protected in special ways – in particular because their use might affect judgments of, decisions about, or actions pertaining to people – from those that do not. An inherent problem is that identifiability runs a spectrum: from overtly identified, to indirectly identifiable, to non-identifiable for all practical purposes. All formulations have to accommodate the fact that even if data are not person-labeled en face, what is important is whether, with some effort such as consulting other information or taking circumstances into account, the data can be understood to be about a real person. Invariably the concern has to be with potentially identifiable as well as overtly identified data. The EU Data Protection Directive says (Article 2(a)): “Personal data” shall mean any information relating to an identified or identifiable natural person (“data subject”); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity. Transposing the EU Directive into national law, the UK Data Protection Act sharpens the reference to indirect identifiability and relates identifiability to all that the data controller – the person who determines the purposes for which and manner in which the data are processed – might come to know (Article I.(1)): “Personal data” means data which relate to a living individual who can be identified – (a) from those data, or (b) from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller . . . In contrast to this crisp definition, the Australian Privacy Act’s definition is remarkably spongy (Article II.6.1): Personal information means information or an opinion (including information or an opinion forming part of a database), whether true or not, and whether recorded in a material form or not, about an individual whose identity is apparent, or can reasonably be ascertained, from the information or opinion.1

1

Australia, Privacy Act 1988: www.comlaw.gov.au/Details/C2011C00157.

“Personal data”

89

In a lapse of its generally very sensible perspective, in 2008 the Australian Law Reform Commission proposed revising this to read: Personal information is information or an opinion, whether true or not, and whether recorded in a material form or not, about an identified or reasonably identifiable individual.2 This would be a slight improvement as regards indirect identifiability, but in including unrecorded information and opinions it would still be spongy. How can a mere opinion, possibly untrue and not recorded in material form, be regulated by informational privacy law? The difficulty in applying such definitions in practice was evidenced by the fact that in 2007, 12 years after the EU Data Protection Directive was passed, the Article 29 Working Party found it necessary to prepare an “Opinion on the concept of personal data,” explicating at some length the four elements of the definition, i.e., “any information relating to an identified or identifiable natural person.” Personal data, it advised, should be viewed as being (partly paraphrasing here): Any information – subjective as well as objective, “any sort of statement about a person,” including possibly incorrect information, and including biometric information defined to encompass not only fingerprints and voiceprints but “even some deeply ingrained skill or other characteristic (such as handwritten signature, keystrokes, particular way to walk and speak, etc.).” Relating to – being “about” an individual and having to do with “content, or purpose, or result,” giving special attention to data that might be used to determine or influence how a person is evaluated or treated. An identified or identifiable – distinguishable from other persons either directly or indirectly “by all means reasonably likely to be used either by the data controller or any other person,” with the “reasonably likely” proviso emphasized. Natural person – mainly the living, but possibly also the not-yet-born or deceased, depending on implications for living persons.3 The Opinion and some other recent reports and court decisions have extended the notion of personal data to profiling, and even to what can only be called weak characterizations or simply clues, surely a development for which the clichéd expression “slippery slope” is justified. The Opinion says that “a name may itself not be necessary in all cases to 2 3

Australian Law Reform Commission, recommendation 6–1. European Article 29 Data Protection Working Party, “Opinion on the concept of personal data” (2007): http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/2007/wp136_en.pdf.

90

Identifiability and person-specific data

identify an individual. This may happen when other ‘identifiers’ are used to single someone out.” It cites, among other examples, house asset value, Internet Protocol address of home computer, and a child’s drawing of her family submitted in court custody proceedings and cast as personal data about the character of the parents. “Simply because you do not know the name of an individual does not mean you cannot identify that individual,” the UK Information Commissioner explained. “Many of us do not know the names of all our neighbours, but we are still able to identify them.”4 Such expansive construals pose serious challenges to de-identification for health research, as well as to other processing of data. Moreover, in softening the focus of law and regulation and broadening the scope, they weaken the practical protection of privacy. It would be ridiculous, as a general practice, to have to treat all statements that are sort-of-about individuals, obtainable from any source by any means, and casually recorded or even unrecorded observations, and iffy opinions as well, as deserving formal protection. A landmark legal judgment that hinged largely on the sense in which identified data related to a plaintiff and thereby comprised personal data was the case of Michael John Durant v. Financial Services Authority in the UK. Mr. Durant had demanded under the Data Protection Act to be provided with copies of certain paper records held by the regulatory Authority that pertained to a dispute he had had with Barclays Bank. The Authority had refused to provide them. No-one denied that the data involved his name; the issue was whether they comprised personal data under the Data Protection Act. In its decision, which continues to be debated, the Court of Appeals ruled that: Not all information retrieved from a computer search against an individual’s name or unique identifier is personal data within the Act . . . Mere mention of the data subject in a document held by a data controller does not necessarily amount to his personal data.

4

UK Information Commissioner’s Office, “Determining what is personal data” (2007), p. 6: www.ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_ guides/personal_data_flowchart_v1_with_preface001.pdf. Profiling per se is now attracting legal attention, mainly with respect to market research and behavioral advertising based on surveillance of people’s telecommunications, web habits, and GPS tracks. Notably see Council of Europe, “Recommendation of the Committee of Ministers to member states on the protection of individuals with regard to automatic processing of personal data in the context of profiling” (2010), CM/Rec(2010)13: https://wcd.coe.int/wcd/ViewDoc.jsp? id=1710949&Site=CM. The Recommendation defines profiling as “an automatic data processing technique that consists of applying a ‘profile’ to an individual, particularly in order to take decisions concerning her or him or for analysing or predicting her or his personal preferences, behaviours and attitudes” (Article 1(e)).

“Personal data”

91

Whether it does so in any particular instance depends on where it falls in a continuum of relevance or proximity to the data subject [and whether] the information is biographical in a significant sense . . . The information should have the putative data subject as its focus . . . In short, it is information that affects his privacy, whether in his personal or family life, business or professional capacity.5 One limitation of such an interpretation is that data that relate to a person in one context may be squarely about them in another. This was dramatically illustrated by the case of a 15-year-old boy who, using online genealogy and person-tracing service data and some clever logic, identified and contacted the previously anonymous sperm donor who was his biological father. After the inferential linking by the boy, data that had merely, and separately, related to the man and the boy became clearly about them, and together.6 A current wave of issues has to do with whether data recorded in the course of everyday electronic transactions are personal data. Examples are data about movements, purchases, and communications recorded when people use airline tickets, passports or other national identity cards, public transport passcards, highway tollgate passes, or credit cards, and about telecommunication exchanges and web searches as these are recorded in service provider logs. If access is had to the relevant central databases, the data, many recorded in time-and-location detail, can usually, although not always, be referenced to the users. Should such data be treated as personal data? In the EU they are, if they are traceable to the users by anyone, including the service providers.7 Among the complications are that passes may be used by people other than those who paid for them by named credit card or check; passes, credit cards, or mobile telecommunication devices may be counterfeit or stolen ones; and computers may be manipulated by people other than the registered users. Another wave of issues has to do with the extent to which the ordering and retrievability of data are relevant to whether they fall under regulations. The EU Data Protection Directive applies to “the processing of 5 6 7

Michael John Durant v. Financial Services Authority [2003] EWCA Civ 1746, para. 28: www.bailii.org/ew/cases/EWCA/Civ/2003/1746.html. For discussion, see Mark J. Taylor, “Data protection: too personal to protect?” SCRIPT-ed 3(1) (2006): www.law.ed.ac.uk/ahrc/script-ed/vol3–1/taylor.asp. An analysis of such a situation is European Article 29 Data Protection Working Party, “Opinion on geolocation services on smart mobile devices” (2011): http://ec. europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/ 2011/wp185_en.pdf.

92

Identifiability and person-specific data

personal data wholly or partly by automatic means, and to the processing otherwise than by automatic means of personal data which form part of a filing system or are intended to form part of a filing system” (Article 3.1, emphasis added). The US Privacy Act, and some laws elsewhere, apply to “systems of records.” Many laws apply only to databases from which data can be retrieved by name, social security number, or other specific reference to individuals. Most of this is a legacy of the concerns decades ago about data in large paper archives and primitive computer databases, especially data held by government agencies. From time to time questions arise as to whether data are filed in a way that brings them under the coverage of a law.8 The issue is arising in new form now as regards the application of law to the delocalized and fragmented handling of data in the Internet cloud and the fog of online social networking. Almost all data used in health research are handled via highly structured and searchable systems, and thus fall squarely within current laws in this respect. The last few pages have discussed “personal data” as they are defined by the EU Data Protection Directive and implemented by the Member States, and as they are at least approximately defined in a number of other countries’ laws. Some systems elsewhere, however, employ other strategies, such as focusing coverage specifically on data used or generated in the course of providing health care or in pursuing health research, referring to the data as “personally identifiable data” or similar. An example of the latter, the US HIPAA Privacy Rule, will be discussed later in this chapter.

Identifiers Identifiers, or identifying data, are data that, at least figuratively, can be used to make contact with individuals. Prime among them are legal name and residential address, of course. Also prime is birthdate, which is especially strong because it doesn’t change and can be verified via birth certificates, and it can help confirm or refute whether a person is the person he or she is thought to be. A great many other data can point to a person, as the EU “Opinion on the concept of personal data” made clear. Whether a particular piece of data should be considered in any formal sense an identifier, and how strong an identifier, depends very much on the context.

8

UK Information Commissioner’s Office, “Determining what information is ‘data’ for the purposes of the DPA” (2009): www.ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_guides/what_is_data_for_the_purposes_of_the_dpa.pdf.

De-identification for research

93

A multitude of identifiers can be present in the kinds of data that are routinely entered in healthcare, public health, or research records, or in data derived from or linked to them, and used in research: □ legal, administrative, communication, or demographic data (name, residential address or postal code, telephone and other telecommunication addresses, sex, birthdate, place of birth, age, marital status, number of children, citizenship, national identification number, military identification number, university student number, prisoner number, health or social welfare number, private insurance number, bank account number, credit card number . . .); □ event dates (clinic visit, hospital admission or discharge, scan, prescription, blood draw, ambulance run . . .); □ general descriptive attributes (height, weight, blood type, iris color, ethnicity, hair color, tattoos, piercing, scars, lisp, limp . . .); □ biometric attributes (fingerprints, iris scan, facial photograph, voiceprint, genotype . . .); □ certificates or licenses (school or university graduation, driver’s license, professional license or registration . . .); □ relationships, roles, or social status (family, household, tribal, occupational, educational, social club, public figure or celebrity/notoriety status . . .); □ health data (immigrant medical examination, drug prescription, genetic test, blood type, serial number of implanted medical device . . .); □ indirect clues (language spoken, religious affiliation, names of physicians or other healthcare providers, hospital or pharmacy name and address, involvement in a clinical trial or longitudinal study, circumstances of hospital emergency room admission . . .). Day-to-day operations clearly have to give priority attention to protecting what can be called the persistent identifiers, such as legal name, name initials, and birthdate, and to other strong identifiers such as sex, residential address, national identity or passport numbers, and Social Security numbers. Depending on the context and risks, attention may also have to be given to protecting less permanent, partial descriptors, called by some experts “quasi-identifiers.” De-identification for research Almost all health data originate as identified or lightly de-identified data. But in most circumstances researchers don’t need, and don’t want, to have to deal with the ethical and legal complications of working with identifiable data. Nor do their organization’s lawyers. Research is about cases, categories, and phenomena. So de-identification (often referred to

94

Identifiability and person-specific data

colloquially but imprecisely in communications with the lay public as “anonymization”) is a crucial strategy. But because, as was remarked earlier, identifiability runs a spectrum from overtly identified, to indirectly identifiable, to non-identifiable for all practical purposes, saying merely that “the data will be anonymized,” as researchers too often do, is simply to begin a discussion. (A note to readers: Because many issues in this book have to do with the conversion of data from one identifiability status to another, for clarity the words de-, re-, non-, and un-identified or identifiable are hyphenated.) Whether, how, and to what degree to de-identify data depends on the character and sensitivity of the data, requirements set by regulations, the risks, the safeguards, and what the de-identifying would lose for research. De-identifying is a craft, often a highly technical one, and within organizations it is a matter of discipline. Many techniques can be employed to de-identify data: □ stripping-off names, addresses, and other overt identifiers, and either permanently destroying the identifiers, or arranging for the identifiers to be held separately and maintaining the ability, by an intermediary independent of the researchers, to link the substantive data back to them (i.e., key-coding, a technique to be discussed shortly); □ aggregating, or averaging, data across records in various ways to generate a general representation, which may be used when individuallevel data are not necessary for the research, or when it is necessary to indicate the general characteristics of a data-set; □ broadening, coarsening, or rounding data, such as by converting birthdate to age or age range, or deleting the last digits of postal code so as to broaden the geographic zone; □ obscuring date clues but preserving event spacing by offsetting dates, such as dates of diagnosis, lab tests, hospital discharge, and so on for medical episodes by a random but uniform interval within each individual’s record, but by differing random intervals across different people’s records (i.e., adding 119 days to subject A’s events, adding 44 to subject B’s, and so on); or, if appropriate, simply counting from a starting date such as date of birth in a study relating to early childhood, or date of first ingestion of a drug in a clinical trial (so, start +2, start +17 . . .); □ obscuring location clues but preserving relative distances from a focal interest, such as a perinatal care clinic or a cell telephone transmission tower, by simply recording the radius and not the compass orientation from the center; □ degrading, or fuzzing, the data, such as by adding statistical “noise” as though the data have become blurred during transmission, or randomly swapping similar bits of data among similar records;

De-identification for research

95

□ releasing only a fraction of the records from a data-set, so that even if it is known that a person is in the data-set, it isn’t possible to infer whether any of the released data represent that person.9 Whenever feasible after any of the above, the data should be examined record-by-record by an expert, checking generally and suppressing or obscuring any data that occur with such rarity that they might be clues to identity (the unique case of xeroderma pigmentosum, the 15-year-old mother, the dermatological photograph that is partially masked but still shows a necklace carrying a religious symbol . . .). Increasingly, de-identification is being performed at least partially by computers. Algorithms comb through the data-set and search for capitalized words (thus catching personal names, addresses, hospital names, pharmaceutical brand names, places of employment, etc.), scan for alphanumerical sequences (catching dates, postal codes, telecommunication addresses, license numbers, social security and healthcare system numbers, etc.), and so on; and, depending on the program and the data, either mask the identifiers, or substitute data randomly picked from large dictionaries of similar data, or signal for a human to make a judgment.10 Obviously such programs can complement and aid the work of human experts. The programs are becoming much better at scouring free text such as medical notes and patient or study-participant narratives, and some reflexively learn over time, such as how to discern from context and syntax to preserve personal names that are part of disease names, such as Hodgkins or Huntington, but to mask names preceded by Dr. or followed by MD. Their development should be pressed and their effectiveness evaluated. For now though, there is probably still no substitute for scrutiny by an experienced statistician bent on searching for trouble. 9

10

Khaled El Emam and Anita Fineberg, “ An overview of techniques for de-identifying personal health information” (2009): www.ehealthinformation.ca/documents/DeidTechniques.pdf; US National Heart, Lung, and Blood Institute, “Guidelines for NHLBI data set preparation” (2005): www.nhlbi.nih.gov/funding/setpreparation.htm; NHS National Health Services Scotland, Information Services Division, “Statistical disclosure control protocol,” version 2 (2010): www.isdscotland.org/isd/files/isd-statistical-disclosure-protocol.pdf; UK Office of National Statistics, “Disclosure control of health statistics” (2006): www.ons.gov.uk/ons/ guide-method/best-practice/disclosure-control-of-health-statistics/index.html. Technical reviews are published in Josep Domingo-Ferrer and Emmanouil Magkos (eds.), Privacy in Statistical Databases (Berlin: Springer-Verlag, 2010), and earlier volumes in the series. See also National Research Council (US), Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data, Myron P. Gutmann and Paul C. Stern (eds.), Putting People on the Map: Protecting Confidentiality with Linked Social-Spatial Data (Washington, DC: National Academies Press, 2007), available at: www.nap.edu. Stephane M. Meystre, F. Jeffrey Friedlin, Brett R. South, et al., “Automatic deidentification of textual documents in the electronic health record: a review of recent research,” BMC Medical Research Methodology, 10(70) (2010): www.biomedcentral.com/ 1471–2288/10/70.

96

Identifiability and person-specific data

Ways non-identified data can become identified Now for the reverse direction. Data that are not identified – whether because they weren’t identified when initially collected, or were derived from non-identified biospecimens, or have purposely been de-identified – can become identified through three basic avenues: □ Matching with identified or identifiable reference data. Matching with intent to identify involves searching for correspondence between elements of the data in question and analogous or identical elements in comparison collections. This is familiar from such routines as matching un-identified fingerprints against reference fingerprints. Whether clear identification results, obviously depends on whether the match is with data that themselves reveal personal identities, such as name and address, or that can lead to revealing them if recourse is made to yet other data such as telephone directories or voter registration databases. Direct matching is the most certain of the identifying avenues. □ Linking with external data and deducing the identity. In linking, data-sets are coupled to either create a partially merged data-set or search among the variables in the data-sets to look for identical or very similar data, such as age and age, sex and sex, disease and disease, biopsy date and biopsy date (or surgery date, which may closely follow a biopsy date). Because linking increases the number and diversity of data on view, it tends to increase the odds of identifying by deduction, even if it doesn’t identify conclusively. (Data linking for research is discussed in Chapter 10.) □ Profiling. Profiling with intent to identify involves building up a sketch, which may suggest likenesses even if it doesn’t yield unique identities. As is familiar from television shows about profiling by clever and attractive police detectives, an accretion of probabilistic details may not directly identify an individual, but, especially if the candidate is known to be a member of a population of interest, it can narrow down the list of possible individuals. Vulnerability through all of these avenues – matching, linking, profiling – has to be assessed before data are linked, or released into the public domain, or shared for research. And it has to be considered continually as data security is managed. A recent analysis is worth mentioning, not because of the kind of data involved but because it serves notice as to the potential (for bad or good) in linking diverse data, even seemingly innocuous data. An identifier in the US that has long been considered vulnerable is the nine-digit Social Security number, which was established long ago to track

Ways non-identified data can become identified

97

financial earnings for taxation and pension purposes but which has become the de facto national identification number for lack of any other. In 2009 Alessandro Acquisti and Ralph Gross announced that: Using only publicly available information, we observed a correlation between individuals’ Social Security Numbers (SSNs) and their birth data and found that for younger cohorts the correlation allows statistical inference of private SSNs. The inferences are made possible by the public availability of the Social Security Administration’s Death Master File and the widespread accessibility of personal information from multiple sources, such as data brokers or profiles on social networking sites.11 Deductive re-identification is facilitated by the vast range of data about people that can now be consulted without breaking laws. Among the sorts of information that can be readily accessed in most developed countries, either free or for a fee, are birth, marriage, divorce, adoption, and death registrations, home and business street addresses, telephone numbers, and newspaper obituaries. Publicly accessible databases may include those recording voter registration, motor vehicle, boat, or firearm registration, property ownership, court proceedings, criminal convictions, organization membership, professional licensing, government employment, or family histories. Mailing list brokers sell name-and-address targets in many lifestyle, health, and other marketing categories.12 Commercial people-search services use multiple databases to find, among others, birth parents and adopted children. Online social networking sites carry more clues to identity, including friends’, or “friends’,” identities, characteristics, and connections than most participants realize. And of course healthcare databases hold uncountably more conditionally accessible data, including family information and payment and reimbursement details. Novel thinking is needed. One problem is how promises not to attempt to identify shared data can be enforced. A proposal, by Robert Gellman, is that the US consider enacting a Personal Information Deidentification Procedures Act, meant to bind recipients of de-identified data to keeping the data de-identified. Under the statute and referring to it, a data provider and a data recipient could 11 12

Alessandro Acquisti and Ralph Gross, “Predicting Social Security numbers from public data,” Proceedings of the National Academy of Sciences, 106 (2009), 10975–10980. Examples of data brokers in the US are NextMark, at http://lists.nextmark.com; and infoUSA, at www.infoUSA.com. In the UK, a commercial assembler of publicly available data, including electoral roll data, is 192.com People Finder: www.192.com.

98

Identifiability and person-specific data

choose to enter into a contract defining responsibilities and agreeing to externally enforceable terms specified in the statute. The contract would recognize the data-subjects as third-party beneficiaries to the agreement, thus enabling them to seek legal damages in the event of negligence by the data provider or recipient (an action not available in most jurisdictions now, because data-subjects are rarely legal parties to data sharing agreements). The scheme could have particular applicability to the sharing of health research data. Whether or not the proposed law is adopted in the US, elements may deserve consideration when broader legislation is drafted in the US or elsewhere.13

Retaining the possibility to re-identify Irreversible de-identification, i.e., permanent anonymization, is not always, perhaps not even often, desirable. It can be very important to be able to trace back to original data or biospecimens, or to data-subjects themselves, such as to: □ allow quality control and validation of source data in efforts to catch and eliminate duplicate cases, reduce errors, investigate suspected scientific misconduct, or confirm data submitted in regulatory or liability proceedings; □ enable recontacting and follow-up if more data or biospecimens need to be collected or accessed for research; □ ensure accurate correspondence when linking data with other data or with biospecimens; □ review consent or authorization records or ethics board decisions relating to the data; □ inform data-subjects, their physicians, or their relatives of findings that might be useful to them; □ enable selection and recruitment of volunteers for future studies. Sometimes reconnection needs to be made with the data-subjects personally, but usually it just needs to be made with medical records, a database, or biospecimens. Tracing back is much easier if the possibility is anticipated in the study protocol and enrollment, or in the design of the data sharing platform. Policies should make clear: what circumstances can justify tracing back; exactly which data 13

Robert Gellman, “The deidentification dilemma: A legislative and contractual proposal,” Fordham Intellectual Property, Media, and Entertainment Law Journal, 21 (2011), 33–61, available online at: http://iplj.net/blog/wp-content/uploads/2010/11/ C02_Gellman_010411_Final.pdf.

The HIPAA Privacy Rule approach

99

can be sought, and what they can be used for; the conditions, such as whether prior approval must be obtained from an ethics or other oversight body; and who will do the tracing and via what informational trail. If contact is to be made with the data-subjects themselves, careful thought should be given as to who will do the contacting and what it will involve.

The HIPAA Privacy Rule approach Perhaps the only law anywhere that addresses identifiability in any technical detail is the US HIPAA Privacy Rule.14 The Rule applies to “individually identifiable health information,” defined as (§160.103): Information that is a subset of health information, including demographic information collected from an individual, and: (1) Is created or received by a health care provider, health plan, employer, or health care clearinghouse; and (2) Relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to an individual; and (i) That identifies the individual; or (ii) With respect to which there is a reasonable basis to believe the information can be used to identify the individual. The Rule provides the following two options through which data can be deemed not identifiable and thus exempt from its coverage (§164.514). Certification of low identifiability risk by a statistician. The first option is to have a statistician review the data and certify that “the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information.” This approach has been used on occasion, but apparently not often, because the Rule does not specify who can be considered a qualified statistician, or how to judge what information might be reasonably available or declare that a risk is very small. Also, statisticians have been concerned that in performing the service they could be subject to legal liability in the event of a dispute 14

The Rule was described on pp. 52 and 62 above. See also US Department of Health and Human Services, Office of Civil Rights, Workshop on the HIPAA Privacy Rule’s DeIdentification Standard (May 2010): www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/deidentificationagenda.html.

100

Identifiability and person-specific data

over some disclosure, with liability hinging on those vague criteria. (To be clear: Such identifiability assessment is performed by statisticians all the time for informal purposes, such as for managing data within organizations, but far less for certifying publicly that data are not individually identifiable under the HIPAA Privacy Rule.) Removal of prohibited identifiers. The other option, sometimes called the “safe harbor” option, is to remove 17 listed types of identifiers and “any other unique identifying number, characteristic, or code” to derive a data-set for which the organization “does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information.”15 (See the list in Box 5.) The List, as it is often called, comprises identifiers that are linked fairly directly somewhere to (in effect) name-and-address. Knowing a few of the elements on the List may or may not allow identification, and even knowing a person-unique fact such as a patient record number may allow identification only if it can be looked up in a master administrative file. The List does not include all of the usual descriptors of persons. Extraordinarily, there is no element for sex, although this is usually easy to infer from other data. Nor is there an element for health, illness, or disability characteristics, even those that may be enduring and evident to simple perception such as blindness, albinism, autism, partial facial paralysis, cleft lip, or cauliflower ear. Presumably the assumption is that these will be caught by element (R), “Any other unique identifying number, characteristic, or code,” even though this relegates a lot to judgment and the qualifier “unique” is subject to interpretation. Surely (R) must cover, for example, an International Classification of Diseases code; for instance, ICD-10 L40 = psoriasis vulgaris, a chronic skin condition often evident even to passing view of a short-sleeved arm in public and that is likely to be under long-term treatment.16 But why not illnesses generally, or at least fairly rare ones? And surely genomic data will be an element in future versions of the list, if there are any; even now they would seem to be covered, in that they are both biometric identifiers and unique identifying characteristics. 15

16

Notice that the term “safe harbor” is used for four very different arrangements relevant to health research: For the de-identification option under the HIPAA Privacy Rule described here; for highly restricted data-access systems, in some British usage; for the scheme under which US companies can self-certify to receive personal data from EU sources; and for dispensation under some data security breach laws from having to report a breach if the data are encrypted and thus can’t be read. Grigorios Loukides, Joshua C. Denny, and Bradley Malin, “The disclosure of diagnosis codes can breach research participants’ privacy,” Journal of the American Medical Informatics Association, 17 (2010), 322–327, available online at: www.ncbi.nlm.nih.gov/ pmc/articles/PMC2995712/pdf/amiajnl2725.pdf.

The HIPAA Privacy Rule approach

101

Box 5. The HIPAA privacy rule identifier list §164.514 (b)(2)* (i) A covered entity may determine that health information is not individually identifiable health information only if . . . the following identifiers of the individual or of relatives, employers, or household members of the individual, are removed: (A) Names; (B) All geographic subdivisions smaller than a State, including street address, city, county, precinct, zip [five-digit postal] code, and their equivalent geocodes, except for the initial three digits of a zip code if, according to the current publicly available data from the Bureau of the Census: (1) The geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and (2) the initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000; (C) All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; (D) Telephone numbers; (E) Fax numbers; (F) Electronic mail addresses; (G) Social security numbers; (H) Medical record numbers; (I) Health plan beneficiary numbers; (J) Account numbers; (K) Certificate/license numbers; (L) Vehicle identifiers and serial numbers, including license plate numbers; (M) Device identifiers and serial numbers; (N) Web Universal Resource Locators (URLs); (O) Internet Protocol (IP) address numbers; (P) Biometric identifiers, including finger and voice prints; (Q) Full face photographic images and any comparable images; and (R) Any other unique identifying number, characteristic, or code . . . and (ii) The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information. * US Department of Health and Human Services, Standards for Privacy of Individually Identifiable Health Information, §164.514.

102

Identifiability and person-specific data

Although other countries’ laws don’t address de-identification in any detail, identifier “strip-lists” are routinely, although not necessarily expertly, drawn up by units that use, share, or broker data for research. Recently Iain Hrynaszkiewicz and his colleagues elaborated upon the HIPAA list in suggesting guidance for de-identifying clinical data for publication, such as adding rare disease, occupation, and family structure as identifiers.17 Limited Data Sets. Recognizing the restrictive nature of the List, the Privacy Rule makes a concession to research by allowing data custodians to release Limited Data Sets for research, data from which many but not all core identifiers have been stripped. Names, electronic communication addresses, and biometric identifiers must not be present, for example, but birthdate, treatment dates, sex, cities, states, zip (postal)

Box 6. The HIPAA Privacy Rule Limited Data Set list* A limited data set is protected health information that excludes the following direct identifiers of the individual or of relatives, employers, or household members of the individual: (i) Names; (ii) Postal address information, other than town or city, State, and zip code; (iii) Telephone numbers; (iv) Fax numbers; (v) Electronic mail addresses; (vi) Social security numbers; (vii) Medical record numbers; (viii) Health plan beneficiary numbers; (ix) Account numbers; (x) Certificate/license numbers; (xi) Vehicle identifiers and serial numbers, including license plate numbers; (xii) Device identifiers and serial numbers; (xiii) Web Universal Resource Locators (URLs); (xiv) Internet Protocol (IP) address numbers; (xv) Biometric identifiers, including finger and voice prints; and (xvi) Full face photographic images and any comparable images. * US Department of Health and Human Services, Standards for Privacy of Individually Identifiable Health Information, §164.514.

17

Iain Hrynaszkiewicz, Melissa L. Norton, Andrew J. Vickers, and Douglas G. Altman, “Preparing raw clinical data for publication: Guidance for journal editors, authors, and peer reviewers,” BMJ, 340 (2010), 304–307.

The HIPAA Privacy Rule approach

103

codes, and some other potentially identifying clues can remain, as well as substantive health information. Researchers applying for access to a Limited Data Set must specify which data they want and the intended uses, name who will be using the data, commit to enforcing safeguards, and promise that they will not attempt to identify the data-subjects or contact them. (See Box 6.) Enough experience has been accumulated that the advantages and weaknesses of the List and Limited Data Sets can be evaluated. The Institute of Medicine review of the Rule concluded that: Even the Privacy Rule’s deidentification standard may not be stringent enough to protect the anonymity of data in today’s technological environment. However, strong security measures . . . and the implementation of legal sanctions against the unauthorized reidentification of deidentified data . . . may be more effective in protecting privacy than more stringent deidentification standards.18 Identifiability risk assessment. Critical statistical work is now being done to improve assessment. Modeling and empirical testing are being performed to probe the relative contributions that different variables (such as postcode population size) can make to identifiability. Correspondingly, statistical tools are being developed that can reduce the identifiability of records to selected levels of acceptability.19

18 19

Institute of Medicine, Beyond the HIPAA Privacy Rule, p. 175. Kathleen Benitez and Bradley Malin, “Evaluating re-identification risks with respect to the HIPAA privacy rule,” Journal of the American Medical Informatics Association, 17 (2010), 169–177. A method for tailoring patient data-set disclosure policies so as to reduce identifiability risks to levels no higher than those that the Safe Harbor method achieves, at least for demographic identifiers, is presented in Bradley Malin, Kathleen Benitez, and Daniel Masys, “Never too old for anonymity: A statistical standard for demographic data sharing via the HIPAA Privacy Rule,” Journal of the American Medical Informatics Association, 18 (2011), 3–10, available online at: http://jamia.bmj.com/content/ 18/1/3.full.pdf?sid=de503861–7d0a-4992–813a-f887353f8c21. See also Khaled El Emam, “Methods for the de-identification of electronic health records for genomic research,” Genome Medicine, 3(25) (2011), available online at: http://genomemedicine. com/content/3/4/25. Two groups that produce steady streams of work on identifiability and whose websites repay following are Khaled El Emam and his colleagues at the Electronic Health Information Laboratory, University of Ottawa and Children’s Hospital of Eastern Ontario (www.ehealthinformation.ca); and Bradley Malin and his colleagues at the Health Information Privacy Laboratory, School of Medicine, Vanderbilt University (www.hiplab.org).

104

Identifiability and person-specific data

Key-coding The method most widely used in health research for de-identifying data is key-coding, i.e., reversibly de-identifying. Overt identifiers are separated from the substantive data and locked away in secure physical or cyber storage, and the potential to reconnect them is maintained via an arbitrary linking code – the key. Held independently and securely, the key makes it possible to reassociate the substantive data with the identifiers if necessary. A simplified illustration. Substantive data about the data-subject are assigned the meaningless code {994HN3}, and the identifiers the code {TZ916Q}, and the substantive data and identifiers are then separated. Thus the key to reassociating them is {994HN3↔TZ916Q}. In some techniques, barcodes are used as keys. For high-sensitivity data, the codes themselves can be further encrypted. The key and the responsibility for its use can be lodged with whoever originally collected the data, or entrusted to an ethics board, a data access committee, a designated government unit, or other reliable disinterested intermediary, sometimes called an “honest broker.” Use of the key can be guided by agreed criteria. Key-holders can be bound by solemn confidentiality commitments. Unauthorized use of the key can be made subject to penalty. And the whole process can be subjected to oversight.20 Keycoding can be used across multiple databases and between data and biospecimens, and it can be used to keep an individual’s data and biospecimens cross-referenced to each other even if the links to the person or intermediary data and specimen providers are irreversibly severed. Use of the term “key-coded” avoids such awkward expressions as “pseudonymized,” and even worse, “pseudoanonymized.” “Encryption” is now taken in everyday speech to mean the transforming of information into ciphertext to keep it secret, as when credit card data are sent through the Internet. “Coding” is universally used in medical practice to refer to the classification of diagnoses, drugs, and procedures to standard categories. The central feature of a system that maintains the potential to reassociate substantive data with identifying data is the key: hence, key-code. The term has the advantage that it is easily understood by the public. The HIPAA Privacy Rule allows, and in effect encourages, key-coding, although it doesn’t use the term (§164.514(c)):

20

An example of a practical key-coding policy is Duke University Health System Institutional Review Board, “Policy on IRB determination of research not involving human subjects for research using coded specimens or coded identifiable private information” (2008): http:// irb.duhs.duke.edu/wysiwyg/downloads/Coded_Specimens_and_Coded_Identifiable_PHI_ Policy_05–15–08.pdf.

Identifiability terminology

105

A covered entity may assign a code . . . to allow information deidentified under this section to be re-identified by the covered entity, provided that: (1) Derivation. The code . . . is not derived from or otherwise related to information about the individual and is not otherwise capable of being translated so as to identify the individual; and (2) Security. The covered entity does not use or disclose the code . . . for any other purpose, and does not disclose the mechanism for reidentification. Aware that assiduous key-code management can be tedious and technically demanding, the Institute of Medicine review of the Privacy Rule recommended that “a better approach would be to establish secure, trusted, nonconflicted intermediaries that could develop a protocol, or key, for routinely linking data without direct identifiers from different sources and then provide more complete and useful de-identified datasets to researchers.”21 (The equivalent of key-coding can be performed by privacy-preserving data linkage systems, several of which will be described in Chapter 10.)

Identifiability terminology Viewed from the perspective of a researcher, data are either: (a) identified or easily identifiable, i.e., it is possible without extraordinary or illegal effort to know who the data-subject is; or (b) key-coded, i.e., it is nearly impossible to know who the subject is as long as the identifiers are separated and kept beyond the key, agreed-upon restrictions such as promises not to attempt to identify the data-subjects are respected, and strong safeguards are maintained; or (c) non-identifiable, i.e., it is impossible for all practical purposes to know who the person is. Additional categories are not needed and are unhelpful. Synonyms of course are okay; “linked anonymised,” as is used in the UK, is equivalent to “key-coded.” Box 7 shows approximate synonyms used in various research cultures. “Nominative” is rarely used now. “Encrypted,” and “coded” without the prefix “key-,” should be avoided because they are popularly used with other meanings. “Anonymous” is sometimes used if no identifiers are collected in the first place (such as in a blind survey of adolescents’ condom use), but this can be confused with “anonymized,” meaning that the data were once identified and have been purposely de-identified 21

Institute of Medicine, Beyond the HIPAA Privacy Rule, p. 178.

106

Identifiability and person-specific data

to some particular standard. “Anonymized” can be used in communications with the lay public; however, used alone it is imprecise in that it doesn’t indicate whether the de-identification is reversible, and so generally should be avoided in professional discourse. “Un-identified,” a term not in the table but one familiar from domestic life, just means that the status isn’t clear at the moment, as with an obscurely labeled sample turning up in the bottom of a freezer. Box 7. Concordance of data identifiability terms

Identified or easily identifiable Key-coded personal individually identifiable person-identifiable personally identifiable nominative

reversibly de-identified linked anonymized pseudonymized pseudoanonymized re-identifiable encrypted coded

Non-identifiable irreversibly de-identified unlinked anonymized anonymous

A related concern about language. Too often, it is stated that a particular combination of partial identifiers makes some substantive data “identifiable,” when what is actually meant is that the combination of data is unique in the data-set or is unique or very rare in the population to which the data pertain, and so perhaps the data are only a stage or two from being identified in the (for short) name-and-address sense. Some statisticians refer to such cases, inelegantly but clearly, as “uniques.” A combination of descriptors, even a unique combination, may well not allow contacting or affecting a real person. Whether it does, depends on how precise and accurate the pieces of data are, on what external information they would have to be examined with before a name-and-address identity might emerge, and on how accessible that external information is. The quality at issue is degree of distinctness. Two examples of the alleged offense will illustrate. Genomicists routinely remark that genotype data are identifiable or self-identifying, when what they mean is that the combination of sequence-related data are in theory distinguishable from all other people’s genotypes. A report from the National Research Council on socio-spatial data referred throughout to the social data part of the combination as being self-identifying, without ever explaining why this should be so for social research data, which

Identifiable to whom?

107

normally are carefully de-identified before being used in research. Aren’t such remarks misleading, at least to audiences who are not members of the technical in-group? Mightn’t a notion such as “individuated” be closer to a true depiction, despite its awkwardness? At the very least, scientists should be urged to carefully consider what they mean when they say that some data or biospecimens are identifiable.

Identifiable to whom? A policy that needs to be vigorously promoted is this: If researchers who have access to carefully de-identified data cannot know the individuals’ identities and promise to safeguard the data and not attempt to re-identify them, then the data should not be considered personally identifiable data for those researchers. Failure to concede this causes many losses of research opportunity. Surely what is important is not whether a link of some sort exists somewhere, inaccessible to the researchers. This is a very important, practical issue. Any jurisdictions not currently endorsing such a policy should consider adopting it. The UK Information Commissioner’s Office has endorsed such a view: The pragmatic line taken by the Article 29 Working Group is that where an organisation holds records which it cannot link, nor is ever likely to be able to link, to particular individuals, the records it holds will not be personal data. This will only be the case where it is unlikely that anyone else to whom the records may be released will be able to make such links. This will remain the case even if there is one organisation that would be able to make such a link as long as that organisation will not release information enabling such links to be made and adopts appropriate security. Where there is no likelihood of records being linked to individuals there is no real risk of any impact on those individuals.22 Relating to this, the UK National Health Service is piloting a Proportionate Review Service, which allows qualified subcommittees of Research Ethics Committees to make a determination that a research application has “no material ethical issues” – such as because it will impose very low risk, burden, or intrusion on data-subjects – and thus does not require full review. One of the criteria favoring the proportionate,

22

UK Information Commissioner’s Office, “Determining what is personal data,” p. 20.

108

Identifiability and person-specific data

fast-track review is if the data or tissues will be kept “anonymous to the researcher.”23

No personal data: no human subject? Returning to the second of the watershed points cited at the opening of this chapter, a crucial affirmation is this policy of the US Office for Human Research Protections (OHRP), with “coding” clearly meaning keycoding:24 OHRP considers private information or specimens not to be individually identifiable when they cannot be linked to specific individuals by the investigator(s) either directly or indirectly through coding systems. For example, OHRP does not consider research involving only coded private information or specimens to involve human subjects if the following conditions are both met: (1) the private information or specimens were not collected specifically for the currently proposed research project through an interaction or intervention with living individuals; and (2) the investigator(s) cannot readily ascertain the identity of the individual(s) to whom the coded private information or specimens pertain because, for example: (a) the investigators and the holder of the key enter into an agreement prohibiting the release of the key to the investigators under any circumstances, until the individuals are deceased . . . (b) there are Institutional Review Board (IRB)-approved written policies and operating procedures for a repository or data management center that prohibit the release of the key to the investigators under any circumstances, until the individuals are deceased; or (c) there are other legal requirements prohibiting the release of the key to the investigators, until the individuals are deceased. “Human subject,” according to the Common Rule on Protection of Human Research Subjects, “means a living individual about whom an 23 24

UK National Patient Safety Agency, Proportionate Review Service: www.nres.npsa.nhs. uk/applications/proportionate-review. US Department of Health and Human Services, Office for Human Research Protections, “Guidance on research involving coded private information or biological specimens” (2008): www.hhs.gov/ohrp/policy/cdebiol.html.

Some reflections

109

investigator (whether professional or student) conducting research obtains (1) data through intervention or interaction with the individual, or (2) identifiable private information.”25 Thus the logical implication and a constructive policy is this: If research does not involve interacting with individuals, identifiable data are not involved, rules are complied with, and appropriate safeguards are maintained, then no real person is subjected to anything and the research does not constitute human subject research. The OHRP Guidance “recommends that institutions should have policies in place that designate [an] individual or entity [other than the investigator] authorized to determine whether research involving coded private information or specimens constitutes human subjects research.” Among other things, such dispensation can obviate the need for consent. This policy should be defended in the US and considered for adoption elsewhere.26

Some reflections Because identifiability is situationally relative and because records can carry many and diverse bits of data, it is difficult for laws and regulations to prescribe acceptable, quantitative thresholds of identifiability. And because of the availability of all the techniques and databases that can be applied in efforts to identify data, de-identification is often criticized now as a losing strategy. But it isn’t necessary to give up on de-identification; in fact, it is important not to. All depends on the degree of identifiability in each situation, the benefit–risk calculus, and the surrounding safeguards.27 Even many highly restricted data-access arrangements (such as those to be discussed in Chapter 10) provide access to data only after they have been key-coded or otherwise made non-identifiable to the researchers. If regulations and research ethics review allow, many projects can proceed confidently with careful de-identification and careful but not

25 26

27

US Department of Health and Human Services, Federal Policy on Protection of Human Subjects (“Common Rule”), §46.102(f). Scenarios regarding consent and the use of biospecimens under the OHRP policy were discussed in US Department of Health and Human Services, Secretary’s Advisory Committee on Human Research Protections, letter of January 24, 2011 to the Secretary, attachment A: www.hhs.gov/ohrp/sachrp/20110124attachmentatosecletter. html. Ann Cavoukian and Khaled El Emam, “Dispelling the myths surrounding deidentification: Anonymization remains a strong tool for protecting privacy,” discussion paper on the website of the Ontario Information and Privacy Commissioner (2011): www. ipc.on.ca/images/Resources/anonymization.pdf.

110

Identifiability and person-specific data

extreme safeguards gauged to the risks. Measures that should be pursued to advance this include: □ further refining de-identification techniques and privacy risk assessment approaches; □ continuing to strengthen physical, administrative, and cyber security in research centers; □ discouraging the release into the public domain of potentially identifying healthcare data such as hospital discharge records; □ supporting the policy that if researchers who have access to carefully deidentified data cannot know the identities and promise to safeguard the data and not attempt to re-identify them, then the data should not be considered personally identifiable data for those researchers; □ supporting the policy that if research does not involve interacting with individuals, identifiable data are not involved, rules are complied with, and appropriate safeguards are maintained, then no real person is subjected to anything and the research does not constitute human subject research; □ continuing to work on the cross-influences between identifiability and consent, and about the effectiveness that various versions of consent and de-identification in combination have for privacy protection.

8

Genetics and genomics

The quandary of genetic exceptionalism – that is, whether genetic and genomic data should be handled and protected very differently from other kinds of health-related data – has been debated for several decades. In one view, data about heritable factors are becoming better understood and more useful for both research and health care, and so should simply be used along with other data and handled in the same ways. Besides, both research and care have long taken into account the way some illnesses tend to run in families, and medical genetics is a wellestablished field. In the opposing view, the transformational insights and exploitive potential, especially of genomics – for regrettable ends as well as for life-enhancing ones – and the newness of it all, demand exceptional treatment. The author’s view is that genomics is both a scientific revolution and an ethics and policy game-changer. Powerful but as yet fairly uncertain personal implications can emerge from the research, and neither most research participants or patients, nor most ethics committees, doctors, judges, or governmental bodies are well prepared yet to comprehend and deal firmly with the personal or social consequences. (Despite many years of Nobel Prizes and magazine cover stories featuring the double helix, even many highly educated people are not yet able to recount even the most rudimentary DNA-to-RNA-to-protein story line, or say what proteins are, or explain why differences among protein structures might matter in bodily functioning.) Threats of genetic discrimination will have to be confronted, as will confusion about how determinative the genome is of one’s fate, as will apprehensions about the development of techniques that can be thought of as eugenics. So, probably at least through some transition years as the science matures, the implications become clearer, and robust social controls are developed, genetic and genomic research should proceed cautiously and the data handled with special care. Among other things, the informational richness holds significant consequences for identifiability and privacy. 111

112

Genetics and genomics

The challenges arise from the facts. Each person’s genome is consolidated at conception and hardly changes during his or her lifetime, is staggeringly extensive and fine-grained, is identically present in every cell of the body except red blood cells, directly or indirectly influences all physical and behavioral attributes, and – in comprising some 3,000,000,000 code-bits (nucleotide base pairs) assembled in myriad combinations from the maternal and paternal strands of code-bits – is unique to the individual, except for identical twins. It indicates aspects of one’s ancestral origins, foreshadows possible aspects of one’s future, holds implications about one’s blood relatives, and becomes part of the makeup of one’s descendants.

Genetics/genomics It is important to recognize the distinction between these two endeavors, because they work in generally different contexts, involve different sorts of data, and raise different issues. Medical genetics, a discipline with a long and beneficent history, is the study of the inheritance of disease-related factors and application of the science in diagnosing and treating genetic disorders. Until recently it has focused on diseases mediated by a single or only a few genes (“Mendelian” inheritance), such as phenylketonuria, congenital hypothyroidism, sickle cell anemia, and cystic fibrosis, and on a few arising from abnormal arrangement, or loss or gain of segments, of chromosomal material, such as Down syndrome. These are conditions for which the disease manifestations are evident, genes are highly determinative, diagnosis is reliable, and clinical care and counseling can be provided. It has also made progress on thousands of rarer diseases that have more complex genetic origins, and on some disorders, such as cleft palate, that have several causes but which seem in many cases to result from interaction between genes and the cellular environment.1 It is probably fair to say that classical genetic science is limited in its ability to tease out the factors in multifactoral conditions. More and more clinical genetic tests are being developed that can usefully inform treatment, lifestyle, and reproductive decisions. Newborn blood screening is widely conducted to detect some of the more prevalent Mendelian disorders like those just mentioned. In some places embryos are checked for deleterious mutations before being used 1

The authoritative compendium, updated daily, is the website “Online Mendelian inheritance in man”: www.ncbi.nlm.nih.gov/omim. The website is a monument to dedicated research by many biomedical investigators and sustained participation by patients and families affected by genetic disorders.

Genetics/genomics

113

for in vitro fertilization. Now genetic testing is approaching a challenging phase-change as genomic science is making a different kind of (almost) routine “testing” possible via sequencing and other genotyping, raising issues as to whether testing or screening should be broadened to include illnesses influenced by many heritable factors, each of which exerts only a modest effect on health. Obviously, the results of genetic tests ordered by a physician and performed under certified clinical laboratory auspices should be protected as medical data. Just as obviously, the results of direct-to-consumer tests performed for genealogy, paternity, health, or other purposes but outside of the medical context cannot (unless, perhaps, they are submitted by the person for their medical record). It should be remarked that not all genetic research is mechanistic. Much has to do with exploring ways patients and their families can better compensate for and cope with the conditions. Genomics, the science that coalesced around the Human Genome Project in the 1990s, is the dynamic study of the whole genetic apparatus and its functioning. Among other things, it can generate a detailed description of an individual’s genes, called genotype. Although it is refining the understanding of classic genetic phenomena, for the most part it is probing the genome generally to understand how variations relate to health problems or drug responses, and studying interactions between genetic and environmental factors. Much of the current work is focusing on carcinogenesis and cancer, because cancer is basically a result of genomic malfunctioning.2 Genomic data are now being derived from a diversity of DNA sources in production-line fashion. Genomic research is being performed in many countries. The cost of sequencing is rapidly falling toward the goal of $1,000 per full genome.3 Biospecimens and genomic data are being linked with health, family history, and social data for research. The foundations of genomically personalized medicine and public health genomics are being laid.4 All of this is wonderfully promising. But again, it is raising novel concerns regarding such matters as notice, consent, identifiability, and data sharing, which affect privacy. 2

3 4

For a vision of the future, see Eric D. Green, Mark S. Guyer, and the US National Human Genome Research Institute, “Charting a course for genomic medicine from base pairs to bedside,” Nature, 470 (2011), 204–213. Kevin Davies, The $1,000 Genome: The Revolution in DNA Sequencing and the New Era of Personalized Medicine (New York: Free Press, 2010). Francis Collins, The Language of Life: DNA and the Revolution of Personalized Medicine (New York: HarperCollins, 2010); Khoury, Bedrosian, Gwinn, et al. (eds.), Human Genome Epidemiology.

114

Genetics and genomics

GWAS Genome-wide association studies (GWAS) are among the most common analyses being conducted today. These scan as many as a million independent sequence markers called single nucleotide polymorphisms (SNPs) across the genomes of thousands of people known to have a disease or disease factor, or a pharmaceutical response factor such as a necessary metabolic enzyme, and compare these with the markers in the genomes of people not having the disease or factor, searching for associations. How strongly causal the factors are, and what the mechanisms of causality are, then must be investigated using more in-depth techniques. The explorations are scientifically illuminating, and they are starting to generate insights relevant to health care.5 For statistical power, they must examine data about very many people, as Paul Burton and his colleagues have calculated: “Any research infrastructure aimed at providing a robust platform for exploring genomic association will typically require several thousands of cases to study main effects and several tens of thousands of cases to properly support the investigation of gene–gene or gene–environment interaction.”6 In order to assemble the requisite number and diversity of cases, many genome projects exchange biospecimens and data internationally. For example, an International Cancer Genome Consortium involving many funding organizations and hundreds of scientists in Asia, Australia, Europe, and North America is coordinating the sharing of high-quality data and tumor samples globally in an effort “to generate comprehensive catalogues of genomic abnormalities . . . in tumors from 50 different cancer types and/or subtypes which are of clinical and societal importance across the globe and make the data available to the entire research community as rapidly as possible, and with minimal restrictions, to accelerate research into the causes and control of cancer.”7

5

6

7

Teri A. Manolio, “Genomewide association studies and assessment of the risk of disease,” New England Journal of Medicine, 363 (2010), 166–176. A continually updated resource is L. A. Hindorff, J. MacArthur, A. Wise, et al., “A catalog of published genome-wide association studies”: www.genome.gov/gwastudies. At the time of writing, the catalogue had recorded almost 1,000 studies. Paul R. Burton, Anna L. Hansell, Isabel Fortier, et al., “Size matters: Just how big is BIG? Quantifying realistic sample size requirements for human genome epidemiology,” International Journal of Epidemiology, 38 (2009), 263–273. See also Timothy Caulfield, Amy L. McGuire, Mildred Cho, et al., “Research ethics recommendations for wholegenome research: Consensus statement,” PLoS Biology, 6(3), e73. doi:10.1371/journal. pbio.0060073 (2008). International Cancer Genome Consortium: http://icgc.org.

Genotype-driven recruitment

115

EHR-driven genomic discovery In what surely will become routine in the future, increasingly the data in electronic health records (EHRs) are being tapped for genomic investigations. As Isaac Kohane has explained, this enables two approaches: In the first, patients with characteristics matching those of interest – for example, a disease category such as rheumatoid arthritis or lack of a clinical response to serotonin-specific reuptake inhibitor antidepressants – are selected via EHRs using a combination of structured, codified, and narrative text. The populations of patients thereby characterized are then recruited to provide samples or have their discarded clinical samples analyzed for genomic research. In the second, EHRs are used to provide additional clinical characterization or to fill in missing details on subjects whose samples have already been collected for either a biobank or other cohort study.8 An Electronic Medical Records and Genomics (eMERGE) Network of seven clinical centers in the US is exploring how well this can work. Linking medical-center EHR data with genotype data derived from DNA extracted from clinical specimens, they are searching for genomic factors relating to cataracts, dementia, diabetes, and other diseases, and factors relating to patients’ response to the lipid-lowering drugs called statins. In a coming phase the project will take stock of the subjects’ perceptions, but so far it appears that most are comfortable with the way the projects are collecting and sharing the data for research.9 (One of the projects, BioVU, is described on p. 148 below.) Genotype-driven recruitment A vexing challenge can arise in genomic research. The scenario is that researchers notice, either when running a study or when reviewing existing data, what seems to be an association between some gene variants and a disease phenomenon. They want to go back to the patients to collect more data or biospecimens, do sequencing, and investigate whether the 8 9

Isaac S. Kohane, “Using electronic health records to drive discovery in disease genomics,” Nature Reviews Genetics, 12 (2011), 417–428. eMERGE: www.mc.vanderbilt.edu/victr/dcc/projects/acc/index.php/Main_Page; Catherine A. McCarty, Rex L. Chisholm, Christopher G. Chute, et al., “The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies,” Medical Genomics 4(13) (2011), available online at: www.biomedcentral. com/content/pdf/1755-8794-4-13.pdf.

116

Genetics and genomics

variants are indeed risk factors. To broaden the coverage they may want to involve genetic relatives. Such recontacting, though, has the potential to reveal – or even suggest incorrectly, as it may turn out – that the participants or their relatives have the condition or a genomic predisposition to it. The process may transgress a “right not to know,” if such a right is recognized. And it can severely alarm people before the science becomes clear or any medical or counseling response is possible. Sean McGuire and Amy McGuire have recommended that in such situations, subjects who agree to be recontacted should be “told that the follow-up study is genotype-driven and what that means, what the genotype and biological pathway of interest are, what the procedures for collecting additional information will be, that half of the participants are controls with no particular genetic variation, and that an invitation to participate is not contingent on the presence of any known phenotype [i.e., clinical phenomenon].”10 Just such a scenario confronted a team at Duke University when, in an epilepsy genomics study, it was noticed that in some patients a rare anomaly (large heterozygous deletions) seemed to confer an elevated risk of seizures. A follow-up study was initiated to investigate. More blood samples were needed, as was the participation of relatives. Although eventually the team was able to pursue the study, it first had to work its way through a series of awkward communication and consent issues. As is evident in the letter it sent to the original study participants requesting further involvement, it basically followed McGuire and McGuire’s suggested approach (Box 8). Even so, many reluctances and misunderstandings had to be dealt with. In a reflective article afterward, the team strongly urged the development of guidelines on genotype-driven recontacting and recruitment.11 Notice and consent Genetic and genomic projects can pose problems for consent that are special, if not necessarily unique. Genomic science is very, very complex, making “fully informed” consent in the sense of understanding the science impossible for most people. Grasping the sense of a proposed project may depend on having at least a basic comprehension of the mechanism of inheritance. Many genomic studies are best pursued, or can only be 10

11

Sean E. McGuire and Amy L. McGuire, “Don’t throw the baby out with the bathwater: Enabling a bottom-up approach in genome-wide association studies,” Genome Research, 18 (2008), 1683–1685: http://genome.cshlp.org/content/18/11/1683.full. Laura M. Beskow, Kristen N. Linney, Rodney A. Radtke, et al., “Ethical challenges in genotype-driven research recruitment,” Genome Research, 20 (2010), 705–709: http:// genome.cshlp.org/content/20/6/705.full.

Notice and consent

117

Box 8. Letter from Duke University project team to epilepsy study participants (excerpt)* We have discovered that among patients with many different types of epilepsy, some are missing large sections of their DNA . . . Even people without a disease sometimes are missing sections of their DNA (a deletion). However, we have discovered that some patients with epilepsy have larger sections missing and we think it is possible that this might contribute to why they have seizures. At this point, there are many things we do not know. We do not know how such a deletion affects epilepsy clinically. We do not know how it affects the risk for getting epilepsy in an individual patient. We do not know what it means for inheriting epilepsy or if it even has any impact on inheritance . . . We hope in the future that we will learn more about these deletions and therefore be able to advance the treatment of epilepsy but we are not there at this point. In order to learn more about our findings we would like to contact some of you to obtain additional blood samples. We may also want to contact your family members, but will not do so without permission from you and from them. If we contact you about this follow up study, you should not assume it means that you have the deletion in question. Because we do not know what this deletion means and it will not affect your care right now, we will not be able to confirm whether or not you have the deletion we are studying. If you DO NOT want to be contacted for follow up please call [study coordinator] at [number]. If we do not hear from you by [date] we will assume you would like to hear more about helping us with this next exciting step. * The letter in full is published in Beskow, Linney, Radke, et al., “Ethical challenges”, p. 708.

pursued, via platforms that assemble large amounts of high-quality genotype and health data from different sources and provide data for multiple research uses over time. The users and uses of the resource data can’t be specified in advance, nor can the eventual users or uses of the new data generated. Moreover, the informational benefits and risks can be difficult to predict and communicate. As with any kind of exploratory mapping, unanticipated revelations can emerge. In studies involving family members, unexpected other facts of life, such as ones having to do with parentage, can be revealed, a potentially traumatic matter for everyone involved. Sequence or other data recorded in passing may amount to health-relevant “incidental findings,” i.e., ones other than those intentionally searched for,

118

Genetics and genomics

or will come to be so in the future after science and medicine advance. More than in most other kinds of research, genetic and genomic studies can turn out to have implications for uninvolved blood relatives, and relatives may even unwittingly become in a sense research subjects de facto. Narrow consent simply can’t accommodate all this. Broad consent by the people involved or whose materials are being studied, or waiving by an ethics body of the requirement for consent, may be essential, and compensating protective measures taken. Genetic and genomic identifiability12 As was mentioned earlier, it has become a habit to refer to genotype data as being self-identifying. This can be misleading. What is true is that genotype can be taken as a kind of hyper-barcode, an intrinsic tracing tag, and perhaps the rarity of tagging sequences or combinations of SNPs can be estimated. Genomic data can help single out, identify in the name-and-address sense by matching, or suggest or confirm relations among people. Some may even indicate likely personal characteristics and thus point to a person. De-identification of genomic data. Genomic data, which often are recorded with associated data such as study number, disease type, and other clinical data, can be de-identified for research by: □ Separating-off or obscuring the identifiers and key-coding. This is widely done, as with other kinds of data. The equivalent of keycoding can be performed by privacy-preserving data linkage systems. Some identifiability risk can remain if genomic data are examined along with clinical data, especially if the study is a very local one in which some participants are known to some of the researchers. Vulnerability to the special approaches through which genomic data can become identified (described in the next section) have to be taken into account and safeguards maintained. □ Statistically degrading the data. This is done occasionally, such as by randomly altering or exchanging a small percentage of SNPs. But usually it severely degrades usefulness, because genomic research almost always needs exact molecular details. □ Releasing only a short sequence, or a limited number of SNPs. The problem with this is that it is hard to know in advance whether any 12

This section builds upon a project the author conducted with the US National Human Genome Research Institute and a resulting policy forum article written with Francis S. Collins, “Identifiability in genomic research,” Science, 317 (2007), 600–602. The author remains indebted to Dr. Collins for the stimulating experience and to Science for publishing our work.

Genetic and genomic identifiability

119

particular run will have special structural or functional characteristics that might contribute to identifying. □ Irreversibly de-identifying biospecimens and the related data. Sometimes the substantive data and the biospecimens for an individual are cross-referenced to each other via a code of some kind soon after being collected, and then the identifying data and all direct and indirect links to the individual are destroyed. Under proper safeguards the data– biospecimen tandems can be scientifically useful and present relatively low privacy risk. But obviously the technique has the same disadvantages for research that any irreversible de-identification does. Ways non-identified genomic data can become identified. Genomic data can become identified using methods similar to those used with other kinds of data, but with twists:13 □ Matching genotype with identified or identifiable genotype data. Although the method is not perfect, the reliability is much, much higher than with conventional matches such as with fingerprints. Several million SNPs occur in at least one percent of the human population, and more occur to a lesser extent. Given the enormous number of possible combinations, only 30 to 80 SNPs must be confirmed as being identical to conclude that two DNA samples are from the same person.14 Other features also can be matched. The potential for matching is growing rapidly as hospital, police, military, research, and other genotype databases and DNA archives grow. Again, whether nameand-address identity can be deduced depends on whether the data matched-to are identified or can be identified through further effort. □ Linking genomic and associated data with external data and deducing the identity. The possibility of building up pointers to individuals through such linking is growing as data of all the sorts listed in the preceding chapter become computerized and available. This can often narrow the candidates down to a few possibilities. □ Profiling. Increasingly, genomic data can be used to derive a partial description of a person. It is now possible not only to infer sex and basic blood type, which are easy, but also to infer – to differing degrees of confidence – probable skin pigmentation, freckling, iris color, hair thickness, curl, and color, male pattern baldness, and other physical 13

14

A technical review that is sensitive to the inferential uncertainties is Manfred Kayser and Peter de Knijff, “Improving human forensics through advances in genetics, genomics and molecular biology,” Nature Reviews Genetics, 12 (2011), 179–192. Zhen Lin, Art B. Owen, and Russ B. Altman, “Genomic research and human subject privacy,” Science, 303 (2004), 183. For many years a matching technique based on a different selection of markers (short tandem repeat polymorphisms) has been used for criminal investigations. In the future, other sequence features may be used.

120

Genetics and genomics

attributes; vulnerability to a variety of diseases; and far less deterministically, possibly some behavioral traits. The descriptions can only be “probabilistic,” as traits are influenced by many factors other than genomic ones, but the techniques are gaining scope and power as genomic science progresses.15 Project enrollment, consent or other forms of permission, and data sharing arrangements must take account of the identification risks posed by the availability of these techniques. One strand of issues arises from the fact that because genetic relatives’ genomes are very, very similar to each other, genotyping of one person can hold informational implications about relatives. Thus, the finding of a health-related heritable factor in one person may suggest that genetic relatives should have themselves checked for it. And triangulating from relatives’ genotypes can help identify criminal suspects or the remains of the victims of homicide, accidents, natural disasters, terrorism, or war.16 An unanticipated complication for genomic research became evident in 2008. For years, aggregate data (summary allele frequency and genotype counts) had been posted on publicly accessible websites to provide an overview of the data-sets, and it had been thought that such résumés would not allow anyone to discern whether a particular person’s genotype was included in the data-sets. But then a team led by David Craig demonstrated that a certain forensic-style statistical analysis can distinguish among individual genotypes in mixtures of DNA from 100 or more people, and so can detect whether a query genotype (SNP profile) is

15

16

A novel approach to assembling data regarding such associations was described in Nicholas Eriksson, J. Michael Macpherson, Joyce Y. Tung, et al., “Web-based, participant-driven studies yield novel genetic associations for common traits,” PLoS Genetics, 6(6), e1000993. doi:10.1371/journal.pgen.1000993 (2010). Incidentally, the journal editors registered reservations about ethical aspects of the approach: Greg Gibson and Gregory P. Copenhaver, “Consent and Internet-enabled human genomics,” PLoS Genetics, 6(6), e1000965. doi:10.1371/journal.pgen.1000965 (2010). Leslie G. Biesecker, Joan E. Bailey-Wilson, Jack Ballantyne, et al., “DNA identifications after the 9/11 World Trade Center attack,” Science, 310 (2005), 1122–1123; Frederick K. Bieber, Charles H. Brenner, and David Lazar, “Finding criminals through DNA of their relatives,” Science, 312 (2006), 1315–1316. An article on legal forensics that expresses both technical and legal caveats regarding attempts to identify people via relatives’ genotypes is Erin Murphy, “Relative doubt: Familial searches of DNA databases,” Michigan Law Review, 109 (2010), 291–348: http://papers.ssrn.com/sol3/papers. cfm?abstract_id=1498807. See also Christopher A. Cassa, Brian Schmidt, Isaac S. Kohane, and Kenneth D. Mandl, “My sister’s keeper? Genomic research and the identifiability of siblings,” Biomed Central Medical Genomics, 1(32) (2008): www.biomedcentral.com/content/pdf/1755-8794-1-32.pdf.

The Personal Genome Project

121

present in a mixture.17 Thus if a person’s genotype is known in sufficient detail, it may be possible to confirm whether he or she is in the data-set. Moreover, with databases involving both cases and controls, the method can reveal whether a genotype – or that of a genetic relative – is a case or a control, and being a case implies that the person has the disease or disease predisposition that is the focus of the study. Concerned about privacy, a number of projects, such as some run by the Wellcome Trust and the US National Institutes of Health, moved much aggregate genotype data from open access to restricted access. It has to be said that the bioinformatic analysis required is far from trivial, requires having a dense genotype profile of the query DNA in hand, is sensitive to analytic assumptions, and does not in itself reveal name-and-address identity.18 But from now on this will be a consideration with sets of genomic data. The Personal Genome Project The Personal Genome Project (PGP) is an unprecedented experiment with genomic privacy, led by George Church at Harvard University. It is worth taking note of because of its extreme nature. The Project is recruiting volunteers who provide extensive personal health information and biospecimens, and allow the data, including potentially their entire genome sequences, to be posted openly on the web. The eventual recruitment goal is 100,000 participants.19 Church and his colleagues are convinced that many conventional consent agreements, including ones having to do with non-genomic research, understate the privacy risks, fail to emphasize that some risks may simply be unimaginable at the time of consenting, irresponsibly reassure the datasubjects, and so are illegitimate. Saying “We feel the most ethical and practical solution to the risks at this time is for volunteers to be recruited, consented, and enrolled based on the expectation of full public data 17

18

19

Nils Homer, Szabolcs Szelinger, Margot Redman, et al., “Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays,” PLoS Genetics, 4(8), e1000167. doi:10.1371/journal.pgen.1000167 (2008). Perhaps relevant to this author’s concern about imprecise use of “identifiability” (p. 106 above), Craig speaks of the method’s “resolving representation” in a mixture, not “identifying” anyone. Kevin B. Jacobs, Meredith Yeager, Sholom Wacholder, et al., “A new statistic and its power to infer membership in a genome-wide association study using genotype frequencies, Nature Genetics, 41 (2009), 1253–1257. PGP: www.personalgenomes.org. As of September 2011, more than 2,500 volunteers had achieved 100 percent on the entrance exam, the medical data on about 1,000 were online, and the DNA of some 500 were moving through whole-genome sequencing. (Personal communication from Professor Church.) Participant data are posted at: https://my.personalgenomes.org/enrolled.

122

Genetics and genomics

release and to purposefully exclude any promises of permanent confidentiality or anonymity,” the Project makes an impressive effort to help potential participants understand the risks, which it recognizes may be substantial.20 Because the PGP model, which the Project calls “public genomics,” may well presage a future in which such surrendering of privacy is common – a future that is possible but by no means inevitable – and because it is the opposite of the highly restricted data sharing approaches that are now moving into wide adoption, as will be described in Chapter 10, it is worth thinking about the warnings in the PGP consent information (Box 9).21 The PGP is aware that it is operating in new and uncertain legal territory and may in the future become vulnerable to tort liability or other claims having to do with whether its consent warnings were sufficient, whether it was reasonable to have concluded that participants understood and internalized the warnings, or whether in releasing findings to the participants and the public it took on a duty of medical care that it didn’t adequately follow-through on.22

Some reflections Three trends seem inevitable. First, research projects have often had difficulty in collecting extensive and reliable pedigree data and associating them with detailed health-related data, but this is changing: We should expect in the near future great improvements in the following areas: (i) the collection, storage, and retrieval of family history information through the use of computers; (ii) the interpretation of the health risks associated with family history for major diseases; (iii) the development of algorithms to capture the distribution of familial risk in populations; (iv) the incorporation of familial risk into electronic medical records; (v) the organization and delivery of evidence-based advice for the prevention of major diseases where familial risk is elevated; and (vi) the performance of cost-effectiveness studies to evaluate the 20 21

22

“Important considerations” section of the PGP website. It is worth remarking that despite their expressive drama, the PGP warnings are not very different in substance from those in the US National Human Genome Research Institute’s template form, “Informed consent elements tailored to genomic research”: www.genome.gov/pfv.cfm?pageID=27026589. Also, the abusive acts described could probably be based on DNA abandoned on a drinking glass. John M. Conley, Adam K. Doerr, and Daniel B. Vorhaus, “Enabling responsible public genomics,” Health Matrix, 20 (2010), 325–385: www.genomicslawreport.com/wp-content/uploads/2011/02/Health_Matrix_-_Journal_of_Law-Medicine_Vol_20_2010.pdf.

Some reflections

123

use of family history in the clinical and the public health settings.23 Whether researchers should communicate genetic research findings to the relatives of research subjects, though, and what the specific obligations of tracing, informing, and possibly counseling and caring should be, and how privacy should be respected, have tended to elude clear policy and legal formulations. So has the obligation of people who become aware that they carry heritable factors of disease to inform family members. So has any right of family members to refuse to be informed. All of these issues have been addressed before with respect to classical genetics, but the penetrating nature of genomic analysis is making it imperative that they be addressed more formally.24 Second, as genomic science becomes ever more sophisticated, the cost of sequencing and other genotyping techniques continues to drop, genotyping becomes more routine, and genealogy and genomic databases continue to grow and be interlinked, the identifiability of genomic data generally will increase.25 And third, as genetic and genomic data become integrated with, or linked to, EHRs, disease and other registries, social databases, and research databases, this will tend to increase the identifiability of the data in those collections.

23

24

25

Rodolfo Valdez, Muin J. Khoury, and Paula W. Yoon, “The use of family history in public health practice: The epidemiologic view,” in Khoury, Bedrosian, Gwinn, et al. (eds.), Human Genome Epidemiology, p. 589. Gillian Nycum, Bartha Maria Knoppers, and Denise Avard, “Intra-familial obligations to communicate genetic risk information: What foundations? What forms?” McGill Journal of Law and Health, 3 (2009), 21–48: http://mjlh.mcgill.ca/pdfs/vol3-1/NycumKnoppersAvard. pdf; Richard R. Fabsitz, Amy McGuire, Richard Sharp, et al., “Ethical and practical guidelines for reporting genetic research results to study participants: Updated guidelines from a National Heart, Lung, and Blood Institute working group,” Circulation: Cardiovascular Genetics, 3 (2010), 574–580: http://circgenetics.ahajournals.org/content/3/6/574.full. An illustration of what can be deduced by overlaying genomic and genealogy data is Jane Gitschier, “Inferential genotyping of Y chromosomes in Latter-Day Saints founders and comparison to Utah samples in the HapMap project,” American Journal of Human Genetics, 84 (2009), 251–258.

124

Genetics and genomics

Box 9. Extract from the PGP’s consent form 6.1(a)* (i) The public disclosure of your genetic and trait data could cause you to learn – either directly, from a family member or from another individual – certain unexpected genealogical features about you and/or your family . . . In particular, this could include inferences of non-paternity, as well as inferences or allegations of paternity made by individuals you did not previously know or suspect were related to you. (ii) Anyone with sufficient knowledge and resources could take your DNA sequence data and/or posted trait information and use that data, with or without changes, to: (A) accurately or inaccurately reveal to you or a member of your family the possibility of a disease or other trait or propensity for a disease or other trait; (B) claim statistical evidence, including with respect to your genetic predisposition to certain diseases or other traits, that could affect the ability of you and/or your family to obtain or maintain employment, insurance or financial services; (C) claim relatedness to criminals or other notorious figures or groups on the part of you and/or your family; (D) correctly or incorrectly associate you and/or your relatives with ongoing or unsolved criminal investigations on the basis of your publicly available genetic data; or (E) make synthetic DNA and plant it at a crime scene, or otherwise use it to falsely identify or implicate you and/or your family. (iii) Whether or not it is lawful to do so, you could be subject to actual or attempted employment, insurance, financial, or other forms of discrimination or negative treatment due to the public disclosure of your genetic and trait information by the PGP or by a third party . . . (iv) If you have previously made available or intend to make available genetic or other medical or trait information in a confidential setting, for example in another research study, the data that you provide to the PGP may be used, on its own or in combination with your previously shared data, to identify you as a participant in otherwise private and/or confidential research . . . * www.personalgenomes.org/consent/PGP_Consent_Approved_12152011. pdf.

9

Safeguards and responsibilities

People participating in research projects or allowing information or biospecimens about themselves to be used need to be assured that reliable protections are employed. So do family members, research ethics bodies, and others who make decisions on their behalf. So do organizations that share data. Researchers and their institutions must be able to provide this assurance and back it up with measures that justify trusting. This is a leadership responsibility; if principal investigators, unit heads, project leaders, or R&D directors don’t care about privacy and confidentiality, others are unlikely to. Many of the propositions discussed in this book are conditioned on the proviso, “if appropriate safeguards are in place.” Among other things, reliable safeguards can justify seeking broad consent, using data without consent, being allowed access to data curated by others, or using partially de-identified data under restricted conditions. Safeguards are a prime determinant of trustworthiness. As was said earlier, safeguards include the many measures taken to protect data or biospecimens while they are being collected or received, and when they are curated and used thereafter. Safeguarding involves an array of intersecting administrative, physical, and information technology measures, including those often thought of as security measures. Operational safeguards At the day-to-day working level, many precautionary practices must be employed. Research teams should be able to attest with confidence, formally and publicly, that they: □ stay aware of current laws, regulations, and guidance relating to data acquisition and use, and have up-to-date, defensible policies and procedures in place; □ keep senior leadership actively aware of the challenges and obligations; □ deal carefully with such core matters as notice, consent, identifiability, data retention, data transfer, and data and biospecimen destruction; 125

126

Safeguards and responsibilities

□ engage with external ethics and other oversight bodies as appropriate, and comply with the decisions; □ sensitize and train research staff, students, and technicians, and resensitize and update training as necessary; □ know what data and biospecimens they hold and stay aware of data flows; □ manage internal data access and use via need-to-know assignments; □ use enforceable contracts or similar agreements to protect any handling of personal data or biospecimens by external agents working on behalf of the unit; □ use enforceable access and material transfer agreements to manage any sharing of personal data with external parties to be used for those parties’ own purposes; □ rigorously enforce effective administrative, physical, and cyber security; □ audit as needed, and make sure that affected units respond to audit findings; □ have contingency plans in place to respond to privacy incidents, investigate causes and consequences, and take remedial action. Each of these items is an agenda. Organizations must elaborate each according to the research they perform, the data and biospecimens they use, data flows, the privacy and other risks involved, all the “rules” that apply, and institutional and professional best practice. The biggest difficulties usually have to do with managing the ultimately complex hardware–software hybrid, Homo sapiens.

Formal responsibilities Lead responsibilities can be assigned in various ways, depending on the context. Principal investigators (PIs) in the UK, the US, and many other countries are senior scientists, usually grant recipients or laboratory heads, who are responsible and accountable for the actions of a research team. As they direct planning, manage budgets, set policies, recruit staff and students, reward performance, and so on, they are expected to make sure that privacy and confidentiality safeguards are working and that strong incentives are in place to keep them working well. Usually the PI role is defined by a funding organization or employing research institution. Custodians are the persons – principal investigators, the managers of a research resource platform, or perhaps an institution, i.e., a legal person – formally responsible for the protection, maintenance, and control of access to and use of data. Custodianship is defined by the ethical and legal

Formal responsibilities

127

Box 10. Manitoba Centre for Health Policy’s “Pledge of Privacy”* We promise to: Respect privacy: MCHP through the University of Manitoba is a public trustee of sensitive de-identified information. We are bound by legislation, professional ethical standards and moral responsibility to never share, sell under any circumstance or use the data for purposes other than approved research. We ensure that all data under our management are anonymized, and that their use adheres to strict procedures, practices and policies. Safeguard confidentiality: We respect the confidentiality of sensitive and private information. All staff and collaborating researchers must sign an oath of confidentiality; anyone who breaches this oath faces immediate loss of access to data and possible dismissal. All records are anonymized before we receive them. Before findings are released, all MCHP publications are reviewed by Manitoba Health and/or appropriate data providing agency and our own management to further ensure individual privacy. Provide security: The environment in which research is conducted is tightly controlled. We restrict access to our workplace with additional levels of security for access to data spaces. The security of data is further protected through state-of-the-art technology, including but not limited to firewalls, encryption, password access and monitoring of users. The databases are housed on computers that are isolated to prevent access by unauthorized persons. * http://umanitoba.ca/faculties/medicine/units/mchp/privacy.html. The Centre’s work is described on p. 148 below.

circumstances surrounding the original collecting of data or biospecimens. It can be formally passed on to others, in what is often called a chain of custody. Data controllers and data processors are roles defined by the EU Data Protection Directive (Article 2): (d) “Controller” shall mean the natural or legal person, public authority, agency or any other body which alone or jointly with others determines the purposes and means of the processing of personal data . . . (e) “Processor” shall mean a natural or legal person, public authority, agency or any other body which processes personal data on behalf of the controller.1 1

The UK Data Protection Act adds “. . . other than an employee of the data controller.”

128

Safeguards and responsibilities

Sometimes under the national laws implementing the Directive there is uncertainty over who the controllers and processors are in complicated situations, such as for data shuttled around in the large networks of global pharmaceutical companies, with multiple data sources, multiple internal data-handling centers, and multiple external data-service providers. Organizations subject to EU law must be clear about the roles internally, and they must take account of how the roles are managed in organizations to which they are considering transferring data for any reason.2 Privacy officers are well-placed staff members tasked with making sure that privacy and confidentiality are attended to by their organizations. They are expected to champion the privacy-protection cause and promote it by organizing and perhaps leading training programs; working with various units to develop policies, standard operating procedures, and contracts; initiating reviews and audits; and working with regulatory compliance officers, legal counsel, and information technology services to be sure that issues are properly dealt with. They must be empowered to inquire, investigate, intervene, and call for managerial attention as necessary. In order to do all this they must stay aware of data flows and uses, and stay current with the external legal and policy context. Privacy officers are employed by thousands of corporations and many governmental and academic organizations. Their effectiveness depends strongly on where they are placed in the organization, on whether they have adequate resources, and on whether they are genuinely supported by top management.3 In an effort to focus responsibility, the data protection laws of several European countries, such as Germany, require that companies over a certain size designate personally named privacy officers (by whatever title), and it is likely that in the future this will become an EU-wide obligation. Canada’s Personal Information Protection and Electronic Documents Act (PIPEDA) requires that organizations “designate an individual or individuals who are accountable for the organization’s compliance” with the principles of the Act. Caldicott Guardians are senior staff members appointed by UK National Health Service operating units to promote, monitor, guide, coordinate, and be responsible for clinical information governance in

2

3

European Article 29 Data Protection Working Party, “Opinion on the concepts of ‘controller’ and ‘processor’,” (2010): http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/ 2010/wp169_en.pdf. A leading organization is the International Association of Privacy Professionals: www. privacyassociation.org.

Data and biospecimen retention

129

those units, in effect acting as privacy officers. One of their roles is to enable appropriate data sharing.4

Stewardship This author has long promoted the notion of stewardship of data and biospecimens, which has a more active meaning and tone than custodianship or guardianship. Stewardship means being responsible and accountable for preserving, growing, and judiciously sharing important holdings, not just hoarding and protecting them. It implies adding value to the holdings, and helping others derive value from them. (Parallels are forest stewards, seed bank stewards, stewards of religious artifacts and edifices, and stewards of rare manuscripts and other cultural treasures. And yes, wine stewards.) People in any of the above roles can serve as stewards, whether using the title or not. So can organizations, such as data archives and biobanks. But the stewardship notion and steward title should be reserved for instances in which they are deserved, and not simply used as a synonym for passive custodianship.

Data and biospecimen retention Many guidelines require holding personally identifiable data and biospecimens for only as long as is necessary. Health research policies usually recognize the need to keep data and biospecimens for many years, and tend to stretch the requirement “only as long as is necessary [for the current project]” to “as long as might be needed for validation, reanalysis, and possible further research use.” Many regulations set time limits for retention, usually an arbitrary period such as 30 years, but not all discuss how long safeguards, or what kinds of safeguards, should be maintained. Surely a reasonable rule is simply: As long as a research organization holds personally identifiable data or biospecimens, it should safeguard them; when it no longer has the wish or the ability to protect the holdings, it should transfer them to other appropriate custody or destroy them.

4

The title reflects the fact that the role was recommended by a report from a committee chaired by Fiona Caldicott. The portal to information about the system is: www.connectingforhealth. nhs.uk/systemsandservices/infogov/caldicott. See also UK Department of Health, The Caldicott Guardian Manual (2010): http://www.dh.gov.uk/en/Publicationsandstatistics/Publications/ PublicationsPolicyAndGuidance/DH_114509.

130

Safeguards and responsibilities

Security Security is the maintaining of integrity and control of access, use, and transfer after data or biospecimens have been acquired. Security requires continual vigilance and “What if . . .?” thinking. It also requires coordination of research units’ protective measures with those maintained by general facility security and central information technology departments. Security comprises: □ an overall orientation to risk (vulnerability assessment, scaling of protections to risks, contingency preparedness . . .); □ physical protections (barriers, entrance controls, intrusion detection systems, patrolling . . .); □ administrative discipline (staff selection, confidentiality-related clauses in employment contracts, standard procedures that make roles and obligations explicit, serious training, auditing of conformance with approved policies and procedures . . .); □ day-to-day operational discipline (password hygiene, computer screen timeouts, logging of access to biospecimens, secure and documented destruction of sensitive documents and biospecimens . . .); □ cyber security (data access control, intrusion blocking, encryption of data in storage or transfer, secure management of external data backup . . .). Much of security practice is commonsensical, but it has to continually evolve. For example, the epidemic of laptop thefts, losses of data tapes in the course of delivery, and so on in recent years made it imperatively clear that organizations must protect portable data more seriously, such as by restricting the carrying of personally identifiable data off-premises in laptops, portable hard disks, USB keys, and other devices, and encrypting data so that they can’t be read even if the storage medium falls into unauthorized hands. And of course vigilance has to be maintained against the ever-changing threat of hacking. Many cyber security resources and standards are available, of varying focus and detail, from many sources.5 State-of-the-art standards are promulgated by such bodies as the International Organization for Standardization (usually known as the ISO), and research units can work to become certified as meeting them. The broad standard, ISO 27001, “Information security management systems,” for example, is an authoritative one that at least fairly large research organizations can strive 5

Extensive guidance and technical information can be accessed via the US National Institute of Standards and Technology, Computer Security Resource Center: http://csrc. nist.gov.

Privacy risk assessment

131

to qualify for.6 Conformance to recognized standards increasingly is becoming a requirement for being allowed access to data – it is much more specific, after all, than just stating that “data will be held securely.”7 Too, a research project’s being able to attest that its system is certified to rigorous standards can be a convincing point in recruiting volunteers. Worry about the security of electronic data is one of the factors that members of the public say most strongly discourages them from agreeing to participate in research.

Privacy risk assessment It is simply not possible to protect equally against all conceivable threats. Nor would it make sense to try to. Increasingly, data custodians are conducting privacy risk assessments (or risk impact assessments) in order to identify and mitigate vulnerabilities, and set protections proportionate to the risks. And increasingly, regulators are requiring or strongly encouraging this. The core concern should be whether commitments made to protect privacy and confidentiality can be violated – by outsiders or insiders, innocently or maliciously – and whether data can be improperly used to reveal the identity of data-subjects, harm or appear to harm data-subjects, or bring the research endeavor into disrepute.8 Like all risk assessments, privacy risk assessments must estimate the probability and magnitude of possible harms, and attempt to weigh the values that affected individuals, society, and the research enterprise place on the imaginable harms. This is not at all easy, and at best it can only be semi-quantitative. But such judgments are made informally all the time. (“Probably this is our biggest everyday weakness.” “That would be a disaster if it ever happened; are we doing all we can to minimize the odds?” “The service vendor is avoiding our questions about destroying the records; ask the legal group to investigate.” “Given what happened last week, shouldn’t we move that database to a higher security level?”) Such 6 7

8

Related to this is a code of practice, ISO 27002. The portal to the standards is: www.27000. org/index.htm. Applications to the UK National Information Governance Board for Health and Social Care for approval to use NHS data for research without consent (as was described on p. 82 above) are required to submit a System Level Security Policy, for which the Board provides a template: www.nigb.nhs.uk/s251/forms. UK Information Commissioner’s Office, Privacy Impact Handbook, version 2.0 (2009): www.ico.gov.uk/upload/documents/pia_handbook_html_v2/index.html; Australian Office of the Privacy Commissioner, Privacy Impact Assessment Guide (2010): www.privacy.gov. au; US Department of Health and Human Services, Office of Civil Rights, “Guidance on risk analysis requirements under the HIPAA Security Rule” (2010): www.hhs.gov/ocr/ privacy/hipaa/administrative/securityrule/rafinalguidancepdf.pdf.

132

Safeguards and responsibilities

judgments often can be refined if they are examined in a structured way and involve technical, legal, and other experts, perhaps including some from outside the operating unit. Vulnerabilities might range from having staff members peeking at data without permission, to having a contract service provider’s system in a faraway country hacked into, to having biospecimens passed on for unpermitted purposes. Potential tangible harms to data-subjects, such as identity theft or financial losses, must be considered, but so should intangible harms, such as emotional distress. Account must also be taken of the risks to the laboratory, project, university, company, or research endeavor more generally, such as public resentment, participant bailout, denial of access to data, cancellation of ethics approval, litigation, or expulsion from collaborative projects. Risk assessment exercises can be especially illuminating if they induce interaction among different functions of the organization, or between the organization and external experts. “You do what?” and “Who authorized that?” are not uncommonly exclaimed as assessments proceed. Obviously the findings of assessments should be used as necessary to assign responsibilities, train personnel, revise procedures, allocate resources, review external relationships, and reinforce defenses. Requests for non-research access Several kinds of standard and generally uncontroversial requests to examine research data for non-research purposes can arise, such as for official public health or social service investigations, auditing by data protection authorities, or validation of data by clinical trial sponsors or medical product regulators. Such requests should be anticipated by internal policies. Troubling to data custodians as well as the public are demands for access to personally identifiable data by the police, courts, state intelligence services, insurers, banks, or employers. Researchers and their institutions cannot be above the law. But they should firmly deny access where it is not required by law or ordered by a court. And where provision of data is compelled, such as by a court subpoena, researchers should negotiate for specification of a clear data-analytic objective, release of only the data necessary to pursue that objective, de-identification and masking of data elements to the fullest extent acceptable, and maintenance of confidentiality around the data when they are released to the court or other recipient. Research projects can declare a default stance of resistance. The UK Biobank’s “Ethics and Governance Framework,” for example, asserts that “access to the resource by the police or other law enforcement

Requests for non-research access

133

agencies will be acceded to only under court order, and UK Biobank will resist such access vigorously in all circumstances.” Freedom of information requests. Many countries and provinces have freedom of information laws (or access to administrative documents laws) that entitle members of the public to see or obtain copies of documents held by government institutions. The provisions vary; for instance, some require that requesters must have a legal stake in obtaining the data, while others say that no reason for interest need be expressed. Usually the institution receiving the request must either provide the information in a timely fashion or respond with a rationale for not providing it and be prepared to defend the denial. Most cover information held by government bodies, not just information collected by them, and so cover much research information submitted to government agencies or handled via databases or research platforms managed by government agencies. When the release of personally identifiable information under freedom of information law is not barred by countervailing law (as it usually is barred, for instance, for personally identifiable data submitted to medical product regulatory agencies), researchers and their institutions should negotiate for minimal necessary and restricted release as was recommended above for other non-research requests.9 Institutions that think they might ever be exposed to such requests should have an internal policy in place to guide responding, and designate a knowledgeable response coordinator. Certificates of Confidentiality. In the US, a statutory instrument designed to protect against compelled disclosure is the Certificate of Confidentiality, a legal assurance that the National Institutes of Health (NIH), Centers for Disease Control and Prevention, Food and Drug Administration, and some other federal agencies can issue for projects that use personally identifiable information. Usually Certificates are issued for specified projects only, and are issued to the hosting institutions, not the investigators. Institutions receiving Certificates are urged to invoke them if challenges arise, but are not required to do so. Federal funding is not a prerequisite for applying for Certificate protection of a project; any project approved by an Institutional Review Board can apply. The NIH policy in full reads: Certificates of Confidentiality are issued by the National Institutes of Health to protect identifiable research information from forced disclosure. They allow the investigator and others who have access 9

The US Freedom of Information Act exempts from reach “personnel and medical files and similar files the disclosure of which would constitute a clearly unwarranted invasion of personal privacy”: 5 Code of Federal Regulations 552(b)(6).

134

Safeguards and responsibilities

to research records to refuse to disclose identifying information on research participants in any civil, criminal, administrative, legislative, or other proceeding, whether at the federal, state, or local level. Certificates of Confidentiality may be granted for studies collecting information that, if disclosed, could have adverse consequences for subjects or damage their financial standing, employability, insurability, or reputation. By protecting researchers and institutions from being compelled to disclose information that would identify research subjects, Certificates of Confidentiality help achieve the research objectives and promote participation in studies by assuring confidentiality and privacy to participants.10 This is a strong statement. But limitations must be noted. Certificates apply only to “identifying information,” the disclosure of which could have adverse consequences; however, identifiability, adverse consequences, and the boundaries of “research information” are matters of judgment. What is shielded is identifying information – mainly meaning the names, addresses, and other overt identifiers of people involved in a project, or confirmation that a named person is a participant – not all of the data held. Whether a researcher can notify public health authorities of a communicable disease or other medical condition observed during the course of the project apparently depends on whether the subject consents to this. Certificates do not prohibit the subjects themselves from disclosing their participation to anyone, or from consenting to disclosure. The shortcomings and uncertainties notwithstanding, the NIH and other agencies continue to encourage projects to apply for them. Around 1,000 Certificates are in effect at the NIH at any time. The major source of uncertainty is that the Certificate arrangement has only been tested in court a few times (although possibly out of court, but not publicly documented), and it is not clear how successfully it would stand up against a determined legal challenge. At the very least the Certificates provide formal justification for resisting informal requests, demanding a court order, and negotiating over exactly what information must be provided to whom and under what continuing protections. A review of a criminal court case that tested Certificate protection, but in the end left its inviolability unclear, called for rigorous evaluation of the Certificates’ practical effectiveness in resisting forcible disclosure, noting four problems in particular: 10

The statutory authority is 42 Code of Federal Regulations 241(d), §301(d). See US National Institutes of Health, Certificates of Confidentiality Kiosk: http://grants.nih.gov/grants/ policy/coc.

Enforcement and sanctions

135

First, requests for research data may arise from legal proceedings unrelated to a study’s focus . . . Second, a Certificate is granted to the research institution, not the Principal Investigator, and their interests may not be identical . . . Third, seeking to enforce a Certificate may result in some disclosure, even if data are not released . . . Fourth, parties in both criminal and civil lawsuits have rights to obtain material relevant to their case . . . [and] courts may give insufficient weight to society’s interest in protecting research records.11 The approach deserves to be scrutinized and strengthened to the extent legally possible. Other countries may want to consider adopting a similar policy.

Enforcement and sanctions Given that data are now being shared and transferred with great fluidity, that identities are becoming ever harder to conceal, and that security can never be absolute, in the coming years organizational and legal strategies inevitably will have to complement regulation of data-handling processes with prohibition of certain data uses, enforcement of data use agreements, and penalization of abuses. Notification of data security breaches. In many places in recent years, laws have been passed requiring that if organizations become aware that impermissible access probably has been gained to person-specific data in their care, they must notify the data-subjects and/or regulatory authorities of the breach, and take actions such as assisting the affected people in monitoring any impact on their financial credit rating. Some of the laws require notification only if there is reason to believe that “serious” or “significant” material (usually meaning financial) losses might have been incurred, or might be in the future. Others require notification regardless of possible harm. Some apply specifically to protected health information.12 11

12

Laura M. Beskow, Lauren Dame, and E. Jane Costello, “Certificates of Confidentiality and compelled disclosure of data,” Science, 322 (2008), 1054–1055, and two responses and the authors’ reply: Science, 323 (2009), 1288–1290. The case had to do with the subpoena of data in a psychiatric disorder study at Duke University by the defense in a rape case, in an effort to impeach the testimonial credibility of a prosecution witness thought to be a participant in the study. (A subpoena is a court order to testify or provide documents in a legal proceeding.) The US Health Information Technology for Economic and Clinical Health Act requires the Secretary of Health and Human Services to post on a website the occurrence of breaches of unsecured, i.e., unencrypted, personally identifiable information that affect 500 or more individuals (§13402(e)(4)). The site is: www.hhs.gov/ocr/privacy/hipaa/ administrative/breachnotificationrule/breachtool.html.

136

Safeguards and responsibilities

Policy difficulties include specifying the kinds, chances, and magnitudes of potential or verifiable harm that trigger notification. Practical difficulties include deciding what actions must be taken after breaches occur, and what and how to communicate about the breach with the victims. There is much leeway for uncertainty and false alarm with respect to actual invasion of privacy, as many thefts are committed (just) for the sale value of the laptop or other hardware, and many hacking incidents are (just) pranks to show off cleverness or harass organizations. Although breach laws mainly are meant to limit damage after breaches occur and to chasten the data-leakers, because most of the laws exempt notification if the data are thoroughly encrypted (meaning that they can’t be read without access to a unique decryption key), they set strong incentives to encrypt data in transfer and storage. Enforcement of data use agreements. The ability to enforce access and other undertakings is an important safeguard. Most of the arrangements are informal. Serious offenses, such as outright theft of biospecimens, can be prosecuted by the police. Abuse of national statistics data and many kinds of healthcare data is subject to civil and criminal penalties specified by statutes or regulations, as is violation of human-subject protections. Negligent disclosures may activate data security breach penalties. Sanctions that can be contemplated if data use agreements are violated – such as if recipient researchers violate conditions of consent, improperly attempt to identify or contact subjects, pass data on to unauthorized recipients, or publicly disclose confidential information – include, in generally increasing order of severity: □ asking the researcher to stop using the data, at least until a review of the situation is conducted; □ blocking access to additional data or biospecimens; □ requiring the researcher to return or destroy the data and/or biospecimens, certify the action, and submit to an audit; □ as appropriate, reporting the incident to the researcher’s employer, the office responsible for scientific integrity in the researcher’s institution, the ethics committee that approved the project, the project funders, or journals that have published or are in process of publishing results based on the questionable use; □ cancelling funding; □ denying funding to the researcher in the future; □ if privacy or data protection law is broken, reporting the incident to the regulatory authority; □ taking action in a court of law, possibly for breach of contract.

Enforcement and sanctions

137

A set of largely untested issues, to be discussed in the next chapter, is that having to do with enforcement of obligations when data and/or biospecimens are transferred internationally. GINA. One law that prohibits inappropriate uses of a particular kind of data is the US Genetic Information Nondiscrimination Act (GINA).13 GINA prohibits health insurance companies and healthcare plans from requesting or requiring genetic information about people or their family members, and from using genetic information in making decisions about coverage or rates. It also prohibits employers from using genetic information in making decisions about the hiring, promoting, or firing of employees. It covers family histories of diseases and disorders, applies to data about individuals and their first-, second-, third-, and fourth-degree relatives, and covers data from genetic “tests” very broadly defined as including analyses of DNA, RNA, chromosomes, proteins, or metabolites that detect or indicate genotypes, mutations, or chromosomal changes. Although it is not specific to research, it applies to many kinds of data and biospecimens used in or generated by research. A variety of penalties can be imposed for violations. GINA complements the genetic nondiscrimination laws of the states, setting a legal floor of protection. It has many shortcomings, starting with its limited range of prohibited uses; it has no relevance for life insurance or long-term care insurance, for example. But in time it may be amended to rectify some of those deficiencies. Other laws of this sort may be needed, both in the US and elsewhere.

13

US Genetic Information Nondiscrimination Act: www.eeoc.gov/laws/statutes/gina.cfm. For helpful interpretation, see US Department of Health and Human Services, Office for Human Research Protections, “Guidance on the Genetic Information Nondiscrimination Act: Implications for investigators and Institutional Review Boards” (2009): www.hhs.gov/ ohrp/policy/gina.pdf.

10

Data sharing, access, and transfer

Data sharing The sharing of data, biospecimens, rare molecules, organisms, and access to special instruments and reference collections has always been part of the ethos of science. As with all sharing, it is never perfect, as it can be impeded by logistical constraints, proprietary reservations, or interpersonal frictions. But few, if any, universal human endeavors proceed in as communal fashion as science, and with community comes sharing. The impetus is increased when the sharing serves a core public interest. In health research, the recent years have brought organized pressures for increased sharing, in the pursuit of: □ reduced duplication of data and biospecimen collecting and management, and reduced competition for access to research participants; □ compilation of masses of data on resource platforms, making statistically more powerful analyses possible; □ eased access to existing data and biospecimens for use as statistical controls; □ increased linking of data and biospecimens across projects and data types; □ scientifically more diverse interrogation of data for more purposes, including independent validation of findings; □ and therefore overall, more productive use, for the common good, of government and charity research funds and the data and biospecimens contributed by research participants. Many funding organizations have put their weight behind data sharing, and it has become a campaign. Among others, the US National Institutes of Health,1 the UK Medical Research Council,2 and the Wellcome Trust3 1 2 3

US National Institutes of Health, data sharing portal: http://grants.nih.gov/grants/policy/ data_sharing/. Medical Research Council (UK), data sharing portal: www.mrc.ac.uk/Ourresearch/ Ethicsresearchguidance/Datasharinginitiative/Recentactivities/index.htm. Wellcome Trust, “Policy on data management and sharing” (2010): www.wellcome.ac.uk/ About-us/Policy/Policy-and-position-statements/WTX035043.htm.

138

Data sharing

139

have adopted policies requiring accessible archiving and data sharing. The Thomas and Walport data sharing report in the UK strongly encouraged the exchanging of data among government agencies, including for health research.4 On behalf of the many depositors of the social and economic data that it manages, the UK Data Archive encourages and facilitates the sharing of data-sets for research, many of which relate to health.5 In the US, academic health centers have been urged to make data sharing a more important part of their culture and educational programs.6 Recently 17 major internationally active funding organizations made a joint commitment to increasing the availability of data for health research.7 Most longitudinal projects, such as birth cohorts, twin cohorts, and chronic disease cohorts, naturally lend themselves to multiple uses, and most have an admirable record of sharing. Many specialized, closely focused projects also share data, of course, although often preferring to do so only with colleagues personally known to them; until the recent push by funders, much has depended on the collaborative spirit of project leaders and whether they anticipated mutual benefit from the sharing. Many large-scale projects initiated in recent years have been structured from their inception to be broad community resources. UK Biobank, for example, “aims to encourage and provide wide access to the resource for researchers from the academic, commercial, charity and public sectors, both nationally and internationally, in order to maximise its value for health.”8 Other examples include those sketched in Box 2, such as the Million Women Study, the International Cancer Genome Consortium, and the Kaiser Permanente Research Program on Genes, Environment and Health. Increasingly now, project grants include support for the work of data sharing. An unprecedented mode of data sharing was pioneered by the Human Genome Project. Genomicists in a number of countries, facing the daunting challenge of mapping the Code of Life, agreed to coordinate their work, assign chromosomal territories in order to minimize duplication (except to confirm each other’s findings), and post sequences on the web 4 5

6

7 8

Thomas and Walport, Data Sharing Review Report. UK Data Archive, Veerle Van den Eynden, Louise Corti, Matthew Woollard, et al., “Managing and sharing data: Best practice for researchers” (2011): www.data-archive. ac.uk/media/2894/managingsharing.pdf. Heather A. Piwowar, Michael J. Becich, Howard Bilofsky, and Rebecca S. Crowley, on behalf of the caBIG Data Sharing and Intellectual Capital Workspace, “Towards a data sharing culture: Recommendations for leadership from academic health centers” PLoS Medicine, 5(9), e183. doi:10.1371/journal.pmed.0050183 (2008). Mark Walport and Paul Brest on behalf of 17 signatory organizations, “Sharing research data to improve public health,” The Lancet, 377 (2011), 537–539. UK Biobank: www.ukbiobank.ac.uk/docs/UKBProtocolfinal.pdf.

140

Data sharing, access, and transfer

as soon as they were “read.” Week by week anyone in the world with access to the Internet could see the draft sequence being pieced together. Surely this was the fastest and most open data sharing, ever, in large-scale biological research. Elements of this model, notably the rapid prepublication release of data, are now being adopted by other lines of research. Access9 Data sharing implies access, and there are pressures now to manage access more carefully, efficiently, openly, and fairly, and to improve the consistency among policies so as to make the rules and rewards of exchange clear and encourage mutual sharing. Access means being allowed to see and/or use data or materials. With respect to privacy, access is any circumstance allowing the taking of information into knowledge. The media of data storage and communication are irrelevant. Under protective regimes, provision of access can amount to data “release,” “transfer,” or “disclosure,” with legal implications. In research, access is achieved by being shown or sent paper records, portable electronic data storage devices, or biospecimens; downloading data via the web or other telecommunication conduit; analyzing data from a distance via secure connections; or visiting a center and using data or biospecimens onsite. A caution: Showing protected data to colleagues in one’s own institution who are not members or supervisees of the custodial team, or who are not otherwise authorized, usually constitutes improper granting of access. “She’s a friend who does similar research” may not be a dispensation.

The two basic modes of access Provision of access to information for health research proceeds via two basic modes: open and restricted (or controlled). Often a combination of the two is employed. Open access tends to be provided via posting on publicly accessible websites. This can be efficient, inexpensive, and egalitarian. Masses of useful data are made available this way. But usually such open distribution 9

This section builds on the author’s work in preparing a report for the Medical Research Council and the Wellcome Trust, Access to Collections of Data and Materials for Health Research (2006): www.wellcome.ac.uk/stellent/groups/corporatesite/@msh_grants/documents/web_document/wtx030842.pdf.

Access agreements

141

must be limited to data that are not personally identifiable, or to data for which consent to such public release has been granted. Open release of data that have been de-identified. This is commonplace. Before any such release, the identifiability risks must be assessed and reduced to an acceptable level. A drawback is that thoroughly stripped or aggregated data tend to lack the necessary fine detail. And if the de-identification is truly irreversible, it is impossible for anyone to recontact the data-subjects or databases to validate data, seek additional information or biospecimens, pass findings back to participants, or invite participation in other research projects. Open release of identifiable data with consent. This is fairly rare, although it is done with some data-sets judged to be of very low sensitivity, with a few for which the participants or a genuinely representative advocacy group urge open access, and with a few, such as those of the Personal Genome Project described in Chapter 8, for which open release is intended and announced from the inception. Restricted access of one sort or another is used for most privacy sensitive data. Access is mediated by contract-like agreements between the requesters and the data holders, who may be either the original data collectors, or curators of archived data, or research resource platforms. Access agreements Access agreements take into account legal and ethical requirements, funders’ policies, professional guidance, international conventions, and common decency, and specify terms. The same sorts of terms apply whether the application for access is made directly to a data steward or access committee, or made indirectly and less personally online. The agreements are legal undertakings. On the providing side, agreements are executed by data stewards (who may be senior investigators, project managers, archive custodians, funders, or corporations); data access committees; hospitals, universities or other institutions; or a combination. Data stewards must manage the terms in a manner that respects the obligations and commitments made to the data-subjects, but at the same time they should consider serving the broader public interest in sharing the data, via whatever mechanisms are appropriate. On the receiving side, agreements are executed by principal investigators or other research leaders, who assume responsibility for ensuring that the conditions are complied with, including by staff and students. The data requesters must decide whether they and their teams can and will comply with the terms, and whether they are willing to be held

142

Data sharing, access, and transfer

accountable. Often, in order to have legal force the agreements must be signed by the employing or hosting institutions as well as by the responsible investigators. Institutions considering co-signing with requesters who are their employees or affiliates must decide whether the reputations, circumstances, and safeguards warrant their taking on the responsibility and legal liability. Funders, research ethics boards, and various governance bodies must exercise duties having to do with optimizing the use of data, protecting data-subjects’ interests, ensuring that access is efficacious and fair, and watching over the research enterprise generally. Agreements take a variety of forms, from memoranda of understanding, to website clickthroughs, to complex contracts, to treaties. Biospecimens are usually shared via what are called “material transfer agreements.” Agreements may refer to strictures imposed by grant conditions, the terms of reference of a data sharing consortium, biospecimen bestpractice guidance, or other policies.

Terms of restricted access Access agreements must address a package of issues, many of them interrelated.10 The full range of matters that agreements cover is sketched in Box 11. The following are the terms most relevant to privacy and confidentiality. Their purpose is to maintain a chain of responsibility and accountability. Screening of professional competence. Requesters may be asked to provide evidence of experience with database research or the kind of immunological, psychiatric, or other technically demanding or ethically sensitive research involved, or of having published articles on the health topic of concern. This is meant to protect the resource from inept analyses that might insult or unduly alarm the data-subjects or impugn the resource, and to avoid wasting the resource team’s efforts. But the obligation of data or biospecimen providers to evaluate the competence of requesters is a matter of debate. Scientific freedom argues against it, and over time the scientific intellectual marketplace rejects or ignores incompetent work. The difficulty of such vetting can be higher if the requesters are far away or in a different culture, or are relatively unestablished investigators.

10

A template of potentially wide utility is National Cancer Research Institute (UK), “Samples and data for cancer research: Template for access policy development” (2009): www.ncri.org.uk/default.asp?s=1&p=8&ss=9.

Terms of restricted access

143

Conformance with consent. Aspects of access may hinge on the original consent, or on some subsequent consent to data sharing, or on the waiving of consent requirements by an ethics review body or by legislation. Among other things, consent may specify to whom access may be granted or the purposes for which the data can be used. Obviously the access agreement must reflect whatever limitations or safeguarding promises exist. Responding if consent is withdrawn. It may be required that if a research participant withdraws or revokes consent, or if consent is invalidated for any other reason, holders of data or biospecimens, including recipients further downstream, must return or destroy the data or materials and/or sever links and certify the actions to the data providers. (Issues surrounding withdrawal were discussed in Chapter 6.) Purpose limitation. A common purpose issue is whether a database can be used to identify potential subjects for other projects. Probably the most common limitation, either stated by project policies or requested by some data-subjects, is use of the data only for the study of certain health conditions or factors. But, as was described in Chapter 6, it can be difficult to define purposes precisely, and defining them too narrowly may block potentially useful studies. It has to be admitted that purpose limitation can be very difficult to audit or enforce once data or biospecimens have been transferred to faraway sites (or even, for that matter, to colleagues close by). Confidentiality. Solemn reminders may be made as to confidentiality obligations, privacy-protection restraints, human tissue regulations, or other laws or professional guidance. Identifiability protection may be discussed. If reversibly de-identified data are to be provided, the agreement should address how the identifiers will be held and by whom, and the criteria and procedures governing use of the key to re-identify. Agreements should always require that no attempt will be made to re-identify, trace, or contact the data-subjects without authorization from the data-subjects or data providers, or to use the received data or materials in any way that could infringe the rights of the subjects or otherwise affect them adversely. Confidentiality restrictions may also apply to information about healthcare providers or institutions, other researchers, or relatives of the data-subjects. Recontacting. Almost always it is forbidden to contact the datasubjects or biospecimen sources without going through an intermediary acceptable to the data-subjects, such as their physicians or physicians working with the health program in which the data were collected.

144

Data sharing, access, and transfer

Guidance may be included as to what to do if findings emerge that perhaps should be communicated to the data-subjects. Research ethics approval. Ethics committee approval is often a condition of access. Data providers, data recipients, or both, may have to obtain approval. Access policies may specify the stage(s) in the application process at which approval must be sought. Security. Reference may be made to physical, administrative, or information technology security standards or guidance. Special requirements may be imposed, such as requiring that portable devices carrying the data, such as laptops, never be allowed to leave secure premises unless the data are encrypted. If access must be effected through a data enclave or other extremely restricted route, the conditions must be referred to. Usually it is required to notify the data provider immediately if de-identified data become identified, whatever the means, intentions, or possible consequences; this could apply in the event of simple inadvertent recognition of a data-subject in a local cohort data-set, for example. Requirements may be imposed regarding actions that must be taken if a data security breach occurs. Limiting onward transfer. Data recipients must always promise not to pass the data or biospecimens on to unauthorized parties. An authorized party might be, for instance, a researcher in another institution who has signed a similar agreement as part of a consortium. As was said earlier, colleagues employed in the recipient’s own department, university, or corporation cannot be assumed as authorized de facto. Linking. Conditions may be imposed on the linking of the provided data with other data or with biospecimens. Increasingly agreements are having to discuss the linking of the shared data with biospecimens that the applicants have access to, or vice versa. Returning or destroying data or biospecimens. This may be required at the end of the project, or in the event of noncompliance with the terms of the access agreement. Relocation or termination. Conditions may be included as to what may or must happen to the transferred holdings if the recipients move to another institution, and what must be done if the recipient’s curatorial responsibility has to end, such as if the unit closes or runs out of the necessary funds. Transborder enforcement. If data or materials are being sent to recipients outside of the local legal jurisdiction, special ethics review, special safeguards, or auditing provisions may be included, as may a statement as to which jurisdiction’s laws will apply.

Terms of restricted access

145

Box 11. General terms of access agreements*

Screening of scientific . . . in order to protect the competence or project merit resource? To conserve effort? Specification of what is on offer data, biospecimens, analytic service, linking? Mode of access data tape, website, data enclave? Conformance with consent coverage, tracked how? Responding if a subject destroy data, specimens, links? withdraws Purpose limitation limits? Confidentiality de-identify? How? Promise not to try to re-identify? Recontacting data-subjects what would justify? How to be done? Research ethics approval at point of data collection? At research platform stage? Security standards? Any special safeguards? Limiting onward transfer restrictions? Linking . . . of what to what? Expectations, restrictions Maintaining data quality rectify errors, deal with contamination Publication and/or data release requirements? Timing? Protect identities Acknowledgment of providers “this work was based on data from . . .” Co-authoring required for control or credit-sharing? Depositing findings with return data? Documentation? the resource Informing data-subjects . . . of progress? Of findings about individuals? Archiving how? Who pays? Conditions of others’ access? Intellectual property (IP) rights IP assignments or waiving Prioritization of access . . . if biospecimens or other resources must be rationed Fees cost recovery? Does fee depend on IP profit prospects? Returning or destroying . . . when finished? If materials commitments are broken? Transborder enforcement legal constraints, ethics approval, liabilities Monitoring, oversight, or audit expectations, plans

146

Data sharing, access, and transfer

Box 11. (continued ) Contingencies if project is terminated Disclaimers Legal liability

destroy the resources? Pass on to a qualified institution? . . . not responsible for quality or consequences . . . of recipients if they don’t comply with commitments

* Modified from this author’s report, Access to Collections of Data and Materials for Health Research: www.wellcome.ac.uk/stellent/groups/corporatesite/@msh_grants/documents/web_document/wtx030842.pdf.

Privacy-preserving data linkage Linkage is the bringing together of data from individual-level data-sets, matching variables in the records that appear to pertain to the same people, events, dates, or phenomena, and constructing either partial or consolidated descriptions. Matching can proceed via identifiers common to the databases, such as healthcare identifier numbers or an encrypted version of them. Or it can proceed via probabilistic matching (i.e., “these data, and these data, and oh, these data too, all have a high chance of being about the same person”), which is often necessary because of changes in personal details over time, or because of variations in data recording, such as slight misspelling of the family name, or recording middle name instead of first name or leaving off a “Jr.”, or simply because of glitches in data entry. Such linking has become technically sophisticated and is now indispensable for many lines of research. Linking is a form of access – a reaching across the databases, and then to the linked-up data. The challenge is to protect the identities of the datasubjects while assembling a richer picture of each. With much linking (although not all) it has to be assumed that the linking increases the risk of deductive identification. Because the risk depends on the detailed nature of the data, specifying exactly what linking might be too risky relative to the safeguards is very difficult to codify in broad guidance or regulations and has to be judged in context. As with all databases, linkage arrangements can be screened by statistical re-identification risk assessment programs.11 Now, computer algorithms are being explored that allow the analysis of voluminous individual-level data held securely in separate 11

Such as those discussed on pp. 93–5 and 103 above.

Privacy-preserving data linkage

147

but interlinked computers as though the data are pooled, even though they are not.12 Much linking proceeds, productively, project by project. But increasingly the value in building enduring, multipurpose, privacy-preserving systems is being recognized. One highly respected example is the Western Australian Data Linkage System, a “chain of links” that allows data from a multitude of data-sets to be drawn upon as needed, customized to each project. Core data-sets, each under its own custodianship, include hospital admission and emergency presentation data, midwife notifications, cancer registrations, mental health contacts, birth and death registrations, and electoral records. Linkages can be made to many satellite databases, such as birth defects registration, Medicare enrollment, and pharmaceutical benefit data. Probabilistic links are made by using names, birthdates, sex, and components of residential address. Essentially the system operates through four distinct steps: 1. Linkage staff create, store and manage links in a dynamic Linkage System using confidential personal demographic information. 2. Linkage staff extract subsets of links from the Linkage System, then encrypt these “linkage keys” differently for each particular project. 3. Encrypted “linkage keys” are provided to the custodians (of the separate data-sets) so they can add them to their clinical or service details for that particular project. 4. Lastly, researchers receive clinical or service details from each data custodian and use the encrypted keys to connect the details needed for their analyses. “In this way,” the System explains, “access to identifying information is restricted to a specialised linkage team who perform the first and second steps. Data custodians are involved in the third step. Researchers are only involved in the last step and therefore do not need to access any personal identifying information.”13 12

13

Pooling, i.e., assembling all the data in a single database, makes data easier to analyze, but it tends to increase identifiability risks. A technique that can be thought of as virtual pooling is Michael Wolfson, Susan E Wallace, Nicholas Masca, et al., “DataSHIELD: Resolving a conflict in contemporary bioscience – performing a pooled analysis of individual-level data without sharing the data,” International Journal of Epidemiology, 39 (2010), 1372–1382: http://ije.oxfordjournals.org/content/39/5/1372.full.pdf+html?sid= 0579e27f-86f5–4311-adea-bfa2ae32df67. Western Australian Data Linkage System: www.datalinkage-wa.org. The description is from the website. See also C. D’Arcy, J. Holman, A. John Bass, Diana L. Rosman, et al., “A decade of data linkage in Western Australia: strategic design, applications and benefits of

148

Data sharing, access, and transfer

A very different, centralized model is one employed by the Manitoba Centre for Health Policy (MCHP) at the University of Manitoba.14 The Centre is the steward of a Population Health Research Data Repository of more than 60 healthcare, public health, education, justice, and social program databases periodically updated by the public agencies that collect the data. Individual-level data are de-identified before being entered into the central repository but remain linkable through a strict key-coding system managed by MCHP. Selected data are linked by authorized personnel at the Centre for each project, and remain linked only for the duration of that project. Applications for access must have ethics and other approvals, and the researchers must be accredited by the Centre. Access is provided in a high-security data laboratory. Reach is now being extended to secure satellite sites. (MCHP’s “Pledge of Privacy” was shown in Box 10.) A resource extensively used for studies of pharmaceutical safety and effectiveness is the UK General Practice Research Database (GPRD).15 The program collects and regularly updates data on more than 5 million currently active patients from 625 primary care practices, and now holds over 66 million patient-years of records. Names, addresses, and some other identifying data are masked at the data-providing practices, and the data, especially free-text data, are further screened at GPRD. Linkage is controlled via two keys: the first held by the data-providing practices, the second by GPRD. The practice key is known only to the practice. External researchers have no way of knowing where the practice is located. Additional linkage, such as with hospitalization and disease registry data, is performed by the NHS Information Centre for Health and Social Care. Derived data-sets are provided for approved research, which has led to more than 900 peer-reviewed publications so far. A novel system is Vanderbilt University’s DNA databank, BioVU. The program accrues DNA from blood samples scheduled to be discarded after clinical testing in the university medical center. Patients are presented with clear notice and opportunities to opt-out. Using a secure computer program (a one-way hash algorithm), the project cross-indexes the DNA samples with “synthetic derivative data,” data that have

14

15

the WA data linkage system,” Australian Health Review, 32 (2008), 766–777: www.publish. csiro.au/?act=view_file&file_id=AH080766.pdf; Emma L. Brook, Diana L. Rosman, and C. D’Arcy Holman, “Public good through data linkage: Measuring research outputs from the Western Australian Data Linkage System,” Australian and New Zealand Journal of Public Health, 32 (2008), 19–23. Manitoba Centre for Health Policy: www.umanitoba.ca/faculties/medicine/units/community_health_sciences/departmental_units/mchp. See also Patricia Martens, “How and why does it ‘work’ at the Manitoba Centre for Health Policy? A model of data linkage, interdisciplinary research, and scientist/user interactions,” in Flood (ed.), Data Data Everywhere, pp. 137–150. General Practice Research Database: www.gprd.com/home.

Extremely restricted access

149

been extracted from patients’ electronic health records and irreversibly de-identified. No ongoing link to the records is maintained. The system allows known disease-related genomic data to be associated with detailed diagnostic and other medical data, and through this the replication of genome–phenome associations observed in research cohorts. Because neither the samples nor the data are identifiable to the researchers, BioVU is considered by both the local Institutional Review Board and the US Office for Human Research Protections not to be human-subject research.16

Extremely restricted access Data enclaves (also called research data centers, data safe havens, or data safe harbors) are arrangements for providing tightly restricted access to sensitive individual-level data held in isolated databases. Applicants must confirm their research bona fides, have a justified research need, gain whatever ethics committee approval is required, and commit to working in the enclave and conforming to tight restrictions on using, copying, and transferring data. A contract-like license may be required, as may completion of a training course. Data enclaves are not new – they have long been used by census and public statistics agencies, for instance, and several of the linkage centers mentioned above are data enclaves – but the need has been growing and the technologies have been improving rapidly.17 Working “in” an enclave can be effected by visiting a center and using the database onsite, or it can be effected by remotely accessing the database via virtual private network or other secure data conduit. The database system may monitor the user’s keystrokes in real time and audit patterns of use, especially when the user is linking across databases. Many such enclaves require that analytic outputs be reviewed by staff statisticians for privacy risks before being removed from the center either physically or electronically. Some require that resulting article manuscripts similarly be reviewed before being submitted for publication. An alternative approach is remote service query, in 16

17

BioVu is part of the eMERGE Network described on p. 115 above. Jill Pulley, Ellen Clayton, Gordon R. Bernard, et al., “Principles of human subjects protections applied in an opt-out, de-identified biobank,” Clinical and Translational Science, 3 (2010), 42–48; Grigorios Loukides, Aris Ghoulalas-Divanis, and Bradley Malin, “Anonymization of electronic medical records for validating genome-wide association studies,” Proceedings of the National Academy of Sciences, 107 (2010), 7898–7903. US National Center for Health Statistics, Research Data Center: www.cdc.gov/rdc; Statistics Canada, Research Data Centres Program: www.statcan.gc.ca/rdc-cdr/indexeng.htm.

150

Data sharing, access, and transfer

which a researcher emails a detailed analytic program to a data center, and the center runs the program against the database and sends the results to the researcher. Several national statistics agencies allow approved researchers to come inside the organization, in effect – thereby bringing the researcher’s actions under the restraints and penalties of the agency’s statutes – by signing a non-disclosure contract and being sworn in as a “deemed employee” or “designated agent” of the agency and agreeing to be supervised by a technical employee of the agency. Statistics Canada does this under the Statistics Act, and the US National Center for Health Statistics does it under the Confidential Information Protection and Statistical Efficiency Act. The need for highly restricted mechanisms for accessing sensitive data was recognized by the Thomas and Walport data sharing report, which recommended that: “Safe havens” should be developed as an environment for population-based research and statistical analysis in which the risk of identifying individuals is minimised; and furthermore we recommend that a system of approving or accrediting researchers who meet the relevant criteria to work within those safe havens is established. We think that implementation of this recommendation will require legislation, following the precedent of the Statistics and Registration Service Act 2007. This will ensure that researchers working in “safe havens” are bound by a strict code, preventing disclosure of any personally identifying information, and providing criminal sanctions in case of breach of confidentiality.18 Genuinely critical accreditation would involve nontrivial issues. Who would be responsible for the accrediting? What criteria would be applied, and what pledges required? Would accreditation grant broad passportlike access, or would accreditation be just one among other qualifications to be considered when the researcher applies for access to a data source? Would applicants’ employing institutions have to endorse? Could scientists outside the accrediting country qualify? Could an accredited senior investigator delegate data analysis to technicians, students, postdoctoral fellows, or visiting scientists? Enforcement would be an issue, and although many census and statistics programs have statutory privacyprotection bases, few research programs anywhere do. It will be 18

Walport and Thomas, Data Sharing Review Report, recommendation 15.

Oversight and governance

151

instructive to see how the government and Parliament respond to the legislative recommendation, and whether and how accreditation might be developed elsewhere. Clearly, extremely restricted arrangements can ensure high confidentiality protection. But they can exclude researchers who for various reasons are unable to travel to the centers or can’t afford the technology to access the databases securely from a distance. They can exclude younger researchers and students. And they go against the scientific tradition of placing data upon which conclusions are based out in public – truly out in public – so that other researchers, and journal editors and medical product regulators, can validate the calculations or probe the source data using other statistical methods or different analytic assumptions. Every effort should be made to compensate for these shortcomings. Oversight and governance Until fairly recently, decisions about access to many collections have been made by the curating principal investigators, perhaps aided by a few colleagues reviewing applications. This is changing. Data access committees. In efforts to foster data sharing, increase transparency, and surround the sharing with governance reassurances, decisions about access to collections, especially larger and more complex collections, tend now to be made or overseen by committees. The committees go by different names, but for simplicity here they will be referred to as data access committees (DACs). The most common model is a DAC that is fairly independently constituted, involving some members from outside the hosting institution and perhaps a few lay members or representatives of advocacy groups. The curating principal investigators or institutions may be participating and voting members, or they may be arm’s-length recipients of the committee’s decisions. The true degree of independence of DACs varies – few if any include extreme critics of the project, and many include biased representatives or friends of the collection’s funders or hosting institution. However they are constituted, though, the committees’ purpose is to make stewardship decisions that faithfully reflect the collection’s mandate and restrictions, and that can stand up to public scrutiny. Often it is the project funders who establish the DAC’s mandate, appoint its members, and pay travel and other committee operating costs. They may embed trusted insiders as committee members, or at least retain a right to send observers to DAC meetings. For tight control, some governmental programs use DACs composed mainly or only of government employees. Some DACs are formally constituted and

152

Data sharing, access, and transfer

appointed, while some are more casual. Some publish their criteria, decisions, and decision rationales, but most don’t. Some basically advise the data custodians, who then make the yes/no (or revise-and-reapply) access decisions. But many DACs make binding decisions. Other forms of oversight. In addition to DACs, many higher-level governance arrangements affect policy development and adherence. Boards of directors, trustees, or funders exert fiduciary oversight, and they may serve as the resort of appeal in the event of a serious access dispute. A few research platforms have developed novel arrangements. For example, the UK Biobank project is watched over by an Ethics and Governance Council – independent of both the Board of Directors and the project management, and not a regulatory body – charged with monitoring and reporting publicly on the conformity of UK Biobank with an Ethics and Governance Framework and subsidiary policies. Among many other activities the Council has advised in depth on the recruitment, consent, and research access policies, and now it is monitoring how and for what purposes access to the resource is granted. Records of its deliberations are published on its website.19 International transfer Data and biospecimens are shuttled around the world for research all the time, and the volume of material and diversity of routes and destinations will only continue to increase. The principal legal arrangements are the following. Transfer under contract. Cross-border transfers are often made under contracts or contract-like agreements, sometimes mutually signed memoranda of understanding. Usually these carry terms similar to those in other access agreements, but in addition carry provisos regarding jurisdiction. For example, the data requesters may promise to comply with the research ethics regulations of the data-providing country, and agree that any legal dispute will be referred to a court of law in that country. Or the data providers may defer to protections imposed in the recipient country. Options to inspect or audit may be included. The legal guaranty may be increased by involving the data recipient’s institution as a 19

UK Biobank Ethics and Governance Council: www.egcukbiobank.org.uk. UK Biobank was described in Chapter 2 and Box 2. See also Martin Richards, Adrienne Hunt, and Graeme Laurie, “UK Biobank Ethics and Governance Council: An exercise in added value,” in Kaye and Stranger (eds.), Biobank Governance, pp. 229–242; Graeme Laurie, “Reflexive governance in biobanking: On the value of policy led approaches and the need to recognize the limits of law,” Human Genetics, 130 (2011), 347–356.

International transfer

153

party to the agreement, or by having a government agency or ministry endorse it. In some instances, transfers or classes of transfers are sanctioned by intergovernmental agreements. For transfers both within Canada and from Canada to other countries, the Canadian Personal Information Protection and Electronic Documents Act (PIPEDA) takes an organization-to-organization approach – not jurisdiction-to-jurisdiction – which can be based on contracts (Schedule 1, clause 4.1.3): “An organization is responsible for personal information in its possession or custody, including information that has been transferred to a third party for processing. The organization shall use contractual or other means to provide a comparable level of protection while the information is being processed by a third party.” The Privacy Commissioner has interpreted this to mean that the data recipient “must provide protection that can be compared to the level of protection the personal information would receive if it had not been transferred.”20 Examples of straightforward contractual provisions are those in the Wellcome Trust Case Control Consortium’s Data Access Agreement: You accept that the Data is protected by and subject to international laws, including but not limited to the UK Data Protection Act 1998, and that you are responsible for ensuring compliance with any such applicable law. The Consortium Data Access Committee reserves the right to request and inspect data security and management documentation to ensure the adequacy of data protection measures in countries that have no national laws comparable to that which pertain in the European Economic Area. This agreement shall be construed, interpreted and governed by the laws of England and Wales and shall be subject to the nonexclusive jurisdiction of the English courts.21 Surely, though, such inspecting is unlikely to be undertaken unless grave allegations of abuse arise, and given the legal distances and costs, formal enforcement could be difficult. Nonetheless, the prospect of signing such an agreement usually engages the attention of the privacy officers and legal advisors of the requester’s institution, and it leads to a documented promise to which data recipients can be held accountable, at the very least in the courts of public opinion and future collaboration. Transfer within the EU/EEA. As was mentioned in Chapter 4, the EU Data Protection Directive and the national laws that transpose it establish 20 21

Canada, Office of the Privacy Commissioner, “Guidelines for processing personal data across borders”: www.priv.gc.ca/information/guide/2009/gl_dab_090127_e.pdf. Wellcome Trust Case Control Consortium, Data Access Agreement: www.wtccc.org.uk/ docs/Data_Access_Agreement_v18.pdf, clauses 16 and 17.

154

Data sharing, access, and transfer

baseline conditions for transfer of personal data among the 30 EEA countries.22 The provisions do not negate additional national or provincial restrictions such as those imposed by ethics review bodies or data access committees. There is considerable inconsistency among the countries’ laws. Transfer from the EU/EEA to non-EEA countries (called by the Data Protection Directive “third countries”). This continues to be a contentious and uncertain set of issues, despite the fact that several mechanisms have been approved. The Directive provides that personal data may be transferred from the EU countries if the recipient country “ensures an adequate level of protection” by virtue of its domestic law or international commitments (Articles 25 and 26). The European Commission determines the adequacy status of applicant countries through a formal evaluation process led by the Article 29 Working Party. This mainly involves an assessment of the laws, regulatory mechanisms, and court decisions. Since the Directive’s entry into force in 1998, the Commission has approved adequacy status for only five non-EEA entities, several of them very small countries mainly interested in the movement of financial data, and transfers to US commercial entities subscribing to a Safe Harbor Framework, a mechanism described below.23 Personal data can be transferred to a country that does not have adequacy status if the data controller, the person who determines the purposes and means of data processing, can “adduce,” i.e., guarantee at risk of legal enforcement, that either: (a) the data-subject has consented to the transfer; or (b) approved sorts of safeguards are employed; or (c) defensible exceptions apply. The principal safeguards that are recognized are contractual guarantees, binding corporate rules, and the EU–US Safe Harbor Agreement.24 Contractual guarantees, such as those set out in standard contract clauses that the Commission has approved, may be asserted for transfers from EU data sources to recipients in countries not certified as affording adequate protection, and for other transfers.

22

23

24

The European Economic Area comprises the 27 EU Member States, plus Iceland, Liechtenstein, and Norway which, although not members of the EU, are committed to enacting laws conforming with those of the EU in such areas as data protection. As of March 2012 the EC had formally approved Argentina, Canada for commercial organizations subject to the Personal Information Protection and Electronic Documents Act, Guernsey, the Isle of Man, and Switzerland. It had approved several other countries with reservations. These mechanisms are discussed in helpful detail in European Commission, DirectorateGeneral for Justice, Freedom and Security, “Frequently asked questions relating to transfers of personal data from the EU/EEA to third countries” (2010): http://ec.europa.eu/jus tice/policies/privacy/docs/international_transfers_faq/international_transfers_faq.pdf. The status and evolution of the EU policies and Member State data protection activities can be followed at: http://ec.europa.eu/justice/data-protection/index_en.htm.

International transfer

155

Binding corporate rules can be approved for transfers among the units of “closely-knit, highly hierarchically structured multinational companies” that are able to, and promise to, monitor data transfers and other processing and enforce their internal rules. The company rules must be approved by national data protection authorities, until recently countryby-country. A system is developing now in which, for efficiency and uniformity, approval by one EU country is recognized by other countries. The EU–US Safe Harbor Agreement is a treaty-level accord entered into between the European Commission and the US Department of Commerce under which US businesses can self-certify as conforming to a Safe Harbor Framework and its principles (basically the OECD privacy principles) and thereby be considered to afford an adequate level of protection. Failure to comply with the commitments is actionable under US federal and state laws prohibiting unfair and deceptive acts in commerce.25 At the time of writing, many biotechnology and pharmaceutical companies have joined, although not all of the large ones, and some have joined for transfer of human resources data, or just security camera recordings, but not research data.26 Some large research-based pharmaceutical and other medical product manufacturers have resisted joining because they are concerned that the generality of the criteria could expose them to uncertain legal liability. Others have declined to join because they believe that their R&D work does not involve the importation of personal data from the EU/ EEA. Some others use other legal mechanisms. But again, many companies have signed on to Safe Harbor. Enforcement has been light. Consent and public interest exemption are two other grounds recognized by the Directive for transfer of personal data to countries not recognized as ensuring adequate protection (Article 26). But what proper informed consent should consist in for transfers to destinations beyond the routine enforcement reach of a data-subject’s home legal system has never been clear. And although communicable disease tracking, drug safety monitoring, and many other public health activities are widely recognized as public interest activities, the Directive provides no criteria for making public interest determinations.

25

26

US Department of Commerce, Safe Harbor certifications: http://export.gov/safeharbor/ eu/index.asp. Among other things, the website lists the companies that have joined and the kinds of data transfers for which they have certified themselves. (It also shows that a lot of former subscribers have allowed their certification to lapse.) Switzerland has an identical Safe Harbor arrangement with the US. Transfer of personal data for pharmaceutical and medical products research under Safe Harbor is specifically commented on at: www.export.gov/safeharbor/eu/eg_main_ 018386.asp.

156

Data sharing, access, and transfer

US academic and other noncommercial organizations can receive personal data from Europe under contractual guarantees, and US government agencies can exchange data and biospecimens under diplomatic agreements with their EU counterparts. But because they are not regulated as companies engaged in commerce, most academic and noncommercial organizations are not eligible to use the binding corporate rule or Safe Harbor mechanisms.

Some reflections Among the issues of international transfer that keep literally thousands of lawyers in many organizations in many countries fretting are: □ the general lack of uniformity among countries’ laws and their implementation and enforcement; □ the difficulty in multisite, multicountry processing of specifying exactly who the data controller is, or controllers are, and thus where responsibility and accountability lie; □ the complexity of the chain of contracts (sometimes hundreds) that can be required for transfers of personal data from data controllers to processors to subprocessors, across borders, and the need to revise as research operations evolve; □ the unpredictable liability exposure created by the generality, even vagueness, of the provisions in several of the mechanisms; □ the procedural and cost impediments, in reality, to a data-subject’s pursuing a complaint across international borders, and even whether the laws empower individuals to take action in their own right as compared with having to petition their data protection authority to take on the case; □ interest in, and apprehension about, the possible future acceptance of class action (group) privacy or data protection lawsuits; □ the fact that health research, with its very special issues but low lobbying presence, just isn’t a central concern for most data protection authorities. There is some movement toward universal standards. In 2009 the International Conference of Data Protection and Privacy Commissioners endorsed a “Joint proposal for a draft of international standards on the protection of privacy with regard to the processing of personal data” as a “step toward the development of a binding international instrument in due course.”27 Depending on how it emerges from

27

International Conference of Data Protection and Privacy Commissioners (2009), “International standards on the protection of personal data and privacy”: www.agpd.es/ portalweb/canaldocumentacion/conferencias/common/pdfs/31_conferencia_internacional/estandares_resolucion_madrid_en.pdf.

Some reflections

157

the political process, this should be constructive, but inevitably such highlevel standards or instrument will be high-level and generic. It may be desirable now for the health research community to consider establishing an international convention specifically on transfer of data and/or biospecimens for biomedical research. Mutual recognition of ethics review or other governance arrangements would have to be considered, as would such matters as cross-national accreditation of institutions, and certification and authentication of researchers. An initial task would be to take stock of the existing protections, and then think about the kinds of data and biospecimen transfers that are likely to challenge them in the future.28

28

Possibly some lessons of procedure and political dynamics can be learned from the experience of the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, an initiative described in footnote 18 on p. 58 above.

11

Ways forward

In the author’s view the following actions and policies discussed in the book would advance health research while neither disrespecting people’s privacy nor losing their trust. Although local receptivity to each proposition will depend on the character of the research system and the ethical and legal context, universal movement on all should be both desirable and possible in the medium term. Progress on them would aid – and, for that matter, help keep up with – the globalization of health research and the international movement of data and biospecimens. Rigorous safeguards and governance should be assumed to be in place as appropriate. Constructive steps would include: □ rethinking the fundamental notion of consent and recasting it to connote more authentically that what is involved is entrusting, with consequences for the framing, leadership, recruitment, conduct, public engagement, and oversight of research projects; □ encouraging fuller acceptance of broad consent or authorization for unspecified research uses of data or biospecimens; □ defending the case for pursuing research on existing data or biospecimens without explicit consent when it is impractical or inappropriate to seek consent; □ continuing to think about the cross-influences between identifiability and consent or other expressions of permission; □ refining de-identification techniques and privacy risk assessment methods, and scaling protections to appraised privacy risks; □ supporting the policy that if data are not identifiable to a researcher and the researcher commits to not attempting to identify or contact the data-subjects, then the data are not in legal senses “personal data” or “personally identifiable data” for that researcher; □ allowing qualified personnel to search through clinical or other records to identify potential research candidates; □ generally reducing consent and procedural requirements for noninterventional, low-risk research such as studies of thoroughly deidentified data or biospecimens; 158

Ways forward

159

□ supporting the policy that if research does not involve interacting with individuals and identifiable data are not involved, the research does not constitute human-subject research; □ developing statutes or regulations that specifically protect identified or potentially identifiable data used or generated in health research, to complement or supplant the provisions in omnibus privacy and data protection regimes; □ making sure that the barriers against access to research data for nonresearch purposes by police, courts, insurers, banks, and other external parties are as high as possible; □ continuing to strengthen physical, administrative, and cyber security in research centers; □ shifting regulatory priorities from regulating data-handling procedures toward enforcing data access and use agreements and penalizing datasubject abuses; □ in various research specialties, developing criteria for deciding whether and how to inform data-subjects, and possibly their relatives as well, of research findings (including incidental observations) and what any follow-up obligations of counseling and care should be; □ attending to the issues accompanying the international movement of data and biospecimens, and exploring the potential usefulness and feasibility of developing an international legal convention or code of conduct on cross-border transfer of data and biospecimens in health research; □ in law and public discourse, promoting health research as a publicinterest cause and emphasizing the public interest in protecting privacy and confidentiality while pursuing research.

Bibliography

Academy of Medical Sciences (UK), A New Pathway for the Regulation and Governance of Health Research (2011): www.acmedsci.ac.uk/p99puid 209.html. Acquisti, Alessandro and Ralph Gross, “Predicting Social Security numbers from public data,” Proceedings of the National Academy of Sciences, 106 (2009), 10975–10980. American Medical Association, Code of Medical Ethics: www.ama-assn.org/ama/ pub/physician-resources/medical-ethics/code-medical-ethics. Asia–Pacific Economic Cooperation, “Privacy framework”: www.apec.org/Groups/ Committee-on-Trade-and-Investment/Electronic-Commerce-SteeringGroup.aspx. Australian Law Reform Commission, For Your Information: Australian Privacy Law and Practice (ALRC Report 108, 2008): www.austlii.edu.au/au/other/ alrc/publications/reports/108. Australian National Health and Medical Research Council, Australian Research Council, and Australian Vice-Chancellors’ Committee, National Statement on Ethical Conduct in Human Research (updated 2009): www.nhmrc.gov.au/_files_nhmrc/publications/attachments/e72.pdf. Australian Office of the Privacy Commissioner, Privacy Impact Assessment Guide (2010): www.privacy.gov.au. Avon Longitudinal Study of Parents and Children (ALSPAC), “ALSPAC withdrawal of consent policy”: www.bristol.ac.uk/alspac/documents/ethics-fullwithdrawal-of-consent-policy-07022011.pdf. Benitez, Kathleen, Grigorios Loukides, and Bradley Malin, “Beyond Safe Harbor: Automatic discovery of health information de-identification policy alternatives,” Proceedings of the 1st ACM International Health Informatics Symposium (New York: Association for Computing Machinery, 2010), pp. 163–172: http://hiplab.mc.vanderbilt.edu/people/malin/Papers/ benitez_ihi.pdf. Benitez, Kathleen and Bradley Malin, “Evaluating re-identification risks with respect to the HIPAA privacy rule,” Journal of the American Medical Informatics Association, 17 (2010), 169–177. Bennett, Colin J., Regulating Privacy: Data Protection and Public Policy in Europe and the United States (Ithaca, NY: Cornell University Press, 1992). Bennett, Colin J. and Charles D. Raab, The Governance of Privacy: Policy Instruments in a Global Perspective (Cambridge, MA: MIT Press, 2006). 160

Bibliography

161

Berg, Jessica W., Paul S. Appelbaum, Lisa S. Parker, and Charles W. Lidz, Informed Consent: Legal Theory and Clinical Practice (New York: Oxford University Press, 2001). Beskow, Laura M., Lauren Dame, and E. Jane Costello, “Certificates of Confidentiality and compelled disclosure of data,” Science, 322 (2008), 1054–1055, and two responses and the authors’ reply, Science, 323 (2009), 1288–1290. Beskow, Laura M., Kristen N. Linney, Rodney A. Radtke, et al., “Ethical challenges in genotype-driven research recruitment,” Genome Research, 20 (2010), 705–709: http://genome.cshlp.org/content/20/6/705.full. Bieber, Frederick K., Charles H. Brenner, and David Lazar, “Finding criminals through DNA of their relatives,” Science, 312 (2006), 1315–1316. Biesecker, Leslie G., Joan E. Bailey-Wilson, Jack Ballantyne, et al., “DNA identifications after the 9/11 World Trade Center attack,” Science, 310 (2005), 1122–1123. Biggs, Hazel, Healthcare Research Ethics and Law: Regulation, Review and Responsibility (Abingdon and New York: Routledge-Cavendish, 2010). Brook, Emma L., Diana L. Rosman, and C. D’Arcy Holman, “Public good through data linkage: Measuring research outputs from the Western Australian Data Linkage System,” Australian and New Zealand Journal of Public Health, 32 (2008), 19–23. Brownsword, Roger, “Consent in data protection law: Privacy, fair processing, and confidentiality,” in Serge Gutwirth, Yves Poullet, Paul De Hert, et al. (eds.), Reinventing Data Protection? (Dordrecht and London: Springer, 2009). Burton, Paul R., Isabel Fortier, and Bartha M. Knoppers, “The global emergence of epidemiological biobanks: Opportunities and challenges,” in Muin J. Khoury, Sara R. Bedrosian, Marta Gwinn, et al. (eds.), Human Genome Epidemiology, second edn. (New York: Oxford University Press, 2010), pp. 77–99. Burton, Paul R., Anna L. Hansell, Isabel Fortier, et al., “Size matters: Just how big is BIG? Quantifying realistic sample size requirements for human genome epidemiology,” International Journal of Epidemiology, 38 (2009), 263–273. Canada, Office of the Privacy Commissioner, “Guidelines for processing personal data across borders”: www.priv.gc.ca/information/guide/2009/gl_dab_ 090127_e.pdf. “Leading by example: Key developments in the first seven years of the Personal Information Protection and Electronic Documents Act” (2008): www.priv. gc.ca/information/pub/lbe_080523_e.cfm. Canadian Institutes of Health Research, “Best Practices for Protecting Privacy in Health Research” (2005, and amended since): www.cihr-irsc.gc.ca/e/29072. html. Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council of Canada, and Social Sciences and Humanities Research Council of Canada, Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (second edn., 2010): www.pre.ethics.gc.ca/ pdf/eng/tcps2/TCPS_2_FINAL_Web.pdf. Cassa, Christopher A., Brian Schmidt, Isaac S. Kohane, and Kenneth D. Mandl, “My sister’s keeper? Genomic research and the identifiability of siblings,”

162

Bibliography

Biomed Central Medical Genomics, 1(32) (2008): www.biomedcentral.com/ content/pdf/1755-8794-1-32.pdf. Caulfield, Timothy, Stephanie M. Fullerton, Sarah E. Ali-Khan, et al., “Race and ancestry in biomedical research: Exploring the challenges,” Genome Medicine, 1, 8.1–8.8 (2009): http://genomemedicine.com/content/pdf/gm8.pdf. Caulfield, Timothy, Amy L. McGuire, Mildred Cho, et al., “Research ethics recommendations for whole-genome research: Consensus statement,” PLoS Biology, 6(3), e73.doi:10.1371/journal.pbio.0060073 (2008). Cavoukian, Ann and Khaled El Emam, “Dispelling the myths surrounding deidentification: Anonymization remains a strong tool for protecting privacy,” discussion paper on the website of the Ontario Information and Privacy Commissioner (2011): www.ipc.on.ca/images/Resources/anonymization.pdf. Childress, James F., Eric M. Meslin, and Harold Shapiro (eds.), Belmont Revisited: Ethical Principles for Research with Human Subjects (Washington, DC: Georgetown University Press, 2005). Claudot, Frédérique, François Alla, Jeanne Fresson, et al., “Ethics and observational studies in medical research: Various rules in a common framework,” International Journal of Epidemiology, 38 (2009), 1104–1108. Collins, Francis, The Language of Life: DNA and the Revolution of Personalized Medicine (New York: HarperCollins, 2010). Conley, John M., Adam K. Doerr, and Daniel B. Vorhaus, “Enabling responsible public genomics,” Health Matrix, 20 (2010), 325–385: www.genomicslawreport.com/wp-content/uploads/2011/02/Health_Matrix_-_Journal_of_LawMedicine_Vol_20_2010.pdf. Connolly, Chris, “Asia-Pacific region at the privacy crossroads” (2008): www. galexia.com/public/research/assets/asia_at_privacy_crossroads_20080825/. Corrigan, Oonagh, John McMillan, Kathleen Liddell, et al. (eds.), The Limits of Consent: A Socio-ethical Approach to Human Subject Research in Medicine (Oxford University Press, 2009). Council for International Organizations of Medical Sciences, “International Guidelines for Epidemiological Studies” (Geneva, 2009), available via: www.cioms.ch. Council of Europe, “Recommendation of the Committee of Ministers to member states on research on biological materials of human origin” (2006), CM/Rec (2006)4: https://wcd.coe.int/wcd/ViewDoc.jsp?id=977859. “Recommendation of the Committee of Ministers to member states on the protection of individuals with regard to automatic processing of personal data in the context of profiling” (2010), CM/Rec(2010)13: https://wcd.coe. int/wcd/ViewDoc.jsp?id=1710949&Site=CM. Davies, Kevin, The $1,000 Genome: The Revolution in DNA Sequencing and the New Era of Personalized Medicine (New York: Free Press, 2010). Detels, Roger, Robert Beaglehole, Mary Ann Lansang, and Martin Gulliford, Oxford Textbook of Public Health, fifth edn. (Oxford University Press, 2009). Domingo-Ferrer, Josep and Emmanouil Magkos (eds.), Privacy in Statistical Databases (Berlin: Springer-Verlag, 2010). Duke University Health System Institutional Review Board (IRB), “Policy on IRB determination of research not involving human subjects for research using coded

Bibliography

163

specimens or coded identifiable private information” (2008): http://irb.duhs. duke.edu/wysiwyg/downloads/Coded_Specimens_and_Coded_Identifiable_ PHI_Policy_05-15-08.pdf. El Emam, Khaled, “Methods for the de-identification of electronic health records for genomic research,” Genome Medicine, 3(25) (2011): http://genomemedicine.com/content/3/4/25. El Emam, Khaled and Anita Fineberg, “An overview of techniques for deidentifying personal health information” (2009): www.ehealthinformation. ca/documents/DeidTechniques.pdf. El Emam, Khaled, Elizabeth Jonker, and Anita Fineberg, “The case for deidentifying personal health information” (2011): http://papers.ssrn.com/ sol3/papers.cfm?abstract_id=1744038. empirica GmbH for the European Commission, Karl A. Stroetmann, Jörg Artmann, Veli N. Stroetmann, et al., European Countries on their Journey towards National eHealth Infrastructures (2011): www.ehealth-strategies.eu/ report/eHealth_Strategies_Final_Report_Web.pdf. Eriksson, Nicholas J., Michael Macpherson, Joyce Y. Tung, et al., “Web-based, participant-driven studies yield novel genetic associations for common traits,” PLoS Genetics, 6(6), e1000993. doi:10.1371/journal.pgen.1000993 (2010). European Article 29 Data Protection Working Party, “Opinion on the concept of personal data” (2007): http://ec.europa.eu/justice/policies/privacy/docs/ wpdocs/2007/wp136_en.pdf. “Opinion on the concepts of ‘controller’ and ‘processor’” (2010): http://ec. europa.eu/justice/policies/privacy/docs/wpdocs/2010/wp169_en.pdf. “Opinion on the definition of consent” (2011): http://ec.europa.eu/justice/dataprotection/article-29/documentation/opinion-recommendation/files/2011/ wp187_en.pdf. “Opinion on geolocation services on smart mobile devices” (2011): http://ec. europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2011/wp185_en.pdf. “Opinion on the principle of accountability” (2010): http://ec.europa.eu/justice/ policies/privacy/docs/wpdocs/2010/wp173_en.pdf. “Working document on the processing of personal data relating to health in electronic health records” (2007): http://ec.europa.eu/justice/policies/privacy/docs/wpdocs/2007/wp131_en.pdf. European Article 29 Data Protection Working Party jointly with the Working Party on Police and Justice, “The Future of Privacy” (2009): http://ec. europa.eu/justice/policies/privacy/docs/wpdocs/2009/wp168_en.pdf. European Commission, “Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation)” (2012): http://ec.europa.eu/justice/data-protection/ document/review2012/com_2012_11_en.pdf. European Commission, Directorate-General for Justice, Freedom and Security, “Frequently asked questions relating to transfers of personal data from the EU/EEA to third countries” (2010): http://ec.europa.eu/justice/policies/privacy/docs/international_transfers_faq/international_transfers_faq.pdf.

164

Bibliography

Fabsitz, Richard R., Amy McGuire, Richard Sharp, et al., “Ethical and practical guidelines for reporting genetic research results to study participants: Updated guidelines from a National Heart, Lung, and Blood Institute working group,” Circulation: Cardiovascular Genetics, 3 (2010), 574–580: http:// circgenetics.ahajournals.org/content/3/6/574.full. Faden, Ruth R., Thomas L. Beauchamp, and Nancy M. P. King, A History and Theory of Informed Consent (New York: Oxford University Press, 1986). Flood, Colleen M. (ed.), Data Data Everywhere: Access and Accountability? (Montreal, Quebec and Kingston, Ontario: McGill-Queen’s University Press, 2011). Friedman, Lawrence M., Curt D. Furberg, and David L. DeMets, Fundamentals of Clinical Trials, fourth edn. (New York: Springer, 2010). Gellman, Robert, “The deidentification dilemma: A legislative and contractual proposal,” Fordham Intellectual Property, Media, and Entertainment Law Journal, 21 (2011), 33–61: http://iplj.net/blog/wp-content/uploads/2010/11/ C02_Gellman_010411_Final.pdf. General Medical Council (UK), “Good Medical Practice Guidance: Confidentiality” (2009): www.gmc-uk.org/guidance/ethical_guidance/confidentiality_40_50_research_and_secondary_issues.asp. Gibson, Greg, and Gregory P. Copenhaver, “Consent and Internet-enabled human genomics,” PLoS Genetics, 6(6), e1000965. doi:10.1371/journal.pgen. 1000965 (2010). Gitschier, Jane, “Inferential genotyping of Y chromosomes in Latter-Day Saints founders and comparison to Utah samples in the HapMap project,” American Journal of Human Genetics, 84 (2009), 251–258. Goodman, Richard A. (ed.), Law in Public Health Practice, second edn. (New York: Oxford University Press, 2007). Gostin, Lawrence O., Public Health Law: Power, Duty, Restraint, second edn. (Berkeley, CA: University of California Press, 2009) Gottweis, Herbert and Alan Petersen (ed.), Biobanks: Governance in Comparative Perspective (London: Routledge, 2008). Green, Eric D., Mark S. Guyer, and the US National Human Genome Research Institute, “Charting a course for genomic medicine from base pairs to bedside,” Nature, 470 (2011), 204–213. Greenleaf, Graham, “Country study B.5 – Japan,” an appendix to LRDP Kantor Ltd, “New challenges” (2010): http://ec.europa.eu/justice/policies/privacy/docs/studies/new_privacy_challenges/final_report_country_ report_B5_japan.pdf. Groebner, Valentin, Who Are You? Identification, Deception, and Surveillance in Early Modern Europe (New York: Zone Books, 2007). Gutwirth, Serge, Yves Poullet, Paul De Hert, et al. (eds.), Reinventing Data Protection? (Dordrecht and London: Springer, 2009). Hardcastle, Rohan, Law and the Human Body: Property Rights, Ownership and Control (Oxford and Portland, OR: Hart Publishing, 2009). Hindorff, L. A., J. MacArthur, A. Wise et al., “A catalog of published genomewide association studies”: www.genome.gov/gwastudies.

Bibliography

165

Hodge, James G. and Lawrence O. Gostin, “Public Health Practice vs. Research. A report to the Council of State and Territorial Epidemiologists” (2004): www. cste.org/pdffiles/newpdffiles/CSTEPHResRptHodgeFinal.5.24.04.pdf. Holman, C. D’Arcy J., A. John Bass, Diana L. Rosman, et al., “A decade of data linkage in Western Australia: Strategic design, applications and benefits of the WA data linkage system,” Australian Health Review, 32 (2008), 766–777: www.publish.csiro.au/?act=view_file&file_id=AH080766.pdf. Homer, Nils, Szabolcs Szelinger, Margot Redman, et al., “Resolving individuals contributing trace amounts of DNA to highly complex mixtures using highdensity SNP genotyping microarrays,” PLoS Genetics, 4(8), e1000167. doi:10.1371/journal.pgen.1000167 (2008). Hrynaszkiewicz, Iain, Melissa L. Norton, Andrew J. Vickers, and Douglas G. Altman, “Preparing raw clinical data for publication: Guidance for journal editors, authors, and peer reviewers,” BMJ, 340 (2010), 304–307. Infectious Diseases Society of America, William Burman and Robert Daum, “Grinding to a halt: The effects of the increasing regulatory burden on research and quality improvement efforts,” Clinical Infectious Diseases, 49 (2009), 328–335. Institute of Medicine (US), Committee on Health Research and the Privacy of Health Information, Sharyl J. Nass, Laura A. Levit, and Lawrence O. Gostin (eds.), Beyond the HIPAA Privacy Rule: Enhancing Privacy, Improving Health Through Research (Washington, DC: National Academies Press, 2009). Institute of Medicine (US), Roundtable on Translating Genomic-Based Research for Health, workshop summary, “Challenges and opportunities in using residual newborn screening samples for translational research” (Washington, DC: National Academies Press, 2010). International Conference of Data Protection and Privacy Commissioners (2009), “International standards on the protection of personal data and privacy”: www.agpd.es/portalweb/canaldocumentacion/conferencias/common/pdfs/ 31_conferencia_internacional/estandares_resolucion_madrid_en.pdf. Jacobs, Kevin B., Meredith Yeager, Sholom Wacholder, et al., “A new statistic and its power to infer membership in a genome-wide association study using genotype frequencies,” Nature Genetics, 41 (2009), 1253–1257. Kaye, Jane and Mark Stranger (eds.), Principles and Practice in Biobank Governance (Farnham: Ashgate, 2009). Kayser, Manfred and Peter de Knijff, “Improving human forensics through advances in genetics, genomics and molecular biology,” Nature Reviews Genetics, 12 (2011), 179–192. Kho, Michelle E., Mark Duffett, Donald J. Willison, et al., “Written informed consent and selection bias in observational studies using medical records: Systematic review,” BMJ, 12, 338:b866 (2009): www.bmj.com/content/338/ bmj.b866.full.pdf. Khoury, Muin J., Sara R. Bedrosian, Marta Gwinn, et al. (eds.), Human Genome Epidemiology, second edn. (New York: Oxford University Press, 2010). Kohane, Isaac S., “Using electronic health records to drive discovery in disease genomics,” Nature Reviews Genetics, 12 (2011), 417–428.

166

Bibliography

Kosseim, Patricia and Megan Brady, “Policy by procrastination: Secondary use of electronic health records for health research purposes,” McGill Journal of Law & Health, 2 (2008), 5–45: http://www.mjlh.mcgill.ca/pdfs/vol2-1/ MJLH_vol2_Kosseim-Brady.pdf. Laurie, Graeme, Genetic Privacy (Cambridge University Press, 2002). “Reflexive governance in biobanking: On the value of policy led approaches and the need to recognize the limits of law,” Human Genetics, 130 (2011), 347–356. Lee, Lisa M. and Lawrence O. Gostin, “Ethical collection, storage, and use of public health data,” Journal of the American Medical Association, 302 (2009), 82–84. Lenk, Christian, Nils Hoppe, Katherina Beier, and Claudia Wiesemann (eds.), Human Tissue Research: A European Perspective on the Ethical and Legal Challenges (Oxford University Press, 2011). Levin, Avner and Mary Jo Nicholson, “Privacy law in the United States, the EU and Canada: The allure of the middle ground,” University of Ottawa Law & Technology Journal, 2 (2005), 357–394: http://papers.ssrn.com/sol3/ papers. cfm?abstract_id=894079. Lin, Zhen, Art B. Owen, and Russ B. Altman, “Genomic research and human subject privacy,” Science, 303 (2004), 183. Loukides, Grigorios, Joshua C. Denny, and Bradley Malin, “The disclosure of diagnosis codes can breach research participants’ privacy,” Journal of the American Medical Informatics Association, 17 (2010), 322–327: www.ncbi. nlm.nih.gov/pmc/articles/PMC2995712/pdf/amiajnl2725.pdf. Loukides, Grigorios, Aris Ghoulalas-Divanis, and Bradley Malin, “Anonymization of electronic medical records for validating genome-wide association studies,” Proceedings of the National Academy of Sciences, 107 (2010), 7898–7903. Lowrance, William W., report to the Medical Research Council (UK) and the Wellcome Trust, Access to Collections of Data and Materials for Health Research (2006): www.wellcome.ac.uk/stellent/groups/corporatesite/@msh_grants/docu ments/web_document/wtx030842.pdf. Lowrance, William W. and Francis S. Collins, “Identifiability in genomic research,” Science, 317 (2007), 600–602. LRDP Kantor Ltd in association with the Centre for Public Reform, “New challenges to data protection,” prepared for the European Commission’s DirectorateGeneral Freedom, Security, and Justice (2010): http://ec.europa.eu/justice/policies/privacy/docs/studies/new_privacy_challenges/final_report_en.pdf. Macleod, Una and Graham C. M. Watt, “The impact of consent on observational research: A comparison of outcomes from consenters and non consenters to an observational study,” BMC Medical Research Methodology, 8:15 (2008): www.biomedcentral.com/1471-2288/8/15. Malin, Bradley, Kathleen Benitez, and Daniel Masys, “Never too old for anonymity: A statistical standard for demographic data sharing via the HIPAA Privacy Rule,” Journal of the American Medical Informatics Association, 18 (2011), 3–10: http://jamia.bmj.com/content/18/1/3.full.pdf?sid=de5038617d0a-4992-813a-f887353f8c21.

Bibliography

167

Manolio, Teri A., “Genomewide association studies and assessment of the risk of disease,” New England Journal of Medicine, 363 (2010), 166–176. Manson, Neil C. and Onora O’Neill, Rethinking Informed Consent in Bioethics (Cambridge University Press, 2007). Martens, Patricia, “How and why does it ‘work’ at the Manitoba Centre for Health Policy? A model of data linkage, interdisciplinary research, and scientist/user interactions,” in Colleen M. Flood (ed.), Data Data Everywhere: Access and Accountability? (Montreal, Quebec and Kingston, Ontario: McGill-Queen’s University Press, 2011), pp. 137–150. Mayer-Schönberger, Viktor, “Generational development of data protection in Europe,” in Philip E. Agre and Marc Rotenberg (eds.), Technology and Privacy: The New Landscape (Cambridge, MA: MIT Press, 1977), pp. 219–241. McCarty, Catherine A., Rex L. Chisholm, Christopher G. Chute, et al., “The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies,” Medical Genomics, 4(13) (2011): www.biomedcentral.com/content/pdf/1755-8794-4-13.pdf. McGuire, Sean E. and Amy L. McGuire, “Don’t throw the baby out with the bathwater: Enabling a bottom-up approach in genome-wide association studies,” Genome Research, 18 (2008), 1683–1685: http://genome.cshlp.org/content/18/11/1683.full. Meystre, Stephane M., F. Jeffrey Friedlin, Brett R. South, et al., “Automatic deidentification of textual documents in the electronic health record: A review of recent research,” BMC Medical Research Methodology, 10(70) (2010): www. biomedcentral.com/1471-2288/10/70. Miller, Franklin G. and Alan Wertheimer (eds.), The Ethics of Consent: Theory and Practice (New York: Oxford University Press, 2010). Morris, Andrew (oral evidence), UK House of Lords, Science and Technology Committee, Genomic Medicine, 2nd Report of Session 2008–09, Section 6.16: www.publications.parliament.uk/pa/ld200809/ldselect/ldsctech/107/107i.pdf. Murphy, Erin, “Relative doubt: Familial searches of DNA databases,” Michigan Law Review, 109 (2010), 291–348: http://papers.ssrn.com/sol3/papers.cfm? abstract_id=1498807. Murtagh, Madeleine, J. Ipek Demir, Jennifer R. Harris, and Paul R. Burton, “Realizing the promise of population biobanks: A new model for translation,” Human Genetics, 130 (2011), 333–345. National Cancer Research Institute (UK), “Samples and data for cancer research: Template for access policy development” (2009): www.ncri.org.uk/default. asp?s=1&p=8&ss=9. National Research Council (US), Panel on Collecting, Storing, Accessing, and Protecting Biospecimens and Biodata in Biosocial Surveys, Robert M. Hauser, Maxine Weinstein, Robert Pool, and Barney Cohen (eds.), Conducting Biosocial Surveys: Collecting, Storing, Accessing, and Protecting Biospecimens and Biodata (Washington, DC: National Academies Press, 2010). National Research Council (US), Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data, Myron P. Gutmann and Paul C. Stern (eds.), Putting People on the Map: Protecting

168

Bibliography

Confidentiality with Linked Social-Spatial Data (Washington, DC: National Academies Press, 2007). NHS National Services Scotland, Information Services Division, “Statistical disclosure control protocol,” version 2 (2010): www.isdscotland.org/isd/files/ isd-statistical-disclosure-protocol.pdf. Nuffield Council on Bioethics, Human Bodies: Donation for Medicine and Research (2011): www.nuffieldbioethics.org/sites/default/files/Donation_full_report.pdf. Nuremberg Trial proceedings, Trials of War Criminals before the Nuernberg Military Tribunals under Control Council Law No. 10 (October 1946–April 1949): www. loc.gov/rr/frd/Military_Law/NTs_war-criminals.html. Nycum, Gillian, Bartha Maria Knoppers, and Denise Avard, “Intra-familial obligations to communicate genetic risk information: What foundations? What forms?” McGill Journal of Law and Health, 3 (2009), 21–48: http://mjlh. mcgill.ca/pdfs/vol3-1/NycumKnoppersAvard.pdf. O’Neill, Onora, Autonomy and Trust in Bioethics (Cambridge University Press, 2002). Ontario, Ministry of Health and Long-Term Care, “Declaration of PHIPA as substantially similar to PIPEDA”: www.health.gov.on.ca/english/providers/ legislation/priv_legislation/phipa_pipeda_qa.html. Organisation for Economic Co-operation and Development, “Guidelines on Human Biobanks and Genetic Research Databases” (2009): www.oecd.org/ dataoecd/41/47/44054609.pdf. “Recommendation of the Council concerning guidelines governing the protection of privacy and transborder flows of personal data” (1980): www.oecd. org/document/18/0,3746,en_2649_34255_1815186_1_1_1_1,00&en-USS_ 01DBC.html. Percival, Thomas, Medical Ethics, or A Code of Institutes and Precepts Adapted to the Professional Conduct of Physicians and Surgeons, third edn. (London: John Henry Parker, 1849), available gratis from Google Books. Piwowar, Heather A., Michael J. Becich, Howard Bilofsky, and Rebecca S. Crowley, on behalf of the caBIG Data Sharing and Intellectual Capital Workspace, “Towards a data sharing culture: Recommendations for leadership from academic health centers,” PLoS Medicine, 5(9), e183. doi:10.1371/ journal.pmed.0050183 (2008). Price, David, Human Tissue in Transplantation and Research: A Model Legal and Ethical Donation Framework (Cambridge University Press, 2010). Privacy Rights Clearinghouse, “Chronology of data breaches, 2005–present,” healthcare subset: www.privacyrights.org/data-breach. Pulley, Jill, Ellen Clayton, Gordon R. Bernard, et al., “Principles of human subjects protections applied in an opt-out, de-identified biobank,” Clinical and Translational Science, 3 (2010), 42–48. Regan, Priscilla M., Legislating Privacy: Technology, Social Values, and Public Policy (Chapel Hill, NC: University of North Carolina Press, 1995). Richards, Martin, Adrienne Hunt, and Graeme Laurie, “UK Biobank Ethics and Governance Council: An exercise in added value,” in Jane Kaye and Mark Stranger (eds.), Principles and Practice in Biobank Governance (Farnham: Ashgate, 2009), pp. 229–242.

Bibliography

169

Richards, Neil M. and Daniel J. Solove, “Privacy’s other path: Recovering the law of confidentiality,” Georgetown Law Journal, 96 (2007), 123–182: http:// papers.ssrn.com/sol3/papers.cfm?abstract_id=969495. Robinson, Neil, Hans Graux, Maarten Botterman, and Lorenzo Valeri, Review of the European Data Protection Directive, a report prepared by RAND Europe for the UK Information Commissioner’s Office (2009): www.rand.org/pubs/ technical_reports/2009/RAND_TR710.pdf. Rotimi, Charles N., and Lynn B. Jorde, “Ancestry and disease in the age of genomic medicine,” New England Journal of Medicine, 363 (2010), 1551–1558. Royal College of Physicians, Royal College of Pathologists and British Society for Human Genetics, “Consent and confidentiality in clinical genetic practice: Guidance on genetic testing and sharing genetic information,” Report of the Joint Committee on Medical Genetics, second edn. (2011): www.rcplondon. ac.uk/sites/default/files/consent_and_confidentiality_2011.pdf. Sleeboom-Faulkner, Margaret (ed.), Human Genetic Biobanks in Asia: Politics of Trust and Scientific Advancement (London: Routledge, 2009). Solove, Daniel J., Understanding Privacy (Cambridge, MA: Harvard University Press, 2008). Steinmann, Michael, Peter Sýora, and Urban Wiesing (eds.), Altruism Reconsidered: Exploring New Approaches to Property in Human Tissue (Farnham: Ashgate, 2009). Strom, Brian L., Stephen E. Kimmel, and Sean Hennessy (eds.), Pharmacoepidemiology, fifth edn. (Chichester: Wiley-Blackwel, 2012). Taylor, Mark J., “Data protection: Too personal to protect?” SCRIPT-ed, 3, #1 (2006): www.law.ed.ac.uk/ahrc/script-ed/vol3-1/taylor.asp. Thomas, Richard and Mark Walport, Data Sharing Review Report (2008): www. justice.gov.uk/reviews/docs/data-sharing-review-report.pdf. UK Biobank, “Ethics and Governance Framework”: www.ukbiobank.ac.uk/wpcontent/uploads/2011/05/EGF20082.pdf. UK Data Archive, Veerle Van den Eynden, Louise Corti, Matthew Woollard, et al., “Managing and sharing data: Best practice for researchers” (2011): www.data-archive.ac.uk/media/2894/managingsharing.pdf. UK Department of Health, The Caldicott Guardian Manual (2010): http://www. dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAnd Guidance/DH_114509. UK Health Departments, “Governance arrangements for research ethics committees: A harmonised edition” (2011): www.dh.gov.uk/prod_consum_dh/ groups/dh_digitalassets/documents/digitalasset/dh_126614.pdf. UK House of Lords, Committee on Privacy, Report of the Committee on Privacy, Kenneth Younger, chair (Home Office, Cmnd 5012, H. M. Stationery Office, 1972). UK House of Lords, Committee on Privacy, Session of June 6, 1973: http:// hansard.millbanksystems.com/lords/1973/jun/06/privacy-younger-committeesreport. UK House of Lords, Select Committee on the Constitution, Surveillance: Citizens and the State, 2nd Report of Session 2008–09: www.publications.parliament. uk/pa/ld200809/ldselect/ldconst/18/1802.htm.

170

Bibliography

UK Human Tissue Authority, “Code of Practice 9: Research” (2009): www. hta.gov.uk/legislationpoliciesandcodesofpractice/codesofpractice/code9 research.cfm. UK Information Commissioner’s Office, “Determining what information is ‘data’ for the purposes of the DPA” (2009): www.ico.gov.uk/upload/documents/library/ data_protection/detailed_specialist_guides/what_is_data_for_the_purposes_of_ the_dpa.pdf. “Determining what is personal data” (2007): www.ico.gov.uk/upload/docu ments/library/data_protection/detailed_specialist_guides/personal_data_flow chart_v1_with_preface001.pdf. Privacy Impact Handbook, version 2.0 (2009): www.ico.gov.uk/upload/documents/pia_handbook_html_v2/index.html. “Privacy notices code of practice” (2009): www.ico.gov.uk/upload/documents/ library/data_protection/detailed_specialist_guides/privacy_notices_cop_final .pdf. “Use and disclosure of health data” (2002): www.ico.gov.uk/upload/documents/library/data_protection/practical_application/health_data_-_use_and_ disclosure001.pdf. UK National Information Governance Board, “System level security policy,” template: www.nigb.nhs.uk/ecc/applications/SLSP.pdf. UK National Research Ethics Service. “Defining research” (2008): www.nres. npsa.nhs.uk/news-and-publications/publications/general-publications. UK Office for National Statistics, “Disclosure control of health statistics” (2006): www.ons.gov.uk/ons/guide-method/best-practice/disclosure-control-of-health -statistics/index.html. US Agency for Healthcare Research and Quality, Richard E. Gliklich and Nancy A. Dreyer (senior eds.), Registries for Evaluating Patient Outcomes: A User’s Guide, second edn. (2010): www.ncbi.nlm.nih.gov/books/NBK49444/pdf/ TOC.pdf. US Department of Health and Human Services, Federal Policy on Protection of Human Subjects (“Common Rule”), DHHS version, 45 Code of Federal Regulations 46: www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html. US Department of Health and Human Services, Office of Civil Rights, “Guidance on risk analysis requirements under the HIPAA Security Rule” (2010): www.hhs.gov/ocr/privacy/hipaa/administrative/securityrule/rafinalguidancepdf.pdf. US Department of Health and Human Services, Office of Civil Rights, Workshop on the HIPAA Privacy Rule’s De-Identification Standard (May 2010): www. hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/ deidentificationagenda.html. US Department of Health and Human Services, Office for Human Research Protections, “Guidance on research involving coded private information or biological specimens” (2008): www.hhs.gov/ohrp/policy/cdebiol.html. “Guidance on the Genetic Information Nondiscrimination Act: Implications for investigators and Institutional Review Boards” (2009): www.hhs.gov/ ohrp/policy/gina.pdf.

Bibliography

171

“Guidance on withdrawal of subjects from research: data retention and other related issues” (2010): www.hhs.gov/ohrp/policy/subjectwithdrawal.html. “International compilation of human research protections” (2012): www.hhs. gov/ohrp/international/intlcompilation/intlcompilation.html. US Department of Health and Human Services, Secretary’s Advisory Committee on Heritable Disorders in Newborns and Children, briefing paper, “Considerations and recommendations for national guidance regarding the retention and use of residual dried blood spot specimens after newborn screening” (2009): www.cchconline.org/pdf/HHSRectoBankBaby DNA042610.pdf. US Department of Health and Human Services, Secretary’s Advisory Committee on Human Research Protections, letter of January 24, 2011 to the Secretary, attachment A: www.hhs.gov/ohrp/sachrp/20110124attachmentatosecletter. html. US National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects (1979): www.hhs.gov/ohrp/humansubjects/guidance/belmont.html. US National Committee on Vital and Health Statistics, letter to the Secretary of Health and Human Services, “Recommendations regarding sensitive health information” (November 10, 2010): www.ncvhs.hhs.gov/101110lt.pdf. US National Heart, Lung, and Blood Institute, “Guidelines for NHLBI data set preparation” (2005): www.nhlbi.nih.gov/funding/setpreparation.htm. US National Human Genome Research Institute, template form, “Informed consent elements tailored to genomic research”: www.genome.gov/pfv.cfm? pageID=27026589. US National Institutes of Health, “Points to consider when planning a genetic study that involves members of named populations” (2008): http://bioethics. od.nih.gov/named_populations.html. “Protecting personal health information in research: Understanding the HIPAA Privacy Rule”: http://privacyruleandresearch.nih.gov/pr_02.asp. US Privacy Protection Study Commission, Personal Privacy in an Information Society (US Government Printing Office, Washington, DC, 1977): http:// aspe.hhs.gov/datacncl/1977privacy/toc.htm. US Secretary of Health, Education and Welfare, Advisory Committee on Automated Personal Data Systems, Records, Computers and the Rights of Citizens (US Government Printing Office, Washington, DC, 1973): http:// aspe.hhs.gov/datacncl/1973privacy/tocprefacemembers.htm. US Veterans Health Administration, “Notice of privacy practices”: www1.va.gov/ vhapublications/ViewPublication.asp?pub_ID=1090. Valdez, Rodolfo, Muin J. Khoury, and Paula W. Yoon, “The use of family history in public health practice: the epidemiologic view,” in Muin J. Khoury, Sara R. Bedrosian, Marta Gwinn, et al. (eds.), Human Genome Epidemiology, second edn. (New York: Oxford University Press, 2010), pp. 579–593. Visscher, Peter M. and William G. Hill, “The limits of individual identification from sample allele frequencies: Theory and statistical analysis,” PLoS Genetics, 5(10), e1000628. doi:10.1371/journal.pgen.1000628 (2009).

172

Bibliography

Vollmann, Jochen and Rolf Winau, “Informed consent in human experimentation before the Nuremberg code,” BMJ, 313 (1996), 1445–1447. Wacks, Raymond, Personal Information: Privacy and the Law (Oxford: Clarendon Press, 1989). Walport, Mark and Paul Brest on behalf of 17 signatory organizations, “Sharing research data to improve public health,” The Lancet, 377 (2011), 537–539. Warner, Malcolm and Michael Stone, The Data Bank Society: Organizations, Computers and Social Freedom (London: George Allen & Unwin, 1970). Wartenberg, Daniel and W. Douglas Thompson, “Privacy versus public health: The impact of current confidentiality rules,” American Journal of Public Health, 100 (2010), 407–412. Waters, Nigel, “The APEC Asia-Pacific privacy initiative – a new route to effective data protection or a trojan horse for self-regulation?” Script-ed, 6(1) (2009), 75: www.law.ed.ac.uk/ahrc/SCRIPT-ed/vol6-1/waters.asp. Weindling, Paul, Nazi Medicine and the Nuremberg Trials: From Medical War Crimes to Informed Consent (Basingstoke: Palgrave Macmillan, 2004). Wellcome Trust, “Policy on data management and sharing” (2010): www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043. htm. Wellcome Trust Case Control Consortium, Data Access Agreement: www.wtccc. org.uk/docs/Data_Access_Agreement_v18.pdf. Western Australian Department of Health, Office of Population Health Genomics, “Guidelines for human biobanks, genetic research databases and associated data” (2010): www.genomics.health.wa.gov.au/publications/ docs/guidelines_for_human_biobanks.pdf. Westin, Alan F., Privacy and Freedom (New York: Atheneum, 1967). Westin, Alan F. and Michael A. Baker, Databanks in a Free Society (New York: Quadrangle Books, 1972). Widdows, Heather and Caroline Mullen (ed.), Governance of Genetic Information: Who Decides? (Cambridge University Press, 2009). Willison, Don, Elaine Gibson, and Kim McGrail, “A roadmap to research uses of electronic health information,” in Colleen M. Flood (ed.), Data Data Everywhere: Access and Accountability? (Montreal, Quebec and Kingston, Ontario: McGill-Queen’s University Press, 2011), pp. 233–251. Wolfson, Michael, Susan E. Wallace, Nicholas Masca, et al., “DataSHIELD: Resolving a conflict in contemporary bioscience – performing a pooled analysis of individual-level data without sharing the data,” International Journal of Epidemiology, 39 (2010), 1372–1382: http://ije.oxfordjournals.org/content/ 39/5/1372.full.pdf+html?sid=0579e27f-86f5-4311-adea-bfa2ae32df67. World Medical Association, Declaration of Helsinki, “Ethical Principles for Medical Research Involving Human Subjects”: www.wma.net/en/30publications/10policies/b3/17c.pdf.

GENERAL RESOURCES Australia, Privacy Act 1988: www.comlaw.gov.au/Details/C2011C00157. Australian Office of the Privacy Commissioner: www.privacy.gov.au.

Bibliography

173

Biobanking and Biomolecular Resources Research Infrastructure: www.bbmri. eu/index.php/home. Canada, Office of the Privacy Commissioner: http://priv.ca. Canada, Personal Information Protection and Electronic Documents Act: http:// laws-lois.justice.gc.ca/eng/acts/P-8.6. Clarke, Roger: www.rogerclarke.com. ClinTrials.Gov: http://clinicaltrials.gov/ct2/search. Council of Europe, Convention for the Protection of Human Rights and Fundamental Freedoms, European Treaty Series No. 5: http://conventions. coe.int/treaty/en/ Treaties/Html/005.htm. Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data, European Treaty Series No. 108 (1981): http://conventions.coe.int/Treaty/EN/Treaties/Html/108.htm. El Emam, Khaled and the Electronic Health Information Laboratory, University of Ottawa and Children’s Hospital of Eastern Ontario: www.ehealthinformation.ca. Electronic Medical Records and Genomics project (eMERGE): www.mc.vanderbilt.edu/victr/dcc/projects/acc/index.php/Main_Page. European Commission, Directorate-General for Justice, data protection portal: http://ec.europa.eu/justice/data-protection/index_en.htm. European Union, Charter of Fundamental Rights: www.europarl.europa.eu/charter/pdf/text_en.pdf. Directive on Good Clinical Practice in the Conduct of Clinical Trials on Medicinal Products for Human Use (2001/20/EC): http://eur-lex.europa. eu/LexUriServ/LexUriServ.do?uri=OJ:L:2001:121:0034:0044:en:PDF. Directive on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data (Directive 95/46/ EC): http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=celex:31995 L0046:en:html. France, Loi relative à l’informatique, aux fichiers et aux libertés. English translation: www.cnil.fr/fileadmin/documents/en/Act78-17VA.pdf. General Practice Research Database: www.gprd.com/home. infoUSA: www.infoUSA.com. International Association of Privacy Professionals: https://www.privacyassociation.org. International Cancer Genome Consortium: http://icgc.org. International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use: www.ich.org/home. html. International Organization for Standardization: www.27000.org/index.htm. Japan, Personal Information Protection Act, English translation: www.cas.go.jp/ jp/seisaku/hourei/data/APPI.pdf. Malin, Bradley and the Health Information Privacy Laboratory, School of Medicine, Vanderbilt University: www.hiplab.org. Manitoba Centre for Health Policy: www.umanitoba.ca/faculties/medicine/units/ community_health_sciences/departmental_units/mchp.

174

Bibliography

Medical Research Council (UK), data sharing portal: www.mrc.ac.uk/ Ourresearch/Ethicsresearchguidance/Datasharinginitiative/Recentactivities/ index.htm. Michael John Durant v. Financial Services Authority [2003] EWCA Civ 1746, para. 28: www.bailii.org/ew/cases/EWCA/Civ/2003/1746.html. NextMark: http://lists.nextmark.com. Nuremberg Code: www.hhs.gov/ohrp/archive/nurcode.html. Online Mendelian inheritance in man: www.ncbi.nlm.nih.gov/omim. Ontario, Personal Health Information Protection Act (PHIPA): www.e-laws.gov. on.ca/html/statutes/english/elaws_statutes_04p03_e.htm. Personal Genome Project: www.personalgenomes.org. Public Population Project in Genomics (P3G): www.p3g.org. Statistics Canada, Research Data Centres Program: www.statcan.gc.ca/rdc-cdr/ index-eng.htm. UK Biobank: www.ukbiobank.ac.uk. UK Biobank Ethics and Governance Council: www.egcukbiobank.org.uk. UK Data Protection Act 1998: www.legislation.gov.uk/ukpga/1998/29/contents. UK Department of Health, Caldicott Guardian system, portal: www.connectingforhealth.nhs.uk/systemsandservices/infogov/caldicott. UK Human Rights Act: www.legislation.gov.uk/ukpga/1998/42/contents. UK Human Tissue Act, Supplementary list of materials: www.hta.gov.uk/_db/_ documents/Supplementary_list_of_materials_200811252407.pdf. UK National Information Governance Board: www.nigb.nhs.uk. UK National Patient Safety Agency: www.nres.npsa.nhs.uk. UK National Patient Safety Agency, Proportionate Review Service: www.nres. npsa.nhs.uk/applications/proportionate-review. UK National Research Ethics Committees, portal: www.nres.npsa.nhs.uk. US Department of Commerce, Safe Harbor certification portal: http://export.gov/ safeharbor/eu. US Department of Health and Human Services, healthcare data-security breach notifications: www.hhs.gov/ocr/privacy/hipaa/administrative/breachnotificationrule/breachtool.html. US Department of Health and Human Services, Office for Human Research Protections, policy and guidance portal: www.hhs.gov/ohrp/policy/index.html. US Department of Health and Human Services, Standards for Privacy of Individually Identifiable Health Information (Privacy Rule): www.hhs.gov/ ocr/privacy/hipaa/administrative/privacyrule/index.html. US Department of Justice, “Overview of the Privacy Act of 1974”: www.justice. gov/opcl/1974privacyact-overview.htm. US Genetic Information Nondiscrimination Act: www.eeoc.gov/laws/statutes/ gina.cfm. US Health Information Technology Economic and Clinical Health Act: http:// healthit.hhs.gov/portal/server.pt/community/healthit_hhs_gov__home/1204. US National Center for Health Statistics, Research Data Center: www.cdc.gov/rdc. US National Institute of Standards and Technology, Computer Security Resource Center: http://csrc.nist.gov.

Bibliography

175

US National Institutes of Health, Certificates of Confidentiality Kiosk: http:// grants.nih.gov/grants/policy/coc. http://grants.nih.gov/grants/policy/data_sharing. US Secretary of Health and Human Services, Healthcare data-security breach notifications: www.hhs.gov/ocr/privacy/hipaa/administrative/breachnotificationrule/breachtool.html. Western Australian Data Linkage System: www.datalinkage-wa.org. World Health Organization, International Clinical Trials Registry Platform: http://apps.who.int/trialsearch/Default.aspx.

Index

Academy of Medical Sciences, UK, 35, 65, 66 access to data, 140–152 agreements regarding, 141–144 Certificates of Confidentiality, US, 133–135 colleagues, allowing access to data by, 140 consent conformance of access arrangements to, 143 open release of identifiable data with, 141 withdrawal of, with shared data, 143 court, legal, and police demands for, 6, 132, 134, 159 data access committees (DACs), 141, 151–152 de-identified data, open release of, 141 defined, 140 ethics committee approval for, 144 extremely restricted, 149–151 freedom of information laws, demands under, 6, 133 individual participation principle, OECD guidelines, 39 jurisdictional conditions regarding, 144 limitations on onward transfer, 144 linking conditions, 144 material transfer agreements for biospecimen access, 142 non-research access, requests for, 132–135, 159 open access, 140–141 oversight and governance of, 151–152 privacy-preserving data linkage systems, 105, 146–149 professional competency requirements for accessing, 142 progress in future, actions and policies likely to contribute to, 159 purpose limitation continuance, 143

176

recontacting, conditions on, 143 relocation or termination of project, conditions regarding, 144 restricted access, 141 return or destruction of data or biospecimens at conclusion of project, 144 security requirements for, 144, 159 terms and conditions of restricted access, 142–144 accountability EU Data Protection Directive on, 42 as OECD privacy principle, 40 Acquisti, Alessandro, 97 activities preparatory to research, 80 Agency for Healthcare Research and Quality, US, 54 aggregate genomic data, resolution of DNA contributions in, 120 ALSPAC (Avon Longitudinal Study of Parents and Children), 26 American Medical Association (AMA) ethics code regarding privacy and confidentiality, 34 ancestry, and data sensitivity, 21–22 “anonymization” of data, 94, 105, 106 APEC see Asia-Pacific Economic Cooperation Article 29 Data Protection Working Party, European, 42, 154 on accountability, 45 on concepts of controller and processor, 128 on definition of consent, 44 determination of adequacy of protection from data transfers from EU/EEA, 154 The Future of Privacy, 46 on geolocation services, 91 mandate of, 42 on personal data, 89–90, 107

Index Asia-Pacific Economic Cooperation (APEC) broad privacy and data protection regimes in, 50–51 Privacy Framework, 51 Australia broad privacy and data protection regimes in, 49–50 health-related privacy regimes in, 64–65 research without consent, conditions allowing for, 81 Western Australia Data Linkage System, 28, 147 Australian Law Reform Commission, 49, 65, 66, 89 Australian National Health and Medical Research Council, 30, 50, 64 Australian National Statement on Ethical Conduct in Human Research, 64 Australian Privacy Act (1988), 49–50 definition of personal data in, 88 review and revision of, xiii, 49, 64, 89 Avon Longitudinal Study of Parents and Children (ALSPAC), 26 behavioral/social research integrated with health research, 15 Belmont Report (Ethical Principles and Guidelines for the Protection of Human Subjects, 1979), 55–56 Biggs, Hazel, 33 binding corporate rules, international transfers of data under, 155 Biobanking and Biomolecular Resources Research Infrastructure project, 17 biobanks, 16–17 biospecimens collections of, 13–17 defined, 13–14 see also data and biospecimens BioVU, Vanderbilt University, 28, 115, 148 birth cohorts, defined, 18 see also cohort and other longitudinal studies breach of confidence, legal definition of, 33–34 breaches of data security, notification requirements for, 135–136 breaches of security, notification requirements for, 135–136 Burton, Paul, 114 Byers, Lord, 37 Caldicott Guardians, 65, 128 Canada

177 broad privacy and data protection regimes in, 48–49 ethics review system in, 63 health-related privacy regimes in, 53, 57, 63–64 Ontario Personal Health Information Protection Act (PHIPA), 53 Personal Information Protection and Electronic Documents Act (PIPEDA), 48–49, 53, 128, 153 Privacy Act, 48 privacy officers in, 128 research without consent, conditions allowing for, 81 roadmap on research use of electronic health information, 64, 65, 66 Statistics Act, 150 Statistics Canada, 150 transfers of data within and from, 153 Canadian Institutes of Health Research, 57 Canadian Standards Association, Model Code for the Protection of Personal Information, 48 Canadian Tri-Council Policy Statement, Ethical Conduct for Research Involving Humans, 57, 63 Cancer Research UK, 65 Centers for Disease Control and Prevention (CDC), US, 54, 133–135 Certificates of Confidentiality, US, 133–135 certification of low identifiability risk under HIPAA Privacy Rule, 99–100 Charter of Fundamental Rights of the European Union (2000), 30 Church, George, 121 Clarke, Roger, 29 clinical trial regimes, 57–59 Clinical Trials Regulations, UK, 65 CNIL (Commission national de l’informatique et des libertés), 45 cohort and other longitudinal studies consent and, 74, 76 data sharing by, 139 defined, 18 colleagues, allowing access to data by, 140 collection limitation principle, as OECD principle, 39 collections see under data and biospecimens Commerce Department, US, and Safe Harbor agreement, 155 Commission national de l’informatique et des libertés (CNIL), 45 community consultation and engagement, 79–80

178

Index

competency requirements for access to data, 142 computer linking of data, 146 Confidential Information Protection and Statistical Efficiency Act, US, 150 confidentiality breach of, 33–34 defined, 33–34 distinguished from privacy, 32, 33 Hippocratic medical, 52–53 see also privacy, confidentiality, and health research consent, 67–86 in access agreements see under access to data agreement process, importance of, 71 broad consent, preferability of, 5, 74–77, 118 cohort studies and, 74, 76 community consultation and engagement, 79–80 in complicated research projects, 5 conventional application of, 69–70 Declaration of Helsinki on, 67, 68, 71, 78 duration/continuation/renewal of, 71 empowerment or control, questionably viewed as providing, 85 entrusting, viewed as, 85–86 European Article 29 Working Party on, 44 explicit versus implicit, 77 freely/willingly/voluntarily granted, 73–74 “fully” informed, problems with, 72, 116 future progress, actions and policies likely to contribute to, 158 in genetic and genomic studies, 116–118, 121–122, 123–124 inadequacy of standard accounts now, 68–69, 84–86 international transfers from EU/EEA on basis of, 155 legitimately sought, 71 meaningfully informed, 71–73 narrow consent, 74 Nuremberg Code on, 67 open release of identifiable data with, 141 opt-in versus opt-out agreements, 77–78 regulations and requirements, 75 research without, conditions allowing for, 81–84 as right, 67 to sharing of research data, 5 of vulnerable populations, 74 waiver of legal rights prohibited in agreements, 71

withdrawal of, 78–79, 143 contractual agreements, international transfers under, 152–153, 154, 156 cost of health care, and public interest in health research, 2 Council of Europe Convention 108 on Protection of Personal Data (1981) current revising of, xiii development of, 40 EU Data Protection Directive embracing principles of, 42 distinguished from EU and its governing Council, 40 Recommendation on Profiling, 90 Recommendation on Research on Biological Materials of Human Origin, 36 Council of International Organizations of Medical Sciences, 55 court access to data, 6, 132–135, 159 Craig, David, 120 custodians, 126 cyber security, concept of, 130–131 data access committees (DACs), 141, 151–152 data and biospecimens, 7–26 collections of, 11–17 biospecimen collections, 13–17 databases, 11–12 registries, 12–13 definitions pertaining to, 7–10, 13–14 direct control over, reduction of, 6 e-health revolution and, xiii–xiv, 10–11 electronic health records (EHRs), 10–11 electronic medical records (EMRs), 11 financial or other rewarding of subjects for use of, 24 information versus data, 7 as intellectual property, 23 metadata, 7 ownership of, 22–24 personal health records (PHRs), 11 primary versus secondary use of, 9–10 retention of, 129, 144 secondary use of data EU Data Protection Directive on, 43 versus primary use, 9–10 vulnerability of health care versus health research data, 4 see also access to data, de-identification of data, future use of data, linking of data, personal data and

Index identifiability, platforms, security of data, sensitivity of data, transfers of data data controllers, 42, 127 data disclosure, defined, 8 data enclaves, 149–151 data handling or processing, defined, 8 data processors, 127 Data Protection Act, UK, 38 contractual transfers of data under, 153 coverage of research as a medical purpose, 45 data-subjects in, 8 EU Data Protection Directive, transposition of, 41, 45, 47 health-related privacy regimes and, 65 personal data and identifiability under, 88, 90 data quality principle, OECD guidelines, 39 data safe havens/safe harbors/enclaves, 149–151 data sharing, 138–140 access agreements, onward transfer limitations in, 144 advantages of, 138–139 by cohort or longitudinal studies, 139 defined, 9 organized pressures for, 138–139 Thomas and Walport report on, 81, 139, 150 data stewards, 141 data-subjects, definition of, 8 see also subjects data use agreements, enforcement of, 136–137 Database of Genotypes and Phenotypes (dbGaP), 27 databases, 11–12 dbGaP (Database of Genotypes and Phenotypes), 27 de-identification of data certification of low identifiability risk under HIPAA Privacy Rule, 99 genetic and genomic data, 118–119 importance of, 109–110 open release of de-identified data, 141 reduced consent and procedural requirements for de-identified data, 158 techniques for, 93–95 Declaration of Helsinki, 36, 54–55, 60, 67, 68, 71, 78 Department of Commerce, US, 155 destruction or return of data or biospecimens at conclusion of project, 144

179 disclosure of data, defined, 8 discrimination, genetic, 137 Duke University epilepsy genomics study, 116, 117 Durant v. Financial Services Authority (UK), 87, 90 e-health revolution, 10–11 Economic and Social Research Council, UK, 65 electronic health records (EHRs) described, 10–11 genomic discovery, EHR-driven, 115 integration of genetic and genomic data with, 123 electronic medical records (EMRs), 11 Electronic Medical Records and Genomics (eMERGE) Network, 115 encryption of data, 104, 105, 136 enforcement and sanctions, 135–137 entrusting, consent viewed as, 85–86 Ethical Conduct for Research Involving Humans (Canadian Tri-Council Policy Statement), 57, 63 Ethical Principles and Guidelines for the Protection of Human Subjects (Belmont Report, 1979), 55–56 ethics access to data, ethics committee approval for, 144 Australian National Statement on Ethical Conduct in Human Research, 64 consent as cornerstone of, 67 see also consent of genetics and genomics exceptional treatment, need for, 111 mismatch of scientific advance and personal and social understanding, 6, 111 health, value placed on, 1 importance of privacy protection for research, 3–4 privacy, value placed on, 3 review systems, 60–61, 63, 65, 75, 107 sensitivity of data and biospecimens see sensitivity of data Ethics and Governance Council of UK Biobank, 152 ethnicity, and data sensitivity, 21–22 EU Clinical Trials Directive (2001), xiii, 58 EU Data Protection Directive (1995), 41–47 on consent, 73 current revising of, xiii, 46

180

Index

EU Data Protection Directive (1995), (cont.) on data controllers and data processors, 127 data processing, definition of, 8 EU Clinical Trials Directive deference to, 58 national transpositions of, 41, 44 personal data and identifiability under, 87, 88, 89–90, 91 problems with, 44–47 on public interest, 44 on secondary use of data, 43 on sensitivity of data, 22, 42, 43, 44 suspension of subjects’ right of access to data, on circumstances allowing, 43 on transfers from EU/EEA to other countries, 154–156 on transfers within EU/EEA, 153 Europe, Council of see Council of Europe European Article 29 Data Protection Working Party see Article 29 Data Protection Working Party, European European Commission, 154, 155 European Convention on Human Rights applicability of, 30 as based on the Universal Declaration of Human Rights, 30 on exceptions to right to privacy, 31 on privacy as fundamental right, 30 European Union/European Economic Area (EU/EEA) broad privacy and data protection strategies in, 37, 38, 41 Canadian approach to privacy compared, 49 Charter of Fundamental Rights of the European Union (2000) applicability of, 30 on privacy, 30 Council of Europe distinguished from EU and its governing Council, 40 countries judged to provide adequate protection for personal data imported from, 154 countries modeling privacy and data protection strategies on, 41, 51 ethics review systems in, 61 personal data, data from everyday electronic transactions as, 91 privacy officers in, 128 transfers of data under EU-US Safe Harbor Agreement, 154, 155 within Europe, 153

from Europe to other countries, 154–156 explicit versus implicit consent, 77 external access to data see access to data fair information practices, 32 families of research subjects see relatives of research subjects Federal Common Rule on Protection of Human Subjects, US (1991), 56–57 clinical trials and, 58 on consent, 77 current revising of, xiii HIPAA Privacy Rule and, 56, 62 human subjects as defined by, 108 research as defined by, 24, 25–26 on withdrawal of consent, xiii, 78 Federal Trade Commission Act, US, 47 FOIA (Freedom of Information Act), US, 59, 135 Food and Drug Administration, US, 58, 133–135 Food, Drug, and Cosmetic Act, US, 57 For Your Information: Australian Privacy Law and Practice (Australian Law Reform Commission, 2008), 49 Framingham Heart Study, 26 France CNIL (Commission national de l’informatique et des libertés), 45 early privacy legislation in, 38 EU Data Protection Directive, national transposition of, 45 Freedom of Information Act (FOIA), US, 59, 135 freedom of information laws, demands for access to data under, 133 “fully” informed consent, 72, 116 The Future of Privacy (European Article 29 Data Protection Working Party, 2009), 46, 73 future progress, actions and policies likely to contribute to, 158–159 future use of data broad consent agreements and, 5, 74–77 clinical trial or product postmarketing surveillance, data collected during, 59 research without consent, conditions allowing for, 81–84 Gellman, Robert, 97 General Medical Council, UK, 65, 83 General Practice Research Database (GPRD), UK, 28, 148

Index generalizable knowledge, in definition of research, 8, 24, 25–26 Genetic Information Nondiscrimination Act (GINA), US, 137 genetics, defined, 112–113 genetics and genomics, 111–123 aggregate data, resolution of DNA contributions in, 120 consent in, 116–118, 121–122, 123–124 de-identification of data, 118–119 defined and distinguished, 112–113 discrimination based on, 137 electronic health records (EHRs) genomic discovery, EHR-driven, 115 integration of genetic and genomic data with data from, 123 exceptional treatment, need for, 111 genome-wide association studies (GWAS), 114 genotype-driven recruitment of subjects, 115–116 identifiability of genomic data, 6, 118–121 incidental findings in, 117 Mendelian versus multifactor conditions, 112–113 mismatch of scientific advance and personal and social understanding, 6, 111 nature of the genome, 112 non-identified genomic data, means of identifying, 119–120 Personal Genome Project (PGP), 121–122 relatives of research subjects and see relatives of research subjects genome-wide association studies (GWAS), 114 genomics, defined, 113 genotype, defined, 113 Germany early privacy legislation in, 38 EU Data Protection Directive, national transposition of, 41 privacy officers in, 128 GINA (Genetic Information Nondiscrimination Act), US, 137 Good Clinical (Trial) Practices, 58 Gostin, Lawrence, 24, 25–26, 29, 54 GPRD (General Practice Research Database), UK, 28, 148 Gross, Ralph, 97 guidelines see laws, regulations, and guidelines

181 Guidelines on the Protection of Privacy and Transborder Flows of Personal Data, OECD see Organisation for Economic Co-operation and Development (OECD) privacy principles Guthrie, Robert, 15 Guthrie spots (newborn blood screening specimens), 15 GWAS (genome-wide association studies), 114 Harvard University, Personal Genome Project (PGP), 121–122, 123–124, 141 Health and Social Care Act, UK, 65, 83 health data see data and biospecimens Health Information Technology Economic and Clinical Health (HITECH) Act, US, 10 Health Insurance Portability and Accountability Act, US (HIPAA Privacy Rule, 2002) on authorization of research use of data, 76 expected revision of, xiii identifiability under, 99–103, 104 Limited Data Sets under, 95, 102–103 subjects, on approaching and selecting, 80 US Federal Common Rule on Protection of Human Subjects and, 56, 62 health problems, reclassification of, via new research, 5 health research see privacy, confidentiality, and health research, research healthcare costs and public-interest nature of health research, 2 Helsinki, Declaration of, 36, 54–55, 60, 67, 68, 71, 78 HIPAA, US see Health Insurance Portability and Accountability Act Hippocratic medical confidentiality, 52–53 HITECH, US see Health Information Technology Economic and Clinical Health Act Hodge, James, 24, 25–26 Hrynaszkiewicz, Iain, 102 Human Fertilisation and Embryology Act, UK, 65 Human Genetics Commission, UK, 65 Human Genome Project, 113, 139 human right, privacy viewed as, 30–32 Human Rights Act, UK, 30 human subjects see subjects

182

Index

Human Tissue Act, UK, 14, 65, 83–84 Human Tissue Authority, UK, code of practice on research (2009), 83–84 Iceland, subject to EU Data Protection Directive, 41 identifiability, 87–110 categories of identifiability, 105–107 certification of low identifiability risk, 99–100 concordance of identifiability terms, 106 future actions and policies likely to increase identifiability, 158 genetic and genomic identifiability, 6, 118–121 identifiers or identifying data concept of, 92–93 US HIPAA Privacy Rule identifier list, 100–102 key-coding, 59, 104–105, 108, 118, 148 legal protections for identified or potentially identifiable data, need for, 159 linking data-sets with intent to identify, problem of, 96–98 no human subject if data not identifiable, policy of, 108–109, 159 non-identified data, means of identifying, 96–98, 119–120 ordering and retrievability of data affecting status of data, 91 personal data, defined, 8, 42, 87–92 personal data, under EU Data Protection Directive, 87, 88, 89–90, 91 personally identifiable data, under US HIPAA Privacy Rule, 99–103, 104 rare data as identifiers, 95, 106 researchers, when data are not identifiable to, 107–108, 158 risk assessments, 99–100, 103 social/behavioral research integrated with health research, 15 terminology pertaining to, 105–107 US Office for Human Research Protections (OHRP) on, 108–109 see also de-identification of data, reidentification of data implicit versus explicit consent, 77 individual participation principle, OECD guidelines, 39 information data versus, 7

see also data and biospecimens fair information practices, US tendency to cast privacy protections as, 32 Information Commissioner, UK, 76, 90, 107 informed consent see consent Institute of Medicine, US, 62, 63, 65, 66, 77, 103, 105 Institutes of Health Research, Canada, 57 Institutional Review Board (IRB) system, US, 56, 60, 61, 77 integration of different fields of health research, 5 intellectual property, data and biospecimens as, 23 International Cancer Genome Consortium, 28, 114, 139 International Conference of Data Protection and Privacy Commissioners, 156 International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use, 58 international transfers of data, 152–156 under binding corporate rules, 155 from Canada, 153 with consent, 155 under contractual agreements, 152–153, 154, 156 EU Data Protection Directive on exportation of personal data from EU, 42, 154–156 EU-US Safe Harbor Agreement, 154, 155 future progress, actions and policies likely to contribute to, 159 public interest exemption for, 155 special conditions in access agreements regarding, 144 universal standards, importance of developing, 156–157 US Safe Harbor self-certification framework for, 48 IRB (Institutional Review Board) system, US see Institutional Review Board (IRB) system, US Japan broad privacy and data protection regimes in, 50 health-related privacy regimes in, 58 Japanese Personal Information Protection Act, 50 Japanese Pharmaceuticals and Medical Devices Agency, 58

Index jurisdictional conditions in access agreements, 144 Kaiser Permanente Research Program on Genes, Environment, and Health, 27, 139 key-coding, 59, 104–105, 108, 118, 148 knowledge defined, 8 generalizable, 8, 24, 25–26 Kohane, Isaac, 115 Laurie, Graeme, 30 law enforcement access to data, 6, 132–135, 159 laws, regulations, and guidelines breaches of security, notification requirements for, 135–136 broad privacy and data protection regimes, 35–51 defined, 36 in APEC countries, 50–51 in Australia, 49–50 in Canada, 48–49 of Council of Europe, 40 of EU, 40–47 European strategies regarding, 37, 38, 41 historical background to development of, 36–38 in Japan, 50 confidentiality, as legal concept, 33–34 consent legitimacy in seeking, 71 regulations and requirements, 75 waiver of rights not legally allowed in, 71 data use agreements, enforcement of, 136–137 enforcement and sanctions, use of, 135–137 health specific privacy regimes, 52–66 in Australia, 64–65 in Canada, 53, 57, 63–64 clinical trials, 57–59 described, 36 ethics review systems, 60–61, 63, 65 health care and payment regimes, 52–53 human-subject protection, 54–57 inconsistencies, complexities, and redundancies, problem of, 62–66 product postmarketing data, 58–59 public health regimes, 53–54

183 specialized laws and regulations, 59–60 in UK, 58, 65–66 in US, 52–53, 54, 55–57, 58, 60, 61, 62 identified or potentially identifiable data, specific protections for, 159 inconsistencies, complexities, and redundancies, 6, 35–36, 62–66 privacy, as legal concept, 30–32 see under particular items by name or country Lee, Lisa, 54 Levin, Avner, 49 Liddell, Kathleen, 73 Liechtenstein, subject to EU Data Protection Directive, 41 Limited Data Sets under US HIPAA Privacy Rule, 95, 102–103 linking of data access agreement limitations on, 144 defined, 146 as form of access, 146 with intent to identify, 96–98, 119 privacy-preserving data linkage systems, 105, 146–149 longitudinal studies see cohort and other longitudinal studies Manitoba Centre for Health Policy (MHCP), University of Manitoba “Pledge of Privacy,” 127 Population Health Research Data Repository, 28, 148 Manson, Neil, 68, 71, 75 matching of data, 96, 119, 146 material transfer agreements for biospecimen sharing, 142 McGuire, Amy, 116 McGuire, Sean, 116 Medicaid, US, 53 Medical Ethics (Percival, 1849), 14 Medical Research Council, UK, 65, 138 Medicare, US, 53 Medicines and Healthcare Products Regulatory Agency, UK, 58–59 Mendelian genetic conditions, 112–113 metadata, 7 MHCP see Manitoba Centre for Health Policy Million Women Study, 27 Morris, Andrew, 35 multifactor genetic conditions, 112–113 National Center for Health Statistics, US, 54, 150 National Health and Nutrition Examination Study (NHANES), US, 27

184

Index

National Health Service (NHS), UK Caldicott Guardians, 65, 128 consent requirements, 75 Data Protection Act and, 65 Information Centre for Health and Social Care, 148 Million Women Study, based on NHS Breast Screening Centres, 27 Proportionate Review Service, fast-track review, 107 research without consent, conditions allowing for, 82 subjects, approaching and selecting, 81 UK Biobank data from, 17 use of data without patient consent, 45 National Health Service (NHS) Act, UK, 65, 82, 83–84 National Information Governance Board, UK, 65, 82, 83, 131 National Institutes of Health, US, 121, 133–135, 138 National Research Act, US (1974), 60 National Research Council, US, 106 Nazi doctors, medical experiments prosecuted, 67, 68 A New Pathway for the Regulation and Governance of Health Research (Academy of Medical Sciences, UK, 2011), 65, 66 newborn blood screening, 15, 112 NHANES (National Health and Nutrition Examination Study), US, 27 Nicholson, Mary Jo, 49 non-identified data, means of identifying, 96–98, 119–120 non-research access, requests for, 132–135, 159 Norway, subject to EU Data Protection Directive, 41 notification requirements breaches of security, in case of, 135–136 de-identified data, in case of incidental identification of, 144 Nuffield Council on Bioethics, UK, 65 Nuremberg Code, 67 OECD see Organisation for Economic Cooperation and Development (OECD) privacy principles Office for Human Research Protections (OHRP), US, 108–109, 149 O’Neill, Onora, 68, 71, 75 Ontario Personal Health Information Protection Act (PHIPA), Canada, 53

openness and transparency access to data, open, 140–141 EU Data Protection Directive on, 42 as OECD privacy principle, 39 opt-in versus opt-out consent agreements, 77–78 Organisation for Economic Co-operation and Development (OECD) privacy principles, 38–40 Canadian Standards Association, Model Code for the Protection of Personal Information based on, 48 Council of Europe Convention 108 compared to, 40 Japanese Personal Information Protection Act aligned with, 50 ownership of data and biospecimens, 22–24 participants see subjects patient outcome registries, 13 penalties and enforcement, 135–137 Percival, Thomas, 14 personal data, identifiability of see identifiability Personal Genome Project (PGP), Harvard University, 121–122, 123–124, 141 personal health records (PHRs), 11 Personal Information Protection and Electronic Documents Act (PIPEDA), Canada, 48–49, 53, 128, 153 Personal Privacy in an Information Society (US Congressional Commission, 1977), 38 PHIPA (Ontario Personal Health Information Protection Act), Canada see Ontario Personal Health Information Protection Act (PHIPA), Canada PIPEDA (Personal Information Protection and Electronic Documents Act), Canada see Personal Information Protection and Electronic Documents Act (PIPEDA), Canada platforms, 18 consent problems with, 5, 85 data flow via, 18 defined, 18 examples of, 26–28 opportunities afforded by, 5 police access to data, 6, 132–135 pooling data, 146, 147 Population Health Research Data Repository, MHCP, 28, 148 primary versus secondary use of data, 9–10

Index principal investigators (PIs), 126, 141 Privacy Act, Australia see Australian Privacy Act Privacy Act, Canada, 48 Privacy Act, US, 38, 47, 59, 92 privacy, confidentiality, and health research, 1–6, xiii–xiv access to data, 140–152 see also access to data conflict between privacy and research, problem with framing issue as, 1 consent issues, 67–86 see also consent counterclaims to right to privacy, 31 data and biospecimens, 7–26 see also data and biospecimens data sharing, 138–140 see also data sharing definitions confidentiality, 33–34 distinguishing privacy and confidentiality, 32, 33 privacy, 29–33 research, 24–26 safeguards, 34 e-health revolution and, 10–11, xiii–xiv fundamental human right, privacy as, 30–32 future progress, actions and policies likely to contribute to, 158–159 genetics and genomics, 111–123 see also genetics and genomics importance of privacy protection to successful research process, 3–4 integration of different types of health research, 5 laws, regulations, and guidelines on, 35–51, 52–66 see also laws, regulations, and guidelines personally identifiable data, 87–110 see also personal data and identifiability as public-interest cause see public interest safeguards and responsibilities regarding, 125–137 see also safeguards and responsibilities scale-based problems with, 5 scientific opportunities, new developments in, 5 vulnerability of health care versus health research data, 4 privacy officers, 128 privacy-preserving data linkage systems, 105, 146–149 privacy risk assessments, 131–132

185 product postmarketing regimes, 58–59 professional competency requirements for access to data, 142 profiling, 89, 90, 96, 119 “pseudonymized” or “pseudoanonymized” data, 104, 106 public access to data see access to data public health practice distinguished from research, 25–26 public health regimes, 53–54 Public Health Service Act, US, 54 public interest cost of health care and, 2 EU Data Protection Directive on, 44 future progress, actions and policies likely to serve, 159 health research as, 1–3 international transfers from EU/EEA on grounds of, 155 privacy protection in, 3–4 Public Population Project in Genomics, 17 public research resource platforms see platforms purpose limitations on access to data, 143 purpose specification principle, OECD guidelines, 39 race, and data sensitivity, 21–22 re-identification of data non-identified data, means of identifying, 96–98 retaining possibility of, 98–99 recontacting subjects access agreements, conditions in, 143 genotype-driven recontact, 115–116 records see electronic health records; electronic medical records; personal health records Records, Computers, and the Rights of Citizens (US HEW Advisory Committee on Automated Personal Data Systems, 1973), 37 registries, 12–13 defined, 12–13 EU Data Protection Directive failing to accommodate, 44 Medical Ethics (Percival, 1849), on research use of hospital, 14 patient outcome registries, 13 regulatory regimes see laws, regulations, and guidelines

186

Index

relatives of research subjects communication of genetic findings to, 123, 159 familial risk data, improvements in collection, storage, and retrieval of, 122–123 implications of a person’s genotype for, 120 privacy and confidentiality protections needed for, 5 recruitment, genotype-driven, 115–116 relocation of projects, access agreement conditions regarding, 144 Report of the Committee on Privacy (Younger Report, UK House of Lords, 1972), 37 research activities preparatory to, 80 defined, 24–26 EU Data Protection Directive’s failure to define, 44 public health practice distinguished, 25–26 see also privacy, confidentiality, and health research research data centers (data enclaves), 149–151 research participants/subjects see subjects research platforms see platforms researchers data controllers and processors, 127 non-identifiability of data in hands of, 107–108, 158 responsibilities of see safeguards and responsibilities third-party access to data, legal power to resist, 6 resource platforms see platforms responsibilities see safeguards and responsibilities restricted access to data, 141 retention of data and biospecimens, 129, 144 Rethinking Informed Consent in Bioethics (Manson and O’Neill), 68 return or destruction of data or biospecimens at conclusion of project, 144 Richards, Martin, 73 rights consent as, 67 disassociated data or biospecimens, adhering to, 67 to know data about self, 23, 39, 43

privacy, 30–32 waiving of legal rights not allowed in consent agreements, 71 to withdraw consent, 67 risk assessments identifiability, 99–100, 103 privacy, 131–132 security, 130 rules see laws, regulations, and guidelines safe harbors data safe harbors/data havens/data enclaves/data research centers, 149–151 EU-US Safe Harbor Agreement on transfers of data, 154, 155 four different arrangements termed as, 100 US HIPAA Privacy Rule, removal of prohibited identifiers under, 100–102 safeguards and responsibilities, 125–137 accountability principle, OECD guidelines, 40 broad consent, as supporting, 77 day-to-day operational safeguards, listing of, 125–126 definition of safeguard, 34, 125 enforcement and sanctions, 135–137 formal responsibility roles, 126–129 MHCP pledge of privacy, 127 non-research access, requests for, 132–135 privacy risk assessments, 131–132 retention of data and biospecimens, 129 security and cybersecurity, importance of attention to, 130–131 security safeguards principle, OECD guidelines, 39 stewardship, concept of, 129 sanctions and enforcement, 135–137 secondary use of data distinguishing from primary use, 9–10 EU Data Protection Directive on, 43 security of data access to data, security as condition of, 144, 159 breaches, documented incidents of, 4 concept and elements of, 130–131 as distinguished from privacy and confidentiality, 34 as OECD privacy principle, 39 sensitivity of data, 18–22 ancestry, race, and ethnicity as controversial categories, 21–22

Index Canadian PIPEDA on, 48 categories of sensitive data, 20 de-identification considerations and, 94 EU Data Protection Directive on, 22, 42, 43, 44 key-coding, encryption of, 104 as policy issue, 22 sharing data see data sharing single nucleotide polymorphisms (SNPs), 114 social/behavioral research integrated with health research, 15 Social Security Numbers (SSNs), US, 96–98 Solove, Daniel, 29 Statistics Act, Canada, 150 Statistics Canada, 150 stewardship, 129 subjects access agreements, conditions regarding recontacting subjects in, 143 access to data about self EU Data Protection Directive on suspension of, 43 as general principle, 23 approaching and selecting for research, 80–81, 158 in clinical trials, 57–59 communication of genetic findings to, 123, 159 consent of see consent data-subjects, definition of, 8 Declaration of Helsinki on protection of, 36, 54–55, 60, 67, 68, 71, 78 financial or other payback for use of data, issue of, 24 genotype-driven recontacting and recruitment of, 115–116 individual participation principle, OECD guidelines, 39 policy of no human subject if no personal data involved, 108–109, 159 terminology alternatives for, 8 see also relatives of research subjects Sweden, early privacy legislation in, 38 Switzerland, data protection law parallel to EU data protection laws, 41 termination of projects, access agreement conditions regarding, 144 third-party access to data see access to data Thomas, Richard, 81, 139, 150 transfers of data access agreements, onward transfer limitations in, 144

187 within Canada, 153 within EU/EEA, 153 see also international transfers of data transparency see openness and transparency trust, consent viewed as delegation of, 85–86 UK Biobank, 17, 27, 78–79, 132, 139, 152 UK Biobank Ethics and Governance Council, 152 UK Data Archive, 27, 139 Understanding Privacy (Solove, 2008), 29 unique bits of data, and identifiability, 95, 106 United Kingdom breach of confidence in English common law, 33 Caldicott Guardians in, 65, 128 health-related privacy regimes in, 58, 65–66 Principle Investigators (PIs) in, 126 Report of the Committee on Privacy (1972, Younger Report), 37 research without consent, conditions allowing for, 82–84 United Kingdom laws and regulations Clinical Trials Regulations, 65 Data Protection Act see Data Protection Act, UK Health and Social Care Act, 65, 83 Human Fertilisation and Embryology Act, 65 Human Rights Act, 30 Human Tissue Act, 14, 65, 83–84 National Health Service (NHS) Act, 65, 82, 83–84 United Kingdom organizations and agencies Academy of Medical Sciences, 35, 65, 66 Cancer Research UK, 65 Economic and Social Research Council, 65 General Practice Research Database (GPRD), 28, 148 Human Genetics Commission, 65 Human Tissue Authority, 83–84 Information Commissioner, 76, 90, 107 Medical Research Council, 65, 138 Medicines and Healthcare Products Regulatory Agency, 58–59 National Health Service (NHS) see National Health Service (NHS), UK National Information Governance Board, 65, 82, 83, 131 Nuffield Council on Bioethics, 65

188

Index

United States Canadian approach to privacy compared, 49 Certificates of Confidentiality in, 133–135 consent not involving waiver of legal rights in, 71 few omnibus privacy and data protection regimes in, 47–48 health-related privacy regimes in, 52–53, 54, 55–57, 58, 60, 61, 62 IRB system, 56, 60, 61, 77, 149 Medicare and Medicaid, 53 Personal Privacy in an Information Society (1977), 38 principal investigators (PIs) in, 126 privacy protections generally cast as fair information practices in, 32 Records, Computers, and the Rights of Citizens (1973), 37 research without consent, conditions allowing for, 81, 84 Social Security Numbers (SSNs), 96–98 transfers of data from EU/EEA to academic, noncommercial, and governmental organizations, 156 EU-US Safe Harbor arrangement for commercial organizations, 154, 155 United States laws and regulations Confidential Information Protection and Statistical Efficiency Act, 150 Federal Common Rule see Federal Common Rule on Protection of Human Subjects, US Federal Trade Commission Act, 47 Food, Drug, and Cosmetic Act, 57 Freedom of Information Act (FOIA), 59, 135 Genetic Information Nondiscrimination Act (GINA), 137 Health Information Technology Economic and Clinical Health (HITECH) Act, 10 Health Insurance Portability and Accountability Act (HIPAA) see Health Insurance Portability and Accountability Act National Research Act, 60 Privacy Act, 38, 47, 59, 92 Public Health Service Act, 54 United States organizations and agencies

Agency for Healthcare Research and Quality, 54 Centers for Disease Control and Prevention (CDC), 54, 133–135 Food and Drug Administration, 58, 133–135 Institute of Medicine, 62, 63, 65, 66, 77, 103, 105 National Center for Health Statistics, 54, 150 National Institutes of Health, 121, 133–135, 138 National Research Council, 106 Office for Human Research Protections (OHRP), 108–109, 149 Veterans Health Administration, 53 United States v. Karl Brandt et al., 67 Universal Declaration of Human Rights, 30 universal standards for international transfers, importance of developing, 156–157 University of Manitoba see Manitoba Centre for Health Policy (MHCP), University of Manitoba use limitation principle, OECD guidelines, 39 Vanderbilt University BioVU Program, 28, 115, 148 Veterans Health Administration, US, 53 vulnerable people consent by, 74 data sensitivity and, 21 laws and regulations protecting, in research, 59–60 public health regimes aimed to help, 54 waiver of legal rights prohibited in consent agreements, 71 Walport, Mark, 81, 139, 150 Wellcome Trust, 65, 121, 138 Wellcome Trust Case Control Consortium, 27, 153 Western Australia Data Linkage System, 28, 147 withdrawal of consent, 78–79, 143 World Medical Association, 54 Younger Report (Report of the Committee on Privacy, UK House of Lords, 1972), 37