The Law and Ethics of Data Sharing in Health Sciences (Perspectives in Law, Business and Innovation) 981996539X, 9789819965397

Data sharing – broadly defined as the exchange of health-related data among multiple controllers and processors – has ga

119 72 5MB

English Pages 213 [211] Year 2024

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Digital Economy and Competition Law in Asia (Perspectives in Law, Business and Innovation) 9811603235, 9789811603235

The digital economy, broadly defined as the economy operating on the basis of interconnectivity between people and busin

120 59 3MB Read more

History, Law, and the Human Sciences: Medieval and Renaissance Perspectives

802 103 15MB Read more

The Future of Financial Systems in the Digital Age: Perspectives from Europe and Japan (Perspectives in Law, Business and Innovation) 9811678294, 9789811678295

This book is open access, which means that you have free and unlimited access. The increasing capacity of digital netwo

102 36 3MB Read more

Lethe’s Law: Justice, Law and Ethics in Reconciliation 9781472562326, 9781841131092

This book offers a series of original essays by an international group of scholars whose work looks comparatively at law

243 50 1MB Read more

Compromise in Ethics, Law, and Politics

1,162 81 12MB Read more

Law, Ethics and Emerging Military Technologies: Confronting Disruptive Innovation 9781003273912

This book addresses issues of legal and moral governance arising in the development, deployment, and eventual uses of em

418 61 2MB Read more

Innovation in Scientific Research and Emerging Technologies: A Challenge to Ethics and Law 3030167321, 9783030167325

This book discusses the ethical and legal challenges related to innovations, with reference to both scientific research

878 70 2MB Read more

Science and the Law: Analytical Data in Support of Regulation in Health, Food, and the Environment 9780841229471, 9780841229488

560 41 7MB Read more

Public Health Law and Ethics: A Reader 9780520946057

Now revised and expanded to cover today’s most pressing health threats, Public Health Law and Ethics probes the legal an

332 38 6MB Read more

Public Health Law and Ethics: A Reader 9780520967731

Public Health Law and Ethics: A Reader, 3rd Edition probes the legal and ethical issues at the heart of public health th

403 79 10MB Read more

The Law and Ethics of Data Sharing in Health Sciences (Perspectives in Law, Business and Innovation)
981996539X, 9789819965397

Author / Uploaded
Marcelo Corrales Compagnucci (editor)
Timo Minssen (editor)
Mark Fenwick (editor)
Mateo Aboy (editor)
Kathleen Liddell (editor)

Table of contents :
Preface
Contents
Editors and Contributors
The Dynamic Context and Multiple Challenges of Data Sharing
1 Introduction
2 A Fast-Moving Regulatory Landscape
3 Book Structure and Chapters
4 Conclusion
References
The GA4GH Regulatory and Ethics Work Stream (REWS) at 10: An Interdisciplinary, Participative Approach to International Policy Development in Genomics
1 Introduction
2 Creation of the GA4GH’s Framework for Responsible Sharing of Genomic and Health-Related Data
3 GA4GH Policies, Standards, and Tools
4 Conclusion: GA4GH 10th Anniversary—A Shift Towards Maturity?
Appendix 1
References
Assessing Public and Private Rights of Action to Police Health Data Sharing
1 Introduction
2 Policing Medical Data Sharing Using Public Rights of Action
3 Policing Medical Data Sharing Using Private Rights of Action
4 Designing Regulation for Health Data Through Public and Private Enforcement
4.1 Public Versus Private Enforcement Under Public Law
4.2 Private Enforcement Under Private Law
4.3 Changing Public Laws to Cover More Health Data
5 Conclusion
References
Patient Perspectives on Data Sharing
1 Introduction
2 Patient Perspective on Data Sharing
2.1 Patient Motivations for Data Sharing
2.2 Patient Concerns About Data Sharing
2.3 Patient Views on Privacy, Trust, Distrust and Conditions for Sharing
2.4 Patient Views on Sharing of Data for Secondary Uses
2.5 Patient Views on Data Sharing for Artificial Intelligence or Machine Learning
3 The Patient Perspective in a Changing Context of Data Sharing
3.1 The Changing Context
3.2 The Fit of Patient Perspectives with the Changing Context of Data Sharing
4 Concluding Remarks and Future Perspectives
References
Operationalizing the Use of Existing Data in Support of Biomedical Research and Innovation: An Inclusive and Sustainable Approach
1 Introduction
2 The Problem of Under Utilization
3 European Data Strategy—the Legal Approach
4 Citizen Science and RRI
5 Discussion
6 Conclusion
References
Dobbs in a Technologized World: Implications for US Data Privacy
1 Introduction
2 PHI, ePHI, and Other Health Care Data
3 Financial Data
4 Tracking
5 Social Media
6 Conclusion
References
Consent and Retrospective Data Collection
1 Introduction
2 Retrospective and Prospective Data Collection
3 Legal Implications
4 Consent (Article 6(1)(a) GDPR)
5 Exemptions for Processing of Health Data in Research
6 Further Processing
7 An Ethical Perspective on Consent
7.1 Definition of Ethical Consent
7.2 Relationship Between GDPR-Based Consent and Ethical Consent
7.3 Looking at Specific Elements of Informed Consent from an Ethical Perspective
7.4 Practical Application
8 Conclusion
References
Enabling Secondary Use of Health Data for the Development of Medical Devices Based on Machine Learning
1 Introduction: Why the Development of ML-Based Medical Devices Needs Secondary Use of Health Data
2 Secondary Use of Health Data for ML Under the GDPR
2.1 The Concept of “Scientific Research” and Its Applicability to the Development of ML-Based Medical Devices
2.2 Compatibility of Initial Purpose and Purpose of Secondary Use
2.3 Why the Obligations of the GDPR on Secondary Use for ML Are Insufficient
3 Enabling Secondary Use for the Development of ML-Based Medical Devices: Conducive Approaches
3.1 Explicit Legal Bases for the Processing of Health Data for the Development of ML-Based Medical Devices
3.2 Infrastructure and Intermediary: Extension of the Concept for the Use of Research Data
3.3 Standards of Data Preparation
4 EHDS: A Way Forward?
4.1 Secondary Use of Health Data for the Development of ML-Based Medical Devices: An Overview of Legal Bases and Procedures
4.2 Enabling Secondary Use of Health Data for ML: Implications from the EHDS
4.3 Does the EHDS Introduce a New Perspective on Data (Sharing)?
5 Conclusion
References
Supplementary Measures and Appropriate Safeguards for International Transfers of Health Data After Schrems II
1 Introduction
2 A Critical Appraisal of Schrems II
3 Post-Schrems II Developments
3.1 National Regulatory Responses
3.2 The EDPB Guidelines
3.3 The New Standard Contractual Clauses
4 A Critical Analysis of Supplementary Measures
5 Conclusion
References
The Internal Network Structure that Affects Firewall Vulnerability
1 Introduction
2 The Intrinsic Vulnerabilities of a Firewall
2.1 Reasons for Using Models to Study Firewall Vulnerabilities
2.2 The Assumptions of the Simple Models
2.3 Examination of Firewall Vulnerabilities Using Simple Models
2.4 Suggestions
3 Graphical Model of Information Flow
3.1 Simple Examples
3.2 A Model of Firewalls
3.3 Numerical Simulation
3.4 Limitation of Our Model
4 An Information Flow Model with Information Source and Monitor
4.1 Simple Example
4.2 Numerical Simulation: Effects of Source and Monitor Position
5 Conclusion
References
Index

Citation preview

Perspectives in Law, Business and Innovation

Marcelo Corrales Compagnucci · Timo Minssen · Mark Fenwick · Mateo Aboy · Kathleen Liddell Editors

The Law and Ethics of Data Sharing in Health Sciences

Perspectives in Law, Business and Innovation Series Editor Toshiyuki Kono, Kyushu University, Fukuoka, Japan Editorial Board Hans-Wolfgang Micklitz, Robert Schuman Centre for Advanced Studies, European University Institute, Florence, Italy Ryu Kojima, Graduate School of Law, Kyushu University, Fukuoka, Japan Erik P. M. Vermeulen, Tilburg University, Tilburg, The Netherlands Urs Gasser, Technical University of Munich, Munich, Germany Jane C. Ginsburg, Columbia University Law School, New York, USA

Over the last three decades, interconnected processes of globalization and rapid technological change—particularly, the emergence of networked technologies—have profoundly disrupted traditional models of business organization. This economic transformation has created multiple new opportunities for the emergence of alternate business forms, and disruptive innovation has become one of the major driving forces in the contemporary economy. Moreover, in the context of globalization, the innovation space increasingly takes on a global character. The main stakeholders—innovators, entrepreneurs and investors—now have an unprecedented degree of mobility in pursuing economic opportunities wherever they arise. As such, frictionless movement of goods, workers, services, and capital is becoming the “new normal”. This new economic and social reality has created multiple regulatory challenges for policymakers as they struggle to come to terms with the rapid pace of these social and economic changes. Moreover, these challenges impact across multiple fields of both public and private law. Nevertheless, existing approaches within legal science often struggle to deal with innovation and its effects. Paralleling this shift in the economy, we can, therefore, see a similar process of disruption occurring within contemporary academia, as traditional approaches and disciplinary boundaries—both within and between disciplines—are being re-configured. Conventional notions of legal science are becoming increasingly obsolete or, at least, there is a need to develop alternative perspectives on the various regulatory challenges that are currently being created by the new innovation-driven global economy. The aim of this series is to provide a forum for the publication of cutting-edge research in the fields of innovation and the law from a Japanese and Asian perspective. The series will cut across the traditional sub-disciplines of legal studies but will be tied together by a focus on contemporary developments in an innovation-driven economy and will deepen our understanding of the various regulatory responses to these economic and social changes. The series editor and editorial board carefully assess each book proposal and sample chapters in terms of their relevance to law, business, and innovative technological change. Each proposal is evaluated on the basis of its academic value and distinctive contribution to the fast-moving debate in these fields. All books and chapters in the Perspectives in Law, Business and Innovation book series are indexed in Scopus.

Marcelo Corrales Compagnucci · Timo Minssen · Mark Fenwick · Mateo Aboy · Kathleen Liddell Editors

The Law and Ethics of Data Sharing in Health Sciences

Editors Marcelo Corrales Compagnucci CeBIL University of Copenhagen København S, Denmark

Timo Minssen CeBIL University of Copenhagen Copenhagen K, Denmark

Mark Fenwick Kyushu University Fukuoka, Japan

Mateo Aboy University of Cambridge Cambridge, UK

Kathleen Liddell University of Cambridge Cambridge, UK

ISSN 2520-1875 ISSN 2520-1883 (electronic) Perspectives in Law, Business and Innovation ISBN 978-981-99-6539-7 ISBN 978-981-99-6540-3 (eBook) https://doi.org/10.1007/978-981-99-6540-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 Chapters “Patient Perspectives on Data Sharing” and “Supplementary Measures and Appropriate Safeguards for International Transfers of Health Data After Schrems II” are licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/). For further details see license information in the chapters. This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Paper in this product is recyclable.

Preface

This work represents a collaborative effort, drawing upon the extensive global network of the scientifically independent Collaborative Research Program in Biomedical Innovation Law (the CeBIL Program) and its successor, the InterCeBIL Program (the Programs), commencing on September 1st, 2023. Generously supported by grants from the Novo Nordisk Foundation and hosted by the Center for Advanced Studies in Biomedical Innovation & Life Science Law (CeBIL), these Programs bring together esteemed researchers and research centers, including the University of Copenhagen, Harvard Law School, Harvard Medical School, University of Michigan Law School, and the University of Cambridge. The collective mission is to collaborate on cutting-edge legal research at the forefront of health and life science innovation. The Programs’ overall aim and ambition is to contribute to the translation of ground-breaking biomedical and life science research into safe, effective, affordable and accessible therapies and biosolutions. This is done by analyzing the most significant legal challenges in pharmaceutical innovation, public health and life science innovation from a holistic cross-disciplinary perspective. The scientifically independent Programs address fundamental legal challenges in the pharma and life sciences, taking into account the ecosystem of bio-pharmaceutical innovation and healthcare. The focus is on biomedical and life science innovation law that cuts across various legal disciplines. It brings in interdisciplinary, industry and policy perspectives, providing a much-coveted contribution to the future of the biomedical and life science innovation system. Over the course of the initial five and a half years, the CeBIL Program dedicated its efforts to addressing inefficiencies in innovation within the realms of biomedical advancement. This was achieved through a series of five interconnected studies encompassing precision medicine and digital innovation, antimicrobial resistance, biologics, new applications and orphan drugs. These studies were further complemented by a sixth synergy study, collectively exploring the forefront of biomedical innovation. Since September 2023, the Inter-CeBIL program has been continuing its mission with additional studies and use cases focused on advanced medical computing, pandemic preparedness, sustainable health and life science innovations, as well v

vi

Preface

as incorporating the areas of NGTs and novel biosolutions in the food and plant sciences. What unifies and has always unified these studies and areas of innovation is the undeniable need for effective and legally robust frameworks to facilitate data sharing. Focusing on these intersections and the significance of data sharing, this volume is part of study six of the initial CeBIL Program: Synergy & Policy Solutions. It is a first step towards developing a multi-level conceptual framework that integrates law, ethical considerations, economics, science, business and policy. It forms the basis for a holistic trans-disciplinary perspective to address and synthesize the most significant legal challenges in pharmaceutical innovation and public health. The aim is to contribute with legal insights to the translation of ground-breaking biomedical research to beneficial therapies that reach patients. To achieve this, study six synthesized the scientific, business and policy aspects of bio-pharmaceutical innovation challenges into a synergistic framework. The framework acknowledges similarities and differences across the challenges, providing better solutions. In that way, study six helped develop a platform for interaction between different stakeholders within the complex ecosystem of biomedical innovation. Accordingly, contributors to this book include philosophers, social scientists, pharmacists, ethicists, mathematicians, legal scholars and practitioners from Europe, East Asia and the Americas. They offer some of the latest thinking on the current challenges and opportunities with data sharing in the health sciences, with special reference to the associated legal and ethical issues. This book was supported by a Novo Nordisk Foundation grant for a scientifically independent Collaborative Research Program in Biomedical Innovation Law (grant agreement number NNF17SA0027784), which will continue as the Inter-CeBIL Program from September 1, 2023 (grant agreement number NNF23SA0087056). The editors are indebted to the authors and co-authors of each chapter for their hard work, patience and cooperation throughout the whole process; from the initial concept to the final manuscript. Finally, the editors are grateful to the Springer staff for their support and efforts in ensuring final and timely publication. Copenhagen, Denmark Copenhagen, Denmark Fukuoka, Japan Cambridge, UK Cambridge, UK

Marcelo Corrales Compagnucci Timo Minssen Mark Fenwick Mateo Aboy Kathleen Liddell

Contents

The Dynamic Context and Multiple Challenges of Data Sharing . . . . . . . Marcelo Corrales Compagnucci, Mark Fenwick, Timo Minssen, and Mateo Aboy The GA4GH Regulatory and Ethics Work Stream (REWS) at 10: An Interdisciplinary, Participative Approach to International Policy Development in Genomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yann Joly, Edward Dove, Bartha Maria Knoppers, and Dianne Nicol Assessing Public and Private Rights of Action to Police Health Data Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David A. Simon, Carmel Shachar, and I. Glenn Cohen Patient Perspectives on Data Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Louise C. Druedahl and Sofia Kälvemark Sporrong Operationalizing the Use of Existing Data in Support of Biomedical Research and Innovation: An Inclusive and Sustainable Approach . . . . . Helen Yu

1

13

33 51

69

Dobbs in a Technologized World: Implications for US Data Privacy . . . . Jheel Gosain, Jason D. Keune, and Michael S. Sinha

85

Consent and Retrospective Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . Tima Otu Anwana, Katarzyna Barud, Michael Cepic, Emily Johnson, Max Königseder, and Marie-Catherine Wagner

99

Enabling Secondary Use of Health Data for the Development of Medical Devices Based on Machine Learning . . . . . . . . . . . . . . . . . . . . . . 127 Lea Köttering

vii

viii

Contents

Supplementary Measures and Appropriate Safeguards for International Transfers of Health Data After Schrems II . . . . . . . . . . . 151 Marcelo Corrales Compagnucci, Mark Fenwick, Mateo Aboy, and Timo Minssen The Internal Network Structure that Affects Firewall Vulnerability . . . . 173 Shinto Teramoto, Shizuo Kaji, and Shota Osada Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

Editors and Contributors

About the Editors Marcelo Corrales Compagnucci is an Associate Professor and Associate Director at the Center for Advanced Studies in Biomedical Innovation Law (CeBIL), Faculty of Law, University of Copenhagen in Denmark. He specializes in information technology, privacy and data protection law. His research interests are the legal issues involved in disruptive innovation technologies and biomedicine. His past activities have included working as a consultant and lawyer for law firms, IT companies and international organizations such as the OECD and the World Bank. He was also a research associate with the Institute for Legal Informatics (IRI) at Leibniz Universität Hannover in Germany, and a visiting research fellow in various research centers around the world, including the Petrie-Flom Center for Health Law Policy, Biotechnology and Bioethics at Harvard Law School, the Max Planck Institute for Comparative and International Private Law (Hamburg), the Max Planck Institute for Innovation and Competition (Munich), the SCRIPT research center at the University of Edinburgh in Scotland, and the Academia Sinica in Taiwan. He has a Doctor of Laws (LL.D.) degree from Kyushu University in Japan. He also holds a Master of Laws (LL.M.) in international economics and business law from Kyushu University, an LL.M. in law and information technology and an LL.M. in European intellectual property law, both from the University of Stockholm in Sweden. He has several publications in the field of IT Law. His most recent book collections co-edited with various authors include AI in eHealth: Human Autonomy, Data Governance and Privacy in Healthcare (Cambridge University Press, 2022), Smart Contracts: Technological, Business and Legal Perspectives (Hart Publishing, 2021); Legal Design: Integrating Business, Design and Legal Thinking with Technology (Edward Elgar Publishing, 2021). A full list of publications is available at: https://research.ku.dk/ search/result/?pure=en%2Fpersons%2F662698. Timo Minssen is a Professor of Health and Life Science Innovation Law at the University of Copenhagen (UCPH), the Founding Director of UCPH’s Center for

ix

x

Editors and Contributors

Advanced Studies in Biomedical Innovation Law (CeBIL) and an LML Research Affiliate at the University of Cambridge. Professor Minssen is also the principal investigator and the Novo Nordisk Foundation grant-holder of both the CeBIL program and the Inter-CeBIL program. His research, supervision, teaching & part-time advisory practice concentrates on Intellectual Property-, Competition & Regulatory Law, with a special focus on the law and ethics of new technologies, big data and artificial intelligence in the health & life sciences. Originating from Germany, he holds master’s degrees and doctoral degrees from the Universities of Göttingen, Uppsala and Lund. In addition to gaining practical experience in German Courts and international law firms, Timo has been a visiting scholar and research fellow at Harvard University, the Universities of Oxford and Cambridge, the Chicago-Kent College of Law, the Max Planck Institute for Innovation and Competition Law and the Pufendorf Institute for Advanced Studies. He is a member of several international committees and regularly advises the WHO, WIPO, EU Commission, various research organizations, companies, national governments and law firms. He presents his research at international symposia, major law firms, the Universities of Oxford, Cambridge, Hong Kong & Tokyo, Harvard Law School, Harvard Business School, Stanford Law School, Yale, MIT, the Broad institute, as well as at the WHO, WIPO, European Medicines Agency and National Ethics Councils, etc. His publications comprise 6 books, as well as 200+ articles, book chapters and internet publications. Timo’s research has been featured in i.a. The Economist, The Financial Times, El Mundo, Politico, WHO Bulletin, Times of India & Times Higher Education, and is published in both leading legal journals, as well as in top “wet-science” journals, such as Science, NEJM Catalyst, Harvard Business Review, JAMA, Nature Biotechnology, Nature Genetics, Nature Electronics, Nature PJ Digital Medicine, The Lancet Digital Health, and PLoS-Computational Biology. He is also a regular contributor to Harvard Law School’s “Bill of Health” blog. Mark Fenwick is a Professor of International Business Law at the Faculty of Law, Kyushu University, Fukuoka, Japan. His primary research interests are in the fields of business regulation in a networked age and white-collar and corporate crime. Recent publications include New Technology, Big Data & the Law (Springer, 2017: coedited with M. Corrales Compagnucci and N. Forgó), Robotics, AI and the Future of Law (Springer, 2018: co-edited with M. Corrales Compagnucci and N. Forgó), Smart Contracts: Business, Legal & Technological Perspectives (Hart/Bloomsbury, 2021: co-edited with M. Corrales Compagnucci and S. Wrbka), and Organizing-forInnovation: Corporate Governance in a Digital Age (Springer, 2023: co-authored with E.P.M. Vermeulen and T. Kono). He has a master’s and Ph.D. from the Faculty of Law, University of Cambridge (Queens’ College) and has been a visiting professor at Cambridge University, Chulalongkorn University, Duke University, the University of Hong Kong, Shanghai University of Finance & Economics, the National University of Singapore, Tilburg University and Vietnam National University. He has also conducted research for the EU, OECD and the World Bank and is a contributor to the International Corporate Governance Network Yearbook.

Editors and Contributors

xi

Mateo Aboy is the Director of Research in Biomedical Innovation, AI & Law at the Faculty of Law, University of Cambridge. He is a member of the Centre for Law, Medicine & Life Sciences (LML) and the Centre for IP & Information Law (CIPIL) at the University of Cambridge. His multidisciplinary background includes a combination of law, engineering, regulatory science and management experience. He holds degrees in electrical & computer engineering (BS, BSEE, MSECE, MPhil/ DEA, PhD ECE), law (LLB, SJD/PhD/LLD) and international management (MBA), as well as professional registrations as a Professional Chartered Engineer (CEng, EU/ ES COIT), Certified Licensing Professional (CLP), Patent Practitioner with a Bar Admission/licensed to practice in patent cases before the United States Patent Office (USPTO), Fellow of Information Privacy (FIT, IAPP), Certified Privacy Information Professional (CIPP/E), Certified Privacy Manager (CIPM), Lead Implementer of Information Security Management Systems-ISMS (ISO 27001), Lead Auditor of Medical Device Quality Management Systems (ISO 13485), Lead Implementer of Privacy Information Management Systems - PIMS (ISO 27701) and Certified Data Protection Officer (C-DPO). His research investigates the intersection of digital innovation, IP policy and economics of healthcare by exploring the key tenets of innovation with a focus on the digital health, biotech and pharmaceutical industries. This includes investigating the transformation of medical technology, drug development and healthcare delivery, as well as the associated legal, regulatory, policy and strategy questions raised by the growth of medical AI/ML/QC and biotech innovations for personalized medicine. Mateo’s research seeks to understand the drivers of technology-based innovation, IP incentives and the determinants of how emergent medical technologies are protected, regulated, funded, developed, adopted and used in practice. He is the author of more than 150 scholarly articles published in leading scientific, engineering and legal journals. His professional experience includes work in various senior roles both in private sector and in academia. As a Licensed Patent Practitioner, he has successfully prosecuted numerous cases before the USPTO, focusing primarily on medical device and computer-implemented inventions. He holds over 20 patents as an inventor. Kathleen Liddell is the Director of the Centre for Law, Medicine and Life Sciences (LML), and the Herschel Smith Professor of Intellectual Property and Medical Law at the Faculty of Law, University of Cambridge. Her research focuses on innovations in health, medicine and society, with the aim of understanding and improving the legal frameworks that govern and support these fields. Professor Liddell has been the principal investigator for several large projects on intellectual property and information law, including in relation to genomics, precision medicine, repurposing pharmaceuticals and antimicrobial resistance. Professor Liddell has worked on policy reports for national health departments, national ethical advisory commissions and the European Commission. She is the recipient of grants from (for example) the Wellcome Trust, the Philomathia Foundation, the Cambridge ESRC-Impact Acceleration Account and the Novo Nordisk Foundation. Her expertise extends to other areas of life sciences including national and international regulation of clinical trials, medical negligence,

xii

Editors and Contributors

biomaterials (including human tissue, cells, organs), pharmaceuticals, diagnostics and personalized and regenerative medicine. She studied law and natural sciences at the University of Melbourne before undertaking her doctorate in law at the University of Oxford and a Master’s degree in Bioethics at Monash University. In addition to academia, Dr. Liddell has worked in private legal practice and in the civil service.

Contributors Mateo Aboy Center for Law, Medicine and Life Sciences, University of Cambridge, Cambridge, UK Tima Otu Anwana Department of Innovation and Digitalisation in Law, University of Vienna, Wien, Austria Katarzyna Barud Department of Innovation and Digitalisation in Law, University of Vienna, Wien, Austria Michael Cepic Department of Innovation and Digitalisation in Law, University of Vienna, Wien, Austria Marcelo Corrales Compagnucci Center for Advanced Studies in Biomedical Innovation Law (CeBIL), University of Copenhagen, Copenhagen, Denmark Edward Dove Edinburgh Law School, University of Edinburgh, Edinburgh, UK Louise C. Druedahl Faculty of Law, Centre for Advanced Studies in Biomedical Innovation Law (CeBIL), University of Copenhagen, Copenhagen, Denmark Mark Fenwick Faculty of Law, Kyushu University, Fukuoka, Japan I. Glenn Cohen Petrie-Flom Center for Health Law Policy, Biotechnology, & Bioethics, Harvard Law School, Cambridge, MA, USA Jheel Gosain Saint Louis University School of Law, Saint Louis, MO, USA Emily Johnson Department of Innovation and Digitalisation in Law, University of Vienna, Wien, Austria Yann Joly Center of Genomics and Policy, Department of Human Genetics, Faculty of Medicine and Health Sciences, McGill University, Montreal, Canada Shizuo Kaji Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan Sofia Kälvemark Sporrong Faculty of Pharmacy, Social Pharmacy, Department of Pharmacy, Uppsala University, Uppsala, Sweden Jason D. Keune Saint Louis University School of Law, Saint Louis, MO, USA; Albert Gnaegi Center for Health Care Ethics, Saint Louis University, Saint Louis, MO, USA

Editors and Contributors

xiii

Bartha Maria Knoppers Center of Genomics and Policy, Department of Human Genetics, Faculty of Medicine and Health Sciences, McGill University, Montreal, Canada Max Königseder MLL Meyerlustenberger Lachenal Froriep AG, Zurich, Switzerland Lea Köttering University of Hamburg, Hamburg, Germany Timo Minssen Center for Advanced Studies in Biomedical Innovation Law (CeBIL), University of Copenhagen, Copenhagen, Denmark Dianne Nicol Centre for Law and Genetics, Faculty of Law, University of Tasmania, Tasmania, Australia Shota Osada Faculty of Education, Kagoshima University, Kagoshima, Japan Carmel Shachar Center for Health Law and Policy Innovation, Harvard Law School, Cambridge, MA, USA David A. Simon Northeastern University School of Law, Cambridge, MA, USA Michael S. Sinha Center for Health Law Studies, Saint Louis University School of Law, Saint Louis, MO, USA Shinto Teramoto Faculty of Law, Kyushu University, Fukuoka, Japan Marie-Catherine Wagner Department of Innovation and Digitalisation in Law, University of Vienna, Wien, Austria Helen Yu Value-Based Health and Care Academy, Swansea University, Swansea, Wales

The Dynamic Context and Multiple Challenges of Data Sharing Marcelo Corrales Compagnucci, Mark Fenwick, Timo Minssen, and Mateo Aboy

Abstract This chapter outlines the dynamic context and multiple challenges of data sharing in the contemporary data ecosystem, specifically as it relates to healthcare. Here, we define “data sharing” as the practice of sharing health-related data between a number of data controllers and processors. Data collected in this manner can come from the provision of health, clinical trials, observational studies, public health surveillance programs, and other health data collection methods. Several justifications for such sharing are introduced. Our main contention is that the regulatory environment today is an increasingly complex and rapidly evolving combination of norms and principles. To navigate this environment successfully requires careful analysis and judgment from all stakeholders across diverse fields of technology and the law. The purpose of this volume, therefore, is to offer a series of case studies that integrate theoretical and practical perspectives and illustrate how to effectively navigate this complex and rapidly evolving space. Keywords Data sharing · Data protection · Clinical trials · Health science · Innovation incentives · Open data

M. C. Compagnucci (B) · T. Minssen Center for Advanced Studies in Biomedical Innovation Law (CeBIL), University of Copenhagen, Copenhagen, Denmark M. Fenwick Faculty of Law, Kyushu University, Fukuoka, Japan M. Aboy Center for Law, Medicine and Life Sciences, University of Cambridge, Cambridge, UK © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_1

1

2

M. C. Compagnucci et al.

1 Introduction Data sharing in health sciences, as defined in this book, is the sharing of data between a plurality of data controllers and processors.1 Data concerning health is considered a special category of data under the General Data Protection Regulation (GDPR). Thus, the processing (including sharing) of health data is generally prohibited by Article 9(1) GDPR unless one of the Article 9(2) GDPR legal basis for processing applies. In general, the legal basis for the processing of health data under the GDPR are: (1) Article 9(2)(h) “processing is necessary for the purposes of preventive or occupational medicine, for the assessment of the working capacity of the employee, medical diagnosis, the provision of health or social care or treatment or the management of health”; (2) Article 9(2)(i) “processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices”; (3) Article 9(2)(j) “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”; and (4) Article 9(2)(a) “the data subject has given explicit consent to the processing of those personal data for one or more specified purposes.” This often raises challenges for secondary uses of personal health data. For instance, the legal basis for processing personal data in the course of a clinical trial falls within “legal obligation(s) to which the controller is subject” Article 6(1)(c) in conjunction with Article 9(2)(i) “ensuring high standards of quality and safety of health care and of medicinal products or medical devices.” This is because the processing operations are expressly required by the Clinical Trial Regulation (CTR) to ensure the safety and reliability of medicinal products. However, secondary processing operations of the data collected as part of a clinical trial purely related to research activities cannot be based on a legal obligation. A similar situation is encountered when data that was originally collected by a hospital for the provision of health or treatment (primary use) is later desired to be used to support the development of a medical technology (e.g., an AI-enabled medical device). In a typical scenario, the complexities and challenges include having (1) a different legal basis for processing between primary and secondary uses of the data, (2) additional controllers and processors for the secondary uses and different in legal nature (e.g., public hospitals vs. private corporations), and (3) the need to share data across cloud-based systems which may also give rise to international data transfers between jurisdictions. Despite these challenges, data sharing is often necessary to achieve desirable social goals. The importance of data sharing in health sciences can be attributed to 1

A ‘controller’ is the entity who ‘defines the means and purposes of the processing’ (Article 4(7) GDPR). A ‘processor’ is the entity who processes personal data on behalf of the controller, if the controller did not process personal data directly themselves but outsourced the task (Article 4(8) GDPR). For more information regarding the concepts of controllers and processors see: Dahi and Corrales Compagnucci (2022).

The Dynamic Context and Multiple Challenges of Data Sharing

3

several factors. To begin with, it allows other researchers to validate and replicate the results of a study, which is an integral element in evidence-based medicine and the scientific method and establishes a substantive basis for future research.2 An additional benefit of sharing is that it allows researchers to build upon the work of others and combine evidence from multiple studies in order to address different and more complex issues. Finally, the sharing of data can promote collaboration and the exchange of ideas between researchers, accelerating the general pace of scientific discovery and innovation in the health research field.3 As such, sharing data from original research can facilitate a re-purposing and exploration of novel lines of inquiry, enhance the quality of research output, and reduce the amount of waste generated in research.4 International organizations (such as the World Health Organization5 and the Organization for Economic Co-operation and Development)6 and research funders (such as the Bill and Melinda Gates Foundation7 or the UK Medical Research Council)8 have explicitly recognized the need for and value of such sharing and have called for even greater levels of data sharing. Increasingly, high-impact medical journals, such as the BMJ and The Lancet, are also requiring clinical trial data to be shared as a condition of publication, increasing the trend toward wider dissemination and circulation of data.9 The cross-border character of most data transfers, as well as the centrality of technology and the need for technology-oriented solutions, merely adds to this complexity.10 The emergence, development, and proliferation of digital health technologies greatly facilitate but also add to the regulatory challenges of data sharing in health science. These technologies now present new opportunities for all stakeholders operating in the ecosystem. It is common practice for clinical research organizations and

2

Tenopir et al. (2011). Lee et al. (2016). 4 Yoong et al. (2022). 5 World Health Organisation, ‘Policy on use and sharing of data collected in Member States by the World Health Organization (WHO) outside the context of public health emergencies (Provisional)’ World Health Organisation, available at: https://www.who.int/about/policies/publishing/ data-policy. Accessed 19 June 2023. 6 Organisation for Economic Co-operation and Development (OECD), ‘Data governance: Enhancing access to and sharing of data’ available at: https://www.oecd.org/sti/ieconomy/enhanceddata-access.htm. Accessed 19 June 2023. 7 Bill and Melinda Gates Foundation, Information Sharing Approach, available at: https://www.gat esfoundation.org/about/policies-and-resources/information-sharing-approach. Accessed 19 June 2023. 8 UK Research and Innovation, Data sharing, available at: https://www.ukri.org/councils/mrc/gui dance-for-applicants/policies-and-guidance-for-researchers/data-sharing/. Accessed 19 June 2023. 9 Loder and Groves (2015). 10 See, for example, Corrales Compagnucci et al. (2021), Jurcys et al. (2022), Minssen et al. (2020), Corrales Compagnucci et al. (2020). 3

4

M. C. Compagnucci et al.

hospitals to use cloud-based systems for data storage, computation, and sharing.11 Big data algorithms, AI and predictive analytics are also now being used to help diagnose diseases, develop more effective drugs and treatments, and improve patient care. To develop these techniques, large datasets are usually required. Patients are also able to access medical information more quickly, conveniently, and remotely in real-time, creating an additional layer of pressure.12 However, many challenges remain in operationalizing these new opportunities, and researchers must now navigate a complex data ecosystem.13 The purpose of this edited collection is, therefore, twofold. First, we offer an international perspective on recent regulatory and legal developments in data sharing in the health sciences. Second, the chapters integrate theoretical and practical perspectives to explain how to traverse the various legal complexities that arise in a fast-changing regulatory landscape. Navigation of this increasingly complex and rapidly evolving mosaic of norms and principles requires careful analysis and judgment from all stakeholders. Here, by way of introduction to the contributions, we briefly introduce some of the regulatory complexity of data sharing in the health sciences (Sect. 2), before reviewing the substantive chapters and the main themes they address (Sect. 3).

2 A Fast-Moving Regulatory Landscape To illustrate regulatory complexity, this section briefly examines the fast-moving changes that have affected data security and privacy. These two core principles must be met in order to collect and share data in a responsible and sustainable manner. Healthcare organizations must adhere to strict regulations such as the Health Insurance Portability and Accountability Act (HIPAA)14 in the US, the GDPR15 in the EU, or the Act on the Protection of Personal Information (APPI) in Japan.16 Such legislation requires specific privacy protections, including encryption and sharing

11

For the legal and technical requirements in cloud-based architectures see, for example, Kirkham et al. (2012), Kousiouris et al. (2013), pp. 61–72, Barnitzke et al. (2011), Barnitzke et al. (2012), Kiran et al. (2013). 12 Corrales Compagnucci et al. (2022), pp. 1–15. 13 Corrales Compagnucci et al. (2022), pp. 1–15. 14 The Health Insurance Portability and Accountability Act of 1996 (HIPPA) is a United States Act of Congress enacted by the 104th United States Congress and signed into law by President Bill Clinton on August 21, 1996. For a case study in the US, see, e.g., Celedonia et al. (2021), pp. 242–250. 15 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC, OJ 2016 L 119, 1 (General Data Protection Regulation, GDPR). 16 Japan Act on the Protection of Personal Information, Act No. 57 of 2003 (APPI).

The Dynamic Context and Multiple Challenges of Data Sharing

5

restrictions, when handling health records. Failing to comply with these regulations can result in hefty fines and reputational damage.17 Furthermore, the protection of patient data is crucial for ensuring high levels of trust with patients.18 Legal and ethical requirements are often technically embedded following the privacy-by-design and ethics-by-design approaches to facilitate sharing and use of data in a non-discriminatory manner. For this reason, several chapters in this book examine privacy and security issues, particularly the scope of informed consent and its withdrawal, information of data subjects, international data transfers, data sharing and control, transparency, explainability, as well as secondary uses of data. Privacy and data security are also affected, in quite a different way, by the laws and draft regulations from the European Union that seek to facilitate data sharing. One example is the proposal for a regulation on the European Health Data Space (EHDS). The EHDS is an ecosystem of health-specific rules and standards, as well as a governance framework that is designed to empower individuals by enabling them to gain greater digital access and control of their electronic personal health data, both at the national and European levels. This can facilitate their free movement, in addition to encouraging a genuine single market for electronic health record systems, medical devices and high-risk AI systems (primary use of data). In addition, this ensures that health data are used consistently, responsibly and efficiently for research, innovation, policymaking, and regulation (i.e., the secondary use of data).19 The recent EU proposal for a regulatory framework on artificial intelligence (AI Act), offers another example and additional layer of regulatory complexity. The AI Act focuses on AI systems that have the potential to affect citizens negatively. It takes a risk-based approach and requires that each AI technology and application be categorized according to its risk level, from unacceptable to minimal. Various criteria are used to categorize AI, including how the data is used, the security of the system, its intended purpose, and its potential harm to individuals.20 Finally, the Medical Device Regulation 2017/745 (MDR) 2021 and the In Vitro Diagnostic Regulation 2017/746 (IVDR) 2022 offer a third cluster of examples. Due to the changes in requirements, existing medical devices must be updated to comply with shifting standards. The interplay between regulators, device designers, manufacturers, physicians, and patients is a critical theme that needs to be addressed.21 It has been estimated that several manufacturers struggled to meet the requirements during the transitional period and the requirement that all devices be resubmitted for

17

Corrales Compagnucci et al. (2019), pp. 144–145. Corrales Compagnucci (2020) , pp. 159, 228, 229, 291. 19 See European Health Data Space, available at: https://health.ec.europa.eu/ehealth-digital-healthand-care/european-health-data-space_en. Accessed 20 June 2023. 20 See Proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act) and amending certain union legislative acts. COM/2021/206 final. 21 See, e.g., Cohen et al. (2022). 18

6

M. C. Compagnucci et al.

compliance made it more difficult for manufacturers to compete in Europe.22 For example, the MDR now applies to some items that were previously not considered medical devices. For example, computer software used to aid in the diagnosis of a disease is now considered a medical device and must adhere to the new regulatory standards.23 Notably, these regulations often interact, resulting in increased regulatory complexity. For instance, health data obtained by a hospital subject to GDPR may be then used in the development of a cloud-based medical AI-enabled device subject to MDR. This AI medical device (relying on the health data for its training), in turn, may be used to support a clinical trial subject to GDPR and CTR with subjects across jurisdictions requiring international cross-border transfers. In addition to the data security and privacy issues discussed, there are a number of other legal challenges associated with data sharing in health science, including issues related to intellectual property and the costs and logistics of making such data available to others.

3 Book Structure and Chapters This volume contains the following chapters, addressing and illustrating the above themes and issues. The chapter by Yann Joly, Edward Dove, Bartha Maria Knoppers and Dianne Nicol examines The Global Alliance for Genomics and Health (GA4GH). GA4GH is an international not-for-profit organization dedicated to the development of standards and policies that seek to expand the use of genomic data within a human rights framework, improving health care for everyone in an ethically and legally responsible manner. The GA4GH benefits from the participation of more than five hundred organizations in healthcare, patient advocacy, research and ethics, government, life science, and information technology. The chapter examines the key accomplishments of the Regulatory and Ethics Work Stream (REWS) of the GA4GH. The REWS is a founding Work Stream of the GA4GH responsible for the landmark Framework for Responsible Sharing of Genomic and Health-Related Data (2014/19). On the organization’s tenth anniversary, the chapter highlights what, from the perspective of the authors as present or former leaders of the REWS, they consider to be its major contributions in developing interdisciplinary, participative, and international policy developments in genomics. David A. Simon, Carmel Shachar and I. Glenn Cohen also proceed from the suggestion that data is an integral part of healthcare delivery and that a growth in digital technologies has also produced enormous quantities of health data that contain individuals’ personal, and often highly sensitive, information. A key question for policymakers is, therefore, how to regulate the collection, storage, sharing, and disclosure of this information. The authors evaluate two different types of regulatory 22 23

See Lucido (2022). Dræbye Gantzhorn and Bjerregaard Bjerrum (2021).

The Dynamic Context and Multiple Challenges of Data Sharing

7

enforcement mechanisms, namely, public rights of action (where the government sues) and private rights of action (where private persons sue). They use recent cases to illustrate the advantages and drawbacks of private rights of action in health data privacy cases and then use this analysis to contrast them with public rights of action. Their analysis suggests that public and private rights of action should be viewed as complementary regulatory tools rather than competing alternatives. As such, both public and private rights of action have an important role to play in regulating health data. To ensure private rights are effective regulatory tools, policymakers must pay particular attention to how those rights of action are designed and implemented to ensure their effectiveness. Louise C. Druedahl and Sofia Kälvemark Sporrong examine how data sharing is central for the development and deployment of artificial intelligence in the healthcare systems of the future. However, they note how the perspectives of patients are seldom included in the larger debates of how, when, and what data to share. This chapter, therefore, provides an overview of research on patient perspectives on data sharing and associated aspects, including patients’ motivations, concerns, and views on privacy and conditions for sharing. Moreover, these perspectives are put into the evolving context of informed consent and today’s European context of the GDPR and Data Governance Act. Overall, there seems to be a discrepancy between the patients’ perspectives on data sharing and the reality in which their data are to be shared. However, the reality of data use is moving towards the continued re-use of data for secondary purposes. However, questions remain regarding how patients perceive sharing; seemingly, patient views are lost or, at least, minimized in the wider push for innovation and jurisdictional competitiveness. Ensuring that patients’ voices are heard is essential for public acceptance of data sharing and, consequently, for greater inclusiveness and equity of results and innovations originating from patients’ shared data. Helen Yu focuses on how advances in science and technology have created an expectation and demand for more research and innovation, particularly in the health and biomedical fields. There is an inherent promise associated with the potential of breakthrough technologies, particularly when combined with quality health-related data, namely that such innovations can deliver significantly improved health outcomes globally. However, science and innovation alone are not sufficient to achieve societal transformation toward global health. There is an observed reluctance to operationalize the use of existing data, mainly due to privacy and security concerns, as well as apprehension around how, for what purpose, and by whom data will be used. Research and innovation, therefore, need to be supported by behavior and attitude changes in order to foster inclusive participation and effective societal uptake of the resulting solutions. This chapter explores how the principles of Responsible Research and Innovation might be applied to provide a legally supported, inclusive, and sustainable approach to operationalizing the use of existing data in support of health-related innovations. By incorporating a deliberative and responsive process to citizen science practices, the root causes underlying this observed reluctance can

8

M. C. Compagnucci et al.

be identified and addressed. The aim of the chapter is to gain a fundamental understanding of the real and perceived barriers to utilizing data for research and innovation purposes, which can then be used to proffer solutions to create a responsive and inclusive culture to support the ongoing responsible use of data sustainably. Jheel Gosain, Jason D. Keune and Michael S. Sinha start with the June 2022 decision of the U.S. Supreme Court in Dobbs v. Jackson Women’s Health Organization, in which the Court overturned fifty years of precedent by eliminating the federal constitutional right to abortion care established by the Court’s 1973 decision in Roe v. Wade. The Dobbs decision left the decision about abortion services in the hands of the states, which created enormous diversity of access to women’s healthcare across the country. This, in turn, also revealed the profusion of privacy issues that emanate from our technology-driven world. This chapter, therefore, reviews these privacy issues, focusing in particular on healthcare data, financial data, website tracking and social media. They then offer potential future legislative and regulatory pathways that balance privacy with law enforcement goals in women’s health and any domain that shares this structural feature. Tima Otu Anwana, Katarzyna Barud, Michael Cepic, Emily Johnson, Max Königseder, and Marie-Catherine Wagner explore how the secondary use of health data offers great potential for health research. Technological developments, for instance, the progress in the field of artificial intelligence, have improved the reusability of datasets. However, the authors argue that the GDPR and ethical guidelines routinely restrict the reuse of personal data when the data subject has not given informed or explicit consent. In retrospective studies, where researchers use personal data and sensitive data from previous medical examinations, the retrospective collection of the patient’s consent can be challenging. This chapter, therefore, focuses on the potential legal and practical hurdles associated with obtaining consent from the data subject for a new processing purpose. In addition, it will present the ethical considerations associated with consent and retrospective data collection in health and medical research. The chapter discusses several Horizon 2020-funded research projects in the fields of health and medical research. These research projects are used as practical examples to demonstrate the issues faced with consent as a legal basis in this type of retrospective research. Lea Köttering discusses how medical devices based on machine learning promise to have a significant impact on developments in healthcare. The chapter asks to what extent data protection law, de lege lata and de lege ferenda, enables the development of machine learning-based medical devices. A key aspect of this is the processing of health data, which does not originate with the developers but with the healthcare providers, as ML-based medical devices are trained with a large amount of health data. According to the current legal situation under the GDPR, secondary use of health data is possible in principle; however, the consent of the data subjects faces certain difficulties, and as the chapter shows, the development of an ML-based medical device does not necessarily constitute scientific research within the meaning of the GDPR. Therefore, this chapter argues that a separate legal basis is needed, and this must be accompanied by technical-organizational measures that safeguard the rights of the data subject to a large extent and should only be allowed if the public

The Dynamic Context and Multiple Challenges of Data Sharing

9

benefits from the research and/or deployment of the ML-based medical device. In addition, there is a need for infrastructural measures such as the establishment or expansion of intermediary bodies, given the lack of incentives, personnel capacity, and expertise among healthcare providers to share health data with a broad range of interested parties. Furthermore, to ensure a reliable output from ML-based medical devices, standards for data preparation must be established. Finally, this contribution emphasizes the European Health Data Space proposal and briefly examines whether this is a step in the right direction. Marcelo Corrales Compagnucci, Mark Fenwick, Mateo Aboy and Timo Minssen examine the July 2020 decision of the Court of Justice of the European Union (CJEU) in Data Protection Commissioner v Facebook Ireland Limited, Maximillian Schrems (Schrems II), which invalidated the EU-US Privacy Shield adequacy decision but found that Standard Contracting Clauses (SCCs) are a valid mechanism to enable GDPR-compliant transfers of personal data from the EU to jurisdictions outside the EU/EEA, as long as various unspecified “supplementary measures” are in place to compensate for any gaps in data protection arising from the third country law or practices. The effect of this decision has been to place regulators, scholars, and data protection professionals under greater pressure to identify and explain these supplementary measures to facilitate cross-border transfers of personal data. This chapter critically examines the current framework for cross-border transfers after Schrems II, including the new SCCs adopted by the European Commission, as well as the current European Data Protection Board guidance on “supplementary measures.” The chapter suggests that the so-called “supplementary measures” are not supplementary and that the CJEU’s characterization undermines the original clarity of the GDPR with regard to the required standards for the security of processing as well as the available mechanisms for cross-border transfers of personal data. The chapter concludes that despite the legal uncertainty introduced by the CJEU several post-Schrems II developments have been helpful to raise awareness and improve the overall safeguards associated with cross-border transfers of personal data. These include the new SCCs and an increased understanding of the capabilities and limitations of the technical and organizational measures, including encryption, pseudonymization, and multi-party processing. Technical solutions such as multiparty homomorphic encryption that combine these three technical measures while still allowing for the possibility to query and analyze encrypted data without decrypting it has significant potential to provide effective security measures that facilitate cross-border transfers of personal data in high-risk settings. Finally, Shinto Teramoto, Shizuo Kaji and Shota Osada focus on the issue of how sharing extensive healthcare information is essential for the development of medicine and the formulation of effective public health policies. However, such data often contains sensitive or personal information or trade secrets. Consequently, safety measures are needed to strike a balance between the sharing of data and the protection of such information. A firewall is one of the major safety technologies designed to prevent the delivery of protected information by severing harmful connections or limiting the formation of new connections between relevant parties in an information exchange network. The chapter examines some simple models that identify

10

M. C. Compagnucci et al.

vulnerabilities in firewalls, but argues that such models often oversimplify real-world scenarios, neglecting factors like internal connections among nodes and the influence of other information held by nodes. Therefore, the chapter proposes several improved models and uses them to explore some of the reasons why firewalls fail. The chapter suggests that firewalls are less effective as the number of network nodes increases and that both high- and low-degree nodes pose non-negligible risks. The study also raises awareness about the role of internal monitors in preventing leaks. The effectiveness of information leakage control could be increased with the monitor’s proximity to the information source. This necessitates a greater focus on internal monitoring, perhaps using information and communication technology.

4 Conclusion The recurring theme in all the contributions to this volume is the need for comprehensive data-sharing solutions that align with both legal and ethical principles and embrace and integrate cutting-edge technologies in a socially responsible manner. This is not always easy to operationalize, however, and demands a multi- or transdisciplinary approach that cuts across different fields of law and technology. The aim of this volume, therefore, is to offer a number of case studies that integrate a theoretical and practical perspective illustrating how to navigate this emerging environment best and achieve the ultimate goal, which is to facilitate the innovative and responsible use of data in the pursuit of socially valuable knowledge that can improve the provision of healthcare services. Acknowledgements This research was supported by a Novo Nordisk Foundation grant for a scientifically independent Collaborative Research Program in Biomedical Innovation Law (grant agreement number NNF17SA0027784) and the Inter-CeBIL Program (grant agreement number NNF23SA0087056). The opinions expressed are the authors’ own and not of their respective affiliations. The authors declare no conflicts of interests.

References Barnitzke B et al (2011) Legal restraints and security requirements on personal data and their technical implementation in clouds. In: Cunningham P, Cunningham M (eds) eChallenges e2011 conference proceedings. Florence Barnitzke B, Corrales M, Forgó N (2012) Aspectos Legales de la Computación en la Nube: Seguridad de Datos y Derechos de Propiedad Sobre los Mismos, Vol 2. Editorial Albremática: Buenos Aires Bill and Melinda Gates Foundation, Information Sharing Approach, available at: https://www. gatesfoundation.org/about/policies-and-resources/information-sharing-approach. Accessed 20 June 2023 Bruzzone G, Debackere K (2021) As open as possible, as closed as needed: challenges of the EU strategy for data. Les Nouvelles J Licens Exec Soc LV I(1):41–49

The Dynamic Context and Multiple Challenges of Data Sharing

11

Celedonia K et al (2021) Community-based health care providers as research subject recruitment gatekeepers: ethical and legal issues in a real-world case example. Res Ethics 17(2):242–250 Cohen IG et al (2022) The future of medical device regulation: innovation and protection. Cambridge University Press, Cambridge Corrales Compagnucci M et al (2019) Homomorphic encryption: the ‘holy grail’ for big data analytics and legal compliance in the pharmaceutical and healthcare sector? Eur Pharm Law Rev 3(4):144–155 Corrales Compagnucci M (2020) Big data, databases and ‘ownership’ rights in the cloud. Springer, Singapore Corrales Compagnucci M et al (2020) Lost on the high seas without a safe Harbor or a shield? navigating cross-border data transfers in the pharmaceutical sector after Schrems II invalidation of the EU-US privacy shield. Eur Pharm Law Rev 4(3):153–160 Corrales Compagnucci M, Aboy M, Minssen T (2021) Cross-border transfers of personal data after Schrems II: supplementary measures and new standard contractual clauses (SCCs). Nordic J Eur Law 4(2):37–47 Corrales Compagnucci M et al (2022) AI in e-health: human autonomy, data governance and privacy in healthcare. Cambridge University Press, Cambridge Dahi A, Corrales Compagnucci M (2022) Device manufacturers as controllers: expanding the concept of ‘controllership’ in the GDPR. Comput Law Secur Rev 47:105762. https://doi.org/ 10.1016/j.clsr.2022.105762 Dræbye Gantzhorn M, Bjerregaard Bjerrum EK (2021) When is software regulated as medical devices? (22 June 2021), available at: https://www.bechbruun.com/en/news/2021/when-is-sof tware-regulated-as-medical-devices. Accessed 20 June 2023 Jurcys P, Corrales Compagnucci M, Fenwick M (2022) The future of international data transfers: managing new legal risk with a ‘user-held’ data model. Comput Law Secur Rev 46:105691. https://doi.org/10.1016/j.clsr.2022.105691 Kiran M et al (2013) Managing security threats in clouds, digital research 2012. In: The 8th international conference for internet technology and secured transactions, London, UK Kirkham T et al (2012) Assuring data privacy in cloud transformations. In: Proceedings of the 11th IEEE international conference on trust, security and privacy in computing and communications (IEEE TrustCom-12). IEEE. https://doi.org/10.1109/TrustCom.2012.97 Lee JE et al (2016) User-friendly data-sharing practices for fostering collaboration within a research network: roles of a vanguard center for a community-based study. Int J Environ Res Public Health 13(1):34 Loder E, Groves T (2015) The BMJ requires data sharing on request for all trials. BMJ 350(may07 4):h2373 Lucido S (2022) EU medical device regulation still presents challenges and opportunities, available at: https://www.assurx.com/eu-medical-device-regulation-still-presents-challenges-and-opport unities/. Accessed 20 June 2023 Minssen T et al (2020) The EU-US privacy shield regime for cross-border transfers of personal data under the GDPR: what are the legal challenges and how might these affect cloud-based technologies, big data, and AI in the medical sector? Eur Pharm Law Rev 4(1):34–50 Minssen T, Gerke S (2023) Ethical and legal challenges of digital medicine in pandemics. In: Reis A, Schmidhuber M, Frewer A (eds) Pandemics and ethics. Springer, Berlin, Heidelberg. https:// doi.org/10.1007/978-3-662-66872-6_12. Accessed 20 June 2023 Kousiouris G, Vafiadis G, Corrales M (2013) A cloud provider description schema for meeting legal requirements in cloud federation scenarios. In: Douligeris C, Polemi N, Karantjias A, Lamersdorf W (eds) Collaborative, trusted and privacy-aware e/m-services. Springer, Berlin Organisation for Economic Co-operation and Development (OECD), ‘Data governance: Enhancing access to and sharing of data’ available at: https://www.oecd.org/sti/ieconomy/enhanced-dataaccess.htm. Accessed 20 June 2023 Tenopir C et al (2011) Data sharing by scientists: practices and perceptions. PLoS ONE 6(6):e21101

12

M. C. Compagnucci et al.

UK Research and Innovation, Data sharing, available at: https://www.ukri.org/councils/mrc/ guidance-for-applicants/policies-and-guidance-for-researchers/data-sharing/. Accessed 20 June 2023 World Health Organisation, ‘Policy on use and sharing of data collected in Member States by the World Health Organization (WHO) outside the context of public health emergencies (Provisional)’ World Health Organisation, available at: https://www.who.int/about/policies/publis hing/data-policy. Accessed 20 June 2023 Yoong SL et al (2022) The benefits of data sharing and ensuring open sources of systematic review data. J Public Health 44(4):e582–e587

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10: An Interdisciplinary, Participative Approach to International Policy Development in Genomics Yann Joly, Edward Dove, Bartha Maria Knoppers, and Dianne Nicol

Abstract The Global Alliance for Genomics and Health (GA4GH) is an international not-for-profit organization dedicated to the development of standards and policies to expand the use of genomic data within a human rights framework, improving health for everyone. The GA4GH benefits from the participation of more than 500 leading organizations in healthcare, patient advocacy, research and ethics, government, life science, and information technology. This chapter charts the key accomplishments of the Regulatory and Ethics Work Stream (REWS) of the GA4GH. The REWS is a founding Work Stream of the GA4GH responsible for the landmark Framework for Responsible Sharing of Genomic and Health-Related Data (2014/ 19). On the organization’s tenth anniversary, the authors highlight what, in their unique perspective as present or former leaders of the REWS, they consider to be its major contributions to interdisciplinary, participative, and international policy developments in genomics. Considerations for future REWS objectives and outputs are presented in closing. Keywords Data governance · Data sharing · GA4GH · Genomic policy · Genomic research

Y. Joly (B) · B. M. Knoppers Center of Genomics and Policy, Department of Human Genetics, Faculty of Medicine and Health Sciences, McGill University, Montreal, Canada E. Dove Edinburgh Law School, University of Edinburgh, Edinburgh, UK D. Nicol Centre for Law and Genetics, Faculty of Law, University of Tasmania, Tasmania, Australia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_2

13

14

Y. Joly et al.

1 Introduction As the costs associated with human genomic sequencing continue to decline, genomic assays are increasingly used in both research and healthcare.…[W]e expect tens of millions of whole-exome or whole-genome sequences to be generated within the next decade, with a high proportion of that data coming from the healthcare setting and therefore associated with clinical information.1 [...]If they can be shared, these data-sets hold great promise for research into the genetic basis of disease and will represent more diverse populations than have traditionally been accessible in research; however, data from individual healthcare systems are rarely accessible outside of institutional boundaries.2

Will the ongoing global “accessibility” challenge affecting health data ever be remedied? Perhaps one positive development and potential remedy is the EU Health Data Space (EHDS), first proposed in May 2022, which will create a common infrastructure to let health data flow more freely for access by patients, researchers, and policymakers in Europe. Already in 1990, such an open, data sharing ethos underpinned the 1990 Human Genome Project (HGP), leading to the completion and public release of the sequence map of the human genome in 2003. A decade later, it also inspired the creation of the Global Alliance for Genomics and Health (GA4GH) in 2013. The mission of the GA4GH is to create the policy frameworks and open tools for integrating genomic and health data. Today, GA4GH serves as an exemplar of open access IT tools and polycentric, multi-stakeholder governance. Founded on a unique human rights approach, its Framework for the Responsible Sharing of Genomic and HealthRelated Data (the Framework),3 as we discuss further below, has been translated into policy and implementation tools for the international scientific community. It bears remembering that the characteristics and unique history of the notion of a “genomic commons”4 can be traced to the data-sharing norms instilled by the 1996 Bermuda principles,5 requiring that all DNA sequence data generated by the HGP be released to the public 24 h after generation. Today, 35 years of international and regional normative, ethical, and legal frameworks for genomic data, and exemplified by the GA4GH, are founded on this ethic of open-orientated but responsible data sharing. The goal is to serve the public good while responsibly circumscribing the conditions of data and sample collection, use, access, and oversight. Concomitantly, it should not be forgotten that since 2005, the creation of large population databases, as well as national biobanks interoperable across jurisdictions, has further developed the infrastructures for open science, privileging the use of cloud computing today.6 However, to move genomic data from the biomedical research and biobanking environments to the clinic, that is, to support its further translation into medical care, still remains the challenge of this century. Necessarily, a multidisciplinary, contextual, 1

Birney et al. (2017, pp. 1–20). Smith (2021, pp. 183–184). 3 Knoppers (2014a, pp. 3–8). 4 Contreras and Knoppers (2018, pp. 429–453). 5 Contreras and Knoppers (2018, pp. 429–453). 6 Stein et al. (2015, pp. 149–151). 2

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

15

and more precise approach to patients’ genomic-health data integration also brings equitable health care and economic issues to the fore, to say nothing of health data standardization. Most importantly, do (or will) clinicians and patients adhere to this data-sharing ethos?7 Health care is, after all, the translation of the right for citizens to benefit from scientific progress for both prevention and treatment of disease.8 From a health systems perspective, developing the concept of research and clinical data sharing for the public good will thus depend on effective oversight and public trust. Health data literacy and public engagement are also key. A decade of altruistic participation of citizens providing their data for biobanking has contributed to more robust and better-quality databases for everyone. This success augurs well for patients doing the same within health care systems by fostering health data altruism for the creation of aggregate and anonymized clinical support databases as well as the secondary uses of their health data.9 The Driver Projects of the GA4GH, which are real-world genomic data initiatives, will serve as testing grounds for the GA4GH and its internally developed bioinformatics, ethics, and policy tools and standards, while contributing to their validation and refinement. Perhaps the most daunting challenge, however, for scientists, clinicians, regulators, and the public alike, is the sheer volume of data, contexts, and cultures to be integrated into understandable and useful interpretation for human health of concepts such as diversity, ancestry, and equity. The human genome expressed in “humanity” is not static but evolving.10 Moreover, the language of stratification of polygenic risk scores in whole genome screening, for example, is “public” not individual, making the genomic understanding of public health a new “political” field of endeavor. The COVID-19 pandemic of 2020–2023 reinforced the need for more international, systems-wide health data interoperability and streamlined, proportionate, and harmonized regulation. Recognizing individual equivalence in genetic diversity may well serve to move the debate away from classical, binary concepts. It eliminates the idea of the “individual versus public” in the concept of citizenry, to reflect social consciousness and desires to contribute to the overall public good. Hopefully, international genomic health data standardization may make for a more equitable future since data could be more easily shared and understood but, the socio-ethical policies needed to govern this shift would need to be more dynamic and anticipatory. In what follows, we proceed to chart the key work of the Regulatory and Ethics Work Stream (the REWS) of the GA4GH. In so doing, we highlight what, in our capacities as present or former leaders of the REWS, we consider to have been our committee’s contribution to the operationalization of interdisciplinary, participative approach to international policy development in genomics.

7

Watson (2022, p. 853). British Medical Association (2019). 9 Shabani (2022, pp. 1357–1359). 10 See Universal Declaration on the Human Genome and Human Rights (1997). 8

16

Y. Joly et al.

2 Creation of the GA4GH’s Framework for Responsible Sharing of Genomic and Health-Related Data Those involved directly and indirectly with data sharing in the life sciences know all too well that genomic research and clinical practice generate and rely on secondary use of data. Sharing this data often aligns with a moral imperative to reduce the burden of disease while at the same time, promoting the development of innovations that prolong health and wellbeing.11 To that end, health improvements can be achieved more safely and efficiently by accessing and combining datasets from across disciplines, institutions, and locations. Genomic and clinical data sharing also requires safeguarding the interests of participants and patients who have provided their data to generate research discoveries and translational outcomes. One of the challenges in data sharing, however, and one of the main reasons for the establishment of the GA4GH a decade ago, is the lack of a common ethical and legal framework to bring patients, physicians, participants, regulators, funders, and researchers together in the goal of promoting responsible genomic and clinical data sharing.12 Thus, one of the first commitments of the GA4GH upon its inception in January 2013, and as set out in its foundational White Paper on Global Alliance Creating a Global Alliance to Enable Responsible Sharing of Genomic and Clinical Data (the White Paper ) in June 2013, was to partner with research consortia and other stakeholders around the world to develop an International Code of Conduct for Genomic and Clinical Data Sharing (the Code of Conduct ).13 This idea was inspired by an international code of conduct for data sharing in genomic research proposed by colleagues in 2011 that could provide common guidance on the basis of two fundamental values: (i) mutual respect and trust among scientists, stakeholders, and research participants; and (ii) a commitment to safeguarding public trust, participation, and investment.14 Unlike the 2011 data sharing code of conduct, however, the GA4GH initiative sought to situate a data sharing code explicitly within a human rights framework.15 We shall proceed to explain the basis and genesis of this idea. In 1948, the General Assembly of the United Nations adopted the Universal Declaration of Human Rights (UDHR) to guarantee the rights of every individual in the world. Included were twin rights “to share in scientific advancement and its benefits” and “to the protection of the moral and material interests resulting from any 11

Rehm et al. (2021, pp. 1–33) and Knoppers et al. (2014, pp. 1–9). Rehm et al. (2021, pp. 1–33). 13 GA4GH (2013, pp. 1–14). The White Paper committed the organization, among other things, to “Promote harmonization of regulatory frameworks, lower barriers to data sharing by developing policies for informed consents, align guidelines across jurisdictions, while respecting privacy and engaging individuals, families, and communities” (p. 27). See also the mission of the GA4GH, which is to “to accelerate progress in genomic research and human health by cultivating a common framework of standards and harmonized approaches for effective and responsible genomic and health-related data sharing”. See GA4GH, “About Us”, available at: https://www.ga4gh.org/abo ut-us/ Accessed 20 March 2023. 14 Knoppers et al. (2011, pp. 1–4). 15 See Knoppers et al. (2014, p. 895). 12

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

17

scientific… production of which [a person] is the author” (Article 27). For those of us in the GA4GH, a two-pronged question arose: could this right “to benefit from” and “to be recognized for” have direct application to collaborative genomic and clinical data sharing internationally, and could it be activated through an international code of conduct? We were firmly of the view that the answer to both was “yes”: a code of conduct based on universally recognized and actionable human rights would constitute a novel and actionable approach to promoting the responsible sharing of data, by way of using already-adopted universal human rights that resonate around the world. The particular human right that the putative Code of Conduct would be based on—the human right to science—was largely dormant until then, and yet would form a sure and shared foundation for GA4GH membership and commitment to the organization’s mission. As some of us wrote in the foundational paper setting out this vision, three reasons stood out to situate the proposed Code of Conduct within a human rights framework: First, because human rights have both political and legal dimensions, they reach beyond the moral appeals of bioethics and can provide a more robust governance framework for the regulation of genomics research. Because they carry international legal force, they can better promote and delineate the contours of responsible access, sharing, and attribution of both research and clinical data. Indeed, if health care becomes a primary location for collecting the phenotypic and genetic data needed to create learning systems for research and clinical care, we need to reinforce the self-regulatory codes of ethics of genomic researchers and clinicians with legally recognized human rights, that is, a co-regulatory system. Second, human rights belong to groups as well as individuals (spurring a reciprocity between the individual and public level) and reach beyond classic negative duties (i.e. forbidding State actors from interfering with the rights of individuals, such as the freedom of expression or the right to privacy) to positive, more progressive duties, urging action by governments (and ideally, industry, funders, and researchers) to share the data, technologies and knowledge that are the fruits of our science to achieve a goal desired by all, such as health. Another advantage is that human rights can foster responsible translational genomic research by offering stronger protection in three critical areas: privacy; anti-discrimination and fair access; and procedural fairness.16

Development of the Framework began in late 2013. At that stage, several members at the Centre of Genomics and Policy (CGP) at McGill University held detailed discussions directed by B.M. Knoppers about how we might develop the regulatory and ethical underpinnings of the GA4GH, with a particular view to situate the advancement of responsible data sharing within a human rights framework. There was little precedent for doing so, yet this was for us even more of a reason to explore it further, given the universalizing force of human rights and their potential to have actionability. In these discussions, we considered the value in an unexplored human right situated within the UDHR as well as the United Nations’ 1966 International Covenant on Economic, Social and Cultural Rights (ICESCR), specifically the human right to science. This set off a period of intensive in-house legal research in which we sought to better understand the development and use of science through a review of various 16

See Knoppers et al. (2014, p. 897).

18

Y. Joly et al.

sources. We quickly discovered there was very little in existence between jurisprudence and commentary despite the right to science’s existence in international human rights texts for over half a century. We also sought to look at precedents regarding international, national, or regional codes of conduct in the field of science, particularly in the life sciences, including ethical and legal codes and policies guiding data sharing behavior (which can be seen in Appendix 1 of the final version of the Framework ). Again, we found relatively few documents in existence or on point. This meant that, for the most part, we needed to rely on our own pool of expertise within the GA4GH to chart an innovative path for the Framework. From October 2013 through September 2014, the GA4GH’s Regulatory and Ethics Working Group (the REWG), as it was then called,17 formed a team of experts to iteratively draft the Framework. This involved a nearly year-long period of preparation, consultation, and revision, which took place among a core group led by the 15 members of the REWG18 and E. Dove (who served as the REWG Coordinator from 2013 to 2015), as well as steady involvement from multiple international consortia, such as H3Africa, the Human Variome Project, the International Rare Disease Research Consortium (IRDiRC), and the International Society for Biological and Environmental Repositories (ISBER). Drafting work began in earnest in January 2014, with the first draft comprising only an introductory section that made the case for the Code of Conduct to be situated within the human right to science. We argued that through a human rights framework, this Code of Conduct would: (1) interpret the right to enjoy the benefits of scientific progress and its applications as being the right to access and share genomic and clinical data across the translation continuum, from basic research through practical, material application (e.g., diagnostics and therapeutics); and (2) apply the right to benefit from the protection of the moral and material interests resulting from scientific production to genomic research by developing actionable moral rights (i.e., a right of attribution and a right to integrity of the production) for data generators. Further, this Code of Conduct would establish a set of principles and procedures for responsible research conduct, founded on and guided by complementary human rights principles such as privacy, anti-discrimination, and procedural fairness. The Code, then, would constitute a set of fundamental international principles 17

The REWG was renamed the Regulatory and Ethics Work Stream in 2017, see Appendix 1. Bartha Knoppers, Centre of Genomics Policy, Montreal, Canada (Chair); Partha Majumder, National Institute of Biomedical Genomics, India (Co-Chair); Martin Bobrow, University of Cambridge, United Kingdom; Paul Burton, University of Bristol, United Kingdom; Don Chalmers, University of Tasmania, Australia; Thomas Hudson, AbbVie, United States; Terry Kaan, University of Hong Kong, Hong Kong; Kazuto Kato, Osaka University, Graduate School of Medicine, Japan; Michael Parker, University of Oxford, United Kingdom; Jennifer Stoddart, former Privacy Commissioner of Canada, Montreal, Canada; Sharon Terry, Genetic Alliance, Washington D.C., United States; David Townend, Maastricht University, Netherlands; Jantina de Vries, University of Cape Town, South Africa; John Wilbanks, Sage Bionetworks, United States; Eva Winkler, University of Heidelberg, Germany.

18

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

19

and implementation practices to guide the actions and management of genomic and clinical data sharing, as well as to achieve translational genome science. It would be cognizant of the need to incentivize data providers to share their data, to afford researchers proportionate regulatory treatment of research projects, and to address possible patient-participant concerns about the impact of data sharing on, inter alia, privacy, anti-discrimination, and fair access. Broadly speaking, we envisioned the Code of Conduct as a rhetorical and yet aspirational instrument for GA4GH members, stakeholders in the life sciences, and the broader public. It would seek to enhance the sense of community among GA4GH members, of belonging to a group with common values and a common mission. As we saw it, the benefit of this Code of Conduct was to provide a widely accepted and predictable framework for responsible genomic and clinical data sharing. It would harmonize divergent interpretations of ethical guidelines and various laws pertaining to genomic and clinical data and provide a “safe harbor” for researchers, clinicians, and organizations that seek to share genomic and clinical data in an ethically and legally compliant manner. The primary goals of the Code of Conduct thus would be to: (1) to protect the welfare of groups and individuals with whom genomic researchers and clinicians work or who are involved in research efforts or clinical practice for data sharing; (2) define and guide accepted and acceptable behaviors; (3) promote high standards of practice; (4) provide a benchmark for subscribing members to use for self-evaluation; and (5) establish a framework for professional behavior and responsibilities, all in a spirit of greater international cooperation, collaboration, openness, and transparency. Those who accepted its principles would be expected to interpret them in good faith, to respect them, to make sure appropriate sanctions are in place, and to make them widely known. At the same time, we recognized that each person adopting the Code of Conduct would need to supplement it in ways based on local values, culture, and experience. All along, we sought to ensure that the document was neither exhaustive nor overly rigid; it was purposely written in relatively broad language. The fact that a particular conduct was not addressed specifically by the Code of Conduct would not mean the conduct was necessarily either ethical or unethical, legal or illegal. It was designed to be a dynamic instrument that could grow and change in response to future developments in the practice and science of genomic and clinical data sharing. It also bears emphasizing that we knew the Code of Conduct could only apply to uses of data that were consented to by donors (or their legal representatives) and/ or approved for use by competent bodies or institutions in compliance with national and international laws and that respect restrictions on downstream uses. To operate on an alternative premise was, to put it mildly, a recipe for unending controversy. By February 2014, a first comprehensive draft was in place, comprising preambulatory text and distinct sections on purpose, scope, future amendments to the Code of

20

Y. Joly et al.

Conduct , and “Principles and Implementation Practices”. The latter section, which was by far the most comprehensive and significant, comprised principles that would serve as a guide in determining courses of action in various contexts, and implementation practices to apply the Code of Conduct in a practical, proportionate, manner to genomic and clinical data sharing activities. Two drafts of the Code of Conduct emerged over the next two months in 2014, including a 3rd draft for discussion at the first GA4GH Plenary meeting at the Wellcome Trust in London in March 2014. The main substantial progress occurred, however, at a workshop that took place one month later in Paris on 2nd April of that year. At a gathering of approximately 20 leading experts in genomic and healthrelated data sharing,19 we dedicated a full day to working through, line by line, the text of the Code of Conduct, which by then was on the 4th draft. Ultimately, we agreed that crafting and setting apart “Foundational Principles” would be a better approach than combining principles with implementation practices. We also agreed that a separate section on “Guidelines” following the Foundational Principles would be helpful in enabling adherents to the Code to practically apply the Foundational Principles in a proportionate manner. The following day, D. Chalmers (University of Tasmania) and E. Dove worked for many hours in the Paris hotel lobby to further refine the Code of Conduct, which was later circulated to the workshop participants and REWG members for comment—an impromptu form of working that we suspect many drafters of policies would recognize.20 This led to a fourth Foundational Principle and ultimately the final formulation, being: (1) Respect Individuals, Families and Communities; (2) Advance Research and Scientific Knowledge; (3) Promote Health, Wellbeing and the Fair Distribution of Benefits; and (4) Foster Trust, Integrity and Reciprocity. It also led to more fleshing out of the Guidelines, of which nine were identified (closing tracking the final version), with elements under each guideline. The 5th draft, developed towards the end of April, recognized the need for implementation mechanisms to consider how the Code of Conduct should be adopted by organizations and regulatory bodies involved in data sharing. The 6th draft, completed one week later, transformed the “Guidelines” to “Guidelines and Core Elements” to better distinguish the specific nine headings (guidelines) from the core elements intended to apply the Foundational Principles to individuals and organizations involved in the sharing of genomic and health-related data. The feedback also suggested creating clearer sections in the Framework on “Purpose and Interpretation” and “Application”. At this stage, the draft began to be externally circulated to organizations for members to share thoughts with us on the draft over a two-month period until June 2014, with the understanding that the Code was still a work in progress and thus intended “for information purposes only” until a final draft was produced and disseminated. Many helpful comments were received over those two months, including a suggestion that titling the document as a “framework” rather than a “code of conduct” might have better international traction; thus, from June 2014 onwards, we renamed the 19 20

Among the attendees, special thanks go to Jennifer Harris, Georges Dagher and Kazuto Kato. See also Knoppers (2014c, pp. 1–3).

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

21

document as the Framework for Responsible Sharing of Genomic and Health-Related Data, which ultimately became the final title of the document (the Framework ). The feedback also suggested transforming the “Guidelines and Core Elements” to only “Core Elements” to better clarify that it is good practice for those involved in genomic and health-related data sharing to have core elements of responsible data sharing in place (as “guidelines” might confuse stakeholders as to which and how many to devise). Those core elements within the Framework aid in the interpretation of the Foundational Principles to individuals and organizations involved in the sharing of genomic and health-related data. The feedback was collated into a 7th draft over July, and this was discussed and approved by the GA4GH REWG and the Transitional Steering Committee in late July 2014; it was then uploaded on the GA4GH website from 1 August 2014 for public comment over the course of August and into early September. As public comments from around the world were collected and integrated into the 8th draft, the Framework was discussed with case study examples (pediatric research; rare diseases; ageing research; and research in low- and middle-income countries) at the International Biobanking Summit-III (IBS) in Helsinki on 24 September 2014 for what we hoped would be real-world validation. Finally, after final revision in a 9th draft in late September and early October 2014, the REWG announced the launch of the Framework in final form (the 10th draft) at the GA4GH second plenary meeting on 18 October 2014 in San Diego, coinciding with the nearly year-long drafting effort. The Framework itself was then approved by the GA4GH Steering Committee and published on the website in late 2014, where it remains in effect to this day. Approaching a decade following the Framework’s creation and implementation, we can reflect on its wider impact. First, as one of us has noted, the Framework reflects the hard work and dedication of individuals around the world to bring this foundational policy document to life, without a doubt reflecting a “labor of love”: The most essential ingredient for what was (and is, with all policymaking) inevitably a long and arduous process of discussion, drafting and re-drafting is the involvement of people working closely together. Even more important than the expertise and experience of individuals is their goodwill and capacity for consensus. In short, egos have no place in this environment of mutual trust and respect. Openness to criticism, comments and discussion of everyone’s favorite “wish list” of definitions, clauses, and so on is the ultimate test of policymaking mettle. Moreover, validation of the content of any policy developed through broad exposure and input strengthens eventual buy-in and use by the scientific community.21

Second, the Framework has been translated into multiple languages—currently 14, comprising many of the world’s most spoken languages—enabling it to be widely understood and embedded in local organizations that work with genomic and healthrelated data. This helps fulfil the GA4GH’s mission to be as global in reach as possible. Third, the Framework has been recognized and endorsed by multiple other organizations. We did not expect it to be immediately impactful and change cultural practices, as policy impact can take time:

21

See also Knoppers (2014c, pp. 1–3).

22

Y. Joly et al. Policymaking also requires patience. One should not expect any immediate tangible impact. Real-world application and citation of what are often self-regulatory approaches occur over the long term. A self-regulatory approach takes time to enter into the scientific culture since it lacks the imprimatur and legitimacy of policies emanating from recognized international bodies such as UNESCO or WHO. But the impact is eventually felt.22

And indeed, as early as 2014, a US National Cancer Institute (NCI) Cancer Genomics Cloud Pilot grant awarded to the Broad-University of California Cloud Pilot (BUCCP) explicitly recognized that to build a system enabling large-scale analysis of The Cancer Genome Atlas (TCGA) data and other datasets in the cloud ought to be done in accordance with the Framework’s foundation principles and core elements.23 As another example, the Framework has been cited by organizations such as UK Health Data Research Alliance (HDR UK), which requires that every organization involved in the HDR UK’s Digital Innovation Hub Program must “[a]dhere to the Foundation Principles and Core Elements for Responsible Data Sharing set out in the Global Alliance for Genomics and Health Framework for Responsible Sharing of Genomic and Health-Related Data.”24 Finally, we also note that the Framework itself set in motion the development of multiple GA4GH policies in the years following its launch, which seek to elaborate the general principles and guidance offered in the Framework and provide specific guidance on particular issues. These policies, detailed in the next section of this chapter, help both individuals and organizations make improvements or adopt specific best practices for responsible data sharing and governance processes. Both the Framework and several of these underlying policies have been cited in policy documents from leading global organizations, including, for example, in a white paper published in July 2020 by the World Economic Forum, entitled “Genomic Data Policy Framework and Ethical Tensions”.25 We are proud that the Framework was reaffirmed at the GA4GH Plenary meeting in September 2019 without any amendments. We continue to view it as a living document and an aspirational instrument, not only for setting out the work of the GA4GH itself across all its Work Streams, but also for the millions of patients, research participants, scientists, clinicians, and policymakers endeavoring to share global genomic and clinical data in a responsible manner and within a framework that views genomic and clinical databases as global public goods that must be respected, protected, and promoted. To that end, we do not wish or envision the Framework to remain unamended for the next five years; indeed, after a decade in place, we suspect some changes are necessary to account for advances in basic research and technology, and ethical and legal developments, to ensure that it remains fit for purpose for years to come.

22

See also Knoppers (2014c, pp. 1–3). See Broad Institute (2014). 24 UK Health Data Research Alliance (2020, pp. 1–7). 25 World Economic Forum (2020, pp. 1–29). 23

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

23

3 GA4GH Policies, Standards, and Tools From the onset, the GA4GH’s main purpose was to develop specific policies, tools, and products to assist the international genomics research community in sharing its information-rich data. The GA4GH’s strong focus on developing tools and standards to promote legal and technical interoperability for data sharing clearly comes across in the foundational White Paper. This included no less than 27 mentions of the word “tools” and 34 of the word “standards” in an otherwise succinct document.26 This orientation was further emphasized in the public policy context by B.M. Knoppers in her 2014 commentary on “GA4GH, International ethics harmonization and the Global Alliance for Genomics and Health”.27 In this commentary, she states that GA4GH has three goals and relates the first two of those to the creation of standards, guidelines, and procedures to facilitate data sharing.28 Thus, tool building for data sharing, in a broad sense, was clearly the initial “raison d’être” of the GA4GH and has remained so through its first decade of existence. The GA4GH’s more technically-oriented Work Streams focused on promoting extensible technology platforms by developing open standards and formats to support the storage, curation, and sharing of genomic and health data in accordance with the Findability, Accessibility, Interoperability, and Reuse (FAIR) Guiding Principles for scientific data management and stewardship.29 The inaugural REWS, termed the Regulatory, Ethics and Law Division in the White Paper,30 was intended to support these activities by providing policies and tools for researchers and regulatory authorities to harmonize and streamline privacy and ethics requirements, with the intent of facilitating responsible data sharing. As previously explained, the public policy and human rights foundation of those tools was set forth by the foundational Framework.31 The composition of the initial regulatory and ethics toolkit was determined through a discursive process that involved the REWS community, the original Regulatory, Ethics and Law Division Co-Chairs B.M. Knoppers and K. Kato, and the GA4GH Steering Committee. In 2016, following some strategic planning, Work Streams, including the renaming of the Regulatory, Ethics and Law Division, as the REWG, identified their planned projects and tools in 5-year roadmaps that were made openly available on the GA4GH website. As has always been the case, proposed policies and tools were developed 26

GA4GH (2013, pp. 1–34). Knoppers (2014b, pp. 1–3). 28 Knoppers (2014b, pp. 1–3). The third one was to engage the community on the importance of data sharing in genomics. 29 Wilkinson et al. (2016, pp. 1–9). 30 The Regulatory, Ethics and Law Division was the original name of the REWS. See Appendix 1 for more information on the different names and leaders of the REWS across the years. 31 The GA4GH REWS tools presented in this section of the chapter do not represent all of the tools developed by this work stream. Readers interested in viewing the complete range of REWS tools can visit: Regulatory and Ethics Toolkit, available at: www.ga4gh.org, where all REWS tools are featured and openly accessible. 27

24

Y. Joly et al.

through the voluntary collaboration of interested REWG experts from a variety of fields, countries, and professional affiliations. It is no surprise that the first two policies developed by the REWG were intended to address two central aspects of data governance: consent and data privacy/security.32 These policies recommended best practices to ensure a more consistent interpretation of requirements applicable in these major health policy domains. The Data Privacy and Security Policy also provided an extensive lexicon of specialized privacy/security terms and expressions. It bodes well that both these policies stood the test of time and are still in use following minor reviews and updates in 2019. Once these two foundational policies were placed, the REWG members focused their attention on ethical challenges specific to the genomics and health context and on developing policies that would directly address the needs of the GA4GH’s Technical Work Streams. For example, the Accountability Policy (2016) (currently in revision) was developed to provide guidance to stakeholders responsible for the oversight of genomic and health data sharing initiatives.33 This policy outlined best practices for monitoring and responding to events of non-compliance with data sharing standards. The Machine Readable Consent Guidance (2020) is a good example of a policy developed collaboratively with a GA4GH Technical Work Stream, namely the Data Use & Researcher Identities (DURI) Work Stream.34 DURI had developed the Data Use Ontology (DUO), an international standard consisting of a structured, controlled vocabulary of data use terms that described the scope of permitted research purposes for using a scientific resource.35 The Machine-Readable Consent Guidance explains how to create a consent form that maps directly and unambiguously to the GA4GH DUO, in a way that would render the consent machine-readable. In parallel to these new tools, the General Data Protection Regulation (GDPR) and International Health Data Sharing Forum was created by REWS in 2018 to examine the European Union General Data Protection Regulation (GDPR), a substantial data protection law, and assess its implications for international health data sharing. The forum publishes monthly “GDPR Briefs” to address policy questions about the GDPR’s impact on various aspects of international health research and genomic and health-related data sharing, and to further explore the various issues raised in the GDPR literature.36 The Forum has been highly productive, publishing 40 briefs over the past four years, and is now considering enlarging its scope to address issues created by data privacy laws and policies beyond the GDPR. In 2017, at the GA4GH Strategic Planning meeting in Hinxton, the GA4GH community identified a need for a more sophisticated product certification process.37 An approval process involving the newly coined ‘Foundational Work Streams’ consisting of the Data Security Work Stream and the REWS, along with the Steering 32

GA4GH, Consent Policy (2019); see also GA4GH, Data Privacy and Security Policy (2019). GA4GH, Accountability Policy (2016). 34 GA4GH, Machine Readable Consent Guidance (2020). 35 Lawson et al. (2021, pp. 1–12). 36 See GA4GH GDPR Forum. 37 See GA4GH Product Approval Process (2017). 33

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

25

Committee, was suggested. The proposed approach specified a five stage-process to product development: (1) proposed, (2) submitted for approval, (3) under review, (4) approved and (5) retired. Under this certification process, interested developers had to complete a “Product Proposal Form” outlining the product scope, expected impact, and affected stakeholders. The form was then subjected to review before receiving the go-ahead from the GA4GH Steering Committee. An internal (at the level of the concerned Work Stream) and a public (open to all) consultation to seek all necessary feedback were required to ensure the quality and ethical compliance of the product before it could be submitted for approval. In addition to this refined process, the GA4GH also adopted policies on copyright (2019/rev 2020) and intellectual property (IP) (in development) to secure the necessary legal protection for GA4GH outputs and products to remain openly accessible to members of the research community.38 Following the GA4GH’s overhaul of its product development process, the REWS proposed a new product roadmap for 2020–2021. Featured tools included a Consent Toolkit consisting of model consent provisions that could be used in different consent contexts, such as for large-scale initiatives, projects involving several members of a family, clinical whole genome sequencing, and rare disease research.39 A participant-patient engagement framework was also suggested and ultimately would be completed by July 2021.40 The REWS also established a Data Access Committee Review Standards (DACReS) working group. The purpose of this group was to draft procedural standards and guidance to improve consistencies in Data Access Committee reviews, as well as their quality and effectiveness in ensuring adequate research data protections. The completed DACReS policy was made available on the REWS tool page in late 2021. New REWS working groups were also created in 2021 to address several topics including achieving greater diversity in genomics datasets, preventing genetic discrimination,41 sharing genetic information between hospitals for patient care, disseminating research findings on public attitudes in genetics, and tracking the ethical provenance of data of genomics datasets.42 In 2021, GA4GH hired S. Fairley as the organization’s first Chief Standards Officer (CSO).43 The focus of Fairley’s work was to develop a GA4GH Core Technical Team to support the work of external contributors and ensure consistent, ongoing standards development. Her arrival further strengthened the standards development process, including REWS output and product development. In mid-2022, a draft Product Development and Approval Process was circulated to members of the 38

GA4GH Copyright Policy (2019/rev 2020), GA4GH, IP policy. GA4GH, Consent Clauses for Large Scale Initiatives; GA4GH, Familial Consent Clauses; GA4GH, Consent Clauses for Genomic Research; GA4GH, Pediatric Consent Clauses; GA4GH, Model consent clauses for rare disease research. 40 GA4GH, Framework for Involving and Engaging Participants, Patients and Publics in Genomics Research and Health Implementation (2021). 41 GA4GH, Genetic Discrimination: Implications for Data Sharing Projects (GeDI) (2022). 42 REWS Strategic Roadmap 2022/2023 available at Work Streams www.ga4gh.org (Accessed 20 March 2023). 43 GA4GH Names Susan Fairley, Ph.D., as Chief Standards Officer (1 March 2021). 39

26

Y. Joly et al.

GA4GH community for comments.44 This process emphasized the need to avoid duplicating efforts and instead better align new GA4GH outputs and products with the work of other standards-setting organizations and internally across the GA4GH’s different standards. It also provided details for the product development steps. This last reform was intended to consolidate the position of the GA4GH as a bona fide international standard-setting organization in human genomics and data sharing. In line with this priority, the GA4GH helped establish a consortium of standard development organizations: the Cross Standards Development Organization (xSDO), which included representatives from the GA4GH, Health Level Seven (HL7), and the International Organization for Standardization (ISO).45 An initiative called the Global Policy Forum was also recently initiated to promote networking, collaboration, and standardization of policy development between the various ethics and policy groups of large-scale international research and governance organizations in genomics. While the REWS’s contribution to GA4GH tools and standards development has been exemplary so far, challenging questions will need to be addressed in the coming years for the REWS standard development process to remain rigorous yet streamlined and fully aligned with that of the other GA4GH Work Streams. These questions include three observations. First, because many governance and ethics policies do not have a technical component, the REWS produces policies, tools, and standards at a faster pace than the GA4GH’s Technical Work Streams. While this should be viewed positively, a potential consequence is that a rather large number of policies, tools, and products in existence will need to be revised, updated, and classified to remain valid. In the future, such time-consuming tasks could be delegated to a core ethical team rather than be included as part of the voluntary work expected from the larger REWS community. Second, the number of implementation tools resulting from interdisciplinary work undertaken between the REWS and one or more Technical Work Stream(s) remains small compared to the REWS standalone outputs and products. This is a sign that there is a possibility for the REWS’ work to be better aligned with that of the seven other Work Streams of the GA4GH. The formidable challenge of achieving true interdisciplinarity between such distinct disciplines as social sciences, humanities, genomics, and information technology is often underestimated by funding agencies and policymakers. However, the GA4GH has a strong administrative core and a culture of collaboration that makes it possible to envision it being one of the first organizations to successfully achieve truly interdisciplinary products and tools’ development in genomics. Third, unlike more technical GA4GH Work Streams, the REWS has so far deferred the task of identifying quantitative or qualitative metrics that would allow the GA4GH to assess the frequency of use and performance of its tools and policies. While an argument can certainly be made that universal ethics and governance standards and guidelines are different than IT or genomics standards, this different nature does not 44 45

Draft Product development and approval process. Page et al. (2023).

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

27

a priori seem to prevent the implementation of a robust performance measurement process that would apply specifically to REWS’ outputs and products. In fact, if achievable, such a process could enable the REWS to gather data to demonstrate the importance of ethics and governance products for the wider genomic and data sharing community, as well as for funding agencies. While waiting for more rigorous indicators of performance, the membership growth in the REWS, multiple citations of its outputs and products, and frequent positive feedback from the genomic community can be considered as strong indicators that such tools are important and useful. Yet, it will be interesting to see how the REWS approaches ethics and governance policy development as the GA4GH evolves toward a more structured standard-setting entity in the coming years.

4 Conclusion: GA4GH 10th Anniversary—A Shift Towards Maturity? As noted earlier in this chapter, the mission of the GA4GH is to create policy frameworks and open tools for integrating genomic and health data. We have come a long way in the past ten years to overcome the challenge raised by the lack of a common international ethical and legal framework to bring patients, physicians, participants, regulators, funders, and researchers together. The driving force behind the REWS’ success unquestionably has been the Framework. As explained above, the REWS has been a most productive Work Stream. With guidance from the Framework, the REWS toolkit includes a body of work addressing the major ethical, legal, and social implications of genomics that have been canvassed in the literature for the past 25 years or so. In particular, we have shown how particular attention has been paid to the central, yet delicate, issue of consent, with a consent policy, model consent clauses on the general topics of genomics research and clinical genomics, and more specifically on large-scale initiatives, familial research, pediatrics, and rare diseases research, and finally, machine readable consent guidelines. Privacy has also received extensive consideration, with a Data Privacy and Security Policy and, as noted above, a bespoke forum for discussing the GDPR and international health data sharing, given its broad implications for European researchers and indeed more globally. More recently, the REWS has addressed important questions of equity (including preventing genetic discrimination), diversity and inclusion in genomics, reflected in the work of the Genetic Discrimination Observatory (see also https://gdo.global/). Other guidelines and policies cover such broad topics as: Involving and Engaging Participants, Patients and Publics in Genomics Research and Health Implementation; Clinically Actionable Genomic Research Results; Copyright; Pediatric Pharmacogenetics; Ethics Review Recognition; and Tracking Ethical Provenance. In sum, then, much has been achieved in the REWS context over the past 10 years. Nevertheless, there remains considerable work to be done. As genomic technology advances, new regulatory and ethical issues naturally come to the fore.

28

Y. Joly et al.

This is reflected in the 2022–2023 REWS Roadmap. The motivation and mandate remain the same: to activate the right to science and its applications for the benefit of everyone. The REWS continues providing assistance to researchers and institutions in meeting this mandate by addressing genomic data sharing issues through appropriate data governance policies that can apply internationally. The 2022–2023 Roadmap recognizes that more work is needed on genetic discrimination, data access committee standards, and data protection law and international health data sharing. In addition, new work is commencing on diversity in datasets, ethical provenance, clinical data sharing, and consent and newborn screening. The REWS is also continuing to support dissemination of the outcomes of Your DNA, Your Say (YDYS), a questionnaire survey that collected responses between 2017 and 2019. The YDYS dataset includes responses from 37,000 individuals across 22 countries. The study examined a broad range of issues associated with genomic data sharing. A REWS subgroup has been established to create Public Attitudes for Genomics Policy Briefs based on the findings from YDYS, which continue to be published in the academic literature. After undertaking a review process, the GA4GH as a whole, recently released an updated version of its strategic plan.46 Some of the major challenges that will continue into the next decade for the GA4GH community include integration and alignment with the Technical Work Streams, and support for further implementation of GA4GH standards. Perhaps the biggest challenge that lies ahead, however, is engagement with the clinical community. Ultimately, the goal of international genomic data sharing is to ensure that genomics becomes part of mainstream clinical care, hence the name GA4GH, which fosters and frames both genomics and health data sharing. However, as the strategic refresh report notes, “the ultimate targets of implementation—individuals with clinical backgrounds and clinical genomic experience in the areas of patient care, clinical laboratory roles, and clinical research— are still underrepresented within the community.”47 Addressing this challenge will require greater focus on real-world clinical implementation challenges and increased focus on the development of clinical standards and policies. The REWS has already started to address some of the issues associated with clinical implementation through its subgroup work on clinical data sharing and consent and newborn screening. It remains to be seen whether this GA4GH policy and tool building momentum can be maintained and carried through another decade. However, the broad participation by the ethics and policy community worldwide and an already impressive ethical toolkit openly available for the global community augur well for the future of data governance and sharing across the world. Acknowledgements The authors are most thankful for the outstanding contribution they have received over the years from the REWS support team: Adrian Thorogood, Lindsay Smith, Stephanie O.M. Dyke, Michael Beauvais, Kristina Kekesi-Lafrance, Maili Raven-Adams, and Beatrice Kaiser. They also acknowledge and thank the wider GA4GH support team, including Angela Page, Susan Fairley, Neerjah Skantharajah, Stephanie Li, Connor Graham, and Justina Chung. The authors 46 47

GA4GH Strategic Plan 2020–2021, available at: www.ga4gh.org (Accessed 20 March 2023). GA4GH Connect: 2022 Strategic Refresh (2022).

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

29

thank Profs. Kazuto Kato and Madeleine Murtagh for their important leadership in GA4GH as former REWS co-leads. Thanks also to GA4GH CEO Peter Goodhand for his vision and for having campaigned for the REWS since its inception. The voluntary participation of all members of the REWS community, which was essential to the development of the REWS toolkit, should also not go unrecognized. Dianne Nicol would also like to acknowledge the support of her colleagues at The Centre for Law and Genetics at the University of Tasmania. Bartha M. Knoppers and Yann Joly would like to recognize the helpful assistance of their colleagues at the Centre of Genomics and Policy at McGill University (including the helpful editing of Katherine Huerne and Mei-Chen Chang). Bartha M. Knoppers and Yann Joly recognize the funding support from Genome Canada, Genome Quebec, and the Canadian Institutes of Health Research. Dianne Nicol acknowledges the funding support of the Australian Research Council (DP180100269).

Appendix 1 Ethics and Policy at GA4GH: key names, leadership, and example milestones

Co-lead

Bartha Knoppers

Bartha Knoppers

Regulatory and Ethics Work Group (REWG), 04/03/2014–01/ 05/2017

Regulatory and Ethics Work Stream (REWS), 01/05/ 2017–ongoing

Regulatory Ethics Bartha and Law Division, Knoppers 01/06/ 01/06/2013–04/ 13–13/01/20 03/2014

Name

Kazuto Kato

Kazuto Kato

Kazuto Kato 01/06/13–01/10/ 17

Co-lead (2)

Madeleine Murtagh 01/10/ 17–13/01/20

Co-lead (3)

Edward Dove 13/01/20–17/03/ 22

Co-lead (4)

Yann Joly 13/01/ 20–ongoing

Co-lead (5)

Dianne Nicol 17/03/ 22–ongoing

Co-lead (6)

GDPR and International Health Data Sharing Forum (2018)

GA4GH, Framework for Responsible Sharing of Genomic and Health-Related Data (2014)

Example milestone

30 Y. Joly et al.

The GA4GH Regulatory and Ethics Work Stream (REWS) at 10 …

31

References Birney E, Vamathevan J, Goodhand P (2017) Genomics in healthcare: GA4GH looks to 2022, bioRxiv https://doi.org/10.1101/203554 British Medical Association (2019) Medicine’s social contract. https://www.bma.org.uk/media/ 2606/bma-medicines-social-contract-oct-19.pdf. Accessed 20 Oct 2022 Broad Institute (2014) Broad Institute and University of California team awarded NCI cancer genomics cloud pilot contract. https://www.prnewswire.com/news-releases/broad-instituteand-university-of-california-team-awarded-nci-cancer-genomics-cloud-pilot-contract-281076 612.html. Accessed 17 Jan 2023 Contreras JL, Knoppers BM (2018) The genomic commons. Annu Rev Genomics Hum Genet 19:429–453 GA4GH Accountability Policy (2016) https://www.ga4gh.org/wp-content/uploads/Accountability_ Policy_FINAL_v1_Feb10-1.pdf. Accessed 17 Jan 2023 GA4GH Connect: 2022 Strategic Refresh (2022) https://docs.google.com/document/d/13_4E_08F HWmX_sFri7zXREZrGDG9phGezSd9cim2coM/edit#heading=h.cs8uigja2d5i. Accessed 17 Jan 2023 GA4GH Copyright Policy (2020) https://www.ga4gh.org/wp-content/uploads/GA4GH-CopyrightPolicy-Updated-Formatting.pdf. Accessed 17 Jan 2023 GA4GH Product Approval Processes (2017) https://www.ga4gh.org/wp-content/uploads/GA4GHProduct-Approval-Processes-v2.pdf. Accessed 17 Jan 2023 GA4GH (2013) Creating a global alliance to enable responsible sharing of genomic and clinical data. https://www.ebi.ac.uk/sites/ebi.ac.uk/files/shared/images/News/Global_Alliance_White_ Paper_3_June_2013.pdf. Accessed 17 Jan 2023 GA4GH, Machine Readable Consent Guidance (2020) https://www.ga4gh.org/wp-content/uploads/ Machine-readable-Consent-Guidance_6JUL2020-1.pdf. Accessed 17 Jan 2023 GA4GH (2021) Framework for involving and engaging participants, patients and publics in genomics research and health implementation. https://www.ga4gh.org/wp-content/uploads/ GA4GH_Engagement-policy_V1.0_July2021-1.pdf. Accessed 17 Jan 2023 GA4GH, Consent Clauses for Genomic Research (2020) https://www.ga4gh.org/wp-content/upl oads/Consent-Clauses-for-Genomic-Research.pdf. Accessed 17 Jan 2023. GA4GH, Consent Clauses for Large Scale Initiatives (2022) https://www.ga4gh.org/wp-content/ uploads/Large-Scale-Initiatives-Consent-Clauses.docx-2.pdf. Accessed 17 Jan 2023 GA4GH, Consent Policy (2019) https://www.ga4gh.org/wp-content/uploads/GA4GH-Final-Rev ised-Consent-Policy_16Sept2019.pdf. Accessed 17 Jan 2023 GA4GH, Data Privacy and Security Policy (2019) https://www.ga4gh.org/wp-content/uploads/ GA4GH-Data-Privacy-and-Security-Policy_FINAL-August-2019_wPolicyVersionsUPDA TED.docx-2.pdf. Accessed 17 Jan 2023 GA4GH, Familial Consent Clauses (2021) https://www.ga4gh.org/wp-content/uploads/FamilialConsent-Clauses-6.pdf. Accessed 17 Jan 2023 GA4GH, GDPR and International Health Data Sharing Forum https://www.ga4gh.org/genomicdata-toolkit/regulatory-ethics-toolkit/gdpr-forum/. Accessed 17 Jan 2023 GA4GH, Genetic Discrimination: Implications for Data Sharing Projects (GeDI) (2022b) https:// www.ga4gh.org/wp-content/uploads/Genetic-Discrimination-Dec.-2-2021.docx.pdf GA4GH, Regulatory & Ethics Toolkit (2023) https://www.ga4gh.org/genomic-data-toolkit/regula tory-ethics-toolkit/. Accessed 17 Jan 2023 GA4GH, Strategic Plan (2020) https://www.ga4gh.org/document/strategic-roadmap-2020/. Accessed 5 Jun 2023 Knoppers BM (2014a) Framework for responsible sharing of genomic and health-related data. HUGO J 8(1):3 Knoppers BM (2014b) International ethics harmonization and the global alliance for genomics and health. Genom Med 6(2):13 Knoppers BM (2014c) Does policy grow on trees? BMC Med Ethics 15(1):1–3

32

Y. Joly et al.

Knoppers BM et al (2011) Towards a data sharing code of conduct for international genomic research. Genom Med 3(7):1–4 Knoppers BM et al (2014) A human rights approach to an international code of conduct for genomic and clinical data sharing. Hum Genet 133(7):895–903 Lawson J et al (2021) The data use ontology to streamline responsible access to human biomedical datasets. Cell Genom 1(2):100028, 1–12 Nguyen MT et al (2019) Model consent clauses for rare disease research. BMC Med Ethics 20(1):1–7 Page A, Haendel M, Freimuth R (2023) A community approach to standards development. In: McCormick JB, Pathak J (eds) Genome data sharing. Elsevier, London Rehm HL, Page AJ, Smith L, Adams JB, Alterovitz G, Babb LJ, Barkley MP, Baudis M, Beauvais MJ, Beck T, Beckmann JS (2021) GA4GH: international policies and standards for data sharing across genomic research and healthcare. Cell Genom 1(2):100029 Shabani M (2022) Will the European health data space change data sharing rules? Science 375(6587):1357–1359 Smith J (2021) The next 20 years of human genomics must be more equitable and more open. Nature 590:183–184 Stein LD, Knoppers BM, Campbell P, Gets G, Korbel JO (2015) Data analysis: create a cloud commons. Nature 523:149–151 UK Health Data Research Alliance (2020) Digital innovation hub program prospectus: principles for participation. https://www.hdruk.ac.uk/about-us/policies/digital-innovation-hub-progra mme-prospectus-principles-for-participation/. Accessed 17 Jan 2023 Watson C (2022) Many researchers say they’ll share data—but don’t. Nature 606(7916):853 Wilkinson M, Dumontier M, Aalbersberg I et al (2016) The FAIR guiding principles for scientific data management and stewardship. Sci Data 3, 160018. 3(1):1–9 World Economic Forum (2020) Genomic data policy framework and ethical tensions. https://www. weforum.org/whitepapers/genomic-data-policy-framework-and-ethical-tensions/. Accessed 17 Jan 2023

Assessing Public and Private Rights of Action to Police Health Data Sharing David A. Simon, Carmel Shachar, and I. Glenn Cohen

Abstract Data is an integral part of healthcare delivery. A growth in digital technologies has produced large swaths of health data that contain individuals’ personal, and often sensitive, information. A key question for policymakers is how to regulate the collection, storage, sharing, and disclosure of this information. In this chapter, the authors evaluate two different types of regulatory enforcement mechanisms: public rights of action (where the government sues) and private rights of action (where private persons sue). They use a recent case to illustrate the advantages and drawbacks of private rights of action in health data privacy cases, and then use this analysis to contrast them with public rights of action. Their analysis suggests that public and private rights of action should be viewed as complementary regulatory tools, rather than competing alternatives. In short, both public and private rights of action have important roles in regulating health data. To ensure private rights are effective regulatory tools, policy makers should pay particular attention to how those rights of action are designed and implemented. Keywords Public and private enforcement · Regulation · Sharing · Data

1 Introduction In the United States, as in many countries, the privacy of health data has become a frequent flashpoint for litigation. For example, when Google partnered with the University of Chicago Medical Center (“UCMC”) in 2017, hopes were high. UCMC D. A. Simon (B) Northeastern University School of Law, Cambridge, MA, USA C. Shachar Center for Health Law and Policy Innovation, Harvard Law School, Cambridge, MA, USA I. Glenn Cohen Petrie-Flom Center for Health Law Policy, Biotechnology, & Bioethics, Harvard Law School, Cambridge, MA, USA © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_3

33

34

D. A. Simon et al.

claimed the collaboration would “us[e] new machine-learning techniques to create predictive models that could help prevent unplanned hospital readmissions, avoid costly complications and save lives.”1 But the partnership soon came under legal fire after it was revealed that UCMC shared with Google electronic health records that contained sensitive patient information.2 This purportedly included the following data from patient encounters: . . . . .

patient demographics; provider orders; diagnoses, procedures, medications, lab values, and vital signs; dates of service; and free-text medical notes.3

Upon discovering these disclosures, one affected patient, Matt Dinerstein, brought a class action lawsuit in 2019 on behalf of all adult patients whose records from 2010 to mid-2016 were shared.4 The main allegation in the lawsuit, Dinerstein v. Google, was that UCMC violated Dinerstein’s privacy rights. Specifically, Dinerstein pled numerous state law claims against both UCMC and Google, including violations of the Illinois Consumer Fraud and Deceptive Business Practice Act (against UCMC), express and implied breach of contract (against UCMC), tortious interference (against Google), intrusion on seclusion (against Google), and unjust enrichment (against UCMC and Google).5 In particular, Dinerstein claimed that UCMC disclosed records without sufficient anonymization, and therefore put patient privacy at risk.”6 Although the lawsuit was ultimately dismissed before it reached the merits stage, an appeal from that dismissal is pending in the U.S. Court of Appeals for the Seventh Circuit. The questions the case raises—about how sharing of medical data should be regulated—continue to be asked and debated by legislators, courts, and federal and state regulators in the U.S. Regulating privacy of medical information, of course, can be done by a variety of governmental entities, such as states’ attorney generals and federal agencies like the Federal Trade Commission through state and federal statutes. But Dinerstein—as well as cases in other data breach sharing contexts7 — raises the prospect that some of this regulation may come from claims brought on behalf of private litigants (sometimes under private law) as opposed to in lawsuits brought by government bodies.

1

Wood (2017), (Accessed 14 February 2023). Dinerstein v. Google, LLC, 484 F. Supp. 3d 561 (2020). The case is currently on appeal at the Seventh Circuit. 3 Dinerstein, 484 F. Supp. 3d at 569–570. 4 Dinerstein, 484 F. Supp. 3d at 566, 568. 5 Dinerstein, 484 F. Supp. 3d at 570. 6 Dinerstein, 484 F. Supp. 3d at 569. 7 E.g., Calhoun v. Google LLC, 526 F. Supp. 3d 605, 617 (N.D. Cal. 2021); In re Anthem, Inc. Data Breach Litig., 162 F. Supp. 3d 953, 979 (N.D. Cal. 2016); Austin-Spearman v. AARP & AARP Servs. Inc., 119 F. Supp. 3d 1, 7 (D.D.C. 2015). 2

Assessing Public and Private Rights of Action to Police Health Data …

35

In what follows we use the Dinerstein case to assess how private rights of action (i.e., those brought by private plaintiffs) compare to and complement the more common regulatory framework that uses public rights of action (i.e., those brought by arms of the state) to police privacy violations in the U.S. First, we describe the primary regulatory framework for privacy at the federal level, focusing on public rights of action brought by federal agencies. Then we return to Dinerstein to discuss how private rights of action function to police privacy of medical information, highlighting the differences from regulation through public rights of action under federal law. Finally, we compare these two regimes and suggest that private rights of actions can play an important complementary role in regulating sharing of health data. To be effective, these private rights of action must be attentive to system design considerations, such as cost, fee structure, the definition of a violation, and their structural relation to existing public rights of action and enforcement priorities. While our focus is on data privacy in the U.S., we are hopeful that some of what we have to say about the optimal mix of public and private enforcement will have implications for other regimes as well.

2 Policing Medical Data Sharing Using Public Rights of Action In the United States, several federal laws regulate the sharing of health information. In this section we discuss three that allow the government to sue—usually called public rights of action—when the laws are violated: the Health Insurance Portability and Accountability Act (HIPAA); the Federal Trade Commission Act (FTCA); and the Children’s Online Privacy Protection Act (COPPA). Perhaps the best-known of these laws is HIPAA, which focuses exclusively on information appearing in medical records and applies to “covered entities” (health plans, health care providers, and health care clearinghouses) and their “business associates” (such as physicians and hospitals) that possess “protected health information” (PHI).8 HIPAA contains two central provisions—one related to privacy (the Privacy Rule)9 and one related to security (the Security Rule).10 While the Privacy Rule regulates when and how PHI can be disclosed and used, the Security Rule regulates how information must be protected to ensure confidentiality, integrity, and security. The Department of Health and Human Services enforces HIPAA, violations of which can lead to significant civil penalties, potentially running into the millions or tens of millions of dollars.11 HIPAA notably does not contain a private right of action, meaning it is only enforceable by state actors. 8

Health Insurance Portability and Accountability Act of 1996, § 262 et seq., 42 U.S.C.A. § 1320d, et seq. 9 45 C.F.R. §§ 160, 164.102–164.106, 164.500–164.534. 10 45 C.F.R. §§ 160, 164.102–164.106, 164.302–164.318. 11 E.g., Healthcare Finance News (2023) (Accessed 14 February 2023).

36

D. A. Simon et al.

Although HIPAA’s protections are not unimportant, they are quite limited. For one thing, the law applies to only a small set of entities (covered entities and business associates) that have specific health-related information (PHI). For another, a covered entity that collects PHI but removes enough identifying information is limited by neither the Privacy Rule nor the Security Rule when it uses the deidentified data. It is also not a violation of HIPAA when an individual provides authorization to use his or her data in particular ways. Individuals may not appreciate that authorization can allow their data to be widely shared, and may many individuals provide authorization even when they do not fully appreciate the ramifications. Finally, HIPAA contains a variety of exceptions, such as complying with law enforcement or public health activities.12 Even when HIPAA does not apply, however, other federal laws may. The FTCA is one example.13 It is a federal statute that prohibits unfair and deceptive business practices. Unlike HIPAA, the FTCA applies to all companies that sell products. Unfair and deceptive practices include privacy-related issues. Companies that do not sufficiently safeguard data, or that promise to do one thing with consumer data but do another, can be liable for civil penalties in actions brought by the Federal Trade Commission (FTC).14 For example, the FTC sued and eventually settled with Easy Healthcare Corporation, which marketed and distributed the Premom ovulation tracking app to pregnant women, for “repeatedly and falsely promis[ing] Premom users in [its] privacy policies” that the company would not share customer information without consent, would share only non-identifiable customer data, and would use any customer data for only its own “analytics or advertising.”15 The FTC also has authority to enforce other statutes, such as the Fair Credit Reporting Act (FRCA), which requires credit reporting agencies to use “reasonable procedures to assure maximum possible accuracy of the information concerning the individual about whom the report relates.”16 FTC has used its authority to regulate privacy practices. For example, it brought an enforcement action against Appfolio, a popular property management data software, for failing to use reasonable procedures to ensure the accuracy of the information about prospective tenants.17 A more targeted law is COPPA, which applies to websites and online services that collect information about children. It requires the FTC to promulgate and enforce rules relating to how and under what conditions these websites can collect and maintain data about children. Under its COPPA Rule, the FTC requires online services and websites to disclose what information will be collected from and about children and

12

Hall et al. (2018), pp. 124–127. 15 U.S.C. § 41 et seq. 14 15 U.S.C. §§ 45(a)(1), 45(m)(1)(A), 53(b), 56(a)(1), 57b. 15 Complaint, U.S. v. Easy Healthcare Corp., 23-cv-3107, at *2 (filed May 17, 2023); Stipulated Order, U.S. v. Easy Healthcare Corp., 23-cv-3107 (filed May 17, 2023). 16 15 U.S.C. §§ 1681e(b), 1681s. 17 United States v. AppFolio, 1:20-cv-03563, Complaint, at *5-6 (filed Dec. 12, 2020). United States v. AppFolio, 1:20-cv-03563, Stipulated Judgment (January 12, 2021). 13

Assessing Public and Private Rights of Action to Police Health Data …

37

how it will be used.18 It also requires parental consent before collecting information and a reasonable means for parents to review what information will be collected.19 Websites that offer games or access cannot condition access on disclosure of information. Finally, the company must “maintain reasonable procedures to protect the confidentiality, security, and integrity” of data collected.20 FTC, which enforces the statutes through its rules, has the authority to issue civil penalties for violations.21 Other potential regulations can also apply. For example, the American Recovery and Reinvestment Act of 2009 created a new requirement for “vendor[]s of personal health records” to take certain corrective actions following breaches of health information.22 The FTC implemented this provision in its Health Breach Notification Rule.23 FTC enforces this rule in a manner similar to its COPPA Rule.24 Several features of public enforcement are noteworthy.25 First and most obviously, they create a “public right of action.” In other words, it is the government that sues violators. Under HIPAA (typically), FTCA, the Health Breach Notification Rule, and COPPA, the federal government, through various agencies and departments, sues to enforce the statute.26 Of course, some public laws also create a right that empowers private litigants to sue violators (as discussed in the next section). But HIPAA and the FTCA, two of the most noteworthy American data protection regulations, do not. Governmental entities, not private citizens, bring claims to redress violations of the statute.27

18

16 C.F.R. § 312. 16 C.F.R. § 312.3. 20 16 C.F.R. § 312.3. 21 FTC may develop information in a civil investigation that can support a criminal prosecution by the DOJ. https://www.ftc.gov/enforcement/criminal-liaison-unit (Accessed 14 February 2023). 22 Pub. L. 111-5, Feb. 17, 2009, 123 Stat. 115, § 1307. 23 16 C.F.R. § 318.7 et seq. 24 16 C.F.R. § 318.7. We emphasize that the laws discussed are federal and apply to patients and health systems anywhere in the U.S. Some individual U.S. states, like California and Virginia have passed their own privacy laws with implications for health data. 25 These are not the only features of public enforcement of public law, but a few salient ones we highlight. 26 The Department of Health and Human Services’ (HHS) Office for Civil rights enforces the HIPAA Privacy and Security Rules. If a possible criminal violation has occurred, HHS may refer the case to the Department of Justice (DOJ). State Attorneys General may also bring civil actions on behalf of state residents to enforce violations of the Privacy and Security Rules. 42 U.S.C. § 1320d-5(d). For unfair and deceptive practices, the Federal Trade Commission brings enforcement actions. Like HHS in the context of HIPAA, the FTC may work with or refer civil cases to DOJ for criminal prosecution. The FTC is also charged with enforcing COPPA and the Health Breach Notification Rule, bringing civil enforcement actions against companies that may have violated the law. 27 Of course, federal statutes can also serve as the basis for argument in a private lawsuit, such as in the tort doctrine of negligence per se. For a discussion of this intersection, see, e.g., Geistfeld (2014). 19

38

D. A. Simon et al.

Second, enforcement power brings discretion, enabling agencies to rachet up or down the effect of the law or regulation.28 For example, the Health Breach Notification Rule had largely laid fallow until Lina Kahn took over as Chair of the FTC and issued a statement articulating her intent to cultivate and use it.29 In 2022, she started to live up to this promise suing GoodRx—a company that offers direct-to-consumer prescription drug discounts for patients who pay with cash and the “free” GoodRx card—for alleged violations of the rule.30 Although there are limits and downsides to agency discretion (with some deploying it more successfully than others), the tool is widely used by agencies like FTC and FDA to address policy challenges the agency views as important and to de-emphasize those areas it thinks less important.31 Such discretion is not limitless, though. Statutory grants of authority typically limit the agency claims both in subject matter and scope, which can help focus agency resources on particular problems in the market. COPAA, for example, confines agency actions to a specific problem. Consider FTC’s recent lawsuit against WW International (formerly Weight Watchers, a weight loss program) (“WW”) under the COPPA Rule for designing and marketing a phone application and website to children as young as 8 years old.32 WW’s allegedly used games and videos to entice children to use the app and undertake health related actions, like tracking food intake and eating habits.33 According to the complaint filed by the FTC, WW didn’t properly provide notice or obtain parental consent, and retained information it collected “for longer than reasonably necessary.”34 FTC’s action was targeted and limited by the statute—it did not have the authority to pursue WW for making an app targeted at children; rather, it had the authority to regulate only specific data practices of WW.

28

Agencies can do this using a variety of tools, including rulemaking, adjudication, licensing, enforcement, and policy-setting (Ruhl and Robisch 2016). Agencies also use tools to maintain internal control over discretion and keep bureaucrats accountable to the agency (Metzger and Stack 2017). 29 Remarks by Chair Lina M. Khan on the Health Breach Notification Rule Policy Statement. Commission File No. P205405, September 15, 2021, available at: https://www.ftc.gov/system/ files/documents/public_statements/1596360/remarks_of_chair_lina_m_khan_regarding_health_ breach_notification_rule_policy_statement.pdf (Accessed 14 February 2023). 30 Federal Trade Commission (2023), (Accessed 15 February 2023). For a more recent case demonstrating a commitment to enforcement of the Health Breach Notification Rule, see the Easy Healthcare case, below. 31 Ruhl and Robisch (2016) (describing how agencies use discretion to claim they have no authority over a problem). 32 US v. Kurbo, Inc. & WW International, Inc., 22-CV-946, Complaint (filed February 16, 2022), available at https://www.ftc.gov/system/files/ftc_gov/pdf/filed_complaint.pdf (Accessed 19 June 2023). 33 Federal Trade Commission (2022), (Accessed 15 February 2023). 34 U.S. v. Kurbo, Inc. & WW International, Inc., 22-CV-946, Complaint at *14-15 (filed February 16, 2022), available at https://www.ftc.gov/system/files/ftc_gov/pdf/filed_complaint.pdf (Accessed 19 June 2023).

Assessing Public and Private Rights of Action to Police Health Data …

39

Importantly, however, the more specific the regulation, the more the regulation is likely to fail to anticipate or solve related (but unforeseen) problems—and sometimes those problems may be better addressed through different or more comprehensive regulation. Consider the FTC’s actions with respect to WW. FTC’s power is constrained by statute, and it is directed to police unfair practices and to ensure competition, not to completely quash all data sharing. FTC could not order WW to stop providing an application or website directed toward children, even if that might be a better outcome than simply ordering the company to stop tracking data in a particular manner.35 At the same time, limitations on the scope of regulatory authority provide important checks on state power. The fact that FTC has to stretch the meaning of the Health Breach Notification Rule to target conduct indirectly shows that Congress must act in some cases before the agency has the power to address what it thinks is problematic behavior. While FTC was able to use its regulatory weight to influence GoodRx, there are limits of what the Health Breach Notification allows. Another key aspect of public enforcement is its notice function. One obvious way in which this happens is with the publication of the lawmaking process, the law, and administrative regulations (including the notice and comment process). But another less obvious way is through a series of decisions that accumulate at the agency over time.36 As agencies take more public enforcement actions, the rules it articulates become clearer, giving parties notice about what kinds of actions the agency is likely to pursue in the future. Public enforcement also has other benefits. First, it reflects democratic principles by allowing accountable governmental officials to effectuate the will of the people, though they may be captured by industry.37 Even if not elected, public officials are subject to transparency forced by laws like the Freedom of Information Act.38 Second, public rights of action ensure a consistent enforcement policy tied to the relevant political regime.39 When administrations change (e.g., the president, a state governor), enforcement policy can change to reflect the will of the voters. Third, by centralizing and coordinating enforcement actions within a specialized body, it efficiently administers a regulatory scheme.40 Private rights of action, by contrast, may not be brought when they are socially beneficial because they will not generate an economic benefit to the litigants (or their attorneys).41 Third, the government may be in a better position to pursue programmatic remedies—changes to company behavior. This may be because a statute provides special remedies or perhaps because of the government’s leverage as the regulating entity. 35

Even without tracking children’s data, WW might still make money by selling services or products to children who use the app (or to their parents if the app is “shared” or has parental oversight). 36 Solove and Hartzog (2014). 37 Mance (2023), (Accessed 19 June 2023). 38 5 U.S.C. § 552. 39 5 U.S.C. § 552. Engstrom (2013, pp. 630–631). 40 Engstrom (2013). 41 Mance (2023).

40

D. A. Simon et al.

Private litigation, on the other hand, may be more apt to seek monetary compensation, either because it is expedient for the parties or because the law makes it difficult to obtain other remedies (or both). As this Part has shown, public enforcement has a number of benefits. It . notifies the public of a potential area of interest and rules relating to conduct in that area; . marshals government resources to address a specific problem; . limits agency authority to address that problem by statute; . fulfills a democratic function by using accountable officials to enforce rules directed by the political process; . provides consistent enforcement through a governmental body; and . enables programmatic changes to regulated actors through litigation. Public enforcement, however, is not always beneficial. For example, agencies may have broad authority that is difficult for Congress to reign in or may be captured by industry. Government information and resources are also limited, potentially driving enforcement to a small subset of (the wrong) cases. In the next Part, we explore an alternative to public enforcement through public rights of action: private enforcement through private rights of action.

3 Policing Medical Data Sharing Using Private Rights of Action We return now to Dinerstein to contrast the public-right-of-action approach with the enforcement of legal rights through private lawsuits. Using the Dinerstein court’s analysis as an example, we first distinguish between two types of claims by private litigants: one based on private law and one based on public law. We then discuss how various aspects of the case highlight features of claims based on purely private law. We conclude by discussing different issues that arise when using private rights of action based in public law. Types of Claims. Dinerstein’s class action claims against Google and UCMC sounded mostly in common law actions like contract and tort, what is commonly referred to as “private law.” But they also included a claim under an Illinois statute meant to regulate deceptive business practices, which is a “public law.” That statute, though, provided a private right of action to plaintiffs such that all of the claims asserted were private rights of action, whether based in statutory or common law.42 Finally, Dinerstein attempted to assert claims under HIPAA, which provides public but not private rights of action—i.e., only the Department of Health and Human Services and in some cases State Attorneys general can enforce HIPAA directly.

42

Federal law can also create private rights of action. E.g., 15 U.S.C. § 1681n-o (creating a private right of action under the Fair Credit Reporting Act).

Assessing Public and Private Rights of Action to Police Health Data …

41

These three types of lawsuits reflect the different types of private and public rights of action. While public rights of action are created by statute,43 private rights of action can be authorized by statute or can be created by common law. When the legislature authorizes private rights of action, it addresses certain system design considerations, such as costs and funding (e.g., damages, fee-shifting) procedural issues (e.g., discovery).44 Private rights of action under common law, however, typically are not part of a planned legislative scheme. They therefore may lack the design aspect central to private rights of action created by public law. Standing. Simply because Congress creates a private right of action doesn’t necessarily mean a claim can proceed when the statute is violated. When Congress authorizes a private cause of action under federal law,45 the plaintiff must show some legally cognizable injury.46 In legal parlance, the suing party must have “standing.”47 When the law in question is federal, as a general matter standing requires the plaintiff to show that she “(1) suffered an injury in fact, (2) that is fairly traceable to the challenged conduct of the defendant, and (3) that is likely to be redressed by a favorable judicial decision.”48 Frequently, the question before courts is whether an “injury in fact” occurred, which the Supreme Court has held must be “concrete and particularized.”49 In Dinerstein, the plaintiff made the case that he was injured in several ways: UCMC breached its contract (even though no monetary harm occurred)50 ; invaded his

43

Some claims are authorized explicitly, other times courts find them to exist implicitly (Davis 2014). 44 Burbank et al. (2013, pp. 648–661). 45 A federal court applies the rules of Article III standing whenever “Congress creates new private causes of action to vindicate private or public rights.” Spokeo, Inc. v. Robins, 578 U.S. 330, 348 (2016), as revised (May 24, 2016) (Thomas, J., concurring). We note that standing is typically dictated by forum (Article III court versus state court) rather than cause of action. And, although we use the terms “private enforcement” and “private rights of action” interchangeably, we note the distinction between the former, which may arise under common law or federal law, and the latter, which scholars have used to describe enforcement of public law through a legislatively authorized cause of action. We also mean to exclude from discussion “private rights of initiation”: suits by private parties against agencies seeking to require agency action (Stewart and Sunstein 1982, p. 1197). 46 States also have separate standing doctrines, which do not necessarily follow federal requirements. Wexler v. Wirtz Corp., 809 N.E.2d 1240, 1243 (Ill. 2004) (stating standing requirements); Lebron v. Gottlieb Mem’l Hosp., 930 N.E.2d 895, 917 (Ill. 2010) (noting not required to follow federal standing principles). 47 Standing is also required in public rights of action, but the statute giving the right typically matches the cognizable injury needed. But see below. 48 Spokeo, 578 U.S. at 338 (quoting Friends of the Earth, Inc. v. Laidlaw Environmental Services (TOC), Inc., 528 U.S. 167, 180–181 (2000)). 49 Spokeo, 578 U.S. at 334 (2016) (quoting Friends of the Earth, Inc. v. Laidlaw Environmental Services (TOC), Inc., 528 U.S. 167, 180–181 (2000)). 50 Dinerstein, 484 F. Supp. 3d at 571. Dinerstein concerned only the standing of the individual plaintiff, not the class members. TransUnion LLC v. Ramirez, 141 S. Ct. 2190 (2021).

42

D. A. Simon et al.

privacy51 ; and “stole” his medical information.52 Judge Pallmeyer concluded the first two counted as legally cognizable injuries that sounded in common law but rejected the latter because using the information didn’t decrease its value.53 Importantly, however, the court noted that the mere fact that the defendant allegedly violated a federal statute did not give rise to a legally cognizable claim.54 For Dinerstein, this meant it was not enough to show that UCMC violated HIPAA (which did not contain a private right of action)—he had to demonstrate that his rights in contract or tort had been violated by particular actions of UCMC or Google. Dinerstein, along with two recent Supreme Court cases, illustrate the importance of designing a statute to redress wrongs through private litigation. Consider Spokeo v. Robins. There a credit reporting agency allegedly violated the FCRA, a law requiring it to take measures to, among other things, ensure the accuracy of consumer information. Although the law created a private right of action, the Court noted that the standing requirement consisted of two elements: harm to a particular person (“particularized”) and harm that actually occurred in fact (“concrete”). Because the statute created a right to redress substantive and not procedural harm, a “bare procedural violation” might cause “particularized harm” that was not “concrete.” While it left the ultimate decision of whether a concrete “intangible” harm occurred, the Court wrote that it was “difficult to imagine” that simply violating the statute without substantive harm the statute protected—for example, under the FCRA, merely providing accurate information to third parties without proper notice or providing inaccurate information that had no effect—would present any cognizable material risk of harm.55 In 2020, however, the U.S. Supreme Court in Ramirez provided an answer to the question of “concrete” intangible harm it left open in Spokeo. There the claimants filed a class action alleging that a credit agency violated the FCRA by failing to use reasonable procedures to ensure accuracy their information (which they alleged was inaccurate). Some members of the class had their information disclosed to third parties; others did not. According to the Court, it was not possible to suffer a concrete injury (as opposed to the risk of an injury) from inaccurate information without disclosure; therefore, it found standing for only those whose information the credit agency disclosed to third parties. Unlike the bare procedural violation suffered by those whose information was not disclosed, the claimants in Ramirez whose information was disclosed did suffer an injury, albeit an intangible one—here akin to the reputational harm traditionally protected by defamation.56 By narrowing the standing doctrine to limit claimants of “intangible” harms or to hook them to those already 51

E.g., Burbank et al. (2013, p. 639, n. 2). Dinerstein, 484 F. Supp. 3d at 577–578. 53 Dinerstein, 484 F. Supp. 3d at 578. 54 Dinerstein, 484 F. Supp. 3d at 575 (quoting Bryant v. Compass Grp. USA, Inc., 958 F.3d 617, 619–20 (7th Cir. 2020) (discussing Spokeo). 55 Although the court suggested the statute in Spokeo did not create a “a degree of risk sufficient to meet the concreteness requirement,” refrained from ruling on the subject, remanding the case to the Ninth Circuit for further consideration. Spokeo, 578 U.S. at 343. 56 Ramirez, 141 S. Ct. at 2200, 2204-07. 52

Assessing Public and Private Rights of Action to Police Health Data …

43

recognized in tort (or other private law), the Court solidified the logic of Judge Pallmeyer’s decision in Dinerstein. And it refocused the importance of designing statutes that can support the underlying purposes through a standing analysis. Substantive Violation. Standing, however, is just the beginning hurdle for private rights of action. Consider private rights of action based on private law. Once Dinerstein had established standing by alleging violation of his rights in contract and tort, he still needed to show that UCMC and Google actually violated those rights. For his contract claim, this meant proving that UCMC failed to keep the promises it made in the privacy practices that it required Dinerstein to sign and that doing so caused him damage. And it meant showing two other elements. First, the contract was supported by consideration, which requires the parties to a contract to give up something. Second, the nonbreaching party suffered economic damages. Like most contract claims, this one turned on a court interpreting specific contract language. For example, UCMC pledged to use “all efforts” to protect privacy.57 Here Judge Pallmeyer found that Illinois law did not displace the otherwise clear (though broad) contractual language that allowed UCMC to share data with Google.58 Still other contractual language, however, required UCMC to obtain written permission to sell patients’ medical information.59 While the former could not sustain a contract claim, the latter could. Despite this, the court ultimately dismissed the action because contract law did not recognize any economic damages for the use of Dinerstein’s information.60 His tort claims failed for similar reasons: UCMC and Google did not match other scenarios where the tort typically was recognized.61 Private rights of action based on private law, then, may not be ideal if legal rules lag behind technological development.

4 Designing Regulation for Health Data Through Public and Private Enforcement In this Part we round out our discussion of regulating health information by emphasizing the different features of, on the one hand, public and private rights of action based on public law and private rights of action based on private law on the other. We frame our discussion by noting that regulatory enforcement mechanisms are a choice where public and private enforcement can be either substitutes or complements.62

57

Dinerstein, 484 F. Supp. 3d at 580. Dinerstein, 484 F. Supp. 3d at 582. 59 Dinerstein, 484 F. Supp. 3d at 588. 60 Dinerstein, 484 F. Supp. 3d at 591–592. 61 Dinerstein, 484 F. Supp. 3d at 594 (quoting Lovgren v. Citizens First Nat. Bank of Princeton, 534 N.E.2d 987, 989 (Ill. 1989)). 62 Burbank et al. (2013). 58

44

D. A. Simon et al.

We take the view that “[p]rivate enforcement and public enforcement are complements, not substitutes.”63 Given this stance, we highlight the intersections and relative tradeoffs involved in using one or the other, rather than delineate the best possible regulatory framework. We emphasize that private enforcement requires legislatures to pay careful attention to system design considerations and note several potential ways that Congress could use federal law to incorporate private rights of action to better regulate health data privacy.

4.1 Public Versus Private Enforcement Under Public Law We start by noting that public regulation of health data in the United States is lacking. The primary and most specific regulation is HIPAA. The government uses HIPAA and its accompanying regulation, as well as other privacy laws, to police behavior that the statute covers. Public rights of action here have several potential benefits. First, because they are centralized and organized around a particular mission, regulators enforcing HIPAA and related statutes are likely to have a more comprehensive view of the regulatory environment than private individuals. Using its vantage point, the regulator can pick out cases that are likely to have the most social value and push only for interpretations of the underlying statute that suit its overall purpose as opposed to merely yield a victory in a particular case. Second, regulators enforcing HIPAA and other federal privacy laws can use their resources to target behaviors in line with political priorities, using enforcement discretion to ratchet enforcement up or down with changing political priorities. We can see this with FTC’s recent announcement with the Health Breach Notification Rule. Political will is trending toward more concern for privacy of health data, and the agency is responding by expanding enforcement to new areas. Third, statutes target specific behavior that the legislature has identified as problematic. HIPAA targets behavior of physicians and hospitals. If additional regulation is needed for other actors handling health information, the legislature can create specific rules to address it through the democratic process. Congress has responded to privacy concerns by passing laws such as COPPA, and it could do the same to address new privacy risks. These benefits also have downsides that suggest private enforcement may complement public enforcement. First, centralizing and organizing regulation around a particular statute increases the chance of regulators acting in ways that undermine the regulatory process. Regulators may act in their own interests rather than in the public’s. Two examples are regulatory capture (where industry drives agency decision making) or reputational aggrandizement (where an agency seeks to burnish its own standing or the power it holds).64 If captured, regulators enforcing HIPAA could seek to enforce against competitors, or reduce regulation to a suboptimal level. When 63 64

Scholz (2022). E.g., Carpenter (2010).

Assessing Public and Private Rights of Action to Police Health Data …

45

regulators seek power, on the other hand, they may become heavy-handed, regulating too much rather than too little (though under-regulation is theoretically possible). Second, while public regulators have significant resources, they are not unlimited. Regulators must choose which cases to target, valuing high impact cases over important ones that, while less impactful, may present significant social problems. Through overuse and aging regulation, they may develop regulatory cataracts that inhibit what they see as problems. Third, because laws and regulations fail to anticipate new problems, regulators may try to extend the reach of statutes to cover the behavior in question. This risks subverting one of the potential benefits of the public law approach to either public or private rights of action—namely, the democracy—enhancing function. We can see this with FTC’s new position on the Health Breach Notification Rule, which it has interpreted quite broadly. Private enforcement of public laws can help to reduce these problems. Private enforcement relies on those with the best information about violations (claimants) to bring them, providing a kind of regulatory Lasik. Private parties also supplement public enforcement by bringing claims that the regulators do not view as worth pursuing. Although we think private claimants here can pick up the regulatory slack, we stop short of claiming that “stickler private plaintiffs [that] insist on enforcing the law” necessarily “leads to ongoing, thoughtful enforcement of privacy law.”65 The reason is that private rights of action will proceed only to the extent that they “[yield] an expected positive net return.”66 And law needs to constrain these claims, as well, to avoid overenforcement—which can be achieved through system design considerations (by limiting claim type, scope, damages, etc.). Nevertheless, private enforcement can provide a check against some of the problems associated with public enforcement. For private enforcement to be effective, it must address other considerations, which we discuss in the next section. Finally, private enforcement may be less likely to overreach than public enforcement. Admittedly, private claimants may be as likely as regulators to stretch the bounds of law to obtain economic rewards. But courts may be less sympathetic to such efforts precisely because this is such a prominent element in these claims. This suggests that while private enforcement can be a valuable tool for regulating health data privacy, its effectiveness will depend on how well existing private rights of action fit with harms. Because many of these harms may present novel factual or legal questions, it may be best for legislatures to create private rights of actions that (1) address specific harms and (2) pay attention to system design considerations, which is the subject of the next section.

65 66

Scholz (2022). Cheng (1985).

46

D. A. Simon et al.

4.2 Private Enforcement Under Private Law While complementing public rights of action with private ones authorized under public law may help enforce existing health data privacy laws, there will still be gaps in the law that deserve attention. Here we should consider how private rights of action under private law might fill them. As Dinerstein illustrates, public rights of action will not reach every suit. Private law provides some flexibility to reach claims even when there is no private right of action under public law. In Dinerstein, for example, although HIPAA contained no private right of action, it provided the basis for contract claim because UCMC incorporated it into its agreement with Dinerstein. Limits of Private Law. Private law, like contract and tort, is also open-ended, leaving the parties to determine what behavior they want to limit or prescribe. Where enforcement flexibilities of public law run out, private law claims may be able to step in. As Dinerstein showed, however, courts may be conservative in recognizing new types of injuries or actions based upon them, potentially limiting the usefulness of private law claims. Consider both the parties’ ability to change contract language and negotiate around public rights of action and uneven negotiating power, make private causes of action insufficient to address them as well. Patients have effectively no ability to bargain for contract terms in the health contexts, making the agreements into “contracts of adhesion” that one party forces on another. These terms may reflect the power dynamics of the contracting parties. In Dinerstein, for example, the contract allowed “any use of my medical information will be in compliance with federal and state laws, including all laws that govern patient confidentiality, and the University of Chicago Medical Center Notice of Privacy Practices.”67 Without an ability to effectively bargain, contract law may provide a poor fit for regulating health data privacy. System Design Considerations. Dinerstein also illustrated that beyond private law based claims, any system of private enforcement will have to grapple with system design questions. These are of two types. The first relates to doctrinal questions, such as standing, elements to a cause of action, and proof of damages. For example, substantive law may limit individual claims brought under purely private law. Publicly authorized, private rights of action may avoid this problem in part but must still confront issues like standing. This was true for Dinerstein’s claim under Illinois law. Public rights of action, by contrast, typically take violations of state or federal law as the injury to be rectified—none of those individuals actually harmed are party to the lawsuit. Consider the standing challenges raised in Spokeo and Ramirez. Because the Supreme Court has linked standing to substantive violations targeted by statute, violations of HIPAA (as an example) might have to show, similar to Ramirez, an unauthorized disclosure occurred, not just that a covered entity failed to protect information in accordance with the statutory prescriptions. For example, a legislature drafting a statute authorizing a private right of action for improperly securing health data should carefully define the violations and harms 67

Dinerstein, 484 F. Supp. 3d at 567.

Assessing Public and Private Rights of Action to Police Health Data …

47

at issue in terms of security. Even so, statutes that conceptualize harm as failing to protect information may face standing challenges unless courts agree that such failures themselves constitute substantive violations, something to which the Supreme Court seems unreceptive. The second relates to how to structure the administrative regime governing private rights of action.68 For example, how will costs be structured? Courts may devise systems to limit litigation, potentially blunting claims. Fee-shifting provisions may move the ultimate cost of a suit to the losing party, potentially sweetening the prospect of litigation for claimants. Third, the statutory payout may dictate how attorneys structure their fees and, hence, whether certain clients access legal services. Statutory damages may incentivize firms to work on cases they otherwise would see as too small to merit attention. An analysis of fee-shifting and damages provisions would also have to consider the potential for procedural issues as well, including those concerning certification of class actions, along with standing issues discussed above. Finally, a legislature must consider the appriopriate role of judges and judicial discretion within the system, be it state or federal. While Dinerstein (appropriately) did not evaluate the design considerations (since that was not the job of the court), they are an important component of any regulatory system that opts for (some) private enforcement. To incentivize claims like Dinerstien’s, private enforcement must be accompanied by sufficient financial incentives to generate lawsuits. Attorneys’ fees provisions, statutory damages, and perhaps even treble damages could entice attorneys to take cases on contingency, creating access to the courts that might otherwise be absent. Private enforcement, whether through individual lawsuits or class actions like Dinerstein, may not substantially alter health data policy, either by deterring conduct or requiring changes.69 Since enforcement budgets of public regulators are likely to be lower than they would be based on returns in the private market,70 it is important that these incentives are calibrated correctly so that private enforcement picks up the slack. This is why private enforcement likely serves better as a complement to public enforcement rather than a replacement for it. Despite these hurdles, private enforcement of public law has some benefits. First, it marshals resources and information from the private sector, which may more efficiently target violations than public regulators with less information. Second, it creates a similar structure to administer claims as public regulation but off the balance sheets of government agencies.71

68

These are discussed at length in Burbank et al. (2013). See also Scholz (2022) (arguing for using both in privacy regime). 69 Landes and Posner (1975). 70 Landes and Posner (1975, p. 36). 71 Donahue and Witt (2020).

48

D. A. Simon et al.

4.3 Changing Public Laws to Cover More Health Data Because health data regulation is policed through a patchwork of public and private enforcement, increasing data privacy should include adjunctive private rights of action for privacy violations. Consistent with the previous sections, however, such changes must be careful to pay attention to system design considerations. One might, for example, increase enforcement of existing regimes by adding a private right of action. Congress could also expand HIPAA to alter the definition of “covered entities” or the terms on which they must provide notice and obtain authorization, perhaps expanding requirements for privacy. It could then authorize private rights of action for either the narrower or broader definitions to calibrate enforcement of relevant interests. With these system design considerations in mind, then, Congress might decide that it would be prudent for the government to test the new enforcement powers before authorizing private rights of action over new types of claims. It could even write a “sunrise” provision into statute, automatically allowing private rights of action to proceed after a certain period of time. Congress would also have to address damages, fee-shifting, and other costs of administration. To incentivize claims, for example, it might include a fee-shifting provision that awarded attorneys’ fees to successful plaintiffs. If damage awards are high enough, private claimants may bring claims large enough to address remedial actions, in addition to judgment amounts. Alternatively, Congress may decide that fee-shifting would focus on a small number of egregious violations rather than a large number of problematic violations. Here Congress may include a small statutory damage award and eliminate fee-shifting, encouraging class actions for minor violations. While these changes may be politically difficult to implement, they can serve as important support beams for a larger network of public enforcement of health data privacy regulation. While changes to federal law may improve the regulation of health privacy, public state law may provide a needed gap-filler where federal action is not feasible or advisable. California and other states, for example, have attempted to beef up privacy protections by enacting new laws. California enacted privacy legislation in 2018 that provided consumers rights to know, opt-out, delete and non-discrimination relating to how companies collect their data.72 In 2023, it supplemented this law by adding rights to correct inaccurate information and to limit the use and disclosure of sensitive information.73 These rights are enforced by the California Attorney General. Looking outside privacy-specific statutes, though, is also an option. State laws, such as Illinois’ Consumer Fraud and Deceptive Business Practices Act could be revised or a new statute enacted, as Illinois did (and other states have done) with the Illinois Human Rights Act—a law that protects individuals against various kinds of discrimination.74 72

California Consumer Privacy Act of 2018 (CCPA), Cal. Civ. Code1798.100 et seq. Proposition 24, California Privacy Rights Act of 2020, amending, Cal. Civ. Code1798.100 1798.199.100. 74 775 Ill. Comp. Stat. 5/ et seq. 73

Assessing Public and Private Rights of Action to Police Health Data …

49

Of course, states may face similar barriers to legislative changes. This means changes to private rights of action may be more tenable. If these changes come from the legislatures, some of the same considerations may apply, though they could be blunted by the limited effects (as stated above) of private law on sharing of health information. Legislative fixes could, for example, pull a page from tort and contract law and make certain kinds of contracts requiring consent to sharing all kinds of (medical) information void or voidable as unconscionable when the power of the parties bargaining is asymmetrical, and the consumer has no real alternative.75 Finally, our discussion has been about how public and private causes of action influence firm behavior. But we do not mean to suggest that public and private causes of action are the only potential ways to regulate the sharing of medical data. Nevertheless, public and private enforcement are important tools to regulate how health data is shared and protected.

5 Conclusion In this chapter we focused on the mechanisms used to regulate the privacy of health data in the United States. Using a recent case brought by a private litigant, it described the potential benefits and drawbacks of enforcing privacy rights through lawsuits brought by private litigants or the government. It showed the limits of using private law to enforce public rights, and suggested ways for legislatures to complement public enforcement with private enforcement using private rights of action based on public law.

References Burbank SB, Farhang S, Kritzer HM (2013) Private enforcement. Lewis Clark Law Rev 17(3):637– 722 Carpenter DP (2010) Reputation and power: organizational image and pharmaceutical regulation at the FDA. Princeton Studies in American Politics, Princeton University Press, Princeton Cheng C (1985) Important rights and the private attorney general doctrine comment. Calif Law Rev 73(6):1929–1955 Davis S (2014) Implied public rights of action. Colum L Rev 114(1):1–84 Donahue N, Witt JF (2020) Tort as private administration. Cornell Law Rev 105(4):1093–1170 Engstrom DF (2013) Agencies as litigation gatekeepers. Yale Law J 123(3):616–713 Federal Trade Commission (2022) FTC takes action against company formerly known as weight watchers for illegally collecting kids’ sensitive health data. https://www.ftc.gov/news-events/ news/press-releases/2022/03/ftc-takes-action-against-company-formerly-known-weight-wat chers-illegally-collecting-kids-sensitive. Accessed 18 Jun 2023

75

E.g., Waggoner v. Nags Head Water Sports, Inc., 141 F.3d 1162 (4th Cir. 1998); Copeland v. Healthsouth/Methodist Rehab. Hosp., LP, 565 S.W.3d 260, 274 (Tenn. 2018).

50

D. A. Simon et al.

Federal Trade Commission (2023) First FTC health breach notification rule case addresses GoodRx’s not-so-good privacy practices. https://www.ftc.gov/business-guidance/blog/2023/ 02/first-ftc-health-breach-notification-rule-case-addresses-goodrxs-not-so-good-privacy-pra ctices. Accessed 18 Jun 2023 Geistfeld MA (2014) Tort law in the age of statutes. Iowa Law Rev 99(3):957–1020 Hall MA, Orentlicher D, Bobinski MA, Bagley N, Cohen IG (2018) Health care law and ethics. Wolters Kluwer Law Bus 127–154 Healthcare Finance News (2023) Anthem pays $16 million in record HIPAA settlement for data breach. https://www.healthcarefinancenews.com/news/anthem-pays-16-million-rec ord-hipaa-settlement-data-breach. Accessed 19 Jun 2023 Landes WM, Posner RA (1975) The private enforcement of law. J Leg Stud 4(1):1–46 Mance A (Forthcoming 2023) How private enforcement exacerbates climate change. Cardozo Law Rev. https://papers.ssrn.com/abstract=4204954. Accessed 18 Jun 2013 Metzger GE, Stack KM (2017) Internal administrative law. Mich Law Rev 115(8):1239–1307 Ruhl JB, Robisch K (2016) Agencies running from agency discretion. William Mary Law Rev 58(1):97–182 Scholz LH (2022) Private rights of action in privacy law. William Mary Law Rev 63(05):1639–1694 Seth D (2014) Implied public rights of action. Columbia Law Rev 114(1):1–84 Solove D, Woodrow H (2014) The FTC and the new common law of privacy. Columbia Law Rev 114(3):583–676 Stewart RB, Sunstein CR (1982) Public programs and private rights. Harv Law Rev 95(6):1193– 1322 Wood M (2017) UChicagoMedicine, The Forefront. UChicago Medicine Collaborates with Google to use machine learning for better health care. https://www.uchicagomedicine.org/forefront/res earch-and-discoveries-articles/uchicago-medicine-collaborates-with-google-to-use-machinelearning-for-better-health-care. Accessed 18 Jun 2023

Patient Perspectives on Data Sharing Louise C. Druedahl and Sofia Kälvemark Sporrong

Abstract Data sharing is key for artificial intelligence and for future healthcare systems, but the perspectives of patients are seldom included in the larger debates of how, when, and what data to share. This chapter provides an overview of research on patient perspectives on data sharing and associated aspects, including patients’ motivations, concerns, and views on privacy and conditions for sharing. Moreover, these perspectives are put into the evolving context of informed consent and today’s European context of the General Data Protection Regulation (GDPR) and Data Governance Act (DGA). Overall, there seems to be a discrepancy between the patients’ perspective on data sharing and the reality in which their data are to be shared. The current patient views are researched within relatively ‘local’ contexts, where the patient would consent to collecting data for primary use and on patients’ preferences regarding consent and what they see as barriers and motivators for data sharing. However, the reality of data use is moving towards re-use of data for secondary purposes and a context of more altruistic consent such as the DGA. Questions remain regarding how patients perceive sharing and the role of their data in the larger governance of data; seemingly, patient views are lost in the wider debate of innovation and jurisdictional competitiveness. Ensuring that patients’ voices are heard is essential for public acceptance of data sharing, and thus for inclusiveness and equity of results and innovations originating from patients’ shared data. Keywords Artificial intelligence · Consent · Data sharing · Data governance Act · GDPR · Patient perspective

L. C. Druedahl (B) Faculty of Law, Centre for Advanced Studies in Biomedical Innovation Law (CeBIL), University of Copenhagen, Copenhagen, Denmark S. Kälvemark Sporrong Faculty of Pharmacy, Social Pharmacy, Department of Pharmacy, Uppsala University, Uppsala, Sweden © The Author(s) 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_4

51

52

L. C. Druedahl and S. Kälvemark Sporrong

1 Introduction Data and sharing of data are actively pursued with a specific goal of increasing medical knowledge1 and a more general goal of promoting science.2 Data and sharing of data are viewed as keys to unlocking the potential of artificial intelligence (AI) in healthcare in order to improve detection, diagnosis, and treatment of disease.3 Data sharing within healthcare is evolving and becoming a sought-after practice in the wider scientific community. It is covered in the Lindau Guidelines,4 scientific medical journals,5 the pharmaceutical industry,6 and also in regulatory settings such as the European Medicines Agency (EMA)’s policy to publish clinical trials for increasing transparency and fostering innovation.7 Rationales for data sharing include the fact that existing data can be used in a broader context to address issues other than those originally intended.8 Accordingly, data already collected, or collected in the future, can be used for advancement and innovations. The public picture differs slightly from the healthcare/research perspective and is colored by media scandals about data use and storage portraying the industries using data as reckless cowboys.9 One example is Google’s Nightingale project, which collected personal, identifiable healthcare data without patient knowledge, and led to a controversy with the potential to undermine public trust in data sharing.10 These events happen while regulators work on regulations to ensure the safety, efficacy, and quality of products that rely on large amounts of data, such as the guidance of the US Food and Drug Administration (FDA) on AI-based medical devices,11 the EU AI Act,12 as well as regulations for safeguarding patient privacy (such as the General Data Protection Regulation 2016/679).13 Artificial intelligence is “a generic term that refers to any machine or algorithm that is capable of observing its environment, learns, and based on the knowledge and experience gained, takes intelligent action or 1

Flanagin et al. (2022), Forbes. Brian Foy (2022). Council for the Lindau Nobel Laureate Meetings/Foundation Lindau Nobel Laureate Meetings (2020). 3 Elmore and Lee (2021). 4 Council for the Lindau Nobel Laureate Meetings/Foundation Lindau Nobel Laureate Meetings (2020). 5 Flanagin et al. (2022), Taichman et al. (2017). 6 European Federation of Pharmaceutical Industries and Associations (EFPIA) and Pharmaceutical Research and Manufacturers of America (PhRMA) (2013). 7 European Medicines Agency (2022). 8 Sydes et al. (2015). 9 The Lancet Digital Health (2019). 10 Ledford (2019). 11 US Food and Drug Administration (2022). 12 European Commission (2021). 13 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). 2

Patient Perspectives on Data Sharing

53

proposes decisions.”14 Hence AI is not one technology; instead, it includes a diverse range of technologies. A current commonly used AI technology is machine learning (ML), which covers the training of algorithms to identify data patterns, including neural networks (NN) that can use more unstructured data and are less dependent on human intervention.15 The broader debate on health data sharing has many facets, including data ownership, data protection, data security, trust, confidentiality, privacy, and data quality.16 While these are intensely debated from legal, regulatory, and ethical perspectives, there is less focus on how patients perceive sharing of their data and barely any on how to incorporate these perspectives in policy and legal frameworks. However, it is imperative to understand patients’ perspectives, as well as their hopes and concerns, because there can be disparity between patients’ attitudes toward data sharing in general, sharing of their own data, and what data they eventually give permission to share. Such an understanding is essential not only to fulfill the ethical principles set out by the World Health Organization (WHO) to ensure inclusiveness and equity of innovations relying on AI,17 but also to fulfill the ultimate goals of increasing adoption and supporting the progress of data sharing in societies.18 These issues are presented and discussed here. The chapter is structured as follows: Sect. 2 provides an overview of the patient perspective on health data sharing; Sect. 3 puts the patient perspective in a context relating to how data sharing is evolving; and Sect. 4 gives brief concluding remarks and future perspectives.

2 Patient Perspective on Data Sharing “Patients want their data used responsibly, however, so the question is really: Who should control how data are distributed and used by others? The patients themselves? Doctors and researchers? Research institutions or governments?”.19

Unsurprisingly, there is not just one patient perspective on sharing of health data: various debates exist for different aspects. In this chapter, we unfold some of these. An essential part of the patient perspective on data sharing is what patients perceive as 14

Iglesias et al. (2019). Wainberg et al. (2018), Eda Kavlakoglu (2020). 16 Cohen and Mello (2019), Price and Cohen (2019), Kostick-Quenet et al. (2022), Minssen et al. (2020), World Health Organization (2021). 17 World Health Organization (2021). 18 Lounsbury et al. (2021). 19 Haug (2017). 15

54

L. C. Druedahl and S. Kälvemark Sporrong

health data and whether they know what is shared when they agree to share data about themselves. Data on this are scarce, but a survey (n = 1,191) showed that 52% of US patients did not know that they shared health data when sharing personal information in healthcare settings.20 Moreover, a study among vulnerable patients showed that it is unclear to patients what data are valuable to share and what data are actually shared, for example, that it can include electronic health-record (EHR) data.21 This is truly problematic if generalizable to wider populations, for example regarding whether patients’ consents really are informed. However, despite patients’ seemingly limited knowledge of what constitutes personal and shareable data, some perceive health data as sensitive information.22 Patients with rare diseases expressed that sensitive information could include information about disability, genetic information about their disease, and physiological data.23 Further, the patients who viewed their data as sensitive also wanted more control over how their health information was used, irrespective of whether they were familiar with sharing information online in social networks.24 Nonetheless, systematic reviews have found widespread patient support of data sharing for medical research,25 although this support can differ depending on data origin, such as from clinical trials, EHRs, or biobanking.26

2.1 Patient Motivations for Data Sharing Patients’ motivation and willingness to share their data often connect to altruism, such as a wish to contribute to medical research, to advance disease understanding or development of new treatments, to help others, or, more broadly, for ‘the common good.’27 An additional motivation found among clinical trial participants is a general belief that the benefits of data sharing outweigh any potential concerns.28 However, the purpose of data sharing is important in a wider context as the public is significantly more comfortable with data sharing for patient purposes than for business purposes.29 Data sharing and a willingness to share data also carry an expectation of outcomes, for example, that sharing data lead to important improvements in research and care.30 Multiple studies have shown that patients hope that sharing their data will enable 20

Health IT Analytics. Shania Kennedy (2022). Bernaerdt et al. (2021). 22 Bernaerdt et al. (2021). 23 Courbier et al. (2019). 24 Courbier et al. (2019). 25 Kalkman et al. (2022), Hutchings et al. (2021). 26 Mello et al. (2018). 27 Kalkman et al. (2022), Howe et al. (2018), Mello et al. (2018), Broes et al. (2020), Courbier et al. (2019). 28 Mello et al. (2018). 29 Trinidad et al. (2020). 30 Kim et al. (2015), Howe et al. (2018). 21

Patient Perspectives on Data Sharing

55

personal access to and ownership of their data, increase collaboration and evidence generation for better and safer care, improve diagnostics and treatments, as well as improving equality and delivery of personalized care.31 Overall, the general picture of patient and public motivation and willingness to share data is that people want to contribute their data for purposes they want to support.

2.2 Patient Concerns About Data Sharing Despite the general support for data sharing, patients also have concerns. Studies have found that patients’ frequent concerns about sharing of data are breach of confidentiality, commercial use or exploitation of data for profit, or abuse of data.32 Other patient concerns include the risk of their information being stolen, that their data are used for marketing rather than research purposes, and that fewer persons would enroll in clinical trials if they knew their data would be shared.33 Furthermore, patients are less likely to consent to sharing their medical records if the records contain sensitive topics34 and if they lack a usual source of care, or experience cost barriers to care.35 Nonetheless, patients consider not only aspects of their own data, but also how sharing of their data could influence care for other patients, such as how increased health data sharing in resource-high regions could introduce or perpetuate biases for patients in resource-low regions.36 Concerns particularly related to confidentiality include the risks of reidentification and discrimination.37 A survey of patients and the public showed that, overall, they are very or extremely concerned about discriminatory use of health data or data being used against them in relation to employment or opportunities for quality healthcare.38 This is echoed in studies with vulnerable patients and patients with rare diseases who fear becoming victims of discrimination in the labor market or having to pay higher premiums for insurance.39 Thus, many patients, as well as the general public, are concerned about the risk of discrimination as a consequence of data sharing. Associations have been found between being concerned about reidentification and/or having lower trust in other people in general and an increased likelihood of believing that the negative aspects outweigh the benefits of data sharing.40 However, this might be mitigated by anonymization as two qualitative studies among cancer 31

Lounsbury et al. (2021), Kalkman et al. (2022), Broes et al. (2020). Kalkman et al. (2022), Mello et al. (2018). 33 Mello et al. (2018). 34 Hutchings et al. (2021). 35 Grande et al. (2013). 36 Lounsbury et al. (2021). 37 Mello et al. (2018), Kalkman et al. (2022). 38 American Medical Association (2022a), Lounsbury et al. (2021). 39 Bernaerdt et al. (2021), Courbier et al. (2019). 40 Mello et al. (2018). 32

56

L. C. Druedahl and S. Kälvemark Sporrong

patients and the general public found that willingness to share healthcare data increases if data are anonymized.41 Patients’ concerns about data sharing are not only related to the content of the data itself, but also to the entity receiving the shared data matter to patients.42 Studies have pointed toward patients wanting to share their data with an entity they trust,43 and that patients with rare diseases worry about their data being shared with third parties or being used for a purpose they did not choose.44 Further, patients are relatively willing to share their data with university researchers or not-for-profit organizations45 but are more cautious about sharing with pharmaceutical companies,46 employers,47 insurance companies,48 or tech companies.49 The reported concerns about sharing data with insurers are rooted in worries that the insurance company, based on the data, would restrict treatment because of financial rationales.50 However, despite the skepticism about the entity receiving the data, some patients think that data should be shared as much as possible, even if companies profit from them because they also bring new treatments to the market.51 Overall, despite the many different views, the general picture is that patients are concerned about discrimination and who can gain access to or receive their data.

2.3 Patient Views on Privacy, Trust, Distrust and Conditions for Sharing The patient perspective includes more than motivation and willingness to share data and the concerns about such sharing. For example, it also includes views on privacy, trust and distrust as well as views on responsible data sharing and how conditions for sharing, such as opt-in/out, can influence patient willingness. If privacy is lacking in data sharing, patients can associate it with potential exploitation of their data.52 A study among US patients found a perception of privacy 41

Lounsbury et al. (2021), Broes et al. (2020). Jones et al. (2022), Mello et al. (2018), Kim et al. (2015), Kim et al. (2019), Broes et al. (2020), Grande et al. (2013), Courbier et al. (2019), Kalkman et al. (2022). 43 Jones et al. (2022), Mello et al. (2018), Kim et al. (2015), Kim et al. (2019), Broes et al. (2020), Grande et al. (2013), Courbier et al. (2019), Kalkman et al. (2022). 44 Courbier et al. (2019). 45 Mello et al. (2018), Courbier et al. (2019), Aggarwal et al. (2021). 46 Bernaerdt et al. (2021), Lounsbury et al. (2021), Jones et al. (2022), Broes et al. (2020), Aggarwal et al. (2021), Mello et al. (2018). 47 Bernaerdt et al. (2021). 48 Jones et al. (2022), Courbier et al. (2019). 49 Lounsbury et al. (2021), Aggarwal et al. (2021). 50 Jones et al. (2022). 51 Broes et al. (2020). 52 Lounsbury et al. (2021). 42

Patient Perspectives on Data Sharing

57

to be a right (92%, N = 1,000), and that patients generally believed that health data should not be available for purchase.53 However, another study of the general US public showed that a minority (24.4%) agreed that their de-identified healthcare data could be sold to a pharmaceutical company.54 Apart from patients’ views on selling data, another aspect of privacy is whether data sharing can worsen the privacy and cybersecurity of their data. Patients are generally in favor of having strict control of their data.55 Studies have found that patients see control over their data as an expression of their right to privacy, and that many value individual control of data over the social benefit of data sharing as well as prioritizing individual control over who can access their data and for what purposes.56 Nonetheless, despite the focus on control of data and data sharing both among the general public and patients, few know how their data are currently shared; one study showed only 20% (N = 1,000) of US patients had knowledge of this.57 An aspect related to control of data is that patients can be in a situation where they are frail, and data control and data access are not necessarily priorities.58 Further, patients may think it important to control their data sharing not only with third-parties, but also with healthcare professionals.59 Patients often trust their own doctors, but less so other members of the healthcare team or the healthcare system in general, according to a qualitative study.60 A survey found that 33.9% (N = 850) of patients were more willing to share data when they could select who to share with.61 A systematic review concluded that the following conditions influence what patients view overall as responsible data sharing62 : – Value. The data sharing should reflect the participants’ values and be in the public’s interest.63 – Trust. Trust in the ability of the institution handling their data is particularly important for patients.64 – Privacy, data security, and minimizing risks. Protection of the individuals’ privacy and reducing reidentification are essential. Data must be securely stored, and studies that offer value and minimize risks are more supported by patients.65 53

American Medical Association (2022b). Trinidad et al. (2020). 55 Courbier et al. (2019), American Medical Association (2022a), Kim et al. (2015), Bernaerdt et al. (2021). 56 Kim et al. (2015), Bernaerdt et al. (2021). 57 American Medical Association (2022b). 58 Bernaerdt et al. (2021). 59 Bernaerdt et al. (2021). 60 Sanyer et al. (2021). 61 Kim et al. (2019). 62 Kalkman et al. (2022). 63 Kalkman et al. (2022). 64 Kalkman et al. (2022). 65 Kalkman et al. (2022). 54

58

L. C. Druedahl and S. Kälvemark Sporrong

– Transparency, information and control. Transparency regarding how data are shared, with whom, and for what purpose is key, along with who will perform the research and policies for monitoring and governing databases. Patients can accept lower levels of control but require factors such as adequate transparency, control over use and commercialization, and the right to object to the processing of their data. Information about what their data are used for is also essential.66 – Responsibility and accountability. Patients “place responsibility for data sharing practices on the shoulders of the researchers,”67 and also believe that researchers of the original study should monitor data use by other researchers. Accountability was also important to patients, such as sanctions or penalties if data were misused or data protection was breached.68 From research, it seems that a majority of patients and the public want the possibility to consent through opt-in or opt-out options regarding data sharing. A survey found that 83.7% of the public (N = 800) would require their permission sought for data sharing regarding both healthcare and research.69 The study also found that 11% of respondents preferred an opt-out approach, where data sharing is allowed unless specifically prohibited by the patient; 23% preferred an opt-in approach, where patients need to explicitly consent to data sharing; and 66% preferred an opt-in approach but with the possibility of accessing data without consent in emergency cases.70 Another study showed that the majority of patients (N = 1,000) wished to decide on whether to opt-in or -out before their data were shared with a company.71 It is essential to look into these preferences because patients seem less willing to share when they should opt-in.72 However, patient data sharing preferences can vary for different types of data.73 In a study, 76.6% of US patients (N = 1,246) selected at least 1 item that they did not want to share. This could result in refusal to share if only an all-or-nothing option is available when consenting to sharing data.74 In a study among vulnerable patients, preferences for consent varied between patients as “some preferred a onetime consent, others implicit consent, some participants thought no consent was needed if data were shared anonymously and still others wanted to see a consent procedure for each individual episode of care.”75 These aspects are crucial to consider

66

Kalkman et al. (2022). Kalkman et al. (2022). 68 Kalkman et al. (2022). 69 Kim et al. (2015). 70 Kim et al. (2015). 71 American Medical Association (2022b). 72 Kim et al. (2019). 73 Bernaerdt et al. (2021). 74 Kim et al. (2019). 75 Bernaerdt et al. (2021). 67

Patient Perspectives on Data Sharing

59

because there is a risk that patients choosing to opt-out of data sharing might belong to certain groups with particular health situations or demographics.76 Overall, patients have views on when data sharing is appropriate, how their privacy should be respected, and under what conditions they find data sharing acceptable and responsible. Moreover, patients generally want to be able to opt-in and opt-out, but their preferences diverge.

2.4 Patient Views on Sharing of Data for Secondary Uses Data sharing is not only related to the terms of how data are collected for a primary purpose: data use or sharing of such data is also possible for secondary purposes. A few studies have specifically addressed this and found that patients generally support secondary use of their data for medical research purposes.77 There are similar aspects relevant for sharing of data for both secondary use and primary use. However, patients want even more control of secondary uses of their data by other researchers than by the researchers who originally collected their data, but there are also divergences between opinions.78 Some interviewed cancer patients in a study said that they would like to be informed of a secondary use of their health data,79 but another survey showed that more than 75% of patients in general (N = 1,000) want to receive a request prior to secondary use by a company.80 Reasons for wanting to be involved in the decision of who can use their data for a secondary purpose include curiosity, transparency reasons, a wish for information about who uses their data and the purpose of the secondary use, a wish to ensure no misuse of their data and samples, and a wish for knowledge about what cause they are contributing to with their data.81 Nonetheless, some patients express strong willingness to give broad consent for secondary use of their data.82 In a US study, a survey of the general public found that, for the majority, the willingness to share was influenced by the actual use of data for secondary purposes, more so than who was using the data or the sensitivity of the health data.83 Sharing data solely for marketing use lowered the willingness to share to the greatest extent,84 and university hospitals were more favored as recipients of data than, for example, pharmaceutical companies.85 76

Watts (2019). Hutchings et al. (2021). 78 Broes et al. (2020). 79 Broes et al. (2020). 80 American Medical Association (2022a). 81 Broes et al. (2020). 82 Kalkman et al. (2022). 83 Grande et al. (2013). 84 Grande et al. (2013). 85 Grande et al. (2013). 77

60

L. C. Druedahl and S. Kälvemark Sporrong

2.5 Patient Views on Data Sharing for Artificial Intelligence or Machine Learning With the rise of new technologies, such as AI and ML, it is relevant to consider what is known of the patient perspective, particularly regarding data sharing. Many aspects of data sharing considered earlier in this chapter are also relevant for AI and ML. However, there are few studies particularly related to data sharing for AI and ML. In one study more than 85% of patients (N = 408) supported the use of anonymized health data and health images for research using ML, and 45.9% believe that the benefits outweighed the risks when using AI in healthcare.86 However, some patients worry that AI could negatively influence patientcenteredness because a disproportionate focus on data would negatively affect the doctor-patient relationship.87

3 The Patient Perspective in a Changing Context of Data Sharing The current patient views on data sharing are linked to preferences regarding consent and views about potential barriers and motivators for sharing data, such as those outlined above. There is less knowledge about how patients perceive their data are shared in the larger governance of data and the role their data play in strategies for innovation and jurisdictional competitiveness. At the same time, legislation is seemingly changing its perspective from a protective patient consent to a more altruistic one in order to facilitate data sharing.

3.1 The Changing Context The traditional way of perceiving patients’ consenting to collection of their data is in relation to participation in clinical trials or in relation to the hospital where the patient receives treatment or a medical procedure, that is, in specific and somewhat limited contexts. This concept of consent is historically still fairly new, for example, in the US, informed consent regarding patient autonomy was initiated only in the early twentieth century.88 Nonetheless, patient autonomy prevails as a key ethical principle in healthcare and healthcare-related research.89 As technology has evolved, the autonomy of patients allows them to decide who they permit to access their health 86

Aggarwal et al. (2021). Lounsbury et al. (2021). 88 Bazzano et al. (2021). 89 Beauchamp and Childress (2001). 87

Patient Perspectives on Data Sharing

61

data. This includes aspects such as collection, processing, and storage of their data, as well as for what purpose their data are used and by whom. This autonomy is protected as a right to privacy via data protection laws in many jurisdictions. One such example is the EU GDPR,90 which was implemented to update the 1995 Data Protection Directive (Directive 95/46/EC) to follow suit with the technological development by creating minimum data privacy and security standards.91 The remaining part of this section focuses on the EU context that can serve as an illustration and inspiration for other jurisdictions as one way of regulating and governing data sharing. According to the GDPR, health data can be handled only by patients giving their consent or because the handling falls under the exemptions of the GDPR. Individuals in the EU, and thus patients, are secured certain rights through the GDPR such as the right to access information about data processing (Art. 15) and the right to restrict processing (Art. 18).92 A more recent development in the EU is the Data Governance Act (DGA), which came into force on June 23, 2022.93 The GDPR must still be fulfilled under the DGA, but the DGA aims to make more data available. This is done through “regulating the re-use of publicly/held, protected data, by boosting data sharing through the regulation of novel data intermediaries and by encouraging the sharing of data for altruistic purposes”.94 The DGA’s scope includes data in the public sector, which also comprises healthcare data.95 From a European policymaker perspective, a rationale for introducing the DGA was to realize the potential of the EU data in innovation of new products and services as well as social and economic benefits. This is incentivized to keep the EU on the forefront of innovation based on data.96 These outcomes can also have tremendous value for patients, for example, by bringing to the market better treatments and care. With the DGA, there is a change of perspective regarding data sharing because it encourages individuals and other stakeholders, such as companies, to share data voluntarily for altruistic reasons, i.e., without reward and for use in the public’s interest.97 This signifies a shift regarding patient consent for the use of their data: from protective to altruistic. This does not mean that data privacy and patient consent are less important in the DGA, but the way to give consent differs as consent under the GDPR is given for a specific purpose in contrast to the “altruistic” consent under the DGA. Moreover, the DGA aims to enable the use of data already existing in public sector bodies in EU member states, meaning there is an underlying assumption that 90

Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). 91 GDPR.EU (2023). 92 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). 93 European Commission (2022a). 94 European Commission (2022b). 95 European Commission (2022b). 96 European Commission (2022a). 97 European Commission (2022b).

62

L. C. Druedahl and S. Kälvemark Sporrong

the consent given for collection, storage and use of this data also aligns with secondary data sharing via the DGA. However, in situations where this is not the case, the DGA sets out that “If a public sector body cannot grant access to certain data for re-use, it should assist the potential re-user in seeking the individual’s consent to re-use their personal data or the data holder’s permission whose rights or interests may be affected by the re-use.”98 Nonetheless, it appears very labor-intensive to secure consent from large numbers of citizens in a country. Thus, while in theory, the DGA relies on the individual consenting to use of their data, it is difficult to see how that would play out in practice, although it probably will in connection with the research exemptions mentioned in the GDPR (Art. 9.2). In these, EU Member States can enact laws to allow the use of data necessary to process data for public health reasons, such as for a high standard of quality and safety in healthcare, medicines, and medical devices.99 Despite the challenges of finding out how each individual in the EU should consent or has consented to use of their data for altruistic purposes in alignment with the DGA, for example, Shaban100 argued that the DGA will empower the individual by recognizing data intermediaries as a support and a guide to how their data are shared. Data intermediaries can mean different things but can, for example, be a consent management clearinghouse, i.e., allowing access to personal data with consent from individual patients for specific purposes.101 Data intermediaries could provide an interface for their members to customize with whom and for what purpose they share their personal data and thus have a role that includes negotiating terms for the sharing on behalf of the individual, serve as a dialog partner, and make sure that the individual’s data are shared as the individual wishes. However, even with data intermediaries, the negotiating power of individuals is limited if they do not have some knowledge of the technicalities of data sharing, hence rendering it difficult to make nuanced decisions on when one does or does not wish to share data. Thus, “[i]n reality, individuals are facing a so-called consent dilemma or a take-it-or-leave-it option when consenting to the use and re-use of their data.”102 However, it appears important that patient data intermediaries are impartial so they “should not depend on the services it shares data to and from in order to guarantee its neutrality.”103 Accordingly, rather than influencing the individual to consent or not for a specific purpose, they should mediate the patient’s voice.

98

European Commission (2022b). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). 100 Shabani (2021). 101 Wernick et al. (2020), World Economic Forum (2022). 102 Shabani (2021). 103 World Economic Forum (2022). 99

Patient Perspectives on Data Sharing

63

3.2 The Fit of Patient Perspectives with the Changing Context of Data Sharing The altruistic consent as part of the DGA aligns well with the altruistic motivations and values for patients to share their data, for medical research in general and specifically for new and better treatments. Furthermore, many of the concerns that patients have regarding data sharing are countered by the anonymity, privacy, and data security focus in the DGA, whereby the public sector must have the means (including technical capacities) in place to ensure privacy and confidentiality of the data to be shared, which can include “anonymisation, pseudonymisation or accessing data in secure processing environments (e.g., data rooms) supervised by the public sector, to contractual means such as confidentiality agreements concluded between the public sector body and the re-user.”104 This also potentially removes the concern about discrimination, which could be reduced by anonymizing data prior to sharing. However, it must be sufficiently anonymized so that patients do not risk reidentification (and hence possible discrimination) with both present and future technologies, the latter being, by nature, a challenge as it is unpredictable. Generally, all data to be shared via the DGA already exist, thus it will primarily refer to use of data for secondary purposes. In research on patient perspectives on secondary use of data, patients are concerned about the actual use of their data (see Sect. 2.4), an aspect that could be facilitated in an altruistic consent if it is possible to provide consent for sharing data only for some specific, but not all, future purposes. However, not all parts of the patient perspectives align with the altruistic consent laid out in the DGA. For example, – Patients consider control of data sharing necessary not only regarding thirdparties, but also regarding healthcare professionals.105 The partner receiving the shared data matters to patients and is generally described as important.106 Thus, patients should be allowed to differentiate their consent depending on who or what type of entity can receive and use their data. – It is problematic if many patients are unaware of what and where their data are as well as what is shared, for example, some patients did not know that their EHR data were of interest for data sharing.107 – In general, patients want to receive a request before a company uses their health data for a secondary purpose.108 – Patients prefer an opt-in approach to data sharing and would particularly like an opt-out possibility if their data were to be shared with companies.109 104

European Commission (2022b). Bernaerdt et al. (2021). 106 Jones et al. (2022), Mello et al. (2018), Kim et al. (2015), Kim et al. (2019), Broes et al. (2020), Grande et al. (2013), Courbier et al. (2019), Kalkman et al. (2022). 107 Bernaerdt et al. (2021). 108 American Medical Association (2022a). 109 Kim et al. (2015), American Medical Association (2022b). 105

64

L. C. Druedahl and S. Kälvemark Sporrong

– Many patients value individual control of data over the social benefit of data sharing.110 There is a risk that individuals are held “captive” by the policy argument that data are a common good, something to be used at the “will” of societies and regulations put in place in line with this perspective. Thus, there is a reduced focus on what patients think and their options for controlling their personal health data.

4 Concluding Remarks and Future Perspectives There seems to be a discrepancy between the currently researched patient perspective on data sharing and the reality wherein their data are to be shared. The current patient views are researched within relatively “local” contexts, where the patient would consent to collecting data for primary use and on patients’ preferences to consent and what they see as barriers and motivators for data sharing. Nonetheless, the reality of data use is moving towards re-use of data for secondary purposes, and in contexts, such as the DGA, where an altruistic consent would not allow patients per default to know who receives their data or for what purpose they are used. Little is known about how patients perceive data sharing and how they want their data shared. There is also lack of knowledge on patients’ views on the role that their data play in the larger governance of data as well as to what extent the patient perspective on data sharing in general extends to data sharing for AI/ML purposes. Thus, there currently appears to be a gap between the reality and what is known of the patient perspective on data sharing. It seems that patient views are lost in the wider debate of AI, innovation, and jurisdictional competitiveness. The aspects that are not currently taken into consideration in an altruistic consent (such as the DGA) include control over sharing regarding some, but not all, types of recipient when data are shared; lack of detailed knowledge among patients about what information and data are shared as part of data sharing; possibilities for patients to receive a request before data are used for secondary purposes; and, overall, individual control of data sharing. The most problematic concern from the patient perspective regarding altruistic consent comes from those patients who value their control higher than the social benefit because the altruistic consent builds on exactly the opposite: the social benefit outweighs the control. From the perspective of stakeholders who aim to encourage data sharing, the main concern is that some patients may choose to opt-out completely if their voice is not heard and their preferences for data sharing are not met. In such cases, AI training data would lack diversity in patient groups.111 Going forward, this is crucial because it will pose a real and significant challenge due to the consequent bias, lower prediction accuracy, and potentially wrong conclusions

110 111

Kim et al. (2015). World Health Organization (2021), Minssen et al. (2020).

Patient Perspectives on Data Sharing

65

of AI/ML-based healthcare products for diagnoses or treatments for these patients. Hence, this would affect inclusiveness of AI for patient health. We hope this chapter can foster reflections on how data governance structures align with the patient perspective. We have shown how the patient perspective is essential, not only in itself, but also for moving forward, to obtain future innovations that benefit all types of patients, and thus can contribute to achieving the WHO’s sustainable development goals on good health and wellbeing, and to reducing inequalities. This starts with closing the gap between patients’ views of data sharing in a local context and how they perceive and consent to sharing of their data in the larger framework of jurisdictional innovation strategies and competitiveness. This is needed so that consent given by the individual patient is truly informed, even if it is an altruistic consent without knowledge of particular purposes. While moving toward altruistic consent, patient autonomy should not be left behind. Acknowledgements L.C.D.’s work was supported by the Collaborative Research Program for Biomedical Innovation Law, a scientifically independent research program supported by the Novo Nordisk Foundation (grant NNF17SA0027784). L.C.D.’s work was also funded by the European Union (Grant Agreement no. 101057321). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the Health and Digital Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

References Aggarwal R, Farag S, Martin G et al (2021) Patient perceptions on data sharing and applying artificial intelligence to health care data: cross-sectional survey. J Med Internet Res 23(8):e26162 American Medical Association (2022a) Patient perspectives around data privacy. https://www.amaassn.org/system/files/ama-patient-data-privacy-survey-results.pdf. Accessed 6 Jan 2023 American Medical Association (2022b) Patient survey shows unresolved tension over health data privacy. https://www.ama-assn.org/press-center/press-releases/patient-survey-shows-unr esolved-tension-over-health-data-privacy. Accessed 22 Mar 2023 Bazzano LA, Durant J, Brantley PR (2021) A modern history of informed consent and the role of key information. Ochsner J 21(1):81–85 Beauchamp TL, Childress JF (2001) Principles of biomedical ethics, 5th edn. Oxford University Press Inc., New York Bernaerdt J, Moerenhout T, Devisch I (2021) Vulnerable patients’ attitudes towards sharing medical data and granular control in patient portal systems: an interview study. J Eval Clin Pract 27(2):429–437 Broes S, Verbaanderd C, Casteels M, et al. (2020) Sharing of clinical trial data and samples: the cancer patient perspective. Front Med 7(33) Cohen IG, Mello MM (2019) Big data, big tech, and protecting patient privacy. J Am Med Assoc 322(12):1141–1142 Council for the Lindau Nobel Laureate Meetings/Foundation Lindau Nobel Laureate Meetings (2020) Lindau Guidelines. https://lindauguidelines.org/. Accessed 22 Mar 2023 Courbier S, Dimond R, Bros-Facer V (2019) Share and protect our health data: an evidence based approach to rare disease patients’ perspectives on data sharing and data protection - quantitative survey and recommendations. Orphanet J Rare Dis 14(1):1–15

66

L. C. Druedahl and S. Kälvemark Sporrong

Eda Kavlakoglu (2020) AI vs. machine learning vs. deep learning vs. neural networks: what’s the difference? https://www.ibm.com/cloud/blog/ai-vs-machine-learning-vs-deep-learning-vsneural-networks. Accessed 22 Mar 2023 Elmore JG, Lee CI (2021) Data quality, data sharing, and moving artificial intelligence forward. JAMA Netw Open 4(8):e2119345–e2119345 European Commission (2021) Proposal for harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts. https://eur-lex.europa.eu/legalcontent/EN/TXT/?uri=celex%3A52021PC0206. Accessed 22 Mar 2023 European Commission (2022a) European data governance act. Shaping Europe’s digital future. https://digital-strategy.ec.europa.eu/en/policies/data-governance-act. Accessed 22 Mar 2023 European Commission (2022b) Data governance act explained. Shaping Europe’s digital future. https://digital-strategy.ec.europa.eu/en/policies/data-governance-act-explained. Accessed 22 Mar 2023 European Federation of Pharmaceutical Industries and Associations (EFPIA), Pharmaceutical Research and Manufacturers of America (PhRMA) (2013) EFPIA and PhRMA release joint principles for responsible clinical trial data sharing to benefit patients. https://www.efpia.eu/ news-events/the-efpia-view/statements-press-releases/130724-efpia-and-phrma-release-jointprinciples-for-responsible-clinical-trial-data-sharing-to-benefit-patients/. Accessed 22 Mar 2023 European Medicines Agency (2022) Clinical data publication. https://www.ema.europa.eu/en/ human-regulatory/marketing-authorisation/clinical-data-publication. Accessed 22 Mar 2023 Flanagin A, Curfman G, Bibbins-Domingo K (2022) Data sharing and the growth of medical knowledge. J Am Med Assoc 328(24):2398–2399 Forbes. Brian Foy (2022) Healthcare data sharing is essential to the future of medicine. https:// www.forbes.com/sites/forbestechcouncil/2022/07/21/healthcare-data-sharing-is-essential-tothe-future-of-medicine/. Accessed 22 Mar 2023 GDPR.EU (2023) What is GDPR, the EU’s new data protection law? https://gdpr.eu/what-is-gdpr/. Accessed 22 Mar 2023 Grande D, Mitra N, Shah A et al (2013) Public preferences about secondary uses of electronic health information. JAMA Intern Med 173(19):1798–1806 Haug CJ (2017) Whose data are they anyway? can a patient perspective advance the data-sharing debate? N Engl J Med 376(23):2203–2205 Health IT Analytics. Shania Kennedy (2022) Data Sharing Knowledge Gaps Widespread Among Patients. https://healthitanalytics.com/news/data-sharing-knowledge-gaps-widespread-amongpatients. Accessed 22 Mar 2023 Howe N, Giles E, Newbury-Birch D, McColl E (2018) Systematic review of participants’ attitudes towards data sharing: a thematic synthesis. J Health Serv Res Policy 23(2):123–133 Hutchings E, Loomes M, Butow P, Boyle FM (2021) A systematic literature review of attitudes towards secondary use and sharing of health administrative and clinical trial data: a focus on consent. Syst Rev 10:132 Iglesias M, Shamuilia S, Anderberg A (2019) Intellectual property and artificial intelligence. https:// publications.jrc.ec.europa.eu/repository/handle/JRC119102. Accessed 22 Mar 2023 Jones RD, Krenz C, Griffith KA et al (2022) Patient experiences, trust, and preferences for health data sharing. JCO Oncol Pract 18(3):e339–e350 Kalkman S, Van Delden J, Banerjee A et al (2022) Patients’ and public views and attitudes towards the sharing of health data for research: a narrative review of the empirical evidence. J Med Ethics 48(1):3–13 Kim J, Kim H, Bell E et al (2019) Patient perspectives about decisions to share medical data and biospecimens for research. JAMA Netw Open 2(8):e199550 Kim KK, Joseph JG, Ohno-Machado L (2015) Comparison of consumers’ views on electronic data sharing for healthcare and research. J Am Med Inform Assoc 22(4):821–830 Kostick-Quenet K, Mandl KD, Minssen T et al (2022) How NFTs could transform health information exchange. Science 375(6580):500–502

Patient Perspectives on Data Sharing

67

Ledford H (2019) Google health-data scandal spooks researchers. https://www.nature.com/articles/ d41586-019-03574-5. Accessed 15 Feb 2023 Lounsbury O, Roberts L, Goodman JR et al (2021) Opening a ‘can of worms’ to explore the public’s hopes and fears about health care data sharing: qualitative study. J Med Internet Res 23(2):e22744 Mello MM, Lieou V, Goodman SN (2018) Clinical trial participants’ views of the risks and benefits of data sharing. N Engl J Med 378(23):2202–2211 Minssen T, Gerke S, Aboy M et al (2020) Regulatory responses to medical machine learning. J Law Biosci 7(1):1–18 Price WN, Cohen IG (2019) Privacy in the age of medical big data. Nat Med 25(1):37–43 Sanyer O, Butler JM, Fortenberry K et al (2021) Information sharing via electronic health records in team-based care: the patient perspective. Fam Pract 38(4):468–472 Shabani M (2021) the data governance act and the EU’s move towards facilitating data sharing. Mol Syst Biol 17(3):e10229 Sydes MR, Johnson AL, Meredith SK et al (2015) Sharing data from clinical trials: the rationale for a controlled access approach. Trials 16(1):1–6 Taichman DB, Sahni P, Pinborg A et al (2017) Data sharing statements for clinical trials: a requirement of the international committee of medical journal editors. Ethiop J Health Sci 27(4):315–318 The Lancet Digital Health (2019) Unicorns and cowboys in digital health: the importance of public perception. Lanc Dig Health 1(7):e319 Trinidad MG, Platt J, Kardia SLR (2020) The public’s comfort with sharing health data with third-party commercial companies. Human Soc Sci Commun 7(1):1–10 US Food and Drug Administration (2022) Artificial Intelligence and Machine Learning (AI/ML)Enabled Medical Devices. https://www.fda.gov/medical-devices/software-medical-devicesamd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices. Accessed 22 Mar 2023 Wainberg M, Merico D, Delong A, Frey BJ (2018) Deep learning in biomedicine. Nat Biotechnol 36(9):829–838 Watts G (2019) Data sharing: keeping patients on board. Lanc Dig Health 1(7):e332–e333 Wernick A, Olk C, Von Grafenstein M (2020) Defining data intermediaries. Technol Regul 2020:65– 77 World Economic Forum (2022) Advancing digital agency: the power of data intermediaries. https:// www3.weforum.org/docs/WEF_Advancing_towards_Digital_Agency_2022.pdf. Accessed 22 Mar 2023 World Health Organization (2021) Ethics and governance of artificial intelligence for health. https:// www.who.int/publications/i/item/9789240029200. Accessed 22 Mar 2023

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Operationalizing the Use of Existing Data in Support of Biomedical Research and Innovation: An Inclusive and Sustainable Approach Helen Yu

Abstract Advancements in science and technology has created an expectation and demand on research and innovation to address some of the greatest societal challenges, particularly in the health and biomedical fields. There is an inherent promise associated with the potential of breakthrough technologies, particularly when combined with quality health-related data, to deliver significant improved health outcomes globally. However, science and innovation alone are not sufficient to achieve societal transformation towards global health. There is an observed reluctance to operationalize the use of existing data, mainly due to privacy and security concerns, as well as a palpable apprehension around how, for what purpose, and by whom data will be used. Research and innovation need to be supported by behavior and attitude change in order to foster inclusive participation and effective societal uptake of the resulting solutions. This chapter explores how the principles of Responsible Research and Innovation can be applied to provide a legally supported, inclusive, and sustainable approach to operationalizing the use of existing data in support of health-related innovations. By incorporating a deliberative and responsive process to citizen science practices, the root causes underlying this observed reluctance can be identified and addressed. The overall aim is to gain a fundamental understanding of the real and perceived barriers to utilizing data for research and innovation purposes, which can then be used to proffer solutions to create a responsive and inclusive culture to sustainably support the ongoing responsible use of data. Keywords Responsible research and innovation; use of health dataXE · Health data; privacy and security concerns; citizen science; developing trust and accountabilityXE · Accountability; legally supported frameworks to drive innovationXE · Innovation

H. Yu (B) Value-Based Health and Care Academy, Swansea University, Swansea, Wales © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_5

69

70

H. Yu

1 Introduction Despite the numerous scientific advances and innovations made in the fields of health and medicine for the prevention, detection, and treatment of human diseases, recent demographic and health data indicates that the need for ongoing biomedical research and innovation remains high.1 The aging population and related increase in the number of people suffering from long-term chronic conditions, as well as an increase in lifestyle diseases among the younger population, all contribute to an urgent demand for medical innovations.2 The associated rising healthcare costs and spending globally means there is significant pressure on all parts of the system to deliver quality, accessible, and affordable healthcare.3 Although responses to address global health challenges has shifted from being primarily focused on health and science based solutions to include more social and economic considerations,4 research shows that sustained efforts and ongoing investment in an inclusive and holistic approach to health innovations need to continue to reduce the social and economic impact of the ubiquitous disease burden on society.5 The potential of breakthrough health and care technologies, particularly when combined with quality health-related data, promises to deliver significant improved health outcomes.6 For example, data-driven solutions are key to understanding and addressing the increasing global threat posed by the rapid spread of antimicrobial resistance (AMR).7 Without effective antibiotics to prevent and treat an increasing range of infectious diseases, an estimated 10 million lives per year may be lost by 2050 as a consequence of the rise in drug-resistant infections and at a cost of US$100 trillion in lost global production.8 Data collected from electronic health records could be used in clinical research to reduce cost and speed up drug discovery and development. Integrated health data could be used to inform care and prevention management pathways, as well as empower patients to play a more active role in improving their own health and wellness. Despite the general recognition that data is crucial to the advancement of health related innovations, there is an observed reluctance among stakeholders to actually operationalize and put existing data into use.9 The literature repeatedly cites the need for an acute awareness of the challenges, as well as opportunities, associated with

1

See, e.g., Vos et al. (2020), pp. 1204–1222; Goldman et al. (2005), pp. W5–R5. Murray et al. (2020), pp. 1135–1159. 3 World Health Organization (2022). 4 Florent (2020), p. 1129. 5 See, e.g., Abegunde et al. (2007), pp. 1929–1938; Proksch et al. (2019), pp. 169–179. 6 See, e.g., Coorevits et al. (2013), pp. 547–560; Groves et al. (2016), pp. 1–20; Adibuzzaman et al. (2017), p. 384. American Medical Informatics Association; Bonander and Gates (2010), p. e1346. 7 See e.g., World Health Organization (2015); World Health Organization (2020b). 8 O’Neill (2016). 9 See, e.g., Bax et al. (2001), pp. 316–325; Thakur et al. (2012), pp. 562–569; Jirotka et al. (2005), pp. 369–398. 2

Operationalizing the Use of Existing Data in Support of Biomedical …

71

the use of health data and large data sets to improve health and care.10 Some of the often cited barriers to the use of health data include security and privacy concerns, trustworthy and transparent governance and management structures to ensure proper use of data for research and innovation purposes, and entitlement to benefit sharing.11 Despite the strong government and industry push for the efficient use of available knowledge and data to accelerate innovation, other stakeholder interests such as public and individual patient concerns, need to managed with equal consideration as part of the innovation process.12 Science and innovation alone are not sufficient to achieve societal transformation. Research and innovation efforts need to be supported by behavior and attitude change in order to foster inclusive participation by stakeholders (i.e., government, industry, society, academia, scientists, and data providers and stewards) to ensure effective societal uptake of the resulting solutions.13 This chapter will explore how the principles of Responsible Research and Innovation (RRI) can be applied to provide a legally supported, inclusive and sustainable approach to operationalizing the use of existing data in support of innovation. By understanding the existing legal frameworks relating to the use of data in the RRI policy context, there is a basis to better understand the source of the disconnect between legal and policy intentions and stakeholder hesitance to realize the full potential of existing data by enabling its use.14 For example, stakeholders have expressed confusion, anxiety and uncertainty arising from being unable to reconcile between the principles of open science, open innovation, and knowledge sharing in accordance with FAIR data principles on the one hand and the need to preserve intellectual property (IP) rights and commercial potential of innovations on the other.15 This leads to mistrust, questions of potential proprietary interests in data, and privacy and security considerations, all of which negatively impacts the overall willingness (particularly of the public and patients) to operationalize the use of data.16 To successfully and sustainably encourage the sharing and use of available health data, practices with the object of earning and maintaining trust of the public and patients need to be embedded in the research and innovation process. Technologybased protection and information governance measures are part of the solution but fail to address the psychosocial obstacles that impede the broader use of data. By adopting a legal framework that supports a responsive citizen science approach that genuinely reflects the interests of the relevant stakeholders, the root causes underlying the observed reluctance to operationalize the use of existing data can be identified. It is necessary to first acquire a fundamental understanding of the real and perceived 10

See, e.g., Kalra et al. (2017), pp. 1–8; Issa, Byers and Dakshanamurthy (2014), pp. 293–298; Hemerly (2013), pp. 25–31. 11 See, e.g., Kalra et al. (2017), pp. 1–8; Issa, Byers and Dakshanamurthy (2014), pp. 293–298; Hemerly (2013), pp. 25–31. 12 Yu (2016), pp. 611–635. 13 World Health Organization (2020a). 14 Houe (2019), pp. 1–8. 15 Yu (2016), pp. 611–635. 16 Yu (2016), pp. 611–635.

72

H. Yu

barriers to utilizing data for research and innovation purposes so that legal and policy tools can be leveraged to facilitate change in behavior and attitude towards the use of data. The overall aim is to gain insight into the socio-legal origin of the data under-utilization problem, which can then be used to proffer solutions to create a responsive and inclusive culture to support the sustainable use of data for research and innovation.

2 The Problem of Under Utilization Data sources are seen as vital assets, but their fundamental value lies in the critical step of being able to access the information in a responsible and sustainable manner to develop data-driven solutions.17 For example, the pharmaceutical industry advocates for the use of available health data to better understand disease progression, target clinical research and development, and evaluate the effect of different products on health outcomes.18 The assumption is that patient participation in the innovation process through permitting the use of their data will empower them to be more involved in monitoring, managing, and improving their own personal health and well-being.19 Technology such as wearable devices that are capable of gathering and providing health data is one such tool that is becoming increasingly user friendly and accessible to the public to manage their health and well-being while generating useful health related data.20 Innovative companies are regularly devising new AI assisted technologies capable of analyzing a multitude of different data points and sources to provide insights on disease and health management.21 The political will to create new and flexible laws and regulations in an attempt to keep up with the pace of innovation demonstrates a notable level of government support for data driven innovations. Collectively, current technological capabilities, availability of diverse health data, and political will to permit the use of data to advance research and innovation all align to help generate significant health outcomes, which in turn should reduce the long-term demand on healthcare systems if the disease burden of chronic illnesses can be addressed. However, despite the range of data sources available to help better understand different diseases, there is a still an aversion to permitting access to the use of data, preventing available information from being integrated in a meaningful and secure manner for research and innovation purposes.22 Often times, the data that is required for health innovations involve access to data that is considered deeply personal and private to individuals that is guarded with assurances of confidentiality 17

Kaye and Hawkins (2014), pp. 1–8. Dias and Duarte (2015), pp. 230–236. 19 See, e.g., Den Broeder et al. (2018), pp. 505–514; Ciasullo et al. (2022), pp. 365–392; Heyen et al. (2022), pp. 1–24. 20 Dunn et al. (2018), pp. 429–448. 21 Piwek et al. (2016), pp. 1–9. 22 See, e.g., Saura et al. (2021), pp. 1–13; Pisani and AbouZahr (2010), pp. 462–466; Van Panhuis et al. (2014), pp. 1–9. 18

Operationalizing the Use of Existing Data in Support of Biomedical …

73

to engender trust and a sense of security.23 The lack of public trust, real or perceived, has proven to be an ongoing barrier leading to the under-utilization of existing data to realize the potential benefits of data-driven innovations.24 Examples in popular media of misuse by industry of personal health data, as well as security breaches by third parties to gain access to private and personal health data continue to instill fear and concern in the public.25 The question is what drives this mistrust. Without the ability to integrate and extract knowledge from available data, the potential benefits directly and indirectly associated with data-driven innovations cannot be realized. From personalized medicine to AI-enabled treatment protocol to developing value-based outcomes in health and care, all require access to data from patients and the public. Current computational abilities mean there is significant capacity to process a substantial amount of data, but how to make the most use from which type and what combination of data to gain the necessary insights to drive innovation remains a highly competitive field. Both tangible and knowledge innovations derived from data can impact decision-making, leading to improved health outcomes and healthcare efficiency to achieve the greatest value from available resources.26 In addition to the mistrust, there is the further practical and technical challenge of identifying and curating the applicable types of data from multiple sources and integrating them in meaningful and useable manner for research and development purposes.27 However, without a clear path to respond to the psychosocial barriers, all other practical efforts to address the downstream challenges become moot if this particular bottleneck is not properly addressed. The significant potential of optimizing the use of data to drive scientific understanding, purposive innovation, and economic growth calls to attention the lack of legal and policy coherence on the protection and use of data that continues to the plague the innovation ecosystem.28

3 European Data Strategy—the Legal Approach Concerns regarding the privacy and ownership of personal health data has been explored extensively in the literature.29 Policies, legal regulations, and guidelines have been proposed and adopted to address public concerns while trying to encourage

23

Hemerly (2013), pp. 25–31. Hemingway et al. (2018), pp. 1481–1495. 25 See, e.g., Seh et al. (2020), p. 133. 26 NEJM Catalyst (2018). 27 Hemerly (2013), pp. 25–31. 28 European Commission (2018). 29 See, e.g., Appari and Johnson (2010), pp. 279–314; Malin et al. (2013), pp. 2–6; Haas et al. (2011), pp. 26–31; Kruse et al. (2017), pp.1–9; Chhanabhai and Holt (2007), p. 8; Barrows and Clayton (1996), pp. 139–148; OECD (2021). 24

74

H. Yu

private interests in developing and commercializing data driven innovations.30 For example, the European Data Strategy explicitly recognizes the need to “creat[e] a single market for data [to] allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers and public administrations”.31 Furthermore, both the proposed Data Act and the Data Governance Act (which came into force in June 2022) aim to facilitate the use and sharing of data across all sectors in the EU, while providing a governance structure to assure the public that their interests and values are reflected in a legal framework on developing trustworthy data sharing platforms for the benefit of society and businesses.32 Despite the laudable aims and intent of the European Data Strategy and associated legal frameworks, the practical and pragmatic question is whether legal frameworks such as the Data Governance Act and proposed Data Act will translate into practices to help overcome the psychosocial barrier to optimizing the use of health related data. The literature on the use of rights and laws to create a sense of accountability, control, and government responsiveness to public concerns point to the challenges of translating the aims and objectives of the law into implementable strategies to achieve social change.33 While legal and governance frameworks may set out the rights and responsibilities of different societal actors and the associated consequences of not playing by the rules, societal accountability focuses on empowering the public to engage and be part of the process of defining the “rules of engagement” to avoid having to reactively seek justice and legal redress afterwards.34 What the literature and research has not adequately explored to date is the how to address patient and public concerns, based on their understanding of and experiences with the relevant legal frameworks. How the law takes effect in the real world significantly impacts actual behavior, despite the express intent or willingness, albeit conceptually, by patients and the public to share and operationalize data.35 The misalignment between what the law intends versus how the law is put into action and experienced by the public36 is an overlooked area of research to develop solutions that are not only legally sound but are responsive to the needs and perceptions of the patients and the public. The regulations governing the use of data from a legal standpoint may not be in step with the perceptions and understanding by the public of what their rights are with respect to their data, which has a direct 30

See, e.g., European Commission (2020); European Commission (2017); Regulation (EU) 2016/ 679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation); Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC (Text with EEA relevance.) PE/51/2019/REV/1; Proposal for a Regulation of the European Parliament and of the Council on harmonised rules on fair access to and use of data (Data Act) CON/2022/68 final. 31 European Commission (2022a). 32 European Commission (2022b). 33 Johnston (2006), pp. 1–32; Hickey and King (2016), pp. 1225–1240. 34 Joshi (2017), pp. 160–172; Boydell et al. (2019), pp. 1–6. 35 Stockdale et al. (2018), pp. 1–25; Howe et al. (2018), pp. 123–133. 36 Halperin (2011), pp. 45–76.

Operationalizing the Use of Existing Data in Support of Biomedical …

75

effect on trust and perceived risk.37 More specifically, how do patients and the public understand and experience concepts found in legal and policy instruments related to operationalizing data, such as “data altruism,” “responsibility,” “privacy-by-design,” and “openness.” By understanding and identifying the types of risk and benefits that contribute to the hesitance and apprehension, there is a basis to develop responsive legal tools to precisely address specific pressure points of concern, which in turn helps garner and build ongoing trust with patients and the public for future innovations.

4 Citizen Science and RRI Broadly, the concept of citizen science has been understood in the literature as public participation in the scientific process to develop socially relevant and acceptable innovations.38 Citizen science in the context of RRI is intended to empower the public, as the intended consumers of new innovations, through participation in the shaping of research agendas to ensure that resulting innovations reflect the values of society.39 The objective is to develop a shared responsibility amongst all stakeholders through meaningful and responsive dialogue to collectively inform decision making on research processes and outcomes.40 The premise is that research done responsibly through meaningful engagement with different stakeholders, including patients and the public, will ensure that the resulting innovations from the research process incorporate social considerations, values, and interests to meet societal needs.41 The concept of citizen science within the RRI framework envisions a mutually responsive participatory approach between stakeholders to ensure that resulting innovations are “ethically acceptable, socially desirable, and sustainable.”42 The main challenge and criticism of RRI continue to revolve around the issue of how to implement the principles of RRI to achieve the desired alignment of research and innovation outcomes with societal values.43 “Stakeholder engagement” and creating opportunities for ‘meaningful participation by the public’ often describe the types of activities that are incorporated into the research and innovation process to demonstrate a project has adopted RRI practices. However, the perceived lack of influence by stakeholder groups such as patients and the public on the agenda setting and research and innovation process is in part why citizen science, when done in a superficial manner, may lead to a sense of mistrust as opposed to a greater alignment with societal needs.44 RRI describes meaningful stakeholder engagement as being 37

Visschers and Siegrist (2008), pp. 156–167. Strasser et al. (2019), pp. 52–76. 39 RRI Tools, https://www.rri-tools.eu/about-rri. Accessed 4 September 2022. 40 Macq et al. (2020), pp. 489–512. 41 Zwart et al. (2014), p. 11. 42 Von Schomberg and Blok (2021), pp. 309–323. 43 See, e.g., Blok and Lemmens (2015), pp. 19–35; Åm (2019), pp. 163–178; Owen and Pansera (2019), pp. 26–48. 44 Smallman (2018), pp. 241–253. 38

76

H. Yu

inclusive, reflective, responsive, adaptive, and open and transparent.45 In other words, RRI also contemplates societal readiness as part of the social desirability and ethical acceptability of whether new innovations will be successfully adopted as a solution to a societal problem.46 It is recognized in the literature that human behavior plays an important role in the acceptability of technological solutions, suggesting that positive change can be facilitated by aligning and responding to human interactions with artefacts.47 According to the technology acceptance model, individual acceptance and use of new innovations, as well as a willingness to participate in the innovation process, depends on psychosocial factors such as perceived security, benefit, and ease of use.48 In the context of operationalizing data for health innovation, because the contribution and use of data raises legal questions associated with privacy, data protection, IP, data attribution, and data security, as well as concerns relating to transparency, accountability, and responsibility, all these factors collectively manifest as barriers to adopting data-driven solutions.49 By identifying and managing the real and perceived risks of stakeholders through meaningful dialogue, trust and confidence can be gained to foster behavior change.50 Citizen science has been linked to the concept of “deliberative democracy” in the literature as a way to include and recognize the public voice as a collective perspective arising from a deliberative process and mutual understanding as opposed to an aggregate of individual responses.51 The deliberative process can be adapted to suit different purposes and stakeholder groups and it has been recognized as a beneficial method to consult with the public in an inclusive manner on complex or value-based matters that involve compromises to come up with long term solutions.52 Research and evidence demonstrates deliberative democracy to be an effective and sound way to engage the public, though not without practical challenges of organizing and implementing a well-informed and balanced deliberation forum.53 Although deliberation takes time, the process is expected to produce more sustainable and enduring results and help build legitimacy and trust for the development of future innovations.54 Most importantly, adopting a deliberative process will help facilitate an informed dialogue on key pressure points. Accountability, fairness and permissible use of data to advance public health interests need to be balanced against private interests of industry to ensure the willingness of different stakeholders to engage in collaborative R&I efforts.55 Because of the potential for conflicting objectives and priorities 45

Stilgoe et al. (2013), pp. 1568–1580. Novitzky et al. (2020), pp. 39–41. 47 Brown and Wyatt (2010), pp. 29–43; Roberts et al. (2016), pp. 11–14. 48 Park and Kim (2014), pp. 376–385; Jelsma (2006). 49 Miller and Sim (2004), pp. 116–126. 50 Jakku et al. (2019), pp. 1–13; Holden and Karsh (2009), pp. 21–38. 51 Dryzek et al. (2019), pp. 1144–1146. 52 OECD (2020). 53 Bächtiger et al. (2018). 54 Ercan et al. (2019), pp. 19–36. 55 Von Schomberg (2013), pp. 51–74. 46

Operationalizing the Use of Existing Data in Support of Biomedical …

77

among stakeholders, adopting an inclusive RRI approach to align research objectives, processes, and outcomes with stakeholder needs and societal expectations will increase the likelihood of generating socially and economically acceptable results for effective societal uptake.56

5 Discussion Patient-centered approaches have often been discussed in the same context as citizen science in the context of health innovation. Both concepts have obvious overlaps with respect to recognizing patient preferences, values, and interests. The traditional model of healthcare has often been criticized for being too paternalistic and should give way to a more collaborative approach where patients share in the decision making of their own health.57 However, there is literature that questions whether patient-centered approaches where patient involvement rises to the level of acting as partners along with scientifically trained researchers, clinicians and healthcare providers to determine the best treatment pathway actually creates better health outcomes.58 The literature even warns of ‘tokenism in patient engagement’59 and the dangers of masking the traditional paternalistic approach with superficial and disingenuous engagement initiatives, which would likely erode trust and discourage participation in future genuine efforts to include the public.60 In order for deliberative democracy to work in the context of citizen science on facilitating the use of data to drive health innovations, there is a need to understand, reflect, and respond with legally supported frameworks to build the trust and accountability. Without the appropriate legal and policy framework to support the use of data for research and innovation purposes, there will always be implementation challenges no matter how scientifically, socially, and economically sound the objectives are for the use of data.61 The proactive approach of RRI to critically assess the broader impact of innovation on society in a responsive and reflexive manner makes the RRI framework particularly relevant to the healthcare context as a way to align the values, needs and expectations of relevant stakeholders in a holistic and balanced manner.62 However, in practice, the notion of alignment is often confounded with agreement, making the challenge of imparting the core values and reasoning of RRI and social impact into traditional science and technology research practices even more difficult to implement.63 The translation of policy into practice requires the reassembling of 56

Irwin (2006), pp. 299–320. World Health Organization (2020a, b). 58 Shay and Lafata (2015), pp. 114–131; Elwyn et al. (2015), pp. 1–10. 59 Hahn et al. (2017), pp. 290–295. 60 Frahm et al. (2022), pp. 174–216; Domecq et al. (2014), pp. 1–9. 61 Irwin (2001), pp. 1–18; van Oudheusden (2014), pp. 67–86; Owen et al. (2021), pp. 217–233. 62 Yu (2016), pp. 611–635; Lehoux et al. (2008), pp. 251–254. 63 Ribeiro et al. (2018), pp. 316–331. 57

78

H. Yu

policy ideas into context specific settings.64 By adopting an RRI approach to engage patients and the public to better understand how best to address their concerns through shared responsibility, a holistic innovation framework that incentivizes collaborative participation can be developed.65 Translating citizen science into practice can be challenging if steps are not proactively taken to identify and create conditions necessary to adopt engagement practices that reflect and acknowledges the specific dynamics, culture, and concerns of the R&I context.66 For example, by instilling a sense of ownership, control, and selfdetermination of what RRI and citizen science means in a specific research environment, there is a greater likelihood that an RRI model co-created by the stakeholders through an inclusive dialogue to better understand the reasoning and objectives underlying engagement will be successfully adopted. To achieve this, the generally accepted practices of stakeholder engagement activities such as focus groups, interdisciplinary involvement, public outreach, and semi-structured interviews remain the same. Although the literature suggests these alone do not lead to successful implementation of science society policies,67 the background training and education leading to and in support of the engagement activities aimed at contextualizing the core values and principles of RRI will create a common basis for stakeholders to engage in meaningful and responsive dialogue. The responsiveness of a more deliberative model of engagement is key to earning the trust that there will be accountability and a genuine inclusive approach to citizen science. For example, creating conditions of meaningful participation that contemplates a process to modify and adapt structures, if necessary to respond to changing circumstances would significantly enhance accountability. There is a risk that engagement activities become token gestures, leaving patients feeling unheard and let down if there is no follow through or explanation of how their efforts ultimately contributed to the process. If researchers are perceived to treat stakeholder engagement as merely a box-checking exercise and misrepresent the deliberative democracy purpose of citizen science, patients and the public will not tolerate the dishonesty and avoid future attempts at engagement if they do not feel valued or heard. A genuine dialogue, followed up by a re-engagement to demonstrate how patient feedback has been incorporated, along with a meaningful examination of whether the responsive action taken has addressed the patients’ concerns demonstrates good faith and sincere attempt to identify, understand and respond to the root cause underlying the hesitance to operationalize data. From this thoughtful effort, there is an informed basis to design, develop, and implement policy and legal structures that are responsive to stakeholder interests and concerns. This in turn will foster greater trust, accountability, and willingness to collaborate. The use of legal assurances and mechanisms introduced by way of a contract to govern the use of data could be a solution to reflect and safeguard shared 64

Griggs et al. (2014). Yu (2016), pp. 611–635; Lehoux et al. (2008), pp. 251–254. 66 Bajmócy and Pataki (2019). 67 Hartley et al. (2017), pp. 361–377. 65

Operationalizing the Use of Existing Data in Support of Biomedical …

79

interests and concerns. This is where hard law plays a key role in creating the infrastructure necessary to support, incentivize, and hold stakeholders accountable to do their share in the responsible use of data to drive health innovations. The reluctance to share and operationalize the use of existing data can only be understood and solved through meaningful engagement with the relevant parties that are the source or have control over the data. Based on the principles of RRI, if there is a mutual understanding by all stakeholders of both “law according to the books” and “law in reality,” there is an informed basis to articulate and shape legally supported data sharing agreements to better align RRI principles with stakeholder interests and concerns.68 For example, patients and the public have expressed concerns related to privacy and what can be done with their data in the event of a security breach.69 They are also concerned about the commercialization of their data or relinquishing interests and entitlements to benefit should the use of data lead to some commercializable success.70 There is also the confusion between IP rights and the broadly accepted policy mandate for data sharing, open science, and open access to support the advancement of scientific research and free exchange of knowledge. Data is commonly recognized as proprietary information that has significant potential value.71 Because of this gap in knowledge and limited access to resources, patients and guardians of valuable data are often paralyzed into inaction and opt for the safest option, which is to withhold consent or not permit access to data under their control.72 By involving and consulting with patients and the public at the outset, there is an opportunity to identify and characterize the observed hindrances and opportunities from the perspective of the relevant stakeholders to have an informed understanding of how each stakeholder group perceives the role of law and regulations that impact the use and sharing of existing data. This type of engagement will allow for a better alignment of objectives with the expectations of stakeholders. Legal mechanisms can be used to support what RRI initiatives want to achieve by defining the “rules of engagement” to safeguard the interests and concerns underlying the reluctance to operationalize the use of available data. The law can be useful tool to help strike a balance and define a legally protected framework where the stakeholders can reach an understanding on what conditions need to be in place to continue to function within the new norm. There is an ability to tailor contracts to clarify the rights and responsibilities while establishing expectations to specifically tackle concerns that give rise to the psychosocial barriers. Without a bespoke legal framework to alleviate real and perceived challenges, data driven innovations will encounter challenges no matter how sound the research may theoretically be. The relationship between RRI and the sustainable implementation of RRI practices requires the law to de-risk and incentivize the partnership between the stakeholders to achieve RRI outcomes to capture the value of citizen science and stakeholder engagement while protecting against free-riding and unauthorized use of the data. 68

Hartley et al. (2017), pp. 361–377. Van Panhuis et al. (2014), pp. 1–9. 70 Andanda (2013), pp. 140–177. 71 Günther et al. (2017), pp. 191–209. 72 Hulsen (2020), p. 3046. 69

80

H. Yu

6 Conclusion Science and technology alone are not sufficient to address the health challenges we are currently facing. The responsible use of health data for research and innovation purposes requires a thoughtful and responsive framework to address the express concerns of patients and the public with respect to operationalizing and optimizing the use of their data. The engagement and inclusion of patients and the public to develop such a framework is often neglected in practice and in the literature. As healthcare costs continue to rise globally due to a growing and aging population, the potential of successfully developing technologies and innovation in health promises to provide much needed relief from budgetary pressures and public demand for better health and care outcomes. The expression “it takes a village” aptly applies to the challenge of operationalizing the use of data for health innovations. As discussed, the inclination reported by the public to share and use their data for research and innovation purposes should not be mistakenly interpreted to mean that active measures to allay their concerns are not needed in order to garner their consent and collaboration. A genuine RRI approach to citizen science needs to be adopted to support the necessary behavior and attitude change in order to drive innovation in a sustainable manner— that is, to ensure the R&I activities society invests time, effort, and resources into developing stands a better than average chance of being adopted as long-term solutions. This is where the legal mechanisms can be leveraged to specifically address the concerns of the public. Committing to the creation of a responsive and legally supported framework will not only garner trust, credibility and accountability but will also ensure that stakeholder interests and concerns are reflected and taken into account as part of the R&I process. The law can be used to provide legally supported and incentivized ways to help implement the shared “responsible” participation of stakeholders to facilitate the use of health data to drive innovation. Acknowledgements This research was supported by a Novo Nordisk Foundation grant for a scientifically independent Collaborative Research Program in Biomedical Innovation Law (grant agreement number NNF17SA0027784) and by a grant from the Center for Digital Life Norway and the Research Council of Norway (grant agreement number 294594).

References Abegunde DO, Mathers CD, Adam T, Ortegon M, Strong K (2007) The burden and costs of chronic diseases in low-income and middle-income countries. Lancet 370(9603):1929–1938 Adibuzzaman M, DeLaurentis P, Hill J, Benneyworth BD (2017) Big data in healthcare–the promises, challenges and opportunities from a research perspective: a case study with a model database. In: AMIA annual symposium proceedings. p 384 Åm H (2019) Limits of decentered governance in science-society policies. J Responsible Innov 6(2):163–178 Andanda P (2013) Managing intellectual property rights over clinical trial data to promote access and benefit sharing in public health. IIC-Int Rev Intellect Prop Compet Law 44(2):140–177 Appari A, Johnson ME (2010) Information security and privacy in healthcare: current state of research. Int J Internet Enterp Manag 6(4):279–314

Operationalizing the Use of Existing Data in Support of Biomedical …

81

Bächtiger A, Dryzek JS, Mansbridge J, Warren ME (eds) (2018) The oxford handbook of deliberative democracy. Oxford University Press, Oxford Bajmócy Z, Pataki G (2019) Responsible research and innovation and the challenges of co-creation. In: Bammé A, Getzinger G (eds) Yearbook 2018 of the institute for advanced studies on science, technology and society. Profil Verlag, München–Wien Barrows RC Jr, Clayton PD (1996) Privacy, confidentiality, and electronic medical records. J Am Med Inform Assoc 3(2):139–148 Bax R et al (2001) Surveillance of antimicrobial resistance—what, how and whither? Clin Microbiol Infect 7(6):316–325 Blok V, Lemmens P (2015) The emerging concept of responsible innovation. Three reasons why it is questionable and calls for a radical transformation of the concept of innovation. In: Responsible innovation 2: concepts, approaches, and applications. pp 19–35 Bonander J, Gates S (2010) Public health in an era of personal health records: opportunities for innovation and new partnerships. J Med Internet Res 12(3):e1346 Boydell V, McMullen H, Cordero J, Steyn P, Kiare J (2019) Studying social accountability in the context of health system strengthening: innovations and considerations for future work. Health Res Policy Syst 17(34):1–6 Brown T, Wyatt J (2010) Design thinking for social innovation. Dev Outreach 12(1):29–43 Chhanabhai P, Holt A (2007) Consumers are ready to accept the transition to online and electronic records if they can be assured of the security measures. Medscape Gen Med 9(1):8 Ciasullo MV, Carli M, Lim WM, Palumbo R (2022) An open innovation approach to co-produce scientific knowledge: an examination of citizen science in the healthcare ecosystem. Eur J Innov Manag 25(6):365–392 Coorevits P, Sundgren M, Klein GO, Bahr A, Claerhout B, Daniel C, Dugas M, Dupont D, Schmidt A, Singleton P, De Moor G (2013) Electronic health records: new opportunities for clinical research. J Intern Med 274(6):547–560 Den Broeder L, Devilee J, Van Oers H, Schuit AJ, Wagemakers A (2018) Citizen Science for public health. Health Promot Int 33(3):505–514 Dias JA, Duarte P (2015) Big data opportunities in healthcare. How can medical affairs contribute. Rev Port Farmacoter 7(4):230–236 Domecq JP, Prutsky G, Elraiyah T, Wang Z, Nabhan M, Shippee N, Brito JP, Boehmer K, Hasan R, Firwana B, Erwin P (2014) Patient engagement in research: a systematic review. BMC Health Serv Res 14(1):1–9 Dryzek JS, Bächtiger A, Chambers S, Cohen J, Druckman JN, Felicetti A, Fishkin JS, Farrell DM, Fung A, Gutmann A, Landemore H (2019) The crisis of democracy and the science of deliberation. Science 363(6432):1144–1146 Dunn J, Runge R, Snyder M (2018) Wearables and the medical revolution. Pers Med 15(5):429–448 Elwyn G, Frosch DL, Kobrin S (2015) Implementing shared decision-making: consider all the consequences. Implement Sci 11(1):1–10 Ercan SA, Hendriks CM, Dryzek JS (2019) Public deliberation in an era of communicative plenty. Policy Polit 47(1):19–36 European Commission (2017) Building a European data economy. Brussels 10.1.2017 COM (2017) 9 final. European Commission (2018) Study on emerging issues of data ownership, interoperability, (re-) usability and access to data, and liability—final report European Commission (2020) European data strategy. Brussels, 19.2.2020 COM (2020) 66 final European Commission (2022a) European data strategy. https://commission.europa.eu/strategy-andpolicy/priorities-2019-2024/europe-fit-digital-age/european-data-strategy_en. Accessed 27 Dec 2022a European Commission (2022b) European data governance act. https://digital-strategy.ec.europa.eu/ en/policies/data-governance-act. Accessed 27 Dec 2022b Florent V (2020) Global health: time for radical change. Lancet 396(874):1129

82

H. Yu

Frahm N, Doezema T, Pfotenhauer S (2022) Fixing technology with society: the coproduction of democratic deficits and responsible innovation at the OECD and the European commission. Sci Technol Human Values 47(1):174–216 Goldman DP, Shang B, Bhattacharya J, Garber AM, Hurd M, Joyce GF, Lakdawalla DN, Panis C, Shekelle PG (2005) Consequences of health trends and medical innovation for the future elderly: when demographic trends temper the optimism of biomedical advances, how will tomorrow’s elderly fare? Health Aff 24(Suppl2):W5-R5 Griggs S, Norval A, Wagenaar H (2014) Practices of freedom. Decentred governance, conflict and democratic participation. Cambridge University Press, Cambridge Groves P, Kayyali B, Knott D, Kuiken SV (2016) The ‘big data’ revolution in healthcare: accelerating value and innovation. McKinsey & Company, Center for US Health System Reform Business Technology Office, pp 1–20 Günther WA, Mehrizi MHR, Huysman M, Feldberg F (2017) Debating big data: a literature review on realizing value from big data. J Strateg Inf Syst 26(3):191–209 Haas S, Wohlgemuth S, Echizen I, Sonehara N, Müller G (2011) Aspects of privacy for electronic health records. Int J Med Inf 80(2):e26–e31 Hahn DL, Hoffmann AE, Felzien M, LeMaster JW, Xu J, Fagnan LJ (2017) Tokenism in patient engagement. Fam Pract 34(3):290–295 Halperin JL (2011) Law in books and law in action: the problem of legal change. Maine Law Rev 64(1):45–76 Hartley S, Pierce W, Taylor A (2017) Against the tide of depoliticisation: the politics of research governance. Policy Polit 45(3):361–377 Hemerly J (2013) Public policy considerations for data-driven innovation. Computer 46(6):25–31 Hemingway H, Asselbergs FW, Danesh J, Dobson R, Maniadakis N, Maggioni A, Van Thiel GJ, Cronin M, Brobert G, Vardas P, Anker SD (2018) Big data from electronic health records for early and late translational cardiovascular research: challenges and potential. Eur Heart J 39(16):1481–1495 Heyen NB, Gardecki J, Eidt-Koch D, Schlangen M, Pauly S, Eickmeier O, Wagner T, Bratan T (2022) Patient science: citizen science involving chronically ill people as co-researchers. J Particip Res Methods 3(1):1–24 Hickey S, King S (2016) Understanding social accountability: politics, power and building new social contracts. J Dev Stud 52(8):1225–1240 Holden RJ, Karsh BT (2009) A theoretical model of health information technology usage behaviour with implications for patient safety. Behav Inf Technol 28(1):21–38 Houe H, Nielsen SS, Nielsen LR, Ethelberg S, Mølbak K (2019) Opportunities for improved disease surveillance and control by use of integrated data on animal and human health. Front Vet Sci 6(301):1–8 Howe N, Giles E, Newbury-Birch D, McColl E (2018) Systematic review of participants’ attitudes towards data sharing: a thematic synthesis. J Health Serv Res Policy 23(2):123–133 Hulsen T (2020) Sharing is caring—data sharing initiatives in healthcare. Int J Environ Res Public Health 17(9):3046 Irwin A (2001) Constructing the scientific citizen: Science and democracy in the biosciences. Public Underst Sci 10(1):1–18 Irwin A (2006) The politics of talk: coming to terms with the “new” scientific governance. Soc Stud Sci 36(2):299–320 Issa NT, Byers SW, Dakshanamurthy S (2014) Big data: the next frontier for innovation in therapeutics and healthcare. Expert Rev Clin Pharmacol 7(3):293–298l Jakku E, Taylor B, Fleming A, Mason C, Fielke S, Sounness C, Thorburn P (2019) “If they don’t tell us what they do with it, why would we trust them?” Trust, transparency and benefit-sharing in smart farming. NJAS—Wagening J Life Sci 90:1–13 Jelsma J (2006) Designing ‘Moralized’ products. In: Verbeek PP, Slob A (eds) User behavior and technology development: shaping sustainable relations between consumers and technologies. Springer, Berlin

Operationalizing the Use of Existing Data in Support of Biomedical …

83

Jirotka M, Procter R, Hartswood M, Slack R, Simpson A, Coopmans C, Hinds C, Voss A (2005) Collaboration and trust in healthcare innovation: The eDiaMoND case study. Comput Support Coop Work (CSCW) 14(4):369–398 Johnston M (2006) Good governance: rule of law, transparency, and accountability. United Nations Public Administration Network, New York, pp 1–32 Joshi A (2017) Legal empowerment and social accountability: Complementary strategies toward rights-based development in health? World Dev 99:160–172 Kalra D, Stroetmann V, Sundgren M, Dupont D, Schlünder I, Thienpont G, Coorevits P, De Moor G (2017) The European institute for innovation through health data. Learn Health Syst 1(1):1–8 Kaye J, Hawkins N (2014) Data sharing policy design for consortia: challenges for sustainability. Genome Med 6(1):1–8 Kruse CS, Smith B, Vanderlinden H, Nealand A (2017) Security techniques for the electronic health records. J Med Syst 41(8):1–9 Lehoux P, Williams-Jones B, Miller F, Urbach D, Tailliez S (2008) What leads to better health care innovation? Arguments for an integrated policy-oriented research agenda. J Health Serv Res Policy 13(4):251–254 Macq H, Tancoigne É, Strasser BJ (2020) From deliberation to production: public participation in science and technology policies of the European Commission (1998–2019). Minerva 58(4):489– 512 Malin BA, Emam KE, O’Keefe CM (2013) Biomedical data privacy: problems, perspectives, and recent advances. J Am Med Inform Assoc 20(1):2–6 Miller RH, Sim I (2004) Physicians’ use of electronic medical records: barriers and solutions. Health Aff 23(2):116–126 Murray CJ, Abbafati C, Abbas KM, Abbasi M, Abbasi-Kangevari M, Abd-Allah F, Abdollahi M, Abedi P, Abedi A, Abolhassani H, Aboyans V (2020) Five insights from the global burden of disease study 2019. Lancet 396(10258):1135–1159 NEJM Catalyst (2018) Healthcare big data and the promise of value-based care. NEJM Catalyst 4(1) Novitzky P, Bernstein MJ, Blok V, Braun R, Chan TT, Lamers W, Loeber A, Meijer I, Lindner R, Griessler E (2020) Improve alignment of research policy and societal values. Science 369(6499):39–41 Organization for Economic Cooperation and Development (2021) Report on the implementation of the recommendation of the council concerning guidelines governing the protection of privacy and transborder flows of personal data, C 42. https://one.oecd.org/document/C(2021)42/en/pdf. Accessed 28 Dec 2022 O’Neill J (2016) Tackling drug-resistant infections globally: final report and recommendations. https://amrreview.org/sites/default/files/160518_Final%20paper_with%20cover.pdf. Accessed 21 Dec 2022 Organization for Economic Cooperation and Development (2020). Innovative citizen participation and new democratic institutions: catching the deliberative wave. https://www.oecdilibrary.org/ sites/339306daen/index.html?itemId=/content/publication/339306da-en. Accessed 4 Jan 2023 Owen R, von Schomberg R, Macnaghten P (2021) An unfinished journey? Reflections on a decade of responsible research and innovation. J Responsible Innov 8(2):217–233 Owen R, Pansera M (2019) Responsible innovation and responsible research and innovation. Handbook on science and public policy. pp 26–48 Park E, Kim KJ (2014) An integrated adoption model of mobile cloud services: exploration of key determinants and extension of technology acceptance model. Telemat Inform 31(3):376–385 Pisani E, AbouZahr C (2010) Sharing health data: good intentions are not enough. Bull World Health Organ 88(6):462–466 Piwek L, Ellis DA, Andrews S, Joinson A (2016) The rise of consumer health wearables: promises and barriers. PLoS Med 13(2):1–9 Proksch D, Busch-Casler J, Haberstroh MM, Pinkwart A (2019) National health innovation systems: clustering the OECD countries by innovative output in healthcare using a multi indicator approach. Res Policy 48(1):169–179

84

H. Yu

Ribeiro B, Bengtsson L, Benneworth P, Bührer S, Castro-Martínez E, Hansen M, Jarmai K, Lindner R, Olmos-Peñuela J, Ott C, Shapira P (2018) Introducing the dilemma of societal alignment for inclusive and responsible research and innovation. J Responsible Innov 5(3):316–331 Roberts JP, Fisher TR, Trowbridge MJ, Bent C (2016) A design thinking framework for healthcare management and innovation. Healthcare 4(1):11–14 Saura JR, Ribeiro-Soriano D, Palacios-Marqués D (2021) From user-generated data to data-driven innovation: a research agenda to understand user privacy in digital markets. Int J Inf Manage 60:1–13 Seh AH, Zarour M, Alenezi M, Sarkar AK, Agrawal A, Kumar R, Ahmad Khan R (2020) Healthcare data breaches: insights and implications. Healthcare 8(2):133 Shay LA, Lafata JE (2015) Where is the evidence? A systematic review of shared decision making and patient outcomes. Med Decis Making 35(1):114–131 Smallman, M., (2018) Citizen science and responsible research and innovation. UCL Press, pp. 241– 253. Stilgoe J, Owen R, Macnaghten P (2013) Developing a framework for responsible innovation. Res Policy 42(9):1568–1580 Stockdale J, Cassell J, Ford E (2018) “Giving something back”: a systematic review and ethical enquiry into public views on the use of patient data for research in the United Kingdom and the Republic of Ireland. Wellcome Open Res 3(6):1–25 Strasser B, Baudry J, Mahr D, Sanchez G, Tancoigne E (2019) “Citizen science”? Rethinking science and public participation. Sci Technol Stud 32:52–76 Thakur R, Hsu SH, Fontenot G (2012) Innovation in healthcare: Issues and future trends. J Bus Res 65(4):562–569 Van Oudheusden M (2014) Where are the politics in responsible innovation? European governance, technology assessments, and beyond. J Responsible Innov 1(1):67–86 Van Panhuis WG, Paul P, Emerson C, Grefenstette J, Wilder R, Herbst AJ, Heymann D, Burke DS (2014) A systematic review of barriers to data sharing in public health. BMC Public Health 14(1):1–9 Visschers VH, Siegrist M (2008) Exploring the triangular relationship between trust, affect, and risk perception: a review of the literature. Risk Manag 10(3):156–167 Von Schomberg R (2013) A vision of responsible research and innovation. In: Owen R, Bessant J, Heintz M (eds) Responsible innovation: managing the responsible emergence of science and innovation in society. Wiley, London, pp 51–74 Von Schomberg L, Blok V (2021) Technology in the age of innovation: responsible innovation as a new subdomain within the philosophy of technology. Philos Technol 34:309–323 Vos T, Lim SS, Abbafati C, Abbas KM, Abbasi M, Abbasifard M, Abbasi-Kangevari M, Abbastabar H, Abd-Allah F, Abdelalim A, Abdollahi M (2020) Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the global burden of disease study 2019. Lancet 396(10258):1204–1222 World Health Organization (2015). Global action plan on antimicrobial resistance. https://www. who.int/antimicrobial-resistance/global-action-plan/en/. Accessed 21 Aug 2022 World Health Organization (2020b) Global antimicrobial resistance and use surveillance system (GLASS) report. https://apps.who.int/iris/bitstream/handle/10665/332081/978924000 5587-eng.pdf. Accessed 21 Aug 2022 World Health Organization (2020a) Challenges to tackling antimicrobial resistance economic and policy responses: economic and policy responses. OECD Publishing World Health Organization (2022) World Health Organization Strategy (2022–2026) for the national action plan for health security. https://www.who.int/publications/i/item/978924006 1552. Accessed 10 Dec 2022 Yu H (2016) Redefining responsible research and innovation for the advancement of biobanking and biomedical research. J Law Biosci 3(3):611–635 Zwart H, Landeweerd L, van Rooij A (2014) Adapt or perish? Assessing the recent shift in the European research funding arena from ‘ELSA’ to ‘RRI.’ Life Sci Soc Policy 10(1):11

Dobbs in a Technologized World: Implications for US Data Privacy Jheel Gosain, Jason D. Keune, and Michael S. Sinha

Abstract In June 2022, the U.S. Supreme Court issued its opinion in Dobbs v. Jackson Women’s Health Organization, overturning 50 years of precedent by eliminating the federal constitutional right to abortion care established by the Court’s 1973 decision in Roe v. Wade. The Dobbs decision leaves the decision about abortion services in the hands of the states, which created an immediately variegated checkerboard of access to women’s healthcare across the country. This, in turn, laid bare a profusion of privacy issues that emanate from our technologized world. Here, we review these privacy issues, including healthcare data, financial data, website tracking and social media. We then offer potential future legislative and regulatory pathways that balance privacy with law enforcement goals in women’s health and any domain that shares this structural feature. Keywords Abortion · data privacy · Dobbs · Reproductive Rights · Supreme Court · United States

1 Introduction On June 24, 2022, the U.S. Supreme Court issued its opinion in Dobbs v. Jackson Women’s Health Organization, overturning nearly 50 years of precedent established by the Court’s 1973 decision in Roe v. Wade.1 By eliminating a federal constitutional

J. Gosain · J. D. Keune · M. S. Sinha (B) Saint Louis University School of Law, Saint Louis, MO, USA J. D. Keune Albert Gnaegi Center for Health Care Ethics, Saint Louis University, Saint Louis, MO, USA M. S. Sinha Center for Health Law Studies, Saint Louis University School of Law, 100 N. Tucker Blvd., 63101 Saint Louis, MO, USA 1

Dobbs v. Jackson Women’s Health Organization (2023).

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_6

85

86

J. Gosain et al.

right to abortion care, Dobbs effectively reverted the decision to the states.2 A sea change in national abortion policy occurred almost immediately. Though some states took action to offer greater protection for abortion services, several other states passed laws going as far as to ban all abortion care while criminalizing anyone who aids or abets the process—including physicians.3 In states like Texas, ordinary citizens are now empowered to surveil pregnant persons through the provision of bounties in exchange for information that leads to prosecution.4 Social media conversations have led to criminal charges.5 People using period-tracking or other fertility mobile device applications have rushed to delete the programs, though others counter that their data will remain accessible to law enforcement officials in other ways.6 These instances are not isolated and have raised valid concerns about the extent to which our private data can be misused for malicious purposes.7 Over the last few decades, as technology and social media have evolved, privacy protections have lagged behind.8 This is especially true for healthcare privacy, and this has concerning implications for reproductive privacy post-Dobbs.9 The Health Insurance Portability and Accountability Act of 1996 (HIPAA), which applies to protected health information (PHI) held within certain “covered entities,” also extends to common means of patient access to electronic protected health information (ePHI), such as patient portals.10 However, the tools used to access those portals, such as web browsers and mobile devices, may not have sufficient privacy protections in place.11 Although HIPAA may provide some protection for PHI and ePHI, the law contains broad exceptions for law enforcement access, which can be problematic when state law enforcement officials seek access to health records.12 In January 2023, two California Congresswomen introduced the Secure Access for Essential Reproductive 2

Dobbs v. Jackson Women’s Health Organization (2023). Worth noting that some Republicans want a federal ban on abortion, including U.S. Senator Lindsey Graham (R-SC). See Graham (2022). 3 Guttmacher Institute (2023). 4 Texas Heartbeat Act (2021). 5 Kaste (2022). 6 Hill (2022); see also Harwell (2019); see also Prince (2023), p. 1085-1086 (describing the “reproductive health data ecosystem”). 7 Cohen (2022). 8 Bloomberg Law (2019) (“The regulatory framework is really trying to keep up with the technology,” said Geoffrey Starks, commissioner on the Federal Communications Commission); see also Theodos and Sittig (2020) p. 7 (“With no major updates in the last 20 years, HIPAA remains the preeminent comprehensive health information privacy law. HIPAA was written and passed in the late twentieth century when the health information environment was primarily paper based and before the explosion of digital health tools. Two decades later, the health information industry has transformed leaving substantial gaps between advancements in digital health and privacy laws”). 9 Cohen (2022). 10 Shachar (2022), p. 417; see also Prince (2023), pp. 1096-1097. 11 Theodos and Sittig, p. 3. 12 45 C.F.R. § 164.512(e)(1) (2022); see also Shachar (2022); see also Boodman et al. (2022).

Dobbs in a Technologized World: Implications for US Data Privacy

87

(SAFER) Health Act, which seeks to strengthen HIPAA by prohibiting disclosure of PHI related to pregnancy termination or loss without medical consent.13 This follows HHS Guidance issued in June 2022 and an Executive Order signed by President Joe Biden in July 2022.14 However, other data—such as internet search history, GPS tracking, and financial payment information—can potentially be triangulated to infer that pregnancy termination or loss occurred, though none of these data points relate specifically to health care.15 These data privacy concerns existed prior to the Dobbs decision, but the outcome of the case, and particularly the malicious state laws that followed, have highlighted broad gaps in the US data privacy “infrastructure,” many of which have far-reaching consequences beyond abortion policy.16 Even if health care data were optimally protected, the triangulation of other forms of non-health care data can be used to infer that health care transactions or procedures suggestive of pregnancy termination have occurred. The fact that some forms of data are so widely accessible should be concerning to policymakers within the United States and abroad. In this chapter, we examine some problematic developments in data sharing and data privacy in the United States through the lens of the Dobbs decision. We summarize key issues, highlighting their relevance to women’s health care and reproductive services while considering broader implications for health care privacy writ large. We then make the case that health data privacy reform must occur along with broader data privacy reforms. PHI and ePHI no longer exist solely in the setting of a hospital or physician’s office, and as such, privacy policies should be put in place that reflect the myriad ways in which health care data can be accessed and compromised. We conclude that the US needs new data privacy laws and regulations that are broad enough to encompass any form of healthcare data or information—perhaps something similar to the European Union’s general data protection regulation (GDPR) or the two consumer privacy laws recently passed in California.17

13

Eshoo (2023). U.S. Department of Health and Human Services (2022), The White House (2022). 15 Prince (2023), pp. 1081; see also Vergano et al. (2022) (“‘HIPAA doesn’t reach this whole other world of data that says a lot about your health, but isn’t found in a traditional medical record,’ [Carmel] Shachar said.”). 16 Gajda (2022). 17 Bloomberg Law (2023) (“The California Consumer Privacy Act (CCPA), signed into law on June 28, 2018, creates an array of consumer privacy rights and business obligations with regard to the collection and sale of personal information. The CCPA went into effect Jan. 1. 2020. The California Privacy Rights Act (CPRA), also known as Proposition 24, is a ballot measure that was approved by California voters on Nov. 3, 2020. It significantly amends and expands the CCPA, and it is sometimes referred to as ‘CCPA 2.0.’”). Four other states have passed GDPR-inspired laws taking effect in 2023: Colorado, Connecticut, Utah, and Virginia. See Bellamy (2023). 14

88

J. Gosain et al.

2 PHI, ePHI, and Other Health Care Data In the United States, numerous laws have been passed over the years to protect health care privacy, yet we are currently reliant on obsolete statutes. In 1996, when HIPAA was passed, medical records in the United States were primarily in paper form; electronic health records, where they existed, were not robust enough to interact with one another.18 Subsequent legislation, including the Health Information Technology for Economic and Clinical Health Act of 2009 (HITECH), was passed to foster interchangeability of health care records and technologies.19 Both laws focus primarily on PHI and ePHI held within health care systems like hospitals or physician offices. Yet increasingly, health care data have become broadly accessible, such that patients can now access their own medical record remotely on their home computer or even their mobile devices. Those records, which previously existed behind hospital firewalls and could only be released with written patient permission, may now be susceptible to data scraping by any program or code that extracts information from personal browsing data, even in the setting of ad blockers or other limitations on access.20 With this in mind, individuals may be less able to hide personal information from those that seek to discover it—for profit or other illicit purposes. Telemedicine has introduced new concerns in this regard. Video calls, which became pervasive during the COVID-19 pandemic, often occurred via unencrypted platforms like Zoom.21 Hackers can access cameras and track keystrokes, and search history can be captured, with data sold in aggregate on the dark web.22 These threats are not unique to the United States, but other countries have implemented stronger privacy protections that limit this sort of conduct. For example, the European GDPR applies to all personally identifiable information (PII), which is broader than HIPAA’s coverage of PHI.23 After the Dobbs decision was issued, several major companies came forward offering to cover the costs of out-of-state travel for abortion care.24 Yet this would require disclosure of both pregnancy and an intent to terminate pregnancy to the employer. In the US, private health insurance is obtained primarily through employment, which could create a conflict of interest.25 Though unlawful, companies may still attempt to discriminate against employees, either on the basis of their pregnancy 18

Health Insurance Portability and Accountability Act (1996). Health Information Technology for Economic and Clinical Health Act (2009). 20 Schieszer (2022). 21 Emerson (2020). 22 Li (2021); see also Nadrag (2021). 23 Tovino (2017), p. 974 (“Identified differences reflect the [HIPAA] Privacy Rule’s original, narrow focus on health industry participants and individually identifiable health information compared to the GDPR’s broad focus on data controllers and personal data.”). 24 Goldberg (2022). Other companies, like Apple, declined to allow remote work options for employees living in states that banned or restricted abortion access. See Harrington (2022). 25 Indeed, it has created conflicts in the past, and courts have sided with Christian employers seeking to decline coverage for certain health care services on religious grounds. See Burwell v. Hobby Lobby Stores, Inc. (2014), (the U.S. Supreme Court sided in favor of Hobby Lobby, which 19

Dobbs in a Technologized World: Implications for US Data Privacy

89

or their decision to terminate it.26 In particular, privately-owned companies with religious ties may morally object and seek to humiliate or otherwise punish employees who terminate pregnancy.

3 Financial Data Any health care transaction, including those involving women’s health care or reproductive services, necessarily involves a complex set of financial transactions.27 Recent technological disruption in the financial industry, marked by waves of innovation, has led to a nearly continuous change in the way that financial transactions are carried out, from early digitization to heterogeneous cloud mechanisms to blockchain applications.28 The US Supreme Court decision in Dobbs was issued in the midst of this “FinTech” disruption, bringing a multitude of financial privacy concerns into view. These concerns are inextricable from the health privacy issues raised by the Dobbs decision and the variability of regulation of reproductive services distributed across the United States that the case engendered. In this rapidly-changing financial environment, innovations are not always yet fully mature when they are implemented.29 This leads to a situation in which issues such as complexity, uncertainty, and lack of control of certain elements create a target for threats.30 The threats themselves evolve just as rapidly as the technology that is meant to block them, and sometimes only a hairsbreadth separates the two. Privacy is one of several goods that the financial industry must offer. In a recent survey, however, it was found that only 35% of companies were confident about their security.31 In the offerings of financial firms, privacy must be weighed against—and budgeted against—other goods, including integrity, authorization and access. The way such goods are distributed is partially due to regulatory efforts; however, market mechanisms also play a major role. For an example of society’s weight of concern, consider that consumers seem more concerned about media reports of losses of dollars than about the loss of privacy.32 Patient billing data might be used to infer that a procedural abortion has taken place. A large bill from Planned Parenthood might be suggestive, and when paired declined to provide insurance coverage to contraception on religious grounds). See also Braidwood Management Inc. v. Becerra (2022) (the US District Court for the Northern District of Texas overturned a mandate to cover pre-exposure prophylaxis (PrEP) medications for HIV prevention). 26 Moreno (2022). 27 Gottlieb et al. (2018), p. 619. 28 Maechler and Moser (2019). 29 This is true of many innovations but is particularly concerning in the realm of finance. See Catinas et al. (2019). 30 Vishwanat et al. (2016). 31 Gai et al. (2017). 32 Ablon et al. (2016).

90

J. Gosain et al.

with location data from a suspect’s mobile device, can make a very convincing argument. Similarly, credit card statements with large transactions or ATM withdrawals proximal to a women’s health center might be viewed as suspicious by law enforcement officials. Later-generation cash transfer applications also provide an avenue for law enforcement to track financial transactions that might be associated with abortion. While society has come to expect privacy in financial transactions with “brick and mortar” banks, loan companies, insurance, and other old-world financial establishments, there is little legal precedent for upholding privacy in every financial transaction, and any privacy agreement that may take place would need to be present in the user agreement, not the law. Firms such as Venmo™, PayPal™, Cashapp™ and Zelle™ all have lengthy user agreements in which privacy may not be protected at all costs.33 Privacy here is achieved contractually between the user and company, and we hypothesize that many users assume that some regulatory protection exists. However, most technological application users do not read the user agreements that they sign.34 Imagine a case in which law enforcement seeks to discover financial transactions made through one of these later-generation apps. One might imagine situations in which such companies might disclose the information rather than fighting a lengthy legal battle at great cost given its relative valuation of privacy. Businesses weigh privacy as a good amongst other goods, and users should be wary of this. The pandemic has had a remarkable impact on the acceleration of growth of the financial services industry, largely driven by social distancing.35 It has been demonstrated that blockchain applications in the financial sector have the potential for lower costs and higher accessibility.36 The migration of FinTech apps from cloud to blockchain provides new opportunities for both regulatory efforts focused on privacy, as well as potential for threats to outpace security, and for new understanding of user privacy to arise. Though yet to be thoroughly vetted and proven through experience, blockchain applications are considered to have a higher quality-privacy capability than cloud-based applications. This is rooted in the ability of blockchain applications to reveal only a minimal amount of a user’s personal information in transactions while keeping linkages in a separate chain.37 Online donations to nonprofit organizations can also be discovered. In 2021, the data-driven journalism group, The Markup, found, “28 ad trackers and 40 third-party cookies tracking visitors, in addition to so-called ‘session recorders’ that could be 33

Fingas (2023); see also Anderson (2022). Auxier et al. (2019) (Just 9% of adults say they always read a company’s privacy policy before agreeing to the terms and conditions, while an additional 13% say they do this often. And additionally, 38% of Americans say they sometimes read these policies. There is also a segment of the population who forgo reading these policies altogether: More than a third of adults (36%) say they never read a privacy policy before agreeing to it.); see also Cakebread (2017) (A Deloitte survey of 2,000 consumers in the U.S found that 91% of people consent to legal terms and services conditions without reading). 35 Renduchintala (2022), p. 1. 36 Dorfleitner and Braun (2019), pp. 207–237. 37 Renduchintala (2022), p. 38. 34

Dobbs in a Technologized World: Implications for US Data Privacy

91

capturing the mouse movements and keystrokes of people visiting the homepage in search of things like information on contraceptives and abortions” at the Planned Parenthood website.38 Though the organization has suspended the use of marketing trackers on some portions of its website,39 the problem is a complex one, and the potential still exists for tracking user activity across a variety of related websites.

4 Tracking Location tracking is another form of data privacy susceptible to breach in a postDobbs world. Such location data, when observed and collected, may reveal information about health care activities, visits to specific health clinics, trips to fast-food restaurants, and more. That data can be bought and sold to third parties and could be used maliciously. U.S. health privacy laws do not address these types of problems. For instance, Google™ technology can track locations through devices such as mobile devices and tablets. This technology can similarly be used to collect location-based search history which, in the setting of data breaches, could be used maliciously. These breaches of privacy may become dangerous for those seeking reproductive health care in the current US abortion landscape. Google notes that between 2018 and 2020, the company has received 5,864 “geofence” warrants from police in the states that have banned abortions as of July 5, 2022.40 These warrants ask for lists of the mobile devices in a specific area and could use this information to track individuals present at a location of interest such as a Planned Parenthood facility.41 Companies like Apple cannot respond to geofence warrants, as the company does not store location data.42 Pro-choice lawmakers are worried that geolocation searches will increase tremendously after the Dobbs decision. In July 2022, Google declared that it would delete all location data located at an abortion clinic or fertility center.43 However, Google is not likely aware of all the locations where abortion services are being performed. Law enforcement is likely keeping a closer eye on these locations, which may make blackout locations less helpful. Further, Google Maps will continue to track all other locations, which will likely reveal to law enforcement that an individual has been seen entering and exiting a “blacked out” area.44 Geofencing warrants have also been sent to companies like Uber™, Apple™, and Snapchat™.45

38

Ng and Varner (2021). Kelley (2022). 40 Ng (2022). 41 Ng (2022). 42 Ng (2022). 43 Elias (2022). 44 Elias (2022). 45 Elias (2022). 39

92

J. Gosain et al.

Location tracking can also be spotted in less obvious ways. The New York Times Magazine reported in 2012 that companies like Target also keep vast amounts of data on people who shop in their stores.46 Target assigns each shopper a Guest ID number to keep tabs on everything bought.47 This ID stores an individual’s demographic information, such as age, marital status, estimated salary, and how long it takes you to drive to the store.48 Companies can and have bought individual data about job history, bank history, and the number of cars one has owned. Purchases like vitamins or unscented locations may also be suggestive of pregnancy.49 Target can also use age, sex, and product search history to predict whether an individual is pregnant.50 In 2012, an angry father in Minneapolis asked Target managers why his teenage daughter was receiving coupons in her name for baby cribs and maternity clothes.51 Managers received an apology later that week once the father discovered that his daughter was, indeed, pregnant.52 Anya Prince, a law professor and data privacy scholar, went to extreme efforts to hide her pregnancy from what she called “the advertising ecosystem.”53 She turned off her phone’s Global Positioning System or left it at home during appointments; she paid for prenatal vitamins and pregnancy tests with cash; she used a virtual private network (VPN) when searching online. Yet, in the same week she miscarried, she received diaper advertisements in the mail. This sort of poorly timed targeted advertising is a reality for many individuals experiencing miscarriages, stillbirths, and other devastating complications of pregnancy. Because location data can be bought and sold to third parties, individuals or companies can breach the data. A post-Dobbs reality is uterus surveillance, where states like Texas are providing financial incentives to help enforce abortion bans and suing abortion providers.54 Until recently, one data location company had been selling information about groups of people visiting the over 600 Planned Parenthood facilities, including duration of stay and where they went afterward. Getting a week’s worth of data only cost a little over $160.55

46

Duhigg (2012). Duhigg (2012). 48 Duhigg (2012). 49 Duhigg (2012). 50 Hill (2012). 51 Hill (2012). 52 Hill (2012). 53 Prince (2022). 54 Sotomayor (2021). 55 Cox (2022). 47

Dobbs in a Technologized World: Implications for US Data Privacy

93

5 Social Media With the surge in innovation on the internet, mobile devices, and health care, widespread use of technology has increased access to information, improved general connectivity, and facilitated communication across the globe. Social media allows the world to be connected through platforms like Facebook, Snapchat, Instagram, and Twitter, which serve as sources of communication, information, and entertainment. However, data privacy has literally slipped between people’s fingers, as mobile devices collect a wide range of personal data through location and browsing history, social media apps, and search queries. Companies can use this data to target advertising and allow social media feeds to be personalized. However, third parties can also access this data for various reasons, often without the user’s knowledge or consent. Social media has become so ubiquitous that many people forget about the lack of privacy associated with “private” messaging. In a post-Dobbs world, this has become an increasingly alarming problem, particularly for those seeking an abortion or other reproductive health care services. In June of 2022, a woman in Nebraska was charged with helping her teenage daughter terminate her pregnancy.56 The transcript of a Facebook Messenger conversation—revealing that the two had been conversing about using medication to induce an abortion—was given to investigators by Meta, the parent company of Facebook, without much resistance. The mother and daughter were charged with a felony for removing, concealing, or abandoning a body, a felony for abortion-related charges, and two misdemeanors for concealing the death of another person and false reporting. The daughter, who recently turned 18, is now being charged as an adult. This case has caused a lot of controversy, with Facebook and Meta releasing statements noting that they “always scrutinize every government request.” Still, they have given investigators information in about 88% of the 59,996 instances in which the government requested data in the second half of 2022.57 Meta owns social media messaging app WhatsApp, in which privacy has been protected through end-to-end encryption. This ensures that no one—not even WhatsApp—can access what is being sent. Encryption is one of the only preventive tools stopping third parties from accessing “private” conversations, but few social media apps use the technology. In fact, the most commonly used social media apps, like Instagram, Twitter, Facebook, and Snapchat, are not protected in any way. Emails are another unprotected technology. Gmail announced the addition of endto-end encryption for Gmail on the web in December 2022 and is launching its beta test on January 20, 2023. However, Gmail has stated that while it can encrypt email, the other provider must also support the encryption. In other words, not all emails are going to be protected.

56 57

Funk (2023). Funk (2023).

94

J. Gosain et al.

6 Conclusion The Dobbs case represents a key moment in history, during which the ground beneath what had heretofore been thought of as privacy shifted. As we have described above, with a stroke of a judicial pen, the spectrum of privacy in the technological era has changed dramatically due to its linkages to abortion. The concerns here, focused fully on women obtaining abortion services, should be thought of as generalizable to any similar problem that is structurally similar: individuals taking actions, the legality of which is variable from state to state, under an umbrella of the Fourth Amendment and a variety of asymmetric federal statutes that govern privacy,58 that are both discernable and can be stored on proprietary servers indefinitely. For example, activities related to guns, alcohol, and marijuana share the same structure. The technologization of the world has raised new legal questions regarding the Dobbs decision. As Jurgen Habermas has pointed out, “…the breadth of biotechnological interventions raises moral questions that are not simply difficult in the familiar sense but are of an altogether different kind.”59 This shift in the understanding of the spectrum of privacy has this quality, at least along the contours of women’s health. What is needed, then, is not regulation and legislation as usual, but a rich engagement with a new problem. Without it, we are heading for an interplay system of advancing threats and privacy intrusions with legislative and regulatory responses that will embroil both those opposed to and those in favor of abortion access. Technological advancement is already proceeding into the future. One important question for law, then, is whether it will develop the dimensionality to allow for analysis and subsequent regulatory and legislative developments that rationally benefit rational individuals and appropriate law enforcement. The question has begun to be answered with the adoption of the GDPR of the European Union. The Regulation requires that digital users understand in “concise, easy-to-understand and clear language” the content, justification for collection, the processing, the length of retention of data, as well as the data owner’s contact information for the purposes of personal data removal.60 While the GDPR certainly represents a pillar of light in an otherwise gray patchwork of privacy regulation, as well as a bellwether for the rest of the world, we envision a regulatory and legislative future that is richer in its engagement with privacy: one that takes not only user understanding into consideration, but the entire privacy landscape in aggregate. Acknowledgements Professor Sinha would like to thank the Bander Center for Medical Business Ethics at Saint Louis University School of Medicine, the SLU LAW Summations Podcast at Saint Louis University School of Law, and the Center for Governance and Markets at the University of Pittsburgh for opportunities to discuss the subject matter of this paper with public audiences.

58

Murphy (2013). Habermas (2003). 60 Renaud and Shepherd (2018), p. 20. 59

Dobbs in a Technologized World: Implications for US Data Privacy

95

References Ablon L, Heaton P, Lavery DC, Romanosky S (2016) Consumer attitudes toward data breach notifications and loss of personal information. https://www.rand.org/content/dam/rand/pubs/res earch_reports/RR1100/RR1187/RAND_RR1187.pdf. Accessed 20 Sep 2023 Anderson M (2022) Payment apps like Venmo and CashApp bring convenience—and security concerns—to some users. https://www.pewresearch.org/fact-tank/2022/09/08/payment-appslike-venmo-and-cash-app-bring-convenience-and-security-concerns-to-some-users/. Accessed 20 Sep 2023 Auxier B, Rainie L, Anderson M, Perrin A, Kumar M, Turner E (2019) Americans’ attitudes and experiences with privacy policies and laws. https://www.pewresearch.org/internet/2019/11/15/ americans-attitudes-and-experiences-with-privacy-policies-and-laws/. Accessed 20 Sep 2023 Bellamy FD (2023) U.S. data privacy laws to enter new era in 2023. https://www.reuters.com/legal/ legalindustry/us-data-privacy-laws-enter-new-era-2023-2023-01-12/. Accessed 20 Sep 2023 Bloomberg Law (2019) Regulation and legislation lag behind constantly evolving technology. https://pro.bloomberglaw.com/brief/regulation-and-legislation-lag-behind-techno logy/. Accessed 20 Sep 2023 Bloomberg Law (2023) CCPA vs CPRA: What’s the difference? Bloomberg Law, January 23, 2023. https://pro.bloomberglaw.com/brief/the-far-reaching-implications-of-the-californiaconsumer-privacy-act-ccpa/. Accessed 20 Sep 2023 Boodman E, Bannow T, Herman B, Ross C (2022) HIPAA won’t protect you if prosecutors want your reproductive health records. https://www.statnews.com/2022/06/24/hipaa-wont-protect-you-ifprosecutors-want-your-reproductive-health-records/. Accessed 20 Sep 2023 Braidwood Management Inc. v. Becerra (2022) 627 F. Supp. 3d 624 (N.D. Tex) Burwell v. Hobby Lobby Stores Inc (2014) 573 U.S. 682 Cakebread C (2017) You’re not alone, no one reads terms of service agreements. https://www.bus inessinsider.com/deloitte-study-91-percent-agree-terms-of-service-without-reading-2017-11. Accessed 20 Sep 2023 Catinas D, Cunningham C, Herriford C, Wood J (2019) Data privacy, security, and regulation in financial technology. https://jsis.washington.edu/news/data-privacy-security-and-regulation-infinancial-technology/. Accessed 20 Sep 2023 Cohen K (2022) Location, health, and other sensitive information: FTC committed to fully enforcing the law against illegal use and sharing of highly sensitive data. https://www.ftc.gov/business-guidance/blog/2022/07/location-health-and-other-sensitiveinformation-ftc-committed-fully-enforcing-law-against-illegal. Accessed 20 Sep 2023 Cox J (2022) Data broke is selling location data of people who visit abortion clinics. https:// www.vice.com/en/article/m7vzjb/location-data-abortion-clinics-safegraph-planned-parent hood. Accessed 20 Sep 2023 Dobbs v. Jackson Women’s Health Organization (2023) 142 S. Ct. 2228 Dorfleitner D, Braun D (2019) Fintech, digitalization, and blockchain: possible applications for green finance, in the rise of green finance in Europe. Palgrave Macmillan, Cham Duhigg C (2012) How companies learn your secrets. https://www.nytimes.com/2012/02/19/mag azine/shopping-habits.html. Accessed 20 Sep 2023 Elias J (2022) Google says it will delete location history for visits to abortion clinics after overturning of Roe v. Wade. https://www.cnbc.com/2022/07/01/google-will-delete-location-history-for-vis its-to-abortion-clinics.html. Accessed 20 Sep 2023 Emerson R (2020) Transforming the medical landscape: telehealth before, during & after COVID-19. https://blog.zoom.us/transforming-medical-landscape-telehealth-before-dur ing-after-covid-19/. Accessed 20 Sep 2023 Eshoo A (2023) On 50th anniversary of Roe, Eshoo and Jacobs introduce legislation to protect reproductive healthcare. https://eshoo.house.gov/media/press-releases/50th-anniversary-roe-eshooand-jacobs-introduce-legislation-protect. Accessed 20 Sep 2023

96

J. Gosain et al.

Fingas J (2023) US law enforcement has warrantless access to many money transfers. https:// www.engadget.com/us-money-transfer-mass-surveillance-trac-183552282.html; Accessed 20 Sep 2023 Funk J (2023) Nebraska woman charged with helping her daughter have an abortion. https://apnews.com/article/abortion-health-nebraska-government-and-politics-b94abeeed 9a8c486cf479d6ae78c62aa. Accessed 20 Sep 2023 Gai K, Qiu M, Sun X, Zhao H (2017) Smart computing and communication. In: Lecture notes in computer science. Springer, Cham Gajda A (2022) How Dobbs threatens to Torpedo Privacy Rights in the US. https://www.wired. com/story/scotus-dobbs-roe-privacy-abortion/. Accessed 20 Sep 2023 Goldberg E (2022) These companies will cover travel expenses for employee abortions. https:// www.nytimes.com/article/abortion-companies-travel-expenses.html. Accessed 20 Sep 2023 Gottlieb JD, Shapiro AH, Dunn A (2018) The complexity of billing and paying for physician care. Health Aff 37(4):619–626 Graham L (2022) Graham introduces legislation to protect unborn children, bring U.S. Abortion Policy in Line with Other Developed Nations. https://www.lgraham.senate.gov/public/index. cfm/2022/9/graham-introduces-legislation-to-protect-unborn-children-bring-u-s-abortion-pol icy-in-line-with-other-developed-nations. Accessed 20 Sep 2023 Guttmacher Institute (2023) An overview of abortion laws, January 1, 2023. https://www.guttma cher.org/state-policy/explore/overview-abortion-laws. Accessed 20 Sep 2023 Habermas J (2003) The future of human nature. Polity Press, Cambridge Harrington C (2022) Apple won’t let staff work remotely to escape texas abortion limits. https://www.wired.com/story/apple-wont-let-staff-work-remotely-to-escape-texasabortion-limits/. Accessed 20 Sep 2023 Harwell D (2019) Is your pregnancy app sharing your intimate data with your boss? https://www.washingtonpost.com/technology/2019/04/10/tracking-your-pregnancy-anapp-may-be-more-public-than-you-think. Accessed 20 Sep 2023 Health information technology for economic and clinical health act of (2009) Pub Law 111–5 Health insurance portability and accountability act of (1996) Public Law 104:191 Hill K (2012) How target figured out a teen girl was pregnant before her father did. https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girlwas-pregnant-before-her-father-did/. Accessed 20 Sep 2023 Hill K (2022) Deleting your period tracker won’t protect you. https://www.nytimes.com/2022/06/ 30/technology/period-tracker-privacy-abortion.html. Accessed 20 Sep 2023 Kaste M (2022) Nebraska cops used Facebook messages to investigate an alleged illegal abortion. https://www.npr.org/2022/08/12/1117092169/nebraska-cops-used-facebook-messages-toinvestigate-an-alleged-illegal-abortion. Accessed 20 Sep 2023 Kelley J (2022) Nonprofit websites are full of trackers. That should change. https://www.eff.org/ deeplinks/2022/08/tracking-ubiquitous-within-nonprofits-it-doesnt-have-be. Accessed 20 Sep 2023 Komando K. How to stop your smartphone from tracking your every move, sharing data and sending ads. https://www.usatoday.com/story/tech/columnist/komando/2019/02/14/yoursmartphone-tracking-you-how-stop-sharing-data-ads/2839642002/. Accessed 20 Sep 2023 Li TC (2021) Privacy in pandemic: law, technology, and public health in the COVID-19 Crisis Loy. Univ Chicago L.J. 52(3):767–865 Maechler AM, Moser T (2019) The evolution of payment systems in the digital age: a central bank perspective. Speech to the Swiss National Bank’s Money Market Event. https://www.bis.org/ review/r190328g.pdf. Accessed 20 Sep 2023 Moreno JE (2022) Abortion-related workplace discrimination still banned post-Roe. https://news. bloomberglaw.com/daily-labor-report/abortion-related-workplace-discrimination-still-bannedpost-roe. Accessed 20 Sep 2023 Murphy E (2013) The politics of privacy in the criminal justice system: Information disclosure, the Fourth Amendment, and statutory law enforcement exemptions. Michigan l Rev 111(4):485–546

Dobbs in a Technologized World: Implications for US Data Privacy

97

Nadrag P (2021) Industry voices—forget credit card numbers. medical records are the hottest items on the dark web. https://www.fiercehealthcare.com/hospitals/industry-voices-forget-cre dit-card-numbers-medical-records-are-hottest-items-dark-web. Accessed 20 Sep 2023 Ng A (2022) A uniquely dangerous tool: How Google’s data can help states track abortions. https://www.politico.com/news/2022/07/18/google-data-states-track-abortions-000 45906. Accessed 20 Sep 2023 Ng A, Varner M (2021) Nonprofit websites are riddled with ad trackers. https://themarkup.org/bla cklight/2021/10/21/nonprofit-websites-are-riddled-with-ad-trackers. Accessed 20 Sep 2023 Prince AER (2022) I tried to keep my pregnancy secret. https://www.theatlantic.com/ideas/archive/ 2022/10/can-you-hide-your-pregnancy-era-big-data/671692/. Accessed 20 Sep 2023 Prince AER (2023) Reproductive health surveillance. Boston College L. Rev. 64(5):1077 Renaud K, Shepherd L (2018) GDPR: its time has come. Netw Secur 2018(2):20 Renduchintala T (2022) A survey of blockchain applications in the FinTech sector. J Open Innov: Technol Mark Comp 8(4):185 Schieszer J (2022) HIPAA may not cover personal health data patients disclose online. https:// www.renalandurologynews.com/home/departments/hipaa-compliance/hipaa-may-not-coverpersonal-health-data-patients-disclose-online/. Accessed 20 Sep 2023 Shachar C (2022) HIPAA, privacy, and reproductive rights in a post-Roe era. J Am Med Ass’n 328(5):417–418 Sotomayor S (2021) Texas now has abortion ‘bounty hunters’: read Sonia Sotomayor’s scathing legal dissent. https://www.theguardian.com/commentisfree/2021/sep/02/sonia-sotomayor-dis sent-texas-abortion-ban-law-supreme-court. Accessed 10 Feb 2023 Texas Heartbeat Act (2021) S.B. 8 Theodos K, Sittig S (2020) Health information privacy laws in the digital age: HIPAA doesn’t apply. Perspect Health Inf Manag 18(Winter):1l The White House (2022) FACT SHEET: president Biden to sign executive order protecting access to reproductive health care services. https://www.whitehouse.gov/briefing-room/statements-rel eases/2022/07/08/fact-sheet-president-biden-to-sign-executive-order-protecting-access-to-rep roductive-health-care-services/. Accessed 20 Sep 2023 Tovino SA (2017) The HIPAA privacy rule and the EU GDPR: illustrative comparisons. Seton Hall L Rev 47(4):973–993 U.S. Department of Health and Human Services (2022) HHS issues guidance to protect patient privacy in wake of supreme court decision on Roe. https://www.hhs.gov/about/news/2022/06/ 29/hhs-issues-guidance-to-protect-patient-privacy-in-wake-of-supreme-court-decision-on-roe. html. Accessed 20 Sep 2023 Uses and disclosures for which an authorization or opportunity to agree or object is not required (2022) 45 C.F.R. § 164.512(e)(1) Vergano D, Powers B, Lambert J (2022) How the Dobbs abortion ruling reshaped America’s privacy debate, from health to politics and law. https://www.grid.news/story/360/2022/10/ 13/how-the-dobbs-abortion-ruling-reshaped-americas-privacy-debate-from-health-to-politicsand-law/. Accessed 20 Sep 2023 Vishwanat S, Bhat A, Chhonkar A (2016) Security challenges in the evolving Fintech landscape. https://www.pwc.in/assets/pdfs/consulting/cyber-security/banking/security-challengesin-the-evolving-fintech-landscape.pdf. Accessed 20 Sep 2023 Winder D (2022) Gmail message encryption confirmed by Google. https://www.forbes.com/sites/ daveywinder/2022/12/19/new-gmail-encrypted-email-feature-confirmed-by-google-will-youget-it/. Accessed 20 Sep 2023

Consent and Retrospective Data Collection Tima Otu Anwana, Katarzyna Barud, Michael Cepic, Emily Johnson, Max Königseder, and Marie-Catherine Wagner

Abstract The secondary use of health data offers great potential for health research. Technological developments, for instance the progress in the field of artificial intelligence, have improved the reusability of datasets. However, the GDPR and ethical guidelines regularly restrict the reuse of personal data when the data subject has not given their informed or explicit consent. In retrospective studies, where researchers use personal data and sensitive data from previous medical examinations, the retrospective collection of the patient’s consent can be challenging. This chapter will focus on the potential legal and practical hurdles associated with obtaining consent from the data subject for a new processing purpose. In addition, it will present the ethical considerations associated with consent and retrospective data collection in health and medical research. This chapter will discuss several Horizon 2020 funded research projects in the areas of health and medical research. These research projects will be used as practical examples to demonstrate the issues faced with consent as a legal basis in retrospective research. Keywords Consent · Data protection · Ethics · Further processing · Medical research · Retrospective data collection

1 Introduction Information about people is an essential aspect of health and medical research, whether that is for prevention, treatment, or expanding the knowledge sphere. The datafication of health and medical discoveries paired with technological advancements has magnified the necessity for more data and in turn, the insights that may be garnered from them. The need for ongoing research is recognized in Europe T. O. Anwana · K. Barud · M. Cepic · E. Johnson · M.-C. Wagner (B) Department of Innovation and Digitalisation in Law, University of Vienna, Schenkenstraße 4/2. Stock, 1010 Wien, Austria M. Königseder MLL Meyerlustenberger Lachenal Froriep AG, Schiffbaustrasse 2, 8031 Zurich, Switzerland © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_7

99

100

T. O. Anwana et al.

through various funding schemes, such as the EU’s research and innovation funding program, Horizon Europe (former Horizon 2020).1 Under this initiative, many scientific research projects are utilizing personal data as a way of “improving our health and care systems together”2 in a variety of medical research areas. The aim of this funding initiative is to prevent diseases, support the development of better diagnostics and more effective therapies, use personalized medicine approaches to improve healthcare and wellbeing, and take up innovative health technologies, with a focus on digitalization (See footnote 2). The program is a key funding initiative of the European Union and contributes to research and innovation with a budget of e95.5 billion.3 Many of these research initiatives include the prospective collection and processing of personal data, including data concerning health. From both a practical and legal perspective, gaining consent for this data processing is usually straightforward as the data subject is accessible. However, the same cannot be said about retrospective data processing, that is, the use of data collected in the past and so prior to the realization of the projects. While there are ongoing efforts within the research community to collect and process new data sets, the advancement of technologies and the expansion of knowledge also brings about the reuse of retrospective datasets making data-driven research sustainable. For researchers, there are a number of practical and legal obstacles to overcome to ethically and lawfully process these retrospective data sets. This chapter will focus on the potential hurdles associated with gaining consent from the data subject for a new processing purpose. Consent in data protection law “implies real choice and control for the data subjects”4 and given the pre-processing awareness and control consent provides the data subject, as a legal basis, consent offers the data subject the most autonomy out of all of the available legal bases in Article 6(1) GDPR.5 As personal data collected and processed in the areas of health and medicine often consist of data concerning health, controllers are also required to obtain an exception to the prohibition on processing special categories of personal data as outlined in Articles 9(1) and 9(2) GDPR. The explicit consent of the data subject not only offers transparency and autonomy to the data subject, but when read together with Article 6(1)(a), legal compliance is easier for controllers in a practical sense. Furthermore, given the ethical implications of processing health data, many organizations also require ethical consent when participating in research, and the 1

European Commission (2021b). European Commission, Why the EU Supports Health Research and Innovation. Available at: https://ec.europa.eu/info/research-and-innovation/research-area/health-research-and-innovatio n_en. Accessed 27 March 2023. 3 European Commission (2021b). 4 European Data Protection Board, ‘Guidelines 05/2020 on Consent Under Regulation 2016/679’. Adopted on 4 May 2020 7. 5 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). 2

Consent and Retrospective Data Collection

101

Horizon 2020 program is no exception. The requirements for ethical consent and the relationship with GDPR-based consent are further outlined in this publication. This chapter will begin by defining retrospective data in relation to EU data protection law. Then by outlining the practical and legal distinctions between prospective and retrospective data collection, greater insights into the difficulties associated with obtaining consent for retrospective data will become evident. Following this, consent in data protection law will be discussed. In particular, consent as the lawful basis for processing as well as explicit consent as an exception to processing special categories of personal data will be examined. After which, the lawfulness of further processing will be touched upon as it sets out rules that may be of relevance to retrospective data processing depending on the circumstances. To complement the legal discussion, the final section of this chapter will set out the ethics considerations associated with consent and retrospective data collection in the context of health and medical research. In sum, this chapter aims to provide a legal and ethical overview of the applicability of consent in a research context when utilizing retrospective personal data, thereby assisting researchers. To provide real-world examples of the issues faced with gaining consent for retrospectively collected personal data, this chapter will refer to several Horizon 2020 funded research projects in the areas of health and medical research. The authors of this chapter are involved in these research projects, therefore, firsthand knowledge on the challenges and opportunities associated with consent can be used to ensure a better understanding of domain and project related issues. Below is a brief outline of each of these research projects: Biomap,6 InteropEHRate,7 ProCAncer-I,8 VBC,9 and KATY.10 The BIOMAP project investigates the causes and mechanisms of Atopic Dermatitis and Psoriasis by identifying the biomarkers responsible for the variation in disease outcome. For this, a data portal will be provided where researchers can examine data by performing cross-study comparisons, slice and dice cohorts based on certain clinical features and run in-built workflows.11 The InteropEHRate project aims to develop interchanging communication protocols which support the exchange of health records between patients, healthcare practitioners and researchers, in a safe and efficient manner. The exchange of health data is made possible through the S-EHR mobile application which places the management of health data into the hands of citizens. One key aspect of the InteropEHRate

6

Biomap, https://biomap-imi.eu/about/data-portal. Accessed 27 March 2023. InteropEHRate, https://www.interopehrate.eu/. Accessed 27 March 2023. 8 ProCAncer-I, https://www.procancer-i.eu/about/. Accessed 27 March 2023. 9 VBC, https://virtualbraincloud-2020.eu/tvb-cloud-main.html. Accessed 27 March 2023. 10 KATY, https://katy-project.eu/. Accessed 27 March 2023. 11 This project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant agreement No. 821511. 7

102

T. O. Anwana et al.

protocol is that by using the S-EHR App, citizens are able to share data directly with research centers, thus facilitating decentralised research studies.12 The ProCAncer-I project’s objective is to improve the diagnosis and treatment of prostate cancer (PCa), especially avoiding over diagnosis and overtreatment of indolent tumors, which is a common issue in today’s PCa care. A key element to achieve this aim is to establish the largest collection of PCa multi-parametric (mp)MRI, anonymized image data worldwide (>17,000 cases). This image repository consisting of 60% retrospective and 40% prospective mpMRI cases, will be used to train, test and validate AI algorithms, which are set out to assist clinicians in future practice.13 The Virtual Brain Cloud project aims to develop a platform for personalized prevention and treatment of dementia. In order to achieve this the Virtual Brain Cloud integrates the data of large cohorts of patients and healthy controls through multi-scale brain simulation using The Virtual Brain (or TVB) simulator.14 The KATY project aims to build a precise personalized medicine system empowered by Artificial Intelligence (AI), which predicts the response of kidney cancer to targeted therapies. The research group strives to develop a solution aimed at facilitating a simple conversation about treatment choices between patient and clinician, leveraging on an effective user experience. The data used in the development of the project are collected retrospectively and prospectively from publicly available data repositories as well as by the consortium members. As a stress test, the KATY project will initially experiment with data from patients with a rare and complex form of kidney cancer called renal cell carcinoma cancer.15 Each of these projects process retrospective data and have faced similar legal challenges, particularly regarding the obtaining of consent from the data subject. It is from this basis that this chapter will discuss the main legal considerations associated with the use of retrospectively obtained personal data that is later used in the context of health and medical research.

2 Retrospective and Prospective Data Collection Data collection can be a crucial part of a research study design involving human participants. In the case of medical research, researchers may choose to adopt one of two approaches. One possibility would be to conduct a study and enroll participants into the study by collecting data directly, this can be defined as a prospective study. 12

This project has received funding from the European Union’s innovation program under grant agreement No. 826106. 13 This project has received funding from the European Union’s innovation program under grant agreement No. 952159. 14 This project has received funding from the European Union’s innovation program under grant agreement No. 826421. 15 This project has received funding from the European Union’s innovation program under grant agreement No. 101017453.

Horizon 2020 research and Horizon 2020 research and Horizon 2020 research and Horizon 2020 research and

Consent and Retrospective Data Collection

103

Another approach would be to look at previously collected data, for instance, data maintained in biobanks or existing medical records, this approach can be defined as a retrospective study. There are significant differences between the two study designs from a scientific perspective, as well as from a legal and ethical perspective. This section will give an overview of the difference between prospective and retrospective studies and will highlight the respective legal implications. In contrast to the terms “clinical trial” and “clinical study”, which are defined by the Clinical Trials Regulation,16 it is important to highlight that retrospective and prospective data collection are not legal terms, but are commonly used to describe different types of study designs in (medical) research.17 In a prospective study the research questions, techniques and other essential elements of the study design are defined prior to the data collection. The outcome of interest has not occurred at the time of the initiation of the study.18 Retrospective studies, as the name suggests, look into the past. Data that was not originally collected for research purposes or for a specific project, but can be “recycled” to gain new insights at a later date. Especially in medical research, information from past clinical events can be a time and budget saving alternative to expensive prospective studies,19 while making use of rich pre-existing sources of information. Prospective studies have several advantages, making this study design attractive to researchers. In contrast to retrospective studies, prospective studies are not simply observational in nature. This study design allows for greater interventions by the researchers, for example, testing the effectiveness of a specific medication.20 Furthermore, a prospective study can be tailored to the specific research question and only relevant information will be gathered. Therefore, multiple outcomes can be assessed and analyzed at different time frames.21 The major disadvantage of prospective studies is that they are expensive and time-consuming, thus placing additional burdens on the researcher.22 Retrospective studies are often considered to be the less reliable option, because of several disadvantages as compared to a prospective study design.23 For instance, in retrospective studies the researcher is wholly dependent on the usability and accuracy of the available records. As a result, the researcher selects the cases or data with the research question in mind, which can lead to a bias in the findings.24 Furthermore, since the retrospective data collection was not designed to fit the specific requirements of the study, some information may be missing.25 Despite these shortcomings, there 16

Clinical Trials Regulation (EU) 536/2014. See Talari and Goyal (2020), p. 409. 18 See Talari and Goyal (2020), p. 398. 19 Hess (2004), p. 1171. 20 See Talari and Goyal (2020), p. 398. 21 See Talari and Goyal (2020), p. 401; Euser et al. (2009), p. 215–216. 22 See Euser et al. (2009), p. 216. 23 See Talari and Goyal (2020), p. 399. 24 Hess (2004), p. 1172. 25 Talari and Goyal (2020), p. 399. 17

104

T. O. Anwana et al.

are many positive examples of retrospective studies. For instance, a prominent retrospective study with significant impact described the correlation between smoking and lung cancer.26 The hypothesis that the risk of getting lung cancer is significantly higher for smokers than non-smokers, could not have been confirmed by a (prospective) randomized study, wherein the study subjects are randomly assigned to either the drug/exposure group (in this case smoking) or to the placebo/non-exposure group (See footnote 27). Since asking randomly selected people to smoke to check whether they develop lung cancer or not is neither ethical nor practical, this hypothesis cannot be put through the test of a prospective study. The potential benefits derived from large volumes of retrospective datasets can be better exploited nowadays through machine learning techniques and the use of Artificial Intelligence. For example, there are several Horizon 2020 projects, such as ProCAncer or KATY which use (retrospective) clinical data to train algorithms to help improve cancer diagnosis. By testing and validating the results of the retrospective study with smaller samples of prospective data, potential biases or errors can be discovered.27

3 Legal Implications The processing of retrospective and/or prospective data has different legal and ethical implications. One principal consideration is the establishment of a legal basis and adherence with the rules on processing special categories of personal data in accordance with Articles 6 and 9 of the GDPR. The collection of clinical data in the course of a prospective study is, in most cases, less complicated from a data protection standpoint. For instance, from a legal perspective, it is easily possible to obtain the patient’s explicit consent to participate in the respective study, because the participant can be informed about the purposes and aims of the data processing before enrolling in the study. Establishing the legal basis in data protection for the data processing of retrospective data can be more onerous. Depending on the time of collection, particularly if collection took place pre-GDPR, then there may be concerns regarding the legal validity of the consent or other legal basis. This process may present practical challenges in finding and assessing the original legal basis. One possible legal basis for the processing of available clinical data for the purposes of a retrospective study would be obtaining the former patient’s consent for the specific research purposes pursuant to Article 6(1)(a) GDPR. As described above, one of the major advantages of retrospective studies is that they are lesstime consuming as the data has already been collected. To make use of this benefit, researchers might prefer to either rely on the consent given at the time of the initial data collection or on another legal basis for data processing, as the process of 26 27

Doll and Hill (1950), pp. 739–748. Talari and Goyal (2020), p. 401.

Consent and Retrospective Data Collection

105

obtaining the consent of possibly thousands of former patients can significantly delay the start of a research project and can prove to be practically unfeasible. Relying on the initial consent could be an option in some research projects, but is often not easy to achieve in practice. For instance, it could be the case that the retrospective data was collected before the entry into force of the GDPR. According to Recital 171, pre-GDPR consent could still be valid, if it fulfils the conditions set out by the GDPR. Therefore, it has to be reviewed whether the consent forms used at the time of the initial data collection are in line with the conditions of Article 4(11) GDPR. Even though the wording of the definition of Article 2(h) Data Protection Directive is similar to Article 4(11) GDPR—with the exception of the term unambiguous and the formal requirements—in practice, pre-GDPR consent does often not fulfil the GDPR requirements, since data protection compliance was not taken always taken as seriously as it is today. Furthermore, the consent has to be specific, which is why new research projects are usually not covered by the extent of the initial consent. However, this issue could—in some cases—be solved by “broad consent” pursuant to Article 6(4) GDPR.28 Alternatively, if the conditions are met in the specific case, researchers could argue that their task is carried out in the public interest (Article 6(1)(e) GDPR) or in their or a third party’s legitimate interests (Article 6(1)(f) GDPR). In addition to establishing a legal basis in Article 6(1) GDPR, a specific consideration often in the context of medical research is the requirement for an applicable exemption to the prohibition of processing of special categories of personal data in Article 9 GDPR. When processing personal data concerning health, controllers need to also ensure that one of the exceptions in Article 9(2) applies. The so-called “research exemption” in Article 9(2)(j) states that the prohibition on the processing of special categories of personal data does not apply where the processing is necessary for scientific research purposes. In addition, processing has to be “in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.”29 Article 9(2)(j) has to be read together with the standards for lawful processing under Article 6(1) GDPR.30 The research exemption and the cumulative application of Articles 6 and 9 GDPR will be explained in more detail in the section “Exemptions for processing of health data in research”. It is important to point out that relying on the research exemption can be rather complicated, particularly in international research projects, as Article 89(2) GDPR includes what is known as an “opening clause,” which can lead to deviations in the respective Member States. In principle, EU regulations, such as the GDPR are directly applicable in all Member States and therefore do not leave any room for national lawmakers; opening clauses, however, are the exception to that rule and allow Member States to deviate in certain regulatory areas. For instance, the German 28

More information on broad consent is provided in Sect. 4. GDPR, Article 9(2)(j). 30 Donelly and McDonagh (2019), p. 112. 29

106

T. O. Anwana et al.

legislator made use of the opening clause provided in Article 89(2) GDPR by allowing the processing of special categories of personal data for research purposes without prior consent, if certain conditions are met.31 Another example would be the Italian data protection code, according to which special categories of personal data may only be processed for research purposes with the prior consent of the data protection officer and only if the provision of information to the data subject proves impossible or involves a disproportionate effort, or would make it impossible or seriously jeopardize the attainment of the research objectives.32 In addition to a legal basis in data protection, researchers regularly have to comply with high ethical standards.33 As it will be described below in the “An Ethical Perspective on Consent”, these guidelines often demand transparent and comprehensive information of the patients prior to their participation. Firstly, the patient should be informed about the reasons for their participation. Subsequently, participants are permitted to freely participate in an informed manner regarding all processes, including the actual study and what will happen to the study results (e.g., commercial exploitation). Similar to the obtaining of consent in the meaning of the GDPR, it is also regularly unfeasible to obtain ethical consent in retrospective study designs. In these cases, consideration and approval by the respective ethics committee is usually required before the retrospective data can be used.

4 Consent (Article 6(1)(a) GDPR) From the perspective of the principle of lawfulness as stated in Article 5(1)(a) GDPR and the wording of Article 6(1) GDPR, there must always be a legal basis applicable to a certain processing activity. Of course, this also holds true for the processing of personal data for research purposes. The GDPR generally follows a prohibitive approach to the processing of personal data unless it is explicitly permitted. In other words, the processing of personal data is generally forbidden, unless it is specifically allowed under the GDPR. Article 6(1) GDPR exhaustively34 outlines the conditions under which processing of personal data shall be lawful. The first possible legal basis is Article 6(1)(a), which states that processing is lawful, if “the data subject has given consent to the processing of his or her personal data for one or more specific purposes.” Following Article 4(11) GDPR, consent is defined as “any freely given, specific, informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her”.

31

Bundesdatenschutzgesetz (BDSG), paragraph 27(1). Article 110-bis (1) D.Lgs 196/2003 (as modified by D.Lgs. 101/2018). 33 This will be described in more detail in Sect. 7. 34 ECJ 11.11.2020, C-61/19 (ECLI:EU:C:2020:901, Orange România SA) 34 with further case law. 32

Consent and Retrospective Data Collection

107

In determining whether consent is freely given, Recital 43 GDPR states that considering all the circumstances, if an imbalance of power between the data subject and controller seems clear, such as a public authority-to-citizen relationship, then freely given consent cannot be assumed and as such consent cannot be used as the legal basis to the processing of personal data. Furthermore, as required by Article 7(4) GDPR, the conditions under which legal consent is given and whether these conditions leave the data subject with sufficient freedom to consent must be taken into account. Specificity in consent requires that the data subject indicates their wishes to a particular processing operation and that inferring from an indication of the data subject’s wishes for other purposes may not occur.35 Consent is informed if the data subject is in a position to easily determine the consequences of their consent, the reason for which the information on the processing is given must be clearly comprehensible and sufficiently detailed.36 An unambiguous indication is the declaration of a data subject’s wishes by means of a statement (explicit) or clear affirmative action (implicit).37 When it comes to various efforts to make research data sustainable, in that they may be used for future undetermined research, it might be difficult to comply with the elements just pointed out. This is because, in these cases, certain aspects inherent to consent, such as the (final) purpose, may not even be known. This is also reflected in Recital 33 GDPR, which states that data subjects should be allowed to give their consent to certain areas of scientific research or research projects when in keeping with recognized ethical standards for scientific research. One can therefore point out a fundamental problem here: There is a need to strike the adequate balance between the self-determination of the data subject, in allowing science to use their data, even for not yet defined (research) purposes, and in data protection through being protected in their right to privacy, by consenting only to specific processing operations. When retrospective data are used in research, it is not uncommon for them to be used for different purposes, than those for which they were initially collected, leading to conflicts with informed consent. All of the projects mentioned in the introduction of this article encounter this issue to some extent. For this reason, but also in an effort to illuminate this aspect for future research endeavors, two questions now arise (1) how to solve “specific” consent to certain areas of research and (2) how to enable future use of retrospective data? In practice many research institutions solve issue (1) by referring to what is commonly known as broad consent,38 in that it covers research or a certain area, but is not specified to a particular processing purpose. In light of Recital 33 GDPR and from a teleological perspective, this seems to be expedient to enable data driven research, however if we look at the wording of Article 4(11), Article 6(1)(a) (and even 35

ECJ 11.11.2020, C-61/19 (ECLI:EU:C:2020:901, Orange România SA) 38. ECJ 1.10.2019, C-673/17 (ECLI:EU:C:2019:801, Planet49 GmbH) 74. 37 Article 29 Data Protection Working Party, Guidelines on consent under Regulation 2016/679, 17/EN, WP259 rev.01 (2018) 15 f. 38 On the fundamentals of the broad consent concept see Strech et al. (2016), pp. 295–309. 36

108

T. O. Anwana et al.

more so Article 9(2)(a)) GDPR that require verbatim specific or specified purposes, the legal dogmatic justification becomes problematic.39 While the other components (freely given, informed, unambiguous) can likely be fulfilled, the central anchor point of the consent problem in scientific research therefore becomes the specificity.40 The European Data Protection Board (EDPB)41 is of the opinion that the GDPR cannot be interpreted in a way that allows a controller to circumvent specifying the purpose when they ask for consent.42 In cases where the research purpose cannot be fully specified, they suggest allowing data subjects to consent to a general research purpose and to specific stages of a research project that are already known to the researcher.43 In contrast, the German Conference of independent data protection authorities approved44 a broad consent template45 that allows for the use of data in the future for diseases not yet known at the time consent was given.46 In this way, some scholars47 attribute such normative force to Recital 33 GDPR that they apply a lower standard of specificity in the case of research. We can therefore see two rather diametrical approaches to deal with the notion of giving consent in the context of research (specific vs. broad). This situation becomes even more complex when sensitive data is being used. Depending on the importance one attaches to Recital 33, the specificity of purpose comes to the fore in interpreting the notion of consent in scientific research. If the wording of Recital 33 is interpreted strictly, the normative basis for broad consent must be questioned.48 When it comes to issue (2), a similar approach could be taken. Depending on which point of view one follows, it seems feasible to use broad consent at the time of data collection and thereby enabling future, then retrospective, data use. In doing so, a broad research agenda should be reflected in the informed consent form. However, keep in mind that again the normative basis for broad consent could be questioned.

39

Cepic (2021). Hallinan (2020). 41 European Data Protection Board, “EDPB Document on Response to the Request from the European Commission for Clarifications on the Consistent Application of the GDPR, Focusing on Health Research,” February 2, 2021, https://edpb.europa.eu/sites/default/files/files/file1/edpb_replyec_que stionnaireresearch_final.pdf. Accessed 27 March 2023. 42 In this way also Donelly and McDonagh (2019). 43 Likewise tending to reject broad consent: Article 29 Working Party Guidelines on consent under Regulation 2016/679, 17/EN WP259 rev.01 (2017) 28. 44 Datenschutzkonferenz (2020). 45 Medizininformatik-Initiative (2020). 46 Hänold (2020). 47 Quinn (2021). 48 Cepic (2021). 40

Consent and Retrospective Data Collection

109

5 Exemptions for Processing of Health Data in Research As already mentioned above, the GDPR only permits processing of special categories of personal data including data concerning health or genetic data under certain circumstances. For example, the processing may be permitted for research purposes when one of the lawful bases from Article 6(1) applies and the data controller meets one of the derogations set out in Article 9(2) GDPR.49 The exemption set out in Article 9(2)(a) appears to be one of the most relevant for health research. It allows for lawful, fair and transparent processing of personal data, required by Article 5 GDPR, and at the same time plays a significant role in terms of autonomous decision of the data subject whether they agree to share their data for the health research purposes. The exemption reads that the processing is lawful if the data subject has given explicit consent to the processing of the personal data for one or more specified purposes, except where Union or Member State law provides that the prohibition referred to in paragraph 1 may not be lifted by the data subject. In terms of a lawful basis for processing such data, it is always the matter of a case-by-case analysis whether Article 9 “in itself provides for stricter and sufficient conditions, or whether a cumulative application of both Article [6] and [9] is required to ensure full protection of data subjects”.50 In this regard, the EDPB stresses that only the consent collected pursuant to Article 6(1)(a) and Article 9(2)(a) GDPR may provide a legal basis for the processing of data concerning health.51 The wording of the Article 6 speaks in favor of this approach—it provides that processing shall “be lawful only if and to the extent that at least one of the following applies”. The literal interpretation of this provision leads to the conclusion that only Article 6 includes the legal bases which ensure lawful processing of the personal data, whereas Article 9 provides additional requirements applicable to special categories of data. Taking this into further consideration, in terms of the consent required for processing of special categories of personal data, it is not sufficient to fulfil only the conditions laid down for a consent in Article 6(1)(a) in conjunction with Article 4(11) GDPR, described in Section Consent (Article 6(1)(a) GDPR. The consent for processing special categories of personal data shall be explicit. It is due to the fact that in the context of processing the special categories of personal data a serious data processing risk may emerge, therefore “a high level of individual control over personal data is deemed appropriate.”52

49

European Commission (2021a), p. 58. Article 29 Data Protection Working Party (2014), margin number 15. The articles referenced in Opinion 06/2014 are arts 7 and 8 of the Data Protection Directive which are the forerunners of arts 6 and 9 respectively of the GDPR. 51 EDPB, Guidelines 03/2020 on the processing of data concerning health for the purpose of scientific research in the context of the COVID-19 outbreak (2020), margin number 17 and 18. 52 EDPB, Guidelines 05/2020 on consent under Regulation 2016/679, Version 1.1. (2020), margin number 91. 50

110

T. O. Anwana et al.

The term “explicit” refers to the manner consent shall be expressed by the data subject. it cannot be implied or tacit.53 It could be achieved when the data subject would expressly confirm in a written statement that he or she agrees to the processing of his or her data. The best example for appropriate fulfilment of this requirement is obtaining a signature of the data subject under the written statement that would exclude any doubt and would enable the demonstration of evidence whenever required. Nonetheless, in the times of digitalization, it is also crucial to ensure that explicit consent may be expressed by means other than a written statement. A data subject might be given the possibility to provide their statement by filing an electronic form, by sending an email, by uploading a scanned document carrying a signature of the data subject, by using an electronic signature, through a telephone conversation, given that a specific confirmation from the data subject was asked (e.g., oral confirmation was required). Theoretically, an oral statement could be valid as an explicit consent, nevertheless, it may bring difficulties in proving that all conditions for a valid consent were met when it was recorded, e.g., it may be challenging to assess whether the data subjects had been appropriately informed about processing of their data before consenting to it, and it is difficult to envisage many circumstances in which health researchers could safely rely on an oral consent to data processing. One of the reasons why explicit consent is not the most favorable exception for the processing of data concerning health and genetic data in scientific research, is the individual’s right to withdraw consent at any time.54 When the data subject decides to exercise this right, all data processing operations which were based on consent remain lawful. Nevertheless, the controller is obliged to stop the processing actions concerned and, in the case no other lawful basis can justify the retention of the data for further processing,55 then the controller shall delete it. This kind of dependency on the data subject’s decision and the uncertainty related to it may have a negative impact on the research and its uninterrupted development, and in case the data subject exercises its right, it may also, potentially, devalue some scientific research results.56 Nonetheless, the GDPR does not restrict the application of Article 6 to consent with regard to processing data for research purposes. There are other legal bases defined in Article 6, which might be taken into account by the scientific research community. Those legal bases are defined in Article 6(1)(b)–(f) GDPR, and may be applicable to health and genetic data, if in conjunction with any of the exceptions defined in Article 9(2) GDPR. There are other potentially applicable exceptions in Article 9(2) GDPR. For instance, in the context of health research, another exemption that could constitute a valid legal basis for processing of special categories for data, is the exemption set forth in Article 9(2)(j) GDPR. Scientific research which requires 53

See Mester in Taeger/Gabel (2022), margin number 18; Schiff in Ehmann/Selmayr (2017), margin number 30. 54 GDPR, Article 7(3). 55 This interpretation is supported by the European Commission’s GDPR Q&A website, https://ec. europa.eu/info/law/law-topic/data-protection/reform/rules-business-and-organisations/principlesgdpr/purpose-data-processing/can-we-use-data-another-purpose_en. Accessed 27 March 2023. 56 Chico (2018) p. 117.

Consent and Retrospective Data Collection

111

the use of special categories of data may dispense with consent altogether, given that the necessary and justified measures are implemented to safeguard the fundamental interests of the data subject. When applying the exemption indicated in Article 9(2)(j), the processing shall be performed “in accordance with Article 89(1) based on Union or Member State law, proportionate to the aim pursued, respecting the essence of the right to data protection and providing for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject”.57 In other words, if in the course of the health research the Article 9(2)(j) is relied upon, then national legislation and the safeguards as required by Article 89(1) GDPR must be considered and applied. Health research initiatives should also note that, according to Article 9(4) GDPR, the Member States may maintain or introduce further conditions, including limitations, with regard to the processing of data concerning health or genetic data.58 For example, Austrian legislation specifies that genetic analyses on human beings for scientific purposes can only be carried out on de-identified probes and linkage can be conducted only by the entities which obtained consent from the data subject, in accordance with Article 4(11) GDPR. Moreover, the results of genetic analyses may be published when appropriate measures that allow to avoid re-identification are implemented.59 The approach on which legal basis is applicable or required in health research differs depending on the Member State. Studies show that France requires informed consent but not the consent defined in the GDPR and treats informed consent instead as a national safeguard for the participation of the individuals in research. In contrast, Ireland applies a requirement of explicit consent under GDPR as a requirement for primary and secondary research.60 Considering the wording of the GDPR, it is rather clear that it was not the intent of the lawmakers to limit the legal basis for the processing of health data for research purposes only to the explicit consentquite the opposite, the legislator acknowledged that it could be possible also based on other exceptions defined in Article 9(2) GDPR, and in the above example it has been acknowledged by France. It goes without saying that the differences in the approaches may significantly impact the conduct of research, in particular the ones conducted at the transnational level. There are also doubts and issues emerging around the prescreening, capacity and the use of biobank or archival materials.61 All of these aspects prove that there is a need for further discussion and clarification on understanding and application of consent across Member States. With regard to the research involving the data collected retrospectively, the possibility of processing the data may become complicated and may depend on the scope of the initial consent. The current research, willing to process retrospective data, 57

Staunton et al. (2019), p. 1161. GDPR, Article 9(4). 59 European Commission (2021a), p. 76; Austrian Federal Gene Technology Act (Gentechnikgesetz), paragraph 66. 60 European Commission (2021a), p. 77. 61 European Commission (2021a), p. 77. 58

112

T. O. Anwana et al.

shall ensure that the consent given by the data subject in the initial research covers permission to process his or her data in the current research and that it covers its objectives, i.e., that explicit consent has been obtained from the data subject for the processing of his or her personal data for the purpose of specified health research, either in relation to a particular area or more generally in that area or a related area of health research, or part thereof.62 It may be that the initial consent does not include the above-mentioned points. In such cases, it is to be considered whether re-consenting is required or whether another legal basis and other exception defined in Article 9 GDPR is applicable and if processing of retrospective data could be possible to be recognized as “further processing”, as defined in the GDPR and further described in the following section.

6 Further Processing In addition to the legal basis for processing and where one of the Article 9(2) exceptions applies, one must also consider whether further processing is taking place. Depending on the origin of the data being processed, according to the GDPR, processing may be categorized as ‘further processing’, in which case, an additional set of legal considerations must be observed by the controller. The question of whether further processing is taking place is particularly relevant when processing data that has been collected in the past, possibly for another purpose and by another controller. As such, this section will describe what further processing is, what some of the legal and academic ambiguities are, before going on to discuss how the rules of further processing apply within the context of processing retrospective data. Further processing principles apply to both prospective and retrospective data processing. Processing retrospective data requires the controller(s) to gather knowledge on the initial legal basis and assess compatibility requirements on processing activities that may have taken place some time ago. Thus, a discussion on further processing is relevant within the context of this chapter. As already discussed, Article 6 of the GDPR sets out the requirements for lawful processing.63 This Article begins by setting out six legal bases available to controllers, but the requirements of lawfulness do not end there. Article 6(4) delivers an additional lawfulness requirement in the case of what the GDPR describes as “further processing”. The GDPR defines further processing as “processing for a purpose other than that for which the personal data have been collected.”64 The existence of further processing therefore hinges on the presence of new purposes, distinct from the original purposes of processing. Article 5(1)(b) on the “purpose limitation” requires that personal data shall be “collected for specified, explicit and legitimate purposes”. Therefore, once the controller deviates or goes beyond those purposes, if they wish to 62

Clarke et al. (2019), p. 1132. GDPR, Article 6(1). 64 GDPR, Article 6(4). 63

Consent and Retrospective Data Collection

113

continue processing in a lawful manner, they must apply the requirements of Article 6(4). Article 6(4) requires that if the further processing is not based on the consent of the data subject or on a Union or Member State law, the controller is required to carry out a compatibility assessment. For further processing based on consent, in line with the definition and concepts supporting consent, such as data subject autonomy, knowledge, freedom, controllers should return to the data subject to obtain consent for the further processing if consent was the initial legal basis for the processing in line with Article 6(1)(a). Recital 50 of the GDPR confirms that where the data subject has provided consent for the further processing “the controller should be allowed to further process the personal data irrespective of the compatibility of the purposes.”65 Gaining consent in the context of retrospective data collection may be problematic when the data was processed some time ago. That is to say, it may be challenging from a practical perspective to gain consent for further processing if, for example, the initial consent was obtained some time ago, the data subject is difficult to contact, or there are conflicting ethical considerations. Thus, when relying on consent for the initial legal basis, controllers should where possible consider if further processing is envisioned as a future possibility. Alternatively, if the controller is relying on one of the other five legal bases in Article 6(1), they must perform a compatibility assessment. When performing the compatibility assessment, controllers are required to observe the non-exhaustive list set out in Article 6(4). The GDPR asks that controllers take into account the link between the initial purposes for collection and the new purposes of the planned further processing; the context of the processing, particularly with regard to the relationship between the controller and the data subject; the nature of the processing, particularly where special categories of personal data are being processed; the possible consequences of the further processing; and the existence of appropriate safeguards.66 This assessment simply asks controllers to weigh up the possible detrimental consequences of the further processing for the data subject. Moreover, this compatibility assessment applies to all further processing regardless of whether it relates to prospective/newly collected data or to retrospective data. Each compatibility assessment will be unique to the controller and their processing and may include additional compatibility considerations. The WP29 notes that “compatibility needs to be assessed on a case-by-case basis”.67 Additionally, the GDPR allows for near automatic compatibility for scientific research. Article 5(1)(b) on the principle of purpose limitation makes the following statement: “Further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes (‘purpose limitation’).”68

65

GDPR, Recital 50. GDPR, Article 6(4); Recital 50. 67 Article 29 Data Protection Working Party (2013), p. 3. 68 GDPR, Article 5(1)(b). 66

114

T. O. Anwana et al.

The European Commission’s online statement on further processing also reiterates this point, stating “[i]f your company/organization wants to use the data for statistics or for scientific research it is not necessary to run the compatibility test.”69 However, the absence of the compatibility test does not reduce or eliminate the pre-existing data protection safeguards in the GDPR. The WP29 states that the research exemption “should not be read as providing an overall exception from the requirement of compatibility, and it is not intended as a general authorization to further process data in all cases for historical, statistical or scientific purposes.”70 As with all processing of personal data, all of the circumstances must be taken into account to ensure the rights, freedoms and interests of the data subjects are maintained. There exists contention on whether Article 6(4) can act as a legal basis without the requirement for an Article 6(1) provision.71 This chapter takes the view that Article 6(4) is not a legal basis alone and will outline the reasoning for this. Article 6(4) must be read in combination with Article 6(1) as can be seen by the requirement for either consent or another legal basis and compatibility. As stated in Article 6(1), the processing is only lawful if one of its six provisions applies, and this applies to all processing scenarios. As such, Article 6(4) should be interpreted as the legal basis for further processing with its contents denoting the requirement for a legal basis, whether that be consent or another legal basis in combination with a compatibility assessment. This conclusion is supported by the wording of Recital 50, which states that in the case of further processing, “no legal basis separate from that which allowed the collection of the personal data is required.” This sentence allows two points to be deduced. Firstly, there is an assumption in further processing, a legal basis in Article 6(1) already exists, and secondly, if this is the case, then further processing only applies to the ‘initial controllers’. That is, the controller who started the initial processing and already had a legal basis for the processing but now wants to process the personal data for new purposes that are different from the original purposes of the processing. A further consideration is that with the processing of special categories, in addition to the legal basis and adherence with the Article 6(4) requirements, one must also consider whether one of the exceptions in Article 9(2) GDPR applies. Thus, the requirement for an applicable exception to the prohibition on the processing of special categories of personal data must also be employed in the context of further processing as with initial processing. Furthermore, as discussed earlier in this section, the further processing requirements apply to the controller who already processed the personal data and who now intends to process that data for new and different purposes from the original purposes. Retrospective data processing is processing of data that has already been processed whether by the same controller or a new one. In accordance with the definition of further processing and the details in Recital 50 GDPR, retrospective data processing is only further processing when the same controller who processed the same data 69

European Commission (2022). Article 29 Data Protection Working Party (2013), p. 28. 71 Albers and Veit in Wolff/Brink (2020), Article 6, margin number 71 ff. 70

Consent and Retrospective Data Collection

115

for one purpose in the past now wishes to process the data for new purposes. For a new and separate controller who has not processed the retrospective data set yet, the Article 6(4) further processing requirements would not apply. Instead, this controller would be considered to be a new controller and would therefore be required to obtain a new legal basis and adhere to all of the other requirements of the GDPR. As such, in determining the application of Article 6(4), the controller collecting the retrospective data must consider whether this is data they have initially collected/processed or not. In addition to the GDPR considerations of legal basis, processing special categories and of further processing, particularly in the context of health and medical research, there are additional ethical considerations. The following section will complement the legal discussion to provide a fuller picture of the necessary procedural considerations for researchers by looking at the ethical aspects of processing retrospectively collected personal data.

7 An Ethical Perspective on Consent In previous sections of this chapter, the GDPR requirements pertaining to consent for the processing of retrospective data in the context of medical research has been explored. This section will focus on ethical consent in the context of retrospective data as well as its relationship with GDPR-based consent in scientific research. As discussed above (see the section on retrospective and prospective data collection), researchers face several issues when seeking GDPR-based consent for the use of retrospective data as a result of the strict conditions for consent outlined in the GDPR. These issues are also evident in the case of ethical consent, although compliance with ethical guidelines is not mandatory in the same way as the provisions of the GDPR. This section examines the differences and interplay between ethical-based consent and GDPR-based consent, the conditions for ethical consent, as well as the consequences of consent withdrawal.

7.1 Definition of Ethical Consent Consent is an essential ethical requirement derived from the ethical principle of patient autonomy and basic human rights.72 Ethical consent requires that a human subject voluntarily and explicitly indicate their willingness to participate in a research study even in the case of retrospective data. In the context of this chapter, ethical consent is based on ethics requirements outlined in ethical guidelines on medical research studies such as the World Medical Association (WMA) Declaration of Helsinki. This Declaration is not legally binding; however, it is internationally recognized as the global ethical standard for medical research involving human subjects. 72

Horton (2002), p. 107.

116

T. O. Anwana et al.

The WMA is an international organization that currently represents 115 National Medical Associations and is in official relations with the World Health Organization (WHO).73 Paragraphs 25–32 of the Helsinki Declaration specifically deal with the requirements regarding informed consent from an ethical perspective. The Declaration of Helsinki considers the informed consent of the subject to be an essential requirement in order to enroll an individual in a research study, unless the person is incapable of giving his or her consent. Paragraph 25 states that “no individual capable of giving informed consent may be enrolled in a research study unless he or she freely agrees.” The consent must be given voluntarily, meaning the patient shall agree freely to participate in the study and all requirements associated with participation. In order to fulfil the requirement of being “informed”, the patient must be provided with the relevant information relating to the study, such as the aims, methods, sources of funding and his or her rights. Researchers must ensure that the patient has fully understood the information and the consequences of his or her consent. These requirements are also applicable in the context of retrospective studies. In accordance with Paragraph 32 of the Helsinki Declaration, researchers must obtain the informed consent of dataset owners to use personal information stored in biobanks, databases or repositories for research purposes. In contrast to the GDPR,74 the Helsinki Declaration does not provide a definition of ‘consent’, furthermore, various researchers in this field do not provide a definitive definition of ‘ethical consent’.75 However, taking into consideration the conditions of ethical consent mentioned in the Helsinki Declaration and ethics literature, in this chapter, ethics-based consent is defined as a voluntary, unambiguous and informed expression by human subjects of their acceptance to participate in a research study. The core elements of ethical consent and consent stemming from data protection laws are similar and have similar definitions. However, it must be noted that these two consents have different objectives and backgrounds. In the literature, these two topics are mostly treated separately, however, in practice, confusion may arise because these consents are closely related. The enrollment of a study participant often requires the processing of personal data, it is therefore essential to understand the relationship between the two forms of consent.

7.2 Relationship Between GDPR-Based Consent and Ethical Consent Ethical considerations are embedded in the provisions of the GDPR. The GDPR contains elements of principle-based provision, Recital 33 of the GDPR clearly indicates that ethical behavior and compliance with ethical standards is essential when 73

World Medical Association (2022). GDPR, Article 4(11). 75 Wu et al. (2019). 74

Consent and Retrospective Data Collection

117

processing personal data, especially in the context of scientific research and the consent of human participants. In the context of health-related studies, where personal data is processed, GDPRbased consent and ethical consent are connected but distinct. The two must therefore be distinguished and treated separately as they relate to different activities that take place in a study. In terms of the GDPR, consent acts as a legal basis for the processing of personal data (Article 6(1)(a)). Where consent forms the legal basis, the conditions for consent outlined in Article 7 of the GDPR must be complied with. In the context of ethics, the consent of participants in a study is obtained to confirm their willingness to take part in the said study and their acknowledgement of the requirements and risks associated with participation.76 It is often necessary for researchers to obtain both forms of consent. Participants must indicate their willingness to participate in the study as well as their acceptance of all intended personal data processing activities in accordance with the GDPR.

7.3 Looking at Specific Elements of Informed Consent from an Ethical Perspective In this section of the chapter, the relationship between GDPR-based consent and ethical consent is explored in the context of retrospective health data, looking specifically at the conditions of freely given consent and the withdrawal of consent. Article 7(4) of the GDPR requires that consent is “freely given”. The concept of freely given consent suggests “real choice and control for data subjects.”77 As such, when data subjects feel compelled to give their consent, this does not constitute a valid legal basis for the processing of personal data. In accordance with Recital 43, consent is not a valid legal basis in instances where there is an obvious power imbalance between the data controller and the data subject. In such cases, it is highly unlikely that the data subject is able to express a real choice. Furthermore, the likelihood of “deception, intimidation, coercion or significant negative consequences” against the data subject increases when there is an obvious power imbalance.78 This concern is equally evident in the context of ethical consent. In medical research, there is often an imbalance of power between the researcher and human subjects who are being researched. This imbalance occurs because the researcher often has more authority and knowledge in the field than the research subject. Furthermore, the researcher is often in control of the experiment and thus in control of the subject, their actions and their data. This is also evident in the case of retrospective data, where researchers may have more authority over data contained in databases and repositories. This disparity in knowledge, power and control could affect whether consent is freely given or not. The problem is further exacerbated if the researcher 76

Declaration of Helsinki, paragraph 26. Article 29 Data Protection Working Party (2017), p. 5. 78 Article 29 Data Protection Working Party (2017), p. 7. 77

118

T. O. Anwana et al.

and patient have a dependent relationship as is evident between patients and doctors. Patients have an intimate and vulnerable relationship with their doctors,79 it is therefore of the utmost importance that the patient trusts the doctor, especially in the context of research.80 The imbalance of powers between the researcher and the human subject could result in instances of coercion, where consent is given under duress or consent is given to avoid adverse consequences. In terms of the GDPR, where applicable a legal basis other than consent may be relied on; however, from an ethical perspective the consent of human participants is necessary for participation in a study. As such, ethical guidelines impose several conditions on consent, beyond those imposed in the GDPR. Paragraph 31 of the Helsinki Declaration contains one such condition. This paragraph states that if a patient declines to give their consent to participate in a study, this decision “must never adversely affect the patient-physician relationship.”81 In this way, the Declaration acknowledges the power imbalance between patients and doctors and places the burden on the physicians conducting the research to ensure that this does not result in adverse consequences for the patient. Furthermore, in accordance with Paragraph 31, the physician is obligated to fully inform the patient on which aspects of their care are related to the research. In addition, Paragraph 27 of the Declaration of Helsinki seeks to address the issue of coercion which may arise due to the disparities between the researcher and the subject. Paragraph 27 clearly states that the physician must ensure that the patient, whose consent for participation in a study is sought, is free from any duress. If the patient is in a dependent relationship with the physician, an appropriately qualified individual, who is completely independent should be involved and seek informed consent. Scholars argue that a coerced consent is worse than no consent.82 A valid consent can never be sought with coercion, because this undermines the voluntary requirement of the consent (See footnote 84). From an ethical perspective, there are exceptional circumstances where consent cannot be freely given because it would be impossible or impracticable to obtain. This is often the case when dealing with retrospective data, in particular when medical research is conducted with identifiable human data collected from biobanks, databases and repositories. In these cases, it may be impossible or impractical for researchers to contact each human subject in order to obtain their consent. Under these circumstances, Paragraph 32 of the Declaration of Helsinki requires consideration and approval by the research ethics committee before the research can commence. As a consequence, human subjects may be enrolled in retrospective studies without their consent. To avoid such circumstances, biobanks such as the UK Biobank83 adopt broad language in their consent forms which allow for the long-term storage 79

Habiba (2000). Chipidza et al. (2015). 81 Helsinki Declaration, paragraph 31. 82 Wendler and Wertheimer (2017). 83 The UK BioBank is a large-scale biomedical database and research resource, which aims to help the research community to improve public health. 80

Consent and Retrospective Data Collection

119

and use of information for all health-related research purposes, even in the case of the subject’s incapacity or death.84 GDPR-based consent and ethical consent both place emphasis on the right of withdrawal. Paragraph 26 of the Declaration of Helsinki states clearly that the participants of the research must be able to withdraw their consent at any time without reprisal. Similarly, Article 7(3) of the GDPR states that the “data subject shall have the right to withdraw his or her consent at any time” and “it should be as easy to withdraw as it is to give consent.” Furthermore, the data subject shall be informed of the right to withdraw consent prior to giving consent. As discussed earlier in this chapter, consent in terms of the GDPR must be treated separately from ethical consent; this principle also applies in the case of the withdrawal of consent. When GDPR-based consent is withdrawn, this affects the legal validity of ongoing personal data processing. In contrast, when ethical consent is withdrawn, this indicates that the subject no longer wishes to participate in the study. Furthermore, the withdrawal of ethical consent might not always result in the withdrawal of consent for the processing of personal data. This distinction in the context of retrospective data is better illustrated by the UK Biobank Ethics and Governance Framework.85 The UK has left the EU, however, in order to obtain and retain their adequacy decision, the UK implemented the UK GDPR under the post-BREXIT agreements. The UK Biobank is a good example of including different levels of withdrawal of consent, which could also be adopted by European controllers. Human subjects registered in the UK Biobank are free to withdraw their consent at any time. However, the policy is more specific with respect to the content of the withdrawal and creates a distinction between three types of withdrawals: No further contact: By choosing this level of consent withdrawal, the participant indicates that they no longer wish to be directly contacted by the researcher. However, the researcher would still have permission to use the information and samples previously provided. Additionally, the participants continue to consent to indirect contact by allowing the researcher to obtain further information, in the future from their health-relevant records. In our view, by choosing this option, participants withdraw the ethical consent to a specific aspect of the study, meaning that they no longer wish to be directly contacted. However, this withdrawal does not necessarily impact the consent given for the processing of personal data. This specific withdrawal of the (ethical) consent for direct contact allows researchers to further process the participant’s personal data and even collect more data through health records. This is ethically justifiable because the withdrawal of consent specifically relates to direct contact and not to the continued use of data. One should consider that if consent for a specific activity is withdrawn, this does not consequently mean that the processing activity is unethical. By withdrawing the ethical consent in this context, the participant of the study only wishes no longer to be contacted in relation to the research study. Consequently, all future contact would be unethical. However, the processing 84 85

UK Biobank (n.d.). UK Biobank (2007).

120

T. O. Anwana et al.

of personal data per se is, in our view, not necessarily unethical because simple processing of personal data does not require ethical consent. Clear and understandable communication with the data subject is key to ensuring that they understand the content and effect of withdrawal. No further access: By choosing this level of consent withdrawal, the participant indicates that they no longer wish to be directly contacted by the researcher and they do not permit the future collection of further information from their health-relevant records. However, the researcher would still have permission to use information and samples provided prior to the withdrawal of consent. This option may be classified as withdrawal for ethical consent, since the volunteer wishes not to participate in the study anymore. In this case, the participant also withdraws consent in accordance with the GDPR as no further collection of personal data is permitted. However, the use of those already collected is still possible and in accordance with Article 7(3) GDPR, the withdrawal of consent does not affect the lawfulness of processing based on consent before its withdrawal. No further use: By choosing this level of consent withdrawal, the participant indicates that they no longer wish to be directly contacted by the researcher and they do not permit the future collection of further information from their health-relevant records. In addition, any information and samples previously collected would no longer be available to researchers. In this case, the researcher would destroy all data which is not anonymized and would only hold the participants’ information for archival audit purposes. This option does not only include a withdrawal of consent for processing but also obliges the controller to erase all personal data. This option would, therefore, not only fulfil the requirement of withdrawal of ethical as well as data protection consent, but also the right to erasure stemming from Article 17 GDPR. The policy does not specify how a withdrawal shall be interpreted in case the extent has not been specified. However, in our view, in case of doubt, the controller should consider a withdrawal of consent as comprehensive and therefore covering ethical consent and consent for processing of personal data. It is not reasonable to require the data subject to clearly understand the distinction between these two consents and its implications. Nonetheless, the controller must clearly explain to the data subject all these consequences when seeking for consent.

7.4 Practical Application In the literature, GDPR-based consent is often treated separately from ethical consent, however, in practice, the two are often incorrectly conflated. In the context of scientific research, to ensure that patients are aware that they are consenting to two different activities, it is suggested that both types of consent are treated separately. This could be achieved through the use of separate consent forms, one focused on GDPR-based consent and the other on ethical consent.

Consent and Retrospective Data Collection

121

Alternatively, researchers could take additional steps or ensure that the patient consents to two separate activities in a single consent form. This approach has been adopted in the Horizon 2020 project InteropEHRate. During the project piloting phase, patients will be requested to participate in the InteropEHRate pilots by physically testing the tools and protocols being developed in the project.86 In addition, the personal data of patients will be processed during the pilots in order to validate the InteropEHRate tools and protocols. To ensure that participants understand that they are consenting to different processing activities, the project has adopted a “tickthe-box” approach in its consent form. Using this method, patients are required to express their consent for each activity by ticking boxes that individually relate to ethical and GDPR consent. In this way, the patient consents both to participate in the pilot and to all activities relating to personal data processing. The practical application of the distinction between GDPR-based consent and ethical consent is also necessary in the context of the withdrawal of consent. It has been demonstrated that the withdrawal of ethical consent or stopping participation in a research project does not necessarily mean that all processing of personal data must stop, this is based on the understanding that both forms of consent relate to different activities. As discussed, it is not always clear to participants of medical studies that they consented to different activities stemming from different obligations (ethical and legal). In practice, it is the controller’s obligation in accordance with the principle of transparency and ethical standards to clearly inform the participants of the outcomes associated with the withdrawal of consent. In case one fails to do so, an expression of withdrawal should be considered as comprehensive, meaning in the ethical sense as well as from a data protection perspective.

8 Conclusion The processing of the retrospective data concerning health and genetic data in health and medical research is a central building block in many research projects. As evident in the above-mentioned Horizon 2020 projects, the use of previously collected data is crucial in developing new technologies in the medical field. This does not only save time and money, but it also increases the variety and amount of data. Considering the sensitive nature of the data, it is of the utmost importance for researchers to consider several aspects related to the processing of this type of data, to ensure that they are handled in a lawful and ethical manner. As outlined in this chapter, it is essential to distinguish between the GDPR consent requirements and the ethical consent obligations, even though they are complementary and feature overlapping rationales. In practice, consent is often used as a legal basis for processing personal data, as it offers researchers some advantages. It gives the data subject the most control and autonomy in comparison to other legal bases. From an ethical perspective, consent 86

InteropEHRate Project (2020).

122

T. O. Anwana et al.

is often required by national and international guidelines to involve participants in research studies. Nonetheless, consent also harbors some risks and should be carefully considered. In the context of processing of prospective data, it may occur that the initial consent given by a research participant in initial research, where the data come from, does not necessarily cover the purpose of the current research. In this regard, it should be emphasized that the GDPR sets out specific requirements for consent and defining the purpose of processing, which are interpreted in various manners. As discussed in the chapter, there are two different approaches represented with regard to this matter. One defines that consent shall be specific, i.e., cover the purposes of the research the data is collected for in a clear and specific manner. On the other hand, this may harm the possibility of processing the data in other future research. This is why some researchers state that a consent shall be as broad as possible (“broad consent”). The processing of the data concerning health, i.e., special categories of data, must not only fulfil the requirements set forth in Articles 5(1) and 6(1), but also the conditions outlined in Article 9(2). When processing health data, researchers have to indicate an applicable exception defined in Article 9(2) GDPR. If the processing is based on consent, it shall be ensured that consent is explicit, as required by Article 9(2)(a). Additionally, choosing consent as the legal basis for the processing of the retrospective data may cause a situation in which processing must be stopped due to the withdrawal of the consent by a data subject. Moreover, in the case of retrospective data, it may be difficult to reach the data subjects to request additional consent that would allow for the processing of the data in the current or future health research initiatives, which could benefit from the previously collected data. Therefore, to ensure that research is uninterrupted, it may be worth considering another basis for processing instead of relying on the consent expressed by the data subject. For all legal bases, the purpose limitation is an intrinsic part of processing, required by the GDPR. If research goes beyond the initially defined purpose, and this may be the case for the processing of retrospective data, the application Article 6(4) may be necessary. Article 6(4) defines the circumstances under which “further processing” of retrospective data is allowed. This chapter argues that Article 6(4) GDPR is not a separate legal basis and, therefore, a legal basis defined in Article 6(1) GDPR should be always identified. If further processing is not based on the consent given by the data subject, which constitutes an initial legal basis, as it does not envisage further processing as a possibility, in principle, a compatibility assessment shall be performed. Nevertheless, scientific research may benefit from the special rule defined in Article 5(1)(b) GDPR, which sets forth that processing of the data for scientific research purposes should not be considered as incompatible. What should be kept in mind in such an event is that this does not reduce the pre-existing data protection safeguards in the GDPR. This chapter defines the main legal and ethical aspects which should be taken into consideration when the scientific community decides to process retrospectively collected data in research. Nevertheless, there is no golden rule applicable to all such

Consent and Retrospective Data Collection

123

situations. Establishing the applicable legal considerations, including the applicability of consent as a legal basis, should be done so on a case-by-case basis. It is crucial to note that the GDPR recognizes the importance of research and includes a series of legal arrangements that are favorable to research and to build upon what has already been achieved with regard to the data collected, so that data processing is possible and can contribute to further scientific development.

References Albers and Veit in Wolff/Brink (eds) (2020) BeckOK Datenschutzrecht, Artikel 6 Rechtmäßigkeit der Verarbeitung Article 29 Data Protection Working Party. Opinion 03/2013 on Purpose Limitation, April 2, 2013 Article 29 Data Protection Working Party. Opinion 06/2014 on the Notion of Legitimate Interests of the Data Controller under Article 7 of Directive 95/46/EC, April 19, 2014 Article 29 Data Protection Working Party. Guidelines on Consent under Regulation 2016/679, November 28, 2017 Austrian Federal Gene Technology Act (Gentechnikgesetz—GTG), paragraph 66 BIOMAP (n.d.) BIOMAP data and analysis portal|Biomarkers in atopic dermatitis and Psoriasis I BIOMAP. biomap-imi.eu. https://biomap-imi.eu/about/data-portal. Accessed 27 Mar 2023 Cepic M (2021) Broad consent: die Erweiterte Einwilligung in Der Forschung. Beck-Online ZDAktuell 05214 Chico V (2018) The impact of the general data protection regulation on health research. Br Med Bull 128(1):109–118 Chipidza FE, Wallwork RS and Stern TA (2015) Impact of the doctor-patient relationship. Primary Care Companion for CNS Disord 17(5):10.4088 Clarke N, Vale G, Reeves EP, Kirwan M, Smith D, Farrell M, Hurl G, McElvaney NG (2019) GDPR: an impediment to research? Ir J Med Sci 188(4):1129–1135 Datenschutzkonferenz (2020) Pressemitteilung Der Konferenz Der Unabhängigen Datenschutzaufsichtsbehörden Des Bundes Und Der Länder Vom Datenschutzbehörden Des Bundes Und Der Länder Akzeptieren Die Einwilli- Gungsdokumente Der MedizininformatikInitiative. https://www.datenschutzkonferenz-online.de/media/pm/20200427_Einwilligungsdo kumente_der_Medizininformatik-Initiative.pdf. Accessed 27 Mar 2023 Doll R, Hill AB (1950) Smoking and carcinoma of the lung. BMJ 2(4682):739–748 Donnelly M, McDonagh M (2019) Health research, consent and the GDPR exemption. Eur J Health Law 26(2):97–119 EU General Data Protection Regulation (GDPR): Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/ 46/EC European Commission (n.d.) Health Research and Innovation. European Commission—European Commission. https://ec.europa.eu/info/research-and-innovation/research-area/health-researchand-innovation_en. Accessed 27 Mar 2023 European Commission (n.d.), Why the EU supports health research and innovation. https:// ec.europa.eu/info/research-and-innovation/research-area/health-research-and-innovation_en. Accessed 27 Mar 2023 European Commission (2021a) Assessment of the EU Member States’ Rules on Health Data in the Light of GDPR Specific Contract No SC 2019 70 02 in the Context of the Single Framework Contract Chafea/2018/Health/03. DG Health and Food Safety, February 12, 2021

124

T. O. Anwana et al.

European Commission (2021b) Horizon Europe. https://ec.europa.eu/info/research-and-innova tion/funding/funding-opportunities/funding-programmes-and-open-calls/horizon-europe_en. Accessed 27 Mar 2023 European Commission (2022) Can we use data for another purpose? https://ec.europa.eu/info/law/ law-topic/data-protection/reform/rules-business-and-organisations/principles-gdpr/purposedata-processing/can-we-use-data-another-purpose_en. Accessed 27 Mar 2023 European Data Protection Board. Guidelines 05/2020 on Consent under Regulation 2016/679, Version 1.1, May 4, 2020 European Data Protection Board. Guidelines 03/2020 on the Processing of Data Concerning Health for the Purpose of Scientific Research in the Context of the COVID-19 Outbreak, April 21, 2021 European Data Protection Board (2021) EDPB document on response to the request from the European Commission for clarifications on the consistent application of the GDPR, focusing on health research, February 2, 2021. https://edpb.europa.eu/sites/default/files/files/file1/edpb_ replyec_questionnaireresearch_final.pdf. Accessed 27 Mar 2023 Euser AM, Zoccali C, Jager KJ, Dekker FW (2009) Cohort studies: prospective versus retrospective. Nephron Clin Pract 113(3):214–217 Habiba MA (2000) Examining consent within the patient-doctor relationship. J Med Ethics 26(3):183–187 Hallinan D (2020) Broad consent under the GDPR: an optimistic perspective on a bright future. Life Sci Soc Policy 16(1):2–18 Hänold S (2020) DSK: Zustimmung Zu Broad-Consent-Formular Der Medizininformatik-Initiative. Beck-Online ZD-Aktuell 07198 Hess DR (2004) Retrospective studies and chart reviews. Respir Care J 49(10):1171–1174 Horton J (2002) Principles of biomedical ethics. Trans R Soc Trop Med Hyg 96(1):107 InteropEHRate (n.d.) Home. InteropEHRate. https://www.interopehrate.eu/. Accessed 27 Mar 2023 InteropEHRate Project (2020) Deliverable 7.1. https://www.interopehrate.eu/wp-content/uploads/ 2021/09/InteropEHRate-D7.1-Experimentation-scenarios-and-validation-plan.pdf. Accessed 27 Mar 2023 KATY. https://katy-project.eu/. Accessed 27 Mar 2023 Mester in Taeger/Gabel (eds) (2022) DSGVO/BDSG/TTDSG, Artikel 9 Medizininformatik-Initiative. Arbeitsgruppe Consent Mustertext Patienteneinwilligung, April 16, 2020. https://www.medizininformatik-initiative.de/sites/default/files/2020-04/MII_AG-Con sent_Einheitlicher-Mustertext_v1.6d.pdf Musmade PB, Nijhawan LP, Udupa N, Bairy KL, Bhat KM, Janodia MD, Muddukrishna BS (2013) Informed consent: issues and challenges. J Adv Pharm Technol Res 4(3):134 ProCAncer-I. About the project|ProCAncer-I. https://www.procancer-i.eu/about/. Accessed 27 Mar 2023 Quinn P (2021) Research under the GDPR—a level playing field for public and private sector research? Life Sci Soc Policy 17(1):3–33 Regulation (EU) No 536/2014 of the European Parliament and of the Council of 16 April 2014 on clinical trials on medicinal products for human use, and repealing Directive 2001/20/EC Schiff A in Ehmann/Selmayr (eds) (2017). Datenschutz-Grundverordnung: DS-GVO, Artikel 9 Verarbeitung besonderer Kategorien personenbezogener Daten Staunton C, Slokenberga S, Mascalzoni D (2019) The GDPR and the research exemption: considerations on the necessary safeguards for research biobanks. Eur J Hum Genet 27(8):1159–1167 Strech D, Bein S, Brumhard M, Eisenmenger W, Glinicke C, Herbst T, Jahns R et al (2016) A template for broad consent in biobank research. Results and explanation of an evidence and consensus-based development process. Eur J Med Genet 59(6–7):295–309 Talari K, Goyal M (2020) Retrospective studies—utility and caveats. J Royal Coll Phys Edinb 50(4):389–402 UK Biobank (n.d.) Consent Form: UK Biobank. https://www.ukbiobank.ac.uk/media/05ldg1ez/con sent-form-uk-biobank.pdf. Accessed 6 Apr 2023

Consent and Retrospective Data Collection

125

UK Biobank (2007) Ethics and Governance Framework. https://www.ukbiobank.ac.uk/media/0xs bmfmw/egf.pdf. Accessed 6 Apr 2023 Virtual Brain Cloud. TVB-Cloud Main—TVB_Cloud. https://virtualbraincloud-2020.eu/tvb-cloudmain.html. Accessed 27 Mar 2023 Wendler D, Wertheimer A (2017) Why is coerced consent worse than no consent and deceived consent? J Med Philos: Forum Bioeth Philos Med 42(2):114–131 World Medical Association (2018) WMA—the World Medical Association-WMA Declaration of Helsinki—Ethical Principles for Medical Research Involving Human Subjects. https://www. wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-researchinvolving-human-subjects/. Accessed 6 Apr 2023 World Medical Association (2022) WMA—the World Medical Association-about us. WMA: About Us. https://www.wma.net/who-we-are/about-us/ Wu Y, Howarth M, Zhou C, Hu M, Cong W (2019) Reporting of ethical approval and informed consent in clinical research published in leading nursing journals: a retrospective observational study. BMC Med Ethics 20(1):2–10

Enabling Secondary Use of Health Data for the Development of Medical Devices Based on Machine Learning Lea Köttering

Abstract Medical devices based on machine learning (ML) promise to have a significant impact and make advances in healthcare. This chapter analyzes to what extent data protection law, de lege lata and de lege ferenda, enables the development of ML-based medical devices. A key aspect of this is the processing of health data, which does not originate with the developers but with the healthcare providers. MLbased medical devices are trained with a large amount of health data. According to the current legal situation under the General Data Protection Regulation (GDPR), secondary use of health data is possible in principle (Article 6 (4) GDPR). However, the consent of the data subjects faces certain difficulties, and as the following analysis shows, the development of an ML-based medical device does not necessarily constitute scientific research within the meaning of the GDPR. Therefore, this chapter argues that a separate legal basis is needed. This must be accompanied by technicalorganizational measures that safeguard the rights of the data subject to a large extent and should only be allowed if the general public benefits from the research on and/or deployment of the ML-based medical device. In addition, there is a need for infrastructural measures such as the establishment or expansion of intermediary bodies, given the lack of incentives, personnel capacity, and expertise among healthcare providers to share health data with a broad range of interested parties. Furthermore, to ensure a reliable output from ML-based medical devices, standards for data preparation must be established. Finally, this chapter discusses the proposal of the European Health Data Space (EHDS) and briefly examines whether this is a step in the right direction. Keywords European health data space · General data protection regulation · Machine learning · Medical devices · Scientific research · Secondary use

L. Köttering (B) University of Hamburg, Hamburg, Germany © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_8

127

128

L. Köttering

1 Introduction: Why the Development of ML-Based Medical Devices Needs Secondary Use of Health Data Machine learning (ML)1 promises to have a significant impact and make advances in healthcare.2 ML-based medical devices can be developed to carry out specific tasks, e.g., in the field of medical diagnostics.3 In this context, ML-based medical devices have proven to be particularly effective in medical fields such as dermatology, pathology and radiology (image recognition).4 An ML-based medical device, for example, can detect and diagnose strokes in a radiological image.5 However, a large amount of health data is required to develop an ML-based medical device.6 This data is used to train, test and validate a model (training phase).7 The training phase refers to the process of determining the ideal parameters comprising an ML-Model.8 So, the model learns patterns that it can later apply to data beyond the training examples. The aim of the development is that the model can generate a reliable output for unknown data. By referring to the example of the radiological image, the final model should be able to reliably detect and diagnose a stroke. The accuracy of the model and thus its output is centrally related to the amount of training data.9 In addition, the output depends on other characteristics of the training data, such as balance and representativeness.10 Thus, depending on the task of the model and for reasons of representativeness and to avoid potential bias, the developer might require data from different institutions and population groups (age, gender, etc.). When developing an ML-based medical device for radiological diagnostics, this may require training data from different device manufacturers and devices with different field strengths.11 In addition, data may vary depending on differing hospital procedures.12 1

In this chapter, the term machine learning is used as an umbrella term for several sub-disciplines such as deep learning, convolutional neural networks, etc. 2 For an overview, see Panesar (2019). 3 The largest group of ML-based medical tools in the EU comes from the field of diagnostics according to EIT Health and McKinsey (2020), p. 65. 4 Hosny et al. (2018), p. 500; Esteva et al. (2017), p. 115; Madabhushi et al. (2016), p. 170. 5 This is feasible, for example, using the decision support tool e-ASPECTS: https://www.braino mix.com/stroke/e-aspects/ accessed 1 March 2023. 6 Russell and Norvig (2021), pp. 45, 723. In the context of ML in radiology: Kleesiek et al. (2020), p. 64; Richter and Slowinsky (2019), pp. 5, 26; Schneider (2020), para 6. 7 Alpaydin (2016), p. 155. 8 Kop (2021), p. 3 with further reference. 9 In the context of ML in radiology, see Tang et al. (2018), Fig. 2; Langs et al. (2020), p. 7; Kleesiek et al. (2020), p. 64. 10 On unbalanced datasets, see Russell and Norvig (2021), p. 725. In the context of ML in radiology and regarding representativeness, see Langs et al. (2020), p. 7. Also, there is sometimes striking talk of “garbage in, garbage out”, see Barocas (2016), p. 683. 11 Kooi (2020), para 2.3; Also highly discussed with regard to transfer learning: Choudhary et al. (2020), p. 129. 12 Kooi (2020), para 2.3; Choudhary et al. (2020), p. 129.

Enabling Secondary Use of Health Data for the Development of Medical …

129

Fig. 1 Data flow

The majority of health data is generated, gathered and stored by healthcare providers. In principle, however, it is not the healthcare provider who wants to develop a medical device based on ML. The institutions interested and involved in the development primarily include universities, public and private research institutions, and especially companies such as medical device producers.13 For the developer, it would be particularly resource-efficient to have access to and utilize already existing data collected by the healthcare provider. Also, depending on the task of the model, for symptomatic or medical reasons, or to ensure reliability of outcomes when using the final model, it is essential to use data from the context of healthcare. In this respect, health data could be transferred from the healthcare context to the involved institution. For analytical reasons and to understand what this encompasses, one should take a closer look at the individual processing steps (in the following: data flow)14 : Initially, health data is processed by healthcare providers, e.g., for diagnostic purposes (initial processing). Then, the provider prepares and transfers certain health data to enable another institution to process it for its purpose (processing A). This is followed by the actual processing of the institution for its specific purpose (processing B). Against this backdrop, in this analysis the term ‘secondary use’ is used to describe the process of using health data collected for a specific purpose (e.g., diagnostics) in the primary context (e.g., hospital) in a temporally subsequent context for 13

In the field of radiology, particularly medical device manufacturers for MRI-/CT-devices have an increasing interest in providing software that is based on ML and support the use of the equipment. In this context, manufacturers are also working on so-called app stores, which will enable the implementation of software from external providers in the future. See for example: syngo.via OpenApps from Siemens Healthineers. 14 See, Fig. 1.

130

L. Köttering

other purposes (e.g., research). This understanding also overlaps with the notion of further processing under the GDPR.15 Semantically, however, the notion of further processing may also encompass any other operation for the same purpose. However, this is not what is intended by further processing according to the GDPR.16 If the processing is repeated for the same purpose, e.g., billing a health insurance fund,17 then there is no change of purpose for it to be considered as further processing. Therefore, the term secondary use seems more pertinent. At the same time, the proposal of the European Health Data Space (EHDS)18 slightly blurs the understanding of the terminology just outlined. Accordingly, secondary use of electronic health data means the processing of electronic health data for the purposes set out in Chapter IV of the EHDS; the data used may include personal electronic health data initially collected in the context of primary use, but also electronic health data collected for the purpose of secondary use. In this regard, the EHDS does not necessarily require a change from an initial to a subsequent purpose. What remains is the processing of health data for a purpose. This understanding seems rather misleading in light of the term ‘secondary use’.19 In this chapter, the term secondary use is ultimately used as an umbrella term particularly characterized by a change in the purpose of processing.

2 Secondary Use of Health Data for ML Under the GDPR The processing of health data requires a legal basis. This legal basis is determined by the purpose of the processing. Therefore, a new legal basis is required whenever health data are processed for a purpose other than the initial one (secondary use). Article 6 (4) GDPR sets out that where the processing for a purpose other than that for which the personal data have been collected is not based on the data subjects consent or on a Union or Member State law which constitutes a necessary and proportionate measure in a democratic society to safeguard the objectives referred to in Article 23 (1) GDPR, the controller, in order to ascertain whether processing for another purpose is compatible with the purpose for which the personal data are initially collected, shall take specific aspects mentioned in point (a)-(e) into account. First, this raises the question of whether only the processing A is to be considered relevant in the light of Article 6 (4) GDPR (see: Fig. 1). At least the wording of Article 6 (4) GDPR, however, implies that every purpose other than that for which the personal data have been collected, falls within the scope of Article 6 (4) GDPR. Furthermore, the legislative process has shown that Article 6 (4) GDPR is not precisely limited to the secondary use of the initial controller. Austria had criticized 15

Article 6 (4) GDPR. Article 6 (4) GDPR “for a purpose other than that for which the personal data have been collected”. 17 On this example, see also Custer and Uršiˇ c (2016), p. 8. 18 European Commission, COM (2022) 197 final, 2022/0140. 19 It remains to be seen whether this definition will survive the legislative process without change. 16

Enabling Secondary Use of Health Data for the Development of Medical …

131

that the processing is not only open to the processor who collects the data for the first time, but also to any other processor.20 This criticism has not been further reflected in the law. Therefore, Article 6 (4) GDPR also applies to new controllers. Moreover, as implied above, processing A and B can be combined for one purpose, e.g., the transfer and processing of data for the development of a medical device based on ML. Often, data controllers obtain the data subject’s consent. Consent is generally considered a secure legal basis and as “the golden rule” of medical ethics and the protection of data subjects.21 However, it is not always an appropriate legal basis for processing health data,22 and also poses a number of legal and practical problems, e.g., accessibility of the data subject, lawfulness of broad consent, time and costs.23 Given the number of data required and the parties involved, consent does not appear to be driving the development of ML-based medical devices. Apart from this, Article 6 (4) GDPR also provides that secondary use may be based on a Union or Member State law. At first glance, one could consider Article 9 (2) (h) or (i) GDPR. For instance, Article 9 (2) (h) GDPR provides a legal basis for medical diagnostics and Article 9 (2) (i) GDPR for ensuring high standards of quality and safety of healthcare and of medicinal products or medical devices. Although it complies with the broader purpose of ML-based medical devices in diagnostics, two aspects have to be taken into consideration: First, the use of ML-based medical devices for diagnostics and their quality and safety in use have to be distinguished from technical development. It is a case of two different purposes. Secondly, the legal bases still require a concretizing legal basis from the Union or Member State Law. Currently, there is no legal basis in Union law that has already entered into force and explicitly addresses sharing health data and the development of medical devices based on ML.24 With regard to Member State law, it is questionable whether Article 6 (4) GDPR is understood as an opening clause for Member States to adopt a wide range of norms regarding secondary use of data.25 In any case, Member States can enact specific legal norms for secondary use within the given opening clauses in Articles 6 and 9 GDPR.26 Additionally, Recital 50 GDPR clarifies that every legal basis provided by

20

Council of the European Union (2016), p. 5. Wouters (2021), p. 210. 22 With further references: European Commission (2021), p. 77; Peloquin et al. (2020), p. 700. 23 EHDS, p. 13. 24 However, this could change when the EHDS comes intro force. 25 Critically in the 36th edition: Albers and Veit (2020), para 77. Broad interpretation by the Bundesgerichtshof (Federal Supreme Court of Germany): Decision of 24.09.2019—Case No. VI ZB 39/ 18, para 38: “In fact, the regulatory powers of the Union and the Member State diverge with regard to initial collection and processing for a purpose compatible with the original purpose of collection pursuant to Article 6 (1) and (4) (a-e) of the GDPR, on the one hand, and with regard to further processing for a purpose incompatible with the original purpose of collection pursuant to Article 6 (4) of the GDPR, on the other.” (translated by the author). 26 Kühling et al. (2016), pp. 43 et seq. 21

132

L. Köttering

Union or Member State law for the processing of personal data may also provide a legal basis for further processing in the meaning of Article 6 (4) GDPR. Many Member States have made use of the opening clause in Article 9 (2) (j) GDPR. According to this, processing is necessary for archiving purposes in the public interest, scientific or historical research purposes, or statistical purposes in accordance with Article 89 (1) GDPR based on Union or Member State law which shall be proportionate to the aim pursued, will respect the essence of the right to data protection, and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject. In this regard, some Member States explicitly refer to data sharing, secondary use and further processing of health data in their legislation.27 With regard to Germany, the legislation has also used this opening clause to regulate secondary use. However, the legal situation in Germany proves to be extremely fragmented and characterized by referral loops.28 This is particularly evident in the fact that the applicability of the legal provisions on secondary use of health data depends on the type of institution processing the data and that a distinction is made not only between public and private entities, but also between public and private hospitals, as well as those in religious ownership. Accordingly, finding the relevant specific rules requires coordination and considerable administrative effort through applicable laws.29 Apart from that, the GDPR also states that health data processing for scientific purposes shall not be considered to be incompatible with the initial purpose. This phrase is enshrined in Article 5 (1) (b) GDPR and refers to the compatibility test in Article 6 (4) GDPR. Therefore, compatibility is assumed. Accordingly, processing health data for scientific research is generally recognized as compatible with the original purpose. As has been shown above, the data protection law in force provides for the secondary use of health data, mainly in the field of scientific research. Against this backdrop and in light of the research question at hand, the applicability of the data protection regulations on secondary use to the development of an ML-based medical device is analyzed in the following paragraphs. In this respect, it will first be examined whether and to what extent the development of an ML-based medical device constitutes scientific research. Then, it will be analyzed to what extent the development might be compatible with the initial purpose and it will be outlined why the obligations provided by the GDPR on secondary use are insufficient for the development of ML-based medical devices.

27

For an overview, see: European Commission (2017), pp. 42–82; Molnár-Gábor et al. (2022). Köttering (forthcoming) Part 2 Chap. 3. 29 Molnár-Gábor et al. (2022), p. 273; Karaalp (2017), p. 284. 28

Enabling Secondary Use of Health Data for the Development of Medical …

133

2.1 The Concept of “Scientific Research” and Its Applicability to the Development of ML-Based Medical Devices The development of a medical device based on machine learning is characterized by several steps. A solution approach is sought for a specific problem. For this purpose, data is collected, and a model is trained. During the training process, weights and parameters are adjusted. Depending on the choice of the task and the model, the process is rather very well or little understood. In any case, however, the process is regularly characterized by assumptions and experiments. In this respect, there are parallels to the notion of scientific research in the context of data protection law. The GDPR privileges scientific research in several legal provisions. This serves to balance the protection of personal data and the interests in research. Indeed, scientific research is legally protected under Article 13 of the Charter of Fundamental Rights (CFR). In this respect, the understanding of the term “scientific research” used in the GDPR must be interpreted in light of Article 13 CFR. In addition, Recital 159 GDPR and Article 179 of the Treaty of the Functioning of the European Union (TFEU) must be taken into account. Accordingly, the concept of scientific research must be understood in a broad sense.30 Discussed in particular and formative for the concept of scientific research are the aspects of gaining knowledge, applied research, technological developments, commercial interests, public interest in research, publication, and scientific discourse. These aspects shall be explored in the context of the development of a medical device using machine learning. Gaining knowledge. Scientific research is characterized by the attributes of gaining insights and generating knowledge through scientific methods.31 It is described as a project that is based on methodological and ethical standards.32 This includes preparatory actions such as determining the (current) state of research, compiling, preparing, processing existing findings, formulating hypotheses and questions, as well as experiments, interpreting data and recording them for publication.33 The development of a model can meet elements of these attributes.34 The process of development is characterized by scientific methods of data preparation, as well as training and adjusting the model.35 With reference to the data flow (Fig. 1), processing A, for example, can be considered a preparatory measure. Depending on the medical specialty, clinical context, and type of health data, the development, training and adjusting of the model (e.g., processing B) differs. While a gain in knowledge does not result from the data used per se, it may result from processing the data to develop the model. Similar to many research projects, this development process is incremental 30

Recital 159 of the GDPR. Ruffert (2022), para 6; Hofmeister (2022), p. 300 with further references. 32 Article 29 Data Protection Working Party (2017), WP 259 rev.01 p. 28. 33 Trute (1994), p. 146; Hofmeister (2022), p. 300. 34 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 35 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 31

134

L. Köttering

and in a certain way, iterative.36 In this respect, not every step of development might generate new knowledge.37 Yet, it is in the inherent nature of research projects that they build on prior knowledge and existing findings.38 When it comes to developing a model for the medical field, the gain in knowledge might lie in new findings about the data preparation or the modelling process to develop a trustworthy model.39 Applied research. The development of the model can also be attributed to applied research. When developing a model for diagnostics, the aim is often to develop a medical device that is suitable for use in practice and generates reliable and trustworthy results.40 Considering Recital 159 GDPR the notion of scientific research also encompasses applied research. This type of research aims at gaining knowledge, which can be realized through economic application and, in this respect, is geared towards solving practical problems, developing new products, or significant improvement in existing products.41 Numerous models are directed to improve the efficiency and quality in diagnostics and therapy.42 However, a closer distinction must be made: The process described as development of a model does not include the mere production of a product ready for sale. Rather, the term development used in this context describes the process of finding a way to produce a model using scientific methods and means.43 Thus, product development may constitute scientific research, while mere product manufacturing does not. Accordingly, the latter is not privileged under the provisions of the GDPR, but could be protected, for example, by the freedom to conduct a business (Article 16 CFR). Technological developments. Recital 159 GDPR states that technological developments are encompassed by the term “scientific research”. In line with the aforementioned analysis, this only applies to activities employing scientific methods, but not to mere product manufacturing. Developing a model for diagnostic purposes involves a training process tailored to the specific task. While some models based on machine learning are well understood, others are not, e.g., deep learning techniques. Functions, capabilities, and involved logic of such a model can therefore be considered a subject of research. This becomes apparent not least in the numerous scientific contributions on this broad topic. Commercial interests. A highly disputed aspect is the question of whether or not commercial interests conflict with the concept of research and whether they would preclude benefiting from the privileges that the GDPR provides for scientific research. In fact, some research projects are integrated in economic functional contexts.44 However, according to Recital 159 GDPR, scientific research includes 36

Roßnagel (2019a, b), p. 158. Köttering (forthcoming) Part 3 Chaps. 8 & 16. 38 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 39 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 40 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 41 Meszaros and Ho (2021), p. 7. 42 EIT Health and McKinsey (2020), pp. 9, 15. 43 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 44 On industry research: Trute (1993), p. 104. 37

Enabling Secondary Use of Health Data for the Development of Medical …

135

privately funded research. Also, irrespective of the type and ownership of the research institution, there is a need for (re-)funding of research projects.45 Although, the type of funding may have a guiding effect on the choice of research topics, it does not affect the standards of science.46 Whether the development of a model can be considered scientific research, however, should be based on whether scientific standards are met and the generation of scientific knowledge is possible. To rephrase, it is to be considered scientific research if the fruits of the research are utilized for commercial interests, but the scientific process of gaining knowledge is not influenced.47 If private companies were developing a medical device based on machine learning, scientific research would likely be assumed if they had a separate research department.48 Public interest. The aspects of scientific research described and analyzed so far display a wide understanding of scientific research. Against this backdrop, it is argued that scientific research must be limited to research projects that serve the public interest.49 In addition to the scholar’s personal interest in research, there is usually also an interest of the general public in the increase in knowledge that accompanies research.50 Although this does not follow from the wording of Article 89 (1) GDPR, according to which the public interest is only required in conjunction with archiving purposes.51 This can, however, be understood as the reason for the privileged treatment in the GDPR.52 Public interest may include diverse aspects. The crucial factor is that a project does not only serve particular or private interests, but particularly makes a contribution to the community and serves the interest of the general public.53 In this context, for example, the development of a model to assess the purchasing interests of individual customers would not be covered.54 The development of a model for the public health system, for diagnostic purposes, or for monitoring diseases, in contrast, can be covered.55 Publication and scientific discourse. However, it must be questioned whether and to what extent the increase in knowledge must be accessible to the general public.56 It might be required to publish the results of research and make them available to the scientific discourse.57 In reality, however, research projects have an 45

Köttering (forthcoming) Part 3 Chaps. 8 & 16. Trute (1994), p. 427; Köttering (forthcoming) Part 3 Chaps. 8 & 16. 47 Gärditz (2022a), para 2. 48 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 49 European Data Protection Supervisor (2020), pp. 2, 26; Shabani and Borry (2018), p. 152. 50 European Data Protection Supervisor (2020), p. 2; Recital 113 sentence 4 GDPR. 51 “Processing for archiving purposes in the public interest, scientific research […]”, Article 89 (1) GDPR. 52 European Data Protection Supervisor (2020), p. 2. Spiecker gen. Döhmann (2022), p. 164. 53 Schrader (2022), p. 349 with further references. 54 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 55 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 56 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 57 Schneider (2019), p. 268. 46

136

L. Köttering

interest in not making findings available to the public or only after a considerable delay.58 From an activity-related perspective, research can at least be distinguished from publishing, evaluating, and communicating.59 The aspect of gaining knowledge, which decisively shapes the concept of research, results primarily from research and not from publishing.60 Especially in consideration of the individual right under Article 13 CFR, it is furthermore questionable to what extent the research process and scientific findings are less entitled to protection if they are not made available for collective benefit.61 At the beginning of the research project, at least when having a critical distance to the research subject, it may not always be clear whether research results can be published.62 This might also be relevant for the development of an ML-based medical device and be applied to involved research activities. In summary, this paragraph provides an incentive for clarifying the concept of scientific research with regard to the development of an ML-based medical device. In doing so, various aspects that shape the concept of scientific research are examined in more detail. Although some methods of ML are well understood, others (e.g., deep learning) seem to be in many ways still the subject of research. The necessity to (re-)fund research cannot in principle preclude an activity from being considered scientific research. Rather, it is crucial that the process meets scientific standards. The interpretation under the GDPR additionally suggests that the processing must be in the public interest. However, this does not necessarily require that the research results are published. The development of ML-based medical devices is designed to enhance individual health and, in this context, the social health system, can be considered a project of public interest. Although parallels can be found between the understanding of scientific research and the development of a medical device in this respect, it cannot generally be assumed that this is scientific research within the meaning of the GDPR. A precise analysis must be carried out in each individual case, examining in particular the aspect of knowledge generation. Against this background, the question arises as to which parties carry out this analysis and whether sufficient legal certainty is provided. In any case, such detailed analysis does not seem to enable or unleash development and innovation through ML.

2.2 Compatibility of Initial Purpose and Purpose of Secondary Use The GDPR holds another possibility for secondary use, data sharing, and further processing. According to Article 6 (4) GDPR, secondary use is also permitted if the 58

Trute (1994), p. 106. Trute (1994), p. 722. 60 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 61 Köttering (forthcoming) Part 3 Chaps. 8 & 16. 62 Gärditz (2022a, b), para 104. 59

Enabling Secondary Use of Health Data for the Development of Medical …

137

initial purpose is compatible with the new purpose. This compatibility results from a balancing analysis primarily based on the criteria set out in Article 6 (4) GDPR. The criteria are not exhaustively listed but encompass the link between both purposes, the context of the processing, the nature of the personal data, possible consequences and the existence of appropriate safeguards. In order to analyze the aspects in more detail, it will be referred back to the data flow outlined above (Fig. 1). According to this, data collected for diagnostic purposes in a healthcare context are transferred to another controller for the development of an ML-based medical device. Although the medical device may later be used for diagnostic purposes, this must be distinguished from technology development. While the secondary context aims at technology development, the primary context processes the data for individual purposes. When assessing the link, the perspective of the data subject is crucial.63 From this perspective, these two purposes cannot be considered intuitively linked. In addition, when health data is collected and processed within the medical treatment context, it is characterized by a special relationship of trust and confidentiality between the doctor and the patient.64 Recital 50 sentence 10 GDPR highlights that professional obligations of secrecy can prohibit the processing for secondary purposes. This results not least from the particular sensitivity of health data. However, the data subject as an individual is not focused upon during the development of an ML-based medical device.65 There is no interest in the health status of the individual person, yet technical-organizational measures, as well as data subject rights would still have to be respected. However, this alone should not be sufficient for secondary use to be compatible for the purpose of developing an ML-based medical device. Therefore, the compatibility test does not seem to be a real alternative to use health data for the development of an ML-based medical device.

2.3 Why the Obligations of the GDPR on Secondary Use for ML Are Insufficient Although the GDPR and Member State law allow the processing of health data for secondary purposes and in particular for purposes of scientific research, the provisions are already ambiguous for legal professionals. Thus, when healthcare providers apply data protection law to determine whether health data sharing is permissible, the provider bears a significant responsibility and liability risk. This is especially so since healthcare providers are neither designed nor equipped to share health data with a wide range of researchers and other interested parties in a privacy-compliant manner. This would require not only the preparation and secure transfer of health data, but also a detailed legal examination. For example, health data must be pseudonymized/ anonymized, and it must be ascertained whether the other party’s project is scientific 63

Roßnagel (2019a), para. 34 with further references. This already goes back to the Hippocratic Oath. 65 Köttering (forthcoming) Part 3 Chaps. 8 & 17. 64

138

L. Köttering

research or to what extent the purposes are compatible within the meaning of the GDPR. In this regard, healthcare providers lack personnel capacities and adequate expertise. Admittedly, most institutions have a legal department. However, healthcare providers simply lack an incentive to make the effort to prepare and share data appropriately.66 Rather, it may be observed that data protection law is also used as a pretext not to share data. The underlying reasons for this range from personal researcher interests67 (e.g., to obtain a certain qualification (dissertation) or to be first to publish on a specific topic68 ) to the interest of physicians not to be compared in their performance and not to violate the duty of confidentiality. The exact reasons seem to be manifold and intertwined. When data sharing is considered, the interpretation of regulations varies widely among institutions and personnel. This leads to further practical problems when it comes to the development of ML-based medical devices. These problems range from organizational/legal efforts to lack of accurate data.69 Different institutions follow different medical procedures and data collection. It might also be necessary to verify the reliability and compare processes in order to detect inconsistencies. This is because inconsistencies can lead to (un)noticed bias in the model.70 To sum up, the obstacles of secondary use of health data do not only lie in the regulations provided by the GDPR but also in other characteristics of this sector. While consent might be dysfunctional for some research projects,71 or for the development of an ML-based medical device, other provisions for secondary use are too inconcrete and undercomplex to properly address secondary use for ML. Healthcare providers lack capacity, equipment, processes, and legal expertise to meet the demand for health data in a privacy-compliant manner. There is no legally secure basis to share health data for the development of a medical device based on ML. In any case, it does not unleash the potential of ML-based medical devices within the EU that can compete with other innovations, e.g., from Silicon Valley.72

66

OECD (2019) Misaligned incentives, and limitations of current business models and markets, externalities of data sharing, and re-use and the misaligned incentives. 67 See also, Devriendt (2022), p. 3008. 68 OECD (2019) Misaligned incentives, and limitations of current business models and markets, externalities of data sharing and re-use and the misaligned incentives. 69 This is at least to be assumed when it comes to functionality of an ML-based medical device. 70 Russell and Norvig (2021), pp. 40, 672 et seq; Hildebrandt (2023), para 4.1. On unbalanced datasets, see Russell and Norvig (2021), p. 725. 71 See also, Peloquin et al. (2020), p. 700. 72 Kop (2021), p. 5.

Enabling Secondary Use of Health Data for the Development of Medical …

139

3 Enabling Secondary Use for the Development of ML-Based Medical Devices: Conducive Approaches To achieve the EU’s goal of a data single market and data ecosystem,73 the barriers outlined with regard to the GDPR must be overcome. To this end, an appropriate balance must be found between data protection rights and the interests of secondary use. This is also envisioned for ML-based medical devices.74 Therefore, it is necessary to work on conducive approaches. The following section highlights three aspects that arise from the ongoing debate about secondary use for research, development of ML-based medical devices and, in particular, the problems posed by the GDPR in this context.

3.1 Explicit Legal Bases for the Processing of Health Data for the Development of ML-Based Medical Devices So far, neither consent nor Article 6 (4) GDPR provide a suitable legal basis to enable ML-based development of medical devices.75 Also, ML-based development does not, in all cases, clearly constitute scientific research in the sense of the GDPR. Certainly, the lack of a specific legal basis may have been in the legislator’s mind. The legislator may have intended to allow processing for the purpose of developing an ML-based device only under the existing conditions. Additionally, a need for regulation cannot, in principle, be derived from the absence of a legal basis. However, as a consequence of at least four aspects, a regulatory gap can be identified: These include the opening clauses of the GDPR, which in principle allows secondary use, the announcement of the EHDS,76 as well as the European Data Strategy, and the European AI Strategy.77 The EU is seeking innovation and appropriate regulatory options in this area. In addition, the development may be protected by research (Article 13 CFR) and the freedom to conduct a business (Article 16 CFR). So, a balance must be reached between data protection law (Article 8 CFR) and these interests. Finally, especially in light of significant advances in prevention, diagnosis, and treatment,78 it is in the public interest that ML-based medical devices are developed and deployed in practice. 73

European Commission, COM (2020) 66 final p. 1, 5. European Commission, COM (2020) 66 final p. 1; see also “Health Research and Innovation, available at: https://research-and-innovation.ec.europa.eu/research-area/health_en accessed 2 March 2023. 75 See above, Sect. 2 Secondary Use of Health Data for ML Under the GDPR. 76 European Commission, DG Health and Food Safety (2021), p. 139; European Commission, COM (2020) 66 final p. 22. 77 European Commission, COM (2020) 66 final p. 29; European Commission, COM (2018) 237 final p. 3. 78 Schuessler et al. (2022), pp. 338–339. 74

140

L. Köttering

In this regard, health data have emerged as a valuable resource,79 especially for the development of an ML-based medical device. Therefore, health data should not be stored in inaccessible data silos, but used for the public and common good.80 Data, moreover, does not lose its content if or by how often it is processed.81 Against this background, we should not conceptualize exclusive rights or ownership of data.82 Nor can the GDPR be interpreted as allocating partial and/or limited ownership rights of personal data to any legal subject.83 Moreover, there is no need for exclusive or ownership rights in data.84 In fact, the GDPR offers space and flexibility to start sharing personal data and processing it.85 So, we should rather think of and enable infrastructural use of health data.86 This requires appropriate regulatory structures that should consider content aspects (such as identifiability of individuals, healthcare status, etc.), the domain of the data (e.g., private hospitals, public research institutions, personal wearable devices), and the data origins (machine-generated, observed, derived, inferred or provided data).87 In addition, this should include frameworks for access rights, processing rights, and appropriate safeguards.88 In line with the GDPR, access and processing rights, as well as the regulatory framework in general, should be less focused on specific authorized persons and stakeholders or on types of processing, but rather provide rules and safeguards related to the purpose and context of processing.89 Given the development of ML-based medical devices and to realize its potential, the regulatory structure should provide a legal basis for processing health data.90 This legal basis must be compatible with Article 8 CFR. For example, the regulation could include explicit legal bases for the development and production, as well as for specific research projects using machine learning. To effectively enable the development of ML-based medical devices, it may not be possible to strictly distinguish between processing steps that follow scientific methodology and those that represent a mere production of a product. However, processing purposes should also not be defined too broadly; otherwise, they can neither adequately serve the protected objects nor provide legal certainty. In any case, processing should be limited to those 79

Kop (2021), pp. 1, 8; Drexl (2019), p. 19. Kop (2021), pp. 1, 8. 81 Trute (2017), p. 84. 82 Kop (2021), pp. 1, 8; Jurcys et al. (2020), p. 7; European Commission, SWD (2017) 2 final, p. 17. 83 Kop (2021), p. 8. Rather, a rethinking of classical property law is necessary. See in this regard, Kop (2021), p. 8. 84 Drexl et al. (2016), p. 2. 85 Kop (2021), p. 8. 86 Further on the topic of data as infrastructure, see OECD (2015), pp. 177–206; Trute (2017), pp. 87 et seq. 87 Kop (2021), p. 3 with further references. 88 Kop (2021), p. 1. 89 Kop (2021), pp. 11–12 states that in some cases, it may be necessary to designate authorized persons. However, this might be less future-proof and it does also not necessarily guarantee sufficient protection. 90 Kop (2021), p. 11; see, Article 8 (2) CFR. 80

Enabling Secondary Use of Health Data for the Development of Medical …

141

purposes that serve the public at large.91 Accordingly, it should not only serve the individual patient, but in particular the general public. To this end, standards and solid criteria must be established to assess the contribution to the general public. ML-based development of medical devices promises significant advancement in prevention, diagnostics, and treatment.92 They are therefore likely not only to serve individual health issues, but also to speed up workflows and reduce costs.

3.2 Infrastructure and Intermediary: Extension of the Concept for the Use of Research Data Another aspect that should facilitate secondary use of health data is the establishment of infrastructures and intermediaries for secondary use. Under current law, a relatively large amount of responsibility falls on healthcare providers. Therefore, it seems necessary to redistribute responsibilities and institutionalize the governance of sharing health data for the development of ML-based medical devices.93 Agents, intermediaries, data trustees, or data-sharing bodies that have legal and medical expertise can contribute to this effort.94 Such institutions can track and enable a systematic secondary use of health data among many stakeholders.95 Thus, secondary use would no longer primarily dependent on the consent of individual data subjects or the willingness of healthcare providers to make health data available. However, healthcare providers would need to collaborate with these bodies, institutions, or data trustees. Yet, healthcare providers have little incentives or interests of their own to share data. This requires a legal obligation. The bodies do not necessarily have to be government institutions. It is also feasible to entrust private and/ or certified bodies with structured data sharing for secondary use. However, private bodies might lack the financial basis.96 Given the legal uncertainties of the GDPR, it also seems essential to develop consistent decision-making routines. Bodies could develop transparent decisionmaking practices that further flesh out the legal bases and criteria for secondary use. To do so, however, these bodies need to be equipped with legal, scientific and medical expertise. Through this interdisciplinary expertise, a variety of data inquiries, agreements and collaborations among institutions, understandable, consistent, and constructive solutions can be found. Some approaches for such bodies or institutes already exist. However, data access is often limited to certain authorized persons or certain types of data (e.g., health insurance data). Moreover, these approaches focus on health data sharing for research 91

Kop (2021), p. 10. Schuessler et al. (2022), pp. 338 et seq. 93 Panagopoulos et al. (2022), p. 3. 94 On this subject in general, see also Devriendt et al. (2022). 95 Richter et al. (2019), p.10; Devriendt et al. (2022), p. 3009. 96 However, they could be remunerated for certain services, such as the preparation of data. 92

142

L. Köttering

purposes and often on access through research institutions.97 Thus, the deployment of research results/products and development of ML-based medical devices is not addressed. Although there are several parallels between development of MLbased medical devices and scientific research, there may be some projects that can contribute to the efficiency of healthcare but cannot be considered as scientific research. Even if one considers the project to be scientific research within the meaning of the GDPR, the necessary legal certainty is lacking. This brings one back to the necessity of a legal basis for health data sharing for the development of ML-based medical devices. However, due to the parallels with typical research projects, it seems resource-efficient if these bodies or institutions also manage secondary use for the development of ML-based medical devices. In any case, this infrastructure already developed should inspire and shape new ideas of intermediaries and procedures.98 Ideas that concentrate on federated health data sharing procedures are promising because they reduce the number of parties involved and, thus the risks to privacy.99 However, this does not eliminate the need for intermediaries or data sharing bodies. The previous analyses of the obligations of the GDPR and practical problems show that healthcare providers are not able to provide adequate decision-making routines. The multitude of stakeholders has not been able to establish standards. Therefore, compared to the anticipated demand for health data, a decision-making body seems necessary to enable secondary use for the development of ML-based medical devices.

3.3 Standards of Data Preparation Following on from the decision-making routines mentioned earlier, this aspect requires standards not only in terms of data access, but also in terms of how the data is prepared, e.g., gathered, pseudonymized and anonymized. The risk-based approach of the GDPR regarding pseudonymized and anonymized data has been increasingly discussed.100 Consistent legal assessment and appropriate data preparation could be developed through collaboration among data trustees or data-sharing bodies. In this respect, institutions should work with secure methods, such as differential privacy and k-anonymity.101 The provision of personal data should only be

97

European Commission, DG Health and Food Safety (2021), pp. 98 et seq. Also, the secondary use of health data for typical research projects is not free of difficulties. There are a number of limitations and conditions. 98 For regulatory mechanisms developed by member states, see European Commission, DG Health and Food Safety (2021), pp. 98 et seq. 99 Kaissis et al. (2020), pp. 305 et seq; Rossello et al. (2021). 100 See, e.g., Peloquin et al. (2020), p. 698; Mourby (2020). 101 Article 29 Data Protection Working Party (2014) WP216; see also, Kolain et al. (2022) and Wood et al. (2018).

Enabling Secondary Use of Health Data for the Development of Medical …

143

possible in exceptional cases, considering the risks for data subjects and the sensitivity of health data. In this context, it must also be considered that depending on the task of the ML model, training data can be retrieved by targeted attacks.102 Another problem is the comparability and representativeness of data from different institutions. This is not only due to the lack of standardized electronic health records.103 People with similar health problems do not necessarily seek medical advice or treatment due to differences in access to healthcare, income, and education.104 Furthermore, different work processes affect the content of health data.105 Especially in the development of ML-based medical devices, the accuracy of data can have a major impact and cause incompleteness and bias.106 Therefore, methods must be found to make differences in the data apparent to developers. For example, information could be provided about the domain, the data originates and, if necessary, about special processes. However, the risk of individual persons being identified must be kept very minimal. If necessary, re-identification prohibitions could also be introduced.

4 EHDS: A Way Forward? The European Commission seems to have recognized the legal uncertainties and practical problems that persists in secondary use of health data as it proposed the EHDS in May 2022. The EHDS is based on the Data Governance Act (DGA) and the European Data Strategy.107 Against this backdrop, the EHDS envisions an ecosystem of health data.108 It addresses data processing in the primary context, interoperability measures regarding electronic health records and health data processing in the secondary context. In the following, the proposed process of secondary use under the EHDS will be outlined. Thereby, the first draft of the EHDS is addressed, which will most likely change in some respects during the legislative process. Accordingly, the following paragraph focusses on main regulatory approaches to secondary use of health data for ML. Then, the text briefly outlines to what extent the EHDS responds to issues within the GDPR and thus enables the development of ML-based medical devices. Finally, the question of whether the EHDS introduces a new perspective on data (sharing) will be elaborated on. 102

Shokri et al. (2017). Kaissis (2020), p. 305; Eit Health and McKinsey (2020), p. 91; OECD (2015), p. 340. 104 Hildebrandt (2023), para 4.1. 105 Kooi (2020) para 2.3; Choudhary et al. (2020), p. 129. 106 Hildebrandt (2023), para 4.1. 107 European Commission, COM (2022) 197 final, pp. 1, 4. 108 European Commission, COM (2020) 66 final, p. 5. 103

144

L. Köttering

4.1 Secondary Use of Health Data for the Development of ML-Based Medical Devices: An Overview of Legal Bases and Procedures The European Commission has formed a concept that pursues the enabling of secondary use of health data. The concept is characterized by infrastructural and procedural approaches. These approaches aim to interact with existing bodies, e.g., data protection agencies, the European Data Innovation Board. Furthermore, the EHDS provides for central platforms, e.g., MyHealth@EU and HealthData@EU, to transfer and disclose health data e.g., through healthcare providers. It is a certainly ambitious undertaking not only regarding the infrastructural and procedural implementation but especially in light of interoperability measures and affordability.109 Beyond that, the concept of secondary use follows a rather simple procedure (Fig. 2): Specific types of data applicants are allowed to request data for defined purposes.110 The request is considered and verified by a health data access body (HDAB), which is established by the Member States.111 After they review and examine the data request, they issue a data permit if the data applicant meets the requirements.112 The health data is particularly gathered in the primary context and obliged to be disclosed by data holders.113 With regard to the outlined data flow, the HDAB functions as an administrative, organizing, and intermediary institute. Against this backdrop, the data flow could be abstracted as shown in Fig. 3. In Recital 37, the EHDS elaborates on the compatibility with the GDPR. Accordingly, the EHDS constitutes the legal basis under Article 9 (2) (g), (h), (i) and (j) GDPR. While the EHDS creates the legal obligation in the sense of Article 6 (1) (c) GDPR for disclosing the data by the data holder to the HDAB, the data applicant shall demonstrate a legal basis pursuant to Article 6 (1) (e) or (f) GDPR. In respect of Article 6 (1) (e) GDPR, another reference is made to EU or national law, which is different from the EHDS, mandating the user to process personal health data for

Fig. 2 Secondary use of health data under the EHDS

109

European Commission, COM (2022) 197 final, p. 16. Article 34, 35, 47 EHDS. 111 Article 36, 37 EHDS. 112 Article 46 EHDS. 113 However, as the definition of “secondary use” (Article 2 (2) (e) EHDS) indicates, health data may also be gathered for the purposes outlined in Article 34 EHDS. 110

Enabling Secondary Use of Health Data for the Development of Medical …

145

Fig. 3 Data flow under the EHDS

the compliance of its tasks.114 If the lawful ground for processing is Article 6 (1) (f) GDPR, it is the EHDS that provides the safeguards.115 Although the proposal leaves it open whether and to what extent the development of ML-based medical devices can be considered scientific research or should be interpreted in the context of public interest in the area of public health, medical diagnosis, or provision of health, social care, or treatment (Article 9 (2) (g), (h), (i) GDPR).116 The legislator has, in any case, recognized the importance and potential for innovationof ML-based medical devices through a separate legal basis. In addition to secondary use for development and innovation activities for products and services, the proposal provides a legal basis for training, testing, and evaluating of algorithms including medical devices, AI systems, and digital health applications. Compared to the scientific research exception in the GDPR, the proposal explicitly requires that the device, system, or application contributes to the public health or social security or ensuring high levels of quality and safety of healthcare, of medicinal products, or of medical devices. This also applies to medical products and devices intended to increase the efficiency and quality of diagnostics and therapy, e.g., clinical decision support systems.

4.2 Enabling Secondary Use of Health Data for ML: Implications from the EHDS The proposal of the EHDS so far promises some solution-orientated approaches. Four aspects of how the EHDS addresses barriers to secondary use of health data 114

Recital 37. Recital 37. 116 Recital 37 seems to rather assume that the legal bases overlap. 115

146

L. Köttering

for the development of ML-based medical devices compared to the GDPR can be highlighted: First, the EHDS establishes legal bases and requirements for secondary use of and access to data based on the purpose of the processing. Second, the EHDS introduces HDABs, which can be considered as a type of intermediary. Third, preparation and quality of data is addressed by the EHDS. For instance, HDABs are obliged to prepare data in a pseudonymized or anonymized manner,117 and data sets shall be assigned data quality labels.118 Fourth, healthcare providers can no longer shirk data sharing behind pretexts, as it is legally mandated.119 However, numerous HDABs can be established by the member states and, in addition, data can continue to be requested from the individual data holders. Therefore, it remains to be seen whether the final EHDS will eventually enable a consistent legal assessment and guarantee qualitative health data for the development of ML-based medical devices.

4.3 Does the EHDS Introduce a New Perspective on Data (Sharing)? Overall, the proposal leaves the impression of a significant advancement in terms of secondary use of health data. While the discussion under the GDPR focused primarily on the rights of the data subject,120 considering the EHDS, the influence of the data subject on secondary use is rather diminished.121 Health data shall be able to be shared without consent.122 Furthermore, no right of objection or even a reciprocal benefit for the data subjects is to be foreseen. In this respect, the legislator is refraining from concepts such as data ownership or licensing. The EHDS displays that health data are rather understood as common good. In this regard, the proposal reinforces concepts of sharing,123 access rights and intermediary bodies. These bodies pursue neither their own interests with access nor with the granting of access. These approaches align with other data sharing proposals in the EU, such as the Data Act. However, in view of the increasing criticism, it is very likely that at least an opt-out option will be provided in the further legislative process in order to address data protection concerns.

117

Article 44 EHDS. Article 56 EHDS. Critically: Hildebrandt (2023), para 4.2. 119 Article 33 EHDS. 120 This is particularly expressed in the frequently chosen legal basis of consent. Also, the right to data portability gives the data subject rights of disposal. 121 This primarily concerns secondary use. In general, the EHDS shall also ensure that data subjects can exercise their rights. European Commission, COM (2022) 197 final p. 1. 122 Article 33 EHDS. 123 The EHDS also provides for altruism rules. Whether these rules become applicable next to the obligation to share data by the data holder seems unclear. 118

Enabling Secondary Use of Health Data for the Development of Medical …

147

5 Conclusion The promised significant impact and advancements in healthcare through ML-based medical devices are being thwarted, if not prevented, by current law and practical challenges. The GDPR provides only a few regulations for secondary use of health data characterized by a broad set of facts and legal uncertainties. Although consent as a legal basis would particularly protect data subjects while enabling the secondary use of health data, it is fraught with a number of legal and practical problems. It partially fulfills the characteristics of scientific research within the meaning of the GDPR. However, the meaning and scope of the term in European law have not been conclusively clarified. Moreover, ML procedures are very diverse. Some procedures are already well explored. Overall, there is no explicit, enacted legal basis for the development of an ML-based medical device. Aside from these legal issues, healthcare providers who handle a large amount of data lack incentives to share data. They also lack personnel capacity and expertise. Therefore, new regulatory structures are needed. Data should be less characterized by exclusive rights and more viewed as a common good. The protection of data subjects can still be ensured. An explicit legal basis is needed for the development of ML-based medical devices. However, an essential criterion should be that the development contributes to the public good. It seems most promising if existing approaches to sharing health data for research purposes are expanded to include use for developing ML-based medical devices. It is also necessary that intermediary bodies or trustees review data requests and prepare the data. In this way, a consistent decision-making practice can emerge. Healthcare providers must be obligated to collaborate with these intermediary bodies. In addition, standards of health data collection and processing must be established to achieve comparability and to identify, or better yet, eliminate bias. The EHDS is a significant step in the right direction. However, it remains to be seen to what extent this proposal will survive the mills of legislation and provide more uniformity beyond the borders of the member states. In addition to the EHDS, there are a number of new legislative proposals and infrastructure projects being proposed and designed by the EU. These interlock and build on each other. All of them aim at making data available, establishing intermediaries, and enabling data sharing within the EU and between different sectors. In this respect, the EU has recognized that it is no longer just about data protection law, but that data law is much more diverse.

References Albers M, Veit RD (2021) Artikel 6 Rechtmäßigkeit der Verarbeitung. In: Brink S, Wolff H A (eds) Beck’scher Online-Kommentar 36. Edition, Beck CH, Munich Alpaydin E (2016) Machine Learning. The MIT Press, Cambridge, Massachusetts

148

L. Köttering

Article 29 Data Protection Working Party (2014) Opinion 5/2014 on anonymization techniques WP216 Article 29 Data Protection Working Party (2017) Guidelines on consent under Regulation 2016/ 679—17/EN WP259 rev.01 Barocas S, Selbst AD (2016) Big data’s disparate impact. Calif Law Rev 104(3):671–732 Bundesgerichtshof (Federal Supreme Court of Germany) (2019) Decision of 24.09.2019—Case No. VI ZB 39/18 Choudhary A, Tong L, Zhu Y, Wang MD (2020) Advancing medical imaging informatics by deep Learning-Based domain adaption. IMIA Yearb Med Inform 2020:129–138 Council of the European Union, General Secretariat (2016) CM 2213/16 pp 1–6 Custers B, Uršiˇc H (2016) Big data and data reuse: a taxonomy of data reuse for balancing big data benefits and personal data protection. Int Data Priv Law 6(1):4–15 Devriendt T, Shabani M, Lekadir K, Borry P (2022) Data sharing platforms: instruments to inform and shape science policy on data sharing? Sci 127(6):3007–3019 Drexl J, Hilty R M, Desaunettes L, Greiner F, Kim D, Richter H, Surblyte G, Wiedemann K (2016) Data ownership and access to Data—Position statement of the max planck institute for innovation and competition of 16 August 2016 on the Current European Debate, pp 1–12 https://www.ip.mpg.de/fileadmin/ipmpg/content/stellungnahmen/positionspaper-dataeng-2016_08_16-def.pdf Accessed 1 March 2023 Drexl J (2019) C. Legal challenges of the changing role of personal and Non-Personal data in the data economy pp19–42. In: De Franceschi A, Schulz R (eds) Digital Revolution—New challenges for law, data protection, artificial intelligence, smart products, blockchain technology and virtual currencies, nomos, Baden-Baden Eit Health, McKinsey (2020) Transforming healthcare with AI—The impact on the workforce and organisations https://eithealth.eu/wp-content/uploads/2020/03/EIT-Health-and-McKinsey_ Transforming-Healthcare-with-AI.pdf Accessed 1 March 2023 Esteva A, Kuprel B, Roberta A N, Ko Justin, Swetter S M, Blau H M, Thrun S (2017) Dermatologist—level classification of skin cancer with deep neural networks. Nature 542: 115-118 European Commission, SWD (2017) 2 final, Commission Staff Working Document on the free flow of data and emerging issues of the European data economy. AccoIng Doc Commun Build Eur Data Econ (COM (2017) 9 final), pp 1–49 European Commission, COM (2018) 237 final, Artif Intell Eur European Commission, COM (2020) 66 final, A Eur Strat Data European Commission, DG Health and Food Safety (2021) Assessment of the EU member states’ rules on health data in the light of GDPR European Commission, COM (2022) 197 final, 22022/0140 (COD) Proposal for a regulation of the European Parliament and of the council on the European health data space European Data Protection Supervisor (2020) A preliminary opinion on data protection and scientific research Gärditz K F (2022a) Artikel 5 Absatz 3 Freiheit von Wissenschaft, Forschung und Lehre. In: Dürig G, Herzog R, Scholz R (eds) Grundgesetz-Kommentar, 99. Edition, C.H. Beck, Munich Gärditz K F (2022a) Kapitel 11 Recht der medizinischen Forschung. In: Huster S, Kingreen T (eds.) Handbuch Infektionsschutzrecht, C.H. Beck, Munich Hildebrandt M (2023) Ground-Truthing in the European Health Data Space. In: Keynote 13th International Joint Conference BIOSTEC 2023. Biomedical Systems and Technologies (forthcoming) Hofmeister H (2022) The Protection of Scientific Freedom Under the European Charter of Fundamental Rights: A Critical Analysis. In: De Gennare I, Hofmeister H, Lüfter R (eds) Academic freedom in the European context: Legal, philosphical and institutional perspectives, palgrave macmillan, London Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts JWL (2018) Artificial intelligence in radiology. Nature 18(8):500–510

Enabling Secondary Use of Health Data for the Development of Medical …

149

Jurcys P, Donewald C, Globocnik J, Lampinen M (2020) My data, my terms: A proposal for personal data use licenses. Harv J Law & Technol 30 (Digest Spring 2020):1–14 https://papers.ssrn.com/ sol3/papers.cfm?abstract_id=3555188 Accessed 1 March 2023 Kaissis GA, Makowski MR, Rückert D, Braren RF (2020) Secure, privacy-preserving and federated machine learning in medical imaging. Nat Mach Intell 2:305–311 Karaalp RN (2017) Der Schutz von Patientendaten für die medizinische Forschung in Krankenhäusern—Eine rechtsvergleichende Untersuchung der Regelungen in Deutschland und Frankreich. Springer, Wiesbaden Kleesiek J, Murray JM, Kaissis G, Rickmer B (2020) Künstliche Intelligenz und maschinelles Lernen in der onkologischen Bildgebung. Onkologe 26(1):60–65 Kolain M, Gradenauer C, Ebers M (2022) Anonymity Assessment—A universal tool for measuring anonymity of data sets under the GDPR with a special focus on smart robotics. Rutgers Comput Technol Law J 48(2):176–223 Kooi T (2020) Why skin lesions are peanuts and brain tumors harder nuts. Gradient. Available at: https://thegradient.pub/why-skin-lesions-are-peanuts-and-brain-tumors-harder-nuts/ Accessed 1 March 2023 Kop M (2021) The Right to Process Data for Machine Learning Purposes in the EU. Harvard Journal of Law & Technology 34 (Digest Spring 2021): 1–23. Köttering L (forthcoming) Datenschutzrechtliche Regulierung maschineller Lernverfahren durch die DSGVO. Kühling J, Martini M, Heberlein J, Kühl B, Nink D, Weinzierl Q, Wenzel M (2016) Die Datenschutz-Grundverordnung und das nationale Recht—Erste Überlegungen zum innerstaatlichen Regelungsbedarf https://dopus.uni-speyer.de/frontdoor/deliver/index/docId/1539/ file/Kuehling_Martini_et_al_Die_DSGVO_und_das_nationale_Recht_2016.pdf Accessed 1 March 2023 Langs G, Attenberger U, Licandro R, Hofmanninger J, Perkonigg M, Zusag M, Röhrich S, Sobotka D, Prosch H (2020) Maschinelles lernen in der radiologie begriffsbestimmung vom einzelpunkt bis zur trajektorie. Radiol 60(1):6–14 Madabhushi A, Lee G (2016) Image analysis and machine learning in digital pathology: Challenges and opportunities. Med Image Anal 33:170–175 Meszaros J, Ho C (2021) AI research and data protection: Can the same rules apply for commercial and academic research under the GDPR? Comput Law Secur Rev 41(105532):1–10 Molnár-Gábor F, Sellner J, Pagil S, Slokenberga S, Tzortzatou-Nanopoulou O, Nyström K (2022) Harmonization after the GDPR? Divergences in the rules for genetic and health data sharing in four member states and ways to overcome them by EU measures: Insights from Germany, Greece, Latvia and Sweden. Semin Cancer Biol 84:271–283 Mourby M (2020) Anonymity in EU health law: Not an alternative to information governance. Med Law Rev 28(3):1–24 OECD (2015) Data-driven innovation: big data for growth and well-being OECD (2019) Enhancing access to and sharing of data—Reconciling risks and benefits for data re-use across societies Panagopoulos A, Minssen T, Sideri K, Yu H, Corrales Compagnucci M (2022) Incentivizing the sharing of healthcare data in the AI era. Comput Law Secur Rev 45:105670 Panesar A (2019) Machine learning and AI for healthcare—Big data for improved health outcomes. apress Peloquin D, Dimaio M, Bierer B, Barnes M (2020) Disruptive and avoidable GDPR challenges to secndary research uses of data. Eur J Hum Genet 28:697–705 Richter H, Slowinski PR (2019) The data sharing economy: On the emergence of new intermediaries. Int Rev Intellect Prop 50:4–29

150

L. Köttering

Rossello S, Díaz Morales R, Munoz Gonzalez L (2021) Benefits and challenges of federated learning under the GDPR. Available at: https://kuleuven.limo.libis.be/discovery/fulldisplay? docid=lirias3611958&context=SearchWebhook&vid=32KUL_KUL:Lirias&lang=en&sea rch_scope=lirias_profile&adaptor=SearchWebhook&tab=LIRIAS&query=any,contains,lirias 3611958 Accessed 1 March 2023 Roßnagel A (2019a) Datenschutz in der Forschung—Die neuen Datenschutzregelungen in der Forschungspraxis von Hochschulen. Zeitschrift Für Datenschutzrecht 9(4):157–164 Roßnagel A (2019a) Artikel 6 Rechtmäßigkeit der Verarbeitung. In: Simitis S, Hornung G, Spiecker gen Döhmann I (eds) Datenschutzrechtrecht. Nomos, Baden-Baden Ruffert M (2022) Artikel 13 EU-GRCharta. In: Callies C, Ruffert M (eds) EUV/AEUV: das Verfassungsrecht der Europäischen Union mit Europäischer Grundrechtecharta: Kommentar. 6. Auflage, C.H. Beck, Munich Russell S, Norvig P (2021) Artificial intelligence, global Edition—A modern approach. Pearson education limited, London Schneider G (2019) Disentangling health data netweorks: a critical analysis of Article 9 (2) and 89 GDPR. Int Data Priv Law 9(4):253–271 Schneider G (2020) Health data pools under European policy and data protection law: research as a new efficiency defence? J Intellect Prop, Inf Technol Electron Commer Law 50(11):49–67 Schrader LF (2022) Datenschutz im Gesundheitswesen unter der Europäischen DatenschutzGrundverordnung. Duncker & Humblot, Berlin Schuessler M, Bärninghausen T, Jani A (2022) Organisational readiness for the adoption of artificial intelligence in hospitals pp In: Corrales Compagnucci M, Wilson ML, Fenwick M, Forgó N, Bärnighausen T (eds) AI in eHealth—Human Autonomy. Data GovAnce Priv Healthc. Cambridge University Press, Cambridge, pp 334–377 Shabani M, Borry P (2018) Rules for processing genetic data for research purposes in view of the new EU General Data Protection Regulation. Eur J Hum Genet 26:149–156 Shokri R, Stronati M, Song C, Shmatikov V (2017) Membership inference attacks against machine learning models. https://arxiv.org/pdf/1610.05820.pdf 1 March 2023 Spiecker gen. Döhmann I (2022) Die Regulierungsperspektive von KI/Big Data in der Wissenschaft. In: Gethmann CF, Buxmann P, Distelrath J, Humm BG, Lingner S, Nitsch V, Schmidt JC, Spiecker gen Döhmann I (eds), Künstliche Intelligenz in der Forschung—Neue Möglichkeiten und Herausforderungen für die Wissenschaft. Springer, Berlin Tang A, Tam R, Cadrin-Chênevert A, Guest W, Chong J et al (2018) Canadian association of radiologists white paper on artificial intelligence in radiology. Can Assoc Radiol J 69(2):120–135 Trute H-H (1994) Die Forschung zwischen grundrechtlicher Freiheit und staatlicher Institutionalisierung—Das Wissenschaftsrecht als Recht kooperativer Verwaltungsvorgänge. Mohr Siebeck, Tübingen Trute H-H (2017) Industry 4.0 in Germany and the EU—Data between property and access in the data-driven economy. J Law & Econ Regul 10 (2): 69–92. Wood A, Altman M, Bembenek A, Bun M, Gaboardi M, Honaker J, Nissim K, O’Brien DR, Steinke T, Vadhan S (2018) Differential privacy: A primer for a non-technical audience. Vand J Ent & Tech L 21(1):209–276 Wouters B, Shaw D, Sun C, Ippel L, van Soest J, van den Berg B, Mussmann O, Koster A, van der Kallen C, van Oppen C, Dekker A, Dumontier M, Townend D (2021) Putting the GDPR into practice: Difficulties and uncertainties experienced in the conduct of big data health research. Eur Data Prot Law Rev 7(2):206–216

Supplementary Measures and Appropriate Safeguards for International Transfers of Health Data After Schrems II Marcelo Corrales Compagnucci, Mark Fenwick, Mateo Aboy, and Timo Minssen

Abstract In July 2020, the Court of Justice of the European Union (CJEU) in Data Protection Commissioner v Facebook Ireland Limited, Maximillian Schrems (“Schrems II”) invalidated the EU-US Privacy Shield adequacy decision but found that Standard Contracting Clauses (SCCs) are a valid mechanism to enable GDPRcompliant transfers of personal data from the EU to jurisdictions outside the EU/EEA, as long as various unspecified “supplementary measures” are in place to compensate for any gaps in data protection arising from the third country law or practices. The effect of this decision has been to place regulators, scholars, and data protection professionals under greater pressure to identify and explain these “supplementary measures” to facilitate cross-border transfers of personal data. This chapter critically examines the current framework for cross-border transfers after Schrems II, including the new SCCs adopted by the European Commission, as well as the current European Data Protection Board (EDBP) guidance on “supplementary measures.” We argue that the so-called “supplementary measures” are not “supplementary” and that the CJEU’s characterization of such measures as “supplementary” undermines the original clarity of GDPR with regards to the required standards for the security of processing as well as the available mechanisms for cross-border transfers of personal data. We conclude that despite the legal uncertainty introduced by the CJEU several post-Schrem II developments have been helpful to increase awareness and improve the overall safeguards associated with cross-border transfers of personal data. These include the new SCCs and an increased understanding of the capabilities and limitations of the technical and organizational measures, including encryption, pseudonymization, and multi-party processing. Technical solutions such as multiparty homomorphic encryption (HE) that combine these three technical measures M. Corrales Compagnucci (B) · T. Minssen CeBIL, University of Copenhagen, Copenhagen, Denmark M. Fenwick Kyushu University, Fukuoka, Japan M. Aboy University of Cambridge, Cambridge, UK © The Author(s) 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_9

151

152

M. Corrales Compagnucci et al.

while still allowing for the possibility to query and analyze encrypted data without decrypting it has significant potential to provide effective security measures that facilitate cross-border transfers of personal data in high-risk settings. Keywords Appropriate safeguards · Cross-border transfers of personal data · Multiparty homomorphic encryption · New standard contractual clauses (SCCs) · Schrems II · Supplementary measures

1 Introduction The Court of Justice of the EU (CJEU) in its decision in Data Protection Commissioner v Facebook Ireland Ltd and Maximillian Schrems (Schrems II)1 significantly affected the landscape for cross-border data transfers of personal data. Crucially, Schrems II invalidated the adequacy decision recognizing the EU-US Privacy Shield2 that had facilitated data flows between Europe and the US under Article 45 of the General Data Protection Regulation (GDPR).3 Although the Court upheld the validity of standard contractual clauses (SCCs), which had provided an alternative means for such cross-border transfers, it established conditions for their continued use. The decision introduced a new and additional layer of uncertainty into this important area of law, at a time when cross-border transfers of data between the EU and US are increasing and where transfers of personal data play a key role. In October 2022, President Joe Biden signed the Executive Order to implement the “EU-US Data Privacy Framework,” which introduced new binding safeguards to address concerns raised in the Schrems II case.4 These safeguards include a two-layer redress mechanism among others. Subsequently, in December 2022, the European Commission initiated the process of adopting an adequacy decision for the EU-US Data Privacy Framework.5 As of the writing of this chapter, the framework has not yet 1

Case C-311/18 Data Protection Commissioner v Facebook Ireland Limited and Maximillian Schrems [2020] ECLI: EU:C:2020:559 [“Schrems II”]. 2 Commission Implementing Decision (EU) 2016/1250 of 12 July 2016 pursuant to Directive 95/ 46/EC of the European Parliament and of the Council on the adequacy of the protection provided by the EU-US Privacy Shield OJ 2016 L 207, 1. 3 Regulation (EU) 2016/679 of the EP and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC, OJ 2016 L 119, 1 (General Data Protection Regulation, GDPR). 4 Fact Sheet: President Biden Signs Executive Order to Implement the European Union-U.S. Data Privacy Framework, The White House, (7 October 2022), available at: https://www.whitehouse. gov/briefing-room/statements-releases/2022/10/07/fact-sheet-president-biden-signs-executiveorder-to-implement-the-european-union-u-s-data-privacy-framework/. Accessed 20 June 2023. 5 Data Protection: Commission Starts Process to Adopt Adequacy Decision for Safe Data Flows with the US (13 December 2022), available at: https://ec.europa.eu/commission/presscorner/detail/ en/ip_22_7631. Accessed 20 June 2023.

Supplementary Measures and Appropriate Safeguards for International …

153

been published.6 Therefore, our focus in this chapter is on supplementary measures and appropriate safeguards for the international transfer of health data post-Schrems II. This encompasses data transfers to third countries outside the European Economic Area (EEA). Schrems II introduced uncertainty regarding the conditions for the continued use of SCCs, and how firms can meet them. The CJEU found that SCCs were valid and “appropriate safeguards” for personal data transfers from EU controllers to processors outside the EEA if “supplementary measures” were in place to compensate for the lack of protection in a non-EEA country. However, the Court did not identify or define these measures and the European Data Protection Board’s (EDPB) subsequent guidance suggests the options for companies looking to utilize SCCs are quite limited.7 The privacy and data protection community—including data protection officers (DPOs), the EDPB, and Data Protection Supervisory Authorities and scholars—are now under increasing pressure to identify these appropriate safeguards and supplementary measures to enable routine cross-border data transfers to continue. This pressure has only increased in the context of a global pandemic that often required the large-scale transfer of personal data to manage outbreaks and research and test clinical responses to the virus. As discussed in a previous article8 some commentators and regulatory agencies have interpreted the decision in Schrems II, and the requirement for safeguards, as halting the use of SCCs to transfer personal data outside of the EEA.9 In contrast with this more restrictive approach, we argue that SCCs provide a useful instrument to support cross-border data transfers—notwithstanding the CJEU ruling in Schrems II—especially in the context of health data in regulated settings such as clinical trials.10

6

We considered developments until May 2023. EDPB Recommendations 01/2020 on measures that supplement transfer tools to ensure compliance with the EU level of protection of personal data (10 November 2020) [“EDPB Recommendations 01/2020”]. 8 Bradford et al. (2021, p. 1). 9 For example, the French data protection agency, CNIL, on October 9, 2020 recommended that postSchrems II the hosting and the management of the public “Health Data Hub” should be “reserved for entities exclusively under the jurisdiction of the European Union,” because transfer of personal data to such entities would not risk any exposure of personal data to any jurisdiction outside the EU. Following this intervention, the French health authority revised its emergency COVID-19 declaration to ban any transfer of personal data outside the EU, irrespective of whether an SCC was in place. See Commission Nationale Informatique & Libertés, National Council of Free Software vs. Ministry of Solidarities and Health Conseil d’Etat Section du Contentieux Ref. L. 521–2 CJA (9 October 2020); Arrêté du 9 octobre 2020 modifiant l’arrêté du 10 juillet 2020 prescrivant les mesures générales nécessaires pour faire face à l’épidémie de covid-19 [Announcement of 9th October modifying the declaration of 10 July 2020 proscribing general measures necessary to address the epidemic of Covid-19] Journal Officiel De La République Française [JO] p. 143 (10 October 2020). 10 See, e.g., Corrales Compagnucci et al. (2020, pp. 153–160). 7

154

M. Corrales Compagnucci et al.

New SCCs were adopted on June 4, 2021, by the European Commission to modernize the clauses under the GDPR. The terms of the new SCCs can point to mechanisms that, when combined with third-party beneficiary contractual rights are designed to ensure “appropriate data protection safeguards,” and that such mechanisms are particularly useful in highly regulated contexts such as clinical trials and public health research.11 The new SCCs introduced a “modular approach,” which includes additional possible arrangements of data transfers previously not possible.12 It is, therefore, expected that the new SCCs will enable organizations to account for a variety of complex data transfer scenarios,13 which may need the implementation of additional safeguards. The new SCCs have also adopted inter alia a risk-based approach so-called “Transfer Impact Assessment” (TIA), which is based on the EDPB Recommendations 01/2020 and the Schrems II ruling. As such, adopting and complying with the new SCCs may require considerable effort and transaction costs for companies, in particular any organization handling health-related data. The chapter is structured as follows. Section 2 offers a critical review of the Schrems II decision. Section 3 describes post-Schrems II developments, including the new SCCs and the EDBP’s recommendations on measures to support transfer tools. Section 4 argues that the so-called “supplementary measures” are not “supplementary” and that the CJEU’s characterization of such measures as “supplementary” undermines the original clarity of the GDPR with respect to the required standards for the security of processing as well as the available mechanisms for cross-border transfers of personal data. Section 5 concludes.

2 A Critical Appraisal of Schrems II Schrems II 14 is the latest installment of a long-running case concerning objections by an Austrian national, Maximilian Schrems, to the transfer of his personal data from Facebook Ireland to its US parent Facebook Inc. for processing. Schrems claimed that US law requires Facebook Inc. to make any personal data transferred to it available to the US authorities, including the National Security Agency (NSA) and the Federal

11

Bradford et al. (2021, p. 1). Processor-to-processor (Module 3) and processor-to-controller (Module 4). 13 European Commission, ‘European Commission adopts new tools for safe exchange of personal data’ (4 June 2021) https://ec.europa.eu/commission/presscorner/detail/en/ip_21_2847. Accessed 17 August 2022. 14 There was a previous case, Schrems I, decided in 2015, which led to the invalidation of the ‘Safe Harbor’ framework. See C-362/14 Maximiliam Schrems v Data Protection Commissioner of 6 October 2016 [2015] ECLI:EU:C:2015:650 (Schrems I). See also, CJEU Press Release No. 117/ 15 (6 October 2015). In order to partially alleviate concerns, the so-called ‘EU-US Privacy Shield’ agreement was enacted in 2016. See Commission Implementing Decision (EU) 2016/1250 of 12 July 2016 pursuant to Directive 95/46/EC of the European Parliament and of the Council on the adequacy of the protection provided by the EU-US Privacy Shield OJ 2016 L 207, 1. 12

Supplementary Measures and Appropriate Safeguards for International …

155

Bureau of Investigation (FBI) for national security monitoring.15 Schrems argued that these blanket monitoring programs violated Articles 7, 8 and 47 of the EU Charter, and made it impossible for Facebook Inc. to comply with both US and EU law. Schrems, therefore, asked the Commissioner to prohibit or suspend the transfer of his personal data. Moreover, he argued that the SCCs in effect between Facebook Ireland and Facebook Inc. did not constrain the US authorities and could not, therefore, be used to justify the transfer of the data under Article 46.16 Finally, he sought a ruling from the Irish data protection authority that the Article 45 adequacy finding for the US, the so-called “EU-US Privacy Shield,” was in error. The CJEU agreed with Schrems and struck down the EU-US Privacy Shield adequacy decision.17 The court ruled that SCCs could be used to transfer data to jurisdictions without an adequacy ruling if “essentially equivalent” protections for EU personal data could be assured.18 Unsurprisingly, given the scope of the judgement, Schrems II introduced a new and significant degree of uncertainty into this field of law and the question of what is now necessary to legally continue cross-border data transfers of data. The critical view of Schrems II would be that it failed to clarify how to safely transfer data based on “accountability” when there is a conflict between EU and local law, and instead, has introduced a new degree of uncertainty into the two principal mechanisms used for data transfers to the US (see Footnote 5). The result was, in effect, to reduce the mechanisms for transferring personal data from the EU to third countries. The ruling limited the meaning of “adequacy” under Article 45, and it seems to define accountability under Article 46 so broadly as to include responsibility for ensuring the adequacy of the law of third countries (see Footnote 5). There are several curious features of Schrems II, and two in particularly are worth noting. First, with respect to Article 45 adequacy, the CJEU had many legitimate reasons to conclude that the Privacy Shield was inadequate, but the grounds for this conclusion in the decision, namely national security, was probably the weakest. Second, the decision undermined the use of Article 46 “appropriate safeguards” as a justification for data transfers. Instead of regarding the issue of organizational safeguards as a distinct legal question, the CJEU appears to have collapsed the adequacy pathway of Article 45 and the appropriate safeguards test of Article 46 into a single question. That effect of this move has been to reduce all the Chapter V mechanisms into a single adequacy test that is unworkable in practice (see Footnote 5).

15

Schrems II at 55. Schrems II at 151–153. 17 Schrems II at 201. 18 Schrems II at 203(b) (“Article 46(1) and Article 46(2)(c) of Regulation 2016/679 must be interpreted as meaning that the appropriate safeguards, enforceable rights and effective legal remedies required by those provisions must ensure that data subjects whose personal data are transferred to a third country pursuant to standard data protection clauses are afforded a level of protection essentially equivalent to that guaranteed within the European Union by that regulation, read in the light of the Charter of Fundamental Rights of the European Union.”). 16

156

M. Corrales Compagnucci et al.

It seems clear that the Privacy Shield—a general agreement concluded between the EC and the US—is problematic under both EU and US law.19 Most obviously, as a voluntary program, the Privacy Shield lacked the force of law. The framework allowed data transfers to any entity that self-certified compliance to GDPR-like standards. This seems to conflict with the essence of an Article 45 adequacy inquiry, which should examine whether a third country’s legal framework is sufficiently protective of data subject rights.20 In addition, the Privacy Shield failed to meet the Article 46 standard for voluntary mechanisms like codes of conduct or certifications, which—in form, at least—it resembles.21 The US Department of Commerce and the Federal Trade Commission promised to enforce the Privacy Shield, but the Commission found that US agencies often failed to audit participating entities and check whether they were in compliance.22 Many organizations, including Facebook, claimed to comply but continued to use personal data for purposes contrary to the principles of the GDPR. At the same time, the Privacy Shield as understood by the EC was incompatible with US law because, it required federal agencies to conduct protective audits for the sole benefit of EU citizens, activities, something which was arguably beyond the scope of their lawful powers.23 Instead of focusing on these other grounds, however, Schrems II focused on the question of US government access to personal data for national security purposes, and whether EU citizens enjoyed an equivalent right of judicial review and redress available to them as in EU law.24 Under EU law, access to personal data for national security purposes that infringes upon privacy rights must be “necessary and proportionate.”25 At the same time, however, national security policy remains the sole responsibility of the Member States. In effect, each EU Member State is afforded discretion to balance national security needs with data privacy rights.26 And yet, although the CJEU ruled in Schrems II that third countries, such as the US, were not entitled to the same degree of discretion, the court then went on to find that the US approach to national security monitoring was not “necessary and proportionate.”27 Schrems II threatens the viability of the Article 45 legal adequacy test. Under the GDPR, the EC determines whether a country outside the EU offers an adequate level of data protection. For the level of protection in a third country to be considered as adequate, it must offer guarantees to the data subject “essentially equivalent” to those 19

Bradford et al. (2020, p. 12). See, e.g., Court of Justice of the European Union Press Release No 165/19, Advocate General’s Opinion in Case C-311/18 Data Protection Commissioner v Facebook Ireland Limited, Maximillian Schrems (Luxembourg, 19 December 2019). 21 Bradford et al. (2020, p. 14). 22 Commission Staff Working Document Accompanying the document, Report from the Commission to the European Parliament and The Council on the third annual review of the functioning of the EU-US Privacy Shield 25–27 COM (2019) 495 final (23.10.2019). 23 Bradford et al. (2020, p. 14). 24 Schrems II at 178–200. 25 EU Charter of Fundamental Rights (CFR) Article 52(1); GDPR Article 23. 26 Meltzer (2020). 27 Schrems II at 81, 178–200. 20

Supplementary Measures and Appropriate Safeguards for International …

157

offered in the EU.28 The means of protection, however, may differ from that in the EU, so long as they are as effective in practice.29 When assessing whether a third country’s law and practice are adequate under the GDPR, the EC has also taken into account a number of other considerations, including the significance of a trading partner, both commercially and in terms of cultural ties to the EU, and strategic objectives in continuing data flows and encouraging legal reform.30 In other words, the EC has to weigh the benefits of continued data flows, as well as risks, in deciding on adequacy.31 To date, the EC has recognized 14 countries that do provide adequate protection under this test, namely Andorra, Argentina, Canada (for commercial organizations), the Faroe Islands, Guernsey, Israel, the Isle of Man, Japan, Jersey, New Zealand, the Republic of Korea, Switzerland, the United Kingdom (under the GDPR and the Law Enforcement Directive (LED)), and Uruguay.32 The CJEU, in contrast to this more nuanced approach, only asked whether the third country provides privacy protections consistent with the Charter of Fundamental Rights. In Schrems II, for example, the Court engaged in a detailed review of the provisions of US national security law and concluded that these laws were deficient.33 This approach transforms the character of an “adequacy inquiry” from one of general effectiveness, to a fine-grained search for equivalency. An important effect of this change is to undermine other existing Article 45 adequacy rulings (potentially transfers, even to adequacy countries requiring so-called “supplementary measures”). It seems highly unlikely that Israel, for example, or Argentina or Japan would meet Schrems II’s new threshold for law enforcement surveillance. The focus in Schrems II on compatibility with the EU Charter for Article 45 adequacy seems to exclude different approaches to privacy in other countries and restricts the

28

Recital 104 of the GDPR; Case C-362/14, Schrems v. Data Prot. Comm’r, 2015 E.C.R. 650,191 73 (6 October 2015) (“while the term ‘adequate’ cannot require a third country to ensure a level of protection identical to that guaranteed in the EU legal order, … [it still] must be understood as requiring the third country in fact to ensure, by reason of its domestic law or its international commitments, a level of protection of fundamental rights and freedoms that is essentially equivalent to that guaranteed within the European Union by virtue of Directive 95/46 read in the light of the Charter.”). 29 Schrems I at 74; Article 29 Data Protection Working Party, Adequacy Referential (updated), WP 254 at 2 (6 February 2018) https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id= 614108. Accessed 17 August 2022. 30 European Commission Memo/17/15, ‘Digital Single Market—Communication on Exchanging and Protecting Personal Data in a Globalized World’ (10 January 2017) http://europa.eu/rapid/pressrelease_MEMO-17-15_en.htm. Accessed 10 December 2022. 31 Roth (2017, pp. 49–60), Stoddart et al. (2016, pp. 143, 147–49). 32 Except for the United Kingdom, these adequacy decisions do not cover data exchanges in the law enforcement sector which are governed by the Law Enforcement Directive (Article 36 of Directive (EU) 2016/680). See European Commission, Adequacy Decisions https://ec.europa.eu/info/law/law-topic/data-protection/international-dimension-data-protec tion/adequacy-decisions_en. Accessed 10 December 2022. 33 Schrems II at 180, 184.

158

M. Corrales Compagnucci et al.

scope for Article 45 adequacy findings more generally.34 Thus, in practice, Schrems II collapses the Article 45 (adequacy) and 46 (appropriate safeguards) into a single transfer option. Moreover, Schrems II undermined the use of Article 46 “appropriate safeguards” as well. In the absence of an adequacy decision, Article 46(1) of the GDPR allows appropriate safeguards to be taken by the controller or processor that “compensate for the lack of data protection in a third country.”35 One might, for example, interpret the word “compensate” in Recital 108 to mean alternative technical means, or via legal remedies available in the host country. However, the CJEU constructed it more narrowly to mean capable of ensuring “a level of protection essentially equivalent to that which is guaranteed within the EU,” i.e., specific legal adequacy as under Article 45.36 The basis of this obligation is not completely clear. At times, the Court suggested that the EU Charter itself requires this standard.37 At other points, the Court cited specific clauses of the SCCs requiring the Parties to warrant full compliance with the GDPR as the source of this equivalency obligation.38 The CJEU ruled that SCCs remain an option where the controller offers “supplementary measures” to rectify legal problems that undermine equivalency.39 Crucially, the Court failed to provide any meaningful guidance about what kinds of measures might be required in this regard. The issue with SCCs (as well as BCRs and Certification mechanisms etc.) is that they are contractual mechanisms between individual private parties, and they do not bind other governments. If third country law or practice is incompatible with the GDPR, then SCCs cannot address that issue. Moreover, the Court stated that data controllers, and Data Protection Authorities, must suspend transfers to any country if it concludes that “an obligation [allowing processing of personal data] prescribed by the law of the third country of destination... goes beyond what is necessary” and conflicts with the GDPR.”40 As such, Schrems II could be read as a prohibition on any transfer, whether under Articles 45, 46 or another provision of Chapter V, unless the legal rights of EU citizens in third countries are equivalent on every point to those available under EU law. The effect of this approach is to collapse all of Chapter V into a narrowly constructed legal adequacy determination, rather than treat them as a series of different options. But unlike Article 45 adequacy, which must be determined by the EC and the EDPB after an investigation, Article 46 legal adequacy would need

34 Meltzer (2020). For a detailed EU Law perspective and relevant precedents concerning the conclusion of law enforcement, data related, or other international agreements, see Christakis and Terpan (2021, pp. 81–86). 35 Recital 108 of the GDPR. 36 Schrems II at 96. 37 Schrems II at 99–101. 38 Schrems II at 140–142 (citing 2010 SCCs 4(a), 5(a)&(b)). 39 Schrems II at 103, 134. 40 Schrems II at 41–42.

Supplementary Measures and Appropriate Safeguards for International …

159

to be performed ex ante by individual controllers.41 This is a complex task as the question of whether the regulations of a third country allow processing “beyond what is necessary” when compared to the EU is not always a simple question. And a mistaken decision would leave controllers and processors liable to the full extent of Article 83(5)’s penalties, including fines of up to e20 million or 4 percent of worldwide annual turnover.42 Furthermore, all pre-Schrems II adequacy decisions are now suspect (i.e., since they could be invalidated on the same grounds as the EUUS Privacy Shield adequacy). As such, the so-called “supplementary measures” by the CJEU may now be necessary for all cross-border transfers of personal data. The Court did, however, retain the possibility of transfers in the specific circumstances justified on the basis of an Article 49 derogation, but this is only available for nonrepetitive transfers of limited data and it is unclear on what basis the Court considers that EU fundamental rights are justifiably compromised in these circumstances.43

3 Post-Schrems II Developments Given how narrowly the Court defined adequacy, arguably no transfers to jurisdictions with active national security data collection, such as the US, could ever occur, raising concerns over many other countries with whom data transfers are currently permitted.

3.1 National Regulatory Responses Some EU regulators reached exactly this conclusion. For example, on October 10, 2020, the French Ministry for Health and Solidarity made an emergency change to its COVID-19 law and prohibited the sharing of any French public health data beyond the European Union. The change was in response to an action brought by the French National Commission for Informatics and Freedoms (CNIL), the French data protection authority, requesting that data from the French national public health registry no longer be entrusted to servers run by Microsoft, or any of its subsidiaries, 41

Schrems II at 134, 142 (“It is therefore, above all, for that controller or processor to verify, on a case-by-case basis and, where appropriate, in collaboration with the recipient of the data, whether the law of the third country of destination ensures adequate protection, under EU law, of personal data transferred pursuant to standard data protection clauses, by providing, where necessary, additional safeguards to those offered by those clauses…It follows that a controller established in the EU and the recipient of personal data are required to verify, prior to any transfer, whether the level of protection required by EU law is respected in the third country concerned. The recipient is, where appropriate, under an obligation, under Clause 5(b), to inform the controller of any inability to comply with those clauses, the latter then being, in turn, obliged to suspend the transfer of data and/ or to terminate the contract.”). 42 Article 83(5) of the GDPR. 43 Schrems II at 202.

160

M. Corrales Compagnucci et al.

because these companies were subject to US national security laws. CNIL dismissed the use of Article 46 SCCs and supplementary measures as a corrective because such measures could never prevent direct access by US intelligence.44 The Conseil d’Etat, the highest French administrative court, agreed that after Schrems II, no personal data transfer to the US would be possible under either Article 45 or 46.45 However, the Court was willing to permit Microsoft’s Irish subsidiary to continue hosting and processing data in the EU on one condition. The Court required Microsoft to insert contract clauses to the effect that they would only follow the law of the EU, and not the US, in granting access to public authorities to any data. Other Data Protection Authorities (DPA) have not gone quite so far, but have, nevertheless, greatly narrowed the range for permissible transfers. The DPA for the German State of Baden-Württemberg is, so far, the only member state DPA to provide official guidance on transfers in the wake of Schrems II. The guidance requires controllers first to determine whether a cross-border transfer is necessary, or whether another solution, such as processing the data within the EU, is available.46 The guidance allows the use of Article 46 mechanisms, such as SCCs and BCRs, alongside “supplementary measures” to protect the data if transfer to the third country cannot be avoided, and the controller determines that the legal protections in the third country are sufficiently adequate. Post-Schrems II, it is difficult to see how any data controller could reasonably conclude that legal protections in the US are adequate (especially in the US importer is subject to FISA 702). The European Data Protection Supervisor, the regulator responsible for ensuring the compliance of EU agencies with the GDPR, “strongly encourage[d]” its agencies to avoid any processing activities that involve transfers of personal data to the US.47

3.2 The EDPB Guidelines On November 10, 2020, the EDPB issued Guidelines on measures that supplement transfer tools to ensure compliance with the EU level of protection of personal data (EDPB Guidelines). These guidelines gave effect to the CJEU’s decision of adequacy, appropriate safeguards and “supplementary measures” but provided a more pragmatic risk-based approach to help guide organizations through the challenges associated with cross-border transfers. The EDPB Guidelines require those who rely on Article 46 appropriate safeguards to guarantee an equivalent level of protection 44

Commission Nationale Informatique & Libertés, National Council of Free Software vs. Ministry of Solidarities and Health Conseil d’Etat Section du Contentieux Ref. L. 521–2 CJA (9 October 2020). 45 Bradford et al. (2021, p. 1). 46 Malik and Khan (2020). 47 European Data Protection Supervisor, Strategy for Union institutions, offices, bodies and agencies to comply with the Schrems II Ruling (29 October 2020), https://edps.europa.eu/sites/edp/files/pub lication/2020-10-29_edps_strategy_schremsii_en_0.pdf. Accessed 17 August 2022.

Supplementary Measures and Appropriate Safeguards for International …

161

in the third country, if necessary through the use of “supplementary measures.”48 The EDPB also does not distinguish whether this equivalency is required to protect fundamental rights under the CFR, or merely to ensure that importers do not breach their obligations under the wording of the existing SCCs.49 The Guidelines state that “in principle, supplementary measures may have a contractual, technical or organizational nature” but that generally “only technical measures” such as secure encryption will be effective to impede access by foreign authorities.50 This Guidance is less restrictive than the French approach but would still seem to preclude sharing data with any entity requiring unencrypted access to the raw data even if for the purpose of health research or drug discovery.51 The EDPB Guidelines proposed a six-step-approach that is designed to assist in implementing the requirements as established by the CJEU in Schrems II.52 . Step 1—Know Your Data Transfers. The EDPB recommends that data exporters make themselves fully aware of all aspects of data transfers of personal data outside the EU. While logging all data transfers can be complicated, especially in a world of cloud computing, this is necessary to establish an equivalent level of protection for data subjects. Data exporters are, therefore, expected to record all processing activities, keep data subjects informed and comply with the principle of minimization. All such requirements also apply to onward transfer situations.53 . Step 2—Identify the Transfers Tools. Data handlers are expected to clearly identify all transfer tools, as laid out the GDPR, Chapter V. This includes adequacy decisions54 ; transfers tools containing “appropriate safeguards” of a contractual nature in the absence of adequacy decisions (for example, SCCs, BCRs, codes of conducts, certification mechanisms and ad hoc contractual clauses)55 ; and, derogations.56 It worth noting that derogations have an exceptional character and are

48

EDPB Recommendations 01/2020, pp. 28–29. EDPB Recommendations 01/2020, pp. 29–30, 34. 50 EDPB Recommendations 01/2020, pp. 45, 48. 51 EDPB Recommendations 01/2020, Annex 2 Use Cases 6 and 7. 52 Corrales Compagnucci et al. (2021, p. 41), Jurcys et al. (2022, pp. 4–5). 53 EDPB Recommendations 01/2020, pp. 8–9. 54 See Article 45 of the GDPR. Adequacy decisions may cover a country as a whole or be limited to a part of it. If you transfer data to any of these countries, there is no need to take any further steps described in this section. The EU Commission has so far recognized only twelve countries which can offer adequate level of protection. These countries are Andorra, Argentina, Canada (commercial organizations), Faroe Islands, Guernsey, Israel, Isle of Man, Japan, Jersey, New Zealand, Switzerland and Uruguay. As of March 2021, adequacy talks were concluded with South Korea. See European Commission, Adequacy decisions, How the EU determines if a non-EU country has an adequate level of data protection, https://ec.europa.eu/info/law/law-topic/data-protection/international-dim ension-data-protection/adequacy-decisions_en. Accessed 17 August 2022. 55 Article 46 of the GDPR. The transfer tools may require additional ‘supplementary measures’ to ensure essentially equivalent level of protection. See Schrems II at 130 and 133. 56 Article 49 of the GDPR. 49

162

M. Corrales Compagnucci et al.

subject to strict conditions. If a data transfer does not fall under either (a) an “adequacy decision” or (c) “derogations”, then data handlers must proceed to Step 3.57 . Step 3—Examine if Article 46 is Effective Considering All Circumstances. Utilizing a transfer tool under Article 46 may not be sufficient if the transfer tool is not “effective” in practice. “Effective,” in this context means that the level of protection is equivalent to that afforded inside the EEA.58 To make this determination, data exporters should carry out a transfer impact assessment (TIA) to assess—in collaboration with the importer—if the law and practice of the third country where the data is being transferred impact on the effectiveness of the appropriate safeguards of the Article 46 in the context of the specific transfer. The legal context refers to several issues and will depend on the specific circumstances, in particular: the actors involved,59 the purpose of the transfer,60 the types of entities involved,61 sector in which the transfer takes place,62 categories of data,63 format of the data,64 possibility of onward transfers and whether the data will be stored in the third country or whether there is only remote access to data stored withing the EU/EEA.65 In implementing this assessment, various considerations regarding the third country’s legal system need to be examined, in particular whether any state authorities can access personal data and, more generally, whether those elements enumerated in Article 45(2) of the GDPR66 are present, i.e., respect for the rule of law and human rights in the relevant jurisdiction or jurisdictions.67 . Step 4—Supplementary Measures. If the TIA revealed that an Article 46 tool is not effective, data exporters—in conjunction with importers—need to consider whether any “supplementary measures” are necessary. Any supplementary measures are “supplementary” to the safeguards, i.e., if the additional measures ensure a situation where the data transferred is afforded an adequate level of protection in the third country, which is essentially equivalent to the European standard. Such an exercise needs to be conducted on a case-by-case basis and considering which specific supplementary measures would be most effective based on the previous analysis conducted within the context of Steps 1, 2 and 3. In principle, 57

EDPB Recommendations 01/2020, pp. 9–11. See Schrems II at 105 and second finding. 59 For example, in a cloud federation scenario the actors participating: controllers, processors and sub-processors. The more actors involved, the more complex the TIA will be. 60 For instance, for marketing purposes or for storage, back-up, clinical trials, etc. 61 Such as public or private entities; controllers or processors. 62 Such as telecommunications, financial, medical, etc. 63 For example, sensitive data relating to children which may fall under a specific legislation in the third country. 64 In plain text, pseudonymized or encrypted. 65 EDPB Recommendations 01/2020, p. 12. 66 Schrems II at 104. 67 EDPB Recommendations 01/2020, p. 12. 58

Supplementary Measures and Appropriate Safeguards for International …

163

supplementary measures may have a complex and multi-layered character, i.e., they are a mixture of contractual, technical, or organizational. Combining them in a way that they support each other may improve the level approach and thus help to ensure the adequate level is the objective. Contractual and organizational measures in isolation will generally not be enough to block access to personal data by public authorities, most obviously in cases of surveillance. Therefore, technical measures (such as encryption and anonymization) may be necessary to aid raise the level of protection to a satisfactory level.68 . Step 5—Formal Procedural Steps. This step ensures that any supplementary measures are implemented in a procedurally correct manner, and this may be different depending on the transfer tool that has been adopted. For example, if data exporters intend to put in place supplementary measures, there is no need to request an approval any regulatory authority insofar as the supplementary measures do not contradict, directly or indirectly, the SCCs and are enough to ensure that the level of protection guaranteed by the GDPR is not undermined.69 . Step 6—Re-Evaluation at Appropriate Intervals. The final step proposed by the EDPB is to continually review the situation, e.g., if there are developments in the third country where the data was transferred that affect the initial judgment and any measures taken in response to that judgment. This conforms with the principle of accountability which is an ongoing obligation as specified in Article 5(2) of the GDPR. Data exporters, in collaboration with the importers, should put in place sufficiently robust mechanisms to ensure that the transfer that any transfer relying on the SCCs are suspended or prohibited if the supplementary measures are not effective in the third country or where those clauses are breached or become impossible to implement.70

3.3 The New Standard Contractual Clauses The new SCCs adopted on June 4, 2021, involve a combination of law, technology, and organizational tools, comprising impact assessments, default contractual promises and Article 32 technical security safeguards. The purpose of such an approach is to safeguard data subject rights even in the absence of full protection under the law of any third country where data is transferred. In contrast to Schrems II, this article defends the central and on-going importance of such an approach, particularly in a regulated health context, for example, such as clinical trials under the Clinical Trial Regulation (CRT). The main innovations of the new SCCs can, therefore, be summarized as follows71 :

68

EDPB Recommendations 01/2020, pp. 15–17. EDPB Recommendations 01/2020, pp. 17–18. 70 EDPB Recommendations 01/2020, pp. 18–19. 71 Corrales Compagnucci et al. (2021, p. 43), Jurcys et al. (2022, pp. 5–6). 69

164

M. Corrales Compagnucci et al.

. A Modular Approach. In contrast, to the first-generation SCCs which provided very limited possibilities of data transfers and a separate set of clauses, the new SCCs offer more flexibility in complex data chains by adopting a “modular approach.” This means that data exporters and data importers can identify the most appropriate module for their needs.72 In addition to already existing options for data transfers from “controller to controller” and “controller to processor,” an additional two modules have been added relating to data transfers from “processor to processor”73 and “processor to controller.”74 . Geographic Scope of Application. The new SCCs have a wider scope of application compared to the earlier clauses. The earlier clauses only allowed the data exporter to be a party if they were legally established within the EEA. This led to difficulties in cases where a data exporter was established outside of the EU but subject to the GDPR because of the GDPR’s extraterritorial scope in Article 3(2). This was changed in the new SCCs, and the data exporter can now be a non-EEA entity. This provision, along with the modular approach, allows parties to deal with any kind of data transfer situation between parties, irrespective of the character of the transfer or legal place of establishment.75 . Multipartite Clauses and Docking Clauses. The new SCCs also permit multiple data exporting parties to contract. An obvious example would be within a corporate group, for instance. Moreover, new parties can be easily added via the newly included “docking clause” (Clause 7).76 This clause is optional and allows additional third parties that are not part of the agreement to be added to the existing agreement of the other parties without having to add separate contracts. Third parties can join by completing an Appendix—with additional details of the transfer, technical and organizational measures implemented and a list of subprocessors where relevant.77 This mechanism offers a more dynamic solution for existing data processing practices in cases involving corporate acquisitions, additional corporate entities, and sub-processors.78 . Data Transfer Impact Assessment (TIA). As a result of Schrems II, firms are now required to conduct a mandatory TIA which must be made available to the relevant supervisory authority upon request. The TIA should assess and warrant in particular the following: whether the laws of the third country which the data is imported is compliant with the SCCs and the GDPR, and whether additional safeguards are necessary to bolster data protection. A relevant example here would 72

Module 1: controllers to controllers, Module 2: controllers to processors, Module 3: processors to processors and Module 4: processors to controllers. 73 This is, for example, when there is a processor such as a cloud service provider located in the European Economic Area (EEA) and transfers data to another processor such as an infrastructure provider in the US. 74 In this case, the data is transferred back to the controller (back to ‘its origin’). This is also sometimes referred as ‘reverse transfer’. 75 Lee (2021). 76 Lee (2021). 77 Milner-Smith (2021). 78 Braun et al. (2021).

Supplementary Measures and Appropriate Safeguards for International …

165

be whether the data importer is subject to the US Foreign Intelligence Surveillance Act Section 702 (FISA 702).79 Moreover, any TIA should be monitored continuously and updated if there are any changes in the laws of the third country. . Security Measures. Annex II of the new SSCs offer a list of examples of the technical and organizational measures necessary to help ensure an adequate level of protection, especially measures to ensure data security. While the list is nonexhaustive, it enumerates multiple options which set out the instructions from the controllers and the measures to aid the controllers. According to Annex II, the technical and organizational measures must be described in “specific (and not generic) terms.” This covers any pertinent “certifications” to ensure an appropriate level of security, taking into consideration “the nature, scope, context and purpose of the processing, and the risks for the rights and freedoms of natural persons.” Most significantly, measures aimed at pseudonymization, and encryption are at the top of the list, followed by other measures such as: ensuring confidentiality, integrity, availability, resilience of processing systems and services; measures for ensuring user identification and authorization, physical security of locations at which personal data are processed, events logging, internal IT and IT security governance and management, data minimization, etc.80 As such, there are several practical features in the new SCCs toolbox which provide for more flexibility and generate greater legal certainty for all parties. This seems particularly important when the SCCs cover complex, multiple international data transfers involving servers in several countries. However, the new SCCs impose more requirements than was previously the case, particularly in cases involving data importers acting as controllers. For example, the obligation to give notice to data subjects and to notify data breaches to the relevant EU authorities.

4 A Critical Analysis of Supplementary Measures Nevertheless, we would conclude that the so-called “Supplementary Measures” are not “supplementary.” Unfortunately, the CJEU’s characterization of such measures as “supplementary” undermines the original clarity of the GDPR with regards to the required standards for security of processing, as well as the available mechanisms for cross-border transfers of personal data. The introduction of the concept of “Supplementary Measures” by the CJEU in its Schrems II judgement is unhelpful and unfortunate. It introduced unnecessary uncertainty because, inter alia, the CJEU (1) did not define or provide examples of effective “supplementary measures”; (2) erroneously considered these undefined measures to be “supplementary” when such measures were already core requirements necessary to comply with the GDPR (e.g., measures to ensure security of processing 79 80

Fennessy (2021). Annex II Standard Contractual Clauses.

166

M. Corrales Compagnucci et al.

under Article 32); (3) significantly blurred the lines between the different transfer mechanisms (Article 45 adequacy versus Article 46 appropriate safeguards); and (4) introduced a new and alien concept (i.e., “supplementary measure”) not defined by the GDPR that undermines the native GDPR concept of “appropriate safeguards” (since now “appropriate safeguards” under Article 46 are no longer “appropriate” unless they have “supplementary measures”). In all situations, Article 32 GDPR (security of processing) already requires all controllers and processors to conduct a risk assessment and implement the necessary technical and organizational measures to ensure a level of security appropriate to the risk. Furthermore, the original SCCs always required specifying the measures as part of its SCC Annex to ensure “appropriate safeguards.” Accordingly, controllers and processors were already required to evaluate the various risks and implement the necessary technical and organizational safeguards to comply with the GDPR. The necessity of a thorough risk assessment, the implementation of technical and organizational measures, and the documentation of such measures as part of the SCC have always been a “core baseline measure” as opposed to a “supplementary measure.” A more legally sound approach might have been for the CJEU to refer to these core requirements (e.g., Article 32 security of processing, Article 25 data protection by design and default, Article 35 data protection impact assessments, Article 46 “appropriate safeguards” and the SCC documentation requirements as part of the SCC’s Annexes) and rule that they needed to be complied with in order to satisfy obligations under the GDPR. In particular, the CJEU could have held that the specific SCC measures implemented by Facebook (and the level of documentation provided in the SCC’s Annex) were not sufficient to satisfy the Article 32 (security of processing) and Article 46 (appropriate safeguards) requirements, particularly in the case of US data importers that fall under 50 USC 1881a of the Foreign Intelligence Surveillance Act of 1978 (FISA 702). Given the significant risks posed by FISA 702, the CJEU might, instead, have emphasized that the “security of processing” requirements of Article 32(a) “pseudonymization and encryption of personal data”, Article 32(b) “the ability to ensure the ongoing confidentiality, integrity, availability and resilience of processing systems and services”, and Article 32(d) “a process for regularly testing, assessing and evaluating the effectiveness of technical and organizational measures for ensuring the security of the processing” were foundational to ensure the effectiveness of the Article 46 “appropriate safeguards” in these cases. Instead, the CJEU introduced the unnecessary and alien concept of “supplementary measures” (instead of relying on the established Article 46 GDPR “appropriate safeguard” standard and Article 32 security measures) and left it to the EDPB to define and provide guidance for such supplementary measures. Perhaps not surprisingly, the much anticipated measures included as part of the EDBP’s “recommendations on measures that supplement data transfer tools of

Supplementary Measures and Appropriate Safeguards for International …

167

protection of personal data”81 were simply the ordinary measures and baseline practices employed by controllers and processors to ensure compliance with Article 32 GDPR, and the measures expected to be implemented and documented as part of the Article 46 requirement to satisfy the “appropriate safeguards” standard with SCCs. Pre-Schrems II, the cross-border transfer framework was relatively clear and settled and might be represented as follows: Legal Cross-Border Transfer (T ) = Country Laws ( A) + Appropriate Safeguards (S) Appropriate Safeguards (S) = GDPR Legal Obligations + Technical/Org. Measures Thus, Article 46 GDPR “appropriate safeguards” (S) were designed to ensure that a GDPR level of data protection applied to the personal data by requiring controllers and processors to comply with all GDPR obligations regardless of the jurisdiction (A). The spectrum varied from jurisdictions (A) where the EC had recognized an adequate level of protection (Article 45 GDP adequacy) for which no additional safeguard was needed (S = 0) to jurisdictions where the national data protection laws need to be supplemented with the SCCs in order to ensure an equivalent level of protection. Notably, the SCCs included the legal obligations, as well as the technical and organizational measures specified in the Annex (security measures). Thus, the appropriate safeguards already included the measures that post-Schrems II are now referred to as “supplementary.” This made sense since for the transfer to take place, under Article 46, the safeguards needed to be “appropriate.” However, post-Schrems II, we have a new and more complicated configuration: Legal Cross-Border Transfer (T ) = Country Laws (A) + Appropriate Safeguards (S) + Supplementary Measures (SM) Supplementary Measures (SM) = Technical/Og. Measures previously documented in SCC Annex

Our view is that the pre-Schrems II understanding is legally superior since the legal basis for such transfers is Article 46 GDPR and consequently the safeguards must be “appropriate” (i.e., they must implement all the necessary measures considering the risk to fill any data protection gaps). Our argument that the new so-called supplementary measures are simply the old baseline requirements to satisfy core GDPR obligations (e.g., Articles 25, 32, 46 GDPR) is consistent with the EDPB’s recent acknowledgement that: “[C]ontrollers may have to apply some or all of the measures described here irrespective of the level of protection provided for by the laws applicable to the data importer because they are needed to comply with Articles 81

EDBP Recommendations 01/2020.

168

M. Corrales Compagnucci et al.

25 and 32 GDPR in the concrete circumstances of the transfer […] exporters may be required to implement the measures described in this paper even if their data importers are covered by an adequacy decision, just as controllers and processors may be required to implement them when data is processed within the EEA.”82 Unsurprisingly, the “supplementary measures” proposed by the EDPB mirror the baseline core measures already contemplated in Article 32 GDPR, i.e., pseudonymization and encryption of personal data. The guidance includes seven use cases to illustrate what it considers to be effective supplementary measures. Predictably, the only effective supplementary measures are those (1) where the data transferred is fully encrypted and the keys are retained solely under the control of the data exporter or an entity trusted by the exporter in the EEA; (2) the data transferred is effectively pseudonymized; or (3) relying on split or multi-party processing (e.g., secure multi-party computation). The EDPB stated that in situations where the third-country importer has access to the encryption keys (i.e., access to the unencrypted data in cleartext) and the public authorities may have the power to access the transferred data, the “EDBP is incapable of envisioning an effective technical measure.”83 Given that the only effective supplementary measures are full encryption, pseudonymization or multi-party processing, we anticipate that multiparty homomorphic encryption (HE) will emerge as a dominant solution that combines secure multiparty computation and homomorphic encryption to overcome their respective limitations and provide scalable and effective data protection measures in distributed cross-border data sharing settings (e.g., health research, clinical trials). As noted by the EDBP, personal data needs to be encrypted both at “rest” (i.e., when data is stored) and while “in transit” (i.e., when transferred from one place to another). While modern encryption algorithms are secured, they make it difficult to process and analyze the data without first decrypting it—and this process of decryption exposes the data to the risk of data protection, cyber-attacks, or unauthorized third-party interception.84 HE solutions have significant potential because they allow the possibility to query and analyze encrypted data without decrypting and compromising it. Therefore, HE might solve many of the vulnerabilities inherent in other approaches to data protection and data security. This could also allow the processing of health data for other secondary use and research, for example. HE has been labelled the “Holy Grail” of cryptography.85 Although HE is not a new technology, it is still in the early stages of development. In a previous article,86 we introduced a new automated tool for searching and analyzing encrypted data using

82

EDBP Recommendations 01/2020, Paragraph 83. EDBP Recommendations 01/2020, Paragraphs 80, 94–95 (Use Case 6), 96–97 (Use Case 7). 84 Corrales Compagnucci et al. (2019, p. 144). 85 Tourky et al. (2016, p. 196). 86 Corrales Compagnucci et al. (2019, p. 144–155). 83

Supplementary Measures and Appropriate Safeguards for International …

169

HE techniques, which is being developed within the scope of the Energy Shield project.87 In the context of medical research, for example, these features of HE could have important applications in facilitating healthcare providers, research centers or pharmaceutical sponsors to analyze their data securely and in an encrypted domain. For example, HE could be used in a private cloud medical record storage system (e.g., Patient Controlled Encryption), in which all data for a patient’s medical record is encrypted by healthcare providers before being uploaded to the patient’s record in the cloud storage system.88

5 Conclusion In this chapter, we explored the current framework for cross-border transfers postSchrems II, specifically the new SCCs adopted by the EC to provide “appropriate safeguards” for cross-border transfers, as well as the EDBP guidance on “supplementary measures.” We argued that the so-called “supplementary measures” are not, in fact, “supplementary” in any meaningful sense, and that the CJEU’s characterization of such measures as “supplementary” undermines the original clarity of GDPR with regards to the required standards for security of processing, as well as the available mechanisms for cross-border transfers of personal data. That said, our view is that despite the legal uncertainty introduced by the CJEU several post-Schrem II developments have been helpful in clarifying and improving the safeguards associated with cross-border data transfers. These include the new SCCs and the EDBP guidance on measures for transfer tools and their view on their effectiveness. The SCCs provide a practical toolbox to comply with the Schrems II requirements and contain a more pragmatic approach with examples of possible security measures that companies can implement in high-risk situations. Encryption, pseudonymization and multi-party processing are at the top of the list. HE can be considered a combination of these three security measures. HE could emerge as a dominant solution that combines secure multiparty computation and homomorphic encryption to overcome their respective limitations and provide scalable and effective data protection measures in distributed data sharing settings (e.g., health research, clinical trials). HE solutions have significant potential because they allow the possibility to query and analyze encrypted data without decrypting and compromising it. Thus, HE enables organizations to provide the same level of protection without compromising the usability of the data and the multi-party HE approaches—with an independent

87

This project has received funding from the European Union’s H2020 research and innovation programme under the Grant Agreement No. 832907 https://energy-shield.eu/. Accessed 17 August 2022. See also https://medco.epfl.ch a start-up that now commercializes a multi-party homomorphic encryption solution: https://tuneinsight.com. Accessed 17 August 2022. 88 See, e.g., Scheibner et al. (2021), Froelicher et al. (2021).

170

M. Corrales Compagnucci et al.

trusted third-party that holds the key in the EU/EEA—provide a whole new layer of protection for cross-border transfers of personal data. We finally note that it is important to monitor what effect further technological developments may have on the significance of cross-border data transfer and ultimately Schrems II. These developments range from new technologies providing momentum to decentralized innovation models, such as swarm intelligence technologies, blockchain, and decentralized data-centric AI, to new quantum encryption and decryption algorithms. Acknowledgements This research was co-funded by the CLASSICA Horizon Europe project (Grant Agreement no. 101057321), by a Novo Nordisk Foundation grant for a scientifically independent Collaborative Research Program in Biomedical Innovation Law (grant agreement number NNF17SA0027784), and by the Inter-CeBIL program (grant agreement number NNF23SA0087056). The opinions expressed are the authors’ own and not of their respective affiliations. The authors declare no conflicts of interests.

References Article 29 data protection working party, adequacy referential (updated), WP 254 at 2 (6 February 2018). https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=614108. Accessed 17 Aug 2022 Bradford L, Aboy M, Liddell K (2020) International transfers of health data between the EU and USA: a sector-specific approach for the USA to ensure an ‘adequate’ level of protection. J Law Biosci 7(1):1–33 Bradford L, Aboy M, Liddell K (2021) Standard contractual clauses for cross-border transfers of health data after Schrems II. J Law Biosci 8(1):1–36 Braun M et al (2021) European Commission adopts and publishes new standard contractual clauses for international transfers of personal data. https://www.wilmerhale.com/en/insights/blogs/ wilmerhale-privacy-and-cybersecurity-law/20210607-european-commission-adopts-and-pub lishes-new-standard-contractual-clauses-for-international-transfers-of-personal-data. Accessed 17 Aug 2021 Christakis T, Terpan F (2021) EU-US negotiations on law enforcement access to data: divergences, challenges and EU law procedures and options. Int Data Priv Law 11(2):81–106 Commission Nationale Informatique & Libertés, National Council of Free Software vs. Ministry of Solidarities and Health Conseil d’Etat Section du Contentieux Ref. L. 521–2 CJA (9 October 2020) Commission staff working document accompanying the document, report from the commission to the European Parliament and The Council on the third annual review of the functioning of the EU-US Privacy Shield 25–27 COM (2019) 495 final (23.10.2019) Corrales Compagnucci M et al (2019) Homomorphic encryption: the ‘Holy Grail’ for big data analytics & legal compliance in the pharmaceutical and healthcare sector? Eur Pharm Law Rev 3(4):144–155 Corrales Compagnucci M, Minssen T, Seitz C, Aboy M (2020) Lost on the high seas without a safe harbor or a shield? Navigating cross-border data transfers in the pharmaceutical sector after Schrems II invalidation of the EU-US privacy shield. Eur Pharm Law Rev 4(3):153–160 Corrales Compagnucci M, Aboy M, Minssen T (2021) Cross-border transfers of personal data after Schrems II: supplementary measures and new standard contractual clauses (SCCs). Nordic J Eur Law 4(2):37–47

Supplementary Measures and Appropriate Safeguards for International …

171

Court of Justice of the European Union (CJEU) Press Release No. 117/15 (6 October 2015) Court of Justice of the European Union (CJEU) Press Release No. 165/19, Advocate General’s Opinion in Case C-311/18 Data Protection Commissioner v Facebook Ireland Limited, Maximillian Schrems (Luxembourg, 19 December 2019) EDPB, Recommendations 01/2020 on measures that supplement transfer tools to ensure compliance with the EU level of protection of personal data (10 November 2020) [“EDPB Guidelines”] European Commission Memo/17/15, ‘Digital single market—communication on exchanging and protecting personal data in a globalized world’ (10 January 2017). http://europa.eu/rapid/pressrelease_MEMO-17-15_en.htm. Accessed 17 Aug 2022 European Commission, ‘European Commission adopts new tools for safe exchange of personal data’ (4 June 2021). https://ec.europa.eu/commission/presscorner/detail/en/ip_21_2847. Accessed 17 Aug 2021 European Commission, adequacy decisions. https://ec.europa.eu/info/law/law-topic/data-protec tion/international-dimension-data-protection/adequacy-decisions_en. Accessed 17 Aug 2022 European Data Protection Board (EDPB) (2020) Recommendations 01/2020 on measures that supplement transfer tools to ensure compliance with the EU level of protection of personal data. https://edpb.europa.eu/our-work-tools/our-documents/recommendations/rec ommendations-012020-measures-supplement-transfer_en. Accessed 17 Aug 2022 European Data Protection Supervisor (2020) Strategy for union institutions, offices, bodies and agencies to comply with the Schrems II ruling, https://edps.europa.eu/sites/edp/files/publication/ 2020-10-29_edps_strategy_schremsii_en_0.pdf. Accessed 17 Aug 2022 Fennessy C (2021) Data transfers: questions and answers abound, yet solutions elude. https:// iapp.org/news/a/data-transfers-questions-and-answers-abound-yet-solutions-elude/. Accessed 17 Aug 2022 Froelicher D et al (2021) Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption. Nat Commun 12:5910 Jurcys P, Corrales Compagnucci M, Fenwick M (2022) The future of international data transfers: managing legal risk with a ‘user-held’ data model. Comput Law Secur Rev 46(105691):1–25 Lee P (2021) The updated standard contractual clauses: a new hope? https://iapp.org/news/a/theupdated-standard-contractual-clauses-a-new-hope/. Accessed 17 Aug 2021 Malik O, Khan M (2020) Understanding Baden Wurttemberg’s updated guidance on international data transfers. IAPP.org. https://iapp.org/news/a/post-schrems-ii-understanding-badenwurttembergs-updated-guidance-on-international-data-transfers/. Accessed 17 Aug 2022 Meltzer J-P (2020) The court of justice of the European Union in Schrems II: the impact of the GDPR on data flows and national security, Brookings report. https://www.brookings.edu/res earch/the-court-of-justice-of-the-european-union-in-schrems-ii-the-impact-of-gdpr-on-dataflows-and-national-security/. Accessed 17 Aug 2022 Milner-Smith A (2021) New standard contractual clauses—what do you need to know? https:// www.lewissilkin.com/en/insights/new-standard-contractual-clauses-what-do-you-need-toknow. Accessed 17 Aug 2022 Roth P (2017) Adequate level of data protection in third countries post-Schrems and under the general data protection regulation. J Law Inf Sci 25(1):49–67 Scheibner J et al (2021) Revolutionizing medical data sharing using advanced privacy-enhancing technologies: technical, legal, and ethical synthesis. J Med Internet Res 23(2):e25120 Stoddart J, Chan B, Joly Y (2016) The European Union’s adequacy approach to privacy and international data sharing in health research. J Law Med Ethics 44(1):143–155 Tourky D, El Kawkagy M, Keshk A (2016) Homomorphic encryption the “Holy Grail” of cryptography. In: 2nd IEEE International conference on computer and communications (ICCC), pp 196–201

172

M. Corrales Compagnucci et al.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The Internal Network Structure that Affects Firewall Vulnerability Shinto Teramoto, Shizuo Kaji, and Shota Osada

Abstract Sharing extensive healthcare information is essential for the advancement of medicine and the formulation of effective public health policies. However, it often contains sensitive or personal information, or trade secrets. Certain safety measures are needed to strike a balance between the sharing of data and the protection of such information. A firewall is one of the major safety measures designed to prevent the delivery of protected information by severing harmful connections or limiting the formation of new connections between relevant parties in an information exchange network. Although very simple models suggest firewall vulnerabilities, such models often oversimplify real-world scenarios, neglecting factors like internal connections among nodes and the influence of other information held by nodes. Therefore, we propose several improved models and use them to explore some of the reasons why firewalls fail. Our study finds that firewalls are less effective as the number of network nodes increases, and that both high- and low-degree nodes pose non-negligible risks. The study also raises awareness about the role of internal monitors in preventing leaks. The effectiveness of information leakage control could be increased with the monitor’s proximity to the information source. This necessitates a greater focus on internal monitoring, perhaps using information and communication technology. Keywords Communication network · Graph theory · Information security · Mathematical model · Social network

S. Teramoto (B) Faculty of Law, Kyushu University, Fukuoka, Japan S. Kaji Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan S. Osada Faculty of Education, Kagoshima University, Kagoshima, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3_10

173

174

S. Teramoto et al.

1 Introduction It is widely recognized that the compilation and sharing of extensive data sets, composed of individual patients’ and citizens’ medical and health records, is vital for advancing medical science and practice, as well as for the design and implementation of public health policy.1 Unprocessed records, often intended for big data sharing, frequently encompass sensitive, personal, or secret information. This can include medical and health records, or trade secrets, referred to as “protected information” in this document. However, many laws and regulations limit the sharing of such information among multiple entities. For instance, in Japan, the Unfair Competition Prevention Act (the UCPA) defines the acquisition, use and disclosure of trade secrets2 through fraudulent, coercive, or other wrongful means, as well as related use and disclosure as “unfair competition” (Article 2, Para. 1, Items 4 through 10), which are subject to civil remedies such as injunctions (Article 3), compensation for damages (Article 4) and criminal sanctions (Article 21). The UCPA also provides similar protection to “data with limited access”3 (Article 2, Para. 1, Items 11 through 16). The Act on the Protection of Personal Information (the APPI) imposes strict restrictions on businesses regarding the collection and use of personal information. It also prohibits administrative bodies from using or providing personal information in their possession unless authorized by laws and regulations. Undoubtedly, individual information contained in medical and health records falls within the domain of typical personal information. Data sharing can conflict with the protection of protected information, which is frequently contained in the shared data, unless the data is processed to remove such information. Moreover, it’s important to note that anonymizing personal information does not guarantee protection against forensic analysis, particularly when aided by artificial intelligence. To build and share big data that contains protected information, while safeguarding such information, it is crucial to implement measures to prevent its leakage and dissemination. Such measures are often referred to as “firewalls”. Suppose there is an entity, or a division of an entity, (referred to as “E”) that is responsible for collecting, handling, and/or analyzing big data, including protected information. According to the general knowledge of legal practitioners, a typical firewall allows the nodes belonging to E, including employees of E and devices 1

See, e.g., Directorate-General for Health and Food Safety (May 3, 2022), the Strategic Headquarters for the Promotion of an Advanced Information and Telecommunications Network Society (June 15, 2021). 2 Article 2, Para. 6 of the UCPA provides: The term “trade secret” as used in this Act means technical or business information useful for business activities, such as manufacturing or marketing methods, that is kept secret, and is not publicly known. 3 Article 2, Para. 7 of the UCPA provides: The term shared data with limited access as used in this Act means technical or business information that is accumulated to a significant extent and is managed by electronic or magnetic means (meaning an electronic form, magnetic form, or any other form that is impossible to perceive through the human senses alone; the same applies in the following paragraph) as information to be provided to specific persons on a regular basis (excluding information that is kept secret).

The Internal Network Structure that Affects Firewall Vulnerability

175

such as PCs, tablets, cameras, and video recorders (referred to as “inside nodes”), to access a specific class of information, while prohibiting any nodes not belonging to E (referred to as “outside nodes”) from accessing such information by contacting any of the inside nodes. Consider a general physician at a clinic or a hospital in Japan examining a patient who may be infected with COVID-19 during the COVID-19 pandemic. For the sake of simplicity and convenience, this example does not take into account any paramedical staff involved in the same patient’s care. The physician is likely to access and collect the patient’s personal information, such as their recent activities and disease history, generate new personal information through diagnosis, and document it in the clinic’s electronic health records (“EHR”). The physician is obligated to protect the confidentiality of the patient’s personal information under the Penal Code4 and the APPI. However, Article 12 of the Act on the Prevention of Infectious Diseases and Medical Care for Patients with Infectious Diseases requires the physician to report the name, age, gender, and other items specified by the Ministry of Health, Labor and Welfare (MHLW) of the COVID-19 patient to the head of the public health center nearest to the clinic. The result of these regulations is that the physician must disclose specific personal information of the patient to the public health center, as mandated by the laws and regulations for controlling infectious diseases. However, the physician must also ensure that any other personal information of the patient remains confidential and is not shared with the public health center. Accordingly, the physician is not allowed to merely share the patient’s record in the EHR with the public health center. To fulfill this requirement, the MHLW has established HER-SYS (Health Center Realtime Information-Sharing System on COVID-19), which allows physicians to notify the public health center of COVID-19 cases.5 Physicians input only the information specified in Article 12, as mentioned above, into HER-SYS, which is taken from the records of individual patients. This means that a firewall has been installed between the clinic or hospital where the physician works and the adjacent public health center. Only the information specified by law is permitted to pass through the firewall and be shared with the public health center. This is considered a typical firewall, as it allows nodes within to share specific information with nodes outside while preventing the sharing of other information. The primary function of a typical firewall is to act as a filter. Such a filter is crucial in achieving a balance between sharing big data containing protected information and protecting that information.

4

Article 134, Para. 1 of the Penal Code of Japan provides: When a physician, pharmacist, pharmaceuticals distributor, midwife, attorney, defense counsel, notary public or any other person formerly engaged in such profession, without just cause, discloses another person’s confidential information which has come to be known in the course of such profession, imprisonment for not more than 6 months or a fine of not more than 100,000 yen is imposed. 5 See https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/0000121431_00129.html (Accessed 17 June 2023).

176

S. Teramoto et al.

2 The Intrinsic Vulnerabilities of a Firewall From the perspective of a social network, a firewall can be seen as designed to prevent the delivery of protected information from the nodes inside (V 1 in Fig. 1) to the nodes outside (V 2 in Fig. 1) of the firewall by severing the edge (represented by the dotted line in Fig. 1) connecting them. However, even after the firewall is installed, the formation of friendships, comradeships, and other edges that don’t involve the delivery of protected information can persist or be established. It is highly likely that a seemingly innocent edge could potentially restore the severed edge or even be transformed into a harmful edge that conveys protected information, leading to the failure of the firewall.

2.1 Reasons for Using Models to Study Firewall Vulnerabilities Given the swift evolution of data management policies and the advent of new technologies and services, organizations may find it compelling to consider the implementation of an ’experimental firewall,’ one that diverges from the structure of conventional firewalls. However, if the experimental firewall causes a material leak of protected information, the organization may become legally responsible for compensating the affected individuals for any damages incurred as a result. Given these legal concerns, it is beneficial to conduct an analysis based on models and simulations to examine the vulnerabilities and the effectiveness of the experimental firewall before conducting social experiments that require real organizations to implement experimental firewalls for examination purposes.

2.2 The Assumptions of the Simple Models We make the following assumptions when designing our models to represent a social network with a firewall installed. While there are many other ways to represent a network, these are common and straightforward methods: Fig. 1 A representation illustrating the edges connecting V1 and V2

The Internal Network Structure that Affects Firewall Vulnerability

177

(1) A model can be represented by a graph; (2) The said graph consists of nodes and edges; (3) A node represents an individual actor participating in the social network. The actor can be a natural person, a digital device, an entity, a division, or anything else; and (4) An edge connecting two nodes represents the flow of information between them.

2.3 Examination of Firewall Vulnerabilities Using Simple Models 2.3.1

Model 1 (As Shown in Fig. 2)

(1) Assumptions made for Model 1 (i) The model is represented by a network consisting of two nodes (V 1 and V 2 ); (ii) V 1 holds a piece of information, referred to as “I”; (iii) A firewall is installed to prevent V 2 from accessing I by severing the edge (V 1 , V 2 ); and (iv) The probability that the edge (V 1 , V 2 ) survives after the installation of the firewall is p. (2) Results The probability that I will not be accessible to V 2 through the surviving edge (V 1 , V 2 ) after the firewall installation is 1 − p. If p = 0.2, then the resulting probability is 0.8. Event

(V 1 , V 2 ) survives

(V 1 , V 2 ) is severed

Probability

p

1−p

Result

I is accessible to V 2 (firewall failed) I is not accessible to V 2 (firewall successful)

Probability

p

1−p

Suppose that p = 0.2

0.2

0.8

Fig. 2 A representation illustrating Model 1

178

S. Teramoto et al.

2.3.2

Model 2 (As Shown in Fig. 3)

(1) Assumptions made for Model 2. (i) The model is represented by a network consisting of three nodes (V 1 , V 2 and V 3 ); (ii) There are two edges, (V 1 , V 2 ) and (V 1 , V 3 ); (iii) V 1 holds a piece of information, referred to as “I”; (iv) A firewall is installed to prevent V 2 and V 3 from accessing I by severing the edges (V 1 , V 2 ) and (V 1 , V 3 ); and (v) For each of the edges, (V 1 , V 2 ) or (V 1 , V 3 ), the probability that it survives after the installation of the firewall is p. (2) Results The probability that I will not be accessible to neither V 2 nor V 3 through either of the surviving edges after the firewall installation is (1 − p)2 . If p = 0.2, the resulting probability is 0.64. Event

(V 1 , V 2 ) survives

Probability

p

1−p

Event

(V 1 , V 3 ) survives

(V 1 , V 3 ) is severed

(V 1 , V 2 ) is severed

Probability

p

1−p

Result

I is accessible to V 2 and/or V 3 (firewall failed)

I is not accessible to either of V 2 and V 3 (firewall successful)

Probability

1 − (1 − p)2

(1 − p)2

Suppose that p = 0.2

0.36

0.64

Fig. 3 A representation illustrating Model 2

The Internal Network Structure that Affects Firewall Vulnerability

2.3.3

179

Model 3 (As Shown in Fig. 4)

(1) Assumptions made for Model 3. (i) The model is represented by a network consisting of nine nodes (V 1 , V 2 , V 3 , …, and V 9 ); (ii) Each of the edges (V 1 , V 2 ), (V 1 , V 3 ), …, and (V 1 , V 9 ) connects V 1 to each of the other nodes; (iii) V 1 holds a piece of information, referred to as “I”; (iv) A firewall is installed to prevent V 2 , V 3 , …, and V 9 from accessing I by severing the edges (V 1 , V 2 ), (V 1 , V 3 ), …, and (V 1 , V 9 ); and (v) For each of the aforementioned edges, the probability that it survives after the installation of the firewall is p. (2) Results The probability that I will not be accessible to any of V 2 , V 3 , …, and V 9 through any of the surviving edges after the firewall installation is (1 − p)8 . If p = 0.2, the resulting probability is 0.167. Result

Firewall failed

Firewall successful

Probability

1 − (1 − p)8

(1 − p)8

Suppose that p = 0.2

0.832

0.167

Fig. 4 A representation illustrating Model 3

180

S. Teramoto et al.

Fig. 5 A representation illustrating Model 4

2.3.4

Model 4 (As Shown in Fig. 5)

(1) Assumptions made for Model 4 (i) The model is represented by a network consisting of 8 × 2 nodes, labeled as Vi where i ranges from 1 to 8, and Vj where j ranges from 9 to 16; (ii) Each of the edges (V i , V j ) connects V i and V j ; (iii) Each of the nodes V i holds a piece of information, referred to as “I”; and (iv) For each of the aforementioned edges (V i , V j ), the probability that it will survive after the installation of the firewall is p. (2) Results The probability that I will not be accessible to any of the nodes V j is (1 − p)8×8 . If p = 0.2, the resulting probability is 6.27 × 10–7 . Result

Firewall failed

Firewall successful

Probability

1 − (1 − p)8×8

(1 − p)8×8

Suppose that p = 0.2

0.999

6.27 × 10–7

2.4 Suggestions The models referred to above suggest that a firewall is highly vulnerable, particularly when there are numerous nodes both inside and outside of the firewall. However, it must be acknowledged that these models may not fully capture some of the conditions frequently present in real-world businesses, which can either increase or decrease the vulnerability of a firewall. For example:

The Internal Network Structure that Affects Firewall Vulnerability

181

(1) The aforementioned models assume that every node within the firewall has access to I from the outset. However, it is quite possible that only one node or a limited number of nodes within the firewall initially have access to I, and that I is disseminated to other nodes within the firewall subsequently. If the number of the nodes within the firewall that have access to I is limited, the likelihood of the nodes outside the firewall gaining access to I decreases, and vice versa. Therefore, the speed and extent of the dissemination of I within the firewall is likely to affect the probability of a successful (or, unsuccessful) firewall; (2) The aforementioned models assume that the nodes within the firewall are isolated from each other. However, in real-world business scenarios, it is quite likely that these nodes are connected to each other, though they do not necessarily make a complete graph. Given that these connections have the potential to affect the spread of information within the firewall, it is important to incorporate them in any models used to examine the vulnerabilities and improvements of a firewall6 ; and (3) The aforementioned models do not take into account any information held by nodes within the firewall, except for the protected information. However, in real-world scenarios, various types of information are held and disseminated among the nodes within the firewall. Some of information may discourage a node within the firewall from sharing protected information with nodes outside the firewall. For instance, if the reputation of a particular node within the firewall for handling protected information is widely known to be careless, other nodes within the firewall are likely to be discouraged from sharing such information with that node. Furthermore, other nodes within the firewall are highly likely to monitor that node to prevent the leakage of protected information outside the firewall. To make the models more reflective of real-world scenarios, the spread of this type of information should be incorporated into the models. The limitations of the aforementioned models highlight the need for improvement by incorporating the communication networks present within the firewall. This is addressed in Sects. 3 and 4, which examine the improved models.

3 Graphical Model of Information Flow In this section, we introduce a simple graphical model that can be used to measure the relationship between the network, the firewall, and the flow of information. We note that there have been a variety of mathematical models proposed for the purpose of representing information diffusion, such as the independent cascades (IC) and the

6

A firewall is considered to have failed when any protected information is accessed by any of the nodes outside the firewall. Accordingly, for the purpose of discussing the vulnerabilities and improvement of a firewall, we do not need to consider the spread of such information outside the firewall. We do not have to care the connections among the nodes outside the firewall.

182

S. Teramoto et al.

linear threshold (LT) models.7 There are several notable differences between these models and ours. Most importantly, our model does not consider time evolution, as we are interested in the asymptotic behavior of the network; for example, we look at the probability of information leakage within a fixed period. We do not care when the leakage happens, but rather, we assess the risk of a leakage incident within a predefined term. This simplifies our model compared to previous ones while still meeting the modeling goal. We model the network as a directed graph, where nodes represent agents and directed edges represent information pathways. Our fundamental assumptions are as follows: (1) We consider a single type of protected information at a time; as the network structure depends on the type of information, we consider different graphs for each type of information, if necessary. (2) The network consists of a number of agents, each of whom obtains (the fixed type of) information from outside the network or creates it with a certain probability. (3) The information is delivered from one agent to another stochastically according to the relationship between them. (4) We are interested in the probability of each agent possessing the information and the probability of the information leakage outside the network. This abstraction allows for modeling flexibility. The network can represent entities of various scales, ranging from a section of a company to a large system involving many institutions with a hierarchical structure. An agent can be a person, a group, an information terminal, or a server such as EHR, depending on the granularity of the modeling. The set of nodes is denoted by V, which consists of agents and two special nodes described later. Each directed edge connecting a node u to another v is denoted by (u, v). An edge represents the existence of information transmission from u to v based on professional or personal relations between them. We assume the graph is simple in the sense that there is no self-loop (an edge from a node to itself) or multiple edges between a single pair of nodes. Each edge is associated with a weight p(u, v) ranging from 0 to 1 that represents the probability of delivering the information from the transmitter u to the recipient v. The value p(u, v) could be influenced by various conditions, including but not limited to the intimacy between the persons denoted by u and v. We call p(u, v) the delivery probability for the edge from u to v. There exists a distinguished node i ∈ V that has edges to every agent. This special node i can be considered as a source of information so that p(i, v) is understood as the probability for the agent v to obtain or create (the fixed type of protected) information in the course of his or her own work. We call p(i, v) the acquisition probability for the node v. There exists another distinguished node e ∈ V to which every agent has an edge. The special node e represents the external network (outside the firewall) so that p(u, e) is understood as the probability for the agent u who possesses the information to divulge it outside. We call p(u, e) the divulging probability for the node u. We assume 7

Guille et al. (2013).

The Internal Network Structure that Affects Firewall Vulnerability

183

that there is no edge between i and e, which means that we are not concerned with the information delivery that does not involve the network under consideration. We use symbols u and v to denote agent nodes, while w and w' represent any node including the special ones. In our model, p(u, v), p(i, v), and p(u, e) are the model parameters that characterize the network. Now, the probability p(v) for an agent u to hold the information after the (predefined) fixed period is defined by the following equations: p(i) = 1 1 − p(w) = ∏ (1 − p(w') p(w', w)) (w /= i), w' ∈ Nin (w)

(1)

where N in (w) = {w' ∈ V | ∃(w' ,w) ∈ E} is the set of nodes (including i) that have an edge to w. Equations (1) can be intuitively understood as follows. The agent v does not obtain the information only when no node (including i) delivers the information to v. A node u possesses the information with a probability of p(u) and delivers it to w through an edge (u, w) with a probability of p(u)p(u, w). Note that the distinguished node i delivers the information to u with a probability of p(i)p(i, u) = p(i, u), which is understood as the probability for u of obtaining the information by itself. Assuming that the delivery between any pair of nodes occurs independently from one another, the probability 1 − p(w) for w of not possessing the information is computed by the product of the probability of adjacent nodes failing to deliver the information to w. The same procedure computes the probability p(e) of the information leakage to outside the network after the fixed period by setting w = e and p(i, e) = 0; a leakage incident occurs when some agent u inside the network delivers the information to e. To summarize, given the parameters p(u, v), p(I, v), and p(u, e) that define the information network, the probability p(u) for an agent u holding the information and the probability p(e) of the information leakage to outside the network are computed by solving the system of Eqs. (1). This amounts to solving a polynomial equation of a high degree, which is generally infeasible. We have implemented simulation codes8 which relies on the belief propagation algorithm9 that iteratively solves for (1). When the network has cycles, the algorithm is not guaranteed to converge to the solution, but it gives a good approximation of the solution in practice. It should be noted that p(u)p(u, e) is considered as the risk of the agent u for divulging the information. This value serves as an indicator to identify the weak point of the network and to design prevention measures.

8 9

Available at: https://github.com/shizuo-kaji/InformationNetwork. Frey (1998)

184

S. Teramoto et al.

3.1 Simple Examples Before diving into a complex situation, we look at a few simple examples to illustrate the essence of the model. Example 3.1.1 The simplest example consists of a single agent along with the two distinguished nodes i and e. The model parameters are set as depicted in the diagram in Fig. 6 (Left). That is, p(i, u) = 0.6 and p(u, e) = 0.4. In this case, we can compute the model by hand. • From 1 − p(u) = 1 − p(i)p(i, u) = 0.4, we see the probability for u to hold the information after a fixed period is p(u) = 0.6. • From 1 − p(e) = 1 − p(u)p(u, e) = 0.76, we calculate the probability of information leakage after a fixed period is p(e) = 0.24. • Assume that a newly employed measure (a firewall) halves the probability p(u, e) for u to divulge the information. Then, we can easily see the probability of information leakage p(e) is also halved. In this simplest case with a single agent, the effect of reducing information exchange between inside and outside the network has a linear impact on preventing information leakage. Example 3.1.2 The next example involves two agents with different characteristics, as depicted in the diagram in Fig. 6 (Right). The agent u transmits the information to the other agent v but not vice versa. • We immediately see p(u) = 0.3. • From 1 − p(v) = (1 − 0.4)(1 − 0.5p(u)) = 0.51, we see p(v) = 0.49. In addition to the information the agent v acquires the information by oneself (from i) with a probability of 0.4, the agent receives the information delivered from u, which results in the final probability p(v) = 0.49. • From 1 − p(e) = (1 − 0.8p(u))(1 − 0.1p(v)) = 0.76·0.951, we calculate p(e) = 0.277. Notice that this probability is smaller than the simple addition 0.8·p(u) + 0.1·p(v) = 0.289 of the probabilities of the divulgence from u and v. This is

Fig. 6 Example of simple networks with a single agent (Left) and two agents (Right)

The Internal Network Structure that Affects Firewall Vulnerability

185

because both u and v divulge the information simultaneously with a probability of 0.289 − 0.277 = 0.012. • Assume that a newly employed measure cuts the communication between inside and outside the network and halves the probability p(x, e) for x = u, v to divulge the information. From 1 − p(e) = (1 − 0.4p(u))(1 − 0.05p(v)) = 0.88 · 0.9755, we see p(e) = 0.142. Compared with the probability of 0.277 before implementing the measure, the probability of leakage is reduced approximately to 0.51. Even though the probability of divulging for all agents is halved, the reduction of the final leakage probability is less than that. Not only the communication between inside and outside the network, but also the information flow of the inside of the network matters. This illustrates the importance of the internal network structure in designing the prevention measure with respect to the target reduction level in the leakage probability.

3.2 A Model of Firewalls In our formalism, a prevention measure against information leakage, which we call a firewall, can be abstractly modeled as the change in the network parameters p(u, v), p(i, v), and p(v, e), and the network topology. More precisely, let f be an increasing function from the interval [0, 1] to itself. A firewall is modeled by an application of the function f to a certain network parameter. For example, by applying f (x) = 0.5x to the edge delivery probability, we obtain p' (u, v) = 0.5p(u, v), which means that the information exchange between any pair of agents will be halved by the firewall. In reality, it is difficult to design an actual measure, such as restricting access to the information by introducing a new guideline, to achieve the effect of the function f . Conversely, it is difficult to estimate the effect of a certain measure in the form of a function f . Therefore, the aim of the analysis with our model is more qualitative than quantitative; for example, we would like to know whether cutting connections between agents is more effective than restricting agents from acquiring the information. Also, it has to be noted that smooth information exchange within the network, which is desirable for an efficient operation, and the risk of undesirable leakage are a trade-off. We always keep this in mind to look at the results of the simulation.

3.3 Numerical Simulation We investigate the impact of the topological and quantitative network properties on information diffusion by a series of numerical simulations using our model. We emphasize that the experiments conducted in this section are not meant to be comprehensive but showcase the potential of our model in analyzing various scenarios. As the effect of a firewall is modeled by the change of the network properties, we can see how the network responds to a particular firewall measure. The leak probability p(e)

186

S. Teramoto et al.

is the main indicator to look at, but the emphasis should also be put on the information possession probability p(u) for each agent. It is natural that a leakage incident occurs more likely when many agents possess the information. Since sharing information is vital to the efficient operation of the institution, assessing the trade-off between p(u) and p(e) is practically important in designing a firewall. We use the aforementioned computer codes for the simulation. Example 3.3.1 We look at complete graphs with various configurations. In this setting, the network is homogeneous, and every agent plays an equal role. This type of network is likely to be seen only in a small institution. First, we look at the effect of the number of agents. The acquisition probability p(i, u) for each agent u is drawn from the normal distribution of mean 0.1 and standard deviation 0.1. The divulging probability p(u, e) for each agent u is drawn from the normal distribution of mean 0.01 and standard deviation 0.01. The delivery probability p(u, v) for each edge is drawn from the normal distribution of standard deviation 0.1 and varying means. We run the simulation for 30 runs and plot the distribution of the leak probability p(e) in Fig. 7 (Left). We observe that the leak probability increases along with the size of the network. This is to be expected, since the amount of information obtained from outside sources or created within the network increases as the number of agents increases. When the network size gets bigger (n ≥ 20), the edge delivery probability matters less. This observation is also in agreement with Fig. 7 (Right) in which the distribution of p(u) is shown to saturate when n ≥ 10 with the mean delivery probability greater than or equal to 0.4. In a dense network, where every agent communicates with each other, information diffusion is quickly accelerated by the size of the network, and so is the risk of information leakage. Next, we look at the effect of each agent’s divulging and acquisition probabilities with a fixed number of 10 agents. The divulging probability for each agent is drawn from the normal distribution of standard deviation 0.01 and various means. The acquisition probability for each agent is drawn from the normal distribution of standard deviation 0.1 and various means. The edge delivery probabilities are drawn from the normal distribution of mean 0.1 and standard deviation 0.1. We run the simulation for 30 runs and plot the distribution of the leak probability in Fig. 8

Fig. 7 Results for complete graphs with various sizes and delivery probability. (Left) distribution of the leak probability p(e). (Right) distribution of the information possession probability p(u)

The Internal Network Structure that Affects Firewall Vulnerability

187

Fig. 8 Results for complete graphs of size 10 with various divulging and acquisition probability for agents. (Left) distribution of the leak probability p(e). (Right) distribution of the information possession probability p(u). The divulging probability does not affect p(u) and is omitted from the plot

(Left). We observe that the leak probability increases along with the mean divulging and acquisition probabilities, but the former has a more significant effect. This is reasonable since the divulging probability is directly connected to the leak probability. Figure 8 (Right) shows the change in the distribution of p(u) when the mean acquisition probability is varied. We observe that the median of p(u) increases in an upward convex manner with respect to the mean acquisition probability, meaning that the effect of the increase in the individual’s information acquisition is amplified by the network. Note that the leak and possession probabilities obviously increase as the inflow of information increases in the form of acquisition. To conduct a controlled experiment separating the effect of acquisition, we consider the following adjustment. We define the total acquisition p(i, G) of the network by 1 − p(i, G) =

∏

(1 − p(i, u))

u∈V

This can be interpreted as the acquisition probability of a single imaginary agent combining all the agents. Then, we vary p(i, u) while keeping a fixed value of p(i, G). For example, we set p(i, u) to a constant value for a certain ratio of the agents and to null for the rest of the agents. Figure 9 shows that when the ratio of agents with a positive acquisition probability is lower, the leak and possession probabilities are both higher especially when the size of the network is small. This means that information diffusion is promoted when the inflow is concentrated on a small number of agents. Also, the variance of p(u) is more articulated than that of the leak probability by the maldistribution of the agents who acquire the information. These results indicate that the popular strategy works in a small network that aim to restrict the number of people who create or acquire sensitive information from outside. In this way, the leak probability p(e) can be kept low while maintaining a high p(u) of information sharing within the network.

188

S. Teramoto et al.

Fig. 9 Results for complete graphs of various sizes and with various ratios of agents with a positive acquisition probability. (Left) distribution of the leak probability p(e). (Right) distribution of the information possession probability p(u)

Fig. 10 (Left) An example of the NWS network with 50 agents. (Middle) Correlation between the node degree and p(u) for agents. (Right) Correlation between the eigenvector centrality of the nodes and p(u) for agents

Example 3.3.2 As a real-world network model, we look at the Newman–Watts10 variant of a Watts–Strogatz small-world network (NWS, for short). In this setting, the average distance between agents grows logarithmically with respect to the number of agents. We fix the number of neighbors and the rewiring probability to 5 and 0.8 respectively. Figure 10 (Left) shows an example of the NWS network with 50 agents, where the delivery probability is fixed at 0.1 for all edges, and the divulging and acquisition probabilities for each agent are fixed at 0.1 and 0.01, respectively. The node color intensity indicates p(u) for each agent u. We observe from Fig. 10 (Middle) that higher-degree agents who have connections to many other agents have higher p(v), as expected. From Fig. 10 (Right), a similar correlation is seen between the eigenvector centrality11 of the nodes and p(u), where the eigenvector centrality is computed with respect to the edge weights p(u, v). Following the case of the complete graph, we look at the effect of the number of agents. The acquisition probability for each agent is drawn from the normal distribution of mean 0.1 and standard deviation 0.1. The divulging probability for each agent is drawn from the normal distribution of mean 0.01 and standard deviation 0.01. The delivery probabilities for edges are drawn from the normal distribution of 10 11

Newman and Watts (1999). Bonacich (1986).

The Internal Network Structure that Affects Firewall Vulnerability

189

Fig. 11 Results for NWS graphs with various sizes and delivery probability. (Left) distribution of the leak probability p(e). (Right) distribution of the information possession probability p(u)

standard deviation 0.1 and varying means. We run the simulation for 30 runs and plot the distribution of the leak probability p(e) in Fig. 11 (Left). We observe that the leak probability increases along with the size of the network, but the distribution of p(u) is stable against the size of the network as seen in Fig. 11 (Right). In a small-world network, the information possession probability p(u) for each node is less affected by the global topology of the network, which makes a stark contrast to the case of the complete graph. As the network is inhomogeneous and some agents have higher degrees of influence in the information diffusion then others, it would be natural to expect that prevention measures are more effective when applied to those agents with higher degrees. Figure 11 shows that this assumption is not always the case. The network configuration is set to the same as the previous one except that n is fixed to 100, and the divulging or acquisition probability is reduced to null for some agents. We choose 0–90% of agents from the one with the highest degree to lower degrees and set their divulging probability to null. This experiment emulates the implementation of a prevention measure that prioritizes hub agents. Figure 12 (Left) shows that to reduce the leak probability effectively, divulgence of not only those agents of high degrees but also most of (the 90% of) the agents should be prevented. Figure 12 (Right) shows that preventing the acquisition of the information has a small impact on the leak probability.

Fig. 12 Distribution of the leak probability for NWS graphs. (Left) A proportion of the agents’ divulging probability is cut to zero. (Right) A proportion of the agents’ acquisition probability is cut to zero

190

S. Teramoto et al.

Fig. 13 Results for NWS graphs with various sizes and delivery probability of the central agent. (Left) distribution of the leak probability p(e). (Right) distribution of the information possession probability p(u)

Lastly, we investigate how the existence of a central agent affects the overall information diffusion. We add a single agent that is connected to every other agent. This agent would represent a shared data server without proper access control so that every member can store or retrieve the information on the server. We set the network parameters as in Fig. 11 except that the delivery probability is fixed at 0.1. We add an agent to the network that has bidirectional edges with varying delivery probability. Figure 13 shows that the existence of a single central node promotes information sharing drastically. Such an agent in the network can increase the vulnerability of the whole network substantially.

3.4 Limitation of Our Model There are several major limitations of our model. First, our model’s assumption is too simple to accommodate various aspects of real-world networks. Also, it is virtually impossible to estimate the network parameters from data. This is a generic and fundamental limitation in a model-based study due mainly to the lack of reliable real-world data. To deal with multiple types of protected information, we can consider one network for each type of information independently in our current formulation. However, we cannot capture the interaction between different types of information in this way. It is often the case that different sets of information have positive and negative impacts on each other’s information flow.

The Internal Network Structure that Affects Firewall Vulnerability

191

4 An Information Flow Model with Information Source and Monitor In this section, we consider a network model of agents such as EHRs inside the firewall with two special agents: the source of protected information and the monitor of information leakage. In reality, even if many agents within the firewall are aware of the risk of information leakage, it is natural for a situation to arise where no one will point it out. On the other hand, if one agent reacts to such a situation, other agents will also agree, and the impact will spread, creating a situation to prevent agents from information leakage. In this section, we model such a situation that there is a monitor inside a firewall that functions to prevent agents from divulging protected information externally. We are interested in how the location of the source and the monitor affect the risk of information leakage. Since we do not care when information leakage occurs, we do not consider time evolution. The basic assumptions of this section are as follows: (1) We consider a single type of protected information at a time, as the network structure depends on the type of information. (2) The network consists of many agents, and there exist two distinguished agents as the information source and the monitor of information leakage. In this section, we call the two special agents a source and a monitor for simplicity. (3) The source spreads protected information over the network. The monitor spreads monitoring over the network to prevent information leakage by other agents. (4) Agents without the monitor have the possibility of information leakage until the time from the receipt of the information to the arrival of the monitoring. We introduce the formulation of the network model in this section. We model the relationship of information delivery between agents with a directed and edgeweighted graph. We assume that the graph has no multiple edge or self-loop and is simple in the sense that each pair of different nodes u and v has a directed path from u to v. We denote the set of nodes as V, the number of nodes as N, and a directed edge from agent u to v as (u, v). Each edge (u, v) has a weight w(u, v) that takes a value greater than or equal to 0. w(u, v) can be interpreted as the time taken by nodeu to deliver information to nodev. We assume that for each node, protected information and its monitoring are delivered through the shortest path from the source and the monitor, respectively. Here, the shortest path from u to v is the directed path from u to v that minimizes the total weights of the included edges. In our model, w(u, v) is the model parameter. Here, we introduce an information leakage index after defining the time taken for information to be transmitted. For two nodes u and v, we define the cost c(u, v) as the total cost contained in the shortest path from u to v. We define the information leakage index I(s, m, v) of a node v for the source s and the monitor m as follows: I (s, m, v) = max{0, c(m, v) − c(s, v)}

(2)

192

S. Teramoto et al.

where max{x, y} returns the larger one of x and y. Note that if m = v in Eq. (2), then I(s, m, v) = 0. In the right-hand side of Eq. (2), c(m, v) (resp. c(, v)) can be regarded as the expected time until the monitoring transmitted from m (resp. s) is delivered to v. Therefore, c(m, v) − c(s, v) can be interpreted as the time between when the protected information is delivered to v and when the monitoring is delivered to v. Note that if c(m, v) − c(s, v) ≤ 0 in Eq. (2), it is the situation that the monitoring is delivered to v before the protected information is delivered. Hence, we can interpret that there is no possibility of information leakage from v and set I(s, m, v) = 0. If we set p(v, e) = q to the probability that v divulges information outside in unit time, then the probability p(v) that v divulges information outside in whole time is estimated as 1− p(v) = (1 − q) I (s,m,v) . As I (s, m, v) gets closer to zero, p(v) approaches zero. On the other hand, as I(s, m, v) becomes larger, p(v) approaches one. Therefore, I(s, m, v) is understood as an index of how easily protected information can be leaked from v. This can be interpreted as the monitor has no possibility of information leakage. In sum, the model parameter w(u, v) defines an information flow model without time evolution. By selecting positions of the source and the monitor, Eqs. (2) give the information leakage index for each node. We attempt to model the delivery of protected information and monitoring to each node without using time evolution by considering the shortest paths. In the following, we compare the information leakage index for NWS network models with several rewiring probabilities.

4.1 Simple Example Before presenting numerical simulations, we check the above concepts with a simple graph. Example 4.1.1 As a simple example, we consider a complete graph with four nodes V = {0, 1, 2, 3}. Figure 14 is a weighted complete graph with four nodes. The shortest paths from 0 to 1 and from 1 to 0 are {(0, 1)} and {(1, 3), (3, 0)}, respectively. Hence, we calculate c(0, 1) = w(0, 1) = 1.58 and c(1, 0) = w(1, 3) + w(3, 0) = 2.66 + 1.14 = 3.8. Let 0 and 2 be the source and the monitor, respectively. Then we compute information leakage indices for 0, 1, 2, and 3 as follows: • c(2, 0) = w(2, 0) = 3 and c(0, 0) = 0. Hence, I(0, 2, 0) = max{0, c(2, 0) − c(0, 0)} = max{0, 3 − 0} = 3. • c(2, 1) = w(2, 1) = 1.07 and c(0, 1) = w(0, 1) = 1.58. Hence, I(0, 2, 1) = max{0, c(2, 1) − c(0, 1)} = max{0, − 0.51} = 0. • c(2, 2) = 0. Hence, I(0, 2, 2) = max{0, c(2, 2) − c(0, 2)} = max{0, − c(0, 2)} = 0. • c(2, 3) = w(2, 1) + w(1, 3) = 1.07 + 2.66 = 3.73 and c(0, 3) = w(0, 3) = 1.21. Hence, I(0, 2, 3) = max{0, c(2, 3) − c(0, 3)} = max{0, 3.73 − 1.21} = 2.52.

The Internal Network Structure that Affects Firewall Vulnerability

193

Fig. 14 Example of a complete graph with four agents

4.2 Numerical Simulation: Effects of Source and Monitor Position We investigate the effects of the relative position of the source and the monitor by calculating information leakage indices. For fixed source s and monitor m, we define Var I (s, m) and I (s, m) as the mean and variance of I(s, m, v) as follows: 1 ∑ I (s, m, v) N v∈V )2 1 ∑( Var I (s, m, v) − I (s, m) I (s, m) = N v∈V I (s, m) =

If we set p(e) to the leak probability of the entire graph for fixed source and monitor, I (s,m,v) = then p(e) ∑ is estimated as 1 − p(e) = ∏v∈V (1 − p(v)) = ∏v∈V (1 − q) I (s,m,v) . Hence, I is interpreted as the information leakage index of the (1 − q) v∈V entire graph. I(s, m, v), which is the information leakage index of a node v, can be Var greater than the mean I (s, m). The variance I (s, m) is regarded as a measure of the range of values taken by each node. Example 4.2.1 As a real-world network model, we generate a NWS network model. For each edge (u, v), we independently choose a value p(u, v) at random ranging from 0 to 1 and set its reciprocal as the edge weight w(u, v) = p(u, v)−1 . Here, p(u, v) and w(u, v) are interpreted as the probability of delivering information from u to v through the edge (u, v) and the expected number of trials until success, respectively. For ease of numerical simulations, we independently choose p(u, v) uniformly at random from 0.01, 0.02, 0.03 ,…, 0.99.

194

S. Teramoto et al.

Fig. 15 Results for NWS network of size 25 with 3 neighbors and various rewiring probabilities. Var Var Means of I (s, m) (Left) and I (s, m) (Right). Here, I (s, m) and I (s, m) are the marginal expectation and variance of I(s, m, v) in v

We look at a relationship between the rewiring probability and the information leakage index of the entire graph. Figure 15 (Left) shows a relation between rewiring probability and the mean of I (s, m) on the choice of the source and the monitor. Here, the graph size is 25, the number of neighbors is set to 3, and the rewiring probability, denoted by beta in Fig. 15, is varied from 0 to 1 in increments of 0.1. As expected, we observe that the mean of I (s, m) decreases as the rewiring probability increases. In the same situation, from Fig. 15 (Right), we observe that the mean of Var I (s, m) decreases as the rewiring probability increases. Example 4.2.2 Under the same setting in Example 4.2.1, we look at the effects of distance between the source and the monitor. We fix a graph and compare the Var relationship between the distance between s and m and I (s, m) and I (s, m). We set d(u, v) = c(u, v) + c(v, u) as the distance between u and v. Figure 16 shows Var correlations between the distance d(s, m), I (s, m), and I (s, m)1/2 with rewiring Var probabilities 0.1, 0.5, and 1. We observe that I (s, m) and I (s, m)1/2 increases as d(s, m) increases.

The Internal Network Structure that Affects Firewall Vulnerability

195

Fig. 16 Results for NWS networks of size 25 with 3 neighbors and rewiring probabilities 0.1, 0.5, Var and 1.0. Correlations between d(s,m) and I (s, m) (Left), and I (s, m)1/2 (Right) for all choices of s and d m

5 Conclusion Legal practitioners can draw several suggestions from the examinations in Sects. 3 and 4 to help them implement effective measures to prevent firewall failures in their clients’ systems. Based on the examination of Example 3.3.1., it is likely that in a group with dense mutual connections between nodes (such as a complete graph or a similar network), an increase in the number of nodes raises the risk of divulging protected information. To limit the number of nodes, which includes both employees and computer terminals, in the organization responsible for handling protected information would be a simple but effective measure. For such industries as medicine or finance, where a large number of nodes are involved, it would be advisable to divide the nodes into multiple groups, such as medical teams and business divisions, each with a limited number of nodes that handle the protected information of a specific number of subjects. Based on the examination of Example 3.3.2, it is likely that in a scale-free network, an increase in the number of nodes has less impact on the risk of divulging protected information. That will be a relief for the managements of many organizations with scale-free networks of employees or computer terminals. However, the existence of a node that acts as a central hub to disseminate protected information to many nodes within the firewall would increase the risk of protected information leaking outside the firewall substantially. A shared data server may be an example of such a node, and a proper access control should be implemented that permits an access only to the need-to-know information for each node. Contrary to expectations, to effectively prevent the divulgence of protected information, preventive measures should be applied not only to the nodes with higher degree but also to most of the

196

S. Teramoto et al.

nodes. Given this, it still makes sense to follow conventional practice and educate employees of all ranks on how to prevent the disclosure of protected. In places such as companies, government offices, and hospitals, it is indeed possible to formulate plans where personnel with the duty to prevent information leaks may be positioned. However, in reality, encountering such instances is rather rare. Additionally, the purpose of workplace training focused on confidentiality and personal information protection might not only be for each employee to prevent their own potential leaks, but also to equip them with the ability to prevent information leaks caused by their colleagues. Indeed, it might be part of our common understanding that the likelihood of each of us refraining from leaking information increases when we consider that we are being watched by others. Moreover, the awareness that ‘we are being watched’ is in itself a different kind of ‘information’, separate from confidential information or personal information, and this information is expected to propagate through the network within the firewall. Considering these factors, as proposed in Sect. 4, it seems to have practical significance to explore a model where not only the source of information or the initial transmitter of information, but also monitors aiming to prevent the leakage of confidential or personal information, exist within the network of the firewall. Moreover, this model incorporates the propagation of “deterrent information”, which plays a role in preventing information leaks. The leakage of confidential or personal information beyond the firewall invariably occurs through communication between a node inside the firewall and another node outside of it. Additionally, any node within the firewall has the potential to become one side of such communication. In light of this, it aligns with our common understanding to assume that if we can restrict the propagation of confidential or personal information within the firewall, we can also limit the likelihood of such leakage occurring. The model proposed in Sect. 4 focuses on the degree of commonality between the propagation paths of confidential or personal information originating from the source of information or the initial transmitter (s), and the propagation paths of deterrent information originating from the monitor (m). It attempts to estimate the potential effectiveness of this commonality in suppressing the propagation of confidential or personal information within the network of the firewall. However, it is not practical to identify individual paths and scrutinize the similarities between them. Alternatively, the discussion in Sect. 4 focuses on the distance between s and m, as it can be reasonably assumed that the closer the distance between s and m, the closer the propagation paths originating from each become. The discussion in Sect. 4 explores the correlation between the distance between s and m and the likelihood of information leakage occurring. The model discussed in Sect. 4 suggests the probability of restraining the spread of confidential or personal information when the distance between s and m is short, indicating a high level of commonality between the propagation paths of confidential or personal information originating from the source of information or the initial transmitter (s), and the propagation paths of deterrent information originating from the monitor (m).

The Internal Network Structure that Affects Firewall Vulnerability

197

From this suggestion, legal practitioners can obtain several insights. Legal practitioners have not actively considered the establishment of individuals who monitor specific information that is at risk of leakage and raise awareness regarding information handling. While we have diligently provided organizational education to prevent information leakage, we have not given sufficient thought to this aspect. The individuals who take on such roles may be considered a type of whistleblower. However, while we have been enthusiastic about establishing systems to protect whistleblowers who disclose concealed information publicly (for example, the Whistleblower Protection Act), we have underestimated the role of whistleblowers who warn of the risks of information leakage. Of course, it is not realistic to have a large number of monitors within an organization, and increasing the number of monitors could potentially increase the risk of them becoming sources of information that may cause leakage. However, considering that much of the information transmission occurs through electronic communication networks, it may be worth considering the implementation of mechanisms that assign a monitoring role to individual nodes within the telecommunications network. This could be a viable approach worth careful consideration.

References Act on the Prevention of Infectious Diseases and Medical Care for Patients with Infectious Diseases (“Kansensh¯o no Yob¯o oyobi Kansensh¯o no Kanja ni taisuru Iry¯o ni kansuru H¯oritsu” in Japanese. Act No. 114 of 1998, as amended). Act on the Protection of Personal Information (APPI) (“Kojinj¯oh¯o no Hogo ni kansuru H¯oritsu” in Japanese. Act No. 57 of 2003, as amended). Act on the Protection of Personal Information Held by Administrative Organs (“Gy¯osei Kikan no Hoy¯u suru Kojinj¯oh¯ohogo ni kansuru H¯oritsu” in Japanese. Act No. 58 of 2003, as amended). Bankruptcy Act (“Hasan H¯o” in Japanese. Act No. 75 of 2004, as amended). Bonacich P (1986) Power and centrality: a family of measures. Am J Sociol 92(5):1170–1182 Directorate-general for health and food safety, proposal for a regulation—The European health data space (May 3, 2022), https://health.ec.europa.eu/publications/proposal-regulation-europeanhealth-data-space_en Accessed 17 June 2023 Frey BJ (1998) Graphical models for machine learning and digital communication. MIT Press, Cambridge MA Guille A et al (2013) Information diffusion in online social networks: a survey. ACM SIGMOD Rec 42(2):17–28 Newman MEJ, Watts DJ (1999) Renormalization group analysis of the small-world network model. Phys Lett A 263(4–6):341–346 Penal Code (“Kei H¯o” in Japanese. Act No. 45 of 1907, as amended) The Strategic Headquarters for the Promotion of an Advanced Information and Telecommunications Network Society [K¯odoj¯oh¯ots¯ushin Nettow¯aku Shakai Suishin Senryaku Honbu], BASIC ¯ ¯ GUIDELINES FOR OPEN DATA [OPUND ETA KIHON SHISHIN] (June 15, 2021), available at https://cio.go.jp/sites/default/files/uploads/documents/data_shishin.pdf Accessed 17 Jun 2023.

198

S. Teramoto et al.

Unfair Competition Prevention Act (UCPA) (“Fusei Ky¯os¯o B¯oshi H¯o” in Japanese. Act No. 47 of 1993) Whistleblower Protection Act of Japan (“K¯oeki Ts¯uh¯osha Hogo H¯o” in Japanese. Act No. 122 of 2004)

Index

A Abortion, 8, 85–94 Accountability, 4, 24, 35, 58, 74, 76–78, 80, 86, 88, 155, 163 Act on the Protection of Personal Information (the APPI), 174 Antimicrobial Resistance (AMR), 70 Appropriate safeguards, 113, 137, 140, 153, 155, 158, 160, 161, 166, 167, 169 Artificial Intelligence (AI), 5, 7, 8, 51, 52, 60, 99, 102, 104, 174

B Biomedical research, 14, 70

C Charter of Fundamental Rights (CFR), 133, 157 Children’s Online Privacy Protection Act (COPPA), 35 Citizen science, 7, 69, 71, 75–80 Code of conduct, 16–20 Communication network, 181, 197 Consent, 2, 5, 7, 8, 16, 24, 25, 27, 28, 36–38, 49, 51, 54, 55, 58, 60–65, 80, 87, 90, 93, 99–101, 104–122, 127, 131, 138, 139, 141, 146, 147 Court of Justice of the EU (CJEU), 9, 151, 152 COVID-19 pandemic, 15, 88, 175 Cross-border transfers of personal data, 9, 151, 154, 159, 165, 169, 170

D Data Act, 74, 146 Data governance, 24, 28, 65, 74 Data Governance Act (DGA), 7, 51, 61, 74, 143 Data privacy, 7, 24, 27, 33, 35, 44–46, 48, 61, 87, 91–93, 152, 156 Data protection, 8, 9, 24, 25, 28, 37, 52, 53, 58, 61, 76, 100, 101, 104–108, 111, 114, 116, 120–122, 127, 132, 133, 137–139, 144, 147, 151, 153–156, 158–160, 164, 166–169 Data security, 4–6, 24, 53, 63, 76, 165, 168 Dinerstein v. Google (Dinerstein), 34, 40 Dobbs v. Jackson Women’s Health Organization (Dobbs), 8, 85

E Electronic Health-Record (EHR), 54 Electronic Protected Health Information (ePHI), 86 Ethics, 5, 6, 13, 15, 17, 23, 26–28, 94, 101, 106, 115–119, 131 EU AI Act, 52 European Data Protection Board (EDBP), 9, 108, 151, 153 European Health Data Space (EHDS), 5, 9, 127, 130 EU-US Data Privacy Framework, 152

F Fair Credit Reporting Act (FRCA), 36 FAIR principles, 71

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 M. Corrales Compagnucci et al. (eds.), The Law and Ethics of Data Sharing in Health Sciences, Perspectives in Law, Business and Innovation, https://doi.org/10.1007/978-981-99-6540-3

199

200 Federal Trade Commission (FTC), 34, 36, 37, 156 Federal Trade Commission Act (FTCA), 35 Financial data, 8, 85, 89 Firewall, 9, 10, 88, 174–182, 184, 185, 191, 195, 196 Further processing, 101, 110, 112–115, 122, 130–132, 136 G General Data Protection Regulation (GDPR), 2, 24, 51, 87, 127, 152 Genomic policy, 14 Genomic research, 16–18, 27 Global Alliance for Genomics and Health (GA4GH), 6, 13–30 Graph theory, 174 H Health data, 1, 2, 5–9, 14, 15, 23, 24, 27, 28, 30, 33, 35, 44, 46–49, 53–55, 57, 59, 61, 63, 70–73, 80, 87, 99–101, 105, 111, 117, 122, 127–132, 137, 138, 140–147, 151, 153, 159 Health Data Access Body (HDAB), 144 Health Insurance Portability and Accountability Act (HIPAA), 4, 35 Health science, 2–4, 6 Horizon Europe, 100 Human Genome Project (HGP), 14 I Information security, xi Innovation, 3, 5, 7, 10, 22, 51–53, 61, 64, 65, 69–77, 79, 80, 89, 93, 100, 102, 138, 139, 145, 163, 169, 170 International data transfers, 2, 5, 165 L Legal framework, 14, 16, 27, 53, 71, 74, 79, 156 M Machine Learning (ML), 8, 53, 60, 104, 127, 128, 133, 134, 140 Mathematical model, 181 Medical devices, 2, 5, 6, 8, 9, 52, 62, 127–129, 131–143, 145–147 Multiparty homomorphic encryption, 9, 151, 168

Index N National Security Agency (NSA), 154 New Standard Contractual Clauses (SCCs), 163

P Patient perspective, 7, 51, 53, 56, 60, 63–65 Private enforcement, 35, 40, 41, 43–49 Protected Health Information (PHI), 35, 86 Public enforcement, 37, 39, 40, 44, 45, 47–49

R Responsible Research and Innovation (RRI), 7, 69, 71 Retrospective data collection, 8, 99, 101, 103, 113 Roe v. Wade, 8, 85

S Schrems II, 9, 151–161, 163–165, 167, 169, 170 Scientific research, 8, 79, 100, 105, 107–110, 113, 115, 117, 120, 122, 127, 132–137, 139, 142, 145, 147 Secondary use (of data), 2, 5, 15, 16, 59, 63, 131 Social network, 54, 176, 177 Supplementary measures, 9, 151, 153, 154, 157–163, 165–169 Supreme Court, 8, 41, 42, 46, 47, 85, 88, 89, 131

T Transfer Impact Assessment (TIA), 154, 162, 164 Trust, 5, 15, 16, 20, 21, 52, 53, 55–57, 71, 73, 75–78, 80, 118, 137

U University of Chicago Medical Center (UCMC), 33, 34, 40–43, 46

W Working Party (WP29), 107, 108, 157 World Health Organization (WHO), 3, 53, 116