Patient Reported Outcomes and Quality of Life in Surgery [1st ed. 2023] 3031275969, 9783031275968

This book provides a guide to the assessment of quality of life and patient reported outcomes measures in general surger

177 69 10MB

English Pages 251 [245] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Patient Reported Outcomes and Quality of Life in Surgery [1st ed. 2023]
 3031275969, 9783031275968

Table of contents :
Contents
1: Quality of Life Theory
What Is Quality of Life?
Measuring Quality of Life and Associated Challenges
Conclusion (Fig. 1.3)
References
2: Statistical Methods for PROMS and QoL
Introduction
Models for Longitudinal Data Analysis
Repeated Measures Model
Growth Curve Models
Statistical Analytical Methods
Basic Scoring Systems
Basic Statistical Analyses
Mixed-Effects Models
Generalised Estimating Equations
Minimally Important Difference
Ceiling-Floor Effects
Missing Data and Imputation Methods
Quality Adjusted Life Years (QALYs)
Quality Adjusted Life Years
Cost-Utility Analysis
Non-traditional Quality of Life Assessment Methods
Limitations
Conclusion
References
3: Research Methods for PROMS and QoL
Introduction
Clinical Need for PROMs
Reported Outcome Measures
Measurement Scales for PROMs
Types of PROMs Instrument
Types of PROMs Instruments
Generic Instruments
Disease-Specific Instruments
Other
Establishing a PROMs Instrument
Conclusion
References
4: Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome Measures (PROMS)
Introduction
Aim of the Chapter
What Are Patient Reported Outcomes (PROs) and Patient Reported Outcome Measures (PROMs)
What Are the Measurement Properties of PROMs
Why Perform Systematic Reviews on Measurement Properties of PROMs
Current Methodology: The COSMIN Initiative
Definitions and Taxonomy
Performing a Systematic Review
General
Literature Search
Evaluation of Measurement Properties
Evaluation of Content Validity
Evaluation of Internal Structure
Evaluation of Reliability, Measurement Error, Criterion Validity, Hypotheses Testing for Construct Validity and RESPONSIVENESS
Report and Selection of Most Suitable PROM
Limitations and Considerations
References
5: Quality of Life as Endpoint in Surgical Randomised Controlled Trials
Introduction
Cardiac Surgery
Gastrointestinal Surgery
Gynecological Surgery
Discussion
Conclusion (Fig. 5.2)
References
6: The Role of Patient Reported Outcomes Measures (PROMS) and Health-Related Quality-of-Life (HRQoL) in Economic Analysis
An Introduction to Economic Evaluation
Types of Economic Analysis
The Quality-Adjusted Life-Year
Empirical Methods to Measure Quality-of-Life Directly
Generic Patient-Reported Outcome Measures (PROMS) to Measure Quality-of-Life
The EQ-5D
The SF-36
Other Generic Instruments
Disease-Specific Patient-Reported Outcome Measures (PROMS) to Measure Quality-of-Life
Examples of Disease Specific PROMS Used in Economic Analysis
Discussion
References
7: Quality of Life Following Bariatric and Metabolic Surgery
Introduction
Quality of Life (QoL)
HRQoL After surgery
HRQoL After Bariatric and Metabolic Surgery
Methods
Search Study and Inclusion Criteria
Data Extraction and Results Reporting
Results
Summary of Studies
Quality of Life (QoL) Assessments Tools
Study Heterogeneity
Physical QoL Changes
Mental QoL Changes
QoL Post Different Procedures
Discussion
References
8: Quality of Life after Upper GI Surgery
Introduction
Esophagus
Gastric
Discussion
Conclusions (Fig. 8.2)
References
9: Patient-Reported Quality of Life After Pancreatic and Liver Surgery
Introduction
Methods
Literature Search
Inclusion/Exclusion Criteria
Data Extraction
Data Analysis
Results
Study Characteristics of Studies in Pancreatic Cancer
Quality of Life Assessment in Pancreatic Cancer
Study Characteristics of Studies in Liver Cancer
Quality of Life Assessment in Liver Cancer
Discussion
Patient-Reported Global Quality of Life After Pancreatic Resection
Patient-Reported Global Quality of Life After Liver Resection (LR)
Study Limitations
Conclusion
References
10: Quality of Life in Head & Neck Surgical Oncology and Thyroid Surgery
Quality of Life (QoL) Instruments in Head & Neck and Thyroid Cancer
Introduction
Quality of Life (QoL) Instruments
Discussion
Organ-specific Quality of Life (QoL) Considerations in Head & Neck Surgical Oncology
Quality of Life (QoL) in Laryngeal Cancer
Quality of Life (QoL) in Oropharyngeal Cancer
Quality of Life (QoL) in Thyroid Cancer
Comparison of Quality of Life Between Different Treatments for Thyroid Cancer
Robotic Versus Open Thyroidectomy
Radioactive Iodine Ablation Versus Surgery
Radiofrequency Ablation Versus Surgery
Hemithyroidectomy Versus Total Thyroidectomy
Active Surveillance Versus Surgery
Conclusion
References
11: Quality of Life and Patient Reported Outcomes in Breast Cancer
Introduction
Material and Methods
Search Strategy
Inclusion and Exclusion Criteria
Outcomes of Interest
Results
Selected Studies
Factors Influencing HRQOL in Breast Cancer Patients
Patient and Lesion Characteristics
Treatment
Social and Psychological Factors
Physical Factors
Intervention Specific Evidence of HRQOL
Treatment
Social and Psychological Interventions
Physical Interventions
Discussion
Patient and Lesion Characteristics
Treatment
Social and Psychological Factors
Physical Factors
Limitations
Future Research and Clinical Practice
Conclusion
References
12: Quality of Life After Colorectal Surgery
Introduction
QOL Tools in Colorectal Surgery
Modular Questionnaires
Specific Questionnaires
Bowel Dysfunction
Urogenital Dysfunction
Biopsychosocial Impact of Ostomy
QOL Considerations in the Management of Specific Colorectal Conditions
Colorectal Cancer
Inflammatory Bowel Disease (IBD)
Inherited Cancer Syndromes
Diverticular Disease
Perianal Surgery
Discussion
References
13: Quality of Life After Lung Cancer Surgery
Introduction
Materials and Methods
Search Strategy
Inclusion/Exclusion Criteria
Outcomes of Interest and Data Extraction
Results
Selected Studies
Study Objectives, Designs and Population
Studies Comparing HRQOL Results as per Surgical Approach [Thoracotomy vs. Video-Assisted Thoracoscopic (VATS) Lung Resection vs. Robotic-Assisted Thoracoscopic Surgery (RATS)]
Studies Comparing Uniportal vs. Multi-Portal VATS Lobectomy
Studies Comparing Surgical Lung Resection Vs. Stereotactic Ablative Radiotherapy (SABR)
Studies Comparing HRQOL Results as Per Extent of Resection (Sublobar, Lobectomy, Sleeve, Bilobectomy, Pneumonectomy)
Studies Comparing QoL Results After Lung Resection Among Different Age Groups
Studies Comparing HRQOL After Lung Resection Against Control Population or No Comparison at All
Health-Related Quality-of-Life Measures Used
Limitations of the Studies
Conclusion
References
14: Health-Related Quality of Life and Patient Reported Outcome Measures Following Transplantation Surgery
Introduction
Background
The Role of HRQOL-PROMS in Transplantation
Assessment Tools for Measuring HRQOL-PROMS in Transplantation
Materials and Methods
Search Strategy
Inclusion/Exclusion Criteria
Outcomes of Interest and Data Extraction
Quality Score
Results
HRQOL and PROMS in Kidney Transplantation
Background
Overall QoL Outcomes
Disease-Related Symptom Burden
Physical Health Outcomes
Mental Health After Kidney Transplantation
Employability and Social-Wellbeing Outcomes
Donor Quality of Life Post-Donation
Paediatric Transplantation
HRQOL & PROMS in Liver Transplantation
Background
Overall QoL
Disease-Related Symptom Burden
Physical Health
Mental Health Outcomes
Employment and Social Outcomes
Paediatric Transplantation
Donor Outcomes
Heart and Lung Transplantation
Background
Symptom-Burden and Physical Functionality After Cardiac Transplantation
Psychosocial Outcomes After Cardiac Transplantation
Paediatric Cardiac Transplantation
Lung Transplantation
Symptom-Burden and Physical Health Outcomes After Lung Transplantation
Psychosocial Outcomes After Lung Transplantation
Discussion & Conclusion
The Patient-Specific Balance of HRQOL-PROMS and Its Significance for Value-Based Care in Transplantation
References
Index

Citation preview

Patient Reported Outcomes and Quality of Life in Surgery Thanos Athanasiou Vanash Patel Ara Darzi Editors

123

Patient Reported Outcomes and Quality of Life in Surgery

Thanos Athanasiou Vanash Patel  •  Ara Darzi Editors

Patient Reported Outcomes and Quality of Life in Surgery

Editors Thanos Athanasiou Surgery and Cancer Imperial College London London, UK

Vanash Patel Surgery and Cancer Imperial College London London, UK

Ara Darzi Surgery and Cancer Imperial College London London, UK

ISBN 978-3-031-27596-8    ISBN 978-3-031-27597-5 (eBook) https://doi.org/10.1007/978-3-031-27597-5 © Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

1 Quality of Life Theory ��������������������������������������������������������������������   1 Esha Khanderia and Vanash Patel 2 Statistical  Methods for PROMS and QoL ������������������������������������   9 Bhamini Vadhwana and Munir Tarazi 3 Research  Methods for PROMS and QoL��������������������������������������  17 Bhamini Vadhwana and Munir Tarazi 4 Methodology  for Systematic Reviews on Measurement Properties of Patient Reported Outcome Measures (PROMS) ������������������������������������������������������������������������  27 Orestis Argyriou, Michail Chatzikonstantinou, Vanash Patel, and Thanos Athanasiou 5 Quality  of Life as Endpoint in Surgical Randomised Controlled Trials������������������������������������������������������������������������������  55 Athina A. Samara 6 The  Role of Patient Reported Outcomes Measures (PROMS) and Health-Related Quality-of-Life (HRQoL) in Economic Analysis ��������������������������������������������������������������������������  77 Wilfred Ifeanyi Umeojiako, Ahmer Mansuri, Katherine-Helen Hurndall, and Christopher Rao 7 Quality  of Life Following Bariatric and Metabolic Surgery��������  85 Alan Askari, Chanpreet Arhi, and Ravikrishna Mamidanna 8 Quality  of Life after Upper GI Surgery����������������������������������������  97 Grigorios Christodoulidis, Athina A. Samara, and Michel B. Janho 9 Patient-Reported  Quality of Life After Pancreatic and Liver Surgery���������������������������������������������������������������������������� 121 Nicole E. James, Eliana Kalakouti, Swathikan Chidambaram, Tamara M. H. Gall, and Mikael H. Sodergren 10 Quality  of Life in Head & Neck Surgical Oncology and Thyroid Surgery������������������������������������������������������������������������ 147 George Garas, Keshav Gupta, and Sameer Mallick

v

vi

11 Quality  of Life and Patient Reported Outcomes in Breast Cancer ���������������������������������������������������������������������������������� 169 Kim Borsky and Fiona Tsang-Wright 12 Quality  of Life After Colorectal Surgery �������������������������������������� 181 Niamh A. Moynagh, George Malietzi, and Ailín C. Rogers 13 Quality  of Life After Lung Cancer Surgery���������������������������������� 191 Thomas Tsitsias and Thanos Athanasiou 14 Health-Related  Quality of Life and Patient Reported Outcome Measures Following Transplantation Surgery�������������� 215 Zoe-Athena Papalois and Vassilios Papalois Index���������������������������������������������������������������������������������������������������������� 241

Contents

1

Quality of Life Theory Esha Khanderia and Vanash Patel

What Is Quality of Life? Quality of life (QOL) was first described as “a state of complete physical, mental and social well-being, and not merely the absence of disease and infirmity” by the World Health Organisation (WHO) in 1948 [1]. This definition demonstrates the multidimensional nature of QOL and highlights the wide range of factors which contribute to QOL including physical and psychological well-being and social circumstances such as education, access to healthcare, standard of living, income, political climate and environment [2–4]. This is illustrated by Maslow’s Hierarchy of Needs (Fig. 1.1), which identifies eight human needs that are required to be fulfilled in order to feel happy, healthy and able to function. Lindstrom’s Quality of Life Model (Fig.  1.2) outlines the four main spheres of life experienced by all individuals that may impact quality of life [5, 6]. However, there is no universal single definition of QOL and therefore it is subjective and difficult to measure. In view of E. Khanderia (*) Watford General Hospital, West Hertfordshire Teaching Hospitals NHS Trust, Watford, UK e-mail: [email protected] V. Patel Watford General Hospital, West Hertfordshire Teaching Hospitals NHS Trust, Watford, UK Imperial College London, London, UK e-mail: [email protected]

varying and wide-ranging parameters, QOL is often considered to be a vague and inconsistent concept. The definition of QOL has evolved since it was first described by the WHO in 1948 [1], it is a dynamic concept as it is “modified by the developments, experiences and changes” that occur throughout life [7]. The demographics of society in the western world have also changed since QOL was first defined, which perhaps has in turn influenced the evolution of QOL theory. The evolution of QOL theory as a concept from 1948 to the modern day is illustrated in Table 1.1. In addition to the evolving definitions of QOL over the years, the significance of QOL in medicine has increased exponentially over time as demonstrated by the number of publications on MEDLINE containing the term “quality of life” from 1977 to date [14, 15]. The terms QOL and health related quality of life (HRQOL) are often used interchangeably in literature however their meanings are distinct [16]. HRQOL was first described as a concept in the 1990s as “a functional effect of illness and its treatment as perceived by the patient” [17]. Therefore, while QOL encompasses spheres of life and general parameters within them such as work, family and education, HRQOL focuses more specifically on the impact of disease on a wide range of factors in a patient’s life which include physical function, and the social and psychological burden of health and disease [18, 19].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 T. Athanasiou et al. (eds.), Patient Reported Outcomes and Quality of Life in Surgery, https://doi.org/10.1007/978-3-031-27597-5_1

1

E. Khanderia and V. Patel

2 Fig. 1.1 Maslow’s hierarchy of needs [5]

Sphere

Dimension

Examples

I. Global

1. Macro-environment 2. Human rights 3. Politics

Clean environment, democratic rights, etc.

II. External

1. Work 2. Family standard of living 3. Residence, housing

Inheritance, parent background – knowledge provided to a child, influence for child’s further education and dependence to social class; family income, nutrition, residence, type of dwelling, etc.

III. Interpersonal

1. Family 2. Close relationships 3. Interpersonal relationships

Structure and function of social relationships – relationships with parents, other family members, relatives, friends, society, etc.

IV. Personal

1. Physical 2. Psychological 3. Spiritual

Growth, personality development, activeness, self-respect, meaning of life, etc.

Fig. 1.2  Lindstrom’s general quality of life model, 1992 [6]

Traditionally, clinical outcomes such as morbidity and mortality were measured and considered to be indicators of population health. However, with increased access to medical treatment, education and technology, life expectancy has risen and is no longer thought to be representative as a measurement of well-being. The importance of considering HRQOL in disease, treatment, the impact on carers and life expectancy due to the changing demographics and availability of resources is imperative in making decisions that are in the best interest of patients [13].

Over the last 70 years since QOL was first described by the WHO, the population of the United Kingdom (UK) has increased from 50 million to approximately 66.4 million [20]. Life expectancy in the UK has also increased by an average of 13 years since 1948 and the number of co-morbid patients with chronic illnesses has also increased. In the UK in 2016, 1.3 billion prescriptions were issued reflecting the burden of long-term conditions [21–23] on an increasingly ageing population. Long term health conditions are known to significantly affect HRQOL as they can alter the ability of individuals to retain their

3

1  Quality of Life Theory Table 1.1  Quality of life theory Source World Health Organisation [1]

Year 1948

Abraham Maslow [5]

1962

Ventegodt [8]

1970

Calman [4]

1984

Cutter [9]

1985

Feinstein [10]

1987

Lindstrom [6]

1992

Felce and Perry [11]

1995

Kagawa-Singer et al. [12]

2010

World Health Organisation [13]

2014

way of life prior to developing the illness. However, HRQOL is a subjective descriptor as personal assessment of health determines HRQOL and therefore patients with the same

QOL theory “A state of complete physical, mental and social well-being, and not merely the absence of disease and infirmity” QOL theory developed from self-actualisation theory and a theory of human motivation. Maslow’s hierarchy of needs describes the eight main needs of humans to be fulfilled in order to feel happy, healthy and able to function. It is suggested therefore that QOL is dependent on these eight main needs (Fig. 1.2) Integrative Quality of Life Theory which uses subjective factors such as encompassing wellbeing, satisfaction with life, happiness, meaning in life and objective factors such as biological order, realising of life potential, fulfilment of needs and cultural norms to describe QOL “Quality of life measures the difference, or the gap, at a particular period of time between the hopes and expectations of the individual and that individual’s present experiences. Quality of life can only be described by the individual, and must take in to account many aspects of life.” “An individual’s happiness or satisfaction with life and environment, including needs and desires, aspirations, lifestyle preferences and other tangible and intangible factors which determine overall well-being.” “Quality of life seems to be an umbrella term covering a variety of concepts such as functioning, health status, perceptions, life conditions, behaviour, happiness, lifestyle, symptoms, etc.” There are four main spheres of life that encompass a number of different dimensions. Each of these dimensions contributes to quality of life (Fig. 1.2) “An overall general wellbeing that comprises objective descriptors and subjective evaluations of physical, material, social and emotional wellbeing together with the extent of personal development and purposeful activity, all weighed up by a personal set of values” “QOL is a subjective, multidimensional experience of well-being that is culturally constructed as individuals seek safety and security, a sense of integrity and meaning in life, and a sense of belonging in one’s social network” “An individual’s perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns”

condition or illness can have a very different perception of their own HRQOL [24]. Disease burden and therefore healthcare needs have changed since 1948. Since post-war Britain,

4

mortality from conditions such as heart disease, tuberculosis and stroke have reduced [20, 21, 25, 26]. Public health initiatives on preventative medicine and healthy living aimed at improving HRQOL through diet, exercise, patient education and accessibility of resources maybe partially responsible for this reduction in mortality. Despite the health improvement and reduction of inequalities seen since the inception of the NHS, cancer related deaths have risen from 16.8% in 1948 to 27.8% in 2017 [20, 21, 25, 26]. New ­initiatives introduced in the last few decades for cancer screening have enabled earlier detection, prompt treatment, reduced mortality and improved HRQOL with better outcomes for patients. Due to an ageing population, deaths from “senility and dementia” have also risen [20, 21, 25, 26]. Better access to healthcare has meant that people are living longer with disease due to early diagnosis and treatment. As the age of the population increases patients develop greater comorbidities, with an accompanied decline in cognitive ability and functional status. Physical ability, independence and being able to carry out activities of daily living (ADL) are important contributors to HRQOL [27]. From a social perspective, due to immobility, frailty and comorbidities, with age many face isolation [28–30]. Similarly, significant life events such as retirement or loss of life partners can have a significant impact on mental health and emotional wellbeing. All of these factors contribute to HRQOL. In managing long term conditions people rely increasingly on the support systems that have been set up around them to support and enable them to lead a fuller life. These can vary in magnitude between countries and cultures and subsequently are often difficult to quantify. Micro environmental factors are also important for older people, as many are accustomed to a routine and their own surroundings. Uprooting or effecting a change in the environment of older people especially those with cognitive impairment can affect HRQOL and cause considerable distress. Older individuals who have a level of education and

E. Khanderia and V. Patel

therefore an understanding about their own health are associated with a better perception of HRQOL [31]. In addition, personal satisfaction with health is also associated with a better perception of HRQOL [32]. Therefore, in unwell elderly patients, it is important to look at the patient holistically whilst taking into account comorbidities, age and issues around recovery [33]. In addition, there is improved understanding, diagnosis and treatment of mental illnesses and increased recognition of the link between mental health and physical and social health [34]. Mental health has become a greater focus of healthcare agendas and is of increasing importance as the cost to society due to related morbidity and mortality escalates. Support for those with mental illness, treatment of conditions and the impact on QOL has become more recognised with an increased awareness in society. Certain surgical subspecialties have recognised this and implemented screening procedures prior to surgery. For example, in bariatrics and gender reassignment surgery an integration of preoperative psychological assessments to assess suitability for surgery has commenced after noting the impact of mental health of individuals on HRQOL and suitability for life changing medical interventions [35].

 easuring Quality of Life M and Associated Challenges Measurement of HRQOL enables healthcare professionals to determine the impact of illness and healthcare interventions on patients through examination of differing aspects of their life and not at mortality alone [36]. Different methods to measure HRQOL have been developed over a number of years. HRQOL measurement tools used today are based on three main dimensions; social, functional and psychological indicators [37]. Social indicators are factors such as education, poverty, employment and life expectancy that are used to measure social progress within society [38]. Functional indica-

1  Quality of Life Theory

tors are factors such as mobility and ability to continue normal life despite illness. Functional indexes in medicine started in the 1930s to assess parameters such as the ability to perform ADL. The New York Heart Association score was one of the first functional indices developed in order to evaluate the functional capacity of patients with heart disease [39]. Many specialties have adapted this functional index and formulated similar tools to determine the impact of chronic disease, cancer and surgical procedures on function [40–44]. Psychological indicators are factors such as mood, happiness, anxiety and loneliness [45]. Psychologists use scoring systems to assess the quality of life in patients surviving with long term illnesses [37, 46–48]. This shows the multifaceted input required to measure HRQOL which has presented challenges in being able to quantify and measure across patient populations. HRQOL questionnaires were developed in the 1970s and were adapted for chronic illnesses from a quality of life score first created by John Flanagan, an American psychologist [49–51]. A number of validated HRQOL questionnaires have since been developed and are commonly used by healthcare professionals to calculate HRQOL [3]. These questionnaires are self-­ administered and provide an overall numerical score that can be compared in patients with similar health conditions and treatment across a population, combining objective and subjective indicators. HRQOL is generally only measured during points of direct contact between patients and their healthcare providers and therefore refers to a set point in time. This means that healthcare providers do not have a dynamic insight into HRQOL following new diagnosis, treatment or through a period of chronic illness and therefore lack data that represents the spectrum of disease or illness. In addition, patient questionnaires measuring QOL are distributed at varying times along the patient journey which makes it diffi-

5

cult to compare QOL among patient groups even with the same illness. Moreover, individuals have different expectations of illness and wellness which can be dependent on baseline function and health. It is also important to remember that as QOL is a dynamic concept patient perceptions and expectations may change over time [52]. Furthermore, it can be argued that QOL measurement tools are not patient centred as they describe QOL through markers developed by healthcare professionals to provide a qualitative or quantitative marker of QOL.  Questionnaires are usually limited in their scope as they also restrict the patient’s ability to answer outside of the pre-determined options [53]. In addition, due to translation and language differences it can be conceived that the essence and nuances of QOL questionnaires may be lost in particular parts of a population. This lack of comparability and transferability of data is often compounded due to cultural variations in health-­related behaviour, making certain questions incompatible or irrelevant due to different models of health beliefs in some populations [54].

Conclusion (Fig. 1.3) Quality of life is a dynamic concept that has evolved through time. Advent of modern technology and a changing population demographic will lead to changes in the definition of QOL to encompass factors pertinent to the society of the time. HRQOL measurement tools consequently play an important role as indicators of patient preference and opinion regarding their own health and wellbeing. In addition, they are therefore also useful potential markers of cost effectiveness of healthcare intervention. Thus, modern day medicine must consider HRQOL and look at the patient holistically when making healthcare decisions about management of patients to ensure that high quality patient care and outcomes are delivered.

E. Khanderia and V. Patel

6

There is no universal definition of QOL, it is subjective and encompasses many parameters such as social, physical, mental well-being, cultural norms and personal perspectives

HRQOL focuses specifically on the impact of disease on a wide range of factors in a patient’s life

Traditionally, outcomes such as morbidity and mortality were considered to be indicators of population health. However, the importance of HRQOL in patients due to the changing demographics of society and availability of treatment is imperative in making decisions that are in the best interest of patients

Measurement of HRQOL enables healthcare professionals to determine the impact of illness and healthcare interventions on patients through examination of differing aspects of their life and not at mortality alone

Validated HRQOL questionnaires are used by healthcare professionals to collect information about HRQOL during the patient journey and are indicators of patient opinion and preference

However, when using HRQOL tools in clinical practice, it is important to bear in mind that these can be unreliable due to language barriers, cultural differences, differing expectations and responses provided at varying times by different patients

Fig. 1.3 Conclusions

References 1. Post MW.  Definitions of quality of life: what has happened and how to move on. Top Spinal Cord Inj Rehabil. 2014;20(3):167–80. 2. Siegrist J, Junge A.  Conceptual and methodological problems in research on the quality of life in clinical medicine. Soc Sci Med. 1989;29(3):463–8. 3. Porcu S, Mandas A. How to evaluate quality of life. Monaldi Arch Chest Dis. 2019;89(1) https://doi. org/10.4081/monaldi.2019.1033. 4. Calman KC.  Quality of life in cancer patients—an hypothesis. J Med Ethics. 1984;10(3):124–7. 5. Maslow AH.  Toward a psychology of being. New York: Simon and Schuster; 2013. 6. Lindström B. Quality of life: a model for evaluating health for all. Conceptual considerations and policy implications. Soz Praventivmed. 1992;37(6):301–6. 7. Holmes S.  Assessing the quality of life—reality or impossible dream? A discussion paper. Int J Nurs Stud. 2005;42(4):493–501. 8. Ventegodt S, Merrick J, Andersen NJ. Quality of life theory I. The IQOL theory: an integrative theory of the global quality of life concept. ScientificWorldJournal. 2003;3:1030–40. 9. Cutter SL. Rating places: a geographer’s view on quality of life. Washington, DC: Association of American Geographers; 1985. 10. Feinstein AR.  Clinimetric perspectives. J Chronic Dis. 1987;40(6):635–40. 11. Felce D, Perry J.  Quality of life: its definition and measurement. Res Dev Disabil. 1995;16(1):51–74.

12. Kagawa-Singer M, Padilla GV, Ashing-Giwa K.  Health-related quality of life and culture. Semin Oncol Nurs. 2010;26(1):59–67. 13. World Health Organization. WHOQOL: measuring quality of life 2012. Available from https://www.who. int/toolkits/whoqol 14. Wood-Dauphinee S.  Assessing quality of life in clinical research: from where have we come and where are we going? J Clin Epidemiol. 1999;52(4):355–63. 15. Moons P, Budts W, De Geest S. Critique on the conceptualisation of quality of life: a review and evaluation of different conceptual approaches. Int J Nurs Stud. 2006;43(7):891–901. 16. Karimi M, Brazier J.  Health, health-related quality of life, and quality of life: what is the difference? PharmacoEconomics. 2016;34(7):645–9. 17. Schipper H.  Quality of life. J Psychosoc Oncol. 1990;8(2–3):171–85. 18. Etxeberria I, Urdaneta E, Galdona N. Factors associated with health-related quality of life (HRQoL): differential patterns depending on age. Qual Life Res. 2019;28(8):2221–31. 19. Benito-León J, Rivera-Navarro J, Guerrero AL, de Las HV, Balseiro J, Rodríguez E, et  al. The CAREQOL-MS was a useful instrument to measure caregiver quality of life in multiple sclerosis. J Clin Epidemiol. 2011;64(6):675–86. 20. ONS. Overview of the UK population. August 2019. Available from https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/articles/overviewoftheukpopulation/ august2019

1  Quality of Life Theory 21. Trust N.  Facts and figures on the NHS at 70 2018. Available from https://www.nuffieldtrust.org.uk/ files/2018-­07/facts-­and-­figs-­website.pdf 22. Hawe E, Cockcroft L.  OHE guide to UK health and health care statistics. London: Office of Health Economics; 2013. 23. Digital N. Prescriptions dispensed in the community— statistics for England, 2006–2016. 2017. Available from https://digital.nhs.uk/data-­and-­information/ publications/statistical/prescriptions-­d ispensed-­ in-­t he-­c ommunity/prescriptions-­d ispensed-­i n-­t he-­ community-­statistics-­for-­england-­2006-­2016-­pas 24. Albrecht GL, Devlieger PJ.  The disability paradox: high quality of life against all odds. Soc Sci Med. 1999;48(8):977–88. 25. ONS.  The 20th century mortality files 2013. Available from https://data.gov.uk/ dataset/2548e46b-­873e-­4668-­968c-­25d6c155dd73/ the-­20th-­century-­mortality-­files 26. ONS. Deaths registered in England and Wales—21st century mortality. 2020. Available from https:// www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/ the21stcenturymortalityfilesdeathsdataset 27. Levasseur M, Desrosiers J, St-Cyr TD. Do quality of life, participation and environment of older adults differ according to level of activity? Health Qual Life Outcomes. 2008;6:30. 28. Schoene D, Heller C, Aung YN, Sieber CC, Kemmler W, Freiberger E.  A systematic review on the influence of fear of falling on quality of life in older people: is there a role for falls? Clin Interv Aging. 2019;14:701–19. 29. Vanleerberghe P, De Witte N, Claes C, Schalock RL, Verté D.  The quality of life of older people aging in place: a literature review. Qual Life Res. 2017;26(11):2899–907. 30. Schwartz RM, Ornstein KA, Liu B, Alpert N, Bevilacqua KG, Taioli E.  Change in quality of life after a cancer diagnosis among a nationally representative cohort of older adults in the US.  Cancer Investig. 2019;37(7):299–310. 31. Elsous AM, Radwan MM, Askari EA, Abu AM.  Quality of life among elderly residents in the Gaza Strip: a community-based study. Ann Saudi Med. 2019;39(1):1–7. 32. Levasseur M, St-Cyr Tribble D, Desrosiers J. Meaning of quality of life for older adults: importance of human functioning components. Arch Gerontol Geriatr. 2009;49(2):e91–e100. 33. Barata A, Martino R, Gich I, García-Cadenas I, Abella E, Barba P, et  al. Do patients and physicians agree when they assess quality of life? Biol Blood Marrow Transplant. 2017;23(6):1005–10. 34. WHO.  Mental health. Available from https://www. who.int/health-­topics/mental-­health#tab=tab_1 35. Snyder AG.  Psychological assessment of the patient undergoing bariatric surgery. Ochsner J. 2009;9(3):144–8. 36. Addington-Hall J, Kalra L.  Who should measure quality of life? BMJ. 2001;322(7299):1417–20.

7 37. Prutkin JM, Feinstein AR.  Quality-of-life measurements: origin and pathogenesis. Yale J Biol Med. 2002;75(2):79–93. 38. Palys TS.  Measuring social well-being: a progress report on the development of social indicators. JSTOR; 1979. 39. Association CCotNYH. Nomenclature and criteria for diagnosis of diseases of the heart and great vessels. Boston: Little, Brown; 1994. 40. Zeman FD.  The functional capacity of the aged; its estimation and practical importance. J Mt Sinai Hosp N Y. 1947;14(3):721–8. 41. Karnofsky D, editor. The clinical evaluation of chemotherapeutic agents in cancer. New York: Columbia University Press; 1949. 42. Moskowitz E, McCann CB.  Classification of disability in the chronically ill and aging. J Chronic Dis. 1957;5(3):342–6. 43. Mahoney FI, Wood OH, Barthel DW. Rehabilitation of chronically ill patients: the influence of complications on the final goal. South Med J. 1958;51(5):605–9. 44. Lawton MP, Brody EM. Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist. 1969;9(3):179–86. 45. Shiovitz-Ezra S, Leitsch S, Graber J, Karraker A. Quality of life and psychological health indicators in the national social life, health, and aging project. J Gerontol B Psychol Sci Soc Sci. 2009;64(Suppl 1):i30–i7. 46. Lebo D. Some factors said to make for happiness in old age. J Clin Psychol. 1953;9(4):386–7. 47. Eisenberg H, Goldenberg I. A measurement of quality of survival of breast cancer patients. In: Clinical evaluation in breast cancer. London: Academic Press; 1966. p. 93–108. 48. Costanza R, Fisher B, Ali S, Beer C, Bond L, Boumans R, et al. An integrative approach to quality of life measurement, research, and policy. SAPI EN S Surveys and Perspectives Integrating Environment and Society 2008(1.1). 49. Burckhardt CS, Anderson KL.  The Quality of Life Scale (QOLS): reliability, validity, and utilization. Health Qual Life Outcomes. 2003;1:60. 50. Meyers S, Walfish JS, Sachar DB, Greenstein AJ, Hill AG, Janowitz HD.  Quality of life after surgery for Crohn’s disease: a psychosocial survey. Gastroenterology. 1980;78(1):1–6. 51. Meyers S. Assessing quality of life. Mt Sinai J Med. 1983;50(2):190–2. 52. Carr AJ, Gibson B, Robinson PG. Measuring quality of life: is quality of life determined by expectations or experience? BMJ. 2001;322(7296):1240–3. 53. Carr AJ, Higginson IJ.  Are quality of life measures patient centred? BMJ. 2001;322(7298):1357–60. 54. Danielsen AK, Pommergaard HC, Burcharth J, Angenete E, Rosenberg J.  Translation of questionnaires measuring health related quality of life is not standardized: a literature based research study. PLoS One. 2015;10(5):e0127050.

2

Statistical Methods for PROMS and QoL Bhamini Vadhwana and Munir Tarazi

Introduction Statistical analyses for patient reported outcome measures (PROMs), predominantly health related quality of life (QoL) assessments, are varied and can impact the overall conclusions. Psychometric evaluations from PROMs tools can be open to statistical interpretation. A fundamental understanding of the PROMs instrument scales, scoring methods, and handling of multiple longitudinal data are crucial in generating valuable results to guide patient-centred care. PROMs are used to assess symptoms, functional domains, general health perceptions, and QoL [1]. Statistical analysis from PROMs are used in research to provide insights into the impacts of disease and its treatment, and clinically are used to enhance patient-centred care and incorporate the patient’s perspective in health system performance evaluation [1, 2].

B. Vadhwana (*) · M. Tarazi Department of Surgery and Cancer, Imperial College London, London, UK e-mail: [email protected]; [email protected]

 odels for Longitudinal Data M Analysis Repeated Measures Model Repeated measures describe multiple assessments following a clinical treatment over discrete time points, where each time point is defined as a categorical variable. Longitudinal studies of this nature typically follow a model of 2–4 assessments, for example, pre-treatment, 6 weeks and 12 weeks post-treatment, with a fixed follow-up duration. This is normally dictated by timing of patient visits and hospital protocols. Risk of biases must be considered and eliminated. For example, health measures immediately following surgery can offer a false representation due to the pain and anxiety associated with surgery itself. On the other hand, if the assessment windows are too wide, there is a risk of introducing irrelevant variables which can influence the statistical power of the study. Analysis of the repeated measures model considers missing data and varying time periods between assessments. A variance-­ covariance model around the repeated measures as a time category is employed as a linear mixed model. Multiple imputation analysis using the Markov Chain Monte Carlo (MCMC) technique can be used to account for missing data.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 T. Athanasiou et al. (eds.), Patient Reported Outcomes and Quality of Life in Surgery, https://doi.org/10.1007/978-3-031-27597-5_2

9

10

Growth Curve Models A growth curve model uses quality of life measures as a function of time, where there is variation in timings of assessments between patients, or a large quantity of assessments. There is ­modelling of health related changes over time as a continuous variable. A mixed-effects model analytical approach is used.

Statistical Analytical Methods Basic Scoring Systems Many PROMs instruments are based on basic scoring systems where an ordered Likert scale can be used to compute a score for a particular outcome. For example, a positively phrased statement such as “I enjoy my hobbies” will be scored more highly. It is important to consider negatively phrased items such as “I am in pain.” A higher score in this case reflects a more negative outcome. In these scenarios, reverse scoring should be performed where the reported value is subtracted from the maximum score in that statement (for example a 4-point Likert scale means the maximum score is 4).

Basic Statistical Analyses PROMs comprise predominantly Likert or visual analogue scales. Both methods produce scores where higher scores correlate with a better health related quality of life and lower scores with a negative perception. Statistical analysis can be performed for individual dimensions and general quality of life ratings. It is important to determine whether the data is parametric or non-parametric to inform which statistical test to use. Parametric data refers to a “normal distribution” of data, where the spread of data is similar on either side of the mid-point, resembling symmetry. In these cases, parametric tests include t-tests, Chi squared tests and analysis of variance (ANOVA) for repeat measures from the same population. Where data does not follow a “nor-

B. Vadhwana and M. Tarazi

mal distribution” and is skewed towards one end of the bell curve, non-parametric tests must be selected, such as Mann Whitney U test, Wilcoxon signed rank test, Kruskal Wallis test, and Friedmans test for repeated measures. Data normality can be assessed using the KolmogorovSmirnov or Shapiro Wilk tests. For comparable paired datasets which follow a normal distribution, for example pre- and post-treatment, paired t-tests can be utilised. When data is non-parametric, for example when comparing different treatment groups, Mann Whitney U test can be used. Repeated measures in a longitudinal model can use regression analysis (linear or multiple), and studies comparing different groups can use ANOVA tests [3].

Mixed-Effects Models A general linear model (GLM) approach is adaptable for both models where time can be categorical or continuous. A mixed model for repeated measures presents a measured health outcome and known covariates over a fixed time period. Growth model curves also use this analytical strategy as varying outcomes are allowed over time. A mixed-effects model combines fixed and random effects. The fixed component represents the mean trajectory and the expected response of the outcome measure over time (Fig.  2.1). The random component reflects the variance of patients’ responses, around the average response.

Fig. 2.1  Mixed effects model. Black line  =  average response of all participants. Coloured lines  =  individual participant responses, with circles representing actual scores

2  Statistical Methods for PROMS and QoL

The variability is twofold between patients; the initial value (for example, pre-treatment) and the degree of change (for example, perceived improvement in mobility). Finally, the residual effects encompass variability within individuals, and address potential outliers. For growth curve models, two modelling platforms can be used; (1) polynomial models and (2) piecewise linear models. Polynomial models present an approximate curve fitting the observed multiple assessed time points. The more time points there are, the higher likelihood that the trajectory will deviate from a linear path. However, the coefficients will vary at different points of the curve. Piecewise linear regression models use linear patterns over short time periods. Deviation from a truly linear path represents changes in quality of life measures at a defined time point following clinical intervention (Fig. 2.2).

Generalised Estimating Equations Generalised estimating equations (GEE) were developed by Liang and Zeger to analyse longitudinal data (repeated measures) from large cohorts [4]. Missing data must be at random to avoid statistical bias. It is considered an extension of GLM, and models the average response of the population rather than the within-subject variation. The model demonstrates how much the average population response would change with Polynomial

11

one-unit increase of the co-variance. For example, with an additional 1  month post treatment, GEE could estimate the average change in health outcome measures [5].

Minimally Important Difference The minimally important difference (MID) describes the lowest threshold at which a difference in an outcome measure is perceived to be important to the patient, even without reaching statistical significance [6]. This can be a measure of improvement or harm. Although there is no accepted consensus, an effect size of 0.2–0.5 is considered sufficient [7]. This may vary between instruments.

Ceiling-Floor Effects Multi-attribute based PROMs are successfully used in clinical research trials for assessment of quality of life following clinical interventions. However, the data can be skewed by the ceiling and floor effects. The ceiling effect describes a situation where the majority of patients score on the highest of the scale, thereby losing discrimination of the quality measured [8]. Similarly, the floor effect describes the majority of values on the lower end of the scale. These domains maybe not accurately reflect the real-life situation in specific diseases. Piecewise linear

Fig. 2.2  Growth curve models; Polynomial and Piecewise linear models. Blue line = control group. Green line = interventional group

12

 issing Data and Imputation M Methods Missing data presents difficulties in statistical analyses and raises questions about the value of the outcomes. The two main types of missing data include: (1) patients omitting certain questions, and (2) a large quantity of missing data from multiple variables overall. Missing data can be classified into one of three categories. Firstly, data missing completely at random (MCAR) does not correlate to the observed data [9–11]. For example, it may be an administrative error. Secondly, data missing at random (MAR) have a systematic relationship between observed data and the nature of the missing outcomes. Finally, data missing not at random (MNAR) has a strong association between the missing values and the cohort. The cause of the missing values is dependent on the patient and their environment. It is crucial to identify the latter to inform analytical strategies [11]. Missing data greater than 10% should stimulate thoughts about the nature of the missing data. Missing data can lead to a selection bias with onward analysis, and therefore imputation techniques must be used with caution. For example, missed questionnaires due to treatment toxicity or post-operative complications can bias the health related quality of life towards more positive representation. Single and multiple imputation methods exist for different levels of missing data. Several single imputation methods can be performed for a single missing value. Most commonly, the mean of the observed data, the last value carried forward, or the minimum value carried forward are calculated to replace a missing value. With a small number of missing items, the half rule can be utilised which allows the mean of the overall data as a substitute, with the caveat that the patient has completed at least 50% of the questionnaire. A misconception with this method is that the missing value is considered as if it were a true value. The variance of the data is reduced which increases the likelihood of type 1 errors. Multiple imputation methods integrates a level of uncertainty into the statistical calculation and therefore addresses the underestimation of

B. Vadhwana and M. Tarazi

single imputation analyses. It is of maximum benefit when there is strong additional clinical data to correlate with PROMs outcomes. At least 3–20 sets of data values are analysed and the results combined using Rubins rules to achieve precision estimates [9].

Quality Adjusted Life Years (QALYs) Quality Adjusted Life Years Health related quality of life (HRQoL) outcomes provides a platform for: (1) assessment of patient-­centred quality of life, and (2) as a metric of time to inform quality adjusted survival (QAS) and quality adjusted life years (QALYs). Both components can be combined to assess the quality adjusted time without symptoms and toxicity (Q-TWiST). The advantage of the Q-TWiST is the integrated evaluation of risk (measured as QoL; quality) and benefit (measured as survival; quantity). These can be translated into an economic scale to analyse cost utility of clinical interventions. Scoring systems such as the time-­ trade-­off (TTO) or standard gamble (SG) convert patient multi-attribute measures to utility values [12, 13]. Common examples of well-established questionnaires with multi-attribute measures includes the EuroQoL EQ-5D-5L and SF-36 forms [14, 15]. Here, patients put a direct value on their own health state. TTO demonstrates how much time a patient would sacrifice to be in perfect health for a given time period [16]. SG utility is based on the patient valuing two treatment options at an equal level [17]. This is based on formulas derived from health related quality of life scores from the general population, based on geographical location. An area under the curve is generated (utility value versus time) to estimate the average expected course of each patient treatment group. Alternatively, imputing QALYs from individual patient-specific values of health status can be calculated for use in univariate analysis. In general, a QALYs gained are an adjustment in the utility value (quality of life compared to the general population) as a direct result of clinical treatment, multiplied by the

2  Statistical Methods for PROMS and QoL

13

Fig. 2.3 Graph demonstrating quality-­ adjusted life years with a targeted clinical intervention

length of the treatment effect. This provides a tangible value to understand the improvement or regression a clinical intervention has to a patient’s life (Fig. 2.3). QALY = utility value ( Q ) × years of life ( y )

Q : maximum of 1 = perfect health

Two methods for QALY assessment can be employed: the recall period and the trapezoidal approximation. Ideally, completed quality of life measures in a timely fashion until death would yield the most accurate data. On the other hand, data can be collected on recall of measures in the past week/month and calculates an average utility. Both methods work towards the final time point of death. The major challenges of these methodologies are missing data and limited follow-­up. Clinical trials assessing quality of life have a limited follow-up period postoperatively, with the majority of patients not receiving follow-­ up until death. Therefore, QALYs must be adjusted to reflect a relatively short follow-up time. The minimum follow-up time encompasses all possible post-surgery scenarios, but for fair approximation, the median follow-up period is recommended. Repeated measures over time, for example consecutive quality of life measures to assess the longitudinal benefits in the post-­operative period, can be valuable to identify time points that benefits from surgery can be seen.

Cost-Utility Analysis In recent decades, cost effectiveness and cost utility analyses have steered allocation of health resources. With emerging novel treatment strategies, economic evaluations have become more important to determine the relationship between the cost and patient benefits. More specifically, a cost-utility analysis (CUA) compares the costs to outcome measures in the form of QALY which produces a value between 0 and 1 (1  =  perfect health, 0 = death) [18]. From a clinical perspective, CUA provides a broad platform to understand where funding is directed and how valuable a particular intervention is for the said population. For example, we know breast cancer screening is cost-effective for early detection and treatment, however, endoscopic screening for gastric cancer is expensive, invasive, and not without risks. A fundamental understanding of this can stimulate novel research in diagnostics. It is important to note that CUA should not deter from clinical needs. For example, it is more cost effective to treat appendicitis with antibiotics than perform an appendicectomy, however clinical judgement and practices surpass this [19]. In a similar fashion, CUA recommends that EVAR should not be selected over open AAA surgery, however, this has not translated to clinical practice [20]. CUA is selected when the outcome measure is QALYs. It is useful when comparing differ-

B. Vadhwana and M. Tarazi

14 Fig. 2.4 Cost effectiveness acceptability curve demonstrating the willingness to pay for a more cost-effective treatment strategy; Treatment A > Treatment B

ent treatment strategies, for example surgery versus curative chemotherapy in cancer. A Markov model decision tree incorporates all possible health outcomes for both treatment groups including costings. An incremental cost effectiveness ratio (ICER) value is calculated to inform the treatment leading to the most desirable health outcome. There are differences

Incremental cost effectiveness ratio =

globally in how healthcare systems justify the costs in relation to QALY. The UK threshold is approximately £20,000–30,000 per QALY, the USA US$50,000–100,000 per QALY and AU$35,000–50,000 per QALY [21]. Considering these thresholds, an ICER less than or equal to the relevant threshold is costeffective (Fig. 2.4).

( cost of surgery − cost of chemotherapy )

( QALY of surgery − QALY of chemotherapy )

 on-traditional Quality of Life N Assessment Methods More recent innovative methods of obtaining real-time health related quality of life data via social media monitoring is being trialled [22]. A wealth of information relating to symptomatology, treatments, effects on daily activities and lifestyle are shared within online communities. Renner et al. generated a social media listening algorithm to assimilate relevant data based on specific domains already established in quality of life instruments: physical, psychological, activities, social and financial. General impact on quality of life was identified at a sensitivity of 0.83 and specificity of 0.74. It is important to consider the wider resources available in the development

of PROMs and consider the ease of access for patients to contribute data. However, this may be limiting to the technology averse population.

Limitations Quality of life research is wholly dependent on the target patient population and there are many limitations which must be considered. Many PROMs instruments are questionnaire based which are largely subjective. Differing opinions can be due to geographical location, confounding patient factors and other lifestyle influences. Patient populations across the world with the same disease process may prioritise quality of life indicators differently, largely influenced by socioeconomic status and education. Patient

2  Statistical Methods for PROMS and QoL

15

related confounding include acute life events, concurrent illnesses and life stressors which may not be identified by the questionnaires. PROMs may also be biased towards patients who understand the need for research and are willing to participate. Other biases include cultural and language barriers. Many PROMs instruments consist of at least 20–30 items, and in some cases, multiple questionnaires. Respondent fatigue is a known phenomenon where participants become tired completing the questionnaires, and the quality of the responses deteriorate. Therefore, design of new PROMs tools, and study design should address this. Studies using multiple questionnaires can randomise the order of questionnaires given to patients to reduce bias from respondent fatigue. In addition, recent decades have seen evolutions in the relationship between cost analyses and QALYs in health economic modelling to guide health resources and understand patient impact. However, there is a deficiency in tools to marry up surgical outcomes with PROMs. There is a need to develop robust methods to quantify surgery specific health outcomes with patient perceived health outcomes.

• Longitudinal data analyses, repeated measures model or growth curve model, assesses health related quality of life changes over time in response to a treatment. • Health related quality of life outcomes can be used as a metric of time to inform quality adjusted survival (QAS) and quality adjusted life years (QALYs). • Quality of life years gained are an adjustment in utility value, demonstrating the quality of life compared to the general population, as a result of clinical treatment multiplied by the length of its effect. • Cost effectiveness and cost utility analyses steer allocation of health resources and considers how valuable a particular intervention is for a target population. • An incremental cost-effective ratio (ICER) is calculated to inform the treatment leading to the most desirable health outcome. • Quality of life research is wholly dependent on patients voluntary participation, and can be biased by geographical location, confounding patient factors and lifestyle influences.

Conclusion

References

PROMs represent a multi-dimensional evaluation of health related quality of life including symptomatology, treatments, functional status and socioeconomics. All domains assessed will have diverse trajectories over time, and therefore, the primary outcome measures including timepoints must be clearly defined. Conclusions generated from PROMs instruments can facilitate clinical decisions and ultimately improving patient care. This multi-faceted approach to patient care includes treatment regimens, adverse effects, and patient experiences. Summary Points • Statistical analyses from PROMs used in research provides valuable insights into the impacts of disease and treatment and incorporate the patient’s perspective in health system performance evaluation.

1. McKenna SP, Heaney A, Wilburn J, Stenner AJ.  Measurement of patient-reported outcomes. 1: the search for the Holy Grail. J Med Econ. 2019;22:516–22. 2. Al Sayah F, Jin X, Johnson JA.  Selection of patient-­reported outcome measures (PROMs) for use in health systems. J Patient Rep Outcomes. 2021;5:99. 3. Hamel JF, Saulnier P, Pe M, Zikos E, Musoro J, Coens C, Bottomley A. A systematic review of the quality of statistical methods employed for analysing quality of life data in cancer randomised controlled trials. Eur J Cancer. 2017;83:166–76. 4. Liang KY, Zeger SL.  Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. 5. Ballinger GA. Using generalized estimating equations for longitudinal data analysis. Organ Res Methods. 2004;7:127–50. 6. Johnston BC, Ebrahim S, Carrasco-Labra A, Furukawa TA, Patrick DL, Crawford MW, Hemmelgarn BR, Schunemann HJ, Guyatt GH, Nesrallah G. Minimally important difference estimates and methods: a protocol. BMJ Open. 2015;5:e007953.

16 7. King MT.  A point of minimal important difference (MID): a critique of terminology and methods. Expert Rev Pharmacoecon Outcomes Res. 2011;11:171–84. 8. Bindman AB, Keane D, Lurie N.  Measuring health changes among severely ill patients: the floor phenomenon. Med Care. 1990;28:1142–52. 9. Rubin DB.  Multiple imputation for nonresponse in surveys. New York: Wiley; 1987. 10. Rubin DB, Schenker N.  Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. J Am Stat Assoc. 1986;81:366–74. 11. Rubin DB, Schenker N.  Multiple imputation in health-care data bases: an overview and some applications. Stat Med. 1991;10:585–98. 12. Franks P, Lubetkin EI, Gold MR, Tancredi DJ.  Mapping the SF-12 to preference-based instruments: convergent validity in a low-income, minority population. Med Care. 2003;41:1277–83. 13. Feeny D.  Preference-based measures: utility and quality-­ adjusted life years. In: Assessing quality of life in clinical trials. 2nd ed. Oxford: Oxford University Press; 2005. p. 405–29. 14. Brooks R. EuroQoL: the current state of play. Health Policy. 1996;37:53–72. 15. Ware JE, Sherbourne CD.  The MOS 36-item short-­ form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30:473–83.

B. Vadhwana and M. Tarazi 16. McNeil BJ, Weichselbaum R, Pauker SG.  Tradeoffs between quality and quantity of life in laryngeal cancer. N Engl J Med. 1981;305:982–7. 17. Torrance GW, Thomas WH, Sackett DL.  A utility maximizing model for evaluation of health care programs. Health Serv Res. 1971;7:118–33. 18. Sassi F.  Calculating QALYs, comparing QALY and DALY calculations. Health Policy Plan. 2006;21:402–8. 19. Wu JX, Dawes A, Sacks GD, Brunicardi FC, Keeler EB.  Cost effectiveness of nonoperative management versus laparoscopic appendectomy for acute uncomplicated appendicitis. Surgery. 2015;158:712–21. 20. Takayama Y.  A cost-utility analysis of endovascular aneurysm repair for abdominal aortic aneurysm. Ann Vasc Dis. 2017;10:185–91. 21. Bertram MY, Lauer JA, De Joncheere K, Edejer T, Hutubessy R, Kieny MP, Hill S.  Cost–effectiveness thresholds: pros and cons. Bull World Health Organ. 2016;94:925–30. 22. Renner S, Marty T, Khadhar M, Foulquié P, Voillot P, Mebarki A, Montagni I, Texier N, Schück S.  A new method to extract health-related quality of life data from social media testimonies: algorithm development and validation. J Med Internet Res. 2022;24:e31528.

3

Research Methods for PROMS and QoL Bhamini Vadhwana and Munir Tarazi

Introduction Advancements in medical technology have facilitated improved measurable clinical outcomes for patients. Medical innovations of biochemical, physiological, and radiological techniques have led to more accurate clinical diagnoses. In recent years, developments in surgical techniques such as minimally invasive access, hybrid and robotics, have shown potential to impact patient postoperative outcomes. However, desirable clinical outcomes may not correlate with patient perceptions. To align treatment strategies and patient satisfaction, information specific to the patient journey is fundamental. Physical and psychological symptoms pertinent to the patient may not be clear and it is important to ascertain the severity of these. The post-operative impact on quality of life is a comprehensive multi-faceted assessment that can define treatment satisfaction [1–3]. Characterisation includes psychosocial functioning, social well-being, activities of daily living, personal satisfaction with healthcare, health

B. Vadhwana (*) · M. Tarazi Department of Surgery and Cancer, Imperial College London, London, UK e-mail: [email protected]; [email protected]

related quality of life (HRQoL), adherence to medical treatments and clinical trial outcomes [4–8]. Therefore, Patient Reported Outcome Measures (PROMs) serves as a valuable tool to reveal patient specific symptoms and its influence on quality of life [9].

Clinical Need for PROMs The face of medical treatment is evolving rapidly, from the traditional paternalistic approach to the current patient-centred approach. Involving patients in their own treatment journeys has become the standard practice of care [10]. Assessment of clinical parameters provide information about the pathological status and treatment administered; however, it does not address if these actions are influencing patients perceived quality of life. Impact on quality of life is variable between individuals. Patient reported outcomes are becoming key in understanding how disease affects quality of life, and how treatments can improve or adversely affect this. It has become an important part of holistic patient care, alongside clinical parameters. This is particularly evident for benign surgery such as antireflux surgery where the need for intervention is guided by symptoms and quality of life. Global health policies work towards promoting PROMs. In the UK, NICE endorsed the Oxford Hip Score (1996) and

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 T. Athanasiou et al. (eds.), Patient Reported Outcomes and Quality of Life in Surgery, https://doi.org/10.1007/978-3-031-27597-5_3

17

18

the Oxford Knee Score (1998) to quantify the functional gains for individuals [11]. The Aberdeen Varicose Vein Questionnaire (AVVQ) was used in clinical practice to ascertain the severity of reported symptoms to guide the need for invasive treatment [11, 12]. Similarly, the Cancer Patient Experience Survey is recognised by Public Health England to encourage transparency in cancer care and with a view to improving cancer services and support. Appropriate ­selection of a PROMs tool is crucial to ascertain valuable target information. Since the initial purpose of improving treatment related outcomes, the application of PROMs have seen wider benefits in health economics to quantify and justify allocation of resources to certain surgical procedures, supporting clinical decision making, encouraging quality improvements and producing relevant health policies [13–15]. However, generic PROMs can be challenging to interpret disease specific conditions which aren’t always available. Commonly, questionnaires can be exhaustive with at least 30 items to rate. Globally, diversities in culture, socioeconomic status and education means generalisability and applicability of the tools is a problem. They hold good internal validity, but poor external validity. Overall, many PROMs tools have been internationally validated for prospective studies, local audits, national registries and for general holistic assessment of patient post-­ operative outcomes.

Reported Outcome Measures Patient reported outcomes (PROs) are characterised by patients’ perceptions and experiences both in and out of a healthcare setting [16, 17]. Many tools exist to provide objective measures associated with mobility, daily activities, symptoms such as pain, sleep patterns amongst others [5]. Psychometric testing is more intricate and its ability to validate patient satisfaction remains uncertain. Nonetheless, PROs are crucial to providing a holistic, high quality, patient-centred care.

B. Vadhwana and M. Tarazi

More commonly, observer reported outcomes (ObsRO), where a person other than the patient has reported on the outcomes is used. Views from patients’ family, friends and colleagues involved in their support network are also considered as ObsRO.  Examples include the patient’s nutritional intake and functional status in performing daily activities. A less commonly used term is a proxy observed outcome which describes the reports given on behalf of the patient, as an advocate. PROs and ObsRO can be used to define the management of patients across three broad categories: (1) clinical care, (2) personal and social well-being, and (3) health economic status. Clinical care includes a medical assessment, diagnosis, establishing treatment strategies and monitoring both short- and long-term survival outcomes. In addition to mortality, assessing morbidity is crucial as it has a more subtle influence on patients quality of life. These objective assessments are made by clinicians and the wider health care profession by determining measurable parameters such as treatment response, biochemical and radiological results. In particular, the work-up for staging a cancer requires radiological assessment of the cancer size and location, evidence of distant spread, the histopathological and immunological characteristics of the cancer, and the physiological status of the patient to undergo surgery. This continues in the post-operative period where the immediate outcome of cancer surgery is defined by histopathological analysis of tumour margins and lymph node assessment, and in the medium to long term by surveillance imaging. These are observer reported outcomes (ObsRO), used routinely in clinical practice. Personal and social well-being can be reported by observers in an objective manner and by patients directly through their experiences. This includes executing routine daily tasks, performing hobbies and psychological status, all contributing to overall quality of life. For example, the EuroQoL EQ-5D-5L provides a platform for clinicians to measure impact on quality of life following surgery, however, is restricted by a set

3  Research Methods for PROMS and QoL

19

framework [18, 19]. Patient reported outcomes (PROs) are extremely valuable in ascertaining health-related quality of life outcomes important to the patient. Health economics plays a vital role in quantifying efficiency and cost-effectiveness of the use of health care resources in order to achieve the maximum value and benefit to the users. Cost effectiveness analyses (CEA) provide a measurable cost of the clinical intervention and the subsequent impact on patients’ lives. It helps to validate the clinical effectiveness of interventions.

scale typically reflect two opposing feelings or thoughts. For example, strong—weak, fair— unfair, happy—unhappy.

Measurement Scales for PROMs

Pictorial Scale  Pictorial scales are visually stimulating, easy to understand and is universally acceptable to all populations evading any language barriers. This scale is however limited for certain questions only. For example, on a scale of (1) bad to excellent, (2) unhappy to happy and (3) no pain to a lot of pain.

PROMs tools can assimilate information in dichotomous, categorical, and continuous scales of measurement. Dichotomous values (ie yes or no) provide basic information requiring minimal interpretation. However, the majority of individuals fall in between these two points, at an intermediatory level. Therefore, many PROMs instruments offer more than two responses, to address a range of values that would address/ incorporate the thoughts of the population. Many such scales have been created comprising ordinal categories, numbers, and occasionally pictures. Commonly used response scales include: likert scale, semantic differential, visual analogue scale, pictorial scale, rating scale, and categorical checklist [16, 20]. Likert Scale  The Likert scale is the most commonly used rating scale. It comprises a continuum of categories over a 5–7-point scale in response to a given statement. The most frequently used scales are: (1) strongly disagree, disagree, neutral, agree, strongly agree, and (2) very frequently, frequently, occasionally, rarely, never. Additional points on the scale can be incorporated, and one is selected to reflect the individual’s experience. Semantic Differential  The semantic differential scale is an ordinal scale of 5–7 points between two contrasting meanings. The two ends of the

Visual Analogue Scale  The visual analogue scale is a well-established method of determining an outcome which typically lies on a continuous scale. The generic tool EQ-5D-5L uses a visual analogue scale from 0 (poor health) to 100 (excellent health) to assess how patients personally rate their health-related quality of life. Consecutive assessments can provide a timeline over weeks to months of when the benefits of surgery were perceived by the patient.

Rating Scale  Rating scales are used to ascertain the frequency of certain symptoms over a defined period of time, mostly over a week or a month. It allows assessment of targeted symptoms, and quantitation of symptom frequency. This can be used to infer the impact on day to day quality of life. Categorical Checklist  The checklist addresses a breadth of symptoms, however, in a binary fashion. Patients are asked to indicate if any of the symptoms were experienced over a given time period. The frequency, nature or severity of symptoms does not form part of this tool.

Types of PROMs Instrument Types of PROMs Instruments PROMs instruments are primarily based on (1) symptom assessment and (2) functional status specific to the patient. These two domains encompass the majority of quality of life assessments. Health related quality of life is normally defined

20

by symptom burden and functionality affecting day-to-day living and behavioural patterns. Two formats of PROMs are commonly used: (1) multi-attribute utility instrument, MAUI, and (2) visual analogue scale, VAS. MAUIs typically incorporate dimensions on a physical and mental scale and are used most effectively in chronic conditions where symptoms may be subtle. VAS is most valuable in an acute setting to express immediate benefits from an intervention. However, VAS can also be used in chronic cases to depict the overall health perception on a scale of 0 (poor health) to 100 (excellent scale). Various PROMs tools exist with different intended objectives and primary endpoints. Generic quality of life assessment tools can be implemented in any disease-type and provides an overall assessment of the impact on quality of life. Disease specific questionnaires for surgical procedures highlight symptoms specific to the pathology. PROMs can be used in clinical and research settings. Two types of tools exist; validated tools and unvalidated tools. Validated QoL questionnaires are normally utilised in a clinical setting, as they have proven to be reliable and reproducible having been exposed to rigorous validation methods [13]. Unvalidated tools such as local surveys may not be applicable to the wider population.

Generic Instruments Generic PROMs tools are standardised measures of a patient’s quality of life and can be used in any surgical setting [21, 22]. On an international level, the generalisability and accessibility of these tools can allow comparisons across datasets in clinical and research settings. However, what is deemed as important quality of life measures in a generic tool may not be applicable across the world and may be non-discriminative in certain surgery types. The EuroQoL 5-dimension (EQ-5D-5L) tool is a standardised validated questionnaire to be completed pre- and postoperatively to assess the impact of the surgery on quality of life [18, 19]. The EQ-5D-5L comprises two components: (1) a 5-item descriptive system

B. Vadhwana and M. Tarazi

comprising mobility, self-care, ability to perform daily activities, pain and anxiety, and (2) visual analogue scale for perceived health rating from 0 (poor health) to 100 (excellent health). This is used globally with healthcare systems in Sweden and Alberta adopting its use in national registries [23, 24]. Another well-established tool is the Short-Form-36 healthy survey (SF-36) which assesses overall health status with 36 items including functional limitations, physical and emotional health, pain, and psychosocial outcomes [25, 26]. Examples of other generic tools include the Schedule for the Evaluation of Individual Quality of Life (SEIQoL) questionnaire which utilises the visual analogue scale, the Hospital Anxiety and Depression Scale (HADS) which is a 14-item list to assess the level of psychological impact on patients, and the Nottingham Health Profile (NHP) comprising two parts: (1) a 38-item list categorised into six domains including sleeping patterns, energy, emotional status, pain, mobility, and social interactions and (2) seven statements about lifestyle affected by health including employment, housework, social interactions, personal relationships, sex life, hobbies, and holidays [27–29].

Disease-Specific Instruments Disease specific tools were established to target specific symptoms related to a disease process. They benefit from being focussed and add immense value to the holistic assessment of a patient, including monitoring of quality of life, and potentially a guide to clinical decision making [21]. There are various categories of disease specific PROMs; cancer-related PROMs include generic cancer related symptoms which are shared between different cancer types [bone metastases (QLQ-BM22), cancer related fatigue (QLQ-FA12), elderly cancer patients (QLQ-­ ELD14)], specific cancers [lung (QLQ-LC13), colorectal (QLQ-CR29), gastric (QLQ-STO22)] [30]. Currently, the European Organisation for Research and Treatment of Cancer (EORTC) provides a comprehensive platform of quality of life questionnaires for specific cancer types, with

3  Research Methods for PROMS and QoL

the majority designed using a Likert rating scale. Although many resources are available, the limitation lies in the length of the questionnaire and the time invested in completing it accurately. In addition, PROMs can be used to assess the functional capacity of individuals which can adversely impact on lifestyle. Orthopaedic surgery uses these to assess improvements post surgery. Instruments include the Western Ontario and McMaster Universities Arthritis Index (WOMAC) which is a 24-item questionnaire measuring functional status in patients undergoing hip or knee arthroplasty, and the Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire which is a 30-item targeted list for patients with upper limb functional limitations. Other tools are used in benign conditions, where symptomatology is the main driver for surgery. For example, the AAVQ cited earlier was used to select patients with severe quality of life impact for surgery. Other benign PROMs include: gastrointestinal quality of life index (GI-QLI), digestive symptoms questionnaire, and the gastroesophageal reflux disease-health related quality of life (GERD-HRQL) for reflux. Serial monitoring using these PROMs can help to identify candidates who would benefit from surgery. Fig. 3.1  A conceptual framework model for developing a PROMS instrument

21

Other The gold-standard format for PROMs assessments has been paper-based. Advances in digital health technologies have led to the introduction of electronic PROMs, ePROMs [31]. They can be widely adopted internationally with ease of access, ability to gather and analyse information efficiently, and decrease overall costs of implementing paper-based PROMs. However, socioeconomic and linguistic barriers can pose a challenge in the uptake of this.

Establishing a PROMs Instrument The conceptual framework of a PROMs instrument describes the relationship between the items for evaluation and the target endpoints (Fig. 3.1). The end point of a PROMs tool should lead to a perceptible outcome that can be used for overall clinical care and/or health economics. Therefore, the intended objective of the tool, design and data analysis of the scoring should lead to quantifiable outcome measures. The end point model design demonstrates how PROMs fits into the holistic assessment of the patient. This includes biochem-

B. Vadhwana and M. Tarazi

22

Fig. 3.2  The end point model to assess response to a targeted clinical intervention

Phase 2—Categorisation of similar quality of life aspects into domains/themes Phase 3—Pre-testing the preliminary item list Phase 4—International field-testing of the refined model Phase 5—Validation of the PROMs instrument

Fig. 3.3 Five phases for developing a PROMs instrument

ical parameters, physiological/physical parameters, radiological assessment (i.e. treatment response, regression, spread) and finally patient reported outcomes focussing on quality of life (Fig. 3.2). A step-by-step approach is adopted to develop a PROMs based on quality of life. The development process involves four phases (Fig. 3.3). Phase 1—Identification of relevant aspects of quality of life specific to the disease/ condition

Phase 1: Identification of Relevant Aspects of Quality of Life Specific to the Disease/ Condition Many validated tools exist universally, and therefore development of a novel tool must be relevant and be considered an adjunct to platforms already available. The target population should be defined as a disease specific group to allow accurate measures of quality of life indicators. Imposing additional parameters may reduce the target cohort eligible and limit comparability between datasets. At least 5–10 patients representative of the population should form the focus group. Designing a cell matrix can help to pick a representative patient population (Table  3.1). Three sources can be accessed: (1) a comprehensive systematic literature review of the quality of life impact from disease specific surgery to identify important areas for potential improvement, (2) conducting semi-structured interviews within focus groups with patients with the relevant condition can be utilised to gather qualitative data, identify themes and to ultimately inform end points of the study model (3) an initial list of items can be reviewed by clinical experts incorporating at least five health care professionals

Male Female

Pre-neoadjuvant chemotherapy

Pre-surgery X X

Post-surgery X X

Post-adjuvant chemotherapy

Table 3.1  Example of cell matrix comprising patient groups to identify the target cohort for the study question Long-term quality of life

3  Research Methods for PROMS and QoL 23

24

who have experience in managing this condition. The framework of outcomes generated must be translated from qualitative to quantitative scores for data interpretation. It is recommended that three languages and countries are selected for global representation [32–34]. The suggested groups are: (1) English-speaking country (2) Northern Europe country (3) Southern Europe country. Phase 2: Categorisation of Similar Quality of Life Aspects into Domains/Themes A rich pool of relevant items should have been collated. Iterations of quality of life measures can be grouped into domains. For example, mobility can incorporate daily activities, hobbies, and housework. The scales of measurement are commonly polytomous utilising the Likert scale or visual analogue scale. The responses should be representative of the entire population. It is recommended that all items within a domain are either all positively or negatively phrased to allow ease of scoring and data interpretation. Item reduction is performed by psychometric analysis and expert input to ensure content validity. At the end of this phase, a preliminary item list should be presented. Phase 3: Pre-testing the Preliminary Item List The target population are invited to test the preliminary item list including understanding of the questions and statements, appropriateness of the rating scales and the format and clarity of what is expected. The length of the questionnaire and associated time taken to complete it in full is important to note. It is recommended that at least six countries are included to incorporate the breadth of cultures and interpretation of the questions. A minimum of 15 patients should be involved. Subsequent revisions of the item list will be undertaken and re-reviewed. Questions can be adapted, removed, or new items added. The first version of the PROMs instrument can be generated for field testing.

B. Vadhwana and M. Tarazi

Phase 4: International Field-Testing of the Refined Model The instrument is administered to a large-scale sample size of the target population to assess the reliability, reproducibility, accessibility and validity of the items. The reliability and consistency of the items measured is determined by a Cronbach’s alpha coefficient greater than 0.70. Known-groups validity can be used to compare the outcomes of subgroups of patients, for example patients at different stages of disease, or performance status. Following the responses, the final modifications can be made to produce the final version. Phase 5: Validation of the PROMs Instrument Psychometric validation of the instrument requires a calculated number of patients. Fayers and Machin (ref 2007), have suggested that a minimum of 10 patients per item are required [35]. The majority of questionnaires have a minimum of 30 items, which translates to a minimum of 300 patients for the validation cohort. Test-­ retest repeatability is crucial to ensure repeatable scores in the same group of patients with correlation analysis demonstrating 0.70 as acceptable. Item response therapy (IRT) is useful for reducing items, and confirming essential items for inclusion [36, 37]. Psychometric evaluation of PROMs instruments in the development process can be modelled on the Classical Test Theory (CTT) and the Rasch Measurement Theory (RMT) [38, 39]. CTT is commonly based on the summation of true values and true correlations between items, with an assessment of the tool as a whole. CCT is limited to non-parametric analyses and may not be adequate for objective PROMs models. RMT is an advanced method modelling relationships between individual items and participants, with true clinical expectations. It allows monitoring of the quality and precision of outcome calculations for high quality, reproducible PROMs tools.

3  Research Methods for PROMS and QoL

Conclusion PROMs have become an integral part of improving patient care globally, allowing appropriate resource allocation in healthcare systems and driving innovation for future health care ­practices. It is crucial to consider the optimal research methodology to generate the most valuable and clinically translatable results to address the study aim. Summary Points • Treatment strategies and associated desirable clinical outcomes may not align with patients perceived quality of life. • Post-operative physical and psychological impact can be assessed with a comprehensive multifaceted tool, as an aid to define treatment satisfaction. • PROMs are based on symptom assessment and functional status specific to the patient. • Two common formats of PROMs instruments are (1) multi-attribute utility instrument, and (2) visual analogue scale. • Two types of PROMs instruments include: (1) generic tools for generalisability and accessibility for comparisons across global datasets, and (2) disease-specific, where cancer-related PROMs are well established. • The implementation of PROMs has a wider benefit in health economics to quantify and justify resource allocation for procedures, clinical decision-making, encouraging quality improvement and informing health policies. • Globally, cultural diversities, discrepancies in socioeconomic statuses and education means generalisability and applicability of the tools presents a problem.

References 1. WHO. World health organization constitution. Basic doc. 1984. http://www.who.int/governance/eb/ who_constitution_en.pdf 2. WHO.  International classification of impairments, disabilities, and handicaps: a manual of classification relating to the consequences of disease. 1980. http:// apps.who.int/iris/handle/10665/41003

25 3. Wilson IB, Cleary P.  Linking clinical variables with health-related quality of life. A conceptual model of patient outcomes. JAMA. 1995;273:59–65. 4. Greenhalgh J.  The applications of PROs in clinical practice: what are they, do they work, and why? Qual Life Res. 2009;18:115–23. 5. Kozma CM, Reeder C, Schulz RM. Economic, clinical, and humanistic outcomes: a planning model for pharmacoeconomic research. Discussion 1120. Clin Ther. 1993;15:1121–32. 6. Fitzpatrick R, Davey C, Buxton MJ, Jones DR.  Evaluating patient-based outcome measures for use in clinical trials. Health Technol Assess. 1998;2:1–74. 7. Chen H, Taichman DB, Doyle RL.  Health-related quality of life and patient-reported outcomes in pulmonary arterial hypertension. Proc Am Thorac Soc. 2008;5:623–30. 8. Chao J, Nau DP, Aikens JE. Patient-reported perceptions of side effects of antihyperglycemic medication and adherence to medication regimens in persons with diabetes mellitus. Clin Ther. 2007;29:177–80. 9. Weldring T, Smith SMS.  Patient-reported outcomes (PROs) and patient-reported outcome measures (PROMs). Health Serv Insights. 2013;6:61–8. 10. International Alliance of Patients’ Organizations. What is patient centred health care? A review of definitions and principles. 2nd ed. London: IAPO; 2007. p. 1–34. 11. Kingsley C, Patel S.  Patient-reported outcome measures and patient-reported experience measures. BJA Educ. 2017;17(4):137–44. 12. NHS Digital. Patient reported outcome measures (PROMs). https://digital.nhs.uk/data-­and-­ information/data-­tools-­and-­services/data-­services/ patient-­reported-­outcome-­measures-­proms. Accessed 1 Apr 2022 13. Churruca K, Pomare C, Ellis LA, Long JC, Henderson SB, LED M, Leahy CJ, Braithwaite J. Patient-reported outcome measures (PROMs): a review of generic and condition-specific measures and a discussion of trends and issues. Health Expect. 2021;24:1015–24. 14. Ahmed S, Berzon RA, Revicki DA, Lenderking WR, Moinpour CM, Basch E, et  al. The use of patient-­ reported outcomes (PRO) within comparative effectiveness research: implications for clinical practice and health care policy. Med Care. 2012;50:1060–70. 15. Valderas JM, Kotzeva A, Espallargues M, Guyatt G, Ferrans CE, Halyard MY, et al. The impact of measuring patient-reported outcomes in clinical practice: a systematic review of the literature. Qual Life Res. 2008;17:179–93. 16. Chin R, Lee BY. Economics and patient reported outcomes, principles and practice of clinical trial medicine. London: Elsevier; 2008. p. 145–66. 17. U.S.  Department of Health and Human Services FDA Center. Guidance for industry. Patient-reported outcome measures: use in medical product development to support labeling claims. U.S. FDA, Clinical/ Medical. 2009.

26 18. EuroQol Group. EuroQol—a new facility for the measurement of health related quality of life. Health Policy. 1990;16:199–208. 19. Brooks R. EuroQol: the current state of play. Health Policy. 1996;37:53–72. 20. Hjermstad MJ, Fayers PM, Haugen DF, Caraceni A, Hanks GW, Loge JH, et al. Studies comparing numerical rating scales, verbal rating scales, and visual analogue scales for assessment of pain intensity in adults: a systematic literature review. J Pain Symptom Manag. 2011;41:1073–93. 21. Guyatt GH, Feeny DH, Patrick DL.  Measuring health-related quality of life. Ann Intern Med. 1993;118:622–9. 22. Patrick DL, Deyo R.  Generic and disease-specific measures in assessing health status and quality of life. Med Care. 1989;27:S217–32. 23. Al Sayah F, Jin X, Johnson JA. Selection of patient-­ reported outcome measures (PROMs) for use in health systems. J Patient Rep Outcomes. 2021;5:99. 24. Ernstsson O, Janssen MF, Heintz E.  Collection and use of EQ-5D for follow-up, decision-making, and quality improvement in health care—the case of the Swedish National Quality Registries. J Patient Rep Outcomes. 2020;4:78. 25. Ware JE, Sherbourne CD.  The MOS 36-item short-­ form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30:473–83. 26. https://www.rand.org/health-­c are/surveys_tools/ mos/36-­item-­short-­form.html. Accessed 1 Apr 2022. 27. Joyce CR, Hickey A, HM MG, O’Boyle CA. A theory-­ based method for the evaluation of individual quality of life: the SEIQoL. Qual Life Res. 2003;12:275–80. 28. Snaith RP. The hospital anxiety and depression scale. Health Qual Life Outcomes. 2003;1:29. 29. Wiklund I.  The Nottingham Health Profile—a measure of health-related quality of life. Scand J Prim Health Care Suppl. 1990;1:15–8. 30. European Organisation for Research and Treatment of Cancer (EORTC). https://qol.eortc.org/questionnaires/. Accessed 1 Apr 2022.

B. Vadhwana and M. Tarazi 31. Meirte J, Hellemans N, Anthonissen M, Denteneer L, Maertens K, Moortgat P, Van Daele U. Benefits and disadvantages of electronic patient-reported outcome measures: systematic review. JMIR Perioper Med. 2020;3:e15588. 32. Bullinger M. In: Kuykken W, editor. Ensuring international equivalence of quality of life measures. Quality of life assessment: international perspectives. Berlin: Springer; 1994. p. 33–40. 33. Leplege A, Verdier A. The adaptation of health status measures: methodological aspects of the translation procedure. In: Shumaker S, Berzon R, editors. The international assessment of health-related quality of life: theory, translation, measurement and analysis. Oxford: Rapid Communications of Oxford; 1995. p. 93–101. 34. Sprangers MA, Cull A, Bjordal K, Groenvold M, Aaronson NK.  The European organization for research and treatment of cancer. Approach to quality of life assessment: guidelines for developing questionnaire modules. EORTC study group on quality of life. Qual Life Res. 1993;2:287–95. 35. Fayers P, Machin D.  Quality of life: the assessment analysis and interpretation of patient reported outcomes. Chichester: Wiley; 2007. 36. Hambleton R, van der Linden WJ. Advances in item response theory and applications: an introduction. Appl Psychol Meas. 1982;6:373–8. 37. Lord FM.  Applications of item response theory to practical testing problems. Hillsdale: Erlbaum; 1980. 38. McKenna SP, Heaney A, Wilburn J, Stenner AJ.  Measurement of patient-reported outcomes. 1: the search for the Holy Grail. J Med Econ. 2019;22:516–22. 39. Cappelleri JC, Lundy JJ, Hays RD. Overview of classical test theory and item response theory for quantitative assessment of items in developing patient-reported outcome measures. Clin Ther. 2014;36:648–62.

4

Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome Measures (PROMS) Orestis Argyriou, Michail Chatzikonstantinou, Vanash Patel, and Thanos Athanasiou

O. Argyriou (*) Health Education England – North West London, London, UK The Hillingdon Hospitals NHS Foundation Trust, Uxbridge, UK e-mail: [email protected] M. Chatzikonstantinou Bariatric Centre for Weight Management and Metabolic Surgery, University College Hospital, London, UK e-mail: [email protected]

V. Patel West Hertfordshire Teaching Hospitals NHS Trust, Watford, UK Imperial College London, London, UK e-mail: [email protected] T. Athanasiou Department of Surgery and Cancer, Imperial College London, London, UK Imperial College Healthcare NHS Trust, London, UK e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 T. Athanasiou et al. (eds.), Patient Reported Outcomes and Quality of Life in Surgery, https://doi.org/10.1007/978-3-031-27597-5_4

27

O. Argyriou et al.

28

Patient Reported Outcome Measures can be assessed by evaluating their Measurement Properties. A systematic review can be performed in order to compare and evaluate PROMs, to make recommendations regarding their use, and to identify any gaps or the need for the design of a new instrument. The COSMIN initiative (Consensus-based Standards for the selection of health Measurement Instruments) has provided thorough methodological guides for performing such a systematic review. This involves a step-wise approach, to assess separately content validity, internal structure and the remaining measurement properties. Following the current advancements and increased scientific interest in research relating to quality of life, particularly with the use of patient reported outcome tools, clinicians are frequently involved in relevant studies. A clinician may be interested to investigate which tool is more appropriate for their practice, and this is the purpose of this methodological overview. Nevertheless, although a clinician can massively benefit from a more in-depth understanding of this methodology, it is strongly advised that such studies should be undertaken in close collaboration with Epidemiologists and Biostatisticians.

Introduction Aim of the Chapter This chapter aims to discuss and present the currently used methodology for performing studies and systematic reviews on the measurement properties of PROMs. It aims to initially provide some insight into the most common terms utilised in the fields of

designing and interpreting reported papers and results on PROMs. The process of PROMs design, and generation of a new PROM is beyond the scope of this chapter and is only discussed as part of the assessment and evaluation of studies for a systematic review.

 hat Are Patient Reported Outcomes W (PROs) and Patient Reported Outcome Measures (PROMs) Patient-reported Outcomes (PROs) have long been established in current medical research, as both primary and secondary outcomes of studies. According to the FDA, a Patient-Reported Outcome (PRO) is any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else [1]. As Patient-Reported Outcome Measures (PROMs) or, alternatively PRO instruments, we define the instruments that are utilised to measure PROs or capture PRO data, such as questionnaires that are completed by patients [1]. In the relevant literature, when referring to a PROM or a PROM instrument, authors may be discussing a questionnaire as a whole or single question.

 hat Are the Measurement W Properties of PROMs Μeasurement properties are essential criteria in the design and evaluation of a PROM. Broadly, these are Validity, Reliability, Responsiveness and Interpretability. Detailed definitions will be discussed below.

4  Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome…

 hy Perform Systematic Reviews W on Measurement Properties of PROMs Provided that PROMS, looking at an area of interest, exist already (developed and/or validated), a systematic review may be performed, in order to compare the measurement properties of these PROMs, evaluate the quality of each PROM, identify advantages and disadvantages of each PROM, and ultimately, recommend which PROMs should be used in future studies. In addition, if the results indicate a rather low quality of the available PROMs, or inadequate measurement of the area of interest, then the systematic review may inform and guide the design of a new PROM.

Current Methodology: The COSMIN Initiative The vast majority of guidance and tools on PROMs interpretation, has been provided by the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) initiative [2]. The COSMIN initiative, after initially identifying the lack of clear definitions and

29

widely accepted methodology [3], has specified the definitions of the measurement properties of PROMs [4], and also provides comprehensive guidance for performing a systematic review of outcome measurements, as well as handbooks for the interpretation and assessment of each measurement property in PROMs.

Definitions and Taxonomy In order to perform a systematic review on measurement properties of PROMs, the researcher must be familiar with the measurement properties, and their definitions. As mentioned previously, the COSMIN initiative, following a Delphi study, has recommended definitions for the measurement properties [4]. Most importantly, the initiative agreed on a taxonomy, incorporating the measurement properties [4]. According to this taxonomy, COSMIN identifies three main domains of measurement properties in assessing the quality of a PROM; Validity, Reliability and Responsiveness with Interpretability being considered as a fourth domain (Fig. 4.1 and Table 4.1). A fourth domain, Interpretability, is also considered [4].

30

Fig. 4.1  Three plus one domains of assessment of a quality of a PROM. Mokkink, L. B. et al. The COSMIN study reached international consensus on taxonomy, terminol-

O. Argyriou et al.

ogy, and definitions of measurement properties for health-­ related patient-reported outcomes. J. Clin. Epidemiol. 63, 737–745 (2010)

4  Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome… Table 4.1  Definitions of the three domains of the assessment of a PROM Definitions Validity

Content validity

*Face validity

Construct validity

Structural validity

Cross-cultural validity

Criterion validity

Reliability

Internal consistency Reliability

The degree to which an HR-PRO instrument measures the construct(s) it purports to measure The degree to which the content of an HR-PRO instrument is an adequate reflection of the construct to be measured The degree to which (the items of) an HR-PRO instrument indeed looks as though they are an adequate reflection of the construct to be measured The degree to which the scores of an HR-PRO instrument are consistent with hypotheses (for instance with regard to internal relationships, relationships to scores of other instruments, or differences between relevant groups) based on the assumption that the HR-PRO instrument validly measures the construct to be measured The degree to which the scores of an HR-PRO instrument are an adequate reflection of the dimensionality of the construct to be measured The degree to which the performance of the items on a translated or culturally adapted HR-PRO instrument are an adequate reflection of the performance of the items of the original version of the HR-PRO instrument The degree to which the scores of an HR-PRO instrument are an adequate reflection of a “gold standard” The degree to which the measurement is free of measurement error i.e. the extent to which scores for patients who have not changed are the same for repeated measurement under several conditions The degree of the interrelatedness among the items The proportion of the total variance in the measurements which is because of “true” differences among patients

31

Table 4.1 (continued) Definitions Measurement error

Responsiveness

Interpretability

The systematic and random error of a patient’s score that is not attributed to true changes in the construct to be measured The ability of an HR-PRO instrument to detect change over time in the construct to be measured The degree to which one can assign qualitative meaning that is, clinical or commonly understood connotations—to an instrument’s quantitative scores or change in scores

Performing a Systematic Review General A systematic review on measurement properties of PROMs shares some common methodological features with any other systematic review. We will focus more on discussing the process of assessing the measurement properties. The COSMIN initiative has provided summarising guidelines for performing a systematic review [5] as well as a more detailed user manual, describing the methodology in more depth [6]. In this section, we will present and discuss the processes recommended in these documents. All tables and figures are adopted from these sources. The overall process and the steps that need to be followed, can be shown in the following flowchart [5]. As shown in the flowchart, a systematic review consists of three stages (Fig. 4.2). Initially, as per routine practice, a literature search is performed followed by a thorough assessment of the measurement properties. Finally, recommendations can be exported and formed, and the review is reported.

Literature Search The initial stage consists of the standard steps (steps 1–4) for performing systematic reviews.

O. Argyriou et al.

A. Perform the literature search

32

1. Formulate the aim of the review 2. Formulate eligibility criteria 3. Perform a literature search 4. Select abstracts and full-text articles

5. Evaluate content validity

B. Evaluate the measurement properties

Evaluate the quality of the PROM:

6. Evaluate internal structure - Structural validity - Internal consistency - Cross-cultural validity

-

Evaluate the methodological quality of the included studies by using the COSMIN Risk of Bias checklist

-

Apply criteria for good measurement properties by using quality criteria

-

Summarize the evidence and grade the quality of the evidence by using the GRADE approach

7. Evaluate the remaining measurement properties - Reliability - Measurement error - Criterion validity - Hypotheses testing for construct validity - Responsiveness

C. Select a PROM

8. Evaluate interpretability and feasibility

9. Formulate recommendations

10. Report the systematic review

Fig. 4.2  The first four stages of a literature search. Prinsen, C. A. C. et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual. Life Res. 27, 1147–1157 (2018)

4  Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome…

–– Step 1: Formulating the aim When deciding and developing the aim of the review, the four key elements that need to be included are the construct of interest, the population, the type of the instrument and the measurement properties of interest. –– Step 2: Formulating the Eligibility Criteria Not all studies mentioning the PROMs of interest are to be included. Eligible studies should fulfil the aforementioned four key elements. Most importantly, given the large amount of studies on different PROMs, the main focus should be studies looking at the assessment and evaluation of one (or more) of the measurement properties of the PROM, and certainly not studies just using the PROM as an outcome measurement. –– Step 3: Performing the literature search. Standard Cochrane methodology should be followed for performing the literature search. The four key elements of the aim need to be included, as can be shown in the following flowchart, depicting the search strategy and terms, as described by the COSMIN initiative [5] (Fig. 4.3) –– Step 4: Selection of abstracts and full-text articles Selection and review of the abstracts and full texts is performed in a routine manner with the general recommendation for this to be performed by two reviewers independently.

Evaluation of Measurement Properties As demonstrated in the flowchart in Fig. 4.2, this is done in three main stages. Given the significance of content validity and internal structure, these are assessed separately, followed by assessment of the remaining properties. 1. Content Validity 2. Internal Structure 3. Remaining Properties (Reliability, Measurement error, Criterion validity, Hypotheses testing for construct validity, Responsiveness)

33

Evaluation of Content Validity The COSMIN initiative, given the significance and complexity of the evaluation of content validity, provides a separate user manual, with the relevant methodology [7]. According to the COSMIN recommendations, there are three aspects of content validity in a PROM: • Relevance • Comprehensiveness • Comprehensibility In order to assess these, COSMIN recommends ten criteria for good content validity, which have been formulated following a Delphi study [8], as shown in Table 4.2. To assess the above, we are using a stepwise process: Step 1—Evaluation of the quality of the PROM development Step 2—Evaluation of the quality of content validity studies on the PROM Step 3—Evaluation of the content validity of the PROM A more detailed description of the steps is provided below, but not in its full length and detail. For each step, COSMIN has very comprehensively provided relevant boxes, summarising the process in a rather succinct manner. These will also be presented below. Step 1: Evaluating the Quality of the PROM Development This step is further subdivided into steps 1a and 1b. In step 1a, the quality of the PROM design is assessed (evaluating relevance). In step 1b, the quality of any cognitive interview studies or pilot studies assessing the PROM, are examined (evaluating comprehensibility and comprehensiveness) (Table 4.3). To perform the above steps, a number of items/questions need to be answered, as per the flowchart shown below (Fig. 4.4).

O. Argyriou et al.

34

All PROMs

All “validated” PROMs

1. Construct

1. Construct

 Construct of interest Comprehensive search terms

 Construct of interest Comprehensive search terms

 All outcomes No search terms

 All outcomes No search terms

AND

One or more PROMs

AND

2. Population

2. Population

2. Population

 Population of interest Comprehensive search terms

 Population of interest Comprehensive search terms

 Population of interest Comprehensive search terms

AND

AND

AND

 Age e.g. child filter

AND

 Age e.g. child filter

AND

3. Type of instrument

3. Type of instrument

 Preferably No search terms

 Preferably No search terms

 PROMS PROM filter

 PROMS PROM filter

 Other instruments Comprehensive search terms

 Other instruments Comprehensive search terms

AND

 Age e.g. child filter

AND 3. Name(s) of instrument(s)

AND

4. Measurement properties

4. Measurement properties

 Search filter Terwee *

 Search filter Terwee *

NOT

NOT

NOT

Exclusion filter Terwee *

Exclusion filter Terwee *

Exclusion filter Terwee *

Fig. 4.3  Step 3, performing the literature search. Prinsen, C. A. C. et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual. Life Res. 27, 1147–1157 (2018)

4  Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome…

35

Table 4.2  Criteria for good content validity

Relevance 1 Are the included items relevant for the construct of interest? 2 Are the included items relevant for the target population of interest? 3 Are the included items relevant for the context of use of interest? 4 Are the response options appropriate? 5 Is the recall period appropriate? Comprehensiveness 6 Are no key concepts missing? Comprehensibility 7 Are the PROM instructions understood by the population of interest as intended? 8 Are the PROM items and response options understood by the population of interest as intended? 9 Are the PROM items appropriately worded? 10 Do the response options match the question? Terwee, C. B. et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual. Life Res. 27, 1159–1170 (2018) Table 4.3  COSMIN box 1

COSMIN box 1. Standards for evaluating the quality of studies on the development of a PROM la. Standards for evaluating the quality of the PROM design to ensure relevance of the PROM General design requirements Concept elicitation (relevance and comprehensiveness)

lb. Standards for evaluating the quality of a cognitive interview study or other pilot test performed to evaluate comprehensibility and comprehensiveness of a PROM General design requirements Comprehensiveness Comprehensibility

This describes 13 items/questions for Part 1a, and 22 items/questions for Part 1b. The detailed items are not presented here, and we would recommend reading the full manual, where the items are presented, along with further explanations and examples. Step 2: Evaluating the Quality of Content Validity Studies on the PROM In this step, we assess how patients and professionals were asked about the relevance, comprehensibility and comprehensiveness, either as part of the PROM design process, or as a separate content validity study (Table 4.4).

This can also be widely separated in Steps 2a, 2b and 2c (asking patients about relevance, comprehensiveness and comprehensibility), and steps 2d and 2e (asking professionals about relevance and comprehensiveness), as shown in the respective flowchart. Overall, there are 31 items/questions to be assessed (Fig. 4.5). For Steps 1–2 As mentioned previously, the exact items that are utilised in each step are not presented here. What is important to note is how ratings are provided for each item. A 4-point rating scale is utilised, as shown here.

O. Argyriou et al.

36

Complete general design requirements (items 1-5)

PART 1a

Was a sample from the target population involved in the development of the PROM (item 5)?

YES

PROM development inadequate

NO

Complete items 6-13

Was a cognitive interview study or other pilot test conducted (item 14)?

NO

YES

PROM development inadequate

PART 1b

Complete item 15

Were patients asked about the comprehensiveness (item 26)?

Were patients asked about the comprehensibility (item 16)?

YES

NO

?

YES

NO

?

PROM development inadequate Complete item 16-25

Complete item 26-35

Determine final rating (worst score counts)

Fig. 4.4  Evaluating the quality of the PROM development. Caroline B Terwee et al. COSMIN methodology for assessing the content validity of PROMs

4  Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome…

37

Table 4.4  COSMIN box 2: Standards for evaluating the quality of studies on the content validity of a PROM COSMIN box 2. Standards for evaluating the quality of studies on the content validity of a PROM 2a. Asking patients about the relevance of the PROM items 2b. Asking patients about the comprehensiveness of the PROM 2c. Asking patients about the comprehensibility of the PROM 2d. Asking professionals about the relevance of the PROM items 2e. Asking professionals about the comprehensiveness of the PROM

Caroline B Terwee et al. COSMIN methodology for assessing the content validity of PROMs

PART 2d,e

PART 2a,b,c

Were patients asked about relevance?

YES

NO

?

Were patients asked about comprehensiveness?

YES

NO

?

Complete item 1-7

Complete item 8-14

Were professionals asked about relevance?

Were professionals asked about comprehensiveness?

YES

NO

Complete item 22-26

?

YES

NO

Were patients asked about comprehensibility?

YES

NO

?

Complete item 15-21

?

Complete item 27-31

Fig. 4.5  Evaluating the quality of content validity studies on the PROM. Caroline B Terwee et al. COSMIN methodology for assessing the content validity of PROMs

• • • •

Very good Adequate Doubtful Inadequate

For each item, the COSMIN manuals provide detailed examples of what criteria should be fulfilled to achieve is rating. Below we pro-

vide an example, of Item 5, from step 1a (Table 4.5). To ensure high quality, COSMIN recommends using a ‘worst score counts’ method, where the lowest rating is utilised as an overall rating. For Step 1, the lowest rating in the respective items will correspond to the overall rating for the PROM development.

O. Argyriou et al.

38 Table 4.5  Example of the COSMIN manuals 5

Was the PROM development study performed in a sample representing the target population for which the PROM was developed?

Very good Study performed in a sample representing the target population

Adequate Assumable that the study was performed in a sample representing the target population, but not clearly described

Doubtful Doubtful whether the study was performed in a sample representing the target population

Inadequate Not applicable Study not performed in a sample representing the target population (SKIP standards 612)

Caroline B Terwee et al. COSMIN methodology for assessing the content validity of PROMs Table 4.6  COSMIN criteria and rating system for evaluating the content validity of PROM Name of the PROM or subscale:..................................

Criteria (see Table 2)

PROM development study + / - / ± /?1

Content validity Content validity study 1 study 22

+ / - / ± /?

+ / - / ± /?

Rating of reviewers

+ / - / ± /?

OVERALL RATINGS PER PROM3 (see step 3b) +/-/±

QUALITY OF EVIDENCE (see step 3c) High, moderate, low, very low

Relevance 1

Are the included items relevant for the construct of interest?4

2

Are the included items relevant for the target population of interest?4

3

Are the included items relevant for the context of use of interest?4

4

Are the response options appropriate?

5

Is the recall period appropriate? RELEVANCE RATING (see Table 3)

Comprehensiveness 6

Are all key concepts included? COMPREHENSIVENESS RATING (see Table 3)

Comprehensibility 7 Are the PROM instructions understood by the population of interest as intended? 8 Are the PROM items and response options understood by the population of interest as intended? 9 10

Are the PROM items appropriately worded? Do the response options match the question? COMPREHENSIBILITY RATING (see Table 3) CONTENT VALIDITY RATING (see Table 4) 1 2 3 4

Ratings for the 10 criteria can only be + / - /?. The RELEVANCE, COMPREHENSIVENESS, COMPREHENSIBILITY, AND CONTENT VALIDITY ratings can be + / - / ± /? Add more columns if more content validity studies are available If ratings are inconsistent between studies, consider using separate tables for subgroups of studies with consistent results. These criteria refer to the construct, population, and context of use of interest in the systematic review.

Caroline B Terwee et al. COSMIN methodology for assessing the content validity of PROMs

For Step 2, the lowest rating in the respective items will correspond to the overall rating of the content validity studies on the PROM. Step 3: Evaluating the Content Validity of the PROM For this step, content validity of the PROM is evaluated by examining the quality and results of already performed studies on the PROM.  This, again, is further subdivided in three steps.

For step 3a, ratings need to be provided for relevance, comprehensiveness and comprehensibility, using the ten criteria for good content (presented previously), for three different aspects, as per the table shown below. • Methods and results of PROM development study • Content validity studies on the PROM • Reviewers’ own ratings of the PROM (Table 4.6)

4  Methodology for Systematic Reviews on Measurement Properties of Patient Reported Outcome… Table 4.7  GRADE criteria

Study design At least 1 content validity study

Quality of evidence High

Lower if Risk of bias

No content validity studies

Moderate Low Very low

-1 Serious -2 Very serious -3 Very serious

39

Inconsistency -1 Serious -2 Very serious Indirectness -1 Serious -2 Very serious

https://www.gradeworkinggroup.org/

Essentially, the ratings for the methods and results of the PROM development studies, and the content validity studies, are the ones already assessed in steps 1 and 2, according to the respective COSMIN boxes, and are utilised in this table. With regards to the potential ratings of each criterion, these can be: –– Sufficient (+): ≥85% of the items of the PROM (or sub-scale) fulfil the criterion –– Insufficient (−):