Patient Reported Outcomes : An Overview [1 ed.] 9788897419600, 9788897419709

Patient reported outcomes (PROs) are a measurement based on a report that comes directly from the patient about the stat

154 81 776KB

English Pages 43 Year 2016

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Patient Reported Outcomes : An Overview [1 ed.]
 9788897419600, 9788897419709

Citation preview

Patient Reported Outcomes Annabel Nixon Diane Wild Willie Muehlhausen

An overview

Patient Reported Outcomes Annabel Nixon Diane Wild Willie Muehlhausen

An overview

© SEEd srl. Tutti i diritti riservati Piazza Carlo Emanuele II, 19 – 10123 Torino, Italy Tel. 011.566.02.58 – Fax 011.518.68.92 www.edizioniseed.it [email protected] First edition June 2015 ISBN 978-88-9741-960-0

Although the information about medication given in this book has been carefully checked, the author and publisher accept no liability for the accuracy of this information. In every individual case the user must check such information by consulting the relevant literature. This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the Italian Copyright Law in its current version, and permission for use must always be obtained from SEEd Medical Publishers Srl. Violations are liable to prosecution under the Italian Copyright Law.

Summary

Introduction to Patient Reported Outcomes.................................................... 4 Definitions..................................................................................................... 4 Application.................................................................................................... 5 Taxonomy...................................................................................................... 6 Format........................................................................................................... 8

Applications of PRO data.............................................................................. 10 Clinical trials............................................................................................... 10 Regulatory approvals.................................................................................. 11 Real world evidence studies........................................................................ 12 Clinical practice.......................................................................................... 12 Health technology assessment.................................................................... 13 Prescribers and patients............................................................................... 14

Collecting PRO data to support product evaluation...................................... 16 Develop a PRO strategy considering issues important to patients, the target product profile and stakeholder requirements............................................. 17 Select validated PRO instruments............................................................... 17 Ensure the trial or study design is appropriate for PRO collection............. 21

Linguistic Validation..................................................................................... 22 The Need for Linguistically Validated PROs.............................................. 22 Methodology............................................................................................... 23 Same Language Different Country............................................................. 25 Measurement Equivalence.......................................................................... 26 Regulators................................................................................................... 26

ePRO Technology Overview......................................................................... 27 Introduction................................................................................................. 27 Technologies............................................................................................... 27 Patient & Site Staff Acceptance.................................................................. 29 Regulatory Perspective............................................................................... 29 Migration & Validation............................................................................... 31

Analysing and reporting PRO data................................................................ 32 Statistical analysis plan............................................................................... 32 Reporting PRO results................................................................................ 35

References..................................................................................................... 37 Authors.......................................................................................................... 41

3

Patient Reported Outcome. An overview

Introduction to Patient Reported Outcomes

Definitions Patient Reported Outcome (PRO) is an umbrella term that has become widely accepted to refer to «a measurement based on a report that comes di‑ rectly from the patient (i.e. study subject) about the status of a patient’s health condition without amendment or interpretation of the patient’s response by a clinician or anyone else» [FDA, 2009]. Similarly, the European Medicines Agency (EMA) defines a PRO as «any outcome evaluated directly by the pa‑ tient himself and based on patient’s perception of a disease and its treatment(s)» [EMA, 2005]. A PRO is interchangeably referred to as a PROM (Patient Re‑ ported Outcome Measure) by some agencies (e.g. UK National Health Service, NHS). Throughout this booklet the term PRO shall be adopted. A PRO instrument includes the standardized format for data collection, as well as all the information and documentation that support the use of the standardized form. The standardized format could be self‑report onto paper, electronic (e.g. online, tablet, mobile phone) or telephone Interactive Voice Response System (IVRS), or it could be by interview provided that the inter‑ viewer records only the patient’s response without interpretation. A PRO instrument can comprise a single question (item), such as a pain Numerical Rating Scale (NRS) shown in Figure 1. Or a PRO can have many items that are group together to form a total score and/or domain scores. For example, the EORTC QLQ‑C30, a measure of Health‑Related Quality of Life (HRQL) used widely in oncology comprises 30 items which are grouped to‑

4

Figure 1. Example pain Numerical Rating Scale (NRS).

Introduction to Patient Reported Outcomes

Figure 2. Distal and proximal concept measurement using PROs.

gether into 15 domains covering symptoms commonly reported in oncology such as pain and fatigue, as well as areas of functioning important to cancer patients such as physical function and social function. PROs should be used to measure a concept that is relevant and experi‑ enced by a patient. The concept might be symptoms experienced by the pa‑ tient, such as pain or fatigue. Symptoms are considered to be concepts that are proximal to the patient experience (Figure 2). The concept might be more distal to the patient experience, such as the impact of a symptom on an aspect of the patient’s functioning such as physical function, cognitive function or sexual function. The concept might be health‑related quality of life (HRQL), defined as the patient’s subjective perception of the impact of his disease and it(s) treatment on daily life, physical, psychological and social functioning and well‑being. The concept can be measured in either absolute terms, for example pain severity at a specified time point. Or it can be measured in terms of change from a previous measurement.

Application PROs have several and wide reaching applications. They are used in clini‑ cal trials to measure the effect of a medical intervention on one or more con‑ cepts relevant to the patient that is expected to be influenced by the medical intervention, with PRO data being submitted to regulatory agencies such as the US Food and Drug Administration (FDA) and the Europeans Medicines

5

Patient Reported Outcome. An overview

Agency (EMA) to support regulatory decision making. PROs are playing an increasing role in Health Technology Assessment (HTA) decision mak‑ ing, particularly in the UK (National Institute for Clinical Excellence, NICE), France (Transparency Committee, TC) and Germany (Federal Joint Commit‑ tee, GBA). PROs are used widely in real world evidence or observation stud‑ ies in order to capture the impact of a medical intervention on patients in a real world setting. PROs are also used in clinical practice to inform discussions be‑ tween the physician and the patient. In the UK NHS, all patients having hip or knee replacements, varicose vein surgery, or groin hernia surgery are invited to fill in PROs. In addition, PROs influence prescribing decision making at the clinician level, and influence patient demands for treatments, particularly in the US where there is direct‑to‑consumer advertising not permitted in Europe.

Taxonomy There is no single catalogue of all valid and reliable PRO instruments cur‑ rently in use, several PRO databases exist listing several thousands of PRO instruments, and new instruments are always being developed. It is therefore important that the selection of PRO instrument(s) is carefully considered from the very many instruments that are available. Generic vs. disease specific Generic PRO instruments are those that can be used in the general popula tion and/or across different diseases. This enables comparison in relation to

Generic

6

Disease‑specific

Advantages

•• Allows for comparison with the general population data •• Allows for comparison across different diseases •• Allows collection of more common health domains

•• Allows greater sensitivity to the domains most pertinent to the disease

Disadvantages

•• May include less relevant items or exclude relevant items •• May be less sensitive to changes within the domains specific to the disease

•• May fail to identify general domains which are relevant to the specific disease •• Cannot be used for comparison to general population

Table I. Generic vs. disease‑specific PRO instruments

Introduction to Patient Reported Outcomes

Concept

Descriptions

Examples

Signs and symptoms

Signs and symptoms of disease (e.g., pain, fatigue and nausea): reports of physical and psychological symptoms or sensations not directly observable and therefore only known by the patient.

A numeric rating scale to assess pain or fatigue.

Function

Physical function: impaired physical activity and functioning (e.g., self‑care, walking, mobility, sleep, sexual, disability).

The physical function domain of Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) [Bellamy, 1988].

Psychological and emotional function: positive or negative affect and cognitive (e.g., anger, alertness, self‑esteem, sense of well‑being, distress, coping)

The Hospital Anxiety and Depression Scale (HADS) [Zigmond, 1983], and the Beck Depression Inventory (BDI) [Beck, 1961].

Treatment satisfaction

Patient satisfaction: usually an evaluation of treatments, patients’ preference, health care delivery systems and professionals, patient education programs and medical devices

Treatment Satisfaction for Medication Questionnaire (TSQM) [Atkinson, 2004] [Atkinson, 2005].

Activities of Daily Living (ADLs)

Instruments measuring basic ADLs cover daily activities that are within the individual’s usual environment, such as bathing, showering, bowel/bladder management, dressing, eating and personal hygiene.

The Knee Outcomes Survey Activities of Daily Living Scale [Irrgang, 1998].

Also available are instruments that measure Instrumental ADLs which cover areas of life relating to an individual living independently within their community, such as housework, taking medications, managing money, shopping, and using transport. HRQL

HRQL instruments are a type of PRO which attempt to capture a broader perspective on the well‑being of a patient. HRQL can be defined as the patient’s subjective perception of the impact of their disease and its treatment(s) on daily life, physical, psychological and social functioning and well‑being [Leidy, 1999]. HRQL measures therefore assess a broad range of different concepts and are typically referred to as being multidimensional in nature.

Table II. Example of instruments

The SF‑36 (Ware & Sherbourne, 1992) and the EQ‑5D [Brooks, 1996].

7

Patient Reported Outcome. An overview

societal norms and between disparate groups of patients. Such measures are usually multi‑dimensional relating to many areas of life. Examples of the most commonly used generic measures are the Medical Outcomes Short Form 36 (SF‑36) [Ware, 1992], and the EQ‑5D [Brooks, 1996]. However, generic mea‑ sures may be uni‑dimensional (e.g. Female Sexual Function Index [Rosen, 2000]); or limited by age group (e.g. PedsQL generic core scale [Varni, 1999]). The advantages and disadvantages of generic and disease‑specific PRO instru‑ ments are presented in Table I. Disease‑specific PRO instruments are those that have been developed for use in specific patient populations. This may be broadly defined, e.g. the Eu‑ ropean Organisation for Research and Treatment in Cancer QLQ‑C30 (core questionnaire) [Aaronson, 1993] for use with cancer patients in general. Broad disease‑specific measures often also have bolt‑on modules where many forms of a disease exist, e.g. EORTC lung cancer module [Bergman, 1994] and breast cancer module [Sprangers, 1996]. Additionally measures may be speci‑ fied by treatment type, e.g. Urethral stricture surgery patient‑reported outcome measure [Jackson, 2011], and disease stage, e.g. Needs Assessment for Ad‑ vanced Cancer Patients [Rainbird, 2005]. Disease‑specific PRO instruments have the advantage of being tailored to issues specific to a given condition which generic PRO instruments fail to adequately address. This can translate into increased responsiveness to clinically important changes in a patient’s condition. Examples of instruments In Table II are summarized some examples of instruments based on the concept measured.

Format

8

Patient reported outcomes measures can be administered by self‑report, interviewer‑administered or proxy‑report. A self‑report PRO is completed by the patient directly. When possible, self‑report administration is considered the gold‑standard of PRO data collection because data are collected from the patient directly. Interviewer‑administered PROs rely on interviewers to col‑ lect PRO data directly from the patient, without interpretation. Proxy‑report involves someone other than the patient (e.g. a caregiver or healthcare pro‑ vider) responding on behalf of the patient, as if he or she were the patient. Proxy‑report can be used to gain the patient’s perspective in situations where self‑report or interview is not possible due to a limitation in the population of interest’s ability to communicate and/or complete the PRO (e.g. severe

Introduction to Patient Reported Outcomes

symptoms, cognitive impairment, infants). Proxy‑reported PROs are at times discouraged because they require a subjective judgment to be made about the patient without patient validation. Self‑report and interviewer‑administered are the two most common modes of PRO administration.

9

Patient Reported Outcome. An overview

Applications of PRO data

PRO data provide essential data in many settings including clinical trials, regulatory approvals, real‑world evidence studies, clinical practice and provi‑ sion of decision making information to prescribers and patients. Particularly when trying to capture data that cannot be measured by objective clinical out‑ comes because the concept of interest, such as pain, fatigue or HRQL, is only known to the patient.

Clinical trials

10

It is increasingly recognised that clinical trials should incorporate pa‑ tient‑reported measures of health outcome. Valid and reliable PRO instru‑ ments are able to provide a standardised, quantifiable measure of treatment benefit, upon which the outcomes of interventions and treatment effect from the patients’ perspective can be judged. The increasing use of PRO instru‑ ments in clinical trials reflects the wide range of medical and surgical interven‑ tions that are aimed primarily at improving patient’s well‑being, HRQL and symptoms of disease. In some instances PROs provide the best evidence of a treatment’s effec‑ tiveness and as such represent primary endpoints in trials. Examples where this is often the case include the experience of pain, gastrointestinal and uro‑ logical symptoms, or psychological well‑being. More frequently PRO instru‑ ments provide data for secondary endpoints, which provide supportive evi‑ dence of the efficacy of a treatment. PRO instruments should be included in most pivotal trials to enable us to understand the treatment experience from the patient’s perspective. However, it is sensible to include PRO instruments in earlier, phase II trials because the information gathered from these instruments builds on the knowledge about the investigational product and also allows for the PRO measurement strategy to be refined prior to the pivotal trials, to ensure that the maximum benefit is obtained from the PRO data from the pivotal trial. Including PRO instruments in earlier trials, if correctly designed, can also provide data that will allow for the measurement properties of the PRO instruments to be evaluated and/ or confirmed (i.e. validity, reliability, sensitivity to change), and to determine how PRO change scores should be interpreted.

Applications of PRO data

PRO instruments can be incorporated into international trials as long as the PRO instrument has been appropriately prepared for use in each target language according to accepted guidelines [Wild, 2005]. Many PRO instru‑ ments have multi‑language versions readily available, and contacting the in‑ strument developer can quickly establish this. Where a language version is not available, and where the instrument developer is in support of development of different language versions, then this can be rectified with the process taking approximately three months (see Chapter ‘Linguistic Validation’).

Regulatory approvals Increasingly, regulators who approve new drug products for human use re‑ quire pharmaceutical companies who develop new treatments to demonstrate treatment benefit from the patient’s perspective, as well as traditional data relating to safety, efficacy, and survival rates. Treatment benefits include how a patient feels and functions, and often the best way to capture how a patient feels and functions is by asking the patient directly. PRO instruments are used in clinical trials to support pharmaceutical com‑ panies to obtain a label claim from regulators when submitting their new drug for regulatory approval. A product label claim is a statement or implication of treatment benefit that appears in any section of a product’s approved label‑ ling. The use of PRO measures in clinical trials for the purpose of obtaining regulatory label claims tend to be based on the symptoms, signs, or any aspect of functioning which is directly related to disease status (proximal concepts). The way in which regulatory bodies view PROs including HRQL mea‑ sures varies. The FDA has a preference for instruments which assess symp‑ toms, signs, or an aspect of functioning directly related to disease status. The FDA regard HRQL as too unspecific and broad to be meaningfully assessed in a clinical trial for the purpose of obtaining a label claim. However, the EMA takes a more inclusive view of what treatment changes may be captured successfully and treats HRQL measures as valid for inclusion in trials in con‑ sideration of labelling. Guidance on the appropriate use of PROs is provided by individual regulatory agencies and careful consideration is needed in their choice and application. When PRO measures are used for label claims in regulatory submissions it is important that the role of the PRO in the endpoint hierarchy is understood. This is known as an endpoint model. The endpoint model shows the relation‑ ship among all the endpoints in the clinical trial (both PRO and non‑PRO). The PRO instrument used in the clinical trial should relate to the concept(s) that the new drug is intended to target. For example, if developing a new Non‑Steroidal Anti‑Inflammatory Drug (NSAID) in rheumatoid arthritis that aims to reduce inflammation and pain, investigators might include validated

11

Patient Reported Outcome. An overview

Endpoint

Assessment

Hierarchy

Signs and symptoms of RA

American College of Rheumatology 20 (ACR20) response (clinician reported)

Primary

Swollen joints

Number of swollen joints (clinician reported)

Secondary

Pain

Brief Pain Inventory (BPI) 11 point numeric rating scale (patient reported outcome)

Secondary

HRQL

Short‑form 36 (patient reported)

Exploratory

Table III. Endpoint model for a fictional NSAID treatment for rheumatoid arthritis.

PRO instruments measuring pain as well as biochemical markers of disease activity and physician examination of the number of swollen joints (Table III). It is worth noting that specifically in oncology trials, HRQL data is re‑ viewed by regulators in order to contextualise the survival benefit, to under‑ stand the quality as well as the quantity of survival, although the HRQL data is not usually reported on the approved product label.

Real world evidence studies Real world evidence is becoming an increasingly important element of healthcare decision‑making as HTA and reimbursement agencies and payors become more demanding in terms of the relevance of clinical evidence and connections to the delivery of care in clinical practice. In real world evidence population studies PRO instruments can be incor‑ porated to provide information on the health profile and health care needs of the population. Data from such surveys, especially when accompanied by patient demographic data such as socio‑economic status, sex, ethnicity, and age, can provide important information on the type of services needed and on whom they should be targeted.

Clinical practice

12

PRO instruments are routinely administered in clinical settings for assess‑ ing the effectiveness of interventions and treatment benefit to patients. The information gathered through PRO instruments can be used to guide clinicians in making clinical decisions and as such PRO instruments also offer important supplementary information for clinicians to guide the care of their patients.

Applications of PRO data

Valid and reliable PRO instruments offer an efficient way for patients to provide feedback of how they view their health that can complement existing clinical evidence such as biological markers of disease activity. PRO instru‑ ments can be used to screen for health problems and to monitor the progress of health problems identified as well as the outcomes of any treatment. PRO instruments can also sometimes be used in the selection of treatment options for patients. Since 2009, the NHS in the UK has routinely used PRO instruments to as‑ sess the quality of care delivered to NHS patients from the patient perspective. Currently covering four clinical procedures; hip replacement, knee replace‑ ment, varicose vein and groin hernia surgery. The PRO instruments are used to calculate the health gains after surgical treatment using pre‑ and post‑oper‑ ative surveys. National headline data are published every month, with more granular data published every quarter.

Health technology assessment There is increasing recognition of the role that economic evaluation plays in reimbursement decision making and PRO instruments can have significant value in supporting such efforts. Reimbursement agencies are now seeking value propositions that go beyond traditional safety and clinical efficacy mes‑ sages. Due to the rising costs of health and the increased demand to dem‑ onstrate value for money, pharmaceutical companies are now charged with generating evidence on the patient’s perspective of treatment. When investigators need evidence for the overall value of a health care intervention in a way that permits economic comparisons with other inter‑ ventions either within or across treatment areas, then outcomes are often expressed in the form of utilities. Health utilities are a measure of satisfac‑ tion with a particular state of health and PRO instruments such as the EQ‑5D [Brooks, 1996] allow utilities to be derived from patients. The most widely known form of summary value of treatments for comparative purposes is the Quality Adjusted Life Year (QALY). Within the QALY methodology the out‑ comes of treatment, including quality and quantity of life gains, are expressed as a single index that can be used to inform decisions to the allocation of health care resources. Many existing HRQL scales cannot be used in costeffective analyses to estimate a cost per QALY. A potential solution to this lack of generic preference‑based measures could be a method called “mapping” or “cross walking” which can potentially be used to estimate cost per QALY when a preference‑based measure has not been used in a clinical trial or study. Non‑utility PRO data are increasingly valued by payors with recognition that they directly reflect the patient experience of a disease and its treatment and contribute to a more holistic understanding of the potential value of a new

13

Patient Reported Outcome. An overview

product, and may also provide the basis for differentiation between products. In oncology particularly, payors are increasingly interested in the quality as well as the duration of survival. Recent benefit assessments by the national HTA agency in Germany, the G‑BA (Federal Joint Committee) demonstrate how PRO data can support the value proposition, and how lack of such data can compromise pricing and reim‑ bursement outcomes. In the assessment of Xalkori (crizotinib) in NSCLC, the manufacturer provided additional data during the hearing procedure with the G‑BA that allowed the evaluation of patient‑reported symptoms and HRQL. The assessment result was subsequently increased from ‘no additional benefit proven’ to a ‘hint of significant additional benefit’ versus chemotherapy in one patient subgroup. By contrast, in the assessment of Stivarga (regorafenib) in metastatic colorectal cancer, the G‑BA, and the GKV‑Spitzenverband (the key pricing agency in Germany) considered that the efficacy and survival benefit could not be put into context in the absence of morbidity and HRQL data. The benefit assessment resolution was reduced to ‘hint of minor additional ben‑ efit’, and was to be re‑evaluated after 1½ years. The GBA is not the only payor to expect non‑utility PRO data. Other leading HTA agencies are increasingly criticising the lack, or poor quality or relevance, of PRO data as they seek to determine the true value of the drug and to put survival, efficacy and safety in a broader context. For example, the Transparency Commission in France is clear that it will not accept survival benefit that is at the detriment of HRQL. Although no formal HTA’s exist in the US, WellPoint, one of the largest US health benefits companies, issued formulary guidance to pharmaceutical companies requesting more detailed information on the effectiveness of new compounds improving patients’ HRQL [Steike, 2008].

Prescribers and patients

14

PRO data captured by pharmaceutical companies is of interest to prescrib‑ ers and patients themselves. Reporting the results of PRO data from clini‑ cal trials is an effective way of communicating treatment benefits, improved tolerability, treatment satisfaction and patient experience to prescribers and clinicians. Publishing in peer reviewed journals, or disseminating the results from clinical trials at industry and medical conferences enables access to a wider audience, including prescribing physicians, patients and patient advo‑ cacy groups. For example, this is particularly important when physicians have a variety of possible drugs with similar efficacy and safety profiles that they could prescribe to patients, where patients ultimately make the decision of whether or not to take prescribed medication, and how drug information is shared within patient groups. Inclusion of PRO data may create product differ‑ entiation enabling prescribers to confidently prescribe one approved medicine

Applications of PRO data

over that of another, alleviate concerns in patients about whether the drug they have been prescribed is actually going to help them and thus perhaps improve adherence rates. Additionally, published information including PRO data is of value to pa‑ tient advocacy groups. With published PRO data patient advocacy groups can be powerful allies in lobbying payers and politicians to gain access to newer, and often more expensive medicines for patients. The decision by the Na‑ tional Institute for Health and Care Excellence (NICE) to approve Herceptin for early stage breast cancer as a treatment to be offered by the UK National Health Service (NHS) is widely known to have been influenced by patient and patient advocacy pressure [Berg, 2006].

15

Patient Reported Outcome. An overview

Collecting PRO data to support product evaluation

Once the decision has been made that capturing the patient’s voice is go‑ ing to be an important part of the research being planned, the next step is to consider how PROs are going to be effectively incorporated into the planned research. Figure 3 sets out the key steps for ensuring PROs are successfully incorporated into the research. In this chapter we will look closely at boxes 1‑3, the final chapter will examine boxes 4‑5.

16

Figure 3. Steps to successful PRO measurement.

Collecting PRO data to support product evaluation

Develop a PRO strategy considering issues important to patients, the target product profile and stakeholder requirements The PRO strategy needs to be well defined with a clear understanding of the link between disease and treatment outcomes. Developing a PRO mea‑ surement strategy should begin with a clear understanding of the disease and the outcomes relevant to patients. This can be best understood through en‑ gagement with patients, patient advocates, informal caregivers, clinicians, and nurses. Any PRO strategy should be developed alongside the other intended endpoints to ensure a comprehensive measurement strategy for endpoints of interest, and framing this in the context of an endpoint model is useful. Investigators should understand the investigational product; the mecha‑ nisms for how the product works, the likely impact this might have on the dis‑ ease, and how this might have a beneficial impact on the patient. This product knowledge needs to be evaluated alongside the understanding of the disease and outcomes relevant to patients in order to identify concepts that would be relevant to measure in any evaluation of the product in patients. Alongside this the various stakeholder requirements need to be understood, considering the regulators, payors, clinicians and patients, this has been set out in Chapter ‘Applications of PRO data’. Unfortunately, complexity is added be‑ cause the various stakeholders are not always aligned in their preferences. For example, the FDA are demonstrating a preference for purpose‑designed PRO instruments and have no concerns that pharmaceutical company developing and evaluating the treatment may also be spearheading and funding the de‑ velopment of a PRO instrument to evaluate that treatment. By contrast, many payors express concern over these bespoke instruments and instead prefer to see PRO data provided on established PRO instruments that are well known to them. Navigating these multiple requirements early ensures a comprehensive PRO strategy that will be well received by the various stakeholders.

Select validated PRO instruments It is essential that the selected PRO instruments have been appropriately validated for use in the target population. The PRO instrument needs to mea‑ sures concepts that are relevant to the patient with the disease of interest, and that the statistical measurement properties of the PRO instrument have been established in patients with the target indication [Nixon, 2014]. We will cover the following characteristics of PRO instruments: -- Reliability. -- Validity.

17

Patient Reported Outcome. An overview

-- Ability to detect change (responsiveness). -- Recall period. -- Patient burden. Reliability A PRO instrument needs to produce results that are reproducible and inter‑ nally consistent, and minimise the extent to which the instrument is free from measurement error. As the measurement error of an instrument increases, so does the sample size required to obtain precise estimates of the effects of an intervention. Reliability estimates of 0.7‑0.9 are recommended. Reproducibility For a PRO instrument to be reproducible, it has to be able to provide the same score on repeated administrations when respondents have not changed. This is assessed by test‑retest reliability. There is no exact agreements about the length of time between administrations, this is dependent on the stability of the concept being measured; consider pain in rheumatoid arthritis which is more variable than a person’s health‑related quality of life. In practice test‑re‑ test reliability is evaluated usually between 2 to 14 days. This is usually evalu‑ ated statistically using the intra‑class correlation coefficient. Other types of reproducibility which can be evaluated for PRO instru‑ ments that are interviewer led rather than self‑completed by the patient are intra‑interviewer reliability and inter‑interviewer reliability. Intra‑interviewer reliability evaluates the stability of scores when a PRO instrument is admin‑ istered by the same interviewer at two different time points. Inter‑interviewer reliability is the agreement among responses when the PRO instrument is ad‑ ministered by two or more different interviewers. Internal consistency An evaluation of internal consistency reveals how well the items within a scale measure a single underlying concept. This is usually assessed using Cronbach’s alpha, which measures the overall correlation between items with‑ in a scale. Very high levels of correlation between items may indicate redun‑ dancy or that the items are measuring a very narrow aspect of the construct. Validity

18

It is essential to know that the PRO instrument is able to measure what it purports to measure, this is referred to a validity. There are many different types of validity, the key ones are addressed below.

Collecting PRO data to support product evaluation

Content validity Content validity is the extent to which an instrument measures the concept of interest. Establishing the content validity of an instrument for its intended use is of critical importance. Evidence for content validity is most frequently obtained from qualitative studies demonstrating that the items and domains of an instrument are appropriate and comprehensive relative to its intended measurement concept, population, and use. Content validity is specific to the population, condition, and treatment to be studied. For example, a PRO in‑ strument measuring pain may be shown to be content valid for rheumatoid arthritis patients, but this does not mean that the same instrument will also be content valid for bone pain as a result of multiple myeloma. Content validity can be demonstrated through discussion with clinical experts, literature review and undertaking interviews with the target patient population. As well as the PRO instrument item and domains content, it is also important to ensure that an instrument has content validity in terms of instru‑ ment format, response options, and recall period. There are well documented research methods for establishing content validity of a PRO instrument, (see for example [Patrick, 2011a; Patrick, 2011b; Kerr, 2010]). There is also a growing movement towards mixed methods for establishing content validity, involving a combination of qualitative and quantitative research methods. Construct validity Construct validity is tested by looking at evidence that scores on the PRO instrument of interest conform to a priori hypotheses concerning logical rela‑ tionships that are expected to exist with related measures. These related mea‑ sures could be other PRO instruments or clinical measures for example. Convergent validity is one type of construct validity, and is the extent to which a PRO instrument relates to other measures based on theoretical content, or the expected relationship with a chosen variable. For example, it would be quite reasonable to hypothesise that a PRO instrument that measures patient reported pain in rheumatoid arthritis, would be related to an already validated instrument for measuring physical function in patients with rheumatoid ar‑ thritis. This would be evaluated statistically by examining the strength of cor‑ relation between the PRO instrument measuring pain and the PRO instrument measuring physical function. Divergent validity is the reverse of convergent validity i.e. the extent to which a PRO instrument does not related to other measures based on theoretical content. Divergent validity is rarely evaluated in PRO research. Known groups validity is a form of construct validity. In known groups validity PRO instrument scores from groups of patients that differ by a known indicator, for example disease severity, are compared. This is evaluated sta‑ tistically by grouping patients by the key indicator, and comparing the PRO scores of the different sub‑groups of patients to evaluate whether they are statistically different in the way that was hypothesised.

19

Patient Reported Outcome. An overview

Ability to detect change (responsiveness) An evaluation that is undertaken to determine whether the PRO instrument can capture important changes in health. It is usually evaluated by looking at changes in PRO instrument scores for groups of patients whose health is known to have changed, for example following a health intervention of known efficacy. Patients may be asked how their current health compares to a previ‑ ous point in time. There is no single agreed method for assessment the ability of a PRO instrument to detect change, and various statistical techniques are used to quantify this property although most usually this is tested using the effect size statistic. Recall period PROs have differing time period that patients are asked to consider when completing the instrument. The variability, duration, frequency and intensity of the concept being measured influences what the most appropriate recall period should be, and this should be taken into consideration when selecting a PRO instrument for use in research. PRO instruments require patients to rely on their memory, and the person completing the PRO instrument needs to be able to accurately recall the information requested. This is particularly an is‑ sue if the patient is being asked to recall information over a long time period, compare their current state with an earlier period, or average their response over a period of time. For these reasons, there is increasing preference for PRO instruments with a short recall period that ask patients to describe their current or recent state, and for this to be captured in the form for a daily diary. Patient burden

20

Although PRO instruments should be considered an important component of the study objectives, PRO instruments can be lengthy and it is important not to burden the patients, who may be very unwell, with unnecessary activi‑ ties. It is important to ensure that the questions they are completing appear reasonable and relevant. Careful consideration should be given to the num‑ ber and length of PRO instruments participants will be expected to complete. Guidelines suggest limiting data collection so that the average patient can complete the process ideally within 20 minutes at baseline, and 10‑15 minutes at follow‑up time points [Basch, 2012]. Further, careful consideration should be given to the particulars of the pa‑ tient population being investigated and whether the PRO instruments are ap‑ propriate or whether an alternative source of data might be more appropriate and valid. For example in studies involving young children or with patients with cognitive impairment.

Collecting PRO data to support product evaluation

Ensure the trial or study design is appropriate for PRO collection A randomised and blinded clinical trial design is preferable because pa‑ tients who know they are on an active treatment may overstate their treatment benefit and patients who know they are not receiving active treatment may underreport any improvement they experience. The quality of the clinical trial can be improved by specifying in the study protocol procedures for minimis‑ ing inconsistencies in trial conduct related to PRO data collection. It is recom‑ mended that PRO instruments are completed by patients in the same order using standardised instructions prior to undertaking any other visit activity including receiving treatment. The goal of assessment is usually to understand how the patient experi‑ ence changes from baseline to future time points of interest. PRO data should be captured at baseline and at selected follow‑up time points which are the same for every patient. PRO data should be collected as frequently as neces‑ sary to meet research objectives but without overburdening patients. Consider electronic data collection approaches for the PRO data (see Chapter ‘ePRO Technology Overview’). It is important to employ methods that minimise missing PRO data, and these should be documented in the study protocol. This can include electronic data capture, training site staff and patients and monitoring PRO data comple‑ tion adherence.

21

Patient Reported Outcome. An overview

Linguistic Validation

For a PRO instrument to be considered valid to be used with a popula‑ tion who speak a different language than the language it was originally de‑ veloped in, it needs to go through a specific translation process referred to as linguistic validation. Linguistic validation is the process of translating PROs that ensures that the translated PRO has conceptual and semantic equivalence. Conceptual equivalence refers to whether the same concept exists in another language or culture. Semantic equivalence refers to whether the same linguis‑ tic expression exists in another language or culture. Where conceptual and semantic equivalence are both present, the items in a PRO are considered to be linguistically valid. Linguistic validation is distinct from translation in that it refers to a specific methodology, which usually includes testing the translation with native language speakers from the target country.

The Need for Linguistically Validated PROs

22

It is becoming increasingly common for PROs to be used in multi‑national studies. This is due in part to the development of global perspectives on health and health care. Clinical trials are increasingly international, resulting in the need to aggregate data across sample populations from different countries. Provision and regulation of healthcare, although still primarily a national re‑ sponsibility, are affected by transnational agreements on provider licensing, and certification, standards of care and applications of medical technology. The approval process and market for treatments, particularly pharmaceuticals, are becoming more international. Interest in PRO measures that can be used internationally is furthered by the desire to strengthen causal inference in the evaluation of treatments. As diagnostic criteria become more international, treatment is becoming more standardized in different cultures, thus prompting attempts to compare out‑ comes of treatment in different cultural settings, particularly using random‑ ized clinical trials. Causal inference and external validity may be strengthened if the same effect is found cross culturally. The majority of PROs are developed in Europe and the US but are required for use across Europe, Asia, South America and Africa and therefore need to be translated for use in those countries and cultures.

Linguistic Validation

It is essential that careful attention is paid to the way in which PROs are translated. This is particularly challenging because of the inherent potential ambiguity of language generally and the possible cultural sensitivity of some questions and concepts in particular. Correct interpretation of every part of the PRO, not just the items, is vital for the validity to be maintained across languages.

Methodology There has been much discussion over the years regarding the optimal methodology for the linguistic validation of PRO’s. In 2008, Acquadro et al. conducted a literature review of methods used in the translation and linguistic validation of PRO’s [Acquadro, 2008]. They identified 17 sets of methods, most recommending a multi‑step approach involving a centralised review pro‑ cess. However each group proposes its own sequence of translation events and weights each step differently. The standard approach One of the published set of methods identified in the search was Wild et al. [Wild, 2005] which was a collaboration between practitioners in the field to develop a ‘best practice’ methodology, based on requirements for accuracy and cultural relevance within the context of real world experience. This paper outlines the recommended methodology of the International Society of Phar‑ macoeconomics and Outcomes Research (ISPOR) and has subsequently been adopted as the standard approach by those sponsoring multi‑national clinical studies. The process usually takes between 2 and 3 months and should be conduct‑ ed by individuals and companies who are experienced in linguistic validation specifically. The steps are presented below: 1. Preparation. Once the PRO has been selected and before any translation work commences, it is essential that permission to translate and use the instrument be sought from the instrument developer or copyright holder. It is also advisable that an explanation of all the concepts included in the measure is developed in conjunction with the developer if possible, which can be made available throughout the translation process to reduce any risk of misinterpretation by the translators. 2. Forward Translation. The source (usually US or UK English) PRO is translated into the target language independently by two native speakers of that language. One of these acts as the ‘in‑country investigator’ and should

23

Patient Reported Outcome. An overview

3. 4.

5.

6.

7.

8.

9. 24

be resident in the target country. The second forward translator does not necessarily need to reside in their native country but it is preferable. Reconciled translation. The two forward translations are then reconciled to a ‘best of both’ translation. This could be carried out by the in‑country investigator but it can also be carried out by a third translator. Backward translation. Two independent back translations should be car‑ ried out by native speakers of the original language, who are fluent in the target language. Ideally they should reside in the source language coun‑ try but this is not essential. The back translators should have no previous knowledge of the PRO. Back translation review. The two back translations should be compared with the source text to determine points of difference. Any discrepancies are highlighted and revised until the translation is deemed to match the source. This review needs to be carried out by someone who is familiar with the PRO and who can compare the English versions and identify and differences. Developer/clinician review. For some PROs, the developer may require that they review the translation. Any changes suggested by the developer should be discussed with the in‑country investigator before finalizing the translation. In addition, some developers may require that a clinician is also asked to review the translation to ensure that the language used is appropri‑ ate for the target patient population and understandable by all. Harmonization. Harmonisation is the process by which conceptual equiv‑ alence across different translations is checked, which helps to ensure in‑ ter‑translation validity and allow reliable pooling of data from trials and other multi‑national studies. It is not always conducted as a specific step in the process. Where it is considered to be necessary, for example at the request of the developer, it is possible to hold a harmonization meeting to which in‑country investigators or back translators are invited to represent each language. Cognitive Debriefing. Cognitive debriefing interviews are a key element of the linguistic validation process. The translated instrument is tested on a number of respondents, who ideally should be part of the target patient population. For example if the PRO is intended to be used on a diabetic population in Denmark, Danish‑speaking diabetics residing in Denmark will be interviewed. Generally 5 respondents is considered to be adequate. The patients are asked to complete the translated PRO and are interviewed to determine their level of understanding and ability to complete it. The data from these interviews can be used as evidence of the content validity of the translation. The process also ensures multi‑lingual harmonisation of the translations, by ensuring that the PRO is understood in the same way by target populations across all language and cultural groups. Cognitive Debriefing review. The cognitive debriefing results are re‑ viewed and any revisions to the original translation should be agreed with the in‑country investigator.

Linguistic Validation

10. Proofreading. As a final quality control step, it is recommended that two proof readings are carried out in order to catch and correct any remaining minor errors. It is recommended that the proof‑readers are native‑speakers of the target language who are not familiar with the questionnaire and have not previously been involved in the translation process. Alternative Approaches Some researchers believe that, rather than developing a PRO in one lan‑ guage and translating it into target languages, it would be better to develop measures concurrently in each culture [McKenna, 2005]. This approach has been employed on a few occasions, for example the World Health Organisa‑ tion worked collaboratively around the world to develop the WHOQOL, a PRO which can be used in multiple settings with all patients and the general population [Saxena, 1997]. The WHOQOL was developed over a period of 12 years and is now available in 40 countries and most majority languages [Skevington, 2004]. A shorter, hybrid approach was taken with an obesity PRO measure (the OWLQOL) which was developed partially in the US and partially in Europe taking a total of 6 months and resulting in a measure which was culturally valid in 6 cultures [Niero, 2002]. However, given time and financial restraints as well as the availability of a large number of well developed and validated PROs, it continues to be more common for a PRO to be developed in English and then translated into a vari‑ ety of other languages using the methods described above.

Same Language Different Country A language can differ in usage, vocabulary, and even grammar depending on where it is spoken, to the extent that a translation developed for use in a particular country may not be suitable for use in other countries. Common ex‑ amples of languages spoken in multiple countries are English, French, Span‑ ish, Chinese and Arabic. When a translation of the same language is required for more than one country a decision needs to be made as to the optimal meth‑ odology for achieving this depending on the required end result. There are at least three options to consider: 1. The first option is to produce entirely separate translations simultaneously for each target country. This is advantageous in terms of timing but may lead to texts that are more dissimilar than is necessary. 2. The second approach is to produce a translation in one country and then to review the translation for use in each additional target country. This meth‑

25

Patient Reported Outcome. An overview .

odology reduces the differences between the finished translations, but takes longer. 3. The third approach is to work with translators from each country to de‑ velop a translation, which works in all countries, a global translation. It is not always possible or feasible to produce a global translation and the feasibility depends on the content of the PRO and the languages and coun‑ tries involved. There is no consensus on whether one of these approaches is superior to another and the recommendation by Wild et al. [Wild, 2008] is that each study and situation should be evaluated on a case‑by‑case basis.

Measurement Equivalence Most of the literature focuses on the quality of the individual language versions to ensure semantic and conceptual equivalence, but do not extend to addressing the issue of measurement equivalence. A wide range of techniques have been and can be used to assess measure‑ ment equivalence of translations and these include: classical test theory, factor analysis, structural equation modelling (SEM) and Differential Item Func‑ tioning (DIF). For each of these techniques there is a requirement for large samples of patients from each language group which can often be logistically difficult to achieve. No matter which method is used there is no guidance about what is considered to be an acceptable level for data pooling and what should be done if differences are found. Aside from some published studies using generic PROs [Wild, 2008] it is currently unusual for this type of data to be published and it is currently not routinely required by regulatory bodies.

Regulators

26

The FDA has taken an increasing interest in how PROs are translated. They require evidence to support claims of linguistic equivalence and content validity between the source and translated text when a PRO has been used to collect data for making a product label claim [Eremenco, 2005]. They have accepted the methodology presented above and described in Wild et al. [Wild, 2005] as the standard approach. The EMA clearly voiced their concerns a number of years ago by asking the question: “Are HRQL instruments internationally validated?” [Chassagny, 2001]. Other regulators, e.g. payors have not outlined their own criteria for translation and linguistic validation of PRO’s but will follow the lead of the FDA and EMA.

ePRO Technology Overview

Introduction The evolution of technology and the rapid global penetration of mobile devices within all potential patient populations drive the adoption rate of the use of electronic clinical outcome assessments (eCOA) in clinical trials. Dif‑ ferent technologies have been deployed in many projects during the last 20 years and have their individual pros and cons. Most electronic systems support patients in complying with the protocol by enforcing strict time windows for diaries completed at home and by providing audible reminders. In this chapter we will introduce commonly used technologies and their individual benefits and challenges.

Technologies Handheld Device This technology comprises of personal digital assistants (PDAs, i.e. iPOD Touch, iPAD Mini), consumer smart phones (i.e. iPhone®, Blackberry®, Sam‑ sung Galaxy®) and commercial B2B devices (i.e. Bluebird Pideon®). Hand‑ held devices are mostly used for electronic patient diaries. The main advan‑ tages include the ability to alert patients and drive compliance through the use of alarms, time windows and the option to answer questions in an offline mode (not being connected to the internet). Tablet Tablets come in a variety of different sizes and are mainly used to capture questionnaire data from patients or clinicians during site visits. This technol‑ ogy also allows for offline data entry and due to the size of the screens, it al‑ lows for the display and capture of wordier questionnaires as well as images (i.e. body diagram).

27

Patient Reported Outcome. An overview

Web/IWRS Web‑based systems, like some of the common electronic data capture (EDC) systems allow to present questionnaires in a browser on a laptop, tab‑ let or computer. These systems only work when the device is connected to the internet. This is probably the main draw‑back of these systems and has been a key reason for the limited adoption, despite their relatively low cost. Based on different operating systems (i.e. Apple iOS®, Microsoft Windows®, Android®) and different browsers (i.e. Internet Explorer®, Safari®, Chrome®, Firefox®) the representation of the questionnaires will look different. There are still some open scientific and regulatory questions around the validity of these instrument versions, which will be addressed in the near future. Mobile Web Mobile web is different from the standard web/IWRS in that the presenta‑ tion can be specific to mobile devices such as mini tablets or smartphones. The need to be online is again a major drawback for this technology and therefore mobile web is currently not commonly recommended for use in projects with patient diaries. Interactive Voice Response Systems (IVRS) Interactive Voice Response Systems (IVRS) allow patients to use any reg‑ ular phone to dial into a (toll‑free) number to answer questions via touchtone keypads. The main advantage of this technology is that no specific hardware is required, almost any phone will do, and no devices will need to be purchased, provisioned and shipped to sites. For illiterate patient populations this may also be the most suitable solution. Key drawbacks are the currently relatively high validation efforts after a migration from paper and the fact that lengthy instruments maybe too burdensome for patients. IVRS is being used regularly for short daily diaries. Digital Pen

28

The digital pen is a hybrid solution that uses the original paper question‑ naires, printed on special paper (with micro dots) and an electronic system (infrared camera) to record the answers digitally. Therefore there is no need to re‑validate the instrument for these projects. However, the digital pen does not allow for edit checks while the patient uses it and therefore the error rate of the questionnaire data is similar to what we see on other paper based systems. The cost for this technology is also often higher than a tablet solution.

ePRO Technology Overview

Patient & Site Staff Acceptance Capturing PRO data electronically (ePRO) is not a new concept, it has been done as far back as 1987 and results, challenges and successes have been widely published. The current literature shows that patients, in most cases, prefer the electronic over the paper versions and indicates that patients spent less time completing the electronic versions. This is important to consider when evaluating patient burden. One of the key advantages that patients high‑ light during the use of an electronic diary is the ability to incorporate audible alarms to remind them to complete their response – a support which is not available with paper questionnaires. The use of ePRO has also shown to be accepted basically in all patient populations. Current published data suggests that there is no evidence that i.e. elderly patients cannot use eDiaries. Site personnel have also indicated that a well‑designed ePRO system is advantageous and reduces their burden of having to transcribe the PRO data into an EDC system.

Regulatory Perspective The use of ePRO in clinical trials is regulated by the individual authorities such as FDA (US) and EMA (EU). Guidelines on the use of PROs in clinical studies have been published. In addition to the regulator´s guidelines there are professional groups and institutions that have published best practices on the use of eCOA (i.e. ISPOR, ePRO Consortium). Here is a list of guidelines and other documents that regulate and support the use of eCOA in clinical trials. Guidelines US Food and Drug Administration (FDA) -- Guidance for Industry ‑ Patient‑Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims (Sec‑ tion 4F) http://www.fda.gov/downloads/Drugs/Guidances/UCM193282.pdf -- Guidance for Industry ‑ Qualification Process for Drug Development Tools FDA Draft Guidance for Electronic Source Data, FDA Guidance http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryIn‑ formation/Guidances/UCM230597.pdf -- Guidance for Industry ‑ Electronic Source Documentation in Clinical Investigations http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryIn‑ formation/Guidances/UCM328691.pdf

29

Patient Reported Outcome. An overview

-- Draft Guidance for Industry and Food and Drug Administration Staff Mobile Medical Applications http://www.fda.gov/downloads/MedicalDevices/DeviceRegulationand‑ Guidance/GuidanceDoGuidanc/UCM263366.pdf -- Guidance for Industry ‑ Computerized Systems Used in Clinical In‑ vestigations http://www.fda.gov/ohrms/dockets/98fr/04d-0440-gdl0002.pdf EMA -- Reflection paper on expectations for electronic source data and data transcribed to electronic data collection tools in clinical trials http://www.ema.europa.eu/docs/en_GB/document_library/Regulatory_ and_procedural_guideline/2010/08/WC500095754.pdf -- Reflection paper on the regulatory guidance for the use of healthrelated quality of life (hrql) measures in the evaluation of medicinal products http://www.ispor.org/workpaper/emea-hrql-guidance.pdf Best Practices ISPOR Taskforce Reports http://www.ispor.org/workpaper/practices_index.asp -- Recommendations on Evidence Needed to Support Measurement Equivalence between Electronic and Paper‑Based Patient‑Reported Outcome (PRO) Measures http://www.ispor.org/workpaper/patient_reported_outcomes/Coons.pdf -- Principles of Good Practice for the Translation and Cultural Adapta‑ tion Process for Patient‑Reported Outcomes (PRO) Measures: Report of the ISPOR Task Force for Translation and Cultural Adaptation http://www.ispor.org/workpaper/patient_reported_outcomes/WildHouchin.pdf -- Validation of Electronic Systems to Collect Patient‑Reported Outcome (PRO) Data ‑ Recommendations for Clinical Trial Teams: Report of the ISPOR ePRO Systems Validation Good Research Practices Task Force http://www.ispor.org/sigs/eprosystemvalidationsg.asp ePRO Consortium Recommendations

30

http://c-path.org -- Best Practices for the Electronic Implementation of Clinical Outcome Assessment Response Scale Options http://c-path.org/wp-content/uploads/2013/10/best-practices-for-the-elec‑ tronic-implementation-of-COA-response-scales.pdf

ePRO Technology Overview

-- Migration Principles for Existing Clinical Outcome Assessment (COA) Instruments http://c-path.org/pdf/ePRO-MigrationPrinciplesforExistingCOAInstru‑ ments.pdf -- Principles for the Development of New Clinical Outcome Assessments http://c-path.org/pdf/ePRO-DevelopmentPrinciplesNewClinicalOutcome‑ AssessmentInstruments.pdf

Migration & Validation The FDA Guideline from 2009 introduced the need to validate question‑ naires and their electronic versions. The concern was that a migration from pa‑ per to any electronic platform may introduce a change in how patients would answer the questions. Any migration from paper to electronic will require some change to the original paper version. At a minimum the number of items (questions) per screen will be different from the layout on paper. Most mobile devices (handhelds) only allow to place one item per screen. Current best practice recommends [Coons, 2009] that evidence of equiva‑ lence of original (often paper) vs. electronic versions of an instrument should be sought. The ISPOR report suggests three levels of changes and three dif‑ ferent study types to establish equivalence between the original and the modi‑ fied version. The migration from paper to electronic versions (screen‑based) is considered a minor change and in many cases this can be achieved via a cognitive debriefing and usability testing study with a small sample of patients from the target population (5‑10 patients). Where more significant changes have been made during migration, such as paper to IVRS, a larger, quantitative equivalence study may be necessary in addition to a cognitive debriefing and usability study. If the instrument has been modified beyond the initial scenarios and item stems or item responses were adapted the new version will need to be treated as a new questionnaire and all development processes as described in the early chapters will need to be considered. Some researchers have opted to conduct a feasibility project with the new or migrated daily instrument (Diary). During this project a small sample of patients (about 20) will be asked to use the device with the questionnaire at home. This research will then show, if patients can possibly comply with the Diary questions and schedule during their daily life. It does not test the validity of the instrument, but verifies that the study protocol can be adhered to and complied with.

31

Patient Reported Outcome. An overview

Analysing and reporting PRO data

PRO data should be handled, analysed and interpreted to the same high standard as other trial data. Having set out the principles for PRO data analy‑ sis in the study protocol, a statistical analysis plan needs to provide the detail for analysis of PRO data alongside the other trial endpoints. Analysis set out on the statistical analysis needs to ensure that the objectives of the study are addressed. This section details the PRO specific analysis considerations that need to be documented in the statistical analysis plan and provides guidelines for re‑ porting PRO results.

Statistical analysis plan The statistical analysis plan should be finalised prior to database lock: often PRO data analysis is unplanned or conducted post‑hoc and this is con‑ sidered exploratory analysis useful only in generating hypotheses for further study.

PRO instrument scoring

32

The PRO instrument score provides a number derived from a patient’s responses to items in the instrument, and usually results in domain and/or total scores. The score should be based on a validated scoring algorithm which should be documented in the user manual provided with the PRO instrument. For example, the SF‑36 [Ware, 1992] is scored in eight HRQL domains, with these scores also being used to calculate overall mental and physical health components. Score calculation, and imputation of any missing data, should be performed according to the developers’ instructions as set out in the user manual.

Analysing and reporting PRO data

Handling missing PRO data Patients can, for various reasons, not complete either part of a PRO instru‑ ment, or fail to complete whole instruments, or withdraw from a clinical trial early. The resulting missing PRO data can introduce bias into the analysis and interfere with the ability to draw meaningful conclusions from the analysis. The statistical analysis plan needs to detail the approach for analysing and reporting missing PRO data, and should include a sensitivity analyses for im‑ putation methods. Missing PRO data should be addressed for missing PRO items within a domain or instrument as well as for entire missing PRO instru‑ ments at specific time points. Limiting analysis to complete cases is not ad‑ vised. Using a repeated measures mixed effects model uses all available data over time is an appropriate way of handling missing data even when the data are not missing at random. If this is the selected analysis approach, then ensure that important covariates are included in the model, including baseline PRO score. Pattern mixture models can be used where it is known that the PRO data is not missing at random. Multiplicity There are often multiple endpoints in an evaluation of a treatment, this is particularly pertinent in PRO analysis where PRO instruments frequently have multiple domains and therefore multiple scores. Analysis of multiple PRO endpoints where a positive result would be considered evidence of the effec‑ tiveness of treatment as reported by the patient has an inflated probability of false positive findings known as the Type 1 error rate i.e. detecting a treatment effect that is not present. This should be controlled through a prospectively planned multiplicity adjustment, for example by using sequential analysis or through a statistical procedure such as Bonferroni step‑down or step‑up tests. The planned approach to handling multiplicity of endpoints should be set out prospectively in the statistical analysis plan. PRO results interpretation Traditionally analyses of PRO data have focused on statistically significant comparisons of means between study groups. It is more useful to report the change from baseline, but PRO scores are usually expressed as units on an abstract scale and so guidance is needed for how to interpret the reported PRO change scores. Statistical significance does not necessarily mean that observed differences between treatments or within an individual over time are important or meaningful to patients [Wyrwich, 2013].

33

Patient Reported Outcome. An overview

Minimal important difference The minimal important difference is a benchmark by which PRO change scores can be interpreted at the group level, for example when comparing treatment arms. It is defined as the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects, a change in the patients manage‑ ment [Jaeschke, 1989]. The term minimal clinically important difference is adopted only when clinical assessments or judgements are incorporated. The minimal important difference allows for a comparison of the average change from baseline across all patients in the analytic group of interest. Not all PRO instruments have a published minimal important difference benchmark, and importantly the minimal important difference benchmark is context‑specific and may not be applicable to the situation of interest. Responder definition The FDA recommend the use of a responder definition when interpret‑ ing PRO change scores. A responder definition is an individual patient’s PRO score change over a predetermined time period that should be interpreted as treatment benefit [FDA, 2009]. So in contrast to the minimal important dif‑ ference which is for interpreting change scores at the group level, a responder definition allows each patient to be categorised as a PRO responder or non‑re‑ sponder. The study groups of interest are compared in terms of the proportion of patients reaching the specified responder definition value. The responder definition is defined a priori for each trial and study population and needs to be specified in the statistical analysis plan. Any value proposed as a responder definition needs to be at least as large as a minimal important difference to rule out the possibility of patients being classified as a responder by chance. Guidance for approaches to developing a responder definition can be found in Wyrwich et al. [Wyrwich, 2013]. Cumulative distribution of responses

34

The cumulative distribution of responses is a useful approach as it shows the spectrum of responses across a study population. The cumulative distribu‑ tion of responses is defined as the proportion of patients who experience every magnitude of change in a PRO instrument score at a time point of interest compared with baseline. It is a continuous plot of the proportion of patients at each point along the scale score continuum who experience change at that level or lower. The further benefit of this approach is that it avoids the need to develop a responder definition. Figure 4 shows a cumulative distribution frequency plot taken from an FDA approved product label for Aricept, indicated for Alzheimer’s dis‑ ease where the cognitive part of the Alzheimer’s Disease Assessment Scale (ADAS‑Cog) was used to measure cognitive impairment in patients, with a

Figure 4. Cumulative distribution frequency plot for the ADAS‑Cog for Aricept.

reduction in ADAS‑Cog score equating to improvement. In Figure 4 we can see that Aricept showed treatment benefit when compared to placebo in terms of improving a patient’s cognitive impairment at a variety of possible cut‑off points such as a 4 point improvement or 7 point improvement.

Reporting PRO results PRO data have often been analysed and reported separately from other clinical trial outcomes, if at all. The impact of this is that important informa‑ tion about the patient experience is not accessible to stakeholders reviewing the primary publication. It is therefore recommended that clinical trial PRO results are reported at the same time as the main clinical trial results are re‑

35

Patient Reported Outcome. An overview

Topic

PRO‑specific reporting

Title and abstract

•• The PRO should be identified in the abstract as a primary or secondary outcome.

Introduction

•• Include background and rationale for PRO assessment in the RCT. •• PRO hypothesis should be stated and relevant domains identified, if applicable.

Methods

•• Provide or cite evidence of PRO instrument validity and reliability if available, including the person completing the PRO and methods of data collection (paper, telephone, electronic, other).

Statistical methods

•• Statistical approaches for dealing with missing data to be explicitly stated for PROs specified as primary or important secondary outcomes.

Results

•• The number of PRO outcome data at baseline and at subsequent time points should be made transparent. •• Include a table that shows baseline demographic and clinical characteristics for each group. •• For each group, the number of participants (denominator) included in each analysis and whether the analysis was by original assigned groups. •• Provide the results for each group, the estimated effect size, and its precision (such as 95% confidence interval), for multidimensional PRO results provide this for each domain and time point. •• Provide the results for any other ancillary PRO analysis performed, including subgroup analysis and adjusted analysis, distinguishing pre‑specified from exploratory.

Discussions

•• Provide PRO‑specific limitations and implications for generalizability and clinical practice. •• PRO data should be interpreted in relation to clinical outcomes including survival data.

Table IV. CONSORT PRO extension guidelines for PRO reporting in RCTs.

ported. The Consolidated Standards of Reporting (CONSORT) PRO exten‑ sion provides evidence based recommendations to improve the reporting PRO data from randomised control trials [Calvert, 2013]. Table IV provides a sum‑ mary of their PRO‑specific guidance for reporting. For an example of report‑ ing PRO data see Dimopoulos et al. [Dimopoulos, 2013].

36

References

-- Aaronson NK, Ahmedzai S, Bergman B, et al. The European Organization for Research and Treatment of Cancer QLQ‑C30: a quality‑of‑life instrument for use in international clinical trials in oncology. J Natl Cancer Inst 1993; 85: 365‑76 -- Acquadro C, Conway K, Hareendran A, et al. Literature Review of Methods to Translate Health‑Related Quality of Life Questionnaires For Use in Multi‑National Clinical Trials. Value Health 2008; 11: 509‑21 -- Atkinson MJ, Sinha A, Hass SL, et al. Validation of a general measure of treatment satisfaction, the Treatment Satisfaction Questionnaire for Medication (TSQM), using a national panel study of chronic disease. Health Qual Life Outcomes 2004; 2: 12 -- Atkinson MJ, Kumar R, Cappelleri JC, et al. Hierarchical construct validity of the treatment satisfaction questionnaire for medication (TSQM version II) among outpatient pharmacy consumers. Value Health 2005; 8 Suppl 1: S9-S24 -- Basch E, Abernethy AP, Mullins CD, et al. Recommendations for incorporating patient‑reported outcomes into clinical comparative effectiveness research in adult oncology. J Clin Oncol 2012; 30: 4249‑55 -- Beck AT, Ward CH, Mendelson M, et al. An Inventory for Measuring Depression. Arch Gen Psychiatry 1961; 4: 561‑71 -- Bellamy N1, Buchanan WW, Goldsmith CH, et .Validation study of WOMAC: A health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. Journal of Rheumatology. J Rheumatol 1988; 15: 1833‑40 -- Berg S. Herceptin: Was patient power key?. BBC News, 2006. Available at: http://news.bbc.co.uk/1/hi/health/5063352.stm (last accessed September 2009) -- Bergman B, Aaronson S, Ahmedzai S, et al. The EORTC QLQ‑LC13: a modular supplement to the EORTC core quality of life questionnaire (QLQ‑C30) for use in lung cancer clinical trials. Eur J Cancer 1994; 30: 635‑42 -- Brooks R. EuroQol: the current state of play. Health Policy 1996; 37: 53‑72

37

Patient Reported Outcome. An overview

-- Calvert M, Blazeby J, Altman DG, et al.; CONSORT PRO Group. Reporting of patient‑reported outcomes in randomized trials: the CONSORT PRO extension. JAMA 2013; 309: 814‑22 -- Chassagny O, Sagnier P, Marquis P et al., for the European Regulatory Issues on Quality of Life Assessment (ERIQA) Group. Patient‑reported outcomes; the example of Health‑Related Quality of Life – A European Guidance Document for the Improved Integration of Health‑Related Quality of Life Assessment in the Drug Approval Process. DIA Journal 2001; 36: 209‑38 -- Coons S, Gwaltney C, Hays R, et al. Recommendations on Evidence Needed to Support Measurement Equivalence between Electronic and Paper‑Based Patient Reported Outcome (PRO) Measures: ISPOR ePRO Good Research Practices Task Force Report. Value Health 2009; 12: 419‑29 -- Dimopoulos MA, Delforge M, Hajek R, et al. Lenalidomide, melphalan, and prednisone followed by lenalidomide maintenance improves healthrelated quality of life in newly diagnosed multiple myeloma patients aged 65 years or older: results of a randomized phase III trial. Haematologica 2013; 98: 784-8 -- EMA (European Medicines Agency). Reflection paper on the regulatory guidance for the use of health‑related quality of life (HRQL) measures in the evaluation of medicinal products. Committee for medicinal products for human use (CHMP), 2005. Available at: http://www.ema. europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/ WC500003637.pdf -- Eremenco S, Cella D, Arnold B. A comprehensive method for the translation and cross‑cultural validation of health status questionnaires. Eval Health Prof 2005; 28: 212‑32 -- FDA (Food and Drug Administration). Guidance for Industry. Patient‑Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. December, 2009. Available at: http://www.fda.gov/downloads/Drugs/ GuidanceComplianceRegulatoryInformation/Guidances/UCM193282.pdf -- Irrgang JJ, Snyder‑Mackler L, Wainner RS, et al. Development of a patient‑reported measure of function of the knee. J Bone Joint Surg Am 1998; 80: 1132‑45 -- Jackson MJ, Sciberras J, Mangera A, Brett A, et al. Defining a Patient‑Reported Outcome Measure for Urethral Stricture Surgery. Eur Urol 2011; 60: 60‑8

38

-- Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 1989; 10: 407‑15

References

-- Kerr C, Nixon A, Wild D. Assessing and demonstrating data saturation in qualitative inquiry supporting patient‑reported outcomes research. Expert Rev Pharmacoecon Outcomes Res 2010; 10: 269‑81 -- Leidy NK, Revicki DA, Geneste B. Recommendations for evaluating the validity of quality of life claims for labelling and promotion. Value Health 1999; 2: 113‑27 -- McKenna SP, Doward LC. The Translation and Cultural Adaptation of Patient Reported Outcome Measures. Value Health 2005; 8: 89‑91 -- Niero M, Martin M, Finger T, et al. A New Approach to Multi‑Cultural Item Generation in the Development of Two Obesity‑Specific Measures: The Obesity and Weight Loss Quality of Life (OWLQOL) Questionnaire and the Weight‑Related Symptom Measure. Clin Ther 2002; 24: 690‑700 -- Nixon A, Kerr C, Doll H et al. Osteoporosis Assessment Questionnaire – physical function (OPAQ-PF): a psychometrically validated osteoporosistargeted patient reported outcome measure of daily activities of physical function. Osteoporos Int 2014; 25: 1775-84 -- Patrick DL, Burke LB, Gwaltney CJ, et al. Content validity ‑ Establishing and reporting the evidence in newly‑developed patient‑reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 1 ‑ Eliciting concepts for a new PRO instrument. Value Health 2011a; 14; 967‑77 -- Patrick DL, Burke LB, Gwaltney CJ, et al. Content validity ‑ Establishing and reporting the evidence in newly‑developed patient‑reported outcomes (PRO) Instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 2 – Assessing respondent understanding. Value Health 2011b; 14; 978‑88 -- Rainbird KJ, Perkins JJ, Sanson‑Fisher RW. The Needs Assessment for Advanced Cancer Patients (NA‑ACP): a measure of the perceived needs of patients with advanced, incurable cancer. A study of validity, reliability and acceptability. Psychooncology 2005; 14: 297‑306 -- Rosen R, Brown C, Heiman J, et al. The Female Sexual Function Index (FSFI): A Multidimensional Self‑Report Instrument for the Assessment of Female Sexual Function. J Sex Marital Ther 2000; 26: 191‑208 -- Saxena S, Orley J. WHOQOL Group. Quality of Life Assessment: The World Health Organisation Perspective. Eur Psychiatry 1997; 12 Suppl 3: 263s‑6s -- Skevington S, Sartorius N, Amir M. Developing Methods for Assessing Quality of Life in Different Cultural Settings. The History of the WHOQOL Instruments. Soc Psychiatry Psychiatr Epidemiol 2004; 39: 1‑8 -- Sprangers MA, Groenvold M, Arraras JI, et al. The European Organization for Research and Treatment of Cancer breast cancer‑specific quality‑of‑life questionnaire module: first results from a three‑country field study. J Clin Oncol 1996; 14: 2756‑68

39

Patient Reported Outcome. An overview

-- Steinke S. WellPoint Seeks More Quality Of Life, Cost Data In Formulary Submissions. The Pink Sheet 2008 -- Varni JW, Seid M, Rode CA. The PedsQL: measurement model for the pediatric quality of life Inventory. Med Care 1999; 37: 126‑39 -- Ware JE, Sherboume CD. The MOS 36‑item short‑form health survey (SF‑36) 1: conceptual framework and item selection. Med Care 1992; 30: 473‑83 -- Wild D, Grove A, Martin M. Principles of good practice for the translation and cultural adaptation process for patient‑reported outcome (PRO) measures: Report of the ISPOR task force for translation and cultural adaptation. Value Health 2005; 8: 94‑102 -- Wild D, Eremenco S, Mear I, et al. Multinational Trials – Recommendations on the Languages Required, Approaches to using the Same Language in Different Countries, and the Approaches to Support Pooling the Data. The ISPOR Patient‑Reported Outcomes Translation and Linguistic Validation Good Research Practices Task Force Report. Value Health 2009; 12: 430‑40 -- Wyrwich KW, Norquist JM, Lenderking WR, et al., the Industry Advisory Committee of International Society for Quality of Life Research ISOQOL. Methods for interpreting change in patient‑reported outcome measures. Qual Life Res 2013; 22: 475‑83 -- Zigmond AS, Snaith RP. The hospital anxiety and depression scale. Acta Psychiatr Scand 1983; 67: 361‑70

40

Authors

Dr Annabel Nixon is a Patient Reported Outcomes expert with 19 years’ international experience in this field. Dr Nixon has specialist knowledge of FDA, EMA and health technology agency requirements for Patient Reported Outcomes and other clinical outcome assessments to support label claims and product reimbursement. Dr Nixon has presented at international conferences and published widely in peer review publications. Previously Dr Nixon com‑ pleted a PhD at the University of Manchester focused on measurement of health-related quality of life in a Kenyan population, was a Research Fellow at the London School of Hygiene and Tropical Medicine before moving into consultancy initially at Quintiles located in Europe and the US, followed by Director of the Patient Reported Outcomes group at Oxford Outcomes and subsequently Director of Patient Centered Outcomes at PRMA Consulting. Currently Dr Nixon is working as an independent patient reported outcomes consultant at Chilli Consultancy and is co-chair of the DIA Study Endpoints community. Diane Wild is a researcher with particular expertise in the translation and linguistic validation of patient reported outcome measures. She has published widely in the areas of patient reported outcomes and linguistic validation. She was the lead author on a translation and linguistic validation best practices paper in 2005 which has become the industry standard for methods in this area. Diane was the founding Director of Oxford Outcomes, a multi-national outcomes research consultancy. Currently Diane is working as an independent patient reported outcome and linguistic validation consultant whilst studying for a MSc in Medical Anthropology. Willie Muehlhausen is a recognized eClinical Technology expert with 17 years’ experience in this field of clinical research. He has researched and published with a specific focus on usability of patient facing technologies and has developed several medical devices for patient use. In recent years he has focused his research on issues and challenges around the development and mi‑ gration of electronic Clinical Outcomes Assessments and the implementation of BYOD strategies. From 2011 to 2013 he served as the founding Vice Direc‑ tor of the C-Path´s ePRO consortium and still chairs the Scientific Subcom‑ mittee. Currently Willie is Head of Innovation at ICON plc where his team develops next generation data capture and analytics systems which include ePRO solutions.

41