Diagnostic Test Accuracy Studies in Dementia: A Pragmatic Approach [2nd ed.] 978-3-030-17561-0;978-3-030-17562-7

The new and updated edition of this book explains the key steps in planning and executing diagnostic test accuracy studi

331 78 3MB

English Pages XVI, 184 [194] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Diagnostic Test Accuracy Studies in Dementia: A Pragmatic Approach [2nd ed.]
 978-3-030-17561-0;978-3-030-17562-7

Table of contents :
Front Matter ....Pages i-xvi
Introduction (A. J. Larner)....Pages 1-20
Methods (1): Participants and Test Methods (A. J. Larner)....Pages 21-49
Methods (2): Statistical Methods (A. J. Larner)....Pages 51-93
Results (1): Participants and Test Results (A. J. Larner)....Pages 95-107
Results (2): Estimates of Diagnostic Accuracy (A. J. Larner)....Pages 109-148
Discussion (A. J. Larner)....Pages 149-162
Future Prospects for Diagnostic Test Accuracy Studies in Dementia (A. J. Larner)....Pages 163-179
Back Matter ....Pages 181-184

Citation preview

Diagnostic Test Accuracy Studies in Dementia A Pragmatic Approach A. J. Larner Second Edition

123

Diagnostic Test Accuracy Studies in Dementia

A. J. Larner

Diagnostic Test Accuracy Studies in Dementia A Pragmatic Approach Second Edition

A. J. Larner Walton Centre for Neurology and Neurosurgery Liverpool, UK

ISBN 978-3-030-17561-0    ISBN 978-3-030-17562-7 (eBook) https://doi.org/10.1007/978-3-030-17562-7 © Springer Nature Switzerland AG 2015, 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To Martin Loyal cousin (and supporter of Gloucestershire County Cricket Club) and dedicated healthcare professional

Preface to the First Edition

This book has evolved from more than a decade of personal experience in conducting diagnostic test accuracy studies of clinical signs and cognitive and non-­cognitive screening instruments in a dedicated cognitive disorders clinic. Many of these studies have been published, and are summarised elsewhere (Larner 2014a, b). The ageing of the human population with the consequent increase in numbers of individuals afflicted with cognitive impairment and dementia mandates diagnostic test accuracy studies to identify and eventually treat these patients effectively. If, as seems likely, population based testing for early identification becomes the future policy norm, the requirement for tests with established high diagnostic test accuracy is self-evident. A rigorous methodology developed to generate meaningful data from such studies is therefore an imperative need. As well as giving a general overview, the book gives particular emphasis to, and argues in favour of, what I have previously termed “pragmatic diagnostic accuracy studies” (Larner 2012a, 2014a, pp. 33–5). This methodology seems to me to correspond largely with Sackett and Haynes’ (2002) nomenclature of addressing a “Phase III question”, i.e. among patients in whom it is clinically sensible to suspect the target disorder, does the test result distinguish those with and without the target disorder? Consecutive patients should be studied to answer such questions, and hence this approach, with certain limitations, would seem to fit very well with the idiom of day-to-day clinical practice. Hence, such pragmatic diagnostic test accuracy studies, far from being the preserve of large, well-funded, often international, collaborative groups comprised of an intellectual elite (as for most randomised controlled trials), may fall relatively easily within the ambit or compass of jobbing work-a-day clinicians (like myself), a phenomenon which I have elsewhere ventured to term “micro-research” (Larner 2012b, p. xv). It is hoped that a brief exposition on some of the practicalities of pragmatic diagnostic test accuracy studies will encourage readers as to their feasibility without the necessity for a large research infrastructure or funding, and hence to identify suitable research questions and undertake the appropriate empirical studies. No in depth mathematical expertise is required for the application of the few equations found in the text (moreover, statistical programmes are also available), nor any familiarity with probability theory, for vii

viii

Preface to the First Edition

which reason probability notation is eschewed. Lest a discourse on method be deemed too arid, the text is leavened with some examples taken from the literature on diagnostic studies in dementia. The book is structured as for a research publication, i.e. Introduction, Methods, Results, Discussion, with a few digressions where necessary. Since methodology is paramount, this of necessity comprises the longest overall section of the book (Chaps. 2 and 3). The chosen structure also follows the published guidelines for the assessment of the quality of studies examining the diagnostic accuracy of clinical tests, such as the Standards for Reporting Diagnostic Accuracy (STARD; Bossuyt et al. 2003) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS and QUADAS-2; Whiting et al. 2004, 2011). More recently, STARD guidelines for reporting diagnostic test accuracy studies in dementia (STARDdem) have emerged from the Cochrane Dementia and Cognitive Improvement Group (Noel-Storr et al. 2014), so this is an apposite moment to produce a book-length treatment of these issues as related to dementia practice. However, the content here is rather more discursive, and perhaps less prescriptive, than in the aforementioned guidelines, reflective of individual practice. Readers will (hopefully) therefore be helped to negotiate the sometimes bumpy path between the aspirations of principles and the messy realities of practice, and therefore be able to be “doing science” in some sense without being superficialist. Studies examining cognitive screening instruments are particularly emphasized, reflecting the author’s area of particular interest (greater detail on some of the instruments examined may be accessed elsewhere: Larner 2013), but as the detection of disease biomarkers has become of increasing importance to diagnosis, as reflected in more recent sets of diagnostic criteria for neurodegenerative disorders, these too will be considered. Although the focus of this book is dementia and cognitive disorders, the approach described may be applicable not only to other areas of neurological practice but also of medicine and even surgery. References Bossuyt PM, Reitsma JB, Bruns DE et al. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clin Chem. 2003;49:7–18. Larner AJ. Pragmatic diagnostic accuracy studies. http://bmj.com/content/345/bmj. e3999?tab=responses, 28 August 2012a. Larner AJ. Dementia in clinical practice: a neurological perspective. Studies in the dementia clinic. London: Springer; 2012b. Larner AJ (ed.). Cognitive screening instruments. A practical approach. London: Springer; 2013. Larner AJ.  Dementia in clinical practice: a neurological perspective. Pragmatic studies in the cognitive function clinic. 2nd ed. London: Springer; 2014a. Larner AJ. Neurological signs of possible diagnostic value in the cognitive disorders clinic. Pract Neurol. 2014b;14:332–5.

Preface to the First Edition

ix

Noel-Storr AH, McCleery JM, Richard E et al. Reporting standards for studies of diagnostic test accuracy in dementia: the STARDdem Initiative. Neurology. 2014;83:364–73. Sackett DL, Haynes RB. The architecture of diagnostic research. In: Knottnerus JA (ed.). The evidence base of clinical diagnosis. London: BMJ Books; 2002. p. 19–38. Whiting P, Rutjes AW, Dinnes J, Reitsma J, Bossuyt PM, Kleijnen J. Development and validation of methods for assessing the quality of diagnostic accuracy studies. Health Technol Assess. 2004;8(iii):1–234. Whiting PF, Rutjes AW, Westwood ME et  al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155:529–36. A. J. Larner

Preface to the Second Edition

This new edition abides by the principles which informed the first edition (see Preface to the first edition), aiming to serve as a propaedeutics of diagnostic test accuracy studies in dementia, based around the structure of the STARD guidelines for reporting diagnostic test accuracy studies in dementia (STARDdem) published by the Cochrane Dementia and Cognitive Improvement Group (Noel-Storr et  al. 2014; Quinn and Takwoingi 2017). It incorporates new material on STARD publications which have appeared since the first edition (Larner 2015), which is related to diagnostic test accuracy (Bossuyt et al. 2015; Cohen et al. 2016, 2017), as well as the use of biomarkers of cognitive disorders as increasingly enshrined in diagnostic criteria. As before, it continues to reflect the author’s own experience in diagnostic test accuracy studies, particularly in the sphere of cognitive screening instruments (Larner 2017, 2018). The Results section has now been split into two chapters to mirror the similar bipartite division of the Methods section. Previous errors have been corrected. It is hoped that this approach will encourage clinicians to undertake high-quality pragmatic diagnostic test accuracy studies in dementia and cognitive disorders, rooted in day-to-day clinical practice. References Bossuyt PM, Reitsma JB, Bruns DE et al. STARD 2015. BMJ. 2015;351:h5527. Cohen JF, Korevaar DA, Altman DG et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open. 2016;6(11):e012799. Cohen JF, Korevaar DA, Gatsonis CA et al. STARD for Abstracts: essential items for reporting diagnostic accuracy studies in journal or conference abstracts. BMJ. 2017;358:j3571. Larner AJ.  Diagnostic test accuracy studies in dementia: a pragmatic approach. London: Springer; 2015. Larner AJ (ed.). Cognitive screening instruments. A practical approach. 2nd ed. London: Springer; 2017.

xi

xii

Preface to the Second Edition

Larner AJ.  Dementia in clinical practice: a neurological perspective. Pragmatic studies in the Cognitive Function Clinic. 3rd ed. London: Springer; 2018. Noel-Storr AH, McCleery JM, Richard E et al. Reporting standards for studies of diagnostic test accuracy in dementia: the STARDdem Initiative. Neurology. 2014;83:364–73. Quinn TJ, Takwoingi Y.  Assessment of the utility of cognitive screening instruments. In: Larner AJ (ed.). Cognitive screening instruments. A practical approach. 2nd ed. London: Springer; 2017. p. 15–34. A. J. Larner

Contents

1 Introduction����������������������������������������������������������������������������������������������    1 1.1 Prologue��������������������������������������������������������������������������������������������    1 1.2 Title/Abstract/Keywords ������������������������������������������������������������������    4 1.3 Introduction��������������������������������������������������������������������������������������    5 1.3.1 Research Question����������������������������������������������������������������    6 1.4 Some Brief Notes on Bias����������������������������������������������������������������   10 1.4.1 Patient-Based Biases������������������������������������������������������������   11 1.4.2 Test Performance Biases ������������������������������������������������������   13 1.5 Summary and Recommendations ����������������������������������������������������   14 References��������������������������������������������������������������������������������������������������   14 2 Methods (1): Participants and Test Methods����������������������������������������   21 2.1 Participants����������������������������������������������������������������������������������������   22 2.1.1 Study Population������������������������������������������������������������������   22 2.1.2 Recruitment: Study Design (Cross-Sectional vs Longitudinal)����������������������������������������   25 2.1.3 Sampling ������������������������������������������������������������������������������   26 2.1.4 Data Collection (Retrospective vs Prospective); Missing Data ������������������������������������������������������������������������   28 2.2 Test Methods������������������������������������������������������������������������������������   29 2.2.1 Target Condition(s) and Reference Standard(s)��������������������   29 2.2.2 Technical Specifications and Test Administration����������������   33 2.2.3 Calibration: The Definition of Cut-Offs��������������������������������   36 2.2.4 Blinding��������������������������������������������������������������������������������   41 2.3 Summary and Recommendations ����������������������������������������������������   41 References��������������������������������������������������������������������������������������������������   42 3 Methods (2): Statistical Methods������������������������������������������������������������   51 3.1 The 2 × 2 Table; Table of Confusion; Confusion Matrix������������������   52 3.2 Measures of Discrimination; Confidence Intervals��������������������������   53 3.2.1 Correct Classification Accuracy; Inaccuracy; Net Reclassification Improvement (NRI)������������������������������������   54 xiii

xiv

Contents

3.2.2 Sensitivity and Specificity����������������������������������������������������   56 3.2.3 Error Terms: False Positive, Negative����������������������������������   59 3.2.4 Youden Index (Y)������������������������������������������������������������������   60 3.2.5 Predictive Values ������������������������������������������������������������������   62 3.2.6 Error Terms: False Alarm (Discovery), Reassurance (Omission)�������������������������������������������������������   65 3.2.7 Predictive Summary Index (PSI)������������������������������������������   66 3.2.8 Likelihood Ratios; Bayes’ Theorem ������������������������������������   66 3.2.9 Diagnostic Odds Ratio or Cross-Product Ratio; Error Odds Ratio ������������������������������������������������������������������   71 3.2.10 Clinical Utility Indexes ��������������������������������������������������������   72 3.2.11 “Number Needed to”: Diagnose, Predict, Misdiagnose, Screen ������������������������������������������������������������   72 3.2.12 Receiver Operating Characteristic (ROC) Curve; Q* Index��������������������������������������������������������������������������������   75 3.2.13 Effect Size: Cohen’s d����������������������������������������������������������   77 3.3 Comparative Measures����������������������������������������������������������������������   78 3.3.1 Correlation����������������������������������������������������������������������������   78 3.3.2 Tests of Agreement: Cohen’s Kappa Statistic����������������������   79 3.3.3 Limits of Agreement: Bland-Altman Method����������������������   80 3.3.4 Combination Using Simple Logical Rules ��������������������������   81 3.3.5 Weighted Comparison (WC) and Equivalent Increase (EI)��������������������������������������������������������������������������   82 3.3.6 Effect Size: Cohen’s d����������������������������������������������������������   83 3.4 Reproducibility ��������������������������������������������������������������������������������   83 3.5 Significance Testing��������������������������������������������������������������������������   84 3.5.1 Null Hypothesis Testing��������������������������������������������������������   84 3.5.2 Correcting for Skewed Data��������������������������������������������������   85 3.6 Summary and Recommendations ����������������������������������������������������   86 References��������������������������������������������������������������������������������������������������   86 4 Results (1): Participants and Test Results���������������������������������������������   95 4.1 Participants����������������������������������������������������������������������������������������   96 4.1.1 Study Duration and Setting��������������������������������������������������   96 4.1.2 Demographics ����������������������������������������������������������������������   97 4.1.3 Participant Loss��������������������������������������������������������������������  101 4.2 Test Results ��������������������������������������������������������������������������������������  102 4.2.1 Interval Between Diagnostic Test and Reference Standard��������������������������������������������������������  102 4.2.2 Distribution of Disease Severity ������������������������������������������  102 4.2.3 Cross Tabulation and Dropouts��������������������������������������������  103 4.2.4 Adverse Effects of Testing����������������������������������������������������  104 4.2.5 Indeterminate Results�����������������������������������������������������������  104 4.2.6 Variability Between Subgroups��������������������������������������������  104 4.3 Summary and Recommendations ����������������������������������������������������  105 References��������������������������������������������������������������������������������������������������  105

Contents

xv

5 Results (2): Estimates of Diagnostic Accuracy��������������������������������������  109 5.1 Measures of Discrimination��������������������������������������������������������������  111 5.1.1 Correct Classification Accuracy; Net Reclassification Improvement (NRI)��������������������������������������������������������������  111 5.1.2 Sensitivity and Specificity����������������������������������������������������  115 5.1.3 Error Terms: False Positive, Negative; Alarm, Reassurance��������������������������������������������������������������������������  118 5.1.4 Youden Index (Y)������������������������������������������������������������������  118 5.1.5 Predictive Values; Predictive Summary Index����������������������  119 5.1.6 Likelihood Ratios������������������������������������������������������������������  121 5.1.7 Diagnostic Odds Ratio or Cross-Product Ratio��������������������  124 5.1.8 Clinical Utility Indexes ��������������������������������������������������������  125 5.1.9 “Number Needed to” Metrics: Diagnose, Predict, Misdiagnose��������������������������������������������������������������������������  125 5.1.10 Receiver Operating Characteristic (ROC) Curve; Q* Index��������������������������������������������������������������������������������  127 5.1.11 Effect Size: Cohen’s d����������������������������������������������������������  129 5.2 Comparative Measures����������������������������������������������������������������������  130 5.2.1 Correlation����������������������������������������������������������������������������  130 5.2.2 Tests of Agreement: Cohen’s Kappa Statistic����������������������  132 5.2.3 Limits of Agreement: Bland-Altman Method����������������������  133 5.2.4 Combination Using Simple Logical Rules ��������������������������  133 5.2.5 Weighted Comparison (WC) and Equivalent Increase (EI)��������������������������������������������������������������������������  136 5.2.6 Effect Size: Cohen’s d����������������������������������������������������������  137 5.3 Reproducibility ��������������������������������������������������������������������������������  137 5.4 Significance Testing��������������������������������������������������������������������������  138 5.4.1 Null Hypothesis Testing��������������������������������������������������������  138 5.4.2 Correcting for Skewed Data��������������������������������������������������  141 5.5 Summary and Recommendations ����������������������������������������������������  141 References��������������������������������������������������������������������������������������������������  142 6 Discussion�������������������������������������������������������������������������������������������������  149 6.1 Summary of Key Results������������������������������������������������������������������  149 6.2 Clinical Applicability������������������������������������������������������������������������  150 6.3 Shortcomings/Limitations����������������������������������������������������������������  151 6.3.1 Participants����������������������������������������������������������������������������  153 6.3.2 Test Results ��������������������������������������������������������������������������  154 6.3.3 Estimates of Diagnostic Accuracy����������������������������������������  155 6.4 Conclusion����������������������������������������������������������������������������������������  157 6.5 References/Bibliography������������������������������������������������������������������  157 6.6 Epilogue: The Publication Process ��������������������������������������������������  158 6.7 Summary and Recommendations ����������������������������������������������������  159 References��������������������������������������������������������������������������������������������������  159

xvi

Contents

7 Future Prospects for Diagnostic Test Accuracy Studies in Dementia ��������������������������������������������������������������������������������  163 7.1 Problems and Pitfalls������������������������������������������������������������������������  164 7.1.1 Bias ��������������������������������������������������������������������������������������  164 7.1.2 Index Test and Reference Standard��������������������������������������  165 7.1.3 Blinding��������������������������������������������������������������������������������  165 7.1.4 Reproducibility ��������������������������������������������������������������������  166 7.1.5 Wrong Paradigm ������������������������������������������������������������������  166 7.2 Opportunities������������������������������������������������������������������������������������  166 7.2.1 Questions Which Might Be Addressed ��������������������������������  166 7.2.2 Settings����������������������������������������������������������������������������������  171 7.2.3 Analysis��������������������������������������������������������������������������������  172 7.3 Pragmatic Diagnostic Test Accuracy Studies: A Proposal����������������  172 7.4 Summary and Recommendations ����������������������������������������������������  176 References��������������������������������������������������������������������������������������������������  176 Index������������������������������������������������������������������������������������������������������������������  181

Chapter 1

Introduction

Contents 1.1  P  rologue 1.2  T  itle/Abstract/Keywords 1.3  I ntroduction 1.3.1  Research Question 1.4  Some Brief Notes on Bias 1.4.1  Patient-Based Biases 1.4.2  Test Performance Biases 1.5  Summary and Recommendations References

 1  4  5  6  10  11  13  14  14

Abstract  This chapter examines the introductory elements in the report of a diagnostic test accuracy study. Central to this is the definition of the research question to be examined. An important distinction needs to be drawn between proof-of-concept or experimental studies, which are particularly appropriate for new diagnostic tests, and which may be undertaken in ideal or extreme contrast settings; and pragmatic studies, which generally recruit consecutive patients and hence are more reflective of the idiom of day-to-day clinical practice. Awareness of the various sources of bias which may influence the outcomes of diagnostic test accuracy studies is important from the outset, since these may limit study utility.

1.1  Prologue The need for diagnostic test accuracy studies is self-evident to any clinician. Although some diagnoses can be made on history from patient and informant alone (perhaps particularly in neurology and psychiatry), more often than not further testing by means of examination and investigation is needed to confirm or refute diagnostic hypotheses emerging from the history (Larner et al. 2011). Clinicians need to

© Springer Nature Switzerland AG 2019 A. J. Larner, Diagnostic Test Accuracy Studies in Dementia, https://doi.org/10.1007/978-3-030-17562-7_1

1

2

1 Introduction

know the diagnostic accuracy of such examination signs and diagnostic tests. Hence the requirement for diagnostic test accuracy studies is well-recognised (Cordell et al. 2013). Studies to generate such data require methodological rigour to ensure their utility and applicability. This is not some sterile academic exercise in arid numeration, but a vital process to appreciate the benefits and limitations of diagnostic tests and to promote their intelligent, rather than indiscriminate, use. Evidently, reliable diagnosis will pave the way for many processes, including, but not limited to, the giving of information to patients and their relatives, the initiation of symptomatic and/or disease modifying treatment, and the planning of care needs. The quality of diagnostic test accuracy studies may be evaluated using methodological quality assessment tools (e.g. Scottish Intercollegiate Guidelines Network 2007), of which the best known and widely adopted are the STAndards for the Reporting of Diagnostic accuracy studies (STARD; Bossuyt et  al. 2003; Ochodo and Bossuyt 2013) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS; Whiting et al. 2003, 2004) and its revision (QUADAS-2; Whiting et al. 2011). STARD is a prospective tool which may be used to plan and implement well-­ designed studies, relatively free of bias. The original publication (Bossuyt et  al. 2003) included a checklist of 25 items and a flow chart to be followed to optimise study design and reporting. An updated version of STARD (Bossuyt et  al. 2015; Cohen et al. 2016) has a checklist of 30 items. QUADAS is a retrospective instrument used to assess the methodological rigour of diagnostic accuracy studies, using 14 criteria to assess the quality of research studies. These initiatives were in part a consequence of the perception that diagnostic test accuracy study methodology was of poorer quality than that used in studies of therapeutic interventions (randomised double-blind placebo-controlled studies). High quality diagnostic test accuracy studies not only inform clinicians in their decision making but also may be suitable for inclusion in meta-analyses, which have their own guidelines for performance and reporting (the PRISMA statement; Liberati et al. 2009; Moher et al. 2009; Shamseer et al. 2015). Guidelines for diagnostic test accuracy studies specific to dementia which are based on the original STARD guidelines have been published by the Cochrane Dementia and Cognitive Improvement Group (Noel-Storr et  al. 2014). This STARDdem initiative acknowledged areas where revisions of STARD pertinent to dementia and cognitive disorders were required, as well as highlighting areas in which reporting has been poor hitherto. The diagnosis of dementia poses many challenges. Dementia and cognitive impairment are syndromes with many potential causes (Mendez and Cummings 2003; Kurlan 2006; Ames et  al. 2010; Dickerson and Atri 2014; Quinn 2014), including many neurological and non-neurological diseases (Larner 2013a). The clinical heterogeneity of the casemix in clinics dedicated to the assessment of cognitive disorders is a given, unless significant selection by clinicians at the referral stage, for example by the imposition of exacting clinical inclusion and exclusion criteria, is permitted.

1.1 Prologue

3

Moreover, cognitive impairment is a process rather than an event (with the possible exception of strategic infarct dementia, a fairly rare occurrence) and hence often of changing severity over time. The evolution of cognitive decline (illustrated both beautifully and harrowingly in the novel We are not ourselves by Thomas 2014) means that early signs are often passed off or explained away, and hence delay in presentation for clinical assessment is common. Patients with dementia disorders may therefore present at different stages of disease, with variable degrees of clinical severity. An added complication in diagnosis, and one brought more sharply into focus by the drive to early diagnosis and initiation of disease-modifying drugs (when these become available), is correct identification of patients in early disease stages, before criteria for dementia are fulfilled. Various terminologies have been used for such states, such as mild cognitive impairment (MCI), cognitive impairment no dementia, mild cognitive dysfunction, and minor neurocognitive disorder, indeed a lexicon has been proposed (Dubois et al. 2010). Certainly the old binary classification for the diagnosis of Alzheimer’s disease (Is it dementia? If so, is it Alzheimer’s disease? McKhann et  al. 1984) has been rejected in favour of diagnosis based on disease biomarkers (Dubois et al. 2007; McKhann et al. 2011), a move from understanding AD as a clinicopathological entity to a clinicobiological entity (Dubois et al. 2014). Disease biomarkers may be positive long before clinical features become apparent (Bateman et  al. 2012; Jack Jr et  al. 2013; Yau et  al. 2015; Dubois et  al. 2018), prompting consideration of “pre-MCI” or “subjective cognitive impairment” stages (Reisberg et al. 2008; Garcia-Ptacek et al. 2016). Diagnostic studies in dementia may therefore be either cross-Sectional, the typical paradigm of clinical practice, or longitudinal, the delayed verification paradigm. Passage of time is certainly one of the most informative diagnostic tests for dementia syndromes, but its application may result in opportunities for early treatment being missed. Diagnostic test accuracy studies which score highly on the STARD/QUADAS ratings may not necessarily reflect the situations encountered by clinicians in daily practice. For example, such studies may have been undertaken in populations selected for a known diagnosis and compared with normal controls, a situation alien to day-to-day clinical practice. Pragmatic diagnostic test accuracy studies (Larner 2012a, 2014a, pp. 33–5, 2018a, pp. 37–9) may therefore also be required, to provide information supporting or refuting a given diagnosis suspected on clinical assessment. This is analogous to the need for pragmatic studies of treatments to supplement the findings of randomised double-blind placebo-controlled trials (Marson et al. 2005; Larner and Marson 2011). This book examines some of the practicalities of performing diagnostic test accuracy studies, particularly from a pragmatic perspective. A note should be appended here about whether tests are being used for diagnosis or for screening. Some authorities appear to envisage screening as a process applied to asymptomatic individuals with early disease (Sackett and Haynes 2002a, p. 33), although the widely accepted World Health Organisation (WHO) Screening Criteria (Wilson and Jungner 1968) do not seem to require that the condition being screened for is asymptomatic, merely that it has a “recognised latent or presymptomatic

4

1 Introduction

stage”. Many tests used in the evaluation of patients with memory complaints which may possibly be a consequence of brain disease are not diagnostic per se, but may indicate those patients who are, or might be (“at risk”), in an early symptomatic phase and require further investigation to confirm or refute a suspected diagnosis. This is perhaps particularly true of cognitive screening instruments (Larner 2017a), hence this nomenclature. Many factors other than the presence of a dementia disorder may conspire to produce poor patient performance on these measures, such as primary sensory deficits, sleep disturbance, affective disorder, drug use (therapeutic, recreational), lack of application, or any combination of these. In other words, tests which are not examining biomarkers may be influenced by factors other than the disease per se. Hence these tests may be able to do no more than screen patients for the possible presence of a dementing disorder (although some claim to be relatively specific for Alzheimer’s disease). Debate about the value of screening of whole populations for cognitive impairment, which will inevitably include testing of large numbers of asymptomatic individuals, continues (e.g. Fox et al. 2013), and other strategies focusing on at risk groups may be feasible (Larner 2018b). With increasing efforts to define neurodegenerative disorders, such as Alzheimer’s disease, as clinicobiological, rather than clinicopathological, entities (Dubois et al. 2007, 2014; McKhann et al. 2011), it may be that truly diagnostic tests, addressing the biology of disease, will be forthcoming, such as CSF and neuroimaging biomarkers (some of which are considered in Chap. 5). Even if this is so, such sophisticated tests may not be universally, or indeed widely, available, and hence the use of cognitive screening instruments rather than diagnostic (biomarker) tests may persist. Both screening and biomarker tests require assessment using test accuracy studies, but in these circumstances the former may be better denoted as “screening test accuracy studies” rather than “diagnostic test accuracy studies”. In the interests of simplicity the latter term has been used throughout in this book (although the screening utility of clinical signs and cognitive instruments has been acknowledged in previous publications, e.g. Larner 2007a, 2012b, c, 2014b). The issue of developing tests to screen for asymptomatic individuals who may be harbouring dementing disorders, and the nature of the test accuracy studies required for them, is one of the key areas for the future (Sect. 7.2.1.2).

1.2  Title/Abstract/Keywords Little advice need be proffered on the subject of the title of an article reporting a diagnostic test accuracy study. Whereas in case reports or case series, where a catchy, alliterative, interrogative, ambiguous, or teasing title may be important in order to garner reader attention for what is essentially anecdotal evidence (Ghadiri-­ Sani and Larner 2014a), in diagnostic test accuracy studies no such gimmickry is required (or desired). The article title should be entirely informative, and should perhaps use the exact terminology (“a diagnostic test accuracy study”) to alert

1.3 Introduction

5

potential readers and to avoid all ambiguity. However, at time of writing (December 2018), searching Pubmed title words with the term “diagnostic accuracy” coupled with either “dementia” or “Alzheimer’s” achieves few hits (