A Guide to Assessments That Work [2 ed.] 0190492244, 9780190492243

The first edition of A Guide To Assessments That Work provided a much needed resource on evidence-based psychological as

255 61 6MB

English Pages 776 [773] Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

A Guide to Assessments That Work [2 ed.]
 0190492244, 9780190492243

Table of contents :
Contents
Foreword to the First Edition • Peter E. Nathan
Preface
About the Editors
Contributors
Part I: Introduction
1. Developing Criteria for Evidence-Based Assessment: An Introduction to Assessments That Work • John Hunsley, Eric J. Mash
2. Dissemination and Implementation of Evidence-Based Assessment • Amanda Jensen-Doss, Lucia M. Walsh, Vanesa Mora Ringle
3. Advances in Evidence-Based Assessment: Using Assessment to Improve Clinical Interventions and Outcomes • Eric A. Youngstrom, Anna Van Meter
Part II: Attention-Deficit and Disruptive Behavior Disorders
4. Attention-Deficit/Hyperactivity Disorder • Charlotte Johnston, Sara Colalillo
5. Child and Adolescent Conduct Problems • Paul J. Frick, Robert J. McMahon
Part III: Mood Disorders and Self-Injury
6. Depression in Children and Adolescents • Lea R. Dougherty, Daniel N. Klein, Thomas M. Olino
7. Adult Depression • Jacqueline B. Persons, David M. Fresco, Juliet Small Ernst
8. Depression in Late Life • Amy Fiske, Alisa O’Riley Hannum
9. Bipolar Disorder • Sheri L. Johnson, Christopher Miller, Lori Eisner
10. Self-Injurious Thoughts and Behaviors • Alexander J. Millner, Matthew K. Nock
Part IV: Anxiety and Related Disorders
11. Anxiety Disorders in Children and Adolescents • Simon P. Byrne, Eli R. Lebowitz, Thomas H. Ollendick, Wendy K. Silverman
12. Specific Phobia and Social Anxiety Disorder • Karen Rowa, Randi E. McCabe, Martin M. Antony
13. Panic Disorder and Agoraphobia • Amy R. Sewart, Michelle G. Craske
14. Generalized Anxiety Disorder • Michel J. Dugas, Catherine A. Charette, Nicole J. Gervais
15. Obsessive–Compulsive Disorder • Shannon M. Blakey, Jonathan S. Abramowitz
16. Post-Traumatic Stress Disorder in Adults • Samantha J. Moshier, Kelly S. Parker-Guilbert, Brian P. Marx, Terence M. Keane
Part V: Substance-Related and Gambling Disorders
17. Substance Use Disorders • Damaris J. Rohsenow
18. Alcohol Use Disorder • Angela M. Haeny, Cassandra L. Boness, Yoanna E. McDowell, Kenneth J. Sher
19. Gambling Disorders • David C. Hodgins, Jennifer L. Swan, Randy Stinchfield
Part VI: Schizophrenia and Personality Disorders
20. Schizophrenia • Shirley M. Glynn, Kim T. Mueser
21. Personality Disorders • Stephanie L. Rojas, Thomas A. Widiger
Part VII: Couple Distress and Sexual Disorders
22. Couple Distress • Douglas K. Snyder, Richard E. Heyman, Stephen N. Haynes, Christina Balderrama-Durbin
23. Sexual Dysfunction • Natalie O. Rosen, Maria Glowacka, Marta Meana, Yitzchak M. Binik
Part VIII: Health-Related Problems
24. Eating Disorders • Robyn Sysko, Sara Alavi
25. Insomnia Disorder • Charles M. Morin, Simon Beaulieu-Bonneau, Kristin Maich, Colleen E. Carney
26. Child and Adolescent Pain • C. Meghan McMurtry, Patrick J. McGrath
27. Chronic Pain in Adults • Thomas Hadjistavropoulos, Natasha L. Gallant, Michelle M. Gagnon
Assessment Instrument Index
Author Index
Subject Index

Citation preview

A GUIDE TO ASSESSMENTS THAT WORK

A GUIDE TO ASSESSMENTS THAT WORK S e c o n d E di t i o n

EDITED BY

John Hunsley and Eric J. Mash

1

1 Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and certain other countries. Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America. © Oxford University Press 2018 First Edition published in 2008 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by license, or under terms agreed with the appropriate reproduction rights organization. Inquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above. You must not circulate this work in any other form and you must impose this same condition on any acquirer. CIP data is on file at the Library of Congress ISBN 978–​0–​19–​049224–​3 1 3 5 7 9 8 6 4 2 Printed by Sheridan Books, Inc., United States of America

Contents

Foreword to the First Edition by Peter E. Nathan  vii

Part II  Attention-​Deficit and Disruptive Behavior Disorders 

Preface  xi

4. Attention-​Deficit/​Hyperactivity Disorder  47

About the Editors  xv Contributors  xvii Part I  Introduction  1. Developing Criteria for Evidence-​Based Assessment: An Introduction to Assessments That Work  3 JOHN HUNSLEY ERIC J. MASH

2. Dissemination and Implementation of Evidence-​Based Assessment  17 AMANDA JENSEN-​DOSS LUCIA M. WALSH VANESA MORA RINGLE

3. Advances in Evidence-​Based Assessment: Using Assessment to Improve Clinical Interventions and Outcomes  32 ERIC A. YOUNGSTROM ANNA VAN METER

CHARLOTTE JOHNSTON SARA COLALILLO

5. Child and Adolescent Conduct Problems  71 PAUL J. FRICK ROBERT J. McMAHON

Part III  Mood Disorders and Self-​Injury  6. Depression in Children and Adolescents  99 LEA R. DOUGHERTY DANIEL N. KLEIN THOMAS M. OLINO

7. Adult Depression  131

JACQUELINE B. PERSONS DAVID M. FRESCO JULIET SMALL ERNST

8. Depression in Late Life  152 AMY FISKE ALISA O’RILEY HANNUM

9. Bipolar Disorder  173

SHERI L. JOHNSON CHRISTOPHER MILLER LORI EISNER

vi

Contents

10. Self-​Injurious Thoughts and Behaviors  193 ALEXANDER J. MILLNER MATTHEW K. NOCK

Part IV  Anxiety and Related Disorders  11. Anxiety Disorders in Children and Adolescents  217 SIMON P. BYRNE ELI R. LEBOWITZ THOMAS H. OLLENDICK WENDY K. SILVERMAN

12. Specific Phobia and Social Anxiety Disorder  242 KAREN ROWA RANDI E. MCCABE MARTIN M. ANTONY

13. Panic Disorder and Agoraphobia  266 AMY R. SEWART MICHELLE G. CRASKE

14. Generalized Anxiety Disorder  293 MICHEL J. DUGAS CATHERINE A. CHARETTE NICOLE J. GERVAIS

15. Obsessive–​Compulsive Disorder  311 SHANNON M. BLAKEY JONATHAN S. ABRAMOWITZ

16. Post-​Traumatic Stress Disorder in Adults  329 SAMANTHA J. MOSHIER KELLY S. PARKER-​GUILBERT BRIAN P. MARX TERENCE M. KEANE

Part V Substance-​Related and Gambling Disorders 17. Substance Use Disorders  359 DAMARIS J. ROHSENOW

18. Alcohol Use Disorder  381

ANGELA M. HAENY CASSANDRA L. BONESS YOANNA E. McDOWELL KENNETH J. SHER

19. Gambling Disorders  412

DAVID C. HODGINS JENNIFER L. SWAN RANDY STINCHFIELD

Part VI Schizophrenia and Personality Disorders  20. Schizophrenia  435

SHIRLEY M. GLYNN KIM T. MUESER

21. Personality Disorders  464 STEPHANIE L. ROJAS THOMAS A. WIDIGER

Part VII Couple Distress and Sexual Disorders  22. Couple Distress  489

DOUGLAS K. SNYDER RICHARD E. HEYMAN STEPHEN N. HAYNES CHRISTINA BALDERRAMA-​DURBIN

23. Sexual Dysfunction  515 NATALIE O. ROSEN MARIA GLOWACKA MARTA MEANA YITZCHAK M. BINIK

Part VIII  Health-​Related Problems  24. Eating Disorders  541 ROBYN SYSKO SARA ALAVI

25. Insomnia Disorder  563

CHARLES M. MORIN SIMON BEAULIEU-​BONNEAU KRISTIN MAICH COLLEEN E. CARNEY

26. Child and Adolescent Pain  583 C. MEGHAN McMURTRY PATRICK J. McGRATH

27. Chronic Pain in Adults  608

THOMAS HADJISTAVROPOULOS NATASHA L. GALLANT MICHELLE M. GAGNON

Assessment Instrument Index  629 Author Index  639 Subject Index  721

Foreword to the First Edition

I believe A Guide to Assessments that Work is the right book at the right time by the right editors and authors. The mental health professions have been intensively engaged for a decade and a half and more in establishing empirically supported treatments. This effort has led to the publication of evidence-​based treatment guidelines by both the principal mental health professions, clinical psychology (Chambless & Ollendick, 2001; Division 12 Task Force, 1995), and psychiatry (American Psychiatric Association, 1993, 2006). A substantial number of books and articles on evidence-​ based treatments have also appeared. Notable among them is a series by Oxford University Press, the publishers of A Guide to Assessments that Work, which began with the first edition of A Guide to Treatments that Work (Nathan & Gorman, 1998), now in its third edition, and the series includes Psychotherapy Relationships that Work (Norcross, 2002)  and Principles of Therapeutic Change that Work (Castonguay & Beutler, 2006). Now we have an entire volume given over to evidence-​ based assessment. It doesn’t appear de novo. Over the past several years, its editors and like-​minded colleagues tested and evaluated an extensive series of guidelines for evidence-​based assessments for both adults and children (e.g., Hunsley & Mash, 2005; Mash & Hunsley, 2005). Many of this book’s chapter authors participated in these efforts. It might well be said, then, that John Hunsley, Eric Mash, and the chapter authors in A Guide to Assessments that Work are the right editors and authors for this, the first book to detail the assessment evidence base.

There is also much to admire within the pages of the volume. Each chapter follows a common format prescribed by the editors and designed, as they point out, “to enhance the accessibility of the material presented throughout the book.” First, the chapters are syndrome-​ focused, making it easy for clinicians who want help in assessing their patients to refer to the appropriate chapter or chapters. When they do so, they will find reviews of the assessment literature for three distinct purposes:  diagnosis, treatment planning, and treatment monitoring. Each of these reviews is subjected to a rigorous rating system that culminates in an overall evaluation of “the scientific adequacy and clinical relevance of currently available measures.” The chapters conclude with an overall assessment of the limits of the assessments available for the syndrome in question, along with suggestions for future steps to confront them. I believe it can well be said, then, that this is the right book by the right editors and authors. But is this the right time for this book? Evidence-​based treatments have been a focus of intense professional attention for many years. Why wouldn’t the right time for this book have been several years ago rather than now, to coincide with the development of empirically supported treatments? The answer, I think, reflects the surprisingly brief history of the evidence-​based medical practice movement. Despite lengthy concern for the efficacy of treatments for mental disorders that dates back more than 50  years (e.g., Eysenck, 1952; Lambert & Bergin, 1994; Luborsky, Singer, & Luborsky, 1976; Nathan, Stuart, & Dolan, 2000), it took the appearance of a Journal of the

viii

Foreword to the First Edition

American Mental Association article in the early 1990s advocating evidence-​based medical practice over medicine as an art to mobilize mental health professionals to achieve the same goals for treatments for mental disorders. The JAMA article “ignited a debate about power, ethics, and responsibility in medicine that is now threatening to radically change the experience of health care” (Patterson, 2002). This effort resonated widely within the mental health community, giving impetus to the efforts of psychologists and psychiatrists to base treatment decisions on valid empirical data. Psychologists had long questioned the uncertain reliability and utility of certain psychological tests, even though psychological testing was what many psychologists spent much of their time doing. At the same time, the urgency of efforts to heighten the support base for valid assessments was limited by continuing concerns over the efficacy of psychotherapy, for which many assessments were done. Not surprisingly, then, when empirical support for psychological treatments began to emerge in the early and middle 1990s, professional and public support for psychological intervention grew. In turn, as psychotherapy’s worth became more widely recognized, the value of psychological assessments to help in the planning and evaluation of psychotherapy became increasingly recognized. If my view of this history is on target, the intense efforts that have culminated in this book could not have begun until psychotherapy’s evidence base had been established. That has happened only recently, after a lengthy process, and that is why I claim that the right time for this book is now. Who will use this book? I hope it will become a favorite text for graduate courses in assessment so that new generations of graduate students and their teachers will come to know which of the assessment procedures they are learning and teaching have strong empirical support. I also hope the book will become a resource for practitioners, including those who may not be used to choosing assessment instruments on the basis of evidence base. To the extent that this book becomes as influential in clinical psychology as I hope it does, it should help precipitate a change in assessment test use patterns, with an increase in the utilization of tests with strong empirical support and a corresponding decrease in the use of tests without it. Even now, there are clinicians who use assessment instruments because they learned them in graduate school, rather than because there is strong evidence that they work. Now, a different and better standard is available. I am pleased the editors of this book foresee it providing an impetus for research on assessment instruments

that currently lack empirical support. I  agree. As with a number of psychotherapy approaches, there remain a number of understudied assessment instruments whose evidence base is currently too thin for them to be considered empirically supported. Like the editors, I believe we can anticipate enhanced efforts to establish the limits of usefulness of assessment instruments that haven’t yet been thoroughly explored. I also anticipate a good deal of fruitful discussion in the professional literature—​and likely additional research—​on the positions this book’s editors and authors have taken on the assessment instruments they have evaluated. I  suspect their ratings for “psychometric adequacy and clinical relevance” will be extensively critiqued and scrutinized. While the resultant dialogue might be energetic—​even indecorous on occasion—​as has been the dialogue surrounding the evidence base for some psychotherapies, I am hopeful it will also lead to more helpful evaluations of test instruments. Perhaps the most important empirical studies we might ultimately anticipate would be research indicating which assessment instruments lead both to valid diagnoses and useful treatment planning for specific syndromes. A  distant goal of syndromal diagnosis for psychopathology has always been diagnoses that bespeak effective treatments. If the system proposed in this volume leads to that desirable outcome, we could all celebrate. I congratulate John Hunsley and Eric Mash and their colleagues for letting us have this eagerly anticipated volume. Peter E. Nathan (1935–2016)

References American Psychiatric Association. (1993). Practice guidelines for the treatment of major depressive disorder in adults. American Journal of Psychiatry, 150 (4 Supplement), 1–​26. American Psychiatric Association. (2006). Practice guidelines for the treatment of psychiatric disorders: Compendium, 2006. Washington, DC: Author. Castonguay, L. G., & Beutler, L. E. (2006). Principles of therapeutic change that work. New  York:  Oxford University Press. Chambless, D. L., & Ollendick, T. H. (2001). Empirically supported psychological interventions:  Controversies and evidence. In S. T. Fiske, D. L. Schacter, & C. Zahn-​Waxler (Eds.), Annual review of psychology (Vol. 52, pp. 685–​716). Palo Alto, CA: Annual Review. Division 12 Task Force. (1995). Training in and dissemination of empirically-​validated psychological treatments:

Foreword to the First Edition

Report and recommendations. The Clinical Psychologist, 48, 3–​23. Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16, 319–​324. Hunsley, J., & Mash, E. J. (Eds.). (2005). Developing guidelines for the evidence-​based assessment (EBA) of adult disorders (special section). Psychological Assessment, 17(3). Lambert, M. J., & Bergin, A. E. (1994). The effectiveness of psychotherapy. In S. L. Garfield & A. E. Bergin (Eds.), Handbook of psychotherapy and behavior change (4th ed., pp. 143–​189). New York: Wiley. Luborsky, L., Singer, B., & Luborsky, L. (1976). Comparative studies of psychotherapies: Is it true that “everybody has won and all must have prizes?” In R. L. Spitzer & D. F. Klein (Eds.), Evaluation of psychological therapies (pp. 3–​22). Baltimore, MD: Johns Hopkins University Press.

ix

Mash, E. J., & Hunsley, J. (Eds.). (2005). Developing guidelines for the evidence-​based assessment of child and adolescent disorders (special section). Journal of Clinical Child and Adolescent Psychology, 34(3). Nathan, P. E., & Gorman, J. M. (1998, 2002, 2007). A guide to treatments that work. New  York:  Oxford University Press. Nathan, P. E., Stuart, S. P., & Dolan, S. L. (2000). Research on psychotherapy efficacy and effectiveness:  Between Scylla and Charybdis? Psychological Bulletin, 126, 964–​981. Norcross, J. C. (Ed.). (2002). Psychotherapy relationships that work: Therapist contributions and responsiveness to patients. New York: Oxford University Press. Patterson, K. (2002). What doctors don’t know (almost everything). New York Times Magazine, May 5, 74–​77.

Preface

BACKGROUND

Evidence-​based practice principles in health care systems emphasize the importance of integrating information drawn from systematically collected data, clinical expertise, and patient preferences when considering health care service options for patients (Institute of Medicine, 2001; Sackett, Rosenberg, Gray, Haynes, & Richardson, 1996). These principles are a driving force in most health care systems and have been endorsed as a necessary foundation for the provision of professional psychological services (American Psychological Association Presidential Task Force on Evidence-​Based Practice, 2006; Dozois et al., 2014). As psychologists, it is difficult for us to imagine how any type of health care service, including psychological services, can be provided to children, adolescents, adults, couples, or families without using some type of informal or formal assessment methods. Nevertheless, until relatively recently, there was an almost exclusive focus on issues related to developing, disseminating, and providing evidence-​based interventions, with only cursory acknowledgment of the role that evidence-​based assessment (EBA) activities play in the promotion of evidence-​ based services. Fortunately, much has changed with respect to EBA since the publication of the first edition of this volume in 2008. A growing number of publications are now available in the scientific literature that address the importance of solid assessment instruments and methods. Special sections on EBA have been published in recent issues of top

clinical psychology journals (e.g., Arbisi & Beck, 2016; Jensen-​Doss, 2015). The evidence base for the value of monitoring treatment progress has increased substantially, as have calls for the assessment of treatment progress to become standard practice (e.g., Lambert, 2017). There is also mounting evidence for assessment as a key component for engaging clients in effective mental health services (Becker, Boustani, Gellatly, & Chorpita, 2017). Unfortunately, some long-​ standing problems evident in the realm of psychological assessment remain. Many researchers continue to ignore the importance of evaluating the reliability of the assessment data obtained from their study participants (e.g., Vacha-​Haase & Thompson, 2011). Despite the demonstrated impact of treatment monitoring, relatively few clinicians systematically and routinely assess the treatment progress of their clients (Ionita & Fitzpatrick, 2014), although it appears that students in professional psychology programs are receiving more training in these assessment procedures than was the case in the past (e.g., Overington, Fitzpatrick, Hunsley, & Drapeau, 2015). All in all, though, when viewed from the vantage point of the early years of the 21st century, it does seem that steady progress is being made with respect to EBA. As was the case with the first edition, the present volume was designed to complement the books published by Oxford University Press that focus on bringing the best of psychological science to bear on questions of clinical importance. These volumes, A Guide to Treatments that Work (Nathan & Gorman, 2015) and Psychotherapy

xii

Preface

Relationships that Work (Norcross, 2011), address intervention issues; the present volume specifically addresses the role of assessment in providing evidence-​based services. Our primary goal for the book was to have it address the needs of professionals providing psychological services and those training to provide such services. A secondary goal was to provide guidance to researchers on scientifically supported assessment tools that could be used for both psychopathology research and treatment research purposes. Relatedly, we hope that the summary tables provided in each chapter will provide some inspiration for assessment researchers to try to (a)  develop instruments for specific assessment purposes and disorders for which, currently, few good options exist and (b) expand our limited knowledge base on the clinical utility of our assessment instruments.

to (a)  understanding the patient’s or client’s needs and (b)  accessing the scientific literature on evidence-​based treatment options. We also recognize that many patients or clients will present with multiple problems; to that end, the reader will find frequent references within a chapter to the assessment of common co-​occurring problems that are addressed in other chapters in the volume. To be optimally useful to potential readers, we have included chapters that deal with the assessment of the most commonly encountered disorders or conditions among children, adolescents, adults, older adults, and couples. Ideally, we want readers to come away from each chapter with a sense of the best scientific assessment options that are clinically feasible and useful. To help accomplish this, we were extremely fortunate to be able to assemble a stellar group of contributors for this volume. The authors are all active contributors to the scientific literature on assessment and share a commitment to the provision of ORGANIZATION EBA and treatment services. To enhance the accessibility of the material presented All chapters and tables in the second edition have been throughout the book, we asked the authors, as much as posrevised and updated by our expert authors to reflect recent sible, to follow a common structure in writing their chapdevelopments in the field, including the publication of ters. Without being a straitjacket, we expected the authors the fifth edition of the Diagnostic and Statistical Manual to use these guidelines in a flexible manner that allowed for of Mental Disorders (DSM-​ 5; American Psychiatric the best possible presentation of assessment work relevant Association, 2013). For the most part, the general cover- to each disorder or clinical condition. The chapter format age and organization of the first edition, which our read- generally used throughout the volume is as follows: ers found useful, has been retained in the second edition. Introduction: A brief overview of the chapter content. Consistent with a growing developmental psychopatholNature of the Disorder/​ Condition:  This section ogy perspective in the field, the scope of some chapters includes information on (a)  general diagnostic considhas expanded in order to provide more coverage of assess- erations, such as prevalence, incidence, prognosis, and ment issues across the lifespan (e.g., attention-​ deficit/​ common comorbid conditions; (b) evidence on etiology; hyperactivity disorder in adults). The most important and (c)  contextual information such as relational and changes in organization involve the addition of two new social functioning and other associated features. chapters, one dealing with the dissemination and implePurposes of Assessment: To make the book as clinically mentation of EBA (Chapter 2) and the other dealing with relevant as possible, authors were asked to focus their new developments in EBA (Chapter 3). The contents of review of the assessment literature to three specific assessthese chapters highlight both the important contributions ment purposes: (a) diagnosis, (b) case conceptualization that assessment can make to the provision of psychological and treatment planning, and (c)  treatment monitoring services and the challenges that mental health profession- and evaluation. We fully realize the clinical and research als face in implementing cost-​effective and scientifically importance of other assessment purposes but, rather than sound assessment strategies. attempting to provide a compendium of assessment meaConsistent with evidence-​ based psychology and sures and strategies, we wanted authors to target these evidence-​based medicine, the majority of the chapters three key clinical assessment purposes. We also asked in this volume are organized around specific disorders authors to consider ways in which age, gender, ethnicity, or conditions. Although we recognize that some clients and other relevant characteristics may influence both the do not have clearly defined or diagnosable problems, the assessment measures and the process of assessment for the vast majority of people seeking psychological services disorder/​condition. do have identifiable diagnoses or conditions. Accurately For each of the three main sections devoted to speassessing these disorders and conditions is a prerequisite cific assessment purposes, authors were asked to focus on

Preface

assessment measures and strategies that either have demonstrated their utility in clinical settings or have a substantial likelihood of being clinically useful. Authors were encouraged to consider the full range of relevant assessment methods (interviews, self-​report, observation, performance tasks, computer-​based methods, physiological, etc.), but both scientific evidence and clinical feasibility were to be used to guide decisions about methods to include. Assessment for Diagnosis:  This section deals with assessment measures and strategies used specifically for formulating a diagnosis. Authors were asked to focus on best practices and were encouraged to comment on important conceptual and practical issues in diagnosis and differential diagnosis. Assessment for Case Conceptualization and Treatment Planning:  This section presents assessment measures and strategies used to augment diagnostic information to yield a full psychological case conceptualization that can be used to guide decisions on treatment planning. Specifically, this section addresses the domains that the research literature indicates should be covered in an EBA to develop (a) a clinically meaningful and useful case conceptualization and (b)  a clinically sensitive and feasible service/​treatment plan (which may or may not include the involvement of other professionals). Assessment for Treatment Monitoring and Treatment Outcome: In this third section, assessment measures and strategies were reviewed that can be used to (a) track the progress of treatment and (b) evaluate the overall effect of treatment on symptoms, diagnosis, and general functioning. Consistent with the underlying thrust of the volume, the emphasis is on assessment options that have supporting empirical evidence. Within each of the three assessment sections, standard tables are used to provide summary information about the psychometric characteristics of relevant instruments. Rather than provide extensive psychometric details in the text, authors were asked to use these rating tables to convey information on the psychometric adequacy of instruments. To enhance the utility of these tables, rather than presenting lists of specific psychometric values for each assessment tool, authors were asked to make global ratings of the quality of the various psychometric indices (e.g., norms, internal reliability, and construct validity) as indicated by extant research. Details on the rating system used by the authors are presented in the introductory chapter. Our goal is to have these tables serve as valuable summaries for readers. In addition, by using the tables to present psychometric information, the authors were able to focus their chapters on both conceptual and practical

xiii

issues without having to make frequent detours to discuss psychometrics. At the conclusion of each of these three main sections there is a subsection titled Overall Evaluation that includes concise summary statements about the scientific adequacy and clinical relevance of currently available measures. This is where authors comment on the availability (if any) of demonstrated scientific value of following the assessment guidance they have provided. Conclusions and Future Directions: This final section in each chapter provides an overall sense of the scope and adequacy of the assessment options available for the disorder/​condition, the limitations associated with these options, and possible future steps that could be taken to remedy these limitations. Some authors also used this section to raise issues related to the challenges involved in trying to ensure that clinical decision-​making processes underlying the assessment process (and not just the assessment measures themselves) are scientifically sound.

ACKNOWLEDGMENTS

To begin with, we express our gratitude to the authors. They diligently reviewed and summarized often-​ voluminous assessment literatures and then presented this information in a clinically informed and accessible manner. The authors also worked hard to implement the guidelines we provided for both chapter structure and the ratings of various psychometric characteristics. Their efforts in constructing their chapters are admirable, and the resulting chapters consistently provide invaluable clinical guidance. We also thank Sarah Harrington, Senior Editor for clinical psychology at Oxford University Press, for her continued interest in the topic and her ongoing support for the book. We greatly appreciate her enthusiasm and her efficiency throughout the process of developing and producing this second edition. We are also indebted to Andrea Zekus, Editor at Oxford University Press, who helped us with the process of assembling the book from start to finish. Her assistance with the myriad issues associated with the publication process and her rapid response to queries was invaluable. Finally, we thank all the colleagues and contributors to the psychological assessment and measurement literatures who, over the years, have shaped our thinking about assessment issues. We are especially appreciative of the input from those colleagues who have discussed with us the host of problems, concerns, challenges, and promises associated with efforts to promote greater awareness of the need for EBA within professional psychology.

xiv

Preface

References American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. American Psychological Association Presidential Task Force on Evidence-​Based Practice. (2006). Evidence-​based practice in psychology. American Psychologist, 61, 271–​285. Arbisi, P. A., & Beck, J. G. (2016). Introduction to the special series “Empirically Supported Assessment.” Clinical Psychology: Science and Practice, 23, 323–​326. Becker, K. D., Boustani, M., Gellatly, R., & Chorpita, B. F. (2017). Forty years of engagement research in children’s mental health services:  Multidimensional measurement and practice elements. Journal of Clinical Child & Adolescent Psychology. Advance online publication. Dozois, D. J.  A., Mikail, S., Alden, L. E., Bieling, P. J., Bourgon, G., Clark, D. A.,  .  .  .  Johnston, C. (2014). The CPA Presidential Task Force on Evidence-​Based Practice of Psychological Treatments. Canadian Psychology, 55, 153–​160. Institute of Medicine. (2001). Crossing the quality chasm: A new health system for the 21st century. Washington, DC: National Academies Press. Ionita, F., & Fitzpatrick, M. (2014). Bringing science to clinical practice:  A Canadian survey of psychological

practice and usage of progress monitoring measures. Canadian Psychology, 55, 187–​196. Jensen-​ Doss, A. (2015). Practical, evidence-​ based clinical decision making:  Introduction to the special series. Cognitive and Behavioral Practice, 22, 1–​4. Lambert, M. J. (2017). Maximizing psychotherapy outcome beyond evidence-​ based medicine. Psychotherapy and Psychosomatics, 86, 80–​89. Nathan, P. E., & Gorman, J. M. (Eds.). (2015). A guide to treatments that work (4th ed.). New  York, NY:  Oxford University Press. Norcross, J. C. (Ed.). (2011). Psychotherapy relationships that work:  Evidence-​ based responsiveness (2nd ed.). New York, NY: Oxford University Press. Overington, L., Fitzpatrick, M., Hunsley, J., & Drapeau, M. (2015). Trainees’ experiences using progress monitoring measures. Training and Education in Professional Psychology, 9, 202–​209. Sackett, D. L., Rosenberg, W. M. C., Gray, J. A. M., Haynes, R. B., & Richardson, W. S. (1996). Evidence based medicine: What it is and what it is not. British Medical Journal, 312, 71–​72. Vacha-​Haase, T., & Thompson, B. (2011). Score reliability:  A retrospective look back at 12  years of reliability generalization studies. Measurement and Evaluation in Counseling and Development, 44, 159–​168.

About the Editors

John Hunsley, PhD, is Professor of Psychology in the School of Psychology at the University of Ottawa and is a Fellow of the Association of State and Provincial Psychology Boards and the Canadian Psychological Association. He has served as a journal editor, an editorial board member for several journals, and an editorial consultant for many journals in psychology. He has published more than 130 articles, chapters, and books related to evidence-​based psychological practice, psychological assessment, and professional issues.

Eric J.  Mash, PhD, is Professor Emeritus in the Department of Psychology at the University of Calgary. He is a Fellow of the American Psychological Association, the Canadian Psychological Association, and the American Psychological Society. He has served as an editor, editorial board member, and consultant for many scientific and professional journals and has written and edited many books and journal articles related to child and adolescent mental health, assessment, and treatment.

Contributors

Jonathan S. Abramowitz,  PhD: Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina Sara Alavi: Eating and Weight Disorders Program, Icahn School of Medicine at Mt. Sinai, New York, New York

Simon P. Byrne,  PhD: Yale Child Study Center, Yale School of Medicine, New Haven, Connecticut Colleen E. Carney,  PhD: Department of Psychology, Ryerson University, Toronto, Ontario, Canada

Martin M.  Antony, PhD: Department of Psychology, Ryerson University, Toronto, Ontario, Canada

Catherine A. Charette: Département de psychoéducation et de psychologie, Université du Québec en Outaouais, Gatineau, Quebec, Canada

Christina Balderrama-​Durbin,  PhD: Department of Psychology, Binghamton University—State University of New York, Binghamton, New York

Sara Colalillo, MA: Department of Psychology, University of British Columbia, Vancouver, British Columbia, Canada

Simon Beaulieu-​Bonneau, PhD: École de psychologie, Université Laval, Quebec City, Quebec, Canada

Michelle G. Craske,  PhD: Department of Psychology, University of California at Los Angeles, Los Angeles, California

Yitzchak M. Binik,  PhD: Department of Psychology, McGill University, Montreal, Quebec, Canada Shannon M. Blakey, MS: Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina Cassandra L. Boness, MA: Department of Psychological Sciences, University of Missouri, Columbia, Missouri

Lea R. Dougherty,  PhD: Department of Psychology, University of Maryland, College Park, Maryland Michel J. Dugas, PhD: Département de psychoéducation et de psychologie, Université du Québec en Outaouais, Gatineau, Québec, Canada Lori Eisner, PhD: Needham Psychotherapy Associates, LLC

xviii

Contributors

Juliet Small  Ernst: Cognitive Behavior Therapy and Science Center, Oakland, California

Richard E. Heyman, PhD: Family Translational Research Group, New York University, New York, New York

Amy Fiske,  PhD: Department of Psychology, West Virginia University, Morgantown, West Virginia

David C. Hodgins,  PhD: Department of Psychology, University of Calgary, Calgary, Alberta, Canada

David M. Fresco,  PhD: Department of Psychological Sciences, Kent State University, Kent, Ohio; Department of Psychiatry, Case Western Reserve University School of Medicine, Cleveland, Ohio

John Hunsley, PhD: School of Psychology, University of Ottawa, Ottawa, Ontario, Canada

Paul J. Frick, PhD: Department of Psychology, Louisiana State University, Baton Rouge, Louisiana; Learning Sciences Institute of Australia; Australian Catholic University; Brisbane, Australia Michelle M. Gagnon,  PhD: Department of Psychology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada

Amanda Jensen-​Doss, PhD: Department of Psychology, University of Miami, Coral Gables, Florida Sheri L. Johnson,  PhD: Department of Psychology, University of California Berkeley, Berkeley, California Charlotte Johnston,  PhD: Department of Psychology, University of British Columbia, Vancouver, British Columbia, Canada

Natasha L.  Gallant,  MA: Department of Psychology, University of Regina, Regina, Saskatchewan, Canada

Terence M. Keane, PhD: VA Boston Healthcare System, National Center for Posttraumatic Stress Disorder, and Boston University School of Medicine, Boston, Massachusetts

Nicole J. Gervais,  PhD: Department of Psychology, University of Toronto, Toronto, Ontario, Canada

Daniel N. Klein, PhD: Department of Psychology, Stony Brook University, Stony Brook, New York

Maria Glowacka: Department of Psychology and Neuroscience, Dalhousie University, Halifax, Nova Scotia, Canada

Eli R. Lebowitz,  PhD: Yale Child Study Center, Yale School of Medicine, New Haven, Connecticut

Shirley M. Glynn,  PhD: VA Greater Los Angeles Healthcare System and UCLA Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, Los Angeles, California Thomas Hadjistavropoulos,  PhD: Department of Psychology, University of Regina, Regina, Saskatchewan, Canada Angela M. Haeny,  MA: Department of Psychological Sciences, University of Missouri, Columbia, Missouri

Kristin Maich, MA: Department of Psychology, Ryerson University, Toronto, Ontario, Canada Brian P. Marx,  PhD: VA Boston Healthcare System, National Center for Posttraumatic Stress Disorder, and Boston University School of Medicine, Boston, Massachusetts Eric J. Mash, PhD: Department of Psychology, University of Calgary, Calgary, Alberta, Canada

Alisa O’Riley Hannum, PhD, ABPP: VA Eastern Colorado Healthcare System, Denver, Colorado

Randi E. McCabe,  PhD: Anxiety Treatment and Research Clinic, St. Joseph’s Healthcare, Hamilton, and Department of Psychiatry and Behavioral Neurosciences, McMaster University, Hamilton, Ontario, Canada

Stephen N. Haynes,  PhD: Department of Psychology, University of Hawai’i at Ma¯noa, Honolulu, Hawaii

Yoanna E. McDowell, MA: Department of Psychological Sciences, University of Missouri, Columbia, Missouri

Contributors

Patrick J. McGrath,  PhD: Centre for Pediatric Pain Research, IWK Health Centre; Departments of Psychiatry, Pediatrics and Community Health & Epidemiology, Dalhousie University; Halifax, Nova Scotia, Canada Robert J. McMahon,  PhD: Department of Psychology, Simon Fraser University, Burnaby, British Columbia,  Canada; BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada C. Meghan McMurtry, PhD: Department of Psychology, University of Guelph, Guelph; Pediatric Chronic Pain Program, McMaster Children’s Hospital, Hamilton; Department of Paediatrics, Schulich School of Medicine & Dentistry, Western University, London; Ontario, Canada Marta Meana,  PhD: Department of Psychology, University of Nevada Las Vegas, Las Vegas, Nevada Christopher Miller, PhD: VA Boston Healthcare System, Center for Healthcare Organization and Implementation Research, and Harvard Medical School Department of Psychiatry, Boston, Massachusetts Alexander J. Millner, PhD: Department of Psychology, Harvard University, Cambridge, Massachusetts Charles M. Morin,  PhD: École de psychologie, Université Laval, Quebec City, Quebec, Canada Samantha J. Moshier, PhD: VA Boston Healthcare System and Boston University School of Medicine, Boston, Massachusetts Kim T. Mueser,  PhD: Center for Psychiatric Rehabilitation and Departments of Occupational Therapy, Psychological and Brain Sciences, and Psychiatry, Boston University, Boston, Massachusetts

xix

Thomas H. Ollendick, PhD: Department of Psychology, Virginia Polytechnic Institute and State University, Blacksburg, Virginia Kelly S. Parker-​Guilbert, PhD: Psychology Department, Bowdoin College, Brunswick, ME and VA Boston Healthcare System, Boston, Massachusetts Jacqueline B. Persons,  PhD: Cognitive Behavior Therapy and Science Center, Oakland, California and Department of Psychology, University of California at Berkeley, Berkeley, California Vanesa Mora Ringle: Department of Psychology, University of Miami, Coral Gables, Florida Damaris J. Rohsenow,  PhD: Center for Alcohol and Addiction Studies, Brown University, Providence, Rhode Island Stephanie L.  Rojas, MA: Department of Psychology, University of Kentucky, Lexington, Kentucky Natalie O. Rosen, PhD: Department of Psychology and Neuroscience, Dalhousie University, Halifax, Nova Scotia, Canada Karen Rowa,  PhD: Anxiety Treatment and Research Clinic, St. Joseph’s Healthcare, Hamilton, and Department of Psychiatry and Behavioral Neurosciences, McMaster University, Hamilton, Ontario, Canada Amy R. Sewart, MA: Department of Psychology, University of California Los Angeles, Los Angeles, California Kenneth J. Sher,  PhD: Department of Psychological Sciences, University of Missouri, Columbia, Missouri Wendy K. Silverman,  PhD: Yale Child Study Center, Yale School of Medicine, New Haven, Connecticut

Matthew K. Nock,  PhD: Department of Psychology, Harvard University, Cambridge, Massachusetts

Douglas K. Snyder,  PhD: Department of Psychology, Texas A&M University, College Station, Texas

Thomas M. Olino,  PhD: Department of Psychology, Temple University, Philadelphia, Pennsylvania

Randy Stinchfield,  PhD: Department of Psychiatry, University of Minnesota, Minneapolis, Minnesota

xx

Contributors

Jennifer L. Swan: Department of Psychology, University of Calgary, Calgary, Alberta, Canada

Lucia M. Walsh: Department of Psychology, University of Miami, Coral Gables, Florida

Robyn Sysko,  PhD: Eating and Weight Disorders Program, Icahn School of Medicine at Mt. Sinai, New York, New York

Thomas A. Widiger,  PhD: Department of Psychology, University of Kentucky, Lexington, Kentucky

Anna Van Meter,  PhD: Ferkauf Graduate School of Psychology, Yeshiva University, New York, New York

Eric A. Youngstrom,  PhD: Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina

Part I

Introduction

1

Developing Criteria for Evidence-​Based Assessment: An Introduction to Assessments That Work John Hunsley Eric J. Mash For many professional psychologists, assessment is viewed as a unique and defining feature of their expertise (Krishnamurthy et  al., 2004). Historically, careful attention to both conceptual and pragmatic issues related to measurement has served as the cornerstone of psychological science. Within the realm of professional psychology, the ability to provide assessment and evaluation services is typically seen as a required core competency. Indeed, assessment services are such an integral component of psychological practice that their value is rarely questioned but, rather, is typically assumed. However, solid evidence to support the usefulness of psychological assessment is lacking, and many commonly used clinical assessment methods and instruments are not supported by scientific evidence (e.g., Hunsley, Lee, Wood, & Taylor, 2015; Hunsley & Mash, 2007; Norcross, Koocher, & Garofalo, 2006). Indeed, Peterson’s (2004) conclusion from more than a decade ago is, unfortunately, still frequently true: “For many of the most important inferences professional psychologists have to make, practitioners appear to be forever dependent on incorrigibly fallible interviews and unavoidably selective, reactive observations as primary sources of data” (p.  202). Furthermore, despite the current emphasis on evidence-​based practice, professional psychologists report that the least common purpose for which they use assessment is to monitor treatment progress (Wright et al., 2017). In this era of evidence-​based health care practices, the need for scientifically sound assessment methods and instruments is greater than ever (Barlow, 2005). Assessment is the key to the accurate identification of

clients’ problems and strengths. Whether construed as individual client monitoring, ongoing quality assurance efforts, or program evaluation, assessment is central to efforts to gauge the impact of health care services provided to ameliorate these problems (Brown, Scholle, & Azur, 2014; Hermann, Chan, Zazzali, & Lerner, 2006). Furthermore, the increasing availability of research-​ derived treatment benchmarks holds out great promise for providing clinicians with meaningful and attainable targets for their intervention services (Lee, Horvath, & Hunsley, 2013; Spilka & Dobson, 2015). Importantly, statements about evidence-​ based practice and best-​ practice guidelines have begun to specifically incorporate the critical role of assessment in the provision of evidence-​ based services (e.g., Dozois et  al., 2014). Indeed, because the identification and implementation of evidence-​based treatments rests entirely on the data provided by assessment tools, ignoring the quality of these tools places the whole evidence-​based enterprise in jeopardy.

DEFINING EVIDENCE-​BASED ASSESSMENT

There are three critical aspects that should define evidence-​ based assessment (EBA; Hunsley & Mash, 2007; Mash & Hunsley, 2005). First, research findings and scientifically supported theories on both psychopathology and normal human development should be used to guide the selection of constructs to be assessed and the assessment process. As Barlow (2005) suggested, 3

4

Introduction

EBA measures and strategies should also be designed to be integrated into interventions that have been shown to work with the disorders or conditions that are targeted in the assessment. Therefore, while recognizing that most disorders do not come in clearly delineated neat packages, and that comorbidity is often the rule rather than the exception, we view EBAs as being disorder-​or problem-​specific. A problem-​specific approach is consistent with how most assessment and treatment research is conducted and would facilitate the integration of EBA into evidence-​ based treatments (cf. Mash & Barkley, 2007; Mash & Hunsley, 2007; Weisz & Kazdin, 2017). This approach is also congruent with the emerging trend toward personalized assessment and treatment (e.g., Fisher, 2015; Ng & Weisz, 2016; Sales & Alves, 2016; Seidman et al., 2010; Thompson-​Hollands, Sauer-​ Zavala, & Barlow, 2014). Although formal diagnostic systems provide a frequently used alternative for framing the range of disorders and problems to be considered, commonly experienced emotional and relational problems, such as excessive anger, loneliness, conflictual relationships, and other specific impairments that may occur in the absence of a diagnosable disorder, may also be the focus of EBAs. Even when diagnostic systems are used as the framework for the assessment, clinicians need to consider both (a) the potential value of emerging transdiagnostic approaches to treatment (Newby, McKinnon, Kuyken, Gilbody, & Dalgleish, 2015) and (b) that a narrow focus on assessing symptoms and symptom reduction is insufficient for treatment planning and treatment evaluation purposes (cf. Kazdin, 2003). Many assessments are conducted to identify the precise nature of the person’s problem(s). It is, therefore, necessary to conceptualize multiple, interdependent stages in the assessment process, with each iteration of the process becoming less general in nature and increasingly problem-​specific with further assessment (Mash & Terdal, 1997). In addition, for some generic assessment strategies, there may be research to indicate that the strategy is evidence-​based without being problem-​specific. Examples of this include functional assessments (Hurl, Wightman, Haynes, & Virues-​Ortega, 2016) and treatment progress monitoring systems (e.g., Lambert, 2015). A second requirement is that, whenever possible, psychometrically strong measures should be used to assess the constructs targeted in the assessment. The measures should have evidence of reliability, validity, and clinical utility. They should also possess appropriate norms for norm-​ referenced interpretation and/​or replicated supporting evidence for the

accuracy (sensitivity, specificity, predictive power, etc.) of cut-​scores for criterion-​referenced interpretation (cf. Achenbach, 2005). Furthermore, there should be supporting evidence to indicate that the EBAs are sensitive to key characteristics of the individual(s) being assessed, including characteristics such as age, gender, ethnicity, and culture (e.g., Ivanova et  al., 2015). Given the range of purposes for which assessment instruments can be used (i.e., screening, diagnosis, prognosis, case conceptualization, treatment formulation, treatment monitoring, and treatment evaluation) and the fact that psychometric evidence is always conditional (based on sample characteristics and assessment purpose), supporting psychometric evidence must be considered for each purpose for which an instrument or assessment strategy is used. Thus, general discussions concerning the relative merits of information obtained via different assessment methods have little meaning outside of the assessment purpose and context. Similarly, not all psychometric elements are relevant to all assessment purposes. The group of validity statistics that includes specificity, sensitivity, positive predictive power, and negative predictive power is particularly relevant for diagnostic and prognostic assessment purposes and contains essential information for any measure that is intended to be used for screening purposes (Hsu, 2002). Such validity statistics may have little relevance, however, for many methods intended to be used for treatment monitoring and/​or evaluation purposes; for these purposes, sensitivity to change is a much more salient psychometric feature (e.g., Vermeersch, Lambert, & Burlingame, 2000). Finally, even with data from psychometrically strong measures, the assessment process is inherently a decision-​ making task in which the clinician must iteratively formulate and test hypotheses by integrating data that are often incomplete or inconsistent. Thus, a truly evidence-​based approach to assessment would involve an evaluation of the accuracy and usefulness of this complex decision-​making task in light of potential errors in data synthesis and interpretation, the costs associated with the assessment process, and, ultimately, the impact that the assessment had on clinical outcomes. There are an increasing number of illustrations of how assessments can be conducted in an evidence-​ based manner (e.g., Christon, McLeod, & Jensen-​Doss, 2015; Youngstrom, Choukas-​ Bradley, Calhoun, & Jensen-​Doss, 2015). These provide invaluable guides for clinicians and provide a preliminary framework that could lead to the eventual empirical evaluation of EBA processes.

Developing Criteria for Evidence-Based Assessment FROM RESEARCH TO PRACTICE: USING A “GOOD-​E NOUGH” PRINCIPLE

Perhaps the greatest single challenge facing efforts to develop and implement EBAs is determining how to start the process of operationalizing the criteria we just outlined. The assessment literature provides a veritable wealth of information that is potentially relevant to EBA; this very strength, however, is also a considerable liability, for the size of the literature is beyond voluminous. Not only is the literature vast in scope but also the scientific evaluation of assessment methods and instruments can be without end because there is no finite set of studies that can establish, once and for all, the psychometric properties of an instrument (Kazdin, 2005; Sechrest, 2005). On the other hand, every single day, clinicians must make decisions about what assessment tools to use in their practices, how best to use and combine the various forms of information they obtain in their assessment, and how to integrate assessment activities into other necessary aspects of clinical service. Moreover, the limited time available for service provision in clinical settings places an onus on using assessment options that are maximally accurate, efficient, and cost-​effective. Thus, above and beyond the scientific support that has been amassed for an instrument, clinicians require tools that are brief, clear, clinically feasible, and user-​friendly. In other words, they need instruments that have clinical utility and that are good enough to get the job done (Barlow, 2005; Lambert & Hawkins, 2004; Weisz, Krumholz, Santucci, Thomassin, & Ng, 2015; Youngstrom & Van Meter, 2016). As has been noted in the assessment literature, there are no clear, commonly accepted guidelines to aid clinicians or researchers in determining when an instrument has sufficient scientific evidence to warrant its use (Kazdin, 2005; Sechrest, 2005). The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014) sets out generic standards to be followed in developing and using psychological instruments but is silent on the question of specific psychometric values that an instrument should have. The basic reason for this is that psychometric characteristics are not properties of an instrument per se but, rather, are properties of an instrument when used for a specific purpose with a specific sample. Quite understandably, therefore, assessment scholars, psychometricians, and test developers have been reluctant to explicitly indicate the minimum psychometric values or evidence necessary to indicate that an

5

instrument is scientifically sound (cf. Streiner, Norman, & Cairney, 2015). Unfortunately, this is of little aid to the clinicians and researchers who are constantly faced with the decision of whether an instrument is good enough, scientifically speaking, for the assessment task at hand. Prior to the psychometric criteria we set out in the first edition of this volume, there had been attempts to establish criteria for the selection and use of measures for research purposes. Robinson, Shaver, and Wrightsman (1991), for example, developed evaluative criteria for the adequacy of attitude and personality measures, covering the domains of theoretical development, item development, norms, inter-​ item correlations, internal consistency, test–​retest reliability, factor analytic results, known groups validity, convergent validity, discriminant validity, and freedom from response sets. Robinson and colleagues also used specific psychometric criteria for many of these domains, such as describing a coefficient α of .80 as exemplary. A different approach was taken by the Measurement and Treatment Research to Improve Cognition in Schizophrenia Group to develop a consensus battery of cognitive tests to be used in clinical trials in schizophrenia (Green et al., 2004). Rather than setting precise psychometric criteria for use in rating potential instruments, expert panelists were asked to rate, on a nine-​ point scale, each proposed tool’s characteristics, including test–​retest reliability, utility as a repeated measure, relation to functional outcome, responsiveness to treatment change, and practicality/​tolerability. An American Psychological Association Society of Pediatric Psychology task force used a fairly similar strategy. The task force efforts, published at approximately the same time as the first edition of this volume, focused on evaluating psychosocial assessment instruments that could be used in health care settings (Cohen et al., 2008). Instrument characteristics were reviewed by experts and, depending on the available empirical support, were evaluated as promising, approaching well-​established, or well-​established. These descriptors closely resembled those that had been used to identify empirically supported treatments. Clearly, any attempt to develop a method for determining the scientific adequacy of assessment instruments is fraught with the potential for error. The application of criteria that are too stringent could result in a solid set of assessment options, but one that is so limited in number or scope as to render the whole effort clinically worthless. Alternatively, using excessively lenient criteria could undermine the whole notion of an instrument or process being evidence based. So, with a clear awareness of this assessment equivalent of Scylla and Charybdis, a

6

Introduction

decade ago we sought to construct a framework for the chapters included in the first edition of this volume that would employ good-​enough criteria for rating psychological instruments. In other words, rather than focusing on standards that define ideal criteria for a measure, our intent was to provide criteria that would indicate the minimum evidence that would be sufficient to warrant the use of a measure for specific clinical purposes. We assumed, from the outset, that although our framework is intended to be scientifically sound and defensible, it is a first step rather than the definitive effort in designing a rating system for evaluating psychometric adequacy. Our framework, described later, is unchanged from the first edition because there have been no developments in the measurement and assessment literatures that have caused us to reconsider our earlier position. Indeed, as we indicate in the following sections of this chapter, several critical developments have served to reinforce our views on the value of the framework. In brief, to operationalize the good-​enough principle, specific rating criteria are used across categories of psychometric properties that have clear clinical relevance; each category has rating options of adequate, good, and excellent. In the following sections, we describe the assessment purposes covered by our rating system, the psychometric properties included in the system, and the rationales for the rating options. The actual rating system, used in this volume by all authors of disorder/​problem-​oriented chapters to construct their summary tables of instruments, is presented in two boxes later in this chapter.

ASSESSMENT PURPOSES

Although psychological assessments are conducted for many reasons, it is possible to identify a small set of interrelated purposes that form the basis for most assessments. These include (a) diagnosis (i.e., determining the nature and/​or cause[s]‌of the presenting problems, which may or may not involve the use of a formal diagnostic or categorization system), (b) screening (i.e., identifying those who have or who are at risk for a particular problem and who might be helped by further assessment or intervention), (c) prognosis and other predictions (i.e., generating predictions about the course of the problems if left untreated, recommendations for possible courses of action to be considered, and their likely impact on the course of the problems), (d)  case conceptualization/​ formulation (i.e., developing a comprehensive and clinically relevant understanding of the client, generating hypotheses

regarding critical aspects of the client’s biopsychosocial functioning and context that are likely to influence the client’s adjustment), (e)  treatment design/​planning (i.e., selecting/​ developing and implementing interventions designed to address the client’s problems by focusing on elements identified in the diagnostic evaluation and the case conceptualization), (f)  treatment monitoring (i.e., tracking changes in symptoms, functioning, psychological characteristics, intermediate treatment goals, and/​or variables determined to cause or maintain the problems), and (g) treatment evaluation (i.e., determining the effectiveness, social validity, consumer satisfaction, and/​or cost-​ effectiveness of the intervention). The chapters in this volume provide summaries of the best assessment methods and instruments available for commonly encountered clinical assessment purposes. While recognizing the importance of other possible assessment purposes, chapters in this volume focus on (a)  diagnosis, (b)  case conceptualization and treatment planning, and (c)  treatment monitoring and treatment evaluation. Although separable in principle, the purposes of case conceptualization and treatment planning were combined because they tend to rely on the same assessment data. Similarly, the purposes of treatment monitoring and evaluation were combined because they often, but not exclusively, use the same assessment methods and instruments. Clearly, there are some overlapping elements, even in this set of purposes; for example, it is relatively common for the question of diagnosis to be revisited as part of evaluating the outcome of treatment. In the instrument summary tables that accompany each chapter, the psychometric strength of instruments used for these three main purposes are presented and rated. Within a chapter, the same instrument may be rated for more than one assessment purpose and thus appear in more than one table. Because an instrument may possess more empirical support for some purposes than for others, the ratings given for the instrument may not be the same in each of the tables. The chapters in this volume present information on the best available instruments for diagnosis, case conceptualization and treatment planning, and treatment monitoring and evaluation. They also provide details on clinically appropriate options for the range of data to collect, suggestions on how to address some of the challenges commonly encountered in conducting assessments, and suggestions for the assessment process. Consistent with the problem-​specific focus within EBA outlined previously, most chapters in this volume focus on one or more specific disorders or conditions. However, many clients

Developing Criteria for Evidence-Based Assessment

present with multiple problems and, therefore, there are frequent references within a given chapter to the assessment of common co-​occurring problems that are addressed in other chapters in the volume. To be optimally useful to potential readers, the chapters are focused on the most commonly encountered disorders or conditions among children, adolescents, adults, older adults, and couples. With the specific focus on the three critical assessment purposes of diagnosis, case conceptualization and treatment planning, and treatment monitoring and treatment, within each disorder or condition, the chapters in this volume provide readers with essential information for conducting the best EBAs currently possible.

PSYCHOMETRIC PROPERTIES AND RATING CRITERIA

Clinical assessment typically entails the use of both idiographic and nomothetic instruments. Idiographic measures are designed to assess unique aspects of a person’s experience and, therefore, to be useful in evaluating changes in these individually defined and constructed variables. In contrast, nomothetic measures are designed to assess constructs assumed to be relevant to all individuals and to facilitate comparisons, on these constructs, across people. Most chapters include information on idiographic measures such as self-​monitoring forms and individualized scales for measuring treatment goals. For such idiographic measures, psychometric characteristics such as reliability and validity may, at times, not be easily evaluated or even relevant (but see Weisz et al., 2011). It is crucial, however, that the same items and instructions are used across assessment occasions—​without this level of standardization it is impossible to accurately determine changes that may be due to treatment (Kazdin, 1993). The nine psychometric categories rated for the instruments in this volume are norms, internal consistency, inter-​rater reliability, test–​retest reliability, content validity, construct validity, validity generalization, sensitivity to treatment change, and clinical utility. Each of these categories is applied in relation to a specific assessment purpose (e.g., case conceptualization and treatment planning) in the context of a specific disorder or clinical condition (e.g., eating disorders, self-​injurious behavior, and relationship conflict). Consistent with our previous comments, factors such as gender, ethnicity, and age must be considered in making ratings within these categories. For each category, a rating of less than adequate, adequate, good, excellent, not reported, or not applicable was

7

possible. The precise nature of what constituted adequate, good, and excellent varied, of course, from category to category. In general, however, a rating of adequate indicated that the instrument meets a minimal level of scientific rigor; good indicated that the instrument would generally be viewed as possessing solid scientific support; and excellent indicated there was extensive, high-​quality supporting evidence. Accordingly, a rating of less than adequate indicated that the instrument did not meet the minimum level set out in the criteria. A rating of not reported indicated that research on the psychometric property under consideration had not yet been conducted or published. A rating of not applicable indicated that the psychometric property under consideration was not relevant to the instrument (e.g., inter-​ rater reliability for a self-​ report symptom rating scale). When considering the clinical use of a measure, it would be desirable to only use those measures that would meet, at a minimum, the criteria for good. However, because measure development is an ongoing process, the rating system provides the option of the adequate rating in order to fairly evaluate (a) relatively newly developed measures and (b) measures for which comparable levels of research evidence are not available across all psychometric categories in the rating system. In several chapters, authors explicitly commented on the status of some newly developed measures, but by and large, the only instruments included in chapter summary tables were those that had adequate or better ratings on the majority of the psychometric dimensions. Thus, the instruments presented in these tables represent only a subset of available assessment tools. Despite the difficulty inherent in promulgating scientific criteria for psychometric properties, we believe that the potential benefits of fair and attainable criteria far outweigh the potential drawbacks (cf. Sechrest, 2005). Accordingly, reasoned arguments from respected psychometricians and assessment scholars, along with summaries of various assessment literatures, guided the selection of criteria for rating the psychometric properties associated with an instrument. Box 1.1 presents the criteria used in rating norms and reliability indices; Box 1.2 presents the criteria used in rating validity indices and clinical utility. Norms When using a standardized, nomothetically based instrument, it is essential that norms, specific criterion-​ related cutoff scores, or both are available to aid in the accurate interpretation of a client’s test score

8

Introduction

BOX 1.1  

Criteria at a Glance: Norms and Reliability NORMS

Adequate = Measures of central tendency and distribution for the total score (and subscores if relevant) based on a large, relevant, clinical sample are available. Good = Measures of central tendency and distribution for the total score (and subscores if relevant) based on several large, relevant samples (must include data from both clinical and nonclinical samples) are available. Excellent = Measures of central tendency and distribution for the total score (and subscores if relevant) based on one or more large, representative samples (must include data from both clinical and nonclinical samples) are available. I N T E R NA L C O N S I S T E N C Y

Adequate  =  Preponderance of evidence indicates α values of .70–​.79. Good = Preponderance of evidence indicates α values of .80–​.89. Excellent  =  Preponderance of evidence indicates α values ≥ .90. I N T E R -​R AT E R R E L I A B I L I T Y

Adequate  =  Preponderance of evidence indicates κ values of .60–​.74; the preponderance of evidence indicates Pearson correlation or intraclass correlation values of .70–​.79. Good = Preponderance of evidence indicates κ values of .75–​ .84; the preponderance of evidence indicates Pearson correlation or intraclass correlation values of .80–​.89. Excellent  =  Preponderance of evidence indicates κ values ≥ .85; the preponderance of evidence indicates Pearson correlation or intraclass correlation values ≥ .90. T E S T –​R E T E S T R E L I A B I L I T Y

Adequate  =  Preponderance of evidence indicates test–​retest correlations of at least .70 over a period of several days to several weeks. Good  =  Preponderance of evidence indicates test–​ retest correlations of at least .70 over a period of several months. Excellent  =  Preponderance of evidence indicates test–​retest correlations of at least .70 over a period of a year or longer.

BOX 1.2  

Criteria at a Glance: Validity and Utility

C O N T E N T VA L I D I T Y

Adequate  =  The test developers clearly defined the domain of the construct being assessed and ensured that selected items were representative of the entire set of facets included in the domain. Good  =  In addition to the criteria used for an adequate rating, all elements of the instrument (e.g., instructions and items) were evaluated by judges (e.g., by experts or by pilot research participants). Excellent  =  In addition to the criteria used for a good rating, multiple groups of judges were employed and quantitative ratings were used by the judges. C O N S T R U C T VA L I D I T Y

Adequate = Some independently replicated evidence of construct validity (e.g., predictive validity, concurrent validity, and convergent and discriminant validity). Good  =  Preponderance of independently replicated evidence, across multiple types of validity (e.g., predictive validity, concurrent validity, and convergent and discriminant validity), is indicative of construct validity. Excellent = In addition to the criteria used for a good rating, there is evidence of incremental validity with respect to other clinical data. VA L I D I T Y G E N E R A L I Z AT I O N

Adequate = Some evidence supports the use of this instrument with either (a) more than one specific group (based on sociodemographic characteristics such as age, gender, and ethnicity) or (b) in multiple contexts (e.g., home, school, primary care setting, and inpatient setting). Good  =  Preponderance of evidence supports the use of this instrument with either (a) more than one specific group (based on sociodemographic characteristics such as age, gender, and ethnicity) or (b)  in multiple settings (e.g., home, school, primary care setting, and inpatient setting). Excellent = Preponderance of evidence supports the use of this instrument with more than one specific group (based on sociodemographic characteristics such as age, gender, and ethnicity) and across multiple contexts (e.g., home, school, primary care setting, and inpatient setting).

Developing Criteria for Evidence-Based Assessment

BOX 1.2  Continued

T R E AT M E N T S E N S I T I V I T Y

Adequate  =  Some evidence of sensitivity to change over the course of treatment. Good = Preponderance of independently replicated evidence indicates sensitivity to change over the course of treatment. Excellent = In addition to the criteria used for a good rating, evidence of sensitivity to change across different types of treatments. CLINICAL UTILITY

Adequate  =  Taking into account practical considerations (e.g., costs, ease of administration, availability of administration and scoring instructions, duration of assessment, availability of relevant cutoff scores, and acceptability to clients), the resulting assessment data are likely to be clinically useful. Good  =  In addition to the criteria used for an adequate rating, there is some published evidence that the use of the resulting assessment data confers a demonstrable clinical benefit (e.g., better treatment outcome, lower treatment attrition rates, and greater client satisfaction with services). Excellent  =  In addition to the criteria used for an adequate rating, there is independently replicated published evidence that the use of the resulting assessment data confers a demonstrable clinical benefit.

(American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014). For example, norms can be used to determine the client’s pre-​and post-​ treatment levels of functioning and to evaluate whether any change in functioning is clinically meaningful (Achenbach, 2001; Kendall, Marrs-​Garcia, Nath, & Sheldrick, 1999). Selecting the target population(s) for the norms and then ensuring that the norms are adequate can be difficult tasks, and several sets of norms may be required for a measure. One set of norms may be needed to determine the meaning of the obtained score relative to the general population, whereas a different set of norms could be used to compare the score to specific subgroups within the population (Cicchetti, 1994).

9

Regardless of the population to which comparisons are to be made, a normative sample must be truly representative of the population with respect to demographics and other important characteristics (Achenbach, 2001; Wasserman & Bracken, 2013). Ideally, whether conducted at the national level or the local level, this would involve probability-​sampling efforts in which data are obtained from the majority of contacted respondents. As those familiar with psychological instruments are aware, such a sampling strategy is rarely used for the development of test norms. The reliance on data collected from convenience samples with unknown response rates reduces the accuracy of the resultant norms. Therefore, at a minimum, clinicians need to be provided with an indication of the quality and likely accuracy of the norms for a measure. Accordingly, the ratings for norms required, at a minimum for a rating of adequate, data from a single, large clinical sample. For a rating of good, normative data from multiple samples, including nonclinical samples, were required; when normative data from large, representative samples were available, a rating of excellent was applied. Reliability Reliability is a key psychometric element to be considered in evaluating an instrument. It refers to the consistency of a person’s score on a measure (Anastasi, 1988; Wasserman & Bracken, 2013), including whether (a)  all elements of a measure contribute in a consistent way to the data obtained (internal consistency), (b) similar results would be obtained if the measure was used or scored by another clinician (inter-​ rater reliability),1 or (c)  similar results would be obtained if the person completed the measure a second time (test–​retest reliability or test stability). Not all reliability indices are relevant to all assessment methods and measures, and the size of the indices may vary on the basis of the samples used. Despite the long-​standing recognition of the centrality of reliability to all forms of psychological measurement, there is a persistent tendency in psychological research to make unwarranted assumptions about reliability. For example, numerous reviews have found that almost three-​fourths of research articles failed to provide information on the reliability estimates of the measures completed by participants in the studies (e.g., Barry, Chaney, Piazza-​ Gardner, & Chavarria, 2014; Vacha-​ Haase & Thompson, 2011). Inattention to reliability, or the use of an inappropriate statistic to estimate reliability, has the potential to undermine the validity of conclusions drawn from research studies. Concerns have been

10

Introduction

raised about the impact of these errors in a broad range of research domains, including communication (Feng, 2015), psychopathology (Rodebaugh et  al., 2016), and clinical diagnosis (Chmielewski, Clark, Bagby, & Watson, 2015). As emphasized throughout this volume, a careful consideration of reliability values is essential when selecting assessment instruments for clinical services or clinical research. With respect to internal consistency, we focused on coefficient alpha (α), which is the most widely used index (Streiner, 2003). Although there have been repeated calls to abandon the use of coefficient α in favor of more robust and accurate alternatives (e.g., Dunn, Baguley, & Brunsden, 2014; Kelley & Pornprasertmanit, 2016), it is rare to find an internal consistency coefficient other than α used in the clinical assessment literature. Recommendations in the literature for what constitutes adequate internal consistency vary, but most authorities seem to view .70 as the minimum acceptable value for α (e.g., Cicchetti, 1994), and Charter (2003) reported that the mean internal consistency value among commonly used clinical instruments was .81. Accordingly, a rating of adequate was given to values of .70–​.79; a rating of good required values of .80–​.89; and, finally, because of cogent arguments that an α value of at least .90 is highly desirable in clinical assessment contexts (Nunnally & Bernstein, 1994), we required values ≥ .90 for an instrument to be rated as having excellent internal consistency. Note that it is possible for α to be too (artificially) high, as a value close to unity typically indicates substantial redundancy among items (cf. Streiner, 2003). These value ranges were also used in rating evidence for inter-​rater reliability when assessed with Pearson correlations or intraclass correlations. Appropriate adjustments were made to the value ranges when κ statistics were used, in line with the recommendations discussed by Cicchetti (1994; see also Charter, 2003). Note that although a number of statistics are superior to κ, it continues to be the most commonly used inter-​rater reliability statistic (Xu & Lorber, 2014). Importantly, evidence for inter-​ rater reliability could only come from data generated among clinicians or clinical raters—​estimates of cross-​informant agreement, such as between parent and teacher ratings, are not indicators of reliability. In establishing ratings for test–​retest reliability values, our requirement for a minimum correlation of .70 was influenced by summary data reported on typical test–​ retest reliability results found with clinical instruments (Charter, 2003)  and trait-​ like psychological measures (Watson, 2004). Of course, not all constructs or measures are expected to show temporal stability (e.g., measures

of state-​like variables and life stress inventories), so test–​ retest reliability was only rated if it was relevant. A rating of adequate required evidence of correlation values of .70 or greater, when reliability was assessed over a period of several days to several weeks. We then faced a challenge in determining appropriate criteria for good and excellent ratings. In order to enhance its likely usefulness, the rating system should be relatively simple. However, test–​retest reliability is a complex phenomenon that is influenced by (a) the nature of the construct being assessed (i.e., it can be state-​like, trait-​like, or influenced by situational variables), (b) the time frame covering the reporting period instructions (i.e., whether respondents are asked to report their current functioning, functioning over the past few days, or functioning over an extended period, such as general functioning in the past year), and (c) the duration of the retest period (i.e., whether the time between two administrations of the instrument involved days, weeks, months, or years). In the end, rather than emphasize the value of increasingly large test–​retest correlations, we decided to maintain the requirement for .70 or greater correlation values but require increasing retest period durations of (a)  several months and (b)  at least 1  year for ratings of good and excellent, respectively. Validity Validity is another central aspect to be considered when evaluating psychometric properties. Recent editions of the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999, 2014)  explicitly state that validity is a unitary concept and that it is not appropriate to consider different types of validity. Despite these admonitions, research on validity continues to use concepts such as content validity, predictive validity, and incremental validity. Setting aside the wide range of conceptual and practical issues associated with the lack of consensus on the framing of test validity (for a detailed discussion, see Newton & Shaw, 2013), there is a very simple reason for incorporating several types of validity into the rating system used in this book: The vast majority of the literature on clinical assessment, both historically and currently, does not treat validity as a unitary concept (cf. Strauss & Smith, 2009). To strike a balance between the unitary approach advocated by the Standards and the multiplicity of validity types used in the literature, we focused on content validity, construct validity, validity generalization, and treatment sensitivity. In the following

Developing Criteria for Evidence-Based Assessment

paragraphs, we explain further the rationale for our use of four types of validity. As is readily apparent by reviewing the summary tables of instruments in the following chapters, the extent and strength of research evidence across these types of validity can vary substantially for a given assessment instrument. Foster and Cone (1995) drew an important distinction between “representational” validity (i.e., whether a measure really assesses what it purports to measure) and “elaborative” validity (i.e., whether the measure has any utility for measuring the construct). Attending to the content validity of a measure is a basic, but frequently overlooked, step in evaluating representational validity (Haynes, Richard, & Kubany, 1995). As discussed by Smith, Fischer, and Fister (2003), the overall reliability and validity of scores on an instrument is directly affected by the extent to which items in the instrument adequately represent the various aspects or facets of the construct the instrument is designed to measure. Assuming that representational validity has been established for an assessment purpose, it is elaborative validity that is central to clinicians’ use of a measure. Accordingly, replicated evidence for a measure’s concurrent, predictive, discriminative, and, ideally, incremental validity (Hunsley & Meyer, 2003)  should be available to qualify a measure for consideration as evidence based. We have indicated already that validation is a context-​sensitive concept—​inattention to this fact can lead to inappropriate generalizations being made about a measure’s validity. Thus, there should be replicated elaborative validity evidence for each purpose of the measure and for each population or group for which the measure is intended to be used. This latter point is especially relevant when considering an instrument for clinical use, and thus it is essential to consider evidence for validity generalization—​that is, the extent to which there is evidence for validity across a range of samples and settings (cf. Messick, 1995; Schmidt & Hunter, 1977). In the rating system used in subsequent chapters, ratings of content validity evidence required explicit consideration of the construct facets to be included in the measure and, as the ratings increased, involvement of content validity judges to assess the measure (Haynes et al., 1995). Unlike the situation for reliability, there are no commonly accepted summary statistics to evaluate construct validity (but see Markon [2013] and Westen & Rosenthal [2003]). As a result, our ratings were based on the requirement of increasing amounts of replicated evidence of elements of construct validity such as predictive validity, concurrent validity, convergent validity, and discriminant validity; in addition, for a rating of excellent, evidence of incremental

11

validity was also required. As was the case when we introduced the rating system in the first edition of this book, we were unable to find any clearly applicable standards in the literature to guide us in developing criteria for validity generalization or treatment sensitivity (a dimension rated only for instruments used for the purposes of treatment monitoring and treatment evaluation). Therefore, adequate ratings for these dimensions required some evidence of, respectively, the use of the instrument with either more than one specific group or in multiple contexts and evidence of sensitivity to change over the course of treatment. Consistent with ratings for other dimensions, good and excellent ratings required increasingly demanding levels of evidence in these areas. Utility It is also essential to know the utility of an instrument for a specific clinical purpose. The concept of clinical utility, applied to both diagnostic systems (e.g., Keeley et  al., 2016; Kendell & Jablensky, 2003; Mullins-​Sweatt, Lengel, & DeShong, 2016) and assessment tools (e.g., di Ruffano, Hyde, McCaffery, & Bossuyt, 2012; Yates & Taub, 2003), has received increasing attention in recent years. Although definitions vary, they have in common an emphasis on garnering evidence regarding actual improvements in both decisions made by clinicians and service outcomes experienced by clients. Unfortunately, despite thousands of studies on the reliability and validity of psychological instruments, there is only scant attention paid to matters of utility in most assessment research studies (McGrath, 2001). This has directly contributed to the current state of affairs in which there is very little replicated evidence that psychological assessment data have a direct impact on improved provision and outcome of clinical services. Currently, therefore, for the majority of psychological instruments, a determination of clinical utility must often be made on the basis of likely clinical value rather than on empirical evidence. Compared to the criteria for the psychometric dimensions presented thus far, our standards for evidence of clinical utility were noticeably less demanding. This was necessary because of the paucity of information on the extent to which assessment instruments are acceptable to clients, enhance the quality and outcome of clinical services, and/​or are worth the costs associated with their use. Therefore, we relied on authors’ expert opinions to classify an instrument as having adequate clinical utility. The availability of any supporting evidence of utility was sufficient for a rating of good, and replicated evidence of utility was necessary for a rating of excellent.

12

Introduction

The instrument summary tables also contain one final column, used to indicate instruments that are the best measures currently available to clinicians for specific purposes and disorders and, thus, are highly recommended for clinical use. Given the considerable differences in the state of the assessment literature for different disorders/​conditions, chapter authors had some flexibility in determining their own precise requirements for an instrument to be rated, or not rated, as highly recommended. However, to ensure a moderate level of consistency in these ratings, a highly recommended rating could only be considered for those instruments having achieved ratings of good or excellent in the majority of its rated psychometric categories. Although not required in our system, if several instruments had comparable psychometric merits for a given assessment purpose, some chapter authors considered the cost and availability of an assessment instrument when making this recommendation (see also Beidas et al., 2015).

For information on important developments on rating systems used in many areas of health care research, the interested reader can consult the website of the Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group (http://​www.gradeworkinggroup.org). The second issue has to do with the responsible clinical use of the guidance provided by the rating system. Consistent with evaluation and grading strategies used through evidence-​ based medicine and evidence-​ based psychology initiatives, many of our rating criteria relied on the consideration of the preponderance of data relevant to each dimension. Such a strategy recognizes both the importance of replication in science and the fact that variability across studies in research design elements (including sample composition and research setting) will influence estimates of these psychometric dimensions. However, we hasten to emphasize that reliance on the preponderance of evidence for these ratings does not imply or guarantee that an instrument is applicable for all clients or clinical settings. Our intention is to have these SOME FINAL THOUGHTS ratings provide indications about scientifically strong measures that warrant consideration for clinical and research We are hopeful that the rating system described in this use. As with all evidence-​based efforts, the responsibility chapter, and applied in each of the chapters of this book, rests with the individual professional to determine the will continue to aid in advancing the state of evidence-​ suitability of an instrument for the specific setting, purbased psychological assessment. We also hope that it will pose, and individuals to be assessed. serve as a stimulus for others to refine and improve upon Third, as emphasized throughout this volume, focusour efforts (cf. Jensen-​Doss, 2011; Youngstrom et  al., in ing on the scientific evidence for specific assessment tools press). Whatever the possible merits of the rating system, should not overshadow the fact that the process of clinical we close this chapter by drawing attention to three critical assessment involves much more than simply selecting and issues related to its use. administering the best available instruments. Choosing First, although the rating system used for this volume the best, most relevant, instruments is unquestionably an is relatively simple, the task of rating psychometric proper- important step. Subsequent steps must ensure that the ties is not. Results from many studies must be considered instruments are administered in an appropriate manner, in making such ratings, and precise quantitative standards accurately scored, and then individually interpreted in were not set for how to weight the results from studies. accordance with the relevant body of scientific research. Furthermore, in the spirit of evidence-​based practice, it However, to ensure a truly evidence-​based approach to is also important to note that we do not know whether assessment, the major challenge is to then integrate all the these ratings are, themselves, reliable. Reliance on indi- data within a process that is, itself, evidence-​based. This vidual expert judgment, no matter how extensive and will likely require both (a) a reframing of the assessment current the knowledge of the experts, is not as desirable process within the larger health and social system context as basing evidence-​based conclusions and guidance on in which it occurs and (b)  the use of new technologies systematic reviews of the literature conducted according to enable complex decision-​making and integration of to a consensually agreed upon rating system. However, large amounts of assessment information in both tradifor all the potential limitations and biases inherent in our tional and nontraditional health service delivery settings approach, reliance on expert review of the scientific lit- (Chorpita, Daleiden, & Bernstein, 2015). Much of our erature is the current standard in psychology and, thus, focus in this chapter has been on evidence-​based methwas the only feasible option for the volume at this time. ods and instruments, in large part because (a)  methods

Developing Criteria for Evidence-Based Assessment

and specific measures are more easily identified than are processes and (b) the main emphasis in the assessment literature has been on psychometric properties of methods and instruments. As we indicated early in the chapter, an evidence-​based approach to assessment should be developed in light of evidence on the accuracy and usefulness of this complex, iterative decision-​making task. Although the chapters in this volume provide considerable assistance for having the assessment process be informed by scientific evidence, the future challenge will be to ensure that the entire process of assessment is evidence based.

Note 1. Although we chose to use the term “inter-​rater reliability,” there is some discussion in the assessment literature about whether the term should be “inter-​rater agreement.” Heyman et  al. (2001), for example, suggested that because indices of inter-​rater reliability do not contain information about individual differences among participants and only contain information about one source of error (i.e., differences among raters), they should be considered to be indices of agreement, not reliability.

References Achenbach, T. M. (2001). What are norms and why do we need valid ones? Clinical Psychology:  Science and Practice, 8, 446–​450. Achenbach, T. M. (2005). Advancing assessment of children and adolescents: Commentary on evidence-​based assessment of child and adolescent disorders. Journal of Clinical Child and Adolescent Psychology, 34, 541–​547. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Anastasi, A. (1988). Psychological testing (6th ed.). New York, NY: Macmillan. Barlow, D. H. (2005). What’s new about evidence based assessment? Psychological Assessment, 17, 308–​311. Barry, A. E., Chaney, B. H., Piazza-​ Gardner, A. K., & Chavarria, E. A. (2014). Validity and reliability reporting in the field of health education and behavior:  A

13

review of seven journals. Health Education & Behavior, 41, 12–​18. Beidas, R. S., Stewart, R. E., Walsh, L., Lucas, S., Downey, M. M., Jackson, K.,  .  .  .  Mandell, D. S. (2015). Free, brief, and validated:  Standardized instruments for low-​ resource mental health settings. Cognitive and Behavioral Practice, 22, 5–​19. Brown, J., Scholle, S. H., & Azur, M. (2014). Strategies for measuring the quality of psychotherapy:  A white paper to inform measure development and implementation (Prepared for the U.S. Department of Health and Human Services). Retrieved from https://​aspe.hhs. gov/​report/​strategies-​measuring-​quality-​psychotherapy-​ white- ​ p aper- ​ i nform- ​ m easure- ​ d evelopment- ​ a nd-​ implementation Charter, R. A. (2003). A breakdown of reliability coefficients by test type and reliability method, and the clinical implications of low reliability. Journal of General Psychology, 130, 290–​304. Chmielewski, M., Clark, L. A., Bagby, R. M., & Watson, D. (2015). Method matters: Understanding diagnostic reliability in DSM-​IV and DSM-​5. Journal of Abnormal Psychology, 124, 764–​769. Chorpita, B. F., Daleiden, E. L., & Bernstein, A. D. (2015). At the intersection of health information technology and decision support:  Measurement feedback systems  .  .  .  and beyond. Administration and Policy in Mental Health and Mental Health Services Research, 43, 471–​477. Christon, L. M., McLeod, B. D., & Jensen-​Doss, A. (2015). Evidence-​based assessment meets evidence-​based treatment: An approach to science-​informed case conceptualization. Cognitive and Behavioral Practice, 22, 74–​86. Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6, 284–​290. Cohen, L. L., La Greca, A. M., Blount, R. L., Kazak, A. E., Holmbeck, G. N., & Lemanek, K. L. (2008). Introduction to the special issue:  Evidence-​ based assessment in pediatric psychology. Journal of Pediatric Psychology, 33, 911–​915. di Ruffano, L. F., Hyde, C. J., McCaffery, K. J., & Bossuyt, P. M. M. (2012). Assessing the value of diagnostic tests: A framework for designing and evaluating trials. British Medical Journal, 344, e686. Dozois, D. J.  A., Mikail, S., Alden, L. E., Bieling, P. J., Bourgon, G., Clark, D. A., . . . Johnston, C. (2014). The CPA presidential task force on evidence-​based practice of psychological treatments. Canadian Psychology, 55, 153–​160. Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem

14

Introduction

of internal consistency estimation. British Journal of Psychology, 105, 399–​412. Feng, G. C. (2015). Mistakes and how to avoid mistakes in using intercoder reliability indices. Methodology, 11, 13–​22. Fisher, A. J. (2015). Toward a dynamic model of psychological assessment:  Implications for personalized care. Journal of Consulting and Clinical Psychology, 83, 825–​836. Foster, S. L., & Cone, J. D. (1995). Validity issues in clinical assessment. Psychological Assessment, 7, 248–​260. Green, M. F., Nuechterlein, K. H., Gold, J. M., Barch, D. M., Cohen, J., Essock, S.,  .  .  .  Marder, S. R. (2004). Approaching a consensus cognitive battery for clinical trials in schizophrenia:  The NIMH-​ MATRICS conference to select cognitive domains and test criteria. Biological Psychiatry, 56, 301–​307. Haynes, S. N., Richard, D. C. S., & Kubany, E. S. (1995). Content validity in psychological assessment:  A functional approach to concepts and methods. Psychological Assessment, 7, 238–​247. Hermann, R. C., Chan, J. A., Zazzali, J. L., & Lerner, D. (2006). Aligning measure-​based quality improvement with implementation of evidence-​ based practices. Administration and Policy in Mental Health and Mental Health Services Research, 33, 636–​645. Heyman, R. E., Chaudhry, B. R., Treboux, D., Crowell, J., Lord, C., Vivian, D., & Waters, E. B.. (2001). How much observational data is enough? An empirical test using marital interaction coding. Behavior Therapy, 32, 107–​123. Hsu, L. M. (2002). Diagnostic validity statistics and the MCMI-​III. Psychological Assessment, 14, 410–​422. Hunsley, J., Lee, C. M., Wood, J., & Taylor, W. (2015). Controversial and questionable assessment techniques. In S. O. Lilienfeld, S. J. Lynn, & J. Lohr (Eds.), Science and pseudoscience in clinical psychology (2nd ed., pp. 42–​82). New York, NY: Guilford. Hunsley, J., & Mash, E. J. (2007). Evidence-​based assessment. Annual Review of Clinical Psychology, 3, 57–​79. Hunsley, J., & Meyer, G. J. (2003). The incremental validity of psychological testing and assessment: Conceptual, methodological, and statistical issues. Psychological Assessment, 15, 446–​455. Hurl, K., Wightman, J., Haynes, S. N., & Virues-​Ortega, J. (2016). Does a pre-​interventions functional assessment increase intervention effectiveness? A  meta-​analysis of within-​subjects interrupted time-​series studies. Clinical Psychology Review, 47, 71–​84. Ivanova, M. Y., Achenbach, T. M., Rescorla, L. A., Turner, L. V., Ahmeti-​Pronaj, A., Au, A., . . . Zasepa, E. (2015). Syndromes of self-​ reported psychopathology for ages 18–​59 in 29 societies. Journal of Psychopathology and Behavioral Assessment, 37, 171–​183.

Jensen-​Doss, A. (2011). Practice involves more than treatment: How can evidence-​based assessment catch up to evidence-​based treatment? Clinical Psychology: Science and Practice, 18, 173–​177. Kazdin, A. E. (1993). Evaluation in clinical practice:  Clinically sensitive and systematic methods of treatment delivery. Behavior Therapy, 24, 11–​45. Kazdin, A. E. (2003). Psychotherapy for children and adolescents. Annual Review of Psychology, 54, 253–​276. Kazdin, A. E. (2005). Evidence-​based assessment of child and adolescent disorders:  Issues in measurement development and clinical application. Journal of Clinical Child and Adolescent Psychology, 34, 548–​558. Keeley, J. W., Reed, G. M., Roberts, M. C., Evans, S. C., Medina-​Mora, M. E., Robles, R., . . . Saxena, S. (2016). Developing a science of clinical utility in diagnostic classification systems field study strategies for IDC-​11 mental and behavioral disorders. American Psychology, 71, 3–​16. Kelley, K., & Pornprasertmanit, S. (2016). Confidence intervals for population reliability coefficients: Evaluation of methods, recommendations, and software for composite measures. Psychological Methods, 21, 69–​92. Kendall, P. C., Marrs-​Garcia, A., Nath, S. R., & Sheldrick, R. C. (1999). Normative comparisons for the evaluation of clinical significance. Journal of Consulting and Clinical Psychology, 67, 285–​299. Kendell, R., & Jablensky, A. (2003). Distinguishing between the validity and utility of psychiatric diagnoses. American Journal of Psychiatry, 160, 4–​12. Krishnamurthy, R., VandeCreek, L., Kaslow, N. J., Tazeau, Y. N., Miville, M. L., Kerns, R.,  .  .  .  Benton, S. A. (2004). Achieving competency in psychological assessment: Directions for education and training. Journal of Clinical Psychology, 60, 725–​739. Lambert, M. J. (2015). Progress feedback and the OQ-​ system:  The past and the future. Psychotherapy, 52, 381–​390. Lambert, M. J., & Hawkins, E. J. (2004). Measuring outcome in professional practice: Considerations in selecting and using brief outcome instruments. Professional Psychology: Research and Practice, 35, 492–​499. Lee, C. M., Horvath, C., & Hunsley, J. (2013). Does it work in the real world? The effectiveness of treatments for psychological problems in children and adolescents. Professional Psychology:  Research and Practice, 44, 81–​88. Markon, K. E. (2013). Information utility:  Quantifying the total psychometric information provided by a measure. Psychological Methods, 18, 15–​35. Mash, E. J., & Barkley, R. A. (Eds.). (2007). Assessment of childhood disorders (4th ed.). New York, NY: Guilford. Mash, E. J., & Hunsley, J. (2005). Evidence-​based assessment of child and adolescent disorders: Issues and challenges.

Developing Criteria for Evidence-Based Assessment

Journal of Clinical Child and Adolescent Psychology, 34, 362–​379. Mash, E. J., & Hunsley, J. (2007). Assessment of child and family disturbance: A developmental systems approach. In E. J. Mash & R. A. Barkley (Eds.), Assessment of childhood disorders (4th ed., pp. 3–​ 50). New  York, NY: Guilford. Mash, E. J., & Terdal, L. G. (1997). Assessment of child and family disturbance:  A behavioral-​ systems approach. In E. J. Mash & L. G. Terdal (Eds.), Assessment of childhood disorders (3rd ed., pp. 3–​ 68). New  York, NY: Guilford. McGrath, R. E. (2001). Toward more clinically relevant assessment research. Journal of Personality Assessment, 77, 307–​332. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–​749. Mullins-​Sweatt, S. N., Lengel, G. J., & DeShong, H. L. (2016). The importance of considering clinical utility in the construction of a diagnostic manual. Annual Review of Clinical Psychology, 12, 133–​155. Newby, J. M., McKinnon, A., Kuyken, W., Gilbody, S., & Dalgleish, T. (2015). Systematic review and meta-​ analysis of transdiagnostic psychological treatments for anxiety and depressive disorders in adulthood. Clinical Psychology Review, 40, 91–​110. Newton, P. E., & Shaw, S. D. (2013). Standards for talking and thinking about validity. Psychological Methods, 18, 301–​319. Ng, M. Y., & Weisz, J. R. (2016). Building a science of personalized intervention for youth mental health. Journal of Child Psychology and Psychiatry, 57, 216–​236. Norcross, J. C., Koocher, G. P., & Garofalo, A. (2006). Discredited psychological treatments and tests:  A Delphi poll. Professional Psychology: Research and Practice, 37, 515–​522. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-​Hill. Peterson, D. R. (2004). Science, scientism, and professional responsibility. Clinical Psychology: Science and Practice, 11, 196–​210. Robinson, J. P., Shaver, P. R., & Wrightsman, L. S. (1991). Criteria for scale selection and evaluation. In J. P. Robinson, P. R. Shaver, & L. S. Wrightsman (Eds.), Measures of personality and social psychological attitudes (pp. 1–​16). New York, NY: Academic Press. Rodebaugh, T. L., Sculling, R. B., Langer, J. K., Dixon, D. J., Huppert, J. D., Bernstein, A., . . . Lenze, E. J. (2016). Unreliability as a threat to understanding psychopathology: The cautionary tale of attentional bias. Journal of Abnormal Psychology, 125, 840–​851. Sales, C. M. D., & Alves, P. C. G. (2016). Patient-​centered assessment in psychotherapy: A review of individualized

15

tools. Clinical Psychology:  Science and Practice, 23, 265–​283. Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62, 529–​540. Sechrest, L. (2005). Validity of measures is no simple matter. Health Services Research, 40, 1584–​1604. Seidman, E., Chorpita, B. F., Reay, B., Stelk, W., Kutash, K., Mullican, C., & Ringeisen, H. (2010). A framework for measurement feedback to improve decision-​making in mental health. Administration and Policy in Mental Health and Mental Health Services Research, 37, 128–​131. Smith, G. T., Fischer, S., & Fister, S. M. (2003). Incremental validity principles in test construction. Psychological Assessment, 15, 467–​477. Spilka, M. J., & Dobson, K. S. (2015). Promoting the internationalization of evidence-​based practice: Benchmarking as a strategy to evaluate culturally transported psychological treatments. Clinical Psychology:  Science and Practice, 22, 58–​75. Strauss, M. E., & Smith, G. T. (2009). Construct validity:  Advances in theory and methodology. Annual Review of Clinical Psychology, 5, 89–​113. Streiner, D. L. (2003). Starting at the beginning:  An introduction to coefficient alpha and internal consistency. Journal of Personality Assessment, 80, 99–​103. Streiner, D. L., Norman, G. R., & Cairney, J. (2015). Health measurement scales:  A practical guide to their development and use (5th ed.). New York, NY:  Oxford University Press. Thompson-​Hollands, J., Sauer-​Zavala, S., & Barlow, D. H. (2014). CBT and the future of personalized treatment: A proposal. Depression and Anxiety, 31, 909–​911. Vacha-​Haase, T., & Thompson, B. (2011). Score reliability:  A retrospective look back at 12  years of reliability generalization studies. Measurement and Evaluation in Counseling and Development, 44, 159–​168. Vermeersch, D. A., Lambert, M. J., & Burlingame, G. M. (2000). Outcome Questionnaire:  Item sensitivity to change. Journal of Personality Assessment, 74, 242–​261. Wasserman, J. D., & Bracken, B. A. (2013). Fundamental psychometric considerations in assessment. In J. R. Graham & J. A. Naglieri (Eds.), Handbook of psychology:  Volume 10. Assessment psychology (2nd ed., pp. 50–​81). Hoboken, NJ: Wiley. Watson, D. (2004). Stability versus change, dependability versus error: Issues in the assessment of personality over time. Journal of Research in Personality, 38, 319–​350. Weisz, J. R., Chorpita, B. F., Frye, A., Ng, M. Y., Lau, N., Bearman, S. K., & Hoagwood, K. E. (2011). Youth Top Problems:  Using idiographic, consumer-​guided assessment to identify treatment needs and to track change

16

Introduction

during psychotherapy. Journal of Consulting and Clinical Psychology, 79, 369–​380. Weisz, J. R., & Kazdin, A. E. (Eds.). (2017). Evidence-​based psychotherapies for children and adolescents (3nd ed.). New York, NY: Guilford. Weisz, J. R., Krumholz, L. S., Santucci, L., Thomassin, K., & Ng, M. Y. (2015). Shrinking the gap between research and practice:  Tailoring and testing youth psychotherapies in clinical care contexts. Annual Review of Clinical Psychology, 11, 139–​163. Westen, D., & Rosenthal, R. (2003). Quantifying construct validity:  Two simple measures. Journal of Personality and Social Psychology, 84, 608–​618. Wright, C. V., Beattie, S. G., Galper, D. I., Church, A. S., Bufka, L. F., Brabender, V. M., & Smith, B. L. (2017). Assessment practices of professional psychologists:  Results of a national survey. Professional Psychology: Research and Practice, 48, 73–​78. Xu, S., & Lorber, M. F. (2014). Interrater agreement statistics with skewed data: Evaluation of alternatives to Cohen’s

kappa. Journal of Consulting and Clinical Psychology, 82, 1219–​1227. Yates, B. T., & Taub, J. (2003). Assessing the costs, benefits, cost-​ effectiveness, and cost–​ benefit of psychological assessment: We should, we can, and here’s how. Psychological Assessment, 15, 478–​495. Youngstrom, E. A., Choukas-​Bradley, S., Calhoun, C. D., & Jensen-​Doss, A. (2015). Clinical guide to the evidence-​ based approach to diagnosis and treatment. Cognitive and Behavioral Practice, 22, 20–​35. Youngstrom, E. A., & Van Meter, A. (2016). Empirically supported assessment of children and adolescents. Clinical Psychology: Science and Practice, 23, 327–​347. Youngstrom, E. A., Van Meter, A., Frazier, T. W., Hunsley, J., Prinstein, M. J., Ong, M.-​L., & Youngstrom, J. K. (in press). Evidence-​based assessment as an integrative model for applying psychological science to guide the voyage of treatment. Clinical Psychology:  Science and Practice.

 17

2

Dissemination and Implementation of Evidence-​Based Assessment Amanda Jensen-​Doss Lucia M. Walsh Vanesa Mora Ringle During the past two decades, there has been a major push to increase the use of evidence-​based practices in clinical settings. The American Psychological Association Presidential Task Force on Evidence-​ Based Practice (2006) defines evidence-​ based practice in psychology (EBPP) as “the integration of the best available research with clinical expertise in the context of patient characteristics, culture, and preferences” (p.  271) and states that the goal of EBPP is to improve public health through the application of research-​ supported assessment, case formulation, therapeutic relationship, and treatment approaches. Although EBPP is defined broadly, many efforts to improve practice have focused on treatment, with less attention paid to other aspects of practice. There is a particular need to focus on increasing use of evidence-​ based assessment (EBA), as assessment cuts across all of these other areas of practice. For example, assessment results should form the foundation of case conceptualization; inform decisions about which treatments to use; and provide data about whether treatment is working, whether therapy alliance is strong, and when to end treatment. There are several reasons why a focus on EBA will increase the likelihood that EBPP will lead to improved public health. First, EBA can improve the accuracy of diagnoses, which is one important component of case conceptualization and treatment selection (Christon, McLeod, & Jensen-​Doss, 2015). Research linking diagnostic accuracy to improved treatment engagement and outcomes (Jensen-​Doss & Weisz, 2008; Klein, Lavigne, & Seshadri, 2010; Kramer, Robbins, Phillips, Miller, &

17

Burns, 2003; Pogge et  al., 2001)  suggests that improved diagnostic assessment could have a positive effect across the treatment process. As detailed throughout this book, evidence-​based diagnostic assessment typically involves the use of structured diagnostic interviews or rating scales. Studies have demonstrated that when clinicians use structured diagnostic interviews, they assign more accurate diagnoses (Basco et al., 2000), better capture comorbidities (Matuschek et  al., 2016), assign more specific diagnoses (Matuschek et  al., 2016), reduce psychiatrist evaluation time (Hughes et al., 2005), and decrease the likelihood that a psychiatrist will increase a patient’s medication dose (Kashner et al., 2003). Using EBA for progress monitoring can support clinical judgment by creating an ongoing feedback loop to support ongoing case conceptualization (Christon et al., 2015) and, if data suggest clients are at risk for treatment failure, revise the treatment plan (Claiborn & Goodyear, 2005; Lambert, Hansen, & Finch, 2001; Riemer, Rosof-​Williams, & Bickman, 2005). Gold standard progress monitoring of this nature typically involves administering rating scales every session or two and then incorporating the feedback into clinical decisions. This differs from what we refer to here as “outcome monitoring,” or administering outcome measures before and after treatment to determine treatment effectiveness. Although useful for many purposes, this type of outcome monitoring does not support clinical decision-​making during service provision. Several monitoring and feedback systems (MFSs) have been developed to support ongoing progress monitoring; they typically include a battery of progress measures and

18

18

Introduction

generate feedback reports that often include warnings if a client is not on track for positive outcomes (Lyon, Lewis, Boyd, Hendrix, & Liu, 2016). Extensive research with adult clients suggests that clinician use of MFSs can improve outcomes, particularly for those “not on track” for positive outcomes (Krägeloh, Czuba, Billington, Kersten, & Siegert, 2015; Shimokawa, Lambert, & Smart, 2010); similar results have been found for youth clients (Bickman, Kelley, Breda, De Andrade, & Riemer, 2011; Stein, Kogan, Hutchison, Magee, & Sorbero, 2010), although effects vary based on organizational support for progress monitoring (Bickman et al., 2016). The purpose of this chapter is to make the case that significant work is needed to encourage the dissemination of information about EBA and its implementation in clinical practice. First, we discuss “assessment as usual,” how it differs from EBA, and reasons for these differences. Then, we describe efforts to increase use of EBA through dissemination and implementation efforts. Finally, we present some ideas for future work needed to further advance the use of EBA. Consistent with the other chapters in this book, we focus on assessment of psychopathology and its application to clinical diagnosis and progress monitoring. Although similar issues likely exist for other forms of assessment, such as psychoeducational assessment, discussion of those is beyond the scope of this book. Finally, although assessment tools to support case conceptualization are described in the subsequent chapters of this book, most of the literature studying clinician case conceptualization practices and how to improve them has focused on whether clinicians can apply specific theoretical models to client data and the validity of case conceptualizations (e.g., Abbas, Walton, Johnston, & Chikoore, 2012; Flinn, Braham, & das Nair, 2015; Persons & Bertagnolli, 1999) rather than on how to collect and integrate EBA data to generate a case conceptualization. As mentioned previously, both diagnostic assessment and progress monitoring can support case conceptualization; therefore, much of the literature we discuss here has implications for case conceptualization. However, in the Future Directions section, we address steps needed to advance assessment-​based case conceptualization.

IS THERE A RESEARCH–​P RACTICE GAP IN ASSESSMENT?

Despite the proliferation of excellent assessment tools, available data suggest there are significant training and practice gaps in both diagnostic assessment and progress

monitoring. As discussed in the following sections, these gaps have important implications for the accuracy of clinician-​generated diagnoses and the accuracy of clinician judgments about treatment progress. Research–​Practice Gaps in Diagnostic Assessment As detailed throughout the chapters in this book, evidence-​based diagnostic assessment for most disorders relies on standardized diagnostic interviews and/​or rating scales. Unfortunately, surveys of training programs suggest that clinicians are not being prepared to conduct these assessments during their graduate training. Several surveys of psychology programs have been conducted in the past three decades (e.g., Belter & Piotrowski, 2001; Childs & Eyde, 2002), with the two most recent (Mihura, Roy, & Graceffo, 2016; Ready & Veague, 2014)  finding that training in assessment has generally remained constant, but there has been an increase in training focused on assessment of treatment outcomes, psychometrics, and neuropsychology. However, these two studies found inconsistent results regarding the use of clinical interviewing. Ready and Veague reported only half to three-​ fourth of programs included clinical interviewing as a focus of training. However, Mihura et al. found that 92% of programs queried included clinical interviewing as a required topic. These differences might reflect a change during the 3  years that passed between the two studies. However, it is also likely that the two studies also obtained information from different programs, as Mihura and colleagues included more programs than Ready and Veague and each study only obtained data from approximately one-​third of all of the American Psychological Association (APA)-​accredited programs. Two studies have examined training in diagnostic assessment specifically. Ponniah et  al. (2011) surveyed clinical training directors from social work, clinical psychology PhD and PsyD, and psychiatric residency programs regarding training in structured assessment based on Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria. Only one-​ third of surveyed programs reported providing both didactics and supervision in diagnostic assessment, with clinical psychology PhD and psychiatry residency programs being most likely to do so and social work programs the least likely. These results are concerning because master’s level clinicians represent the majority of those providing services to those with mental health disorders in the United States (Garland, Bickman, & Chorpita, 2010). Focusing on clinical psychology PhD and PsyD programs exclusively, Mihura et  al. (2016)

 19

Dissemination and Implementation of Evidence-Based Assessment

found that less than half of the programs required a course and less than one-​fourth required an applied practicum on any structured diagnostic interview. Differences in required structured diagnostic interview courses were found between training models:  73% of clinical science and 63% of scientist-​ practitioner programs required a course, whereas only 35% of practitioner-​ focused programs had a similar requirement. Not surprisingly based on these training gaps, available data suggest that clinicians are not engaged in EBA for diagnostic assessment. Existing surveys across a range of clinicians indicate that unstructured interviews are commonly relied on for diagnosis (e.g., Anderson & Paulosky, 2004), that evidence-​based tools are infrequently used (e.g., Gilbody, House, & Sheldon, 2002; Whiteside, Sattler, Hathaway, & Douglas, 2016), and diagnostic practices often do not map on to best practice guidelines (e.g., Demaray, Schaefer, & Delong, 2003; Lichtenstein, Spirito, & Zimmermann, 2010). Unfortunately, these gaps between “best practice” and “as usual” assessment practices have implications for the accuracy of diagnoses generated in routine practice. Studies comparing clinician-​generated diagnoses to those generated through comprehensive, research-​ supported methods consistently find low rates of agreement between the two (Rettew, Lynch, Achenbach, Dumenci, & Ivanova, 2009; Samuel, 2015). Studies examining the validity of clinician-​generated diagnoses also suggest that these diagnoses are less valid than the evidence-​based diagnoses (Basco et al., 2000; Jewell, Handwerk, Almquist, & Lucas, 2004; Mojtabai, 2013; Samuel et al., 2013; Tenney, Schotte, Denys, van Megen, & Westenberg, 2003). Research–​Practice Gaps in Progress Monitoring Most of what is known about graduate training in progress monitoring focuses on trainee psychologists. As described throughout this volume, most progress monitoring tools are standardized rating scales, so many of the assessment training gaps discussed previously also are relevant for progress monitoring. However, other surveys have focused on whether trainees are trained in utilizing these scales for ongoing progress monitoring, rather than for diagnostic assessment. With regard to APA accredited psychology programs, Ready and Veague (2014) found that only approximately half of programs offer courses focused on progress monitoring, and Mihura et al. (2016) found only 10% of programs require their students to routinely use outcome measures in their practical. Differing rates of training in progress monitoring have been found between

19

different types of doctoral programs (e.g., counseling vs. PsyD; Overington, Fitzpatrick, Hunsley, & Drapeau, 2015)  and training program models (e.g., practitioner-​ scholar models vs. clinical-​scientist models; Overington et al., 2015), although little is known about progress monitoring training in master’s programs. Similar to training programs, fewer than half of internship directors report having their trainees use progress monitoring (Ionita, Fitzpatrick, Tomaro, Chen, & Overington, 2016; Mours, Campbell, Gathercoal, & Peterson, 2009), and nearly one-​third of directors have never heard of progress monitoring measures (Ionita et al., 2016). Similar to diagnostic assessment, there are low rates of progress monitoring among practicing clinicians. Much of the research in this area is focused on outcome monitoring (Cashel, 2002; Hatfield & Ogles, 2004), with relatively less focus given to ongoing progress monitoring. Surveys suggest that many clinicians report tracking client progress (Anderson & Paulosky, 2004; Gans, Falco, Schackman, & Winters, 2010). However, many of them are not using validated measures, instead relying on tools developed within the clinic, unstructured interviews, reports from clients, and clinical judgment (Anderson & Paulosky, 2004; Gans et al., 2010; Johnston & Gowers, 2005). This finding has been supported in two recent large surveys of psychologists and master’s level clinicians, in which fewer than 15% of clinicians engaged in ongoing progress monitoring (Ionita & Fitzpatrick, 2014; Jensen-​Doss et  al., 2016). Clinicians who do use progress monitoring measures appear to be using these measures to track progress internally and for administrative purposes, but they rarely report using them to plan treatment or monitor progress (Garland, Kruse, & Aarons, 2003). This lack of formal progress monitoring is concerning in light of data showing that it is difficult for clinicians to accurately judge client progress based on clinical judgment alone. For example, when Walfish, McAlister, O’Donnell, and Lambert (2012) asked a multidisciplinary sample of clinicians to rate their own level of skills, none of them ranked themselves as below average, and one-​fourth of them rated themselves at the 90th percentile of clinical skill relative to their peers. These therapists estimated that three-​fourths of their clients improved in therapy, with less than 5% deteriorating; nearly half of them said that none of their clients ever deteriorated. The study authors point out that these estimates deviated wildly from published estimates of improvement and deterioration rates. Interestingly, available data suggest that even when clinicians are trained to engage in progress monitoring, this

20

20

Introduction

does not improve their ability to rate progress based on clinical judgment alone. Hannan and colleagues (2005) removed feedback reports from a setting that had been using an MFS and asked clinicians to predict their clients’ outcomes. Those clinicians underestimated how many clients would deteriorate or not improve, and they overestimated how many would improve.

WHY IS THERE A RESEARCH–​P RACTICE GAP IN ASSESSMENT?

As detailed previously, lack of training is likely one factor that contributes to clinicians’ not utilizing best assessment practices. In addition, research has identified other clinician and organizational variables that might be contributing to this research–​practice gap. Several studies have focused on clinician attitudes that might be driving assessment practices. Jensen-​Doss and Hawley (2010, 2011)  conducted a national, multidisciplinary survey to assess clinicians’ attitudes toward standardized assessment tools, with a particular focus on diagnostic assessment. On average, clinicians reported neutral to positive attitudes toward standardized assessment tools, although this varied by discipline, with psychologists reporting more positive attitudes compared to psychiatrists, marriage and family therapists, social workers, and mental health counselors (Jensen-​Doss & Hawley, 2010). Attitudes, particularly beliefs about the practicality of standardized assessment tools, predicted self-​reported use of these tools. Other studies have found that clinicians have concerns that structured diagnostic interviews would be unacceptable to clients (Bruchmüller, Margraf, Suppiger, & Schneider, 2011), although data gathered directly from clients do not support this view (Suppiger et al., 2009). Studies have also examined clinician attitudes toward progress monitoring. Across studies, attitudes toward these measures have varied from neutral to positive, although concerns regarding the validity of the measures (e.g., whether they accurately reflected client progress) are common (Cashel, 2002; Gilbody et al., 2002; Hatfield & Ogles, 2007; Ionita et  al., 2016; Johnston & Gowers, 2005). As with diagnostic assessment, clinicians often report practical concerns about progress monitoring, including limited access to affordable measures, measures being too long, difficulties reaching clients to fill out measures, and little time to administer measures and keep track of when to fill out measures (Gleacher et  al., 2016; Hatfield & Ogles, 2004; Ionita et al., 2016; Johnston & Gowers, 2005; Kotte et al., 2016; Meehan, McCombes, Hatzipetrou, &

Catchpoole, 2006; Overington et al., 2015). Clinicians also report anxiety about progress monitoring data being used for performance evaluation, use of these measures ruining rapport, concern regarding how to present results correctly to clients, and a general lack of knowledge about progress monitoring (Ionita et al., 2016; Johnston & Gowers, 2005; Meehan et al., 2006). In a study that separately asked about attitudes toward the practice of progress monitoring and attitudes toward standardized progress measures, clinicians reported very positive attitudes toward the idea of monitoring progress but more neutral attitudes toward the measures themselves (Jensen-​Doss et al., 2016), suggesting that clinicians are open to engaging in the practice if their concerns about the measures themselves can be addressed. Consistent with research on diagnostic assessment, more positive attitudes toward progress monitoring are associated with higher rates of self-​reported progress monitoring practices (Hatfield & Ogles, 2004; Jensen-​Doss et al., 2016; Overington et al., 2015). A number of organizational barriers and facilitators of EBA have also been identified in the literature. Lack of organizational support is a barrier frequently mentioned by clinicians, including both active discouragement from supervisors and administration regarding the use of measures and little guidance given by organizational leaders of when and how often to use them (Connors, Arora, Curtis, & Stephan, 2015; Gilbody et al., 2002; Ionita et al., 2016; Overington et al., 2015). Many of the practical concerns described previously also speak to organizational factors, such as the amount of time clinicians are allowed to spend on assessment and the budget available for purchasing assessment tools. Clinicians also often report that administrators are more interested in tracking administrative outcomes (e.g., length of wait list, client turnover, and number of sessions) than outcomes such as functioning and symptom reduction (Gilbody et al., 2002; Johnston & Gowers, 2005). Conversely, clinicians who indicate their organizations have policies or rules about assessment are more likely to report using progress monitoring (Jensen-​Doss et  al., 2016). Clinician assessment practices also vary across organizational settings; providers working in private practice settings are less likely to use standardized diagnostic and progress monitoring tools than are those working in other settings (Jensen-​Doss et al., 2016; Jensen-​Doss & Hawley, 2010).

EFFORTS TO IMPROVE ASSESSMENT PRACTICES

The studies reviewed previously indicate that although effective assessment tools exist, they often are not making

 21

Dissemination and Implementation of Evidence-Based Assessment

their way into practice settings. As such, several efforts have been made to bridge this research–​practice gap, some focused on specific evidence-​based measures (e.g., rating scales for trauma; National Child Traumatic Stress Network, 2016)  and others focused on EBA processes (e.g., using an MFS to gather data and using feedback to make decisions about treatment; Bickman et  al., 2016). Efforts to improve clinician assessment practices can be divided into dissemination efforts, or efforts to inform clinicians about EBA tools, and implementation efforts that seek to support clinicians in their use of such tools. Implementation efforts can be subdivided into those focused on training clinicians in EBA; those focused on implementing EBA in individual organizations; and those focused on integrating EBA into mental health systems, such as state public mental health systems. Although a comprehensive review of all of these efforts is beyond the scope of this chapter, we highlight some illustrative examples of each approach. Dissemination Efforts Assessment-​focused dissemination efforts have typically created sources for clinicians to identify evidence-​based measures or guides for them to engage in EBA processes. This volume is an example of an EBA dissemination effort, as are publications in EBA special journal issues (Hunsley & Mash, 2005; Jensen-​Doss, 2015; Mash & Hunsley, 2005) and review papers, such as Leffler, Riebel, and Hughes’ (2015) review of structured diagnostic interviews for clinicians. The DSM board has also embarked on efforts to improve diagnostic practices and accuracy by outlining steps for diagnosis and creating decision trees to support differential diagnosis (First, 2013). Although these dissemination efforts have typically focused on what clinicians should do, Koocher and Norcross have also published articles identifying discredited assessment methods (Koocher, McMann, Stout, & Norcross, 2015; Norcross, Koocher, & Garofalo, 2006). There are also efforts to disseminate EBA information online. For example, there is a website dedicated to information about measures relevant to the assessment of traumatized youth (http://​www.nctsn.org/​ resources/​online-​research/​measures-​review; National Child Traumatic Stress Network, 2016), a repository of information about assessment tools relevant to child welfare populations (http://​www.cebc4cw.org/​assessment-​ tools/​measurement-​tools-​highlighted-​on-​the-​cebc; The California Evidence-​ Based Clearinghouse for Child Welfare, 2017), and the PROMIS website with measures

21

that assess outcomes for various health domains that promote and facilitate outcome and progress monitoring (http://​www.healthmeasures.net/​explore-​measurement-​ systems/​promis; Cella et  al., 2010; HealthMeasures, 2017; Pilkonis et  al., 2011). In a novel approach to dissemination, the APA has recently funded a grant to update assessment pages on Wikipedia, with a focus on assessments that are freely available (Youngstrom, Jensen-​Doss, Beidas, Forman, & Ong, 2015–​2016). Training Efforts Some groups are moving beyond dissemination to provide training in EBA to clinicians. Relative to the numerous studies focused on training clinicians in treatments (Herschell, Kolko, Baumann, & Davis, 2010), there are fewer EBA training studies. Documented EBA training efforts to date have consisted of workshops, workshops plus ongoing consultation, and courses. Didactic training workshops have helped improve clinician progress monitoring attitudes (Edbrooke-​Childs, Wolpert, & Deighton, 2014; Lyon, Dorsey, Pullmann, Silbaugh-​ Cowdin, & Berliner, 2015), self-​ efficacy (Edbrooke-​ Childs et  al., 2014), and use (Persons, Koerner, Eidelman, Thomas, & Liu, 2016). Another training approach is to follow workshops with ongoing consultation. For example, a training effort in Washington state included 6 months of expert-​led phone consultation and found that training impacted clinician attitudes, skill, and implementation of standardized assessment tools (Lyon et al., 2015). Finally, online training has recently been applied to EBA training. For example, Swanke and Zeman (2016) created an online course in diagnostic assessment for master’s level social work students. The course was based on a problem-​based learning approach wherein students were given diagnostic problems to solve by identifying symptoms and matching symptoms to DSM diagnoses. At the end of the course, the average student grade on content quizzes was 78.7% and the class was well-​received by the students, although students’ levels of knowledge prior to the course are not know, so it is difficult to determine whether the course actually increased knowledge. Organizational-​Level Implementation Efforts Another approach to increasing use of EBA is for organizations to attempt to change assessment practices organization-​wide. Several examples of such efforts have been documented in the literature, including studies

2

22

Introduction

examining the impact of organizations incorporating structured diagnostic interviews (e.g., Basco et al., 2000; Lauth, Levy, Júlíusdóttir, Ferrari, & Pétursson, 2008; Matuschek et al., 2016) and progress monitoring systems (e.g., Bickman et al., 2011, 2016; Bohnenkamp, Glascoe, Gracey, Epstein, & Benningfield, 2015; Strauss et  al., 2015; Veerbeek, Voshaar, & Pot, 2012). One illustrative example of organizational-​ level implementation work focused on progress monitoring is the work of Bickman and colleagues. Following an initial successful randomized effectiveness trial showing that using an MFS called Contextualized Feedback System (CFS) improved client outcomes (Bickman et al., 2011), Bickman and colleagues (2016) conducted a second randomized trial within two mental health organizations. All clinicians within the agencies were required to administer CFS, and cases were randomly assigned to receive feedback as soon as measures were entered into the system (i.e., clinicians immediately received feedback reports summarizing the CFS data) or to receiving feedback every 6  months. Before the trial began, the investigators conducted a “pre-​implementation contextualization phase,” during which they held workgroups to understand existing clinic procedures and brainstorm about how CFS would fit into those procedures. Training and ongoing consultation in CFS was provided to clinicians and to agency administrators to ensure both clinical (i.e., using it with individual clients) and organizational (e.g., ongoing review of aggregated data to identify problems with CFS implementation) use of CFS. After finding that only one clinic demonstrated enhanced outcomes with CFS, the authors determined that the two agencies differed in their rates of questionnaire completion and viewing of feedback reports. To better understand these findings, they then conducted qualitative interviews with the participating clinicians (Gleacher et  al., 2016). Clinicians at the clinic with better implementation and outcomes reported more barriers to using CFS with their clients than did clinicians at the other clinic, perhaps because they were using it more often. However, they also reported fewer barriers at the organizational level and more support from their organizational leadership. The authors concluded that organizational factors are strong drivers of implementation success. System-​Level Efforts Another approach to implementation is for mental health systems, such as state public mental health agencies or agencies like the Veteran’s Administration, to enact policies

requiring evidence-​based assessment. System-​level implementations documented in the literature have primarily focused on progress monitoring. An early example was the state of Michigan’s use of the Child and Adolescent Functional Assessment Scale (CAFAS; Hodges & Wong, 1996). As described by Hodges and Wotring (2004), clinicians in the public mental health system were required to use the CAFAS to track client outcomes. Data were then used to provide clinicians and agencies feedback on individual client and agency-​wide outcomes, including comparison to agency and state averages. Hawaii, which has been a pioneer in the advancement of evidence-​based treatments (EBTs) in the public sector, has supported these efforts by developing and implementing an MFS that is used statewide (Higa-​McMillan, Powell, Daleiden, & Mueller, 2011; Kotte et  al., 2016; Nakamura et al., 2014). To date, both clinicians and caseworkers across various agencies in the state have been trained in and are implementing the MFS. In an effort to encourage the use of EBA, Higa-​McMillan et al. reported the use of “Provider Feedback Data Parties” during which client progress and clinical utilization of the data are discussed. Other studies on Hawaii’s EBA efforts observed that the fit between the MFS and case manager characteristics facilitated MFS implementation, whereas provider concerns about the clinical utility and scientific merit of the MFS were reported as barriers (Kotte et al., 2016). Internationally, system-​ level efforts to implement progress monitoring have been reported in the United Kingdom and Australia. Efforts to implement routine monitoring throughout the United Kingdom have been ongoing for well over a decade (Fleming, Jones, Bradley, & Wolpert, 2016; Hall et al., 2014; Mellor-​Clark, Cross, Macdonald, & Skjulsvik, 2016). The Child Outcomes Research Consortium (CORC; http://​www.corc.uk.net), a learning and planning collaboration of researchers, therapists, managers, and funders, has spearheaded most of this work. CORC has made valid, reliable, brief, and free measures available to all clinicians working in the United Kingdom, provided training in the measures, and created an MFS to support their use. These measures are reported to be widely implemented, but not at an optimal level (Mellor-​Clark et al., 2016), so efforts are now focused on adopting more theory-​driven approaches to implementing the system (Mellor-​Clark et al., 2016; Meyers, Durlak, & Wandersman, 2012). In Australia, efforts to implement progress monitoring have been ongoing since the late 1990s and include training and development of computer systems to support data collection and analysis (Meehan et  al., 2006; Trauer, Gill, Pedwell, & Slattery, 2006).

 23

Dissemination and Implementation of Evidence-Based Assessment

Outcome data are collected at all public clinics and are aggregated at a national level to be used for comparison by local clinics. Finally, note that policies focused on other aspects of care can also have implications for assessment. For example, the Precision Medicine Initiative (The White House Office of the Press Secretary, 2015) focuses on increasing personalized medical treatments that take individual differences in genes and environment into account. Such tailored approaches are likely going to require increased use of psychosocial assessment in health care settings. Similarly, the US Medicare and Medicaid system is moving increasingly toward value-​ based payment, where reimbursement is based on quality, rather than quantity, of care (Centers of Medicare & Medicaid Services, 2016). As such, assessment of quality indicators within publicly funded behavioral health settings is going to become increasingly important. Finally, initiatives to implement EBTs often lead to the development of assessment processes to support those treatments, as evidenced by the Hawaii initiative described previously.

FUTURE DIRECTIONS

As we hope this review has made clear, the literature on EBA contains both good and bad news. On the one hand, a number of excellent EBA tools exist and some efforts are underway to encourage clinician use of those tools. On the other hand, significant gaps continue to exist between assessment best practices and what the average clinician does in practice. To address these gaps, we have several suggestions for future directions the field should take. 1. Increase graduate-​level training in evidence-​based diagnostic assessment and progress monitoring. Most of the training and implementation efforts described previously have primarily focused on retraining clinicians whose graduate training likely did not include in-​depth training in structured diagnostic assessment or progress monitoring. Researchers focused on EBTs have called for an increased focus on training at the graduate level because training people well at the outset is likely easier and more cost-​effective than trying to retrain them (e.g., Bearman, Wadkins, Bailin, & Doctoroff, 2015). One avenue for improving graduate training is increasing the specificity of accreditation guidelines for training programs (Dozois et al., 2014; Ponniah et al., 2011). For both psychology and psychiatry training programs, past accreditation standards stressed the need for students

23

to attain competence in diagnosis of clients via measurement and interviews and to assess treatment effectiveness, but they gave little guidance regarding what constitutes appropriate assessment (APA, 2006; Canadian Psychological Association, 2011; Ponniah et  al., 2011). A  similar picture exists in accreditation guidelines for mental health counseling (American Mental Health Counselors Association, 2011), marriage and family therapy (Commission on Accreditation for Marriage and Family Therapy Education, 2014), and bachelor’s and master’s level social work programs (Commission on Accreditation & Policy, 2015), although these guidelines do include training in progress monitoring as a way to perform program evaluation. In January 2017, a new set of accreditation guidelines for the APA went into effect that include EBA as a core competency (APA Commission on Accreditation, 2015). A recent Canadian task force focused on increasing EBP use (Task Force on Evidence-​Based Practice of Psychological Treatments; Dozois et  al., 2014)  emphasized monitoring progress and outcomes throughout treatment. However, the Canadian Psychological Association accreditation guidelines for doctoral programs have not been updated to reflect this change as of this publication. 2. Increase “best practice” training strategies in EBA dissemination and implementation efforts. Although exceptions exist (Lyon et al., 2015), the primary approach that has been taken to training clinicians in EBA is what is sometimes referred to as a “train and pray” approach:  Bring clinicians together for a workshop and then hope they take what they have learned and apply it in practice. The literature on training in EBTs suggests that such an approach is unlikely to lead to sustained practice changes (Herschell et  al., 2010). Rather, training needs to involve active learning strategies, ongoing consultation in the practice, and attention to contextual variables such as whether clinicians have adequate organizational support to continue using the practice (Beidas & Kendall, 2010; Herschell et  al., 2010). Examples of strategies that could be incorporated into EBA trainings include engaging clinicians in behavioral rehearsal during training (Beidas, Cross, & Dorsey, 2014); providing ongoing consultation after initial training (e.g., Bickman et al., 2016; Lyon et al., 2015); increasing sustainability of assessment practices through “train the trainer” models that train agency supervisors to provide ongoing supervision assessment (Connors et  al., 2015); and incorporating all levels of an agency into training through learning collaborative models that address implementation at the clinician, supervisor, and administrator levels (e.g., Ebert,

24

24

Introduction

Amaya-​Jackson, Markiewicz, Kisiel, & Fairbank, 2012; Nadeem, Olin, Hill, Hoagwood, & Horwitz, 2014). 3. Increase our focus on pragmatic assessment. Studies conducted with clinicians consistently suggest that perceived lack of practicality is a major barrier to clinician use of EBA (e.g., Ionita et al., 2016; Jensen-​Doss & Hawley, 2010). In addition, the fact that many clinicians who do gather assessment data do not actually incorporate that data into clinical decisions (Garland et al., 2003; Johnston & Gowers, 2005)  suggests that they may not find the data clinically useful. Glasgow and Riley (2013) have called for the field to focus on pragmatic measures, which they define as measures “that [have] relevance to stakeholders and [are] feasible to use in most real-​world settings to assess progress” (p. 237). They propose criteria for determining whether a measure is pragmatic, including that is it important to stakeholders, such as clients, clinicians, or administrators; that it is low burden to complete; that it generates actionable information that can be used in decision-​making; and that it is sensitive to change over time. Expanding our reviews of EBA tools to include dimensions such as these might help identify measures most likely to make their way into practice. One example of such a review was conducted by Beidas and colleagues (2015), who identified brief, free measures and rated their psychometric support for a range of purposes, including screening, diagnosis, and progress monitoring. Another opportunity for increasing the practicality of assessment is to take advantage of recent policies emphasizing increased data collection and accountability in health care settings (e.g., the “Patient Protection and Affordable Care Act,” 2010). Lyon and Lewis (2016) point out the opportunity that these policies provide for increasing use of progress monitoring. As agencies increasingly incorporate health information technologies, such as electronic medical records, into their settings to meet data reporting requirements, there is an opportunity to integrate electronic MFSs into these systems (Lyon et al., 2016). If progress monitoring can be built into the daily workflow of clinicians, this greatly increases its practicality. 4. Leverage technology to increase the use of EBA. Another avenue for increasing the practicality of assessment is to incorporate technologies such as electronic health care records platforms and smartphone applications into the assessment process. With the rise of policies emphasizing increased data collection and accountability in health care settings (e.g., “Patient Protection and Affordable Care Act,” 2010), mental health settings are increasingly relying on health information technologies, such as electronic health care records, to meet data

reporting requirements. Lyon and Lewis (2016) point out these shifts provide an opportunity to increase the use of progress monitoring. In a recent review, Lyon, Lewis, Boyd, Hendrix, and Liu (2016) identified 49 digital MFSs that could be used by clinicians with access to computers or tablets to administer progress measures and rapidly receive feedback. However, fewer than one-​third of those were able to be directly incorporated into electronic health care records, and Lyon and colleagues concluded that additional work is needed to develop digital MFSs that can be incorporated into the daily workflow of practice in a way that is sustainable. Another technological advance with great potential to enhance assessment is smartphone technologies that support data collection. Researchers have developed applications to support real-​time data collection (Trull & Ebner-​Priemer, 2009) and have begun to examine the clinical utility of such applications for gathering information such as mood (e.g., Schwartz, Schultz, Reider, & Saunders, 2016)  or pain ratings (Sánchez-​Rodríguez, de la Vega, Castarlenas, Roset, & Miró, 2015). Such applications could facilitate self-​monitoring of symptoms between sessions or efficient collection and scoring of progress monitoring data in session. Many smartphone applications to track psychological well-​being are already commercially available (e.g., a November 15, 2016, search of the Google Play store yielded more than 50 results for “mood tracking”), and an important next step is to determine how these applications can be ethically developed and incorporated into clinical practice (Jones & Moffitt, 2016). 5. Develop theoretical models of organizational support for EBA. Despite numerous studies suggesting that organizational context is critical to EBA (e.g., Gleacher et al., 2016; Jensen-​Doss et al., 2016), there is a need for conceptual models that can guide organizational approaches to improving assessment practices. Models of organizational culture and climate have been developed to explain use of EBTs (e.g., Williams & Glisson, 2014) and have been translated into organizational interventions that improve EBT uptake and client outcomes (e.g., Glisson, Williams, Hemmelgarn, Proctor, & Green, 2016). Many aspects of these models are likely applicable to the use of EBA, but the constructs within them may need to be elaborated. Although existing models might be helpful to guide EBA implementation in agency settings such as clinics or schools, these models are not as applicable to clinicians working in private practice, who seem to be the clinicians least likely to engage in EBA (Jensen-​Doss et al., 2016). Additional work is needed to understand the needs of this population.

 25

Dissemination and Implementation of Evidence-Based Assessment

6. Conduct more work focused on EBA processes. Although EBA consists of both psychometrically supported assessment measures and the processes by which those measures are applied, there has historically been a dearth of research focused on EBA processes (Hunsley & Mash, 2007). The rise in studies about MFSs, which consist of measures, guidelines for how often to administer them, actionable feedback about clinical results, and, increasingly, clinical guides suggesting next steps to take in treatment (Krägeloh et  al., 2015), is a welcome advance on this front. However, additional work is needed on diagnostic assessment processes and on approaches to integrating assessment data to form a case conceptualization. In terms of diagnostic assessment, Youngstrom’s work on grounding assessment decisions in probability nomograms (e.g., Youngstrom, Choukas-​ Bradley, Calhoun, & Jensen-​Doss, 2015)  is an interesting example of how researchers can further develop and study the assessment process. Drawing from approaches used in evidence-​ based medicine (Strauss et  al., 2015), Youngstrom and colleagues have examined the diagnostic utility of various risk factors and assessment tools (e.g., Van Meter et  al., 2014; Youngstrom, 2014; Youngstrom et  al., 2004), generating data that can then be applied via a tool called a nomogram, which helps clinicians translate assessment information into estimated probabilities that a client meets criteria for a disorder (for an illustration, see Youngstrom et al., 2015). One benefit of this approach is that it can be done sequentially, starting with lower burden assessment strategies first, and only moving on to more intensive assessment of diagnoses that are not ruled in or out at earlier stages of assessment. Clinicians have been successfully trained to use the nomogram in two studies (Jenkins & Youngstrom, 2016; Jenkins, Youngstrom, Washburn, & Youngstrom, 2011), although research is needed to determine whether clinicians go on to apply the nomogram in their work and whether use improves their diagnostic accuracy with clients. Another assessment process in need of additional research is assessment-​ driven case conceptualization. EBA case conceptualization models have been proposed (Christon et al., 2015) and many theoretical conceptualization models, such as the cognitive–​behavioral model, indicate that assessment should support the conceptualization process (Persons & Davidson, 2010). However, most of the research on case conceptualization has focused on whether clinicians who review the same clinical vignettes or session recordings generate the same conceptualizations (Flinn et al., 2015) or whether clinicians

25

can be trained to apply a particular conceptualization approach to vignettes or recordings (Abbas et al., 2012). To our knowledge, no studies have focused on whether clinicians can be trained to gather assessment data and use them to generate an accurate case conceptualization, whether such training could lead to actual changes in clinician conceptualization practices, and whether those practice changes might improve client outcomes. This is clearly an area in critical need of additional research. Finally, some chapters in this volume highlight the utility of functional assessment for case conceptualization and ongoing progress monitoring. However, little is known about whether clinicians are trained in this practice, view it favorably, utilize it in practice, or find it feasible. In education, the requirement to conduct functional behavioral assessment in the Individuals with Disabilities Act Amendments of 1997 led to the need for widespread implementation of functional assessment in schools (Scott, Nelson, & Zabala, 2003). Surveys suggest that this practice is acceptable to school personnel (Crone, Hawken, & Bergstrom, 2007; Nelson, Roberts, Rutherford, Mathur, & Aaroe, 1999), although concerns have been raised about its feasibility (Nelson et  al., 1999). However, future research is needed to understand how this practice is viewed and utilized in other settings.

CONCLUSIONS

As this volume illustrates, decades of excellent research has generated a rich body of clinically useful EBA tools. Unfortunately, many of these tools have not yet made it into practice settings, limiting their public health impact. Fortunately, researchers and policymakers are increasingly attending to the dissemination of these tools, as well as their implementation in mental health organizations and systems. Through this work, the field will progress toward a more fully realized application of EBPP that goes beyond treatment, hopefully improving mental health outcomes for clients.

References Abbas, M., Walton, R., Johnston, A., & Chikoore, M. (2012). Evaluation of teaching an integrated case formulation approach on the quality of case formulations: Randomised controlled trial. The Psychiatrist, 36, 140–​145. American Mental Health Counselors Association. (2011). Recommended AMHCA training. Standards for the practice of clinical mental health counseling. Retrieved from https://​amhca.site-​ym.com/​?Copyofstandardsofp

26

26

Introduction

American Psychological Association. (2006). Guidelines and principles for accreditation programs in professional psychology (G&P). Washington, DC:  Author. Retrieved from https://​www.apa.org/​ed/​accreditation/​about/​policies/​guiding-​principles.pdf Anderson, D. A., & Paulosky, C. A. (2004). A survey of the use of assessment instruments by eating disorder professionals in clinical practice. Eating and Weight Disorders, 9, 238–​241. American Psychological Association Presidential Task Force on Evidence-​Based Practice. (2006). Evidence-​based practice in psychology. American Psychologist, 61, 271–​285. American Psychological Association, Commission on Accreditation (2015) Standards of accreditation for health service psychology. Retrieved from http://​www. apa.org/​ed/​accreditation/​about/​policies/​standards-​of-​ accreditation.pdf. Basco, M. R., Bostic, J. Q., Davies, D., Rush, A. J., Witte, B., Hendrickse, W., & Barnett, V. (2000). Methods to improve diagnostic accuracy in a community mental health setting. American Journal of Psychiatry, 157, 1599–​1605. Bearman, S. K., Wadkins, M., Bailin, A., & Doctoroff, G. (2015). Pre-​practicum training in professional psychology to close the research–​practice gap:  Changing attitudes toward evidence-​ based practice. Training and Education in Professional Psychology, 9, 13–​20. Beidas, R. S., Cross, W., & Dorsey, S. (2014). Show me, don’t tell me: Behavioral rehearsal as a training and analogue fidelity tool. Cognitive and Behavioral Practice, 21, 1–​11. Beidas, R. S., & Kendall, P. C. (2010). Training therapists in evidence-​based practice:  A critical review of studies from a systems-​ contextual perspective. Clinical Psychology: Science and Practice, 17, 1–​30. Beidas, R. S., Stewart, R. E., Walsh, L., Lucas, S., Downey, M. M., Jackson, K.,  .  .  .  Mandell, D. S. (2015). Free, brief, and validated:  Standardized instruments for low-​ resource mental health settings. Cognitive and Behavioral Practice, 22, 5–​19. Belter, R. W., & Piotrowski, C. (2001). Current status of doctoral-​level training in psychological testing. Journal of Clinical Psychology, 57, 717–​726. Bickman, L., Douglas, S. R., De Andrade, A. R.  V., Tomlinson, M., Gleacher, A., Olin, S., & Hoagwood, K. (2016). Implementing a measurement feedback system:  A tale of two sites. Administration and Policy in Mental Health and Mental Health Services Research, 43, 410–​425. Bickman, L., Kelley, S. D., Breda, C., De Andrade, A. R. V., & Riemer, M. (2011). Effects of routine feedback to clinicians on mental health outcomes of youths: Results of a randomized trial. Psychiatric Services, 62, 1423–​1429.

Bohnenkamp, J. H., Glascoe, T., Gracey, K. A., Epstein, R. A., & Benningfield, M. M. (2015). Implementing clinical outcomes assessment in everyday school mental health practice. Child and Adolescent Psychiatric Clinics of North America, 24, 399–​413. Bruchmüller, K., Margraf, J., Suppiger, A., & Schneider, S. (2011). Popular or unpopular? Therapists’ use of structured interviews and their estimation of patient acceptance. Behavior Therapy, 42(4), 634–​643. Canadian Psychological Association. (2011). Accreditation standards and procedures for doctoral programmes and internships in professional psychology. Ottawa, Ontario, Canada: Author. Cashel, M. L. (2002). Child and adolescent psychological assessment:  Current clinical practices and the impact of managed care. Professional Psychology: Research and Practice, 33, 446–​453. Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S.,  .  .  .  Choi, S. (2010). The Patient-​ Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-​ reported health outcome item banks: 2005–​2008. Journal of Clinical Epidemiology, 63, 1179–​1194. Centers of Medicare & Medicaid Services. (2016). Medicare FFS physician feedback program/​ value-​ based payment modifier. Retrieved from https://​www.cms. gov/​ M edicare/​ M edicare-​ Fee-​ f or-​ S ervice-​ Payment/​ PhysicianFeedbackProgram/​index.html Childs, R. A., & Eyde, L. D. (2002). Assessment training in clinical psychology doctoral programs:  What should we teach? What do we teach? Journal of Personality Assessment, 78, 130–​144. Christon, L. M., McLeod, B. D., & Jensen-​Doss, A. (2015). Evidence-​ based assessment meets evidence-​ based treatment:  An approach to science-​ informed case conceptualization. Cognitive and Behavioral Practice, 22, 36–​48. Claiborn, C. D., & Goodyear, R. K. (2005). Feedback in psychotherapy. Journal of Clinical Psychology, 61, 209–​217. Commission on Accreditation & Policy, Council on Social Work Education. (2015). Educational policy and accreditation standards for baccalaureate and master’s social work programs. Retrieved from https://​www. cswe.org/​getattachment/​Accreditation/​Accreditation-​ Process/ ​ 2 015- ​ E PAS/ ​ 2 015EPAS_ ​ Web_ ​ F INAL.pdf. aspx Commission on Accreditation for Marriage and Family Therapy Education. (2014). Accreditation standards: Graduate and post-​graduate marriage and family therapy training programs. Retrieved from http://​dx5br1z4f6n0k.cloudfront. net/​imis15/​Documents/​COAMFTE/​Version%2012/​ COAMFTE_​Accreditation_​Standards_​Version_​12.pdf

 27

Dissemination and Implementation of Evidence-Based Assessment

Connors, E. H., Arora, P., Curtis, L., & Stephan, S. H. (2015). Evidence-​ b ased assessment in school mental health. Cognitive and Behavioral Practice, 22, 60–​73. Crone, D. A., Hawken, L. S., & Bergstrom, M. K. (2007). A demonstration of training, implementing, and using functional behavioral assessment in 10 elementary and middle school settings. Journal of Positive Behavior Interventions, 9, 15–​29. Demaray, M. K., Schaefer, K., & Delong, L. K. (2003). Attention-​deficit/​hyperactivity disorder (ADHD):  A national survey of training and current assessment practices in the schools. Psychology in the Schools, 40, 583–​597. Dozois, D. J., Mikail, S. F., Alden, L. E., Bieling, P. J., Bourgon, G., Clark, D. A.,  .  .  .  Hunsley, J. (2014). The CPA Presidential Task Force on Evidence-​Based Practice of Psychological Treatments. Canadian Psychology, 55, 153. Ebert, L., Amaya-​Jackson, L., Markiewicz, J. M., Kisiel, C., & Fairbank, J. A. (2012). Use of the Breakthrough Series Collaborative to support broad and sustained use of evidence-​based trauma treatment for children in community practice settings. Administration and Policy in Mental Health and Mental Health Services Research, 39, 187–​199. Edbrooke-​Childs, J., Wolpert, M., & Deighton, J. (2014). Using patient reported outcome measures to improve service effectiveness (UPROMISE):  Training clinicians to use outcome measures in child mental health. Administration and Policy in Mental Health and Mental Health Services Research, 43, 302–​308. First, M. B. (2013). DSM-​5 handbook of differential diagnosis: Arlington, VA: American Psychiatric Publishing. Fleming, I., Jones, M., Bradley, J., & Wolpert, M. (2016). Learning from a learning collaboration:  The CORC approach to combining research, evaluation and practice in child mental health. Administration and Policy in Mental Health and Mental Health Services Research, 43, 297–​301. Flinn, L., Braham, L., & das Nair, R. (2015). How reliable are case formulations? A  systematic literature review. British Journal of Clinical Psychology, 54, 266–​290. Gans, J., Falco, M., Schackman, B. R., & Winters, K. C. (2010). An in-​depth survey of the screening and assessment practices of highly regarded adolescent substance abuse treatment programs. Journal of Child & Adolescent Substance Abuse, 19, 33–​47. Garland, A. F., Bickman, L., & Chorpita, B. F. (2010). Change what? Identifying quality improvement targets by investigating usual mental health care. Administration and Policy in Mental Health and Mental Health Services Research, 37, 15–​26.

27

Garland, A. F., Kruse, M., & Aarons, G. A. (2003). Clinicians and outcome measurement: What’s the use? Journal of Behavioral Health Services & Research, 30, 393–​405. Gilbody, S. M., House, A. O., & Sheldon, T. A. (2002). Psychiatrists in the UK do not use outcomes measures: National survey. British Journal of Psychiatry, 180, 101–​103. Glasgow, R. E., & Riley, W. T. (2013). Pragmatic measures: What they are and why we need them. American Journal of Preventive Medicine, 45, 237–​243. Gleacher, A. A., Olin, S. S., Nadeem, E., Pollock, M., Ringle, V., Bickman, L.,  .  .  .  Hoagwood, K. (2016). Implementing a measurement feedback system in community mental health clinics: A case study of multilevel barriers and facilitators. Administration and Policy in Mental Health and Mental Health Services Research, 43, 426–​440. Glisson, C., Williams, N. J., Hemmelgarn, A., Proctor, E., & Green, P. (2016). Aligning organizational priorities with ARC to improve youth mental health service outcomes. Journal of Consulting and Clinical Psychology, 84, 713–​725. Hall, C., Moldavsky, M., Taylor, J., Sayal, K., Marriott, M., Batty, M.,  .  .  .  Hollis, C. (2014). Implementation of routine outcome measurement in child and adolescent mental health services in the United Kingdom: A critical perspective. European Child & Adolescent Psychiatry, 23, 239–​242. Hannan, C., Lambert, M. J., Harmon, C., Nielsen, S. L., Smart, D. W., Shimokawa, K., & Sutton, S. W. (2005). A lab test and algorithms for identifying clients at risk for treatment failure. Journal of Clinical Psychology, 61, 155–​163. Hatfield, D. R., & Ogles, B. M. (2004). The use of outcome measures by psychologists in clinical practice. Professional Psychology:  Research and Practice, 35, 485–​491. Hatfield, D. R., & Ogles, B. M. (2007). Why some clinicians use outcome measures and others do not. Administration and Policy in Mental Health and Mental Health Services Research, 34, 283–​291. HealthMeasures (2017). PROMIS (Patient-​Reported Outcomes Measurement Information System). Retrieved from http://​www.healthmeasures.net/​explore-​ measurement-​systems/​promis Herschell, A. D., Kolko, D. J., Baumann, B. L., & Davis, A. C. (2010). The role of therapist training in the implementation of psychosocial treatments:  A review and critique with recommendations. Clinical Psychology Review, 30, 448–​466. Higa-​McMillan, C. K., Powell, C. K., Daleiden, E. L., & Mueller, C. W. (2011). Pursuing an evidence-​based culture through contextualized feedback:  Aligning

28

28

Introduction

youth outcomes and practices. Professional Psychology: Research and Practice, 42, 137–​144. Hodges, K., & Wong, M. M. (1996). Psychometric characteristics of a multidimensional measure to assess impairment: The Child and Adolescent Functional Assessment Scale. Journal of Child and Family Studies, 5, 445–​467. Hughes, C. W., Emslie, G. J., Wohlfahrt, H., Winslow, R., Kashner, T. M., & Rush, A. J. (2005). Effect of structured interviews on evaluation time in pediatric community mental health settings. Psychiatric Services, 56, 1098–​1103. Hunsley, J., & Mash, E. J. (2005). Introduction to the special section on developing guidelines for the evidence-​based assessment (EBA) of adult disorders. Psychological Assessment, 17, 251–​255. Hunsley, J., & Mash, E. J. (2007). Evidence-​based assessment. Annual Review of Clinical Psychology, 3, 29–​51. Ionita, G., & Fitzpatrick, M. (2014). Bringing science to clinical practice:  A Canadian survey of psychological practice and usage of progress monitoring measures. Canadian Psychology, 55, 187–​196. Ionita, G., Fitzpatrick, M., Tomaro, J., Chen, V. V., & Overington, L. (2016). Challenges of using progress monitoring measures:  Insights from practicing clinicians. Journal of Counseling Psychology, 63, 173–​182. Jenkins, M. M., & Youngstrom, E. A. (2016). A randomized controlled trial of cognitive debiasing improves assessment and treatment selection for pediatric bipolar disorder. Journal of Consulting and Clinical Psychology, 84, 323–​333. Jenkins, M. M., Youngstrom, E. A., Washburn, J. J., & Youngstrom, J. K. (2011). Evidence-​ based strategies improve assessment of pediatric bipolar disorder by community practitioners. Professional Psychology:  Research and Practice, 42, 121–​129. Jensen-​ Doss, A. (2015). Practical, evidence-​ based clinical decision making:  Introduction to the special series. Cognitive and Behavioral Practice, 22, 1–​4. Jensen-​Doss, A., Becker, E. M., Smith, A. M., Lyon, A. R., Lewis, C. C., Stanick, C. F., & Hawley, K. M. (2016). Monitoring treatment progress and providing feedback is viewed favorably but rarely used in practice. Administration and Policy in Mental Health and Mental Health Services Research. https://​doi.org/​10.1007/​ s10488-​016-​0763-​0 Jensen-​Doss, A., & Hawley, K. M. (2010). Understanding barriers to evidence-​based assessment:  Clinician attitudes toward standardized assessment tools. Journal of Clinical Child & Adolescent Psychology, 39, 885–​896. Jensen-​Doss, A., & Hawley, K. M. (2011). Understanding clinicians’ diagnostic practices:  Attitudes toward the utility of diagnosis and standardized diagnostic tools. Administration and Policy in Mental Health and Mental Health Services Research, 38, 476–​485.

Jensen-​Doss, A., & Weisz, J. R. (2008). Diagnostic agreement predicts treatment process and outcomes in youth mental health clinics. Journal of Consulting and Clinical Psychology, 76, 711–​722. Jewell, J., Handwerk, M., Almquist, J., & Lucas, C. (2004). Comparing the validity of clinician-​ generated diagnosis of conduct disorder to the Diagnostic Interview Schedule for Children. Journal of Clinical Child and Adolescent Psychology, 33, 536–​546. Johnston, C., & Gowers, S. (2005). Routine outcome measurement: A survey of UK child and adolescent mental health services. Child and Adolescent Mental Health, 10, 133–​139. Jones, N., & Moffitt, M. (2016). Ethical guidelines for mobile app development within health and mental health fields. Professional Psychology:  Research and Practice, 47, 155–​162. Kashner, T. M., Rush, A. J., Surís, A., Biggs, M. M., Gajewski, V. L., Hooker, D. J., . . . Altshuler, K. Z. (2003). Impact of structured clinical interviews on physicians’ practices in community mental health settings. Psychiatric Services, 54, 712–​718. Klein, J. B., Lavigne, J. V., & Seshadri, R. (2010). Clinician-​ assigned and parent-​report questionnaire-​derived child psychiatric diagnoses:  Correlates and consequences of disagreement. American Journal of Orthopsychiatry, 80, 375–​385. Koocher, G. P., McMann, M. R., Stout, A. O., & Norcross, J. C. (2015). Discredited assessment and treatment methods used with children and adolescents: A Delphi poll. Journal of Clinical Child & Adolescent Psychology, 44, 722–​729. Kotte, A., Hill, K. A., Mah, A. C., Korathu-​Larson, P. A., Au, J. R., Izmirian, S.,  .  .  .  Higa-​McMillan, C. K. (2016). Facilitators and barriers of implementing a measurement feedback system in public youth mental health. Administration and Policy in Mental Health and Mental Health Services Research, 43, 861–​878, Krägeloh, C. U., Czuba, K. J., Billington, D. R., Kersten, P., & Siegert, R. J. (2015). Using feedback from patient-​ reported outcome measures in mental health services: A scoping study and typology. Psychiatric Services, 66, 224–​241. Kramer, T. L., Robbins, J. M., Phillips, S. D., Miller, T. L., & Burns, B. J. (2003). Detection and outcomes of substance use disorders in adolescents seeking mental health treatment. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 1318–​1326. Lambert, M. J., Hansen, N. B., & Finch, A. E. (2001). Patient-​focused research:  Using patient outcome data to enhance treatment effects. Journal of Consulting and Clinical Psychology, 69, 159–​172. Lauth, B., Levy, S. R.  A., Júlíusdóttir, G., Ferrari, P., & Pétursson, H. (2008). Implementing the semi-​structured

 29

Dissemination and Implementation of Evidence-Based Assessment

interview Kiddie-​SADS-​PL into an in-​patient adolescent clinical setting: Impact on frequency of diagnoses. Child and Adolescent Psychiatry and Mental Health, 2(14). https://​capmh.biomedcentral.com/​articles/​ 10.1186/​1753-​2000-​2-​14 Leffler, J. M., Riebel, J., & Hughes, H. M. (2015). A review of child and adolescent diagnostic interviews for clinical practitioners. Assessment, 22, 690–​703. Lichtenstein, D. P., Spirito, A., & Zimmermann, R. P. (2010). Assessing and treating co-​occurring disorders in adolescents: Examining typical practice of community-​based mental health and substance use treatment providers. Community Mental Health Journal, 46, 252–​257. Lyon, A. R., Dorsey, S., Pullmann, M., Silbaugh-​Cowdin, J., & Berliner, L. (2015). Clinician use of standardized assessments following a common elements psychotherapy training and consultation program. Administration and Policy in Mental Health and Mental Health Services Research, 42, 47–​60. Lyon, A. R., & Lewis, C. C. (2016). Designing health information technologies for uptake:  Development and implementation of measurement feedback systems in mental health service delivery. Administration and Policy in Mental Health and Mental Health Services Research, 43, 344–​349. Lyon, A. R., Lewis, C. C., Boyd, M. R., Hendrix, E., & Liu, F. (2016). Capabilities and characteristics of digital measurement feedback systems: Results from a comprehensive review. Administration and Policy in Mental Health and Mental Health Services Research, 43, 441–​466. Mash, E. J., & Hunsley, J. (2005). Evidence-​based assessment of child and adolescent disorders: Issues and challenges. Journal of Clinical Child and Adolescent Psychology, 34, 362–​379. Matuschek, T., Jaeger, S., Stadelmann, S., Dölling, K., Grunewald, M., Weis, S.,  .  .  .  Döhnert, M. (2016). Implementing the K-​SADS-​PL as a standard diagnostic tool: Effects on clinical diagnoses. Psychiatry Research, 236, 119–​124. Meehan, T., McCombes, S., Hatzipetrou, L., & Catchpoole, R. (2006). Introduction of routine outcome measures:  Staff reactions and issues for consideration. Journal of Psychiatric and Mental Health Nursing, 13, 581–​587. Mellor-​ Clark, J., Cross, S., Macdonald, J., & Skjulsvik, T. (2016). Leading horses to water:  Lessons from a decade of helping psychological therapy services use routine outcome measurement to improve practice. Administration and Policy in Mental Health and Mental Health Services Research, 43, 279–​285. Meyers, D. C., Durlak, J. A., & Wandersman, A. (2012). The quality implementation framework: A synthesis of critical steps in the implementation process. American Journal of Community Psychology, 50, 462–​480.

29

Mihura, J. L., Roy, M., & Graceffo, R. A. (2017). Psychological assessment training in clinical psychology doctoral programs. Journal of Personality Assessment, 99(2), 153–​164. Mojtabai, R. (2013). Clinician-​identified depression in community settings: Concordance with structured-​interview diagnoses. Psychotherapy and Psychosomatics, 82, 161–​169. Mours, J. M., Campbell, C. D., Gathercoal, K. A., & Peterson, M. (2009). Training in the use of psychotherapy outcome assessment measures at psychology internship sites. Training and Education in Professional Psychology, 3, 169–​176. Nadeem, E., Olin, S. S., Hill, L. C., Hoagwood, K. E., & Horwitz, S. M. (2014). A literature review of learning collaboratives in mental health care: Used but untested. Psychiatric Services, 65, 1088–​1099. Nakamura, B. J., Mueller, C. W., Higa-​McMillan, C., Okamura, K. H., Chang, J. P., Slavin, L., & Shimabukuro, S. (2014). Engineering youth service system infrastructure:  Hawaii’s continued efforts at large-​ scale implementation through knowledge management strategies. Journal of Clinical Child and Adolescent Psychology, 43, 179–​189. National Child Traumatic Stress Network. (2016). Measures review database. Retrieved from http://​www.nctsn.org/​ resources/​online-​research/​measures-​review Nelson, J. R., Roberts, M. L., Rutherford, R. B., Jr., Mathur, S. R., & Aaroe, L. A. (1999). A statewide survey of special education administrators and school psychologists regarding functional behavioral assessment. Education and Treatment of Children, 22, 267–​279. Norcross, J. C., Koocher, G. P., & Garofalo, A. (2006). Discredited psychological treatments and tests:  A Delphi poll. Professional Psychology:  Research and Practice, 37, 515–​522. Overington, L., Fitzpatrick, M., Hunsley, J., & Drapeau, M. (2015). Trainees’ experiences using progress monitoring measures. Training and Education in Professional Psychology, 9, 202–​209. Patient Protection and Affordable Care Act, Pub. L. No. 111–​ 148, § 6301, 124, 727 Stat. (2010). Persons, J. B., & Bertagnolli, A. (1999). Inter-​rater reliability of cognitive–​ behavioral case formulations of depression: A replication. Cognitive Therapy and Research, 23, 271–​283. Persons, J. B., & Davidson, J. (2010). Cognitive–​behavioral case formulation. In K. S. Dobson & K. S. Dobson (Eds.), Handbook of cognitive–​behavioral therapies (3rd ed., pp. 172–​195). New York, NY: Guilford. Persons, J. B., Koerner, K., Eidelman, P., Thomas, C., & Liu, H. (2016). Increasing psychotherapists’ adoption and implementation of the evidence-​ based practice of progress monitoring. Behaviour Research and Therapy, 76, 24–​31.

30

30

Introduction

Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the Patient-​Reported Outcomes Measurement Information System (PROMIS): Depression, anxiety, and anger. Assessment, 18, 263–​283. Pogge, D. L., Wayland-​Smith, D., Zaccario, M., Borgaro, S., Stokes, J., & Harvey, P. D. (2001). Diagnosis of manic episodes in adolescent inpatients:  Structured diagnostic procedures compared to clinical chart diagnoses. Psychiatry Research, 101, 47–​54. Ponniah, K., Weissman, M. M., Bledsoe, S. E., Verdeli, H., Gameroff, M. J., Mufson, L., . . . Wickramaratne, P. (2011). Training in structured diagnostic assessment using DSM-​IV criteria. Research on Social Work Practice, 21, 452–​457. Ready, R. E., & Veague, H. B. (2014). Training in psychological assessment:  Current practices of clinical psychology programs. Professional Psychology:  Research and Practice, 45, 278–​282. Rettew, D. C., Lynch, A. D., Achenbach, T. M., Dumenci, L., & Ivanova, M. Y. (2009). Meta-​analyses of agreement between diagnoses made from clinical evaluations and standardized diagnostic interviews. International Journal of Methods in Psychiatric Research, 18, 169–​184. Riemer, M., Rosof-​ Williams, J., & Bickman, L. (2005). Theories related to changing clinician practice. Child and Adolescent Psychiatric Clinics of North America, 14, 241–​254. Samuel, D. B. (2015). A review of the agreement between clinicians’ personality disorder diagnoses and those from other methods and sources. Clinical Psychology: Science and Practice, 22, 1–​19. Samuel, D. B., Sanislow, C. A., Hopwood, C. J., Shea, M. T., Skodol, A. E., Morey, L. C., . . . Grilo, C. M. (2013). Convergent and incremental predictive validity of clinician, self-​report, and structured interview diagnoses for personality disorders over 5 years. Journal of Consulting and Clinical Psychology, 81, 650–​659. Sánchez-​Rodríguez, E., de la Vega, R., Castarlenas, E., Roset, R., & Miró, J. (2015). An APP for the assessment of pain intensity:  Validity properties and agreement of pain reports when used with young people. Pain Medicine, 16, 1982–​1992. Schwartz, S., Schultz, S., Reider, A., & Saunders, E. F.  H. (2016). Daily mood monitoring of symptoms using smartphones in bipolar disorder:  A pilot study assessing the feasibility of ecological momentary assessment. Journal of Affective Disorders, 191, 88–​93. Scott, T. M., Nelson, C. M., & Zabala, J. (2003). Functional behavior assessment training in public schools facilitating systemic change. Journal of Positive Behavior Interventions, 5, 216–​224. Shimokawa, K., Lambert, M. J., & Smart, D. W. (2010). Enhancing treatment outcome of patients at risk of

treatment failure: Meta-​analytic and mega-​analytic review of a psychotherapy quality assurance system. Journal of Consulting and Clinical Psychology, 78, 298–​311. Stein, B. D., Kogan, J. N., Hutchison, S. L., Magee, E. A., & Sorbero, M. J. (2010). Use of outcomes information in child mental health treatment: Results from a pilot study. Psychiatric Services, 61, 1211–​1216. Strauss, B. M., Lutz, W., Steffanowski, A., Wittmann, W. W., Boehnke, J. R., Rubel, J.,  .  .  .  Kirchmann, H. (2015). Benefits and challenges in practice-​ oriented psychotherapy research in Germany:  The TK and the QS-​ PSY-​ BAY projects of quality assurance in outpatient psychotherapy. Psychotherapy Research, 25, 32–​51. Suppiger, A., In-​Albon, T., Hendriksen, S., Hermann, E., Margraf, J., & Schneider, S. (2009). Acceptance of structured diagnostic interviews for mental disorders in clinical practice and research settings. Behavior Therapy, 40, 272–​279. Swanke, J., & Zeman, L. D. (2016). Building skills in psychiatric assessment through an online problem-​based learning course. Journal of Practice Teaching and Learning, 14, 6–​18. Tenney, N. H., Schotte, C. K.  W., Denys, D. A.  J. P., van Megen, H. J. G. M., & Westenberg, H. G. M. (2003). Assessment of DSM-​ IV personality disorders in obsessive–​compulsive disorder: Comparison of clinical diagnosis, self-​report questionnaire, and semi-​structured interview. Journal of Personality Disorders, 17, 550–​561. The California Evidence-​ Based Clearinghouse for Child Welfare (2017). Measurement tools. Retrieved from http://​ www.cebc4cw.org/​assessment-​tools/​measurement-​tools-​ highlighted-​on-​the-​cebc The White House Office of the Press Secretary. (2015). Fact sheet:  President Obama’s precision medicine initiative. Retrieved from https://​www.whitehouse.gov/​ the-​press-​office/​2015/​01/​30/​fact-​sheet-​president-​obama-​ s-​precision-​medicine-​initiative Trauer, T., Gill, L., Pedwell, G., & Slattery, P. (2006). Routine outcome measurement in public mental health—​What do clinicians think? Australian Health Review, 30, 144–​147. Trull, T. J., & Ebner-​Priemer, U. W. (2009). Using experience sampling methods/​ ecological momentary assessment (ESM/​ EMA) in clinical assessment and clinical research:  Introduction to the special section. Psychological Assessment, 21, 457–​462. Van Meter, A., Youngstrom, E., Youngstrom, J. K., Ollendick, T., Demeter, C., & Findling, R. L. (2014). Clinical decision making about child and adolescent anxiety disorders using the Achenbach system of empirically based assessment. Journal of Clinical Child and Adolescent Psychology, 43, 552–​565. Veerbeek, M. A., Voshaar, R. C.  O., & Pot, A. M. (2012). Clinicians’ perspectives on a web-​ based system for

 31

Dissemination and Implementation of Evidence-Based Assessment

routine outcome monitoring in old-​age psychiatry in the Netherlands. Journal of Medical Internet Research, 14(3). http://​www.jmir.org/​2012/​3/​e76/​?trendmd-​shared=0 Walfish, S., McAlister, B., O’Donnell, P., & Lambert, M. J. (2012). An investigation of self-​assessment bias in mental health providers. Psychological Reports, 110, 639–​644. Whiteside, S. P., Sattler, A. F., Hathaway, J., & Douglas, K. V. (2016). Use of evidence-​based assessment for childhood anxiety disorders in community practice. Journal of Anxiety Disorders, 39, 65–​70. Williams, N. J., & Glisson, C. (2014). Testing a theory of organizational culture, climate and youth outcomes in child welfare systems:  A United States national study. Child Abuse & Neglect, 38, 757–​767. Youngstrom, E. A. (2014). A primer on receiver operating characteristic analysis and diagnostic efficiency statistics

31

for pediatric psychology: We are ready to ROC. Journal of Pediatric Psychology, 39, 204–​221. Youngstrom, E. A., Choukas-​Bradley, S., Calhoun, C. D., & Jensen-​Doss, A. (2015). Clinical guide to the evidence-​ based assessment approach to diagnosis and treatment. Cognitive and Behavioral Practice, 22, 20–​35. Youngstrom, E. A., Findling, R. L., Calabrese, J. R., Gracious, B. L., Demeter, C., Bedoya, D. D., & Price, M. (2004). Comparing the diagnostic accuracy of six potential screening instruments for bipolar disorder in youths aged 5 to 17 years. Journal of the American Academy of Child & Adolescent Psychiatry, 43, 847–​858. Youngstrom, E. A., Jensen-​Doss, A., Beidas, R. S., Forman, E., & Ong, M.-​L. (2015–​2016). Increasing global access to evidence-​based assessment. Washington, DC: American Psychological Association.

32

3

Advances in Evidence-​Based Assessment: Using Assessment to Improve Clinical Interventions and Outcomes Eric A. Youngstrom Anna Van Meter “Assessment” is the application of a measurement method to support a particular goal. In the clinical enterprise, measurement is not an end in itself. We are not trying to simply describe our clients. They are seeking change, and assessment should help identify problems, guide the choice of solutions, and indicate whether things are moving in the right direction (Hunsley & Mash, 2007). Assessment plays a central role in psychoeducational evaluation, custody evaluations, and forensic evaluations as well as clinical evaluation. In each case, assessment provides the data to guide recommendations and actions. Our discussion focuses most on assessment in the clinical context, recognizing that many of the principles and concepts apply more generally as well. Focusing on assessment as the application of measurement to guide effective intervention distills evidence-​based assessment (EBA) to a core principle. The potential value added by assessment changes depending on the type of intervention and the stage of treatment. Rather than being separate clinical activities, assessment and treatment are transactional and linked: Treatment provides the questions and the context for EBA (Hunsley & Mash, 2007; Norcross, Hogan, & Koocher, 2008). At the beginning of treatment, assessment may most helpfully focus on screening, scoping, and predicting diagnoses or key issues. Once refined into a formulation, assessment shifts to prescribing an intervention, with potential alternatives and moderating factors defined. As treatment gets underway, then assessment shifts to measuring progress, including shifts in severity, movement toward goals, and sometimes measurement of process variables that play a mechanism in the treatment.

Our goal is to lay out a practical model of EBA as a transactional integration of assessment with treatment, providing scaffolding for incorporating the different content and techniques presented in subsequent chapters. In our model, perhaps best considered as EBA 2.0, we augment the “3 Ps” of EBA (Youngstrom, 2013)—​prediction, prescription, and process—​with a preparation phase that lays the groundwork for a successful installation of these upgraded practices. For concepts and principles to help anyone, they need to be feasible. Evidence-​based medicine (EBM) often uses the metaphor of a “leaky pipeline” that connects the best research evidence with the person who would benefit (Glasziou & Haynes, 2005). The research only helps if the clinician is aware of it, accepts that it is valid, views it as applicable to the client, has the necessary resources to be able to implement, acts on it, and secures the client’s agreement and adherence. The chapters in this volume address the first half of the potential leaks: Anthologizing the information about measures and their psychometrics and utility directly tackles the problems of awareness and critical appraisal and also guides choices about applicability. EBA 2.0 pushes the information even further down the pipeline by building a strategic approach to assessment that makes it easier to evaluate common issues. It also combines research and pragmatism to sequence the order of measurements, minimizing redundancy or unnecessary testing that will not inform key questions guiding care. As a result, EBA 2.0 can often choreograph assessment sequences that deliver better results in the same time or less than 32

 3

Advances in Evidence-Based Assessment

has been spent in traditional evaluations (cf. Camara, Nathan, & Puente, 2000).

DIAGNOSIS AND TREATMENT FORMULATION

33

variables and honing feedback to the provider and consumer in formats that can lead care. These can result in surprisingly large gains in predictive accuracy, although they are still not a complete solution (James, Witten, Hastie, & Tibshirani, 2013).

AS USUAL

A brief review of typical practices sets a counterpoint that highlights contrasts with EBA. Surveys indicate that most practicing clinicians have been doing minimal assessment beyond an unstructured interview, with the exception of those instances in which clinicians administer, score, and interpret a battery of assessment and write a report (and then rarely provide treatment; Garb, 1998; Jensen-​Doss & Hawley, 2011). The multiplicity of factors involved in each clinical scenario forces clinicians to rely on impressionistic, pattern recognition approaches (Kahneman, 2011). Although our evolved cognitive strategies tended to do well in our environment across evolutionary adaptation, the complexity of modern life creates mismatches where our fast, intuitive system often leaps to wrong clinical conclusions, and we may not recover via our slower, effortful processing strategies. Clinical assessment appears to be a paragon of all that can be problematic with our cognitive wiring. Our efforts at empathy focus on emotionally salient material, processing it swiftly to arrive at a hypothesis that we then seek to confirm (and fail to systematically try to disconfirm; Croskerry, 2003). We underestimate complexity, calling off searches when we find a plausible suspect (Jenkins & Youngstrom, 2016; Rettew, Lynch, Achenbach, Dumenci, & Ivanova, 2009). Cultural differences in beliefs about causes and framing of the problem lead to errors in hypotheses that do not get corrected easily (Carpenter-​ Song, 2009; Yeh et al., 2005). As a result, studies of clinical decision-​making accuracy are consistently humbling. Vignette studies show tremendous variation across clinicians in formulations of the same presenting problem and assessment data (Dubicka, Carlson, Vail, & Harrington, 2008; Jenkins, Youngstrom, Washburn, & Youngstrom, 2011). Even video-​recorded sessions intended as an inter-​ rater reliability exercise show massive differences in scoring depending on culture and training (Mackin, Targum, Kalali, Rom, & Young, 2006). In contrast, IBM and other companies are betting that machine learning may prove helpful in decision-​ making, feeding the multivariate data to artificial intelligence robots to create decision support tools (Susskind & Susskind, 2015). They are using machine learning to mine complex relationships from staggering numbers of

PREPARATION PHASE

EBA 2.0 need not wait for the robots to fix everything. Techniques ranging from the simple to the sophisticated are available that would upgrade our practice. The first step is an easy one: Take stock of the most common presenting issues at our practice and make sure that we are well prepared for them. Depression, anxiety, and attention problems are all pervasive problems that will present to any clinical practice. Other core issues may vary with age range and practice setting. Externalizing behavior or learning disabilities may be more common among school-​ aged referrals, whereas personality disorders or substance misuse become more likely with advancing age. The initial step in EBA 2.0 is to identify the half-​dozen to dozen most common issues. Given the sheer volume of cases affected, even a small upgrade in assessment could pay large dividends if it improves results for one of these frequent referral issues. A second step is to benchmark our local rates against other clinics and settings. Benchmarking can reveal gaps in our practice. If we see many clients with anxiety but few with depression, that would be a surprising pattern based on epidemiological studies and clinic surveys (Rettew et  al., 2009). It is possible that our practice has become so specialized that we mostly get referrals for a narrow set of issues, but it is worth considering whether we unknowingly have blinders that eclipse our view of common comorbidities or competing explanations for similar behaviors. We can formally cross-​check our most common diagnoses and case formulations against lists drawn from meta-​ analyses, epidemiological studies, or billing records. The key point is to make sure that we are not overlooking a common scenario. If we are, then that becomes a priority for continuing education, additional reading, professional supervision and consultation, and updates in assessment practices. Table 3.1 lists chapters in this volume that focus on some of the most common conditions, along with prevalence benchmarks based on different sources. Epidemiological studies from the general population probably provide a lower bound for rates that would be

34

34

Introduction

Table 3.1  Prevalence Benchmarks for Common Clinical Issues Discussed in This Volume Clinical Rates (Rettew et al., 2009)

Condition

Chapter

Diagnosis as Usual

More Structured Diagnostic Interview

General Populationa

ADHD

4

23%

38%

5% in children, 2.5% in adults

Externalizing problems Mood disorders

5 6–​9

17% CD, 37% ODD 17% MDD, 10% dysthymia

25% CD, 38% ODD 26% MDD, 8% dysthymia

4% CD, 3% ODD 7% MDD,1.5% pervasive depressive disorder, 2.5% bipolar spectrum

Anxiety Child and adolescent Social anxiety disorder/​phobia Panic Generalized anxiety disorder Post-​traumatic stress disorder Substance use disorders Alcohol use disorder

11–​14 11 12 13 14 16 17 18

8% 6% 12% 5% 3% 14% 10%

18% 20% 11% 10% 9% 17% 13%

7% 3% 3% 3.5% –​ 5% in adolescents, 8.5% in adults

  The estimates are 12-​month prevalence rates as reported in DSM-​5 (American Psychiatric Association, 2013). Epidemiological rates refer to general population, not treatment-​seeking samples, and so often represent a lower bound of what might be expected at a clinic. a

ADHD, attention-​deficit/​hyperactivity disorder; CD, conduct disorder; MDD, major depressive disorder; ODD, oppositional defiant disorder. Source: Adapted from Youngstrom and Van Meter (2016) and https://​en.wikiversity.org/​wiki/​Evidence_​based_​assessment

seen at a clinic. Rettew et al.’s (2009) meta-​analysis provides rates from an assortment of outpatient clinics. The rates are a helpful starting point but are not etched in stone. Prevalence estimates in each chapter may vary as authors integrate different epidemiological studies or clinical samples; for inpatient settings or specialty clinics, it is likely that the rates of some conditions will be even higher. With our personal list of top referral questions in hand, we can then organize our assessments by topic and check whether there is a better method than the incumbent measurement we are using for each. It need not be a huge amount of work. This edited volume creates an easy opportunity to start at a high level: Cross-​ reference this list of common issues with Table 3.1. Review each relevant chapter to determine if there are measures that fill gaps in our current tool kit or offer greater utility than what we already are using. That strategy capitalizes on the expert review of the literature that informed each chapter to create a strong foundation of assessment methods for the common issues. The book can also be helpful in updating established and thriving clinical practices. One could pick a “topic of the month” and spend an hour checking if there are better assessment options available for use in the practice. At a training clinic or large practice, the topic of the month could be the focus of a brown bag lunch

seminar; in a private practice, it could be a good use of a cancelled appointment slot. Over the course of a year, cycling through the different topics will update the whole practice while keeping the focus fresh and challenging each month. Avoid perfectionism—​the object is not to find “the best” in any particular category but, rather, to make sure that your practice is good enough (Brighton, 2011; Hunsley, 2007) and that you ratchet it steadily upwards. The list of common issues also helps guide individual assessments. At least screening or inquiring briefly about each of the frequent topics, even if that is not what the client first mentions, leverages the base rates. The simple technique of asking about three to six common problems instead of focusing on the first obvious topic avoids well-​ documented pitfalls of confirmation bias, failing to seek disconfirming evidence, and search “satisficing” (calling off the search as soon as one hypothesis seems confirmed rather than continuing to explore other possibilities; Jenkins & Youngstrom, 2016). Remember that comorbidity is the rule, not the exception, and undetected comorbid problems can undermine treatment. More systematic approaches to assessment also help broach awkward topics—​such as substance misuse, sexual dysfunction, suicidal ideation, or physical abuse—​that may be difficult for clients to spontaneously volunteer (Lucas, Gratch, King, & Morency, 2014).

 35

Advances in Evidence-Based Assessment PREDICTION PHASE

Considering our common issues also informs our choice of core measures. Start with broad measures that cover the common issues, and augment with checklists about risk factors. In the therapeutic context, the first wave of assessment is a scouting exercise to discern the areas to explore in more depth. For adults, there are a range of broad coverage instruments available, including checklists (e.g., Derogatis & Lynn, 1999) and personality inventories (e.g., Minnesota Multiphasic Personality Inventory-​ 2 [MMPI-​ 2] interpretive systems, in addition to self-​report options; Sellbom & Ben-​Porath, 2005). If we are working with adolescents, then it makes sense to start with a broad assessment instrument such as the Achenbach System of Empirically Based Assessment (ASEBA; Achenbach & Rescorla, 2003) or the Adolescent Symptom Inventory (Gadow & Sprafkin, 1997). Scores on these measures have shown good psychometric properties across a variety of samples, and they provide broad coverage of most of the common issues in childhood and adolescence. Compared to the more comprehensive personality tests and interviews, checklists are inexpensive and fairly quick to score, and some provide good normative data to help tease apart what is developmentally typical from the more extreme or problematic levels of behavior. There also are free alternatives to many of these instruments (e.g., Goodman, 1999; Ogles, Melendez, Davis, & Lunnen, 2001), although the lower cost is often achieved by reduced breadth of scales or sacrificing the quality of the normative data (but for exceptions to this in the assessment of adult depression, see Chapter 7, this volume). Often, practitioners fall into the “rule of the tool,” giving every client their favorite assessment instrument without thinking much about how it matches up with the presenting problem or the common issues. No measure is perfect. Considering strengths and shortcomings of each measure compared to the common problems list will help build assessment batteries that are much more comprehensive and balanced without adding unnecessary components that burden the client. For example, the ASEBA, MMPI-​2, and Symptom Checklist 90 (SCL-​90) (Derogatis & Lynn, 1999)  all omit scales that directly assess body image or disordered eating patterns, which could be a prevalent and serious issue in teen or adult women (Wade, Keski-​ Rahkonen, & Hudson, 2011). Alternate scoring systems that rationally select items or use analyses with distinct criterion groups may be needed to cover other issues, such as post-​traumatic stress disorder (You, Youngstrom, Feeny, Youngstrom, & Findling, 2015) or substance misuse.

35

After the default or core assessment package is set, the next step is to think through the interpretation of each piece with regard to the common issues. If the goal were an exhaustive review of the literature, then the project would quickly become unmanageable (Youngstrom & Van Meter, 2016). However, a comprehensive approach is not necessary or particularly helpful; not all possible permutations of assessment and construct are clinically relevant:  We do not need to know how an attention-​ deficit/​ hyperactivity disorder (ADHD) scale would do at detecting depression, for example. We can match the goal with the scale to focus our interpretive attention, and we can use the “good enough” principle to keep moving (Brighton, 2011). At the prediction phase, a major source of value for an assessment tool would be changing the probability of the client having a diagnosis or problem. In a detective story, successive clues raise or lower suspicion about each suspect. The same is true with clinical assessment: Accumulating risk factors raise the probability, as would high scores on a valid measure of the same construct. Low scores on tests might also reduce the probability, as would protective factors. It is possible to integrate such information in a much more systematic way than just intuitive, impressionistic interpretation. Bayes’ theorem offers an algorithm for updating a probability estimate on the basis of new information. Although it is centuries old, and authorities such as Meehl (1954) have advocated for its use for decades, its time is finally arriving. A combination of shifting winds—​ with EBM, politics, and sports all incorporating it (for popular examples, see http://​fivethirtyeight.com)—​and technology making it more accessible have made it feasible to start using these methods in real time to integrate information and guide decisions. The improvements are profound, in terms of not just increased overall accuracy but also improved consistency (i.e., a constructive reduction in the range of interpretations of the same data) and reduced bias (protecting us from systematic misinterpretations of the same data) (Jenkins & Youngstrom, 2016; Jenkins et  al., 2011). Tools for synthesizing assessment information now include websites and smartphone applications (search for “Evidence-​Based Medicine Calculator” and choose from among the current best reviewed options) as well as probability nomograms—​an analog to old slide rules that used geometric spacing to accomplish various computations. We include a probability nomogram as Figure 3.1 because it helps illustrate the concepts and represents a least common denominator in terms of technological requirements. For readers who are interested in learning more about how to use this approach in clinical

36

36

Introduction

.1

99

.2

.5 1

1000 500

90

2

200

80

5

%

95

100 50

70

20

60

10

10 5

50

20

2

30

40

1

30

.50

40

.20

50 60

.10 .05

70

.02

80

.01 .005

90

.002 .001

95

20

%

10 5

2 1 .5

.2 99 Pretest Probability

Likelihood Ratio

.1 Posttest Probability

FIGURE 3.1   Probability

nomogram used to combine prior probability with likelihood ratios to estimate revised, posterior probability. Straus et al. (2011) provide the rationale and examples. Youngstrom (2014) and Van Meter et al. (2014, 2016) provide examples both of how to estimate diagnostic likelihood ratios from raw data and how to use a nomogram to apply them to a case. practice, we recommend the article by Van Meter et  al. (2014), in which the authors provide extensive details on how to integrate various types of clinical data in order to inform the diagnostic decision-​making process. Probabilistic interpretation involves the following series of steps:  (a) Decide the starting, or prior, probability for a particular hypothesis; (b)  combine it with

the information added by a specific assessment finding; and (c) review the updated probability and decide on the next clinical action. The information about base rates and common issues provides a starting estimate for step (a). In a probability nomogram, the prior probability gets plotted on the left-​hand column. The information from the assessment finding gets plotted on the middle line, and

 37

Advances in Evidence-Based Assessment

then connecting the dots to cross the right-​hand line provides the graphical estimate of the revised probability. For the probability nomogram to work, the information from the assessment needs to be scaled using an effect size called a diagnostic likelihood ratio (DLR). The DLR is a ratio of how common a given finding would be in the population of interest divided by how common it would be in the comparison group. For example, the DLR attached to an implicit association task for risk of self-​injury would be a ratio of how common the result (i.e., a “positive” test result) was among those who self-​injured compared to how common a similar result would be among those who did not (Nock & Banaji, 2007). In older terminology, the DLR for a high risk score would be the diagnostic sensitivity of the result (the “true positive rate”; e.g., out of 100 cases with history of self-​injury, how many had a positive test result and were correctly classified as engaging in self-​injurious behavior) compared to the false-​positive rate (the complement of diagnostic specificity; e.g., out of 100 people who do not self-​injure, how many had a positive test result and were incorrectly classified). A DLR can also be estimated for low risk, or “negative” test results; for example, how many people with a history of self-​injury had the low risk (negative) result (the false-​negative rate, or 1-​sensitivity) divided by the number of people who do not self-​injure and correctly got a low risk (negative) test result (diagnostic specificity). The algebraic relationship means that it is possible to take the sensitivity and specificity for assessments reported in the chapters of this volume and quickly calculate the DLRs for low risk (negative) and high risk (positive) scores. Although academic standards are starting to require greater detail, including the sensitivity and specificity, in articles reporting on diagnostic tools (e.g., Bossuyt et al., 2003), finding the necessary information to calculate DLRs can be challenging. However, this only needs to be done once if we write it down, either as marginalia or on a cheat sheet of measures that we routinely use in our practice. It also is not necessary to do this for all measures—​only the ones that we are going to use regularly. The DLR approach is omnivorous, and it can be fed any assessment result or data about risk or protective factors, as long as they are re-​expressed as DLRs. With a little effort, almost any effect size can be converted (Hasselbad & Hedges, 1995; Viechtbauer, 2010), along with inputs such as percentiles from normative data (Frazier & Youngstrom, 2006). Another advantage of the approach is that it can add information sequentially, in a flexible order, and as it becomes available. To add information about second input, take the revised probability from

37

the first assessment, use it as the next prior probability (i.e., put it on the leftmost line of the nomogram or in the starting field of a calculator), connect it with the next DLR, and get the updated probability. If several DLRs are available at the same time, then they can be multiplied to get a single combined DLR. The method trades the assumption that the correlation between inputs is modest for the flexibility of input sequence. Regression-​based approaches work in the opposite way, optimally adjusting for the degree of covariation among inputs, but at the cost of greater complexity and an inability to work if any one of the variables in the model is missing for a particular case (Kraemer, 1992). More often, the ability to add new data as they become available is a better match for the unfolding process of the clinical encounter. The third part of the EBA cycle is to consider the updated probability of a given outcome or diagnosis and then decide on the next clinical action. EBA 2.0 adapts the EBM concept of two decision thresholds defining three zones of clinical action. The low probability, intermediate, and high probability zones signify watchful waiting, assessment, and acute treatment in the EBM formulation (Straus, Glasziou, Richardson, & Haynes, 2011). With EBA 2.0, there are distinct assessment strategies and titrated interventions for each zone (Youngstrom, 2013). The low probability zone could still warrant a surveillance or monitoring plan to detect worrisome changes, and it could also be a place for primary preventions that are so low risk and low cost that they make sense to deploy regardless of personal circumstances. The intermediate zone is not just the place for more focused assessment targeting the key hypotheses but also may be the realm for using broad-​spectrum, low-​risk interventions such as many forms of therapy. This is the arena in which targeted prevention, peer counseling, bibliotherapy, and generic supportive counseling all could be appropriate, along with changes in sleep hygiene, diet, and other lifestyle factors. The high probability zone may be the place where treatment shifts to specialist interventions, acute pharmacotherapy, and other tertiary interventions. At this stage, assessment shifts to monitoring treatment response, searching for cues of progress (and using failure to progress as a sign that the case formulation should be revisited; Lambert, Harmon, Slade, Whipple, & Hawkins, 2005). Neither threshold—​between low probability and intermediate or between intermediate and high—​has a rigid location on the probability scale. This is by design. The threshold should shift depending on the relative risks and benefits attached to the treatment, or the costs associated with a false negative (i.e., missing a case that truly has the

38

38

Introduction

target problem) or false positive (i.e., overdiagnosis). With very low-​risk, low-​cost interventions, the treatment threshold could drop so low that everyone gets the intervention: This is the primary prevention model, with inoculation and iodized salt to prevent thyroid problems as widespread public health examples. Although there are models to algebraically weight costs and benefits and precisely shift the threshold (for four different but conceptually related models, see Kraemer, 1992; Straus et al., 2011; Swets, Dawes, & Monahan, 2000; Yates & Taub, 2003), these are complicated to implement without computer support. They also probably are not sufficient in themselves. Ultimately, the decisions about when and how to treat are informed by clinical expertise and patient values, and the decision-​making should be shared with the client (Harter & Simon, 2011).

PRESCRIPTION PHASE

Returning to the flow through the EBA process, the combination of risk factors and screening or initial assessments will probably be enough to move hypotheses into the mid-​ range “assessment zone” or demote them from further consideration, but they will not suffice to confirm hypotheses on their own. Nor will they push revised probabilities high enough to guide treatment in isolation. If the EBA system is working, then the initial test results serve to revise the list of hypotheses that are candidates for further intensive evaluation. Assess More Focused Constructs and Add Collateral Informants The next stages involve gathering more focused measures and collateral perspectives, as well as perhaps selecting a semi-​structured approach for confirming diagnoses. The more focused measures include not just self-​report scales and checklists, of which there are an abundance reviewed in the following chapters, but also in many cases performance measures such as neurocognitive tests. Collateral informants are a routine part of evaluations for youths, where parents or teachers may be initiating the referral. Although less commonly used, they can play a valuable role not just in couples counseling but also in assessing behaviors when individuals may lack insight (e.g., mania, psychosis, or adult autism; Dell’Osso et al., 2002) or when they may not be motivated to provide accurate reports (as might be the case with substance misuse, antisocial behavior, or food intake with eating disorders). Treat each chapter topic as a portfolio of options for a particular diagnostic

hypothesis, and then select an assessment instrument that is “highly recommended” for evaluating each. If there is information about collateral report options as well, it is worth picking one of the top-​tier ones and having it available, too. Although collaterals provide converging perspectives, the correlations tend to be low to moderate (r = .2 to .4 in adults, based on an extensive meta-​analysis; Achenbach, Krukowski, Dumenci, & Ivanova, 2005). Disagreements also are informative in terms of gauging insight, motivation for treatment, and other valuable contextual information (for a detailed review and suggestions, see De Los Reyes et al., 2015). Semi-​Structured Diagnostic Interviews If the goal is to establish a formal diagnosis, then a semi-​ structured diagnostic interview is the next step indicated in the process. In contrast, the standard of practice for decades has been an unstructured interview, where the clinician listens to the presenting problem, generates a hypothesis, and seeks confirming evidence. Clinicians like this approach because it should employ our training and expertise to be able to recognize complex patterns of information and to sniff out key moderating variables. Unfortunately, studies repeatedly show that rather than a set of virtuoso diagnostic performances, what we accomplish with unstructured interviews are formulations with near-​chance inter-​rater agreement. That state of affairs guided the decision of the third and subsequent revisions of the Diagnostic and Statistical Manual (DSM; American Psychiatric Association, 2013)  to emphasize improving reliability, and it also was the impetus for developing structured diagnostic interviews. Fully structured interviews are highly scripted, to the point that they could be delivered via computer. The scripting and automation push inter-​rater reliability to nigh perfection, at the expense of sacrificing clinical judgment. Semi-​structured interviews offer a middle way. They are structured in the sense that they include the same set of topics regardless of presenting problem or clinical intuition, and they also embed the algorithms to satisfy specific diagnostic criteria. A semi-​structured interview about depression, for instance, should ask about at least the nine symptoms in the criteria for a major depressive episode, as well as include questions checking that the symptoms are part of an episodic change in functioning lasting at least 2 weeks and causing impairment in at least one setting. The “semi” aspect means that the interviewer need not stick exactly to a script but instead can paraphrase, or reword using the patient’s own terms. The clinician

 39

Advances in Evidence-Based Assessment

also can re-​inject clinical judgment to the process, but now at the level of leaves and roots, rather than starting with sweeping decisions about choice of branch in the decision-​making tree. In practice, compared to fully structured interviews, semi-​structured approaches tend to take longer to learn to administer reliably, and they may yield lower reliability estimates. If that price affords better clinical validity and more uptake, it is well worth paying. Clinicians cling to unstructured interviews. We offer a set of rationalizations: The more structured interviews will take too long; they will damage rapport with our clients; clients will not like the interview. Surveys decisively rebut the issues of patient preference. Patients prefer the more thorough approaches, believing that clinicians have a more comprehensive and accurate understanding of the situation afterwards (Bruchmuller, Margraf, Suppiger, & Schneider, 2011; Suppiger et  al., 2009). The issue of time could be handled in any of at least three ways. First, use the previous information from the EBA 2.0 process to select specific modules. Rather than grinding through an entire interview, choose semi-​structured interview components focused on the hypotheses still in contention. This method uses the prior assessment data to accomplish what many interviews implement with gating logic and skip out questions. The selective approach also offers the possibility of choosing modules from different interviews that are optimized for particular conditions. The interviews reviewed in subsequent chapters provide the list of options, and a practitioner could build an eclectic and modular follow-​up interview, taking the best from each category. Second, spend longer on the interview. Data show that clients do not mind, and insurance companies are willing to reimburse for the more focused follow-​up interview because the prior EBA steps have documented medical necessity. Third, technology is now making it possible to offload the structured interview as an activity that the client does before meeting the practitioner (Susskind & Susskind, 2015). Completely computer-​administered interviews are decreasing in cost and increasing in sophistication. The structured interview could become another input in the assessment process, leading to a set of most likely diagnoses, which the clinician then probes before deciding on a final formulation. Other More Intensive Testing An EBA approach would deploy other assessments with incremental or confirmatory value at this stage. These are methods that are more burdensome or expensive, precluding use in a universal screening or core battery approach. In the diagnostic arena, they may also put more of a

39

premium on specificity, even at the expense of lower sensitivity, because now the goal is confirmation of a hypothesis that has already passed through the earlier stages of detection (a high-​sensitivity filter) and evaluation (Straus et  al., 2011). This is the realm of systematic behavioral observation with targeted hypotheses, of neurocognitive testing, of drug testing kits, and of polysomnography to evaluate the potential presence of a formal sleep disorder. This could become the province of wearable consumer devices and health-​related smartphone applications that measure sleep, activity, heart rate, and other physiological and behavioral parameters. Treatment Planning and Goal Setting The assessments should serve to identify treatment targets by pushing the probability high enough to warrant corresponding intervention, by direct confirmation using a sufficiently structured interview, or by a combination of these. EBA should not only establish a treatment target but also detect secondary targets, such as comorbidities or areas of impaired functioning. It should also provide alerts to factors that would change the choice of intervention. Comorbid substance misuse, low verbal ability, or a personality disorder all could significantly complicate treatment and lead to poorer prognosis if not addressed. Having arrived at a case formulation, the next step is to negotiate a treatment plan and set measurable goals. We view this as a negotiation because collaborative approaches to care are desirable on ethical and utilitarian grounds. When clients buy into the plan, they are more invested in treatment and more likely to follow through on recommendations and achieve better outcomes. Client beliefs and preferences should be considered throughout the assessment and treatment process, but they deserve extra attention here. Many areas of medicine have developed decision aids to help the patient understand the risks and benefits of different treatment options. This is an area for growth in clinical psychology. At a minimum, a direct and culturally sensitive discussion should occur, and the provider should explicitly link elements of treatment to the stated preferences and provide a meaningful rationale for how treatment would promote attaining the goals. The client may not be ready or motivated to work on everything that the assessment process reveals. When it is possible to focus on shared goals, engagement and rapport will be at a substantial advantage. With targets agreed upon, assessments also establish a baseline measure of severity, and many can add nomothetic benchmarks against which to measure progress.

40

40

Introduction

Tools such as behavior checklists that have standardization data offer normative comparisons in the form of percentiles, T scores, and the like. Interestingly, the scores that are the most elevated are not always the most impairing or distressing (Weisz et al., 2011), and so yet again it is valuable to get the client’s input. Selecting one or more scales as an operational definition of a treatment outcome will provide a more quantifiable and perhaps objective indication of progress.

PROCESS: TREATMENT MONITORING AND TREATMENT OUTCOME

Therapy, like going on a diet, is a challenging form of behavior change. The chances of success increase with explicit goals and regular brief measures of progress—​like weighing in on a bathroom scale—​and process. The psychometric qualities and practical parameters are quite different for a progress or process measure compared to a diagnostic assessment (Youngstrom et  al., 2017). Brevity is a major consideration. Although loss of diagnosis may be a goal of treatment, few practitioners or clients would want to repeat a full structured interview several times over the course of treatment. Sensitivity to treatment effects is another key function; in part for this reason, personality or general cognitive ability tests are not used as outcome measures. Treatment sensitivity requires a blend of enough retest stability to indicate when problems persist, yet also malleability that can indicate if the intervention has the desired effect. Indices of retest reliability are not adequate in isolation to judge suitability for measuring outcome. Conceptually, generalizability coefficients or intraclass correlations quantifying the amount of variance attributable to treatment would be ideal, although they are rarely reported in the literature. Nomothetic Goal Setting Norm-​ referenced measures create an opportunity for nomothetic definitions of treatment milestones. Jacobson and colleagues developed an influential model for this, framing clinically significant change as requiring psychometrically reliable improvement along with transiting an a priori benchmark (Jacobson, Roberts, Berns, & McGlinchey, 1999). Jacobson and colleagues used a reliable change index (RCI) as a way of showing that individual treatment response was unlikely to be due to measurement error or instability. The RCI converts raw change scores into a z-​score-​type metric, using the standard error of the

difference as the scale. Values greater than 1.65 would connote 90% confidence that the change was reliable, and 1.96 would demarcate 95% confidence. In practice, retest stabilities are rarely reported, and even less likely to match the naturalistic length of treatment, so people often use the internal consistency reliability as the basis for estimating the standard error of the measure and then the standard error of the difference (Ogles, 1996). Research reports and reviews tend to focus on group statistics and not the standard errors, so it may be necessary to calculate these for the outcome measures we use regularly. For each common treatment target, select one assessment instrument that will be feasible to use, and make a cheat sheet with the standard error of the difference score; or, even more conveniently, jot down the number of points required for 90% or 95% confidence in the change. A  more recent alternative to the RCI is the minimally important difference (MID) method, which uses patient preferences to define the smallest increment of change that they would find meaningful (Thissen et  al., 2016). MID milestones tend to be smaller than RCI ones, making them easier to achieve and also indicating that more subtle changes can still be important to the individual. The second part of Jacobson and colleagues’ (1999) definition involves passing a benchmark defined by normative data. There are three operational definitions: moving Away from the clinical range, moving Back into the nonclinical range, and moving Closer to the nonclinical than clinical average. The Back definition requires normative data in a nonclinical sample, and the Away definition needs a relevant clinical sample to generate the benchmark; the Closer definition needs both the nonclinical and the clinical samples for estimation. The requirements create a practical barrier to implementation: Many assessments lack the requisite normative data (Youngstrom et al., 2017). The thresholds are also rarely reported, although they are relatively simple to calculate if the data are accessible. Jacobson and colleagues recommended using two standard deviations (SDs) as the rule of thumb for defining the Away and Back thresholds (e.g., moving beyond 2 SDs from the clinical mean or back within 2 SDs of the nonclinical mean), and the Closer threshold is the weighted average of the clinical and nonclinical means. Again, these are worth calculating for the primary outcome measure we select for each common treatment target. Writing them down leverages the few minutes of work involved, providing a resource for treatment across many cases. From a psychometric perspective, measures best suited for the nomothetic definitions of clinically significant

 41

Advances in Evidence-Based Assessment

change will have high reliability—​translating into precise estimates of the client’s true score in classical test theory—​ coupled with large separation between the clinical and nonclinical distributions, most often indexed as Cohen’s d effect size. The high reliability is often achieved via increasing scale length, as the number of items is part of the internal consistency reliability formula. As a result, the tools precise enough to measure change well may be too long to repeat frequently. The nomothetic benchmarks may work best as midterm and final exams—​panels of evaluation that are used less often but that provide fairly deep evaluation of progress (Youngstrom et al., 2017). Idiographic Goal Setting A complementary approach to goal setting and tracking is an idiographic approach, in which the client defines targets of interest and uses a simple way of scaling and recording them to provide frequent feedback. Often, these are single-​item scales, with simple Likert-​ type scoring. The Youth Top Problems approach asks the youth and the caregiver to each pick three things that they want therapy to improve and then report on them at every session using a 0–​10 scale (Weisz et  al., 2011). The reliability of the approach derives from the repeated measurement. One could think of the number of repetitions as the functional length of the scale. The brevity and the salience of the content (because the client chose it) make the approach feasible. It can be remarkably sensitive to treatment effects. It also is likely to enhance treatment effects, much as stepping regularly on a bathroom scale increases the effectiveness of the diet. Measurement-​ based care advocates using these sorts of short, focused evaluations. These also can provide feedback in real time, allowing for course corrections during treatment if there is failure to progress or if there are iatrogenic effects. Process Measurement Many interventions are skill based, and it is possible to track the behaviors that are components of the therapeutic process. The possibilities are broad and include examples such as daily report cards when evaluating interventions for impulsive or externalizing behaviors (see Chapter  4, this volume), completion of three-​and five-​column charts in cognitive–​behavioral therapy, use of coping or diary cards in dialectical behavioral therapy, or counting the number of core conflictual relational themes surfaced during a session of psychodynamic

41

therapy (Luborsky, 1984). Tracking the number of cancellations or no-​shows also provides a behavioral measure of engagement, and other measures of adherence are possible. Process variables can include mediational variables in treatment models, and some may be worth measuring during the course of therapy to ensure that the intervention is starting to produce the desired changes, even if the more global outcomes may take some weeks to achieve. The burgeoning number of mental health applications for smartphones and other devices will create ways of tracing utilization without requiring additional work on the part of the client. These variables are more tied to the particular intervention used, and so they are less likely to be covered in a chapter devoted to assessment. They are important, nonetheless, and will repay any investment in planning and gathering them. Maintenance Monitoring When treatment goes well, termination planning should celebrate the success, and also a plan should be developed for maintenance and for relapse prevention (Ward, 1984). The reality is that many conditions are recurrent (e.g., mood disorders), chronic (e.g., ADHD and personality disorders), or prone to relapse (e.g., substance misuse). There also may be predictable triggers and stressful events, such as moving or separating from a partner, that create opportunities to plan ahead and promote the generalization of successful behaviors. As we conclude a course of therapy, it makes sense to have an assessment strategy that will monitor gains and provide early warning of things worsening. Kazdin and Weisz (1998) discussed a “dental model” of care, in which routine check-​ups are scheduled without waiting for a crisis. These promote prevention as well as early intervention. For clients to use the monitoring strategies, the strategies need to be low friction, convenient, and focused on things that the clients care about (Youngstrom et  al., 2017). Here, too, phone applications and wearable technology are making innovations possible. Daily items tracking substance use or stress, or wearables tracking exercise and sleep, create new opportunities for monitoring long-​term health.

UTILITY: HOW MUCH WILL IT COST?

Psychological assessment looks different viewed through the lens of EBA 2.0, with different techniques woven through the intervention process from before the

42

42

Introduction

start of treatment to after its conclusion. Although full implementation of EBA adds several new techniques, and moves assessment out of its traditional box at the beginning of treatment (or even separated entirely from treatment, as often happens with the assessment report model), it does not usually demand extra time from the client. Sequencing is key. If everyone were screened for all conditions, regardless of prevalence, or if all completed comprehensive neurocognitive batteries along with structured interviews, then time and cost would balloon (Kraemer, 1992; Youngstrom & Van Meter, 2016). Using knowledge of base rates lets us configure our assessment sequence to cover common scenarios first. Then we selectively add an assessment only when it has the potential to answer questions about prediction, prescription, or process. Use of short forms, hybrid models that blend rating scales with modular semi-​structured interviews, and brief idiographic items all promote feasibility. Gathering the nuts and bolts in advance—​making the cheat sheet with the diagnostic likelihood ratios, reliable change indices, and normative benchmarks—​is a one-​time investment in enhancing the assessment and treatment for all subsequent clients. Fiscal costs are in flux, as test publishers are now experimenting with subscription or fee-​for-​scoring models. There also are a plethora of public domain and free options, many of which have accumulated evidence of reliability and validity across a range of populations and settings (Beidas et al., 2015; see also Chapter 7, this volume). Professional societies are currently reviewing and anthologizing many of these (e.g., the American Academy of Child & Adolescent Psychiatry’s Practice Toolbox [http://​www.aacap.org/​aaCaP/​Clinical_​Practice_​ Center/​Home.aspx] pages) and the Society of Clinical Psychology’s assessment pages [http://​www.div12.org]), making them more convenient to find, and a growing number are now available on Wikipedia and Wikiversity (Youngstrom et al., 2017). The proliferating mental health-​ related software applications also are low cost or free. As a result, neither time nor cost is a serious obstacle to implementing EBA 2.0. The gains in accuracy of diagnosis are profound. Inasmuch as diagnosis and formulation guide the effective choice of treatment, better outcomes should follow. Measurement-​based care also is showing that progress and process evaluation enhance the implementation of treatment and provide a small to medium-​sized boost in outcomes (e.g., Guo et  al., 2015). Considering the evidence of utility, perhaps the better question is whether we can afford not to adopt EBA 2.0.

References Achenbach, T. M., Krukowski, R. A., Dumenci, L., & Ivanova, M. Y. (2005). Assessment of adult psychopathology: Meta-​analyses and implications of cross-​informant correlations. Psychological Bulletin, 131, 361–​382. Achenbach, T. M., & Rescorla, L. A. (2003). Manual for the ASEBA adult forms & profiles. Burlington, VT: University of Vermont. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Beidas, R. S., Stewart, R. E., Walsh, L., Lucas, S., Downey, M. M., Jackson, K.,  .  .  .  Mandell, D. S. (2015). Free, brief, and validated: Standardized instruments for low-​ resource mental health settings. Cognitive & Behavioral Practice, 22, 5–​19. Bossuyt, P. M., Reitsma, J. B., Bruns, D. E., Gatsonis, C. A., Glasziou, P. P., Irwig, L. M., . . . Lijmer, J. G. (2003). The STARD statement for reporting studies of diagnostic accuracy:  Explanation and elaboration. Annals of Internal Medicine, 138(1), W1–W12. Brighton, H. (2011). The future of diagnostics:  From optimizing to satisficing. In G. Gigerenzer & J. A. Muir Gray (Eds.), Better doctors, better patients, better decisions (pp. 281–​294). Cambridge, MA: MIT Press. Bruchmuller, K., Margraf, J., Suppiger, A., & Schneider, S. (2011). Popular or unpopular? Therapists’ use of structured interviews and their estimation of patient acceptance. Behavior Therapy, 42, 634–​643. Camara, W. J., Nathan, J. S., & Puente, A. E. (2000). Psychological test usage:  Implications in professional psychology. Professional Psychology:  Research and Practice, 31, 141. Carpenter-​ Song, E. (2009). Caught in the psychiatric net:  Meanings and experiences of ADHD, pediatric bipolar disorder and mental health treatment among a diverse group of families in the United States. Culture, Medicine, & Psychiatry, 33, 61–​85. Croskerry, P. (2003). The importance of cognitive errors in diagnosis and strategies to minimize them. Academic Medicine, 78, 775–​780. De Los Reyes, A., Augenstein, T. M., Wang, M., Thomas, S. A., Drabick, D. A., Burgers, D. E., & Rabinowitz, J. (2015). The validity of the multi-​informant approach to assessing child and adolescent mental health. Psychological Bulletin, 141, 858–​900. Dell’Osso, L., Pini, S., Cassano, G. B., Mastrocinque, C., Seckinger, R. A., Saettoni, M., . . . Amador, X. F. (2002). Insight into illness in patients with mania, mixed mania, bipolar depression and major depression with psychotic features. Bipolar Disorders, 4, 315–​322. Derogatis, L. R., & Lynn, L. L. (1999). Psychological tests in screening for psychiatric disorder. In M. E. Maruish (Ed.), The use of psychological testing for treatment

 43

Advances in Evidence-Based Assessment

planning and outcomes assessment (2nd ed., pp. 41–​79). Mahwah, NJ: Erlbaum. Dubicka, B., Carlson, G. A., Vail, A., & Harrington, R. (2008). Prepubertal mania:  Diagnostic differences between US and UK clinicians. European Child & Adolescent Psychiatry, 17, 153–​161. Frazier, T. W., & Youngstrom, E. A. (2006). Evidence-​based assessment of attention-​ deficit/​ hyperactivity disorder:  Using multiple sources of information. Journal of the American Academy of Child & Adolescent Psychiatry, 45, 614–​620. Gadow, K. D., & Sprafkin, J. (1997). Adolescent Symptom Inventory:  Screening manual. Stony Brook, NY: Checkmate Plus. Garb, H. N. (1998). Studying the clinician:  Judgment research and psychological assessment. Washington, DC: American Psychological Association. Glasziou, P. P., & Haynes, B. (2005). The paths from research to improved health outcomes. ACP Journal Club, 142, A8–​A10. Goodman, R. (1999). The extended version of the Strengths and Difficulties Questionnaire as a guide to child psychiatric caseness and consequent burden. Journal of Child Psychology & Psychiatry, 40, 791–​799. Guo, T., Xiang, Y. T., Xiao, L., Hu, C. Q., Chiu, H. F., Ungvari, G. S.,  .  .  .  Wang, G. (2015). Measurement-​ based care versus standard care for major depression: A randomized controlled trial with blind raters. American Journal of Psychiatry, 172, 1004–​1013. Harter, M., & Simon, D. (2011). Do patients want shared decision making and how is this measured? In G. Gigerenzer & J. A. Muir Gray (Eds.), Better doctors, better patients, better decisions (pp. 53–​58). Cambridge, MA: MIT Press. Hasselbad, V., & Hedges, L. V. (1995). Meta-​ analysis of screening and diagnostic tests. Psychological Bulletin, 117, 167–​178. Hunsley, J. (2007). Training psychologists for evidence-​based practice. Canadian Psychology, 38, 32–​42. Hunsley, J., & Mash, E. J. (2007). Evidence-​based assessment. Annual Review of Clinical Psychology, 3, 29–​51. Jacobson, N. S., Roberts, L. J., Berns, S. B., & McGlinchey, J. B. (1999). Methods for defining and determining the clinical significance of treatment effects:  Description, application, and alternatives. Journal of Consulting and Clinical Psychology, 67, 300–​307. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in R. New York, NY: Springer. Jenkins, M. M., & Youngstrom, E. A. (2016). A randomized controlled trial of cognitive debiasing improves assessment and treatment selection for pediatric bipolar disorder. Journal of Consulting and Clinical Psychology, 84, 323–​333.

43

Jenkins, M. M., Youngstrom, E. A., Washburn, J. J., & Youngstrom, J. K. (2011). Evidence-​ based strategies improve assessment of pediatric bipolar disorder by community practitioners. Professional Psychology:  Research and Practice, 42, 121–​129. Jensen-​Doss, A., & Hawley, K. M. (2011). Understanding clinicians’ diagnostic practices:  Attitudes toward the utility of diagnosis and standardized diagnostic tools. Administration and Policy in Mental Health, 38, 476–​485. Kahneman, D. (2011). Thinking, fast and slow. New  York, NY: Farrar, Straus & Giroux. Kazdin, A. E., & Weisz, J. R. (1998). Identifying and developing empirically supported child and adolescent treatments. Journal of Consulting and Clinical Psychology, 66, 19–​36. Kraemer, H. C. (1992). Evaluating medical tests:  Objective and quantitative guidelines. Newbury Park, CA: Sage. Lambert, M. J., Harmon, C., Slade, K., Whipple, J. L., & Hawkins, E. J. (2005). Providing feedback to psychotherapists on their patients’ progress:  Clinical results and practice suggestions. Journal of Clinical Psychology, 61, 165–​174. Luborsky, L. (1984). Principles of psychoanalytic psychotherapy. New York, NY: Basic Books. Lucas, G. M., Gratch, J., King, A., & Morency, L.-​P. (2014). It’s only a computer: Virtual humans increase willingness to disclose. Computers in Human Behavior, 37, 94–​100. Mackin, P., Targum, S. D., Kalali, A., Rom, D., & Young, A. H. (2006). Culture and assessment of manic symptoms. British Journal of Psychiatry, 189, 379–​380. Retrieved from https://​www.ncbi.nlm.nih.gov/​entrez/​query.fcgi ?cmd=Retrieve&db=PubMed&dopt=Citation&list_​ uids=17012663 Meehl, P. E. (1954). Clinical versus statistical prediction: A theoretical analysis and a review of the evidence. Minneapolis, MN: University of Minnesota Press. Nock, M. K., & Banaji, M. R. (2007). Prediction of suicide ideation and attempts among adolescents using a brief performance-​based test. Journal of Consulting and Clinical Psychology, 75, 707–​715. Norcross, J. C., Hogan, T. P., & Koocher, G. P. (2008). Clinician’s guide to evidence based practices: Mental health and the addictions. London, UK: Oxford University Press. Ogles, B. M. (1996). Assessing outcome in clinical practice. Boston, MA: Allyn & Bacon. Ogles, B. M., Melendez, G., Davis, D. C., & Lunnen, K. M. (2001). The Ohio Scales: Practical outcome assessment. Journal of Child & Family Studies, 10, 199–​212. Rettew, D. C., Lynch, A. D., Achenbach, T. M., Dumenci, L., & Ivanova, M. Y. (2009). Meta-​analyses of agreement between diagnoses made from clinical evaluations and standardized diagnostic interviews. International Journal of Methods in Psychiatric Research, 18, 169–​184.

4

44

Introduction

Sellbom, M., & Ben-​ Porath, Y. S. (2005). Mapping the MMPI-​2 Restructured Clinical scales onto normal personality traits: Evidence of construct validity. Journal of Personality Assessment, 85, 179–​187. Straus, S. E., Glasziou, P., Richardson, W. S., & Haynes, R. B. (2011). Evidence-​based medicine: How to practice and teach EBM (4th ed.). New York, NY: Churchill Livingstone. Suppiger, A., In-​Albon, T., Hendriksen, S., Hermann, E., Margraf, J., & Schneider, S. (2009). Acceptance of structured diagnostic interviews for mental disorders in clinical practice and research settings. Behavior Therapy, 40, 272–​279. Susskind, R., & Susskind, D. (2015). The future of the professions: How technology will transform the work of human experts. New York, NY: Oxford University Press. Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest, 1, 1–​26. Thissen, D., Liu, Y., Magnus, B., Quinn, H., Gipson, D. S., Dampier, C., . . . DeWalt, D. A. (2016). Estimating minimally important difference (MID) in PROMIS pediatric measures using the scale-​judgment method. Quality of Life Research, 25, 13–​23. Van Meter, A., Youngstrom, E., Youngstrom, J. K., Ollendick, T., Demeter, C., & Findling, R. L. (2014). Clinical decision making about child and adolescent anxiety disorders using the Achenbach System of Empirically Based Assessment. Journal of Clinical Child & Adolescent Psychology, 43, 552–​565. Van Meter, A. R., You, D. S., Halverson, T., Youngstrom, E. A., Birmaher, B., Fristad, M. A.,  .  .  .  Lams Group, T. (2016). Diagnostic efficiency of caregiver report on the SCARED for identifying youth anxiety disorders in outpatient settings. Journal of Clinical Child and Adolescent Psychology, 2, 1–​15. Viechtbauer, W. (2010). Conducting meta-​analyses in R with the metafor packages. Journal of Statistical Software, 36, 1–​48. Wade, T. D., Keski-​Rahkonen, A., & Hudson, J. I. (2011). Epidemiology of eating disorders. In M. Tsuang, M. Tohen, & P. B. Jones (Eds.), Textbook of psychiatric epidemiology (pp. 343–​360). New York, NY: Wiley.

Ward, D. E. (1984). Termination of individual counseling: Concepts and strategies. Journal of Counseling and Development, 63, 21–​25. Weisz, J. R., Chorpita, B. F., Frye, A., Ng, M. Y., Lau, N., Bearman, S. K., . . . Hoagwood, K. E. (2011). Youth Top Problems:  Using idiographic, consumer-​guided assessment to identify treatment needs and to track change during psychotherapy. Journal of Consulting and Clinical Psychology, 79, 369–​380. Yates, B. T., & Taub, J. (2003). Assessing the costs, benefits, cost-​ effectiveness, and cost–​ benefit of psychological assessment: We should, we can, and here’s how. Psychological Assessment, 15, 478–​495. Yeh, M., Hough, R. L., Fakhry, F., McCabe, K. M., Lau, A. S., & Garland, A. F. (2005). Why bother with beliefs? Examining relationships between race/​ethnicity, parental beliefs about causes of child problems, and mental health service use. Journal Consulting and Clinical Psychology, 73, 800–​807. You, S. D., Youngstrom, E. A., Feeny, N. C., Youngstrom, J. K., & Findling, R. L. (2017). Comparing the diagnostic accuracy of five instruments for detecting posttraumatic stress disorder in youth. Journal of Clinical Child & Adolescent Psychology, 46, 511–​522. Youngstrom, E. A. (2013). Future directions in psychological assessment:  Combining evidence-​based medicine innovations with psychology’s historical strengths to enhance utility. Journal of Clinical Child & Adolescent Psychology, 42, 139–​159. Youngstrom, E. A. (2014). A primer on receiver operating characteristic analysis and diagnostic efficiency statistics for pediatric psychology: We are ready to ROC. Journal of Pediatric Psychology, 39, 204–​221. Youngstrom, E. A., & Van Meter, A. (2016). Empirically supported assessment of children and adolescents. Clinical Psychology: Science and Practice, 23, 327–​347. Youngstrom, E. A., Van Meter, A., Frazier, T. W., Hunsley, J., Prinstein, M., Ong, M.-​L., & Youngstrom, J. K. (2017). Evidence-​based assessment as an integrative model for applying psychological science to guide the voyage of treatment. Clinical Psychology:  Science and Practice, 24, 331–363.

 45

Part II

Attention-​Deficit and Disruptive Behavior Disorders

46

 47

4

Attention-​Deficit/​Hyperactivity Disorder Charlotte Johnston Sara Colalillo This chapter focuses on the assessment of attention-​ deficit/​hyperactivity disorder (ADHD) in clinical settings and on measures appropriate for youth. Six-​to 12-​year-​old children are the group most frequently referred for assessment and treatment of ADHD, and therefore literatures regarding assessment at other ages are not as well developed and not reviewed in this chapter. However, consistent with the recent adoption of a lifespan perspective on ADHD (American Psychiatric Association [APA], 2013), in this chapter we do include brief information pertaining to the assessment of ADHD in adulthood. Research focused on the assessment of ADHD earlier in life, particularly in the preschool years, is mounting (e.g., Ghuman & Ghuman, 2014; Harvey, Lugo-​Candeals, & Breaux, 2015; Rabinovitz, O’Neill, Rajendran, & Halperin, 2016). There are a number of challenges to the identification of ADHD in this younger age range, including less consistency in the contexts in which children are assessed (e.g., preschool, day care, and home care) and less distinctiveness of ADHD symptoms and other problem behaviors. However, the potential benefits to early identification of the disorder make this area of work an important frontier. Similarly, although most youth are diagnosed with ADHD prior to adolescence, some symptom presentations (e.g., primary problems with inattention) or some circumstances may result in ADHD escaping earlier detection. In addition, the increased autonomy or academic demands associated with adolescence often necessitate a renewed focus on ADHD assessment as a precursor to developing or modifying treatment plans. Readers are referred to Barkley (2006) for an overview of issues related to assessment of ADHD in adolescents. The relatively high prevalence of ADHD, combined with the pernicious nature of the problems associated with it and the persistence of the disorder over time (APA, 2013), 47

make comprehensive and accurate clinical assessment an imperative for guiding clinical care in this population. In addition, perhaps more than many diagnoses, the ADHD diagnosis has been the subject of considerable controversy. Much of this controversy is fueled by frequent, and at times sensationalistic, media reports. Many individuals, including parents of children who undergo assessments for ADHD, express fear that this is an overused diagnostic label designed merely to control children’s naturally rambunctious or extroverted nature and to justify the use of psychotropic medications. Contrary to these concerns, the scientific community has provided ample evidence to support the validity of the disorder and its associated treatments (Barkley, 2002; Kooij et  al., 2010; National Institutes of Health, 2000). Furthermore, evidence suggests that although the diagnosis may sometimes be overused, it is just as frequently missed (e.g., Angold, Erkanli, Egger, & Costello, 2000; Levy, 2015; Sayal, Goodman, & Ford, 2006). However, for each individual child there is no substitute for careful, evidence-​based assessment to provide the best possible clinical service and to assist parents and children in understanding the meaning of the diagnostic label, the link between assessment and treatment recommendations, and the need to monitor impairments and treatment effects over time. We begin the chapter with an overview of ADHD, providing a sense of the core characteristics of the disorder that need to be assessed. We then review assessment measures for children that serve three purposes, along with the unique challenges that may accompany each purpose: (a) measures used for diagnostic purposes, (b)  measures useful for case formulation and treatment planning, and (c) assessments for monitoring the course and outcome of interventions. For each purpose, we have constructed a table indicating measures that meet psychometric criteria

48

48

Attention-Deficit and Disruptive Behavior Disorders

set out by the editors in Chapter 1 of this volume. In the text, we offer brief descriptions of these measures and occasionally mention other promising assessment tools that do not, as yet, meet the criteria used for including measures in the tables. Following the review of assessment tools appropriate for children, we consider the best tools available for the assessment of ADHD in adults. Finally, we conclude with an overview of the state-​of-​the-​art with regard to the assessment of ADHD, with a focus on the challenges that remain for research and clinical practice.

THE NATURE OF ADHD

The study of ADHD is one of the largest empirical literatures in child psychopathology and encompasses evidence regarding the genetic, biological, neurological, psychological, social, and cultural characteristics of the disorder. Significant advances are being made in our understanding of ADHD, including exciting theoretical and empirical works probing the core causes and nature of the disorder (e.g., Gallo & Posner, 2016; Karalunas et al., 2014; Musser, Galloway-​Long, Frick, & Nigg, 2013; Nigg, Willcutt, & Doyle, 2005; Sonuga-​Barke, Cortese, Fairchild, & Stringaris, 2016). The vibrant nature of research on ADHD bodes well for advancing our ability to clinically assess, treat, and potentially even prevent this disorder. However, the rapidly expanding and dynamic nature of the research also means that evidence-​based assessment of ADHD must continually change as it incorporates new evidence. Thus, one challenge to the assessment of ADHD is the need for clinicians to constantly update their knowledge about the disorder and to revise assessment tools and methods accordingly. The first and perhaps most critical recommendation we offer for the assessment of ADHD is that the information in this chapter has an expiry date, and only by keeping abreast of the science of ADHD can clinical practice in this area remain appropriate. ADHD is defined in the most recent edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; APA, 2013)  as a neurodevelopmental disorder characterized by developmentally inappropriate and maladaptive levels of inattention, impulsivity, and hyperactivity occurring in multiple settings with an onset prior to age 12  years. So defined, ADHD has a prevalence rate among school-​aged children of approximately 5%, with more boys than girls affected. ADHD symptoms are persistent over time, and at least two-​thirds of children who meet diagnostic criteria will continue either to meet

diagnostic criteria or to suffer impairment due to symptoms into adolescence and adulthood (e.g., Kooij et  al., 2010). Beyond the core symptoms of the disorder, individuals with ADHD frequently experience difficulties in areas such as academic or job performance, interpersonal relations, oppositional and conduct problems, and internalizing problems (anxiety and mood disorders). Depending on the type of symptoms that an individual displays at the time of assessment, ADHD diagnoses are assigned as predominantly inattentive, predominantly hyperactive–​ impulsive, or combined presentations. Individuals with the predominantly inattentive presentation have problems such as difficulties in paying close attention to details or sustaining attention. The predominantly hyperactive–​impulsive presentation is characterized by behaviors such as motor overactivity or restlessness and also difficulties inhibiting behavior. The combined presentation includes both types of problems. The two symptom dimensions, inattention and hyperactivity–​ impulsivity, are highly related (e.g., Martel, von Eye, & Nigg, 2012; Toplak et  al., 2009), and most individuals with the diagnosis show elevations in both types of symptoms. The predominantly hyperactive–​impulsive presentation appears most common in younger children and may reflect a developmental stage of the disorder (e.g., Hart et  al., 1995). The overlap between the predominantly inattentive presentation and what has been called sluggish cognitive tempo or concentration deficit disorder remains somewhat unclear, although recent evidence suggests these may be distinct disorders (e.g., Becker et al., 2016). Although some research shows differential links between the type of ADHD symptom presentation and patterns of comorbidity or elements of treatment response (e.g., MTA Cooperative Group, 1999; Pliszka, 2015), other work suggests poor stability and specificity related to which type of symptom is most prevalent in an individual (e.g., Willcutt et al., 2012), and DSM-​5 has moved away from subtyping ADHD to the more descriptive focus on symptom presentation.

ASSESSMENT OF ADHD IN CHILDREN

The assessment of ADHD in childhood shares the conundrum assessors face with many childhood disorders, where multiple sources of information must be considered. As defined by DSM, ADHD is characterized by symptoms and impairment that occur cross-​situationally. In the practicalities of assessment, this means that information from both home and school contexts is considered essential

 49

Attention-Deficit/Hyperactivity Disorder

to the assessment process. Given the limitations of child self-​report (e.g., Loeber, Green, Lahey, & Stouthamer-​ Loeber, 1991), the assessment of childhood ADHD places a heavy reliance on parent and teacher reports of the child’s behavior. Although information from multiple informants and contexts is viewed as critical to the assessment of ADHD, there is abundant evidence that these sources frequently show only minimal convergence (e.g., Achenbach, McConaughy, & Howell, 1987). In addition, evidence is meager with respect to the best methods for combining this information (for exceptions, see Gadow, Drabick, et  al., 2004; Martel, Schimmack, Nikolas, & Nigg, 2015)  or specifying which combinations of information offer the best incremental validity in the assessment process (Johnston & Murray, 2003). The influence of rater (e.g., depressed mood or ADHD symptoms in the parent) or situational (e.g., classroom structure and home routines) characteristics must also be considered in evaluating the information provided by the multiple sources (e.g., De Los Reyes, 2013; Dirks, De Los Reyes, Briggs-​Gowan, Cella, Wakschlag, 2012). Thus, the puzzle of how to best combine multiple, often discrepant, pieces of information remains a challenge for assessment.

PURPOSES OF ADHD ASSESSMENT

Clinical assessments of childhood ADHD serve a variety of purposes, ranging from confirming an ADHD diagnosis to ruling out differential diagnoses such as anxiety disorders or learning problems to assessing the response of a child’s ADHD symptoms and functioning to a psychosocial treatment or change in medication regimen. Varied assessment approaches and tools may be needed for addressing each of these different purposes. In this chapter, we focus on assessments for the purpose of diagnosis, treatment planning, and treatment monitoring. In selecting and evaluating assessment tools for each of these purposes, we employed the rating system used throughout the chapters of this volume, as described in Chapter 1. At this point, we offer a caveat regarding our selection and evaluation of the assessment measures included in our tables. We searched broadly for measures and information supporting their use. However, we used practical criteria that limited this search. To meet the dual goals of accessibility and independent research validation of the measures, we prioritized measures that are currently commercially or publicly available but that also have evidence of reliability, validity, or both reported by independent investigators in published studies. Given the breadth of

49

the assessment literature, we acknowledge that we may have missed a small number of measures or information that would allow measures to meet the psychometric criteria required for inclusion in the tables. Within the text of the chapter, we occasionally describe other measures that do not meet the psychometric criteria required for table entry but that hold promise in the assessment of ADHD. For such measures, although we continue in an attempt to be comprehensive, the sheer number of measures with limited psychometric information requires a selective approach to inclusion.

ASSESSMENT FOR DIAGNOSIS

Although most evidence supports a dimensional view of ADHD symptoms (e.g., Marcus & Barry, 2011), assessment for diagnosis requires a categorical decision. There are no objective neurological, biological, or other diagnostic markers for ADHD, and the diagnostic decision rests on perceptions of the child, typically offered by parents and teachers. These reports of whether or not the child shows particular symptoms will be influenced by variables such as the context in which the child is observed (e.g., home vs. school), characteristics of the rater (e.g., expectations and mood), and clarity of the assessment questions. In making diagnostic decisions, the clinician must remain aware of the assumptions underlying not only diagnostic categories but also the use of informant perceptions and the multiple possible explanations for discrepancies across informants. Research remains sorely needed to guide and improve the diagnostic validity of such decisions, and clinicians are best advised to resist unwarranted adherence to the use of arbitrary cut-​offs or algorithms for combining information. According to DSM-​5 (APA, 2013), an ADHD diagnosis in childhood requires not only that at least six of the nine symptoms of either inattention or hyperactivity–​ impulsivity be present but also that these symptoms have existed for at least 6 months, at a level that is maladaptive and inconsistent with developmental level. The symptoms must have presented before the age of 12 years and lead to clinically significant impairment in social and/​ or academic functioning evidenced in two or more settings. In addition, the symptoms should not be better explained by other conditions such as oppositional defiant disorder or anxiety disorders. Thus, the assessment of ADHD requires not only measuring symptoms but also their onsets and their associated impairments in multiple settings and gathering information regarding co-​occurring

50

50

Attention-Deficit and Disruptive Behavior Disorders

problems. Each of these requirements presents an assessment challenge. Defining symptoms as developmentally inappropriate requires that assessment tools permit comparisons to a same-​aged normative group. In addition, consideration should be given to the gender and ethnic composition of the normative sample. DSM-​5 criteria do not specify gender or ethnic differences in how the disorder is displayed and would suggest the use of norms combined across child gender and based on samples with representative numbers of ethnic-​minority children (as well as population characteristics). However, studies have revealed differences in the rates and severity of ADHD symptoms across genders and ethnic groups (e.g., Arnett, Pennington, Willcutt, Defries, & Olson, 2015; DuPaul et  al., 2016; Morgan, Staff, Hillemeier, Farkas, & Maczuga, 2013). Although such evidence would encourage the use of gender-​or ethnicity-​specific norms, such use carries a strong caveat given that the DSM diagnostic criteria are specified without regard to such child characteristics. Where possible, clinicians would be wise to consider comparisons to both specific and general norms; where specific norms do not exist, clinicians should at least acknowledge the possible role of culture, gender, or other characteristics in interpreting assessment information regarding the relative level of ADHD symptoms presented by the child. Assessing the diagnostic criteria related to the age of symptom onset and duration of symptoms also can be challenging. Few established measures tap these aspects of the diagnosis, and clinicians typically rely on more informal parent interviews to provide this information. This reliance on unstandardized retrospective recall carries an obvious psychometric liability (e.g., Angold, Erkanli, Costello, & Rutter, 1996; Russell, Miller, Ford, & Golding, 2014). Given that ADHD is defined by its presence in multiple situations, strategies are needed for combining assessment information from parent and teacher reports into a single diagnostic decision. The most common methods employ either an “or” rule, counting symptoms as present if they are reported by either the parent or the teacher, or alternately an “and” rule, counting symptoms as present only if endorsed by both parent and teacher. Evidence suggests that of these two options, the “or” rule for combining information may have the greatest validity, but either method of combination of informants generally outperforms the reliance on a single reporter (e.g., Shemmassian, & Lee, 2016). Other combinatorial methods, including averaging across raters to reduce the influence of any one informant, also show promise (e.g.,

Martel et  al., 2015). In addition, studies from our lab (Johnston, Weiss, Murray, & Miller, 2011, 2014) demonstrate that the convergence between parent and teacher reports of child ADHD symptoms can be improved by providing parents with instructional materials that clarify the nature of ADHD behaviors and how to rate them (e.g., distinguishing between behaviors that occur only when the child is tired versus those that are more pervasive and distinguishing between age-​appropriate and age-​ inappropriate behaviors). Still, we know that rater or source variance is substantial and often accounts for more variance in rating scale scores than the inattentive and hyperactive–​impulsive dimensions of behavior (e.g., Gadow, Drabick, et  al., 2004; Gomez, Burns, Walsh, & De Moura, 2003). Until further evidence is available, clinicians must rely on clinical judgment, grounded in a solid knowledge of the empirical literature, in combining information from multiple sources and methods to arrive at a final diagnostic decision in childhood ADHD. Finally, in assessments intended to offer a diagnosis of ADHD, the clinician must have a working knowledge of other childhood disorders in order to make informed differential and comorbid diagnoses. The process of teasing apart whether inattentive or impulsive behaviors are best accounted for by ADHD or by problems such as fetal alcohol effects, autism, learning problems, or anxiety remains a challenge. Given the space limitations of this chapter, we do not cover measures useful for assessing these other childhood disorders and instead refer the reader to other child assessment resources (Frick, Barry, & Kamphaus, 2010; Mash & Barkley, 2007) and the relevant chapters in this volume. However, we note that the limitations of our current knowledge and diagnostic systems often contribute to the difficulties of discriminating among disorders, and the clinician may need to assign an ADHD diagnosis as a “working hypothesis” rather than as a confirmed decision. To the extent that the core nature of ADHD remains under debate, best practices for discriminating this condition from other related conditions will remain somewhat elusive. A related problem of discriminating among disorders arises in the use of assessment measures, especially older measures, in which conceptualizations of ADHD are confounded with symptoms of other disorders. For example, the hyperactivity scales of earlier versions of the Conners Parent and Teacher Rating Scales (Goyette, Conners, & Ulrich, 1978)  included items more characteristic of oppositional problems. Similarly, the hyperactivity subscale of the 1982 version of the Personality Inventory for Children-​Revised (Lachar, 1982) assesses behaviors such

 51

Attention-Deficit/Hyperactivity Disorder

51

as cheating and peer relations, which are not core ADHD symptoms. Clinicians are reminded to not judge the appropriateness of measures on the basis of titles or scale names but, rather, to give careful consideration to actual item content and whether this content is congruent with current conceptualizations of ADHD.

requirement for ADHD. For both parent and teacher ratings, age-​and gender-​specific norms are available for large representative samples. Limited information on norms combined across genders is available. The manual outlines evidence of small, but potentially meaningful, differences in scores across ethnic groups, and these demand attention when using the measure with minority group children. The reliability and validity of scores on Overview of Measures for Diagnosis the measure, either in the current DSM-​5 or in earlier DSM-​IV versions, are generally good (Table 4.1). The Narrowband ADHD Checklists ADHD Rating Scale-​5 is the only measure in Table 4.1 Among measures designed to assess ADHD symptoms, with evidence of test–​retest reliability over a period of we include only those that map onto the symptoms as months, in contrast to the shorter test–​retest intervals for described in DSM. A number of rating scales have been other measures. Scores on the ADHD Rating Scale-​5 corproduced that are tied, more or less directly, to DSM relate with other ADHD measures and discriminate chilsymptoms of ADHD, either those contained in DSM-​ dren with ADHD from nonproblem controls and from IV or the essentially unchanged symptom list in DSM-​ clinical controls. Sensitivity and specificity information is 5. One of the most widely used of these is the ADHD available, with some evidence that teacher ratings on the Rating Scale-​5 (DuPaul, Power, Anastopoulos, & Reid, ADHD Rating Scale provide greater specificity and par2016; DuPaul, Reid, et al., 2016). This recently updated ent ratings provide greater sensitivity in making ADHD brief rating scale, which can be completed by parents or diagnoses (e.g., DuPaul, Power, et al., 2016). teachers, lists the 18 DSM-​5 symptoms of ADHD, along In addition to the ADHD Rating Scale-​5, a number with a six-​item scale assessing the impairment associated of very similar questionnaires exist, all with items listing with these symptoms. The ADHD Rating Scale-​5 pro- the DSM symptoms of ADHD, and in some cases associvides a total score and has inattentive and hyperactivity–​ ated problems such as sluggish cognitive tempo (e.g., the impulsivity subscales, supported by factor analysis, that Disruptive Behavior Scale [Gomez, 2012] and the Child are useful in determining ADHD presentation type. The and Adolescent Disruptive Behavior Inventory [Lee, impairment scale is an addition to this most recent ver- Burns, Snell, & McBurnett,  2014]). These measures sion of the ADHD Rating Scale, and it is advantageous range in the extent of psychometric and normative inforgiven that impairment due to symptoms is a diagnostic mation available to support their use. Other measures Table 4.1  Ratings of Instruments Used for Diagnosis Inter-​Rater Reliabilitya

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

NA NA

G G

A A

G G

E E

A A

✓ ✓

  Conners 3 DSM-​IV-​TR Symptom Scales   Parent E E NA   Teacher E E NA

A A

G G

G G

E E

A A

✓ ✓

Instrument

Norms

Internal Consistency

Narrowband ADHD Rating Scales   ADHD Rating Scale-​5   Parent E   Teacher E

ADDES-​4   Parent   Teacher

E E



E E

E E

NA NA

A A

G G

A A

E E

A A

NR

A

A

G

G

G

A

Structured Interviews   DISC-​IV

NA

  This column reflects inter-​rater agreement between clinical judges, and this information is not available for most measures where, instead, parent and teacher agreement is more commonly assessed. a

Note: ADDES-​4 = Attention-​Deficit Disorder Evaluation Scales; DISC-​IV = Diagnostic Interview Schedule for Children-​IV; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

52

52

Attention-Deficit and Disruptive Behavior Disorders

that are described for assessing ADHD offer content that is not entirely consistent with DSM criteria and are not recommended for diagnostic purposes. For example, the Brown Attention-​Deficit Disorder Scales for Children and Adolescents (Brown, 2001) is a parent and teacher report measure of the deficits in executive functioning that are thought to be associated with ADHD. The DSM-​ IV-​ TR Inattentive and Hyperactive/​ Impulsive Symptom Scales of the Conners 3rd Edition (Conners 3; Conners, 2008) are derived from the longer parent (110 items) and teacher (115 items) forms and map onto the DSM symptoms of ADHD. Although not all of these items are worded exactly as the DSM symptoms, they appear synonymous. The third edition of this measure also includes validity scales to assess the accuracy and integrity of responses, as well as brief yes/​no items assessing impairment due to symptoms. The normative sample is large and representative, and information regarding the scores of a large clinical group of children with ADHD is available. Normative percentiles for the symptom scales are available for the genders separately and combined. The scales have good psychometric properties (Conners, 2008; see Table 4.1) and are well validated. The long history of the Conners Rating Scales in the study of ADHD provides an extensive research background for this measure. The Attention-​ Deficit Disorder Evaluation Scales (ADDES-​4; McCarney & Arthaud, 2013a, 2013b) are updated versions of parent (46 items) and teacher (60 items) forms that yield inattention and hyperactive–​ impulsive subscale scores reflecting DSM symptoms of ADHD. Items were developed with input from diagnostic and educational experts. The normative samples are quite large and generally representative. Information from a sample of children with ADHD (although method of diagnosis is not clearly specified) also is available for the parent and teacher versions. Separate age and gender scores are calculated. The reliability and validity information for the measure as reported in the manual is generally good (McCarney & Arthaud, 2013a, 2013b; see Table 4.1); however, a limited number of independent validation studies are available, particularly for the most recent fourth edition of the measure. We note that several briefer measures of ADHD symptoms also exist and are used primarily for screening purposes (e.g., Conners 3 ADHD Index). One prominent example is the Strengths and Difficulties Questionnaire (SDQ) Hyperactivity/​ Inattention Subscale (Goodman, 1997). This five-​item scale has reasonable psychometric properties and normative data and appears useful

in detecting possible cases of ADHD (Algorta, Dodd, Stringaris, & Youngstrom, 2016; Goodman, 2001). The SDQ is available free of charge, is easy to score, and includes subscales reflecting other problems—​ clear advantages in a screening measure. Structured Interviews We included one structured interview, the Diagnostic Interview Schedule for Children-​IV (DISC-​IV; Shaffer, Fisher, Lucas, Dulcan, & Schwab-​ Stone, 2000), in Table 4.1. It is recognized that structured interviews often have limited psychometric information. In particular, the categorical model underlying these measures means that normative information is considered unnecessary. However, given the heavy reliance on structured interviews in many research and medical settings, we opted to include at least one such measure. We caution the clinician to consider carefully the costs of such interviews (e.g., heavy investment of clinician and family time) in contrast to the relatively low incremental validity offered by these measures compared to parent and teacher ratings of ADHD symptoms (e.g., Pelham, Fabiano, & Massetti, 2005; Vaughn & Hoza, 2013). The DISC-​ IV (Shaffer et  al., 2000)  maps directly onto DSM-​IV diagnostic criteria for a range of child disorders, including ADHD, and it includes both symptom and impairment questions. Given that DSM-​5 criteria for ADHD are essentially unchanged from DSM-​IV criteria, the interview remains appropriate for assessment. The DISC-​IV is available in multiple languages and in parent and youth versions. The child version has limited psychometric properties, although some studies support the use of combined responses across parents and children (Shaffer et al., 2000). The highly structured nature of the DISC-​IV diminishes the importance of estimating inter-​rater reliability or inter-​judge agreement for this measure. Psychometric information for the fourth version of the DISC is somewhat limited; however, combined with information on earlier versions, support is generally adequate for the reliability of the measure for making ADHD diagnoses (Shaffer et al., 2000). Similarly, evidence supports the convergent validity of ADHD diagnoses made using the DISC-​IV (e.g., de Nijs et  al., 2004; Derks, Hudziak, Dolan, Ferdinand, & Boomsma, 2006; McGrath, Handwerk, Armstrong, Lucas, & Friman, 2004; Sciberras et  al., 2013). It is noteworthy that there is heavy reliance on this measure in many large research studies. Other structured and semi-​structured interviews used in the assessment of ADHD include the Kiddie Schedule

 53

Attention-Deficit/Hyperactivity Disorder

for Affective Disorders and Schizophrenia (K-​ SADS; Kaufman et  al., 1997)  and the Child and Adolescent Psychiatric Assessment (CAPA; Angold & Costello, 2000). As with the DISC-​IV, these interviews typically have not been subjected to extensive psychometric study. Measures Not Useful in the Assessment of ADHD Diagnoses The current diagnostic criteria for ADHD remain relatively subjective, and the drive to develop and access more objective indicators of the disorder has been strong. A number of cognitive performance measures have been proposed as useful in this regard, many of which are versions of continuous performance tests. Some of these measures have come considerable distances in providing normative information, evidence of stability over time, and sensitivity to the effects of medication treatments (e.g., the Conners CPT II [Conners & MHS Staff, 2000] and the Objective QbTest [Ramtvedt, Røinås, Aabech, & Sundet, 2013]), yet they remain limited in their clinical utility (Hall et al., 2016). Although these measures offer the promise of objective measurement of ADHD symptoms (in contrast to the subjectivity inherent in parent and teacher reports), their relations to other measures of ADHD symptoms often are modest, and there is limited evidence to support their predictive or discriminant validity. In particular, scores on these measures produce high rates of false-​negative diagnoses such that normal range scores are often found in children who meet diagnostic criteria for ADHD according to other measures. Again, none of these measures are, as yet, sufficiently developed to meet the designated psychometric criteria for this volume or to be useful in making diagnostic decisions for individual children (Duff & Sulla, 2015). Similarly, patterns of subscale scores on intelligence tests, biological markers such as blood tests or brain imaging have not been of demonstrated use in the clinical assessment of ADHD (e.g., Kasper, Alderson, & Hudec, 2012; Koocher, McMann, Stout, & Norcross, 2015). Overall Evaluation Based on ease of use and predictive power, combining information from teacher and parent versions of brief DSM-​5-​based rating scales appears to offer the best available option in the diagnosis of ADHD. Although child self-​ report versions exist for several of the measures reviewed, the validity of child report is typically lower than that of parent or teacher reports, and for this reason

53

we have not included these versions. Structured and semi-​ structured diagnostic interviews are a mainstay in research on ADHD; however, evidence suggests that they may not add incrementally to the diagnostic information gathered more efficiently with rating scales (e.g., Ostrander, Weinfurt, Yarnold, & August, 1998; Pelham et al., 2005; Vaughn & Hoza, 2013; Wolraich et al., 2003). We do note, however, consistent with recommended pediatric and psychiatric assessment guidelines (American Academy of Child and Adolescent Psychiatry, 2007; American Academy of Pediatrics, 2011), that there is a definite need for additional information, perhaps gathered through parent interviews or child self-​report, to supplement rating scales in order to fully assess for possible comorbid or differential diagnoses, age of onset and history of symptoms, and other important clinical information relevant to the diagnosis of ADHD.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Three treatments have received empirical support for childhood ADHD (Evans, Owens, & Bunford, 2014): pharmacotherapy, behavioral treatment, and their combination. In assessments for treatment planning, the clinician is seeking information to assist with (a) developing a conceptualization of the factors contributing to the child’s difficulties and prioritizing treatment targets or goals (e.g., Which ADHD symptoms are most impairing or most likely to respond quickly to treatment?), (b) matching difficulties to recommended treatments (e.g., Do this child’s primary difficulties match the ADHD problems that have been targeted with behavioral or medication treatments?), or (c)  identifying environmental elements that may be used in treatment (e.g., Does the teacher offer rewards for academic work completed?). Information regarding factors that may interfere with treatment success (e.g., Does this child have a physical condition that may limit the utility of medication?) or the child’s interests and strengths (e.g., sports interests or skills) also will be useful. In this section, we review measures that provide information relevant to conceptualizing the nature of the problems experienced by children with ADHD and the planning of treatments specifically targeting ADHD symptoms, symptom-​ related impairment, or possible comorbid conditions. However, we caution the reader that this focus is narrow and that much case conceptualization and treatment planning for ADHD involves broader

54

54

Attention-Deficit and Disruptive Behavior Disorders

consideration of co-​occurring difficulties in child, family, academic, or peer functioning. Pelham and colleagues (2005), in their excellent review of evidence-​based assessments for ADHD, offer a cogent and convincing argument that adaptations and impairments in functioning, rather than ADHD symptoms per se, should form the basis for treatment planning in ADHD. Thus, adequate treatment planning for ADHD necessitates gathering and integrating information far beyond symptom or diagnostic status. Information from a variety of sources, regarding a wide range of child and family functioning, is necessary to inform treatments that match the needs and resources of each child and family. For example, the clinician must consider the child’s family, social and cultural context, relevant medical and educational history and concerns, the child’s and family’s goals for treatment, and available treatment options. Although difficulties in domains such as academics and social relationships are often closely linked to ADHD (and may even be the result of ADHD symptoms), assessment methodologies in these areas are only briefly considered here. The parent–​child relationship or parenting style, the parent’s psychological or marital functioning, and the child’s peer relationships or self-​esteem are among the areas that might be considered in a more comprehensive definition of treatment planning for ADHD. We refer the reader to chapters within this volume and to other excellent child assessment resources (Frick et al., 2010; Mash & Barkley, 2007) for detailed information regarding assessment of the problems and conditions that are frequently associated with ADHD and that often figure prominently in conceptualizing the problems and planning treatment for children with this disorder. We cannot state strongly enough how important these other domains of assessment are in planning treatments for children with ADHD that will be maximally sensitive to the child’s and the family’s needs and concerns and that will also hold the greatest potential for altering not only the child’s current functioning but also long-​term outcomes. Overview of Measures for Case Conceptualization and Treatment Planning Broadband Checklists Parent and teacher reports on broadband measures of child psychopathology provide useful information in planning treatments for children with ADHD (see Table 4.2). These measures provide insight into a range of difficulties,

in addition to ADHD, and may direct the clinician to more in-​depth assessments of coexisting disorders or disorders that may account for ADHD-​like symptoms. Scores on these broadband measures also allow the clinician to incorporate knowledge of potential comorbidities into treatment planning as appropriate. For example, some evidence suggests that behavioral treatments for ADHD may have better outcomes among children with comorbid anxiety disorders (MTA Cooperative Group, 1999), and behavioral treatments are empirically supported for addressing the oppositional or conduct disorder problems or both that are frequently comorbid with ADHD (e.g., Powell et al., 2014). We include only broadband rating scales with subscales specifically targeting ADHD symptoms or behaviors. These measures vary in the extent to which their subscales map directly onto DSM ADHD criteria or symptom dimensions. For example, both the Attention Problems subscale of the Child Behavior Checklist and the ADHD Index of the Conners 3 include a mixture of inattention and impulsivity/​ hyperactivity items and are not comprehensive in covering DSM symptoms. Thus, these subscale measures typically cannot be substituted for the narrowband checklists described previously. However, the subscales relevant to attention or hyperactivity–​ impulsivity found on many broadband checklists will offer supplemental information that may be useful in arriving at diagnostic decisions, particularly in complex cases. Because the role of these broadband measures in treatment planning is to provide a screening-​ level assessment of a range of behavior problems, we require satisfactory psychometric properties at the level of subscale scores (as well as total scores). The parent (Children Behavior Checklist [CBCL]) and teacher (Teacher Report Form [TRF]) versions from the Achenbach System of Empirically Based Assessment (ASEBA; Achenbach & Rescorla, 2001) are well-​known and widely used measures, available in several languages, that have lengthy clinical and research traditions. A Youth Self-​ Report form is available for children aged 11 to 18 years, but it is not described here. The parent and teacher checklists are used for children 6 to 18 years of age (a version for younger children also is available), and norms are based on large representative normative samples, as well as samples of clinic-​referred children (although norms specific to different clinical diagnoses are not generally available). There are 118 items, requiring 15  to 20 minutes to complete, as well as subscales assessing competence (although the psychometric properties of the competence subscales are generally not as strong as the behavior problem scales).

 5

Attention-Deficit/Hyperactivity Disorder

55

Table 4.2  Ratings of Instruments Used for Case Conceptualization and Treatment Planning Instrument

Internal Consistency

Inter-​Rater Reliabilitya

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

E E

G E

NA NA

G G

G G

G G

E E

A A

✓ ✓

E E

G E

NA NA

A A

G G

G G

E E

A A

✓ ✓

E E

E E

NA NA

G G

G G

G G

E E

A A

✓ ✓

G G

E E

NA NA

A NR

A A

G G

E E

A A

E E A

NA NA E

A NR NR

G G A

G G G

G G G

A A A

NR

NR

NA

G

A

G

A

A

G

NR

NA

G

A

G

G

A



E E

E E

NA NA

A A

G G

A A

G G

A A

✓ ✓

Norms

Broadband Rating Scales  ASEBA   Parent: CBCL   Teacher: TRF  

BASC-​3   Parent   Teacher   Conners 3   Parent   Teacher  Vanderbilt   Parent   Teacher

Measures of Impairment   VABS-​II   Parent E   Teacher E  CAFAS NR  IRS   Parent   Teacher  COSS   Parent   Teacher

✓ ✓

  This column reflects inter-​rater agreement between clinical judges, and this information in not available for most measures where, instead, parent and teacher agreement is more commonly assessed. a

Note: ASEBA = Achenbach System of Empirically Based Assessment; CBCL = Child Behavior Checklist; TRF = Teacher Report Form; BASC-​3 = Behavior Assessment System for Children-​3; Vanderbilt = Vanderbilt ADHD Diagnostic Parent and Teacher Rating Scales; VABS-​II = Vineland Adaptive Behavior Scales, 2nd Edition; CAFAS = Child and Adolescent Functional Assessment Scale; IRS = Impairment Rating Scale; COSS = Children’s Organizational Skills Scale; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

The ASEBA provides empirically derived subscales that are similar across the multiple informant versions of the measure and assess a variety of emotional and behavior problems, such as attention, rule breaking, and aggression. The measures also yield overall Internalizing and Externalizing scores, as well as rationally derived subscales that map onto DSM diagnostic categories. The similarity in item content across informants allows for the calculation of inter-​rater agreements, and information is available to compare levels of agreement to those in the normative sample. Considerable validity evidence is presented in the ASEBA manual, and numerous reviews provide additional evidence of the convergent, discriminant, and content validity of the measures (e.g., Frick et al., 2010; Gladman & Lancaster, 2003; McConaughy, 2001; Pelham et  al., 2005). As indicated in Table 4.2, both parent and teacher versions have solid psychometric properties, and evidence supports the incremental validity of gathering information from both sources (e.g., Ang et al., 2012; Hanssen-​Bauer,

Langsrud, Kvernmo, & Heyerdahl, 2010). However, as with many of the measures reviewed in this chapter, few studies have examined the incremental validity or clinical utility of ASEBA scores. The Behavior Assessment System for Children, 3rd Edition (BASC-​ 3; Reynolds & Kamphaus, 2015)  is a multidimensional measure of adaptive and problem behaviors that has teacher and parent versions for children aged 6 to 11  years (as well as preschool and adolescent versions not considered here). The measure takes approximately 10 to 20 minutes to complete and has multiple language versions. The BASC-​3 provides rationally derived clinical subscales including Hyperactivity and Attention Problems, as well as a number of other problem dimensions and composite scores for Adaptive Behavior, Externalizing and Internalizing Problems, and a total Behavioral Symptoms Index. The teacher version also has scales related to School Problems. One advantage of the BASC-​3 is that it offers validity checks to assist the clinician

56

56

Attention-Deficit and Disruptive Behavior Disorders

in detecting careless or untruthful responding, misunderstanding, or other threats to validity. BASC-​3 norms are based on a large representative sample and are available both in aggregate form and differentiated according to the age, gender, and clinical status of the child. As noted previously, not only does this measure evaluate behavioral and emotional problems but also it identifies the child’s positive attributes, an aspect with obvious use in planning treatment. Current psychometric information is available in the measure’s manual (Reynolds & Kamphaus, 2015). Given the relative recency of this measure, in some cases in Table 4.2 we have relied on the psychometric information available for earlier parent and teacher versions, specifically the BASC-​ 2 (e.g., Kamphaus, Reynolds, Hatcher, & Kim, 2004; Pelham et al., 2005; Sandoval & Echandia, 1994). The Conners 3 (Conners, 2008)  is the most recent revision to a set of scales that have been closely allied with research and clinical work in ADHD for many years. The Conners 3 has multiple language versions and there are parent and teacher versions (as well as a youth self-​report not described here), each with both short (5 to 10 minutes) and long (15 to 20 minutes) forms available. The short forms focus on a range of behavior problems, whereas the longer forms also include subscales assessing DSM symptom criteria for ADHD and oppositional defiant and conduct disorders, as well as screening and impairment scales. Norms are based on a large representative sample of 6-​to 18-​year-​old children and are also available for a clinical sample. Norms are available for the genders combined, with some scales also having gender-​specific information. The Conners 3 manual and published reviews of the measure outline the strong psychometric properties of both the current and earlier versions of the measure (e.g., Conners, 2008; Kao & Thomas, 2010; Pelham et al., 2005). Finally, the Vanderbilt ADHD Diagnostic Parent and Teacher Rating Scales (Bard, Wolraich, Neas, Doffing, & Beck, 2013; Wolraich et al., 1998, 2003; Wolraich, Bard, Neas, Doffing, & Beck, 2013) are another DSM-​based set of symptom rating scales that include ADHD symptoms, oppositional and conduct problems, as well as anxiety and depression items. Norms are based on a relatively large sample, but of limited representativeness. Preliminary psychometric evidence is available, although further validation is needed. Other broadband questionnaires have been developed that may prove useful in treatment planning for ADHD, although these measures require further research. For example, the Child Symptom Inventory-​4 (CSI-​4; Gadow

& Sprafkin, 2002) assesses a variety of DSM-​IV emotional and behavioral disorders in children between ages 5 and 12 years. Although a DSM-​5 version of this scale is listed on the authors’ webpage, this version has not been fully evaluated. Measures of Impairment As noted previously, there is a growing and appropriate focus on adaptive functioning as central to understanding and treating ADHD, with efforts underway to develop a core set of ability and disability concepts relevant to ADHD within the International Classification of Functional Disability and Health (Schipper et  al., 2015). Global and multidimensional measures of impairment are valuable in a comprehensive assessment of the functioning of children with ADHD. In particular, these measures are likely to be useful in decisions regarding the need for treatment and in identifying appropriate treatment foci. We concur with arguments made by others (e.g., Pelham et  al., 2005)  that impairments in adaptive behavior must figure prominently in treatment planning and monitoring for children with ADHD, more so than absolute levels of ADHD symptoms. As noted in our description of measures useful for diagnosis of ADHD, several of these measures now include items tapping impairment, although these are typically brief ratings. Thus, currently, the clinician must choose between brief or promising measures specific to ADHD (e.g., ADHD Rating Scale-​5 impairment items) and well-​established measures of adaptive behavior that are broad and may not be particularly appropriate to ADHD-​related difficulties (e.g., the Vineland Adaptive Behavior Scales; Sparrow, Cicchetti, & Bala, 2005). The Vineland Adaptive Behavior Scales, Second Edition (VABS-​II; Sparrow et al., 2005) has been a leading measure of the personal and social skills needed for everyday living. A 2016 revision of the measure (VABS-​3; Sparrow, Cicchetti, & Saulnier, 2016) includes updated items, forms, and norms. However, the revision is not yet widely available nor used extensively in research; therefore, we focus our comments on the VABS-​II. Although typically used to identify individuals with developmental problems, some evidence supports the use the VABS in groups of children with ADHD (e.g., Craig et al., 2015; Ware et al., 2014). Consisting of a Survey Interview Form, Parent/​ Caregiver Rating Form, Expanded Interview Form, and a Teacher Rating Form, the VABS-​II requires 20 to 60 minutes to complete. It is organized around four behavior domains (communication, daily living skills,

 57

Attention-Deficit/Hyperactivity Disorder

socialization, and motor skills) and has demonstrated strong psychometric properties. Norms for the parent and teacher rating scale forms are based on large representative groups, including a variety of clinical groups, and the reliability and validity of scores on the measure range from adequate to excellent, as reported in the manual (Sparrow et al., 2005; see Table 4.2). The Child and Adolescent Functional Assessment Scale (CAFAS; Hodges & Wong, 1996)  is an additional multidimensional measure of impairment that may serve as an aid in case conceptualization and treatment planning for children with ADHD. The CAFAS uses interviewer ratings to assess a child’s (ages 7 to 17 years) degree of impairment due to emotional, behavioral, or psychiatric problems. Consisting of 315 items and measuring functioning in areas such as school, home, and community and behaviors such as emotional regulation, self-​harm, and substance use, the CAFAS requires only 10 minutes to complete. Although normative data are not available, reliability and validity information for this measure are generally satisfactory, as indicated in Table 4.2. The Impairment Rating Scale (IRS; Fabiano et  al., 2006)  was developed specifically to assess the areas of functioning that are frequently problematic for children with ADHD. Parent and teacher versions are available in the public domain, with questions pertaining to areas such as academic progress, self-​esteem, peer relations, problem behavior, impact on the family, and overall functioning. Preliminary norms are available only for the teacher version. Test–​retest reliability has been established over periods up to 1 year. Within samples of ADHD and control children, convergent and discriminant validity have been demonstrated, and evidence suggests that parent and teacher IRS ratings accounted for unique variance in predicting child outcomes beyond ADHD symptoms (Fabiano et al., 2006; see Table 4.2). A recent, useful addition to measures assessing difficulties related to ADHD, particularly in the academic domain, is the Children’s Organizational Skills Scales (COSS; Abikoff & Gallagher, 2009). With parent, teacher, and child (not reported here) versions, this measure taps children’s difficulties with task planning, organized actions, and memory and materials management, and also includes questions specifically measuring the impairment caused by these organizational difficulties. The measure has good psychometric properties, and norm information is available based on a large, representative sample. Thus, it will offer useful information, particularly for assessing and planning for school interventions.

57

Observational Measures Informal observations of children in clinical settings have little clinical utility in detecting ADHD or planning for its treatment (e.g., Edwards et al., 2005). However, more structured observational measures do have potential utility in treatment planning. Using such measures can clearly identify a child’s ADHD symptoms and the impairments that ensue from these symptoms, which should be targeted in treatment plans. Unfortunately, despite variability in the psychometric information available, all the measures located failed to demonstrate adequate levels of the criteria used for table inclusion. For example, these observational measures seldom have norms or report the temporal stability of scores. These limitations preclude the inclusion of these measures in the tables; however, we do offer some suggestions for available observational measures designed for classroom use or for assessing parent–​ child interactions. The Direct Observation Form (DOF) is an observational component of the ASEBA (Achenbach & Rescorla, 2001)  and uses a 10-​minute observation of the child’s behavior in a classroom context, recommended to be repeated on three to six occasions. Although the measure includes a narrative and ratings of the child’s behavior, psychometric information is reported primarily for the time sampling of 96 behaviors (the behaviors overlap with items on the CBCL and TRF). For normative comparisons, the DOF recommends that two nonproblem children be observed simultaneously with the target child in order to provide individualized norms. Although the manual also presents norms based on moderate-​size samples of clinic-​referred and nonproblem children, the value of these norms is likely to be limited by the variability across classroom contexts (e.g., variables such as classroom rules, physical structure, and ratio of problem to nonproblem children will undoubtedly influence the rates of problem behaviors displayed by children). The manual reports moderate to high levels of inter-​rater reliability using the DOF, and DOF scores correlate in expected ways with other measures and with clinical status (Achenbach & Rescorla, 2001). In combination with an ASEBA form used to facilitate observations of child behavior in psychological test situations (the Test Observation Form), some evidence points to the ability of these observations to assess unique variance in child behavior beyond parent or teacher ratings (McConaughy et al., 2010). Another potential measure useful in tapping the classroom difficulties of children with ADHD is the Behavioral Observation of Students in Schools (BOSS;

58

58

Attention-Deficit and Disruptive Behavior Disorders

Shapiro, 2011). This measure, with many computerized and interactive features, taps task engagement and off-​ task behaviors (both inattentive and hyperactive) during classroom activities. Evidence of inter-​rater reliability is provided (although several hours of training are required), and the observations have been shown to discriminate between children with ADHD and typically developing classmates (DuPaul et al., 2004). To assess aspects of ADHD that are problematic within parent–​child interactions, a number of observational systems developed in research contexts are available, although most are too complex to provide reliable estimates in clinical practice. Perhaps one exception to this is the Behavioral Coding System (BCS; McMahon & Forehand, 2003). Using the BCS, the clinician codes parent and child behaviors in two 5-​minute interactions: a free-​play situation and a situation in which the parent directs the interaction. The presence of six parent behaviors (rewards, commands, time out, etc.) and three child behaviors (compliance, noncompliance, etc.) is recorded every 30 seconds, and the sequence of behaviors specifying parental antecedents, child responses, and parental consequences can be analyzed. Such information is readily translated into treatment goals, particularly for behavioral treatments. Interobserver agreement and test–​ retest reliability of the BCS are adequate, and the system is sensitive to differences in compliance between clinic-​ referred and nonreferred children (evidence reviewed in McMahon & Forehand, 2003). Finally, we highlight that observations of individualized behavioral targets, by parents or teachers, are likely to be useful in conceptualizing and planning for treatment of each child’s difficulties. For example, with clear and simple behavioral definitions, frequency counts of problematic behaviors that are relevant for each particular child (e.g., times out of seat in the classroom and failure to complete assigned household chores) can be made and may serve as an integral part of treatment planning. Overall Evaluation Broadband parent and teacher checklists provide essential information regarding behavior problems that may accompany or result from ADHD and which may inform treatment planning. These measures are typically well developed, possess solid psychometric properties, and the clinician can feel confident in the information they provide. However, even more relevant information for treatment planning is likely to be derived from assessment of the child’s functioning and impairments in daily

home and classroom situations. Emerging measures of impairment, particularly those designed to be sensitive to the aspects of functioning most closely linked to ADHD, have clear potential in identifying appropriate treatment targets and assisting the clinician in prioritizing these targets. In a similar fashion, the context-​specific and objective nature of observational assessments of the child’s behavior, both in school and at home, have great potential for treatment planning. These measures may also assess environmental antecedents and consequences of the child’s behaviors, yielding information of immediate relevance to the planning of behavioral interventions. An important future direction in the development of any of these assessment measures will be to work to establish their incremental validity and clinical utility within the context of multiple sources and types of assessment information.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT EVALUATION

In conducting assessments to monitor and evaluate treatment implementation or progress in children with ADHD, there is a need for measures that are reliable over time, sensitive to relatively small changes in behavior or symptoms, and practical to use on a frequent basis (e.g., brief and inexpensive). In monitoring medication treatments, measurement of side effects also is recommended (e.g., Barkley Side Effects Rating Scale; Barkley & Murphy, 2006), although standardized measures for this purpose are not available. One prominent issue in considering assessment measures to be used in treatment monitoring is the stability of scores over time and the vulnerability of measures to the effects of repeated assessments (Solanto & Alvir, 2009). For example, does a decrease in symptom severity on a measure over time reflect the benefits of treatment, or could the change be predicted solely on the basis of regression of scores to the mean? If treatment effects are to be assessed over a longer period, the availability of age norms also will be important in order to place score changes within the appropriate context of developmental changes in the behavior. As with disagreements in diagnostic information gathered from multiple sources, discrepancies in reports of treatment-​related changes in child behaviors are expected across informants and settings. Again, clinicians must struggle with how to combine or prioritize the multiple bits of information in reaching an overall conclusion regarding the progress of treatment.

 59

Attention-Deficit/Hyperactivity Disorder

In this section, we consider measures that have demonstrated not only basic psychometric properties but also sensitivity to change due to medication, psychosocial interventions, or both. Although several measures meet these criteria, almost all of the evidence of this sensitivity is derived from studies aggregating across groups of children, and information regarding performance of the measures in individual cases awaits investigation. Furthermore, it is common in research studies to amalgamate multiple measures into composite scores to create more reliable scores for use in treatment comparisons (e.g., MTA Cooperative Group, 2004). Although advantageous from a research perspective, this approach limits the ability of such studies to inform us regarding the sensitivity to treatment of any of the measures used in isolation or with individual children. Overview of Measures for Treatment Monitoring and Treatment Evaluation

59

and graphical depiction of change in a child’s scores over time. Normative performance on the Monitor can be estimated from the BASC norms. Unfortunately, despite being developed with the explicit purpose of treatment monitoring, there is little published evidence of the validity of the scale for this purpose. A  similar measure, the SKAMP (Swanson, 1992), is a brief 10-​item scale assessing academic impairment related to inattention and disruptive behavior. Murray and colleagues (2009) reported means and standard deviations for the measure from a large sample, divided by gender, ethnicity, and grade level, and documented good internal consistency. Satisfactory single-​ day stability also has been demonstrated (e.g., Wigal, Gupta, Guinta, & Swanson, 1998). The SKAMP has repeatedly demonstrated sensitivity to the effects of medication or combined medication and psychosocial treatment (e.g., Greenhill et al., 2001; Manos et al., 2015; Wigal et  al., 2014). Unfortunately, the SKAMP is not widely or easily accessible.

Narrowband ADHD Checklists

Broadband Checklists

No evidence of treatment sensitivity has yet been published based on either the ADHD Rating Scale-​5 or the Conners 3 DSM-​IV-​TR Symptom Scales. However, for the ADHD Rating Scale-​5, evidence from the ADHD-​ IV Rating Scale version (relatively unchanged) indicates sensitivity to medication treatment, at a group level, in numerous studies (e.g., Huss et  al., 2016; Kollins et  al., 2011). Other symptom-​level measures, although lacking in some psychometric characteristics, may bear consideration for treatment monitoring depending on the specific clinical needs of each case. For example, the IOWA (Loney & Milich, 1982)  is a 10-​item measure derived from an older version of the Conners’ Teacher Rating Scale that assesses inattentive–​overactive and aggressive symptoms. Considerable evidence supports the construct validity, internal consistency, and stability of scores on the measure (Johnston & Pelham, 1986; Loney & Milich, 1982; Nolan & Gadow, 1994; Waschbusch & Willoughby, 2008). At a group level, the measure has been proven useful in multiple studies assessing the effectiveness of medication treatments for ADHD (e.g., Maneeton, Maneeton, Intaprasert, & Woottiluk, 2014). The BASC-​3 Flex Monitor (Reynolds & Kamphaus, 2016), which includes items tapping behaviors associated with ADHD (as well as other problems), was designed to allow frequent and individually tailored assessment to monitor effectiveness of treatments for ADHD. Teacher, parent, and child forms are available, with digital versions

As indicated in Table 4.3, the parent and teacher versions of the ASEBA have demonstrated sensitivity to behavioral, medication, and combined interventions for children with ADHD or disruptive behaviors (e.g., Ialongo et al., 1993; Kazdin, 2003; Masi et al., 2016; Wang, Wu, Lee, & Tsai, 2014). Earlier versions of the Conners 3, both parent and teacher forms, have consistently demonstrated sensitivity to medication treatments for children with ADHD (e.g., Gadow, Sverd, Sprafkin, Nolan, & Grossman, 1999; Weiss et al., 2005), and some evidence supports their sensitivity to behavioral interventions as well (e.g., Horn, Ialongo, Popovich, & Peradotto, 1987; Pisterman et al., 1989). Measures of Impairment Among the measures of impairment, the CAFAS has demonstrated sensitivity to behavioral or mental health interventions, in both general and ADHD samples, with generally adequate psychometric properties as indicated in Table 4.3 (e.g., Puddy, Roberts, Vernberg, & Hambrick, 2012; Timmons-​Mitchell, Bender, Kishna, & Mitchell, 2006). However, this sensitivity has not been examined specifically within ADHD samples. Both the IRS (Fabiano et al., 2006) and the Weiss Functional Impairment Rating scale (Weiss et al., 2005; available online at http://​naceonline.com/​AdultADHDtoolkit/​assessmenttools/​wfirs.pdf) have demonstrated evidence of treatment sensitivity, for both behavioral and medication treatments, specifically

60

60

Attention-Deficit and Disruptive Behavior Disorders

Table 4.3  Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Instrument

Norms

Internal Consistency

Inter-​Rater Reliabilitya

Test–​Retest Content Reliability Validity

Construct Validity Validity Generalization

Treatment Clinical Sensitivity Utility

Highly Recommended

✓ ✓

Narrowband ADHD Rating Scales   ADHD Rating Scale-​5   Parent E   Teacher E

E E

NA NA

G G

A A

G G

E E

E E

A A

 IOWA   Parent   Teacher

G G

NA NA

A A

A A

G G

G G

G G

A A

Broadband Rating Scales  ASEBA   Parent: CBCL E   Teacher: TRF E

G E

NA NA

E G

G G

G G

E E

E E

A A

✓ ✓

  Conners 3   Parent   Teacher

E E

NA NA

G G

G G

G G

E E

E E

A A

✓ ✓

Measures of Impairment  CAFAS NR

A

E

NR

A

G

G

G

A

 IRS   Parent   Teacher

NR G

NR NR

NA NA

G G

A A

G G

A G

E E

A A



 Weiss

A

G

NA

A

NR

A

A

E

A



A A

E E

  This column reflects inter-​rater agreement between clinical judges, and this information in not available for most measures where, instead, parent and teacher agreement is more commonly assessed. a

Note: ASEBA = Achenbach System of Empirically Based Assessment; CBCL = Child Behavior Checklist; TRF = Teacher Report Form; CAFAS = Child and Adolescent Functional Assessment Scale; IRS  =  Impairment Rating Scale; Weiss  =  Weiss Functional Impairment Rating Scale; A  =  Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

in samples of children with ADHD (Hantson et al., 2012; Owens, Johannes, & Karpenko, 2009; Stein et al., 2015; Waxmonsky et  al., 2010). Both measures have a parent version, and the IRS also has a teacher version, which is useful in assessing intervention effects within the classroom. The COSS does appear to be sensitive to classroom interventions (Abikoff et  al., 2013); however, as yet, no independent replications of this sensitivity are available. Observational Measures As noted previously, observational measures may be useful in treatment planning, and similar to the procedures of a daily report card, such observations can yield ongoing assessment of treatment progress and documentation of treatment outcome. For example, frequency counts of problematic behavior in either home or school contexts that are individualized for each child have an obvious utility in monitoring treatment and guiding decisions regarding needed changes in regimens. Such observations have proven sensitive to the effects of both medication and behavior management strategies (Pelham et  al., 2005),

and evidence suggests that functional assessments with observable targets improve treatment effectiveness (Miller & Lee, 2013). Structured parent–​child interaction observational measures, such as the BCS, have demonstrated sensitivity to the effects of behavioral parent training (evidence reviewed in McMahon & Forehand, 2003). Despite the clear relevance of these observational measures for assessing treatment-​related change, the advantages of these measures are combined with a lack of information regarding expected normative changes in scores over time and a lack of traditional validity evidence (Kollins, 2004). Overall Evaluation As with measures useful for treatment planning, the measures with the strongest psychometric properties (i.e., ADHD symptom scales and broadband checklists), although potentially useful in monitoring treatment outcomes, are more limited in their ability to assess details of each child’s impairments or to be sensitive to the relatively rapid changes in child behavior that are common in medication and behavioral interventions. In addition,

 61

Attention-Deficit/Hyperactivity Disorder

the length of the broadband checklists is often prohibitive for repeated assessments. Clinicians are advised to give careful consideration to supplementing these measures with others that may more directly assess the child’s daily functioning (e.g., impairment scales or observational measures), with appropriate caution in the use of these measures due to their psychometric limitations. Clinical research is urgently needed to expand the evidence of the reliability and validity of scores on these measures and, most important, to provide empirical support for the clinical utility they are assumed to possess.

ASSESSMENT OF ADHD IN ADULTHOOD

Evidence supporting the lifespan persistence of ADHD symptoms is strong (e.g., Turgay et  al., 2012), and the DSM-​ 5 made revisions explicitly designed to address assessment issues within the adult population. Specifically, symptom examples were provided that are more appropriate for adults (e.g., feelings of restlessness rather than overt motor activity and forgetful in paying bills or keeping appointments) and, reflecting the normative decrease in symptoms across age, only five symptoms of either inattention or hyperactivity–​impulsivity are required for a diagnosis in adulthood. Assessment of ADHD in adulthood presents some challenges that overlap with those present in child assessments, but also some that are unique to the adult stage. As in childhood, it is important that multiple sources of information be considered in the assessment of symptoms. In contrast to childhood, in adulthood there is a reliance on self-​reports as one source of information, and these are considered alongside the perceptions of others who know the individual well (e.g., a spouse). However, as in childhood, the reports from these different sources seldom converge completely (e.g., Barkley, Knouse, & Murphy, 2011). Moreover, not only are there few guidelines for how to reconcile these reports in adulthood compared to childhood, but also there are greater obstacles to obtaining useful perceptions from other informants (e.g., there is no close other available or the client may be reluctant to consent to the gathering of this information). ADHD in adults, as in children, is highly comorbid with a range of other disorders (Kooij et al., 2012), and forming clear, differential diagnoses is often a challenge. More so than in childhood, the possibility of adults overreporting symptoms, perhaps in order to receive special services or dispensations, also must be considered (e.g., Sollman, Ranseen, & Berry, 2010). Finally, although emerging

61

evidence suggests the possibility that ADHD can arise in adults who were not so diagnosed in childhood (Moffitt et  al., 2015), the prevailing view continues to be consistent with that of DSM-​5, which requires evidence of an onset of symptoms and impairment prior to the age of 12 years to substantiate an ADHD diagnosis. Thus, in assessing ADHD in adults, evidence must be gathered regarding the childhood occurrence of symptoms/​impairment, and again, multiple sources of information (e.g., self-​reports, reports from parents or siblings, and school records) are expected to provide the best approximation of this information. Several measures are available to assess current and retrospective reports of ADHD symptoms in adults, although few are well developed or, as yet, widely used. We have focused our comments on the most recent, most widely studied, and most easily accessible of these. One set of measures, useful for diagnosis, case conceptualization, and treatment monitoring, has been developed by Russell Barkley. The set includes both self-​and other-​reports, for both symptoms and impairment, in both adulthood and retrospectively for childhood. The Barkley Adult ADHD Rating Scale-​IV (BAARS; Barkley, 2011a) contains both self-​and other-​reports of adult and childhood symptoms as well as single-​item measures of age of symptom onset and yes/​no assessments of impairment in four domains. The items were developed to map onto DSM criteria, and an additional nine items were added to tap the newer construct of sluggish cognitive tempo (concentration deficit disorder). Norms, based on a large sample representative of the US population, exist for the self-​report versions of the scale (allowing calculation of age-​referenced percentile scores). Norms for the other-​report versions are not available. The BAARS-​IV yields scores for Inattention, Hyperactivity, Impulsivity, as well as sluggish cognitive tempo, and a screener version using the items that best discriminate clinic-​referred adults with ADHD from community and psychiatric controls also is available. The subscale and total scores demonstrate internal consistencies in the .78 to .90 range and 2-​to 3-​week test–​retest reliabilities in the .66 to .88 range. Across a number of studies, scores on the BAARS-​IV have demonstrated convergent validity with other measures of adult ADHD symptoms (Kooij et al., 2008) and with a range of occupational and relationship outcomes (Barkley, 2011a). Versions of the BAARS-​IV for use in non-​US populations also have been presented (e.g., Vélez-​Pastrana et  al., 2016). Finally, the BAARS-​IV has been used successfully to monitor outcomes of both psychosocial (Safren et al., 2010) and medication (Spencer et al., 2007) treatments for adult ADHD.

62

62

Attention-Deficit and Disruptive Behavior Disorders

The clinical utility of the measure is enhanced by a publisher policy that grants limited permission to make copies of the measure from the manual. Similar ADHD symptom checklists for adults include the Adult ADHD Rating Scale, developed in conjunction with the World Health Organization (Kessler et al., 2005)  and available online (https://​www.hcp.med.harvard.edu/​ncs/​asrs.php), and the Conners Adult ADHD Rating Scale (Conners et al., 1999), which includes long, short, and screener forms and is normed with satisfactory psychometric information. Beyond self-​and other-​reported rating scales, clinical interviews specific to adult ADHD also have been developed and may be useful for diagnostic purposes. These include the Conners Adult ADHD Diagnostic Interview for DSM-​IV (CAADID-​IV; Epstein, Johnson, & Conners, 2001) and the Diagnostic Interview for ADHD in Adults (DIVA 2.0; Kooij, 2013), which is available online (http://​ www.divacenter.eu/​DIVA.aspx?id=499). Both measures assess DSM symptoms of ADHD and, as is typical of diagnostic interviews, neither is normed. Preliminary evidence of inter-​rater reliability and convergent/​predictive validity is available for both measures (e.g., Kooij, 2013; Solanto, Wasserstein, Marks, & Mitchell, 2012). The DIVA 2.0 is available in several languages, free of charge, and includes a computer application to facilitate ease of administration and scoring. The CAADID-​IV is composed of two parts. The first portion covers developmental and demographic history, including comorbidities and psychosocial stressors, and can be completed as a self-​ report measure prior to review with the clinician. The second part covers both adult and childhood symptoms, with useful prompts and adult-​appropriate symptom examples provided to guide the assessment. Impairment, pervasiveness, and age of onset are assessed. Assessment of the impairments associated with ADHD symptoms is critical, particularly for case conceptualization, and sometimes for treatment monitoring. Several of the rating scales and interview measures described previously incorporate the assessment of impairment, given its role in diagnostic criteria, and, as for children, efforts are underway to develop a core set of concepts relevant to adult ADHD for the International Classification of Functioning, Disability and Health (Schipper et al., 2015). Currently, assessment of impairment associated with ADHD can be undertaken with the Barkley Functional Impairment Scale (BFIS; Barkley, 2011b). This measure, developed to reflect a clearly defined construct of psychosocial impairment, has both self-​and other-​report forms. The self-​report version has norms derived from

the same representative normative sample used for the BAARS-​IV. The BFIS items cover 15 domains of functioning (e.g., home, community, occupational, and daily responsibilities), and ratings load on a single factor with strong internal consistency (alpha  =  .97) and test–​retest reliability (r = .72). Evidence for convergent and discriminant validity is presented (e.g., correlations with symptom severity, disability status, and clinical group membership). Of course, in addition to impairment, as with children, assessment of a range of possible comorbid conditions and other aspects of functioning is critical in forming a comprehensive case formulation of ADHD in adults, and these constructs also may be important in monitoring treatment progress. Given the nascent nature of the adult ADHD assessment literature, we do not review such measures here, but we encourage clinicians to follow sound clinical practice guidelines (e.g., those provided by the European Consensus on Adult ADHD, the National Institute for Health and Care Excellence [NICE] from the United Kingdom, or the Canadian ADHD Resource Alliance).

CONCLUSIONS AND FUTURE DIRECTIONS

A multitude of tools for assessing ADHD across the lifespan are available, both commercially and in the public domain, and new additions emerge regularly. In contrast to this abundant quantity of measures, few measures are available that possess substantial research on their psychometric qualities or that have been validated for uses beyond diagnostic questions. In this final section of the chapter, we draw attention to prominent unanswered questions regarding assessments for ADHD diagnoses and for treatment planning and monitoring. We again note that our focus on assessment measures should not overshadow the fact that the process of assessing an individual with ADHD involves much more than simple administration of a standard set of measures. Clinicians must make client-​specific decisions regarding which measures are best suited for each individual client and family (e.g., Is this child represented in the measure’s normative group?), at which point in the assessment process (e.g., Is the measure needed primarily for assigning a diagnosis or for monitoring the child’s response to a new medication?), and how information from multiple sources and measures is best combined to answer the assessment question (e.g., Is a sibling an adequate reporter of childhood symptoms in an adult client?). In addition, information derived from the measures presented here must be supplemented with

 63

Attention-Deficit/Hyperactivity Disorder

clinical judgments regarding each individual’s situation and context (e.g., cultural factors) and must be employed within the context of a caring and supportive therapeutic relationship between clinician and client. In diagnosing ADHD, the use of unstructured interviews as a guide for identifying general areas of concern (in terms of both ADHD and comorbid disorders), developmental and treatment history, and information specific to the client’s circumstances remains common, despite the known limitations of this assessment method. Further efforts to develop and evaluate more structured and semi-​structured tools that could couple the gathering of this information in a systematic manner with a sensitivity to individual client differences and the need to establish a strong working relationship between clinician and client would be clinically valuable. Similarly, although a few standardized measures with adequate psychometric properties have proven their value in planning and monitoring treatment progress in children, the most promising measures in this area originate from a behavioral perspective but lack standardization, norm development, and broad psychometric evaluation. We believe that these measures have the greatest potential for enhancing the selection of appropriate treatment targets for children with ADHD and for providing careful, continuous, and objective feedback regarding treatment progress. However, one cannot ignore the inadequacies of these measures in terms of traditional psychometric properties. Continued research is much needed to address these limitations and to develop and test clinically useful measures appropriate to assessing and monitoring change in the functional impairments that form the core of ADHD treatment planning. Technological advances, such as online data collection platforms, computerized scoring and reporting templates, and portable recording options, offer exciting possibilities in moving forward with the development of assessment tools, but they are perhaps particularly applicable within the realm of treatment monitoring. Turning to the more common and psychometrically tested assessment methods commonly used in diagnosis, particularly rating scales, consensus appears to be that for both children and adults, information from multiple informants and contexts is necessary (e.g., Barkley, 2011a; Pelham et al., 2005). What is now needed is greater concentration on evaluating methods for combining this information and establishing the relative incremental validity of different informants and contexts. Similarly, much further research is needed to clarify the relative merits of different assessment methods (e.g., symptom-​specific

63

rating scales, structured interviews, and observations) for arriving at diagnostic or treatment decisions. We know exceptionally little about which types of information are the most crucial in determining which types of assessment and treatment to administer. To maximize the extent to which our assessments can boast of being both evidence-​ based and cost-​effective, research with a clear focus on the clinical utility or incremental validity of how each piece of assessment information fits (or does not) within the puzzle of an optimally designed assessment process for ADHD is urgently needed. Beyond the need to refine the measures and process of assessing ADHD, we have been struck by two significant gaps that exist in this area. First, there often appears to be a disconnect between assessments of ADHD diagnoses and assessments with greater relevance to the treatment of the disorder. As we have repeatedly noted, among individuals referred with ADHD, it is often the case that the most pressing clinical problems are those related to functional impairments (e.g., in interpersonal relationships or academic/​vocational functioning) or to comorbid conditions (e.g., learning problems or depression). Symptom severity, the target of diagnostic assessment, is clearly related to these impairments but not synonymous with them. Knowledge of an individual’s level of ADHD symptoms offers little treatment guidance because changes in these symptom levels may not mirror changes in the functional problems that instigated help-​seeking. Second, as in many areas, there remains a significant gap between research on ADHD assessment and treatment and the delivery of these services outside of research settings. The dissemination and uptake of the most evidence-​based assessment tools (and treatments) lags woefully behind the advancing scientific knowledge. Recent work in the development and evaluation of clinical care pathways for ADHD offers an important bridge over this gap (e.g., Carroll et al., 2013; Coghill & Seth, 2015; Vander Stoep et al., 2017) and holds promise as a future direction in improving the assessment (and subsequent treatment) of ADHD. In closing, we acknowledge a number of resources relevant to the assessment of ADHD and refer clinicians to these resources for additional guidelines and information useful in this endeavor. Recent books by Barkley (2015) and Anastopoulos and Shelton (2001) provide excellent coverage of assessment issues in ADHD. Clinical guidelines for assessing ADHD have been provided by the American Academy of Pediatrics (2011) and the American Academy of Child and Adolescent Psychiatry (2007). Pelham and colleagues’ (2005) contribution on

64

64

Attention-Deficit and Disruptive Behavior Disorders

evidence-​based assessment for ADHD continues to be an excellent resource. We trust that this chapter, along with these additional resources, provides the clinician with an overview of the issues prominent in the assessment of ADHD and with a guide to currently available and useful measures.

References Abikoff, H., & Gallagher, R. (2009). Children’s organizational skills scale. Tonawanda, NY: Multi-​Health Systems. Abikoff, H., Gallagher, R., Wells, K. C., Murray, D. W., Huang, L., Lu, F., & Petkova, E. (2013). Remediating organizational functioning in children with ADHD:  Immediate and long-​term effects from a randomized controlled trial. Journal of Consulting and Clinical Psychology, 81, 113–​128. Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/​adolescent behavioral and emotional problems: Implications of cross-​informant correlations for situational specificity. Psychological Bulletin, 101, 213–​232. Achenbach, T. M., & Rescorla, L. A. (2001). Manual for the ASEBA school-​ age forms & profiles. Burlington, VT:  University of Vermont, Research Center for Children, Youth, & Families. Algorta, G. P., Dodd, A. L., Stringaris, A., & Youngstrom, E. A. (2016). Diagnostic efficiency of the SDQ for parents to identify ADHD in the UK: A ROC analysis. European Child & Adolescent Psychiatry, 25, 949–​957. American Academy of Child and Adolescent Psychiatry. (2007). Practice parameter for the assessment and treatment of children and adolescents with attention-​deficit/​ hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 46, 894–​921. American Academy of Pediatrics. (2011). ADHD: Clinical practice guideline for the diagnosis, evaluation, and treatment of attention-​deficit/​hyperactivity disorder in children and adolescents. Pediatrics, 128, 1007–​1022. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Anastopoulos, A. D., & Shelton, T. L. (2001). Assessing attention-​deficit/​hyperactivity disorder. Dordrecht, the Netherlands: Kluwer. Ang, R. P., Rescorla, L. A., Achenbach, T. M., Ooi, Y. P., Fung, D. S. S., & Woo, B. (2012). Examining the criterion validity of CBCL and TRF problem scales and items in a large Singapore sample. Child Psychiatry and Human Development, 43, 70–​86. Angold, A., & Costello, J. (2000). The Child and Adolescent Psychiatric Assessment (CAPA). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 49–​58.

Angold, A., Erkanli, A., Costello, E. J., & Rutter, M. (1996). Precision, reliability and accuracy in the dating of symptom onset in child and adolescent psychopathology. Journal of Child Psychology and Psychiatry, 37, 657–​664. Angold, A., Erkanli, A., Egger, H. L., & Costello, E. J. (2000). Stimulant treatment for children:  A community perspective. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 975–​984. Arnett, A. B., Pennington, B. F., Willcutt, E. G., DeFries, J. C., & Olson, R. K. (2015). Sex differences in ADHD symptom severity. Journal of Child Psychology and Psychiatry, 56, 632–​639. Bard, D. E., Wolraich, M. L., Neas, B., Doffing, M., & Beck, L. (2013). The psychometric properties of the Vanderbilt Attention-​ Deficit Hyperactivity Disorder Diagnostic Parent Rating Scale in a community population. Journal of Developmental and Behavioral Pediatrics, 34, 72–​82. Barkley, R. A. (2002). International consensus statement on ADHD, January 2002. Clinical Child and Family Psychology Review, 5, 89–​111. Barkley, R. A. (2006). Attention-​deficit/​hyperactivity disorder. In D. A. Wolfe & E. J. Mash (Eds.), Behavioral and emotional disorders in adolescents: Nature, assessment, and treatment (pp. 91–​152). New York, NY: Guilford. Barkley, R. A. (2011a). Barkley Adult ADHD Rating Scale-​IV (BAARS-​IV). New York, NY: Guilford. Barkley, R. A. (2011b). Barkley Functional Impairment Scale (BFIS). New York, NY: Guilford. Barkley, R. A. (2015). Attention-​deficit/​hyperactivity disorder:  A handbook for diagnosis and treatment (4th ed.). New York, NY: Guilford. Barkley, R. A., Knouse, L. E., & Murphy, K. R. (2011). Correspondence and disparity in the self-​and other ratings of current and childhood ADHD symptoms and impairment in adults with ADHD. Psychological Assessment, 23, 437–​446. Barkley, R. A., & Murphy, K. R. (2006). Attention-​deficit hyperactivity disorder:  A clinical workbook (3rd ed.). New York, NY: Guilford. Becker, S. P., Leopold, D. R., Burns, G. L., Jarrett, M. A., Langberg, J. M., Marshall, S. A.,  .  .  .  Willcutt, E. G. (2016). The internal, external, and diagnostic validity of sluggish cognitive tempo:  A meta-​analysis and critical review. Journal of the American Academy of Child & Adolescent Psychiatry, 55, 163–​178. Brown, T. E. (2001). Brown attention-​deficit disorder scales for children and adolescents. San Antonia, TX: Psychological Corporation. Carroll, A. E., Bauer, N. S., Dugan, T. M., Anand, V., Saha, C., & Downs, S. M. (2013). Use of a computerized decision aid for ADHD diagnosis: A randomized controlled trial. Pediatrics, 132, e623–​e629.

 65

Attention-Deficit/Hyperactivity Disorder

Coghill, D., & Seth, S. (2015, November 19). Effective management of attention-​ deficit/​ hyperactivity disorder (ADHD) through structured re-​ assessment:  The Dundee ADHD Clinical Care Pathway. Child and Adolescent Psychiatry and Mental Health, 9, 1–​11. Conners, C. K. (2008). Conners 3rd Edition:  Manual. Tonawanda, NY: Multi-​Health Systems. Conners, C. K., Erhardt, D., Epstein, J. N., Parker, J. D. A., Sitarenios, G., & Sparrow, E. (1999). Self-​ratings of ADHD symptoms in adults: I. Factor structure and normative data. Journal of Attention Disorders, 3, 141–​151. Conners, C. K., & MHS Staff. (2000). Conners Continuous Performance Test II. Tonawanda, NY:  Multi-​ Health Systems. Craig, F., Lamanna, A. L., Margari, F., Matera, E., Simone, M., & Margari, L. (2015). Overlap between autism spectrum disorders and attention deficit hyperactivity disorder: Searching for distinctive/​common clinical features. Autism Research, 8, 328–​337. De Los Reyes, A. (2013). Strategic objectives for improving understanding of informant discrepancies in developmental psychopathology research. Development and Psychopathology, 25, 669–​682. de Nijs, P. F.  A., Ferdinand, R. F., de Bruin, E. I., Dekker, M. C.  J., van Duijn, C. M., & Verhulst, F. C. (2004). Attention-​ deficit/​ hyperactivity disorder (ADHD):  Parents’ judgment about school, teacher’s judgment about home. European Child and Adolescent Psychiatry, 13, 315–​320. Derks, E. M., Hudziak, J. J., Dolan, C. V., Ferdinand, R. F., & Boomsma, D. I. (2006). The relations between DISC-​IV DSM diagnoses of ADHD and multi-​informant CBCL-​AP syndrome scores. Comprehensive Psychiatry, 47, 116–​122. Dirks, M., De Los Reyes, A., Briggs-​ Gowan, M., Cella, D., & Wakschlag, L. S. (2012). Annual research review: Embracing not erasing contextual variability in children’s behavior—​Theory and utility in the selection and use of methods and informants in developmental psychopathology. Journal of Child Psychology and Psychiatry, 53, 558–​574. Duff, C. T., & Sulla, E. M. (2015). Measuring executive function in the differential diagnosis of attention-​deficit/​ hyperactivity disorder:  Does it really tell us anything? Applied Neuropsychology: Child, 4, 188–​196. DuPaul, G. J., Power, T. J., Anastopoulos, A. D., & Reid, R. (2016). ADHD Rating Scale-​5 for children and adolescents:  Checklists, norms, and clinical interpretation. New York, NY: Guilford. DuPaul, G. J., Reid, R., Anastopoulos, A. D., Lambert, M. C., Watkins, M. W., & Power, T. J. (2016). Parent and teacher ratings of attention-​deficit/​hyperactivity disorder symptoms:  Factor structure and normative data. Psychological Assessment, 28, 214–​225.

65

DuPaul, G. J., Volpe, R. J., Jitendra, A. K., Lutz, J. G., Lorah, K. S., & Gruber, R. (2004). Elementary school students with AD/​ HD:  Predictors for academia achievement. Journal of School Psychology, 42, 285–​301. Edwards, M. C., Schulz, E. G., Chelonis, J., Gardner, E., Philyaw, A., & Young, J. (2005). Estimates of the validity and utility of unstructured clinical observations of children in the assessment of ADHD. Clinical Pediatrics, 44, 49–​56. Epstein, J. N., Johnson, D. E., & Conners, C. K. (2001). Conners’ adult ADHD diagnostic interview for DSM-​IV. North Tonawanda, NY: Multi-​Health Systems. Evans, S. W., Owens, J. S., & Bunford, N. (2014). Evidence-​ base psychosocial treatments for children and adolescents with attention-​ deficit/​ hyperactivity disorder. Journal of Clinical Child and Adolescent Psychology, 43, 527–​551. Fabiano, G. A., Pelham, W. E., Waschbusch, D. A., Gnagy, E. M., Lahey, B. B., Chronis, A. M.,  .  .  .  Burrows-​ Mclean, L. (2006). A practical measure of impairment:  Psychometric properties of the impairment rating scale in samples of children with attention deficit hyperactivity disorder and two school-​based samples. Journal of Clinical Child and Adolescent Psychology, 35, 369–​385. Frick, P. J., Barry, C. T., & Kamphaus, R. W. (2010). Clinical assessment of child and adolescent personality and behavior. New York, NY: Springer. Gadow, K. D., Drabick, D. A.  G., Loney, J., Sprafkin, J., Salisbury, H., Azizian, A.,  .  .  .  Schwartz, J. (2004). Comparison of ADHD symptom subtypes as source-​ specific syndromes. Journal of Child Psychology and Psychiatry, 45, 1135–​1149. Gadow, K. D., & Sprafkin, J. (2002). Child Symptom Inventory-​ 4 norms manual. Stony Brook, NY: Checkmate Plus. Gadow, K. D., Sprafkin, J., Salisbury, H., Schneider, J., & Loney, J. (2004). Further validity evidence for the teacher version of the Child Symptom Inventory-​ 4. School Psychology Quarterly, 19, 50–​71. Gadow, K. D., Sverd, J., Sprafkin, J., Nolan, E. E., & Grossman, S. (1999). Long term methylphenidate therapy in children with comorbid attention deficit hyperactivity disorder and chronic multiple tic disorder. Archives of General Psychiatry, 56, 334–​336. Gallo, E. F., & Posner, J. (2016). Moving towards causality in attention-​deficit hyperactivity disorder:  Overview of neural and genetic mechanisms. Lancet Psychiatry, 3, 555–​567. Ghuman, J., & Ghuman, H. (2014). ADHD in preschool children: Overview and diagnostic considerations. In J. K. Ghuman & H. S. Ghuman (Eds.), ADHD in preschool children:  Assessment and treatment (pp. 3–​22). Oxford: Oxford University Press.

6

66

Attention-Deficit and Disruptive Behavior Disorders

Gladman, M., & Lancaster, S. (2003). A review of the Behavior Assessment System for Children. School Psychology International, 24, 276–​291. Gomez, R. (2012). Item response theory analyses of adolescent self-​ ratings of the ADHD symptoms in the Disruptive Behavior Rating Scale. Personality and Individual Differences, 53, 963–​968. Gomez, R., Burns, G. L., Walsh, J. A., & De Moura, M. A. (2003). Multitrait–​multisource confirmatory factor analytic approach to the construct validity of ADHD rating scales. Psychological Assessment, 15, 3–​16. Goodman, R. (1997). The Strengths and Difficulties Questionnaire:  A research note. Journal of Child Psychology and Psychiatry, 38, 581–​586. Goodman, R. (2001). Psychometric properties of the Strengths and Difficulties Questionnaire. Journal of the American Academy of Child & Adolescent Psychiatry, 4, 1337–​1345. Goyette, C. H., Conners, C. K., & Ulrich, R. F. (1978). Normative data on revised Conners Parent and Teacher Rating Scales. Journal of Abnormal Child Psychology, 6, 221–​236. Greenhill, L. L., Swanson, J. M., Vitiello, B., Davis, M., Clevenger, W., Wu, M.,  .  .  .  Wigal, T. (2001). Impairment and deportment responses to different methylphenidate doses in children with ADHD:  The MTA titration trial. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 180–​187. Hall, C. L., Valentine, A. Z., Groom, M. J., Walker, G. M., Sayal, K., Daley, D., & Hollis, C. (2016). The clinical utility of the continuous performance test and objective measures of activity for diagnosing and monitoring ADHD in children: A systematic review. European Child and Adolescent Psychiatry, 25, 677–​699. Hanssen-​ Bauer, K., Langsrud, Ø., Kvernmo, S., & Heyerdahl, S. (2010). Clinician-​ rated mental health in outpatient child and adolescent mental health services: Associations with parent, teacher and adolescent ratings. Child and Adolescent Psychiatry and Mental Health, 4, 1–​12. Hantson, J., Wang, P. P., Grizenko-​Vida, M., Ter-​Stepanian, M., Harvey, W., Joober, R., & Grizenko, N. (2012). Effectiveness of a therapeutic summer camp for children with ADHD:  Phase I  clinical intervention trial. Journal of Attention Disorders, 16, 610–​617. Hart, E. L., Lahey, B. B., Loeber, R., Applegate, B., Green, S. M., & Frick, P. J. (1995). Developmental change in attention-​deficit hyperactivity disorder in boys:  A four-​ year longitudinal study. Journal of Abnormal Child Psychology, 23, 729–​749. Harvey, E. A., Lugo-​Candelas, C. I., & Breaux, R. P. (2015). Longitudinal changes in individual symptoms across the preschool years in children with ADHD. Journal of Clinical Child and Adolescent Psychology, 44, 580–​594.

Hodges, K., & Wong, M. M. (1996). Psychometric characteristics of a multidimensional measure to assess impairment:  The Child and Adolescent Functional Assessment Scale. Journal of Child and Family Studies, 5, 445–​467. Horn, W. F., Ialongo, N., Popovich, S., & Peradotto, D. (1987). Behavioral parent training and cognitive–​ behavioral self-​ control therapy with ADD-​ H children:  Comparative and combined effects. Journal of Clinical Child Psychology, 16, 57–​68. Huss, M., Sikirica, V., Hervas, A., Newcorn, J. H., Harpin, V., & Robertson, B. (2016). Guanfacine extended release for children and adolescents with attention-​ deficit/​ hyperactivity disorder:  Efficacy following prior methylphenidate treatment. Neuropsychiatric Disease and Treatment, 12, 1085–​1101. Ialongo, N. S., Horn, W. F., Pascoe, J. M., Greenberg, G., Packard, T., Lopez, M.,  .  .  .  Puttler, L. (1993). The effects of a multimodal intervention with attention-​ deficit hyperactivity disorder children:  A 9-​ month follow-​up. Journal of the American Academy of Child & Adolescent Psychiatry, 32, 182–​189. Johnston, C., & Murray, C. (2003). Incremental validity in the psychological assessment of children and adolescents. Psychological Assessment, 15, 496–​507. Johnston, C., & Pelham, W. E. (1986). Teacher ratings predict peer ratings of aggression at 3-​year follow-​up in boys with attention deficit disorder with hyperactivity. Journal of Consulting and Clinical Psychology, 54, 571–​572. Johnston, C., Weiss, M. D., Murray, C., & Miller, N. V. (2011). The effects of instructions on mothers’ ratings of child attention-​deficit/​hyperactivity disorder symptoms. Journal of Abnormal Child Psychology, 39, 1099–​1110. Johnston, C., Weiss, M. D., Murray, C., & Miller, N. V. (2014). The effects of instructions on mothers’ ratings of attention-​ deficit/​ hyperactivity disorder symptoms in referred children. Journal of Abnormal Child Psychology, 42, 479–​488. Kamphaus, R. W., Reynolds, C. R., Hatcher, N. M., & Kim, S. (2004). Treatment planning and evaluation with the Behavior Assessment System for Children (BASC). In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (pp. 331–​ 354). Mahwah, NJ: Erlbaum. Kao, G. S., & Thomas, H. M. (2010). Review of Conners 3rd Edition. Journal of Psychoeducational Assessment, 28, 598–​602. Karalunas, S. L., Fair, D., Musser, E. D., Aykes, K., Iyer, S. P., & Nigg, J. T. (2014). Subtyping attention-​deficit/​ hyperactivity disorder using temperament dimensions:  Toward biologically based nosologic. JAMA Psychiatry, 71, 1015–​1024. Kasper, L. J., Alderson, R. M., & Hudec, K. L. (2012). Moderators of working memory deficits in children

 67

Attention-Deficit/Hyperactivity Disorder

with attention-​deficit/​hyperactivity disorder (ADHD): A meta-​analytic review. Clinical Psychology Review, 32, 605–​617. Kaufman, J., Birmaher, B., Brent, D., Rao, U., Flynn, C., Moreci, P., . . . Ryan, N. (1997). Schedule for affective disorders and schizophrenia for School-​Aged Children–​ Present and Lifetime version (K-​ SADS-​ PL):  Initial reliability and validity data. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 980–​988. Kazdin, A. E. (2003). Problem-​ solving skills training and parent management training for conduct disorder. In A. E. Kazdin & J. R. Weisz (Eds.), Evidence-​based psychotherapies for children and adolescents (pp. 241–​262). New York, NY: Guilford. Kessler, R. C., Adler, L., Ames, M., Demler, O., Faraone, S., Hiripi, E., . . . Walters, E. E. (2005). The World Health Organization adult ADHD Self-​Report Scale (ASRS): A short screening scale for use in the general population. Psychological Medicine, 35, 245–​256. Kollins, S. H. (2004). Methodological issues in the assessment of medication effects in children diagnosed with attention deficit hyperactivity disorder (ADHD). Journal of Behavioral Education, 13, 247–​266. Kollins, S. H., Jain, R., Brams, M., Segal, S., Findling, R. L., Wigal, S. B., & Khayrallah, M. (2011). Clonidine extended-​release tablets as add-​on therapy to psychostimulants in children and adolescents with ADHD. Pediatrics, 127, 1406–​1413. Koocher, G. P., McMann, M. R., Stout, A. O., & Norcross, J. C. (2015). Discredited assessment and treatment methods used with children and adolescents: A Delphi poll. Journal of Clinical Child and Adolescent Psychology, 44, 722–​729. Kooij, J. J.  S. (2013). Adult ADHD:  Diagnostic assessment and treatment (3rd ed.). New York, NY: Springer-​Verlag. Kooij, J. J.  S., Bejerot, S., Blackwell, A., Caci, H., Casas-​ Brugué, M., Carpentier, P. J., . . . Asherson, P. (2010). European consensus statement on diagnosis and treatment of adult ADHD:  The European Network Adult ADHD. BMC Psychiatry, 10, 67–​90. Kooij, J. J. S., Boonstra, A. M., Swinkels, S. H. N., Bekker, E. M., de Noord, I., & Buitelaar, J. K. (2008). Reliability, validity, and utility of instruments for self-​report and informant report concerning symptoms of ADHD in adult patients. Journal of Attention Disorders, 11, 445–​458. Kooij, J. J. S., Huss, M., Asherson, P., Akehurst, R., Beusterien, K., French, A., . . . Hodgkins, P. (2012). Distinguishing comorbidity and successful management of adult ADHD. Journal of Attention Disorders, 16(5, Suppl.), 3S–​19S. Lachar, D. (1982). Personality Inventory for Children-​Revised (PIC-​R). Los Angeles, CA:  Western Psychological Services.

67

Lee, S., Burns, G. L., Snell, J., & McBurnett, K. (2014). Validity of the sluggish cognitive tempo symptom dimension in children:  Sluggish cognitive tempo and ADHD–​ inattention as distinct symptom symptoms. Journal of Abnormal Child Psychology, 42, 7–​19. Levy, F. (2015). Attention deficit hyperactivity disorder:  40  years consistent work. Australian and New Zealand Journal of Psychiatry, 49, 573–​573. Loeber, R., Green, S. M., Lahey, B. B., & Stouthamer-​ Loeber, M. (1991). Differences and similarities between children, mothers, and teachers as informants on disruptive child behavior. Journal of Abnormal Child Psychology, 19, 75–​95. Loney, J., & Milich, R. (1982). Hyperactivity, inattention, and aggression in clinical practice. In D. K. Routh (Ed.), Advances in developmental and behavioral pediatrics (pp. 113–​147). New York, NY: Plenum. Maneeton, N., Maneeton, B., Intaprasert, S., & Woottiluk, P. (2014). A systematic review of randomized controlled trials of bupropion versus methylphenidate in the treatment of attention-​deficit/​hyperactivity disorder. Neuropsychiatric Disease and Treatment, 10, 1439–​1449. Manos, M. J., Caserta, D. A., Short, E. J., Raleigh, K. L., Giuliano, K. C., Pucci, N. C., & Frazier, T. W. (2015). Evaluation of the duration of action and comparative effectiveness of lisdexamfetamine dimesylate and behavioral treatment in youth with ADHD in a quasi-​ naturalistic setting. Journal of Attention Disorders, 19, 578–​590. Marcus, D. K., & Barry, T. D. (2011). Does attention-​ deficit/​hyperactivity disorder have a dimensional latent structure? A  taxometric analysis. Journal of Abnormal Psychology, 120, 427–​442. Martel, M. M., Schimmack, U., Nikolas, M., & Nigg, J. T. (2015). Integration of symptom ratings from multiple informants in ADHD diagnosis: A psychometric model with clinical utility. Psychological Assessment, 27, 1060–​1071. Martel, M. M., von Eye, A., & Nigg, J. (2012). Developmental differences in structure of attention-​deficit/​hyperactivity disorder (ADHD) between childhood and adulthood. International Journal of Behavioral Development, 36, 279–​292. Mash, E. J., & Barkley, R. A. (2007). Assessment of childhood disorders (4th ed.). New York, NY: Guilford. Masi, G., Milone, A., Manfredi, A., Brovedani, P., Pisano, S., & Muratori, P. (2016). Combined pharmacotherapy–​ multimodal psychotherapy in children with disruptive behavior disorders. Psychiatry Research, 238, 8–​13. McCarney, S. B. & Arthaud, T. J. (2013a). Attention Deficit Disorder Evaluation Scale, fourth edition:  Home version technical manual. Columbia, MO:  Hawthorne Educational Services.

68

68

Attention-Deficit and Disruptive Behavior Disorders

McCarney, S. B., & Arthaud, T. J. (2013b). Attention Deficit Disorder Evaluation Scale, fourth edition:  School version technical manual. Columbia, MO:  Hawthorne Educational Services. McConaughy, S. H. (2001). The Achenbach System of Evidence Based Assessment. In J. J.  W. Andrews, D. H. Saklofsky, & H. L. Jensen (Eds.), Handbook of psychoeducational assessment:  Ability, achievement, and behavior in children (pp. 289–​324). San Diego, CA: Academic Press. McConaughy, S. H., Harder, V. S., Antshel, K. M., Gordon, M., Eiraldi, R., & Dumenci, L. (2010). Incremental validity of test session and classroom observations in a multimethod assessment of attention deficit/​hyperactivity disorder. Journal of Clinical Child and Adolescent Psychology, 39, 650–​666. McGrath, A. M., Handwerk, M. L., Armstrong, K. J., Lucas, C. P., & Friman, P. C. (2004). The validity of the ADHD section of the Diagnostic Interview Schedule for Children. Behavior Modification, 28, 349–​374. McMahon, R. J., & Forehand, R. L. (2003). Helping the noncompliant child, second edition: Family-​based treatment for oppositional behavior. New York, NY: Guilford. Miller, F. G., & Lee, D. L. (2013). Do functional behavioral assessments improve intervention effectiveness for students diagnosed with ADHD? A  single-​subject meta-​ analysis. Journal of Behavioral Education, 22, 253–​282. Moffitt, T. E., Houts, R., Asherson, P., Belsky, D. W., Corcoran, D. L., Hammerle, M., . . . Caspi, A. (2015). Is adult ADHD a childhood-​onset neurodevelopmental disorder? Evidence from a four-​decade longitudinal cohort study. American Journal of Psychiatry, 172, 967–​977. Morgan, P. L., Staff, J., Hillemeier, M. M., Farkas, G., & Maczuga, S. (2013). Racial and ethnic disparities in ADHD diagnosis from kindergarten to eighth grade. Pediatrics, 132, 85–​93. MTA Cooperative Group. (1999). Moderators and mediators of treatment response of children with attention-​deficit/​ hyperactivity disorder: The multimodal treatment study of children with attention-​deficit/​hyperactivity disorder. Archives of General Psychiatry, 56, 1088–​1096. MTA Cooperative Group. (2004). National Institute of Mental Health multimodal treatment study of ADHD follow-​up:  24-​month outcomes of treatment strategies for attention-​deficit/​hyperactivity disorder. Pediatrics, 113, 754–​761. Murray, D. W., Bussing, R., Fernandez, M., Hou, W., Garvan, C. W., Swanson, J. M., & Eyberg, S. M. (2009). Psychometric properties of teacher SKAMP ratings from a community sample. Assessment, 16, 193–​208. Musser, E. D., Galloway-​Long, H. S., Frick, P. J., & Nigg, J. T. (2013). Emotion regulation and heterogeneity in attention-​deficit/​hyperactivity disorder. Journal of the

American Academy of Child & Adolescent Psychiatry, 52, 163–​171. National Institutes of Health. (2000). Consensus Development Conference Statement:  Diagnosis and treatment of attention deficit hyperactivity disorder (ADHD). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 182–​193. Nigg, J. T., Willcutt, E. G., & Doyle, A. E. (2005). Causal heterogeneity in attention-​ deficit/​ hyperactivity disorder:  Do we need neuropsychological impaired subtypes? Biological Psychiatry, 57, 1224–​1230. Nolan, E. E., & Gadow, K. D. (1994). Relation between ratings and observations of stimulant drug response in hyperactive children. Journal of Clinical Child Psychology, 23, 78–​90. Ostrander, R., Weinfurt, K. P., Yarnold, P. R., & August, G. J. (1998). Diagnosing attention deficit disorders with the Behavioral Assessment System for Children and the Child Behavior Checklist: Test and construct validity analyses using optimal discriminant classification trees. Journal of Consulting and Clinical Psychology, 66, 660–​672. Owens, J. S., Johannes, L. M., & Karpenko, V. (2009). The relation between change in symptoms and functioning in children with ADHD receiving school-​based mental health services. School Mental Health, 1, 183–​195. Pelham, W. E., Fabiano, G. A., & Massetti, G. M. (2005). Evidence-​based assessment of attention deficit hyperactivity disorder in children and adolescents. Journal of Clinical Child and Adolescent Psychology, 34, 449–​476. Pisterman, S., McGrath, P. J., Firestone, P., Goodman, J. T., Webster, I., & Mallory, R. (1989). Outcome of parent-​ mediated treatment of preschoolers with attention deficit disorder with hyperactivity. Journal of Consulting and Clinical Psychology, 59, 628–​635. Pliszka, S. R. (2015). Comorbid psychiatric disorders in children with ADHD. In R. A. Barkley (Eds.), Attention-​ deficit hyperactivity disorder:  A handbook for diagnosis and treatment (4th ed., pp. 140–​ 168). New York, NY: Guilford. Powell, N. P., Lochman, J. E., Boxmeyer, C. L., Jimenez-​ Camargo, L. A., Crisler, M. E., & Stromeyer, S. L. (2014). Treatment of conduct problems and disruptive behavior disorders. In C. A. Alfano & D. C. Beidel (Eds.), Comprehensive evidence based interventions for children and adolescents (pp. 195–​212). Hoboken, NJ: Wiley. Puddy, R. W., Roberts, M. C., Vernberg, E. M., & Hambrick, E. P. (2012). Service coordination and children’s functioning in a school-​ based Intensive Mental Health Program. Journal of Child and Family Studies, 21, 948–​962. Rabinovitz, B. B., O’Neill, S., Rajendran, K., & Halperin, J. M. (2016). Temperament, executive control, and

 69

Attention-Deficit/Hyperactivity Disorder

attention-​ deficit/​ hyperactivity disorder across early development. Journal of Abnormal Psychology, 125, 196–​206. Ramtvedt, B. E., Røinås, E., Aabech, H. S., & Sundet, K. S. (2013). Clinical gains from including both dextroamphetamine and methylphenidate in stimulant trials. Journal of Child and Adolescent Psychopharmacology, 23, 597–​604. Reynolds, C. R., & Kamphaus, R. W. (2015). Behavior Assessment System for Children, Third Edition (BASC-​ 3). New York, NY: Pearson. Reynolds, C. R., & Kamphaus, R. W. (2016). BASC-​3 flex monitor. New York, NY: Pearson. Russell, G., Miller, L. L., Ford, T., & Golding, J. (2014). Assessing recall in mothers’ retrospective reports: Concerns over children’s speech and language development. Journal of Abnormal Child Psychology, 42, 825–​830. Safren, S. A., Sprich, S., Mimiaga, M. J., Surman, C., Knouse, L., Groves, M., & Otto, M. W. (2010). Cognitive behavioral therapy vs. relaxation with educational support for medication-​treated adults with ADHD and persistent symptoms: A randomized controlled trial. Journal of the American Medical Association, 304, 875–​880. Sandoval, J., & Echandia, A. (1994). Behavior Assessment System for Children. Journal of School Psychology, 32, 419–​425. Sayal, K., Goodman, R., & Ford, T. (2006). Barriers to the identification of children with attention deficit/​hyperactivity disorder. Journal of Child Psychology and Psychiatry, 47, 744–​750. Schipper, E., Mahdi, S., Coghill, D., Vries, P. J., Gau, S. S., Granlund, M., . . . Bölte, S. (2015). Toward an ICF core set for ADHD: A worldwide expert survey on ability and disability. European Child and Adolescent Psychiatry, 24, 1509–​1521. Sciberras, E., Efron, D., Schilpzand, E. J., Anderson, V., Jongeling, B., Hazell, P., . . . Nicholson, J. M. (2013). The Children’s Attention Project: A community-​based longitudinal study of children with ADHD and non-​ ADHD controls. BMC Psychiatry, 13, 18–​29. Shaffer, D., Fisher, P., Lucas, C. P., Dulcan, M. K., & Schwab-​ Stone, M. E. (2000). NIMH Diagnostic Interview Schedule for Children Version IV (NIMH DISC-​ IV):  Description, differences from previous versions, and reliability of some common diagnoses. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 28–​38. Shapiro, E. S. (2011). Behavior observations of students in schools. In E. S. Shapiro (Ed.), Academic skills problems fourth edition workbook (pp. 35–​56). New  York, NY: Guilford. Shemmassian, S. K., & Lee, S. S. (2016). Predictive utility of four methods of incorporating parent and teacher

69

symptom ratings of ADHD for longitudinal outcomes. Journal of Clinical Child and Adolescent Psychology, 45, 176–​187. Solanto, M. V., & Alvir, J. (2009). Reliability of DSM-​IV symptom ratings of ADHD:  Implications for DSM-​V. Journal of Attention Disorders, 13, 107–​116. Solanto, M. V., Wasserstein, J., Marks, D. J., & Mitchell, K. J. (2012). Diagnosis of ADHD in adults: What is the appropriate DSM-​ 5 symptom threshold for hyperactivity–​ impulsivity? Journal of Attention Disorders, 16, 631–​634. Sollman, M. J., Ranseen, J. D., & Berry, D. T.  R. (2010). Detection of feigned ADHD in college students. Psychological Assessment, 22, 325–​335. Sonuga-​Barke, E. J. S., Cortese, S., Fairchild, G., & Stringaris, A. (2016). Annual research review: Transdiagnostic neuroscience of child and adolescent mental disorders—​ Differentiating decision making in attention-​ deficit/​ hyperactivity disorder, conduct disorder, depression, and anxiety. Journal of Child Psychology and Psychiatry, 57, 321–​349. Sparrow, S. S., Cicchetti, D. V., & Bala, D. A. (2005). Vineland Adaptive Behavior Scales, Second Edition. Circle Pines, MN: American Guidance Service. Sparrow, S. S., Cicchetti, D. V., & Saulnier, C. A. (2016). Vineland Adaptive Behavior Scales, Third Edition (Vineland-​3). Toronto, Ontario, Canada: Pearson. Spencer, T. J., Adler, L. A., McGough, J. J., Muniz, R., Jiang, H., & Pestreich, L. (2007). Efficacy and safety of dexmethylphenidate extended-​ release capsules in adults with attention-​deficit/​hyperactivity disorder. Biological Psychiatry, 61, 1380–​1387. Stein, M. A., Sikirica, V., Weiss, M. D., Robertson, B., Lyne, A., & Newcorn, J. H. (2015). Does guanfacine extended release impact functional impairment in children with attention-​deficit/​hyperactivity disorder? Results from a randomized controlled trial. CNS Drugs, 29, 953–​962. Swanson, J. M. (1992). School based assessments and interventions for ADD students. Irvine, CA: K.C. Timmons-​Mitchell, J., Bender, M. B., Kishna, M. A., & Mitchell, C. C. (2006). An independent effectiveness trail of multisystemic therapy with juvenile justice youth. Journal of Clinical Child and Adolescent Psychology, 35, 227–​236. Toplak, M. E., Pitch, A., Flora, D. B., Iwenofu, L., Ghelani, K., Jain, U., & Tannock, R. (2009). The unity and diversity of inattention and hyperactivity/​impulsivity in ADHD:  Evidence for a general factor with separable dimensions. Journal of Abnormal Child Psychology, 37, 1137–​1150. Turgay, A., Goodman, D. W., Asherson, P., Lasser, R. A., Babcock, T. F., Pucci, M. L., & Barkley, R. (2012). Lifespan persistence of ADHD:  The Life Transition Model and its application. Journal of Clinical Psychiatry, 73, 192–​201.

70

70

Attention-Deficit and Disruptive Behavior Disorders

Vander Stoep, A., McCarty, C. A., Zhou, C., Rockhill, C. M., Schoenfelder, E. N., & Myers, K. (2017). The Children’s Attention-​ Deficit Hyperactivity Disorder Telemental Health Treatment Study:  Caregiver outcomes. Journal of Abnormal Child Psychology, 45, 27–​43. Vaughn, A. J., & Hoza, B. (2013). The incremental utility of behavioral rating scales and a structured diagnostic interview in the assessment of attention-​deficit/​hyperactivity disorder. Journal of Emotional and Behavioral Disorders, 21, 227–​239. Vélez-​Pastrana, M. C., González, R. A., Rodríguez Cardona, J., Purcell Baerga, P., Alicea Rodríguez, Á., & Levin, F. R. (2016). Psychometric properties of the Barkley Deficits in Executive Functioning Scale:  A Spanish-​ language version in a community sample of Puerto Rican adults. Psychological Assessment, 28, 483–​498. Wang, L., Wu, C., Lee, S., & Tsai, Y. (2014). Salivary neurosteroid levels and behavioural profiles of children with attention-​deficit/​hyperactivity disorder during six months of methylphenidate treatment. Journal of Child and Adolescent Psychopharmacology, 24, 336–​340. Ware, A. L., Glass, L., Crocker, N., Deweese, B. N., Coles, C. D., Kable, J. A., . . . Mattson, S. N. (2014). Effects of prenatal alcohol exposure and attention-​deficit/​hyperactivity disorder on adaptive functioning. Alcoholism: Clinical and Experimental Research, 38, 1439–​1447. Waschbusch, D. A., & Willoughby, M. T. (2008). Parent and teacher ratings on the Iowa Conners Rating Scale. Journal of Psychopathology and Behavioral Assessment, 30, 180–​192. Waxmonsky, J. G., Waschbusch, D. A., Pelham, W. E., Draganac-​Cardona, L., Rotella, B., & Ryan, L. (2010). Effects of atomoxetine with and without behavior therapy on the school and home functioning of children with attention-​deficit/​hyperactivity disorder. Journal of Clinical Psychiatry, 71, 1535–​1551. Weiss, M., Tannock, R., Kratochvil, C., Dunn, D., Velex-​ Borras, J., Thomason, C.,  .  .  .  Allen, A. J. (2005). A

randomized, placebo-​controlled study of once-​daily atomoxetine in the school setting in children with ADHD. Journal of the American Academy of Child & Adolescent Psychiatry, 44, 647–​655. Wigal, S. B., Greenhill, L. L., Nordbrock, E., Connor, D. F., Kollins, S. H., Adjei, A.,  .  .  .  Kupper, R. J. (2014). A randomized placebo-​controlled double-​blind study evaluating the time course of response to methylphenidate hydrochloride extended-​ release capsules in children with attention-​ deficit/​ hyperactivity disorder. Journal of Child and Adolescent Psychopharmacology, 24, 562–​569. Wigal, S. B., Gupta, S., Guinta, D., & Swanson, J. (1998). Reliability and validity of the SKAMP rating scale in a laboratory school setting. Pharmacological Bulletin, 34, 47–​53. Willcutt, E. G., Nigg, J. T., Pennington, B. F., Solanto, M. V., Rohde, L. A., Tannock, R., . . . Lahey, B. B. (2012). Validity of DSM-​IV attention deficit/​hyperactivity disorder symptom dimensions and subtypes. Journal of Abnormal Psychology, 121, 991–​1010. Wolraich, M. L., Bard, D. E., Neas, B., Doffing, M., & Beck, L. (2013). The psychometric properties of the Vanderbilt Attention-​ Deficit Hyperactivity Disorder Diagnostic Teacher Rating Scale in a community population. Journal of Developmental and Behavioral Pediatrics, 34, 83–​93. Wolraich, M. L., Lambert, E. W., Baumgaertal, A., Garcia-​ Torner, S., Feurer, I. D., Bickman, L., & Doffing, M. A. (1998). Teachers’ screening for attention deficit/​hyperactivity disorder: Comparing multinational samples on teacher ratings of ADHD. Journal of Abnormal Child Psychology, 31, 445–​455. Wolraich, M. L., Lambert, E. W., Doffing, M. A., Bickman, L., Simmons, T., & Worley, K. (2003). Psychometric properties of the Vanderbilt ADHD Diagnostic Parent Rating Scale in a referred population. Journal of Pediatric Psychology, 28, 559–​568.

 71

5

Child and Adolescent Conduct Problems Paul J. Frick Robert J. McMahon Conduct problems (CP) in youth are one of the most common reasons that children and adolescents are referred to mental health clinics (Kimonis, Frick, & McMahon, 2014). This is not surprising given that CP often causes significant disruptions for the child at home and school, and it is the form of psychopathology that has been most strongly associated with delinquency and violence (Odgers et al., 2007). An extensive body of research has led to an increased understanding of the many processes that may be involved in the development of severe CP (Frick & Viding, 2009). This research has many important implications for designing more effective interventions to prevent or treat these problems (Conduct Problems Prevention Research Group, 2000; Frick, 2012)  and for improving the methods for assessing children and adolescents with severe CP (McMahon & Frick, 2005). The focus of this chapter is on the implications for assessment. In the next section, we provide a brief overview of several key findings from research on CP in children and adolescents and highlight several findings that we believe have the most direct relevance to the assessment process. Specifically, we focus on research illustrating the great heterogeneity in the types, severity, and course of CP in youth, as well as the frequent co-​occurring problems in adjustment that often accompany CP. We also summarize research showing important dispositional and contextual risk factors that have been related to CP and that could play an important role in the development or maintenance of CP. We then review some recent causal models that have been proposed to explain how these many risk factors could affect the development of the child and lead to CP. After the brief overview of these select but critical areas of research, we then focus on the implications of this research for three types of assessments that are often conducted for children with CP. First, we focus on methods

for determining whether the level of CP is severe, impairing, and developmentally inappropriate enough to be considered “disordered” and in need of treatment. Second, we focus on assessments that can be used for developing case conceptualizations, which can guide comprehensive and individualized treatment plans for children with CP. Using comprehensive interventions that rely on multiple components tailored to the child’s individual needs has proven to be most effective for treating children and adolescents with CP (Conduct Problems Prevention Research Group, 2000; Frick, 2012). Third, we focus on measures that can be used to monitor and evaluate treatment progress and outcomes. Unfortunately, the availability of measures for this crucial assessment purpose is quite limited. After summarizing research on CP and its implications for assessment, we conclude this chapter with a section highlighting some overriding issues related to assessing children with CP, such as the need to assess children with multiple measures that provide information on their adjustment in multiple contexts. We also provide a summary of some of the major limitations in the existing assessment technology and make recommendations for future work to overcome these limitations.

THE NATURE OF CP

Types and Severity of CP and Common Co-​Occurring Conditions CP constitutes a broad spectrum of “acting-​out” behaviors, ranging from relatively minor oppositional behaviors such as yelling and temper tantrums to more serious forms of antisocial behavior such as physical destructiveness, stealing, and physical violence. There have been numerous 71

72

72

Attention-Deficit and Disruptive Behavior Disorders

methods used to divide CP into more discrete and homogeneous types of behaviors (for comprehensive reviews, see Frick & Marsee, 2006; Kimonis, Frick, & McMahon, 2014). For example, the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association [APA], 2013)  includes CP in the category of disruptive, impulse control, and conduct disorders. The DSM-​ 5 makes a distinction between the categories of oppositional defiant disorder (ODD) and conduct disorder (CD). ODD is a pattern of angry/​irritable (e.g., often loses temper), argumentative/​ defiant (e.g., defying or not complying with grown-​ups’ rules or requests), and vindictive (e.g., has been spiteful or vindictive) behaviors. CD consists of more severe antisocial and aggressive behavior that involves serious violations of others’ rights or deviations from major age-​ appropriate norms. The behaviors are categorized into four groups:  aggressiveness to people and animals (e.g., bullying and fighting), property destruction (e.g., fire-​setting and other destruction of property), deceptiveness or theft (e.g., breaking and entering, stealing without confronting victim), and serious rule violations (e.g., running away from home or being truant from school before age 13 years). In addition to this division in the DSM-​5, factor analyses have resulted in another method for differentiating among types of CP. In a meta-​analysis of more than 60 published factor analyses, Frick et  al. (1993) found that CP could be described by two bipolar dimensions. The first dimension was an overt–​covert dimension. The overt pole consisted of directly confrontational behaviors such as oppositional defiant behaviors and aggression. In contrast, the covert pole consisted of behaviors that were nonconfrontational in nature (e.g., stealing and lying; see also Tiet, Wasserman, Loeber, Larken, & Miller, 2001; Willoughby, Kupersmidt, & Bryant, 2001). The second dimension divided the overt behaviors into those that were overt-​destructive (aggression) and those that were overt-​ nondestructive (oppositional), and it divided the covert behaviors into those that were covert-​destructive (property violations) and those that were covert-​nondestructive (status offenses; i.e., those behaviors that are illegal because of the child’s or adolescent’s age). One way in which this clustering of CP is useful is that the four symptom patterns are fairly consistent with the distinctions made in many legal systems for differentiating types of delinquent behaviors, which generally distinguish between violent offenses (overt-​destructive), status offenses (covert-​nondestructive), and property offenses (covert-​destructive; e.g., Office of Juvenile Justice and Delinquency Prevention, 1995).

Two specific forms of CP—​ noncompliance and aggression—​deserve additional attention. Noncompliance (i.e., excessive disobedience to adults) appears to be important as one of the earliest predictors of the development of CP, and it seems to play an important role in many of the subsequent academic and social problems exhibited by children with CP (Chamberlain & Patterson, 1995; McMahon & Forehand, 2003). Most important, however, research has shown that when child noncompliance is improved as a result of intervention, there is often concomitant improvement in other CP behaviors and a subsequent reduction in later risk for CP (e.g., Russo, Cataldo, & Cushing, 1981; Wells, Forehand, & Griest, 1980). There is also evidence that aggression is an important dimension of CP (Burt, 2013). By its very nature, aggression results in harm to another child (Crick & Dodge, 1996). Furthermore, research has consistently shown that aggressive behavior in children and adolescents is often quite stable after the preschool years (Broidy et  al., 2003). Importantly, research has found that there appears to be several different forms of aggressive behavior (Crick & Dodge, 1996; Poulin & Boivin, 2000). The first type of aggression is often referred to as retaliatory aggression, hostile aggression, or reactive aggression, in which aggression is viewed as a defensive reaction to a perceived threat and is characterized by anger and hostility (Little, Jones, Henrich, & Hawley, 2003). The second type of aggressive behavior is generally unprovoked and is used for personal gain (instrumental) or to influence and coerce others (bullying and dominance). This type of aggressive behavior is referred to as instrumental aggression, premeditated aggression, or proactive aggression (Poulin & Boivin, 2000). Importantly, although these different types of aggression are often correlated (e.g., correlations ranging from r = .40 to .70 in school-​aged samples; Little et al., 2003), studies have consistently documented different correlates to the two forms of aggression (for reviews, see Dodge & Pettit, 2003; Marsee & Frick, 2010). For example, reactive but not proactive aggression has been consistently linked to a tendency to misinterpret ambiguous behaviors as hostile provocation (Crick & Dodge, 1996; Hubbard, Dodge, Cillessen, Coie, & Schwartz, 2001) and to poorly regulated responses to emotional stimuli (Marsee & Frick, 2007; Vitaro, Brengden, & Tremblay, 2002). In contrast, proactive but not reactive aggression has been associated with the tendency to view aggression as an effective means to reach goals (Crick & Dodge, 1996) and with reduced levels of emotional reactivity (i.e., skin conductance and

 73

Child and Adolescent Conduct Problems

heart rate acceleration; Hubbard et  al., 2002; Muñoz, Frick, Kimonis, & Aucoin, 2008). In addition to proactive and reactive forms of aggression, both of which are overt in nature, several researchers have identified a form of indirect aggression, called relational aggression, that involves strategies that attempt to harm another child through harming his or her social relationships (Marsee & Frick, 2010). These behaviors include excluding a child from groups, rumor spreading, and friendship manipulation. Several studies have shown that when girls behave aggressively, they are more likely to use relational aggression than overt aggression (e.g., Crapanzano, Frick, & Terranova, 2010; Marsee et  al., 2014). Furthermore, research has suggested that it may be possible to divide relational aggression into instrumental and reactive forms, similar to overt aggression (Little et  al., 2003; Marsee et  al., 2014). Importantly, children who show relational aggression show many of the same social (e.g., peer rejection) and dispositional (e.g., impulsivity and callousness) risk factors as physically aggressive youth (Marsee et al., 2014). Epidemiology of CP A meta-​analysis of epidemiological studies estimated that the worldwide prevalence of ODD among children and adolescents ages 6 to 18 years is 3.3% and the prevalence of CD is 3.2% (Canino, Polanczyk, Bauermeister, Rohde, & Frick, 2010). These prevalence estimates did not vary significantly across countries or continents, although the vast majority of studies included in the meta-​analysis were conducted in North America and Europe. There is, however, evidence for differences in prevalence rates of CP for children of different ages. The level of CP tends to decrease from the preschool to school-​age years (Maughan, Rowe, Messer, Goodman, & Meltzer, 2004) and increase again in adolescence (Loeber, Burke, Lahey, Winters, & Zera, 2000). For example, Loeber et  al. reported prevalence rates for CD of 5.6%, 5.4%, and 8.3% for boys aged 7, 11, and 13 years, respectively, and prevalence rates for ODD of 2.2%, 4.8%, and 5.0% for boys of the same age in a sample of 1,517 youth in a large urban area. However, the increase in the prevalence of CP from childhood to adolescence may not be consistent for all types of CP. Specifically, there is evidence that mild forms of physical aggression (e.g., fighting) show a decrease in prevalence rates across development, whereas nonaggressive and covert forms of antisocial behavior (e.g., lying and stealing) and serious aggression (e.g., armed robbery and sexual assault) show

73

an increase in prevalence rates from childhood to adolescence (Loeber & Hay, 1997). There also appear to be sex differences in the prevalence of CP. Overall estimates of the sex ratio for boys and girls with CP range from 2:1 to 4:1 (Loeber et  al., 2000). However, this overall ratio hides several important developmental differences. Specifically, there are few sex differences between boys and girls in the prevalence rates of most types of CP prior to age 5 years (Maughan et al., 2004). However, after age 4  years the rate of girls’ CP decreases, whereas the rate of CP for boys either increases or stays at the same rate, leading to a male predominance of CP throughout much of childhood (Loeber et  al., 2000). Numerous studies have also noted that the sex ratio between girls and boys with CP narrows dramatically from approximately 4:1 in childhood to approximately 2:1 in adolescence due to an increase in the number of girls engaging in CP in adolescence (for a review, see Silverthorn & Frick, 1999). CP and Co-​Occurring Problems in Adjustment A consistent finding in research with children who show CP is that they often have a number of problems in adjustment, in addition to their CP, and these problems are critical to address in assessment and intervention. Attention-​deficit/​hyperactivity disorder (ADHD) is one of the most common comorbid conditions associated with CP. In a meta-​analytic study, Waschbusch (2002) reported that 36% of boys and 57% of girls with CP had comorbid ADHD. Importantly, this review also suggested that the presence of ADHD often signals the presence of a more severe and more chronic form of CP in children. Internalizing disorders, such as depression and anxiety, also co-​occur with CP at rates higher than expected by chance (Zoccolillo, 1992). In most cases, CP precedes the onset of depressive and anxiety symptoms, and these symptoms are often viewed as consequences of the many adjustment problems experienced by a child with CP (Frick, Lilienfeld, Ellis, Loney, & Silverthorn, 1999; Loeber & Keenan, 1994). In addition, children who present with the angry/​irritable mood symptoms of ODD are more likely to develop internalizing types of difficulties (e.g., Burke, Hipwell, & Loeber, 2010; Rowe, Costello, Angold, Copeland, & Maughan, 2010; Stringaris & Goodman, 2009). CP is also related to substance use (e.g., Hawkins, Catalano, & Miller, 1992). The comorbidity between CP and substance abuse is important because when youths with CP also abuse substances, they tend to show an early onset of substance use and they are more

74

74

Attention-Deficit and Disruptive Behavior Disorders

likely to abuse multiple substances (Lynskey & Fergusson, 1995). With preschool-​aged children, language impairment may be associated with CP (Wakschlag & Danis, 2004), and in older children, CP is often associated with academic achievement below a level predicted by their intellectual level (Hinshaw, 1992). Multiple Risks Associated with CP Most researchers agree that CP is the result of a complex interaction of multiple causal factors (Kimonis, Frick, & McMahon, 2014). These factors can be summarized in five categories:  biological factors, cognitive correlates, family context, peer context, and the broader social ecology (e.g., neighborhood and community). Although a number of biological correlates (e.g., neurochemical and autonomic irregularities) of CP have been identified and are likely important for causal theories (Frick & Viding, 2009), they are not reviewed here because the current state of knowledge is not sufficiently developed to have clear implications for assessment. In contrast, there are several aspects of the youth’s cognitive and learning styles that have been associated with CP that may be important to the assessment process. First, compared to others, youths with CP tend to score lower on intelligence tests, especially in the area of verbal intelligence (Loney, Frick, Ellis, & McCoy, 1998; Moffitt, 2006). Furthermore, these scores are predictive of the persistence of CP and engagement in delinquent behaviors during adolescence (Frick & Loney, 1999). Second, many children and adolescents with CP tend to show a learning style that is more sensitive to rewards than punishments. This has been labeled as a reward-​dominant response style, and it could explain why many of these youths persist in their maladaptive behaviors, despite the threat of serious potential consequences (Frick et al., 2003; O’Brien & Frick, 1996). Third, many youths with CP show a variety of deficits in their social cognition—​that is, the way they interpret social cues and use them to respond in social situations (Crick & Dodge, 1994; Webster-​ Stratton & Lindsay, 1999). For example, children and adolescents with CP have been shown to have deficits in encoding social cues (e.g., lack of attention to relevant social cues), to make more hostile attributional biases and errors in the interpretation of social cues, to have deficient quantity and quality of generated solutions to social conflict, and to evaluate aggressive solutions more positively (Dodge & Pettit, 2003). The critical role of parenting practices in the development and maintenance of CP has been well

established (e.g., Chamberlain & Patterson, 1995; Loeber & Stouthamer-​Loeber, 1986). Types of parenting practices that have been closely associated with the development of CP include inconsistent discipline, irritable explosive discipline, poor supervision, lack of parental involvement, and rigid discipline (Chamberlain, Reid, Ray, Capaldi, & Fisher, 1997). In addition to parenting practices, various other risk factors that may have an impact on the family and may serve to precipitate or maintain CP have been identified. These familial factors include parental social cognitions (e.g., perceptions of the child), parental personal and marital adjustment (e.g., depression, ADHD, antisocial behavior, substance abuse), and parental stress (McMahon & Estes, 1997; McMahon & Frick, 2005). Research suggests that the child’s relationship with peers can also play a significant role in the development, maintenance, and escalation of CP. Research has documented a relationship between peer rejection in elementary school and the later development of CP (Chen, Drabick, & Burgers, 2015). In addition, peer rejection in elementary school is predictive of an association with a deviant peer group (i.e., one that shows a high rate of antisocial behavior and substance abuse) in early adolescence (Chen et al., 2015). This relationship is important because association with a deviant peer group leads to an increase in the frequency and severity of CP (Patterson & Dishion, 1985), and it has proven to be a strong predictor of later delinquency (Monahan, Steinberg, Cauffman, & Mulvey, 2009)  and substance abuse (Dishion, Capaldi, Spracklen, & Li, 1995; Fergusson, Swain, & Horwood, 2002). Finally, there are factors within the youth’s larger social ecology that have been associated with CP. One of the most consistently documented of these correlates has been low socioeconomic status (SES; Frick, Lahey, Hartdagen, & Hynd, 1989). However, several other ecological factors, many of which are related to low SES, such as poor housing, poor schools, and disadvantaged neighborhoods, have also been linked to the development of CP (Ray, Thornton, Frick, Steinberg, & Cauffman, 2016). In addition, the high rate of violence witnessed by youths who live in impoverished inner-​city neighborhoods has also been associated with CP (Howard, Kimonis, Munoz, & Frick, 2012; Oberth, Zheng, & McMahon, 2017). Causal Theories of CP Although there is general agreement that CP in children and adolescents is associated with multiple risk factors, there is less agreement as to how these risk factors play

 75

Child and Adolescent Conduct Problems

a role in the development of CP. Also, in addition to accounting for the large number of risk factors, causal theories of CP need to consider research suggesting that there may be many different causal pathways through which youth develop these behaviors, each involving a different constellation of risk factors and each involving somewhat different causal processes (Frick & Viding, 2009). The most widely accepted model for delineating distinct pathways in the development of CP distinguishes between childhood-​onset and adolescent-​onset subtypes of CP. That is, the DSM-​5 (APA, 2013) makes the distinction between youths who begin showing CP before age 10 years (i.e., childhood onset) and those who do not show CP before age 10 years (i.e., adolescent onset). This distinction is supported by a substantial amount of research documenting important differences between these two groups of youths with CP (for reviews, see Fairchild, van Goozen, Calder, & Goodyer, 2013; Frick & Viding, 2009; Moffitt, 2006). Specifically, youths in the childhood-​ onset group show more serious aggression in childhood and adolescence and are more likely to continue to show antisocial and criminal behavior into adulthood (Odgers et al., 2007). More relevant to causal theory, many of the dispositional (e.g., temperamental risk and low intelligence) and contextual (e.g., family dysfunction) correlates that have been associated with CP are more strongly associated with the childhood-​onset subtype. In contrast, the youths in the adolescent-​onset subtype show lower rates of these same risk factors. If they do differ from other youths, it seems primarily to be in showing greater affiliation with delinquent peers and scoring higher on measures of rebelliousness and authority conflict (Dandreaux & Frick, 2009; Moffitt & Caspi, 2001; Moffitt, Caspi, Dickson, Silva, & Stanton, 1996). The different characteristics of youths in the two subtypes of CP have led to theoretical models that propose very different causal mechanisms operating across the two groups. For example, Moffitt (2006) has proposed that youths in the childhood-​onset group develop CP behavior through a transactional process involving a difficult and vulnerable child (e.g., impulsive, with verbal deficits, and with a difficult temperament) who experiences an inadequate rearing environment (e.g., poor parental supervision and poor-​quality schools). This dysfunctional transactional process disrupts the child’s socialization, leading to poor social relations with persons both inside (i.e., parents and siblings) and outside (i.e., peers and teachers) the family, which further disrupts the child’s socialization. These disruptions lead to enduring vulnerabilities that can negatively affect the child’s psychosocial

75

adjustment across multiple developmental stages. In contrast, Moffitt views youths in the adolescent-​onset pathway as showing an exaggeration of the normative developmental process of identity formation that takes place in adolescence. Their engagement in antisocial and delinquent behaviors is conceptualized as a misguided attempt to obtain a subjective sense of maturity and adult status in a way that is maladaptive (e.g., breaking societal norms) but encouraged by an antisocial peer group. Given that their behavior is viewed as an exaggeration of a process specific to the adolescent developmental stage and not due to enduring vulnerabilities, their CP is less likely to persist beyond adolescence. However, they may still have impairments that persist into adulthood due to the consequences of their CP (e.g., a criminal record, dropping out of school, and substance abuse; Moffitt & Caspi, 2001). This distinction between childhood-​ onset and adolescent-​onset trajectories to severe CP has been very influential for delineating different pathways through which youths may develop CP, although it is important to note that the applicability of this model to girls requires further testing (Fairchild et al., 2013; Silverthorn & Frick, 1999). Furthermore, several authors have argued that the distinction should be considered more quantitative than qualitative (Fairchild et al., 2013; Lahey et al., 2000). That is, a review by Fairchild et al. (2013) supports the contention that dispositional factors play a greater role in CP when the onset is earlier. However, their review suggested that this effect continues into adolescence. Furthermore, this review noted that although the childhood-​onset pathway tended to show a more chronic course across the lifespan, there was still substantial variability in the outcomes within each pathway. The authors concluded that the timing and severity of exposure to environmental adversity in vulnerable individuals seem to account for the differences in age of onset and differences in outcome. Researchers have also begun extending this conceptualization in a number of important ways. For example, research has identified a subgroup of youths (approximately 25%–​30%) within the childhood-​onset pathway who show high rates of callous and unemotional (CU) traits (e.g., lacking empathy and guilt) (Kahn, Frick, Youngstrom, Findling, & Youngstrom, 2012). Despite designating only a minority of children in the childhood-​ onset pathway, the subgroup is important for a number of reasons. First, youth with CP who also show significant levels of CU show a more stable pattern of behavior problems and more severe aggression that results in greater harm to their victims (Frick, Ray, Thornton, & Kahn, 2014a). In addition to showing more severe aggression,

76

76

Attention-Deficit and Disruptive Behavior Disorders

youth with elevated CU traits display more instrumental (i.e., for personal gain or dominance) and premeditated aggression compared to other children and adolescents with severe CP (Frick, Cornell, Barry, Bodin, & Dane, 2003; Lawing, Frick, & Cruise, 2010). Second, research suggests that CU traits define a group of youth with serious CP who show very different genetic, cognitive, emotional, and social characteristics from those of other children and adolescents with serious CP (Frick et al., 2014a). Third, treatment outcome studies suggest that children with CP who are high on CU traits show a poorer response to many types of treatment compared to other children with CP (Frick et al., 2014a; Hawes, Price, & Dadds, 2014). To briefly summarize some of the key findings from this research, children and adolescents with CP and CU traits (compared to other youth with CP) show an insensitivity to punishment cues, which includes responding more poorly to punishment cues after a reward-​dominant response set is primed, responding more poorly to gradual punishment schedules, underestimating the likelihood that they will be punished for misbehavior, and being less sensitive to potential punishment when peers are present relative to other youth with serious CP (Blair, Colledge, & Mitchell, 2001; Frick, Cornell, Bodin, et  al., 2003; Muñoz-​Centifanti & Modecki, 2013; Pardini, Lochman, & Frick, 2003). Children and adolescents with CP and elevated CU traits also show reduced emotional responsiveness in a number of situations, including showing weaker responses to cues of distress in others, less reactivity to peer provocation, less fear to novel and dangerous situations, and less anxiety over the consequences of their behavior relative to other youth with serious CP (Fanti, Panayiotou, Lazarou, Michael, & Georgiou, 2016; Kimonis et al., 2008; Munoz, Frick, Kimonis, & Aucoin, 2008; Viding et  al., 2012). Finally, CP tends to have a different association with parenting practices depending on whether or not the child or adolescent shows elevated levels of CU traits. Specifically, harsh, inconsistent, and coercive discipline is more strongly associated with CP in youths with normative levels of CU traits relative to youths with elevated CU traits, whereas low warmth in parenting appears to be more highly associated with CP in youths with elevated CU traits (Pasalich, Dadds, Hawes, & Brennan, 2012; Pasalich et al., 2016; Wootton, Frick, Shelton, & Silverthorn, 1997). The research on the different characteristics of children with CP depending on their level of CU traits has led to a number of theories to account for these differences by hypothesizing different causal processes underlying the CP in children with and without elevated CU traits. For

example, Frick, Ray, Thornton, and Kahn (2014b) have proposed that children with CP and elevated CU traits have a temperament (i.e., fearless, insensitive to punishment, and low responsiveness to cues of distress in others) that can interfere with the normal development of conscience and place these children at risk for a particularly severe and aggressive pattern of antisocial behavior. In contrast, children and adolescents with childhood-​onset CP who have normative levels of CU traits display higher levels of emotional reactivity to distress in others and to provocation from others. Furthermore, the CP in this group is strongly associated with hostile/​coercive parenting. Based on these findings, it appears that children in this group show a temperament characterized by strong emotional reactivity combined with inadequate socialization experiences that lead to a failure to develop the skills needed to adequately regulate their emotional reactivity (Frick & Morris, 2004). The resulting problems in emotional regulation can result in the child committing impulsive and unplanned aggressive and antisocial acts, for which he or she may feel remorseful afterwards but which he or she may still have difficulty controlling in the future. Based on this research supporting both the clinical and the etiological importance of the presence of elevated CU traits, the DSM-​5 (APA, 2013)  included a specifier to the diagnosis of CD to designate those youths with CP who also show elevated rates of CU traits. The specifier of “with limited prosocial emotions” (LPE) is given if the individual (a) meets criteria for CD and (b) shows two or more of the following CU traits persistently over 12 months in more than one relationship or setting: lack of remorse or guilt; callous—​lack of empathy; unconcern about performance at school, work, or in other important activities; and shallow or deficient affect.

ASSESSMENT FOR DIAGNOSIS

When a child or adolescent with CP is referred for assessment, there are four primary goals for the assessment. First, it is important to determine whether or not the youth is, in fact, demonstrating significant levels of CP to rule out the possibility of the occasional inappropriate referral due to unrealistic parental or teacher expectations. Second, it is important to identify the type and severity of the youth’s CP and to determine the degree and types of impairment associated with them. Some level of CP is normative and, as noted previously, there can be quite a range of CP that varies greatly in terms of how severe and impairing the

 7

Child and Adolescent Conduct Problems

behaviors are for the child. Assessing the level and severity of CP displayed by the child is critical to determine whether treatment is indicated and how intensive it needs to be. Third, given the high degree of comorbidity associated with CP, it is critical to at least screen for a wide variety of emotional, behavioral, social, and academic problems that can further influence the child’s adjustment. Fourth, given the large number of risk factors that can contribute to the development and maintenance of CP, and that could be important targets of intervention, it is critical to assess the many dispositional and contextual risk factors that research has linked to CP in children and adolescents. There are three primary assessment methods that can be used to accomplish these goals: behavior rating scales, structured diagnostic interviews, and behavioral observations. All of these methods have specific strengths and weaknesses that they bring to the assessment process, and we summarize these in the following sections. Table 5.1 lists some of the most commonly used empirically supported instruments for each method of assessment and

77

provides summary evaluations of their adequacy in terms of normative data, reliability, validity, generalizability, and clinical utility. Behavior Rating Scales Behavior rating scales are a core part of an assessment battery for assessing children and adolescents with CP. As noted in Table 5.1, a number of rating scales are commercially available, and they have a number of useful characteristics for meeting the goals outlined previously. First, most scales have subscales assessing different types of CP, and they can be completed by adults who observe the youth in important psychosocial contexts (i.e., parents and teachers) and by the youth himself or herself. By having multiple informants who see the child in different settings, this can provide important information on the pervasiveness of the child’s behavior problems and can help detect potential biases in the report of any single informant. Most of the scales listed in Table 5.1 provide analogous content across the different raters. One notable

Table 5.1  Ratings of Instruments Used for Diagnosis Norms

Internal Inter-​Rater Consistency Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

ASEBA BASC-​3a CRS-​3a

E E E

E E E

A A A

E E E

G E G

E G G

E E E

A A A

✓ ✓

ECBI/​SESBI-​R ECI-​5/​CASI-​5a

G G

E A

A A

G A

E E

E G

G G

A A



Structured Interviews DICA

NA

NA

G

G

E

E

G

A

NA

NA

G

G

E

E

G

A



Behavioral Observations BCS NR DPICS L Compliance Test L

NA NA E

A A E

NR L A

A A A

G G G

G E A

A A A

✓ ✓

Instrument Rating Scales

DISC

BASC-​SOS

NA

NA

A

G

E

E

A

A

ASEBA-​DOF

NA

NA

G

G

E

E

A

A

REDSOCS

L

NA

G

NR

A

A

A

A

G A

NA NA

G G

G G

E E

E E

G G

G G

Impairment Indices CAFAS CGAS



  Ratings for this instrument were made on the basis of research conducted with the previous version of the instrument.

a

Note: ASEBA  =  Achenbach System of Empirically Based Assessment; BASC-​3  =  Behavior Assessment System for Children, 3rd Edition; CRS-​ 3 = Conners Rating Scales, 3rd Edition; ECBI = Eyberg Child Behavior Inventory; SESBI-​R = Sutter–​Eyberg Child Behavior Inventory-​Revised; ECI-​ 5 = Early Childhood Inventory-​5; CASI-​5 = Child & Adolescent Symptom Inventory-​5; DICA = Diagnostic Interview for Children and Adolescents; DISC = Diagnostic Interview Schedule for Children; BCS = Behavioral Coding System; DPICS = Dyadic Parent–​Child Interaction Coding System; BASC-​SOS  =  BASC Student Observation System; ASEBA-​DOF  =  ASEBA Direct Observation Form; REDSOCS  =  Revised Edition of the School Observation Coding System; CAFAS = Child and Adolescent Functional Assessment Scale; CGAS = Children’s Global Assessment Scale; L = Less than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

78

78

Attention-Deficit and Disruptive Behavior Disorders

exception is the Behavior Assessment System for Children, Third Edition (BASC-​3; Reynolds & Kamphaus, 2015). In this scale, the teacher and parent versions are fairly similar in content, with the main difference being that teachers also rate behaviors indicative of learning problems and study skills. The content of the self-​report version, however, is quite different. For example, the child does not rate his or her own level of CP but, instead, the self-​report version provides more extended coverage of the child’s attitudes (e.g., attitudes toward parents and teachers), his or her self-​concept (e.g., self-​esteem and sense of inadequacy), and his or her social relationships. Second, rating scales provide some of the best norm-​ referenced data on a child’s behavior. The most widely used rating scales (see Table 5.1) have large standardization samples that allow the child’s ratings to be compared to the ratings of other children of the same age and sex. This provides critical information to aid in determining whether the child’s behavior is abnormal, given the child’s age and sex. For example, the standardization sample for the Achenbach System of Empirically Based Assessment (ASEBA; Achenbach & Rescorla, 2000, 2001)  is representative of the 48 contiguous United States for SES, sex, ethnicity, region, and urban–​ suburban–​ rural residence (Achenbach & Rescorla, 2000, 2001). In addition, the factor structures of the various ASEBA instruments have been found to be comparable across multiple societies (e.g., Achenbach & Rescorla, 2007, 2010). Third, most rating scales contain additional subscales, over and above those assessing CP. These typically include scales assessing anxiety, depression, social problems, and family relationships. Thus, these rating scales can be very helpful in providing a broad screening of many of the most common co-​occurring problems that are often found in children with CP and many of the risk factors that can play a role in the development and maintenance of CP. However, rating scales can vary on how well they assess the various co-​occurring conditions. For example, the ASEBA does not include separate depression and anxiety scales, nor does it include a hyperactivity scale. A related issue has been a lack of correspondence of some of the scales to their DSM counterparts. However, the ASEBA (Achenbach, 2013) and the Conners Rating Scales (CRS-​ 3; Conners, 2008)  now include scoring algorithms for DSM-​5-​oriented scales. Also, the rating scales developed by Gadow and Sprafkin (2002), such as the Child Symptom Inventory (CSI-​4), Early Childhood Inventory-​ 5 (ECI-​ 5), and the Child and Adolescent Symptom Inventory-​5 (CASI-​5), were specifically developed to correspond to DSM criteria. With the exception

of the CSI-​4, these DSM scales reflect the current DSM-​5 classification. Although an advantage of these rating scales is the breadth of their coverage of multiple areas of child functioning, the cost is that they often have only minimal coverage of CP. There are, however, several rating scales that focus solely on CP and provide a more comprehensive coverage of various types of CP. For example, the Eyberg Child Behavior Inventory (ECBI) and Sutter–​ Eyberg Student Behavior Inventory-​Revised (SESBI-​R) (Eyberg & Pincus, 1999) are completed by parents and teachers, respectively. Both scales include 36 items describing specific CP behaviors and are scored on both a frequency-​of-​occurrence (Intensity) scale and a yes–​no problem identification (Problem) scale. The inclusion of both frequency and problem ratings is very helpful in the diagnostic process to determine the level of impairment associated with the child’s or adolescent’s CP. Interviews The second major method used for the diagnosis of CP is interviews. Interviews can be divided into two general categories:  unstructured clinical interviews and structured diagnostic interviews. The clinical interview with the parent is important in the assessment of CP for a number of reasons. In addition to providing a method for assessing the type, severity, and impairment associated with CP, the clinical interview with the parent helps assess typical parent–​child interactions that may be contributing to the CP, the antecedent conditions that may make CP behaviors more likely to occur, and the consequences that accompany such behaviors and either increase or decrease the likelihood that CP will reoccur. A number of interview formats are available to aid the clinician in obtaining information from the parents about their child’s behavior and parent–​ child interactions (e.g., Barkley, 2013; McMahon & Forehand, 2003; Patterson, Reid, Jones, & Conger, 1975; Wahler & Cormier, 1970). An individual interview with the child or adolescent may also be useful in providing the clinician with an opportunity to assess the child’s perception of why he or she has been brought to the clinic and the child’s subjective evaluation of his or her cognitive, affective, and behavioral characteristics (e.g., Bierman, 1983). One criticism of the unstructured interview has been the difficulty in obtaining reliable information in this format. Structured interviews were developed in an attempt to improve the reliability of the information that is obtained. As listed in Table 5.1, two structured diagnostic

 79

Child and Adolescent Conduct Problems

interviews that are frequently used in the assessment of children with CP are the Diagnostic Interview Schedule for Children (DISC-​IV; Shaffer, Fisher, Lucas, Dulcan, & Schwab-​Stone, 2000) and the Diagnostic Interview for Children and Adolescents (DICA; Reich, 2000). These and other similar interviews (for a review, see Loney & Frick, 2003) provide a structured format for obtaining parent and youth reports on the symptoms that constitute the criteria for ODD and CD according to DSM-​IV-​TR (APA, 2000). They are both currently being updated with the changes in criteria made in the new DSM-​5 (APA, 2013). Similar to behavior rating scales, these interviews provide very structured question-​ and-​ answer formats and, thus, often lead to very reliable scores. The questions are typically asked in a stem and follow-​up format. That is, a stem question is asked (e.g., “Does your child get into fights?”), and follow-​up questions are only asked if the stem question is answered affirmatively (e.g., “Is this only with his or her brothers and sisters?” and “Does he or she usually start these fights?”). Also similar to behavior rating scales, most structured interviews assess many other types of problems in adjustment, in addition to CP. Thus, they can be very helpful for providing an assessment of possible comorbid conditions that are often present in youth with CP. However, as noted in Table 5.1, unlike behavior rating scales, structured interviews often do not provide strong normative information on the child’s behavior. Instead, structured interviews typically focus on assessing how much CP and other problems in adjustment impair a child’s or adolescent’s social and academic functioning. Also, unlike behavior ratings scales, most interview schedules provide standard questions that assess the age at which a child’s behavioral difficulties began to emerge and how long they have caused problems for the child. Also, the assessment of age of onset of CP and other problems in adjustment allows for some estimate of the temporal ordering of a child’s problems, such as whether the child’s CP predated his or her emotional difficulties. Such information could help in determining whether the emotional distress is best conceptualized as being a result of the impairments caused by the CP. However, there are a number of limitations in the information provided by structured interviews (Frick, Barry, & Kamphaus, 2010). If the child has a number of problems, and many stem questions are answered affirmatively requiring the administration of extensive follow-​up questions, the interviews can be very lengthy. That is, their administration time can range from 45 minutes for youths with few problems to more than 2 hours

79

for youths with many problems in adjustment (Frick et  al., 2010). Furthermore, most structured interviews do not have formats for obtaining teacher information, and obtaining reliable information from young children (younger than age 9  years) has been difficult with most structured interviews (Frick et al., 2010). Perhaps one of the most  important limitations in the use of structured interviews, however, is evidence that the number of symptoms reported declines within an interview schedule. That is, parents and youths tend to report more symptoms for diagnoses assessed early in the interview, regardless of which diagnoses are assessed first (Jensen, Watanabe, & Richters, 1999; Piacentini et al., 1999). This finding calls into question the validity of diagnoses assessed later in the interview. Unfortunately, CP is often assessed last in most of the available interview schedules and, as a result, could be most influenced by this limitation. Behavioral Observation Behavioral observations provide a third common way of assessing CP behaviors. Behavioral observations in a child’s or adolescent’s natural setting (e.g., home, school, playground) can make an important contribution to the assessment process by providing an assessment of the youth’s behavior that is not filtered through the perceptions of an informant and by providing an assessment of the immediate environmental context of the youth’s behavior. For example, behavioral observations can indicate how others in the child’s environment (e.g., parents, teachers, peers) respond to the child’s CP; this could be very important for identifying factors that may be maintaining these behaviors. Two widely used, structured, microanalytic observation procedures available for assessing CP and parental responses to these behaviors in younger (3 to 8  years) children in the clinic and the home are the Behavioral Coding System (BCS; Forehand & McMahon, 1981) and the Dyadic Parent–​ Child Interaction Coding System (DPICS; Eyberg, Nelson, Ginn, Bhuiyan, & Boggs, 2013). The BCS and the DPICS are modifications of the assessment procedure developed by Hanf (1970) for the observation of parent–​child interactions in the clinic. As employed in clinic settings, both the BCS and the DPICS place the parent–​child dyad in standard situations that vary in the degree to which parental control is required, ranging from a free-​play situation (i.e., Child’s Game and Child-​Directed Interaction) to one in which the parent directs the child’s activity, either in the context of parent-​ directed play (i.e., Parent’s Game and Parent-​Directed

80

80

Attention-Deficit and Disruptive Behavior Disorders

Interaction) or in cleaning up the toys (i.e., Clean Up). Each task typically lasts 5 to 10 minutes. In the home setting, observations usually occur in a less structured manner (e.g., the parent and child are instructed to “do whatever you would normally do together”). In each coding system, a variety of parent and child behaviors are scored, many of which emphasize parental antecedents (e.g., commands) or consequences (e.g., use of verbal hostility) to the child’s behavior. Both the BCS and the DPICS have been shown to differentiate clinic-​referred from nonreferred children (Eyberg et  al., 2013; Griest, Forehand, Wells, & McMahon, 1980). One of the main limitations of these observational systems is the very intensive training (e.g., 20 to 25 hours for the BCS) required of observers so that they reliably code the parent and child behaviors. This characteristic often limits the usefulness of these systems in many clinical settings (Frick et al., 2010). However, simplified versions of both the DPICS and the BCS have been developed to reduce training demands and may ultimately prove to be more useful to clinicians (Eyberg, Bessmer, Newcomb, Edwards, & Robinson, 1994; McMahon & Estes, 1994). For example, parental negative attention (coded from the simplified version of the BCS) during a structured child-​directed play task predicted higher levels of parent-​ reported CP concurrently and at a 6-​year follow-​up, supporting the predictive validity of this abbreviated coding system (Fleming, McMahon, & King, 2016). As noted previously, an important type of CP, especially in young children, is noncompliance. A  direct observational assessment of child noncompliance can also be obtained in the clinic with the Compliance Test (CT; Roberts & Powers, 1988). In the CT, the parent is instructed to give a series of 30 standard commands without helping or following up on the commands with other verbalizations or nonverbal cues. In one version of the CT, two-​part commands are given (e.g., “[Child’s name], put the [toy] in the [container].”). In another version, the commands are separated into two codeable units (e.g. “[Child’s name], pick up the [toy]. Put it in the [container].”). The CT takes between 5 and 15 minutes to complete. The CT has proven useful in identifying noncompliant preschool children in research and clinical settings (Roberts & Powers, 1990). Many common CP behaviors are by nature covert (e.g., lying, stealing, and fire-​setting), which makes them more difficult to capture through observational techniques. However, Hinshaw and colleagues have developed and evaluated an analogue observational procedure to assess stealing, property destruction, and cheating in children

ages 6 to 12  years (Hinshaw, Heller, & McHale, 1992; Hinshaw, Simmel, & Heller, 1995; Hinshaw, Zupan, Simmel, Nigg, & Melnick, 1997). Samples of boys (ages 6 to 12 years) with ADHD (most of whom also had ODD or CD) and a comparison group were asked to complete an academic worksheet alone in a room that contained a completed answer sheet, money, and toys. Stealing was measured by conducting a count of objects in the room immediately following the work session, whereas property destruction and cheating were assessed by ratings derived from observing the child’s behavior during the session. Each of these observational measures of covert CP was correlated with parental ratings of covert CP. Stealing and property destruction were also associated with staff ratings. There are also several behavioral observational systems that have been developed for use in school settings (Nock & Kurtz, 2005). For example, both the BCS (Forehand & McMahon, 1981) and the DPICS (Eyberg et  al., 2013)  have been modified for use in the classroom to assess child behavior (e.g., Breiner & Forehand, 1981; Jacobs et  al., 2000). Psychometric properties of the Revised Edition of the School Observation Coding System (REDSOCS; Jacobs et  al., 2000), which is the adaptation of the DPICS, have been reported with both clinic-​referred and nonreferred samples (Bagner, Boggs, & Eyberg, 2010; Jacobs et  al., 2000). For example, the REDSOCS discriminated between nonreferred children and children referred for school behavior problems (Jacobs et al., 2000). The BASC-​ 3 Student Observation System (SOS; Reynolds & Kamphaus, 2015)  provides a system for observing children’s behavior in the classroom using a momentary time-​sampling procedure. With the purchase of an application for a smartphone, tablet, or laptop, the observations can be entered directly into a digital database that can be integrated with the results of the parent and teacher ratings on the BASC-​3. The SOS specifies 65 behaviors that are common in classroom settings and includes both adaptive (e.g., “follows directions” and “returns material used in class”) and maladaptive (e.g., “fidgets in seat” and “teases others”) behaviors. The observation period in the classroom involves 15 minutes that is divided into 30 intervals of 30 seconds each. The child’s behavior is observed for 3 seconds at the end of each interval, and the observer codes all behaviors that were observed during this time window. Although the newest version of the SOS has not been extensively tested, scores from the earlier version of this observation system differentiated students with CP from other children (Lett & Kamphaus, 1997).

 81

Child and Adolescent Conduct Problems

Another classroom observational system, the ASEBA Direct Observation Form (ASEBA-​DOF; McConaughy & Achenbach, 2009), was designed to observe students, ages 5 to 14  years, for 10-​minute periods in the classroom. Three types of information are recorded. First, at the end of each minute during the observational period, the child’s behavior is coded as being on or off task for 5 seconds. Second, at the end of the observational period, the observer writes a narrative of the child’s behavior throughout the 10-​minute observational period, noting the occurrence, duration, and intensity of specific problems. Third, and also at the end of the observational period, the observer codes 96 behaviors on a 4-​point scale (0  =  “behavior was not observed” through 3  =  “definite occurrence of behavior with severe intensity or for greater than 3 minutes duration”). These ratings can be summed into Total Problem, Internalizing, and Externalizing behavior composites. The ASEBA-​DOF has been shown to discriminate between referred and nonreferred children in the classroom (e.g., Reed & Edelbrock, 1983), as well as between children with CP and children with other behavior problems (e.g., McConaughy, Achenbach, & Gent, 1988). One limitation in observational systems is the potential for reactivity, whereby the child’s behavior can change because the child knows that he or she is being observed (Aspland & Gardner, 2003). An alternative to observations by independent observers that can reduce reactivity is to train significant adults in the child’s or adolescent’s environment to observe and record certain types of behavior. The most widely used procedure of this type is the Parent Daily Report (PDR; Chamberlain & Reid, 1987), a parent observation measure that is typically administered during brief (5 to 10 minutes) telephone interviews. Parents are asked which of a number of overt and covert behaviors have occurred in the past 24 hours. The PDR has shown moderate convergent validity with other parent report measures of child CP (Chamberlain & Reid, 1987; Webster-​Stratton & Spitzer, 1991). Functional Impairment Most of the measures described previously focus on the type, frequency, and severity of the child’s CP. However, the child’s or adolescent’s level of functional impairment can vary greatly, even with similar levels of CP (Bird, 1999; Bloomquist & Schnell, 2002). Knowledge of impairment is important for a number of reasons. First, it can determine how intensive an intervention may need to be for a child and the most appropriate setting for this treatment,

81

it can provide useful information to the clinician concerning possible intervention targets, and it may also serve as an important indicator of intervention outcome (Frick et  al., 2010; Hodges, Xue, & Wotring, 2004). As noted previously, structured interviews based on the DSM-​IV-​ TR (APA, 2000) allow for the assessment of impairment. Table 5.1 lists two measures designed specifically to assess the youth’s level of impairment:  the Children’s Global Assessment Scale (CGAS; Bird et  al., 1993; Shaffer et  al., 1983)  and the Child and Adolescent Functional Assessment Scale (CAFAS; Hodges, 2000). Also, several of the broad rating scales summarized in Table 5.1 include subscales that assess important areas of potential impairment of children with CP. For example, the BASC-​3 (Reynolds & Kamphaus, 2015)  contains scales assessing the child’s academic adjustment (e.g., learning problems, attitude toward school and teacher, study skills), social adjustment (e.g., social stress, interpersonal relations), and self-​concept (e.g., sense of inadequacy). Overall Evaluation In summary, assessing the types and severity of CP displayed by the child, as well as assessing common co-​ occurring problems in adjustment, is critical to the assessment of children and adolescents with CP. Behavior rating scales, unstructured and structured interviews, and behavioral observations all can help in this process, and each has its unique strengths and weaknesses. Thus, typical assessments of children with CP would include multiple methods of assessment that utilize the strengths of these different approaches. Behavior rating scales, such as the BASC-​ 3 and ASEBA, typically provide the best norm-​referenced information that allows for the comparison of a child’s level of CP to a normative comparison group. Rating scales also typically have formats for obtaining information from several different informants who see the child in different settings (e.g., parents and teachers), and they provide a time-​efficient method for assessing a number of possible co-​occurring problems that may be present in youths with CP. In contrast, structured interviews, such as the DICA and DISC, tend to be more time-​consuming and are often limited in the normative information that they provide. However, they typically provide more information on the level of impairment associated with the child’s CP and the age at which the problem behavior began. Finally, behavioral observation systems, such as the BCS and DPICS, provide an assessment of the child’s behavior that is not filtered through the perceptions of an informant, and

82

82

Attention-Deficit and Disruptive Behavior Disorders

they provide a method for assessing the environmental contingencies that can be involved in the development or maintenance of CP. However, many behavioral observation systems require extensive training to reliably code the child’s behavior, and they are often limited in the normative information they provide.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

The research reviewed previously indicated that children with CP often have multiple comorbid conditions that are important to consider in treatment planning, and there are often numerous risk factors that can be involved in the development or maintenance of CP. As a result, many of the rating scales and structured interviews described in the previous section on diagnosis are also included in Table 5.2 because they are also critical for case conceptualization and treatment planning purposes. These measures provide a broad assessment of the child’s functioning and capture the many important co-​occurring

problems in adjustment and risk factors that can be used in treatment planning. A key area of research for guiding the assessment process is the research documenting various potential developmental pathways to CP. As reviewed previously, children with CP can fall into childhood-​onset or adolescent-​onset pathways, depending on when their level of severe antisocial and aggressive behavior started. Also, there seem to be important differences between those children with CP who do and those who do not show high levels of CU traits. Knowledge of the characteristics of children in these different pathways, and the different causal mechanisms involved, can serve as a guide for structuring and conducting the assessment (Frick et  al., 2010; McMahon & Frick, 2005). Furthermore, interventions can be tailored to the unique needs of youth in these different pathways (Frick, 2012). These developmental pathways can aid case conceptualizations by providing a set of working hypotheses concerning the nature of the CP behavior, the most likely comorbid conditions, and the most likely risk factors (McMahon & Frick, 2005). For example, for a youth

Table 5.2  Ratings of Instruments Used for Case Conceptualization and Treatment Planning Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

APSD

A

L

A

G

E

G

A

A

ASEBA BASC-​3a ECBI/​SESBI-​R

E E G

E E E

A A A

E E G

G E E

E G E

E E G

A A A

ECI-​5/​CASI-​5a

G

A

A

A

E

G

G

A

ICU

L

A

G

G

E

E

E

A

NA

G

G

E

E

G

A

NA

G

G

E

E

G

A



NA NA E

A A E

U L A

A A A

G G G

G E A

A A A

✓ ✓ ✓

Instrument

Highly Recommended

Rating Scales

Structured Interviews DICA NA DISC

NA

Behavioral Observations BCS U DPICS L Compliance Test L PDR BASC-​SOS

L L

NR NA

E A

A G

A E

G E

E A

A A

ASEBA-​DOF

L

NA

G

G

E

E

A

A

REDSOCS

L

NA

G

NR

A

A

A

A

✓ ✓



  Ratings for this instrument were made on the basis of research conducted with the previous version of the instrument.

a

Note: APSD = Antisocial Process Screening Device; ASEBA = Achenbach System of Empirically Based Assessment; BASC-​3 = Behavior Assessment System for Children, 3rd Edition; ECBI = Eyberg Child Behavior Inventory; SESBI-​R = Sutter–​Eyberg Child Behavior Inventory-​Revised; ECI-​5 = Early Childhood Inventory-​5; CASI-​5 = Child & Adolescent Symptom Inventory-​5; ICU = Inventory of Callous–​Unemotional Traits; DICA = Diagnostic Interview for Children and Adolescents; DISC = Diagnostic Interview Schedule for Children; BCS = Behavioral Coding System; DPICS = Dyadic Parent–​Child Interaction Coding System; PDR = Parent Daily Report; BASC-​SOS = BASC-​3 Student Observation System; ASEBA-​DOF = ASEBA Direct Observation Form; REDSOCS = Revised Edition of the School Observation Coding System; L = Less than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

 83

Child and Adolescent Conduct Problems

whose CP appears to onset in adolescence, one would hypothesize based on the available literature that he or she is less likely to be aggressive, to have intellectual deficits, to have temperamental vulnerabilities, and to have comorbid ADHD. However, the youth’s association with a deviant peer group and factors that may contribute to this deviant peer group affiliation (e.g., lack of parental monitoring and supervision) would be especially important to assess for youths in this pathway. In contrast, for a youth whose serious CP began prior to adolescence, one would expect more cognitive and temperamental vulnerabilities, comorbid ADHD, and more serious problems in family functioning. For those youths in this childhood-​ onset group who do not show CU traits, the cognitive deficits would more likely be verbal deficits and the temperamental vulnerabilities would more likely be problems regulating emotions, leading to higher levels of anxiety, depression, and aggression involving anger. In contrast, for a youth with childhood-​onset CP who shows high levels of CU traits, the cognitive deficits are more likely to involve a lack of sensitivity to punishment, and the temperamental vulnerabilities are more likely to involve a preference for dangerous and novel activities and a failure to experience many types of emotion (e.g., guilt and empathy). Furthermore, assessing the level and severity of aggressive behavior, especially the presence of instrumental aggression, would be critical for youths in this group. As most clinicians recognize, people do not often fall neatly into the prototypes that are suggested by research (see also Fairchild et al., 2013). Therefore, these descriptions are meant to serve as hypotheses around which to organize an evidence-​based assessment. They also highlight several specific important pieces of information that are needed when assessing children and adolescents with CP. One of the most critical pieces of information in guiding assessment, and perhaps ultimately intervention, is determining the age at which various CP behaviors began. This information provides some indication as to whether or not the youth may be on the childhood-​onset pathway. Unfortunately, there has been little consistency in the literature concerning the most appropriate operational definition of childhood onset versus adolescent onset or even whether this distinction should be based on chronological age or on the pubertal status of the child (Moffitt, 2006). For example, the DSM-​5 (APA, 2013) makes the distinction between children who begin showing severe CP behaviors before age 10 years (i.e., childhood onset) and those who do not show severe CP before age 10 years (i.e., adolescent onset) in its definition of CD. However, other research studies have used age 11  years (Robins,

83

1966) or age 14 years (Patterson & Yoerger, 1993; Tibbetts & Piquero, 1999) to define the start of adolescent onset. Thus, onset of severe CP before age 10  years seems to be clearly considered childhood onset and onset after age 13 years clearly adolescent onset. However, how to classify children whose CP onset between the ages of 11 and 13 years is less clear and probably dependent on the level of physical, cognitive, and social maturity of the child. Based on this research, it is therefore important for treatment planning to assess the age at which the child began showing serious CP. An important advantage that many structured interviews have over behavior rating scales and behavioral observations is that they provide a structured method for assessing when a youth first began showing serious CP, thereby providing an important source of information on the developmental trajectory of the CP behavior. For example, in the DISC-​IV (Shaffer et  al., 2000), any question related to the presence of a CD symptom that is answered affirmatively is followed by questions asking the parent or youth to estimate at what age the first occurrence of the behavior took place. Obviously, such questions can also be integrated into an unstructured interview format as well. In either case, however, there is always some concern about how accurate the parent or youth is in reporting the timing of specific behaviors. There are three findings from research that can help in interpreting such reports. First, the longer the time frame involved in the retrospective report (e.g., a parent of a 17-​year-​old reporting on preschool behavior vs. a parent of a 6-​year-​old reporting on preschool behavior), the less accurate the report is likely to be (Green, Loeber, & Lahey, 1991). Second, although a parental report of the exact age of onset may not be very reliable over time, typical variations in years are usually small and the relative rankings within symptoms (e.g., which symptom began first) and within a sample (e.g., which children exhibited the earliest onset of behavior) seem to be fairly stable (Green et al., 1991). As a result, these reports should be viewed as rough estimates of the timing of onset and not as exact dating procedures. Third, there is evidence that combining informants (e.g., a parent and youth) or combining sources of information (e.g., self-​report and record of police contact), and taking the earliest reported age of onset from any source, provide an estimate that shows somewhat greater validity than any single source of information alone (Lahey et al., 1999). Assessment to examine the extent to which CU traits may also be present is important, especially if the youth’s history of CP is consistent with the childhood-​onset pathway (but also see Fairchild et  al. [2013] concerning the

84

84

Attention-Deficit and Disruptive Behavior Disorders

relevance of also assessing for CU traits in youth with adolescent-​ onset CP). To illustrate the importance to treatment planning, Hawes et al. (2014) reviewed 16 treatment outcomes studies and reported that CU traits were a strong predictor of poor treatment outcomes across studies (see also Frick et al., 2014a). For example, children with CP and elevated CU traits seem to be less responsive to the discipline components (e.g., time out) of parent management training (Hawes & Dadds, 2005). Furthermore, although many learning-​ based parenting interventions lead to improvements in CU traits, children with these traits often started treatment with the most severe levels of CP and still ended treatment with the most severe levels of CP (White, Frick, Lawing, & Bauer, 2013). Thus, it is important to include a measure of CU traits as part of the treatment planning process (Manders, Dekovic, Asscher, van der Laan, & Prins, 2013). The Antisocial Process Screening Device (APSD; Frick & Hare, 2001), included in Table 5.2, is a behavior rating scale completed by parents and teachers to identify children with CP who also exhibit CU traits (Christian, Frick, Hill, Tyler, & Frazer, 1997; Frick, Bodin, & Barry, 2000; Frick, O’Brien, Wootton, & McBurnett, 1994). A self-​report version of this scale is also available for older children and adolescents, and it has been validated in a number of studies (Muñoz & Frick, 2007). Unfortunately, the APSD only includes six items directly assessing CU traits, and it only has three response options for rating the frequency of the behaviors. The few items, the limited range in response options, and the fact that ratings of CU traits are negatively skewed in most samples resulted in the APSD scores showing poor internal consistency in many formats (Poythress, Dembo, Wareham, & Greenbaum, 2006). To overcome these limitations in the assessment of CU traits, the Inventory of Callous–​unemotional Traits (ICU) was developed to provide a more extended assessment of CU traits (Kimonis et al., 2008). The ICU was developed from the four items on the APSD that most consistently loaded on the CU traits factor across various samples (Frick et  al., 2000). To form the items on the ICU, six items (three positively and three negatively worded items) were developed to assess a similar content to each of the four core traits. These 24 items were then placed on a 4-​point Likert scale that could be rated from 0 (Not at all true) to 3 (Definitely true). Versions for parent, teacher, and self-​report were developed to encourage multi-​informant assessments. The ICU has a number of positive qualities for assessing CU traits. The larger number of items and its extended response format has resulted

in a 24-​item total score that is internally consistent in many samples, with Cronbach’s alpha ranging between .77 and .89 (Frick & Ray, 2015). Furthermore, there is a preschool version for use with children as young as age 3  years (Ezpeleta, de la Osa, Granero, Penelo, & Domenech, 2013), and the ICU has been translated into more than 20 languages with support for its validity across these translations (e.g., Ciucci, Baroncelli, Franchi, Golmaryami, & Frick, 2014; Fanti, Frick, & Georgiou, 2009; Kimonis et al., 2008). Also, the ICU is one of the few measures that include items that directly assess the content included in the new “with limited prosocial emotions” specifier in the DSM-​5 (Kimonis et al., 2015). However, these positive qualities need to be weighed against the lack of a large and representative normative sample being available for the ICU and with empirically derived cut-​offs being available for only certain versions of the scale (Kimonis, Fanti, & Singh, 2014). The key implication from research on the developmental pathways to CP is that the most appropriate treatment for a child or adolescent with CP may differ depending on characteristics of the child and factors in his or her environment that are operating to maintain these behaviors. This approach is very consistent with functional behavioral assessment (FBA) methods that focus on conducting an individualized assessment of each child’s needs and matching intervention strategies to those needs (LaRue & Handelman, 2006; Walker, Ramsey, & Gresham, 2004). The typical FBA involves a specification of problem behaviors in operational terms (e.g., what types of CP are being exhibited in the classroom), as well as identification of events that reliably predict and control behavior through an examination of antecedents and consequences. For example, an FBA at school would determine whether the child’s CP is occurring only in certain classes or situations (e.g., during class change and at lunch) and if there are certain factors that reliably lead to the CP (e.g., teasing by peers and disciplinary confrontations with teachers). It would also determine the consequences that are associated with the CP that may contribute to their likelihood of occurring in the future (e.g., getting sent home from school and preventing further teasing). Information relevant to an FBA can be gathered through interviews with significant others in the child’s environment or through direct observations of the child in his or her natural environment. Thus, several of the behavioral observation systems described previously are also quite important for case conceptualization and treatment planning for the child with CP.

 85

Child and Adolescent Conduct Problems

Overall Evaluation

85

ASSESSMENT FOR TREATMENT MONITORING

In summary, this section highlighted several critical issues for using assessment information for planning treatment for children with CP. First, because children with CP often have many co-​occurring problems in adjustment that are important to address in treatment, it is critical that methods for assessing potential comorbid problems, such as behavior rating scales and structured interviews, can be used in treatment planning. Second, because children who show different developmental trajectories of their CP may require different approaches to treatment, it is critical to assess key characteristics that distinguish among children in these trajectories. Specifically, assessing the age at which the child began to exhibit CP, through either structured or unstructured interviews, and assessing the presence of CU traits are both critical to the treatment planning process. Third, because environmental contingencies have proven to be very important for understanding factors that can either lead to or maintain CP in children and adolescents, assessment of these contingencies through unstructured interviews or behavioral observations is also critical for the treatment planning process.

AND TREATMENT OUTCOME

Most of the applications of research for guiding the assessment process have focused on making diagnostic decisions (e.g., determining whether CP should be the primary source of concern and whether it is severe and impairing enough to warrant treatment) and on treatment planning (e.g., determining what types of intervention may be needed by the child; McMahon & Frick, 2005). However, an important third goal of the assessment process is monitoring the progress of intervention and evaluating treatment outcome. That is, evidence-​based assessments should provide a means for testing whether interventions have brought about meaningful changes in the child’s or adolescent’s adjustment, either for better or for worse (i.e., an iatrogenic effect). This is particularly important in the treatment of CP, given a number of documented cases in which treatments have led to increases, rather than decreases, in problem behavior for some youth with CP (Dishion, McCord, & Poulin, 1999; Dodge, Dishion, & Lansford, 2006). Several of the behavior rating scales and observational measures described previously have demonstrated sensitivity to intervention outcomes. These are described in Table 5.3.

Table 5.3  Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Internal Norms Consistency

Inter-​Rater Test–​Retest Reliability Reliability

Content Validity

Construct Validity

Validity Treatment Generalization Sensitivity

Clinical Utility

Highly Recommended

ASEBA ECI-​5/​CASI-​5a

E G

E A

A A

E A

G E

E G

E G

G G

A A



ECBI/​SESBI-​R

G

E

A

G

E

E

G

G

A



NA NA E

A A E

NR L A

A A A

G G G

G E A

G G G

A A A

✓ ✓

L L

NR NA

E G

A NR

A A

G A

E A

E L

A A



G A

NA NA

G G

G G

E E

E E

G G

G G

G G



NA

NR

A

A

NR

NA

A

NA

G

A

G

A

NA

A

Instrument Rating Scales

Behavioral Observations BCS NR DPICS L Compliance Test L PDR REDSOCS Impairment Indices CAFAS CGAS

Treatment Satisfaction Surveys PCSQ NR NR TAI

NR

E



  Ratings for this instrument were made on the basis of research conducted with the previous version of the instrument.

a

Note: ASEBA = Achenbach System of Empirically Based Assessment; ECI-​5 = Early Childhood Inventory-​5; CASI-​5 = Child & Adolescent Symptom Inventory-​5; ECBI = Eyberg Child Behavior Inventory; SESBI-​R = Sutter–​Eyberg Child Behavior Inventory-​Revised; BCS = Behavioral Coding System; DPICS  =  Dyadic Parent–​Child Interaction Coding System; PDR  =  Parent Daily Report; REDSOCS  =  Revised Edition of the School Observation Coding System; CAFAS  =  Child and Adolescent Functional Assessment Scale; CGAS  =  Children’s Global Assessment Scale; PCSQ  =  Parent’s Consumer Satisfaction Questionnaire; TAI = Therapy Attitude Inventory; L = Less than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

86

86

Attention-Deficit and Disruptive Behavior Disorders

For example, scores from the ASEBA have proven to be sensitive to changes brought about by the treatment of youth with CP (e.g., DeGarmo, Patterson, & Forgatch, 2004; Eisenstadt, Eyberg, McNeil, Newcomb, & Funderburk, 1993; McCabe & Yeh, 2009). Also, scores on the ECI-​4 and CASI-​4R scales have shown changes in response to parenting and psychopharmacological interventions (e.g., Brock, Kochanska, O’Hara, & Grekin, 2015; Gadow et al., 2014). Scores on the ECBI/​SESBI-​R scales have been proven to change after parent management training interventions with young children (e.g., Eisenstadt et al., 1993; Jones, Forehand, Cuellar, Parent, & Honeycutt, 2014; McCabe & Yeh, 2009; Nixon, Sweeney, Erickson, & Touyz, 2003; Scott et al., 2010; Webster-​Stratton & Hammond, 1997). Importantly, because these rating scales often provide norm-​referenced scores, these scales can be critical for determining not only whether or not the intervention has led to significant decreases in the child’s level of CP but also whether the behavior has been brought within a level that is normative for the child’s age. However, behavior rating scales completed by parents who are involved in treatment could be influenced by expectancy effects on the part of the parents who anticipate positive responses to an intervention. Thus, it is important to include ratings of the child’s behavior from others who may not have been involved in the treatment or to include behavioral observations of treatment effects whenever possible, especially if the observer is unaware if the child and his or her parents were involved in treatment or unaware if the observation is pre-​or post-​ treatment. Two observational systems described previously, the BCS and the DPICS, have been used in this way as an outcome measure for parenting interventions for CP (e.g., Eisenstadt et al., 1993; Herschell, Calzado, Eyberg, & McNeil, 2002; McCabe & Yeh, 2009; McMahon, Forehand, & Griest, 1981; Peed, Roberts, & Forehand, 1977; Webster-​Stratton & Hammond, 1997). The PDR, which uses the parent as an observer, has also been used as a treatment outcome indicator but, similar to behavior rating scales, the observations by parents who are involved in treatment could be biased (Bank, Marlowe, Reid, Patterson, & Weinrott, 1991; Chamberlain & Reid, 1991; Webster-​Stratton & Hammond, 1997). The REDSOCS (Jacobs et  al., 2000)  school observation system has reported treatment sensitivity with respect to the classroom generalization effects of parent management training (Bagner et  al., 2010). However, to our knowledge, it has not yet been employed to assess the effects of classroom-​based interventions. As noted previously in the discussion of measures used to diagnosis severe levels of CP, children with the

same level of CP can vary greatly on the level of impairment associated with their CP. Thus, assessing the child’s level of functional impairment after treatment is also an important assessment goal. The two measures of functional impairment included in Table 5.3, the CAFAS and the CGAS, have both proven to be sensitive to treatment effects (Hodges et al., 2004; Shaffer et al., 1983). Also, a number of the rating scales noted in Table 5.3, such as the ASEBA and BASC-​3, assess important areas of potential impairment for children with CP, such as the child’s academic and social adjustment. Although many measures have been used to assess treatment outcome, there has been very little research on the use of assessment measures to monitor the effects of ongoing intervention for CP. Exceptions to this are the structured observational analogues employed in some parent management training programs for young oppositional children that are employed repeatedly throughout the course of treatment, not only to monitor progress but also to determine whether the parent has met specific behavioral performance criteria necessary for progression to the next step of the parenting intervention (Herschell et al., 2002; McMahon & Forehand, 2003). A final assessment domain related to treatment outcome that has had only minimal research focus is in the assessment of treatment satisfaction. This is a form of social validity that may be assessed in terms of satisfaction with the outcome of treatment, therapists, treatment procedures, and teaching format (McMahon & Forehand, 1983). Given the diversity of treatments that are needed for youth with CP, no single consumer satisfaction measure is appropriate for use with all types of interventions for youth with CP and their families. The Therapy Attitude Inventory (TAI; Brestan, Jacobs, Rayfield, & Eyberg, 1999; Eyberg, 1993) and the Parent’s Consumer Satisfaction Questionnaire (PCSQ; McMahon & Forehand, 2003; McMahon, Tiedemann, Forehand, & Griest, 1984) are examples of measures designed to evaluate parental satisfaction with parent management training programs (e.g., Eyberg & Funderburk, 2011; McMahon & Forehand, 2003). Importantly, these measures largely focus on the parents’ satisfaction with treatment. Children and adolescents themselves have rarely been asked about their satisfaction with treatment, with the exception of some evaluations of Multisystemic Therapy with adolescents (e.g., Henggeler et al., 1999). There are several important issues involved in selecting measures suitable for treatment monitoring and outcome evaluation (McMahon & Frick, 2005; McMahon & Metzler, 1998). First, the way questions on a rating

 87

Child and Adolescent Conduct Problems

scale are framed could affect its sensitivity to change. For example, the response scale on a behavior rating scale may be too general (e.g., “never” vs. “sometimes” vs. “always”), or the time interval for reporting the frequency of a behavior (e.g., the past 6 months) may not be discrete enough to detect changes brought about by treatment. Second, a consistent finding when using structured interviews is that parents and children often report fewer symptoms on the second administration of the interview (Jensen et al., 1999; Piacentini et al., 1999). Thus, structured interviews are typically not good measures of treatment outcome because it is unclear whether any reductions in CP between pre-​and post-​treatment measures are due to the treatment or due to this normal decrease in symptoms over repeated administrations. Third, assessment-​by-​intervention interactions may occur when evaluating treatment outcomes. For example, as a function of intervention, parents may learn to become more effective monitors of their children’s behavior. As a consequence, they may become more aware of their children’s CP. Comparison of parental reports of their children’s behavior prior to and after the intervention may actually suggest that parents perceive deterioration in their children’s behavior, when in reality the parents have simply become more accurate reporters of such behavior (Dishion & McMahon, 1998). Overall Evaluation Unfortunately, the development of measures to adequately monitor treatment progress and treatment outcome for children and adolescents with CP has not advanced as far as the development of measures for diagnosis and treatment planning. This is a particularly unfortunate state of affairs in the treatment of CP given that several treatments have proven to have potentially harmful effects on youth by leading to increases in behavior problems after treatment. However, several behavior rating scales, most notably the ASEBA and ECBI, have proven to be sensitive to the effects of treatment, and both the ASEBA and the ECBI provide norm-​referenced scores to determine whether the child’s level of CP was brought within a level that is normative for his or her age. Several behavioral observation systems, such as the BCS and DPICS, have also been used to both monitor the progress of treatment, as well as to evaluate treatment outcome. A few measures have been developed to assess child or parental satisfaction with treatment. However, development of better evidence-​based measures for this purpose is a critical area for future research.

87

CONCLUSIONS AND FUTURE DIRECTIONS

In this chapter, we have summarized several areas of research that have important implications for guiding assessments for youth with CP and summarized some recommended methods for accomplishing three primary assessment goals: diagnosis of non-​normative and impairing forms of CP, case conceptualization and treatment planning, and monitoring and evaluating treatment outcome. In this concluding section, we seek to highlight some overarching issues that influence methods for meeting all of these assessment goals and to highlight some important areas for future research. The first overarching issue is the need for a comprehensive assessment in most cases when assessing youth with CP. That is, an adequate assessment of a youth with CP must assess multiple aspects of the child’s or adolescent’s adjustment (e.g., CP, anxiety and learning problems) in multiple settings (e.g., home and school; Frick et  al., 2010; McMahon & Estes, 1997; McMahon & Frick, 2005). However, it is also important to note that all of the individual assessment techniques summarized in Tables 5.1 have limitations. Thus, it is critical to assess the child using multiple methods whenever possible (Frick et al., 2010). Because of issues of time, expense, and practicality, how best to acquire and interpret this large array of information become important issues. One approach is to use a multistage method, which starts with more time-​efficient measures (e.g., broadband behavior rating scales and unstructured clinical interviews) that are followed by more time-​intensive measures (e.g., structured interviews and behavioral observations) when indicated (McMahon & Estes, 1997; McMahon & Frick, 2005; Nock & Kurtz, 2005). Whether or not a multistage method is used, there are few guidelines available to guide clinicians as to how to integrate and synthesize the multiple pieces of information that are obtained in the assessment to make important clinical decisions. This endeavor is made more complicated by the fact that information from different informants (Achenbach, McConaughy, & Howell, 1987; De Los Reyes & Kazdin, 2005) and information from different methods (Barkley, 1991)  often show only modest correlations with each other. As a result, after collecting multiple sources of information on a youth’s adjustment, the assessor then must make sense out of an array of often conflicting information. Several strategies for integrating and interpreting information from comprehensive assessments have been proposed (Frick et  al., 2010; McMahon & Forehand,

8

88

Attention-Deficit and Disruptive Behavior Disorders

2003; Wakschlag & Danis, 2004). For example, Frick et al. (2010) outlined a multistage strategy for integrating results from a comprehensive assessment into a clear case conceptualization to guide treatment planning. At the first step, the assessor documents all clinically significant findings regarding the youth’s adjustment (e.g., elevations on ratings scales, diagnoses from structured interviews, and problem behaviors from observations). At the second step, the assessor searches for convergent findings across these methods. At the third step, the assessor attempts to explain, using available research as much as possible, any discrepancies in the assessment results. For example, a finding that a child and a parent, but not the teacher, are reporting high rates of anxiety may be explained by research suggesting that teachers may not be aware of a student’s level of anxiety in the classroom (Achenbach et  al., 1987). At the fourth step, the assessor develops a profile of the areas of most concern for the child and also develops a coherent explanation for the child’s CP, again using existing research as much as possible. This process was illustrated previously in using research on the developmental pathways to CP to guide a case conceptualization. Although this approach to interpreting results of a comprehensive assessment is promising, much more research is needed to guide this process of integrating data from comprehensive assessments. Another issue that requires further attention is the great need to enhance the clinical utility of evidence-​based assessment tools (Frick, 2000; Hodges, 2004). Many of the assessment measures that have been used in research have not been developed in such a way that makes them useful in clinical practice. For example, Frick and Loney (2000) reviewed a number of performance-​based measures that have been used in research with children with CP. They concluded that few of these measures have been used in the same format across multiple samples that would allow for the development of meaningful cut-​off scores that could be used in clinical assessments. Also, as noted previously, many of the observational systems used to assess parent–​child interactions require such intensive training of observers that their potential utility in many clinical assessments is also limited. Although we did review a few attempts to develop brief and clinically useful assessment methods, there are still too few such methods available. Perhaps the most important limitation to evidence-​ based assessments of CP is the remaining disconnect between assessment concerning case conceptualization and treatment planning, on the one hand, and the availability of evidence-​ based interventions that map onto those assessment findings, on the other hand. For

example, interventions for youth who are engaging primarily in covert forms of CP (e.g., stealing, fire-​setting) are much less developed than those for more overt types of CP such as noncompliance and aggression (McMahon, Wells, & Kotler, 2006). Similarly, subtype-​specific interventions for reactive, proactive aggression and relational aggression (e.g., Leff, Angelucci, Grabowski, & Weil, 2004; Levene, Walsh, Augimeri, & Pepler, 2004), and for youths with and without CU traits (Hawes et al., 2014), are in relatively early stages of development. Of note, however, is the clear evidence suggesting that high levels of noncompliance in a preschool-​age child are best treated using one of several well-​validated parent management training interventions (McMahon et al., 2006). A critical issue in advancing the link between evidence-​based assessment and treatment planning involves emerging research on the different developmental pathways to CP. As noted previously, this area of research may be the most important for understanding youths with CP because it may explain many of the variations in severity, the multiple co-​occurring conditions, and the many different risk factors that have been associated with CP. This research could also be very important for designing more individualized treatments for youths with CP, especially older children and adolescents with more severe antisocial behaviors (Frick, 2012). However, in order for research on developmental pathways to be translated into practice, it is critical that better assessment methods for reliably and validly designating youths in these pathways be developed. This is especially the case for girls and for ethnically diverse youth (McMahon & Frick, 2005). Furthermore, the different causal processes and developmental mechanisms (e.g., lack of empathy and guilt, poor emotion regulation) that may be involved in the different pathways need to be assessed, and this typically involves translating measures that have been used in research into forms that are appropriate for clinical practice (Frick & Ray, 2015). In conclusion, it is difficult to make a summary evaluation of the state of evidence-​based practice related to the assessment of CP. In some areas, there have been major improvements during the past several decades, such as in the development of behavior rating scales with large and representative normative samples. In other areas, such as in the development of measures to assess satisfaction with treatment, there have been fewer advances. Also, as the research base for understanding CP grows and evolves, so too must the guidelines for using this research in practice. Thus, evidence-​based assessment is a moving target. However, the hallmark of an evidence-​based approach to

 89

Child and Adolescent Conduct Problems

assessment is the commitment to never quit attempting to hit this moving target. The goal of this chapter is to highlight what we believe are currently some critical ways in which research on CP can inform the assessment process and to provide a structure whereby future advances in this research can be used to further enhance the process.

References Achenbach, T. M. (2013). DSM-​ oriented guide for the Achenbach System of Empirically Based Assessment (ASEBA). Burlington, VT:  University of Vermont Research Center for Children, Youth, and Families. Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child–​adolescent behavioral and emotional problems: Implications of cross-​informant correlations for situational specificity. Psychological Bulletin, 101, 213–​232. Achenbach, T. M., & Rescorla, L. A. (2000). Manual for the ASEBA Preschool Forms & Profiles. Burlington, VT: University of Vermont, Department of Psychiatry. Achenbach, T. M., & Rescorla, L. A. (2001). Manual for the ASEBA School-​age Forms & Profiles. Burlington, VT:  University of Vermont, Research Center for Children, Youth, & Families. Achenbach, T. M., & Rescorla, L. A. (2007). Multicultural supplement to the manual for the ASEBA School-​age Forms & Profiles. Burlington, VT: University of Vermont, Research Center for Children, Youth, & Families. Achenbach, T. M., & Rescorla, L. A. (2010). Multicultural supplement to the manual for the ASEBA Preschool Forms & Profiles. Burlington, VT: University of Vermont, Research Center for Children, Youth, & Families. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: Author. Aspland, H., & Gardner, F. (2003). Observational measures of parent–​ child interaction:  An introductory review. Child and Adolescent Mental Health, 8, 136–​143. Bagner, D. M., Boggs, S. R., & Eyberg, S. M. (2010). Evidence-​ based school behavior assessment of externalizing behavior in young children. Education and Treatment of Children, 33, 65–​83. Bank, L., Marlowe, J. H., Reid, J. B., Patterson, G. R., & Weinrott, M. R. (1991). A comparative evaluation of parent training interventions for families of chronic delinquents. Journal of Abnormal Child Psychology, 19, 15–​33. Barkley, R. A. (1991). The ecological validity of laboratory and analogue assessment methods of ADHD. Journal of Abnormal Child Psychology, 19, 149–​178.

89

Barkley, R. A. (2013). Defiant children: A clinician’s manual for assessment and parent training (3rd ed.). New York, NY: Guilford. Bierman, K. L. (1983). Cognitive development and clinical interviews with children. In B. B. Lahey & A. E. Kazdin (Eds.), Advances in clinical child psychology (Vol. 6, pp. 217–​250). New York, NY: Plenum. Bird, H. R. (1999). The assessment of functional impairment. In D. Shaffer, C. P. Lucas, & J. E. Richters (Eds.), Diagnostic assessment in child and adolescent psychopathology (pp. 209–​229). New York, NY: Guilford. Bird, H. R., Shaffer, D., Fisher, P., Gould, M. S., Staghezza, B., Chen, J. & Hoven, C. (1993). The Columbia Impairment Scale (CIS):  Pilot findings on a measure of global impairment for children and adolescents. International Journal of Methods in Psychiatric Research, 3, 167–​176. Blair, R. J.  R., Colledge, E., & Mitchell, D. G.  V. (2001). Somatic markers and response reversal:  Is there orbitofrontal cortex dysfunction in boys with psychopathic tendencies. Journal of Abnormal Child Psychology, 29, 499–​511. Bloomquist, M. L., & Schnell, S. V. (2002). Helping children with aggression and conduct problems: Best practices for interventions. New York, NY: Guilford. Breiner, J. L., & Forehand, R. (1981). An assessment of the effects of parent training on clinic-​referred children’s school behavior. Behavioral Assessment, 3, 31–​42. Brestan, E. V., Jacobs, J. R., Rayfield, A. D., & Eyberg, S. M. (1999). A consumer satisfaction measure for parent–​ child treatments and its relation to measures of child behavior change. Behavior Therapy, 30, 17–​30. Brock, R. L., Kochanska, G., O’Hara, M. W., & Grekin, R. S. (2015). Life satisfaction moderates the effectiveness of a play-​based parenting intervention in low-​income mothers and toddlers. Journal of Abnormal Child Psychology, 43, 1283–​1294. Broidy, L. M., Nagin, D. S., Tremblay, R. E., Bates, J. E., Brame, B. U., Dodge, K. A.,  .  .  .  Vitaro, F. (2003). Developmental trajectories of childhood disruptive behaviors and adolescent delinquency: A six-​site, cross-​ national study. Developmental Psychology, 39, 222–​245. Burke, J. D., Hipwell, A. E., & Loeber, R. (2010). Dimensions of oppositional defiant disorder as predictors of depression and conduct disorder in preadolescent girls. Journal of the American Academy of Child and Adolescent Psychiatry, 49, 484–​492. Burt, S. A. (2013). Do etiological influences on aggression overlap with those on rule breaking? A  meta-​analysis. Psychological Medicine, 43, 1801–​1812. Canino, G., Polanczyk, G., Bauermeister, J. J., Rohde, L. A., & Frick, P. J. (2010). Does the prevalence of CD and ODD vary across cultures? Social Psychiatry and Psychiatric Epidemiology, 45, 695–​704.

90

90

Attention-Deficit and Disruptive Behavior Disorders

Chamberlain, P., & Patterson, G. R. (1995). Discipline and child compliance in parenting. In M. H. Bornstein (Ed.), Handbook of parenting: Vol. 4. Applied and practical parenting (pp. 205–​225). Hillsdale, NJ: Erlbaum. Chamberlain, P., & Reid, J. B. (1987). Parent observation and report of child symptoms. Behavioral Assessment, 9, 97–​109. Chamberlain, P., & Reid, J. B. (1991). Using a specialized foster care community treatment model for children and adolescents leaving the state mental health hospital. Journal of Community Psychology, 19, 266–​276. Chamberlain, P., Reid, J. B., Ray, J., Capaldi, D. M., & Fisher, P. (1997). Parent inadequate discipline (PID). In T. A. Widiger, A. J. Frances, H. A. Pincus, R. Ross, M. B. First, & W. Davis (Eds.), DSM-​IV sourcebook (Vol. 3, pp. 569–​629). Washington, DC: American Psychiatric Association. Chen, D., Drabick, D. A.  G., & Burgers, D. E. (2015). A developmental perspective on peer rejection, deviant peer affiliation, and conduct problems among youth. Child Psychiatry and Human Development, 46, 823–​838. Christian, R. E., Frick, P. J., Hill, N. L., Tyler, L., & Frazer, D. R. (1997). Psychopathy and conduct problems in children:  II. Implications for subtyping children with conduct problems. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 233–​241. Ciucci, E., Baroncelli, A., Franchi, M., Golmaryami, F. N., & Frick, P. J. (2014). The association between callous–​ unemotional traits and behavioral and academic adjustment in children: Further validation of the Inventory of Callous–​unemotional Traits. Journal of Psychopathology and Behavioral Assessment, 36, 189–​200. Conduct Problems Prevention Research Group. (2000). Merging universal and indicated prevention programs:  The Fast Track model. Addictive Behaviors, 25, 913–​927. Conners, C. K. (2008). Conners’ Comprehensive Rating Scales–​3rd edition. Toronto, Ontario, Canada:  Multi-​ Health Systems. Crapanzano, A. M., Frick, P. J., & Terranova, A. M. (2010). Patterns of physical and relational aggression in a school-​ based sample of boys and girls. Journal of Abnormal Child Psychology, 38, 433–​445. Crick, N. R., & Dodge, K. A. (1994). A review and reformulation of social information-​processing mechanisms in children’s social adjustment. Psychological Bulletin, 115, 74–​101. Crick, N. R., & Dodge, K. A. (1996). Social information-​ processing mechanisms in reactive and proactive aggression. Child Development, 67, 993–​1002. Dandreaux, D. M., & Frick, P. J. (2009). Developmental pathways to conduct problems:  A further test of the childhood and adolescent-​onset distinction. Journal of Abnormal Child Psychology, 37, 375–​385.

De Los Reyes, A., & Kazdin, A. E. (2005). Informant discrepancies in the assessment of childhood psychology: A critical review, theoretical framework, and recommendations for further study. Psychological Bulletin, 131, 483–​509. DeGarmo, D. S., Patterson, G. R., & Forgatch, M. S. (2004). How do outcomes in a specified parent training intervention maintain or wane over time? Prevention Science, 5, 73–​89. Dishion, T. J., Capaldi, D., Spracklen, K. M., & Li, F. (1995). Peer ecology of male adolescent drug use. Development and Psychopathology, 7, 803–​824. Dishion, T. J., McCord, J., & Poulin, F. (1999). When interventions harm:  Peer groups and problem behavior. American Psychologist, 54, 755–​764. Dishion, T. J., & McMahon, R. J. (1998). Parental monitoring and the prevention of child and adolescent problem behavior: A conceptual and empirical formulation. Clinical Child and Family Psychology Review, 1, 61–​75. Dodge, K. A., Dishion, T. J., & Lansford, J. E. (2006). Deviant peer influences in programs for youth: Problems and solutions. New York, NY: Guilford. Dodge, K. A., & Pettit, G. S. (2003). A biopsychosocial model of the development of chronic conduct problems in adolescence. Developmental Psychology, 39, 349–​371. Eisenstadt, T. H., Eyberg, S., McNeil, C. B., Newcomb, K., & Funderburk, B. (1993). Parent–​Child Interaction Therapy with behavior problem children:  Relative effectiveness of two stages and overall treatment outcome. Journal of Clinical Child Psychology, 22, 42–​51. Eyberg, S. (1993). Consumer satisfaction measures for assessing parent training programs. In L. VandeCreek, S. Knapp, & T. L. Jackson (Eds.), Innovations in clinical practice: A source book (Vol. 12, pp. 377–​382). Sarasota, FL: Professional Resource Press. Eyberg, S., Bessmer, J., Newcomb, K., Edwards, D., & Robinson, E. (1994). Dyadic Parent–​Child Interaction Coding System II: A manual. Unpublished manuscript, University of Florida, Gainesville, FL. Eyberg, S. M., & Funderburk, B. (2011). Parent-​Child Interaction Therapy protocol. Gainesville, FL:  PCIT International. Eyberg, S. M., Nelson, M. M., Ginn, N. C., Bhuiyan, N., & Boggs, S. R. (2013). Dyadic Parent–​Child Interaction Coding System: Comprehensive manual for research and training (4th ed.). Gainesville, FL: PCIT International. Eyberg, S. M., & Pincus, D. (1999). The Eyberg Child Behavior Inventory and Sutter–​Eyberg Student Behavior Inventory: Professional manual. Lutz, FL: Psychological Assessment Resources. Ezpeleta, L., de la Osa, N., Granero, R., Penelo, E., & Domenech, J. M. (2013). Inventory of Callous–​unemotional Traits in a community sample of preschoolers. Journal of Clinical Child and Adolescent Psychology, 42, 91–​105.

 91

Child and Adolescent Conduct Problems

Fairchild, G., van Goozen, S. H.  M., Calder, A. J., & Goodyer, I. M. (2013). Research review:  Evaluating and reformulating the developmental taxonomic theory of antisocial behaviour. Journal of Child Psychology and Psychiatry, 54, 924–​940. Fanti, K. A., Frick, P. J., & Georgiou, S. (2009). Linking callous–​ unemotional traits to instrumental and non-​instrumental forms of aggression. Journal of Psychopathology and Behavioral Assessment, 31, 285–​298. Fanti, K. A., Panayiotou, G., Lazarou, C., Michael, R., & Georgiou, G. (2016). The better of two evils? Evidence that children exhibiting continuous problems high or low on callous–​unemotional traits score on opposite directions on physiological and behavioral measures of fear. Development and Psychopathology, 28, 185–​198. Fergusson, D. M., Swain, N. R., & Horwood, L. J. (2002). Deviant peer affiliations, crime and substance use:  A fixed effects regression analysis. Journal of Abnormal Child Psychology, 30, 419–​430. Fleming, A. P., McMahon, R. J., & King, K. M. (2016). Structured parent–​child observations predict development of conduct problems: The importance of parental negative attention in child-​directed play. Prevention Science, 18, 257–​267. Forehand, R., & McMahon, R. J. (1981). Helping the noncompliant child:  A clinician’s guide to parent training. New York, NY: Guilford. Frick, P. J. (2000). Laboratory and performance-​based measures of childhood disorders. Journal of Clinical Child Psychology, 29, 475–​478. Frick, P. J. (2012). Developmental pathways to conduct disorder:  Implications for future directions in research, assessment, and treatment. Journal of Clinical Child & Adolescent Psychology, 41, 378–​389. Frick, P. J., Barry, C. T., & Kamphaus, R. W. (2010). Clinical assessment of child and adolescent personality and behavior (3rd ed.). New York, NY: Springer. Frick, P. J., Bodin, S. D., & Barry, C. T. (2000). Psychopathic traits and conduct problems in community and clinic-​ referred samples of children:  Further development of the Psychopathy Screening Device. Psychological Assessment, 12, 382–​393. Frick, P. J., Cornell, A. H., Barry, C. T., Bodin, S. D., & Dane, H. A. (2003). Callous–​unemotional traits and conduct problems in the prediction of conduct problem severity, aggression, and self-​report of delinquency. Journal of Abnormal Child Psychology, 31, 457–​470. Frick, P. J., Cornell, A. H., Bodin, S. D., Dane, H. A., Barry, C. T., & Loney, B. R. (2003). Callous–​unemotional traits and developmental pathways to severe conduct problems. Developmental Psychology, 39, 246–​260. Frick, P. J., & Hare, R. D. (2001). The Antisocial Process Screening Device (APSD). Toronto, Ontario, Canada: Multi-​Health Systems.

91

Frick, P. J., Lahey, B. B., Hartdagen, S. E., & Hynd, G. W. (1989). Conduct problems in boys: Relations to maternal personality, marital satisfaction, and socioeconomic status. Journal of Clinical Child Psychology, 18, 114–​120. Frick, P. J., Lahey, B. B., Loeber, R., Tannenbaum, L. E., Van Horn, Y., Christ, M. A.  G.,  .  .  .  Hanson, K. (1993). Oppositional defiant disorder and conduct disorder: A meta-​analytic review of factor analyses and cross-​validation in a clinic sample. Clinical Psychology Review, 13, 319–​340. Frick, P. J., Lilienfeld, S. O., Ellis, M. L., Loney, B. R., & Silverthorn, P. (1999). The association between anxiety and psychopathy dimensions in children. Journal of Abnormal Child Psychology, 27, 381–​390. Frick, P. J., & Loney, B. R. (1999). Outcomes of children and adolescents with conduct disorder and oppositional defiant disorder. In H. C. Quay & A. Hogan (Eds.), Handbook of disruptive behavior disorders (pp. 507–​ 524). New York, NY: Plenum. Frick, P. J., & Loney, B. R. (2000). The use of laboratory and performance-​based measures in the assessment of children and adolescents with conduct disorders. Journal of Clinical Child Psychology, 29, 540–​554. Frick, P. J., & Marsee, M. A. (2006). Psychopathic traits and developmental pathways to antisocial behavior in youth. In C. J. Patrick (Ed.), Handbook of psychopathic traits (pp. 355–​374). New York, NY: Guilford. Frick, P. J., & Morris, A. S. (2004). Temperament and developmental pathways to conduct problems. Journal of Clinical Child and Adolescent Psychology, 33, 54–​68. Frick, P. J., O’Brien, B. S., Wootton, J. M., & McBurnett, K. (1994). Psychopathy and conduct problems in children. Journal of Abnormal Psychology, 103, 700–​707. Frick, P. J., & Ray, J. V. (2015). Evaluating callous-​ unemotional traits as a personality construct. Journal of Personality, 83, 710–​722. Frick, P. J., Ray, J. V., Thornton, L. C., & Kahn, R. E. (2014a). Can callous–​unemotional traits enhance the understanding, diagnosis, and treatment of serious conduct problems in children and adolescents? A comprehensive review. Psychological Bulletin, 140, 1–​57. Frick, P. J., Ray, J. V., Thornton, L. C., & Kahn, R. E. (2014b). A developmental psychopathology approach to understanding callous–​unemotional traits in children and adolescents with serious conduct problems. Journal of Child Psychology and Psychiatry, 55, 532–​548. Frick, P. J., & Viding, E. M. (2009). Antisocial behavior from a developmental psychopathology perspective. Development and Psychopathology, 21, 1111–​1131. Gadow, K. D., Arnold, L. E., Molina, B. S., Findling, R. L., Bukstein, O. G., Brown, N. V., . . . Aman, M. G. (2014). Risperidone added to parent training and stimulant medication:  Effects on attention-​deficit/​hyperactivity

92

92

Attention-Deficit and Disruptive Behavior Disorders

disorder, oppositional defiant disorder, conduct disorder, and peer aggression. Journal of the American Academy of Child & Adolescent Psychiatry, 53, 948–​959. Gadow, K. D., & Sprafkin, J. (2002). CSI-​4 combined manual. Stony Brook, NY: Checkmate Plus. Green, S. M., Loeber, R., & Lahey, B. B. (1991). Stability of mothers’ recall of the age of onset of their child’s attention and hyperactivity problems. Journal of the American Academy of Child & Adolescent Psychiatry, 30, 135–​137. Griest, D. L., Forehand, R., Wells, K. C., & McMahon, R. J. (1980). An examination of differences between nonclinic and behavior problem clinic-​referred children. Journal of Abnormal Psychology, 89, 497–​500. Hanf, C. (1970). Shaping mothers to shape their children’s behavior. Unpublished manuscript, University of Oregon Medical School, Portland, OR. Hawes, D. J., & Dadds, M. R. (2005). The treatment of conduct problems in children with callous–​ unemotional traits. Journal of Consulting and Clinical Psychology, 73, 737–​741. Hawes, D. J., Price, M. J., & Dadds, M. R. (2014). Callous–​ unemotional traits and the treatment of conduct problems in childhood and adolescence: A comprehensive review. Clinical Child and Family Psychology Review, 17, 248–​267. Hawkins, J. D., Catalano, R. F., & Miller, J. Y. (1992). Risk and protective factors for alcohol and other drug problems in adolescence and early adulthood: Implications for substance abuse prevention. Psychological Bulletin, 112, 64–​105. Henggeler, S. W., Rowland, M. D., Randall, J., Ward, D. M., Pickrel, S. G., Cunningham, P. B.,  .  .  .  Santos, A. B. (1999). Home-​ based Multisystemic Therapy as an alternative to the hospitalization of youths in psychiatric crisis:  Clinical outcomes. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 1331–​1339. Herschell, A., Calzada, E., Eyberg, S. M., & McNeil, C. B. (2002). Clinical issues in Parent–​Child Interaction Therapy:  Clinical past and future. Cognitive and Behavioral Practice, 9, 16–​27. Hinshaw, S. P. (1992). Externalizing behavior problems and academic underachievement in childhood and adolescence:  Causal relationships and underlying mechanisms. Psychological Bulletin, 111, 127–​155. Hinshaw, S. P., Heller, T., & McHale, J. P. (1992). Covert antisocial behavior in boys with attention-​deficit hyperactivity disorder:  External validation and effects of methylphenidate. Journal of Consulting and Clinical Psychology, 60, 274–​281. Hinshaw, S. P., Simmel, C., & Heller, T. L. (1995). Multimethod assessment of covert antisocial behavior in children:  Laboratory observation, adult ratings, and child self-​report. Psychological Assessment, 7, 209–​219.

Hinshaw, S. P., Zupan, B. A., Simmel, C., Nigg, J. T., & Melnick, S. (1997). Peer status in boys with and without attention-​deficit hyperactivity disorder: Predictions from overt and covert antisocial behavior, social isolation, and authoritative parenting beliefs. Child Development, 68, 880–​896. Hodges, K. (2000). Child and Adolescent Functional Assessment Scale (2nd rev. ed.). Ypsilanti, MI: Eastern Michigan University. Hodges, K. (2004). Using assessment in everyday practice for the benefit of families and practitioners. Professional Psychology: Research and Practice, 35, 449–​456. Hodges, K., Xue, Y., & Wotring, J. (2004). Use of the CAFAS to evaluate outcomes for youths with severe emotional disturbance served by public mental health. Journal of Child and Family Studies, 13, 325–​339. Howard, A. L., Kimonis, E. R., Munoz, L. C., & Frick, P. J. (2012). Violence exposure mediates the relation between callous–​ unemotional traits and offending patterns in adolescents. Journal of Abnormal Child Psychology, 40, 1237–​1247. Hubbard, J. A., Dodge, K. A., Cillessen, A. H.  N., Coie, J. D., & Schwartz, D. (2001). The dyadic nature of social information processing in boys’ reactive and proactive aggression. Journal of Personality and Social Psychology, 80, 268–​280. Hubbard, J. A., Smithmyer, C. M., Ramsden, S. R., Parker, E. H., Flanagan, K. D., Dearing, K. F., . . . Simons, R. F. (2002). Observational, physiological, and self-​report measures of children’s anger:  Relations to reactive versus proactive aggression. Child Development, 73, 1101–​1118. Jacobs, J., Boggs, S. R., Eyberg, S. M., Edwards, D., Durning, P., Querido, J., . . . Funderburk, B. (2000). Psychometric properties and reference point data for the Revised Edition of the School Observation Coding System. Behavior Therapy, 31, 695–​712. Jensen, P. S., Watanabe, H. K., & Richters, J. E. (1999). Who’s up first? Testing for order effects in structured interviews using a counterbalanced experimental design. Journal of Abnormal Child Psychology, 27, 439–​445. Jones, D. J., Forehand, R. L., Cuellar, J., Parent, J., & Honeycutt, A. A. (2014). Technology-​enhanced program for child disruptive behavior disorders:  Development and pilot randomized control trial. Journal of Clinical Child and Adolescent Psychology, 43, 88–​101. Kahn, R. E., Frick, P. J., Youngstrom, E., Findling, R. L., & Youngstrom, J. K. (2012). The effects of including a callous–​unemotional specifier for the diagnosis of conduct disorder. Journal of Child Psychology and Psychiatry, 53, 271–​282. Kimonis, E. R., Fanti, K. A., Frick, P. J., Moffitt, T. E., Essau, C., Bijjtebier, P., & Marsee, M. A. (2015). Using self-​ reported callous–​unemotional traits to cross-​nationally

 93

Child and Adolescent Conduct Problems

assess the DSM-​5  “With Limited Prosocial Emotions” specifier. Journal of Child Psychology and Psychiatry, 56, 1249–​1261. Kimonis, E. R., Fanti, K., & Singh, J. P. (2014). Establishing cut-​ off scores for the parent-​ reported Inventory of Callous–​unemotional Traits. Archives of Forensic Psychology, 1, 27–​48. Kimonis, E. R., Frick, P. J., & McMahon, R. J. (2014). Conduct and oppositional defiant disorders. In E. J. Mash & R. A. Barkley (Eds.), Child psychopathology (3rd ed., pp. 145–​179). New York: Guilford. Kimonis, E. R., Frick, P. J., Skeem, J., Marsee, M. A., Cruise, K., Muñoz, L. C.,  .  .  .  Morris, A. S. (2008). Assessing callous–​ unemotional traits in adolescent offenders: Validation of the Inventory of Callous–​unemotional Traits. Journal of the International Association of Psychiatry and Law, 31, 241–​252. Lahey, B. B., Goodman, S. H., Waldman, I. D., Bird, H., Canino, G., Jensen, P.,  .  .  .  Applegate, B. (1999). Relation of age of onset to the type and severity of child and adolescent conduct problems. Journal of Abnormal Child Psychology, 27, 247–​260. Lahey, B. B., Schwab-​Stone, M., Goodman, S. H., Waldman, I. D., Canino, G., Rathouz, P. J., . . . Jensen, P. S. (2000). Age and gender differences in oppositional behavior and conduct problems:  A cross-​ sectional household study of middle childhood and adolescence. Journal of Abnormal Psychology, 109, 488–​503. LaRue, R. H., & Handleman, J. (2006). A primer on school-​ based functional assessment. the Behavior Therapist, 29, 48–​52. Lawing, K., Frick, P. J., & Cruise, K. R. (2010). Differences in offending patterns between adolescent sex offenders high or low in callous–​unemotional traits. Psychological Assessment, 22, 298–​305. Leff, S. S., Angelucci, J., Grabowski, L., & Weil, J. (2004). Using school and community partners to design, implement, and evaluate a group intervention for relationally aggressive girls. In S. S. Leff (Chair), Using partnerships to design, implement, and evaluate aggression prevention programs. Symposium conducted at the meeting of the American Psychological Association, Honolulu. Lett, N. J., & Kamphaus, R. W. (1997). Differential validity of the BASC Student Observation System and the BASC Teacher Rating Scale. Canadian Journal of School Psychology, 13, 1–​14. Levene, K. S., Walsh, M. M., Augimeri, L. K., & Pepler, D. J. (2004). Linking identification and treatment of early risk factors for female delinquency. In M. M. Moretti, C. L. Odgers, & M. A. Jackson (Eds.), Girls and aggression:  Contributing factors and intervention principles (pp. 147–​163). New York, NY: Kluwer. Little, T. D., Jones, S. M., Henrich, C. C., & Hawley, P. H. (2003). Disentangling the “whys” from the “whats” of

93

aggressive behavior. International Journal of Behavioural Development, 27, 122–​133. Loeber, R., Burke, J. D., Lahey, B. B., Winters, A., & Zera, M. (2000). Oppositional defiant and conduct disorder: A review of the past 10 years, Part I. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 1468–​1482. Loeber, R., & Hay, D. F. (1997). Key issues in the development of aggressive and violence from childhood to early adulthood. Annual Review of Psychology, 48, 371–​410. Loeber, R., & Keenan, K. (1994). Interaction between conduct disorder and its comorbid conditions: Effects of age and gender. Clinical Psychology Review, 14, 497–​523. Loeber, R., & Stouthamer-​Loeber, M. (1986). Family factors as correlates and predictors of juvenile conduct problems and delinquency. In M. Tonry & N. Morris (Eds.), Crime and justice (Vol. 7, pp. 29–​149). Chicago, IL: University of Chicago Press. Loney, B. R., & Frick, P. J. (2003). Structured diagnostic interviewing. In C. R. Reynolds & R. W. Kamphaus (Eds.), Handbook of educational assessment of children (2nd ed., pp. 235–​247). New York, NY: Guilford. Loney, B. R., Frick, P. J., Ellis, M., & McCoy, M. G. (1998). Intelligence, psychopathy, and antisocial behavior. Journal of Psychopathology and Behavioral Assessment, 20, 231–​247. Lynskey, M. T., & Fergusson, D. M. (1995). Childhood conduct problems, attention deficit behaviors, and adolescent alcohol, tobacco, and illicit drug use. Journal of Abnormal Child Psychology, 23, 281–​302. Manders, W. A., Dekovic, M., Asscher, J. J., van der Laan, P. H., & Prins, P. J.  M. (2013). Psychopathy as a predictor and moderator of Multisystemic Therapy outcomes among adolescents treated for antisocial behavior. Journal of Abnormal Child Psychology, 41, 1121–​1132. Marsee, M. A., & Frick, P. J. (2007). Exploring the cognitive and emotional correlates to proactive and reactive aggression in a sample of detained girls. Journal of Abnormal Child Psychology, 35, 969–​981. Marsee, M. A., & Frick, P. J. (2010). Callous–​unemotional traits and aggression in youth. In W. Arsenio & E. Lemerise (Eds.), Emotions, aggression, and morality in children:  Bridging development and psychopathology (pp. 137–​ 156). Washington, DC:  American Psychological Association. Marsee, M. A., Frick, P. J., Barry, C. T., Kimonis, E. R., Munoz-​ Centifanti, L. C., & Aucoin, K. J. (2014). Profiles of the forms and functions of self-​ reported aggression in three adolescent samples. Development and Psychopathology, 26, 705–​720. Maughan, B., Rowe, R., Messer, J., Goodman, R., & Meltzer, H. (2004). Conduct disorder and oppositional defiant disorder in a national sample: Developmental epidemiology. Journal of Child Psychology and Psychiatry, 45, 609–​621.

94

94

Attention-Deficit and Disruptive Behavior Disorders

McCabe, K., & Yeh, M. (2009). Parent–​Child Interaction Therapy for Mexican-​Americans:  A randomized clinical trial. Journal of Clinical Child and Adolescent Psychology, 38, 753–​759. McConaughy, S. H., & Achenbach, T. M. (2009). Manual for the Direct Observation Form. Burlington, VT: University of Vermont, Center for Children, Youth, and Families. McConaughy, S. H., Achenbach, T. M., & Gent, C. L. (1988). Multiaxial empirically based assessment: Parent, teacher, observational, cognitive, and personality correlates of Child Behavior Profile types for 6-​to 11-​year-​ old boys. Journal of Abnormal Child Psychology, 16, 485–​509. McMahon, R. J., & Estes, A. (1994). Fast Track parent–​child interaction task: Observational data collection manuals. Unpublished manuscript, University of Washington, Seattle, WA. McMahon, R. J., & Estes, A. M. (1997). Conduct problems. In E. J. Mash & L. G. Terdal (Eds.), Assessment of childhood disorders (3rd ed., pp. 130–​193). New  York, NY: Guilford. McMahon, R. J., & Forehand, R. (1983). Consumer satisfaction in behavioral treatment of children: Types, issues, and recommendations. Behavior Therapy, 14, 209–​225. McMahon, R. J., & Forehand, R. L. (2003). Helping the noncompliant child: Family based treatment for oppositional behavior (2nd ed.). New York, NY: Guilford. McMahon, R. J., Forehand, R., & Griest, D. L. (1981). Effects of knowledge of social learning principles on enhancing treatment outcome and generalization in a parent training program. Journal of Consulting and Clinical Psychology, 49, 526–​532. McMahon, R. J., & Frick, P. J. (2005). Evidence-​based assessment of conduct problems in children and adolescents. Journal of Clinical Child and Adolescent Psychology, 34, 477–​505. McMahon, R. J., & Metzler, C. W. (1998). Selecting parenting measures for assessing family-​based preventive interventions. In R. S. Ashery, E. B. Robertson, & K. L. Kumpfer (Eds.), Drug abuse prevention through family interventions (NIDA Research Monograph No. 177, pp. 294–​ 323). Rockville, MD:  National Institute on Drug Abuse. McMahon, R. J., Tiedemann, G. L., Forehand, R., & Griest, D. L. (1984). Parental satisfaction with parent training to modify child noncompliance. Behavior Therapy, 15, 295–​303. McMahon, R. J., Wells, K. C., & Kotler, J. S. (2006). Conduct problems. In E. J. Mash & R. A. Barkley (Eds.), Treatment of childhood disorders (3rd ed., pp. 137–​268). New York, NY: Guilford. Moffitt, T. E. (2006). Life-​course persistent versus adolescence-​ limited antisocial behavior. In D. Cicchetti & D. J. Cohen (Eds.), Developmental psychopathology:  Risk,

disorder, and adaptation (2nd ed., Vol. 3, pp. 570–​598). New York, NY: Wiley. Moffitt, T. E., & Caspi, A. (2001). Childhood predictors differentiate life-​course persistent and adolescence-​limited antisocial pathways in males and females. Development and Psychopathology, 13, 355–​376. Moffitt, T. E., Caspi, A., Dickson, N., Silva, P., & Stanton, W. (1996). Childhood-​onset versus adolescent-​onset antisocial conduct problems in males:  Natural history from ages 3 to 18  years. Development and Psychopathology, 8, 399–​424. Monahan, K. C., Steinberg, L., Cauffman, E., & Mulvey, E. (2009). Trajectories of antisocial behavior and psychosocial maturity from adolescence to young adulthood. Developmental Psychology, 45, 1654–​1668. Muñoz, L. C., & Frick, P. J. (2007). The reliability, stability, and predictive utility of the self-​report version of the Antisocial Process Screening Device. Scandinavian Journal of Psychology, 48, 299–​312. Muñoz, L. C., Frick, P. J., Kimonis, E. R., & Aucoin, K. J. (2008). Types of aggression, responsiveness to provocation, and callous–​unemotional traits in detained adolescents. Journal of Abnormal Child Psychology, 36, 15–​28. Muñoz Centifanti, L. C., & Modecki, K. (2013). Throwing caution to the wind:  Callous–​unemotional traits and risk taking in adolescents. Journal of Clinical Child and Adolescent Psychology, 42, 106–​119. Nixon, R. D. V., Sweeney, L., Erickson, D. B., & Touyz, S. W. (2003). Parent–​Child Interaction Therapy: A comparison of standard and abbreviated treatments for oppositional defiant preschoolers. Journal of Consulting and Clinical Psychology, 71, 251–​260. Nock, M. K., & Kurtz, S. M.  S. (2005). Direct behavioral observation in school settings: Bringing science to practice. Cognitive and Behavioral Practice, 12, 359–​370. Oberth, C., Zheng, Y., & McMahon, R. J. (2017). Violence exposure subtypes differentially mediate the relation between callous-​ unemotional traits and adolescent delinquency. Journal of Abnormal Child Psychology, 45, 1565–1575. O’Brien, B. S., & Frick, P. J. (1996). Reward dominance:  Associations with anxiety, conduct problems, and psychopathy in children. Journal of Abnormal Child Psychology, 24, 223–​240. Odgers, C. L., Caspi, A., Broadbent, J. M., Dickson, N., Hancox, R. J., . . . Moffitt, T. E. (2007). Prediction of differential adult health burden by conduct problem subtypes in males. Archives of General Psychiatry, 64, 476–​484. Office of Juvenile Justice and Delinquency Prevention. (1995). Juvenile offenders and victims:  A focus on violence. Pittsburgh, PA:  National Center for Juvenile Justice.

 95

Child and Adolescent Conduct Problems

Pardini, D. A., Lochman, J. E., & Frick, P. J. (2003). Callous/​ unemotional traits and social cognitive processes in adjudicated youth. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 364–​371. Pasalich, D. S., Dadds, M. R., Hawes, D. J., & Brennan, J. (2012). Do callous–​ unemotional traits moderate the relative importance of parental coercion versus warmth in child conduct problems? An observational study. Journal of Child Psychology and Psychiatry, 52, 1308–​1315. Pasalich, D. S., Witkiewitz, K., McMahon, R. J., Pinderhughes, E. E., & the Conduct Problems Prevention Research Group. (2016). Indirect effects of the Fast Track intervention on conduct disorder symptoms and callous–​unemotional traits: Distinct pathways involving discipline and warmth. Journal of Abnormal Child Psychology, 44, 587–​597. Patterson, G. R., & Dishion, T. J. (1985). Contributions of family and peers to delinquency. Criminology, 23, 63–​79. Patterson, G. R., Reid, J. B., Jones, R. R., & Conger, R. E. (1975). A social learning approach to family intervention: Vol. 1. Families with aggressive children. Eugene, OR: Castalia. Patterson, G. R., & Yoerger, K. (1993). Developmental models for delinquent behavior. In S. Hodgins (Ed.), Mental disorder and crime (pp. 140–​ 172). Newbury Park, CA: Sage. Peed, S., Roberts, M., & Forehand, R. (1977). Evaluation of the effectiveness of a standardized parent training program in altering the interaction of mothers and their noncompliant children. Behavior Modification, 1, 323–​350. Piacentini, J., Roper, M., Jensen, P., Lucas, C., Fisher, P., Bird, H.,  .  .  .  Dulcan, M. (1999). Informant-​ based determinants of symptom attenuation in structured child psychiatric interviews. Journal of Abnormal Child Psychology, 27, 417–​428. Poulin, F., & Boivin, M. (2000). Reactive and proactive aggression:  Evidence of a two-​ factor model. Psychological Assessment, 12, 115–​122. Poythress, N. G., Dembo, R., Wareham, J., & Greenbaum, P. E. (2006). Construct validity of the Youth Psychopathic Traits Inventory (YPI) and the Antisocial Process Screening Device (APSD) with justice-​ involved adolescents. Criminal Justice and Behavior, 33, 26–​55. Ray, J. V., Thornton, L. C., Frick, P. J., Steinberg, L., & Cauffman, E. (2016). Impulse control and callous–​ unemotional traits distinguish patterns of delinquency and substance use in justice involved adolescents:  Examining the moderating role of neighborhood context. Journal of Abnormal Child Psychology, 44, 599–​611.

95

Reed, M. L., & Edelbrock, C. (1983). Reliability and validity of the Direct Observation Form of the Child Behavior Checklist. Journal of Abnormal Child Psychology, 11, 521–​530. Reich, W. (2000). Diagnostic Interview for Children and Adolescents (DICA). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 59–​66. Reynolds, C. R., & Kamphaus, R. W. (2015). Behavior Assessment System for Children-​3rd Edition (BASC-​3). Bloomington, MN: Pearson. Roberts, M. W., & Powers, S. W. (1988). The Compliance Test. Behavioral Assessment, 10, 375–​398. Roberts, M. W., & Powers, S. W. (1990). Adjusting chair timeout enforcement procedures for oppositional children. Behavior Therapy, 21, 257–​271. Robins, L. N. (1966). Deviant children grown up. Baltimore, MD: Williams & Wilkins. Rowe, R., Costello, J., Angold, A., Copeland, W. E., & Maughan, B. (2010). Developmental pathways in oppositional defiant disorder and conduct disorder. Journal of Abnormal Psychology, 119, 726–​738. Russo, D. C., Cataldo, M. F., & Cushing, P. J. (1981). Compliance training and behavioral covariation in the treatment of multiple behavior problems. Journal of Applied Behavior Analysis, 14, 209–​222. Scott, S., Sylva, K., Doolan, M., Price, J., Jacobs, B., Crook, C., & Landau, S. (2010). Randomised controlled trial of parent groups for child antisocial behaviour targeting multiple risk factors: The SPOKES Project. Journal of Child Psychology and Psychiatry, 51, 48–​57. Shaffer, D., Fisher, P., Lucas, C. P., Dulcan, M. K., & Schwab-​ Stone, M. E. (2000). NIMH Diagnostic Interview Schedule for Children version IV (NIMH DISC-​ IV):  Description, differences from previous versions, and reliability of some common diagnoses. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 28–​38. Shaffer, D., Gould, M. S., Brasic, J., Ambrosini, P., Fisher, P., Bird, H., & Aluwahlia, S. (1983). A Children’s Global Assessment Scale (CGAS). Archives of General Psychiatry, 40, 1228–​1231. Silverthorn, P., & Frick, P. J. (1999). Developmental pathways to antisocial behavior:  The delayed-​onset pathway in girls. Development and Psychopathology, 11, 101–​126. Stringaris, A., & Goodman, R. (2009). Longitudinal outcome of youth oppositionality: Irritable, headstrong, and hurtful behaviors have distinct predictions. Journal of the American Academy of Child and Adolescent Psychiatry, 48, 404–​412. Tibbetts, S. G., & Piquero, A. R. (1999). The influence of gender, low birth weight, and disadvantaged environment in predicting early onset of offending: A test of Moffitt’s interactional hypothesis. Criminology, 37, 843–​877.

96

96

Attention-Deficit and Disruptive Behavior Disorders

Tiet, Q. Q., Wasserman, G. A., Loeber, R., Larken, S. M., & Miller, L. S. (2001). Developmental and sex differences in types of conduct problems. Journal of Child and Family Studies, 10, 181–​197. Viding, E., Sebastian, C. L., Dadds, M. R., Lockwood, P. L., Cecil, C. A., De Brito, S. A., & McCrory, E. J. (2012). Amygdala response to preattentive masked fear in children with conduct problems: The role of callous–​unemotional traits. American Journal of Psychiatry, 169, 1109–​1116. Vitaro, F., Brendgen, M., & Tremblay, R. E. (2002). Reactively and proactively aggressive children:  Antecedent and subsequent characteristics. Journal of Child Psychology and Psychiatry and Allied Disciplines, 43, 495–​506. Wahler, R. G., & Cormier, W. H. (1970). The ecological interview:  A first step in out-​ patient child behavior therapy. Journal of Behavior Therapy and Experimental Psychiatry, 1, 279–​289. Wakschlag, L. S., & Danis, B. (2004). Assessment of disruptive behaviors in young children: A clinical–​developmental framework. In R. Del Carmen & A. Carter (Eds.), Handbook of infant and toddler mental health assessment (pp. 421–​440). New York, NY: Oxford University Press. Walker, H. M., Ramsey, E., & Gresham, F. M. (2004). Antisocial behavior in school:  Evidence-​based practice. Belmont, CA: Wadsworth/​Thomas Learning. Waschbusch, D. A. (2002). A meta-​analytic examination of comorbid hyperactive–​impulsive–​attention problems and conduct problems. Psychological Bulletin, 128, 118–​150. Webster-​Stratton, C., & Hammond, M. (1997). Treating children with early-​onset conduct problems: A comparison of

child and parent training programs. Journal of Consulting and Clinical Psychology, 65, 93–​109. Webster-​Stratton, C., & Lindsay, D. W. (1999). Social competence and conduct problems in young children: Issues in assessment. Journal of Clinical Child and Adolescent Psychology, 28, 25–​43. Webster-​Stratton, C., & Spitzer, A. (1991). Development, reliability, and validity of the Daily Telephone Discipline Interview. Behavioral Assessment, 13, 221–​239. Wells, K. C., Forehand, R., & Griest, D. L. (1980). Generality of treatment effects from treated to untreated behaviors resulting from a parent training program. Journal of Clinical Child Psychology, 8, 217–​219. White, S. F., Frick, P. J., Lawing, K., & Bauer, D. (2013). Callous–​unemotional traits and response to Functional Family Therapy in adolescent offenders. Behavioral Science and the Law, 31, 271–​285. Willoughby, M., Kupersmidt, J., & Bryant, D. (2001). Overt and covert dimensions of antisocial behavior in early childhood. Journal of Abnormal Child Psychology, 29, 177–​187. Wootton, J. M., Frick, P. J., Shelton, K. K., & Silverthorn, P. (1997). Ineffective parenting and childhood conduct problems: The moderating role of callous–​unemotional traits. Journal of Consulting and Clinical Psychology, 65, 301–​308. Zoccolillo, M. (1992). Co-​occurrence of conduct disorder and its adult outcomes with depressive and anxiety disorders:  A review. Journal of the American Academy of Child & Adolescent Psychiatry, 31, 547–​556.

 97

Part III

Mood Disorders and Self-​Injury

98

 9

6

Depression in Children and Adolescents Lea R. Dougherty Daniel N. Klein Thomas M. Olino This chapter provides a review of evidenced-​based assessments of depression in children and adolescents. We focus on three phases of assessment:  diagnosis, case conceptualization and treatment planning, and treatment monitoring/​evaluation. Our goal is to outline the parameters of a general assessment strategy and evaluate the efficacy of various assessment tools. Nevertheless, we acknowledge that additional areas will have to be explored for particular cases or contexts. Several changes were made to the depressive disorders section of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association [APA], 2013). First, DSM-​ IV categories of chronic major depression and dysthymic disorder were integrated into a new category, persistent depressive disorder. Second, disruptive mood dysregulation disorder (DMDD) and premenstrual dysphoric disorder were added. Third, depressive disorder not otherwise specified (DD-​NOS) was removed from DSM-​ 5 and replaced with “other specified depressive disorder” (which includes recurrent brief depression, short-​duration depressive episode, and depressive episode with insufficient symptoms) and “unspecified depressive disorder.” Last, DSM-​5 removed the bereavement exclusion and added “with anxious distress” and “with mixed features” specifiers for MDD. These changes are not without controversy and require further investigation into their validity and clinical utility. In this chapter, we focus primarily on DSM-​ IV-​ TR (APA, 2000)  major depressive disorder (MDD) and, to a lesser extent, dysthymic disorder (DD), given the scant research on the DSM-​5 changes. Nevertheless, we highlight any literature evaluating DSM-​5 depressive disorders.

We believe that the diagnoses of the depressive disorders have a moderate degree of clinical utility and construct validity in children and adolescents. However, as understanding of the etiology and development of depression increases, the classification of depression in young people will undoubtedly change in significant ways. For instance, the National Institute of Mental Health (NIMH) initiated the Research Domain Criteria (RDoC) project to provide a new framework for studying mental disorders. RDoC integrates many levels of information (from genomics to self-​report) to better understand basic biobehavioral dimensions underlying the full range of human behavior (Insel et al., 2010). As the RDoC framework is investigated, it may have a substantial impact on how we classify depressive and other mental disorders across the lifespan. An assessment strategy should be driven by the available data on the clinical features, associated characteristics, course, and treatment of depression, as well as what is known about the processes involved in the maintenance and recurrence of episodes. Hence, we begin with a brief overview of the literature on the psychopathology and treatment of depressive disorders in children and adolescents. This is followed by a review and evaluation of the tools used in each phase of assessment.

THE NATURE OF DEPRESSION

Psychopathology In the DSM-​IV-​TR and DSM-​5, MDD in children and adolescents is defined by a period of at least 2 weeks characterized by the presence of depressed or irritable mood or loss 99

10

100

Mood Disorders and Self-Injury

of interest or pleasure, and at least five of nine symptoms. DSM-​IV-​TR DD and DSM-​5 persistent depressive disorder in children and adolescents are defined as a period of at least 1 year characterized by depressed or irritable mood and at least two of six symptoms. Although DSM-​IV-​TR MDD and DD are not mutually exclusive (i.e., they often co-​occur, a phenomenon referred to as “double depression”), DSM-​ 5 persistent depressive disorder is mutually exclusive from MDD because the latter diagnosis reflects episodic depressive episodes only. Although evidence supporting developmental differences in the factor structure for depressive symptoms has been mixed, there is evidence for age-​related increases in cognitive symptoms, anhedonia, hypersomnia, weight gain, decreased energy, and social withdrawal, which likely reflect the age-​related increases in the rates of depressive disorders rather than their changing presentations across development (Gibb, 2014). However, there is evidence that the duration criterion for MDD should be reduced for very young children (Luby et al., 2003).

functioning. Depressed children and adolescents often exhibit significant impairment in family, school, and peer functioning, and some degree of impairment may persist after recovery from the depressive episode (Garber & Horowitz, 2002; Lewinsohn & Essau, 2002). Depression is the leading risk factor for youth suicide, and it may be a risk factor for the development of other disorders such as substance abuse (Birmaher, Arbelaez, & Brent, 2002). The causal relationship between depression and functional impairment is complex: Depression causes significant impairment, but poor functioning may also be a risk factor for depression. Comorbidity

Depressive disorders are comorbid with other disorders across the lifespan. In school-​aged children and adolescents, approximately two-​thirds of depressed youth have at least one comorbid disorder (Avenevoli, Swendsen, He, Burstein, & Merikangas, 2015; Ford, Goodman, & Meltzer, 2003). In a meta-​analysis of studies using comPrevalence munity samples, Angold, Costello, and Erkanli (1999) Depressive disorders are relatively uncommon in children reported that the median odds ratios for the associations but are more frequent in adolescents. In community sam- of depression with anxiety, conduct, and attention deficit ples, the point prevalence of depressive disorders is 0.5% disorder were 8.2, 6.6, and 5.5, respectively. Depression to 2% in preschool-​aged children, 1% to 3% in school-​ is also often comorbid with eating, reading, and developage children, and 5% to 6% in adolescents; the lifetime mental disorders, and general medical conditions. Even prevalence in adolescents is 15% to 20% (Gibb, 2014; after adjusting for the presence of other diagnoses and their Lewinsohn & Essau, 2002). Not surprisingly, the preva- comorbidity among each other, youth depression continues lence of depression is much higher in clinical settings, to evidence significant comorbidity with generalized anxiwith estimated rates of 8% to 15% in children and greater ety, social anxiety, oppositional defiant disorder (ODD), than 50% in adolescents (Garber & Horowitz, 2002). conduct disorder (CD), and attention-​deficit/​hyperactivity There is no consistent gender difference in the preva- disorder (ADHD) (adjusted median odds ratios were 37.9, lence of depressive disorders in children; however, the 9.9, 10.9, 2.5, and 1.5, respectively) (Copeland, Shanahan, rates diverge in early adolescence, and by age 15 years the Erkanli, Costello, & Angold, 2013). In depressed preprevalence is approximately two times higher in females schoolers, rates of comorbidity may be higher, and the than in males (Hankin et al., 1998). overlap with anxiety disorders and the behavioral disorders is present even at this young age (Egger & Angold, 2006; Maughan, Collshaw, & Stringaris, 2013). Associated Features Two associated features that are important to consider in assessing depression are functional impairment and comorbidity, as both may influence course and treatment response, as well as constituting important treatment targets in their own right. Functional Impairment Depressive disorders in children and adolescents are associated with significant problems with psychosocial

Course Almost all children and adolescents with an episode of MDD recover, although many continue to experience subsyndromal (or residual) symptomatology. The length of episodes varies. The mean duration of episodes of MDD is approximately 7 or 8  months in clinical samples, and episodes of DD last an average of 48  months (Birmaher et  al., 2002; Kovacs, 1996). Rates of relapse and recurrence of MDD are high, with the majority of

 10

Depression in Children and Adolescents

depressed juveniles experiencing another episode within several years (Birmaher et al., 2002; Kovacs, 1996). With respect to homotypic continuity, long-​ term follow-​ up studies indicate that adolescents with MDD are at high risk for experiencing depressive episodes (Copeland, Shanahan, Costello, & Angold, 2009; Lewinsohn, Rohde, Klein, & Seeley, 1999) and significant functional impairment (Copeland, Wolke, Shanahan, & Costello, 2015) in adulthood. Preschool-​onset depression has been found to predict school-​age and early adolescent depression (Luby, Gaffrey, Tillman, April, & Belden, 2014), but long-​term follow-​up studies into adolescence and adulthood are unavailable. Evidence for childhood depression predicting adolescent and adult depression is less consistent (Birmaher et  al., 2002). Depressive disorders also evidence significant heterotypic continuity (i.e., one disorder predicts another disorder) with anxiety, ODD/​ CD, ADHD, and substance use disorders. The temporal sequence between disorders appears to change across development and is often bidirectional (Maughan et al., 2013). For example, although the temporal sequencing of the association between anxiety and depression is bidirectional by late adolescence, anxiety often precedes depression in childhood and early adolescence. There is also evidence of an increased risk for depression associated with the irritable subcomponent of ODD. The mechanisms and processes that serve to maintain depressive episodes and cause recurrences are poorly understood. However, longitudinal studies of the course of depression in children and adolescents have identified a number of factors that appear to predict the duration of MDD episodes and the probability of recurrence. Variables that are associated with a longer time to recovery include an early age of onset, greater severity of depression, suicidality, double depression, the presence of comorbid anxiety or disruptive behavior disorders, depressotypic cognitions, and an adverse family environment. Variables that have been associated with an increased risk of recurrence include greater severity, psychotic symptoms, suicidality, a prior history of recurrent MDD, double depression, the presence of subthreshold symptoms after recovery, a depressotypic cognitive style, recent stressful life events, an adverse family environment, and a family history of MDD (particularly if it is recurrent) (Birmaher et al., 2002). Children and adolescents with MDD and DD are also at risk for developing manic and hypomanic episodes. The probability of “switching” to bipolar disorder is higher in patients with psychotic symptoms, psychomotor retardation, a family history of bipolar disorder, and/​or a

101

high familial loading for mood disorders (Birmaher et al., 2002; Geller, Fox, & Clark, 1994). The processes and mechanisms involved in increased risk for the onset of depression are likely multifaceted and include a number of inherited, biological, and psychosocial risk factors, including (but not limited to) a family history of depression, stressful life events, family separation and conflict, child maltreatment, peer difficulties, child temperament, early mood and behavioral dysregulation, neuroendocrine and neurocognitive processes, neural circuitry involved in the processing of threat and reward, and genetic pathways involving gene–​environment correlations and interactions (Gibb, 2014; Klein, Goldstein, & Finsaas, 2017; Maughan et al., 2013; Thapar, Collishaw, Pine, & Thapar, 2012). Treatment There is relatively strong support for the efficacy of cognitive–​ behavioral therapy (CBT) and interpersonal therapy (IPT) for depressed adolescents, but effects are modest (effect sizes of Cohen’s d = 0.37 and 0.26 for CBT and IPT, respectively; for a review, see Maalouf & Brent, 2012). Fewer data are available on the efficacy of psychosocial interventions in school-​aged children. Although the findings have varied, the majority of studies have reported evidence supporting the efficacy of CBT (Maalouf & Brent, 2012). Data on the treatment of depression in young children are very sparse. Luby and colleagues (Lenze, Pautsch, & Luby, 2011; Luby, Lenze, & Tillman, 2012)  adapted parent–​child interaction therapy (PCIT), originally developed for early childhood externalizing problems, for preschool-​ onset depression by adding an emotional development module (termed PCIT-​ED); preliminary findings suggest PCIT-​ED may be a promising treatment for preschool-​onset depression. Some evidence also suggests that CBT may be effective in preventing the onset of depression and reducing symptoms, particularly in high-​risk youth, but treatment effects decrease substantially over time (Maalouf & Brent, 2012; Stockings et al., 2016). The effects of family therapy either alone or in conjunction with treatment for adolescents have been mixed; however, treatment of parents’ depression, alone or in conjunction with youths’ treatment, shows benefit (Maalouf & Brent, 2012). Despite the efficacy of psychosocial interventions for depressed children and adolescents in clinical trials, there is evidence that the types of treatments routinely provided in community settings are less successful than these evidence-​based treatments (Weersing & Weisz, 2002). Treatments that are adapted for or developed in

102

102

Mood Disorders and Self-Injury

community settings are needed (for a review, see Weisz, Krumholz, Santucci, Thomassin, & Ng, 2015). Controlled pharmacotherapy clinical trials in children and adolescents are also limited. The available evidence indicates that the cyclic antidepressants are not efficacious. Several double-​blind placebo-​controlled trials have reported benefits for selective serotonin reuptake inhibitors (SSRIs) in adolescents or mixed samples of children and adolescents (average effect size, Cohen’s d  =  0.25; Bridge et al., 2007), although effects are modest and some published and unpublished studies have failed to find differences (Maalouf & Brent, 2012; Vasa, Carlino, & Pine, 2006; Vitiello, 2011). Some evidence suggests that the combination of medication and CBT is superior to medication alone, particularly for moderate to severe depression and treatment-​resistant depression, although there are negative findings; nevertheless, combined treatment appears to provide greater improvement of functional status (Dubicka, et  al., 2010; Vitiello, 2009). Questions have also been raised about whether SSRIs are associated with increased suicidal ideation and behavior in children and adolescents (Bridge et  al., 2007; Maalouf & Brent, 2012). Currently, fluoxetine and escitalopram are the only SSRIs approved for the treatment of adolescent depression in the United States, and only fluoxetine is cautiously recommended for use with preadolescent children; thus, the effects of medication use in youth must be closely monitored. Even following an acute phase of one of these effective treatments, approximately 30% to 50% of depressed youth do not improve (Vitiello, 2009), and of those who do improve, rates of relapse and recurrence are high when psychosocial and pharmacological treatments are terminated. Although continued treatment with an SSRI has been shown to lower relapse rates compared to placebo (Emslie et al., 2008), the combination of medication management and CBT demonstrates lower relapse rates compared to medication management alone (Emslie et  al., 2015; Kennard et al., 2014). Moreover, in youth with treatment-​ resistant depression who received an acute SSRI treatment, switching to a combination of CBT and another antidepressant resulted in greater clinical response compared to switching to another medication without CBT (Brent et al., 2008). Continued monitoring of treatment response and pursuing other treatment avenues when patients are not responding to treatment are critical. Data on predictors of treatment response in depressed children and adolescents are limited. Although findings are mixed, data suggest that greater baseline symptom severity, comorbid anxiety, anhedonia, hopelessness, nonsuicidal self-​injurious behavior, subsyndromal manic

symptoms, and severe parent–​child conflict predict poorer treatment response (Emslie, Kennard, & Mayes, 2011). Moreover, combined treatment may be more effective for certain adolescents, including those with comorbid conditions and moderate to severe depression (Emslie et al., 2011). DSM-​5 DMDD DMS-​5 DMDD is characterized by severe temper tantrums and persistently angry/​irritable mood that are present for at least 12  months and across contexts. DMDD cannot be diagnosed in children before age 6  years and must be observed by age 10  years. Emerging research shows that DMDD may be relatively common in clinical settings (26.0%–​30.5%) (Axelson et al., 2012; Margulies, Weintraub, Basile, Grover, & Carlson, 2012)  but fairly uncommon in community samples (with 3-​ month prevalence rates ranging from 0.8% to 8.2%; Copeland, Angold, Costello, & Egger, 2013; Dougherty et al., 2014, 2016). DMDD frequently co-​occurs with another disorder (60.5% to 92.0% in the community-​based studies), and the highest rates of co-​occurrence are with depression and ODD (Copeland, Angold, et al., 2013; Dougherty et al., 2014). The course and stability of DMDD across childhood are largely unknown. Findings suggest that rates of DMDD decrease across childhood (Copeland, Angold, et al., 2013; Dougherty et al., 2016), and the majority of children with DMDD (Deveney et al., 2015; Dougherty et al., 2016) no longer meets criteria for the diagnosis at 3-​or 4-​year follow-​up. However, these children are at high risk for continued impairment and other forms of psychopathology across childhood and into adulthood, including adult depressive and anxiety disorders (Copeland, Shanahan, Egger, Angold, & Costello, 2014; Dougherty et  al., 2016). No randomized control clinical trials for DMDD have been conducted to date.

PURPOSES OF ASSESSMENT

Clinical assessment can be thought of as a sequence including at least three phases: diagnosis, case conceptualization and treatment planning, and treatment monitoring and evaluation. The major goal of the first phase is to develop a preliminary diagnosis and prognosis. For depression, this includes determining whether criteria are met for MDD or persistent depressive disorder and ruling out exclusionary diagnoses such as bipolar disorder and depression due to a general medical condition

 103

Depression in Children and Adolescents

or substance. As part of the assessment of depression, the clinician must assess key symptoms (e.g., suicidal ideation and psychotic symptoms) that might influence treatment decisions. In addition, it is important to carefully assess the previous course of the depression (e.g., prior episodes and chronicity) due to its prognostic value and possible implications for long-​term treatment. It is also important to assess comorbid psychiatric, developmental, and general medical disorders, and areas of significant functional impairment (e.g., family, school, and peers), in order to determine whether depression is the principal diagnosis that should be the primary target of intervention and because of their prognostic implications. Given the high comorbidity between the mood and anxiety disorders, we refer the reader to Chapter 11 in this volume on the assessment of child and adolescent anxiety disorders. The second phase of assessment involves developing a case conceptualization and treatment planning. In addition to variables already described, a comprehensive assessment of personal, interpersonal, or systemic dynamics is crucial in order to provide clues to the development and maintenance of symptoms and dysfunctional life patterns and to provide the focus of treatment. First, it is important to assess the child’s family environment, school functioning, peer relationships, significant stressors and traumas, and family history of psychopathology because these factors have considerable prognostic value and may be involved in the development and/​or maintenance of the disorder. Second, it is important to consider other social factors such as race, culture, ethnicity, and socioeconomic status. Poverty, race, social stressors, and ethnicity have all been linked to greater depression symptomatology in youth (Taylor & Turner, 2002; Wight, Aneshensel, Botticello, & Sepulveda, 2005). Furthermore, because current views of depression are primarily shaped by Western culture, depression may manifest itself differently across cultures and ethnicity. This is suggested by differences in the phenomenology and prevalence of depression across cultures and ethnic groups (Chentsova-​Dutton, Ryder, & Tsai, 2014). Moreover, we need to examine the validity of assessment tools across cultures because evidence suggests that they may also vary (e.g., Dere et al., 2015). Third, data on the severity and prior course of depression, key symptoms such as suicidal ideation/​behavior and psychotic symptoms, comorbidity, and functional impairment are important for determining the appropriate treatment setting (e.g., inpatient vs. outpatient), the intensity and duration of treatment, and perhaps the treatment modality. As noted previously, however, few data

103

are available to guide these decisions. Information on comorbidity is also necessary to determine whether other disorders should be monitored or targeted for treatment. Finally, it is critical to take a detailed history of previous treatment and assess the goals, attitudes, and motivation of the child and parents with respect to the relevant treatment options. This information is critical both for treatment selection and for engaging the child and family in treatment. Because children and parents often disagree on the selection of treatment targets (Hawley & Weisz, 2003), it may take considerable negotiation in order to develop a treatment plan that is acceptable to all parties. The third phase of assessment involves treatment monitoring and evaluation. This entails systematically assessing the degree of change in target symptoms and impairments in order to determine whether treatment should be continued, intensified, augmented, changed, or terminated. Although few guidelines are available to help clinicians determine when treatment should be modified, recent work has begun development and preliminary evaluation of “adaptive interventions” for use in child and adolescent mental health services, which provide a sequence of decision rules that determine whether, how, or when to alter the type, dosage, or delivery of service over the course of treatment (Almirall & Chronis-​Tuscano, 2016; Gunlicks-​ Stoessel, Mufson, Westervelt, Almirall, & Murphy, 2016). Research in this area is critically needed. Information Source It is important to obtain data from multiple informants, including the child, parents, and teachers. Child report is critical because parents and teachers tend to report lower levels of depressive and other internalizing symptoms in children than youths report themselves (Jensen et  al., 1999). However, it is useful to supplement youths’ reports with information from collaterals to assess externalizing disorders. Parent reports are particularly important for preschool and school-​age children. Due to developmental limitations in cognitive processes and language abilities, children are less reliable reporters of psychopathology than adolescents (Edelbrock, Costello, Dulcan, Kalas, & Conover, 1985). In addition, younger children have difficulty reporting on information regarding temporal parameters; therefore, parents must be relied on for information on course such as age of onset, previous episodes, and duration of current episode (Kovacs, 1986). Finally, parents are more involved in the day-​to-​day lives of children than adolescents and therefore are more knowledgeable about their behavior and activities.

104

104

Mood Disorders and Self-Injury

Although obtaining data from multiple informants is optimal, agreement between informants is only fair to moderate (Achenbach, McConaughy, & Howell, 1987). Informants tend to agree more when they observe youths in the same context, when the target behavior is easy to observe (e.g., externalizing vs. internalizing), and when a dimensional measure (vs. a categorical measure) is used (Achenbach et al., 1987; De Los Reyes et al., 2015). Nevertheless, evidence suggests that informant discrepancies provide meaningful and valid information, such as the situational specificity of the child’s emotional and behavioral problems. Several studies have demonstrated that child, parent, teacher, and clinician ratings all account for significant unique variance in predicting subsequent outcomes (Ferdinand et  al., 2003; Verhulst, Dekker, & van der Ende, 1997). In addition, depressed parents appear to have a lower threshold for detecting depression in their children; hence, their reports tend to yield higher rates of both true and false positives (i.e., increased sensitivity but decreased specificity) (Richters, 1992; Youngstrom, Izard, & Ackerman, 1999). The low agreement between data sources presents a significant challenge for clinicians who must decide how to interpret and integrate conflicting information. A variety of approaches to integrating data from multiple informants have been discussed in the literature, including assuming that the feature or diagnosis is present if any informant reports it (the “or” rule), requiring several informants to confirm the feature or diagnosis (the “and” rule), relying on the informant who is judged to be the most valid source of information for the feature or diagnosis, or developing various statistical procedures for optimizing prediction (for a review, see De Los Reyes, Thomas, Goodman, & Kundey, 2013). The approach that most closely mirrors clinical practice is the “best estimate” procedure, in which the clinician uses his or her best judgment to evaluate the informant’s credibility and integrate and resolve conflicting reports. This raises the possibility of introducing the unreliability and idiosyncrasy that structured interviews and standardized ratings scales were developed to prevent (discussed later). However, there is evidence from the adult literature that, when applied following appropriate guidelines (e.g., self-​ report takes precedence for internalizing disorders; informant report is given priority for externalizing disorders), the reliability of best estimate diagnoses can be very high (Klein, Ouimette, Kelly, Ferro, & Riso, 1994). Nevertheless, clinical science has not established “best practices” for using and interpreting multi-​informant assessments, and work in this area is particularly scarce for the assessment

of youths’ internalizing problems (De Los Reyes et  al., 2015). However, recent theoretical work on interpreting multi-​informant assessment outcomes in research (e.g., operations triad model; De Los Reyes et  al., 2013)  may assist in future efforts toward developing evidence-​based assessment practices in clinical practice. Attenuation Effect Studies of interviews and rating scales for both juvenile and adult psychopathology have often found that rates of diagnoses and ratings of symptom severity tend to decrease with repeated administrations, a phenomenon referred to as the “attenuation effect” (Egger et  al., 2006). Because this has been observed in nonclinical samples, it cannot be attributed to treatment or regression to the mean. This has important implications for treatment monitoring and evaluation because it is difficult to distinguish the attenuation effect from a positive response to treatment for the individual patient. Although there is no solution to this problem at present, it behooves the clinician to be aware of this phenomenon and to consider alternative explanations for what appears to be improvement on rating scales. Psychometric Considerations In reviewing available instruments for each of the assessment phases described previously in this section, accompanying tables are used to present general information on a measure’s psychometric properties and clinical utility. Thus, the presentation of specific psychometric data is kept to a minimum in the text and tables. As a general rule, we chose to include more widely used assessment tools that have been independently examined by at least two research groups. We made exceptions to this rule when a new measure appeared exceptionally promising due to unique features of the instrument. However, these newer measures are not included in the tables because there are insufficient data to evaluate their efficacy at this time. Measures were evaluated according to the criteria presented in Hunsley and Mash’s introductory chapter in this volume. Nevertheless, we mention several factors that influenced our ratings. First, inter-​rater reliability can be examined by raters independently rating a case vignette, a videotaped or audiotaped assessment, a live assessment (paired-​rater design), or by two examiners administering the same instrument at two different time points usually spanning only a few days (test–​retest design). The first three approaches hold information constant across raters;

 105

Depression in Children and Adolescents

hence, reliability should be higher than that of test–​retest designs, in which information presented to each examiner can vary substantially. In making the ratings, we tried to take the type of design into account. In addition, examining the test–​retest reliability of depression in youth over several months or a year is relatively uncommon because depression in youth is often intermittent/​ episodic. Therefore, most ratings of test–​retest reliability cannot receive more than an adequate rating due to the shorter time frames assessed. Second, evaluating convergent and divergent validity of an instrument can be difficult because depression tends to co-​occur with many other forms of psychopathology. Although depression measures should correlate more highly with other depression measures than with measures of other forms of psychopathology, there should be substantial correlations between measures of depression and measures of anxiety and/​or behavior problems. Similarly, depressed youth are likely to differ from nondepressed youth not only on measures of depression but also on other measures of psychological dysfunction. Thus, modest discriminant validity may not be a limitation of the instrument but instead might reflect the comorbidity between depression and other disorders. Last, very little work has examined the clinical utility of youth depression measures. We are aware of only one such study (Hughes et al., 2005) that used the Schedule for Affective Disorders and Schizophrenia in School-​Age Children (K-​SADS); therefore, all other measures did not receive above an adequate rating on this criterion. Obviously, this is an area of research that needs much attention.

ASSESSMENT FOR DIAGNOSIS

The two major approaches to diagnosing and assessing depression in children and adolescents involve interviews and rating scales. Interviews can be unstructured, semi-​structured, or fully structured. Unstructured clinical interviews are variable across clinicians, who often fail to inquire about key aspects of psychopathology, particularly if it is inconsistent with their initial diagnostic impressions (Angold & Fisher, 1999), and formulate fewer diagnoses than clinicians using structured interviews (Zimmerman, 2003). With semi-​structured interviews, the interviewer is responsible for rating the criteria as accurately as possible, using all available information, and improvising additional questions or confronting the respondent with inconsistencies when necessary. In contrast, the interviewer’s role in fully structured interviews is limited to

105

reading the questions as written and recording the respondent’s answers. As a result, semi-​structured interviews were designed for use by mental health professionals or well-​ trained and supervised technicians, and seek to capitalize on their clinical training and experience, whereas fully structured interviews were developed for lay interviewers in large-​scale epidemiological studies in which the cost of interviewers with clinical training is prohibitive. There have been few direct comparisons of the validity of semi-​ versus fully structured interviews, but some data support their concordance (e.g., Green et al., 2012). Nevertheless, given the limited data, we generally assume that the semi-​ structured approach yields higher quality data compared to the structured approach because the interviewer presumably has a better sense of the constructs being assessed than does the respondent. Rating scales include clinician-​ administered, self-​ report, and parent-​and teacher-​ report measures. Clinician-​administered rating scales are semi-​structured interviews that focus on a circumscribed area of symptomatology (e.g., depression). Self-​report and parent and teacher rating scales are rated by the designated informant, although they can be read to younger children. Unlike diagnostic interviews, rating scales do not collect sufficient information to make a diagnosis (e.g., duration and exclusion criteria are generally not assessed). Due to their economy, self and informant rating scales can be especially valuable as screening instruments, with elevated scores leading to a more intensive evaluation. Self-​ rating scales are generally superior to parent and teacher rating scales in screening for internalizing disorders due to their greater sensitivity. However, even the best self-​rating scales have only moderate sensitivity and specificity, producing a substantial number of false positives and false negatives (Kendall, Cantwell, & Kazdin, 1989). Because the prevalence of child and adolescent depression tends to be fairly low in most screening contexts, the number of false positives greatly outnumbers true positives (Matthey & Petrovski, 2002; Roberts, Lewinsohn, & Seeley, 1991). Thus, the potential economy and efficiency of screening must be weighed against the costs of unnecessary extended evaluations for false-​positive cases and the risks associated with missing false-​negative cases. In the next section, we briefly describe several of the better researched and more widely used semi-​structured diagnostic interviews, fully structured diagnostic interviews, and rating scales. We also chose to include a few promising assessment tools that are worth noting due to some unique features of the instrument. We have been highly selective, and there are a number of equally good,

106

106

Mood Disorders and Self-Injury

but less widely used, measures that we have not included. For more information on this broader range of instruments, readers are referred to some excellent reviews (Brooks & Kutcher, 2001; D’Angelo & Augenstein, 2012; Leffler, Riebel, & Hughes, 2015; Myers & Winters, 2002; Simmons, Wilkinson, & Dubicka, 2015). There are also a number of measures of specific components of the depressive syndrome, such as self-​esteem, hopelessness, depressive cognitions, and suicidality (for a review, see Winters, Myers, & Proud, 2002) that may be useful for particular cases but are not reviewed here. Semi-​Structured Diagnostic Interviews In this section, we briefly review the four most widely used semi-​structured diagnostic interviews for child and adolescent psychopathology: the K-​SADS (Puig-​Antich & Chambers, 1978), the Child and Adolescent Psychiatric Assessment (CAPA; Angold, Prendergast, et al., 1995) and its downward extension, the Preschool Age Psychiatric Assessment (PAPA; Egger & Angold, 2004), and the Diagnostic Interview for Children and Adolescents (DICA; Herjanic & Reich, 1982). Information on these instruments is provided in Table 6.1. Each interview assesses the criteria for most of the major child and adolescent psychiatric disorders and provides parallel versions for children and parents, with the exception of the PAPA, which has a parent version only. Although some of the instruments are used to interview 6-​and 7-​year-​old

children, it is questionable whether children younger than 8 or 9 years can provide valid information in a diagnostic interview (Angold & Fisher, 1999). Evaluating the validity of semi-​structured diagnostic interviews is complex because they are usually used as the “gold standard” that other measures are compared against. Construct validity is probably the best standard, but given the current state of the literature, it is impossible to distinguish the construct validity of semi-​structured interviews from the diagnoses that they are designed to assess. In order to try to disentangle the construct validity of interviews from diagnostic constructs, it is necessary to conduct head-​to-​head comparisons of several interviews using the same sample and the same criteria for construct validation (e.g., family history and course). Unfortunately, such studies have not been conducted. Although the distinction between MDD and DD has important prognostic implications (Kovacs, 1996), the majority of studies combine them in a higher order depressive disorder category or focus solely on MDD. Hence, for present purposes, we focus on depressive disorders as broadly conceived. The K-​SADS (Puig-​Antich & Chambers, 1978) is the most widely used semi-​structured interview for children and adolescents (6–​18 years), and promising preliminary data suggest that it could possibly be extended to children as young as preschool age (Birmaher et al., 2009). It is the least structured of the semi-​structured interviews, and therefore it requires the greatest amount of clinical training and experience. The K-​SADS was modeled

Table 6.1  Ratings of Instruments Used for Diagnosis and Prognosis Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

NA NA NA NA NA

NA NA NA NA NA

G G G A A

A A A A A

G G G G G

E G G A A

E G G E E

G A A A A

✓ ✓ ✓

E NA NA NA NA

A A A A A

G G A A G

G G G G G

E E E A E

A A A A A

✓ ✓ ✓

Diagnostic Instruments K-​SADS CAPA PAPA DICA DISC

Screening and Symptom Severity Instruments CDRS-​R CDI MFQ RCDS RADS

A E A G E

G G E E E



Note: K-​SADS = Schedule for Affective Disorders and Schizophrenia in School-​Age Children; CAPA = Child and Adolescent Psychiatric Assessment; PAPA = Preschool Age Psychiatric Assessment; DICA = Diagnostic Interview for Children and Adolescents; DISC = Diagnostic Interview Schedule for Children; CDRS-​R  =  Children’s Depression Rating Scale-​Revised; CDI  =  Children’s Depression Inventory; MFQ  =  Mood and Feelings Questionnaire; RCDS = Reynolds Child Depression Scale; RADS = Reynolds Adolescent Depression Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

 107

Depression in Children and Adolescents

after the adult Schedule for Affective Disorders and Schizophrenia (SADS). There are a number of versions of the K-​SADS that assess DSM-​IV criteria. These versions vary in format, whether they assess lifetime as well as current psychopathology, and whether they also provide dimensional measures of symptom severity. Ratings are based on all sources of information and clinical judgment. Administration time of the parent and child interviews range from 35 minutes to 2.5 hours each, depending on the severity and breadth of the child’s psychopathology. Inter-​rater reliability has been reported to be adequate to excellent for depressive disorders in several studies, and it has been particularly impressive with the more recent versions of the K-​SADS (Ambrosini, 2000; Kaufman et  al., 1997). Evidence for convergent validity derives from numerous studies reporting correlations between the K-​SADS and a variety of clinician, self, and parent rating measures of depression and internalizing behavior problems (Ambrosini, 2000; Kaufman et  al., 1997). In addition, youths diagnosed with MDD using the K-​SADS differed from controls on psychosocial impairment, familial aggregation of mood disorders, and numerous neurobiological parameters, and K-​SADS diagnoses of depression predicted continued risk for recurrence of affective disorders (Ambrosini, 2000). Nevertheless, it has been suggested that MDD K-​SADS diagnoses may identify a more severe clinical group than other assessment tools (Hamilton & Gillham, 1999). In addition, when K-​SADS items are coded dichotomously and summed, this K-​ SADS depression scale provides information that is restricted to more severe symptom levels (Olino et  al., 2012). Finally, recent data reveal clear psychometric advantages for the K-​SADS depression scale, including better coverage of the construct of depression, when assessment algorithms incorporate item response theory-​based estimates of symptom severity, discriminability, and subclinical levels compared to a raw symptom count score (Cole et al., 2011). The CAPA (Angold, Prendergast, et al., 1995) assesses the criteria for most major diagnoses in children aged 9 to 17 years. The time frame for symptom assessment is the preceding 3 months, and administration time takes 1 or 2 hours (Angold & Costello, 2000). The interview has several attractive features. First, it is unique in that it includes an extensive glossary defining specific symptoms and distress and frequency ratings. As a result, the CAPA can be used by interviewers with minimal clinical experience, as long as they adhere closely to the definitions and conventions in the glossary. Second, it includes a section for assessing impairment in a number

107

of areas, including family, peers, school, and leisure activities, and it also includes sections assessing the family environment, life events, and trauma. One test-​retest reliability study of the CAPA by its developers (Angold & Costello, 1995) reported kappas for MDD and DD of .90 and .85, respectively, and an intraclass correlation for the MDD symptom scale of .88. Studies, not conducted by the CAPA developers, also reported good to excellent inter-​rater reliability values using audiotaped interviews for diagnoses of MDD and DD (Jozefiak et  al., 2016; Wamboldt, Wamboldt, Gavin, & McTaggart, 2001) and for parent-​and child-​rated depression symptoms (Hammerton, Thapar, & Thapar, 2014; Mars et  al., 2012). Angold et  al. (2012) compared diagnoses generated from the CAPA to those generated using the Diagnostic Interview Schedule for Children-​ IV (DISC-​IV), and the rates of MDD/​DD obtained using the CAPA (9.5%) and the DISC-​IV (5.3%) did not significantly differ (κ = .56), suggesting that these measures are relatively comparable. Supporting its convergent validity, depression as diagnosed by the CAPA is associated with significant levels of functional impairment, higher concordance among monozygotic than among dizygotic twins, lower income, nonsupportive parenting, maternal history of depression, and similar psychosocial risk factors as those observed in adult-​onset depression (Angold & Costello, 2000; Luby et al., 2014; Shanahan, Copeland, Costello, & Angold, 2011). The PAPA (Egger, Ascher, & Angold, 1999)  was developed as a downward extension of the CAPA that incorporates developmental modifications for children 2-​to 5-​years-​old. Thus, the time frame, assessment time, and other unique features of the CAPA described previously also apply to the PAPA. Given the limited diagnostic assessment measures for children younger than age 8  years, the PAPA has been used in children up to age 8 years (e.g., Luby et al., 2014). Adequate test–​retest reliability values for a depressive disorder diagnosis (κ = .62 to .72) and for depressive symptoms (intraclass correlation coefficient [ICC] = .71 to .88) have been reported using independent interviews (Egger et  al., 2006; Luby et al., 2014). In addition, in a sample of 14 children, similar diagnoses were generally derived from the PAPA and the K-​SADS (Birmaher et al., 2009). Inter-​rater reliability values using audiotaped interviews of the parent interview of the PAPA ranged from .63 to 1.00 for a depressive disorder diagnosis and from .85 to .98 for depressive symptoms scale (Danzig et al., 2013; Gaffrey, Barch, Singer, Shenoy, & Luby, 2013; Luby et  al., 2014; Wichstrøm et  al., 2012; Wichstrøm & Berg-​Nielsen, 2014). Much

108

108

Mood Disorders and Self-Injury

of the research establishing the construct and criterion validity of the PAPA depressive disorder diagnosis was performed by Luby and colleagues, although other groups have recently provided similar support. Depression diagnosed with the PAPA using developmentally modified diagnostic criteria for preschoolers is associated with significant functional impairment across multiple domains (Bufferd, Dougherty, Carlson, & Klein, 2011; Danzig et  al., 2013; Luby, Belden, Pautsch, Si, & Spitznagel, 2009), demonstrates homotypic continuity over both 12-​ and 24-​month follow-​up (Luby, Si, Belden, Tandon, & Spitznagel, 2009), and predicts depression in school-​age children (Luby et al., 2014). In addition, characteristics and patterns of risk similar to those reported in depressed older children and adults have been observed in preschool depression diagnosed with the PAPA, including rates of comorbidity, patterns of heterotypic continuity, early predictors, and associations with temperament and neurobiological correlates (e.g., Bufferd et  al., 2014; Dougherty et al., 2011; Gaffrey et al., 2013; Luby, Belden, et  al., 2009; Wichstrøm et  al., 2012). Finally, several studies examining the structure of preschool psychopathology using the PAPA yield a relatively similar structure observed in older youth and adults (Olino, Dougherty, Bufferd, Carlson, & Klein, 2014; Sterba, Egger, & Angold, 2007; Wichstrøm et al., 2014). The DICA (Herjanic & Reich, 1982) was originally designed as a fully structured interview, but recent versions have been semi-​structured in nature. The most recent version of the DICA (Reich, 2000) assesses both DSM-​III-​R and DSM-​IV criteria, and it includes separate interviews for children (6–​12  years), adolescents (13–​17 years), and parents. The interview adopts a lifetime time frame and takes approximately 1 to 2 hours to complete. Data on inter-​rater reliability has varied across studies, ranging from poor to good (Boyle et al., 1993; Brooks & Kutcher, 2001; Reich, 2000). DICA diagnoses are moderately correlated with clinicians’ diagnoses and also clinician and self-​rated measures of depressive symptoms (Brooks & Kutcher, 2001; Reich, 2000), providing evidence of convergent validity. DICA MDD specificity rates are generally high, but its sensitivity rates are low, which suggests that the DICA tends to underdiagnose MDD compared to other measures (Ezpeleta et  al., 1997; Olsson & von Knorring, 1997). A downward extension of the DICA has been developed for parents of preschool-​aged children (Ezpeleta, de la Osa, Granero, Domenech, & Reich, 2011); however, data are too limited to recommend its use for the assessment of depression.

Fully Structured Diagnostic Interviews In this section, we review the DISC (Costello, Edelbrock, Dulcan, Kalas, & Klaric, 1984), the most widely used fully structured diagnostic interview for child and adolescent psychopathology. There are other fully structured interviews, such as the Children’s Interview for Psychiatric Syndromes (ChIPS; Weller, Weller, Fristad, Rooney, & Schecter, 2000), the Dominic-​R (Valla, Bergeron, & Smolla, 2000), and the Development and Well-​ Being Assessment (DAWBA; Goodman, Richards, Ford, Gatward, & Meltzer, 2000), although the DAWBA also allows respondents to enter open-​ended responses, which can be reviewed by a clinician to modify final diagnoses. These instruments are not reviewed here given the limited data for depression. The DISC (Costello et  al., 1984)  assesses a broad range of psychiatric disorders that, in the latest version (DISC-​IV), reflect DSM-​IV and International Statistical Classification of Diseases, 10th revision (ICD-​10; World Health Organization, 1992), criteria (Shaffer, Fisher, Lucas, Dulcan, & Schwab-​ Stone, 2000). The DISC includes separate interviews for youth (9–​17  years) and parents of 6-​to 17-​year-​olds. The time frame includes the past 12 months and the past 4 weeks, and the DISC takes between 1 and 2 hours to complete. Inter-​rater reliability of the DISC was adequate in the initial DISC study (Costello et al., 1984). Test–​retest reliability estimates for MDD for the earlier versions range from poor to good (Hodges, 1994; Shaffer et  al., 2000). However, results obtained with the DISC-​IV suggest that it has better test–​ retest reliability than its predecessors, especially in clinical samples (Shaffer et al., 2000). Concordance between DISC diagnoses and clinicians’ diagnoses (Hodges, 1994; Lewczyk, Garland, Hurlbert, Gearity, & Hough, 2003; Schwab-​Stone et  al., 1996)  and self-​rated measures of depressive symptoms (Angold, Costello, Messer, & Pickles, 1995; Hodges, 1994) range from poor to good, providing only limited evidence of convergent validity. Moreover, the DISC (original version) evidenced very low concordance with the K-​SADS for MDD and poor discriminant validity (Hodges, 1994). Prevalence studies have also suggested that the DISC (original version) has good sensitivity but poor specificity, leading to overdiagnosing (Hodges, 1994). However, a recent study (Angold et al., 2012) comparing the DISC-​IV and the CAPA found that the instruments were relatively comparable overall (with the exception of specific phobia) and in their diagnosis of depression. Finally, Lucas, Fisher, and Luby (1998) used a downward extension of the DISC-​IV for preschoolers

 109

Depression in Children and Adolescents

(DISC-​IV Young Child), which demonstrated encouraging results for the diagnosis of depression, as well as data on the external validity of MDD DISC-​IV diagnoses in this age group (Luby, Mrakotsky, Heffelfinger, Brown, & Spitznagel, 2004; Luby et al., 2006).

109

severity in children with general medical conditions due to its emphasis on somatic symptoms (Brooks & Kutcher, 2001; Myers & Winters, 2002). There are a number of widely used self-​rating scales for child and adolescent depression. We briefly review four:  Children’s Depression Inventory (CDI; Kovacs, 1992), Mood and Feelings Questionnaire (MFQ; Angold, Rating Scales Costello, et al., 1995), Reynolds Child Depression Scale In this section and in Table 6.1, we review some of (RCDS; Reynolds, 1989), and Reynolds Adolescent the more widely used clinician, self-​report, and multi-​ Depression Scale (RADS; Reynolds, 1987). Some of these informant rating scales for depression. Information on measures, such as the MFQ, are based on older versions rating scales designed for adults that are often used with of the DSM. However, their use is still warranted because older adolescents can be found in Chapter 7 in this vol- there have been few changes in symptoms and criteria for ume. The Hamilton Rating Scale for Depression (HAM-​ depressive disorders from DSM-​III to DSM-​5. D), Beck Depression Inventory (BDI), and Center for The CDI (Kovacs, 1992) is the most widely used depresEpidemiological Studies-​Depression Scale (CES-​D) have sion rating scale for children and adolescents. Developed similar psychometric properties in adolescent and adult as a modified version of the BDI, it assesses severity of samples (Olino et  al., 2013; Roberts et  al., 1991), have depression during the previous 2 weeks in children aged comparable reliability and validity values compared to 7 to 17 years. The original CDI includes 27 items, and the measures that were specifically designed for juveniles, and revised version (CDI-​2; Kovacs, 2011) includes 28 items. have been sensitive to treatment effects (Weisz, McCarty, The CDI and CDI-​2 have shorter versions with 10 and 12 & Valeri, 2006). These measures appear to be acceptable items, respectively. Items cover a broad range of depresalternatives for older adolescents. sive symptoms and associated features, with a particular The most widely used clinician scale for rating depres- emphasis on cognitive symptoms, and the CDI takes 10 to sion in children is the Children’s Depression Rating Scale 20 minutes to complete. A number of studies have reported (CDRS; Poznanski, Cook, & Carroll, 1979). Based on the that the CDI (original version) has good internal consisHAM-​D, the CDRS was developed to assess current sever- tency, and many, but not all, studies have also reported ity of depression in children aged 6 to 12 years and is often good short-​term test–​retest reliability estimates (Brooks & used for adolescents as well. The revised version (CDRS-​ Kutcher, 2001; Kovacs, 1992; Silverman & Rabian, 1999). R; Poznanski & Mokros, 1999) contains 17 items assessing Studies of the factor structure of the CDI (original version) cognitive, somatic, affective, and psychomotor symptoms have produced inconsistent findings, with some indication and draws both on the respondent’s report and the inter- that the factor structure varies by age and non-​English verviewer’s behavioral observations. It takes 20 to 30 minutes sions (Cole, Hoffman, Tram, & Maxwell, 2000; Huang & to administer. It is designed to be administered separately Dong, 2014; Weiss & Garber, 2003). The CDI (original to the child and an informant (typically the parent), with version) is moderately to highly correlated with the CDRS, the clinician subsequently integrating the data using clini- a number of other self-​rated depression scales, and other cal judgment. Cut-​off scores are provided to aid in inter- measures of related constructs supporting its convergent preting levels of depression severity. Scores on the CDRS validity (Brooks & Kutcher, 2001; Myers & Winters, 2002; have good internal consistency and good inter-​rater reli- Silverman & Rabian, 1999). However, the discriminant ability (Brooks & Kutcher, 2001; Myers & Winters, 2002). validity of the CDI (original version) is questionable Its convergent validity has been supported by moderate because it is almost as highly correlated with measures to high correlations with the HAM-​D, several self-​rated of anxiety as it is with other measures of depression, and depression scales, and K-​SADS MDD diagnosis (Brooks studies examining its ability to distinguish depressed from & Kutcher, 2001; Mayes, Bernstein, Haley, Kennard, & nondepressed patients have yielded conflicting findings Emslie, 2010; Myers & Winters, 2002). The CDRS-​R (Myers & Winters, 2002; Silverman & Rabian, 1999). The achieved moderate to good discriminative validity in clas- CDI-​2 demonstrated good internal consistency, test–​retest sifying depressive disorders compared to other disorders reliability, construct validity, and discriminant validity in (Yee et  al., 2015). However, some data suggest that the differentiating the MDD from a control group and other CDRS scores may not distinguish between depression and psychiatric groups, and it correlated with other self-​report anxiety and that the CDRS may overestimate depression measures of depression (Kovacs, 2011).

10

110

Mood Disorders and Self-Injury

The MFQ (Angold, Costello, et al., 1995) was developed to assess depression during the past 2 weeks in youths aged 8 to 18  years. It consists of 32 items covering the DSM-​III-​R criteria for depression and additional symptoms, such as loneliness and feeling unloved or ugly. Angold, Costello, et  al. also developed a shorter 13-​item version (SMFQ) by selecting items that yielded optimal discriminating power and internal consistency. The MFQ takes approximately 10 minutes to complete. Scores on the measure have been found to have excellent internal consistency and adequate to good test–​retest reliability (Angold, Costello, et  al., 1995; Daviss et  al., 2006; Wood, Kroll, Moore, & Harrington, 1995). In addition, it has demonstrated good convergent validity with respect to the CDI, DISC, CAPA, and K-​SADS (Angold, Costello, et al., 1995; Thapar & McGuffin, 1998; Wood et al., 1995). The MFQ was also relatively successful in discriminating youths with diagnoses of depression from those with non-​mood disorders (Daviss et al., 2006; Kent, Vostanis, & Feehan, 1997; Thapar & McGuffin, 1998). The RCDS (Reynolds, 1989)  and RADS (Reynolds, 1987)  are 30-​item scales designed to assess depressive symptomatology (as represented in DSM-​III) during the previous 2 weeks in youths aged 8 to 12 years and 13 to 18  years, respectively. Each scale takes approximately 10 minutes to complete. The Reynolds scales have been used primarily with school, rather than clinical, samples. Scores on both scales have excellent internal consistency and adequate test–​retest reliability (Brooks & Kutcher, 2001; Myers & Winters, 2002). In addition, both are correlated with interview diagnoses and other depression rating scales, such as the CDRS, HAM-​D, CDI, BDI, and CES-​D (Brooks & Kutcher, 2001; Myers & Winters, 2002). Discriminant validity has not been well-​studied, although like most depression rating scales, the RADS is moderately correlated with measures of anxiety (Myers & Winters, 2002). Revised versions of the RCDS (RCDS-​ 2; Reynolds, 2010)  and RADS (RADS-​ 2; Reynolds, 2004)  have been developed, and initial reports support similar psychometric properties as their predecessors (Osman, Gutierrez, Bagge, Fang, & Emmerich, 2010; Reynolds, 2004; Reynolds, 2010). Both the RCDS-​2 and the RADS-​2 also expanded the age range to include youth aged 7 to 13 years and 11 to 20 years, respectively. Two new instruments are worth mentioning. First, the NIMH recently initiated a Patient Reported Outcomes Measurement Information System (PROMIS) network that used psychometric methods to develop instruments to address multiple domains of psychological and physical health. The PROMIS network developed

a unidimensional 28-​ item depression measure for use with adults, which has also been used in adolescents (PROMIS-​Depression; Pilkonis et  al., 2011), and a 14-​ item depression pediatric measure for youth aged 8 to 17 years (Irwin et al., 2010). Both the adult and the pediatric versions have a short 8-​item depression measure. Preliminary evidence supports their construct validity and reliability and their ability to assess the full range (none/​ mild to severe) of depressive symptoms (Irwin et al., 2010; Olino et al., 2013; Pilkonis et al., 2011). Second, the Beck Depression Inventory for Youth (BDI-​Y; Beck, Beck, & Jolly, 2001)  assesses DSM-​IV symptoms of depression, with a focus on cognitive features of depression, in youth aged 7 to 14  years. The BDI-​Y includes 20 items, and preliminary evidence demonstrates good internal consistency, high correlations with the CDI, and successful differentiation between youth with depression and controls (Beck et al., 2001; Stapleton, Sander, & Stark, 2007). As noted previously, it is important to obtain information about child and adolescent depression from informants other than the youths themselves. Several of the self-​rating scales, such as the CDI, have been reworded for use by parents and, in some cases, by teachers and peers. Some psychometric data have been reported on these adaptations. Kovacs (2003) reported data on the norms and factor structure of the parent and teacher versions of the CDI, as well as good internal consistency for these measures. In addition, Cole et al. (2000) compared child-​ and parent-​report versions of the CDI. They reported that the two versions had similar internal consistencies and test–​retest reliabilities and that the factor structure of the CDI was relatively similar, although not identical, across informants. There are also a number of multi-​informant rating scales that were designed to assess a broad range of child and adolescent psychopathology using instruments that are comparable across informants (Hart & Lahey, 1999). The most widely used is the parent-​report Child Behavior Checklist for ages 6 to 18 years (CBCL/​6-​18; Achenbach & Rescorla, 2001a) and the CBCL for ages 1½ to 5 years (CBCL/​1½-​5; Achenbach & Rescorla, 2001b) and their accompanying teacher report (Teacher Report Form [TRF]) and youth report for ages 11 to 18  years (Youth Self-​Report [YSR]) versions. The CBCL and YSR assess the child’s behavior during the past 6 months, whereas the TRF uses a 2-​month time frame. All three measures take approximately 10 to 15 minutes to complete. The CBCL includes 118 items assessing two broadband and eight narrowband scales identified using factor analysis, as well as a social competence scale. Extensive

 1

Depression in Children and Adolescents

norms for the CBCL, TRF, and YSR are available for both clinical and community samples, and favorable psychometric properties of the instruments have been documented in hundreds of studies. Unfortunately, the CBCL’s utility in assessing depression, at least as conceptualized in the DSM, is limited. The scale that is most relevant to depression is the narrowband Anxious/​Depressed scale, which combines symptoms of anxiety and depression. In addition, some other depressive symptoms are included on other narrowband scales. Indeed, a latent class analysis of the Anxiety/​Depression scale was unable to distinguish distinct classes for depression and anxiety (Wadsworth, Hudziak, Heath, & Achenbach, 2001). A set of diagnostic scales that are more closely geared to DSM diagnoses has been added to the CBCL. Recent findings suggest the Affective Problems DSM-​oriented scale, intended to correspond to DSM depressive disorders, demonstrated good internal consistency and convergent validity (Nakamura, Ebesutani, Bernstein, & Chorpita, 2009), with significant associations with measures of depression, anxiety, and oppositionality; however, the scale was more strongly associated with measures of depression than oppositionality (Nakamura et  al., 2009). In addition, although the Affective Problems scale corresponded with a depressive disorder diagnosis derived from a parent-​based structured interview and differentiated depressed from nondepressed youth, the Affective Problems scale did not add incremental clinical validity above the empirically derived CBCL syndrome scale (i.e., Withdrawn/​ Depressed) in these analyses (Ebesutani et al., 2010). The Child Symptom Inventory (CSI-​ 4; Gadow & Sprafkin, 2002) and the corresponding Early Childhood Inventory-​4 (ECI-​4; Gadow & Sprafkin, 2000) are rating scales that assess symptoms of the most relevant DSM-​IV-​ TR psychiatric disorders in children aged 5 to 12 years and 3 to 5  years, respectively. There are parent and teacher versions and both categorical and dimensional scoring procedures. Scores on both the CSI-​4 and the ECI-​4 have been found to have acceptable internal consistency, test–​ retest reliability, and convergent validity; however, the discriminant validity, especially for the internalizing disorders, has not been well documented (Gadow & Sprafkin, 2000, 2002). A group of investigators sponsored by the McArthur Foundation have developed a broad-​ band battery of assessment instruments for children in the early school-​ age period (ages 4–​8  years). It includes a parent and teacher rating scale, the McArthur Health and Behavior Questionnaire (HBQ; Essex et  al., 2002), and a child-​ reported semi-​structured interview, the Berkeley Puppet

111

Interview (BPI; Ablow et  al., 1999)  that uses puppets in order to provide a more developmentally sensitive assessment. Both measures include scales tapping various domains of symptomatology (including a subscale for depressive symptoms), physical health, and peer and school functioning. In the initial reports from this group, the depression scale score from the HBQ parent and teacher forms had adequate internal consistency and good test–​retest reliability, and it discriminated clinic from community subjects (Ablow et al., 1999). Although a categorical measure of depression from the HBQ parent form was not correlated with diagnoses of MDD derived from the parent version of the DISC, it was associated with a number of teacher-​rated indices of impairment (Luby et  al., 2002). Similarly, a categorical measure of internalizing symptoms from the HBQ parent form demonstrated low to moderate agreement with a DISC internalizing diagnosis (κ = .30), but the parent HBQ internalizing composite score was significantly associated with parent ratings of child impairment and global physical health and child-​ reported BPI scores (Lemery-​Chalfant et al., 2007). The child-​rated BPI depression scale scores demonstrated adequate internal consistency (α = .75) in a clinic sample but poor internal consistency (α = .36) in a community sample, adequate test–​ retest reliability in both samples (r  =  .42 to .43), and discriminated clinic from community youth (Ablow et  al., 1999). Other research groups demonstrated a similarly low internal consistency estimate for the depression scale (α = .44) in a Dutch community sample (Ringoot et  al., 2013), adequate internal consistency for the composite internalizing scale (α = .72 to .86) (Ringoot et  al., 2013; Stone et  al., 2014), and adequate 1-​year test–​retest reliability for the depression scale (r = .29) (Stone et al., 2014). Inter-​rater reliability of the depression scale based on independent coders review of the videotaped BPI interviews has been found to be good (ICC = .74 to .86) (Stone et al., 2014). In addition, Luby, Belden, Sullivan, and Spitznagel (2007) identified a three-​item child-​rated BPI “core” depression symptoms scale and found that the MDD group based on the DISC had more BPI core depression symptoms compared to the no disorder group, and the scale was related to DISC depression severity scores and parent-​ reported CBCL internalizing symptoms. Assessment of DMDD There are currently no published clinical interviews updated for DSM-​5. Although the current clinical interviews do not assess DMDD, researchers have applied post

12

112

Mood Disorders and Self-Injury

hoc algorithms to several interviews, including the CAPA, PAPA, and K-​SADS, using items from the depression and ODD sections that correspond with DMDD criteria. These post hoc DMDD diagnoses have provided some of the first data on DMDD in community and clinical samples. Several scales assessing the severity or frequency of youth irritability have been developed, including the Affective Reactivity Index (ARI; Stringaris et  al., 2012)  and empirically derived scales from the CBCL (e.g., Roberson-​Nay et al., 2015), with promising psychometric properties.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

In this section, we briefly discuss the assessment of constructs that are useful for prognosis, case conceptualization, and treatment planning. We emphasize constructs that have shown some value in predicting treatment response, either in general or differentially for some treatments and not others (i.e., moderators; for reviews of predictors and moderators of treatment response, see Emslie, Kennard, & Mayes, 2011; Nilsen, Eisemann, & Kvernmo, 2013; Weersing, Schwartz, & Bolano, 2015), are important in determining the appropriate treatment Overall Evaluation setting, or provide additional critical treatment targets. For the purpose of diagnosis, we recommend the use of These constructs include severity of depressive symptoms, semi-​structured interviews because they provide greater comorbid psychopathology, selected depressive symptoms flexibility and allow for the clarification of questions and clinical features, psychosocial functioning, stress and and responses and also the clinical judgment of the trauma, and family history of psychopathology. Greater initial severity of depressive symptoms predicts interviewer. In particular, we recommend the K-​SADS because of its fairly strong psychometric properties. In a poorer course and poorer treatment response (Emslie addition, the CAPA and the PAPA are both good options et al., 2011; Nilsen et al., 2013). There is also some, albeit for the assessment of depression, particularly because they inconsistent, evidence that the benefits of targeted psycollect additional information (severity, frequency, and chosocial or pharmacological treatments are evident only duration) and assess functional impairment and life stress at higher levels of severity (Weersing et al., 2015). Thus, (discussed later). Fully structured interviews, such as the depression severity can inform choices of the appropriate DISC-​IV, can also play an important role in large-​scale type, intensity, and setting (e.g., outpatient or inpatient) of treatment. As discussed in the previous section and sumepidemiological studies and in screening. As with fully structured interviews, we advise that rat- marized in Table 6.1, there are a number of well-​studied ing scales not be used alone when formulating diagnoses rating scales that can be used to assess the severity of because many of the scales appear to measure general depression for the purposes of case conceptualization and distress rather than depression specifically. In addition, treatment planning. The presence of comorbid psychopathology also has they do not provide the necessary information to make a diagnosis (e.g., onset, duration, and frequency of symp- substantial prognostic value and may moderate treatment toms), and when they are used to approximate diagnoses, response (Emslie et al., 2011; Nilsen et al., 2013; Weersing they tend to overidentify youths as depressed. However, et al., 2015); in addition, it may indicate the need to incorrating scales provide useful information on the level of porate additional intervention approaches and/​or a lonsymptom severity and can be used for screening. Our ger duration of treatment. Concurrent anxiety disorders recommendations for rating scales differ depending on have been consistently shown to predict poorer treatment sample characteristics. The CDI, MFQ, RCDS, and response, although several studies have also reported that RADS have been widely used in community and school anxiety is associated with a relatively better response to based psychotherapies than other treatment samples. They exhibit generally good psychometric evidence-​ properties and function well as screening tools in such approaches (Weersing et al., 2015). Coexisting substance populations. In clinical samples, we recommend the use disorders and subthreshold manic symptoms also use of the clinician-​rated CDRS-​R along with the self-​ appear to predict a poorer response to treatment (Maalouf report CDI and MFQ. Although these instruments lack & Brent, 2012; Weersing et al., 2015). The semi-​and fully good discriminant validity, they have functioned well in structured diagnostic interviews reviewed in the previous numerous studies of depressed youth, and when used in section and summarized in Table 6.1 assess most relevant conjunction, they tend to yield prevalence rates that are comorbidities. Most diagnostic interviews also assess important consistent with studies using diagnostic interviews (Myers clinical characteristics, such as duration of the current & Winters, 2002).

 13

Depression in Children and Adolescents

depressive episode and history of prior episodes, which predict a poorer treatment response and greater likelihood of recurrence (Emslie et al., 2011). Thus, these data can play an important role in planning the intensity and duration of treatment, as well as determining the need for continuation and maintenance treatment. Finally, diagnostic interviews and rating scales can provide information about key symptoms that are relevant to decisions regarding treatment intensity and setting (e.g., suicidal ideation and behavior), as well other features that are associated with a poorer response to treatment and might suggest incorporating treatment components designed to target specific symptoms (e.g., insomnia and anhedonia/​ social withdrawal; Maalouf & Brent, 2012). The assessment of hopelessness and suicidality is of particular importance, both to prevent self-​harm and because they predict poorer response to treatment (Emslie et al., 2011). Almost all the diagnostic interviews and depression rating scales discussed previously assess hopelessness and suicidality. In addition, the Hopelessness Scale (Beck, Weissman, Lester, & Trexler, 1974) is a well-​validated self-​report measure that can be used with adolescents. Moreover, there are several widely used measures of suicidal ideation and behavior in adolescents (see Chapter  10, this volume), including self-​ report scales such as the Suicidal Ideation Questionnaire (Reynolds & Mazza, 1999)  and the Columbia Suicide Screen (Shaffer et  al., 2004)  and also clinician rating scales such as the Columbia–​ Suicide Severity Rating Scale (Posner et al., 2011). Psychosocial Functioning Depression in children and adolescents is associated with significant impairment in family and peer relationships and academic performance. The families of depressed youths are often characterized by a lack of cohesion and high levels of disengagement and conflict. The parents of depressed children and adolescents exhibit less warmth and support and greater control, criticism, and rejection compared to parents of controls. Depressed youths have significant social skills deficits, difficulties with peers, and may be involved in problematic romantic relationships. In addition, they often exhibit academic underachievement, school attendance problems, and school failure (Garber & Horowitz, 2002; Hammen, Rudolph, Weisz, Rao, & Burge, 1999; Lewinsohn & Essau, 2002). Moreover, there is evidence that family and peer problems predict a poorer response to treatment but may be associated with a preferential response to interpersonal therapy (Emslie et al.,

113

2011; Weersing et al., 2015). Hence, identifying areas of impaired functioning may be useful in understanding the factors contributing to the youth’s depression and selecting areas to be monitored or targeted in treatment. There are a variety of approaches and instruments for assessing impairments and competencies in psychosocial functioning (for reviews, see Canino, 2016; John, 2001; Winters, Collett, & Myers, 2005). Parents’ reports of children’s impairment appear to have greater validity compared to children’s reports (Kramer et  al., 2004). Some of the instruments discussed previously include subscales assessing functional impairments and competencies. For example, the CAPA and PAPA include comprehensive assessments of the major areas of child psychosocial functioning (Angold & Costello, 2000), the CBCL (Achenbach & Rescorla, 2001a) has a 16-​ item social competence scale, and the HBQ (Essex et al., 2002) and Berkeley Puppet Interview (Ablow et  al., 1999)  include scales tapping social and school functioning. In this section, we discuss measures specifically designed to assess psychosocial functioning in children and adolescents (Table 6.2). However, due to limitations or lack of data, particularly on depressed samples, we have not recommended any of the measures above the others. One group of measures consists of global or unidimensional scales. Global measures provide information on the severity of, and extent of impairment from, the disorder, which may influence the choice of treatment setting (e.g., inpatient vs. outpatient), intensity and duration of treatment, and treatment modality. The Child Global Assessment Scale (C-​GAS; Shaffer et  al., 1983)  and the Columbia Impairment Scale (CIS; Bird et al., 1993) are two widely used global measures of functional impairment. The C-​GAS, adapted from the Global Assessment Scale for adults, is a single 100-​point scale designed for clinicians to rate the severity of symptomatology and functional impairment. A cut-​off of 70 is often used to indicate clinically significant problems. The rating is based on information collected through other means (i.e., a diagnostic interview with parent and/​or child) because the C-​GAS does not provide questions. The CIS is a questionnaire that can be completed by a lay interviewer or parent (for children older than age 4 years) or by children aged 7 to 17  years. It includes 13 items tapping a variety of domains of social functioning and symptomatology that are aggregated into a single score. It was recently recommended for use as a common measure across studies funded by NIMH (Barch et al., 2016). Both the C-​GAS and the CIS are economical, have demonstrated good convergent validity, and differentiate

14

114

Mood Disorders and Self-Injury

relevant populations (e.g., clinical vs. community samples). However, they each yield only one score; hence, they do not provide information on the nature of impairment in specific areas of functioning. In addition, both measures combine symptoms and functioning so that a child’s score could reflect problems in either or both domains. A number of multidomain instruments assessing youths’ functioning across several areas, such as school, family, and peer functioning, are also available (see Table 6.2). The Child and Adolescent Functional Assessment Scale (CAFAS; Hodges, 1999) is an interview that takes approximately 30 to 45 minutes to complete, although administration time can be shorter if it is administered in conjunction with a diagnostic interview. Three of its eight subscales assess functional impairment (role performance at school/​ work, home, and community); the others assess emotional and behavioral problems (for reviews, see Bates, 2001; Canino, 2016; Winters et al., 2005). However, the impairment scales include some symptom items. The CAFAS was designed for youth aged 5 to 19 years, but a preschool and early childhood version is also available (the Preschool and Early Childhood Functional Assessment Scale; see Murphy et al., 1999). The CAFAS has demonstrated good inter-​rater reliability, adequate test–​ retest reliability, correlates with other measures of impairment, distinguishes child inpatients from outpatients, and predicts later functioning. The Social Adjustment Inventory for Children and Adolescents (SAICA; John, Gammon, Prusoff, & Warner, 1987)  assesses school functioning, peer relations, home life, and spare time activities in youths aged 6 to 18 years.

It takes 30 minutes or longer to complete and is administered separately to the parent and the child. The SAICA has demonstrated acceptable levels of inter-​rater reliability, good test–​retest reliability, good convergent validity, and discriminates relevant clinical and nonclinical groups (Winters et  al., 2005). One study has reported that the SAICA performed better than a global functioning scale (i.e., C-​GAS) in predicting the course of depression in adolescents (Sanford et al., 1995). The Behavioral and Emotional Rating Scale (BERS; Epstein, 1999; Epstein, Mooney, Ryser, & Pierce, 2004) is a 52-​item scale that focuses on children’s strengths rather than impairments. It assesses five domains:  interpersonal strengths, involvement with family, intrapersonal strengths, school functioning, and affective strengths. It takes approximately 20 minutes to administer, and it has both parent and youth self-​rating versions. The parent version is appropriate for children aged 0 to 18  years. The BERS has been normed on a national sample, and it has shown good test–​retest reliability, convergent validity, and distinguishes groups of children with and without psychopathology (Canino, 2016). The Brief Impairment Scale (BIS; Bird et al., 2005) is a highly economical parent interview that takes only 3 to 5 minutes and assesses functioning in the areas of school/​ work, interpersonal relations, and self-​fulfillment in youth aged 4 to 17 years. It has shown good internal consistency and test–​retest reliability, is correlated with the C-​GAS, and distinguishes clinical and community samples. The Psychosocial Schedule for School Age Children-​ Revised (PSS-​R; Puig-​Antich, Lukens, & Brent, 1986) assesses

Table 6.2  Ratings of Instruments Used to Assess Psychosocial Functioning for Case Conceptualization and Treatment Planninga Instrument

Norms

Internal Inter-​Rater Consistency Reliability

Test–​Retest Content Construct Reliability Validity Validity

Validity Generalization

Clinical Utility

Highly Recommended

Global Scales of Functioning C-​GAS

E

NA

A

G

A

G

E

A

CIS

E

A

NA

A

A

G

E

A

Multidimensional Scales of Functioning CAFAS

G

A

G

A

A

G

E

A

SAICA

A

A

A

A

A

G

A

A

BERS

E

E

G

G

A

G

E

A

BIS

E

A

NA

A

A

G

E

A

  A number of measures presented in Table 6.2 are also relevant for the purposes of case conceptualization and treatment planning. See text for a discussion of these measures. a

Note: C-​GAS = Child Global Assessment Scale; CIS = Columbia Impairment Scale; CAFAS = Child and Adolescent Functional Assessment Scale; SAICA = Social Adjustment Inventory for Children and Adolescents; BERS = Behavioral and Emotional Rating Scale; BIS—​Brief Impairment Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

 15

Depression in Children and Adolescents

school functioning, relationships with parents, siblings, and peers, and the parents’ marital relationship in children aged 6 to 16 years. Like the SAICA, it is administered separately to the parent and child. Although it has good psychometric properties (Lukens et al., 1983), it has not been used much in recent years. Finally, a new instrument, the Child World Health Organization Disability Assessment Scale (C-​ WHO-​ DAS), is worth mentioning. The C-​ WHO-​ DAS was adapted for children from the adult WHO-​DAS by the DSM-​ 5 Impairment/​ Disability Work Group (Canino, Fisher, Alegria, & Bird, 2013). It has clinician, parent report (for children aged 0–​17 years), and youth self-​report (for adolescents aged 12  years or older) versions, and it assesses six domains: understanding and communicating, getting around (mobility), self-​care, getting along with people, life activities (school and nonschool), and participating in society. To date, only one study, conducted in Rwanda, has examined the psychometric properties of the C-​WHO-​DAS. In this study, internal consistency was good and test–​retest reliability was adequate, confirmatory factor analysis supported the scale’s structure, and the instrument distinguished children who did and did not meet criteria for a psychiatric disorder (Canino, 2016). In addition to the more comprehensive instruments noted previously, there are many measures designed to assess specific areas of functioning. For example, there are a number of widely used inventories assessing key dimensions of family functioning (e.g., Epstein, Baldwin, & Bishop, 1983) and parenting behavior (e.g., Schaefer, 1965), laboratory tasks that have been used to examine the interaction patterns of families of depressed children (Garber & Kaminski, 2000), and interview and laboratory measures of expressed emotion (Sher-​Censor, 2015). Finally, there are a variety of peer nomination measures and teacher ratings of peer functioning that can be used to assess children’s social status and functioning in school (e.g., Huesmann, Eron, Guerra, & Crawshaw, 1994; Ladd, Herald-​Brown, & Andrews, 2009).

115

continuing stressors may contribute to the maintenance of the disorder and should be addressed in the treatment plan. It is important to assess whether stressors are episodic or chronic and whether the stressor is independent or dependent of the child’s behavior. For example, if a stressor is chronic, such as marital conflict, the clinician may recommend marital counseling for the parents, and the child’s treatment may incorporate ways to cope with the ongoing stress. If the youth is “generating” life stress through his or her actions, it would be important to focus on changing these problematic behaviors (Hammen, 2006). Life stress can be assessed through self-​administered questionnaires (Vanaelst, De Vriendt, Huybrechts, Rinaldi, & De Henauw, 2012), which have the advantage of economy. However, semi-​structured interviews have a number of significant strengths, including the ability to assess the temporal relationship between the stressor and the depressive episode; distinguish potentially important features of events such as long-​term threat and whether the event is independent of, versus dependent on, the child’s behavior; and minimize idiosyncratic interpretations of items (Harkness & Monroe, 2016). Two of the most widely used semi-​ structured stress interviews for life stress in children and adolescents are the UCLA Life Stress Interview (LSI; Hammen et  al., 1999)  and the Stressful Life Events Schedule (SLES; Williamson et al., 2003). Both interviews assess episodic life events across a variety of domains (e.g., family, peers, romantic relationships, school, health, and family finances). The LSI also provides an extensive assessment of chronic stressors in each of these areas. There is considerable overlap between the LSI’s conceptualization of chronic stress and social functioning, so this part of the interview can also be viewed as a measure of functional impairment (Harkness & Monroe, 2016). Most life events inventories and interviews also assess traumatic stressors. Similarly, some of the more comprehensive measures of functional impairment discussed previously assess some traumas, such as child maltreatment (e.g., the PSS-​R). Many of the diagnostic interStressful Life Events views discussed previously also assess traumatic events Prospective studies in children have shown that stress—​ in the context of evaluating post-​traumatic stress disorder particularly events related to loss, rejection, disappoint- (PTSD) (e.g., the K-​ SADS). Among diagnostic interment, and conflict—​ and traumas, such as childhood views, the CAPA and PAPA are particularly noteworthy maltreatment, predict the onset and persistence of depres- in providing a broad assessment of life events and traumas sive symptoms, as well as poorer response to treatment (Costello, Angold, March, & Fairbank, 1998). Finally, (Gibb, 2014; Emslie et  al., 2011; Rudolph & Flynn, there are a number of instruments that focus specifi2014). Life stressors and traumas prior to the depressive cally on traumatic stressors (e.g., the Childhood Trauma episodes may be precipitating factors, and subsequent or Questionnaire and the UCLA PTSD index; for a review,

16

116

Mood Disorders and Self-Injury

see Strand, Sarmiento, & Pasquale, 2005)  and on child maltreatment (e.g., the Child Abuse Potential Inventory and the Conflict Tactics Scale—​Parent–​Child Version; for a discussion of issues in assessing child maltreatment and a review of screening instruments, see Slep, Heyman, & Foran, 2015). Family History of Psychopathology A number of studies have reported elevated rates of depression, and often other forms of psychopathology, in the relatives of depressed children and adolescents (e.g., Klein, Lewinsohn, Seeley, & Rhode, 2001). Clinicians should be particularly cautious when youth have a family history of bipolar disorder or psychosis, and they should be alert to emerging signs of mania and/​or psychosis. In addition, parental depression has been related to prolonged depressive episodes in their children and poorer treatment response (Brent et al., 1998). Moreover, there is growing evidence that treating maternal depression can reduce children’s symptoms (Cuijpers, Weitz, Karyotaki, Garber, & Andersson, 2015). Therefore, if a parent is suffering from a psychiatric disorder and is not in treatment, a referral for mental health services can benefit both parent and child. Family history data can be elicited using diagnostic interviews conducted directly with family members (the family interview method) or by interviewing key informants about the other relatives (the family history method). Although direct interviews are more accurate, in most instances clinicians and researchers must rely on informants to obtain family histories. The most widely used interviews for eliciting family history information from informants are the Family History Research Diagnostic Criteria (Andreasen, Endicott, Spitzer, & Winokur, 1977), the Family Informant Schedule and Criteria (Chapman, Mannuzza, Klein, & Fyer, 1994), the Family Interview for Genetic Studies (Nurnberger et al., 1994), and the Family History Screen (Milne et al., 2009; Weissman et al., 2000). Family history data collected from informants tend to have high specificity but only moderate sensitivity. Hence, it is best to obtain information from at least two informants when possible to increase the probability of detecting psychopathology in relatives. Overall Evaluation Assessment for the purpose of case conceptualization and treatment planning requires a combination of measures

assessing factors related to the disorder (i.e., severity and duration of the depressive episode, comorbid psychopathology, and family history of psychopathology) and a variety of social domains (e.g., relationships with family, peers, and romantic partners; schoolwork; and life stressors and trauma). In addition, it is necessary to obtain information from multiple informants (the child and a parent, as well as additional informants such as teachers when possible). It is also important to consider the context of the youth’s behavior and the cultural milieu because what is considered maladaptive in one context or culture may be adaptive in another. First, we recommend the use of one of the rating scales and diagnostic interviews discussed previously to obtain information about the severity and clinical features of the depressive episode and comorbid conditions. Second, we recommend using a multidimensional measure of functional impairment. The SAICA has been mostly used by clinicians to assess functioning rather than by mental health planners to determine service needs. The CAFAS and BERS are reasonable options for a multidimensional scale to guide selection of the level and types of services. The BIS is a good option when time is limited. In addition, the CAPA and PAPA assess impairment in key domains. However, none of the measures of functional impairment were highly recommended in Table 6.2 because their psychometric properties have not been examined as thoroughly as our rating criteria require, and only the SAICA and PSS-​R have been used specifically in studies of child and adolescent depression. Third, it is important to assess stressful life events and traumas. If time permits, the best option for assessing life events is with an interview such as the LSI or the SLES that provides qualitative information regarding the stressor (e.g., the severity and independence/​dependence of stressor, whether it is episodic or chronic, and the timing of the stressor in relation to symptoms). Alternatively, the CAPA and PAPA provide assessments of life events and traumas. Finally, there are a number of good interviews to assess family history of psychopathology. Although it may be challenging, we recommend using multiple informants to increase sensitivity. In summary, assessment for case conceptualization and treatment planning should provide the clinician with information to assess the prognosis of the disorder, areas of impairment and strength, and factors that appear to contribute to the onset and/​or maintenance of the disorder.

 17

Depression in Children and Adolescents ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

To evaluate the assessment tools used in the treatment literature, we examined the various measures’ sensitivity to treatment effects (Table 6.3). This included significant change on depression measures, such as MDD diagnoses and rating scales, as well as change on measures of functional impairment using both global and multidimensional scales. Furthermore, we included only published pharmacological and psychotherapy treatment research that reported at least one significant treatment effect on at least one outcome measure of depression. Otherwise, it is not possible to distinguish the treatment’s lack of efficacy from the measure’s insensitivity in detecting change. All studies examined compared depression treatment to a wait-​list control or another active treatment. In larger clinical trials involving psychopharmacology, psychotherapy, or combined treatments for older children and adolescents with depression, the predominant index of treatment response in adolescence is the CDRS-​ R (Brent et al., 2008; Brooks & Kutcher, 2001; Goodyer et al., 2007; March et al., 2004; Myers & Winters, 2002). In the initial intervention studies of young children with depression, dimensional depression scores derived from the PAPA have indexed treatment outcome. Initial evidence (Lenze et  al., 2011; Luby, Lenze, & Tillman, 2012)  suggests that these scores demonstrate treatment sensitivity. Relative to the first edition of this book, there has been less reliance on reporting changes in MDD diagnoses (i.e., no longer meeting criteria). Previously, many studies used the K-​SADS, which was sensitive to change in all studies (Clarke et  al., 1995, 2001; Clarke, Rohde, Lewinsohn, Hops, & Seeley, 1999; Diamond, Reis, Diamond, Siqueland, & Isaacs, 2002; Lewinsohn, Clarke, Hops, & Andrews, 1990; McCauley et al., 2016; Stark, 1990; Vostanis, Feehan, Grattan, & Bickerton, 1996; Wood, Harrington, & Moore, 1996). Fewer studies rely on other diagnostic instruments (e.g., the DISC) for indexing treatment response (Weisz et al., 2009). A number of self-​ administered depression rating scales have also been shown to be sensitive to treatment effects. The CDI has also been widely used and has been shown to be sensitive to change in several treatment studies (Brook & Kutcher, 2001; Myers & Winters, 2002; Rosselló, Bernal, & Rivera-​Medina, 2012). One study reported that it was more sensitive than the RCDS in detecting the effects of group CBT in school-​aged children (Stark, Reynolds, & Kaslow, 1987); however, another study found that it

117

was less sensitive to the effects of medication compared to the CDRS (Emslie et al., 1997). Both the RCDS and the RADS have detected treatment effects in controlled clinical trials (March et al., 2004; Rawson & Tabb, 1993). Although the MFQ has been used less frequently than the other measures, it has demonstrated sensitivity to change in some (e.g., Goodyer et  al., 2007; McCauley et al., 2016), but not all, clinical trials (Brooks & Kutcher, 2001). Last, many of the youth self-​report rating scales reviewed here (i.e., CDI, RADS, and MFQ) have been shown to be more sensitive to treatment effects compared to the parent versions (Kahn, Kehle, Jenson, & Clark, 1990; Wood et al., 1996). Some treatment studies included parent versions of the CBCL-​Anxious/​Depressed scale, CBCL Internalizing scale, and an adapted depression scale from the CBCL, with few of these studies being published recently. Overall, few of these studies reported post-​ treatment effects using these measures (Weisz et al., 2009), despite other youth-​reported depression measures demonstrating treatment effects (Clarke et al., 1999, 2001; De Cuyper, Timbremont, Braet, De Backer, & Wullaert, 2004; Rosselló & Bernal, 1999; Stark et al., 1987). Interestingly, however, some of these studies later found post-​treatment follow-​ up effects using parent-​ report measures (Clarke et  al., 1999; De Cuyper et  al., 2004). This suggests that parents and youths may be focusing on different indicators of improvement (i.e., parents may rely more heavily on behavioral, rather than mood, changes). Nevertheless, it appears that parents are less sensitive than youths to the more immediate changes in the youths’ depressive symptomatology. Last, it is important to emphasize that treatment monitoring and outcome should include psychosocial functioning in addition to symptom reduction/​ remission. There has been an increased emphasis on examining improvement in functioning. The Clinical Global Impression–​Improvement (CGI-​I) and Clinical Global Impression–​ Severity (CGI-​ S) scores have been used in larger adolescent depression trials (Atkinson et  al., 2014; Brent et  al., 2008; Goodyer et  al., 2007; March et  al., 2004), and each was sensitive to changes in levels of function across treatment. However, much of the work in demonstrating validity of these assessments has come from studies of adults and other disorder populations. Thus, additional work is needed to demonstrate construct validity and reliability in child and adolescent depression. The C-​GAS has been widely used to assess functional impairment in treatment studies, and it has

18

118

Mood Disorders and Self-Injury

Table 6.3  Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Content Reliability Validity

Construct Validity

Validity Treatment Generalization Sensitivity

Clinical Utility

Highly Recommended

K-​SADS CDRS-​R CDI MFQ

NA G E E

NR G G E

G G NA NA

A A A A

G G G A

E G G G

E G E E

E E A A

G A A A

✓ ✓ ✓

RCDS

E

E

NA

A

G

A

A

A

A

RADS C-​GAS SAICA

E E A

E NA A

NA G A

G G E

G A G

G G G

E E A

G E A

A A A

✓ ✓

Note: K-​SADS = Schedule for Affective Disorders and Schizophrenia in School-​Age Children; CDRS-​R = Children’s Depression Rating Scale-​Revised; CDI = Children’s Depression Inventory; MFQ = Mood and Feelings Questionnaire; RCDS = Reynolds Child Depression Scale; RADS = Reynolds Adolescent Depression Scale; C-​GAS  =  Child Global Assessment Scale; SAICA  =  Social Adjustment Inventory for Children and Adolescents; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

been sensitive to treatment of depression in children and adolescents (Goodman, Schwab-​Stone, Lahey, Shaffer, & Jensen, 2000; Goodyer et al., 2007; Muratori, Picchi, Bruni, Patarnello, & Ramagnoli, 2003). The SAICA is one of the few multidimensional measures of social functioning to be used in youth depression treatment studies. It demonstrated sensitivity to post-​ treatment effects and 9-​month follow-​up effects (Vostanis et  al., 1996). The CAFAS has also been shown to be sensitive to treatment gains (Winters et al., 2005); however, these studies did not focus on depression. Finally, the BIS failed to show treatment effects in one large clinical trial (Chorpita et al., 2013). Future research should focus on the effects of treatment on adaptive functioning and determine whether youths’ functioning targeted in treatment actually improves. Overall Evaluation Determining which measures to use for monitoring and evaluating treatment will depend on a number of factors, including the nature of the patient’s condition and the form of treatment. For patients with diagnoses of MDD, the K-​SADS MDD section can be used to assess remission. In addition to the K-​SADS, the clinician-​rated CDRS-​R, which has been sensitive to both pharmacological and psychotherapy treatment studies, is highly recommended. However, if a clinician-​rating scale is too costly, the youth-​rated CDI, which is sensitive to change and widely used with child and adolescent populations, is highly recommended; the RCDS, RADS, and MFQ are also acceptable options. These measures have all been shown to be sensitive to psychotherapy, and they are all self-​report measures, which appear to be more sensitive

than parent-​report measures of depression. It may be of value to use a combination of measures that are intended to assess both modest/​ normative and clinically significant levels of symptoms (Olino et al., 2012, 2013) so that higher levels of severity are accurately assessed early and more modest levels of severity are assessed later in treatment. For example, the CDI and MFQ may be jointly administered such that higher severity symptoms are assessed via the CDI earlier in treatment and lower severity symptoms are assessed via the MFQ later in treatment. This may help provide additional sensitivity in seeing severity to complete remission. In Table 6.3, of the self-​ report measures, only the CDI and RADS are considered highly recommended because they have been most widely used and consistently shown to be sensitive to treatment effects. Additional studies of depression in early childhood are needed in order to make recommendations about specific measures that are sensitive to treatment in this developmental period. Parent-​report measures may be more useful in assessing and monitoring the youth’s functional impairment (Kramer et  al., 2004). In addition, youth and teacher reports on multidimensional measures of functional impairment would also be advisable because multiple informants may be required to get a comprehensive assessment of functioning across different contexts and relationships. We recognize that it is often difficult to obtain information from multiple sources; nevertheless, we strongly recommend that information from multiple informants be obtained during the initial assessment and treatment planning phase and in evaluating treatment outcome. However, it would be acceptable to monitor the youth’s functioning over the course of treatment using only parent and/​or youth reports because these are more

 19

Depression in Children and Adolescents

easily obtainable in clinical settings. We recommend that the C-​GAS be used as the global measure of functional impairment because the CIS (discussed previously) has not been evaluated for sensitivity to change. Reliance on the CGI-​I and CGI-​S is promising but is currently only considered adequate. Finally, we suggest that future research examine the clinical utility and sensitivity to change of a variety of multidimensional scales because recommending one over another for treatment monitoring and outcome seems premature at this time.

119

assessing progress and allows for comparison to published treatment benchmarks. Issues for Future Research

Evaluating empirically supported assessments of child and adolescent depression is a challenging task, and a number of issues must be resolved in order to provide firm grounds for recommendations. Unfortunately, most of the gaps in the development of empirically supported assessments for youth depression identified in the previous edition of this volume remain. However, these gaps continue to define an agenda for future research. CONCLUSIONS AND FUTURE DIRECTIONS First, there are fundamental questions about the validIn this chapter, we reviewed the major approaches and ity of depression as a diagnostic construct and its relationmeasures for diagnosing and assessing depression in chil- ship to other conditions, such as anxiety disorders. For dren and adolescents. We also identified a number of example, structural models of psychopathology indicate additional variables that should be considered for progno- that much of the liability to psychiatric disorders can be sis, treatment planning, case conceptualization, and treat- explained by one or two higher order dimensions, with ment monitoring and evaluation, and we briefly discussed depression loading on a general factor and an internalizing factor (Lahey, Van Hulle, Singh, Waldman, & their assessment. In summary, a comprehensive assessment of child and Rathouz, 2011; Olino et  al., 2014). This literature also adolescent depression should include (a)  determining suggests that depression is almost indistinguishable whether criteria are met for a diagnosis of depressive dis- from generalized anxiety disorder, and the two condiorder and assessing the severity of depressive symptoms; tions should be collapsed into a single “distress” disor(b) assessing key symptoms such as hopelessness, suicidal der (Lahey et al., 2008). Whereas these models focus on ideation, and psychotic symptoms that might influence clinical phenotypes, the NIMH RDoC initiative takes treatment decisions; (c)  carefully assessing the previous a different approach, seeking to identify transdiagnostic course of the depression (e.g., prior episodes and chronic- biobehavioral dimensions that can be assessed at mulity); (d) evaluating comorbid psychiatric, developmental, tiple units of analysis. It will be important to watch these and general medical disorders; (e) assessing family, school, developments because they may fundamentally alter our and peer functioning; (f)  exploring significant stressors, core diagnostic constructs, treatment targets, and possibly traumas, and social factors, including ethnicity and cul- intervention approaches. Second, there is a need for longitudinal studies focusture; and (g) assessing family history of psychopathology. Our recommendations are as follows. First, the ing specifically on the processes associated with the assessment of psychopathology should include a semi-​ maintenance, recovery, and recurrence of depression in structured diagnostic interview because less systematic children and adolescents. In addition, there is a need to approaches frequently overlook key areas of psychopa- expand the surprisingly limited literature on predictors of thology and also because respondent-​ based interviews differential treatment response (Weersing et  al., 2015). pose several limitations, including overdiagnosis, poor This should provide valuable information regarding discriminant validity, and the inability to clarify questions potential targets for assessment and treatment and also for or responses. Second, data should be obtained from mul- choosing between treatment options. Third, we need to determine the best method for intetiple informants, including the child (if he or she is older than age 8 years) and primary caregiver. Finally, regular grating data from multiple informants for diagnosis and monitoring and evaluation of treatment using clinician, self-​ treatment evaluation (De Los Reyes et al., 2015). Fourth, rating, and parent-​rating scales assessing depressive symp- there is a need for methodologically rigorous comparitoms and functional impairment is critical. Although a sons between different diagnostic interviews or rating reduction in test scores when monitoring treatment effects scales and also a need to determine the cost-​effectiveness, should be viewed cautiously in light of the possibility of incremental validity (Hunsley & Meyer, 2003), and treatattenuation effects, this provides a means of objectively ment utility of these measures. Hayes, Nelson, and Jarrett

120

120

Mood Disorders and Self-Injury

(1987) and Nelson-​Gray (2003) have described a number of research designs that can be used to test treatment utility and that are easily implemented. Fifth, assessing the reliability and validity of case formulations and treatment planning is a critical area in which little work has been done for child and adolescent depression (Kuyken, Fothergill, Musa, & Chadwick, 2005). Sixth, during the treatment monitoring phase of assessment, there are few guidelines regarding whether or when treatment with depressed youth should be intensified, changed, or discontinued. Fortunately, there is a growing body of work on these issues that can be drawn upon (e.g., Shimokawa, Lambert, & Smart, 2010). Finally, disseminating and implementing evidence-​based assessment for depressed youth in community settings remains a significant challenge (Garland et al., 2013; Weisz et al., 2015). These gaps in the literature present us with many critical research tasks that will further the development of evidence-​based assessment tools and procedures for child and adolescent depression and facilitate the development of evidence-​based treatments for depressed youth.

ACKNOWLEDGMENTS

Daley DiCorcia’s assistance in preparing the manuscript is gratefully acknowledged. Writing of this chapter was supported by NIMH grants RO1 MH 069942 (Klein) and R01 MH107495 (Olino).

REFERENCES

Ablow, J. C., Measelle, J. R., Kraemer, H. C., Harrington, R., Luby, J., Smider, N., . . . Kupfer, D. J. (1999). The MacArthur Three-​ City Outcome Study:  Evaluating multi-​ informant measures of young children’s symptomatology. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 1580–​1590. Achenbach, T. M., McConaughy, S. H., & Howell, C. T (1987). Child/​ adolescent behavioral and emotional problems:  Implications of cross-​informant correlations for situational specificity. Psychological Bulletin, 101, 213–​232. Achenbach, T. M., & Rescorla, L. A. (2001a). Manual for the ASEBA School-​Age Forms & Profiles. Burlington, VT: University of Vermont. Achenbach, T. M., & Rescorla, L. A. (2001b). Manual for ASEBA Preschool Forms and Profiles. Burlington, VT:  University of Vermont, Research Center for Children, Youth, and Families.

Almirall, D., & Chronis-​Tuscano, A. (2016). Adaptive interventions in child and adolescent mental health. Journal of Clinical Child & Adolescent Psychology, 45, 383–​395. Ambrosini, P. J. (2000). Historical development and present status of the Schedule for Affective Disorders and Schizophrenia for School-​ Age Children (K-​ SADS). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 49–​58. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Andreasen, N. C., Endicott, J., Spitzer, R. L., & Winokur, G. (1977). The family history method using diagnostic criteria. Archives of General Psychiatry, 34, 1229–​1235. Angold, A., & Costello, E. J. (1995). A test–​retest reliability study of child-​reported psychiatric symptoms and diagnoses using the Child and Adolescent Psychiatric Assessment (CAPA-​C). Psychological Medicine, 25, 755–​762. Angold, A., & Costello, E. J. (2000). The Child and Adolescent Psychiatric Assessment (CAPA). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 39–​48. Angold, A., Costello, E. J., & Erkanli, A. (1999). Comorbidity. Journal of Child Psychology & Psychiatry, 40, 57–​87. Angold, A., Costello, E. J., Messer, S. C., & Pickles, A. (1995). Development of a short questionnaire for use in epidemiological studies of depression in children and adolescents. International Journal of Methods in Psychiatric Research, 5, 237–​249. Angold, A., Erkanli, A., Copeland, W., Goodman, R., Fisher, P. W., & Costello, E. J. (2012). Psychiatric diagnostic interviews for children and adolescents:  A comparative study. Journal of the American Academy of Child & Adolescent Psychiatry, 51, 506–​517. Angold, A., & Fisher, P. W. (1999). Interviewer-​based interviews. In D. Shaffer, C. P. Lucas, & J. E. Richters (Eds.), Diagnostic assessment in child and adolescent psychopathology (pp. 34–​64). New York, NY: Guilford. Angold, A., Prendergast, M., Cox, A., Harrington, R., Simonoff, E., & Rutter, M. (1995). The Child and Adolescent Psychiatric Assessment (CAPA). Psychological Medicine, 25, 739–​753. Atkinson, S. D., Prakash, A., Zhang, Q., Pangallo, B. A., Bangs, M. E., Emslie, G. J., & March, J. S. (2014). A double-​ blind efficacy and safety study of duloxetine flexible dosing in children and adolescents with major depressive disorder. Journal of Child and Adolescent Psychopharmacology, 24, 180–​189. Avenevoli, S., Swendsen, J., He, J., Burstein, M., & Merikangas, K. R. (2015). Major depression in the National Comorbidity Survey–​ Adolescent Supplement:  Prevalence, correlates,

 12

Depression in Children and Adolescents

and treatment. Journal of the American Academy of Child & Adolescent Psychiatry, 54, 37–​44. Axelson, D., Findling, R. L., Fristad, M. A., Kowatch, R. A., Youngstrom, E. A., Horwitz, S. M.,  .  .  .  Birmaher, B. (2012). Examining the proposed disruptive mood dysregulation disorder diagnosis in the longitudinal assessment of mania symptoms study. Journal of Clinical Psychology, 73, 1342–​1350. Barch, D. M., Gotlib, I. H., Bilder, R. M., Pine, D. S., Smoller, J. W., Brown, C. H., . . . Farber, G. K. (2016). Common measures for National Institute of Mental Health funded research. Biological Psychiatry, 79(12), e91–​e96. Bates, M. P. (2001). The Child and Adolescent Functional Assessment Scale (CAFAS): Review and current status. Clinical Child & Family Psychology Review, 4, 63–​84. Beck, A. T., Weissman, A., Lester, D., & Trexler, L. (1974). The measurement of pessimism:  The Hopelessness Scale. Journal of Consulting and Clinical Psychology, 42, 861–​865. Beck, J., Beck, A., & Jolly, J. (2001). Beck Youth Inventories of Emotional & Social Impairment manual. San Antonio, TX: Psychological Corporation. Bird, H. R., Canino, G., Davies, M., Ramírez, R., Chavez, L., Duarte, C., & Shen, S. (2005). The Brief Impairment Scale (BIS):  A multidimensional scale of functional impairment for children and adolescents. Journal of the American Academy of Child & Adolescent Psychiatry, 44, 699–​707. Bird, H. R., Shaffer, D., Fisher, P., Gould, M. S., Staghezza, G., Chen, J. Y., et al. (1993). The Columbia Impairment Scale (CIS):  Pilot findings on a measure of global impairment for children and adolescents. International Journal of Methods in Psychiatric Research, 3, 167–​176. Birmaher, B., Arbelaez, C., & Brent, D. (2002). Course and outcome of child and adolescent major depressive disorder. Child and Adolescent Clinics of North America, 11, 619–​638. Birmaher, B., Ehmann, M., Axelson, D. A., Goldstein, B. I., Monk, K., Kalas, C., . . . Brent, D. A. (2009). Schedule for Affective Disorders and Schizophrenia for School-​ Age Children (K-​SADS-​PL) for the assessment of preschool children—​ A preliminary psychometric study. Journal of Psychiatric Research, 43, 680–​686. Boyle, M. H., Offord, D. R., Racine, Y., Sanford, M., Szatmari, P., Fleming, J. E., & Price-​Munn, N. (1993). Evaluation of the Diagnostic Interview for Children and Adolescents for use in general population samples. Journal of Abnormal Child Psychology, 21, 663–​681. Brent, D., Emslie, G., Clarke, G., Dineen, K., Asarnow, J. R., Keller, M., . . . Zelazny, J. (2008). Switching to another SSRI or to venlafaxine with or without cognitive behavioral therapy for adolescents with SSRI-​resistant depression: The TORDIA randomized controlled trial. JAMA, 299, 901–​913.

121

Brent, D., Kolko, D., Birmaher, B., Baugher, M., Bridge, J., Roth, C., & Holder, D. (1998). Predictors of treatment efficacy in a clinical trial of three psychosocial treatments for adolescent depression. Journal of the American Academy of Child & Adolescent Psychiatry, 37, 906–​909. Bridge, J. A., Iyengar, S., Salary, C. B., Barbe, R. P., Birmaher, B., Pincus, H. A.,  .  .  .  Brent, D. A. (2007). Clinical response and risk for reported suicidal ideation and suicide attempts in pediatric antidepressant treatment; A meta-​analysis of randomized controlled trials. JAMA, 297, 1683–​1696. Brooks, S. J., & Kutcher, S. (2001). Diagnosis and measurement of adolescent depression: A review of commonly utilized instruments. Journal of Child & Adolescent Psychopharmacology, 11, 341–​376. Bufferd, S. J., Dougherty, L. R., Carlson, G. A., & Klein, D. N. (2011). Parent-​reported mental health in preschoolers:  Findings using a diagnostic interview. Comprehensive Psychiatry, 52, 359–​369. Bufferd, S. J., Dougherty, L. R., Olino, T. M., Dyson, M. W., Laptook, R., Carlson, G. A., & Klein, D. N. (2014). Predictors of the onset of depression in young children:  A multi-​ method, multi-​ informant longitudinal study from ages 3 to 6. Journal of Child Psychology and Psychiatry, 55, 1279–​1287. Canino G. (2016). The role of measuring functional impairment. National Academies of Sciences, Engineering, and Medicine. 2016. Measuring serious emotional disturbance in children: Workshop summary. Washington, DC:  The National Academies Press. https://​doi.org/​ 10.17226/​21865. Canino, G. J., Fisher, P. W., Alegria, M., & Bird, H. R. (2013). Assessing child impairment in functioning in different contexts:  Implications for use of services and the classification of psychiatric disorders. Open Journal of Medical Psychology, 2(1), 29–​34. Chapman, T. F., Mannuzza S., Klein, D. F., & Fyer, A. J. (1994). Effects of informant mental disorder on psychiatric family history data. American Journal of Psychiatry, 151, 574–​579. Chentsova-​Dutton, Y. E., Ryder, A. G., & Tsai, J. (2014). Understanding depression across cultures. In I. H. Gotlib & C. L. Hammen (Eds.), Handbook of depression (3rd ed., pp. 337–​354). New York, NY: Guilford. Chorpita, B. F., Weisz, J. R., Daleiden, E. L., Schoenwald, S. K., Palinkas, L. A., Miranda, J., . . . Ward, A. (2013). Long-​ term outcomes for the Child STEPs randomized effectiveness trial:  A comparison of modular and standard treatment designs with usual care. Journal of Consulting and Clinical Psychology, 81, 999–​1009. Clarke, G. N., Hawkins, W., Murphy, M., Sheeber, L. B., Lewinsohn, P. M., & Seeley, J. R. (1995). Targeted prevention of unipolar depressive disorder in an at-​risk

12

122

Mood Disorders and Self-Injury

sample of high school adolescents: A randomized trial of group cognitive intervention. Journal of the American Academy of Child & Adolescent Psychiatry, 34, 312–​321. Clarke, G. N., Hornbrook, M., Lynch, F., Polen, M., Gale, J., Beardslee, W., . . . Seeley, J. (2001). A randomized trial of group cognitive intervention for preventing depression in adolescent offspring of depressed parents. Archives of General Psychiatry, 58, 1127–​1134. Clarke, G. N., Rhode, P., Lewinsohn, P., Hops, H., & Seeley, J. R. (1999). Cognitive–​behavioral treatment of adolescent depression: Efficacy of acute group treatment and booster sessions. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 272–​279. Cole, D. A., Cai, L., Martin, N. C., Findling, R. L., Youngstrom, E. A., Garber, J.,  .  .  .  Forehand, R. (2011). Structure and measurement of depression in youths: Applying item response theory to clinical data. Psychological Assessment, 23, 819–​833. Cole, D. A., Hoffman, K., Tram, J. M., & Maxwell, S. E. (2000). Structural differences in parent and child reports of children’s symptoms of depression and anxiety. Psychological Assessment, 12, 174–​185. Copeland, W. E., Angold, A., Costello, E. J., & Egger, H. (2013). Prevalence, comorbidity, and correlates of DSM-​5 proposed disruptive mood dysregulation disorder. American Journal of Psychiatry, 170, 173–​179. Copeland, W. E., Shanahan, L., Costello, E. J., & Angold, A. (2009). Childhood and adolescent psychiatric disorders as predictors of young adult disorders. Archives of General Psychiatry, 66, 764–​774. Copeland, W. E., Shanahan, L., Egger, H., Angold, A., & Costello, E. J. (2014). Adult diagnostic and functional outcomes of DSM-​5 disruptive mood dysregulation disorder. American Journal of Psychiatry, 171, 688–​774. Copeland, W. E., Shanahan, L., Erkanli, A., Costello, E. J., & Angold, A. (2013). Indirect comorbidity in childhood and adolescence. Frontiers in Psychiatry, 4, 1–​8. Copeland, W. E., Wolke, D., Shanahan, L., & Costello, J. (2015). Adult functional outcomes of common childhood psychiatric problems: A prospective, longitudinal study. JAMA Psychiatry, 72, 892–​899. Costello, A. J., Edelbrock, C., Dulcan, M. K., Kalas, R., & Klaric, S. H. (1984). Report of the NIMH Diagnostic Interview Schedule for Children (DISC). Washington, DC: National Institute of Mental Health. Costello, E. J., Angold, A., March, J., & Fairbank, J. (1998). Life events and post-​ traumatic stress:  The development of a new measure for children and adolescents. Psychological Medicine, 28, 1275–​1288. Cuijpers, P., Weitz, E., Karyotaki, E., Garber, J., & Andersson, G. (2015). The effects of psychological treatment of maternal depression on children and parental functioning: A meta-​analysis. European Child & Adolescent Psychiatry, 24, 237–​245.

D’Angelo, E. J., & Augenstein, T. M. (2012). Developmentally informed evaluation of depression:  Evidence-​ based instrument. Child and Adolescent Psychiatric Clinics of North America, 21, 279–​298. Danzig, A. P., Bufferd, S. J., Dougherty, L. R., Carlson, G. A., Olino, T. M., & Klein, D. N. (2013). Longitudinal associations between preschool psychopathology and school-​ age peer functioning. Child Psychiatry and Human Development, 44, 621–​632. Daviss, B., Birmaher, B., Melhem, N. A., Axelson, D. A., Michaels, S. M., & Brent, D. A. (2006). Criterion validity of the Mood and Feelings Questionnaire for depressive episodes in clinic and non-​clinic subjects. Journal of Child Psychology and Psychiatry, 47, 927–​934. De Cuyper, S., Timbremont, B., Braet, C., De Backer, V., & Wullaert, R. (2004). Treating depressive symptoms in school children:  A pilot study. European Child & Adolescent Psychiatry, 13, 105–​114. De Los Reyes, A., Augenstein, T. M., Wang, M., Thomas, S. A., Drabick, D. A., Burgers, D. E., & Rabinowitz, J. (2015). The validity of the multi-​informant approach to assessing child and adolescent mental health. Psychological Bulletin, 141, 858–​900. De Los Reyes, A., Thomas, S. A., Goodman, K. L., & Kundey, S. (2013). Principles underlying the use of multiple informants’ reports. Annual Review of Clinical Psychology, 9, 123–​149. Dere, J., Watters, C. A., Yu, S., Bagby, R. M., Ryder, A., G., & Harkness, K. (2015). Cross-​cultural examination of measurement invariance of the Beck Depression Inventory-​ II. Journal of Abnormal Psychology, 27, 68–​81. Deveney, C. M., Hommer, R. E., Reeves, E., Stringaris, A., Hinton, K. E., Haring, C. T., . . . Leibenluft, E. (2015). A prospective study of severe irritability in youths: 2-​and 4-​year follow-​up. Depression and Anxiety, 32, 364–​372. Diamond, G. S., Reis, B. F., Diamond, G. M., Siqueland, L., & Isaacs, L. (2002). Attachment-​based family therapy for depressed adolescents:  A treatment development study. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 1190–​1196. Dougherty, L. R., Bufferd, S. J., Carlson, G. A., Dyson, M., Olino, T. M., Durbin, C. E., & Klein, D. N. (2011). Preschoolers’ observed temperament and psychiatric disorders assessed with a parent diagnostic interview. Journal of Clinical Child and Adolescent Psychology, 40, 295–​306. Dougherty, L. R., Smith, V. C., Bufferd, S. J., Carlson, G. A., Stringaris, A., Leibenluft, E., & Klein, D. N. (2014). DSM-​ 5 disruptive mood dysregulation disorder:  Correlates and predictors in young children. Psychological Medicine, 44, 2239–​2350. Dougherty, L. R., Smith, V. C., Bufferd, S. J., Kessel, E. M. Carlson, G. A., & Klein, D. N. (2016). Disruptive mood dysregulation disorder at age six and clinical and

 123

Depression in Children and Adolescents

functional outcomes three years later. Psychological Medicine, 46, 1103–​1114. Dubicka, B., Elvins, R., Roberts, C., Chick, G., Wilkinson, P., & Goodyer, I. M. (2010). Combined treatment with cognitive behavioral therapy in adolescent depression:  Meta-​analysis. British Journal of Psychiatry, 197, 433–​440. Ebesutani, C., Bernstein, A., Nakamura, B. J., Chorpita, B. F., Higa-​McMillan, C. K., & Weisz, J. R. (2010). Concurrent validity of the Child Behavior Checklist DSM-​oriented scales: Correspondence with DSM diagnoses and comparison to syndrome scales. Journal of Psychopathology and Behavioral Assessment, 32, 373–​384. Edelbrock, C., Costello, A. J., Dulcan, M. K., Kalas, R., & Conover, N. C. (1985). Age differences in the reliability of the psychiatric interview of the child. Child Development, 56, 265–​275. Egger, H. L., & Angold, A. (2004). The Preschool Age Psychiatric Assessment (PAPA):  A structured parent interview for diagnosing psychiatric disorders in preschool children. In R. DelCarmen-​Wiggins & A. Carter (Eds.), Handbook of infant, toddler, and preschool mental health assessment (pp. 223–​ 243). New  York, NY: Oxford University Press. Egger, H. L., & Angold, A. (2006). Common emotional and behavioral disorders in preschool children: Presentation, nosology, and epidemiology. Journal of Child Psychology and Psychiatry, 47, 313–​337. Egger, H. L., Ascher, B. H., & Angold, A. (1999). The Preschool Age Psychiatric Assessment:  Version 1.1. Durham, NC:  Center for Developmental Epidemiology, Department of Psychiatry and Behavioral Sciences, Duke University Medical Center. Egger, H. L., Erkanli, A., Keeler, G., Potts, E., Walter, B., & Angold, A. (2006). Test–​retest reliability of the Preschool Age Psychiatric Assessment (PAPA). Journal of the American Academy of Child & Adolescent Psychiatry, 45, 538–​549. Emslie, G. J., Kennard, B. D., & Mayes, T. L. (2011). Predictors of treatment response in adolescent depression. Psychiatric Annals, 41, 212–​219. Emslie, G. J., Kennard, B. D., Mayes, T. L., Nightingale-​ Teresi, J., Carmody, T., Rush, A. J.,  .  .  .  Rintelmann, J. W. (2008). Fluoxetine versus placebo in preventing relapse of major depression in children and adolescents. American Journal of Psychiatry, 165, 459–​467. Emslie, G. J., Kennard, B. D., Mayes, T. L., Nakonezny, P. A., Moore, J., Jones, J. M., . . . King, J. (2015). Continued effectiveness of relapse prevention cognitive–​behavioral therapy following fluoxetine treatment in youth with major depressive disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 54, 991–​998. Emslie, G. J., Rush, A. J., Weinberg, W. A., Kowatch, R. A., Hughes, C. W., Carmody, T., & Rintelmann, J. (1997).

123

Double-​blind placebo-​controlled trial of fluoxetine in depressed children and adolescents. Archives of General Psychiatry, 54, 1031–​1037. Epstein, M. H. (1999). The development and validation of a scale to assess the emotional and behavioral strengths of children and adolescents. Remedial and Special Education, 20, 258–​262. Epstein, M. H., Mooney, P., Ryser, G., & Pierce, C. D. (2004). Validity and reliability of the Behavioral and Emotional Rating Scale (2nd edition):  Youth Rating Scale. Research on Social Work Practice, 14, 358–​367. Epstein, N. B., Baldwin, L. M., & Bishop, D. S. (1983). The McMaster Family Assessment Device. Journal of Marital & Family Therapy, 9, 171–​180. Essex, M. J., Boyce, W. T., Goldstein, L. H., Armstrong, J. M., Kraemer, H. C., & Kupfer, D. J. (2002). The confluence of mental, physical, social, and academic difficulties in middle childhood: II. Developing the MacArthur Health and Behavior Questionnaire. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 588–​603. Ezpeleta, L., de la Osa, N., Domenech, J. M., Navarro, J. B., Losilla, J. M., & Judez, J. (1997). Diagnostic agreement between clinicians and the Diagnostic Interview for Children and Adolescents—​DICA-​R—​in an outpatient sample. Journal of Child Psychology & Psychiatry, 38, 431–​440. Ezpeleta, L., de la Osa, N., Granero, R., Domènech, J. P., & Reich, W. (2011). The Diagnostic Interview of Children and Adolescents for Parents of Preschool and Young Children: Psychometric properties in the general population. Psychiatry Research, 190, 137–​144. Ferdinand, R. F., Hoogerheide, K. N., van der Ende, J., Visser, J. H., Koot, H. M., Kasius, M. C., & Verhulst, F. C. (2003). The role of the clinician: Three-​year predictive value of parents’, teachers’, and clinicians’ judgments of childhood psychopathology. Journal of Child Psychology and Psychiatry, 44, 867–​876. Ford, T., Goodman, R., & Meltzer, H. (2003). The British Child and Adolescent Mental Health Survey 1999: The prevalence of DSM-​ IV disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 1203–​1211. Gadow, K. D., & Sprafkin, J. N. (2000). Early Childhood Inventory 4—​Screening manual. Stony Brook, NY: Checkmate Plus. Gadow, K. D., & Sprafkin, J. N. (2002). Child Symptom Inventory 4—​ Screening and norms manual. Stony Brook, NY: Checkmate Plus. Gaffrey, M. S., Barch, D. M., Singer, J., Shenoy, R., & Luby, J. L. (2013). Disrupted amygdala reactivity in depressed 4-​ to 6-​year-​old children. Journal of the American Academy of Child & Adolescent Psychiatry, 52, 737–​746. Garber, J., & Horowitz, J. L. (2002). Depression in children. In I. H. Gotlib & C. L. Hammen (Eds.), Handbook of depression (pp. 510–​540). New York, NY: Guilford.

124

124

Mood Disorders and Self-Injury

Garber, J., & Kaminski, K. M. (2000). Laboratory and performance-​based measures of depression in children and adolescents. Journal of Clinical Child Psychology, 29, 509–​525. Garland, A. F., Haine-​ Schlagel, R., Brookman-​ Frazee, L., Baker-​Ericzen, M., Trask, E., & Fawley-​King, K. (2013). Improving community-​ based mental health care for children:  Translating knowledge into action. Administration and Policy in Mental Health and Mental Health Services Research, 40(1), 6–​22. Geller, B., Fox, L. W., & Clark, K. A. (1994). Rate and predictors of prepubertal bipolarity during follow-​up of 6-​to 12-​year-​old depressed children. Journal of the American Academy of Child & Adolescent Psychiatry, 33, 461–​468. Gibb, B. (2014). Depression in children. In I. H. Gotlib & C. L. Hammen (Eds.), Handbook of depression (3rd ed., pp. 374–​390). New York, NY: Guilford. Goodman, R., Ford, T., Richards, H., Gatward, R., & Meltzer, H. (2000). The Development and Well-​Being Assessment:  Description and initial validation of an integrated assessment of child and adolescent psychopathology. Journal of Child Psychology and Psychiatry, 41, 645–​655. Goodman, S. H., Schwab-​Stone, M. D., Lahey, B. B., Shaffer, D., & Jensen, P. S. (2000). Major depression and dysthymia in children and adolescents: Discriminant validity and differential consequences in a community sample. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 761–​770. Goodyer, I., Dubicka, B., Wilkinson, P., Kelvin, R., Roberts, C., Byford, S., . . . Leech, A. (2007). Selective serotonin reuptake inhibitors (SSRIs) and routine specialist care with and without cognitive behaviour therapy in adolescents with major depression: Randomised controlled trial. BMJ, 335(7611), 142. Green, J. G., Avenevoli, S., Gruber, M. J., Kessler, R. C., Lakoma, M. D., Merikangas, K. R.,  .  .  .  Zaslavsky, A. M. (2012). Validation of diagnoses of distress disorders in the US National Comorbidity Survey Replication Adolescent Supplement (NCS-​A). International Journal of Methods in Psychiatric Research, 21, 41–​51. Gunlicks-​Stoessel, M., Mufson, L., Westervelt, A., Almirall, D., & Murphy, S. (2016). A pilot SMART for developing an adaptive treatment strategy for adolescent depression. Journal of Clinical Child & Adolescent Psychology, 45, 480–​494. Hamilton, J., & Gillham, J. (1999). The K-​ SADS and diagnosis of major depressive disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 1065–​1066. Hammen, C. (2006). Stress generation in depression: Reflections on origins, research, and future directions. Journal of Clinical Psychology, 62, 1065–​1082.

Hammen, C., Rudolph, K., Weisz, J., Rao, U., & Burge, D. (1999). The context of depression in clinic-​ referred youth:  Neglected areas in treatment. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 64–​71. Hammerton, G., Thapar, A., & Thapar, A. K. (2014). Association between obesity and depressive disorder in adolescents at high risk for depression. International Journal of Obesity, 38, 513–​519. Hankin, B. L., Abramson, L. Y., Moffitt, T. E., Silva, P. A., McGee, R., & Angell, K. E. (1998). Development of depression from preadolescence to young adulthood:  Emerging gender differences in a 10-​year longitudinal study. Journal of Abnormal Psychology, 107, 128–​140. Harkness, K. L., & Monroe, S. M. (2016). The assessment and measurement of adult life stress:  Basic premises, operational principles, and design requirements. Journal of Abnormal Psychology, 125, 727–​745. Hart, E. L., & Lahey, B. B. (1999). General child behavior rating scales. In D. Shaffer, C. P. Lucas, & J. E. Richters (Eds.), Diagnostic assessment in child and adolescent psychopathology (pp. 65–​87). New York, NY: Guilford. Hawley, K. M., & Weisz, J. R. (2003). Child, parent, and therapist (dis)agreement on target problems in outpatient therapy: The therapist’s dilemma and its implications. Journal of Consulting and Clinical Psychology, 71, 62–​70. Hayes, S. C., Nelson, R. O., & Jarrett, R. B. (1987). The treatment utility of assessment:  A functional approach to evaluating assessment quality. American Psychologist, 42, 963–​974. Herjanic, B., & Reich, W. (1982). Development of a structured psychiatric interview for children:  Agreement between child and parent on individual symptoms. Journal of Abnormal Child Psychology, 10, 307–​324. Hodges, K. (1994). Evaluation of depression in children and adolescents using diagnostic clinical interviews. In W. M. Reynolds & H. F. Johnston (Eds.), Handbook of depression in children and adolescents (pp. 183–​208). New York, NY: Plenum. Hodges, K. (1999). Child and Adolescent Functional Assessment Scale (CAFAS). In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 631–​664). Mahwah, NJ: Erlbaum. Huang, C., & Dong, N. (2014). Dimensionality of the Children’s Depression Inventory: Meta-​analysis of pattern matrices. Journal of Child and Family Studies, 23, 1182–​1192. Huesmann, L. R., Eron, L. D., Guerra, N. G., & Crawshaw, V. B. (1994). Measuring children’s aggression with teachers’ predictions of peer nominations. Psychological Assessment, 6, 329–​336.

 125

Depression in Children and Adolescents

Hughes, C. W., Emslie, G. J., Wohlfahrt, H., Winslow, R., Kashner, R. M., & Rush, A. J. (2005). Effect of structured interviews on evaluation time in pediatric community mental health settings. Psychiatric Services, 56, 1098–​1103. Hunsley, J., & Meyer, G. J. (2003). The incremental validity of psychological testing and assessment: Conceptual, methodological, and statistical issues. Psychological Assessment, 15, 446–​455. Insel, T., Cuthbert, B., Garvey, M., Heinssen, R., Pine, D. S., Quinn, K., . . . Wang, P. (2010). Research Domain Criteria (RDoC):  Toward a new classification framework for research on mental disorders. American Journal of Psychiatry, 167, 748–​751. Irwin, D. E., Stucky, B., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J.,  .  .  .  DeWalt, D. A. (2010). An item response analysis of the pediatric PROMIS anxiety and depressive symptoms scales. Quality of Life Research, 19, 595–​607. Jensen P. S., Rubio-​Stipec, M., Canino, G., Bird, H. R., Dulcan, M. K., Schwab-​Stone, M. E., & Lahey, B B. (1999). Parent and child contributions to diagnosis of mental disorder: Are both informants always necessary? Journal of the American Academy of Child & Adolescent Psychiatry, 38, 1569–​1579. John, K. (2001). Measuring children’s social functioning. Child Psychology and Psychiatry Review, 6, 181–​188. John, K., Gammon, G. D., Prusoff, B. A., & Warner, V. (1987). The Social Adjustment Inventory for Children and Adolescents (SAICA): Testing of a new semistructured interview. Journal of the American Academy of Child & Adolescent Psychiatry, 26, 898–​911. Jozefiak, T., Kayed, N. S., Rimehaug, T., Wormdal, A. K., Brubakk, A. M., & Wichstrøm, L. (2016). Prevalence and comorbidity of mental disorders among adolescents living in residential youth care. European Child and Adolescent Psychiatry, 25, 33–​47. Kahn, J. S., Kehle, T. J., Jenson, W. R., & Clark, E. (1990). Comparison of cognitive–​behavioral, relaxation, and self-​ modeling interventions for depression among middle school students. School Psychology Review, 19, 196–​211. Kaufman, J., Birmaher, B., Brent, D., Rao, U., Flynn, C., Moreci, P.,  .  .  .  Ryan, N. (1997). Schedule for Affective Disorders and Schizophrenia for School-​Age Children—​ Present and Lifetime version (K-​ SADS-​ PL):  Initial reliability and validity data. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 980–​988. Kendall, P. C., Cantwell, D. P., & Kazdin, A. E. (1989). Depression in children and adolescents:  Assessment issues and recommendations. Cognitive Therapy and Research, 13, 109–​146. Kennard, B. D., Emslie, G. J., Mayes, T. L., Nakonezny, P. A., Jones, J. M., Foxwell, A. A., & King, J. (2014). Sequential

125

treatment with fluoxetine and relapse-​prevention CBT to improve outcomes in pediatric depression. American Journal of Psychiatry, 171, 1083–​1090. Kent, L., Vostanis, P., & Feehan, C. (1997). Detection of major and minor depression in children and adolescents:  Evaluation of the Mood and Feelings Questionnaire. Journal of Child Psychology & Psychiatry, 38, 565–​573. Klein, D. N., Goldstein, B. L., & Finsaas, M. (2017). Depressive disorders. In T. P. Beauchaine & S. P. Hinshaw (Eds.), Child and adolescent psychopathology (3rd ed., pp. 610–​641). Hoboken, NJ: Wiley. Klein, D. N., Lewinsohn, P. M., Seeley, J. R., & Rhode, P. (2001). A family study of major depressive disorder in a community sample of adolescents. Archives of General Psychiatry, 58, 13–​20. Klein, D. N., Ouimette, P. C., Kelly, H. S., Ferro, T., & Riso, L. P. (1994). Test–​retest reliability of team consensus best-​ estimate diagnoses of Axis I and II disorders in a family study. American Journal of Psychiatry, 151, 1043–​1047. Kovacs, M. (1986). A developmental perspective on the methods and measures in the assessment of depressive disorders:  The clinical interview. In M. Rutter, C. E. Izard, & P. B. Read (Eds.), Depression in young people:  Developmental and clinical perspectives (pp. 435–​ 465). New York, NY: Guilford. Kovacs, M. (1992, 2003). Children’s Depression Inventory manual. North Tonawanda, NY: Multi-​Health Systems. Kovacs, M. (1996). Presentation and course of major depressive disorder during childhood and later years of the life span. Journal of the American Academy of Child & Adolescent Psychiatry, 35, 705–​715. Kovacs, M. (2011). Children’s Depression Inventory 2 (CDI 2) (2nd Ed.). North Tonawanda, NY:  Multi-​ Health Systems. Kramer, T. L., Phillips, S. D., Hargis, M. B., Miller, T. L., Burns, B. J., & Robbins, J. M. (2004). Disagreement between parent and adolescent reports of functional impairment. Journal of Child Psychology & Psychiatry, 45, 248–​259. Kuyken, W., Fothergill, C. D., Musa, M., & Chadwick, P. (2005). The reliability and quality of cognitive case formulation. Behaviour Research and Therapy, 43, 1187–​1201. Ladd, G. W., Herald-​Brown, S. L., & Andrews, R. K. (2009). The Child Behavior Scale (CBS) revisited:  A longitudinal evaluation of CBS subscales with children, preadolescents, and adolescents. Psychological Assessment, 21, 325–​339. Lahey, B. B., Rathouz, P. J., Van Hulle, C., Urbano, R. C., Krueger, R. F., Applegate, B., . . . Waldman, I. D. (2008). Testing structural models of DSM-​IV symptoms of common forms of child and adolescent psychopathology. Journal of Abnormal Child Psychology, 36, 187–​206.

126

126

Mood Disorders and Self-Injury

Lahey, B. B., Van Hulle, C. A., Singh, A. L., Waldman, I. D., & Rathouz, P. J. (2011). Higher-​order genetic and environmental structure of prevalent forms of child and adolescent psychopathology. Archives of General Psychiatry, 68, 181–​189. Leffler, J. M., Riebel, J., & Hughes, H. M. (2015). A review of child and adolescent interviews for clinical practitioners. Assessment, 22, 690–​703. Lemery-​Chalfant, K., Schreiber, J. E., Schmidt, N. L., Van Hulle, C. A., Essex, M. J., & Goldsmith, H. H. (2007). Assessing internalizing, externalizing, and attention problems in young children:  Validation of the MacArthur HBQ. Journal of the American Academy of Child & Adolescent Psychiatry, 46, 1315–​1323. Lenze, S. N., Pautsch, J., & Luby, J. (2011). Parent–​child interaction therapy emotion development: A novel treatment for depression in preschool children. Depression and Anxiety, 28, 153–​159. Lewczyk, C. M., Garland, A. F., Hurlbert, M. S., Gearity, J., & Hough, R. L. (2003). Comparing DISC-​IV and clinician diagnoses among youths receiving public mental health services. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 349–​356. Lewinsohn, P. M., & Essau, C. A. (2002). Depression in adolescents. In I. H. Gotlib & C. L. Hammen (Eds.), Handbook of depression (pp. 541–​ 559). New  York, NY: Guilford. Lewinsohn, P. M., Clarke, G. N., Hops, H., & Andrews, J. (1990). Cognitive–​behavioral treatment for depressed adolescents. Behavior Therapy, 21, 385–​401. Lewinsohn, P. M., Rohde, P., Klein, D. N., & Seeley, J. R. (1999). Natural course of adolescent major depressive disorder: I. Continuity into young adulthood. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 56–​63. Luby, J. L., Belden, A.C., Pautsch, J., Si, X., & Spitznagel, E. (2009). The clinical significance of preschool depression:  Impairment in functioning and clinical markers of the disorder. Journal of Affective Disorders, 112, 111–​119. Luby, J. L., Belden, A., Sullivan, J., & Spitznagel, E. (2007). Preschoolers’ contribution to their diagnosis of depression and anxiety:  Uses and limitations of young child self-​report of symptoms. Child Psychiatry and Human Development, 38, 321–​338. Luby, J. L., Gaffrey, M. S., Tillman, R., April, L. M., & Belden, A. C. (2014). Trajectories of preschool disorders to full DSM depression at school age and early adolescence:  Continuity of preschool depression. American Journal of Psychiatry, 171, 768–​776. Luby, J. L., Heffelfinger, A., Measelle, J. R., Ablow, J. C., Essex, M. J., Dierker, L., . . . Kupfer, D. J. (2002). Differential performance of the McArthur HBQ and DISC-​IV in identifying DSM-​IV internalizing psychopathology in

young children. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 458–​466. Luby, J. L., Lenze, S., & Tillman, R. (2012). A novel early intervention for preschool depression:  Findings from a pilot randomized controlled trial. Journal of Child Psychology and Psychiatry, 53, 313–​322. Luby, J. L., Mrakotsky, C., Heffelfinger, A. K., Brown, K. M., Hessler, M. J., & Spitznagel, E. (2003). Modification of DSM-​IV criteria for depressed preschool children. American Journal of Psychiatry, 160, 1169–​1172. Luby, J. L., Mrakotsky, C., Heffelfinger, A., Brown, K., & Spitznagel, E. (2004). Characteristics of depressed preschoolers with and without anhedonia:  Evidence for a melancholic depressive subtype in young children. American Journal of Psychiatry, 161, 1998–​2004. Luby, J. L., Si, X., Belden, A. C., Tandon, M., & Spitznagel, E. (2009). Preschool depression:  Homotypic continuity and course over 24  months. Archives of General Psychiatry, 66, 897–​905. Luby, J. L., Sullivan, J., Belden, A. C., Stalets, M., Blankenship, S., & Spitznagel, E. (2006). An observational analysis of behavior in depressed preschoolers:  Further validation of early-​ onset depression. Journal of the American Academy of Child & Adolescent Psychiatry, 45, 203–​212. Lucas, C., Fisher, P., & Luby, J. (1998). Young-​Child DISC-​ IV research draft:  Diagnostic Interview Schedule for Children. New York, NY: Columbia University, Division of Child Psychiatry. Lukens, E., Puig-​Antich, J., Behn, J., Goetz, R., Tabrizi, M., & Davies, M. (1983). Reliability of the Psychosocial Schedule for School-​ Age Children. Journal of the American Academy of Child & Adolescent Psychiatry, 22, 29–​39. Maalouf, F. T., & Brent, D. A. (2012). Child and adolescent depression intervention overview: What works, for whom and how well? Child and Adolescent Psychiatric Clinics of North America, 21, 299–​312. Maughan, B., Collshaw, S., & Stringaris, A. (2013). Depression in childhood and adolescence. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 22, 25–​40. March, J., Silva, S., Petrycki, S., Curry, J., Wells, K., Fairbank, J., . . . Vitiello, B. (2004). Fluoxetine, cognitive–​behavioral therapy, and their combination for adolescents with depression: Treatment for Adolescents with Depression Study (TADS) randomized controlled trial. JAMA, 292, 807–​820. Margulies, D. M., Weintraub, S., Basile, J., Grover, P. J., & Carlson, G. A. (2012). Will disruptive mood dysregulation disorder reduce false diagnosis of bipolar disorder in children? Bipolar Disorders, 14, 488–​496. Mars, B., Collishaw, S., Smith, D., Thapar, A., Potter, R., Sellers, R., . . . Thapar, A. (2012). Offspring of parents

 127

Depression in Children and Adolescents

with recurrent depression:  Which features of parent depression index risk for offspring psychopathology? Journal of Affective Disorders, 136, 44–​53. Matthey, S., & Petrovski, P. (2002). The Children’s Depression Inventory: Error in cutoff scores for screening purposes. Psychological Assessment, 14, 146–​149. Mayes, T. L., Bernstein, I. H., Haley, C. L., Kennard, B. D., & Emslie, G. J. (2010). Psychometric properties of the Children’s Depression Rating Scale-​ Revised in adolescents. Journal of Child and Adolescent Psychopharmacology, 20, 513–​516. McCauley, E., Gudmundsen, G., Schloredt, K., Martell, C., Rhew, I., Hubley, S., & Dimidjian, S. (2016). The Adolescent Behavioral Activation Program:  Adapting behavioral activation as a treatment for depression in adolescence. Journal of Clinical Child & Adolescent Psychology, 45, 291–​304. Milne, B. J., Caspi, A., Crump, R., Poulton, R., Rutter, M., Sears, M. R., & Moffitt, T. E. (2009). The validity of the family history screen for assessing family history of mental disorders. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 150, 41–​49. Muratori, F., Picchi, L., Bruni, B., Patarnello, M., & Romagnoli, G. (2003). A two-​year follow-​up of psychodynamic psychotherapy for internalizing disorders in children. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 331–​339. Murphy, J. M., Pagano, M. E., Ramirez, A., Anaya, Y., Nowlin, C., & Jellinek, M. S. (1999). Validation of the Preschool and Early Childhood Functional Assessment Scale (PECFAS). Journal of Child and Family Studies, 8, 343–​356. Myers, K., & Winters, N. C. (2002). Ten-​year review of rating scales:  II. Scales for internalizing disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 634–​659. Nakamura, B. J., Ebesutani, C., Bernstein, A., & Chorpita, B. F. (2009). A psychometric analysis of the Child Behavior Checklist DSM-​ oriented scales. Journal of Psychopathology and Behavioral Assessment, 31, 178–​189. Nelson-​Gray, R. O. (2003). Treatment utility of psychological assessment. Psychological Assessment, 15, 521–​531. Nilsen, T. S., Eisemann, M., & Kvernmo, S. (2013). Predictors and moderators of outcome in child and adolescent anxiety and depression: A systematic review of psychological treatment studies. European Child & Adolescent Psychiatry, 22, 69–​87. Nurnberger, J. I., Blehar, M. C., Kaufmann, C. A., York-​ Cooler, C., Simpson, S. G., Harkavy-​ Friedman, J.,  .  .  .  Reich, T.; Collaborators from the NIMH Genetics Initiative. (1994). Diagnostic interview for genetic studies—​Rationale, unique features, and training:  NIMH Genetics Initiative. Archives of General Psychiatry, 51, 849–​859.

127

Olino, T. M., Dougherty, L. R., Bufferd, S. J., Carlson, G. A., & Klein, D. N. (2014). Testing models of psychopathology in preschool-​aged children using a structured interview-​based assessment. Journal of Abnormal Child Psychology, 42, 1201–​1211. Olino, T. M., Yu, L., Klein, D. N., Rohde, P., Seeley, J. R., Pilkonis, P. A., & Lewinsohn, P. M. (2012). Measuring depression using item response theory:  An examination of three measures of depressive symptomatology. International Journal of Methods in Psychiatric Research, 21, 76–​85. Olino, T. M., Yu, L., McMakin, D. L., Forbes, E. E., Seeley, J. R., Lewinsohn, P. M., & Pilkonis, P. A. (2013). Comparisons across depression assessment instruments in adolescence and young adulthood: An item response theory study using two linking methods. Journal of Abnormal Child Psychology, 41, 1267–​1277. Olsson, G., & von Knorring, A. L. (1997). Depression among Swedish adolescents measured by the self-​rating scale Center for Epidemiology Studies–​ Depression Child (CES-​DC). European Child & Adolescent Psychiatry, 6, 81–​87. Osman, A., Gutierrez, P. M., Bagge, C. L., Fang, Q., & Emmerich, A. (2010). Reynolds Adolescent Depression Scale–​ Second Edition:  A reliable and useful instrument. Journal of Clinical Psychology, 66, 1324–​1345. Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the Patient-​ Reported Outcomes Measurement Information System (PROMIS): Depression, anxiety, and anger. Assessment, 18, 263–​283. Posner, K., Brown, G. K., Stanley, B., Brent, D. A., Yershova, K. V., Oquendo, M. A.,  .  .  .  Mann, J. J. (2011). The Columbia–​Suicide Severity Rating Scale: Initial validity and internal consistency findings from three multisite studies with adolescents and adults. American Journal of Psychiatry, 168, 1266–​1277. Poznanski, E. O., Cook, S. C., & Carroll, B. J. (1979). A depression rating scale for children. Pediatrics, 64, 442–​450. Poznanski, E. O., & Mokros, H. B. (1999). Children Depression Rating Scale-​ Revised (CDRS-​ R). Los Angeles, CA: Western Psychological Services. Puig-​Antich, J., & Chambers, W. (1978). The Schedule for Affective Disorders and Schizophrenia for School-​ Age Children. New  York, NY:  New  York State Psychiatric Institute. Puig-​Antich, J., Lukens, E., & Brent, D. (1986). Psychosocial Schedule for School-​Age Children-​Revised. Pittsburgh, PA: Western Psychiatric Institute and Clinic. Rawson, H. E., & Tabb, L. C. (1993). Effects of therapeutic intervention on childhood depression. Child and Adolescent Social Work Journal, 10, 39–​52.

128

128

Mood Disorders and Self-Injury

Reich, W. (2000). Diagnostic Interview for Children and Adolescents (DICA). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 59–​66. Reynolds, W. M. (1987). Reynolds Adolescent Depression Scale:  Professional manual. Odessa, FL:  Psychological Assessment Resources. Reynolds, W. M. (1989). Reynolds Child Depression Scale:  Professional manual. Odessa, FL:  Psychological Assessment Resources. Reynolds, W. M. (2004). The Reynolds Adolescent Depression Scale–​Second Edition (RADS-​2). In M. J. Hilsenroth & D. L. Segal (Eds.), Comprehensive handbook of psychological assessment:  Vol. 2.  Personality assessment (pp. 224–​236). Hoboken, NJ: Wiley. Reynolds, W. M. (2010). Reynolds Child Depression Scale–​ 2nd Edition (RCDS-​ 2). Lutz, FL:  Psychological Assessment Resources. Reynolds, W. M., & Mazza, J. (1999). Assessment of suicidal ideation in inner-​city children and young adolescent:  Reliability and validity of the Suicidal Ideation Questionnaire-​JR. School Psychology Review, 28, 17–​30. Richters, J. E. (1992). Depressed mothers as informants about their children:  A critical review of the evidence for distortion. Psychological Bulletin, 112, 485–​499. Ringoot, A. P., Jansen, P. W., Steenweg-​ de Graaff, J., Measelle, J. R., van der Ende, J., Raat, H., . . . Tiemeier, H. (2013). Young children’s self-​ reported emotional, behavioral, and peer problems:  The Berkeley Puppet Interview. Psychological Assessment, 25, 1273–​1285. Roberson-​Nay, R., Leibenluft, E., Brotman, M. A., Myers, J., Larsson, H., Lichtenstein, P., & Kendler, K. S. (2015). Longitudinal stability of genetic and environmental influences on irritability:  From childhood to young adulthood. American Journal of Psychiatry, 172, 657–​664. Roberts, R. E., Lewinsohn, P. M., & Seeley, J. R. (1991). Screening for adolescent depression:  A comparison of depression scales. Journal of the American Academy of Child & Adolescent Psychiatry, 30, 58–​66. Rosselló, J., & Bernal, G. (1999). The efficacy of cognitive–​ behavioral and interpersonal treatments for depression in Puerto-​Rican adolescents. Journal of Consulting and Clinical Psychology, 67, 734–​745. Rosselló, J., Bernal, G., & Rivera-​ Medina, C. (2012). Individual and group CBT and IPT for Puerto Rican adolescents with depressive symptoms. Journal of Latina/​o Psychology, 1(5), 36–​57. Rudolph, K. D., & Flynn, M. (2014). Depression in adolescents. In I. H. Gotlib & C. L. Hammen (Eds.), Handbook of depression (3rd ed., pp. 391–​ 409). New York, NY: Guilford. Sanford, M., Szatmari, P., Spinner, M., Munroe-​Blum, H., Jamieson, E., Walsh, C., & Jones, D. (1995). Predicting the one-​ year course of adolescent major depression.

Journal of the American Academy of Child & Adolescent Psychiatry, 34, 1618–​1628. Schaefer, E. S. (1965). Children’ reports of parental behavior: An inventory. Child Development, 36, 413–​424. Schwab-​Stone, M. E., Shaffer, D., Dulcan, M. K., Jensen, P. S., Fisher, P., Bird, H. R., . . . Rae, D. S. (1996). Criterion validity of the NIMH Diagnostic Interview Schedule for Children (DISC-​2.3). Journal of the American Academy of Child & Adolescent Psychiatry, 35, 878–​888. Shaffer, D., Fisher, P., Lucas, C., Dulcan, M., & Schwab-​ Stone, M. (2000). The Diagnostic Interview Schedule for Children Version IV (DISC-​IV):  Description, differences from previous versions, and reliability of some common diagnoses. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 28–​38. Shaffer, D., Gould, M. S., Brasic, J., Ambrosini, P., Fisher, P., Bird, H., & Aluwahlia, S. (1983). A Children’s Global Assessment Scale (CGAS). Archives of General Psychiatry, 40, 1228–​1231. Shaffer, D., Scott, M., Wilcox, H., Maslow, C., Hicks, R., Lucas, C. P., . . . Greenwald, S. (2004). The Columbia Suicide Screen:  Validity and reliability of a screen for youth suicide and depression. Journal of the American Academy of Child & Adolescent Psychiatry, 43, 71–​79. Shanahan, L., Copeland, W. E., Costello, E. J., & Angold, A. (2011). Child-​, adolescent-​and young adult-​onset depressions:  Differential risk factors in development? Psychological Medicine, 41, 2265–​2274. Sher-​ Censor, E. (2015). Five Minute Speech Sample in developmental research:  A review. Developmental Review, 36, 127–​155. Shimokawa, K., Lambert, M. J., & Smart, D. (2010). Enhancing treatment outcome of patients at risk of treatment failure:  Meta-​analytic and mega-​analytic review of a psychotherapy quality assurance system. Journal of Consulting & Clinical Psychology, 78, 298–​311. Silverman, W. K., & Rabian, B. (1999). Rating scales for anxiety and mood disorders. In D. Shaffer, C. P. Lucas, & J. E. Richters (Eds.), Diagnostic assessment in child and adolescent psychopathology (pp. 127–​166). New  York, NY: Guilford. Simmons, M., Wilkinson, P., & Dubicka, B. (2015). Measurement and issues: Depression measures in children and adolescents. Child and Adolescent Mental Health, 20, 230–​241. Slep, A. M. S., Heyman, R. E., & Foran, H. M. (2015). Child maltreatment in DSM-​5 and ICD-​11. Family Process, 54, 17–​32. Stapleton, L. M., Sander, J. B., & Stark, K. D. (2007). Psychometric properties of the Beck Depression Inventory for Youth in a sample of girls. Psychological Assessment, 19, 230–​235. Stark, K. D. (1990). Childhood depression: School-​based intervention. New York, NY: Guilford.

 129

Depression in Children and Adolescents

Stark, K. D., Reynolds, W. M., & Kaslow, N. J. (1987). A comparison of the relative efficacy of self-​control therapy and a behavioral problem-​solving therapy for depression in children. Journal of Abnormal Child Psychology, 15, 91–​113. Sterba, S. K., Egger, H. L., & Angold, A. (2007). Diagnostic specificity and nonspecificity in the dimensions of preschool psychopathology. Journal of Child Psychology and Psychiatry, 48, 1005–​1013. Stockings, E. A., Degenhardt, L., Dobbins, T., Lee, Y. Y., Erskine, H. E., Whiteford, H. A., & Patton, G. (2016). Preventing depression and anxiety in young people: A review of the joint efficacy of universal, selective and indicated prevention. Psychological Medicine, 46, 11–​26. Stone, L. L., van Daal, C., van der Maten, M., Engels, R., Janssens, J., & Otten, R. (2014). The Berkeley Puppet Interview:  A screening instrument for measuring psychopathology in young children. Child Youth Care Forum, 43, 211–​225. Strand, V. C., Sarmiento, T. L., & Pasquale, L. E. (2005). Assessment and screening tools for trauma in children and adolescents: A review. Trauma, Violence, & Abuse, 6, 55–​78. Stringaris, A., Goodman, R., Ferdinando, S., Razdan, V., Muhrer, E., Leibenluft, E., & Brotman, M. (2012). The Affective Reactivity Index:  A concise irritability scale for clinical and research purposes. Journal of Child Psychology and Psychiatry, 53, 1109–​1117. Taylor, J., & Turner, R. J. (2002). Perceived discrimination, social stress, and depression in the transition to adulthood: Racial contrasts. Social Psychology Quarterly, 65, 213–​225. Thapar, A., Collishaw, S., Pine, D. S., & Thapar, A. K. (2012). Depression in adolescence. Lancet, 379, 1056–​1067. Thapar, A., & McGuffin, P. (1998). Validity of the shortened Mood and Feelings Questionnaire in a community sample of children and adolescents: A preliminary research note. Psychiatry Research, 81, 259–​268. Valla, J. P., Bergeron, L., & Smolla, N. (2000). The Dominic-​ R:  A pictorial interview for 6-​to 11-​year-​old children. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 85–​93. Vanaelst, B., De Vriendt, T., Huybrechts, I., Rinaldi, S., & De Henauw, S. (2012). Epidemiological approaches to measure childhood stress. Paediatric and Perinatal Epidemiology, 26, 280–​297. Vasa, R. A., Carlino, A. R., & Pine, D. S. (2006). Pharmacotherapy of depressed children and adolescents: Current issues and potential directions. Biological Psychiatry, 59, 1021–​1028. Verhulst, F. C., Dekker, M. C., & van der Ende, J. (1997). Parent, teacher, and self-​reports as predictors of signs of disturbance in adolescents:  Whose information

129

carries the most weight? Acta Psychiatrica Scandinavica, 96, 75–​81. Vitiello, B. (2009). Combined cognitive–​behavioral therapy and pharmacotherapy for adolescent depression: Does it improve outcomes compared with monotherapy? CNS Drugs, 23, 271–​280. Vitiello, B. (2011). Prevention and treatment of child and adolescent depression:  Challenges and opportunities. Epidemiology and Psychiatric Sciences, 20, 37–​43. Vostanis, P., Feehan, C., Grattan, E., & Bickerton, W. L. (1996). A randomized controlled outpatient trial of cognitive–​behavioral treatment for children and adolescents with depression: 9-​month follow-​up. Journal of Affective Disorders, 40, 105–​116. Wadsworth, M. E., Hudziak, J. J., Heath, A. C., & Achenbach, T. M. (2001). Latent class analysis of Child Behavior Checklist anxiety/​depression in children and adolescents. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 106–​114. Wamboldt, M. Z., Wamboldt, F. S., Gavin, L., & McTaggart, S. (2001). A parent–​ child relationship scale derived from the Child and Adolescent Psychiatric Assessment (CAPA). Journal of the American Academy of Child & Adolescent Psychiatry, 40, 945–​953. Weersing, V. R., Schwartz, K. T.  G., & Bolano, C. (2015). Moderators and mediators of treatments for youth with depression. In M. Maric, P. J. M. Prins, & T. H. Ollendick (Eds.), Moderators and mediators of youth treatment outcomes (pp. 65–​96). Oxford, UK:  Oxford University Press. Weersing, V. R., & Weisz, J. (2002). Community clinic treatment of depressed youth:  Benchmarking usual care against CBT clinical trials. Journal of Consulting and Clinical Psychology, 70, 299–​310. Weiss, B., & Garber, G. (2003). Developmental differences in the phenomenology of depression. Development and Psychopathology, 15, 403–​430. Weissman, M. M., Wickramaratne, P., Adams, P., Wolk, S., Verdeli, H., & Olfson, M. (2000). Brief screening for family psychiatric history:  The family history screen. Archives of General Psychiatry, 57, 675–​682. Weisz, J. R., Krumholz, L. S., Santucci, L., Thomassin, K., & Ng, M. (2015). Shrinking the gap between research and practice:  Tailoring and testing youth psychotherapies in clinical care contexts. Annual Review of Clinical Psychology, 11, 139–​163. Weisz, J. R., McCarty, C. A., & Valeri, S. M. (2006). Effects of psychotherapy for depression in children and adolescents:  A meta-​analysis. Psychological Bulletin, 132, 132–​149. Weisz, J. R., Southam-​Gerow, M. A., Gordis, E. B., Connor-​ Smith, J. K., Chu, B. C., Langer, D. A., . . . Weiss, B. (2009). Cognitive–​ behavioral therapy versus usual clinical care for youth depression:  An initial test of

130

130

Mood Disorders and Self-Injury

transportability to community clinics and clinicians. Journal of Consulting and Clinical Psychology, 77(3), 383–​396. Weller, E. B., Weller, R. A., Fristad, M. A., Rooney, M. T., & Schecter, J. (2000). Children’s Interview for Psychiatric Disorders (ChIPS). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 76–​84. Wichstrøm, L., & Berg-​Nielsen, T. S. (2014). Psychiatric disorders in preschoolers:  The structure of DSM-​ IV symptoms and profiles of comorbidity. European Child & Adolescent Psychiatry, 23, 551–​562. Wichstrøm, L., Berg-​Nielsen, T. S., Angold, A., Egger, H. L., Solheim, E., & Sveen, T. H. (2012). Prevalence of psychiatric disorders in preschoolers. Journal of Child Psychology and Psychiatry, 53, 695–​705. Wight, R. G., Aneshensel, C. S., Botticello, A. L., & Sepulveda, J. E. (2005). A multilevel analysis of ethnic variation in depressive symptoms among adolescents in the United States. Social Science & Medicine, 60, 2073–​2084. Williamson, D. E., Birmaher, B., Ryan, N. D., Shiffrin, T. P., Lusky, J. A., Protopapa, J., . . . Brent, D. A. (2003). The Stressful Life Events Schedule for children and adolescents:  Development and validation. Psychiatry Research, 119, 225–​241. Winters, N. C., Collett, B. R., & Myers, K. M. (2005). Ten-​ year review of rating scales: VII. Scales assessing functional impairment. Journal of the American Academy of Child & Adolescent Psychiatry, 44, 309–​338.

Winters, N. C., Myers, K., & Proud, L. (2002). Ten-​year review of rating scales:  III. Scales assessing suicidality, cognitive style, and self-​ esteem. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 1150–​1181. Wood, A., Harrington, R., & Moore, A. (1996). Controlled trial of a brief cognitive–​behavioral intervention in adolescent patients with depressive disorders. Journal of Child Psychology and Psychiatry, 37, 737–​746. Wood, A., Kroll, L., Moore, A., & Harrington, R. (1995). Properties of the Mood and Feelings Questionnaire in adolescent psychiatric outpatients:  A research note. Journal of Child Psychology and Psychiatry, 36, 327–​334. World Health Organization. (1992). The ICD-​ 10 classification of mental and behavioural disorders:  Clinical descriptions and diagnostic guidelines. Geneva, Switzerland: Author. Yee, A. M., Algorta, G. P., Youngstrom, E. A., Findling, R. L., Birmaher, B., Fristad, M., & The LAMS Group (2015). Unfiltered administration of the YMRS and CDRS-​R in a clinical sample of children. Journal of Clinical Child & Adolescent Psychology, 44, 992–​1007. Youngstrom, E., Izard, C., & Ackerman, B. (1999). Dysphoria-​ related bias in maternal ratings of children. Journal of Consulting and Clinical Psychology, 67, 905–​916. Zimmerman, M. (2003). What should the standard of care for psychiatric diagnostic evaluations be? Journal of Nervous and Mental Disease, 191, 281–​286.

 13

7

Adult Depression Jacqueline B. Persons David M. Fresco Juliet Small Ernst We begin this chapter with an overview of the current diagnostic criteria for major depressive disorder (MDD), the epidemiology of MDD, and current theories of and therapies for MDD. We review assessment tools for obtaining a diagnosis, developing a case conceptualization and treatment plan, and monitoring change in therapy. We conclude with a brief discussion of some future directions of assessment of depression. We focus this review on MDD because space is limited and because the empirical support for the tools and theories and therapies we describe focuses most frequently on MDD. However, many other disorders, including persistent depressive disorder (dysthymia), premenstrual dysphoric disorder, substance/​medication-​induced depressive disorder, adjustment disorders, schizoaffective disorder, and bipolar and related disorders, as well as phenomena that are not disorders (e. g., grief), share features with MDD, and many of the assessment tools described here will be helpful in those cases. Chapter 9 in this volume addresses the assessment of bipolar disorder.

THE NATURE OF MAJOR DEPRESSIVE DISORDER

Diagnostic Criteria MDD is an episodic mood disorder characterized by depressed mood or anhedonia (loss of interest and pleasure in life) that has persisted for most of the day, nearly every day, for at least 2 weeks and is accompanied by five or more of the following symptoms: weight gain or weight loss not associated with dieting, decrease or increase in appetite, insomnia or hypersomnia, psychomotor agitation

or retardation, fatigue or loss of energy, feelings of worthlessness, excessive or inappropriate guilt, diminished ability to think or concentrate, indecisiveness, or suicidality (American Psychiatric Association, 2013). The symptoms cause clinically significant distress or impairment in functioning, and they are not due to the direct physiological effects of a substance or a general medical condition. Epidemiology of Major Depressive Disorder MDD is a prevalent and debilitating national health problem. The National Comorbidity Survey Replication (NCS-​R; Kessler, Chiu, Demier, Merikangas, & Walters, 2005)  reported the lifetime prevalence of MDD in the United States at 16.2%, the highest rate of 14 major psychiatric disorders. The 2014 National Survey of Drug Use and Health found that 6.6% of adults suffered at least one major depressive episode in the past year, a figure that equates to roughly 15.7 million Americans (Center for Behavioral Health Studies and Quality, 2015). Many patients with MDD experience multiple episodes, with rates of recurrence up to 85% within a 15-​ year period (Hardeveld, Spijker, De Graaf, Nolen, & Beekman, 2010). The prevalence of depressive symptoms in the United States is widespread; 20.1% of the adults sampled in the National Health and Nutrition Examination Survey reported significant depressive symptoms (Shim, Baltrus, Ye, & Rust, 2011). Depression is a leading cause of disability. MDD accounts for the third greatest burden of all diseases worldwide and the first greatest burden for middle-​and high-​income nations (World Health Organization, 2008). In the United States, estimates of the monetary burden of MDD, whether through direct (e.g., medical services) or

131

132

132

Mood Disorders and Self-Injury

indirect costs (e.g., workplace presenteeism, or the act of working while sick), approached $210.5 billion in 2010 (Greenberg, Fournier, Sisitsky, Pike, & Kessler, 2015). The lifetime prevalence of MDD is higher in women than in men in every age group (Pratt & Brody, 2014). MDD is more likely to occur in Whites compared to Hispanics or non-​Hispanic Blacks (Kessler et  al., 2003), although this pattern is reversed in dysthymia (called persistent depressive disorder in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders [DSM-​5]; American Psychiatric Association, 2013; Riolo, Nguyen, Greden, & King, 2005) and may become insignificant when the factor of poverty is controlled for (Pratt & Brody, 2014). MDD is associated with high rates of comorbidity with other psychiatric disorders; the NCS-​R reported rates of comorbidity as high as 59.2% with anxiety disorders, 24% with substance use disorder, and 30% with impulse control disorders. Other common comorbid conditions include pain and other somatoform disorders, eating disorders, dementias, and personality disorders. Theories of Depression A variety of systems of psychotherapy with ostensibly different mechanisms of action have been shown to be effective in treating major depression and/​or reducing the likelihood of a relapse. Here, we briefly describe the major behavioral, cognitive, affect science, and interpersonal theories of depression and the therapies based on them. These theories and therapies identify mechanisms that cause and maintain symptoms of depression and that clinicians will want to assess to inform their case conceptualization and treatment plan and also to monitor the patient’s progress during therapy. Comprehensive reviews of this literature are provided by Craighead, Johnson, Carey, and Dunlop (2015), DeRubeis, Siegle, and Hollon (2008), and Hollon, Stewart, and Strunk (2006). We describe theories and mechanisms of depression using a “silo” approach that emphasizes distinctions among the theories and therapies of depression. However, as Mennin, Ellard, Fresco, and Gross (2013) note, these therapies are “blunt instruments”—​that is, although they are intended to target certain mechanisms, they likely produce change in many others. Thus, for example, many change principles in our treatments, such as cognitive change (e.g., decentering and cognitive reframing) have a bidirectional relationship with behavior change (e.g., exposure and behavioral activation). Our motivation for emphasizing the distinctions among the models is to help clinicians solve clinical problems. For instance, many patients do not

respond to treatment (response rates for evidence-​based treatments range from 25% to 64%; Craighead et al., 2015). When treatment fails, using an alternate conceptual model can provide new intervention ideas (Persons, 1990, 2008; Persons, Beckner, & Tompkins, 2013). Behavioral Models Behavioral models of depression focus primarily on positive and negative reinforcement. For instance, Ferster (1973) conceptualized that depression arises and is maintained when individuals orient their lives in service of escape or avoidance instead of in the pursuit of positive reinforcement. Ferster proposed a functional analytic approach to treating depression that focused on decreasing the depressed individual’s reliance on escape or avoidance behaviors and expanding the individual’s behavioral repertoire to increase the availability of positive reinforcements. Similarly, Lewinsohn (Lewinsohn & Gotlib, 1995)  posited that depressed individuals lack positive reinforcement or have experienced life events or stressors that caused them to lose positive reinforcers and that until they learn to obtain positive reinforcement, they will be inactive, withdrawn, and dysphoric. Lewinsohn’s therapy helps depressed individuals increase the positive reinforcement they experience by learning to identify and carry out pleasant activities, practice relaxation, and improve their social skills. These early behavioral models gave rise to evidence-​based treatments of depression, including behavioral therapy (Lewinsohn & Gotlib, 1995), behavioral activation (BA) (Dimidjian, Barrera, Martell, Muñoz, & Lewinsohn, 2011; Martell, Addis, & Jacobson, 2001), and the rumination-​focused cognitive–​behavior therapy developed by Watkins and colleagues (Watkins, 2016). Cognitive Content Models Beck and Bredemeier (2016) propose that depression results when individuals with negative and distorted schemas experience life events that activate those schemas. Beck defines schemas as organized, enduring representations of knowledge and experience, generally formed in childhood, which guide the processing of current information. Beck’s model posits that emotions, automatic thoughts, and behaviors are connected and influence one another. Cognitive therapy of depression (Beck, Rush, Shaw, & Emery, 1979) helps the depressed patient modify distorted automatic thoughts and maladaptive behaviors and to change or replace the problematic schemas to reduce depressive symptoms and the person’s

 13

Adult Depression

vulnerability to future episodes of depression. The therapy may also help patients change their life circumstances so as to reduce activation of problematic schemas. McCullough (2000) proposed a cognitive theory of chronic depression that states that the chronically depressed person lacks “perceived functionality,” or the ability to perceive a “contingency relationship between one’s behavior and consequences” (p.  71). Without perceived functionality, the person loses the motivation to take action, with the result that he or she suffers a dearth of positive reinforcers and an excess of punishers. To address this deficit, McCullough developed the Cognitive–​Behavioral Analysis System of Psychotherapy (CBASP). In CBASP, the therapist guides the patient through detailed examinations (assessment) of specific interpersonal interactions and helps the patient learn to identify and remediate his or her passive and ineffectual behaviors. The goal is to teach patients that they actually do have the power to get what they want in interpersonal transactions. Cognitive Process Models A signature characteristic of many forms of psychopathology, including MDD, is repetitive or perseverative thought or negative self-​ referential processing (NSRP) (e.g., Mennin & Fresco, 2013; Olatunji, Naragon-​Gainey, & Wolitzky-​Taylor, 2013; Watkins, 2008). The tendency to engage in repetitive negative thinking may reflect a maladaptive cognitive reactivity associated with the inability to disengage from aversive and conflicting emotional and somatic experiences (Borkovec, Alcaine, & Behar, 2004; Mennin & Fresco, 2014; Newman & Llera, 2011; Nolen-​ Hoeksema, Wisco, & Lyubomirsky, 2008), which in turn further reinforces the use of these self-​ evaluative processes. NSRPs, in turn, can result in considerable deficits in cognitive and behavioral responding (e.g., Lissek, 2012; Whitmer & Gotlib, 2012), as well as an inferior treatment response and more frequent relapse (e.g., Jones, Siegle, & Thase, 2008). Here, the problem is not so much the content of the thought but, rather, the process of thinking and the individual’s rigidity or difficulty regulating where to place his or her attention. Essentially, these processes are enacted to create control and predictability, but instead these individuals can find themselves vacillating between a worried or ruminative mind and chronically distressed body and, subsequently, reinforcing the use of these self-​ evaluative processes when they are momentarily effective at staving off the aversive experience of strong emotional responses (Borkovec et al., 2004; Mennin & Fresco, 2013,

133

2014; Newman & Llera, 2011; Nolen-​Hoeksema et  al., 2008; Olatunji et al., 2013; Watkins, 2008). Perfectionism and self-​criticism are additional forms of NSRPs that confer vulnerability for depression, maintain depressive symptoms, and interfere with treatment. Behavioral activation (Martell et  al., 2001), cognitive therapy (Beck et  al., 1979), and rumination-​ focused cognitive–​behavioral therapy (CBT; Watkins, 2016)  target NSRP in MDD. One biobehavioral capacity associated with reductions in destructive self-​referentiality and that can be enhanced with treatment is decentering, defined as a metacognitive capacity to observe items that arise in the mind (e.g., thoughts, feelings, and memories) with healthy psychological distance, greater self-​awareness, and perspective-​ taking (Bernstein et al., 2015; Fresco, Moore, et al., 2007; Fresco, Segal, Buis, & Kennedy, 2007; Safran & Segal, 1990). Bernstein and colleagues (2015) proposed that decentering is composed of three interrelated metacognitive processes:  meta-​awareness, disidentification from internal experience (i.e., experiencing sensations, emotions, and thoughts from a third-​person perspective), and reduced reactivity to thought content (i.e., less impact on attention, emotion, cognitive elaboration, motivation, etc.). Most of the evidence supporting the construct of decentering is derived from a well-​validated self-​report measure that we describe later in the chapter (Fresco, Moore, et al., 2007). Decentering is associated with acute and enduring treatment effects for patients suffering from MDD (Fresco, Segal, et al., 2007) and generalized anxiety disorder (GAD; with and without MDD) (Hoge et  al., 2015; Mennin, Fresco, Heimberg, & O’Toole, 2017; Mennin, Fresco, Ritter, & Heimberg, 2015; Renna, Quintero, Mennin, & Fresco, 2017). Emotion Models Emotion models of psychopathology draw from basic and translational findings in affective neuroscience that identify two core systems that regulate thoughts and behaviors (e.g., Gray & McNaughton, 2000). The approach or reward system motivates actions toward goals and rewards, and produces positive emotions such as enthusiasm and pride. By contrast, the security system motivates avoidance of aversive outcomes or punishments and is linked with negative emotions. Optimal reward learning requires us to assign value to possible rewarding and punishing stimuli, make predictions about when and where we might encounter these stimuli, and take behavioral actions that are informed by these predictions (O’Doherty, 2004).

134

134

Mood Disorders and Self-Injury

Reward learning is further defined in terms of consummatory pleasure (i.e., “liking”), which refers to the hedonic impact that a reward produces, and anticipatory pleasure (i.e., “wanting”), which refers to the incentive salience associated with a particular reward (Berridge, Robinson, & Aldridge, 2009; Sherdell, Waugh, & Gotlib, 2012). Reward learning is impaired in individuals suffering from MDD. For example, depressed individuals fail to distinguish between options yielding large versus small rewards (Forbes, Shaw, & Dahl, 2007). Similarly, depressed individuals, especially when they are ruminating, are more prone to misconstrue the likelihood and intensity of a potentially punishing situation (Whitmer, Frank, & Gotlib, 2012). Finally, depressed patients, especially when their clinical presentation includes comorbid anxiety disorders, may struggle with the valuation of stimuli in their lives given that most situations are marked with cues for both threat and reward (Stein & Paulus, 2009). Two additional neurobehavioral systems are commonly impaired in MDD. The default network (DN; e.g., Raichle et al., 2001), which serves autobiographical, self-​ monitoring, and social cognitive functions, is associated with adaptive and maladaptive forms of self-​referential mentation. Psychiatric disorders are often marked by excessive activation of the DN, thereby reducing activation of neural regions associated with executive control (e.g., Whitfield-​ Gabrieli & Ford, 2012)  and emotion regulation (e.g., Brewer et  al., 2011; Whitfield-​Gabrieli & Ford, 2012). In addition, the salience network (SN; e.g., Craig, 2009; Menon, 2015)—​ which governs our attention to the external and internal world (Menon & Uddin, 2010), integrates sensory, emotional, and cognitive information, and is associated with optimal communication, social behavior, and self-​awareness (Menon, 2015)—​is disrupted in many forms of psychopathology, especially when there is excessive activity in the neural regions associated with the DN (e.g., Hamilton, Chen, & Gotlib, 2013; Paulus & Stein, 2010; Yuen et  al., 2014). Thus, depression is marked by abnormalities in the interplay of the reward, default, and salience networks, which lead to the clinical features that are commonly the targets of treatment. This neurobehavioral model of depression opens many doors for clinicians who use empirically-​supported treatments to treat MDD. The behavioral and cognitive approaches, described previously, all possess intervention principles that target threat and reward deficits (e.g., exposure and behavioral activation), salience network deficits (e.g., cue detection and self-​monitoring), and excessive default network activation (e.g., cognitive

interventions). In addition, building from a solid foundation of traditional and contemporary CBT principles and informed by basic and translational findings in affect science, emotion regulation therapy (ERT; Fresco, Mennin, Heimberg, & Ritter, 2013; Mennin & Fresco, 2013, 2014)  was developed to specifically target the hypothesized neurobehavioral deficits of commonly co-​occurring disorders such as GAD and MDD. ERT is a theoretically derived, evidence-​ based treatment that teaches clients skills of attention and metacognitive regulation so they can develop optimal behavioral repertoires associated with threat and reward learning. ERT has demonstrated promising preliminary clinical efficacy in open-​label and randomized clinical trials (Mennin, Fresco, Heimberg, & O’Toole, 2017; Mennin, Fresco, Ritter, & Heimberg, 2015; Renna et al., 2017). Interpersonal Models Interpersonal psychotherapy (IPT) was developed by Klerman, Weissman, and their colleagues as a treatment for MDD (Klerman, Weissman, Rounsaville, & Chevron, 1984). The interpersonal model of depression emphasizes the reciprocal relations between biological and interpersonal factors in causing and maintaining depression. The IPT theory proposes that problems or deficits in one or more of four areas of interpersonal functioning (unresolved grief, interpersonal disputes, role transitions, and interpersonal deficits [e.g., social skills deficits or social isolation]) contribute to the onset and/​or maintenance of depression, and the IPT therapist intervenes to address the patient’s deficits in that area. Lewinsohn’s behavioral model and McCullough’s CBASP also included proposals that depressed individuals have interpersonal skills deficits, and the therapies based on those models include skills training elements. Relapse Prevention Models Depression is a recurrent disorder, and relapse rates are high (Hollon et  al., 2006). Mindfulness-​based cognitive therapy (MBCT; Segal, Williams, & Teasdale, 2013)  is predicated on the premise that intervention principles that are effective in eliminating symptoms of depression may not be ideally suited to prevent future episodes. MBCT posits that previously depressed individuals are vulnerable for relapse or recurrence because dysphoria can reactivate patterns of thinking that maintain and intensify the dysphoric states through escalating and self-​perpetuating cycles of ruminative cognitive–​affective

 135

Adult Depression

processing (Teasdale, 1997, 1988). MBCT combines elements of traditional CBT for depression with components of the mindfulness-​based stress reduction program (MBSR) developed by Kabat-​Zinn and colleagues (e.g., Kabat-​Zinn, 1990)  to provide individuals with ways to ward off emotion-​cued spirals into rumination. In particular, MBCT seeks to improve former depressed patients’ focused and flexible attention and ability to decenter (van der Velden et al., 2015).

135

in time, and this is particularly important because without a longitudinal assessment, it can be difficult or impossible to distinguish between a unipolar and bipolar mood disorder. The DSM-​5 version of the SCID is still relatively new, and studies evaluating its psychometric properties are not yet available. In a study of the use of the SCID to diagnose MDD based on the DSM-​IV-​TR (American Psychiatric Association, 2000), Ventura (1998) reported high inter-​rater agreement for current diagnosis based on the DSM-​IV-​TR SCID, with an overall weighted κ of .82. Kappas for MDD have been found to be good to excellent (range = .80 to .91; Ventura, 1998). A streamlined cliniPURPOSES OF ASSESSMENT cian version of the SCID-​5 is available exclusively from We discuss assessment for diagnosis, for case conceptual- American Psychiatric Publishing (https://​www.appi.org/​ ization and treatment planning, and for monitoring prog- products/​structured-​clinical-​interview-​for-​dsm-​5-​scid-​5). The Anxiety and Related Disorders Interview Schedule ress in treatment. The clinician working with a depressed for DSM-​ 5–​Lifetime Version (ADIS-​5L; Brown & Barlow, patient is likely to choose one or more of the behavioral, 2014)  is a semi-​structured interview for the diagnosis of cognitive, emotion-​ focused, interpersonal, or relapse-​ current and past DSM-​ 5 anxiety, mood, obsessive–​ prevention models to guide the therapy, and the choice of compulsive, trauma, and related disorders (e.g., somatic assessment tools for case conceptualization and treatment symptom and substance use). A 0 to 8 clinician severity planning and progress monitoring will likely depend on rating (CSR) is assigned for each diagnosis based on the the model or models the clinician chooses. Assessment severity of the patient’s distress about his or her symptoms tools for diagnosis, in contrast, are independent of the and the degree of interference in daily functioning due to model guiding treatment. There is significant overlap in the symptoms. A CSR of 4 or higher is considered clinithe tools we describe for assessing diagnosis, conceptualcally significant. A disorder is designated as the principal ization and treatment planning, and treatment monitoring. For example, measures of depressive symptoms are diagnosis if it is given a CSR that is at least one point useful for diagnosis, conceptualization and treatment higher than any other clinically significant diagnosis. If the goal of the interview is simply to confirm the presplanning, and monitoring progress in treatment. ence of current and lifetime diagnoses, the ADIS-​5L takes roughly the same amount of time to administer as the SCID-​5. However, the clinician may want to make use of ASSESSMENT FOR DIAGNOSIS the extensive probes for assessing the specific impairment associated with a particular disorder, the client’s strengths, Semi-​Structured Interviews hypothesized etiological factors and situational antecedThe most frequently used instrument for assigning a diag- ents, and a “Diagnostic Timeline” approach to track the nosis is the Structured Clinical Interview (SCID), recently onset, remission, and temporal ordering of diagnoses that updated for DSM-​5 (First, Williams, Karg, & Spitzer, are unique features of the ADIS-​5L. Studies evaluating 2015). The SCID-​5 requires between 60 and 90 minutes the psychometric properties of the ADIS-​5L are not yet to administer and allows the clinician to identify current available, but as detailed in Table 7.1, the norms of the and lifetime psychiatric disorders. The SCID-​5 was fash- ADIS-​IV are adequate; the inter-​rater reliability, content ioned after the traditional interview in which clinicians validity, construct validity, and validity generalization are consider and test several diagnostic hypotheses simultane- good; and clinical utility is excellent. ously. Each section begins with a YES/​NO probe followed by queries that ask for elaborations. This strategy has two Self-​Report Measures main advantages: (1) Diagnostic decisions are known to the interviewer during the interview, and (2)  interviews Many self-​report scales of depressive symptoms are availare shorter because irrelevant sections are not exhaustively able to support diagnostic assessment. We review two: the probed. The SCID-​5 allows the clinician to assess the life- Quick Inventory of Depressive Symptomatology–​ Self-​ time course of the disorder, not just a snapshot at one point Rated (QIDS-​SR) and the Patient Health Questionnaire-​9

136

136

Mood Disorders and Self-Injury

Table 7.1  Ratings of Instruments Used for Diagnosis Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Content Reliability Validity

Construct Validity Validity Generalization

Treatment Sensitivity

Clinical Utility

Highly Recommended

NA

G

NA

G

G

G

E

E



NA

G

NA

G

G

G

E

E



Diagnosis SCID-​5/​ A SCID-​5-​PD ADIS-​5L A Depression Severity QIDS

E

E

NA

E

E

E

E

E

E



PHQ-​9

E

E

NA

E

E

E

E

E

E



Note: SCID-​5 = Structured Clinical Interview for DSM-​5; ADIS-​5L = Anxiety and Related Disorders Interview Schedule for DSM-​5–​Lifetime Version; QIDS  =  Quick Inventory for Depression Severity; PHQ-​9  =  Patient Health Questionnaire 9; A  =  Adequate; G  =  Good; E  =  Excellent; NA  =  Not Applicable.

(PHQ-​ 9). We do not review the Beck Depression Inventory Second Edition (BDI-​II; Beck, Steer, & Brown, 1996), despite its wide use in research, because the scales we chose to review are largely free, easy to access, and sufficient to meet clinicians’ needs. The QIDS-​SR (Rush et  al., 2003)  is a 16-​item self-​ report measure that is designed to assess the severity of depressive symptoms. The scale evaluates all the criterion symptom domains in the DSM-​5 criteria for MDD. The QIDS-​SR is a shortened version of the 30-​item Inventory of Depressive Symptomatology (IDS-​SR); the IDS-​SR, in addition to assessing depressive symptoms, also assesses many symptoms of anxiety. The QIDS-​SR and IDS-​SR are, in turn, adaptations of clinician-​rated versions of the IDS and QIDS. As indicated in Table 7.1, the norming, reliability, and validity of the QIDS-​ SR are excellent. Lamoureux et  al. (2010) conducted receiver operating characteristic analysis in a sample of 125 primary care patients who completed the QIDS-​SR and the SCID and concluded that a score of 13 on the QIDS-​SR provided the best balance of sensitivity (Sn  =  .77) and specificity (Sp = .82) and correctly classified 81% of the sample as to MDD status. The clinician-​rated and self-​rated versions of the IDS and QIDS, as well as copious psychometric information about the scales, are available free for download online (http://​www.ids-​qids.org). The measures are available in 13 languages. The PHQ-​9 (Kroenke, Spitzer, & Williams, 2001)  is a 10-​ item self-​ report measure designed for screening, diagnosing, and/​or monitoring depressive symptoms over a 2-​week period. The first 9 items assess specific depressive symptoms using a 4-​point Likert scale of 0 (not at all) to 3 (nearly every day), and these items are summed for a total score. The PHQ-​9 items correspond closely with the DSM-​5 diagnostic criteria for MDD. The 9th item

assesses suicidal ideation and intent, and it is useful for risk assessment and intervention. A 10th item (nonscored) assesses the degree of functional interference from depressive symptoms. Clinical interpretation guidelines categorize a score of 0 to 4 as normal, 5 to 9 as mild, 10 to 14 as moderate, 15 to 19 as moderately severe, and 20+ as severe depressive symptoms. The psychometric properties of the PHQ-​9 have been evaluated in two studies of 3,000 patients in eight primary care clinics (Spitzer et  al., 1999)  and 3,000 patients in seven obstetric clinics (Spitzer, Williams, Kroenke, Hornyak, & McMurray, 2000). Scores on the PHQ-​9 have demonstrated high internal consistency, test–​retest reliability, and diagnostic validity (Kroenke et  al., 2001), and the measure shows good specificity and sensitivity in grading and diagnosing depression severity (Pettersson, Boström, Gustavsson, & Ekselius, 2015). It is available copyright-​free at http://​ www.phqscreeners.com. In addition to the traditional paper-​and-​pencil method, measures of depressive symptoms can be administered electronically with software programs or through mobile apps downloaded from the web. Electronic assessment can offer advantages, such as automated scoring and charting of the data and remote data collection. However, limitations include risks of loss of privacy and confidentiality. In addition, if patients complete depressive inventories remotely, the clinician must have a plan for alerting the patient of the need to contact the clinician directly if immediate intervention is needed to address suicidality. Overall Evaluation Excellent measures with strong psychometric properties are available for diagnostic assessment of the depressed patient. Although it is tempting to minimize or omit

 137

Adult Depression

diagnostic assessment altogether, we encourage the clinician to take the time to do this because diagnosis has treatment implications. In particular, it is important to distinguish between MDD, a unipolar mood disorder for which psychotherapy alone is often sufficient, and bipolar mood disorder, which generally requires pharmacotherapy plus psychotherapy (Craighead et al., 2015).

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Assessment for case conceptualization and treatment planning requires two types of translation. One is from disorder-​level (and sometimes symptom-​level) conceptualizations and treatments to the case-​level conceptualization and treatment plan. Most of the models we reviewed previously are conceptualizations and therapies for the disorder of MDD. A few of the models also provide conceptualizations and interventions for symptoms (e.g., the BA formulation of rumination as avoidance behavior). A  conceptualization (or formulation) at the level of the case is a hypothesis about the causes of all of the patient’s symptoms, disorders, and problems and how they are related, and the treatment plan describes all of the therapies the patient is receiving for those symptoms, disorders, and problems. The second translation is from nomothetic to idiographic. A nomothetic formulation and treatment plan is stated in general terms (e.g., that depression results from a dearth of positive reinforcers and can be treated by increasing the individual’s positive reinforcers). An idiographic case formulation and treatment plan describes a particular individual. Case Conceptualization A case conceptualization is a hypothesis about the mechanisms causing and maintaining a particular patient’s symptoms, disorders, and problems; the precipitants of the symptoms/​disorders/​problems; and the origins of the mechanisms. We focus here on psychological mechanisms, but the formulation might also include biological mechanisms. We describe tools and strategies for assessing all the elements of the formulation. Symptoms/​Disorders/​Problems The case conceptualization accounts for all of the patient’s symptoms, problems, and disorders. We

137

recommend that the clinician conduct a broad-​based assessment of the following domains: psychiatric symptoms and disorders and treatment difficulties (e.g., multiple providers or inadequate treatment); medical symptoms and disorders and treatment difficulties; and interpersonal, occupational/​school/​homemaking satisfaction and functioning, financial difficulties, housing difficulties, and legal problems. To obtain a comprehensive diagnostic assessment, the clinician can use the measures described in the Assessment for Diagnosis section. Additional tools for assessing many of the depressed patient’s comorbid psychiatric disorders, and symptoms that may not meet full criteria for a disorder, are described in other chapters of this volume. Many MDD patients have a medical problem (Moussavi et  al., 2007). MDD and medical problems can cause or exacerbate one another, and MDD often impedes the patient’s ability to obtain and adhere to treatment for the medical problems. Thus, we recommend that clinicians ask their patients to obtain a physical examination if they have not had one in the past year. MDD is also commonly comorbid with psychosocial and environmental problems, such as marital problems, occupational dissatisfaction, and similar, which can cause, exacerbate, and/​or result from MDD. Lack of satisfaction and difficulties functioning in domains such as work, relationships, and leisure can appear on the problem list element of the case conceptualization and/​or might be precipitants. We recommend three tools that assess functioning difficulties. The first is the Outcome Questionnaire-​45 (OQ-​45; Lambert et al., 1996), which is described in the section titled Assessment for Treatment Monitoring and Treatment Outcome. The second is the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0; World Health Organization [WHO], 2001), a 36-​item self-​report assessment of patient difficulties during the past 30 days in six domains: understanding and communicating, getting around, self-​care, getting along with people, life activities (household/​school/​work), and participation in society. The measure was designed for both initial assessment and progress monitoring. The WHODAS 2.0 is designed to be simple and relatively quick to administer (5–​20 minutes, depending on whether the 12-​or the 36-​ item form is used). The WHODAS 2.0 has been administered to diverse global populations and has demonstrated excellent test–​ retest reliability (intraclass correlation coefficient  =  .98), internal consistency, and concurrent

138

138

Mood Disorders and Self-Injury

validity, both with similar measures and with clinician ratings of functioning (Üstün et al., 2010). This measure is free for clinicians to reproduce and use with their clients. Scoring guidelines are provided both in the DSM-​5 and on the WHO website (http://​www.who.int/​classifications/​ icf/​whodasii/​en). Third, item 10 of the PHQ-​9 provides a quick assessment of global functioning by inquiring about the degree of functional interference of the individual’s depressive symptoms. Ratings range from not difficult at all to extremely difficult. Psychological Mechanisms We describe here and summarize in Table 7.2 several measures for assessing the mechanisms from many of the theories of depression reviewed above. Behavioral Mechanisms The Activity Schedule (presented originally in Beck et al., 1979; see also pp. 126–​127 of Persons, Davidson, & Tompkins [2001] for a version that clinicians may reproduce) is essentially a week-​long hourly calendar in which patients log or plan activities. It is ideal for assessing how the patient spends time as well as for use tracking behavioral homework assignments, such as recording pleasant activities.

The Pleasant Events Schedule (PES; MacPhillany & Lewinsohn, 1982), published in Lewinsohn, Munoz, Youngren, and Zeiss (1986), is a self-​report inventory of 320 potentially reinforcing activities. Respondents assign ratings for each event for the frequency of occurrence during the past 30 days on a 3-​point scale ranging from 0 (not happened) to 2 (happened often; seven or more times) and a pleasantness rating on a 3-​point scale ranging from 0 (not pleasant) to 2 (very pleasant). The PES scores have good reliability and adequate to good validity (Grosscup & Lewinsohn, 1980; MacPhillamy & Lewinsohn, 1982; Nezu, Ronan, Meadows, & McClure, 2000). The PES and supporting materials can be downloaded free of charge at http://​www.ori.org/​scientists/​peter_​lewinsohn. The Snaith–​Hamilton Pleasure Scale (SHAPS; Snaith et al., 1995) is a 14-​item, self-​report measure designed to assess an individual’s hedonic capacity. It assesses “liking” as opposed to “wanting” (discussed previously). The SHAPS asks the patient to rate his or her ability to experience pleasure in the past few days with items such as “I would enjoy my favorite television or radio program” or “I would enjoy being with my family or close friends.” Ratings range from definitely agree to strongly disagree. Nakonezny et al. (2015) found that in a large sample of adults meeting criteria for MDD, SHAPS scores demonstrated high internal consistency (α  =  .91). The measure showed good construct validity; it was significantly

Table 7.2  Ratings for Instruments Used for Case Conceptualization and Treatment Planning Instrument

Internal Norms Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Treatment Generalization Sensitivity

Clinical Utility

Highly Recommended

Symptoms/​Disorders/​Problems E

G

E

A

E

G

E

G

E



Behavioral Mechanisms PES G SHAPS G

WHODAS 2.0

G E

NA NR

G NR

G G

G G

A G

NA A

G A



E

G

G

G

G

G

NR

A



E G G

NR NA E

NR NR A

G G G

G G G

G A NR

G NR NR

G G G



Emotion-​Focused Mechanisms ERQ A G

PTQ

NR

Cognitive Mechanisms FMPS A EQ A ACS A

AIM

A

NA

NR

A

G

A

NR

A

G

NR

G

G

G

NR

NR

G

A

NA

A

G

A

G

NA

G

Interpersonal Mechanisms SAS-​SR

A

Note: WHODAS 2.0 = World Health Organization Disability Assessment Schedule 2.0; PES = Pleasant Events Schedule; SHAPS = Snaith–​Hamilton Pleasure Scale; PTQ = Perseverative Thinking Questionnaire; FMPS = Frost Multidimensional Perfectionism Scale; EQ = Experiences Questionnaire; ACS = Attentional Control Scale; ERQ = Emotion Regulation Questionnaire; AIM = Affect Intensity Measure; SAS-​SR = Social Adjustment Scale–​Self-​ Report; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

 139

Adult Depression

negatively correlated (r = –​.65) with ratings of quality of life. SHAPS totals were only modestly positively correlated with four measures of depressive symptoms (r = .48 to .55), a finding that may indicate that hedonic capacity reflects a “related but distinct construct from depression” (Nakonezny et al., 2015, p. 6). The measure was sensitive to change (Snaith et al., 1995). The SHAPS is published in Snaith et al. (1995), and the publisher gives permission to readers to reproduce the scale from the journal article for personal use or research. The Perseverative Thinking Questionnaire (PTQ; Ehring et  al., 2011)  is a 15-​item self-​report scale that assesses content-​ neutral repetitive negative thinking, including rumination and worry. The PTQ assesses five characteristics of perseverative thinking: repetitive (“The same thoughts keep going through my mind again and again”), intrusive (“Thoughts come to my mind without me wanting them to”), difficult to disengage from (“I can’t stop dwelling on them”), unproductive (“I keep asking myself questions without finding an answer”), and capturing mental capacity (“My thoughts prevent me from focusing on other things”). Scores on the PTQ have demonstrated excellent internal consistency (α = .95 in both German and English language versions), satisfactory test–​ retest reliability (r = .69 at 4-​week retest for the German language version of the scale), good convergent validity compared to similar measures of rumination or worry, and good predictive validity when correlated with measures of anxiety and depression (Ehring et al., 2011). The PTQ is reproduced in the appendix of Ehring et al. (2011), which is available online at http://​www.sciencedirect.com/​science/​article/​pii/​S000579161000114X. Clicking the link within the text that reads “under a creative commons license” on that web page will provide access to the PTQ through the creative commons. To identify antecedents and consequences of a target behavior to help identify the function of the behavior, clinicians can devise a paper-​and-​pencil or a computer-​ based/​smartphone-​based log. The patient can track each instance of the target behavior (e.g., exacerbation of depressed mood, rumination, or suicidality), antecedents of the behavior (events, thoughts, emotions, bodily sensations, and behaviors), and consequences of the behavior (events, thoughts, emotions, bodily sensations, and behaviors) and then review with the therapist to develop a hypothesis about the function the target behavior might serve. Guidance on collecting assessment data for a functional analysis is provided in multiple sources, including Haynes, O’Brien, and Kaholokula (2011) and Kazdin (2013).

139

Cognitive Mechanisms A self-​ monitoring diary, such as the Daily Record of Dysfunctional Thoughts (Beck et al., 1979) or the forms provided by Greenberger and Padesky (1995) or Persons et al. (2001), can be used to assess the automatic thoughts described by Beck’s theory. Emotions, behaviors, and automatic thoughts are typically obtained by simply asking the patient to report them while recalling the specific concrete event that triggered them. J.  S. Beck (1995) offered strategies for eliciting this information when a direct and straightforward approach fails, including asking patients to report images and asking them to vividly imagine and re-​create the event that triggered negative painful emotions. The Frost Multidimensional Perfectionism Scale (FMPS; Frost, Marten, Lahart, & Rosenblate, 1990)  is a 35-​item measure grouped into six subscales:  Concern Over Mistakes, Personal Standards, Parental Expectations, Parental Criticism, Doubts About Actions, and Organization. Respondents rate on a scale ranging from 1 (strongly disagree) to 5 (strongly agree) such items as “If I fail at work/​school, I am a failure as a person” or “Even when I  do something very carefully, I  often feel that it is not quite right.” The FMPS scores have demonstrated good internal consistency (α = .77 to .93; Frost et  al., 1990)  and good convergent validity compared to other similar measures of perfectionism (Stober, 2000). The measure is reprinted in Appendix B of Antony, Orsillo, and Roemer (2001). The Experiences Questionnaire (EQ; Fresco et  al., 2007)  is an 11-​item self-​report measure of decentering. This measure asks the patient to rate the frequency with which he or she is currently having certain experiences, such as “I remind myself that thoughts aren’t facts” or “I can observe unpleasant feelings without being drawn into them.” Ratings range from 1 (never) to 5 (all the time). Fresco et al. used both exploratory and confirmatory factor analysis techniques to examine the EQ factor structure in two large samples of college students and a sample of depressed patients. Scores on the measure showed good internal consistency, ranging from α  =  .81 to .90, and good concurrent and discriminant validity. The EQ has consistently shown sensitivity to treatment change in trials for MDD and GAD (Fresco, Segal, et al., 2007; Hoge et al., 2015; Mennin, Fresco, Ritter, & Heimberg, 2015; Mennin, Fresco, et  al., 2017; Renna et  al., 2017). The EQ is also correlated with a recently developed objective measure of distancing that complements the assessment of decentering (Shepherd, Coifman, Matt, & Fresco,

140

140

Mood Disorders and Self-Injury

2016). The EQ is available from Fresco upon request via e-​mail ([email protected]). The Attentional Control Scale (ACS; Derryberry & Reed, 2002) is a 20-​item self-​report measure that assesses an individual’s ability to focus and shift attention. The items of the ACS are divided among the capacities to (a)  focus attention (“When concentrating, I  can focus my attention so that I  become unaware of what’s going on in the room around me”), (b)  shift attention (“It is easy for me to alternate between two different tasks”), and (c) control thought flexibly (“I can become interested in a new topic very quickly when I need to”). The client rates these items on a scale of 1 (almost never) to 4 (always); higher scores indicate greater overall attentional control. ACS scores have been found to be negatively correlated with trait anxiety and positively correlated with indices of positive emotionality, such as extraversion (Derryberry & Reed, 2001). Scores on the measure have demonstrated good internal consistency (α  =  .88; Derryberry & Reed, 2001), good content validity, and adequate test–​retest reliability (r = .61; Fajkowska & Derryberry, 2010). The ACS is available in Derryberry and Reed (2002) and is free for clinicians. Emotion-​Focused Mechanisms The Emotion Regulation Questionnaire (ERQ; Gross & John, 2003) is a 10-​item rationally derived measure of two aspects of emotion regulation:  reappraisal and suppression. The reappraisal subscale, consisting of 6 items, assesses the ability to modify or change the emotions one experiences (e.g., “I control my emotions by changing the way I think about the situation I’m in”). The suppression subscale, consisting of 4 items, assesses the ability to avoid or prevent the expression of emotions (e.g., “I control my emotions by not expressing them”). Fresco et  al. (2007) reported that internal consistency was good for scores on both the reappraisal subscale (α = .84) and the suppression subscale (α = .82). The reappraisal scale was significantly and positive correlated with decentering (r = .25), but it was uncorrelated with depression symptoms (r = .14) and depressive rumination (r = .14). Conversely, the suppression subscale was significantly and negatively correlated with decentering (r =  –​.31) and significantly and positively correlated with depression symptoms (r = .39) and depressive rumination (r = .31). The ERQ is available free on the Internet https://​www.ocf.berkeley.edu/​~johnlab/​ measures.htm. The Affect Intensity Measure (AIM; Larsen, 1984) is a self-​report measure designed to assess the intensity of an

individual’s characteristic emotional reactions to typical life events. The items of the AIM describe such events as “I get upset easily” or “When I’m happy, I feel like I’m bursting with joy.” The individual rates how often he or she experiences such reactions on a scale from 1 (never) to 6 (always). Weinfurt, Bryant, and Yarnold (1994) conducted factor analyses and described the four basic factors of the AIM as positive affectivity, negative reactivity, negative intensity, and serenity (or positive intensity). Rubin, Hoyle, and Leary (2012) found that scores for items comprising the negative reactivity and negative intensity factors were positively correlated with measures of neuroticism, negative affect, and depression and negatively correlated with self-​compassion. The AIM scores have good internal consistency, test–​retest reliability, and criterion-​related validity (Larsen, Diener, & Emmons, 1986). The scale is available to clinicians and researchers for free at http://​internal.psychology.illinois.edu/​~ediener/​AIM. html. Interpersonal Mechanisms Weissman and Bothwell (1976) developed the Social Adjustment Scale–​Self-​Report (SAS-​SR), a 54-​item self-​ report measure that assesses six social role domains: work/​ homemaker/​student, social and leisure activities, relationships with extended family, marital partner role, parental role, and role within the family unit. Internal consistency of scores on the measure has been found to be adequate (α =.74). The measure has good known-​groups validity, distinguishing samples from the community, patients with depression, and patients with schizophrenia from one another on the basis of total score. The SAS-​SR is available for purchase from Multi-​Health Systems, Inc. (https://​ www.mhs.com/​MHS-​Assessment?prodname=sas-​sr). Precipitants Precipitants of episodes of MDD can be internal, external, biological, or psychological stressors or some combination of these. The WHODAS 2.0 and the Social Adjustment Scale, described previously, can be used to assess precipitants. The clinician can also use the illness history timeline as described in Frank (2005) to identify events that precipitated episodes of illness. Origins The origins part of the formulation offers a hypothesis about how the patient learned or acquired the

 14

Adult Depression

Developing an Initial Case Conceptualization

hypothesized mechanisms of the formulation. Origins can be one or more external environmental events (e.g., the death of a parent or early abuse or neglect), cultural factors, or biological factors (e.g., an unusually short stature that might elicit teasing from peers), including genetics. Information about origins can point to mechanism hypotheses (e.g., early abuse can lead to views of self as bad or worthless). To generate hypotheses about how the patient acquired the conditioned maladaptive responses, learned the faulty schemas, or developed an emotion regulation difficulty, the therapist can conduct a clinical interview that asks the patient to identify key events and factors in his or her upbringing and development, including early trauma, neglect, and abuse (e.g., Wiersma et al., 2009) and early loss, that are known to serve as vulnerability factors for depression. In addition, the clinician will want to obtain a family history of depression and other psychiatric disorders, which can shed light on both biological and psychosocial causes of the patient’s symptoms.

After assessing all the elements of the case conceptualization using the methods described previously, the clinician works with the patient to build a model describing how all the elements are related. The model is a hypothesis, and one that is revised frequently as treatment proceeds. Figure 7.1 provides an example for the case of Thea that was developed using this strategy, with notes about some of the standardized assessment tools that were used to develop the formulation of her case. Alternate strategies for developing a case conceptualization have also been developed. Kuyken, Fothergill, Musa, and Chadwick (2005) showed that clinicians who used the method described by J. S. Beck (1995) to develop a case conceptualization agreed fairly well with one another and with a benchmark formulation created by Judith Beck when they were given the task of identifying the patient’s presenting problems, but agreement was worse when the clinicians were called on to

Case Formulation for Thea Self-criticism and repetitive negative thinking1 “It’s my fault” “I shouldn’t need nurturing. I’m a grown woman.”

Precipitant Loss of important relationship

Origins • 6th of 7 kids • Mother died at age 11 • Father self-involved and alcoholic

No action to get needs met

Loss of reinforcers3

Key of Corresponding Measures: 1-Perseverative Thinking Questionnaire 2-Patient Health Questionnaire; Quick Inventory of Depressive Symptomatology 3-Activity Schedule; Pleasant Events Schedule

FIGURE 7.1  

Conceptualization of the Case of Thea

141

Depressive symptoms2

142

142

Mood Disorders and Self-Injury

make inferences (e.g., about the patient’s schemas). In an initial assessment of the psychometric properties of the Collaborative Case Conceptualization Rating Scale (CCC-​RS) developed by Christine Padesky, Kuyken et al. (2016) reported that the scale had excellent internal consistency, split-​half, and inter-​rater reliability and that the scores were moderately correlated with other measures of related phenomena. The Treatment Plan A treatment plan includes several elements:  the goals of treatment; the frequency and modalities of treatment provided by the clinician who is writing the treatment plan; and adjunct therapies, if any, that are provided by other clinicians. We describe tools for assessing treatment goals and progress toward the goals in the section titled Assessment for Treatment Monitoring and Treatment Outcome. Overall Evaluation Many psychometrically sound standardized measures, described previously, are available to assess patients’ symptoms and problems and the psychological mechanisms described by the major current evidence-​based theories of depression in order to develop an idiographic case conceptualization. As discussed previously, the clinician may also elect to use idiographic tools such as a log to monitor antecedents and consequences of target behaviors in order to develop a functional analysis of a problem behavior or symptom. However, the psychometric qualities of idiographic assessment tools are rarely studied (Haynes & O’Brien, 2000), and it can also be challenging for the clinician to incorporate nomothetic data into an idiographic formulation. Figure 7.1, which describes the case of Thea, provides an illustration of the clinician’s use of nomothetic measures to assist in developing the formulation of the case. Additional details are provided in Persons, Brown, and Diamond (in press). Another challenge is that there is little information about the reliability and validity of the case conceptualization, although contributions in this area are increasing (Bucci, French, & Berry, 2016; Persons & Hong, 2016). To strengthen their idiographic assessment data and the conclusions they draw from them, we recommend that clinicians rely on basic principles of behavioral assessment (Haynes et al., 2011) and collect data (as described in the next section) to test their formulation hypotheses and monitor treatment progress for each case they treat.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

As therapy proceeds, the therapist monitors the outcome of therapy to evaluate the patient’s progress and identify the need for a change in the treatment plan if the patient is not responding. The therapist also monitors the process of therapy to evaluate whether the therapy is being delivered as planned and the targeted psychological mechanisms are changing. Monitoring Outcome To monitor changes in depressive symptoms during treatment, we recommend the QIDS-​SR (described previously) and the Depression Anxiety Stress Scales (DASS; described later in this section) because they are brief, free, and have been demonstrated to have treatment sensitivity. Whatever tool the clinician uses to monitor outcome, it is essential to use it starting in the very first session because there is good evidence that a large proportion of the change in depressive symptoms happens very early in treatment (Ilardi & Craighead, 1994), and some evidence that patients who do not show early change (Crits-​ Christoph et  al., 2001)  or who remain severely symptomatic at week 4 of treatment (Persons & Thomas, 2016)  are very unlikely to remit. Evidence that sudden gains, a large shift in symptoms between one session and the next, predict outcome (Aderka, Nickerson, Boe, & Hoffman, 2012) also highlights the usefulness of monitoring outcome at every session. The clinician likely will also want to monitor symptoms of anxiety, substance use, and other comorbid difficulties identified as goals to change during treatment. Sources of measures for this purpose include other chapters in this volume, Nezu et al. (2000), and Beidas et al. (2014). The DASS (Lovibond & Lovibond, 1995)  is a self-​ report scale that includes three subscales assessing symptoms of depression (low positive affect, hopelessness, and anhedonia; e.g., “felt downhearted and blue” and “difficult to work up the initiative to do things”), anxiety (panic and physiological arousal; e.g., “felt I was close to panic” and “trembling”), and stress (high negative affect; e.g., “hard to wind down” and “rather touchy”). Respondents rate each item to reflect how much it applies to their experience over the preceding week on a Likert scale ranging from 0 (“did not apply to me at all”) to 3 (“applied to me very much”). The scale is available in two versions, one with 21 items and one with 42 items. The DASS is quick to complete,

 143

Adult Depression

suitable for most adult outpatients, and responsive to changes due to treatment (Brown, Chorpita, Korotitsch, & Barlow, 1997). The DASS scores have been reported to have good test–​retest reliability, high internal consistency, and adequate convergent and discriminant validity with other measures of anxiety and depression (Antony, Bieling, Cox, Enns, & Swinson, 1998; Brown et  al., 1997). The three subscales measure largely independent constructs, which is consistent with the tripartite model (Clark & Watson, 1991)  on which the DASS is based (Brown et al., 1997). The measure is in the public domain. Detailed information can be found in the DASS manual (Lovibond & Lovibond, 1995) as well as at http://​www2.psy.unsw.edu. au/​groups/​dass. The measure’s sensitivity to change and coverage of the three domains of positive affect, negative affect, and physiological arousal/​panic make it especially useful for monitoring progress. Its main weakness as a progress-​monitoring tool for the depressed patient is the fact that it does not assess suicidality. Combined measures of symptoms and functioning have been developed to monitor change during psychotherapy for adult psychiatric patients receiving treatment for any problem or disorder, including depression. The most studied of these is the Outcome Questionnaire (OQ-45;  Lambert et  al., 1996), a 45-​ item self-​ report scale that assesses subjective discomfort, interpersonal relations, social role performance, and positive aspects of satisfaction and functioning. The measure includes an item that assesses suicidality, which is particularly important when working with depressed patients. Respondents answer each question in the context of their experience during the past week using a 5-​point Likert scale. The scoring manual or software package classifies each client, at each assessment point, as an improver, nonresponder, or deteriorator based on benchmarking data from a very large sample of clients. The software tool plots the score over time. Internal consistency for a sample of 504 Employee Assistance Program clients was .93 (Lambert et  al., 1996). The total score on the measure has good test–​retest reliability (.84) over an interval of 3 weeks for a sample of 157 undergraduates. The measure is sensitive to change in clients and stable in untreated individuals (Vermeersch, Lambert, & Burlingame, 2000). The measure has good treatment utility, as Lambert and Shimokawa (2011) have shown that psychotherapy patients have better treatment outcome when clinicians use the information to adjust treatment as necessary (i.e., when the patient is classified as a nonresponder or deteriorator). Using the Clinical

143

Support Tool that the measure provides to help the clinician assess factors that are known to be tied to poor outcome of psychotherapy (the therapeutic alliance, social support, and the patient’s readiness for change) has been shown to lead to improved outcomes of cases classified as deteriorators (Whipple et al., 2003). Measures that assess a broad spectrum of the adult patient’s treatment goals and monitor progress toward the goals, and have been shown to be psychometrically sound, are rare. We located two measures:  one that was designed for this purpose and one that was designed for monitoring treatment progress in youths. Goal Attainment Scaling (GAS; Kiresuk & Sherman, 1968) measures changes in idiographic goals due to mental health treatment. GAS calls for patient and therapist to identify, at the outset of treatment, three to five goals that will be the focus of treatment, and the expected level of progress on each goal, and to evaluate later in treatment whether the expected progress has been made. GAS is widely used in program evaluation, has both nomothetic and idiographic features, and allows for assessment of affirmatives (goals and objectives that are positively valued by the patient). Limitations of the measure include the fact that the GAS measures the amount of change relative to what was expected or predicted, and its psychometric properties are not consistently impressive (Kiresuk, Smith, & Cardillo, 1994). The Top Problems measure was created by Weisz et al. (2011) to identify problems and monitor severity of those problems over the course of treatment in a sample of multiply comorbid youths receiving psychotherapy for anxiety, mood, and/​or conduct problems. Weisz et al. reported that the measure had good psychometric properties in their sample, and the measure appears easy to adapt to adults. Monitoring Process Process has two parts: the elements of the therapy that are viewed as important to producing changes in mechanisms and symptoms and the psychological mechanisms that are hypothesized to cause and maintain the symptoms of depression (e.g., engagement in pleasant activities and self-​distance). Elements of the Therapy The therapist can use his or her clinical record to document and monitor the degree to which the treatment plan is being delivered as planned (e.g., to monitor the

14

144

Mood Disorders and Self-Injury

frequency of sessions and the patient’s participation in recommended adjunctive therapies). Homework compliance has been shown to predict outcome of psychotherapy (Kazantzis, Whittington, & Dattilio, 2010), indicating the importance of monitoring that aspect of therapy. To monitor homework, the therapist can work with the patient to develop a paper-​and-​pencil or other tool, locate an app, or develop his or her own tracking form. The Therapeutic Relationship A large body of evidence shows that the therapeutic relationship predicts outcome of psychotherapy (Norcross, 2011)  and thus points to the importance of monitoring this aspect of treatment. We review two measures of the therapeutic relationship. The Revised Helping Alliance Questionnaire (HAq-​ II; Luborsky et  al., 1996)  is a 19-​item self-​report scale assessing the alliance between patient and therapist. Both patient and therapist versions of the scale have been developed. Internal consistency for both patient and therapist versions of the scale has been found to be excellent (α  =  .90 to .93), and test–​retest reliability of the patient version has been found to be r = .78 over three sessions (Luborsky et al., 1996). Concurrent validity demonstrated by correlations between the HAq-​II and the California Psychotherapy Alliance Scale ranged between r = .59 and .71. In a demonstration of the measure’s treatment utility, Whipple et al. (2003) showed that outcome of psychotherapy (on the OQ-​45) was positively related to the clinician’s obtaining weekly feedback on the patient’s HAq-​II scores. The HAq-​II is available for download on the Internet at http://​www.med.upenn.edu/​cpr/​instruments.html.

The Session Alliance Inventory is a six-​item measure developed by Falkenström, Hatcher, Skjulsvik, Larsson, and Holmqvist (2015) and is designed for administration at every psychotherapy session. The measure is a shortened version of Horvath and Greenberg’s (1989) Working Alliance Inventory. Falkenström et  al. reported that the measure has good psychometric properties (Table 7.3) and Falkenström, Ekeblad, and Holmqvist (2016) showed that improvements during one therapy session predicted reductions in depressive symptoms in the subsequent therapy session. The measure is published in Falkenström et al. (2015). Psychological Mechanisms The measures described in the section titled Mechanisms can be used to monitor changes in mechanisms, particularly the measures that are rated in Table 7.3 as sensitive to change. Simple counts and logs can also be used. For example, when Thea was working in therapy on increasing her positive thoughts about herself and her experiences, she tallied them on a golf-​score counter each day and wrote the daily tally on a log that she brought to her therapy session. Overall Evaluation Many measures are available to monitor the outcome and process of treatment. Monitoring both process and outcome allows the therapist to test hypotheses about the relationships between process and outcome that guide clinical decision-​ making. For example, the therapist can assess whether an increase in a depressed patient’s

Table 7.3  Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Treatment Sensitivity

Clinical Utility

Highly Recommended

E G E A

E E E NA

NA NA NA A

E G G A

E A E NA

E G E A

E G G A

E G E G

E E G E

✓ ✓ ✓ ✓

E E

NA NR

G NR

G E

G E

G G

G E

G E

✓ ✓

Outcome QIDS OQ-​45 DASS GAS

Therapeutic Relationship HAQ-​II SAI

E NR

Note: QIDS = Quick Inventory for Depression Severity; OQ-​45 = Outcome Questionnaire-​45; DASS = Depression Anxiety Stress Scales; GAS = Goal Attainment Scaling; HAQ-​II = Helping Alliance Questionnaire-​II; SAI = Session Alliance Inventory; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

 145

Adult Depression

pleasurable activities is associated with a decrease in severity of depressive symptoms. Monitoring outcome and process during treatment is demanding; however, it is particularly important when treating depression because the nonresponse rate is high, even for the evidence-​ based treatments, and patients appear to have better outcomes when their therapists collect and review symptom-​monitoring data during treatment (Lambert, Harmon, Slade, Whipple, & Hawkins, 2005; Whipple et  al., 2003). Hence, we recommend that therapists monitor depressive symptoms, including suicidality, at every session and review a plot of the data. A visual record of the data on a plot that clearly displays the time course of symptom change is a key part of the use of monitoring data. Without it, the therapist can easily accumulate a stack of measures in the clinical record that does not inform the treatment process. The therapist will likely elect to assess mechanisms less frequently, depending on the sensitivity of the measure (see Table 7.3) and the therapist’s hypothesis about how quickly the mechanism is likely to change. Measures with strong psychometric properties that can be used to monitor changes in symptoms and the psychological mechanisms that the therapist conceptualizes as causing and maintaining the patient’s symptoms and problems are available, and we summarize them in Table 7.3. However, almost no measures with strong psychometric properties are available to monitor the patient’s progress toward accomplishing his or her idiographic treatment goals. In part, this lack reflects the challenges of evaluating the psychometric properties of idiographic tools. However, even the standardized measures that are available do not quite measure progress toward therapeutic goals; as described previously, GAS assesses the discrepancy between expected and actual goal attainment, and the Top Problems measure assesses the severity of the problems for which the patient seeks treatment; neither assesses the degree to which the patient has accomplished his or her treatment goals.

CONCLUSIONS AND FUTURE DIRECTIONS

Many strong measures of symptoms, diagnosis, and psychological mechanisms are available to aid the clinician who is treating the depressed patient. Here, we describe several key gaps in the field. One is the dearth of measures available to assess idiographic phenomena, including the case conceptualization and the patient’s treatment goals. The field’s slowness to develop measures

145

for these phenomena and to develop strategies for evaluating idiographic assessment tools may have its origin in the tradition of treatment development that has stressed the creation of standardized therapies that target single disorders. As a result, researchers have developed tools to assess disorders and symptoms, but they have been slow to develop measures to assess functioning and a broad spectrum of patient goals. The field’s recent shift to focus less on disorders and more on transdiagnostic mechanisms (e.g., Cuthbert & Insel, 2013) and to highlight the importance of personalizing treatment (Fisher & Bosley, 2015)  has already led to positive developments in this arena, as shown by the Top Problems tool developed by Weisz et  al. (2011) to identify and monitor progress in problems identified in a sample of multiply-​comorbid youths receiving psychotherapy. Another important gap is that few clinicians use assessment tools in psychotherapy to monitor their patients’ progress in treatment (Hatfield & Ogles, 2004). The importance of clinicians’ monitoring of their patients’ progress is highlighted by a meta-​ analysis (Harkin et al., 2016) showing that monitoring goal progress promoted goal attainment, especially when outcomes were reported to another person or made public and when information was physically recorded in some way. This gap likely results from a failure to train clinicians to do progress monitoring. Research to learn more about why clinicians do not monitor their patients’ progress and how obstacles to monitoring progress can be overcome is needed. Finally, clinicians encounter many impediments to gaining access to evidence-​based assessment tools. Many tools are difficult to learn about and retrieve, and are copyright protected and expensive, and some ask the clinician to submit evidence of expertise in testing that is purportedly needed to administer and interpret the measure. One element of a solution to this problem might include the requirement that researchers who develop an assessment tool using federal funding be asked to post it on an easily accessible website, in the same way that data and manuscripts produced by federally funded grants are disseminated. The future of assessment is likely the Internet. Free and inexpensive web-​based measures with excellent psychometric properties that are easy for clinicians to access and use are urgently needed.

ACKNOWLEDGMENT

We thank Jenna Carl for her assistance.

146

146

Mood Disorders and Self-Injury REFERENCES

Aderka, I., Nickerson, A., Boe, H., & Hoffman, S. (2012). Sudden gains during psychological treatments of anxiety and depression: A meta-​analysis. Journal of Consulting and Clinical Psychology, 80, 93–​101. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Antony, M. M., Bieling, P. J., Cox, B. J., Enns, M. W., & Swinson, R. P. (1998). Psychometric properties of the 42-​item and 21-​item versions of the Depression Anxiety Stress Scales in clinical groups and a community sample. Psychological Assessment, 10, 176–​181. Antony, M. M., Orsillo, S. M., & Roemer, L. (Eds.). (2001). Practitioner’s guide to empirically based measures of anxiety. New York, NY: Kluwer/​Plenum. Beck, A. T., & Bredemeier, K. (2016). A unified model of depression:  Integrating clinical, cognitive, biological, and evolutionary perspectives. Clinical Psychological Science, 4, 596–​619. Beck, A. T., Rush, J. A., Shaw, B. F., & Emery, G. (1979). Cognitive therapy for depression. New  York, NY: Guilford. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for Beck Depression Inventory-​ II. San Antonio, TX: Psychological Corporation. Beck, J. S. (1995). Cognitive therapy:  Basics and beyond. New York, NY: Guilford. Beidas, R. S., Stewart, R. E., Walsh, L., Lucas, S., Downey, M. M., Jackson, K.,  .  .  .  Mandell, D. S. (2014). Free, brief, and validated:  Standardized instruments for low-​ resource mental health settings. Cognitive and Behavioral Practice, 22, 5–​19. Bernstein, A., Hadash, Y., Lichtash, Y., Tanay, G., Shepherd, K., & Fresco, D. M. (2015). Decentering and related constructs:  A critical review and metacognitive processes model. Perspectives on Psychological Science, 10, 599–​617. Berridge, K. C., Robinson, T. E., & Aldridge, J. W. (2009). Dissecting components of reward: “Liking,” “wanting,” and learning. Current Opinion in Pharmacology, 9, 65–​73. Borkovec, T. D., Alcaine, O., & Behar, E. (2004). Avoidance theory of worry and generalized anxiety disorder. In R. G. Heimberg, C. L. Turk, & D. S. Mennin (Eds.), Generalized anxiety disorder:  Advances in research and practice (pp. 77–​108). New York, NY: Guilford.

Brewer, J. A., Worhunsky, P. D., Gray, J. R., Tang, Y., Weber, J., & Kober, H. (2011). Meditation experience is associated with differences in default mode network activity and connectivity. Proceedings of the National Academy of Sciences of the USA, 108, 20254–​20259. Brown, T. A., & Barlow, D. H. (2014). Anxiety and Related Disorders Interview Schedule for DSM-​ 5—​ Adult and Lifetime Version: Clinical manual. New York, NY: Oxford University Press. Brown, T. A., Chorpita, B. F., Korotitsch, W., & Barlow, D. H. (1997). Psychometric properties of the Depression Anxiety Stress Scales (DASS) in clinical samples. Behaviour Research and Therapy, 35, 79–​89. Bucci, S., French, L., & Berry, K. (2016). Measures assessing the quality of case conceptualization:  A systematic review. Journal of Clinical Psychology, 72, 517–​533. Center for Behavioral Health Studies and Quality. (2015). Behavioral health trends in the United States:  Results from the 2014 National Survey on Drug Use and Health (HHS Publication No. SMA 15-​4927, NSDUH Series H-​50). Retrieved from http://​www.samhsa.gov/​data Clark, L. A., & Watson, D. (1991). Tripartite model of anxiety and depression: Evidence and taxonomic implications. Journal of Abnormal Psychology, 100, 316–​336. Craig, A. D. (2009). How do you feel—​Now? The anterior insula and human awareness. Nature Reviews: Neuroscience, 10, 59–​70. Craighead, W. E., Johnson, B. N., Carey, S., & Dunlop, B. W. (2015). Psychosocial treatments for major depressive disorder. In P. E. Nathan & J. M. Gorman (Eds.), A guide to treatments that work (4th ed., pp. 318–​408). New York, NY: Oxford University Press. Crits-​Christoph, P., Connolly, M. B., Gallop, R., Barber, J. P., Tu, X., Gladis, M., & Siqueland, L. (2001). Early improvement during manual-​ guided cognitive and dynamic psychotherapies predicts 16-​ week remission status. Journal of Psychotherapy Practice and Research, 10, 145–​154. Cuthbert, B. N., & Insel, T. R. (2013). Toward the future of psychiatric diagnosis: The seven pillars of RDoC. BMC Medicine, 11, 126–​126. Derryberry, D., & Reed, M. A. (2001). A multidisciplinary perspective on attentional control. Advances in Psychology, 133, 325–​347. Derryberry, D., & Reed, M. A. (2002). Anxiety-​related attentional biases and their regulation by attentional control. Journal of Abnormal Psychology, 111, 225–​236. DeRubeis, R. J., Siegle, G. J., & Hollon, S. D. (2008). Cognitive therapy vs. medications for depression: Treatment outcomes and neural mechanisms. Nature Reviews: Neuroscience, 9, 788–​796.

 147

Adult Depression

Dimidjian, S., Barrera, M., Jr., Martell, C., Muñoz, R. F., & Lewinsohn, P. M. (2011). The origins and current status of behavioral activation treatments for depression. Annual Review of Clinical Psychology, 7, 1–​38. Ehring, T., Zetsche, U., Weidacker, K., Wahl, K., Schönfeld, S., & Ehlers, A. (2011). The Perseverative Thinking Questionnaire (PTQ): Validation of a content-​independent measure of repetitive negative thinking. Journal of Behavior Therapy and Experimental Psychiatry, 42, 225–​232. Fajkowska, M., & Derryberry, D. (2010). Psychometric properties of Attentional Control Scale: The preliminary study on a Polish sample. Polish Psychological Bulletin, 41, 1–​7. Falkenström, F., Ekeblad, A., & Holmqvist, R. (2016). Improvement of the working alliance in one treatment session predicts improvement of depressive symptoms by the next session. Journal of Consulting and Clinical Psychology, 84, 738–​751. Falkenström, F., Hatcher, R. L., Skjulsvik, T., Larsson, M. H., & Holmqvist, R. (2015). Development and validation of a 6-​item working alliance questionnaire for repeated administrations during psychotherapy. Psychological Assessment, 27, 169–​183. Ferster, C. B. (1973). A functional analysis of depression. American Psychologist, 28, 857–​870. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015). Structured Clinical Interview for DSM-​5—​Research version. Washington, DC: American Psychiatric Press. Fisher, A. J., & Bosley, H. G. (2015). Personalized assessment and treatment of depression. Current Opinion in Psychology, 4, 67–​74. Forbes, E. E., Shaw, D. S., & Dahl, R. E. (2007). Alterations in reward-​related decision making in boys with recent and future depression. Biological Psychiatry, 61, 633–​639. Frank, E. (2005). Treating bipolar disorder: A clinician’s guide to interpersonal and social rhythm therapy. New  York, NY: Guilford. Fresco, D. M., Mennin, D. S., Heimberg, R. G., & Ritter, M. (2013). Emotion regulation therapy for generalized anxiety disorder. Cognitive and Behavioral Practice, 20, 282–​300. Fresco, D. M., Moore, M. T., van Dulmen, M., Segal, Z. V., Ma, H., Teasdale, J. D., & Williams, J. M. G. (2007). Initial psychometric properties of the Experiences Questionnaire:  A self-​ report survey of decentering. Behavior Therapy, 38, 234–​246. Fresco, D. M., Segal, Z. V., Buis, T., & Kennedy, S. (2007). Relationship of posttreatment decentering and cognitive reactivity to relapse in major depression. Journal of Consulting and Clinical Psychology, 75, 447–​455.

147

Frost, R. O., Marten, P., Lahart, C., & Rosenblate, R. (1990). The dimensions of perfectionism. Cognitive Therapy and Research, 14, 449–​468. Gray, J. A., & McNaughton, N. (2000). The neuropsychology of anxiety: An enquiry into the functions of the septo-​ hippocampal system (2nd ed.). Oxford, UK:  Oxford University Press. Greenberg, P. E., Fournier, A.-​A., Sisitsky, T., Pike, C. T., & Kessler, R. C. (2015). The economic burden of adults with major depressive disorder in the United States (2005 and 2010). Journal of Clinical Psychiatry, 76, 155–​162. Greenberger, D., & Padesky, C. A. (1995). Mind over mood: A cognitive therapy treatment manual for clients. New York, NY: Guilford. Gross, J. J., & John, O. P. (2003). Individual differences in two emotion regulation processes:  Implications for affect, relationships, and well-​being. Journal of Personality and Social Psychology, 85, 348–​362. Grosscup, S. J., & Lewinsohn, P. M. (1980). Unpleasant and pleasant events and mood. Journal of Clinical Psychology, 36, 252–​259. Hamilton, J. P., Chen, M. C., & Gotlib, I. H. (2013). Neural systems approaches to understanding major depressive disorder:  An intrinsic functional organization perspective. Neurobiology of Disease, 52, 4–​11. Hardeveld, F., Spijker, J., De Graaf, R., Nolen, W. A., & Beekman, A. T.  F. (2010). Prevalence and predictors of recurrence of major depressive disorder in the adult population. Acta Psychiatrica Scandinavica, 122, 184–​191. Harkin, B., Webb, T. L., Chang, B. P.  I., Prestwich, A., Conner, M., Kellar, I.,  .  .  .  Sheeran, P. (2016). Does monitoring goal progress promote goal attainment? A  meta-​analysis of the experimental evidence. Psychological Bulletin, 142, 198–​229. Hatfield, D. R., & Ogles, B. M. (2004). The use of outcome measures by psychologists in clinical practice. Professional Psychology:  Research and Practice, 35, 485–​491. Haynes, S. N., & O’Brien, W. H. (2000). Principles and practice of behavioral assessment. New York, NY: Kluwer /​Plenum. Haynes, S. N., O’Brien, W. H., & Kaholokula, J. K. (2011). Behavioral assessment and case formulation. Hoboken, NJ: Wiley. Hoge, E. A., Bui, E., Goetter, E., Robinaugh, D. J., Ojserkis, R. A., Fresco, D. M., & Simon, N. M. (2015). Change in decentering mediates improvement in anxiety in mindfulness-​ based stress reduction for generalized anxiety disorder. Cognitive Therapy and Research, 39, 228–​235. Hollon, S. D., Stewart, M. O., & Strunk, D. R. (2006). Enduring effects for cognitive behavior therapy in the

148

148

Mood Disorders and Self-Injury

treatment of depression and anxiety. Annual Review of Psychology, 57, 285–​315. Horvath, A. O., & Greenberg, L. S. (1989). Development and validation of the Working Alliance Inventory. Journal of Counseling Psychology, 36, 223–​233. Ilardi, S. S., & Craighead, W. E. (1994). The role of nonspecific factors in cognitive–​ behavior therapy for depression. Clinical Psychology Science and Practice, 1, 138–​156. Jones, N. P., Siegle, G., & Thase, M. (2008). Effects of rumination and initial severity on remission to cognitive therapy for depression. Cognitive Therapy and Research, 32, 591–​604. Kabat-​Zinn, J. (1990). Full catastrophe living: Using the wisdom of your body and mind to face stress, pain, and illness. New York, NY: Dell. Kazantzis, N., Whittington, C., & Dattilio, F. (2010). Meta-​ analysis of homework effects in cognitive and behavioral therapy:  A replication and extension. Clinical Psychology: Science and Practice, 17, 144–​156. Kazdin, A. E. (2013). Behavior modification in applied settings (7th ed.). Long Grove, IL: Waveland. Kessler, R. C., Berglund, P., Demler, O., Jin, R., Koretz, D., Merikangas, K. R.,  .  .  .  Wang, P. S. (2003). The epidemiology of major depressive disorder:  Results from the National Comorbidity Survey Replication (NCS-​ R). Journal of the American Medical Association, 289, 3095–​3105. Kessler, R. C., Chiu, W. T., Demier, O., Merikangas, K. R., & Walters, E. E. (2005). Prevalence, severity, and comorbidity of 12-​month DSM-​IV disorders in the National Comorbidity Survey Replication [erratum published in Arch Gen Psychiatry. 2005;62(7):709. Meikangas, Kathleen, R. added]. Archives of General Psychiatry, 62, 617–​627. Kiresuk, T. J., & Sherman, R. E. (1968). Goal attainment scaling:  A general method for evaluating comprehensive community mental health programs. Community Mental Health Journal, 4, 443–​453. Kiresuk, T. J., Smith, A., & Cardillo, J. E. (1994). Goal attainment scaling:  Applications, theory, and measurement. Hillsdale, NJ: Erlbaum. Klerman, G. L., Weissman, M. M., Rounsaville, B. J., & Chevron, E. S. (1984). Interpersonal psychotherapy for depression. New York, NY: Basic Books. Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-​9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–​613. Kuyken, W., Dudley, R., Abel, A., Gorg, N., Gower, P., McManus, F., & Padesky, C. (2016). Assessing competence in collaborative case conceptualization: Development and preliminary psychometric properties of the Collaborative Case Conceptualization Rating Scale (CCC-​RS). Behavioural and Cognitive Psychotherapy, 44, 179–​192.

Kuyken, W., Fothergill, C. D., Musa, M., & Chadwick, P. (2005). The reliability and quality of cognitive case formulation. Behaviour Research and Therapy, 43, 1187–​1201. Lambert, M. J., Burlingame, G. M., Umphress, V. J., Hansen, N. B., Vermeersch, D. A., Clouse, G., & Yanchar, S. (1996). The reliability and validity of the Outcome Questionnaire. Clinical Psychology and Psychotherapy, 3, 106–​116. Lambert, M. J., Harmon, C., Slade, K., Whipple, J. L., & Hawkins, E. J. (2005). Providing feedback to psychotherapists on their patients’ progress:  Clinical results and practice suggestions. Journal of Clinical Psychology, 61, 165–​174. Lambert, M. J., & Shimokawa, K. (2011). Collecting client feedback. In J. C. Norcross (Ed.), Psychotherapy relationships that work (2nd ed., pp. 203–​223). New York, NY: Oxford University Press. Lamoureux, B. E., Linardatos, E., Haigh, E. A. P., Fresco, D. M., Bartko, D., Logue, E., & Milo, L. (2010). Using the QIDS-​ SR16 to identify major depressive disorder in primary care medical patients. Behavior Therapy, 41, 423–​431. Larsen, R. J. (1984). Theory and measurement of affect intensity as an individual difference characteristic. Dissertation Abstracts International, 85, 2297B (University Microfilms No 84-​22112). Larsen, R. J., Diener, E., & Emmons, R. A. (1986). Affect intensity and reactions to daily life events. Journal of Personality and Social Psychology, 51, 803–​814. Lewinsohn, P. M., & Gotlib, I. H. (1995). Behavioral theory and treatment of depression. In E. E. Beckham & W. R. Leber (Eds.), Handbook of depression (2nd ed., pp. 352–​375). New York, NY: Guilford. Lewinsohn, P. M., Munoz, R. F., Youngren, M. A., & Zeiss, A. M. (1986). Control your depression. New  York, NY: Simon & Schuster. Lissek, S. (2012). Toward an account of clinical anxiety predicated on basic, neurally mapped mechanisms of Pavlovian fear-​learning: The case for conditioned overgeneralization. Depression and Anxiety, 29, 257–​263. Lovibond, P. F., & Lovibond, S. H. (1995). The structure of negative emotional states:  Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behavior Research and Therapy, 33, 335–​343. Luborsky, L., Barber, J., Siqueland, L., Johnson, S., Najavits, L., Franks, A., & Daley, D. (1996). The Revised Helping Alliance Questionnaire (HAq-​II). Journal of Psychotherapy Practice and Research, 5, 260–​271. MacPhillamy, D. J., & Lewinsohn, P. M. (1982). The Pleasant Events Schedule: Studies on reliability, validity, and scale intercorrelation. Journal of Consulting and Clinical Psychology, 50, 363–​380.

 149

Adult Depression

Martell, C. R., Addis, M. E., & Jacobson, N. S. (2001). Depression in context:  Strategies for guided action. New York, NY: Norton. McCullough, J. J.  P. (2000). Treatment for chronic depression:  Cognitive Behavioral Analysis System of Psychotherapy (CBASP). New York, NY: Guilford. Mennin, D. S., Ellard, K. K., Fresco, D. M., & Gross, J. J. (2013). United we stand:  Emphasizing commonalities across cognitive–​behavioral therapies. Behavior Therapy, 44, 234–​248. Mennin, D. S., & Fresco, D. M. (2013). What, me worry and ruminate about DSM-5 and RDoC?: The importance of targeting negative self-referential processing. Clinical Psychology: Science and Practice, 20, 259–268.  http:// doi.org/10.1111/cpsp.12038. Mennin, D. S., & Fresco, D. M. (2014). Emotion regulation therapy. In J. J. Gross (Ed.), Handbook of emotion regulation (2nd ed., pp. 469–​ 490). New  York, NY: Guilford. Mennin, D. S., Fresco, D. M., Heimberg, R. G., & O’Toole, M. (2017). A randomized controlled trial of emotion regulation therapy for generalized anxiety and co-​occurring depression. Manuscript under review. Mennin, D. S., Fresco, D. M., Ritter, M., & Heimberg, R. G. (2015). An open trial of emotion regulation therapy for generalized anxiety disorder and cooccurring depression. Depression and Anxiety, 32, 614–​623. Menon, V. (2015). Salience network. In A. W. Toga (Ed.), Brain mapping:  An encyclopedic reference (pp. 597–​ 611). Waltham, MA: Academic Press. Menon, V., & Uddin, L. Q. (2010). Saliency, switching, attention and control: A network model of insula function. Brain Structure and Function, 214, 655–​667. Moussavi, S., Chatterji, S., Verdes, E., Tandon, A., Patel, V., & Ustun, B. (2007). Depression, chronic diseases, and decrements in health:  Results from the World Health Surveys. Lancet, 370, 851–​858. Nakonezny, P. A., Morris, D. W., Greer, T. L., Byerly, M. J., Carmody, T. J., Grannemann, B. D.,  .  .  .  Trivedi, M. H. (2015). Evaluation of anhedonia with the Snaith Hamilton Pleasure Scale (SHAPS) in adult outpatients with major depressive disorder. Journal of Psychiatric Research, 65, 124–​130. Newman, M. G., & Llera, S. J. (2011). A novel theory of experiential avoidance in generalized anxiety disorder: A review and synthesis of research supporting a contrast avoidance model of worry. Clinical Psychology Review, 31, 371–​382. Nezu, A. M., Ronan, G. F., Meadows, E. A., & McClure, K. S. (Eds.). (2000). Practitioner’s guide to empirically based measures of depression. New York, NY: Kluwer /​Plenum. Nolen-​Hoeksema, S., Wisco, B., & Lyubomirsky, S. (2008). Rethinking rumination. Association for Psychological Science, 3, 400–​424.

149

Norcross, J. C. (Ed.). (2011). Psychotherapy relationships that work:  Evidence-​ based responsiveness (2nd ed.). New York, NY: Oxford University Press. O’Doherty, J. P. (2004). Reward representations and reward-​ related learning in the human brain:  Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769–​776. Olatunji, B. O., Naragon-​ Gainey, K., & Wolitzky-​ Taylor, K. B. (2013). Specificity of rumination in anxiety and depression:  A multimodal meta-​ analysis. Clinical Psychology: Science and Practice, 20, 225–​257. Paulus, M. P., & Stein, M. B. (2010). Interoception in anxiety and depression. Brain Structure and Function, 214, 451–​463. Persons, J. B. (1990). Disputing irrational thoughts can be avoidance behavior:  A case report. The Behavior Therapist, 13, 132–​133. Persons, J. B. (2008). The case formulation approach to cognitive–​behavior therapy. New York, NY: Guilford. Persons, J. B., Beckner, V. L., & Tompkins, M. A. (2013). Testing case formulation hypotheses in psychotherapy:  Two case examples. Cognitive and Behavioral Practice, 20, 399–​409. Persons, J. B., Brown, C., & Diamond, A. (in press). A cognitive–​behavioral case formulation. In K. Dobson (Ed.), Handbook of cognitive–​behavioral therapies (4th ed.). New York, NY: Guilford. Persons, J. B., Davidson, J., & Tompkins, M. A. (2001). Essential components of cognitive–​behavior therapy for depression. Washington, DC:  American Psychological Association. Persons, J. B., & Hong, J. J. (2016). Case formulation and the outcome of cognitive behavior therapy. In N. Tarrier & J. Johnson (Eds.), Case formulation in cognitive behaviour therapy (2nd ed., pp. 14–​37). London, UK: Routledge. Persons, J. B., & Thomas, C. (2016). BDI score at week 4 of cognitive behavior therapy predicts depression remission. Paper presented at the annual meeting of the Association for Behavioral and Cognitive Therapies, New York, NY. Pettersson, A., Boström, K. B., Gustavsson, P., & Ekselius, L. (2015). Which instruments to support diagnosis of depression have sufficient accuracy? A  systematic review. Nordic Journal of Psychiatry, 69, 497–​508. Pratt, L. A., & Brody, D. J. (2014). Depression in the U.S. household population, 2009–​ 2012 (Data Brief No. 172). Hyattsville, MD: National Center for Health Statistics. Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences of the USA, 98, 676–​682. Renna, M. E., Quintero, J. M., Mennin, D. S., & Fresco, D. M. (2017). Emotion regulation therapy:  A

150

150

Mood Disorders and Self-Injury

mechanism-​targeted treatment for disorders of distress. Frontiers in Psychology, 8, 98. Riolo, S. A., Nguyen, T. A., Greden, J. F., & King, C. A. (2005). Prevalence of depression by race/​ ethnicity:  Findings from the National Health and Nutrition Examination Survey III. American Journal of Public Health, 95, 998–​1000. Rubin, D. C., Hoyle, R. H., & Leary, M. R. (2012). Differential predictability of four dimensions of affect intensity. Cognition and Emotion, 26, 25–41. Rush, A. J., Trivedi, M. H., Ibrahim, H. M., Carmody, T. J., Arnow, B., Klein, D. N., . . . Keller, M. B. (2003). The 16-​item Quick Inventory of Depressive Symptomatology (QIDS), Clinician Rating (QIDS-​C), and Self-​Report (QIDS-​SR): A psychometric evaluation in patients with chronic major depression. Biological Psychiatry, 54, 585. Safran, J. D., & Segal, Z. V. (1990). Interpersonal process in cognitive therapy. New York, NY: Basic Books. Segal, Z. V., Williams, J. M.  G., & Teasdale, J. D. (2013). Mindfulness-​based cognitive therapy for depression (2nd ed.). New York, NY: Guilford. Shepherd, K. A., Coifman, K. G., Matt, L. M., Fresco, D. M. (2016). Development of a self-​distancing task and initial validation of responses. Psychological Assessment, 28, 841–​855. Sherdell, L., Waugh, C. E., & Gotlib, I. H. (2012). Anticipatory pleasure predicts motivation for reward in major depression. Journal of Abnormal Psychology, 121, 51–​60. Shim, R. S., Baltrus, P., Ye, J., & Rust, G. (2011). Prevalence, treatment, and control of depressive symptoms in the United States:  Results from the National Health and Nutrition Examination Survey (NHANES), 2005–​2008. Journal of the American Board of Family Medicine, 24, 33–​38. Snaith, R. P., Hamilton, M., Morley, S., Humayan, A., Hargreaves, D., & Trigwell, P. (1995). A scale for the assessment of hedonic tone the Snaith–​ Hamilton Pleasure Scale. British Journal of Psychiatry, 167, 99–​103. Spitzer, R. L., Kroenke, K., Williams, J. W., & Patient Health Questionnaire Primary Care Study Group. (1999). Validation and utility of a self-​report version of PRIME-​MD: The PHQ primary care study. JAMA, 282, 1737–​1744. Spitzer, R. L., Williams, J. B. W., Kroenke, K., Hornyak, R., & McMurray, J. (2000). Validity and utility of the PRIME-​ MD Patient Health Questionnaire in assessment of 3000 obstetric–​gynecologic patients: The PRIME-​MD Patient Health Questionnaire Obstetrics–​ Gynecology Study. American Journal of Obstetrics and Gynecology, 183, 759–​769. Stein, M. B., & Paulus, M. P. (2009). Imbalance of approach and avoidance: The yin and yang of anxiety disorders. Biological Psychiatry, 66, 1072–​1074.

Stober, J. (2000). Frost Multidimensional Perfectionism Scale. In J. Maltby, C. A. Lewis, & A. Hill (Eds.), Commissioned reviews of 250 psychological tests (Vol. 1, pp. 310–​314). Lampeter, UK: Mellen. Teasdale, J. D. (1997). The relationship between cognition and emotion: The mind-​in-​place mood disorders. In D. M. Clark & C. G. Fairburn (Eds.), Science and practice of cognitive behaviour therapy (pp. 67–​93). Oxford, UK: Oxford University Press. Teasdale, J. D. (1988). Cognitive vulnerability to persistent depression. Cognition and Emotion, 2, 247–​274. Üstün, T. B., Chatterji, S., Kostanjsek, N., Rehm, J., Kennedy, C., Epping-​Jordan, J., . . . in collaboration with, WHO/​ NIH Joint Project. (2010). Developing the World Health Organization Disability Assessment Schedule 2.0. Bulletin of the World Health Organization, 88, 815–​823. van der Velden, A. M., Kuyken, W., Wattar, U., Crane, C., Pallesen, K. J., Dahlgaard, J.,  .  .  .  Piet, J. (2015). A systematic review of mechanisms of change in mindfulness-​ based cognitive therapy in the treatment of recurrent major depressive disorder. Clinical Psychology Review, 37, 26–​39. Ventura, J. (1998). Training and quality assurance with the structured clinical interview for DSM-​IV (SCID-​I/​P). Psychiatry Research, 79, 163–​173. Vermeersch, D. A., Lambert, M. J., & Burlingame, G. M. (2000). Outcome Questionnaire:  Item sensitivity to change. Journal of Personality Assessment, 74, 242–​261. Watkins, E. R. (2008). Constructive and unconstructive repetitive thought. Psychological Bulletin, 134, 163–​206. Watkins, E. R. (2016). Rumination-​focused cognitive behavioral therapy for depression. New York, NY: Guilford. Weinfurt, K. P., Bryant, F. B., & Yarnold, P. R. (1994). The factor structure of the Affect Intensity Measure:  In search of a measurement model. Journal of Research in Personality, 28, 314–​331. Weissman, M. M., & Bothwell, S. (1976). Assessment of social adjustment by patient self-​ report. Archives of General Psychiatry, 33, 1111–​1115. Weisz, J. R., Chorpita, B. F., Frye, A., Ng, M. Y., Lau, N., Bearman, S. R., . . . Hoagwood, K. E. (2011). Youth top problems:  Using idiographic, consumer-​ guided assessment to identify treatment needs and to track change during psychotherapy. Journal of Consulting and Clinical Psychology, 79, 369–​380. Whipple, J. L., Lambert, M. J., Vermeersch, D. A., Smart, D. W., Nielsen, S. L., & Hawkins, E. J. (2003). Improving the effects of psychotherapy: The use of early identification of treatment failure and problem solving strategies in routine practice. Journal of Counseling Psychology, 58, 59–​68. Whitfield-​Gabrieli, S., & Ford, J. M. (2012). Default mode network activity and connectivity in psychopathology. Annual Review of Clinical Psychology, 8, 49–​76.

 15

Adult Depression

Whitmer, A. J., Frank, M. J., & Gotlib, I. H. (2012). Sensitivity to reward and punishment in major depressive disorder: Effects of rumination and of single versus multiple experiences. Cognition and Emotion, 26, 1475–​1485. Whitmer, A. J., & Gotlib, I. H. (2012). Switching and backward inhibition in major depressive disorder: The role of rumination. Journal of Abnormal Psychology, 121, 570–​578. Wiersma, J. E., Hovens, J. G., van Oppen, P., Giltay, E. J., van Schaik, D. J., Beekman, A. T., & Penninx, B. W. (2009). The importance of childhood trauma and childhood life events for chronicity of depression in adults. Journal of Clinical Psychiatry, 70, 983–​989.

151

World Health Organization. (2001). WHO disability assessment schedule. Retrieved from http://​who.int/​classifications/​icf/​whodasii/​en World Health Organization. (2008). The global burden of disease:  2004 update. Retrieved from http://​www.who. int/​ h ealthinfo/​ g lobal_​ b urden_​ disease/​ 2 004_​ r eport_​ update/​en Yuen, G. S., Gunning-​ Dixon, F. M., Hoptman, M. J., AbdelMalak, B., McGovern, A. R., Seirup, J. K., & Alexopoulos, G. S. (2014). The salience network in the apathy of late-​life depression. International Journal of Geriatric Psychiatry, 29, 1116–​1124.

152

8

Depression in Late Life Amy Fiske Alisa O’Riley Hannum Assessing depression in older adults presents unique challenges to the clinician for several reasons. First, depression may be underreported in older adults because clients and their families (and, unfortunately, sometimes their physicians) often assume depressive symptoms are normal in late adulthood (Karel, Ogland-​Hand, & Gatz, 2002). Second, depression can sometimes be difficult to differentially diagnose in older adulthood because of the prevalence of comorbid physical and cognitive problems. Finally, it may be difficult to diagnose depression in older adulthood because older adults often demonstrate presentations of the disorder that differ from typical presentations in other age groups (Hegeman, Kok, van der Mast, & Giltay, 2012). Given these challenges, it may be unwise to assess depression in older adults with the same methods and instruments used for younger adults. Even instruments that are well validated and empirically supported for the assessment of depression in younger adults may lack measurement equivalence across the lifespan (Karel et al., 2002). In this chapter, we examine the utility of current measurements of depression for adults older than age 60 years. We begin by elaborating on the nature of depression in older adulthood. We then examine depression instruments in terms of their utility for purposes of diagnosis, case conceptualization and treatment planning, and treatment monitoring and outcome measurement for older adults.

THE NATURE OF DEPRESSION IN LATE LIFE

Depression in late life is commonly defined as meeting diagnostic criteria for one of several depressive disorders. Categories within the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American

Psychiatric Association [APA], 2013) include major depressive disorder, persistent depressive disorder (dysthymia), and adjustment disorder with depressed mood. The 10th edition of the International Classification of Diseases (ICD-​ 10; World Health Organization [WHO], 1992)  specifies categories of mild, moderate, and severe recurrent depressive disorders as well as dysthymia. ICD-​10 moderate and severe depressive disorders are largely equivalent to DSM-​5 major depressive disorder. Bipolar disorder, which is seen infrequently in older adults and which differs in important ways from unipolar depression (Depp & Jeste, 2004), is not discussed in this chapter (see Chapter 9, this volume). The diagnosis of major depressive disorder requires either dysphoria (depressed mood) or anhedonia (diminished interest or pleasure in activities) most of the day, nearly every day, for at least 2 weeks, with other symptoms (i.e., appetite disturbance, sleep disturbance, psychomotor retardation or agitation, low energy, feelings of worthlessness or inappropriate guilt, inability to concentrate, or thoughts of death or suicide) totaling at least five. Additional diagnostic criteria require impairment in social, occupational, or other important areas of functioning and exclude symptoms that can be attributed to another medical condition or a substance. Persistent depressive disorder (dysthymia) is diagnosed when symptoms are present for at least 2  years, without a break of 2 months or more. To meet criteria for persistent depressive disorder, symptoms must include pervasive dysphoria plus two additional symptoms from among the following: appetite disturbance, sleep disturbance, low energy, low self-​esteem, inability to concentrate, and hopelessness. The diagnosis of adjustment disorder with depressed mood is assigned when symptoms occur in response to a specific stressor if the symptoms cause significant distress

152

 153

Depression in Late Life

or impairment but do not meet criteria for major depressive disorder or persistent depressive disorder. The 12-​month prevalence of major depressive disorder among adults age 65  years or older in the National Comorbidity Study Replication was 2.3% compared to 7.7% for adults age 18–​ 64  years (Kessler, Petukhova, Sampson, Zaslavsky & Wittchen, 2012). Similarly, the prevalence of dysthymia among older adults is approximately 2% (Devenand, 2014). Depressive syndromes that do not meet diagnostic criteria for major depressive disorder are at least two or three times more prevalent than major depression (Meeks, Vahia, Lavretsky, Kulkarni, & Jeste, 2011). Note that the meaning and accuracy of these prevalence estimates are affected by issues that are described next. Existing diagnostic criteria may not be appropriate for the classification of depression among older adults because older adults frequently present a symptom picture that differs from the profiles most often reported by younger and middle-​aged adults. Although reports vary somewhat regarding specifics, an emerging consensus is that older adults are less likely to report certain ideational symptoms, such as dysphoria, guilt, and suicidal ideation (Blazer, Bachar, & Hughes, 1987; Gallo, Anthony, & Muthén, 1994; Gallo, Rabins, & Anthony, 1999; Hegeman, Kok, et al., 2012). Exceptions include findings that reports of hopelessness, helplessness, and nonsuicidal thoughts about death may be more common in older than in younger adults (Christensen et al., 1999; Gallo et al., 1994). In contrast to their general tendency to report fewer ideational symptoms, older adults are more likely to report somatic symptoms, such as fatigue, insomnia, psychomotor retardation, agitation, or diminished appetite and weight loss (Blazer et al., 1987; Brodaty et al., 1991; Christensen et  al., 1999; Gallo et  al., 1994; Hegeman, Kok, et al., 2012). This symptom pattern has been referred to variously as “masked depression” (Blumenthal, 1980), “depression without sadness” (Gallo et  al., 1999), and “non-​ dysphoric depression” (Onwuameze & Paradiso, 2014). Yet evidence shows that these somatic symptoms cannot be attributed entirely to increases in physical illness (Nguyen & Zonderman, 2006) and (with the exception of changes in appetite and libido) are indicative of depression in older adults (Norris, Arnau, Bramson, & Meagher, 2004). Some reports also indicate that anhedonia is increasingly common with age (Mora et al., 2012). Nonetheless, there is evidence of substantial heterogeneity in the presentation of depressive symptoms among older adults (Mora et al., 2012).

153

Furthermore, the very nature of late adult life may affect diagnostic classification. For example, diagnostic criteria require that impairment be evident in social, occupational, or other important areas of functioning. However, the definition of normal functioning in these domains has not been well operationalized for older adults, raising the possibility that older adults may be less likely than younger and middle-​aged adults to be seen as functionally impaired. In addition, exclusionary criteria may lead to underdetection of depression in older adults. Symptoms that can be attributed to physical illness or medication use are not to be considered when diagnosing depression, but distinguishing the cause of these symptoms may not be straightforward. Taken together, these factors suggest that existing diagnostic categories may lack sensitivity for detecting depression among older adults. Numerous new categories have been proposed to classify depressive symptoms that do not meet diagnostic criteria (for a discussion, see Kumar, Lavretsky, & Elderkin-​ Thompson, 2004), but a single standard has not emerged. A widely used alternative for identifying cases of depression in older adults is use of a cut-​off score on a depressive symptom checklist to indicate the presence of clinically significant depressive symptoms. This dimensional approach identifies older adults who are experiencing an elevated level of depressive symptoms without excluding individuals whose symptoms do not include dysphoria or anhedonia, those without evidence of impaired functioning, or those with comorbid physical illness. Thus, this method overcomes limitations of diagnostic criteria that do not map well onto depressive experience in old age. Because this approach lacks syndromal criteria, however, it lacks specificity for ruling out causes of symptoms that may not represent depression, such as those that are the direct effects of physical illness. Cognitive deficits may also complicate the measurement of depression in late life. Evaluating whether these deficits are symptoms of depression or dementia can be challenging and may require the use of informants as well as longitudinal assessments (Wang & Blazer, 2015). Alexopoulos et al. (1997) proposed that cognitive deficits, primarily deficits in executive functioning, accompanied by cerebrovascular risk factors and a late age of depression onset, may indicate an etiologically distinct form of depression, which they term “vascular depression.” A substantial minority of cases meeting proposed criteria for vascular depression go on to develop dementia (Potter et  al., 2013). Furthermore, rates of depression are elevated among individuals with dementia (Vilalta-​Franch

154

154

Mood Disorders and Self-Injury

et al., 2006). Thus, cognitive dysfunction may represent a symptom of depression, or depression may be a prodromal symptom of, or reaction to, dementia. Other psychiatric and medical comorbidities should also be taken into account. As at other ages, anxiety is highly comorbid with depression in late life, although there is no evidence that it is more common in depression with late onset (Janssen, Beekman, Comijs, Deeg, & Heeren, 2006). Comorbid physical illness may also complicate the assessment of depression in older adults. Physical illness may represent a cause of depression (Alexopoulos et  al., 1997; Zeiss, Lewinsohn, & Rohde, 1996), an effect of depression (Frasure-​Smith, Lesperance, & Talajic, 1993), or simply co-​occurrence. Furthermore, illness may lead to depression as a result of organic mechanisms (Alexopoulos et al., 1997) or as a psychological reaction. Zeiss and colleagues concluded that functional impairment largely mediates the relationship between illness and depression, suggesting that depression is a psychological response to limitations imposed by the illness. Whatever the direction of causation or mechanism, the comorbidity of depression and physical illness makes assessment more challenging because certain symptoms are shared by both. There is heterogeneity in the prognosis of late life depression. Psychotherapy for depression is as efficacious in older adults as it is in younger adults (Cuijpers, Andersson, Donker, & Van Straten, 2011). Nonetheless, older adults who have recovered from depression appear to be at risk of earlier relapse (Mueller et  al., 2004). Earlier relapse is predicted by residual symptoms following treatment (Chopra et  al., 2005), which suggests that incomplete resolution of a depressive episode may predispose to another episode. Time to relapse is also predicted by the presence of executive dysfunction (Alexopoulos et  al., 2000), consistent with a neurobiological explanation such as vascular depression (Alexopoulos et al., 1997). Finally, depression in late life has been linked to many of the same risk and protective factors as at other points in the lifespan, although the prevalence of these factors, and the strength of their association with depression, may vary by age. Age of depression onset has been examined as a potential marker for vascular depression, an etiologically distinct subtype of the disorder (Alexopoulos et al., 1997). There is little epidemiologic research on the proportion of older adults with depression who experienced the first episode after age 60 years. Some research suggests that half of cases are late onset (e.g., Steingart & Herrmann, 1991). The proportion of older adults with a lifetime history of depression who

met criteria for vascular depression (with onset at age 50 years or older) has been estimated at 22% (Gonzalez, Tarraf, Whitfield, & Gallo, 2012). Genetic factors have been implicated in depressive symptoms among older adults. Estimates of heritability in one study were .14 for men and .29 for women (Jansson et al., 2004). Family studies suggest that genetic influences play a greater role in depression earlier in life (Baldwin & Tomenson, 1995). In contrast, other biological factors, such as cerebrovascular risk factors (Alexopoulos et  al., 1997; Nemeth, Haroon, & Neigh, 2014), may be more influential in late life depression. Stressors also contribute to the risk of depression in older adults, as at other ages (for a meta-​analysis, see Kraaij, Arensman, & Spinhoven, 2002). Among the specific stressors most frequently examined in older populations are physical illness and disability (as discussed previously), bereavement, and caregiving. Bereavement may be a risk factor for depression in late life, particularly among individuals with a history of depressive episodes (Zisook & Shuchter, 1993). Prigerson and colleagues (e.g., Latham & Prigerson, 2004)  have argued that abnormal distress following bereavement generally does not resemble depression, and it should instead be considered a different syndrome, which they term “complicated grief.” Consistent with this logic, the bereavement exclusion for major depression was removed in DSM-​5 (APA, 2013). Caregiving for someone with dementia or disability is a potentially stressful experience that occurs with greater frequency in late life. High rates of depression among caregivers have been reported (for a review, see Schulz & Martire, 2004). Social factors may act as either protective or risk factors for depression in late life. Perceived social support buffers the effects of stressors in older adults, but support that is too intensive, or perceived as unsupportive, may also contribute to risk (for a review, see Hinrichsen & Emery, 2005). Thus, depression in late life differs from depression earlier in the lifespan in terms of presentation, comorbidities, course, and risk factors. These differences imply a need for special care in assessing depression in an older adult. Assessment should consider the possibility of medical comorbidity or declines in cognitive functioning. Furthermore, due to the unique presentation of depression in late life, diagnostic classification may underestimate pathology in this group, whereas symptom checklists may overestimate problems, suggesting that categorical and dimensional measurements may both be important.

 15

Depression in Late Life PURPOSES OF ASSESSMENT

In the following sections, we review assessment instruments with a focus on three specific clinical purposes: diagnosis, case conceptualization and treatment planning, and treatment monitoring and the assessment of treatment outcome. We do not evaluate instruments for use in screening. Much empirical work has focused on the evaluation of instruments to screen for depression in older adults, particularly within primary care settings. For a review of specific screening instruments, the interested reader is referred to Watson and Pignone (2003). For a discussion of the benefits and harms of screening for depression in primary care, see O’Connor, Whitlock, Beil, and Gaynes (2009). A specialized literature exists with respect to the assessment of depression within dementia. Some individuals with dementia may be able to provide accurate information about their own depressive symptoms, but the validity of self-​report varies with the level of awareness of deficits (Snow et al., 2005). As a result, instruments that have been developed specifically for this task are largely observer rated, to be completed by a clinician or lay interviewer, and some incorporate information from a caregiver or other proxy as well. Information on the use of these measures for the assessment purposes discussed previously is included in the relevant sections. Because older adults, specifically older men, are at the highest risk of death by suicide of any demographic group (Curtain, Warner, & Hedegaard, 2016), a clinician assessing depression in this population must also be prepared to assess suicide risk. ASSESSMENT FOR DIAGNOSIS

Structured Interviews Structured clinical interviews address all information needed for a diagnosis. Table 8.1 summarizes the

155

properties of structured clinical interviews when used to diagnose depression in older adults. The primary advantage of structured compared to unstructured clinical interviews is reliability, as seen in the table. As previously mentioned, however, a consideration when assessing older adults is whether diagnostic criteria themselves may lead to underdetection of depression. Lengthy administration represents a challenge with respect to use of these instruments in clinical settings with older adults. However, administration time depends on the person’s responses. Furthermore, increasingly brief versions have been published in recent years. A further challenge is the extensive training required to reach proficiency in the administration of structured clinical interviews, ranging from days to weeks. Although training requirements may be viewed as a burden, training and experience using structured interviews can be especially helpful for new clinicians (Segal, Kabacoff, Hersen, Van Hasselt, & Ryan, 1995). Note that for some of these measures, a DSM-​5 version is not yet published or widely used. The most widely used structured clinical interview in the United States is the Structured Clinical Interview for DSM (SCID; First, Williams, Karg, & Spitzer, 2015). The current revision yields diagnoses according to DSM-​ 5 (APA, 2013)  criteria. The current revision is available in either research or clinical versions, with the clinical version (SCID-​5-​CV; First et  al., 2015)  abbreviated to minimize administration time. The SCID-​5-​CV assesses the most common DSM-​ 5 disorders, including mood disorders, and requires 45 to 90 minutes to administer, but individual modules can be administered separately. Although the current revision of the SCID-​CV has not yet been evaluated in older adult samples, a previous form of the SCID, which was based on DSM-​III-​R diagnostic criteria (APA, 1987), has been found to have good inter-​ rater reliability for the diagnosis of major depressive disorder in older adult samples (Segal et al., 1995). Notably, inter-​rater reliability appears to be lower for diagnosis of

Table 8.1  Ratings of Instruments Used for Diagnosis Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

Structured Clinical Interviews SCID

NA

NA

E

NR

A

NR

G

A

SADS

NA

NA

E

NR

A

NR

G

A

GMS

NA

NA

G

G

G

G

G

A



Note: SCID = Structured Clinical Interview for DSM; SADS = Schedule for Affective Disorders and Schizophrenia; GMS = Geriatric Mental State Schedule; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

156

156

Mood Disorders and Self-Injury

dysthymia (Segal et  al., 1995). Extensive pilot and field testing (First et al., 2015) suggests good content validity, but the measure was not developed specifically for use with older adults. There is ample evidence of construct validity for the SCID in mixed age samples (First, Spitzer, Gibbon, & Williams, 2015), and available evidence suggests content validity in older adult samples as well (Stukenberg, Dura, & Kiecolt-​Glazer, 1990), but data are too sparse to permit conclusions to be drawn. Evidence from a mixed-​age sample suggests that using the SCID (DSM-​III-​R version) improves diagnostic accuracy compared to routine diagnostic procedures, and it appears to improve clinical management of cases (Basco et  al., 2000). Thus, the SCID-​CV may be useful for diagnosing major depressive disorder in older adults, but training and administration requirements are substantial and more empirical work is needed in older adult samples, specifically using the SCID-​5-​CV version, before it can be highly recommended for use in this population. The Schedule for Affective Disorders and Schizophrenia (SADS; Endicott & Spitzer, 1978)  yields diagnoses according to the Research Diagnostic Criteria (Spitzer, Endicott, & Robins, 1978). The SADS requires extensive training (“weeks,” according to Dozois & Dobson, 2002)  and takes 90 to 120 minutes to administer (Dozois & Dobson, 2002). In mixed-​age samples, the SADS demonstrates excellent inter-​ rater reliability (Endicott & Spitzer, 1978). Although use of the SADS with older adults has not often been evaluated, one study reported excellent inter-​rater reliability (Rapp, Smith, & Britt, 1990). As with the SCID, development of the SADS involved thorough evaluation of the content (Endicott & Spitzer, 1978), but the measure was not designed specifically for older adults. The SADS has demonstrated good efficiency in detecting “cases” of depression in older adults as defined by a cut-​off score on the Beck Depression Inventory (Gallagher, Breckenridge, Steinmetz, & Thompson, 1983), but it has been tested too infrequently in this population to support any conclusions regarding construct validity. Thus, the SADS may be a reliable and valid method of diagnosing depression in older adults, but training and administration costs are high, and further evaluation of its validity in this age group is needed before it can be highly recommended. Whereas the SCID and SADS were designed for administration by a trained clinician, the Composite International Diagnostic Interview (CIDI; Robins et  al., 1988)  was initially developed for administration by trained laypersons for use in research and has since been used in clinical settings. The CIDI is a composite of the

Diagnostic Interview Schedule (DIS; Robins, Helzer, Croughan, & Ratcliff, 1981) and the Present State Exam (PSE; Wing, Birley, Cooper, Graham, & Isaacs, 1967). It was designed to produce current and lifetime diagnoses according to both DSM-​III-​R (APA, 1987)  and ICD-​10 (WHO, 1992) criteria. Although a DSM-​5 version of the CIDI has not yet been developed, the previous version has been extensively validated in mixed-​ age samples. There is some evidence of reliability of the CIDI in the elderly (Heun, Müller, Freyberger, & Maier, 1998), but concerns have been raised about validity in this age group. Specifically, older adults are less likely than their younger counterparts to endorse the gateway items of dysphoria and anhedonia (Trainor, Mallett, & Rushe, 2013), possibly due to the additional complexity of these questions (O’Connor & Parslow, 2009). To address these concerns, a revised version of the interview, the CIDI65+, has been developed (Wittchen et al., 2015). Questions were shortened and the format was simplified in order to be more appropriate for older adults with cognitive difficulties. The CIDI65+ has demonstrated good test–​ retest reliability; validity has yet to be examined (Wittchen et  al., 2015). An epidemiologic study that used the CIDI65+ found greater prevalence of depression and other mental disorders in older adults compared to other studies (Andreas et  al., 2017), suggesting that the use of age-​ appropriate measures may improve detection of depression and other disorders in this age group. Several short forms of the CIDI have also been developed. One short form (UM-​CIDI-​SF; Kessler & Mroczek, 1993) was evaluated in a large sample of older adults and found to be as strongly related to physician diagnosis as was the Center for Epidemiological Studies–​Depression Scale (Turvey, Wallace, & Herzog, 1999). Nonetheless, properties of the CIDI, the CIDI65+, and various short forms of the CIDI in older adult samples have yet to be well examined and, therefore, these measures cannot yet be recommended for clinical use with this population. In contrast to most structured interview protocols, the Geriatric Mental State Schedule (GMS; Copeland et al., 1976) was developed specifically with older adults in mind. A classification system (known as AGECAT) was empirically derived for use with the GMS and is implemented through a computer-​based algorithm. The GMS with the AGECAT system assesses for eight psychiatric syndromes in older adults, including neurotic and psychotic depression. Ratings indicate level of diagnostic confidence, from 0 to 5, with 3 or greater indicating the presence of a case. The GMS has demonstrated good to excellent inter-​rater and test–​retest reliability (Copeland et  al., 1988). The

 157

Depression in Late Life

GMS was derived from previous scales, including the PSE (Wing et al., 1967), in consultation with experts and extensive field testing with older adult samples, suggesting good content validity. Construct validity has been demonstrated with good correspondence with DSM-​III diagnosis in community (Copeland, Dewey, & Griffiths-​Jones, 1990)  and medical samples (Ames Flynn, Tuckwell, & Harrigan, 1994), although GMS/​AGECAT is more inclusive than DSM-​IV major or minor depression (de la Cámara et al., 2008). Although the GMS has shown validity in US and UK samples (Copeland et al., 1976), a study involving 26 sites in India, China, Latin America, and Africa showed that sensitivity to depression varied widely by country (Prince et  al., 2004). Taken together, these findings indicate that the GMS is a reliable and valid tool for diagnosing depression in older adults. A possible limitation is that it yields diagnoses based on the empirically derived AGECAT diagnostic criteria and not the more widely accepted DSM or ICD criteria. Overall Evaluation In summary, structured interviews require more time and training to administer than do unstructured interviews, but they can yield highly reliable diagnoses and may be particularly useful in the training of new clinicians. The SCID and the SADS require the most training and are among the lengthiest to administer, but they can also yield extremely reliable results. Neither has been evaluated fully in older adult samples. The CIDI and GMS offer flexibility because they can be administered by trained interviewers who are not clinicians. Thus, no single structured interview is clearly superior:  The choice should depend on who will administer it and how much time can be invested.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

It is important to note that all the instruments described next and in the sections on treatment monitoring and evaluation focus on the nature and severity of depressive symptoms. Self-​Report Measures Several self-​ report depressive symptom measures may prove useful for the development of case conceptualizations and treatment plans for older adults (Table 8.2). One

157

of the most well-​known depressive symptom measures in use today is the Beck Depression Inventory-​II (BDI-​II; Beck, Steer, & Brown, 1996). The BDI-​II is a 21-​item measure, scored using a 4-​point Guttman scale asking respondents to indicate how they felt during the course of the past 2 weeks (including the day of administration). The BDI-​II takes approximately 10 minutes to administer and has been translated into 14 languages. In terms of case conceptualization and treatment planning, the BDI-​II provides information about somatic–​affective and cognitive dimensions of depression (Steer, Ball, Ranieri, & Beck, 1999). In mixed-​age samples (using participants aged 19–​80 years), the BDI-​II scores have demonstrated very good internal consistency and test–​retest reliability, and they are highly correlated with other measures of distress and psychopathology. The BDI-​II scores have also demonstrated very good internal consistency and very good test–​retest reliability in samples of older adults hospitalized in a geriatric psychiatry unit (Steer, Rissmiller, & Beck, 2000)  and in samples of community-​dwelling older adults (Segal, Coolidge, Cahill, & O’Riley, 2008), although one study found that the BDI-​II did not perform as well as the Geriatric Depression Scale in a sample of older women (Jefferson, Powers, & Pope, 2001). In addition, in examining the use of the original BDI with older adults, some researchers have posited that older women may be more hesitant to complete the measure (especially a question related to sexual interest) than other measures of depression (Jefferson et al., 2001) and that the somatic items may be confounded with physical illness in older adults (Clark, Cavanaugh, & Gibbons, 1983). In addition, some researchers have suggested that the complexity of the Guttman-​type response options may limit the assessment’s utility in older adults with any cognitive dysfunction (Clark et  al., 1983). Given these concerns and the paucity of research examining the use of the BDI-​II in older adults, the BDI-​II is not highly recommended at this time. Another self-​ report measure is the Center for Epidemiological Studies–​ Depression Scale (CES-​ D; Radloff, 1977). The CES-​D is a 20-​item measure in which individuals are asked to respond to items on a 4-​point Likert-​type scale based on how they felt during the past week. In terms of case conceptualization and treatment planning, factor analyses show that the CES-​D can provide clinicians information about depressed mood, psychomotor retardation, the absence of well-​ being, and interpersonal difficulties (Gatz, Johansson, Pederson, Berg, & Reynolds, 1993). Scores on the CES-​D have demonstrated good internal consistency, test–​retest reliability,

158

158

Mood Disorders and Self-Injury

Table 8.2  Ratings of Instruments Used for Case Conceptualization and Treatment Planning Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

BDI-​II

A

G

NA

G

A

A

A

A

CES-​D

E

G

NA

G

A

G

G

A

GDS

G

G

NA

G

A

A

A

L

PHQ-​9 SDS

E A

G A

G NA

G NR

A A

G A

G A

E A

Instrument

Highly Recommended

Self-​Report Measures



Structured Clinical Interviews SCID

NA

NA

E

NR

A

NR

G

A

SADS

NA

NA

E

NR

A

NR

G

A

GMS

NA

NA

G

G

G

G

G

A



Clinician Rating Scales GDRS

A

E

G

NR

A

NR

A

A

MADRS

A

A

E

NR

A

A

A

A

A A

A NR

E NR

E A

E A

A A

Measures to Assess Depression in Dementia CSDD DMAS

A A

E A



Note: BDI-​II = Beck Depression Inventory-​II; CES-​D = Center for Epidemiological Studies–​Depression Scale; GDS = Geriatric Depression Scale; PHQ-​ 9 = Patient Health Questionnaire-​9; SDS = Zung Self-​Rating Depression Scale; SCID = Structured Clinical Interview for DSM; SADS = Schedule for Affective Disorders and Schizophrenia; GMS = Geriatric Mental State Schedule; GDRS = Geriatric Depression Rating Scale; MADRS = Montgomery–​ Åsberg Depression Rating Scale; CSDD = Cornell Scale for Depression in Dementia; DMAS = Dementia Mood Assessment Scale; L = Less Than Acceptable; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

convergent validity, and criterion validity with respect to diagnostic instruments (Head et al., 2013; Radloff, 1977). Many researchers have demonstrated that it has measurement equivalence across the lifespan (e.g., Gatz et  al., 1993). A concern with this measure is that reverse-​scored items have been shown to be problematic for older adults (Carlson et al., 2011). The authors recommend imputing reversed items based on responses to non-​reversed items. Eaton, Muntaner, Smith, Tien, and Ybarra (2004) developed a revised version, the CESD-​R, that contains no reverse-​scored items. The CESD-​R also includes several symptoms of major depression that were not included in the CES-​D (weight changes and thoughts of suicide) and eliminates items in the CES-​D not related to the current definition of major depression. Scores on the CESD-​R have been found to have good reliability and some convergent validity (it is highly correlated with the CES-​ D; Eaton et  al., 2004; Van Dam & Earleywine, 2011). Although promising, this measure has not yet been thoroughly evaluated with older adults. Despite the strengths of the CES-​D (and, possibly, the CESD-​R), researchers have found problems with the use of this measure in older adults. Some investigators have suggested that the measure may contain items that are biased against being

older, female, widowed, or having a physical disorder (Grayson, MacKinnon, Jorm, Creasey, & Broe, 2000). Other researchers have demonstrated that the CES-​D is only modestly related to symptoms reported in structured interviews and that it often identifies individuals who either do not meet criteria for depression or meet criteria for other diagnoses (Myers & Weissman, 1980). This last criticism may be reflecting the fact that older adults generally exhibit more subsyndromal rather than syndromal symptoms of depression (Newman, 1989). When examining the criticisms made against this measure, it appears that these issues are most likely problems related to the use of the CES-​D as a diagnostic tool. The evidence suggests that the CES-​D can be recommended as an instrument for case conceptualization and treatment planning, especially as part of a multimethod assessment that also includes a thorough clinical interview. A widely used self-​report measure designed to assess depression in older adults is the Geriatric Depression Scale (GDS; Yesavage et  al., 1983). The GDS is a 30-​ item measure with a yes/​ no answer format. There is also a shortened version of the GDS consisting of 15 items (Sheikh & Yesavage, 1986). In response to concerns that depression measures are often confounded

 159

Depression in Late Life

by physical illness in older adults, the developers of the GDS excluded somatic items from the scale. In terms of its use for case conceptualization and treatment planning, the measure provides information about unhappiness, apathy, anxiety, loss of hope, and energy loss (Onishi, Suzuki, & Umegaki, 2006). The GDS scores have shown good internal consistency (Yesavage et  al., 1983), good test–​retest reliability (Lesher, 1986), and good convergent validity (Lesher, 1986; Yesavage et  al., 1983). Scores on the measure have been shown to have good reliability and validity in several settings, including primary care clinics (Mitchell, Bird, Rizzo, & Meader, 2010), home care settings (Marc, Raue, & Bruce, 2008), and long-​term care facilities (Li et al., 2015). Interestingly, in a meta-​analysis of primary care studies, the 15-​item version of the GDS had better sensitivity and specificity than the full-​length GDS (Mitchell et al., 2010). In terms of its use with older adults, numerous researchers have questioned the strategy of eliminating somatic items from a depression measure (Karel et al., 2002). As mentioned previously, older adults tend to endorse cognitive items less frequently than do younger adults. Thus, eliminating the somatic items may reduce the sensitivity of the test. Furthermore, when individual somatic symptoms are examined, only appetite disturbance seems to be entirely confounded with age (Norris et  al., 2004), which suggests there is no reason to eliminate the valuable information the somatic items on depression scales provide. In addition, there are both strengths and limitations inherent in the yes/​no answer format of this measure. On the positive side, the format is not too cognitively demanding. As such, it may be useful for older adults with cognitive dysfunction; however, several studies have demonstrated that the Cornell Scale for Depression in Dementia has better sensitivity and specificity for older adults with cognitive impairment in the United States and China (e.g., Kørner et  al., 2006), and one study demonstrated that the GDS had very poor sensitivity and specificity for detecting depression in older adults with dementia (Li et al., 2015). In addition, in one survey, older adults indicated they did not like the forced-​ choice aspect of this measure (Fischer, Rolnick, Jackson, Garrard, & Luepke, 1996). Because this measure may not be sensitive to somatic presentations of depression and because there may be some issues with the format of the measure, the GDS should be used with caution for case conceptualization and treatment planning in older adults. Another self-​report measure that may be useful in case conceptualization and treatment planning for older adults is the Patient Health Questionnaire-​9 (PHQ-​9; Kroenke & Spitzer, 2002; Kroenke, Spitzer, & Williams, 2001).

159

The PHQ-​9 is a 9-​item self-​report depression measure that asks respondents to indicate how frequently during the past 2 weeks they have experienced each of the nine symptoms specified in the DSM-​5 criteria for a major depressive disorder. The PHQ-​9 takes 5 to 10 minutes to administer. It was originally developed for use in medical settings (outpatient primary care and obstetrics and gynecology offices), and older adults were included in the initial normative samples (Kroenke et al., 2001; Kroenke & Spitzer, 2002). The PHQ-​9 scores have demonstrated reliability and construct validity in a wide variety of settings and populations (Kroenke et al., 2001; Kroenke & Spitzer, 2002; Lamers et  al., 2008; Phelan et  al., 2010). The PHQ-​9 scores have demonstrated very high sensitivity and specificity (compared to diagnostic interview—​ structured and unstructured) in samples of older adults (Lamers et al., 2008; Phelan et al., 2010), with greater sensitivity and specificity than GDS scores for older primary care patients (Phelan et  al., 2010). Because the PHQ-​9 was found to be a more effective depression screener than the Schedule for Affective Disorders and Schizophrenia and the observational items of the Minimum Data Set 2.0 for long-​term care residents, the PHQ-​9 was incorporated into the Minimum Data Set 3.0 (the data collected on all long-​term care residents in the United States; Saliba et al., 2012). Overall, the PHQ-​9 is highly recommended for case conceptualization and treatment planning in older adults. A final self-​report measure that might be useful in case conceptualization and treatment planning for older adults is the Zung Self-​Rating Depression Scale (SDS; Zung, 1965). The SDS is a 20-​item measure scored on a 4-​point Likert-​type scale. The SDS provides information about a lack of well-​being and depressive affect (Schafer, 2006). Scores on the SDS have been reported to have good reliability and validity in mixed-​ age samples, including a normative sample with adults up to age 69 years (Zung, 1965). Scores on the SDS have shown adequate internal consistency in older adult samples (Dunn & Sacco, 1989), and there is some evidence for their validity in this population, especially as a screening measure (Dunn & Sacco, 1989); however, more research is needed before it can be recommended for case conceptualization and treatment planning in older adults. Structured Interviews Because evidence suggests that structured interviews may result in a more thorough and comprehensive picture of a client’s presenting problem compared to unstructured

160

160

Mood Disorders and Self-Injury

interviews (Segal et  al., 1995), they may be useful for case conceptualization and treatment planning, especially when used in conjunction with other assessment methods. Structured interviews that could be utilized for this purpose include the SCID-​CV (First et  al., 2015), the SADS (Endicott & Spitzer, 1978), and the GMS (Copeland et  al., 1976). All of these interviews were described previously, and Table 8.2 summarizes the utility of these measures for the purposes of case conceptualization and treatment planning in older adults. In addition, a version of the GMS developed specifically to assess depression severity (GMS-​DS; Ravindran, Welburn, & Copeland, 1994)  is particularly promising. It takes only 15 minutes to administer and has shown high internal consistency and good convergent validity with respect to self-​report and clinician ratings (Ravindran et al., 1994). Replication is needed before the measure can be highly recommended. Overall, all of these interviews have good potential for the purposes of treatment planning and case conceptualization with older adults; however, each needs more empirical evaluation before it can be recommended for these purposes. Clinician Rating Scales Clinician rating scales are generally developed to assess severity of depression among individuals who have been diagnosed with the disorder; however, the kinds of questions asked in these scales may also provide useful information for case conceptualization and treatment planning. Table 8.2 summarizes properties of clinician rating scales that may prove useful for the development of case conceptualizations and treatment plans for older adults. Perhaps the most popular clinician rating scale is the Hamilton Rating Scale for Depression (HRSD; Hamilton, 1960, 1967), which is often utilized in treatment outcome studies. A  comprehensive review of the HRSD across a wide range of samples concluded, however, that the measure’s weaknesses outweigh its strengths (Bagby, Ryder, Schuller, & Marshall, 2004). In particular, the authors noted that individual items are poorly designed, the total score does not reflect a unidimensional structure, and the measure has not been updated despite numerous revisions in accepted diagnostic criteria for depression. There are multiple versions of the HRSD, which vary in length (17–​27 items) and even what domains of depression are addressed; all versions assess somatic symptoms, whereas only some versions include items to assess cognitive symptoms of helplessness, hopelessness,

and worthlessness. The HRSD requires a trained clinician and takes 30 minutes to administer to depressed older adults (Moberg et al., 2001). A meta-​analysis demonstrated that scores on the HRSD have acceptable internal consistency, inter-​rater reliability, and test–​retest reliability (Trajkovic et  al., 2011). Nonetheless, scores on the scales have questionable internal consistency in older adults (e.g., Cronbach’s α = .46; Hammond, 1998). In terms of validity, reports vary widely. Studies have repeatedly failed to confirm the unidimensionality of the HRSD (Bech, Paykel, Sireling, & Yiend, 2015; Cole et al., 2004). Concurrent validity has been reported to be equivalent to that of other clinician rating instruments in a sample of depressed older adults (Mottram, Wilson, & Copeland, 2000) but no better than that of the BDI in a community sample (Stukenberg et al., 1990). Rapp and colleagues (1990) used an extracted version of the HRSD and reported moderate to high correlations with the BDI, the SDS, and the GDS and better concurrent validity than these measures in a sample of older male medical inpatients; however, Baker and Miller (1991), also using a medical sample, reported that concurrent validity was lower than for the GDS. In a sample of older adults with dementia, validity was particularly problematic (sensitivity was 8% for the HRSD compared to 82% for the GDS; Lichtenberg, Marcopulos, Steiner, & Tabscott, 1992). Indeed, some researchers have suggested that the HRSD does not actually measure depression at all because factor analysis has shown that the scale may instead measure aspects of anxiety and insomnia (Cole et al., 2004; Hammond, 1998; Stukenberg et al., 1990). Finally, the disproportionate number of somatic items may make the measure especially problematic for use with older adults (Jamison & Scogin, 1992). Given all this, the HRSD is not recommended as a measure for treatment planning or case conceptualization with older adults. The Geriatric Depression Rating Scale (GDRS; Jamison & Scogin, 1992)  was developed in response to problems with the HRSD. It may provide information useful for treatment planning and case conceptualization in older adults because it is based on the GDS, with the addition of somatic items that are considered only if responses are not attributable to physical illness. The GDRS is a 35-​item measure that requires a trained interviewer and takes 35 minutes to administer. Scores have been found to be highly internally consistent, with fairly high inter-​rater reliability. Finally, scores on the measure have also been found by the developers to have some validity for older adults based on correlations with the HRSD, BDI, and GDS (Jamison & Scogin, 1992).

 16

Depression in Late Life

However, more research must be conducted before this measure can be recommended for case conceptualization and treatment planning. Another clinician rating scale that may be useful in case conceptualization and treatment planning is the Montgomery–​Åsberg Depression Rating Scale (MADRS; Montgomery & Åsberg, 1979). The MADRS is a 10-​item clinician rating scale of depression severity. Scores on the MADRS have been shown to have good inter-​rater reliability (Zimmerman, Posternak, & Chelminski, 2004), adequate construct validity (e.g., correlated with HRSD), and good sensitivity and specificity (Engedal et al., 2012; Hammond, 1998; Mottram et  al., 2000; Zimmerman et  al., 2004)  in older adult samples. Compared to the HRSD, the MADRS contains fewer somatic items and has a factor structure that more clearly measures aspects of the depression construct (dysphoria and anhedonia; Hammond, 1998). Like the HRSD, however, scores on the MADRS have only fair internal consistency in this population (Bent-​Hansen et al., 2003; Hammond, 1998). Thus, the MADRS may be a better alternative than the HRSD as a clinician rating instrument, and with modification, it could be a useful measure in this population, but it cannot be highly recommended at this time. The Inventory of Depressive Symptomatology (IDS; Rush et  al., 1986)  initially contained 28 items, but it was revised to include 30 items (Rush, Gullion, Basco, Jarrett, & Trivedi, 1996). Items are rated on a scale of 0 to 3.  Clinician-​rated (IDS-​C) and self-​report (IDS-​ SR) versions have equivalent item content. The IDS provides useful information for case conceptualization and treatment planning in terms of information about severity of symptoms. In younger and mixed-​age samples, the IDS scores have been found to be highly internally consistent (Rush et al., 1986, 1996) and have good inter-​rater reliability (Rush et  al., 1996). Finally, the measure demonstrates convergent validity (it is correlated with the HRSD and the BDI; Rush et al., 1986, 1996). There has been little research examining the IDS in older adults. A factor analysis of the IDS-​SR in older adults identified three factors—​mood, motivation, and somatic—​differing from the factor structure found in younger adults (Hegeman, Wardenaar, Comijs, de Waal, Kok, & van der Mast, 2012). Internal consistency was good for scores on the mood and motivation factors, but it was marginal for the somatic items. Although evidence supports the reliability and validity of the IDS in younger populations, there is not yet enough information about the use of the IDS in older adults for this scale to be highly recommended.

161

Measures to Assess Depression in Dementia Several measures that assess depression in individuals with dementia may be useful for case conceptualization and treatment planning (see Table 8.2). The most frequently used measure of this type is the Cornell Scale for Depression in Dementia (CSDD; Alexopoulos, Abrams, Young, & Shamoian, 1988a). The scale includes 19 items that are rated by a mental health professional on a 3-​point scale (absent, mild or intermittent, and severe). Ratings are based on observation of the client as well as interviews with the client and a caregiver. Administration requires 30 minutes. The CSDD provides information about general depression, rhythm disturbance (including insomnia), agitation/​psychosis, and negative symptoms (Ownby, Harwood, Acevedo, Barker, & Duara, 2001); however, factor analysis has demonstrated other factor structures in various settings (Barca, Selbæk, Laks, & Engedal, 2008; Harwood, Ownby, Barker, & Duara, 1998; Kurlowicz, Evans, Strumpf, & Maislin, 2002). Scores on the scale have good internal consistency and adequate inter-​rater reliability (Alexopoulos et  al., 1988a). Scores have been shown to distinguish individuals with dementia who meet criteria for depression from those who do not meet criteria (Alexopoulos et  al., 1998a), and they are significantly correlated with other measures in the expected directions (Mack & Patterson, 1994). When comparing the CSDD to other depression measures, the CSDD has been shown to have better specificity and sensitivity than the GDS when administered for older adults with and without cognitive impairment (Kørner et al., 2006), and the CSDD was generally comparable to the MADRS in terms of sensitivity and specificity (Leontjevas, Gerritsen, Vernooij-​ Dassen, Smalbrugge, & Koopmans, 2012; Leontjevas, van Hooren, & Mulders, 2009). However, in one study, the MADRS was better at distinguishing depressed and nondepressed patients in a memory care clinic (Knapskog, Barca, & Engedal, 2011). Similarly, in one study, the CSDD had similar specificity and sensitivity as the observation version of the PHQ-​9 (Phillips, 2012), and another study demonstrated that the sensitivity and specificity of the CSDD were equivalent to those of the Hamilton Depression Scale in older adults (Vida, Des Rosiers, Carrier, & Gauthier, 1994). The CSDD scores have been shown to reliably measure depression in individuals with Parkinson’s disease (with and without cognitive impairment; Williams & Marsh, 2008)  and in older adults without cognitive impairment (Alexopoulos, Abrams, Young, & Shamoian, 1988b). The CSDD has also been shown to be a valid

162

162

Mood Disorders and Self-Injury

depressive tool for a variety of settings, including memory care clinics (Hancock & Larner, 2015), long-​term care facilities (Jeon et al., 2015), and inpatient settings (Barca, Engedal, & Selbæk, 2010). Furthermore, the CSDD is not confounded by cognitive status (Maixner, Burke, Roccaforte, Wengel, & Potter, 1995). In examining the CSDD critically, experienced interviewers who evaluated the instrument reported that instructions lack detail in places and that the focus on behaviors occurring within the past week may limit the measure’s sensitivity, but the option to indicate “unable to rate” was particularly helpful (Mack & Patterson, 1994). One study demonstrated low concordance between answers on the CSDD completed by proxies and depressive symptoms endorsed when the CSDD was administered as a self-​report measure (Towsley, Neradilek, Snow, & Ersek, 2012), but another study did not replicate this finding (Wongpakaran, Wongpakaran, & van Reekum, 2013), especially when the instrument was used with older adults with cognitive impairment. Some public health researchers have found that screening with the CSDD in large-​scale public health initiatives aimed at long-​term care residents does not result in improvements in depression care and creates undue burden on nursing staff (Davison et al., 2012; Jeon et al., 2015; Snowdon, Rosengren, Daniel, & Suyasa, 2010). Despite these criticisms, the CSDD is considered a psychometrically sound instrument to measure depression severity in individuals with dementia for purposes of conceptualization and treatment planning. The Dementia Mood Assessment scale (DMAS; Sunderland et  al., 1988)  includes 17 items that assess depressive symptoms in the past week in individuals with dementia on a 0 to 6 scale. Administered by a trained clinician, the measure involves a semi-​structured interview and observation of the patient, along with input from collateral sources. Factor analyses vary slightly in summarizing the domains assessed by the DMAS; in the largest reported study, factors were depressed affect, environmental interaction, diurnal patterns, agitation or suspicion, and somatic indicators (Onega & Abraham, 1997). Adequate internal consistency (Camus, cited in Perrault, Oremus, Demers, Vida, & Wolfson, 2000), inter-​ rater reliability (Sunderland et al., 1988), and construct validity (Camus, cited in Perrault et al., 2000; Sunderland et al., 1988) have been reported for scores on the measure, but sample sizes have generally been small and little detail has been provided in some of the reports (Sunderland & Minichiello, 1996). Thus, this measure is promising, but more evidence of reliability and validity is needed before it could be highly recommended.

In addition to scales that measure depression specifically, several instruments have been developed to assess for depressive symptoms among other disturbances in persons with dementia. Although these instruments may be useful in case conceptualization and treatment planning, in most cases, psychometric information is not given specifically for the depression subscale. One exception is the Neuropsychiatric Inventory (NPI; Cummings et  al., 1994), a semi-​structured interview that is conducted with a knowledgeable informant. The measure includes questions to screen for the presence of depressed mood, apathy, and 10 other psychiatric symptoms. Each screening question is followed by a series of questions to confirm the presence or absence of the symptom, along with ratings of symptom frequency and severity. Although psychometric information specific to subscales is limited, scores on the NPI as a whole demonstrate acceptable to good reliability and good validity (Cummings et al., 1994). Employing a version developed for use in nursing homes (NPI-​NH), Wood and colleagues (2000) showed that the depression and apathy subscales correlated moderately with research observations. Thus, the NPI or NPI-​NH may be useful in detecting depression among individuals with dementia, but further work is needed to establish the validity of the relevant subscales specifically for this purpose. Overall Evaluation Formalized assessment can provide clinicians with invaluable information for case conceptualization and treatment planning when working with older adults who are demonstrating symptoms of depression; however, for measurement devices to be useful for this purpose, they must be chosen based on reliability, validity, and utility in older adult populations. The preceding section examined several different types of measures commonly used in the assessment of depression in older adults, and in terms of fulfilling the goal of case conceptualization and treatment planning, several measures stand out as being particularly useful for these purposes. When using a self-​report measure, it is recommended that the PHQ-​9 (Kroenke et al., 2001; Kroenke & Spitzer, 2002) or the CES-​D (Radloff, 1977) be considered before any other measures because they are well validated in older adults and provide information across several domains of depressive symptoms. Several structured clinical interviews may also provide useful information for case conceptualization and treatment planning for older adults with depression. The GMS-​DS (Ravindran et  al., 1994)  is a promising structured interview because it was developed specifically for

 163

Depression in Late Life

use with older adults, preliminary data show good reliability and validity, and it requires the least time to administer, but further evaluation is needed. Although there are some promising clinician rating scales, at present it is difficult to recommend a specific measure for treatment planning and case conceptualization. Finally, when assessing depression in older adults with dementia, it would be worth considering the CSDD because it has adequate to good reliability and good validity in this population. Whether self-​report, structured interview, or clinician rating scales are used for treatment planning and case conceptualization, it is important to keep in mind that each type of measure is subject to biases and, therefore, a multimethod approach may be the most effective way to formulate case conceptualizations and treatment plans. In particular, it is important to address other key aspects of case conceptualization within an informal clinical interview. For older adults, it is particularly important to assess general social functioning, medical health and medications, and the ability to perform activities of daily living (e.g., walking, dressing, and eating) and instrumental activities of daily living (e.g., balancing a checkbook, grocery shopping, and cooking; Karel et  al., 2002). In addition, it may be useful to utilize a more formal measure of overall functioning. For example, the Short-​Form Health Survey (SF-​36; Ware & Sherbourne, 1992)  is a well-​validated measure of several components of functioning, including physical function, role limitations, bodily pain, general health, vitality, social functioning, emotional functioning, and mental health. Finally, every evaluation of an older adult client for the purposes of case conceptualization should include some sort of evaluation of cognitive functioning, such as the Mini-​Mental State Examination (Folstein, Folstein, & McHugh, 1975)  or the Neurobehavioral Cognitive Status Examination (COGNISTAT; Kiernan, Mueller, Langston, & Van Dyke, 1987).

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

Self-​Report Measures Because of their efficiency and cost-​ effectiveness, self-​ report measures can be very useful for treatment monitoring and measuring treatment outcomes. Most self-​report measures for depression can be administered in less than 20 minutes, and several of the instruments that are described here have shorter versions that can be

163

administered in just a few minutes. Despite the usefulness of self-​report measures in terms of their inherent efficiency, there are some limitations to the use of these measures for treatment monitoring and outcome assessment. Specifically, when used repeatedly, they may be subject to internal validity problems because of carryover effects (Whitley, 2002). There are several self-​report measures that may prove useful for treatment monitoring and measuring treatment outcomes in older adults with depressive symptoms (Table 8.3). For a complete discussion of self-​report measures, see the previous section on measures used for case conceptualization and treatment planning. In this section, we only discuss in detail the two measures most recommended for treatment monitoring and measuring treatment outcomes. One self-​report measure that may prove particularly useful for treatment monitoring and assessing outcomes in older adults is the CES-​D (Radloff, 1977; discussed previously). The CES-​D may be a particularly advantageous self-​report measure because it explicitly instructs respondents to reflect on symptoms during the past week, which may lessen testing effects. In addition, the CES-​D may be a useful measure of change because its scores have demonstrated good test–​retest reliability (Radloff, 1977). Researchers have also demonstrated that changes in CES-​D scores are significantly related to changes in older patients’ self-​report ratings of change in their depressive symptoms (Datto, Thompson, Knott, & Katz, 2006). In terms of efficiency, there are several CES-​D short forms available. For example, 8-​and 10-​item versions, respectively known as the Boston and the Iowa forms, yield scores that have both demonstrated good reliability and validity with older adults (Kohout, Berkman, Evans, & Cornoni-​Huntley, 1993). All in all, the CES-​D is recommended as a good assessment instrument to use to track treatment progress and outcomes in older adults with depressive symptoms. In addition, the PHQ-​9 (Kroenke et al., 2001; Kroenke & Spitzer, 2002; see previous section for information about reliability and validity) is a very useful tool for assessing treatment progress and outcomes in older adults. As mentioned previously, scores on the PHQ-​9 have demonstrated very good reliability and validity in a variety of older adult populations (Lamers et  al., 2008; Phelan et al., 2010). The PHQ-​9 has also been successfully used for treatment monitoring in several large-​scale randomized controlled trials for depression treatment in older adults (Bernd, Unützer, Callahan, Perkins, & Kroenke, 2004; Ciechanowski et al., 2004). There is also evidence

164

164

Mood Disorders and Self-Injury

Table 8.3  Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Instrument

Internal Norms Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity Treatment Validity Generalization Sensitivity

Clinical Utility

Highly Recommended

Self-​Report Measures BDI-​II

A

G

NA

G

A

A

A

A

A

CES-​D

E

G

NA

G

A

G

G

G

A

GDS

G

G

NA

G

A

A

A

L

L

PHQ-​9

E

G

G

G

A

G

E

E

E



Clinician Rating Scales GDRS

A

E

G

NR

A

NR

A

G

A

MADRS

A

A

E

NR

A

A

A

G

A

A

E

E

E

A

A

Measure to Assess Depression in Dementia CSDD

A

E

A



Note: BDI-​II = Beck Depression Inventory-​II; CES-​D = Center for Epidemiological Studies–​Depression Scale; GDS = Geriatric Depression Scale; PHQ-​9  =  Patient Health Questionnaire-​9; GDRS  =  Geriatric Depression Rating Scale; MADRS  =  Montgomery–​Åsberg Depression Rating Scale; CSDD = Cornell Scale for Depression in Dementia; L = Less Than Acceptable; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

that this measure has excellent treatment utility (i.e., staff were able to use the PHQ-​9 results to appropriately and successfully address depressive symptoms in long-​ term care facilities) and is easy to use (Saliba et al., 2012). The PHQ-​9 is highly recommended as a self-​report measure of treatment response in older adults. Structured Interviews Unlike self-​ report measures, structured interviews are often time-​intensive and require extensive training for administration; as such, they are generally not used for treatment monitoring. However, as noted previously, administration times vary and short forms are available. In addition, structured interviews may provide useful information about treatment outcomes. The structured interview that appears to have the most utility for the assessment of treatment outcomes is the GMS-​DS (Ravindran et  al., 1994; discussed previously), which was specifically designed to assess changes in depression levels in older adults, although further evaluation is needed. It takes 15 minutes to administer to depressed older adults and has demonstrated good reliability and validity (Ravindran et  al., 1994). As a measure of treatment outcomes, the GMS-​ DS performs well, with pre-​to post-​treatment change scores correlating .89 and .85, respectively, with clinician and patient ratings of improvement (Ravindran et al., 1994). Unfortunately, there is little research examining the utility of the GMS-​DS, but available evidence suggests it is a

promising measure for assessing outcomes in depressed older adults. Clinician Rating Scales Several clinician rating scales have been evaluated for the purposes of treatment monitoring and assessment of treatment outcomes in older adults (see Table 8.3). The HRSD (Hamilton, 1960; discussed previously) is often touted as a good measure for both treatment monitoring and assessments of outcome, but many problems have been noted (e.g., Bagby et al., 2004). The HRSD scores do not have good reliability in older adults and may not actually be measuring depression (Baker & Miller, 1991; Hammond, 1998; Lichtenberg et al., 1992), which makes this measure’s usefulness in tracking changes in depression in older adults questionable. Also, some researchers have found that it can be time-​consuming and difficult to administer to some populations of older adults (Baker & Miller, 1991), which suggests that this measure may be inconvenient to administer repeatedly. Given these concerns, the HRSD is not recommended for treatment monitoring and assessment of treatment outcome in depressed older adults. A measure that may prove useful for treatment monitoring and outcome assessments in depressed older adults is the GDRS (Jamison & Scogin, 1992; discussed previously). The GDRS scores have fairly good inter-​ rater reliability (Jamison & Scogin, 1992), but test–​retest reliability has not been assessed. It is somewhat lengthy to

 165

Depression in Late Life

administer (35 minutes), so it may be more appropriate for outcome assessment than for monitoring treatment progress. Although there is not enough empirical support for the GDRS (particularly as a measure of change) to recommend its use at this time, it appears to be promising as an outcome measure. The MADRS (Montgomery & Åsberg, 1979)  was specifically designed to be sensitive to change with treatment. However, MADRS scores have only fair internal consistency in older adult samples (Bent-​Hansen et  al., 2003) and thus may not be stable enough to assess change reliably. On the positive side, scores on the measure have shown very good inter-​rater reliability (Zimmerman et  al., 2004)  and fairly good efficiency (Mottram et  al., 2000) when used with older adults. Finally, the measure has been shown to differentiate significantly between placebo and maintenance phase of treatment in older adults (Bent-​Hansen et al., 2003). Despite these promising findings, the problems with this measure’s score reliability and the relative dearth of research examining this measure in older adults suggest that it cannot be recommended for the assessment of progress and outcomes in the treatment of older adults with depression until more research evaluating the MADRS is conducted. The IDS (Rush, Gullion, Basco, Jarrett, & Trivedi, 1996; discussed previously) may prove useful for measuring treatment progress and outcomes, although it has not yet been empirically examined in older adults. In particular, the measure’s fairly good internal consistency (Rush et al., 1996) suggests that it is stable enough to be used as a repeat measure. However, the IDS has not been assessed for test–​retest reliability, which is problematic in terms of its use as a measure for treatment monitoring. Despite this promising evidence from research in mixed-​ age samples, the IDS cannot be recommended for use with older adults until its properties are evaluated in this population. Measures to Assess Depression in Dementia The CSDD (Alexopoulos et  al., 1988a; discussed previously) may be useful for monitoring the outcome of depression treatment in individuals who have dementia. Scores on this rating scale have demonstrated adequate to good reliability and good validity, as described previously, and the ability to detect treatment effects has been demonstrated (Mayer et  al., 2006). As Perrault and colleagues (2000) cautioned, however, most evidence for the reliability and validity of scores on this measure derives

165

from inpatient samples and may not be generalizable to community-​dwelling individuals with dementia. The DMAS (Sunderland et  al., 1988; discussed previously) scores have adequate reliability and validity, but test–​retest reliability and sensitivity to change have not yet been demonstrated. Consequently, the use of this scale for outcome monitoring is not recommended until further research is conducted. Overall Evaluation Given measurement considerations described previously in this section, several measures stand out as potentially useful for treatment monitoring and assessment of treatment outcomes in older adults with depression. Among self-​report measures, the CES-​ D (Radloff, 1977)  and the PHQ-​ 9 (Kroenke et al., 2001; Kroenke & Spitzer, 2002) may be the most useful assessment tools for this purpose. For structured interviews, the GMS-​DS (Ravindran, et  al., 1994)  is the most promising assessment tool, but it is not yet supported by sufficient research. In terms of clinician rating scales, several scales appear promising for treatment monitoring and evaluation, but more research is needed before any can be recommended. The CSDD score can provide a reliable and valid measure of change in depression severity among individuals with dementia. In general, it is important to keep in mind that each type of instrument is vulnerable to testing effects and other threats to internal validity if administered repeatedly. Thus, it would be important to obtain data from multiple measures when making decisions about how effective treatment was for a particular client.

CONCLUSIONS AND FUTURE DIRECTIONS

Our examination of the assessment of depression in late life leads to several conclusions. First, assessing depression in older adults poses unique challenges to clinicians. Many older adults suffer from physical illnesses that result in symptoms similar to somatic symptoms of depression. Given this fact, differential diagnosis of depression in this population can be difficult. Failure to account for this difficulty may result in overidentification of depression in this age group. Eliminating somatic symptoms from measures of depression seems less than ideal because somatic symptoms are often prominent in late life depression. It appears that the best solution at this time is to incorporate information from different types of measures into the assessment process.

16

166

Mood Disorders and Self-Injury

Another challenge posed when assessing late life depression centers on differentially diagnosing depression and dementia. Cognitive symptoms of depression are prominent in late life, and it can often be difficult to determine if these symptoms are truly reflecting depression or if the patient is experiencing cognitive decline. Again, measures often fail to take this difficulty into account. In addition, due to the complexity of their format, some measures of depression can present real difficulties for older adults experiencing mild cognitive decline. This difficulty may limit the validity of such measures in older adults. A further challenge is that older adults often display symptoms of depression that differ from those presented by younger and middle-​aged adults; thus, measures of depression need to be validated and normed with older adults specifically before they can be considered valid in this population. Unfortunately, this research step has not yet been taken for many measures. Similarly, different subgroups should be considered when measuring depression in older adults. Although many instruments originally developed for younger adults have demonstrated validity in healthy older adults, it may be imprudent to utilize these same instruments with individuals with either physical illness or cognitive impairment. Before using any measure of depression with an older adult, it is essential to get information about medical illnesses and cognitive functioning and use measures validated specifically with these populations. More research is needed on the assessment of depression in older adults. Most depression instruments need to be more thoroughly evaluated for their utility with older adult clients. Some instruments that have been designed specifically for this population appear promising, yet they, too, require further validation. It may be useful to develop measures specifically to assess depression in older adults for the purposes of diagnosis, case conceptualization, treatment planning, treatment monitoring, and the assessment of treatment outcomes. An area that remains to be addressed is the empirical examination of clinical utility. Despite the challenges of assessing depression in older adults, some instruments show evidence of good reliability and validity—​it now remains to be established whether their use improves clinical outcomes. References Alexopoulos, G. S., Abrams, R. C., Young, R. C., & Shamoian, C. A. (1988a). Cornell Scale for Depression in Dementia. Biological Psychiatry, 23, 271–​284.

Alexopoulos, G. S., Abrams, R. C., Young, R. C., & Shamoian, C. A. (1988b). Use of the Cornell Scale in nondemented patients. Journal of the American Geriatrics Society, 36, 230–​236. Alexopoulos, G. S., Meyers, B. S., Young, R. C., Campbell, S., Silbersweig, D., & Charlson, M. (1997). “Vascular depression” hypothesis. Archives of General Psychiatry, 54, 915–​922. Alexopoulos, G. S., Meyers, B. S., Young, R. C., Kalayam, B., Kakuma, T., Gabrielle, M.,  .  .  .  Hull, J. (2000). Executive dysfunction and long-​term outcomes of geriatric depression. Archives of General Psychiatry, 57, 285–​290. American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Ames, D., Flynn, E., Tuckwell, V., & Harrigan, S. (1994). Diagnosis of psychiatric disorder in elderly general and geriatric hospital patients:  AGECAT and DSM-​III-​R compared. International Journal of Geriatric Psychiatry, 9, 627–​633. Andreas, S., Schulz, H., Volkert, J., Dehoust, M., Sehner, S., Suling, A.,  .  .  .  Härter, M. (2017). Prevalence of mental disorders in elderly people:  The European MentDis_​ICF65+ study. British Journal of Psychiatry, 210, 125–​131. Bagby, R. M., Ryder, A. G., Schuller, D. R., & Marshall, M. B. (2004). The Hamilton Depression Rating Scale:  Has the gold standard become a lead weight? American Journal of Psychiatry, 161, 2163–​2177. Baker, F. M., & Miller, C. L. (1991). Screening a skilled nursing home population for depression. Journal of Geriatric Psychiatry and Neurology, 4, 218–​221. Baldwin, R. C., & Tomenson, B. (1995). Depression in later life: A comparison of symptoms and risk factors in early and late onset cases. British Journal of Psychiatry, 167, 649–​652. Barca, M. L., Engedal, K., & Selbæk, G. (2010). A reliability and validity study of the Cornell Scale among elderly inpatients, using various clinical criteria. Dementia and Geriatric Cognitive Disorders, 29, 438–​447. Barca, M. L., Selbæk, G., Laks, J., & Engedal, K. (2008). The pattern of depressive symptoms and factor analysis of the Cornell Scale among patients in Norwegian nursing homes. International Journal of Geriatric Psychiatry, 23, 1058–​1065. Basco, M. R., Bostic, J. Q., Davies, D., Rush, A. J., Witte, B., Hendrickse, W., & Barnett, V. (2000). Methods to improve diagnostic accuracy in a community mental health setting. American Journal of Psychiatry, 157, 1599–​1605.

 167

Depression in Late Life

Bech, P., Paykel, E., Sireling, L., & Yiend, J. (2015). Rating scales in general practice depression:  Psychometric analyses of the Clinical Interview for Depression and the Hamilton Rating Scale. Journal of Affective Disorders, 171, 68–​73. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Comparison of Beck Depression Inventories-​IA and -​II in psychiatric outpatients. Journal of Personality Assessment, 67, 588–​597. Bent-​ Hansen, J., Lunde, M., Klysner, R., Andersen, M., Tanghøj, P., Solstad, K., & Bech, P. (2003). The validity of the depression rating scales in discriminating between Citalopram and placebo in depression recurrence in the maintenance therapy of elderly unipolar patients with major depression. Pharmacopsychiatry, 36, 313–​316. Bernd, L., Unützer, J., Callahan, C., Perkins, A., & Kroenke, K. (2004). Monitoring depression treatment outcomes with the Patient Health Questionnaire-​9. Medical Care, 42, 1194–​1201. Blazer, D., Bachar, J. R., & Hughes, D. C. (1987). Major depression with melancholia: A comparison of middle-​ aged and elderly adults. Journal of the American Geriatrics Society, 35, 927–​932. Blumenthal, M. D. (1980). Depressive illness in old age: Getting behind the mask. Geriatrics, 35, 34–​43. Brodaty, H., Peters, K., Boyce, P., Hickie, I., Parker, G., Mitchell, P., & Wilhelm, K. (1991). Age and depression. Journal of Affective Disorders, 23, 137–​149. Carlson, M., Wilcox, R., Chou, C.-​P., Chang, M., Yang, F., Blanchard, J.,  .  .  .  Clark, F. (2011). Psychological Assessment, 23, 558–​562. Chopra, H. P., Zubritsky, C., Knott, K., Ten Have, T., Hadley, T., Coyne, J. C., & Oslin, D. W. (2005). Importance of subsyndromal symptoms of depression in elderly patients. American Journal of Geriatric Psychiatry, 13, 597–​606. Christensen, H., Jorm, A. F., MacKinnon, A. J., Korten, A. E., Jacomb, P. A., Henderson, A. S., & Rodgers, B. (1999). Age differences in depression and anxiety symptoms: A structural equation modelling analysis of data from a general population sample. Psychological Medicine, 29, 325–​339. Ciechanowski, P., Wagner, E., Schmaly, K., Schwartz, S., Williams, B., Diehr, P.,  .  .  .  LoGerfo, J. (2004). Community integrated home-​ based depression treatment in older adults:  A randomized controlled trial. Journal of the American Medical Association, 291, 1569–​1577. Clark, D. C., Cavanaugh, S. V., & Gibbons, R. D. (1983). The core symptoms of depression in medical and psychiatric patients. Journal of Nervous and Mental Disease, 17, 705–​713. Cole, J. C., Motivala, S. J., Dang, J., Lucko, A., Lang, N., Levin, M. J.,  .  .  .  Irwin, M. R. (2004). Structural

167

validation of the Hamilton Depression Rating Scale. Journal of Psychopathology and Behavioral Assessment, 26, 241–​254. Copeland, J. R. M., Dewey, M. E., & Griffiths-​Jones, H. M. (1990). Dementia and depression in elderly persons: AGECAT compared with DSM III and pervasive illness. International Journal of Geriatric Psychiatry, 5, 47–​51. Copeland, J. R. M., Dewey, M. E., Henderson, A. S., Kay, D. W. K., Neal, C. D., Harrison, M. A. M., . . . Shiwach, R. (1988). The Geriatric Mental State (GMS) used in the community:  Replication studies of the computerized diagnosis AGECAT. Psychological Medicine, 18, 219–​223. Copeland, J. R. M., Kelleher, M. J., Kellett, J. M., Gourlay, A. J., Gurland, B. J., Fleiss, J. L., & Sharpe, L. (1976). A semi-​structured clinical interview for the assessment of diagnosis and mental state in the elderly: The Geriatric Mental State Schedule: I. Development and reliability. Psychological Medicine, 6, 439–​449. Cuijpers, P., Andersson, G., Donker, T., & Van Straten, A. (2011). Psychological treatment of depression:  Results of a series of meta-​analyses. Nordic Journal of Psychiatry, 65, 354–​364. Cummings, J. L., Mega, M., Gray, K., Rosenberg-​ Thompson, S., Carusi, D. A., & Gornbein, J. (1994). The Neuropsychiatric Inventory:  Comprehensive assessment of psychopathology in dementia. Neurology, 44, 2308–​2314. Curtain, S. C., Warner, M., & Hedegaard, H. (2016). Increase in suicide in the United States, 1999–​2014 (NCHS Data Brief No 241). Hyattsville, MD:  National Center for Health Statistics. Datto, C. J., Thompson, R., Knott, K., & Katz, I. R. (2006). Older adult report of change in depressive symptoms as a treatment decision tool. Journal of the American Geriatrics Society, 54, 627–​631. Davison, T. E., Snowdon, J., Castle, N., McCabe, M. P., Mellor, D., Karantzas, G., & Allan, J. (2012). An evaluation of a national program to implement the Cornell Scale for Depression in Dementia into routine practice in aged care facilities. International Psychogeriatrics, 24, 631–​641. de la Cámara, C., Saz, P., López-​Antón, R., Ventura, T., Día, J., & Lobo, A. (2008). Depression in the elderly community:  I. Prevalence by different diagnostic criteria and clinical profile. European Journal of Psychiatry, 22, 131–​140. Depp, C. A., & Jeste, D. V. (2004). Bipolar disorder in older adults: A critical review. Bipolar Disorders, 6, 343–​367. Devenand, D. P. (2014). Dysthymic disorder in the elderly population. International Psychogeriatrics, 26, 39–​48. Dozois, D. J.  A., & Dobson, K. S. (2002). Depression. In M. M. Antony & D. H. Barlow (Eds.), Handbook of

168

168

Mood Disorders and Self-Injury

assessment and treatment planning for psychological disorders (pp. 259–​299). New York, NY: Guilford. Dunn, V. K., & Sacco, W. P. (1989). Psychometric evaluation of the Geriatric Depression Scale and the Zung Self-​Rating Scale using an elderly community sample. Psychology and Aging, 4, 125–​126. Eaton, W. W., Muntaner, C., Smith, C., Tien, A., & Ybarra, M. (2004). Center for Epidemiologic Studies Depression Scale:  Review and revision (CESD and CESDR). In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcome assessments: Volume 3. Instruments for adults (pp. 363–​377). Mahwah, NJ: Erlbaum. Endicott, J., & Spitzer, R. L. (1978). A diagnostic interview. The Schedule for Affective Disorders and Schizophrenia. Archives of General Psychiatry, 35, 837–​844. Engedal, K., Kvaal, K., Korsnes, M., Barca, M. L., Borza, T., Selbaek, G., & Aakhus, E. (2012). The validity of the Montgomery–​Aasberg Depression Rating Scale as a screening tool for depression in late life. Journal of Affective Disorders, 141, 227–​232. First, M. B., Williams, J. B. W., Karg, R. S., & Spitzer, R. L. (2015). User’s guide for the Structured Clinical Interview for DSM-​5 Disorders–​Clinician Version (SCID-​5-​CV). Arlington, VA: American Psychiatric Association. Fischer, L. R., Rolnick, S. J., Jackson, J., Garrard, J., & Luepke, L. (1996). The Geriatric Depression Scale: A content analysis of respondent comments. Journal of Mental Health and Aging, 2, 125–​135. Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). Mini-​Mental State: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12, 189–​198. Frasure-​Smith, N., Lesperance, F., & Talajic, M. (1993). Depression following myocardial infarction:  Impact on 6-​month survival. Journal of the American Medical Association, 270, 1819–​1825. Gallagher, D., Breckenridge, J., Steinmetz, J., & Thompson, L. (1983). The Beck Depression Inventory and Research Diagnostic Criteria:  Congruence in an older population. Journal of Consulting and Clinical Psychology, 51, 945–​946. Gallo, J. J., Anthony, J. C., & Muthén, B. O. (1994). Age differences in the symptoms of depression: A latent trait analysis. Journal of Gerontology: Psychological Sciences, 49, P251–​P264. Gallo, J. J., Rabins, P. V., & Anthony, J. C. (1999). Sadness in older persons: 13-​year follow-​up of a community sample in Baltimore, Maryland. Psychological Medicine, 29, 341–​350. Gatz, M., Johansson, B., Pederson, N., Berg, S., & Reynolds, C. (1993). A cross-​national self-​report measure of depressive symptomatology. International Psychogeriatrics, 5, 147–​156.

Gonzalez, H. M., Tarraf, W., Whitfield, K., & Gallo, J. J. (2012). Vascular depression prevalence and epidemiology in the United States. Journal of Psychiatric Research, 46, 456–​461. Grayson, D. A., MacKinnon, A., Jorm, A. F., Creasey, H., & Broe, G. A. (2000). Item bias in the Center for Epidemiologic Studies Depression Scale:  Effects of physical disorders and disability in an elderly community sample. Journal of Gerontology, 55B, 273–​282. Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery, and Psychiatry, 23, 56–​62. Hamilton, M. (1967). Development of a rating scale for primary depressive illness. British Journal of Social and Clinical Psychology, 6, 278–​296. Hammond, M. F. (1998). Rating depression severity in the elderly physically ill patient:  Reliability and factor structure of the Hamilton and the Montgomery–​ Åsberg depression rating scales. International Journal of Geriatric Psychiatry, 13, 257–​261. Hancock, P., & Larner, A. J. (2015). Cornell Scale for Depression in Dementia: Clinical utility in a memory clinic. International Journal of Psychiatry in Clinical Practice, 19, 71–​74. Harwood, D. G., Ownby, R. L., Barker, W. W., & Duara, R. (1998). The factor structure of the Cornell Scale for Depression in Dementia among probable Alzheimer’s disease patients. American Journal of Geriatric Psychiatry, 6, 212–​220. Head, J., Stansfeld, S. A., Ebmeier, K. P., Geddes, J. R., Allan, C. L., Lewis, G., & Kivimäki, M. (2013). Use of self-​administered instruments to assess psychiatric disorders in older people: Validity of the General Health Questionnaire, the Center for Epidemiologic Studies Depression Scale and the self-​completion version of the revised Clinical Interview Schedule. Psychological Medicine, 43, 2649–​2656. Hegeman, J. M., Kok, R. M., van der Mast, R. C., & Giltay, E. J. (2012). Phenomenology of depression in older compared with younger adults:  Meta-​analysis. British Journal of Psychiatry, 200, 275–​281. Hegeman, J. M., Wardenaar, K. J., Comijs, H. C., de Waal, M. W. M., Kok, R. M., & van der Mast, R. C. (2012). The subscale structure of the Inventory of Depressive Symptomatology Self Report (IDS-​SR) in older persons. Journal of Psychiatric Research, 46, 1383–​1388. Heun, R., Müller, H., Freyberger, H. J., & Maier, W. (1998). Reliability of interview information in a family study in the elderly. Social Psychiatry and Psychiatric Epidemiology, 33, 140–​144. Hinrichsen, G. A., & Emery, E. E. (2005). Interpersonal factors and late-​life depression. Clinical Psychology: Science & Practice, 12, 264–​275. Jamison, C., & Scogin, F. (1992). Development of an interview-​ based geriatric depression rating scale. International Journal of Aging and Human Development, 35, 193–​204.

 169

Depression in Late Life

Janssen, J., Beekman, A. T.  F., Comijs, H. C., Deeg, D. J. H., & Heeren, T. J. (2006). Late-​life depression: The differences between early-​and late-​onset illness in a community-​based sample. International Journal of Geriatric Psychiatry, 21, 86–​93. Jansson, M., Gatz, M., Berg, S., Johansson, B., Malmberg, B., Mcclearn, G. E., . . . Pedersen, N. L. (2004). Gender differences in heritability of depressive symptoms in the elderly. Psychological Medicine, 34, 471–​479. Jefferson, A. L., Powers, D. V., & Pope, M. (2001). Beck Depression Inventory-​ II (BDI-​ II) and the Geriatric Depression Scale (GDS) in older women. Clinical Gerontologist, 22(3–​4), 3–​12. Jeon, Y. H., Li, Z., Chenoweth, L., O’Conner, D., Beattie, E., Liu, Z., & Brodaty, H. (2015). The clinical utility of the Cornell Scale for Depression in Dementia as routine assessment in nursing homes. American Journal of Geriatric Psychiatry, 23, 784–​793. Karel, M. J., Ogland-​ Hand, S., & Gatz, M. (2002). Assessing and treating late life depression. New  York, NY: Basic Books. Kessler, R., & Mroczek, D. (1993). UM-​CIDI Short-​Form [Memo]. Ann Arbor, MI: University of Michigan. Kessler, R. C., Petukhova, M., Sampson, N. A., Zaslavsky, A. M., & Wittchen, H.-​U. (2012). Twelve-​month and lifetime prevalence and lifetime morbid risk of anxiety and mood disorders in the United States. International Journal of Methods in Psychiatric Research, 21, 169–​184. Kiernan, R. J., Mueller, J., Langston, J. W., & Van Dyke, C. (1987). The Neurobehavioral Cognitive Status Examination: A brief but quantitative approach to cognitive assessment. Annals of Internal Medicine, 107, 481–​485. Knapskog, A. B., Barca, M. L., & Engedal, K. (2011). A comparison of the validity of the Cornell Scale and the MADRS in detecting depression among memory clinic patients. Dementia and Geriatric Cognitive Disorders, 32, 287–​294. Kohout, F. J., Berkman, L. F., Evans, D. A., & Cornoni-​ Huntley, J. (1993). Two shorter forms of the CES-​D Depression Symptoms Index. Journal of Aging and Health, 5, 179–​193. Kørner, A., Lauritzen, L., Abelskov, K., Gulmann, N., Brodersen, A. M., Wedervang-​Jensen, T., & Kjeldgaard, K. M. (2006). The Geriatric Depression Scale and the Cornell Scale for Depression in Dementia:  A validity study. Nordic Journal of Psychiatry, 60, 360–​364. Kraaij, V., Arensman, E., & Spinhoven, P. (2002). Negative life events and depression in elderly persons:  A meta-​ analysis. Journal of Gerontology: Psychological Sciences, 57B, 87–​94. Kroenke, K., & Spitzer, R. L. (2002). The PHQ 9:  A new depression diagnostic and severity measure. Psychiatric Annals, 32, 509–​515.

169

Kroenke, K., Spitzer, R. L., & Williams, J. B.  W. (2001). Validity of a brief depression severity measure. Journal of General Internal Medicine, 169, 606–​613. Kumar, A., Lavretsky, H., & Elderkin-​Thompson, V. (2004). Nonmajor clinically significant depression in the elderly. In S. P. Roose & H. A. Sackeim (Eds.), Late-​ life depression (pp. 64–​ 80). New  York, NY:  Oxford University Press. Kurlowicz, L. H., Evans, L. K., Strumpf, N. E., & Maislin, G. (2002). A psychometric evaluation of the Cornell Scale for Depression in Dementia in frail, nursing home population. American Journal of Geriatric Psychiatry, 10, 600–​608. Lamers, F., Jonkers, C. C. M., Bosma, H., Penninx, B. W. J. H., Knottnerus, J. A., & van Eijk, J. T.  M. (2008). Summed score of the Patient Health Questionnaire-​9 was a reliable and valid method for depression screening in chronically ill elderly patients. Journal of Clinical Epidemiology, 61, 679–​687. Latham, A. E., & Prigerson, H. G. (2004). Suicidality and bereavement: Complicated grief as a psychiatric disorder presenting greatest risk for suicidality. Suicide and Life-​Threatening Behavior, 34, 350–​362. Leontjevas, R., Gerritsen, D. L., Vernooij-​Dassen, M. J.  F. J., Smalbrugge, M., & Koopmans, R. T. C. M. (2012). Comparative validation of proxy-​based Montgomery–​ Åsberg Depression Rating Scale and Cornell Scale for Depression in Dementia in nursing home residents with dementia. American Journal of Geriatric Psychiatry, 20, 985–​993. Leontjevas, R., van Hooren, S., & Mulders, A. (2009). The Montgomery–​Åsberg Depression Rating Scale and the Cornell Scale for Depression in Dementia:  A validation study with patients exhibiting early-​onset dementia. American Journal of Geriatric Psychiatry, 17, 56–​64. Lesher, E. L. (1986). Validation of the Geriatric Depression Scale among nursing home residents. Clinical Gerontologist, 4, 21–​28. Li, Z., Jeon, Y.-​H., Low, L.-​F., Chenoweth, L., O’Connor, D. W., Beattie, E., & Brodaty, H. (2015). Validity of the Geriatric Depression Scale and the collateral source version of the Geriatric Depression Scale in nursing homes. International Psychogeriatrics, 27, 1495–​1504. Lichtenberg, P. A., Marcopulos, B. A., Steiner, D. A., & Tabscott, J. A. (1992). Comparison of the Hamilton Depression Rating Scale and the Geriatric Depression Scale:  Detection of depression in dementia patients. Psychological Reports, 70, 515–​521. Mack, J. L., & Patterson, M. B. (1994). The evaluation of behavioral disturbances in Alzheimer’s disease: The utility of three rating scales. Journal of Geriatric Psychiatry and Neurology, 7, 99–​115. Maixner, S. M., Burke, W. J., Roccaforte, W. H., Wengel, S. P., & Potter, J. F. (1995). A comparison of two depression

170

170

Mood Disorders and Self-Injury

scales in a geriatric assessment clinic. American Journal of Geriatric Psychiatry, 3, 60–​67. Marc, L. G., Raue, P. J., & Bruce, M. L. (2008). Screening performance of the 15-​item Geriatric Depression Scale in a diverse elderly home care population. American Journal of Geriatric Psychiatry, 16, 914–​921. Mayer, L. S., Bay, R. C., Politis, A., Steinberg, M., Steele, C., Baker, A. S., . . . Lyketsos, G. (2006). Comparison of three rating scales as outcome measures for treatment trials of depression in Alzheimer disease: Findings from DIADS. International Journal of Geriatric Psychiatry, 21, 930–​936. Meeks, T. W., Vahia, I. V., Lavretsky, H., Kulkarni, G., & Jeste, D. V. (2011). A tune in “a minor” can “b major”: A review of epidemiology, illness course, and public health implications of subthreshold depression in older adults. Journal of Affective Disorders, 129, 126–​142. Mitchell, A. J., Bird, V., Rizzo, M., & Meader, N. (2010). Diagnostic validity and added value of the Geriatric Depression Scale for depression in primary care:  A meta-​analysis of GDS30 and GDS15. Journal of Affective Disorders, 125(1–​3), 10–​17. Moberg, P. J., Lazarus, L. W., Mesholam, R. I., Bilker, W., Chuy, I. L., Neyman, I., & Markvart, V. (2001). Comparison of the standard and structured interview guide for the Hamilton Depression Rating Scale in depressed geriatric inpatients. American Journal of Geriatric Psychiatry, 9, 35–​40. Montgomery, S., & Åsberg, M. (1979). A new depression scale designed to be sensitive to change. British Journal of Psychiatry, 134, 382–​389. Mora, P. A., Beamon, T., Preuitt, L., DiBonaventura, M., Leventhal, E. A., & Leventhal, H. (2012). Heterogeneity in depression symptoms and health status among older adults. Journal of Aging and Health, 24, 879–​896. Mottram, P., Wilson, K., & Copeland, J. (2000). Validation of the Hamilton Depression Rating Scale and Montgomery and Åsberg Rating Scales in terms of AGECAT rating cases. International Journal of Geriatric Psychiatry, 15, 1113–​1119. Mueller, T. I., Kohn, R., Leventhal, N., Leon, A. C., Solomon, D., Coryell, W.,  .  .  .  Alexopoulos, G. S. (2004). The course of depression in elderly patients. American Journal of Geriatric Psychiatry, 12, 22–​29. Myers, J. K., & Weisman, N. M. (1980). Use of a self-​report symptom scale to detect depression in a community sample. American Journal of Psychiatry, 137, 1081–​1084. Nemeth, C. L., Haroon, E., & Neigh, G. N. (2014). Heartsick: Psychiatric and inflammatory implications of cerebromicrovascular disease. International Journal of Geriatric Psychiatry, 29, 577–​585. Newman, J. P. (1989). Aging and depression. Psychology and Aging, 4, 150–​165. Nguyen, H. T., & Zonderman, A. B. (2006). Relationship between age and aspects of depression: Consistency and

reliability across two longitudinal studies. Psychology and Aging, 21, 119–​126. Norris, M. P., Arnau, R. C., Bramson, R., & Meagher, M. W. (2004). The efficacy of somatic symptoms in assessing depression in older primary care patients. Clinical Gerontologist, 27, 43–​57. O’Connor, D. W., & Parslow, R. A. (2009). Different responses to K-​10 and CIDI suggest that complex structured psychiatric interviews underestimate rates of mental disorder in old people. Psychological Medicine, 39, 1527–​1531. O’Connor, E. A., Whitlock, E. P., Beil, T. L., & Gaynes, B. N. (2009). Screening for depression in adult patients in primary care settings: A systematic evidence review. Annals of Internal Medicine, 151, 793–​803. Onega, L. L., & Abraham, I. L. (1997). Factor structure of the Dementia Mood Assessment Scale in a cohort of community-​ dwelling elderly. International Psychogeriatrics, 9, 449–​457. Onishi, J., Suzuki, Y., & Umegaki, H. (2006). A comparison of depressive mood of older adults in a community, nursing home, and a geriatric hospital:  Factor analysis of Geriatric Depression Scale. Journal of Geriatric Psychiatry and Neurology, 19, 26–​31. Onwuameze, O. E., & Paradiso, S. (2014). Social adaptive functioning, apathy, and nondysphoric depression among nursing home-​ dwelling very old adults. Psychopathology, 47, 319–​326. Ownby, R. L., Harwood, D. G., Acevedo, A., Barker, W., & Duara, R. (2001). Factor structure of the Cornell Scale for Depression in Dementia for Anglo and Hispanic patients with dementia. American Journal of Geriatric Psychiatry, 9, 217–​224. Perrault, A., Oremus, M., Demers, L., Vida, S., & Wolfson, C. (2000). Review of outcome measurement instruments in Alzheimer’s disease drug trials: Psychometric properties of behavior and mood scales [Original article in French]. Journal of Geriatric Psychiatry and Neurology, 13, 181–​196. Phelan, E., Williams, B., Meeker, K., Bonn, K., Fredrick, J., LoGerfo, J., & Snowden, M. (2010). A study of the diagnostic accuracy of the PHQ-​9 in primary care elderly. Biomed Central Family Practice, 11, 63–​72. Phillips, L. J. (2012). Measuring symptoms of depression:  Comparing the Cornell Scale for Depression in Dementia and the Patient Health Questionnaire-​ 9–​Observation Version. Research in Gerontological Nursing, 5, 34–​42. Potter, G. G., Wagner, H. R., Burke, J. R., Plassman, B. L., Welsh-​Bohmer, K. A., & Steffans, D. C. (2013). Neuropsychological predictors of dementia in late-​life major depressive disorder. American Journal of Geriatric Psychiatry, 21, 297–​306. Prince, M., Acosta, D., Chiu, H., Copeland, J., Dewey, M., Scazufca, M., & Varghese, M. (2004). Effects of

 17

Depression in Late Life

education and culture on the validity of the Geriatric Mental State and its AGECAT algorithm. British Journal of Psychiatry, 185, 429–​436. Radloff, L. S. (1977). The CES-​D scale: A self-​report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–​401. Rapp, S. R., Smith, S. S., & Britt, M. (1990). Identifying comorbid depression in elderly medical patients:  Use of the Extracted Hamilton Depression Rating Scale. Psychological Assessment, 2, 243–​247. Ravindran, A. V., Welburn, K., & Copeland, J. R. M. (1994). Semi-​ structured depression scale sensitive to change with treatment for use in the elderly. British Journal of Psychiatry, 164, 522–​527. Robins, L. N., Helzer, J. E., Croughan, J., & Ratcliff, K. S. (1981). National Institute of Mental Health Diagnostic Interview Schedule:  Its history, characteristics, and validity. Archives of General Psychiatry, 38, 381–​389. Robins, L. N., Wing, J., Wittchen, H. U., Helzer, J. E., Babor, T. F., Burke, J., . . . Towle, L. H. (1988). The Composite International Diagnostic Interview: An epidemiological instrument suitable for use in conjunction with different diagnostic systems and in different cultures. Archives of General Psychiatry, 45, 1069–​1077. Rush, A. J., Giles, D. E., Schlesser, M. A., Fulton, C. L., Weissenburger, J. E., & Burns, C. T. (1986). The Inventory for Depressive Symptomatology (IDS): Preliminary findings. Psychiatry Research, 18, 65–​87. Rush, A. J., Gullion, C. M., Basco, M. R., Jarrett, R. B., & Trivedi, M. H. (1996). The Inventory of Depressive Symptomatology (IDS):  Psychometric properties. Psychological Medicine, 26, 477–​486. Saliba, D., DiFilippo, S., Edelen, M. O., Kroenke, K., Buchanan, J., & Streim, J. (2012). Testing the PHQ-​ 9 Interview and Observational versions (PHQ-​9 OV) for MDS 3.0. Journal of American Medical Directors Association, 13, 618–​625. Schafer, A. B. (2006). Meta-​ analysis of the factor structures of four depression questionnaires: Beck, CES-​D, Hamilton, & Zung. Journal of Clinical Psychology, 62, 123–​146. Schulz, R., & Martire, L. M. (2004). Family caregiving of persons with dementia:  Prevalence, health effects, and support strategies. American Journal of Geriatric Psychiatry, 12, 240–​249. Segal, D. L., Coolidge, F. L., Cahill, B. S., & O’Riley, A. A. (2008). Psychometric properties of the Beck Depression Inventory-​II (BDI-​II) among community-​dwelling older adults. Behavior Modification, 32, 3–​20. Segal, D. L., Kabacoff, R. I., Hersen, M., Van Hasselt, V. B., & Ryan, C. F. (1995). Update on the reliability of diagnosis in older psychiatric outpatients using the Structured Clinical Interview for DSM-​III-​R. Journal of Clinical Geropsychology, 1, 313–​321.

171

Sheikh, J. I., & Yesavage, J. A. (1986). Geriatric Depression Scale (GDS):  Recent evidence and development of a shorter version. Clinical Gerontologist, 5, 165–​173. Snow, A. L., Kunik, M. E., Molinari, V. A., Orengo, C. A., Doody, R., Graham, D. P., & Norris, M. P. (2005). Accuracy of self-​reported depression in persons with dementia. Journal of the American Geriatrics Society, 53, 389–​396. Snowdon, J., Rosengren, D., Daniel, F., & Suyasa, M. (2010). Australia’s use of the Cornell Scale to screen for depression in nursing homes. Australasian Journal on Ageing, 30, 33–​36. Spitzer, R. L., Endicott, J., & Robins, E. (1978). Research diagnostic criteria: Rationale and reliability. Archives of General Psychiatry, 35, 773–​782. Steer, R. A., Ball, R., Ranieri, W. F., & Beck A. T. (1999). Dimensions of the Beck Depression Inventory-​ II in clinically depressed outpatients. Journal of Clinical Psychology, 55, 117–​128. Steer, R. A., Rissmiller, D. J., & Beck, A. T. (2000). Use of the Beck Depression Inventory-​II with depressed geriatric inpatients. Behaviour Research and Therapy, 38, 311–​318. Steingart, A., & Herrmann, N. (1991). Major depressive disorder in the elderly:  The relationship between age of onset and cognitive impairment. International Journal of Geriatric Psychiatry, 6, 593–​598. Stukenberg, K. W., Dura, J. R., & Kiecolt-​ Glaser, J. K. (1990). Depression screening scale validation in an elderly, community-​dwelling population. Psychological Assessment, 2, 134–​138. Sunderland, T., Alterman, I. S., Yount, D., Hill, J. L., Teriot, P. N., Newhouse, P. A., . . . Cohen, R. M. (1988). A new scale for the assessment of depressed mood in demented patients. American Journal of Psychiatry, 148, 955–​959. Sunderland, T., & Minichiello, M. (1996). Dementia Mood Assessment Scale. International Psychogeriatrics, 8(Suppl. 3), 329–​331. Towsley, G., Neradilek, M. B., Snow, A. L., & Ersek, M. (2012). Evaluating the Cornell Scale for Depression in Dementia as a proxy measure in nursing home residents with and without dementia. Aging and Mental Health, 16, 892–​901. Trainor, K., Mallett, J., & Rushe, T. (2013). Age related differences in mental health scale scores and depression diagnosis: Adult responses to the CIDI-​SF and MHI-​5. Journal of Affective Disorders, 151, 639–​645. Trajkovic, G., Starcevic, V., Latas, M., Lestarevic, M., Ille, T., Bukumiric, Z., & Marinkovic, J. (2011). Reliability of the Hamilton Rating Scale for Depression: A meta-​ analysis over a period of 49 years. Psychiatry Research, 189, 1–​9. Turvey, C. L., Wallace, R. B., & Herzog, R. (1999). A revised CES-​D measure of depressive symptoms and

172

172

Mood Disorders and Self-Injury

a DSM-​based measure of major depressive episodes in the elderly. International Psychogeriatrics, 11, 139–​148. Van Dam, N. T., & Earleywine, M. (2011). Validation of the Center for Epidemiologic Studies Depression Scale-​ Revised (CESD-​R):  Pragmatic depression assessment in the general population. Psychiatry Research, 186, 128–​132. Vida, S., Des Rosiers, P., Carrier, L., & Gauthier, S. (1994). Depression in Alzheimer’s disease:  Receiver operating characteristics of the Cornell Scale for Depression in Dementia and the Hamilton Depression Scale. Journal of Geriatric Psychiatry and Neurology, 7, 159–​162. Vilalta-​Franch, J., Garre-​Olmo, J., Lopez-​Pousa, S., Turon-​ Estrada, A., Lozano-​Gattego, M., Hernandez-​Ferrandiz, M., . . . Feijoo-​Lorza, R. (2006). Comparison of different clinical diagnostic criteria for depression in Alzheimer disease. American Journal of Geriatric Psychiatry, 14, 589–​597. Wang, S., & Blazer, D. G. (2015). Depression and cognition in the elderly. Annual Review of Clinical Psychology, 11, 331–​360. Ware, J. E., & Sherbourne, C. D. (1992). The MOS 36-​ item Short-​Form Health Survey (SF-​36): I. Conceptual framework and item selection. Medical Care, 30, 473–​483. Watson, L. C., & Pignone, M. P. (2003). Screening accuracy for late-​life depression in primary care:  A systematic review. Journal of Family Practice, 52, 956–​964. Whitley, B. E. (2002). Principles of research in behavioral science. Boston, MA: McGraw-​Hill. Williams, J. B. W. (2001). Standardizing the Hamilton Depression Rating Scale: Past, present, and future. European Archives of Psychiatry and Clinical Neuroscience, 251(Suppl. 2), 6–​12. Williams, J. R., & Marsh, L. (2008). Validity of the Cornell Scale for Depression in Dementia in Parkinson’s disease with and without cognitive impairment. Movement Disorders, 24, 433–​437. Wing, J. K., Birley, J. L.  T., Cooper, J. E., Graham, P., & Isaacs, A. D. (1967). Reliability of procedure for

measuring and classifying “Present Psychiatric State.” British Journal of Psychiatry, 113, 499–​515. Wittchen, H.-​ U., Strehle, J., Gerschler, A., Volkert, J., Dehoust, M. C., Sehner, S.,  .  .  .  Andreas, S. (2015). Measuring symptoms and diagnosing mental disorders in the elderly community:  The test–​ retest reliability of the CIDI65+. International Journal of Methods in Psychiatric Research, 24, 116–​129. Wongpakaran, N., Wongpakaran, T., & van Reekum, R. (2013). Discrepancies in Cornell Scale for Depression in Dementia (CSDD) items between residents and caregivers, and the CSDD’s factor structure. Clinical Interventions in Aging, 8, 641–​648. Wood, S., Cummings, J. L., Hsu, M.-​A., Barclay, R., Wheatley, M. V., Yarema, K. T., & Schnelle, J. F. (2000). The use of the Neuropsychiatric Inventory in nursing home residents. American Journal of Geriatric Psychiatry, 8, 75–​83. World Health Organization. (1992). International classification of diseases (10th ed.). Geneva, Switzerland: Author. Yesavage, J. A., Brink, T. L., Rose, T. L., Lum, O., Huang, V., Ade, M., & Leirer, V. O. (1983). Development and validation of a geriatric screening scale: A preliminary report. Journal of Psychiatric Research, 17, 37–​49. Zeiss, A. M., Lewinsohn, P. M., & Rohde, P. (1996). Functional impairment, physical disease, and depression in older adults. In P. M. Kato & T. Mann (Eds.), Handbook of diversity issues in health psychology (pp. 161–​184). New York, NY: Plenum. Zimmerman, M., Posternak, M. A., & Chelminski, I. (2004). Derivation of a definition of remission on the Montgomery–​ Åsberg Depression Rating Scale corresponding to the definition of remission on the Hamilton Rating Scale for Depression. Journal of Psychiatric Research, 38, 577–​582. Zisook, S., & Shuchter, S. R. (1993). Major depression associated with widowhood. American Journal of Geriatric Psychiatry, 1, 316–​326. Zung, W. W.  K. (1965). A self-​ rating depression scale. Archives of General Psychiatry, 12, 63–​70.

 173

9

Bipolar Disorder Sheri L. Johnson Christopher Miller Lori Eisner The goal of this chapter is to review measures that are relevant for the clinical evaluation and treatment of bipolar disorder. Specifically, we focus on assessment measures relevant to diagnosis, treatment planning, and treatment monitoring. In each area, we focus on those few assessment measures that have gained at least moderate psychometric support. Only a small number of measures meet established psychometric criteria, perhaps as a consequence of the limited amount of psychological research on bipolar disorder compared to other psychopathologies. With the advent of lithium treatment and the recognition of the genetic basis of disorder, psychological researchers all but abandoned the study of this disorder for several decades, and the development of new assessment instruments languished. Psychological research on the disorder entered a renewed phase of interest in the 1990s, with the volume of research increasing each year since then. Nonetheless, research on bipolar disorder lags far behind that available on other psychopathologies, and many of the assessment needs for conducting research and clinical work within this field remain relatively unaddressed. Despite this relative dearth of exhaustively validated measures of bipolar disorder, several existing measures have been translated into different languages to serve the needs of clinicians worldwide. Given the difficulties of comparing validation studies across different languages, however, we generally limit our consideration to English versions of assessment tools throughout this chapter.

NATURE OF BIPOLAR DISORDER

The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric

Association, 2013) defines several different forms of bipolar disorders, differentiated by the severity and duration of manic symptoms. Bipolar I  disorder is diagnosed on the basis of a single lifetime episode of mania. A  manic episode is diagnosed on the basis of euphoric or irritable mood with accompanying increases in energy or activity, the presence of at least three other symptoms (four if mood is only irritable), and marked social or occupational impairment. The inclusion of energy or activity as a required criterion is new to DSM-​5. Criteria specify that symptoms must last at least 1 week or require hospitalization. Bipolar II disorder is diagnosed on the basis of hypomania and episodes of major depression. Hypomania is less severe than mania: Criteria specify a distinct change in functioning rather than severe impairment. Hypomanic episodes can be diagnosed with 4 days of symptoms. A third form of bipolar disorder, cyclothymia, is diagnosed based on recurrent mood swings, both high and low, which do not meet the severity of bipolar I or bipolar II disorder. Criteria for cyclothymia specify that numerous mood swings must be present. Manic symptoms may be secondary to drugs (e.g., cocaine and amphetamines) and medical conditions (e.g., thyroid conditions). The use of antidepressants without mood-​stabilizing medication can trigger episodes of mania or hypomania, particularly among those with an individual or family history of bipolar disorder (Ghaemi, Lenox, & Baldessarini, 2001). Such episodes are not considered in making a diagnosis of bipolar I or bipolar II disorder but, rather, can contribute to a diagnosis of medication-induced bipolar disorder. Prevalence rates are approximately 1% for bipolar I disorder and 3.9% for bipolar I  and II disorders combined (Kessler, Berglund, Demler, Jin, & Walter, 2005). Rates

173

174

174

Mood Disorders and Self-Injury

of comorbidity within bipolar disorder are quite high, and treatment planning will require consideration of these syndromes. Although not required for diagnosis of bipolar I disorder, as many as 66% to 75% of people with bipolar I  disorder in community surveys experience episodes of major depression (Karkowski & Kendler, 1997; Kessler, Rubinow, Holmes, Abelson, & Zhao, 1997). Similarly, as many as 93% of people with bipolar disorder meet lifetime diagnostic criteria for at least one anxiety disorder (Kessler et al., 1997), and as many as 61% do so for alcohol or substance abuse (Reigier et al., 1990). Indeed, in a Veterans Administration sample, 78% met criteria for comorbid conditions during their lifetime (Bauer et  al., 2005). Hence, initial assessments should consider the possible presence of comorbid syndromes. Estimates from twin studies suggest that heritability accounts for as much as 93% of the variability in whether or not this disorder develops (Kieseppa, Partonen, Haukka, Kaprio, & Lonnqvis, 2004). For those affected by the disorder, though, psychosocial variables predict the course of symptoms. Depression within bipolar disorder appears triggered by negative life events, deficits in social support, and negative cognitive styles (Johnson & Kizer, 2002), whereas mania has been found to be predicted by sleep dysregulation (Leibenluft, Albert, Rosenthal, & Wehr, 1996)  and variables relevant to excessive goal engagement (Johnson, Edge, Holmes, & Carver, 2012). The evidence for genetic contributions to bipolar disorder led to a focus on medication approaches, such as lithium and other mood-​stabilizing medications (Prien & Potter, 1990). With increased evidence that psychosocial variables influence the course of disorder, adjunctive psychosocial treatments have become more common (Johnson & Leahy, 2004).

ASSESSMENT FOR DIAGNOSIS

There is no biological assay for bipolar disorder, so diagnosis is based entirely on review of symptoms and of potential organic explanations. In practice, most clinicians review the DSM symptoms in an informal manner, although clinicians using unstructured diagnostic interviews tend to miss approximately half of all diagnoses (Shear et al., 2000; Zimmerman & Mattia, 1999). Even though many people with a history of major depression will meet diagnostic criteria for bipolar disorder, most practitioners report that they do not routinely screen for bipolar disorder among people with depression (Brickman, LoPiccolo, & Johnson, 2002). Perhaps as a consequence of poor screening, people with bipolar disorder may wait as long as 6 to 8  years on average to be correctly diagnosed (Drancourt et al., 2013; Lish, Dime-​ Meenan, Whybrow, Price, & Hirschfeld, 1994). Failure to detect this diagnosis can have serious repercussions, in that antidepressant treatment without mood-​stabilizing medication can trigger iatrogenic mania (Ghaemi et al., 2001; Tondo, Vázquez, & Baldessarini, 2010). In this section, we discuss diagnostic instruments for bipolar disorder (for a summary of relevant measures, see Table 9.1). This material should be considered in the context of the aforementioned modification to the diagnosis of bipolar disorder in DSM-​5. Unfortunately, very little literature has validated diagnostic or screening instruments against DSM-​5 criteria. We therefore focus on diagnostic and screening measures that have been validated against DSM-​IV (American Psychiatric Association, 1994, 2000) or other older bipolar diagnoses. For adults, two semi-​structured diagnostic instruments have been most commonly used: the Structured Clinical

Table 9.1  Ratings of Instruments Used for Diagnosis Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

E A

NA NA

G G

G G

G G

G G

E E

A A

✓ ✓

Pediatric Clinician Rated K-​SADS-​PL A

NA

E

E

A

G

G

A



Self-​Report GBI MDQ

E G

NA NA

A A

A A

G A

NA L

NA NA

Instrument Clinician Rated SCID SADS

NA NA

Note: SCID = Structured Clinical Interview for DSM-​IV; SADS = Schedule for Affective Disorders and Schizophrenia; K-​SADS-​PL = Kiddie Schedule for Affective Disorders and Schizophrenia–​Present and Lifetime Version; GBI = General Behavior Inventory; MDQ = Mood Disorder Questionnaire; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

 175

Bipolar Disorder

175

Interview for DSM-​ IV (SCID) and the Schedule for Affective Disorders and Schizophrenia (SADS). Both provide interview probes, guidelines for symptom thresholds, and information about potential exclusionary criteria (e.g., medical and pharmacological conditions that could provoke manic symptoms). The SCID is designed to assess DSM-​ IV diagnoses, whereas the SADS is designed to assess the Research Diagnostic Criteria (RDC). Although the two diagnostic systems are similar for mania, RDC criteria are slightly stricter than the DSM criteria about the nature of psychotic symptoms that can be manifested within bipolar disorder, in that certain psychotic symptoms are considered indicative of schizoaffective rather than bipolar disorder. We begin by describing these measures as tools for assessing bipolar disorder in adults, and then we discuss some concerns regarding the diagnosis of bipolar II disorder. Next, we discuss issues and tools for the diagnosis of child and adolescent bipolar disorder. We conclude our discussion of diagnostic assessment with a description of self-​report measures designed to aid diagnostic screening.

Diagnostic Interview (CIDI; World Health Organization, 1990). The SCID has achieved high concordance for bipolar diagnoses between twins and has been validated in a number of countries (Kieseppa et al., 2004).

Diagnostic Assessment of Bipolar I Disorder in Adults

Diagnostic Assessment of Bipolar II Disorder in Adults

The Structured Clinical Interview for DSM-​IV-​TR The SCID is recommended as a part of clinical intake procedures (Spitzer, Williams, Gibbon, & First, 1992), and a clinician’s version is available through American Psychiatric Publishing (First, Spitzer, Gibbon, & Williams, 1997). The recent transition to DSM-​5 has been accompanied by a new version of the SCID (First, Williams, Karg, & Spitzer, 2016), but validation data for the SCID-​5 are not yet widely available. The clinician’s version includes less detail about subtype and course distinctions than is provided within the research version. The SCID is a semi-​structured interview with recommended probes, but diagnosticians are expected to rephrase probes and ask clarifying questions as needed to determine whether a given criterion is met. Inter-​ rater reliability for the SCID diagnoses has been established in a large international multisite trial (Williams et al., 1992) and at least 10 other major trials (Rogers, Jackson, & Cashel, 2001). Initial attempts to test the mania module within a community sample were thwarted by the low base rates of the disorder (Williams et al., 1992). Diagnoses of bipolar disorder based on the SCID, however, were substantially more reliable than those obtained by clinicians who were not using a diagnostic interview or by paraprofessionals using more structured interviews such as the Composite International

The Schedule for Affective Disorders and Schizophrenia Considerable evidence has accrued for the reliability of SADS (Endicott & Spitzer, 1978) diagnoses across 21 studies (for a review, see Rogers et al., 2001). SADS diagnoses of bipolar disorder have been found to robustly correlate with other measures of mania (Secunda et al., 1985), and the SADS appears to accurately capture diagnoses across different cultural and ethnic groups within the United States (Vernon & Roberts, 1982). Lifetime mania diagnoses have achieved good test–​retest reliability over 5 years among samples of adults (Rice et  al., 1986)  and adolescents (Strober et al., 1995), as well as 10 years among a sample of adults (Coryell et al., 1995).

Bipolar II disorder was not recognized within the DSM as a diagnostic category until the fourth edition. It is worth noting that hypomania is the only major syndrome within the DSM in which functional impairment is not a criterion for diagnosis. That is, persons can qualify for hypomanic episodes with a relatively mild shift in functioning. Perhaps because of the minimal severity, bipolar II disorder is not diagnosed unless the person also suffers from episodes of major depression. Intriguingly, considerable debate still exists about the criteria for hypomania; whereas the DSM-​5 criteria specify four symptoms with duration of at least 4 days, the RDC criteria are less stringent, specifying three symptoms with duration of at least 2  days. Given ongoing debates about the diagnostic threshold, it is not surprising that assessment tools for bipolar II disorder are less well established. Despite debate about diagnostic criteria and instruments, though, there is evidence that the diagnosis itself may be important to capture. Diagnoses of bipolar II disorder using the SADS show expected correlations with trait measures of mood lability and energy/​activity (Akiskal et al., 1995), as well as family history of bipolar II disorder (Rice et al., 1986). Several studies have also suggested that people with bipolar II disorder are at higher risk for suicide than are persons with bipolar I disorder or unipolar depression (Dunner, 1996; Undurraga, Baldessarini, Valenti,

176

176

Mood Disorders and Self-Injury

Pacchiarotti, & Vieta, 2012). Hence, identification of bipolar II disorder is important in planning treatment. The lower threshold for this disorder appears to create difficulty in reliably capturing symptoms, an issue that is particularly well documented for the SADS. Even when interviewers rate the same recordings, reliability estimates for bipolar II disorder within the SADS are quite inadequate and much lower than the estimates for bipolar I disorder reflected in Table 9.1 (Keller et al., 1981), although some teams achieved higher estimates (Simpson et  al., 2002; Spitzer, Endicott, & Robins, 1978). In addition to poor reliability for the diagnostic category, interviewers also have been found to have very poor agreement on mild symptoms of mania (Andreasen et al., 1981). Test–​ retest reliability over a 6-​month period was quite poor for bipolar II disorder, intraclass r = .06, and even poorer for cyclothymia (Andreasen et  al., 1981). In a 5-​ year test–​retest study, SADS diagnoses of bipolar II disorder achieved kappa scores of only .09 (Rice et al., 1986), and in a 10-​year study, only 40% of persons initially diagnosed with bipolar II disorder on the SADS experienced further episodes of hypomania or mania (Coryell et al., 1995). Inter-​rater reliability can be limited by either a lack of specificity or a lack of sensitivity. Both the SCID and the SADS have been found to have inadequate sensitivity in detecting cases of bipolar II disorder. Despite some evidence for high inter-​rater agreement of unstructured expert clinical interviews in one study, approximately one-​ third of cases that were diagnosed through expert clinical interview with bipolar II disorder were not identified as such within SCID interviews (κ = .67) (Dunner & Tay, 1993; Simpson et al., 2002). In summary, a set of issues mar the diagnostic assessment of bipolar II disorder, including difficulties identifying hypomanic symptoms that do not cause impairment and broad questions about the duration of hypomanic episodes. Based on this, it is perhaps not surprising that available tools do not produce reliable diagnoses of bipolar II disorder. Given that people who meet criteria for bipolar II disorder may be at high risk for suicidality, improving the detection of this disorder remains an important goal for the field. Diagnostic Assessment of Bipolar Disorder in Children and Adolescents The DSM-​5 diagnostic criteria for juvenile bipolar disorder are the same as those for adult bipolar disorder, with the exception that cyclothymic disorder can be diagnosed

within 1  year, rather than 2  years, of symptoms. There is considerable debate in the field about the diagnostic criteria as some researchers have argued many diagnoses of bipolar disorder among children and adolescents are missed because the criteria are too stringent. Some have argued that episodes of shorter duration or diminished symptom severity should be diagnosable, particularly given that children may not have the same opportunities to exhibit symptomatic behavior in domains such as hypersexuality or overspending. On the other hand, recent dramatic increases in the sheer number of bipolar diagnoses among youth and adolescents (e.g., Blader & Carlson, 2007; Moreno et  al., 2007)  have raised the possibility that bipolar disorder is now overdiagnosed in these populations. This has in turn been associated with a rapid increase in the number of prescriptions of second-​ generation antipsychotics—​which can cause dangerous cardiovascular side effects—​ in young people (Fraguas et al., 2011). In one attempt to address this concern, DSM-​ 5 introduced a separate diagnosis of disruptive mood dysregulation disorder, characterized by severe and recurrent temper outbursts in children that do not coalesce into full manic or hypomanic episodes (Axelson et al., 2012). Taken together, these developments emphasize that accurately diagnosing bipolar disorder in children and adolescents is notoriously difficult. Here, we focus briefly on the key issues involved in diagnosing bipolar disorder in youth. Readers interested in more in-​depth coverage are referred to more detailed works (e.g., Jenkins, Youngstrom, Washburn, & Youngstrom, 2011; Youngstrom, Findling, Youngstrom, & Calabrese, 2005). In diagnosing juvenile bipolar disorder, there is value in using multiple sources of data, including youths, parents, and teachers (Youngstrom, Findling, & Calabrese, 2003). Youths can be poor reporters of hyperactivity, inattention, and oppositional behaviors (Youngstrom, Loeber, & Stouthamer-​Loeber, 2000). To the extent that mania involves externalizing symptoms, youths may be poor reporters of manic symptoms. For internalizing problems, youth and caregiver reports are preferable (Loeber, Green, & Lahey, 1990). Teacher reports are often discrepant with the reports of parents and youths (Youngstrom et al., 2000) because children may show different behaviors across different settings. Given that impairment may not be equal across all settings, averaging scores from different sources appears to enhance reliability (Youngstrom, Gracious, Danielson, Findling, & Calabrese, 2003). Parent report offers several advantages in making accurate psychiatric diagnoses, especially among

 17

Bipolar Disorder

younger children. Parents are more psychologically minded than youths (Anastasi & Urbina, 1997), and they are aware of a child’s developmental history and family functioning (Richters, 1992), as well as low base rate phenomena (e.g., fire-​setting and suicide attempts; Kazdin & Kagan, 1994). Not surprisingly, then, parent report tends to be more accurate in predicting diagnostic status than either youth or teacher reports (Youngstrom et al., 2004). Youth report should not be discounted in the diagnostic process, however. Parent and youth reports have been shown to be more discrepant for externalizing disorders than internalizing disorders, for girls than boys, and for older children than younger children (Verhulst & van der Ende, 1992). Adolescents, especially as they grow older, are important informants on their own problem behaviors given that internalizing behaviors and concealed high-​ risk behaviors may go unnoticed by their parents (Loeber, Green, & Lahey, 1990). One way to approach clinical interviewing with parents and their children is by using diagnostic interviews, described next. Kiddie Schedule for Affective Disorders and Schizophrenia for School-​Age Children—​Present and Lifetime Version Many different versions of the Kiddie Schedule for Affective Disorders and Schizophrenia for School-​ Age Children (K-​SADS) have been developed. The K-​SADS-​ PL, however, is the only instrument that provides global and diagnosis-​specific impairment ratings (Kaufman et al., 1997). Excellent estimates of inter-​rater reliability (98% to 100%) and test–​retest reliability (κ for current and lifetime diagnosis both = 1.00) have been documented for bipolar disorders with the K-​SADS-​PL (Frazier et al., 2007). Several groups have attempted to refine the mania section of the K-​SADS. Axelson et al. (2003) developed a Child Mania Rating Scale module that, in their sample, demonstrated excellent inter-​rater reliability (intraclass correlation  =  .97), excellent internal consistency (α  =  .94), and, using a cut-​off score of 12 or higher, demonstrated sensitivity of 87% and specificity of 81% with clinical judgments of mania (Axelson et al., 2003). These results suggest that the K-​SADS-​MRS holds promise as a rating scale for manic symptoms in children and adolescents. Geller and colleagues (2001) at Washington University in St. Louis developed a more detailed version of the K-​SADS (WASH-​U-​KSADS). Although scores on

177

the measure have achieved good to excellent inter-​rater reliability for mania symptoms (κ range from .82 to 1.00; Geller et al., 2001), the training and time burdens may be too extensive for general clinical practice. Self-​Report Measures Detailed assessment by a trained clinician is considered the most reliable and valid way to obtain a diagnosis of bipolar disorder (Akiskal, 2002). Several self-​ report screeners have been developed, however, to aid in detecting potential diagnoses of bipolar disorder. At this point in their development, information on psychometric adequacy is limited (see Table 9.1). Of these measures, the General Behavior Inventory (GBI; Depue et  al., 1981)  has demonstrated promising psychometric properties (Ratheesh, Berk, Davey, McGorry, & Cotton, 2015). GBI items were designed to cover symptom intensity, duration, and frequency using a response scale that ranges from 1 (“never or hardly ever”) to 4 (“very often or almost constantly”). The original GBI consisted of 69 items, chosen to cover the core symptoms of bipolar disorder by the consensus of three item writers. Modified versions have been developed, as well, that tap both the depressed and the manic poles of bipolar disorder (e.g., Depue & Klein, 1988; Mallon, Klein, Bornstein, & Slater, 1986). The variety of different versions, ranging from 52 to 73 items, makes generalizations regarding psychometric properties difficult. Normative data have not been reported for the GBI in any large clinical samples, but its scores have generally demonstrated excellent internal consistency and adequate test–​retest reliability, with initial evidence of structural invariance in Black and White young adults (Pendergast et al., 2015). Several studies have assessed the GBI’s ability to discriminate bipolar cases from noncases. In general, the GBI scores have demonstrated sensitivity to bipolar disorder of approximately 75%, and specificity greater than 97% (Depue & Klein, 1988; Depue, Krauss, Spoont, & Arbisi, 1989; Klein, Dickstein, Taylor, & Harding, 1989; Mallon et  al., 1986), in both clinical and nonclinical samples. Scores have also demonstrated the ability to predict development of bipolar disorder in young adults (e.g., Alloy et al., 2012). Unfortunately, generalizability is limited because cut-​off scores were not consistent across studies but, rather, were determined within each study to maximize predictive power. The GBI has also been adapted for use with parents to capture mood symptoms in children aged 5 to 17 years

178

178

Mood Disorders and Self-Injury

and has been shown to be diagnostically informative, especially for young children (Findling et  al., 2002; Youngstrom, Findling, Danielson, & Calabrese, 2001). Parallel to the original GBI, the Parent GBI (P-​GBI) consists of depressive and hypomanic/​biphasic subscales, both of which demonstrate excellent internal consistency. The scale also demonstrated strong validity in differentiating children with mood disorders from those with disruptive behavior disorders (80.6% accuracy), as well as distinguishing children with bipolar disorder from those with other mood disorders (86.1% accuracy; Youngstrom et al., 2001). Additional work has demonstrated promise for a 10-​item version of the P-​GBI focused specifically on mania (Youngstrom et al., 2008, 2012). Several studies have examined the validity of youth report on the GBI (e.g., Danielson, Youngstrom, Findling, & Calabrese, 2003). The GBI depression scale demonstrates good discriminative validity distinguishing between those with Axis I mood disorders and those with disruptive behavior disorders or no diagnosis, and the hypomanic/​ biphasic scale distinguishes between children with bipolar spectrum diagnoses and those with other disorders (depression, disruptive behavior disorder, and no diagnosis). Not surprisingly, the GBI is better at differentiating children with bipolar disorder from healthy controls than it is at differentiating children with bipolar disorder from those with unipolar depression (Pendergast et al., 2014). Overall, then, the GBI is a promising screening tool for identifying bipolar disorder among adult and pediatric populations. Nonetheless, more research is needed to establish norms and to evaluate this scale using consistent items and cut-​off scores. One other measure that has been increasingly popular is the Mood Disorder Questionnaire (MDQ; Hirschfeld et  al., 2000). The first 13 items of the MDQ are yes–​no questions covering the full range of manic symptoms; at least 7 must be answered “yes” to achieve a positive screen. Additional items query as to whether the symptoms identified co-​occurred and caused at least moderate problems. The MDQ scores have attained good internal consistency ranging from .79 (Isometsä et al., 2003) to .90 (Hirschfeld et  al., 2000), adequate 1-​month test–​retest reliability in clinical samples (Weber Rouget et al., 2005), and fair sensitivity in differentiating bipolar disorder from unipolar disorder clinical samples (.73 to .90). Nonetheless, specificities have been low in many studies, with considerable variability across settings (.47 to .90; Hirschfeld et al., 2000, 2003; Isometsä et al., 2003; Miller, Klugman, Berv, Rosenquist, & Ghaemi, 2004; Weber Rouget et al., 2005). Poor psychometric properties for the MDQ have emerged especially in

nonclinical samples (Dodd et al., 2009; Hirschfeld et al., 2003; Miller, Johnson, Kwapil, & Carver, 2011) and clinical settings that include disorders and comorbidities other than bipolar disorder and unipolar depression (van Zaane, van den Berg, Draisma, Nolen, & van den Brink, 2012; Zimmerman, Galione, Chelminski, Young, & Dalrymple, 2011). In other cases, researchers have applied different cut-​ points and modified scoring algorithms, in several cases concluding that no cut-​off adequately balances sensitivity and specificity (Zimmerman, 2012; Zimmerman & Galione, 2011). Despite these limitations, the MDQ has been translated into numerous languages and has been tested in many countries throughout the world (e.g., de Sousa Gurgel, Rebouças, de Matos, Carneiro, & Souza, 2012; Gervasoni et al., 2009; Meyer et al., 2011; Sanchez-​ Moreno et al., 2008). Other scales await more testing. Scores on the Hypomanic Personality Scale (HPS; Eckblad & Chapman, 1986)  have been found to predict the development of manic episodes at multiyear follow-​up in two samples of undergraduates (Kwapil et  al., 2000; Walsh, DeGeorge, Barrantes-​ Vidal, & Kwapil, 2015), and a Spanish language version is available (Ruggero, Johnson, & Cuellar, 2004), but the scale has been subjected to minimal study in clinical populations (e.g., Parker, Fletcher, McCraw, & Hong, 2014). The Bipolar Spectrum Disorder Scale (BSDS; Ghaemi et  al., 2005)  and the Mood Spectrum Self-​Reports (MOODS-​SR; Dell’Osso et al., 2002) have only been tested in a handful of studies each (e.g., Carvalho et al., 2015; Miniati et al., 2009). The Hypomania Checklist (HCL-​32; Angst, Adolfsson, et al., 2005), as its name implies, was designed to detect hypomania specifically, but it has predominantly been tested outside of the United States (Carta et al., 2006; Wu, Angst, Ou, Chen, & Lu, 2008). The Temperament Evaluation of Memphis, Pisa, Paris, and San Diego (TEMPS; Akiskal & Akiskal, 2005)  is a measure to which an issue of the Journal of Affective Disorders was dedicated, but to our knowledge it has not been compared with a diagnostic interview, and more than one version or cut-​offs have appeared across studies. The cyclothymia subscale of a TEMPS self-​report version (the autoquestionnaire) has shown promising correlations with diagnostic interviews (Mahon, Perez-​ Rodriguez, Gunawardane, & Burdick, 2013; Mendlowicz, Kelsoe, & Akiskal, 2005). The Child Mania Rating Scale (Pavuluri, Henry, Devineni, Carbray, & Birmaher, 2006) and the Parent-​Young Mania Rating scale (P-​ YMRS; Gracious, Youngstrom, Findling, & Calabrese, 2002)  are both designed to assess current symptoms of mania among youths, but few studies have

 179

Bipolar Disorder

179

Table 9.2  Ratings of Instruments Used for Case Conceptualization and Treatment Planning Instrument

Norm

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Content Reliability Validity

Construct Validity

Validity Generalization

Clinical Utility

Treatment Sensitivity

Highly Recommended

FAD SAI-​E

A A

A A

NA G

A NA

G A

G A

A A

A NA

✓ ✓

A A

Note: FAD = Family Assessment Device; SAI-​E = Schedule for Assessment of Insight-​Expanded Version; G = Good; A = Adequate; NA = Not Available.

investigated their psychometric properties (e.g., Serrano, Ezpeleta, Alda, Matalí, & San, 2011). The Inventory of Depression and Anxiety Symptoms (IDAS) was recently updated (now labeled the IDAS-​II) to incorporate assessment of manic and euphoric symptoms, but it has not yet been extensively studied (Watson et al., 2012). Overall Evaluation To date, two measures of diagnosis are dominant in diagnosing bipolar disorder among adults: the SCID and the SADS. Both have excellent psychometric characteristics for the assessment of bipolar I disorder but function poorly in identifying bipolar II disorder. It is not currently clear whether the limits in detection of bipolar II are strictly a measurement issue or reflect underlying issues in the definitions of hypomanic episodes. These results should be considered in the context of the recent DSM-​5 changes to diagnostic criteria for bipolar disorder that promote increased energy/​ activity to a cardinal symptom. We know very little about how these changes to the diagnostic code affect the psychometric performance of diagnostic measures. Although there is much debate regarding the diagnostic criteria for pediatric bipolar disorder, assessment should include a detailed clinical interview that assesses family history, as well as the intensity and duration of any mood symptoms. The mania modules of the K-​SADS have achieved psychometric support, and obtaining information from multiple sources (e.g., parents or teachers) may be useful as well. The potential benefits of robustly validated screening tools for bipolar disorder are recognized by clinicians and researchers alike but developing such tools has been extremely challenging. When considering self-​ report scales as screening tools, several issues must be kept in mind. For instance, Phelps and Ghaemi (2006) demonstrated that the usefulness of a screening tool varies depending on clinicians’ previous estimates of the probability of the disorder in question. Thus, clinician knowledge about a disorder’s prevalence in the population of

interest may be more important than a screening tool’s sensitivity or specificity. Second, different measures have been used as reference standards. Third, several authors have expanded the diagnostic interviews used as a reference standard to capture milder forms of bipolar spectrum disorder, yet provide only vague information about the modifications. Fourth, suppressor effects—​by which the inclusion of some items in a scale may boost the predictor power of other items—​may be especially relevant for bipolar disorder given that many people with bipolar disorder experience high levels of both positive and negative affect (Watson, Clark, Chmielewski, & Kotov, 2013). Each of these issues complicates comparisons between measures.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

A growing body of work suggests that psychological treatments can be helpful when added to pharmacological treatment for bipolar disorder. Well-​studied psychological treatments include family-​focused therapy, cognitive therapy, interpersonal psychotherapy, and psychoeducation. Nonetheless, several trials have suggested comparable effects of these active psychological treatments (Miklowitz et al., 2007). Given this, a key question is how to determine which treatments would be most helpful. Unfortunately, research on the predictors of treatment outcome remains in its infancy within bipolar disorder (Miklowitz & Johnson, 2014). Hence, there are few measures available to predict response to a given treatment approach (for a review of key measures, see Table 9.2). Certainly, there is evidence that severity of symptom history, whether defined by multiple episodes per year, earlier age of onset, severity of depressive symptoms during manic periods, or comorbid medical and psychiatric conditions, will predict poorer outcome. Hence, clinicians will do well to gather a good clinical history to document the severity of the manic episodes, as well as the presence of comorbid complications. Reviewing the

180

180

Mood Disorders and Self-Injury

history of episodes can be somewhat bewildering because the median time to relapse, even on adequate medication levels, is approximately 1  year (Keller et  al., 1992). Hence, most patients will have had many episodes, and the episodes will have varied in their triggers, severity, and consequences. One strategy that can be very helpful in organizing the complex information is the Life Chart (Denicoff et al., 1997), a graphing procedure developed at the National Institute of Mental Health, which can provide a collaborative tool for helping a patient describe the pattern of episodes over time, potential triggers, and effectiveness of different treatment approaches. Although frequently used, little psychometric information on the Life Chart is available. Other measures are relevant for tracking specific dimensions related to outcome. One of the best predictors of poor outcome is treatment nonadherence, with substantial evidence that treatment dropout increases risk of relapse, suicide, and hospitalization (Keck et al., 1998). It is also well established that treatment nonadherence is normative within bipolar disorder—​less than 25% of patients remain continuously adherent with medication (Merikangas et al., 2011). Hence, predicting treatment nonadherence would be a primary goal of any baseline assessment. The Scale to Assess Unawareness of Mental Disorder (SUMD; Amador et al., 1993), a semi-​ structured interview to assess awareness of symptoms of mental disorder, symptoms, social consequences of disorder, and misattributions for symptoms, has been shown to differentiate people with bipolar disorder from those without bipolar disorder (Varga Magnusson, Flekkoy, Ronneberg, & Opjordsmoen, 2006). However, baseline scores have not been found to predict treatment success over time (Ghaemi, Boiman, & Goodwin, 2000), particularly when baseline function is considered (Novick et al., 2015). On the other hand, the Schedule for Assessment of Insight-​Expanded Version (SAI-​E; Kemp & David, 1996)  has been found to predict treatment adherence at 1-​year follow-​up among people with bipolar disorder (Yen et al., 2005), to differentiate people with and without bipolar disorder (Sanz Constable, Lopez-​lbor, Kemp, & David, 1998), and to achieve a cross-​sectional correlation of .70 with other indices of treatment adherence (Sanz et al., 1998). The self-​report Insight and Treatment Attitudes Questionnaire has been used in several studies of persons with bipolar disorder, but little psychometric or predictive information is available. Inter-​ rater reliabilities of .92 and higher have been reported (Ghaemi, Stoll, & Pope, 1995; Michalakeas et al., 1994; Sajatovic et al., 2009).

Choices regarding which other risk factors to assess will likely depend on the treatment being employed. Hence, if offering cognitive therapy to address maladaptive negative cognitions about the self, clinicians may want to draw from the measures of negative cognition routinely used within the unipolar depression literature (for a review, see Chapter  7, this volume), such as the Dysfunctional Attitudes Scale (DAS; Weissman & Beck, 1978) or the Automatic Thoughts Questionnaire (Hollon & Kendall, 1980). These measures have been extremely well tested in both unipolar depression and general populations. Scores on both measures are elevated during depressive episodes of bipolar disorder compared to those of healthy control groups (Cuellar, Johnson, & Winters, 2005; Pavlickova et al., 2013), and DAS scores have been found to predict increased depressive symptoms over time (Fletcher, Parker, & Manicavasagar, 2014; Johnson & Fingerhut, 2004). Psychometric data within bipolar disorder is available for a sample of more than 300 participants (Reilly-​Harrington et  al., 2010). Nonetheless, the factor structure of the DAS appears to differ among people with bipolar disorder compared with the general population, and studies have varied widely in which subscales they used (Lam, Wright, & Smith, 2004). For clinicians who offer interpersonal and social rhythm psychotherapy (Frank, 2005), a substantial component of treatment focuses on helping clients develop a more regular schedule of daily activities. One measure, the social rhythm metric, has been most widely used to test the constancy of the daily schedule (Monk, Flaherty, Frank, Hoskinson, & Kupfer, 1990). The scale has been shown to correlate with indices of sleep (Monk, Reynolds, Buysse, DeGrazia, & Kupfer, 2003) and to be lower among persons with rapid cycling bipolar disorder compared to healthy controls (Ashman et al., 1999). Nonetheless, the scale correlates with, rather than predicts, mania symptom fluctuations disorder over time (Frank et al., 2005), so it is currently not recommended  for treatment outcome prediction. Several researchers have shown that the Pittsburgh Sleep Quality Index (PSQI) is predictive of symptom changes over time within bipolar disorder (Murray & Harvey, 2010; Saunders, Fernandez-​Mendoza, Kamali, Assari, & McInnis, 2015). Because evidence is more limited regarding how this instrument predicts change for those in a given treatment, it is not recommended at this time for treatment outcome prediction. Drawing on expressed emotion (EE) theory, family treatment programs in bipolar disorder aim to help families become less critical of their ill relative (Miklowitz & Goldstein, 1997). The most feasibly administered scale

 18

Bipolar Disorder

of family criticism is the Perceived Criticism scale (PCS; Hooley & Teasdale, 1989). Patients rate on a scale of 1 to 10 how critical they think they are of their relative and how critical they think their relative is of them. Scores on this scale have demonstrated temporal stability as well as concurrent validity with the other validated measures of EE, such as the Camberwell Family Interview (r  =  .45, p < .01; van Humbeeck et al., 2004). Unfortunately, the scale has not been found to predict the outcome of family therapy (Miklowitz, Wisniewski, Miyahara, Otto, & Sachs, 2005). In contrast, the Family Assessment Device (FAD; Epstein, Baldwin, & Bishop, 1983) is a 60-​item self-​report measure. FAD scores have been found to be elevated among families of those with bipolar disorder compared to controls (Du Rocher Schudlich, Youngstrom, Calabrese, & Findling, 2008; Young et al., 2013) and to predict more gain in family therapy compared to pharmacotherapy (Miller et al., 2008), but FAD scores does not appear to predict more general change in outcomes (Weinstock & Miller, 2010). Six subscales have attained factor analytic support, and a general function score aggregates those subscales (Kabacoff, Miller, Bishop, Epstein, & Keitner, 1990). Adequate construct validity, compared to other measures of family function, as well as indices of social desirability, has been demonstrated (Miller, Epstein, Bishop, & Keitner, 1985), and the scale has been used in a wide range of samples, with large normative data sets available for those with psychiatric diagnoses and controls (Friedmann et al., 1997). A concern, however, is that 1-​ week test–​retest reliability scores are only modest (r = .66 to .75, Mn = .71; Miller et al., 1985).

181

Overall Evaluation Because clinical severity and comorbidity are important predictors of poorer outcome, clinicians should assess these parameters during intake interviews. The Life Chart provides a way of organizing clinical history. Beyond clinical severity, the SAI-​E has been found to predict poorer medication adherence. Although the current state of research does not offer clinicians much guidance on how to choose treatment predictors, there are some promising developments in measuring constructs relevant to case conceptualization and treatment planning, particularly in the domain of insight and family function.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

In this section, we consider measures that can be used to track the progress of treatment (Table 9.3). Currently, well-​validated measures for this purpose exist only for the purpose of documenting changes in symptom levels. The Young Mania Rating Scale and the Bech–​Rafaelsen Mania Rating Scale are among the most widely used scales for this purpose. The Young Mania Rating Scale (YMRS) was designed to be administered by a trained clinician in a 15-​to 30-​ minute patient interview that captures the patient’s report of manic symptoms during the past 48 hours as well as the clinician’s observations during the interview (Young, Biggs, Ziegler, & Meyer, 1978). The 11 items, which assess elevated mood, increased energy, sexual interest,

Table 9.3  Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Treatment Sensitivity

Clinical Utility

Highly Recommended

YMRS MAS SADS-​C ASRM

A A A A

E E NA A

E E G NA

NA NA NA A

A A A A*

G G G G

A A G A

G G E A

A G G G

✓ ✓

SRMI

A

G

NA

A

G

A

A

A

A

WASSUP

A

A

NA

NA

A

A

A

A

A

SHPSS Brief QoL.BD

A A

G G

NA NA

A A

A G

A G

A NA

G NA

A A

*  But missing grandiosity. Note:  YMRS  =  Young Mania Rating Scale; MAS  =  Bech–​Rafaelsen Mania Scale; SADS-​C  =  Schedule for Affective Disorders and Schizophrenia-​ Change Mania Scale; ASRM  =  Altman Self-​Rating Mania Scale; SRMI  =  Self-​Rating Mania Inventory; WASSUP  =  Willingly Approached Set of Statistically Unlike Pursuits; SHPSS = Sense of Hyper-​Positive Self Scale; Brief QoL.BD = Brief Quality of Life in Bipolar Disorder; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

182

182

Mood Disorders and Self-Injury

irritability, speech, thought disorder, content, aggressive behavior, appearance, and insight, are rated on a severity scale of 0 to 4, and 4 items are given twice the weight of other items. Total scores range from 0 to 60. In the original study, the YMRS showed excellent inter-​rater reliability for total scores (intraclass correlation = .93). The YMRS is sensitive to changes in severity but may not be suitable to assess hypomania, the milder form of mania (Vieta, 2010). A  score on the YMRS greater than 25 is suggestive of severe or marked illness (Lukascwiez, 2013). The YMRS (as opposed to the parent-​ YMRS) has also been used to measure manic symptoms in children. When the YMRS was administered by a clinician combining impressions from child and parent interviews, the total YMRS score showed good validity in differentiating bipolar disorder from attention-​deficit/​hyperactivity disorder (Fristad, Weller, & Weller, 1992, 1995; Serrano et al., 2011)  and other diagnoses (Frazier et  al., 2007). The YMRS still demonstrated good discriminative validity when administered to children in a quicker and “unfiltered way” that did not differentiate chronic from episodic symptoms nor account for symptom onset or duration (Yee et al., 2015). The Bech–​ Rafaelsen Mania Rating Scale (MAS; Bech, Bolwig, Kramp, & Rafaelsen, 1979) is an 11-​item rating scale. Each item is rated on a 5-​point scale (0–​4), and the total score, ranging from 0 to 44, is obtained by summing the items. Scores of 15 or higher are indicative of mania. The MAS has been used in many different trials of anti-​manic therapies due to its strong psychometric characteristics (see Table 9.3). The scale has strong validity in detecting changes with treatment and discriminating between active and placebo therapy groups (Bech, 2002). The Schedule for Affective Disorders and Schizophrenia-​Change Version (SADS-​C) for mania is a 5-​item interview designed to assess current severity of manic symptoms. Each item is rated on a 6-​point scale based on behavioral anchors. Inter-​rater reliability estimates have been reported in a range of settings, including forensic settings (Rogers, Jackson, Salekin, & Neumann, 2003). One exception to a pattern of good inter-​reliability results was found in a sample of patients referred for emergency evaluation (intraclass correlation = .63 for mania; Rogers et  al., 2003). The scale has been found to show expected elevations within a bipolar sample compared to patients with other psychiatric disorders, and it also shows robust correlations with another interview to assess manic severity, the MAS (r  =  .89; Johnson, Magaro, & Stern, 1986). In factor analytic studies, all items load on a single scale that is distinct from dysphoria, insomnia, and

psychosis (Rogers et  al., 2003). Nonetheless, less factor analytic support was obtained in a study that considered the item loadings for the SADS-​C and a nurse observation scale for mania (Swann et al., 2001). Self-​Report Measures Several self-​ report measures have been used to track patients’ symptoms throughout the course of treatment. Of these, only two have a broad range of supporting psychometric evidence—​the Altman Self-​Rating Mania Scale and the Self-​Rating Mania Inventory (see Table 9.3)—​with others in development. The Altman Self-​ Rating Mania Scale (ASRM; Altman, Hedeker, Peterson, & Davis, 1997) consists of five items. For each item, participants choose from a set of five statements to best capture their feelings or behavior during the past week. Each item is scored on a scale of 0 (absent) to 4 (severe) so that total scores can range from 0 to 20. Additional items assess psychosis and irritability, but they are not included in the total score. Grandiosity is not covered. Two studies have reported comparable norms for the ASRM in patient samples (Altman et  al., 1997; Altman, Hedeker, Peterson, & Davis, 2001), but few data are available regarding scores for nonpatients. Scores on the ASRM have demonstrated adequate internal consistency and concurrent validity compared with several different reference standards, including SADS-​based diagnoses, the YMRS (Young et  al., 1978), and the Clinician-​Administered Rating Scale for Mania (CARS-​M; Altman et  al., 1997, 2001; Altman, Hedeker, Janicak, Peterson, & Davis, 1994). On the basis of an area under the curve analysis (Hanley & McNeil, 1982), Altman and colleagues concluded that a cut-​off score of 5.5 resulted in an optimal combination of sensitivity and specificity (85% and 86%, respectively), although this cut-​off might result in lower specificity (Altman et al., 2001). Finally, the ASRM has demonstrated good sensitivity to treatment, with scores dropping an average of 5 points after discharge from the hospital in one study (Altman et al., 2001). Overall, the ASRM has demonstrated good psychometric properties. However, the scale covers fewer symptoms compared to other mania indices. The Self-​ Report Manic Inventory (SRMI; Braünig, Shugar & Kruger, 1996; Shugar, Schertzer, Toner, & Di Gasbarro, 1992) is a 47-​item true/​false inventory covering a range of manic symptoms, with one additional item covering insight. Expert clinicians reviewed each item during

 183

Bipolar Disorder

the development phase. In its original design, the time frame for items was the past month; later editions assessed symptoms during the previous week. Normative data have been reported for the SRMI in three small studies of inpatients, and it has demonstrated good internal consistency (Altman et al., 2001; Braünig et al., 1996; Shugar et al., 1992). Two studies have found that the scale differentiates people with bipolar disorder from those with other psychopathologies (Braünig et  al., 1996; Shugar et  al., 1992), although one other study found the SRMI to have low concurrent validity (Altman et  al., 2001). The scale appears sensitive to change in symptoms, but eight of the SRMI items capture behaviors that would not be possible within a hospital setting (Altman et al., 2001). Hence, it has been argued that the content of the scale may not be well-​suited to inpatient assessment. The Internal State Scale (ISS; Bauer et  al., 1991)  is a 17-​item scale designed to assess the severity of manic and depressive symptoms. Of its four subscales, only the 5-​ item activation subscale has correlated significantly with mania ratings. These items were designed to cover behavioral activation (e.g., restlessness and impulsivity) but not other mania symptoms (e.g., euphoria). The ISS also does not assess some other behavioral symptoms that are characteristic of mania, such as decreased sleep or rapid speech (Altman et al., 2001). The ISS has demonstrated correlations with other measures of mania ranging from .21 to .60 and rates of correct classification ranging from .55 to .78 (Altman et al., 2001; Bauer et al., 1991; Bauer, Vojta, Kinosian, Altshuler, & Glick, 2000; Cooke, Krüger, & Shugar, 1996). The ISS has demonstrated sensitivity to treatment change, in that scores diminish appreciably post-​treatment (Altman et  al., 2001; Bauer et  al., 1991; Cooke et al., 1996). Despite these strengths, scoring algorithms have varied substantially across studies, as have means and standard deviations (Altman et al., 2001; Bauer et al., 1991; Cooke et al., 1996). The scale has also been found to have a low sensitivity to manic symptoms at the time of hospitalization (Altman et al., 2001). Given these concerns, the ISS is not currently recommended. There is a growing literature that demonstrates that people with bipolar disorder set high goals and focus on achievement independent of mood state, and these may be important to assess in the context of treatment planning and outcome. The Willingly Approached Set of Statistically Unlike Pursuits scale (WASSUP; Johnson & Carver, 2006) is a self-​report measure designed to assess highly ambitious life goals. Thirty items assessing high goals in the seven domains of popular fame, friendships, world well-​ being, political influence, family, financial

183

success, and creativity are rated from 1 (no chance I will set this goal for myself) to 5 (definitely will set this goal for myself). Persons at risk for mania and people diagnosed with bipolar spectrum disorder show consistent elevations on the Popular fame and Financial success subscales, and elevations on these two subscales predicted manic symptoms over time (Carver & Johnson, 2009; Johnson & Carver, 2006; Fulford, Johnson, & Carver, 2008; Gruber & Johnson, 2009; Johnson & Jones, 2009). WASSUP scores have been shown to decrease in a pilot trial of a cognitive–​ behavioral treatment focused on preventing mania in bipolar patients by directly addressing goal dysregulation, including overly ambitious goals (Johnson & Fulford, 2009). Individuals with bipolar disorder who are likely to set unrealistically high goals may also view themselves in a distinct way, and this may influence how they respond to psychosocial treatment. The Sense of Hyper-​Positive Self Scale (SHPSS; Lam, Wright, & Sham, 2005) assesses positive attributes that bipolar patients believe they possess when their mood state is mildly “high.” These include confident, dynamic, adorable, entertaining, outgoing, optimistic, and creative and are rated on a 7-​point scale from 1 (not at all) to 7 (extremely) as to (a) how well these words describe the patient most of the time and (b) ideally how the patient would like him-​or herself to be. High scores on this scale identify positive mood, increased arousal, and increased behavioral activation characteristic of mild elation and distinct from clinical hypomania or mania. Scores on the scale have demonstrated good internal reliability and test–​ retest reliability. Patients who value and perceive themselves as possessing these attributes demonstrated a poorer response to cognitive therapy in the absence of a relationship between SHPSS scores and current manic symptom scores (Lam, Wright, & Sham, 2005). Although symptom measures are the primary outcomes typically seen in studies of bipolar disorder, inclusion of measures of quality of life may provide a richer picture of the patient and be a useful tool for both treatment planning and assessment of treatment outcomes (Murray & Michalak, 2012). Given the growing evidence that people with bipolar disorder experience severe decline in quality of life and recognizing a need for a disorder-​specific assessment, Michalak and Murray (2010) developed a quality of life measure drawn from both qualitative interviews with bipolar patients, caregivers, and research and treatment experts and literature review. The QoL.BD assesses 14 domains during the prior 7 days using a 5-​point Likert scale ranging from strongly disagree (1) to strongly

184

184

Mood Disorders and Self-Injury

agree (5). The domains assessed include physical, sleep, mood, cognition, leisure, social, spirituality, finances, household, self-​esteem, independence, identity, and the optional domains of work and education. The brief form of the scale draws one item from each domain and demonstrated moderate to large correlations with the 12 basic scales of the QoL.BD. Finally, clinicians should bear in mind that naturalistic studies suggest that people with bipolar disorder experience at least some depressive symptoms one-​third of the weeks in a year (Judd et al., 2002). Higher risk of suicide has been documented during depression within bipolar disorder (Angst, Angst, Gerber-​Werder, & Gamma, 2005). Given this, it is recommended that clinicians track not only manic symptoms but also depression and suicidality. Overall Evaluation Ideally, outcome assessments should incorporate both interview and self-​report measures. Clinicians have several interview-​based measures available for tracking change in symptoms over time:  the SADS-​C, the YMRS, and the MAS. Although more data are available to support the YMRS and the MAS, the SADS-​C has the advantage of brevity. Self-​report measures such as the ASRM and the SRMI can also be completed quickly. This brevity and ease of use can come with the price, however, of reduced precision. To track progress, many clients find it helpful to create their own self-​monitoring forms or to complete brief checklists. Comparing results of the interview with self-​reported symptoms can be helpful for clients in building a greater awareness of symptoms. As treatment progresses, clients often find it helpful to begin to attend to smaller fluctuations in symptoms, such that they can implement early intervention strategies to promote calm and good medical care before symptoms intensify (Lam & Wong, 2005). Given the common problems with insight within this disorder, research is needed on how best to integrate self and clinician ratings of manic symptoms. Beyond traditional symptom measures, examining tendencies toward highly ambitious goal setting, sense of hyper-​positive self, and quality of life may provide for a richer picture of patients and prove to be useful assessments for treatment planning and outcome.

CONCLUSIONS AND FUTURE DIRECTIONS

In this chapter, we have considered assessment tools that are effective or promising in diagnosis, treatment

planning, and treatment outcome monitoring of bipolar disorder. In evaluating current measures, the need for ongoing research is quite apparent. In regard to diagnostic assessment, there is ongoing discussion about the requisite severity and duration of symptoms for hypomania. Similarly, substantial debate exists concerning the best criteria for the diagnosis of bipolar disorder among children and adolescents. Hence, diagnostic instruments are likely to be modified over time to increase their applicability for milder forms of the disorder and for younger age groups. Beyond the need for better diagnostic measures, there is a fundamental need for research on the predictors of outcome within psychological forms of treatment. Measures that could help define the best choice of therapy would be extremely helpful for clinicians. Finally, there is a need for measures that are specifically developed to capture the types of social dysfunctions that are most prevalent in bipolar disorder. Although many researchers apply social functioning measures developed for depression and schizophrenia, it will be important to consider ways in which manic symptoms can damage relationships. Currently, however, several excellent resources for assessment of bipolar disorder are available. For diagnosis, the SCID and the SADS allow for reliable and valid diagnosis of bipolar I disorder. For case conceptualization and treatment planning, the SAI-​E predicts medication nonadherence in bipolar disorder, and the Life Chart can help assess the history of episodes and triggers. The SHPSS, although relatively new, has predicted outcomes of cognitive therapy in one study. Once treatment commences, interview measures such as the YMRS and the MAS, as well as self-​report measures such as the ASRM and the SRMI, are available for monitoring symptom severity. We hope that this review stimulates clinical use of the available measures and encourages research focused on addressing the gaps in the assessment literature.

References Akiskal, H. S. (2002). Classification, diagnosis, and boundaries of bipolar disorders:  A review. In M. Maj, H. S. Akiskal, J. J. Lopez-​Ibor, & N. Sarotius (Eds.), Bipolar disorder (pp. 1–​52). Chichester, UK: Wiley. Akiskal, H. S., & Akiskal, K. K. (2005). TEMPS: Temperament Evaluation of Memphis, Pisa, Paris and San Diego. Journal of Affective Disorders, 85, 1–​2. Akiskal, H. S., Maser, J. D., Zeller, P. J., Endicott, J., Coryell, W., Keller, M., . . . Goodwin, F. (1995). Switching from “unipolar” to bipolar II: An 11-​year prospective study of

 185

Bipolar Disorder

clinical and temperamental predictors in 559 patients. Archives of General Psychiatry, 52, 114–​123. Alloy, L. B., Urošević, S., Abramson, L. Y., Jager-​Hyman, S., Nusslock, R., Whitehouse, W. G., & Hogan, M. (2012). Progression along the bipolar spectrum: A longitudinal study of predictors of conversion from bipolar spectrum conditions to bipolar I and II disorders. Journal of Abnormal Psychology, 121, 16–​27. Altman, E. G., Hedeker, D., Peterson, J. L., & Davis, J. M. (1997). The Altman Self-​Rating Mania Scale. Biological Psychiatry, 42, 948–​955. Altman, E. G., Hedeker, D., Peterson, J. L., & Davis, J. M. (2001). A comparative evaluation of three self-​rating scales for acute mania. Biological Psychiatry, 50, 468–​471. Altman, E. G., Hedeker, D. R., Janicak, P., Peterson, J. L., & Davis, J. M. (1994). The Clinician-​Administered Rating Scale for Mania (CARS-​M):  Development, reliability, and validity. Biological Psychiatry, 36, 124–​134. Amador, X. F., Strauss, D. H., Gorman, J. M., Endicott, J., Yale, S. A., & Flaum, M. (1993). The assessment of insight in psychosis. American Journal of Psychiatry, 150, 873–​879. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). New York, NY: Macmillan. Andreasen, N. C., Grove, W. M., Shapiro, R. W., Keller, M. B., Hirschfeld, R. M., & McDonald-​Scott, P. (1981). Reliability of lifetime diagnosis:  A multicenter collaborative perspective. Archives of General Psychiatry, 38, 400–​405. Angst, J., Adolfsson, R., Benazzi, F., Gamma, A., Hantouche, E., Meyer, T. D.,  .  .  .  Scott, J. (2005). The HCL-​ 32: Towards a self-​assessment tool for hypomanic symptoms in outpatients. Journal of Affective Disorders, 88, 217–​233. Angst, J., Angst, F., Gerber-​Werder, R., & Gamma, A. (2005). Suicide in 406 mood-​disorderd patients with and without long-​term medication: A 40 to 44 years’ follow-​up. Archives of Suicide Research, 9, 279–​300. Ashman, S. B., Monk, T. H., Kupfer, D. J., Clark, C. H., Myers, F. S., Frank, E., & Leibenluft, E. (1999). Relationship between social rhythms and mood in patients with rapid cycling bipolar disorder. Psychiatry Research, 86, 1–​8. Axelson, D., Birmaher, B. J., Brent, D., Wassick, S., Hoover, C., Bridge, J., & Ryan, N. (2003). A preliminary study of the Kiddie Schedule for Affective Disorders and Schizophrenia for School-​Age Children mania rating

185

scale for children and adolescents. Journal of Child and Adolescent Psychopharmacology, 13, 463–​470. Axelson, D., Findling, R. L., Fristad, M. A., Kowatch, R. A., Youngstrom, E. A., Horwitz, S. M.,  .  .  .  Birmaher, B. (2012). Examining the proposed disruptive mood dysregulation disorder diagnosis in children in the Longitudinal Assessment of Manic Symptoms study. Journal of Clinical Psychiatry, 73, 1342–​1350. Bauer, M. S., Altshuler, L., Evans, D. R., Beresford, T., Williford, W. O., & Hauger, R. (2005). Prevalence and distinct correlates of anxiety, substance, and combined comorbidity in a multi-​site public sector sample with bipolar disorder. Journal of Affective Disorders, 85, 301–​315. Bauer, M. S., Crits-​Christoph, P., Ball, W. A., Dewees, E., McAllister, T., Alahi, P.,  .  .  .  Whybrow, P. C. (1991). Independent assessment of manic and depressive symptoms by self-​ rating:  Scale characteristics and implications for the study of mania. Archives of General Psychiatry, 48, 807–​812. Bauer, M. S., Vojta, C., Kinosian, B., Altshuler, L., & Glick, H. (2000). The Internal State Scale: Replication of its discriminating abilities in a multisite, public sector sample. Bipolar Disorders, 2, 340–​346. Bech, P. (2002). The Bech–​Rafaelsen Mania Scale in clinical trials of therapies for bipolar disorder. CNS Drugs, 16, 47–​63. Bech, P., Bolwig, T. G., Kramp, P., & Rafaelsen, O. J. (1979). The Bech–​Rafaelsen Mania Scale and the Hamilton Depression Scale. Acta Psychiatrica Scandinavica, 59, 420–​430. Blader, J. C., & Carlson, G. A. (2007). Increased rates of bipolar disorder diagnoses among US child, adolescent, and adult inpatients, 1996–​2004. Biological Psychiatry, 62, 107–​114. Braünig, P., Shugar, G., & Krüger, S. (1996). An investigation of the Self-​Report Mania Inventory as a diagnostic and severity scale for mania. Comprehensive Psychiatry, 37, 52–​55. Brickman, A., LoPiccolo, C., & Johnson, S. L. (2002). Screening for bipolar disorder by community providers [Letter to the editor]. Psychiatric Services, 53, 349. Carta, M. G., Hardoy, M. C., Cadeddu, M., Murru, A., Campus, A., Morosini, P. L.,  .  .  .  Angst, J. (2006). The accuracy of the Italian version of the Hypomania Checklist (HCL-​32) for the screening of bipolar disorders and comparison with the Mood Disorder Questionnaire (MDQ) in a clinical sample. Clinical Practice and Epidemiology in Mental Health, 2, 1. Carvalho, A. F., Takwoingi, Y., Sales, P. M. G., Soczynska, J. K., Köhler, C. A., Freitas, T. H., . . . Vieta, E. (2015). Screening for bipolar spectrum disorders:  A comprehensive meta-​analysis of accuracy studies. Journal of Affective Disorders, 172, 337–​346.

186

186

Mood Disorders and Self-Injury

Carver, C. S., & Johnson, S. L. (2009). Tendencies toward mania and tendencies toward depression have distinct motivational, affective, and cognitive correlates. Cognitive Therapy and Research, 33, 552–​569. Cooke, R. G., Krüger, S., & Shugar, G. (1996). Comparative evaluation of two self-​ report mania rating scales. Biological Psychiatry, 40, 279–​283. Coryell, W., Endicott, J., Maser, J. D., Keller, M. B., Leon, A. C., & Akiskal, H. S. (1995). Long-​term stability of polarity distinctions in the affective disorders. American Journal of Psychiatry, 152, 385–​390. Cuellar, A. K., Johnson, S. L., & Winters, R. (2005). Distinctions between bipolar and unipolar depression. Clinical Psychology Review, 25, 307–​339. Danielson, C. K., Youngstrom, E. A., Findling, R. L., & Calabrese, J. R. (2003). Discriminative validity of the General Behavior Inventory using youth report. Journal of Abnormal Child Psychology, 31, 29–​39. de Sousa Gurgel, W., Rebouças, D. B., de Matos, K. J. N., Carneiro, A. H.  S., & Souza, F. G.  d. M.; Affective Disorders Study Group. (2012). Brazilian Portuguese validation of Mood Disorder Questionnaire. Comprehensive Psychiatry, 53, 308–​312. Dell’Osso, L., Armani, A., Rucci, P., Frank, E., Fagiolini, A., & Corretti, G., . . . Cassano, G. B. (2002). Measuring mood spectrum:  Comparison of interview (SCI-​ MOODS) and self-​report (MOODS-​SR) instruments. Comprehensive Psychiatry, 43, 69–​73. Denicoff, K. D., Smith-​ Jackson, E. E., Disney, E. R., Suddath, R. L., Leverich, G. S., & Post, R. M. (1997). Preliminary evidence of the reliability and validity of the prospective life-​chart methodology (LCM-​p). Journal of Psychiatric Research, 31, 593–​603. Depue, R. A., & Klein, D. N. (1988). Identification of uni-​ polar and bipolar affective conditions in nonclinical and clinical populations by the General Behavior Inventory. In D. L. Dunner, E. S. Gershon, & J. E. Barrett (Eds.), Relatives at risk for mental disorder (pp. 179–​ 204). New York, NY: Raven Press. Depue, R. A., Krauss, S., Spoont, M. R., & Arbisi, P. (1989). General Behavior Inventory identification of unipolar and bipolar affective conditions in a non-​clinical university population. Journal of Abnormal Psychology, 98, 117–​126. Depue, R. A., Slater, J. F., Wolfstetter-​Kausch, H., Klein, D., Goplerud, E., & Farr, D. (1981). A behavioral paradigm for identifying persons at risk for bipolar depressive disorder:  A conceptual framework and five validation studies. Journal of Abnormal Psychology, 90, 381–​437. Dodd, S., Williams, L. J., Jacka, F. N., Pasco, J. A., Bjerkeset, O., & Berk, M. (2009). Reliability of the Mood Disorder Questionnaire:  Comparison with the Structured Clinical Interview for the DSM-​ IV-​ TR

in a population sample. Australian and New Zealand Journal of Psychiatry, 43, 526–​530. Drancourt, N., Etain, B., Lajnef, M., Henry, C., Raust, A., Cochet, B.,  .  .  .  Bellivier, F. (2013). Duration of untreated bipolar disorder:  Missed opportunities on the long road to optimal treatment. Acta Psychiatrica Scandinavica, 127, 136–​144. Du Rocher Schudlich, T. D., Youngstrom, E. A., Calabrese, J. R., & Findling, R. L. (2008). The role of family functioning in bipolar disorder in families. Journal of Abnormal Child Psychology, 36, 849–​863. Dunner, D. L. (1996). Bipolar depression with hypomania (bipolar II). In T. A. Widiger, A. J. Frances, H. A. Pincus, R. Ross, M. B. First, & W. W. Davis (Eds.), DSM-​IV sourcebook:  Vol. 2 (pp. 53–​64). Washington, DC: American Psychiatric Association. Dunner, D. L., & Tay, L. K. (1993). Diagnostic reliability of the history of hypomania in bipolar II patients and patients with major depression. Journal of Comprehensive Psychiatry, 34, 303–​307. Eckblad, M., & Chapman, L. J. (1986). Development and validation of a scale for hypomanic personality. Journal of Abnormal Psychology, 95, 214–​222. Endicott, J., & Spitzer, R. L. (1978). A diagnostic interview:  The schedule for affective disorders and schizophrenia. Archives of General Psychiatry, 35, 837–​844. Epstein, N.B., Baldwin, L.M., & Bishop, D.S. (1983). The McMaster family assessment device. Journal of Marital and Family Therapy, 9, 171–​180. Findling, R. L., Youngstrom, E. A., Danielson, C. K., DelPorto-​Bedoya, D., Papish-​David, R., Townsend, L., & Calabresse, J. R. (2002). Clinical decision-​making using the General Behavior Inventory in juvenile bipolarity. Bipolar Disorders, 4, 34–​42. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1997). Structured Clinical Interview for DSM-​ IV (SCID). Washington, DC:  American Psychiatric Press. First, M. B., Williams, J. B., Karg, R. S., & Spitzer, R. L. (2016). Structured Clinical Interview for DSM-​ 5 disorders:  SCID-​5-​CV clinician version. Arlington, VA: American Psychiatric Publishing. Fletcher, K., Parker, G., & Manicavasagar, V. (2014). The role of psychological factors in bipolar disorder: Prospective relationships between cognitive style, coping style and symptom expression. Acta Neuropsychiatrica, 26, 81–​95. Fraguas, D., Correll, C. U., Merchán-​Naranjo, J., Rapado-​ Castro, M., Parellada, M., Moreno, C., & Kendall, T. (2011). Efficacy and safety of second-​generation antipsychotics in children and adolescents with psychotic and bipolar spectrum diseases:  Comprehensive review of prospective head-​ to-​ head and placebo-​ controlled comparisons. European Neuropsychopharmacology, 21, 621–​645.

 187

Bipolar Disorder

Frank, E. (2005). Treating bipolar disorder: A clinician’s guide to interpersonal and social rhythm therapy. New  York, NY: Guilford. Frank, E., Kupfer, D. J., Thase, M. E., Mallinger, A. G., Swartz, H. A., Fagiolini, A. M.,  .  .  .  Monk, T. (2005). Two-​year outcomes for interpersonal and social rhythm therapy in individuals with bipolar I disorder. Archives of General Psychiatry, 62, 996–​1004. Frazier, T. W., Demeter, C. A., Youngstrom, E. A., Calabrese, J. R., Stansbrey, R. J., McNamara, N. K., & Findling, R. L. (2007). Evaluation and comparison of psychometric instruments for pediatric bipolar spectrum disorders in four age groups. Journal of Child and Adolescent Psychopharmacology, 17, 853–​866. Friedmann, M. S., McDermut, W. H., Solomon, D. A., Ryan, C. E., Keitner, G. I., & Miller, I. W. (1997). Family functioning and mental illness:  A comparison of psychiatric and nonclinical families. Family Process, 36, 357–​367. Fristad, M. A., Weller, E. B., & Weller, R. A. (1992). The Mania Rating Scale:  Can it be used in children? Journal of the American Academy of Child & Adolescent Psychiatry, 31, 252–​257. Fristad, M. A., Weller, E. B., & Weller, R. A. (1995). The Mania Rating Scale (MRS):  Further reliability and validity studies with children. Annals of Clinical Psychiatry, 7, 127–​132. Fulford, D., Johnson, S. L., & Carver, C. S. (2008). Commonalities and differences in characteristics of persons at risk for narcissism and mania. Journal of Research in Personality, 42, 1427–​1438. Geller, B., Zimerman, B., Williams, M., Bolhofner, K., Craney, J. L., DelBello, M. P., & Soutullo, C. (2001). Reliability of the Washington University in St. Louis Kiddie Schedule for Affective Disorders and Schizophrenia (WASH-​ U-​ KSADS) mania and rapid cycling sections. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 450–​455. Gervasoni, N., Rouget, B. W., Miguez, M., Dubuis, V., Bizzini, V., Gex-​Fabry, M.,  .  .  .  Aubry, J.-​M. (2009). Performance of the Mood Disorder Questionnaire (MDQ) according to bipolar subtype and symptom severity. European Psychiatry, 24, 341–​344. Ghaemi, S. N., Boiman, E., & Goodwin, F. K. (2000). Insight and outcome in bipolar, unipolar and anxiety disorders. Journal of Comprehensive Psychiatry, 41, 161–​171. Ghaemi, S. N., Lenox, M. S., & Baldessarini, R. J. (2001). Effectiveness and safety of long-​ term anti-​ depressant treatment in bipolar disorder. Journal of Clinical Psychiatry, 62, 565–​569. Ghaemi, S. N., Miller, C. J., Berv, D. A., Klugman, J., Rosenquist, K. J., & Pies, R. W. (2005). Sensitivity and specificity of a new bipolar spectrum diagnostic scale. Journal of Affective Disorders, 84, 273–​277.

187

Ghaemi, S. N., Stoll, A. L., & Pope, H. G., Jr. (1995). Lack of insight in bipolar disorder: The acute manic episode. Journal of Nervous and Mental Disease, 183, 464–​467. Gracious, B. L., Youngstrom, E. A., Findling. R. L., & Calabrese, J. R. (2002). Discriminative validity of a parent version of the Young Mania Rating Scale. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 1350–​1359. Gruber, J., & Johnson, S. L. (2009). Positive emotional traits and ambitious goals among people at risk for bipolar disorder: The need for specificity. International Journal of Cognitive Therapy, 2, 176–​187. Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143, 29–​36. Hirschfeld, R. M. A., Holzer, C., Calabrese, J. R., Weissman, M., Reed, M., Davies, M.,  .  .  .  Hazard, E. (2003). Validity of the Mood Disorder Questionnaire: A general population study. American Journal of Psychiatry, 160, 178–​180. Hirschfeld, R. M.  A., Williams, J. B.  W., Spitzer, R. L., Calabrese, J. R., Flynn, L., Keck, P. E. Jr., . . . Zajecka, J. (2000). Development and validation of a screening instrument for bipolar spectrum disorder:  The Mood Disorder Questionnaire. American Journal of Psychiatry, 157, 1873–​1875. Hollon, S. D., & Kendall, P. C. (1980). Cognitive self-​ statements in depression:  Development of an automatic thoughts questionnaire. Cognitive Therapy and Research, 4, 383–​395. Hooley, J. M., & Teasdale, J. D. (1989). Predictors of relapse in unipolar depressives:  Expressed emotion, marital distress, and perceived criticism. Journal of Abnormal Psychology, 98, 229–​235. Isometsä, E., Suominen, K., Mantere, O., Valtonen, H., Leppämäki, S., Pippingsköld, M., & Arvilommi, P. (2003). The Mood Disorder Questionnaire improves recognition of bipolar disorder in psychiatric care. BMC Psychiatry, 3, 8. Jenkins, M. M., Youngstrom, E. A., Washburn, J. J., & Youngstrom, J. K. (2011). Evidence-​ based strategies improve assessment of pediatric bipolar disorder by community practitioners. Professional Psychology:  Research and Practice, 42, 121. Johnson, S. L., & Fulford, D. (2009). Preventing mania:  A preliminary examination of the GOALS program. Behavior Therapy, 40, 103–​113. Johnson, M. H., Magaro, P. A., & Stern, S. L. (1986). Use of the SADS-​C as a diagnostic and symptom severity measure. Journal of Consulting and Clinical Psychology, 54, 546–​551. Johnson, S. L., & Carver, C. (2006). Extreme goal setting and vulnerability to mania among undiagnosed young adults. Cognitive Therapy and Research, 30, 377–​395.

18

188

Mood Disorders and Self-Injury

Johnson, S. L., Edge, M. D., Holmes, M. K., & Carver, C. S. (2012). The behavioral activation system and mania. Annual Review of Clinical Psychology, 8, 243–​267. Johnson, S. L., & Fingerhut, R. (2004). Negative cognitions predict the course of bipolar depression, not mania. Journal of Cognitive Psychotherapy, 18, 149–​162. Johnson, S. L., & Jones, S. (2009). Cognitive correlates of mania risk:  Are responses to success, positive moods, and manic symptoms distinct or overlapping? Journal of Clinical Psychology, 65, 891–​905. Johnson, S. L., & Kizer, A. (2002). Bipolar and unipolar depression:  A comparison of clinical phenomenology and psychosocial predictors. In I. H. Gotlib & C. Hammen (Eds.), Handbook of depression (pp. 141–​ 165). New York, NY: Guilford. Johnson, S. L., & Leahy, R. L. (Eds.). (2004). Psychological treatment of bipolar disorder. New York, NY: Guilford. Judd, L. L., Akiskal, H. S., Schettler, P. J., Endicott, J., Maser, J., Solomon, D. A., . . . Keller, M. B. (2002). The long-​ term natural history of the weekly symptomatic status of bipolar I disorder. Archives of General Psychiatry, 59, 530–​538. Kabacoff, R. I., Miller, I. W., Bishop, D. S., Epstein, N. B., & Keitner, G. I. (1990). A psychometric study of the McMaster family assessment device in psychiatric, medical, and nonclinical samples. Journal of Family Psychology, 3, 431–​439. Karkowski, L. M., & Kendler, K. S. (1997). An examination of the genetic relationship between bipolar and unipolar illness in an epidemiological sample. Psychiatric Genetics, 7, 159–​163. Kaufman, J., Birmaher, B., Brent, D., Rao, U., Flynn, C., Moeci, P., . . . Ryan, N. (1997). Schedule for Affective Disorders and Schizophrenia for School-​Age Children–​ Present and Lifetime Version (K-​ SADS-​ PL):  Initial reliability and validity data. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 980–​988. Kazdin, A. E., & Kagan, J. (1994). Models of dysfunction in developmental psychopathology. Clinical Psychology: Science and Practice, 1, 35–​52. Keck, P. E., McElroy, S. L., Strakowski, S. M., West, S. A., Sax, K. W., Hawkins, J. M.,  .  .  .  Haggard, P. (1998). Twelve-​month outcome of patients with bipolar disorder following hospitalization for a manic or mixed episode. American Journal of Psychiatry, 155, 646–​652. Keller, M. B., Lavori, P. W., Kane, J. M., Gelenberg, A. J., Rosenbaum, J. F., Walzer, E. A., . . . Baker, L. A. (1992). Subsyndromal symptoms in bipolar disorder:  A comparison of standard and low serum levels of lithium. Archives of General Psychiatry, 49, 371–​376. Keller, M. B., Lavori, P. W., McDonald-​Scott, P., Scheftner, W. A., Andreasen, N. C., Shapiro, R. W., . . . Croughan, J. (1981). Reliability of lifetime diagnoses and symptoms

in patients with current psychiatric disorder. Journal of Psychiatry Research, 16, 229–​240. Kemp R., & David A. (1996). Psychological predictors of insight and compliance in psychotic patients. British Journal of Psychiatry, 169, 444–​450. Kessler, R. C., Berglund, P., Demler, O., Jin, R., & Walters, E. E. (2005). Lifetime prevalence and age-​ of-​ onset distributions of DSM-​ IV disorders in the National Comorbidity Survey replication. Archives of General Psychiatry, 62, 593–​602. Kessler, R. C., Rubinow, D. R., Holmes, C., Abelson, J. M., & Zhao, S. (1997). The epidemiology of DSM-​ III-​R bipolar I disorder in a general population survey. Psychological Medicine, 27, 1079–​1089. Kieseppa, T., Partonen, T., Haukka, J., Kaprio, J., & Lonnqvis, J. (2004). High concordance of bipolar I disorder in a nationwide sample of twins. American Journal of Psychiatry, 161, 1814–​1821. Klein, D. N., Dickstein, S., Taylor, E. B., & Harding, K. (1989). Identifying chronic affective disorders in out-​ patients: Validation of the General Behavior Inventory. Journal of Consulting and Clinical Psychology, 57, 106–​111. Kwapil, T. R., Miller, M. B., Zinser, M. C., Chapman, L. J., Chapman, J., & Eckblad, M. (2000). A longitudinal study of high scorers on the Hypomanic Personality Scale. Journal of Abnormal Psychology, 109, 222–​226. Lam, D., & Wong, G. (2005). Prodromes, coping strategies, and psychological interventions in bipolar disorders. Clinical Psychology Review, 25, 1028–​1042. Lam, D., Wright, K., & Smith, N. (2004). Dysfunctional assumptions in bipolar disorder. Journal of Affective Disorders, 79, 193–​199. Lam, D. H., Wright, K., & Sham, P. (2005). Sense of hyper-​ positive self and response to cognitive therapy in bipolar disorder. Psychological Medicine, 35, 69–​77. Leibenluft, E., Albert, P. S., Rosenthal, N. E., & Wehr, T. A. (1996). Relationship between sleep and mood in patients with rapid-​cycling bipolar disorder. Journal of Psychiatric Research, 63, 161–​168. Lish, J. D., Dime-​Meenan, S., Whybrow, P. C., Price, R. A., & Hirschfeld, R. M. (1994). The National Depressive and Manic–​ Depressive Association (DMDA) survey of bipolar members. Journal of Affective Disorders, 31, 281–​294. Loeber, R., Green, S. M., & Lahey, B. B. (1990). Mental health professionals’ perception of the utility of children, mothers, and teachers as informants on childhood psychopathology. Journal of Clinical Child Psychology, 19, 136–​143. Lukasiewicz, M., Gerard, S., Besnard, A., Falissard, B., Perrin, E., Sapin, H., Tohen, M., Reed, C., Azorin, J. M., & The emblem study group (2013). Young mania rating scale: How to interpret the numbers? Determination of

 189

Bipolar Disorder

a severity threshold and of the minimal clinically significant difference in the EMBLEM cohort. International Journal of Methods in Psychiatric Research, 22, 46–​58. Mahon, K., Perez-​ Rodriguez, M., Gunawardane, N., & Burdick, K. (2013). Dimensional endophenotypes in bipolar disorder:  Affective dysregulation and psychosis proneness. Journal of Affective Disorders, 151, 695–​701. Mallon, J. C., Klein, D. N., Bornstein, R. F., & Slater, J. F. (1986). Discriminant validity of the General Behavior Inventory:  An outpatient study. Journal of Personality Assessment, 50, 568–​577. Mendlowicz, M., Jean-​Louis, G., Kelsoe J., & Akiskal H. (2005). A comparison of recovered bipolar patients, healthy relatives of bipolar probands, and normal controls using the short TEMPS-​ A. Journal of Affective Disorders, 85, 147–​151. Merikangas, K. R., Jin, R., He, J.-​P., Kessler, R. C., Lee, S., Sampson, N. A.,  .  .  .  Zarkov, Z. (2011). Prevalence and correlates of bipolar spectrum disorder in the World Mental Health Survey Initiative. Archives of General Psychiatry, 68, 241–​251. http://​doi.org/​10.1001/​ archgenpsychiatry.2011.12 Meyer, T. D., Bernhard, B., Born, C., Fuhr, K., Gerber, S., Schaerer, L.,  .  .  .  Bauer, M. (2011). The Hypomania Checklist-​ 32 and the Mood Disorder Questionnaire as screening tools—​Going beyond samples of purely mood-​disordered patients. Journal of Affective Disorders, 128, 291–​298. Michalak, E. E., & Murray, G. (2010). Development of the QoL.BD: A disorder-​specific scale to assess quality of life in bipolar disorder. Bipolar Disorders, 12, 727–​740. Michalakeas, A., Skoutas, C., Charalambous, A., Peristeris, A., Marinos, V., Keramari, E., & Theologou, A. (1994). Insight in schizophrenia and mood disorders and its relation to psychopathology. Acta Psychiatrica Scandinavica, 90, 46–​49. Miklowitz, D. J., & Goldstein, M. J. (1997). Bipolar disorder:  A family-​focused treatment approach. New  York, NY: Guilford. Miklowitz, D. J., & Johnson, S. L. (2014). Bipolar and related disorders. In D. Beidel (Ed.), Adult psychopathology and diagnosis (7th ed. pp. 217–​252). New York, NY: Wiley. Miklowitz, D. J., Otto, M. W., Frank, E., Reilly-​Harrington, N. A., Wisniewski, S. R., Kogan, J. N.,  .  .  .  Araga, M. (2007). Psychosocial treatments for bipolar depression:  A 1-​year randomized trial from the Systematic Treatment Enhancement Program. Archives of General Psychiatry, 64, 419–​426. Miklowitz, D. J., Wisniewski, S. R., Miyahara, S., Otto, M. W., & Sachs, G. S. (2005). Perceived criticism from family members as a predictor of the one-​year course of bipolar disorder. Psychiatry Research, 136, 101–​111. Miller, C. J., Johnson, S. L., Kwapil, T. R., & Carver, C. S. (2011). Three studies on self-​report scales to detect

189

bipolar disorder. Journal of Affective Disorders, 128, 199–​210. Miller, C. J., Klugman, J., Berv, D. A., Rosenquist, K. J., & Ghaemi, S. N. (2004). Sensitivity and specificity of the Mood Disorder Questionnaire for detecting bipolar disorder. Journal of Affective Disorders, 81, 167–​171. Miller, I. W., Epstein, N. B., Bishop, D. S., & Keitner, G. I. (1985). The McMaster Family Assessment Device: Reliability and validity. Journal of Marital and Family Therapy, 11, 345–​356. Miller, I. W., Keitner, G. I., Ryan, C. E., Uebelacker, L. A., Johnson, S. L., & Solomon, D. A. (2008). Family treatment for bipolar disorder:  Family impairment by treatment interactions. Journal of Clinical Psychiatry, 69, 732. Miniati, M., Rucci, P., Frank, E., Oppo, A., Kupfer, D. J., Fagiolini, A., & Cassano, G. B. (2009). Sensitivity to change and predictive validity of the MOODS-​ SR questionnaire, last-​month version. Psychotherapy and Psychosomatics, 78, 116–​124. Monk, T. H., Flaherty, J. F., Frank, E., Hoskinson, K., & Kupfer, D. J. (1990). The social rhythm metric:  An instrument to quantify the daily rhythms of life. Journal of Nervous and Mental Disease, 178, 120–​126. Monk, T. H., Reynolds, C. F., III, Buysse, D. J., DeGrazia, J. M., & Kupfer, D. J. (2003). The relationship between lifestyle regularity and subjective sleep quality. Chronobiology International, 20, 97–​107. Moreno, C., Laje, G., Blanco, C., Jiang, H., Schmidt, A. B., & Olfson, M. (2007). National trends in the outpatient diagnosis and treatment of bipolar disorder in youth. Archives of General Psychiatry, 64, 1032–​1039. Murray, G., & Harvey, A. (2010). Circadian rhythms and sleep in bipolar disorder. Bipolar Disorders, 12, 459–​472. Murray, G., & Michalak, E. E. (2012). The quality of life construct in bipolar disorder research and practice: Past, present, and possible futures. Bipolar Disorders, 14, 793–​796. Novick, D., Montgomery, W., Treuer, T., Aguado, J., Kraemer, S., & Haro, J. M. (2015). Relationship of insight with medication adherence and the impact on outcomes in patients with schizophrenia and bipolar disorder:  Results from a 1-​ year European outpatient observational study. BMC Psychiatry, 15, 189. Parker, G., Fletcher, K., McCraw, S., & Hong, M. (2014). The Hypomanic Personality Scale:  A measure of personality and/​or bipolar symptoms? Psychiatry Research, 220, 654–​658. Pavlickova, H., Varese, F., Turnbull, O., Scott, J., Morriss, R., Kinderman, P., . . . Bentall, R. P. (2013). Symptom-​ specific self-​referential cognitive processes in bipolar disorder:  A longitudinal analysis. Psychological Medicine, 43, 1895–​1907.

190

190

Mood Disorders and Self-Injury

Pavuluri, M. N., Henry, D. B., Devineni, B., Carbray, J. A., & Birmaher, B. (2006). Child Mania Rating Scale: Development, reliability, and validity. Journal of the American Academy of Child & Adolescent Psychiatry, 45, 550–​560. Pendergast, L. L., Youngstrom, E. A., Brown, C., Jensen, D., Abramson, L. Y., & Alloy, L. B. (2015). Structural invariance of General Behavior Inventory (GBI) scores in Black and White young adults. Psychological Assessment, 27, 21–​30. Pendergast, L. L., Youngstrom, E. A., Merkitch, K. G., Moore, K. A., Black, C. L., Abramson, L. Y., & Alloy, L. B. (2014). Differentiating bipolar disorder from unipolar depression and ADHD: The utility of the General Behavior Inventory. Psychological Assessment, 26, 195–​206. Phelps, J. R., & Ghaemi, S. N. (2006). Improving the diagnosis of bipolar disorder: Predictive value of screening tests. Journal of Affective Disorders, 92, 141–​148. Prien, R. F., & Potter, W. Z. (1990). NIMH workshop report on treatment of bipolar disorder. Psychopharmacology Bulletin, 26, 409–​427. Ratheesh, A., Berk, M., Davey, C. G., McGorry, P. D., & Cotton, S. M. (2015). Instruments that prospectively predict bipolar disorder—​A systematic review. Journal of Affective Disorders, 179, 65–​73. Reigier, D. A., Farmer, M. E., Rae, D. S., Locke, B. Z., Keith, S. J., Judd, L. L., & Goodwin, F. K. (1990). Comorbidity of mental disorders with alcohol and other substance abuse:  Results from the Epidemiological Catchment Area (ECA) study. Journal of the American Medical Association, 264, 2511–​2518. Reilly-​ Harrington, N. A., Miklowitz, D. J., Otto, M. W., Frank, E., Wisniewski, S. R., Thase, M. E., & Sachs, G. S. (2010). Dysfunctional attitudes, attributional styles, and phase of illness in bipolar disorder. Cognitive Therapy and Research, 34, 24–​34. Rice, J. P., McDonald-​Scott, P., Endicott, J., Coryell, W., Grove, W. M., Keller, M. B., & Altis, D. (1986). The stability of diagnosis with an application to bipolar II disorder. Journal of Psychiatry Research, 19, 285–​296. Richters, J. E. (1992). Depressed mothers as informants about their children:  A critical review of the evidence for distortion. Psychological Bulletin, 112, 485–​499. Rogers, R., Jackson, R. L., & Cashel, M. (2001). The Schedule for Affective Disorders and Schizophrenia (SADS). In R. Rogers (Ed.), Handbook of diagnostic and structural interviewing (pp. 84–​ 102). New  York, NY: Guilford. Rogers, R., Jackson R. L., Salekin, K. L., & Neumann, C. S. (2003). Assessing Axis I  symptomatology on the SADS-​C in two correctional samples:  The validation of subscales and a screen for malingered presentations. Journal of Personality Assessment, 81, 281–​290.

Ruggero, C., Johnson, S. L., & Cuellar, A. K. (2004). Spanish language measures for mania and depression. Psychological Assessment, 16, 381–​385. Sajatovic, M., Ignacio, R. V., West, J. A., Cassidy, K. A., Safavi, R., Kilbourne, A. M., & Blow, F. C. (2009). Predictors of nonadherence among individuals with bipolar disorder receiving treatment in a community mental health clinic. Comprehensive Psychiatry, 50, 100–​107. Sanchez-​Moreno, J., Villagran, J., Gutierrez, J., Camacho, M., Ocio, S., Palao, D., . . . Vieta, E. (2008). Adaptation and validation of the Spanish version of the Mood Disorder Questionnaire for the detection of bipolar disorder. Bipolar Disorders, 10, 400–​412. Sanz, M., Constable, G., Lopez-​Ibor, I., Kemp, R., & David, A. S. (1998). A comparative study of insight scales and their relationship to psychopathological and clinical variables. Psychological Medicine, 28, 437–​446. Saunders, E. F., Fernandez-​Mendoza, J., Kamali, M., Assari, S., & McInnis, M. G. (2015). The effect of poor sleep quality on mood outcome differs between men and women: A longitudinal study of bipolar disorder. Journal of Affective Disorders, 180, 90–​96. Secunda, S. K., Katz, M. M., Swann, A., Koslow, S. H., Maas, J. W., & Chuang, S. (1985). Mania:  Diagnosis, state measurement and prediction of treatment response. Journal of Affective Disorders, 8, 113–​121. Serrano, E., Ezpeleta, L., Alda, J. A., Matalí, J. L., & San, L. (2011). Psychometric properties of the Young Mania Rating Scale for the identification of mania symptoms in Spanish children and adolescents with attention deficit/​ hyperactivity disorder. Psychopathology, 44, 125–​132. Shear, M. K., Greeno, C., Kang, J., Ludewig, D., Frank, E., Swartz, H. A., & Hanekamp, M. (2000). Diagnosis of nonpsychotic patients in community clinics. American Journal of Psychiatry, 157, 581–​587. Shugar, G., Schertzer, S., Toner, B. B., & Di Gasbarro, I. (1992). Development, use, and factor analysis of a self-​ report inventory for mania. Comprehensive Psychiatry, 33, 325–​331. Simpson, S. G., McMahon, F. J., McInnis, M. G., MacKinnon, D. F., Edwin, D., Folstein, S. E., . . . DePaulo, R. (2002). Diagnostic reliability of bipolar II disorder. Archives of General Psychiatry, 59, 746–​740. Spitzer, R. L., Endicott, J., & Robins, E. (1978). Research diagnostic criteria: Rationale and reliability. Archives of General Psychiatry, 35, 773–​782. Spitzer, R. L., Williams, J. B. W., Gibbon, M., & First, M. B. (1992). The structured clinical interview for DSM-​III-​R (SCID): I. History, rationale, and description. Archives of General Psychiatry, 49, 624–​629. Strober, M., Schmidt-​Lackner, S., Freeman, R., Bower, S., Lampert, C., & DeAntonio, M. (1995). Recovery and relapse in adolescents with bipolar affective illness:  A five-​year naturalistic, prospective follow-​up. Journal of

 19

Bipolar Disorder

the American Academy of Child & Adolescent Psychiatry, 34, 724–​731. Swann, A. C., Janicak, P. L., Calabrese, J. R., Bowden, C. L., Dilsaver, S. C., Morris, D. D., . . . Davis, J. M. (2001). Structure of mania: Depressive, irritable, and psychotic clusters with different retrospectively-​ assessed course patterns of illness in randomized clinical trial participants. Journal of Affective Disorders, 67, 123–​132. Tondo, L., Vázquez, G., & Baldessarini, R. (2010). Mania associated with antidepressant treatment:  Comprehensive meta-​analytic review. Acta Psychiatrica Scandinavica, 121, 404–​414. Undurraga, J., Baldessarini, R. J., Valenti, M., Pacchiarotti, I., & Vieta, E. (2011). Suicidal risk factors in bipolar I and II disorder patients. Journal of Clinical Psychiatry, 73, 778–​782. Van Humbeeck, G., Van Audenhove, C., Storms, G., De Hert, M., Pieters, G., & Vertommen, H. (2004). Expressed emotion in the professional–​ client dyad:  A comparison of three expressed emotion instruments. European Journal of Psychological Assessment, 20, 237–​246. van Zaane, J., van den Berg, B., Draisma, S., Nolen, W. A., & van den Brink, W. (2012). Screening for bipolar disorders in patients with alcohol or substance use disorders: Performance of the Mood Disorder Questionnaire. Drug and Alcohol Dependence, 124, 235–​241. Varga, M., Magnusson, A., Flekkoy, K., Ronneberg, U., & Opjordsmoen, S. (2006). Insight, symptoms and neurocognition in bipolar I  patients. Journal of Affective Disorders, 91, 1–​9. Verhulst, F. C., & van der Ende., J. (1992). Agreement between parents’ reports and adolescents’ self-​reports of problem behavior. Journal of Child Psychology and Psychiatry and Allied Disciplines, 33, 1011–​1023. Vernon, S. W., & Roberts, R. E. (1982). Use of the SADS-​ RDC in a tri-​ ethnic community survey. Archives of General Psychiatry, 39, 47–​52. Vieta, E. (2010). Guide to assessment scales in bipolar disorder (2nd ed.). London, UK: Springer. Walsh, M. A., DeGeorge, D. P., Barrantes-​ Vidal, N., & Kwapil, T. R. (2015). A 3-​year longitudinal study of risk for bipolar spectrum psychopathology. Journal of Abnormal Psychology, 124, 486. Watson, D., Clark, L. A., Chmielewski, M., & Kotov, R. (2013). The value of suppressor effects in explicating the construct validity of symptom measures. Psychological Assessment, 25, 929. Watson, D., O’Hara, M. W., Naragon-​Gainey, K., Koffel, E., Chmielewski, M., Kotov, R., . . . Ruggero, C. J. (2012). Development and validation of new anxiety and bipolar symptom scales for an expanded version of the IDAS (the IDAS-​II). Assessment, 19, 399–​420. Weber Rouget, B., Gervasoni, N., Dubuis, V., Gex-​Fabry, M., Bondolfi, G., & Aubry, J. (2005). Screening for

191

bipolar disorders using the French version of the Mood Disorder Questionnaire (MDQ). Journal of Affective Disorders, 88, 103–​108. Weinstock, L. M., & Miller, I. W. (2010). Psychosocial predictors of mood symptoms 1 year after acute phase treatment of bipolar I  disorder. Comprehensive Psychiatry, 51, 497–​503. Weissman, A., & Beck, A. T. (1978). Development and validation of the dysfunctional attitude scale: A preliminary investigation. Paper presented at the annual meeting of the American Educational Research Association, Toronto. Williams, J. B.  W., Gibbon, M., First, M. B., Spitzer, R. L., Davies, M., Borus, J., . . . Wittchen, H.-​U. (1992). The structured clinical interview for the DSM-​ III-​ R (SCID):  II. Multisite test–​retest reliability. Archives of General Psychiatry, 49, 630–​636. World Health Organization. (1990). Composite international diagnostic interview (CIDI, Version 1.0). Geneva, Switzerland: World Health Organization. Wu, Y.-​ S., Angst, J., Ou, C.-​ S., Chen, H.-​ C., & Lu, R.-​ B. (2008). Validation of the Chinese version of the Hypomania Checklist (HCL-​32) as an instrument for detecting hypo(mania) in patients with mood disorders. Journal of Affective Disorders, 106, 133–​143. Yee, A. M., Algorta, G. P., Youngstrom, E. A., Findling, R. L., Birmaher, B., & Fristad, M. A. (2015). Unfiltered administration of the YMRS and CDRS-​R in a clinical sample of children. Journal of Clinical Child and Adolescent Psychology, 44, 992–​1007. Yen, C. F., Chen, C. S., Ko, C. H., Yeh, M. L., Yang, S. J., Yen, J. Y., . . . Wu, C. C. (2005). Relationships between insight and medication adherence in outpatients with schizophrenia and bipolar disorder. Psychiatry and Clinical Neurosciences, 59, 403–​409. Young, M. E., Galvan, T., Reidy, B. L., Pescosolido, M. F., Kim, K. L., Seymour, K., & Dickstein, D. P. (2013). Family functioning deficits in bipolar disorder and ADHD in youth. Journal of Affective Disorders, 150, 1096–​1102. Young, R. C., Biggs, J. T., Ziegler, V. E., & Meyer, D. A. (1978). A rating scale for mania:  Reliability, validity and sensitivity. British Journal of Psychiatry, 133, 429–​435. Youngstrom, E. A., Findling, R. L., & Calabrese, J. R. (2003). Who are the comorbid adolescents? Agreement between psychiatric diagnosis, youth, parent, and teacher report. Journal of Abnormal Child Psychology, 31, 231–​245. Youngstrom, E. A., Findling, R. L., Calabrese, J. R., Gracious, B. L., Demeter, C., Bedoya, D. D., & Price, M. (2004). Comparing the diagnostic accuracy of six potential screening instruments for bipolar disorder in youths aged 5 to 17 years. Journal of the American Academy of Child & Adolescent Psychiatry, 43, 847–​858.

192

192

Mood Disorders and Self-Injury

Youngstrom, E. A., Findling, R. L., Danielson, C. K., & Calabrese, J. R. (2001). Discriminative validity of parent report of hypomanic and depressive symptoms on the General Behavior Inventory. Psychological Assessment, 13, 267–​276. Youngstrom, E. A., Findling, R. L., Youngstrom, J. K., & Calabrese, J. R. (2005). Toward an evidence-​ based assessment of pediatric bipolar disorder. Journal of Clinical Child and Adolescent Psychology, 34, 433–​448. Youngstrom, E. A., Frazier, T. W., Demeter, C., Calabrese, J. R., & Findling, R. L. (2008). Developing a ten item mania scale from the Parent General Behavior Inventory for children and adolescents. Journal of Clinical Psychiatry, 69, 831. Youngstrom, E. A., Gracious, B. L., Danielson, C. K., Findling, R. L., & Calabrese, J. (2003). Toward an integration of parent and clinician report on the Young Mania Rating Scale. Journal of Affective Disorders, 77, 179–​190. Youngstrom, E., Loeber, R., & Stouthamer-​ Loeber, M. (2000). Patterns and correlates of agreement between

parent, teacher, and male adolescent ratings of externalizing and internalizing problems. Journal of Consulting and Clinical Psychology, 68, 1038–​1050. Zimmerman, M. (2012). Misuse of the Mood Disorders Questionnaire as a case-​finding measure and a critique of the concept of using a screening scale for bipolar disorder in psychiatric practice. Bipolar Disorders, 14, 127–​134. Zimmerman, M., & Galione, J. N. (2011). Screening for bipolar disorder with the Mood Disorders Questionnaire: A review. Harvard Review of Psychiatry, 19, 219–​228. Zimmerman, M., Galione, J. N., Chelminski, I., Young, D., & Dalrymple, K. (2011). Psychiatric diagnoses in patients who screen positive on the Mood Disorder Questionnaire:  Implications for using the scale as a case-​finding instrument for bipolar disorder. Psychiatry Research, 185, 444–​449. Zimmerman, M., & Mattia, J. I. (1999). Psychiatric diagnosis in clinical practice:  Is comorbidity being missed? Comprehensive Psychiatry, 40, 182–​191.

 193

10

Self-​Injurious Thoughts and Behaviors Alexander J. Millner Matthew K. Nock Self-​injurious thoughts and behaviors (SITB) are an enormous global public health problem. Suicide is a leading cause of death worldwide (Lozano et al., 2012), and cross-​ national studies estimate that the prevalence of nonlethal SITBs ranges from 3% to 9% (Nock, Borges, et al., 2008). Given the possibility of death associated with SITB and that nearly all mental health clinicians will assess and treat patients with suicidal thoughts and behaviors during their career (Dexter-​Mazza & Freeman, 2003; Kleespies, Penk, & Forsyth, 1993), it is crucial that SITB assessment instruments have a strong evidence base. In this chapter, we provide basic background information on the assessment of SITB and summarize the evidence supporting instruments that assess SITB. Note that assessment of suicide risk—​determining the likelihood that a person will actually try to kill him-​or herself in the near future—​is an overlapping but distinct process. An important part of determining suicide risk is the direct assessment of SITB history, presence, and severity using measures with empirical support. This chapter provides guidance in the selection of an instrument for determining risk. However, suicide risk assessment also includes the consideration of other factors (e.g., hopelessness and impulsiveness) associated with suicidal behaviors that are beyond the scope of this chapter. For readers interested in suicide risk assessment, we provide several useful sources containing guidelines and practical recommendations in the context of clinical care (Berman & Silverman, 2014; Fowler, 2012; Jacobs et al., 2010; Silverman & Berman, 2014). This chapter builds on the chapter with the same title from the first edition of this volume (Nock, Wedig, Janis, & Deliberto, 2008)  by updating the psychometric evidence for the instruments included in the first edition and introducing and evaluating several recently established

interviews to assess SITB. To provide context, the chapter begins with a discussion of two fundamental issues in the assessment of SITB: classification and measurement. We also discuss prevalence and conditional probability of the behaviors as well as the goals and challenges of SITB assessment. We then review instruments appropriate for determining the presence and frequency of SITB, case conceptualization and treatment planning, and treatment monitoring and outcome evaluation. In our review, we (a)  separate measures suitable for assessing children and adolescents from those for assessing adults, (b) state precisely which thoughts and behaviors each measure assesses (e.g., suicidal thoughts and nonsuicidal self-​injury [NSSI]), and (c) make recommendations based on which instruments have the strongest empirical support.

NATURE OF SITB

Classification and Measurement SITB consists of a broad array of thoughts and behaviors that involve imagined or actual intentional physical injury to one’s body. During the past several decades, one of the major obstacles facing research and clinical practice has been the lack of a consistent classification of SITB. In many instances, people failed to distinguish among distinct behaviors—​for example, referring to suicidal and nonsuicidal behaviors collectively as “deliberate self-​harm” or combining suicidal thoughts and behaviors under the umbrella term “suicidality.” Fortunately, these practices have largely been abandoned following the publication of consensus articles (Silverman, Berman, Sanddal, O’Carroll, & Joiner, 2007)  and the introduction of new classification systems

193

194

194

Mood Disorders and Self-Injury

and measurement strategies. For example, beginning in the early 2000s, pertinent classification systems were implemented in U.S. government agencies, such as the U.S. Food and Drug Administration (FDA), the Centers for Disease Control and Prevention, and the Department of Defense (Brenner et  al., 2011; Posner, Oquendo, Gould, Stanley, & Davies, 2007; FDA, 2012). Although largely similar, the classification systems across these agencies contain some minor differences (Matarazzo, Clemans, Silverman, & Brenner, 2013), and despite the clear advancement in this area, they continue to receive criticism (Sheehan, Giddens, & Sheehan, 2014b). The particular issues regarding classification and measurement systems are beyond the scope of this chapter, but they generally hinge on the level of granularity with which behaviors should be classified (Sheehan, Giddens, et al., 2014b). In general, consensus classification makes a distinction between suicidal self-​injury, in which people have some intent (i.e., non-​zero) to die from their behavior, and nonsuicidal self-​injury, in which people injure themselves with no intent to die. Within suicidal self-​injury, there are three major categories:  suicidal ideation (i.e., thoughts), which refers to thinking about engaging in a behavior to end one’s life; suicide plan, which includes thinking about how (i.e., method) and where (i.e., place) one intends to injure oneself; and suicide attempt, which refers to engaging in a potentially harmful or lethal behavior with some intention of dying from the behavior. More recently, researchers and clinicians have defined a spectrum of more subtle suicidal thoughts and behaviors as well, including passive suicidal ideation, which includes thoughts such as wishing one were dead; preparatory behaviors, which include actions either to prepare for one’s suicide attempt (e.g., obtaining a gun) or to prepare for the possibility that one might be dead soon (e.g., preparing a will); aborted attempt, in which people take steps to attempt suicide but stop themselves prior to engaging in a potentially harmful or lethal behavior; and interrupted attempt, in which someone or something prevents a person from attempting suicide. Another behavior that is related to suicidal behaviors but is considered a nonsuicidal behavior is a suicide gesture, in which a person carries out an action to give the appearance of a suicide attempt for some purpose (e.g., to communicate pain) but with zero intention of dying. Currently, most suicidal behaviors have consensus definitions with one important exception: a suicide plan (Millner, Lee, & Nock, 2017). As stated previously, a suicide plan consists of, at the very least, formulating a method, but there is no settled definition (e.g., Does a

plan require thinking of a place and/​or a time to attempt suicide?). Some have differentiated between a “plan” and a “specific plan,” with the latter defined as “details of a plan fully or partially worked out,” but there is no precise operationalization (Posner et al., 2011). One problem with poor operationalization is that it can lead to inaccurate measurement. For example, one study that examined the use of a single-​item question to assess the presence of a suicide plan among people who had thought about suicide but never made a suicide attempt found that 40% of those who denied having a plan had engaged in at least four out of five planning steps (e.g., settling on a method and settling on a place to attempt suicide) compared with 52% among those who endorsed having a plan (Millner, Lee, & Nock, 2015). These results suggest respondents’ interpreted the term “suicide plan” inconsistently, questioning the validity of suicide planning items (similar problems exist for other aspects of SITB, such as the presence of “suicidal intent”). Similar measurement problems can occur for terms with consensus definitions if respondents or questioners (e.g., interviewers or clinicians) do not clearly understand the criteria for the behavior in question or there is no option for a behavior in which a respondent engaged (e.g., no item for aborted attempts). Studies have reported that when participants answer a single question regarding the presence or absence of a past suicide attempt, 10% to 40% incorrectly endorse making a prior attempt (Hom, Joiner, & Bernert, 2015; Millner et al., 2015; Nock & Kessler, 2006; Plöderl, Kralovec, Yazdi, & Fartacek, 2011). In addition to inaccurate endorsement of past suicide attempts, prior research has also found that among people with suicide ideation, 10% deny having made a suicide attempt even though their description of a prior behavior fits the definition of a suicide attempt (Millner et al., 2015). This problem extends to clinical settings, in which researchers found that medical notes incorrectly labeled a behavior as a suicide attempt 6% of the time and failed to identify a suicide attempt 18% of the time (Brown, Currier, Jager-​Hyman, & Stanley, 2015). Millner et  al. (2015) found that classification can be improved by (a) increasing the clarity of the question (e.g., asking “Have you ever engaged in a potentially harmful or lethal behavior with some intention of dying?” rather than asking “Have you ever tried to kill yourself?”) and (b) increasing the coverage by providing several thoughts or behaviors that people can choose from (e.g., asking about NSSI, suicide gestures, aborted and interrupted attempts, as well as attempts). For interview-​based assessments, it is important that assessors are trained in the definitions of different terms

 195

Self-Injurious Thoughts and Behaviors

to accurately classify behavior. The government agency classification systems discussed previously can aid with training, and some instruments, such as the Columbia-​Suicide Severity Rating Scale (Posner et al., 2011), offer free web-​ based trainings (http://​cssrs.columbia.edu). Note that these measurement problems are unlikely to be resolved by the standard practice of testing validity because most studies validate instruments using other measures with similar wording. For example, a measure that inquires about prior “suicide attempts” is frequently validated with another measure that assesses the same outcome with the same terminology. Given these circumstances, it would be surprising if scales using similar terms (“suicide attempt”) and measuring similar or identical outcomes were uncorrelated. However, strong validity metrics alone do not necessarily indicate an absence of the types of measurement error discussed previously. Research into misclassification and its reduction is very recent, and continued work in this area will help improve the validity and reliability of the assessment of SITB. Prevalence and Conditional Probability The prevalence of SITB is important to consider during assessment. Studies with large-​scale, representative samples suggest that the prevalence of suicidal ideation, plans, and attempts within the United States is 16%, 5%, and 5%, respectively (Nock, Borges, et  al., 2008). In a cross-​national study of people from 17 different countries throughout the world, these estimates are 9%, 3%, and 3%, respectively (Nock, Borges, et al., 2008). Most people who attempt suicide have thought about suicide prior to their attempt, and many have made a plan as well. Therefore, it is useful to understand the rates at which people transition from one behavior to more severe SITB. Among people who think about suicide, 34% will make a suicide plan, and 29% will make a suicide attempt. Among those with a plan, 56% will make a suicide attempt. Importantly, most people who transition to a plan or an attempt do so within the first year after the onset of suicide ideation (Nock, Borges, et al., 2008). The prevalence of NSSI is unknown because representative epidemiological studies have not included this behavior. Furthermore, rates of NSSI vary depending on how it is assessed; checklists of differing behaviors elicit higher rates than does a single-​item question (Swannell, Martin, Page, Hasking, & St. John, 2014). After taking into account assessment approach and other methodological considerations, a recent meta-​analysis examining cross-​national, fairly large-​ scale (although not truly representative) studies among

195

nonclinical samples reported NSSI rates of 17.2% among adolescents, 13.4% among young adults, and 5.5% among adults (Swannell et al., 2014). Rates reported by studies conducted in U.S. samples vary widely but are generally consistent with cross-​national rates (Jacobson & Gould, 2007; Klonsky, 2011; Whitlock, 2010; Whitlock, Eckenrode, & Silverman, 2006). Purposes of Assessment Although there is no official diagnosis for SITB, the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric Association, 2013) provisionally established suicidal behaviors disorder and NSSI disorder as conditions that require further study. A  small number of studies have started to investigate the clinical utility and validity of these disorders, but currently, assessment is not intended to establish the presence of diagnosis. Instead, the primary purpose of assessment of SITB is to determine (a) the presence or absence of SITB itself; (b) characteristics of SITB, such as frequency and severity; and (c) whether SITB change over time and, if so, how. Compared with the first edition of this chapter, we have altered the inclusion criterion to review measures in which the majority of items assess SITB outcomes or aspects of SITB (frequency, severity, functions, etc.). For example, scales such as the Suicide Probability Scale, which has 6 items that assess suicidal ideation and 30 items that assess potential SITB risk factors such as hopelessness and hostility, are excluded from this chapter. We selected this inclusion criterion because of the large number of scales measuring the same outcomes and several new scales focused on assessing multiple SITB exclusively. We provide tables in which we rate each measure on several psychometric categories based on the evidence for each measure and guidelines provided by the editors of this volume (see Chapter 1). Assessing SITB We recommend that the direct assessment of SITB be included within any comprehensive clinical interview (e.g., intake or discharge interview) to all patients, even those who appear to be low risk. Often, people who lack key risk factors still engage in SITB. We further emphasize the need for direct expression of SITB or self-​ injurious intentions and recommend that clinicians not judge SITB risk based on ancillary “warning signs,” such as giving things away, which lack empirical support (Rudd et al., 2006).

196

196

Mood Disorders and Self-Injury

Table 10.1a  Ratings of Instruments Used for Assessing SITB in Adults Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

E E G NR G

A E NR G NR

G A G G G

A G G A E

G A E G E

A G G G E

✓ ✓ ✓ ✓ ✓

Adults Structured and Semi-​Structured Interviews SITBI G NA SASII A G C-​SSRS G G S-​STS G G SSI G SSI-​W MSSI

NR NR

G G

G E

A L

G G

E E

G G

A E

✓ ✓

A NR

E E

NA NA

A A

G NR

E E

A E

G A



SIS

NR

G

NA

NR

A

A

A

A

SBQ

NR

G

NA

A

NR

A

A

G



SBQ (4-​items) SBQ-​R SHBQ

NR NR NR

G E G

NA NA NA

A A A

NR NR A

G G G

G G L

G G A

✓ ✓

DSHI SHI

NR A

G A

NA NA

G NR

A A

A G

A A

A A

Self-​Report Measures BSI ASIQ

Note: SITBI  =  Self-​Injurious Thoughts and Behaviors Interview; SASII  =  Suicide Attempt Self-​Injury Interview; C-​SSRS  =  Columbia-​Suicide Severity Rating Scale; S-​STS  =  Sheehan-​Suicidality Tracking Scale; SSI  =  Scale for Suicide Ideation; SSI-​W  =  Scale for Suicide Ideation-​Worst; MSSI = Modified Scale for Suicide Ideation; BSI = Beck Scale for Suicide Ideation; ASIQ = Adult Suicide Ideation Questionnaire; SIS = Suicide Ideation Scale; SBQ  =  Suicidal Behaviors Questionnaire; SBQ-​R  =  Suicidal Behaviors Questionnaire-​Revised; SHBQ  =  Self-​Harm Behavior Questionnaire; DSHI = Deliberate Self-​Harm Inventory; SHI = Self-​Harm Inventory; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

There is a natural concern that the direct assessment of SITB will increase the risk of the person actually engaging in SITB. There is a robust literature, including three randomized controlled trials, suggesting that there are no harmful effects of assessing SITB, such as an increase in suicidal ideation or suicide risk (Gould et  al., 2005; Harris & Goh, 2017; Husky et al., 2014; Law et al., 2015). Nevertheless, discussing SITB is a sensitive topic, and we recommend that clinicians start with less sensitive (e.g., a history of depression) and less severe constructs (e.g., history of suicidal ideation) before moving on the more severe behaviors (e.g., attempts). Next, we review the wide array of instruments available to assess the presence of SITB. The psychometric ratings for these instruments are presented in Tables 10.1a for adults and 10.1b for children and adolescents. MEASURES FOR USE WITH ADULTS

Structured and Semi-​Structured Interviews Although some instruments are referred to as structured interviews (Linehan, Comtois, Brown, Heard, & Wagner,

2006; Nock, Holmberg, Photos, & Michel, 2007), instructions included with these scales generally require that interviewers are knowledgeable about classification of SITB and encourage interviewers to probe with unstructured follow-​up questions to clarify details of a behavior in question in order to ensure accurate measurement. Given these circumstances, we do not distinguish among structured and semi-​structured interviews in this section. The Self-​Injurious Thoughts and Behaviors Interview (SITBI; Nock et al., 2007) is a structured interview (long form:  169 items; short form:  72 items) that assesses the presence of several SITB, including suicidal ideation, plans, and attempts as well as NSSI. In 2010, the interview was modified to include interrupted and aborted attempts as well as to assess knowledge of others with a suicide history; however, the reliability and validity of these items have not been tested. If the respondent endorses the lifetime presence of an outcome, the interviewer enters a longer module to assess the age of onset, frequency, severity, methods used, function of the behavior, degree to which external stressors (e.g., “work/​school” or “relationships”) or internal stressors (e.g., “mental state”) contributed to the behavior, use of alcohol or drugs, experience of pain,

 197

Self-Injurious Thoughts and Behaviors

197

Table 10.1b  Ratings of Instruments Used for Assessing SITB in Children and Adolescents Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

E NR E

A NR NR

G E G

A G A

G G G

A G A

✓ ✓

Child/​Adolescent Structured and Semi-​Structured Interviews SITBI G NA SSI NR G SBI NR E CSPS Self-​Report Measures BSI SBQ-​R SIQ SIQ-​JR HASS

NR

A

E

NR

A

G

E

A



A NR

E G

NA NA

NR A

E NR

G A

A NR

G A



E E NR

G G E

NA NA NA

A A NR

A A A

G G G

G G A

G G A

✓ ✓

Note: SITBI = Self-​Injurious Thoughts and Behaviors Interview; SSI = Scale for Suicide Ideation; SBI= Suicide Behaviors Interview; CSPS = Child Suicide Potential Scales; BSI  =  Beck Scale for Suicide Ideation; SBQ-​R  =  Suicidal Behaviors Questionnaire-Revised; SIQ  =  Suicidal Ideation Questionnaire; SIQ-​JR  =  Suicidal Ideation Questionnaire Junior; HASS  =  Harkavy–​Asnis Suicide Scale; A  =  Adequate; G  =  Good; E  =  Excellent; NR = Not Reported; NA = Not Applicable.

or impulsiveness of the SITB. Respondents are also asked to predict the likelihood they will engage in the behavior again in the future. The authors of the SITBI state that the interview is to be administered exactly as worded but that trained interviewers may use follow-​up questions to clarify behaviors. Thus, to ensure accurate measurement, interviewers require several hours of training and ongoing supervision to be adequately knowledgeable about categories of SITB. Furthermore, note that the excellent inter-​ rater reliability metrics stated in the following discussion occurred in the context of well-​trained and supervised interviewers. Administration requires 3 to 75 minutes depending on the number of modules administered. The one study that tested the reliability and validity of the English version of the SITBI among young adults and adolescents (aged 12–​19  years) reported excellent inter-​ rater reliability and adequate test–​retest reliability for the presence of each self-​injurious outcome assessed during a 6-​month period (Nock et al., 2007). In addition, foreign language versions of the SITBI have also shown strong psychometric properties (Fischer et al., 2014; García-​Nieto, Blasco-​ Fontecilla, Paz Yepes, & Baca-​ García, 2013). Multiple sections of the SITBI have been converted into a self-​report form, but the psychometric properties have not been assessed (Bryan & Bryan, 2014; Muehlenkamp, Walsh, & McDade, 2010). The Suicide Attempt Self-​ Injury Interview (SASII; Linehan, Comtois, Brown, et  al., 2006)  is a structured interview with 31 items that assesses in-​ depth characteristics of and motivations for a self-​injurious event (or

“cluster” of events). For each episode, the following are assessed:  intent and outcome expectation of the self-​ injury, method used, degree to which the action was impulsive, lethality, medical consequences of the injury and treatment, communication of self-​injurious intent, context, function, and other mental characteristics (e.g., being “disconnected from feelings”). The SASII focuses on the circumstances around an event in which self-​injury actually occurred and does not assess suicidal thoughts or plans unrelated to a self-​injurious event, interrupted or aborted attempts, or suicide gestures. The SASII is intended to assess detailed characteristics for each self-​injurious event that a respondent can remember and therefore is comprehensive but potentially time-​intensive if the respondent has an extensive history of self-​injury. Alternatively, one can choose to cover the self-​ injury history within a given time period. Scores on the SASII show excellent inter-​rater reliability and adequate validity metrics. Similar to the SITBI, interviewers should state the questions as worded but are encouraged to ask follow-​up questions to obtain specific details or clarify a response (Bland & Murray-​Gregory, 2006). Given that interviewers are instructed to use clinical judgment, they should be trained to ensure that they collect valid data. The Columbia-​ Suicide Severity Rating Scale (C-​ SSRS; Posner et al., 2011) is a semi-​structured interview that assesses the presence of lifetime SITB. The first section contains five suggested prompts to assess suicidal mental states (i.e., ideation, plans, and intent), ordered in increasing severity, starting with passive ideation (e.g.,

198

198

Mood Disorders and Self-Injury

“I wish I was dead”) and ending with active ideation with a specific plan and intent to act. The second section assesses the frequency, intensity, controllability, and deterrents of suicidal ideation as well as reasons for ideation. In the final section, the interviewer assesses the presence and frequency of suicide attempts, interrupted and aborted attempts, preparatory actions, and, finally, if there was a suicide attempt, the actual and potential lethality. Response options for most items are yes/​no, although ideation is rated on a 1 to 5 scale, depending on the number of items endorsed in the first section. Items assessing frequency are free response, and each item in the second section has a unique response scale. The C-​SSRS scores have been found to have high internal consistency and moderate to good convergent validity for each section. In addition, a “since last visit” version (which was used in studies assessing SITB outcomes every 4–​6 weeks) shows strong convergent validity and sensitivity to change. There is also an electronic version of the C-​SSRS (eC-​SSRS; Mundt et al., 2010) that yields scores with adequate reliability and good convergent and predictive validity (Greist, Mundt, Gwaltney, Jefferson, & Posner, 2014; Mundt et  al., 2013). In addition, there are other versions of the C-​SSRS, including a pediatric form, and several screeners that contain condensed versions of the sections, although no psychometric evidence is available for these alternative versions. The authors of the C-​SSRS provide several options for training on scale administration. In addition, the measure provides definitions for each SITB in question. The scale provides questions to be helpful guidelines, but emphasis is placed on gathering enough information to correctly classify the behavior rather than precisely reading specified questions. Studies have found that the C-​SSRS takes between 5 and 11 minutes to administer (Sheehan, Alphs, et al., 2014). Finally, it is worth noting that the FDA and other government agencies endorse the C-​SSRS as a scale for clinical trials. The Sheehan-​ Suicide Tracking Scale (S-​ STS; Sheehan, Giddens, & Sheehan, 2014a) is a structured interview (although there is an identical self-​report version as well) with 16 items that assess a wide range of SITB, including “accidental” overdoses, several forms of passive ideation (within a single question), active ideation, suicidal command hallucinations, specific planning steps, intention to act on suicidal thoughts, intention to die from the act itself, feeling an impulse to kill oneself, preparatory actions, NSSI, and suicide attempts. Each item is rated on a scale from 0 (not at all) to 4 (extremely), and some items collect frequency information. Although interrupted

and aborted attempts are not explicitly assessed, they are inferred through a combination of selecting a time to attempt suicide and taking active steps to prepare for an attempt, although this has been found to result in imprecise measurement (Youngstrom et  al., 2015). In some studies, the authors of the S-​STS used computerized self-​ report and clinician interview assessments in which, at the conclusion of the interview, the clinician was alerted to deviations between the interview and self-​reported rating. The clinician and patient then returned to those items and continued the interview until those items were reconciled (Sheehan, Alphs, et al., 2014; Sheehan, Giddens, et al., 2014a). The main study evaluating the reliability and validity of the S-​STS was conducted with a sample of young Italian adults on an 8-​item earlier version of the scale with questions that were similar but not identical to those of the 16-​item version (Preti et al., 2013; Sheehan, Giddens, et al., 2014a). In the 8-​item version, scores on the section assessing ideation and also the global score obtained by adding all items showed acceptable internal consistency and test–​rest reliability. Scores on the section assessing just suicidal behaviors showed acceptable internal consistency but moderate to poor test–​retest reliability. S-​STS sections showed acceptable convergent and criterion validity as well. The S-​STS has a patient-​rated version, a clinician-​ rated version, and a “clinically meaningful change measure” version, and all versions have flexibility regarding the time period assessed. The authors recommend that interviewers are trained in the definitions of suicidal behaviors similar to those used in the C-​CASA classification system. Interviewers are encouraged to use data from all available sources. The administration time is 4 to 13 minutes for the S-​STS self-​report scale, 3 to 15 minutes for the S-​STS interview, and 1.5 to 3.5 minutes for the reconciliation form (Sheehan, Alphs, et al., 2014). The Scale for Suicide Ideation (SSI; Beck, Kovacs, & Weissman, 1979)  is a 21-​item semi-​structured interview that assesses past week thoughts of suicide. The SSI measures different aspects of suicidal ideation (e.g., presence, frequency, and severity) as well as reasons for suicide, planning, and the presence and intent of prior attempts. All items are rated on unique 0 to 2 scales, and the first 19 items (excluding items regarding prior attempts) are summed to determine a total score. Administration takes approximately 10 minutes. The SSI has been validated across a wide array of samples, including adolescents (Holi et  al., 2005), adults (Beck et  al., 1979), older adults (Witte et  al., 2006), and diverse racial and ethnic groups (Beck et  al., 1979), as well as within clinical

 19

Self-Injurious Thoughts and Behaviors

settings (Vuorilehto, Melartin, & Isometsä, 2006). Given the widespread use and psychometric evidence, we recommend the SSI as a general measure to assess suicidal ideation. The Scale for Suicide Ideation-​Worst (SSI-​W; Beck, Brown, & Steer, 1997) is identical to the SSI but participants should rate items within the context of their most severe suicidal ideation (i.e., their “worst point”). Scores on the SSI-​W have been reported to have good internal consistency and inter-​rater reliability, and importantly, the SSI-​W is predictive of suicide attempts and suicide death (Beck et al., 1997; Beck, Brown, Steer, Dahlagaard, & Grisham, 1999; Joiner et al., 2003). The Modified Scale for Suicidal Ideation (MSSI), is an altered version of the SSI that assesses different aspects of suicidal ideation (13 of 18 items are from the original SSI), has increased ratings range, provides standardized prompts, and utilizes screening items (Miller, Norman, Bishop, & Dow, 1986). Scores on the MSSI have good to excellent reliability, and the measure has shown good convergent and divergent validity (Clum & Yang, 1995; Joiner et  al., 2005; Joiner, Rudd, & Rajab, 1997; Miller et  al., 1986; Pettit et al., 2009; Rudd, Joiner, & Rajad, 1996). Self-​Report Measures Batterham and colleagues (2015) provide a comprehensive review of easy-​to-​administer, self-​report measures that assess suicidal SITB within adults for use in population-​ based research. The PhenX Toolkit also provides recommended instruments to assess SITB and related risk factors for epidemiological studies (Suicide Specialty Collection at https://​www.phenxtoolkit.org; PhenX Toolkit Suicide Workgroup, 2014). The Beck Scale for Suicidal Ideation (BSI, BSS, or BSSI; Beck & Steer, 1991)  is a self-​report version of the SSI. Like the SSI, it contains 21 items, each of which is rated on a 0 to 3 scale. It has been found to have excellent internal consistency, good construct validity (Beck, Steer, & Ranieri, 1988), and other beneficial psychometric features (de Beurs, Fokkema, de Groot, de Keijser, & Kerkhof, 2015). The BSI has been used across clinical and research settings (Healy, Barry, Blow, Welsh, & Milner, 2006). The Adult Suicide Ideation Questionnaire (ASIQ; Reynolds, 1991a) contains 25 items assessing passive and active suicidal ideation, intent, social aspects of suicide, and planning steps. Scores on the ASIQ have shown good reliability across multiple populations, such as college students (Reynolds, 1991b), adults in the general population (Reynolds, 1991a), adult outpatients (Reynolds,

199

1991a), and psychiatric patients (Horon, McManus, Schmollinger, Barr, & Jimenez, 2013; Osman et  al., 1999). In one study, the ASIQ predicted suicide attempts over a 3-​month period (Osman et al., 1999). The Suicide Ideation Scale (SIS; Rudd, 1989)  contains 10 items and assesses the presence and intensity of suicidal ideation as well as suicide attempt history. Substantial evidence for the reliability and validity of the SIS scores has been supported in a college sample (Rudd, 1989)  and in a military clinical sample (Luxton, Rudd, Reger, & Gahm, 2011). The Suicidal Behaviors Questionnaire (SBQ; Linehan, 1981)  is a 34-​item measure to assess the presence and frequency of suicidal ideation, attempts, and NSSI. Scores from the SBQ have good to excellent reliability and supported validity (Linehan, Camper, Chiles, Strosahl, & Shearin, 1987; Simon et al., 2007), although some of the most cited evidence is unpublished (Addis & Linehan, 1989; Linehan, 1990). In addition, scores from an abbreviated 4-​item version of the SBQ (also referred to as the SBQ) have demonstrated adequate to good internal consistency, adequate test–​retest reliability, and convergent validity (Cole, 1988; Cotton, Peters, & Range, 1995). Finally, another 4-​item derivation of the SBQ, the SBQ-​Revised (SBQ-​R), also has demonstrated strong psychometric properties and has been used widely (Osman et  al., 2001; Pedrelli et  al., 2014; Rudd, Goulding, & Bryan, 2011). The Self-​ Harm Behavior Questionnaire (SHBQ; Gutierrez, Osman, Barrios, & Kopper, 2001) is a 32-​item scale that assesses the presence and characteristics (e.g., age of onset, frequency, lethality, method, and intent) of nonsuicidal self-​harm, suicidal ideation, suicide attempts, and suicide threats. The validity and reliability of the SBHQ scores have been supported in multiple studies (Fliege et al., 2006; Gutierrez et al., 2001; Muehlenkamp, Cowles, & Gutierrez, 2009), and the SBHQ has been used across ethnically and racially diverse samples, different age groups, and Veteran samples (Andrews, Martin, Hasking, & Page, 2013; Brausch & Gutierrez, 2010; Kleespies et al., 2011; Muehlenkamp & Gutierrez, 2004; Muehlenkamp et al., 2009). The Deliberate Self-​Harm Inventory (DSHI; Gratz, 2001) is a 17-​item questionnaire that assesses the presence and characteristics (e.g., frequency, severity, duration, and method used) of NSSI. The reliability and validity of scores on the DSHI were supported in a study among college students (Gratz, 2001) and a study among a German-​ speaking clinical sample using a German version of the scale (Fliege et al., 2006).

20

200

Mood Disorders and Self-Injury

The Self-​Harm Inventory (SHI; Sansone, Wiederman, & Sansone, 1998)  is a 22-​item self-​report measure that assesses the presence and absence of 22 SITB, such as overdosing, cutting, burning, and suicide attempts. Studies support the reliability and validity of scores on the instrument among psychiatric outpatients (Sansone, Pole, Dakroub, & Butler, 2006), psychiatric inpatients (Sansone, Songer, & Miller, 2005), and community samples (Sansone et al., 1998).

MEASURES FOR USE WITH CHILDREN AND ADOLESCENTS

Structured and Semi-​Structured Interviews Some of the measures discussed previously have been validated in samples with children and adolescents, and some have been specifically designed for this population. The multiple studies supporting the validity and reliability of the SITBI scores were among samples of adolescents and young adults (12–​19 years; Fischer et al., 2014; Nock et al., 2007). The SITBI has also shown good concurrent validity among a sample of adolescents in a psychiatric inpatient setting (Venta & Sharp, 2014) and has been used to assess SITB in children as young as age 7  years (Barrocas, Hankin, Young, & Abela, 2012). For the SSI, scores have good to excellent internal consistency, and their validity has been supported in studies of psychiatric inpatient children and adolescents (Allan, Kashani, Dahlmeier, Taghizadeh, & Reid, 1997; Nock & Kazdin, 2002) as well as outpatient adolescents (Holi et al., 2005). The Suicide Behaviors Interview (SBI; Reynolds, 1990) is a semi-​structured interview with 22 items rated on 0 to 2 or 0 to 4 scales to measure suicidal behaviors among adolescents. The first section of the SBI assesses risk factors of suicidal behaviors, such as general distress, chronic stress, level of social support, and major negative life events. The second section assesses suicidal SITB, including multiple items to assess ideation and suicide planning and suicide attempts as well as follow-​up questions to assess some details regarding the most recent attempt (e.g., confidence of death). Scores on the SBI have good internal consistency and excellent inter-​rater reliability, as well as adequate content and good convergent validity (Reynolds, 1990; Reynolds & Mazza, 1999). The Child Suicide Potential Scales (CSPS; Pfeffer, Conte, Plutchik, & Jerrett, 1979)  is a semi-​structured

interview with eight scales that measure suicidal behavior (ranging from nonsuicidal to serious attempts on a 5-​point spectrum), precipitating events, affect and behavior within 6 months and then in a separate scale, more than 6 months prior, family background, one’s concept of death, ego functioning, and ego defense. The psychometric properties of the CSPS are relatively strong, with evidence of adequate to excellent internal consistency for all but one scale (precipitating events), excellent inter-​rating reliability (Ofek, Weizman, & Apter, 1998; Pfeffer et  al., 1979), and concurrent validity demonstrated in numerous studies across both clinical and typical populations (Pfeffer, Conte, Plutchik, & Jerrett, 1980; Pfeffer, Newcorn, Kaplan, Mizruchi, & Plutchik, 1988; Pfeffer, Solomon, Plutchik, Mizruchi, & Weiner, 1982; Pfeffer, Zuckerman, Plutchik, & Mizruchi, 1984). Self-​Report Measures Several self-​ report measures that were reviewed in the adult assessment section have also been evaluated for their use with youth. The validity and reliability of BSI scores (Beck & Steer, 1991)  have been supported among adolescent psychiatric inpatients (Steer, Kumar, & Beck, 1993)  and outpatients (Rathus & Miller, 2002). Multiple studies have supported the reliability and validity of scores on the SBQ-​R among adolescents (Glenn, Bagge, & Osman, 2013; Osman et  al., 2001). The Suicidal Ideation Questionnaire (SIQ; Reynolds, 1988)  and Suicidal Ideation Questionnaire Junior (SIQ-JR; Reynolds, 1987), which were created by the same researcher who created the ASIQ, were developed specifically for use in grades 10 through 12 and 7 through 9, respectively, and both have well-​supported psychometric characteristics (Gutierrez & Osman, 2009; Huth-​Bocks, Kerr, Ivey, Kramer, & King, 2007; Pinto, Whisman, & McCoy, 1997; Reynolds & Mazza, 1999). The Harkavy–​ Asnis Suicide Scale (HASS; Harkavy Friedman & Asnis, 1989)  is a three-​part questionnaire that assesses the presence and frequency of active and passive suicidal thoughts, plans and attempts, as well as substance abuse history and exposure to suicidal behavior. The reliability and validity of scores on the HASS have been supported in studies with high school students (Harkavy Friedman & Asnis, 1989) and adolescents drawn from a psychiatric outpatient clinic (Wetzler et al., 1996), a treatment study (Rathus & Miller, 2002), and a pediatric emergency department (Asarnow, McArthur, Hughes, Barbery, & Berk, 2012).

 201

Self-Injurious Thoughts and Behaviors OVERALL EVALUATION

There is a large assortment of instruments to assess SITB. The selection of an instrument should be based on the purpose and focus of the assessment, as well as the psychometric support (summarized in Tables  10.1a and 10.1b). There is large variation in the characteristics assessed by the different instruments. Some instruments collect in-​depth characteristics (e.g., presence and frequency) of an array of SITB (e.g., SITBI, SASII, C-​SSRS, and S-STS), whereas others focus exclusively on collecting detailed information about a specific outcome, such as NSSI (e.g., DSHI and SHBQ) or suicide ideation (e.g., SSI and BSI). We encourage the reader to carefully consider the goals of assessment and the outcomes that need to be assessed to achieve those goals. For clinical settings, given the co-​occurrence of many SITB, and that less severe forms of SITB predict more severe outcomes, we recommend that each form of SITB be comprehensively assessed. The few studies that have examined agreement between self-​ report and interview assessment have found poor agreement (Klimes-​ Dougan, 1998; Prinstein, Nock, Spirito, & Grapentine, 2001), particularly among adolescents, with self-​report showing higher rates of SITB compared to parent and clinician reported measures. The causes of this poor agreement are unknown but could be related to respondents being more forthcoming in self-​ report format rather than having to tell a face-​to-​face interviewer about their SITB history and/​ or respondents erroneously altering their responses due to subtle wording differences between instruments, although discrepancies have been found even when the wording and format of the question are similar (Prinstein et  al., 2001). Another SITB measurement issue, discussed previously, is that structured interviews and self-​ report questionnaires require respondents to rely on their own interpretations of SITB terms, such as “suicide attempt,” that may differ from researchers’ consensus definitions, leading to misclassification. Although no research has clarified reasons for discrepancies between self-​report and interview-​based assessment, research on misclassification suggests that questions that include a longer stem with an embedded definition and provide multiple response options for more subtle suicidal behaviors can reduce, but not eliminate, misclassification (Millner et  al., 2015). However, none of the instruments reviewed here contain questions that have these

201

characteristics, and little research has examined the measurement and misclassification issues discussed previously.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

In addition to measuring the presence or absence of each SITB when conceptualizing a case or planning treatment, it is also important to assess patient-​reported factors that influence the occurrence of each SITB. The causes of SITB are multidetermined, and there is a wide range of risk factors, but there is also a lack of clarity about how these factors work together to produce specific SITB within an individual. We provide a brief list of some risk factors in this section, but note that a recent meta-​ analysis suggests that most risk factors are relatively weak prospective predictors of suicidal thoughts and behaviors (Franklin et al., 2017). This finding could be due to SITB risk factors varying greatly among individuals such that there could be completely distinct risk factors for different people. Therefore, clinicians should assess individuals’ specific reasons and circumstances that precede instances of a SITB. We review measures to accomplish this goal in this section and provide ratings for each instrument in Table 10.2. Although it is impossible to calculate precisely the risk of SITB for an individual, there are a few factors and issues worth considering. First, mental disorders are associated with all forms of suicidal SITB (Nock, Borges, et al., 2008). An important consideration is that the disorders that are among the largest cross-​sectional predictors of suicidal ideation, such as major depressive disorder, differ from disorders that are among the largest predictors of which people with ideation transition to attempting suicide (Nock et  al., 2009; Nock, Hwang, Sampson, & Kessler, 2010). These results suggest that it is important to identify patients’ severity of SITB and understand that risk factors may change as behavior becomes more or less severe. Structured and Semi-​Structured Interviews Two of the interviews reviewed previously, the SITBI and the SASII, collect information pertaining to individuals’ self-​reported reasons for engaging in SITB and circumstances around occurrences of SITB, such as preceding stressful events or triggers.

20

202

Mood Disorders and Self-Injury

Table 10.2  Ratings of Instruments Used for Case Conceptualization and Treatment Planning Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

G A G

NA G A

E E NR

A E NR

G A A

A G A

G A G

A G A

✓ ✓

✓ ✓ ✓

Interviews SITBI SASII FASM

Self-​Report Measures RFL B-​RFL RFL-​A CSRLI

E G G G

E G E G

NA NA NA NA

G NR NR NR

A A A A

E E A A

E G G E

G A G A

RFL-​YA

A

E

NA

NR

A

A

A

A

RFL-​OA

G

E

NA

NR

A

A

A

A

RSAQ

E

A

NA

NR

A

A

E

A

MAST ISAS

G G

G G

NA NA

NR NR

A G

G G

E E

A A



Note: SITBI  =  Self-​Injurious Thoughts and Behaviors Interview; SASII  =  Suicide Attempts Self-​Injury Interview; FASM  =  Functional Assessment of Self-​Mutilation; RFL  =  Reasons for Living Inventory; B-​RFL  =  Brief Reasons for Living Inventory; RFL-​A  =  Reasons for Living for Adolescents; CSRLI = College Student Reasons for Living Inventory; RFL-​YA = Reasons for Living for Young Adult; RFL-​OA = Reasons for Living for Older Adults; RSAQ = Reasons for Suicide Attempts Questionnaire; MAST = Multi-​Attitude Suicide Tendency Scale for Adolescents; ISAS: Inventory of Statements about Self-​Injury; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

The Functional Assessment of Self-​ Mutilation (FASM; Lloyd, Kelley, & Hope, 1997) is an interview that assesses characteristics and functions of NSSI exclusively. The FASM requires a respondent to endorse or deny 12 NSSI methods and, for each endorsed item, to report the frequency and whether medical treatment was necessary. Other items include other characteristics of NSSI, including the age of onset, the impulsiveness of the behaviors, substance use, and the amount of pain. It also provides 22 different reasons for engaging in NSSI. Studies have found excellent to adequate internal consistency for scores on the FASM (Guertin, Lloyd-​Richardson, Spirito, Donaldson, & Boergers, 2001; Klonsky, May, & Glenn, 2013)  and excellent convergent validity with the SITBI (Nock et al., 2007). Self-​Report The Reasons for Living Inventory (RFL; Linehan, Goodstein, Nielsen, & Chiles, 1983)  contains 48 items that assess different reasons for living. There is an expanded version that contains 72 items. The RFL is made up of six subscales based on factor analyses: survival and coping beliefs, responsibility to family, child concerns, fear of suicide, fear of social disapproval, and moral objections (to suicide). A recent review concluded that the RFL is inversely associated with both suicide ideation and

attempts (Bakhiyi, Calati, Guillaume, & Courtet, 2016). There are age and gender differences, as well as an age by gender interaction, on the RFL, with older participants and women reporting higher scores (Ellis & Lamis, 2007; McLaren, 2011); however, the gender gap decreases as people age (Bakhiyi et  al., 2016). There are also differences between racial and ethnic groups (Morrison & Downey, 2000; Walker, Alabi, Roberts, & Obasi, 2010). The RFL has adequate psychometric characteristics (Osman et al., 1993). The RFL has inspired several RFL-​ related instruments specifically tailored for different circumstances (e.g., when a briefer scale is required) and populations (e.g., adolescents, young adults, college students, and older adults). These RFL variants are reviewed briefly next. The Brief Reasons for Living Inventory (BRFL; Ivanoff, Jang, Smyth, & Linehan, 1994)  consists of 12 items all drawn from the original RFL. The BRFL subscales show high correlations with the corresponding RFL subscales, and the BRFL retains the same factor structure (Ivanoff et  al., 1994). Scores on the BRFL have shown adequate to good internal consistency and demonstrated validity (Bryant & Range, 1997; Kovac & Range, 2002; Marion & Range, 2003). The Reasons for Living for Adolescents (RFL-​ A; Osman et  al., 1998)  contains 32 items, none of which overlap with the RFL, to assess five factors:  future

 203

Self-Injurious Thoughts and Behaviors

optimism, suicide-​related concerns, family alliance, peer acceptance and support, and self-​acceptance. The RFL-​A scores show good reliability and predictive validity within both high school students (Gutierrez, Osman, Kopper, & Barrios, 2000; Osman et al., 1998) and psychiatric inpatient adolescents (Osman et al., 1998). The College Student Reasons for Living Inventory (CSRLI; Westefeld, Cardin, & Deaton, 1992)  contains 46 items that differ from those in the original RFL. Five of the six original RFL factors were retained, and a new factor reflecting college and future-​ related concerns replaced the original child concerns factor. Studies have supported good psychometric properties for the CSRLI (Rogers & Hanlon, 1996; Westefeld, Badura, Kiel, & Scheel, 1996; Westefeld et  al., 1992; Westefeld, Scheel, & Maples, 1998). The Reasons for Living for Young Adults (RFL-​YA; Gutierrez et  al., 2002)  contains 32 items, including 12 items from the RFL-​A, to assess reasons for living specifically among those aged 17 to 30  years. Factor analyses suggest factors that largely overlap with those found for the RFL-​A (Gutierrez et  al., 2002). Studies among college students support the reliability and validity of scores on this measure (Bagge, Lamis, Nadorff, & Osman, 2014; Gutierrez et  al., 2002; Wang, Nyutu, & Tran, 2012). No study has directly compared the RFL-​ YA and the CSRLI, which have overlapping objectives. Clinicians and researchers seeking to assess reasons for living among college students should review both scales to determine which scale best fits their needs. The Reasons for Living Scale–​Older Adult version (RFL-​OA; Edelstein et  al., 2009)  contains 69 items, 28 of which come from the original RFL, and it contains five of the six subscales of the RFL (child concerns is omitted). Scores on the RFL-​OA show excellent internal consistency and good convergent and divergent validity (Edelstein et al., 2009; Heisel, Neufeld, & Flett, 2016). Several other measures, rather than assessing reasons for living, assess reasons for making a suicide attempt. The Reasons for Suicide Attempt Questionnaire (RASQ; Holden, Kerr, Mendonca, & Velamoor, 1998)  contains 14 items that assess motivations for attempting suicide. The RASQ has two subscales: extrapunitive/​manipulative reasons (8 items) and internal perturbation based motivations (6 items). The RASQ has shown good psychometric properties within several populations, including adults in psychiatric crisis care (Holden et al., 1998), adults with a prior suicide attempt (Holden & Delisle, 2006), and male prisoners (Holden & Kroner, 2003).

203

The Multi-​ Attitude Suicide Tendency Scale for Adolescents (MAST; Orbach et  al., 1991)  is a 30-​item measure that examines attraction and repulsion to life and death, respectively. Scores on the MAST have demonstrated adequate to excellent reliability (Orbach et al., 1991; Osman et al., 1994) and concurrent validity (Cotton & Range, 1993; Gutierrez, Osman, Kopper, Barrios, & Bagge, 2000; Muehlenkamp & Gutierrez, 2004). The Inventory of Statements About Self-​ Injury (ISAS; Klonsky & Glenn, 2009)  mirrors many of the same items as the FASM, such as providing 12 NSSI methods, age of onset, impulsiveness of the behaviors, experience of physical pain, and reasons for engaging in self-​injury (or behavioral functions served by the behavior). Factor analyses of the FASM (Nock & Prinstein, 2004)  and rationally derived subscales of the SASII (Brown, Comtois, & Linehan, 2002)  converge on a four-​function model of self-​injury engagement, including whether the behavior is negatively or positively reinforced (i.e., to terminate a negative experience or trigger a positive experience) and either intrapersonal (i.e., carried out to affect one’s own emotions) or interpersonal (i.e., to affect others). This model has received empirical support in dozens of studies of developmentally disabled and typically developing samples (Bentley, Nock, & Barlow, 2014). The authors of the ISAS stated that they believe there are not 4 but, rather, 13 behavioral functions of NSSI, and they created this new scale to assess them. Factor analysis of the ISAS suggests that it captures only 2 functions:  the interpersonal and intrapersonal functions of NSSI (Klonsky & Glenn, 2009; Klonsky et al., 2013). Nevertheless, scores on the ISAS have good internal consistency (Klonsky & Glenn, 2009; Klonsky et al., 2013) and adequate to good test–​retest reliability (Glenn & Klonsky, 2011). Overall Evaluation Overall, as with the general initial assessment of SITB, the selection of an instrument for case conceptualization and treatment planning should be based on the evidence supporting it and the purpose of the assessment. There are several instruments that measure reasons for living. Although the strength of the evidence varies among these measures, one should also consider the population being assessed. For example, items referring to one’s children on the original RFL are usually not applicable to college students or adolescents. However, no studies have directly

204

204

Mood Disorders and Self-Injury

compared, for example, the RFL and the CSRLI in a college sample to determine the degree to which the two scales correspond. A number of interview and self-​report measures are available for assessing why a person engages in suicidal or NSSI, and such measures can help clinicians with case conceptualization and point toward treatment targets. For example, if assessment reveals that a person engages in a suicide attempt or NSSI to escape from intolerable emotional suffering, then treatment focused on emotion regulation or distress tolerance skills might help serve a similar function to replace these behaviors. If, on the other hand, self-​injury is intended to communicate psychological pain to family or friends, then interpersonal effectiveness skills might provide an adaptive approach to achieving similar ends. Although this approach seems sensible and is incorporated in some empirically supported treatments, such as dialectical behavior therapy, no studies have explicitly tested the utility of these scales to improve case conceptualization or guide treatment.

ASSESSMENT FOR TREATMENT MONITORING AND OUTCOME EVALUATION

Several randomized controlled trials conducted during the past two decades have provided some empirical support for treatments that aim to reduce SITB directly (Bateman & Fonagy, 2009; Brown et al., 2005; Linehan, Comtois, Murray, & et al., 2006; Linehan et al., 2015). On balance, meta-​analyses have indicated that there is modest support for such treatments (Brent et al., 2009; Glenn, Franklin, & Nock, 2015; Kliem, Kröger, & Kosfelder, 2010) and have suggested there appears to be some degree of publication bias in the treatment literature (Tarrier, Taylor, & Gooding, 2008). Moreover, prior representative studies in the United States suggest that even when a large percentage of people receive treatment for SITB, it does not change the overall rate of these outcomes (Kessler, Berglund, Borges, Nock, & Wang, 2005). This could be due, in part, to a lack of evidence-​based treatments available to those in community settings (McHugh & Barlow, 2010). Treatment monitoring and outcome evaluation approaches focus on assessing changes in the presence, frequency, and severity of SITB. Therefore, many of the measures listed in Tables 10.1a and 10.1b are appropriate for treatment monitoring and outcome evaluation, and several have been used in this role in prior studies.

However, there is little psychometric evidence supporting instruments specifically designed for assessing changes in SITB outcomes over time. Therefore, one of the first steps in selecting a measure for this purpose is to make sure that the time period assessed maps on to the time between assessment sessions. Obviously, it is inappropriate to use measures that assess cumulative outcomes, such as lifetime presence of a behavior, or a set period of time that does not correspond with the amount of time between monitoring sessions because this could cause single SITB occurrences either to be counted more than once (e.g., if monitoring occurs weekly but an instrument assesses monthly) or to be missed entirely (e.g., if monitoring occurs monthly but an instrument assesses only the past week). For these reasons, the C-​SSRS has a “since last visit” version (Posner et al., 2011), and the C-​SSRS, S-​STS, and SASII provide flexibility regarding the time period being assessed (Bland & Murray-​Gregory, 2006; Sheehan, Giddens, et  al., 2014a). The SASII has been used in several studies to monitor treatment and evaluate outcome (Bryan et al., 2014; Linehan, Comtois, Murray, et al., 2006; Linehan et al., 2015; McMain et al., 2009). The SITBI has a section that asks about past month and therefore would be appropriate if monitoring sessions occurred at that interval. In addition to the standard versions, the C-​SSRS also has abbreviated screeners for assessment of past month or “since last contact,” and the S-​STS has a “clinically meaningful change” version that assesses a broader array of factors potentially related to SITB as well as the severity of self-​injurious thoughts and capacity to not engage in SITB (Sheehan, Giddens, et al., 2014a). Among these various instruments and versions, however, only the C-​SSRS “since last visit” scale has been formally tested for sensitivity to change (i.e., it was correlated with other measures assessing SITB over time) and only in two studies (Greist, et al., 2014; Posner et al., 2011). Given the sparse data in this area, we do not present a table to review the measures but, rather, recommend the C-​SSRS and the SASII based on their evidence and use for the purposes of treatment evaluation and outcome monitoring. Overall Evaluation There are many instruments with strong psychometric properties that assess the presence, frequency, and other characteristics of SITB and measure these outcomes over varying periods of time. Given that the primary goal of treatment monitoring and outcome evaluation is to assess changes in SITB over time, one would assume

 205

Self-Injurious Thoughts and Behaviors

that these instruments would detect such fluctuations of these outcomes, but there is a dearth of studies examining whether this is the case. However, repeated assessments of SITB may alter participants’ self-​report, causing them to be more or less forthcoming. If this occurred, then the psychometric properties of instruments tested during one assessment session may not be applicable to use with repeated assessments. Ecological momentary assessment (EMA) studies (i.e., in which participants report thoughts, behaviors, or feelings on a mobile device soon after they occur) repeatedly assess SITB but, generally, they have used single questions—​ not instruments—​to directly assess the presence of SITB (Armey, Crowther, & Miller, 2011; Nock, Prinstein, & Sterba, 2009). Regardless of which instrument is used, it is recommended that clinicians conduct rigorous and comprehensive clinical assessment of primary outcomes of interest in order to best inform treatment planning. It is further recommended that these primary outcomes be assessed repeatedly throughout treatment and used to inform treatment modifications when necessary. From a utility perspective, it is crucial that researchers and clinicians examine the clinical utility of the information collected by these assessment instruments. Although generally assumed to be the case, there is sparse evidence confirming that these tools provide valuable information and enhance clinical care and decision-​making. In lieu of a list of instruments with the specific purpose of treatment monitoring and outcome evaluation, Box 10.1 contains an overview of general recommendations for SITB assessment.

BOX 10.1 

205

CONCLUSIONS AND FUTURE DIRECTIONS

We have provided an overview of the evidence and recommendations for the relatively large number of instruments available for assessing SITB and related characteristics. We have also reviewed recent work suggesting that some forms of measurement of SITB outcomes result in misclassification, within both research and clinical settings (Brown et  al., 2015; Hom et  al., 2015; Millner et  al., 2015; Plöderl et al., 2011). Therefore, we emphasize the importance of having interviewers well trained in the classification of SITB when using interviews and using self-​ report measures with increased clarity and coverage. We also reiterate the importance of carefully considering the purpose of assessment, target of assessment, and quality of evidence when selecting among the measures described in this chapter. Several instruments were omitted from this chapter due to insufficient replicated psychometric support. Although we do not review the evidence supporting these instruments, several are noteworthy as promising measures that require additional testing. First, both the C-​SSRS and the S-​STS have versions for administration to children, although the psychometric properties of these versions have not been investigated. The Alexian Brothers Assessment of Self-​Injury (ABASI; Washburn, Juzwin, Styer, & Aldridge, 2010)  and the Non-​Suicidal Self-​ Injury Disorder Scale (NSSID; Victor, Davis, & Klonsky, 2017) are two novel measures intended to assess NSSI disorder, and both have initial psychometric support. The Inventory of Motivations for Suicide Attempts

Clinical Recommendations and Research Questions for Evidence-​Based Assessment of SITB

1. Identification of SITB (a) Assess the presence of each type of SITB in all patients. (b) Use multiple assessment methods (interview, questionnaire) and informants (patient, clinician, parent) whenever possible. (c) If SITB is identified on any measure, conduct a more thorough SITB evaluation and risk assessment. 2. Case conceptualization and treatment planning (a) Assess risk and protective factors for future SITB.a (b) Assess the function of SITB. (c) Treatment should target SITB directly. 3. Treatment monitoring and outcome evaluation (a) Assessment should begin before treatment and continue as frequently as feasible. (b) Measure multiple forms of SITB vand select measures with evidence of treatment sensitivity. (c) Examine clinical utility of information gained from SITB assessment. See sources cited in text for detailed guidelines for conducting an SITB risk assessment.

a

206

206

Mood Disorders and Self-Injury

(IMSA; May & Klonsky, 2013)  is intended to assess a wider array of reasons for attempting suicide than does the RSAQ. Finally, the Non-​Suicidal Self-​Injury-​Assessment Tool (NSSI-​AT; Whitlock, Exner-​Cortens, & Purington, 2014) is a 12-​module assessment to measure many characteristics of NSSI and intended to be administered on the Internet. There are several needed directions for future work in this area. First, many of the constructs of interest here are non-​arbitrary metrics (e.g., not on a scale from 1 to 5; Blanton & Jaccard, 2006; Embretson, 2006) in the form of the actual presence of self-​harm thoughts or behaviors. This is important because reliability and validity are predicated on assessing a single construct (Cronbach & Meehl, 1955), whereas SITB, such as suicidal ideation and suicide attempts, are independent constructs generally measured together on a single instrument. Therefore, one future direction is for studies to focus evaluations of reliability and validity on measures of each SITB construct separately. Second, and related, there are threats to reliability and validity that can be addressed by research that examines each SITB construct separately. For example, one threat to reliable and valid measurement is that participants may not understand the SITB term in question (e.g., suicide plan or suicide attempt; Millner et al., 2015). Therefore, future research could continue to focus on increasing the clarity of assessment questions and testing how to further reduce inaccurate responses among participants. A  second threat to reliability and validity is that some participants may intentionally withhold reporting prior suicidal behaviors because of stigma or embarrassment (Conner, Langley, Tomaszewski, & Conwell, 2003; Kim, Thomas, Wilk, Castro, & Hoge, 2010). Thus, future research could focus on testing certain prompts that cause participants to be more comfortable and forthcoming answering questions about SITB. Finally, aside from issues of reliability and validity, there are several future directions that would advance the understanding of SITB. First, currently, we lack a basic description of many important SITB processes. For example, little is known about (a) the degree to which SITB such as NSSI and suicidal ideation fluctuate throughout a day, week, or month (Armey et al., 2011; Kleiman et al., 2017; Nock, Prinstein, et al., 2009); (b) how suicidal thoughts or NSSI or other problematic behaviors (e.g., alcohol use) change in the hours or days prior to an attempt (Bagge, Glenn, & Lee, 2013; Bagge, Lee, et  al., 2013; Bagge, Littlefield, Conner, Schumacher, & Lee, 2014); and, related, (c) the suicide planning steps people take as they

move from thinking about suicide to actually engaging in a suicide attempt and how far in advance of the attempt these steps are typically carried out (Bagge, Littlefield, & Lee, 2013; Millner, Lee, & Nock, 2017). Initial work in these areas using EMA, which allows for real-​time assessment via smartphone apps, and recent retrospective report (i.e., interviewing people within 1 or 2 weeks of a suicide attempt) has started to describe these processes, but more work is required to gain a basic description of these processes in order to advance the understanding of when and why people kill themselves. Second, there are several exciting but unverified approaches to improve the prediction of SITB. For example, wearable technology such as smart wristbands can collect passive data, such as walking pace, amount of movement throughout the day, and heartbeat (Onnela & Rauch, 2016), that may contribute to improved prediction of suicidal behaviors. In addition, some studies have found that computerized reaction time behavioral tasks can improve the prospective prediction of suicidal outcomes (Nock & Banaji, 2007; Nock, Park, et  al., 2010; Randall, Rowe, Dong, Nock, & Colman, 2013), but more research is required to confirm these findings. Future research may reveal methods to augment retrospective self-​report assessments by integrating these various data sources (EMA, passive monitoring, and computerized tasks) into predictive models that greatly improve efforts to prevent SITB. For now, this chapter provides current evidence on the most empirically supported instruments to be used by researchers and practitioners working in psychiatric settings.

References Addis, M., & Linehan, M. M. (1989). Predicting suicidal behavior:  Psychometric properties of the Suicidal Behaviors Questionnaire. Poster presented at the annual meeting of the Association for the Advancement of Behavior Therapy, Washington, DC. Allan, W. D., Kashani, J. H., Dahlmeier, J., Taghizadeh, P., & Reid, J. C. (1997). Psychometric properties and clinical utility of the Scale for Suicide Ideation with inpatient children. Journal of Abnormal Child Psychology, 25, 465–​473. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Andrews, T., Martin, G., Hasking, P., & Page, A. (2013). Predictors of continuation and cessation of nonsuicidal self-​injury. Journal of Adolescent Health, 53, 40–​46. Armey, M. F., Crowther, J. H., & Miller, I. W. (2011). Changes in ecological momentary assessment reported

 207

Self-Injurious Thoughts and Behaviors

affect associated with episodes of nonsuicidal self-​injury. Behavior Therapy, 42, 579–​588. Asarnow, J., McArthur, D., Hughes, J., Barbery, V., & Berk, M. (2012). Suicide attempt risk in youths: Utility of the Harkavy–​Asnis Suicide Scale for monitoring risk levels. Suicide and Life-​Threatening Behavior, 42, 684–​698. Bagge, C. L., Glenn, C. R., & Lee, H.-​J. (2013). Quantifying the impact of recent negative life events on suicide attempts. Journal of Abnormal Psychology, 122, 359–​368. Bagge, C. L., Lamis, D. A., Nadorff, M., & Osman, A. (2014). Relations between hopelessness, depressive symptoms and suicidality: Mediation by reasons for living. Journal of Clinical Psychology, 70, 18–​31. Bagge, C. L., Lee, H.-​J., Schumacher, J. A., Gratz, K. L., Krull, J. L., & Holloman, G. (2013). Alcohol as an acute risk factor for recent suicide attempts: A case-​crossover analysis. Journal of Studies on Alcohol and Drugs, 74, 552–​558. Bagge, C. L., Littlefield, A. K., Conner, K. R., Schumacher, J. A., & Lee, H.-​J. (2014). Near-​term predictors of the intensity of suicidal ideation:  An examination of the 24h prior to a recent suicide attempt. Journal of Affective Disorders, 165, 53–​58. Bagge, C. L., Littlefield, A. K., & Lee, H.-​J. (2013). Correlates of proximal premeditation among recently hospitalized suicide attempters. Journal of Affective Disorders, 150, 559–​564. Bakhiyi, C. L., Calati, R., Guillaume, S., & Courtet, P. (2016). Do reasons for living protect against suicidal thoughts and behaviors? A systematic review of the literature. Journal of Psychiatric Research, 77, 92–​108. Barrocas, A. L., Hankin, B. L., Young, J. F., & Abela, J. R. Z. (2012). Rates of nonsuicidal self-​injury in youth:  Age, sex, and behavioral methods in a community sample. Pediatrics, 130, 39–​45. Bateman, A., & Fonagy, P. (2009). Randomized controlled trial of outpatient mentalization-​based treatment versus structured clinical management for borderline personality disorder. American Journal of Psychiatry, 166, 1355–​1364. Batterham, P. J., Ftanou, M., Pirkis, J., Brewer, J. L., Mackinnon, A. J., Beautrais, A.,  .  .  .  Christensen, H. (2015). A systematic review and evaluation of measures for suicidal ideation and behaviors in population-​based research. Psychological Assessment, 27, 501–​512. Beck, A. T., Brown, G. K., & Steer, R. A. (1997). Psychometric characteristics of the Scale for Suicide Ideation with psychiatric outpatients. Behaviour Research and Therapy, 35, 1039–​1046. Beck, A. T., Brown, G. K., Steer, R. A., Dahlagaard, K. K., & Grisham, J. R. (1999). Suicide ideation at its worst point:  A predictor of eventual suicide in psychiatric outpatients. Suicide and Life-​ Threatening Behavior, 29, 1–​9.

207

Beck, A. T., Kovacs, M., & Weissman, A. (1979). Assessment of suicidal intention:  The Scale for Suicide Ideation. Journal of Consulting and Clinical Psychology, 47, 343. Beck, A. T., & Steer, R. A. (1991). Manual for the Beck Scale for Suicide Ideation. San Antonio, TX:  Psychological Corporation. Beck, A. T., Steer, R. A., & Ranieri, W. F. (1988). Scale for Suicide Ideation: Psychometric properties of a self-​report version. Journal of Clinical Psychology, 44, 499–​505. Bentley, K. H., Nock, M. K., & Barlow, D. H. (2014). The four-​function model of nonsuicidal self-​injury key directions for future research. Clinical Psychological Science, 2, 638–​656. Berman, A. L., & Silverman, M. M. (2014). Suicide risk assessment and risk formulation Part II: Suicide risk formulation and the determination of levels of risk. Suicide and Life-​Threatening Behavior, 44, 432–​443. Bland, S., & Murray-​ Gregory, A. (2006, September 28). Instructions for use of Suicide Attempt Self Injury Interview. Retrieved from http://​depts.washington.edu/​ brtc/​files/​SASII%20Instructions.pdf Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 61, 27–​41. Brausch, A. M., & Gutierrez, P. M. (2010). Differences in non-​suicidal self-​injury and suicide attempts in adolescents. Journal of Youth and Adolescence, 39, 233–​242. Brenner, L. A., Breshears, R. E., Betthauser, L. M., Bellon, K. K., Holman, E., Harwood, J. E. F., . . . Nagamoto, H. T. (2011). Implementation of a suicide nomenclature within two VA healthcare settings. Journal of Clinical Psychology in Medical Settings, 18, 116–​128. Brent, D. A., Greenhill, L. L., Compton, S., Emslie, G., Wells, K., Walkup, J. T., . . . Turner, J. B. (2009). The Treatment of Adolescent Suicide Attempters study (TASA): Predictors of suicidal events in an open treatment trial. Journal of the American Academy of Child & Adolescent Psychiatry, 48, 987–​996. Brown, G. K., Currier, G. W., Jager-​Hyman, S., & Stanley, B. (2015). Detection and classification of suicidal behavior and nonsuicidal self-​injury behavior in emergency departments. Journal of Clinical Psychiatry, 76, 1397–​1403. Brown, G. K., Ten Have, T., Henriques, G. R., Xie, S. X., Hollander, J. E., & Beck, A. T. (2005). Cognitive therapy for the prevention of suicide attempts:  A randomized controlled trial. JAMA, 294, 563–​570. Brown, M. Z., Comtois, K. A., & Linehan, M. M. (2002). Reasons for suicide attempts and nonsuicidal self-​injury in women with borderline personality disorder. Journal of Abnormal Psychology, 111, 198–​202. Bryan, C., & Bryan, A. (2014). Nonsuicidal self-​injury among a sample of United States military personnel and veterans enrolled in college classes. Journal of Clinical Psychology, 70, 874–​885.

208

208

Mood Disorders and Self-Injury

Bryan, C. J., David Rudd, M., Wertenberger, E., Etienne, N., Ray-​Sannerud, B. N., Morrow, C. E.,  .  .  .  Young-​ McCaughon, S. (2014). Improving the detection and prediction of suicidal behavior among military personnel by measuring suicidal beliefs: An evaluation of the Suicide Cognitions Scale. Journal of Affective Disorders, 159, 15–​22. Bryant, S. L., & Range, L. M. (1997). Type and severity of child abuse and college students’ lifetime suicidality. Child Abuse & Neglect, 21, 1169–​1176. Clum, G. A., & Yang, B. (1995). Additional support for the reliability and validity of the Modified Scale for Suicide Ideation. Psychological Assessment, 7, 122–​125. Cole, D. A. (1988). Hopelessness, social desirability, depression, and parasuicide in two college student samples. Journal of Consulting and Clinical Psychology, 56, 131–​136. Conner, K. R., Langley, J., Tomaszewski, K. J., & Conwell, Y. (2003). Injury hospitalization and risks for subsequent self-​injury and suicide:  A national study from New Zealand. American Journal of Public Health, 93, 1128–​1131. Cotton, C. R., Peters, D. K., & Range, L. M. (1995). Psychometric properties of the Suicidal Behaviors Questionnaire. Death Studies, 19, 391–​397. Cotton, C. R., & Range, L. M. (1993). Suicidality, hopelessness, and attitudes toward life and death in children. Death Studies, 17, 185–​191. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–​302. de Beurs, D. P., Fokkema, M., de Groot, M. H., de Keijser, J., & Kerkhof, A. J. F. M. (2015). Longitudinal measurement invariance of the Beck Scale for Suicide Ideation. Psychiatry Research, 225, 368–​373. Dexter-​Mazza, E. T., & Freeman, K. A. (2003). Graduate training and the treatment of suicidal clients:  The students’ perspective. Suicide and Life-​ Threatening Behavior, 33, 211–​218. Edelstein, B. A., Heisel, M. J., McKee, D. R., Martin, R. R., Koven, L. P., Duberstein, P. R., & Britton, P. C. (2009). Development and psychometric evaluation of the Reasons for Living–​Older Adults Scale:  A suicide risk assessment inventory. The Gerontologist, 49, 736–​745. Ellis, J., & Lamis, D. (2007). Adaptive characteristics and suicidal behavior:  A gender comparison of young adults. Death Studies, 31, 845–​854. Embretson, S. E. (2006). The continued search for nonarbitrary metrics in psychology. American Psychologist, 61, 50–​55. Fischer, G., Ameis, N., Parzer, P., Plener, P. L., Groschwitz, R., Vonderlin, E., . . . Kaess, M. (2014). The German version of the Self-​Injurious Thoughts and Behaviors Interview (SITBI-​ G):  A tool to assess non-​ suicidal self-​ injury and suicidal behavior disorder. BMC Psychiatry, 14, 1.

Fliege, H., Kocalevent, R.-​D., Walter, O. B., Beck, S., Gratz, K. L., Gutierrez, P. M., & Klapp, B. F. (2006). Three assessment tools for deliberate self-​harm and suicide behavior:  Evaluation and psychopathological correlates. Journal of Psychosomatic Research, 61, 113–​121. Fowler, J. C. (2012). Suicide risk assessment in clinical practice:  Pragmatic guidelines for imperfect assessments. Psychotherapy, 49, 81–​90. Franklin, J., Fox, K. R., Ribeiro, J. D., Kleiman, E., Bentley, K., Chang, B., & Nock, M. K. (2017). Risk factors for suicidal thoughts and behaviors:  A meta-​ analysis of 50  years of research. Psychological Bulletin, 143, 187–​232. García-​Nieto, R., Blasco-​Fontecilla, H., Paz Yepes, M., & Baca-​García, E. (2013). Translation and validation of the “Self-​Injurious Thoughts and Behaviors Interview” in a Spanish population with suicidal behaviour. Revista De Psiquiatrí́a Y Salud Mental, 6, 101–​108. Glenn, C. R., Bagge, C. L., & Osman, A. (2013). Unique associations between borderline personality disorder features and suicide ideation and attempts in adolescents. Journal of Personality Disorders, 27, 604–​616. Glenn, C. R., Franklin, J. C., & Nock, M. K. (2015). Evidence-​ based psychosocial treatments for self-​ injurious thoughts and behaviors in youth. Journal of Clinical Child & Adolescent Psychology, 44, 1–​29. Glenn, C. R., & Klonsky, E. D. (2011). One-​year test–​retest reliability of the Inventory of Statements about Self-​ Injury (ISAS). Assessment, 18, 375–​378. Gould, M. S., Marrocco, F. A., Kleinman, M., Thomas, J. G., Mostkoff, K., Cote, J., & Davies, M. (2005). Evaluating iatrogenic risk of youth suicide screening programs:  A randomized controlled trial. JAMA, 293, 1635–​1643. Gratz, K. L. (2001). Measurement of deliberate self-​ harm:  Preliminary data on the Deliberate Self-​Harm Inventory. Journal of Psychopathology and Behavioral Assessment, 23, 253–​263. Greist, J. H., Mundt, J. C., Gwaltney, C. J., Jefferson, J. W., & Posner, K. (2014). Predictive value of baseline electronic Columbia-​Suicide Severity Rating Scale (eC-​ SSRS) assessments for identifying risk of prospective reports of suicidal behavior during research participation. Innovations in Clinical Neuroscience, 11, 23–​31. Guertin, T., Lloyd-​Richardson, E., Spirito, A., Donaldson, D., & Boergers, J. (2001). Self-​mutilative behavior in adolescents who attempt suicide by overdose. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 1062–​1069. Gutierrez, P. M., & Osman, A. (2009). Getting the best return on your screening investment:  An analysis of the Suicidal Ideation Questionnaire and Reynolds Adolescent Depression Scale. School Psychology Review, 38, 200–​217.

 209

Self-Injurious Thoughts and Behaviors

Gutierrez, P. M., Osman, A., Barrios, F. X., & Kopper, B. A. (2001). Development and initial validation of the Self-​ Harm Behavior Questionnaire. Journal of Personality Assessment, 77, 475–​490. Gutierrez, P. M., Osman, A., Barrios, F. X., Kopper, B. A., Baker, M. T., & Haraburda, C. M. (2002). Development of the Reasons for Living Inventory for Young Adults. Journal of Clinical Psychology, 58, 339–​357. Gutierrez, P. M., Osman, A., Kopper, B. A., & Barrios, F. X. (2000). Why young people do not kill themselves: The Reasons for Living Inventory for Adolescents. Journal of Clinical Child Psychology, 29, 177–​187. Gutierrez, P. M., Osman, A., Kopper, B. A., Barrios, F. X., & Bagge, C. L. (2000). Suicide risk assessment in a college student population. Journal of Counseling Psychology, 47, 403–​413. Harkavy Friedman, J. M.  H., & Asnis, G. M. (1989). Assessment of suicidal behavior:  A new instrument. Psychiatric Annals, 19, 382–​387. Harris, K. M., & Goh, M. T.-​T. (2017). Is suicide assessment harmful to participants? Findings from a randomized controlled trial. International Journal of Mental Health Nursing, 26, 181–​190. Healy, D. J., Barry, K., Blow, F., Welsh, D., & Milner, K. K. (2006). Routine use of the Beck Scale for Suicide Ideation in a psychiatric emergency department. General Hospital Psychiatry, 28, 323–​329. Heisel, M. J., Neufeld, E., & Flett, G. L. (2016). Reasons for living, meaning in life, and suicide ideation: Investigating the roles of key positive psychological factors in reducing suicide risk in community-​ residing older adults. Aging & Mental Health, 20, 195–​207. Holden, R. R., & Delisle, M. M. (2006). Factor structure of the Reasons for Attempting Suicide Questionnaire (RASQ) with suicide attempters. Journal of Psychopathology and Behavioral Assessment, 28, 1–​8. Holden, R. R., Kerr, P. S., Mendonca, J. D., & Velamoor, V. R. (1998). Are some motives more linked to suicide proneness than others? Journal of Clinical Psychology, 54, 569–​576. Holden, R. R., & Kroner, D. G. (2003). Differentiating suicidal motivations and manifestations in a forensic sample. Canadian Journal of Behavioural Science, 35, 35–​44. Holi, M. M., Pelkonen, M., Karlsson, L., Kiviruusu, O., Ruuttu, T., Heilä, H.,  .  .  .  Marttunen, M. (2005). Psychometric properties and clinical utility of the Scale for Suicidal Ideation (SSI) in adolescents. BMC Psychiatry, 5, 1. Hom, M. A., Joiner, T. E., & Bernert, R. A. (2015). Limitations of a single-​ item assessment of suicide attempt history:  Implications for standardized suicide risk assessment. Psychological Assessment, 28, 1026–​1030. Horon, R., McManus, T., Schmollinger, J., Barr, T., & Jimenez, M. (2013). A study of the use and interpretation

209

of standardized suicide risk assessment: Measures within a psychiatrically hospitalized correctional population. Suicide and Life-​Threatening Behavior, 43, 17–​38. Husky, M., Olié, E., Guillaume, S., Genty, C., Swendsen, J., & Courtet, P. (2014). Feasibility and validity of ecological momentary assessment in the investigation of suicide risk. Psychiatry Research, 220, 564–​570. Huth-​Bocks, A. C., Kerr, D. C.  R., Ivey, A. Z., Kramer, A. C., & King, C. A. (2007). Assessment of psychiatrically hospitalized suicidal adolescents:  Self-​ report instruments as predictors of suicidal thoughts and behavior. Journal of the American Academy of Child & Adolescent Psychiatry, 46, 387–​395. Ivanoff, A., Jang, S. J., Smyth, N. J., & Linehan, M. M. (1994). Fewer reasons for staying alive when you are thinking of killing yourself: The Brief Reasons for Living Inventory. Journal of Psychopathology and Behavioral Assessment, 16, 1–​13. Jacobs, D. G., Baldessarini, R. J., Conwell, Y., Fawcett, J. A., Horton, L., Meltzer, H., . . . Simon, R. (2010). Practice guideline for the assessment and treatment of patients with suicidal behaviors. Washington, DC:  American Psychiatric Association. Jacobson, C. M., & Gould, M. (2007). The epidemiology and phenomenology of non-​suicidal self-​injurious behavior among adolescents:  A critical review of the literature. Archives of Suicide Research, 11, 129–​147. Joiner, T. E. J., Conwell, Y., Fitzpatrick, K. K., Witte, T. K., Schmidt, N. B., Berlim, M. T., . . . Rudd, M. D. (2005). Four studies on how past and current suicidality relate even when “everything but the kitchen sink” is covaried. Journal of Abnormal Psychology, 114, 291–​303. Joiner, T. E.  J., Rudd, M. D., & Rajab, M. H. (1997). The Modified Scale for Suicidal Ideation: Factors of suicidality and their relation to clinical and diagnostic variables. Journal of Abnormal Psychology, 106, 260–​265. Joiner, T. E., Steer, R. A., Brown, G., Beck, A. T., Pettit, J. W., & Rudd, M. D. (2003). Worst-​ point suicidal plans: A dimension of suicidality predictive of past suicide attempts and eventual death by suicide. Behaviour Research and Therapy, 41, 1469–​1480. Kessler, R. C., Berglund, P., Borges, G., Nock, M., & Wang, P. S. (2005). Trends in suicide ideation, plans, gestures, and attempts in the United States, 1990–​1992 to 2001–​2003. JAMA, 293, 2487–​2495. Kim, P. Y., Thomas, J. L., Wilk, J. E., Castro, C. A., & Hoge, C. W. (2010). Stigma, barriers to care, and use of mental health services among active duty and National Guard soldiers after combat. Psychiatric Services, 61, 582–​588. Kleespies, P. M., AhnAllen, C. G., Knight, J. A., Presskreischer, B., Barrs, K. L., Boyd, B. L., & Dennis, J. P. (2011). A study of self-​injurious and suicidal behavior in a Veteran population. Psychological Services, 8, 236–​250.

210

210

Mood Disorders and Self-Injury

Kleespies, P. M., Penk, W. E., & Forsyth, J. P. (1993). The stress of patient suicidal behavior during clinical training:  Incidence, impact, and recovery. Professional Psychology: Research and Practice, 24, 293–​303. Kleiman, E. M., Turner, B. J., Fedor, S., Beale, E. E., Huffman, J. C., & Nock, M. K. (2017). Examination of real-​time fluctuations in suicidal ideation and its risk factors: Results from two ecological momentary assessment studies. Journal of Abnormal Psychology, 126, 726–​738. Kliem, S., Kröger, C., & Kosfelder, J. (2010). Dialectical behavior therapy for borderline personality disorder: A meta-​analysis using mixed-​effects modeling. Journal of Consulting and Clinical Psychology, 78, 936–​951. Klimes-​Dougan, B. (1998). Screening for suicidal ideation in children and adolescents: Methodological considerations. Journal of Adolescence, 21, 435–​444. Klonsky, E. D. (2011). Non-​suicidal self-​injury in United States adults:  Prevalence, sociodemographics, topography and functions. Psychological Medicine, 41, 1981–​1986. Klonsky, E. D., & Glenn, C. R. (2009). Assessing the functions of non-​suicidal self-​injury:  Psychometric properties of the Inventory of Statements About Self-​Injury (ISAS). Journal of Psychopathology and Behavioral Assessment, 31, 215–​219. Klonsky, E. D., May, A. M., & Glenn, C. R. (2013). The relationship between nonsuicidal self-​injury and attempted suicide:  Converging evidence from four samples. Journal of Abnormal Psychology, 122, 231–​237. Kovac, S. H., & Range, L. M. (2002). Does writing about suicidal thoughts and feelings reduce them? Suicide and Life-​Threatening Behavior, 32, 428–​440. Law, M. K., Furr, R. M., Arnold, E. M., Mneimne, M., Jaquett, C., & Fleeson, W. (2015). Does assessing suicidality frequently and repeatedly cause harm? A  randomized control study. Psychological Assessment, 27, 1171–​1181. Linehan, M. M. (1981). Suicide behaviors questionnaire. Unpublished manuscript, University of Washington, Seattle, WA. Linehan, M. M. (1990). Screening for suicidal behaviors: The Suicidal Behaviors Questionnaire. Unpublished Manuscript, University of Washington, Seattle, WA. Linehan, M. M., Camper, P., Chiles, J. A., Strosahl, K., & Shearin, E. (1987). Interpersonal problem solving and parasuicide. Cognitive Therapy and Research, 11, 1–​12. Linehan, M. M., Comtois, K. A., Brown, M. Z., Heard, H. L., & Wagner, A. (2006). Suicide Attempt Self-​Injury Interview (SASII): Development, reliability, and validity of a scale to assess suicide attempts and intentional self-​ injury. Psychological Assessment, 18, 303–​312. Linehan, M. M., Comtois, K. A., Murray, A. M., Brown, M. Z., Gallop, R. J., Heard, H. L.,  .  .  .  Lindenboim,

N. (2006). Two-​year randomized controlled trial and follow-​up of dialectical behavior therapy vs. therapy by experts for suicidal behaviors and borderline personality disorder. Archives of General Psychiatry, 63, 757–​766. Linehan, M. M., Goodstein, J. L., Nielsen, S. L., & Chiles, J. A. (1983). Reasons for staying alive when you are thinking of killing yourself: The Reasons for Living Inventory. Journal of Consulting and Clinical Psychology, 51, 276–​286. Linehan, M. M., Korslund, K. E., Harned, M. S., Gallop, R. J., Lungu, A., Neacsiu, A. D., . . . Murray-​Gregory, A. M. (2015). Dialectical behavior therapy for high suicide risk in individuals with borderline personality disorder:  A randomized clinical trial and component analysis. JAMA Psychiatry, 72, 475–​482. Lloyd, E. E., Kelley, M. L., & Hope, T. (1997). Self-​mutilation in a community sample of adolescents:  Descriptive characteristics and provisional prevalence rates. Paper presented at the annual meeting of the Society for Behavioral Medicine, New Orleans, LA. Lozano, R., Naghavi, M., Foreman, K., Lim, S., Shibuya, K., Aboyans, V., . . . Murray, C. J. (2012). Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010:  A systematic analysis for the Global Burden of Disease Study 2010. Lancet, 380, 2095–​2128. Luxton, D. D., Rudd, M. D., Reger, M. A., & Gahm, G. A. (2011). A psychometric study of the Suicide Ideation Scale. Archives of Suicide Research, 15, 250–​258. Marion, M. S., & Range, L. M. (2003). African American college women’s suicide buffers. Suicide and Life-​ Threatening Behavior, 33, 33–​43. Matarazzo, B. B., Clemans, T. A., Silverman, M. M., & Brenner, L. A. (2013). The Self-​ Directed Violence Classification System and the Columbia Classification Algorithm for Suicide Assessment: A crosswalk. Suicide and Life-​Threatening Behavior, 43, 235–​249. May, A. M., & Klonsky, E. D. (2013). Assessing motivations for suicide attempts:  Development and psychometric properties of the Inventory of Motivations for Suicide Attempts. Suicide and Life-​Threatening Behavior, 43, 532–​546. McHugh, R. K., & Barlow, D. H. (2010). The dissemination and implementation of evidence-​ based psychological treatments: A review of current efforts. American Psychologist, 65, 73–​84. McLaren, S. (2011). Age, gender, and reasons for living among Australian adults. Suicide and Life-​Threatening Behavior, 41, 650–​660. McMain, S. F., Links, P. S., Gnam, W. H., Guimond, T., Cardish, R. J., Korman, L., & Streiner, D. L. (2009). A randomized trial of dialectical behavior therapy versus general psychiatric management for borderline personality disorder. American Journal of Psychiatry, 166, 1365–​1374.

 21

Self-Injurious Thoughts and Behaviors

Millner, A. J., Lee, M. D., & Nock, M. K. (2015). Single-​ item measurement of suicidal behaviors: Validity and consequences of misclassification. PLoS One, 10, e0141606. Millner, A. J., Lee, M. D., & Nock, M. K. (2017). Describing and measuring the pathway to suicide attempts: A preliminary study. Suicide and Life-​Threatening Behavior, 47(3), 353–​336. Miller, I. W., Norman, W. H., Bishop, S. B., & Dow, M. G. (1986). The Modified Scale for Suicidal Ideation: Reliability and validity. Journal of Consulting and Clinical Psychology, 54, 724–​725. Morrison, L. L., & Downey, D. L. (2000). Racial differences in self-​disclosure of suicidal ideation and reasons for living:  Implications for training. Cultural Diversity and Ethnic Minority Psychology, 6, 374–​386. Muehlenkamp, J. J., Cowles, M. L., & Gutierrez, P. M. (2009). Validity of the Self-​Harm Behavior Questionnaire with diverse adolescents. Journal of Psychopathology and Behavioral Assessment, 32, 236–​245. Muehlenkamp, J. J., & Gutierrez, P. M. (2004). An investigation of differences between self-​injurious behavior and suicide attempts in a sample of adolescents. Suicide and Life-​Threatening Behavior, 34, 12–​23. Muehlenkamp, J. J., Walsh, B. W., & McDade, M. (2010). Preventing non-​suicidal self-​injury in adolescents: The signs of self-​ injury program. Journal of Youth and Adolescence, 39, 306–​314. Mundt, J. C., Greist, J. H., Gelenberg, A. J., Katzelnick, D. J., Jefferson, J. W., & Modell, J. G. (2010). Feasibility and validation of a computer-​ automated Columbia-​ Suicide Severity Rating Scale using interactive voice response technology. Journal of Psychiatric Research, 44, 1224–​1228. Mundt, J. C., Greist, J. H., Jefferson, J. W., Federico, M., Mann, J. J., & Posner, K. (2013). Prediction of suicidal behavior in clinical research by lifetime suicidal ideation and behavior ascertained by the electronic Columbia-​ Suicide Severity Rating Scale. Journal of Clinical Psychiatry, 74, 887–​893. Nock, M. K., & Banaji, M. R. (2007). Prediction of suicide ideation and attempts among adolescents using a brief performance-​based test. Journal of Consulting and Clinical Psychology, 75, 707–​715. Nock, M. K., Borges, G., Bromet, E. J., Alonso, J., Angermeyer, M., Beautrais, A.,  .  .  .  Williams, D. (2008). Cross-​national prevalence and risk factors for suicidal ideation, plans and attempts. British Journal of Psychiatry, 192, 98–​105. Nock, M. K., Holmberg, E. B., Photos, V. I., & Michel, B. D. (2007). Self-​Injurious Thoughts and Behaviors Interview:  Development, reliability, and validity in an adolescent sample. Psychological Assessment, 19, 309–​317.

211

Nock, M. K., Hwang, I., Sampson, N. A., & Kessler, R. C. (2010). Mental disorders, comorbidity and suicidal behavior:  Results from the National Comorbidity Survey Replication. Molecular Psychiatry, 15, 868–​876. Nock, M. K., Hwang, I., Sampson, N., Kessler, R. C., Angermeyer, M., Beautrais, A.,  .  .  .  Williams, D. R. (2009). Cross-​ national analysis of the associations among mental disorders and suicidal behavior: Findings from the WHO World Mental Health Surveys. PLoS Medicine, 6, e1000123. Nock, M. K., & Kazdin, A. E. (2002). Examination of affective, cognitive, and behavioral factors and suicide-​ related outcomes in children and young adolescents. Journal of Clinical Child & Adolescent Psychology, 31, 48–​58. Nock, M. K., & Kessler, R. C. (2006). Prevalence of and risk factors for suicide attempts versus suicide gestures:  Analysis of the National Comorbidity Survey. Journal of Abnormal Psychology, 115, 616–​623. Nock, M. K., Park, J. M., Finn, C. T., Deliberto, T. L., Dour, H. J., & Banaji, M. R. (2010). Measuring the suicidal mind:  Implicit cognition predicts suicidal behavior. Psychological Science, 21, 511–​517. Nock, M. K., & Prinstein, M. J. (2004). A functional approach to the assessment of self-​mutilative behavior. Journal of Consulting and Clinical Psychology, 72, 885–​890. Nock, M. K., Prinstein, M. J., & Sterba, S. K. (2009). Revealing the form and function of self-​ injurious thoughts and behaviors:  A real-​time ecological assessment study among adolescents and young adults. Journal of Abnormal Psychology, 118, 816–​827. Nock, M. K., Wedig, M. M., Janis, I. B., & Deliberto, T. L. (2008). Self-​ injurious thoughts and behaviors. In J. Hunsley & E. J. Mash (Eds.), A guide to assessments that work (pp. 158–​177). New York, NY: Oxford University Press. Ofek, H., Weizman, T., & Apter, A. (1998). The Child Suicide Potential Scale: Inter-​rater reliability and validity in Israel in-​ patient adolescents. Israel Journal of Psychiatry and Related Sciences, 35, 253–​261. Onnela, J.-​P., & Rauch, S. L. (2016). Harnessing smartphone-​ based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology, 41, 1691–​1696. Orbach, I., Milstein, I., Har-​Even, D., Apter, A., Tiano, S., & Elizur, A. (1991). A Multi-​Attitude Suicide Tendency Scale for adolescents. Psychological Assessment:  A Journal of Consulting and Clinical Psychology, 3, 398–​404. Osman, A., Bagge, C. L., Gutierrez, P. M., Konick, L. C., Kopper, B. A., & Barrios, F. X. (2001). The Suicidal Behaviors Questionnaire-​Revised (SBQ-​R):  Validation with clinical and nonclinical samples. Assessment, 8, 443–​454.

21

212

Mood Disorders and Self-Injury

Osman, A., Barrios, F. X., Panak, W. F., Osman, J. R., Hoffman, J., & Hammer, R. (1994). Validation of the Multi-​Attitude Suicide Tendency Scale in adolescent samples. Journal of Clinical Psychology, 50, 847–​855. Osman, A., Downs, W. R., Kopper, B. A., Barrios, F. X., Baker, M. T., Osman, J. R., . . . Linehan, M. M. (1998). The Reasons for Living Inventory for Adolescents (RFL-​ A): Development and psychometric properties. Journal of Clinical Psychology, 54, 1063–​1078. Osman, A., Gifford, J., Jones, T., Lickiss, L., Osman, J., & Wenzel, R. (1993). Psychometric evaluation of the Reasons for Living Inventory. Psychological Assessment, 5, 154–​158. Osman, A., Kopper, B. A., Linehan, M. M., Barrios, F. X., Gutierrez, P. M., & Bagge, C. L. (1999). Validation of the Adult Suicidal Ideation Questionnaire and the Reasons for Living Inventory in an adult psychiatric inpatient sample. Psychological Assessment, 11, 115–​123. Pedrelli, P., Blais, M. A., Alpert, J. E., Shelton, R. C., Walker, R. S. W., & Fava, M. (2014). Reliability and validity of the Symptoms of Depression Questionnaire (SDQ). CNS Spectrums, 19, 535–​546. Pettit, J. W., Garza, M. J., Grover, K. E., Schatte, D. J., Morgan, S. T., Harper, A., & Saunders, A. E. (2009). Factor structure and psychometric properties of the Modified Scale for Suicidal Ideation among suicidal youth. Depression and Anxiety, 26, 769–​774. Pfeffer, C. R., Conte, H. R., Plutchik, R., & Jerrett, I. (1979). Suicidal behavior in latency-​age children:  An empirical study. Journal of the American Academy of Child Psychiatry, 18, 679–​692. Pfeffer, C. R.  M. D., Conte, H. R., Plutchik, R., & Jerrett, I. M.  A. (1980). Suicidal behavior in latency-​age children: An outpatient population. Journal of the American Academy of Child Psychiatry, 19, 703–​710. Pfeffer, C. R. M. D., Newcorn, J. M. D., Kaplan, G. M. D., Mizruchi, M. S., & Plutchik, R. (1988). Suicidal behavior in adolescent psychiatric inpatients. Journal of the American Academy of Child Psychiatry, 27, 357–​361. Pfeffer, C. R.  M. D., Solomon, G. M.  D., Plutchik, R., Mizruchi, M. S., & Weiner, A. (1982). Suicidal behavior in latency-​age psychiatric inpatients:  A replication and cross validation. Journal of the American Academy of Child Psychiatry, 21, 564–​569. Pfeffer, C. R. M. D., Zuckerman, S. P. D., Plutchik, R. P. D., & Mizruchi, M. S. P. D. (1984). Suicidal behavior in normal school children: A comparison with child psychiatric inpatients. Journal of the American Academy of Child Psychiatry, 23, 416–​423. PhenX Toolkit Suicide Workgroup. (2014, April). Epidemiologic/​ survey studies. Paper presented at the PhenX Toolkit Suicide Workgroup meeting, Bethesda, MD.

Pinto, A., Whisman, M. A., & McCoy, K. J.  M. (1997). Suicidal ideation in adolescents: Psychometric properties of the Suicidal Ideation Questionnaire in a clinical sample. Psychological Assessment, 9, 63–​66. Plöderl, M., Kralovec, K., Yazdi, K., & Fartacek, R. (2011). A closer look at self-​reported suicide attempts: False positives and false negatives. Suicide and Life-​Threatening Behavior, 41, 1–​5. Posner, K., Brown, G. K., Stanley, B., Brent, D. A., Yershova, K. V., Oquendo, M. A.,  .  .  .  Shen, S. (2011). The Columbia-​Suicide Severity Rating Scale: Initial validity and internal consistency findings from three multisite studies with adolescents and adults. American Journal of Psychiatry, 168, 1266–​1277. Posner, K., Oquendo, M., Gould, M., Stanley, B., & Davies, M. (2007). Columbia Classification Algorithm of Suicide Assessment (C-​ CASA):  Classification of suicidal events in the FDA’s pediatric suicidal risk analysis of antidepressants. American Journal of Psychiatry, 164, 1035–​1043. Preti, A., Sheehan, D. V., Coric, V., Distinto, M., Pitanti, M., Vacca, I.,  .  .  .  Petretto, D. R. (2013). Sheehan Suicidality Tracking Scale (S-​STS): Reliability, convergent and discriminative validity in young Italian adults. Comprehensive Psychiatry, 54, 842–​849. Prinstein, M. J., Nock, M. K., Spirito, A., & Grapentine, W. L. (2001). Multimethod assessment of suicidality in adolescent psychiatric inpatients:  Preliminary results. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 1053–​1061. Randall, J. R., Rowe, B. H., Dong, K. A., Nock, M. K., & Colman, I. (2013). Assessment of self-​harm risk using implicit thoughts. Psychological Assessment, 25, 714–​721. Rathus, J. H., & Miller, A. L. (2002). Dialectical behavior therapy adapted for suicidal adolescents. Suicide and Life-​Threatening Behavior, 32, 146–​157. Reynolds, W. M. (1987). Suicidal Ideation Questionnaire Junior. Odessa, FL:  Psychological Assessment Resources. Reynolds, W. M. (1988). Suicidal Ideation Questionnaire: Professional manual. Odessa, FL: Psychological Assessment Resources. Reynolds, W. M. (1990). Development of a semistructured clinical interview for suicidal behaviors in adolescents. Psychological Assessment, 2, 382–​390. Reynolds, W. M. (1991a). Adult Suicide Ideation Questionnaire: Professional manual. Odessa, FL: Psychological Assessment Resources. Reynolds, W. M. (1991b). Psychometric characteristics of the Adult Suicidal Ideation Questionnaire in college students. Journal of Personality Assessment, 56, 289. Reynolds, W. M., & Mazza, J. J. (1999). Assessment of suicidal ideation in inner-​ city children and young

 213

Self-Injurious Thoughts and Behaviors

adolescents:  Reliability and validity of the Suicidal Ideation Questionnaire-​JR. School Psychology Review, 28, 17. Rogers, J. R., & Hanlon, P. J. (1996). Psychometric analysis of the College Student Reasons for Living Inventory. Measurement & Evaluation in Counseling & Development, 29, 13. Rudd, M. D. (1989). The prevalence of suicidal ideation among college students. Suicide and Life-​Threatening Behavior, 19, 173–​183. Rudd, M. D., Berman, A. L., Joiner, T. E., Nock, M. K., Silverman, M. M., Mandrusiak, M.,  .  .  .  Witte, T. (2006). Warning signs for suicide:  Theory, research, and clinical applications. Suicide and Life-​Threatening Behavior, 36, 255–​262. Rudd, M. D., Goulding, J., & Bryan, C. J. (2011). Student Veterans:  A national survey exploring psychological symptoms and suicide risk. Professional Psychology: Research and Practice, 42, 354–​360. Rudd, M. D., Joiner, T., & Rajad, M. H. (1996). Relationships among suicide ideators, attempters, and multiple attempters in a young-​adult sample. Journal of Abnormal Psychology, 105, 541–​550. Sansone, R. A., Pole, M., Dakroub, H., & Butler, M. (2006). Childhood trauma, borderline personality symptomatology, and psychophysiological and pain disorders in adulthood. Psychosomatics, 47, 158–​162. Sansone, R. A., Songer, D. A., & Miller, K. A. (2005). Childhood abuse, mental healthcare utilization, self-​ harm behavior, and multiple psychiatric diagnoses among inpatients with and without a borderline diagnosis. Comprehensive Psychiatry, 46, 117–​120. Sansone, R. A., Wiederman, M. W., & Sansone, L. A. (1998). The Self-​Harm Inventory (SHI): Development of a scale for identifying self-​destructive behaviors and borderline personality disorder. Journal of Clinical Psychology, 54, 973–​983. Sheehan, D. V., Alphs, L. D., Mao, L., Li, Q., May, R. S., Bruer, E. H., . . . Williamson, D. J. (2014). Comparative validation of the S-​STS, the ISST-​Plus, and the C–​ SSRS for assessing the suicidal thinking and behavior FDA 2012 suicidality categories. Innovations in Clinical Neuroscience, 11, 32–​46. Sheehan, D. V., Giddens, J. M., & Sheehan, I. S. (2014a). Status update on the Sheehan-​Suicidality Tracking Scale (S-​STS) 2014. Innovations in Clinical Neuroscience, 11, 93–​140. Sheehan, D. V., Giddens, J. M., & Sheehan, K. H. (2014b). Current assessment and classification of suicidal phenomena using the FDA 2012 draft guidance document on suicide assessment: A critical review. Innovations in Clinical Neuroscience, 11, 54–​65. Silverman, M. M., & Berman, A. L. (2014). Suicide risk assessment and risk formulation Part I:  A focus on

213

suicide ideation in assessing suicide risk. Suicide and Life-​Threatening Behavior, 44, 420–​431. Silverman, M. M., Berman, A. L., Sanddal, N. D., O’Carroll, P. W., & Joiner, T. E. (2007). Rebuilding the tower of Babel: A revised nomenclature for the study of suicide and suicidal behaviors Part 2: Suicide-​related ideations, communications, and behaviors. Suicide and Life-​ Threatening Behavior, 37, 264–​277. Simon, N. M., Zalta, A. K., Otto, M. W., Ostacher, M. J., Fischmann, D., Chow, C. W., . . . Pollack, M. H. (2007). The association of comorbid anxiety disorders with suicide attempts and suicidal ideation in outpatients with bipolar disorder. Journal of Psychiatric Research, 41, 255–​264. Steer, R. A., Kumar, G., & Beck, A. T. (1993). Self-​reported suicidal ideation in adolescent psychiatric inpatients. Journal of Consulting and Clinical Psychology, 61, 1096–​1099. Swannell, S. V., Martin, G. E., Page, A., Hasking, P., & St. John, N. J. (2014). Prevalence of nonsuicidal self-​injury in nonclinical samples: Systematic review, meta-​analysis and meta-​regression. Suicide and Life-​ Threatening Behavior, 44, 273–​303. Tarrier, N., Taylor, K., & Gooding, P. (2008). Cognitive–​ behavioral interventions to reduce suicide behavior:  A systematic review and meta-​ analysis. Behavior Modification, 32, 77–​108. U.S. Food and Drug Administration, U.S. Department of Health and Human Services, Center for Drug Evaluation and Research. (2012). Guidance for industry:  Suicidality:  Prospective assessment of occurrence in clinical trials—​Draft guidance. Retrieved from https://​ www.fda.gov/​drugs/​guidancecomplianceregulatoryinformation/​guidances/​ucm315156.htm Venta, A., & Sharp, C. (2014). Extending the concurrent validity of the Self-​Injurious Thoughts and Behaviors Interview to inpatient adolescents. Journal of Psychopathology and Behavioral Assessment, 36, 675–​682. Victor, S. E., Davis, T., & Klonsky, E. D. (2017). Descriptive characteristics and initial psychometric properties of the non-​suicidal self-​injury disorder scale. Archives of Suicide Research, 21, 265–​278. Vuorilehto, M. S., Melartin, T. K., & Isometsä, E. T. (2006). Suicidal behaviour among primary-​care patients with depressive disorders. Psychological Medicine, 36, 203–​210. Walker, R. L., Alabi, D., Roberts, J., & Obasi, E. M. (2010). Ethnic group differences in reasons for living and the moderating role of cultural worldview. Cultural Diversity and Ethnic Minority Psychology, 16, 372–​378. Wang, M.-​C., Nyutu, P. N., & Tran, K. K. (2012). Coping, reasons for living, and suicide in Black college students. Journal of Counseling and Development, 90, 459–​466.

214

214

Mood Disorders and Self-Injury

Washburn, J. J., Juzwin, K. R., Styer, D. M., & Aldridge, D. (2010). Measuring the urge to self-​injure:  Preliminary data from a clinical sample. Psychiatry Research, 178, 540–​544. Westefeld, J. S., Badura, A., Kiel, J. T., & Scheel, K. (1996). Development of the College Student Reasons for Living Inventory with African Americans. Journal of College Student Psychotherapy, 10, 61–​65. Westefeld, J. S., Cardin, D., & Deaton, W. L. (1992). Development of the College Student Reasons for Living Inventory. Suicide and Life-​Threatening Behavior, 22, 442–​453. Westefeld, J. S., Scheel, K., & Maples, M. R. (1998). Psychometric analyses of the College Student Reasons for Living Inventory using a clinical population. Measurement and Evaluation in Counseling and Development, 31, 86. Wetzler, S., Asnis, G. M., Hyman, R. B., Virtue, C., Zimmerman, J., & Rathus, J. H. (1996). Characteristics of suicidality among adolescents. Suicide and Life-​ Threatening Behavior, 26, 37–​45.

Whitlock, J. (2010). Self-​injurious behavior in adolescents. PLoS Medicine, 7, e1000240. Whitlock, J., Eckenrode, J., & Silverman, D. (2006). Self-​ injurious behaviors in a college population. Pediatrics, 117, 1939–​1948. Whitlock, J., Exner-​Cortens, D., & Purington, A. (2014). Assessment of nonsuicidal self-​injury:  Development and initial validation of the Non-​Suicidal Self-​Injury–​Assessment Tool (NSSI-​AT). Psychological Assessment, 26, 935–​946. Witte, T. K., Joiner, T. E., Jr., Brown, G. K., Beck, A. T., Beckman, A., Duberstein, P., & Conwell, Y. (2006). Factors of suicide ideation and their relation to clinical and other indicators in older adults. Journal of Affective Disorders, 94, 165–​172. Youngstrom, E. A., Hameed, A., Mitchell, M. A., Van Meter, A. R., Freeman, A. J., Algorta, G. P., . . . Meyer, R. E. (2015). Direct comparison of the psychometric properties of multiple interview and patient-​rated assessments of suicidal ideation and behavior in an adult psychiatric inpatient sample. Journal of Clinical Psychiatry, 76, 1676–​1682.

Part IV

Anxiety and Related Disorders

11

Anxiety Disorders in Children and Adolescents Simon P. Byrne Eli R. Lebowitz Thomas H. Ollendick Wendy K. Silverman In this chapter, we summarize the research evidence supporting psychological assessment measures and strategies for use with children and adolescents with anxiety disorders. The focus is on assessment measures and strategies that are evidence based as well as clinically relevant and feasible for use by practitioners. The focus is also on broad measures of anxiety and related processes, including several that have been developed since the writing of the previous chapter for the first edition of this book. Our aim is to inform readers about the scientific suitability of the measures and associated strategies, as well as their utility for specific clinical and research purposes. We begin with a brief description of anxiety disorders in children and adolescents. This is followed by a discussion of the measures and strategies, as well as issues involved in using them to accomplish three primary goals: (a) diagnosis, (b) case conceptualization and treatment planning, and (c)  treatment monitoring and evaluation. An “Overall Evaluation” concludes each section. The chapter ends with summary comments and recommendations. Before proceeding, we note that measures aimed directly at assessing the brain, such as electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI), also show promise and are opening up areas of inquiry and discovery that were not widely known about at the time of the first edition of this volume. The readers are referred to Pine (2011) for further discussion. If the strides that have been made thus far continue at the same rapid pace, we are likely to have a great deal more to say about the utility of these “brain measures” for diagnosis,

treatment planning and evaluation. Also, as in our first edition chapter, we do not cover projective measures and strategies. Despite their use by some in clinical practice, their utility lacks adequate empirical evidence (Lilienfeld, Wood, & Garb, 2000).

NATURE OF THE DISORDERS

Anxiety problems are among the most common forms of emotional disturbance in children and adolescents (e.g., Ollendick & March, 2004; Rapee, Schniering, & Hudson, 2009). Although mild anxiety is often transient and short-​ lived, anxiety disorders can be chronic and interfere substantially with adaptive functioning. Many persist into adulthood, and many adult anxiety disorders appear to have their onset in childhood or adolescence (see Ollendick & Seligman, 2006; Saavedra & Silverman, 2002). As defined most recently in the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association, 2013), the most common anxiety disorder subtypes relevant to children are specific phobia (SP), separation anxiety disorder (SAD), selective mutism, social phobia (SOP), and generalized anxiety disorder (GAD). The removal of obsessive–​compulsive disorder (OCD) as an anxiety disorder and the reclassification of post-​ traumatic stress disorder (PTSD) and acute stress disorder as trauma and stressor-​related disorders are new features of DSM-​5. Selective mutism is now classified as an anxiety disorder. Furthermore, the DSM-​5 separates

217

218

Anxiety and Related Disorders

panic disorder (PD) and agoraphobia (AG) into two distinct diagnoses. Epidemiology Anxiety disorders are relatively common in youths, with Rapee et  al. (2009) reporting between 2.5–​5% of children or adolescents experiencing an anxiety disorder at any one time. Merikangas et al. (2010) reported a lifetime prevalence of anxiety disorders in youths as high as 31.9%. Anxiety disorders are associated with an impairment in areas that include family relationships (Ezpeleta, Keeler, Alaatin, Costello, & Angold, 2001), peer relationships, and academic performance (Essau, Conradt, & Petermann, 2000, 2002). Anxiety often starts early and can be chronic; anxiety symptoms can start as early as preschool age and continue into adulthood (e.g., Rapee, 2011; Rapee et  al., 2009). Merikangas et  al. (2010) reported that adolescents who experienced an anxiety disorder had a median age of onset of 6 years. Sometimes anxiety symptoms spontaneously remit, yet more often symptoms persist unless treated (e.g., Rapee et al., 2009). Whether impairment is considered when estimating prevalence can influence the rates reported in research studies (i.e., rates are lower if impairment is considered). Costello, Egger, Copeland, Erkanli, and Angold (2011) described such prevalence estimates by diagnosis for young people aged 2–​21  years based on a meta-​analysis of 55 studies. They found a point estimate prevalence rate of 10.2% for all anxiety disorders (standard error [SE] = 0.5%), 5.4% for SP (SE  =  .08%), 3.6% for SOP (SE  =  .70%), 2.6% for SAD (SE = .5%), and 0.8% for panic disorder (SE = .02%). Anxiety disorders are more prevalent in girls than boys (e.g., Costello, Mustillo, Erkanli, Keeler, & Angold, 2003; Essau et  al., 2000). The general pattern across community studies is that girls report higher and more intense normative, subclinical, and clinical levels of fear, worry, and anxiety than do boys. Clinic studies have found more inconsistent patterns of prevalence rates by sex, with rates varying by anxiety disorder subtypes (Silverman & Carter, 2006). Such inconsistencies could be related in part to differences in families’ perceived need and willingness to seek mental health services for their sons and daughters. Just what these differences are with regard to variables such as race and ethnicity, socioeconomic status, and parents’ psychopathology requires further study. Anxiety is often associated with comorbidity, including depression, other anxiety disorders, and externalizing disorders (Bittner et al., 2007; Essau, 2003). Autism is another common comorbid disorder, and some of the research on this topic is summarized later in the chapter.

Etiological Factors In this section, we provide an overview of the major etiological factors implicated in the development of anxiety in childhood and adolescence. Although each factor is discussed separately, there are likely multiple pathways to a given anxiety disorder in youth, reflecting complex transactions among multiple factors (Cicchetti & Cohen, 1995). Depending on the configuration of other factors with which the disorder occurs, any given pathway may lead to several different anxiety disorders, to other disorders, or to no disorder at all. Such a position is consistent with a developmental psychopathology perspective on the development of anxiety disorders (see Silverman & Ollendick, 1999; Weems & Silverman, 2006). Genetic Factors Anxiety, as a trait, is believed to be influenced by multiple genes combining with multiple environmental factors. Researchers have worked toward identifying candidate genes associated with the development of childhood anxiety. The serotonin transporter polymorphism (5-​HTTLPR) is considered a viable candidate because serotonin is implicated in mood. However, the serotonin transporter polymorphism has generally been shown to have an inconsistent association with anxiety in children and adults (Gregory & Eley, 2011). Other potential candidate genes include catechol-​ O-​methyltransferase (COMT), the dopamine (DRD4) receptor, and the γ-​aminobutyric (GABA) system. However, for these too, results have been inconsistent regarding their role in the etiology of anxiety in children and adolescents (see Gregory & Eley, 2011 for a review). Temperamental Factors Negative affectivity (Watson & Clark, 1984)  and similar temperamental dimensions, including neuroticism (Eysenck & Eysenck, 1985)  and behavioral inhibition (BI) to the unfamiliar (Reznick, Hegeman, Kaufman, Woods, & Jacobs, 1992), have been shown to increase risk for anxiety and to be moderately heritable (Lonigan, Phillips, Wilson, & Allan, 2011). Of these overlapping constructs, BI has received the most attention as a risk factor. BI is associated with heightened risk for anxiety disorders in children, particularly in a subset of children who show stable BI from infancy through middle childhood and into adolescence (Turner, Beidel, & Wolff, 1996). Because studies also show that many children with high BI do not develop anxiety disorders and uninhibited children

Anxiety Disorders in Children and Adolescents

sometimes do, BI is neither sufficient nor necessary to produce anxiety disorders. Paths to anxiety disorders involving BI or similar anxiety-​prone temperamental factors also involve shared and unshared environmental factors (Eley, 2001). A  rather consistent finding is that BI more often predicts social anxiety. For example, Hirshfeld-​ Becker et al. (2007) found that BI assessed at 1.5–​6 years was predictive of social anxiety 5 years later. It remains unclear whether BI is a general risk for anxiety or a specific risk for social anxiety (Ollendick & Benoit, 2012), or psychopathology more broadly construed (Muris & Ollendick, 2005). In a cross-​sectional study of the concurrent association between attachment security, behavioral inhibition, maternal anxiety, and child anxiety in an at-​risk sample of infants, Shamir-​Essakow, Ungerer, and Rapee (2005) found insecure attachment was associated with BI, even after controlling for the effect of maternal anxiety. Exposure to Stressful Events and Uncontrollable Environments Exposure to severe and chronic life stressors also contributes to the onset of anxiety disorders in childhood (e.g., Allen, Rapee, & Sandberg, 2008). The controllability of environmental events, especially early in childhood, may be particularly important (Weems & Silverman, 2006). Early exposure to controllable environments appears to reduce anxiety, whereas uncontrollable environments may predispose individuals to anxiety. For example, infant rhesus monkeys exposed to chronically uncontrollable environments responded to novel stimuli with greater fear and less exploration (Mineka, Gunnar, & Champoux, 1986), as well as higher cortisol levels (Insel, Scanlan, & Champoux, 1988)  compared with monkeys that had control over their environment. Studies with children and adolescents also support the predisposing role of uncontrollability to anxiety (e.g., Weems, Silverman, Rapee, & Pina, 2003) and the protective role of controllable experiences (Weems & Silverman, 2006). Learning Influences Rachman (1977) influentially proposed a theoretical model of fear acquisition, whereby new fearful learning is acquired via direct conditioning experience with the feared stimulus, through observation of others, and through negative information regarding the feared stimulus. The principles of direct conditioning suggest several mechanisms by which environmental experiences may predispose to, precipitate, or protect against the development of anxiety

219

disorders (Bouton, Mineka, & Barlow, 2001). Consistent with this view, evidence suggests that a substantial percentage of children and adolescents with fears and phobias have a history of direct or indirect conditioning (Ollendick & King, 1991). However, even severely traumatic experiences are not always sufficient to produce phobic anxiety (e.g., Vernberg, La Greca, Silverman, & Prinstein, 1996), and traumatic conditioning episodes are not necessary causes because phobic anxiety can develop in their absence (Menzies & Clarke, 1995). Traumatic conditioning episodes appear to interact with predisposing factors such as temperament and prior learning history to produce heightened risk for phobic responses in vulnerable individuals. Growing evidence further suggests that direct conditioning experiences may account for only a small percentage of childhood phobias, with observation-​based learning and information-​processing modes of acquisition being predominant (e.g., Field & Lawson, 2003). Family Processes Attachment theory provides a framework for understanding the enduring bonds that human infants form with their caregivers, for the classification of those bonds based on the quality of the attachment, and for conceptualizing the long-​term impact of these attachments on human behavioral and emotional patterns (e.g., Bowlby, 1977; Esbjørn, Bender, Reinholdt-​ Dunne, Munck, & Ollendick, 2012). Warren, Huston, Egeland, and Sroufe (1997) found that children classified as anxious/​resistant in their attachment (assessed at 12 months of age) were more likely to have anxiety disorders at 17 years age than were children classified with other types of attachment, even when controlling for temperament and maternal anxiety. Insecure attachment also has been linked with increased levels of anxiety sensitivity (Weems, Berman, Silverman, & Rodriguez, 2002). The risk associated with insecure attachment status, however, is likely to depend on the co-​occurrence of other predisposing factors, such as a BI temperament. For example, Warren and Simmens (2005) followed 1,200 infants who had sensitive mothers and found they showed fewer anxiety and depressive symptoms at 2 to 3  years of age. They also found that children with difficult temperaments with sensitive mothers were less likely to have depression and anxiety. Dallaire and Weinraub (2007) examined attachment security at 15 months and anxiety at 4.5 years and found that insecurely attached children who experience negative life events exhibited more anxiety than did securely attached children.

220

Anxiety and Related Disorders

Parenting behavior, particularly parenting that is viewed as overcontrolling, overinvolved, dependent, or intrusive, has also been linked to the development of childhood anxiety. Research, overall, suggests that parents who exhibit such parenting behavior may (a)  prevent youth from facing fear-​provoking events, a developmentally important task that allows children to develop solutions to face fear; and/​or (b)  send a message that particular stimuli are threatening or dangerous, which may reinforce avoidant behavior (see Silverman & Nelles, 1988; Vasey & Ollendick, 2000). Such parenting behavior, however, likely interacts in important ways with characteristics of the child. For example, the presence of anxiety in either the child or the mother in mother–​child dyads elicited maternal overcontrol during their interactions (Whaley, Pinto, & Sigman, 1999). Recently, there has been a surge of interest in the role of family accommodation (FA) in childhood anxiety disorders (e.g., Lebowitz et al., 2013; Norman, Silverman, & Lebowitz, 2015). Family accommodation describes the changes that parents make to their own behavior to help their child avoid or alleviate their distress and anxiety. FA can reduce a child’s anxiety in the short term, but it is likely to impede the child’s development of more independent coping and self-​regulation skills, to promote and facilitate ongoing avoidance, and to hinder the child’s sense of self-​efficacy (e.g., Norman et al., 2015). Examples of accommodation include parents speaking for a socially anxious child, providing excessive reassurance to a child with generalized anxiety, or allowing a child with separation anxiety to sleep in their bed (e.g., Norman et  al., 2015). FA was initially studied in families of children with OCD (Calvocoressi et  al., 1995; Lebowitz, Panza, Su, & Bloch, 2012), however, there is now ample evidence that FA is highly prevalent across the anxiety disorders, associated with greater symptom severity, and may predict poor treatment outcomes (Jones, Lebowitz, Marin, & Stark, 2015; Lebowitz, Panza, & Bloch, 2016; Lebowitz, Scharfstein, & Jones, 2014, 2015). Of emergent interest is FA’s biological basis vis-​à-​vis childhood anxiety. Recent research has indicated that higher levels of FA are associated with low levels of salivary oxytocin in anxious youth (Lebowitz, Leckman, Feldman, et al., 2016), an intriguing finding given the role of the oxytocinergic system for both anxiety regulation and the modulation of close interpersonal and attachment behavior (Feldman, 2016; MacDonald & Feifel, 2014). Later in this chapter, we discuss recent developments in measuring FA.

Cognitive Biases and Distortions Childhood anxiety disorders are associated with a variety of information-​ processing biases at various stages, including encoding, interpretation, and recall (see Vasey, Dalgleish, & Silverman, 2003). The attentional and interpretational biases present in adults are also present in children (Field & Lester, 2010). Clinically anxious and highly test-​anxious children, for example, show an attentional bias in favor of threat-​relevant stimuli (Vasey, Daleiden, Williams, & Brown, 1995). Compared with normal controls, clinically anxious and highly test-​anxious children also show a bias toward interpreting ambiguous information as threatening (Dadds, Barrett, Rapee, & Ryan, 1996). Whether attention and interpretation biases predispose individuals to or result from anxiety, once present, these biases seem to foster the maintenance and intensification of anxiety (Vasey et al., 2003). By virtue of their tendency to show attentional biases toward threat cues and to interpret ambiguous information as threatening, anxious children and adolescents construct their own anxiogenic experiences. Anxiety sensitivity—​ the belief that anxiety sensations have negative social, psychological, and/​ or physical consequences—​ is another cognitive factor implicated in the etiology of anxiety disorders, especially panic attacks and panic disorder (e.g., Ollendick, 1998; Silverman & Weems, 1998). The previously discussed emergent research has clinical translational implications. For example, attention training away from threat can be used to reduce anxiety symptoms in youths (e.g., Cowart & Ollendick, 2011; Rozenman, Weersing, & Amir, 2011). Researchers have also more recently used a visual search paradigm, which requires participants to make decisions about the presence/​absence of a specific target among distractors. Findings have shown that children with high levels of anxiety symptoms display increased efficiency to detect angry faces compared with either neutral or happy faces (e.g., Perez-​Olivas, Stevenson, & Hadwin, 2008; Waters, Henry, Mogg, Bradley, & Pine, 2010; Waters & Lipp, 2008).

Summary Anxiety disorders are among the most common mental disorders in childhood and adolescence. Prevalence estimates vary, depending on whether the samples are clinic-​ referred or community-​ based, whether impairment is considered, as well as other sample characteristics. The etiology of anxiety disorders is multiple, complex, and overdetermined, including both biological

Anxiety Disorders in Children and Adolescents

and environmental determinants (Lebowitz, Leckman, Silverman, & Feldman, 2016). There is likely more than one pathway to any one disorder. Likewise, the phenotype and presentation of anxiety disorders is multifaceted, including a wide variety of neurological, physiological, cognitive, emotional, and behavioral manifestations. The measure and strategy used to assess anxiety disorders depend on the specific assessment purpose. We turn now to a main assessment purpose: diagnosis.

ASSESSMENT FOR DIAGNOSIS

In this section, we consider assessment measures and strategies most useful for deriving anxiety disorder diagnoses in children and adolescents, namely diagnostic interview schedules. Emphasis is placed on the Anxiety Disorders Interview Schedule for Children:  Child and Parent Versions, which has the most research support for deriving reliable and valid diagnoses. The utility of rating scales for the purpose of diagnosis is also covered. Next, we discuss “best practices” with respect to conceptual and practical issues in diagnosis, including differential diagnosis. We conclude this section with an overall evaluation of available assessment instruments.

TABLE 11.1  

Semi-​Structured and Structured Diagnostic Interview Schedules The use of semi-​structured and structured interview schedules represents best practice for the purpose of deriving an anxiety disorder diagnosis in children and adolescents. A  number of diagnostic interview schedules have been developed to cover the different types of anxiety disorders specified in the DSM-​5. The most widely used interview schedules for diagnosing clinical disorders of childhood and adolescence, including the anxiety disorders, are presented in Table 11.1. Compared with unstructured clinical interviews, semi-​structured and structured interviews are more standardized in terms of the questions that are asked of informants. The increased standardization reduces error variance attributed to interviewers and also variance in usage of diagnostic criteria (Silverman, 1994). Anxiety Disorders Interview Schedule for Children: Child and Parent Versions The Anxiety Disorders Interview Schedule for Children:  Child and Parent Versions (ADIS-​C/​P) is the most widely used semi-​structured interview schedule in the field, including in randomized clinical trials. A downward extension of the adult interview (Brown, DiNardo,

Ratings of Instruments Used for Diagnosis

Instrument

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity Validity Generalization

Clinical Utility

Highly Recommended

NA NA NA NA

NA NA NA NA

E A A G

E G G A

E G G A

E G G A

E G G G

E G G A



E G A G G A A A A

G G E G E G G G G

NA NA NA NA NA NA NA NA NA

G G G G G A E G G

G G E G G E E G G

E G G E G E G G G

G G G E G G G G G

G A G G G G G G G

Norms

Diagnostic Interview Schedules ADIS C/​P-​IV DISC-​IV DICA K-​SADS Child Self-​Rating Scales RCMAS STAIC FSSC-​R MASC SCARED SCAS SPAIC SASC-​R CASI

221

✓ ✓



Note:  ADIS C/​ P-​ IV  =  Anxiety Disorders Interview Schedule; DISC-​ IV  =  Diagnostic Interview Schedule for Children, Version IV; DICA = Diagnostic Interview Schedule for Children and Adolescents; K-​SADS = Schedule for Affective Disorders and Schizophrenia for School-​ Age Children; RCMAS = Revised Children’s Manifest Anxiety Scale; STAIC = State–​Trait Anxiety Inventory for Children; FSSC-​R = Fear Survey Scale for Children-​Revised; MASC  =  Multidimensional Anxiety Scale for Children; SCARED  =  Screen for Child Anxiety-​Related Emotional Disorders; SCAS = Spence Children’s Anxiety Scale; SPAIC = Social Phobia and Anxiety Inventory for Children; SASC-​R = Social Anxiety Scale for Children-​Revised; CASI = Children’s Anxiety Sensitivity Index; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

222

Anxiety and Related Disorders

& Barlow, 1994), the ADIS-​C/​P was constructed initially to allow for DSM-​III (APA, 1980) and DSM-​III-​R (APA, 1987) diagnoses (Silverman, 1991), and it was revised for DSM-​IV (APA, 1994; ADIS for DSM-​IV: C/​P; Silverman & Albano, 1996)  and for DSM-5 (Albano & Silverman, 2017). The DSM-​5 version contains additional modules to reflect some of the other changes made to the system not mentioned previously (e.g., inclusion of hoarding as a psychiatric disorder). Additional questions are included that allow interviewers to obtain information about the history of the problem as well as factors that may maintain the anxiety. In addition to containing modules of almost all the disorders included in the DSM-​5, with heaviest coverage of the anxiety and related disorders, the interview contains clinician severity rating scales that assess for degree of impairment or interference in child functioning associated with the specific anxiety disorder endorsed by the child and parent, respectively. Based on the information obtained from the child and parent versions of the interview, interviewers assign the degree of distress and interference associated with each disorder (0 = “none” to 8 = “very severely disturbing/​impairing”) overall with respect to peers, schoolwork, family life, and personal distress. Each module also contains questions that allow interviewers to assign 0 to 8 ratings regarding fear and avoidance of diverse situations relevant to a specific disorder (e.g., SOP and SP). Similar to the adult interview, clinician severity ratings of 4 (“definitely disturbing/​impairing”) or higher are viewed as “clinical” diagnoses, and those less than 4 are viewed as “subclinical” or subthreshold. The clinical severity ratings are further discussed later in the chapter.

Research has confirmed empirically the validity of diagnoses formulated using the ADIS-​IV:  C/​P by showing that scores on child and parent rating scales converge in expected ways with diagnoses (e.g., Weems, Silverman, Saavedra, Pina, & Lumpkin, 1999; Wood, Piacentini, Bergman, McCracken, & Barrios, 2002). Wood et  al. (2002), for example, evaluated concurrent validity of ADIS-​IV: C/​P diagnoses of SOP, SAD, GAD, and PD in children and adolescents referred to an outpatient anxiety disorders clinic. Specifically, high correspondence was found between ADIS-​IV: C/​P diagnoses and empirically derived factor scores corresponding to each of these diagnoses on the Multidimensional Anxiety Scale for Children (MASC; March, Parker, Sullivan, Stallings, & Conners, 1997), with the exception of GAD. A  DSM-​5 version of the ADIS Child and Parent interview schedules has recently been developed (Albano & Silverman, 2017). The psychometric data found for the DSM-​IV version are likely to generalize to the DSM-​5 version given the high overlap and similarity between the two. Selective mutism, previously classified under “Disorders Usually First Diagnosed in Infancy, Childhood, or Adolescence,” now classified as an anxiety disorder, is similarly likely to be assessed in a reliable and valid manner with the DSM-​ 5 interview. This is because of the high concordance between children and parents in their respective interviews that correspond with a diagnosis of selective mutism. For example, parents have little difficulty responding in the affirmative to the selective mutism question, whereas children, although not usually responding verbally, will shake their heads or point to the words “Yes” or “No” printed on cards.

Reliability of Diagnoses

Additional Diagnostic Interview Schedules

A number of studies conducted in university-​ based research clinics have confirmed empirically the reliability of diagnoses formulated using the ADIS for DSM-​IV: C/​P, including inter-​rater reliability (e.g., Grills & Ollendick, 2003; Silverman et al., 1988), retest reliability of specific diagnoses (Silverman & Eisen, 1992), and retest reliability of symptom patterns (Silverman & Rabian, 1995). Lyneham, Abbott, and Rapee (2007) also found that when using both child and parent reports on the ADIS, the level of agreement between independent raters for anxiety diagnoses was excellent. In a sample of children and adolescents with specific phobias, Reuterskiöld, Ost, and Ollendick (2008) also found excellent parent–​child agreement on specific phobia diagnosis and moderate levels of agreement for comorbid diagnoses.

Other diagnostic interview schedules available to assess for anxiety disorders in children and adolescents are the Diagnostic Interview Schedule for Children (DISC-​IV; Shaffer, Fisher, Lucas, Dulcan, & Schwab-​Stone, 2000), the Diagnostic Interview for Children and Adolescents (DICA; Reich, 2000), and the Schedule for Affective Disorders and Schizophrenia in School-​ Age Children (K-​SADS; Ambrosini, 2000). Similar to the ADIS-​IV: C/​ P, these structured or semi-​structured interview schedules have child and parent versions, assess most of the DSM-​ IV disorders of childhood and adolescence beyond anxiety, and can be used across a wide age range of children. Diagnoses are formulated upon completion of both child and parent versions, and they are determined by rules derived by the interview developers.

Anxiety Disorders in Children and Adolescents

Rating Scales A number of self-​rating scales are available to assess anxiety in children and adolescents. The most widely used scales are presented in Table 11.1. Also contained in the table are scales that are not discussed in this narrative section. Most are omnibus measures and were not designed to identify specific anxiety disorders. Historically, the Revised Children’s Manifest Anxiety Scale (RCMAS; Reynolds & Richmond, 1985) and the State–​Trait Anxiety Inventory for Children (STAIC; Spielberger, 1973) were used to identify the presence of anxiety and to quantify anxiety symptoms in youth (Silverman & Saavedra, 2004). The RCMAS is a 37-​item (28 Anxiety and 9 Lie items) rating scale (yes/​ no) that contains three subscales:  physiological, worry/​oversensitivity, and concentration. The STAIC is a 20-​item rating scale that assesses the chronic (trait) and acute (state) symptoms of anxiety using a 3-​point scale (hardly ever, sometimes, and often). The Fear Survey Schedule for Children-​Revised (FSSC-​ R; Ollendick, 1983) has been most widely used to assess fear. Containing 80 items, which are rated along a 3-​point scale (none, some, and a lot), the factor scales consist of the following:  fear of failure and criticism, fear of the unknown, fear of danger and death, medical fears, and fear of small animals.

Discriminant Validity of Youth Anxiety Self-​Rating Scales The chapter in the first edition summarized studies that examined the ability of the RCMAS and STAIC scores to discriminate between youth with anxiety disorders and youth with no disorders or youth with other disorders (Lonigan, Carey, & Finch, 1994; Perrin & Last, 1992), as well as the results of a meta-​analysis of 43 published studies (Seligman, Ollendick, Langley, & Baldacci, 2004). The overall conclusion, which remains valid still, is that although both measures can discriminate anxiety disorders from no disorders, there is little support for the RCMAS’s and the STAIC’s scores to discriminate between anxiety disorders and other disorders, specifically oppositional and conduct disorders and depressive disorder. Recently, the following anxiety scales have become more widely used by researchers:  the MASC, the Screen for Child Anxiety-​Related Emotional Disorders (SCARED; Birmaher et  al., 1997), and the Spence Children’s Anxiety Scale (SCAS; Spence, 1998). The latest revision of the MASC, the MASC-​2 (March, 2013), is described later. The SCARED is a 38-​item rating scale

223

that assesses symptoms of SAD, GAD, SOP, and school phobia using a 3-​point scale (not true or hardly ever true, sometimes true, and often true). The SCAS is a 44-​item rating scale that assesses symptoms of SAD, GAD, SOP, OCD, PD/​AG, GAD, and fears of physical injury using a 4-​point scale (never, sometimes, often, and always). Scores on the MASC, SCARED, and SCAS have been found to have high internal consistency and to be able to discriminate “anxiety disorders” from “no anxiety disorders,” as well as among the anxiety disorder subtypes to some extent (e.g., SAD vs. SOP; Muris, Merckelbach, Ollendick, King, & Bogie, 2002). As with the RCMAS and STAIC, their associations with depression are also positive and significant. Specifically, respective total scores on the SCARED, SCAS, and MASC correlate near or above .70 with the Children’s Depression Inventory (Kovacs, 1992), similar to the correlations found with the RCMAS and STAIC. Only the FSSC-​R showed clear divergent validity with depression (Muris et al., 2002). Using receiver operator characteristic (ROC) curves to estimate diagnostic accuracy across the range of scores on specific scales, Dierker et al. (2001) compared the RCMAS and the MASC and also included a ROC analysis of a depression self-​rating scale, the Center for Epidemiologic Studies-​ Depression Scale (CES-​ D), in a school-​ based sample survey of ninth-​ grade students. Students scoring at or above the 80th percentile on any one or more of the three rating scales and a random sample scoring below this threshold participated in ADIS-​ C interviews within 2  months of the screening sessions. Results indicated that MASC scores were only partially successful in identifying GAD, and only among the girls. More encouraging are findings by Villabø, Gere, Torgensen, March, and Kendall (2012). Villabø and colleagues found the MASC had moderate to high internal consistency across the subscales in a treatment-​seeking sample. MASC scores also were able to successfully discriminate those with and without anxiety disorders, especially SAD and SOP, but less so for GAD. Wei et al. (2014) found the MASC had low agreement between parent and child; however, it had good internal consistency across subscales and informants. They found it could discriminate between youth with and without anxiety disorders. In an inpatient sample, Skarphedinsson, Villabø, and Lauth (2015) found the MASC could detect whether or not the patient had any anxiety disorder moderately well, but it had limited utility in detecting specific anxiety disorders, apart from GAD. Overall, recent analyses of the MASC reveal a generally similar picture as that presented by earlier analyses: It is useful in differentiating anxiety disorders

224

Anxiety and Related Disorders

from no disorder but less so in differentiating among anxiety disorders or between anxiety disorders and other disorders, and its ability to do so decreases as comorbidity increases. Further work is needed for its utility as a screen. Compared with the MASC, less analysis has occurred with the SCARED and SCAS. Brown-​Jacobsen, Wallace, and Whiteside (2011) examined parent, child, and clinician agreement across the SCAS rating subscales, as well as their predictive value. Results indicated that parent and child agreement on the SCAS was moderate to high for most symptoms, consistent with clinician ratings, and both child and parent provided unique diagnostic information. Updated Versions of Youth Self-​Rating Scales Since the original version of this chapter, modifications or revisions of several of the rating scales have been made. These include a short form (SF) of the Fear Survey Schedule Children (FSSC-​ R-​ SF; Muris, Ollendick, Roelofs, & Austin, 2014)  and the updated MASC-​ 2 (March, 2013)  and RCMAS-​2 (Reynolds & Richmond, 2008). The FSSC-​R-​SF is a shortened 25-​item version of the 80-​item FSSC-​R (Ollendick, 1983). The FSSC-​R-​SF contains the 5 items from each factor of the FSSC-​R (i.e., “fear of failure,” “fear of death,” “fear of small animals,” “medical fears,” and “fear of the unknown”) that have the highest factor loadings. Fear of failure items, for example, are “being teased” and “failing a test.” Muris et al. (2014) reported that the FSSC-​R-​SF scores had good internal consistency (Cronbach’s α = .87 to 91), and both convergent validity and discriminant validity were demonstrated. Although more research is needed, including the gathering of normative data, Muris et al. (2014) highlighted that this short form may have utility as a brief screen for childhood fears and phobias. The MASC-2 was designed to improve upon the MASC by assessing a broader range of anxiety symptoms using the following six scales:  Separation Anxiety/​ Phobias, GAD Index, Social Anxiety, Obsessions and Compulsions, Physical Symptoms, and Harm Avoidance. The MASC-​2 has 50 items, compared to 39 items in the original. MASC-​2 norms were established using a sample of 3,400 youths (aged 8–​19 years) and 1,600 parent reports. The MASC-​2 includes two new scales that measure GAD and OCD symptoms. The MASC-​2 also contains a new “Inconsistency Index,” in which eight pairs of items assess for identical content and thus can be used to determine reliability of respondents’ ratings. Also new is an “Anxiety Probability Score” for each subscale, which estimates the probability that the respondent has a

clinically significant anxiety disorder. The MASC-​2 has a self-​report version for the child (MASC-​2-​SR), as well as a parallel version for the parent (MASC-​2-​P). The MASC-​ 2-​SR has shown generally strong psychometric properties (see Fraccaro, Stelniki, & Nordstokke, 2015). For internal consistency, the MASC-​2-​SR has a coefficient α of .92, for the total score and a median α value of .79 for the scales and subscales. The MASC-​ 2-​ SR also has shown high test–​retest reliability estimates, with corrected correlation values ranging from .80–​.94. It has also shown strong convergent validity with other anxiety measures. For example, the MASC-​2-​SR is moderately to highly correlated with the Beck Youth Inventory-​Anxiety (r  =  .73; Beck, Beck, & Jolly, 2001). The Revised Children’s Manifest Anxiety Scale Second Edition (RCMAS-​2; Reynolds & Richmond, 2008) has 49 yes/​no items, compared with 37 in the initial RCMAS (Reynolds & Richmond, 1978). It is based on a more ethnically diverse norming sample (N = 3,086; 6-​to 19-​year-​ olds). The RCMAS-​2 subscales are the same as those in the RCMAS (i.e., physiological anxiety, worry, and social anxiety). The previous “lie” scale, now referred to as the “defensiveness” scale, contains 9 items. It measures the extent to which respondents try to present themselves in a positive light. Ang, Lowe, and Yusof (2011) examined the psychometric properties of the RCMAS-​2 in a sample of 1,618 Singaporean students. They found evidence of the utility of the RCMAS-​2 in this sample, suggesting the measure has cross-​cultural value in an Asian sample and that the U.S. norms are still appropriate. The RCMAS-​2 also includes a short form, containing the first 10 items of the measure. It can be completed in less than 10 minutes. Rating Scales for Specific Anxiety Domains Several other rating scales are available to assess more specific anxiety (see Table 11.1). Two relevant to social anxiety and worth highlighting are the Social Phobia and Anxiety Inventory for Children (SPAIC; Beidel, Turner, & Morris, 1999) and the Social Anxiety Scale for Children-​Revised Version (SASC-​R; La Greca & Stone, 1993). The SPAIC is a 26-​item rating scale that assesses children’s distress to social situations along three factors—​assertiveness/​general conversation, traditional social encounters, and public performance—​using a 3-​point scale (never or rarely, sometimes, and most of the time or always). The SASC-​R is a 26-​item rating scale that assesses children’s experiences of social anxiety along three factors—​fear of negative evaluation, social avoidance and distress in new situations, and general social avoidance and distress—​using a 5-​ point

Anxiety Disorders in Children and Adolescents

scale (not at all, hardly ever, sometimes, most of the time, and all of the time). There is also an adolescent version consisting of 22 items (La Greca & Lopez, 1998). Conceptual and Practical Issues in Diagnosis

225

SOP), specific objects or situations (i.e., SP), separation from attachment figures (i.e., SAD), excessive worry (i.e., GAD), traumatic events (i.e., PTSD), and concerns about exposure to objects or situations related to an obsession (i.e., OCD).

Differential Diagnosis

Dealing with Comorbidity

Because of the overlap that exists among the anxiety disorder subtypes, differential diagnosis can prove challenging, even when a diagnostic interview schedule is used. It is beyond the scope of this chapter to analyze the myriad of issues involved in the differential diagnosis of the anxiety disorders. However, to illustrate how challenging differential diagnosis can be, issues involved in the differential diagnosis of GAD, SOP, and PD are presented here. In GAD, worry is a process in which all individuals with the disorder actively engage (e.g., Silverman, La Greca, & Wasserstein, 1995). However, worry is a pervasive clinical feature of most of the anxiety disorder subtypes (Weems, Silverman, & La Greca, 2000). The differential diagnosis of GAD requires that the individual’s worry does not focus solely on areas that pertain to the other anxiety subtypes, such as social evaluative situations (i.e., SOP), specific objects or situations (i.e., SP), and separation from attachment figures (i.e., SAD). The worries also cannot have emerged from exposure to a traumatic event (i.e., PTSD). GAD must be further distinguished from excessive worry about having panic attacks (i.e., PD) as well as worrying in the form of obsessions (i.e., OCD). The specific differential between GAD and SOP can be especially challenging. In GAD, the worry about social situations and academic tasks stems usually from a fear of failure due to not reaching a self-​generated standard. In SOP, the fear or worry stems from a fear of negative evaluation by others relating to social evaluative situations such as academic-​or peer-​related events. The social avoidance associated with SOP must be further distinguished from the social avoidance relating to having an unexpected panic attack (PD) and not wanting this attack to occur in public places (i.e., agoraphobia). Also important is distinguishing SOP from autism spectrum disorder (ASD): Children with SOP have the capacity for and interest in social relationships; children with ASD have a general lack of interest in social relationships. Panic is another common clinical feature that is pervasive across the different anxiety disorder subtypes (Ollendick, Mattis, & Birmaher, 2004). The differential diagnosis of PD requires, however, that the panic attacks are not cued by social evaluative situations (i.e.,

Comorbidity, or the presence of multiple disorders, occurs at high rates in children and adolescents with anxiety disorders. It is the rule rather than the exception. Estimated rates of comorbidity run as high as 91% in clinic samples (e.g., Angold & Costello, 1999) and 71% in community samples (e.g., Woodward & Fergusson, 2001). Although some of these reported high rates of comorbidity reflect methodological artifacts including referral bias, comorbidity cannot be explained simply as artifact. Research further shows that anxious youth who are comorbid with another disorder, a depressive disorder particularly, are more severely impaired than are youth with one disorder only (Franco, Saavedra, & Silverman, 2007; Seligman & Ollendick, 1998). These findings highlight the importance of carefully considering the different types of comorbid patterns that often accompany anxiety disorders. Use of the ADIS-​C/​P, for example, which covers the full range of DSM disorders, is a way to increase assurance that the diverse comorbid patterns that often co-​occur with anxiety disorders have been carefully and thoroughly assessed. Measuring Anxiety with Comorbid Autism The comorbidity between autism and anxiety is well established, with up to 40% of youth diagnosed with autism spectrum disorder (ASD) meeting criteria for an anxiety disorder (Davis, White, & Ollendick, 2014; Kaat, Gadow, & Lecavalier, 2013). Yet the phenomenology of anxiety in ASD is not well understood. There is debate about whether anxiety in this population should be viewed as a separate disorder, a symptom of ASD, or separate but not independent of ASD (Lecavalier et al., 2014). There is also overlap with DSM symptoms in individuals with ASD and anxiety. For example, avoidance of social interactions, or awkwardness during these interactions, could reflect either ASD or social phobia (or both). Difficulties due to cognitive and language delays also can make the use of self-​reports difficult with some young people with ASD (e.g., Lecavalier et al., 2014). Researchers have examined the utility of anxiety measures in this comorbid population. Storch et  al. (2012)

226

Anxiety and Related Disorders

administered the ADIS-​C/​P to 85 children and adolescents (aged 7 to 17 years) with ASD. Diagnostic agreement between the youths and parents or clinical consensus was poor; however, agreement was good to excellent between parents and clinician. The authors concluded that clinicians should primarily base their diagnostic decisions on parent reports. In a subsequent study, Ung et  al. (2014) examined the inter-​rater reliability of the ADIS-​C/​P in 70 youths with high-​functioning autism. The researchers compared inter-​rater reliability between a live administration of the ADIS-​C/​P and a taped recording of the live administration. Ung et  al. (2014) found good to excellent clinician-​to-​clinician reliability across different ASD diagnoses. Kaat and Lecavalier (2015) examined internal consistency, test–​retest reliability, and inter-​rater reliability of the MASC-​2 and the Revised Children’s Anxiety and Depression Scale (RCADS) in 46 children with ASD. They found internal consistency was adequate, but inter-​rater reliability was poor. Divergent and convergent validity were adequate and parent–​child agreement was higher when the child had a higher IQ and lower symptom levels. Lecavalier et al. (2014) conducted a review of 38 published studies and 10 assessment measures to determine the suitability of existing measures for assessing anxiety in young people with ASD. Four measures were viewed as suitable for use in clinical trials: the parent-​rated 20-​item version of the Child and Adolescent Symptom Inventory (CASI; Hallett et al., 2013), the parent-​rated MASC, the Pediatric Anxiety Rating Scale (PARS; Research Units on Pediatric Psycho-​pharmacology [RUPP], 2002), and the ADIS-​C/​P. That is, scores on these measures were deemed to have good clinical relevance and good to excellent reliability and validity.

most evaluation. Additional research is needed to determine reliability and validity of diagnoses when the interview is used in community settings, as well as in deriving diagnoses of disorders with varying base rates. Research on the utility of other assessment methods for the purpose of diagnosis and differential diagnosis is limited. Of the currently available self-​rating scales, the MASC has the strongest evidence base as a screen for diagnosis. The more recent MASC-​2 and RCMAS-​2 appear to be well designed and validated across large samples, yet at this stage they have not been as thoroughly researched as the older versions. When measuring anxiety with comorbid autism, we defer to Lecavalier et al. (2014) in suggesting the parent-​rated 20-​item CASI, MASC, PARS, and ADIS as most suitable.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

This section presents assessment measures and strategies for use in arriving at a fuller case conceptualization that can guide decisions about treatment planning. Case conceptualization is similar to diagnosis, but they are not one in the same. Case conceptualization focuses more on psychological processes associated with the etiology and maintenance of a disorder. A definitive understanding of children’s psychological problems is difficult to achieve, and initial conclusions about how to conceptualize the “case” and plan treatment are best viewed as hypotheses that await verification based on additional information (Ollendick & Hersen, 1993; Silverman & Saavedra, 2004). The assessment process ideally continues throughout treatment, which can also serve as an opportunity to obtain additional information based on treatment response. Overall Evaluation Thus, evidence-​ based assessments de-​ emphasize Assessment for diagnosis of anxiety in children and adoles- quick, definitive conclusions and focus more on obtaincents continues to face a number of challenges. Younger ing information that unfolds over time and is directly children may have difficulties with language and self-​ relevant to treatment. “Relevance for treatment” refers to report. They may also be susceptible to social desirability the clinical utility of information in planning the treatand demand characteristics. To determine diagnoses, it is ment and, in the final analysis, the evaluation of intervenpreferable to have multiple informants, multiple meth- tion outcomes (Mash & Terdal, 1988). A related concept ods, and to assess across different contexts (e.g., Essau & is treatment utility, which refers to the degree to which Barrett, 2001). There is also a trade-​off between time com- assessment strategies are shown to contribute to beneficial mitment and validity. For diagnosis, interview schedules treatment outcomes (Hayes, Nelson, & Jarrett, 1987). have the most empirical evidence for deriving reliable The most widely used measures for case conceptualand valid diagnoses. Among the interview schedules avail- ization and treatment planning are presented in Table able, the ADIS C/​P has been used in most of the youth 11.2. The table also includes several measures that are anxiety research studies, and it has been subjected to the not discussed in the narrative. The discussion focuses on

Anxiety Disorders in Children and Adolescents TABLE 11.2  

227

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Inter-​Rater Consistency Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

NA NA NA NA

E A A G

E G G A

E G G A

E G G A

E G G G

E G G A



G E G G G E

NA NA NA NA NA G

G G G A E E

G G G E E E

E G E E G G

G G E G G E

G E E G G E



NA NA NA

G G NA

G G G

G G G

G G G

G G G

G G G

✓ ✓

Diagnostic Interview Schedules ADIS C/​P-​IV DISC-​IV DICA K-​SADS

NA NA NA NA

Child Self-​Rating Scales RCMAS SRAS MASC SCAS SPAIC FASA/​FASA-​CR

E A G A A A

✓ ✓ ✓

Behavioral Observations BAT SET/​PYIT SM

NA NA NA

Note: ADIS C/​P-​IV = Anxiety Disorders Interview Schedule; DISC-​IV = Diagnostic Interview Schedule for Children, Version IV; DICA = Diagnostic Interview Schedule for Children and Adolescents; K-​ SADS  =  Schedule for Affective Disorders and Schizophrenia for School-​ Age Children; RCMAS = Revised Children’s Manifest Anxiety Scale; SRAS = School Refusal Assessment Scale; MASC = Multidimensional Anxiety Scale for Children; SCAS = Spence Children’s Anxiety Scale; SPAIC = Social Phobia and Anxiety Inventory for Children; FASA/​FASA-​CR = Family Accommodation Scale-​ Anxiety/​Family Accommodation Scale-​Child Report; BAT = Behavioral Avoidance Task; SET/​PYIT = Social Evaluative Task/​Parent–​Youth Interaction Task; SM = Self-​Monitoring; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

measures and strategies that have supportive evidence for developing (a)  a clinically meaningful and useful case conceptualization and (b)  a clinically sensitive and feasible treatment plan. A  discussion of best practices for case conceptualization and treatment planning is also provided. The section concludes with an overall evaluation of instruments.

Chorpita, 2016; Motoca & Silverman, 2011; Ollendick, King, & Chorpita, 2006). Although several authors have criticized the linkage between diagnosis and treatments in the evidence-​based treatment movement (e.g., Goldfried & Wolfe, 1998), this concern is dampened with regard to treating phobic and anxiety disorders because of the strong evidence for exposure-​based CBT approaches. The implications of the above are clear: if one wishes to use the treatment approach that possesses the most Semi-​Structured and Structured Diagnostic research evidence, it is important to first have high confiInterview Schedules dence that one has reliable and valid information that the The initial set of decisions that need to be made when child is suffering primarily from clinical levels of anxiety working with children and adolescents with anxiety disor- and not another clinical disorder. Second, it is important ders is how to best conceptualize the case, to determine to have confidence about the specific type(s) of anxiety whether an anxiety disorder exists, and to plan treatment disorder to ensure that the appropriate exposure tasks can accordingly. Differences between the anxiety disorders be planned and implemented (e.g., exposure to social and other disorders, as well as differences within the anxi- evaluative situations for SOP cases and exposure to sepaety subtypes, constitute the primary reason why the use ration situations for SAD cases). of structured and semi-​structured interview schedules are As discussed previously in the section titled Assessment critical from an evidence-​based perspective. for Diagnosis, semi-​structured and structured diagnostic Furthermore, cognitive–​behavioral treatment (CBT), interviews are useful in the initial stages of deriving diagwhich involves exposure-​ based exercises, both in ses- noses of anxiety disorders. See Table 11.2 for a listing of the sion and out of session, remains the strongest evidence-​ schedules used most in the anxiety field. Specific aspects based treatment for anxiety disorders in children and of anxiety diagnoses yield further clinically suggestive adolescents (Higa-​McMillan, Francis, Rith-​Najarian, & information about treatment targets and treatment plans.

228

Anxiety and Related Disorders

Prescriptive Treatment Strategies

contained 4 items devoted to each of the four functional conditions. Child and parent versions of the scale were The reader is referred to Table 11.2 for a listing of the developed. Item means were averaged across functions to main measures and strategies, with some additional ones derive functional profiles that included the primary and noted that are not discussed in the narrative. In an early secondary reasons why a particular child refused school. study, Eisen and Silverman (1993) showed CBTs were The original SRAS and its recent revision (24 items with most effective for children with overanxious disorder parent and child versions) have been found to be reli(i.e., the DSM-​III-​R precursor to GAD) when they were able across time and between parent raters (Kearney, matched with specific symptoms. For example, children 2002, 2006; Kearney & Silverman, 1993; Silverman & with primary symptoms of worry, defined by the worry/​ Ollendick, 2005). oversensitivity subscale of the RCMAS, responded more The scale is useful in the prescriptive treatment of favorably to a cognitive therapy, whereas children with school refusing children. Prescriptive treatment for negaprimary symptoms of somatic complaints, defined by the tively reinforced school refusal behavior (Functions 1 physiological arousal subscale of the RCMAS, responded and 2) consists of psychoeducation, fear hierarchy develmore favorably to relaxation training aimed at dealing opment, cognitive therapy, modeling, and behavioral with physiological and somatic complaints. exposures designed to gradually reintroduce the child to This early study was replicated by Eisen and Silverman school. Prescriptive treatment for positively reinforced (1998) in that although both treatments were effective, school refusal behaviors (Functions 3 and 4)  consists of the prescriptive treatments produced greater improvedeveloping daily routines, escorting the youth to school, ments for the children to meet specific positive end-​state contingency contracting, and communications skills functioning criteria. Similar effects of matching were training. Single-​ case experimental design treatment shown by Ollendick, Hagopian, and Huntzinger (1991) studies have shown that the SRAS and the SRAS-​R are with separation anxious children and by Ollendick (1995) useful in determining which prescriptive treatment best with adolescents with PD with agoraphobia. Each of these fits a particular case and which treatments may be less studies used single case multiple baseline designs to illuseffective (Chorpita, Albano, Heimberg, & Barlow, 1996; trate the controlling effects of the matched interventions. Kearney, Pursell, & Alvarez, 2001; Kearney & Silverman, Overall, CBTs were shown to be maximally effective in 1993, 1999). these studies when the assessment of diagnoses was supplemented with the assessment of symptom profiles. Behavioral Observations A prescriptive approach is also illustrated in the treatment of school refusal behavior in children and adoles- Systematic direct behavioral observations can play a cents. Although not a specific psychiatric diagnosis, school particularly helpful role in case conceptualization and refusal is a common mental, health, and educational treatment planning (see Table 11.2 for a summary). One problem that refers to child-​motivated refusal to attend useful role of behavioral observations is for identifying school and/​or difficulties remaining in school for an entire and quantifying specific fear and anxiety symptoms and school day (Kearney, 2003). Children who refuse school behaviors, such as avoidance. Ost, Svensson, Hellstrom, frequently meet criteria for one or more of the anxiety dis- and Lindwall (2001), for example, observed children orders in childhood, and they may also meet criteria for engage in behavioral avoidance tasks, which consisted of one of the disruptive behavior disorders (Heyne, Sauter, a series of graduated steps, and the percentage of steps the Ollendick, van Widenfelt, & Westerberg, 2014). children accomplished was recorded. Kearney and Silverman (1990) proposed a functional Perhaps because behavioral avoidance tasks have model suggesting children refused school for one of four long been known to be affected by instructional set and probable reasons: (a) to avoid stimuli that provoke nega- demand characteristics (e.g., “go as far as you can” vs. “stop tive affectivity, (b) to escape aversive social and/​or evalua- whenever you feel too scared”), direct observations have tive situations, (c) to seek attention from significant others, been used more often to assess subjective judgments of and (d)  to pursue tangible reinforces outside of school. children’s levels of fear/​anxiety in fear/​anxiety-​provoking To address these functions, they developed the School situations/​t asks or observers’ subjective judgments of chilRefusal Assessment Scale (SRAS; Kearney & Silverman, dren’s levels of fear/​anxiety. In some studies, observers’ 1993)  and its recent revision (SRAS-​R; Kearney, 2002). subjective ratings are obtained by providing the observers The original SRAS was a 16-​ item instrument that with global rating scales (e.g., a Likert rating scale from 1

Anxiety Disorders in Children and Adolescents

to 5). In other studies, observers are provided with behavioral dimensions to help assist the observers in making their subjective ratings (Silverman & Ollendick, 2005). Two other types of behavioral observation tasks are social evaluative tasks and parent–​youth interaction tasks. With regard to social evaluative tasks (Beidel, Turner, & Morris, 2000), participants are informed of the evaluative nature of the task and are given standard behavioral assertiveness instructions. For example, Beidel et  al. (2000) invited children and adolescents to read aloud a story in front of a small group and were told to “Respond as if the scene were really happening.” With regard to parent–​ youth interaction tasks (e.g., Hudson & Rapee, 2002), parents and their children were observed while engaging in problem-​solving situations. Specifically, Hudson and Rapee conducted observations of “normal” and anxious children and their siblings while completing a separate set of tangram or puzzle tasks designed to be slightly too difficult to complete during the allocated 5 minutes. Of interest was the degree of parental involvement during the task (e.g., degree of unsolicited help and degree to which the parent physically touched the tangram piece). From a best-​practices perspective, we believe systematic direct observational procedures have clinical utility with regard to case conceptualization and treatment planning. They can yield helpful conceptual information about the nature of family interactions among anxious children or just “how far children can go” when it comes to interacting with a feared object or event. However, they can be time-​consuming and difficult to arrange. We are encouraged by a novel and promising behavioral assessment approach that provides an objective index of fear-​related avoidance and that is also presented to children as a fun game. Relying on motion tracking software, the Yale Interactive Kinetic Environment Software (YIKES) is a flexible experimentation platform that facilitates examination of approach and avoidance during an episode of immersive gameplay. In a series of experiments focused on behavioral avoidance of spider images, approach toward spider images was compared to approach toward matched neutral images. Behavioral avoidance of the spider images in both anxious youth and their mothers was associated with subjective ratings of fear of spiders. Behavioral avoidance in the mothers significantly moderated the association between mother and child fear, and anxiety sensitivity moderated the association between fear and avoidance in the anxious children (Lebowitz, Shic, Campbell, Basile, & Silverman, 2015; Lebowitz, Shic, Campbell, MacLeod, & Silverman, 2015). These findings

229

highlight the ability of novel behavioral measurement methodologies to contribute to assessment and allow for the testing of otherwise difficult to examine hypotheses. Self-​Monitoring Self-​monitoring often has been viewed as a more efficient and easier way to accomplish the same goals as direct observation. Although relatively common in practice among behaviorally oriented clinicians, little has been done in the child and adolescent anxiety research area to evaluate feasibility and psychometric properties. An exception is Beidel, Neal, and Lederer (1991), who devised and evaluated the feasibility (i.e., child compliance), reliability, and validity of a daily diary for assessing the range and frequency of social evaluative anxious events in elementary school children (N = 57; n = 32, test anxious; n = 25, non-​test anxious) during a 2-​week assessment phase. Structured in nature, the daily diary listed events such as I had a test and The teacher called on me to answer a question, as well as a list of potential responses to the occurrence of the events, including positive (e.g., practiced extra hard and told myself not to be nervous and it would be okay), negative (e.g., cried and got a headache or stomachache), and neutral (e.g., did what I was told) behaviors. The children also rated the degree of distress they experienced using a pictorial 5-​point rating scale that depicted increasing degrees of anxious arousal. With regard to feasibility or compliance, with no incentives offered, the mean number of days the diary was completed for the 2 weeks ranged from 7.9 to 11.5 days, although only 31% to 39% of the children complied for the full 2 weeks (Beidel et al., 1991). Retest reliability was modest, but that is probably because the events listed on the diary showed true fluctuations. Evidence for validity was demonstrated in that the test-​anxious children reported significantly more emotional distress and more negative behaviors such as crying or behavioral avoidance. Thus, as with direct observation procedures, self-​ monitoring procedures have clinical utility in yielding helpful conceptual information (e.g., the specific situations that elicit anxiety in a child and the child’s cognitions when faced with a specific object or event). Furthermore, the prevalent use of digital technology, including smartphones, has led to the development of self-​ monitoring applications for anxiety (Anxiety and Depression Association of America, 2016). Although the use of applications for treatment and monitoring seems

230

Anxiety and Related Disorders

a promising development, particularly for technologically savvy young people, more research is needed to determine their value (e.g., Radovic et al., 2016). Conceptual and Practical Issues in Assessing for Case Conceptualization and Treatment Planning Demonstrating Treatment Utility The studies summarized in this section regarding prescriptive treatment strategies represent important efforts in demonstrating the treatment utility of assessment. For example, in the studies by Eisen and Silverman (1993, 1998), which showed CBTs were most effective for specific aspects of the treatment (e.g., cognitive therapy and relaxation therapy) when matched with the child’s specific symptoms (e.g., worry and physiological arousal), the treatment utility of assessing for these specific symptoms was empirically shown because the assessment produced better treatment outcome. However, the treatment utility of deriving DSM diagnoses using interview schedules still has not been demonstrated in other treatment outcome studies (Nelson-​Gray, 2003). For example, how do children who had diagnoses assigned with a structured or semi-​ structured diagnostic interview schedule fare in anxiety reduction programs compared with children who had diagnoses assigned using an unstructured clinical interview? What about in comparison with children whose anxiety was assessed using rating scales? Answers to these questions can lead to more cost-​and time-​effective practice, which is of high relevance in efforts to transfer evidence-​based practices to community settings. Considering Impaired But Not Diagnosed Children and Adolescents Diagnostic interview schedules emphasize DSM anxiety disorders and symptoms and are in line with the treatment targets of CBT. Research shows, however, that a substantial proportion of children and adolescents who present to community mental health clinics do not meet criteria for a DSM disorder but, rather, evidence impaired functioning (Angold, Costello, & Erkanli, 1999). Anxiety is a specific problem area that has been found likely to lead to youth impairment, but it is not necessarily a diagnosis (Angold et al., 1999). Because impairment may not be reported by the parents and/​or children, it is important to probe carefully for whether the child is mastering expected developmental tasks (e.g., developing peer relationships). If not, then

anxiety may be deemed as potentially impairing. As noted previously, contained on the ADIS-​C/​P is the clinician rating scale, which allows for an assessment of interference of each anxiety diagnosis along a 0-​to 8-​point scale, where 4 is considered a clinical diagnosis; less than 4 is subthreshold. Retest reliability estimates of the clinician rating scale ratings have been found to be satisfactory (e.g., Silverman & Eisen, 1992). The clinician rating scale can also be adapted to assess for interference of anxiety symptoms, even if DSM diagnostic criteria are not met. For example, for a child who does not meet full criteria for SAD but cannot sleep alone at night without her mother, the child could be asked the following: “You just told me that sometimes you have trouble sleeping alone at night without your mother. How does not being able to go to sleep by yourself mess things up in terms of how you now are doing in school? How about in terms of things with your family? And how much does it affect things with friends? And how much does it make you feel very upset [personal distress]?” The PARS (RUPP Anxiety Study Group, 2002)  is another measure designed to assess the frequency, severity, and associated impairment across SAD, SOP, and GAD symptoms in children and adolescents (aged 6–​17  years). The internal consistency of scores on the PARS has been found to be satisfactory, but its retest reliability needs further study (e.g., retest reliability  =  .55 for the total scale score using an average retest interval of approximately 3 weeks). The PARS’s convergent and divergent validity also needs further examination. For example, although observed correlations have been found to be in the expected directions (i.e., positive correlations with ratings of internalizing symptoms and negative correlations with externalizing symptoms), this was especially true of the correlations between PARS ratings and clinician ratings and also other sources’ ratings, including children’s ratings on the MASC. Because anxiety is clinically significant only when there is an associated level of interference in functioning (APA, 2013), the Child Anxiety Impact Scale (CAIS; Langley, Bergman, McCracken, & Piacentini, 2004; Langley et al., 2014) is a scale that represents the growing understanding of the critical role of impact or interference. The CAIS is a 27-​item rating scale (i.e., 4-​point Likert scale, from 0 [not at all] to 3 [very much]) that measures functional impairment of anxiety symptoms on psychosocial functioning in school, social, and family domains. Children are asked to rate the amount of difficulty they have in completing the activity described by the item due to anxiety. The CAIS has a parallel version for parents. Internal

Anxiety Disorders in Children and Adolescents

consistency has been found to be adequate to good for the total scores and subscale scores (Cronbach’s α  =  .70 to .90). Scores on the CAIS also predict other anxiety scores, including on the MASC, PARS, SCARED, and the Child Behavior Checklist (CBCL) internalizing scales (Langley et al., 2014). Best practices suggest the consideration of impairment when assessing children and adolescents who present with anxiety difficulties. Impairment rating scales provide reliable and valid estimates of the extent of the youth’s impairment. For those children and adolescents who show significant impairment in their functioning, even if subthreshold in diagnosis, treatment services still could be important to provide. Considering Family Accommodation Rather than viewing a child’s anxiety primarily as an individual phenomenon impacting the child, an alternative conceptualization is to view childhood anxiety as systemic, emphasizing the importance of family interactions. Researchers have begun to focus on parents’ involvement in childhood anxiety disorders through the process of family accommodation. Parents who regularly attempt to alleviate their child’s distress by changing their own behavior may inadvertently maintain their child’s anxiety (e.g., Lebowitz, Scharfstein, et al., 2014). Assessing FA as part of a case conceptualization provides clinicians with alternative or additional treatment targets, and treatments have emerged that emphasize the reduction of FA through parent-​based work (Lebowitz & Omer, 2013; Lebowitz, Omer, Hermes, & Scahill, 2014). This approach may be particularly useful if the child is not willing or is unable to participate directly in treatment. The Family Accommodation Scale-​ Anxiety (FASA; Lebowitz et  al., 2013)  requires parents to rate the frequency with which they engage in FA, and it includes two subscales for specific domains of FA: (a) active participation in symptom-​driven behaviors and (b)  modifications to parent or family routines and schedules. The FASA also queries parental distress relating to the need to accommodate the child as well as negative child consequences of refusing to accommodate. A child report version of FASA (FASA-​CR; Lebowitz, Scharfstein, et  al., 2015)  parallels the parent version and also queries child beliefs relating to FA, such as whether the child believes it is helpful and whether he or she believes the parents should cease accommodating. Results from several independent studies indicate that FA is very prevalent across the anxiety disorders (e.g., 97% of parents endorsing FA), is associated

231

with more severe symptoms, and reduction in FA is associated with treatment response of childhood anxiety disorders (Jones et  al., 2015; Kagan, Peterman, Carper, & Kendall, 2016; Lebowitz, Panza, et  al., 2016; Lebowitz et al., 2013). Overall Evaluation The field has available many promising assessment measures and strategies to augment information obtained from diagnostic interviews to guide decisions about treatment planning, but it has a long way to go in achieving an evidence base to guide these efforts. Although the use of these measures can help in arriving at a clinically meaningful and useful case conceptualization and implement a clinically sensitive and feasible treatment plan at the individual level, these measures have not yet been shown to be useful at the group or nomothetic level. Similarly, despite the preliminary evidence for adopting an idiographic, prescriptive approach to treat anxiety disorders and/​or school refusal behavior, through the identification of problematic symptoms and/​or functionally motivating conditions, evidence is needed that the former is more efficacious than a nomothetic, statistically based approach (Silverman & Berman, 2001). Another important development in the field is the assessment and targeting of family accommodation, as a maintaining factor in the development of anxiety. It is important for future research to compare the relative efficacy of an idiographic, prescriptive approach to a standard CBT package.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

This section deals with assessment measures and strategies most useful for tracking the progress of treatment and evaluating the overall effect of treatment on anxiety symptoms, diagnosis, and general functioning. Diagnoses derived from interview schedules have been used for treatment evaluation purposes. In treatment studies, 100% of participants meet diagnostic criteria for an anxiety disorder at pretreatment. An important outcome variable is diagnostic recovery rate at post-​treatment and follow-​ up. Most studies report 60% to 80% of participants as recovered or no longer meeting diagnostic criteria at post-​treatment, with maintenance of recovery at 1-​year follow-​up (Ollendick et al., 2006; Silverman et al., 1998). Studies vary to the extent that the primary diagnosis is reported versus “all” anxiety diagnoses.

232

Anxiety and Related Disorders

TABLE 11.3  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

NA NA NA NA

NA NA NA NA

E A A G

E G G A

E G G A

E G G A

E G G G

E G G A



E

G

NA

G

E

E

G

G



ADIS-​C/​P: CRS Child CAIS RCMAS STAIC

NA

NA

G

E

G

NA

NA

E



G E G

E G G

NA NA NA

NA G G

G G G

G E G

G G G

E G A

✓ ✓

FSSC-​R MASC SCAS SPAIC

A G A A

E G G G

NA NA NA NA

G G A E

E G E E

G E E G

G E G G

G E G G

✓ ✓

NA

G

G

G

G

G

G



Instrument

Norms

Diagnostic Interview Schedules ADIS C/​P-​IV DISC-​IV DICA K-​SADS Parent CBCL-​I Clinician

Behavioral Observations BAT/​SET/​PYIT

NA

Note: ADIS C/​P-​IV = Anxiety Disorders Interview Schedule; DISC-​IV = Diagnostic Interview Schedule for Children, Version IV; DICA = Diagnostic Interview Schedule for Children and Adolescents; K-​SADS = Schedule for Affective Disorders and Schizophrenia for School-​Age Children; CBCL-​ I = Child Behavior Checklist-​Internalizing Scale; ADIS-​C/​P: CRS = Anxiety Disorders Interview Schedule-​Child and Parent Versions: Clinician Rating Scale; CAIS = Child Anxiety Impact Scale; RCMAS = Revised Children’s Manifest Anxiety Scale; STAIC = State–​Trait Anxiety Inventory for Children; FSSC-​R = Fear Survey Scale for Children-​Revised; MASC = Multidimensional Anxiety Scale for Children; SCAS = Spence Children’s Anxiety Scale; SPAIC = Social Phobia and Anxiety Inventory for Children; BAT/​SET/​PYIT = Behavioral Avoidance Task/​Social Evaluative Task/​ Parent–​Youth Interaction Task; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

Because interview schedules were discussed previously and because data from self-​ monitoring procedures are reported in many case studies or single case designs but rarely in randomized trials, emphasis is placed in this section on rating scales and observational methods (Table 11.3). The section also includes a discussion of best practices with respect to conceptual and practical issues and concludes with an overall evaluation. It is worth noting first that rating scales and observational methods were not often administered over the course of a child or adolescent treatment program, but just at pretreatment, post-​ treatment, and follow-​ up; in some studies, they were administered at mid-​treatment (Kendall et al., 1997). As interest has grown over the years in identifying mediators of treatment, the importance in administering measures of hypothesized mediators and the primary outcome measures during the course of treatment (e.g., every other session) has grown as well. This allows for a more precise evaluation of temporal precedence of the mediator(s) over the outcome(s) (Maric, Prins, & Ollendick, 2015). If one is concerned about client burden, one could administer an abridged version of a rating scale. This can be accomplished by determining the three or four items of the scale that correlate most

highly with the total score (or total subscale scores) and administering only those items on a weekly or biweekly basis. In addition, because there are fluctuations in rating scale scores irrespective of treatment, usually attenuation, it can also be useful to administer scales at least twice prior to the intervention—​first at the initial intake and then immediately prior to treatment. Rating Scales The use of rating scales represents best practices for the assessment of treatment progress and treatment outcome because they are sensitive to treatment change. Rating scales completed by youth, parents, teachers, or clinicians are easy to administer, have relatively low cost, and have objective scoring procedures (Silverman & Rabian, 1999; Silverman & Serafini, 1998). See Table 11.3 for the most widely used rating scales for this assessment purpose. The most common rating scales used in clinical trials that have been shown to be sensitive to treatment effects have been the teacher and parent rating versions of the CBCL; (Achenbach, 1991a, 1991b). The CBCL is a 118-​item (parent version) and a 120-​item (teacher version) behavior checklist that assesses the behavior problems and social

Anxiety Disorders in Children and Adolescents

competencies of children and adolescents using a 3-​point scale (not true, somewhat or sometimes true, and very true or often true). The CBCL includes broadband subscales (i.e., externalizing and internalizing) and narrowband subscales (i.e., withdrawn, somatic complaints, and anxious/​ depressed). Van Meter et al. (2014) examined the properties of the CBCL and Youth Severity Rating (YSR) internalizing scales in two large samples (N = 1,084 and 651). They found that both measures discriminated between youth with any anxiety disorder or GAD from other diagnoses, leading them to conclude that the CBCL and YSR provide valuable information as to whether a youth is experiencing an anxiety disorder. Evans, Thirlwall, Cooper, and Cresswell (2016) examined whether the CAIS and SCAS could be used to identify recovery from an anxiety disorder in 337 children. They found that both measures, particularly the CAIS parent version, were useful for this purpose, except in cases of specific phobia. The Clinician Rating Scale of the ADIS for DSM-​IV: C/​P (Silverman & Albano, 1996) also has been used in a number of studies. It, too, has been found to be sensitive to treatment effects (Silverman & Ollendick, 2005). Behavioral Observations The use of direct behavioral observation represents another approach for assessing treatment progress and treatment outcome, although it has been far less used for this purpose relative to interviews (i.e., diagnostic recovery rates) and rating scales (i.e., statistically significant declines in dimensional scores; see Table 11.3). In studies by Kendall et  al. (1997) and Beidel et  al. (2000), participants were asked to engage in an evaluative task and were given standard behavioral assertiveness instructions. Both Kendall et  al. (1997) and Beidel et  al. (2000) reported treatment improvements in participants’ performance on these tasks. Using a family observation task, however, Barrett et  al. (1996) did not find significant pre-​to post-​treatment differences between an individual-​versus family-​based CBT. The extent that newer developed behavioral assessment measures, such as the YIKES, are sensitive to treatment change is currently under investigation. Conceptual and Practical Issues in the Assessment of Treatment Progress and Treatment Outcome Using Normative Data There are concerns about using normative values to assess clinical significance. Norming mainly indicates a child’s relative standing; it still does not indicate a child’s absolute

233

standing on anxiety. When the CBCL (Achenbach, 1991a, 1991b), for example, is used to assess treatment outcome, clinically significant improvement is defined as meeting a minimum criterion T score on the CBCL internalizing scale of less than 70 (adjusted according to age norms; Kendall et al., 1997; Silverman et al., 1999). Thus, cases that shift from being above this cut-​off value to being below the cut-​off value are viewed as clinically significant improvement following the treatment (see Kazdin, 1999). There is no clear evidence, however, that children who score below 70 have fewer worries or display less avoidant behaviors compared to children who score above 70. That is, this shift on the CBCL from pre-​to post-​treatment does not inform whether the treatment had meaningful impact on the day-​to-​day functioning of the treated youth (see Kazdin, 1999). Examples such as meeting role demands, functioning in everyday life, and improvement in the quality of one’s life would also be useful to assess. Reporting Biases It is reasonable to assume that some anxious children are reluctant to self-​disclose their personal anxious reactions on rating scales. The RCMAS Lie Scale and the RCMAS-​2 Defensiveness Scale are useful in this regard. The RCMAS Lie Scale, a downward extension of the Lie Scale on the adult version of the Manifest Anxiety Scale, was derived from the social desirability/​Lie Scale of the Minnesota Multiphasic Personality Inventory. Containing items such as “I never get angry,” “I like everyone I know,” and “I am always kind,” the Lie Scale has been used as an indicator of social desirability or defensiveness (Dadds, Perrin, & Yule, 1998; Reynolds & Richmond, 1985), reflecting a tendency to present oneself in a favorable light and/​or to deny flaws and weaknesses that others are usually willing to admit. Research using the RCMAS Lie Scale in unselected school samples (Dadds et  al., 1998)  and clinic-​referred anxious samples (Pina, Silverman, Saavedra, & Weems, 2001) reveals younger children score significantly higher on the Lie Scale compared to older children; no significant sex differences have been found. These findings underscore the need for clinicians and researchers to recognize that younger anxious children are more likely than older age groups to evidence social desirability when using anxiety rating scales. The findings also underscore the need to emphasize to anxious youth that there are “no right or wrong answers” during the assessment process. Similar pressure to please and to be viewed in a favorable light may exist with other assessment strategies such as behavioral observations, although the issue has not been

234

Anxiety and Related Disorders

studied. Research with the Defensiveness Scale of the latest revision of the RCMAS is also needed. Overall Evaluation There has been little systematic assessment undertaken over the course of child anxiety treatment studies to monitor treatment progress. It is recommended that such efforts be undertaken. Using abbreviated versions of psychometrically sound measures may represent one important way to move this work forward. Diagnostic interviews and rating scales have been most widely used for evaluating treatment outcome. Despite the wide usage of rating scales and that the scales show sensitivity to change, more research is needed to determine the clinical relevance or value of scores on these scales, including changes in scores.

CONCLUSIONS AND RECOMMENDATIONS

We first provided an overview of anxiety disorders in children and adolescents and then summarized the research evidence for psychological assessment anxiety measures and strategies. The focus was on measures that are not only evidence based but also feasible and useful for the needs of the practitioner. Based on the information provided in this chapter, we make the following recommendations. For the purpose of diagnosis, structured or semi-​ structured clinical interviews are recommended because they lead to more reliable anxiety diagnoses compared to unstructured clinical interviews. The interview schedule used most frequently has been the ADIS-​C/​P. Although reliability and validity of anxiety diagnoses have been documented using this schedule, further evaluation of the interview schedule is needed in community clinics and with disorders of varying base rates. To assist in screening for anxiety disorders for later diagnostic workups, it is recommended that factor scale scores of instruments be examined, not just total scores, when trying to discriminate anxiety disorders from other disorders. Also relating to screening, the MASC appears to have the most research evidence, although the evidence is limited. More research needs to be conducted with the MASC-​2, including the new parent version. The MASC-​ 2, as well as the RCMAS-​ 2, appears to have valuable additional features, but more research is required to scrutinize and better understand the utility of these measures.

To assess for the purpose of case conceptualization and treatment formulation, a prescriptive treatment approach represents a potentially useful way to proceed and a fruitful avenue for future research. Also clinically useful for the purpose of case conceptualization and treatment formulation are direct observations and self-​monitoring procedures, but questions exist about their feasibility, retest reliability, and validity. For the purpose of assessing for treatment outcome, using the ADIS interviews, the RCMAS, and the CBCL internalizing scales have been the most widely used measures, and they all have been found to be sensitive to change. The FASA and CAIS are both relatively new measures to the field but hold much promise given the growing interest in both family processes and impairment, respectively. It is almost always important to consider multiple sources of information, and not assume there is one unique gold standard, because different perspectives likely reflect biases and varying perceptions of what is in the best interest of the child or adolescent.

ACKNOWLEDGMENTS

This work was supported in part by National Institute of Mental Health grants R34 MH096915, R34 MH097931, and K23MH103555 and by a NARSAD Young Investigator Award (21470).

References Achenbach, T. M. (1991a). Manual for the child behavior checklist 14–​18 and 1991 profile. Burlington, VT: University of Vermont. Achenbach, T. M. (1991b). Manual for the teachers report form and 1991 profile. Burlington, VT: University of Vermont. Albano, A. M., & Silverman, W. K. (2017). Anxiety disorders interview schedule for the DSM-​5. New York, NY: Oxford University Press. Allen, J. L., Rapee, R. M., & Sandberg, S. (2008) Severe life events and chronic adversities as antecedents to anxiety in children: A matched control study. Journal of Abnormal Child Psychology, 36, 1047–​1056. Ambrosini, P. J. (2000). Historical development and present status of the Schedule for Affective Disorders and Schizophrenia for School-​ Age Children (K-​ SADS). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 49–​58. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3th ed.). Washington, DC: Author.

Anxiety Disorders in Children and Adolescents

American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3th ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Ang, R. P., Lowe, P. A., & Yusof, N. (2011). An examination of the RCMAS-​2 scores across gender, ethnic background, and age in a large Asian school sample. Psychological Assessment, 23, 899–​910. Angold, A., Costello, E. J., & Erkanli, A. (1999). Comorbidity. Journal of Child Psychology and Psychiatry, 40, 57–​87. Anxiety and Depression Association of America. (2016). Mental health apps. Retrieved from https://​www.adaa. org/​finding-​help/​mobile-​apps Beck, J., Beck, A., & Jolly, J. (2001). Beck youth inventories of social and emotional impairment manual. San Antonia, TX: Psychological Corporation. Beidel, D. C., Neal, A. M., & Lederer, A. S. (1991). The feasibility and validity of a daily diary for the assessment of anxiety in children. Behavior Therapy, 22, 505–​517. Beidel, D. C., Turner, S. M., & Morris, T. L. (1999). Psychopathology of childhood social phobia. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 643–​650. Beidel, D. C., Turner, S. M., & Morris, T. L. (2000). Behavioral treatment of childhood social phobia. Journal of Consulting and Clinical Psychology, 68, 1072–​1080. Birmaher, B., Khetarpal, S., Brent, D. A., Cully, M., Balach, L., Kaufman, J., & Neer, S. M. (1997). The Screen for Child Anxiety Related Emotional Disorders (SCARED): Scale construction and psychometric characteristics. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 545–​553. Bittner, A., Egger, H. L., Erkanli, A., Costello, E. J., Foley, D. L., & Angold, A. (2007). What do childhood anxiety disorders predict? Journal of Child Psychology and Psychiatry, 48, 1174–​1183. Bouton, M. E., Mineka, S., & Barlow, D. H. (2001). A modern learning theory perspective on the etiology of panic disorder. Psychological Review, 108, 4–​32. Bowlby, J. (1977). The making and breaking of affectional bonds: I. Aetiology and psychopathology in the light of attachment theory. British Journal of Psychiatry, 130, 201–​210. Brown, T. A., DiNardo, P. A., & Barlow, D. H. (1994). Anxiety Disorders Interview Schedule for DSM-​IV (ADIS-​IV). San Antonio, TX: Psychological Corporation/​Graywind. Brown-​Jacobsen, A. M., Wallace, D. P., & Whiteside, S. P. (2011). Multimethod, multi-​ informant agreement, and positive predictive value in the identification of

235

child anxiety disorders using the SCAS and ADIS-​C. Assessment, 18, 382–​392. Calvocoressi, L., Lewis, B., Harris, M., Trufan, S. J., Goodman, W. K., McDougle, C. J., & Price, L. H. (1995). Family accommodation in obsessive–​ compulsive disorder. American Journal of Psychiatry, 152, 441–​443. Chorpita, B. F., Albano, A. M., Heimberg, R. G., & Barlow, D. H. (1996). A systematic replication of the prescriptive treatment of school refusal behavior in a single subject. Journal of Behavior Therapy and Experimental Psychiatry, 27, 281–​290. Cicchetti, D., & Cohen, D. J. (1995). Perspectives on developmental psychopathology. In D. Cicchetti & D. Cohen (Eds.), Developmental psychopathology: Volume 1. Theory and methods (pp. 3–​20). New York, NY: Wiley. Costello, J., Egger, H., Copeland, W., Erkanli, A., & Angold, A. (2011). The developmental epidemiology of anxiety disorders: Phenomenology, prevalence, and comorbidity. In W. K. Silverman & A. Field (Eds.), Anxiety disorders in children and adolescents (2nd ed., pp. 56–​76). New York, NY: Cambridge University Press. Costello, E. J., Mustillo, S., Erkanli, A., Keeler, G., & Angold, A. (2003). Prevalence and development of psychiatric disorders in children and adolescents. Archives of General Psychiatry, 60, 837–​844. Cowart M. J, & Ollendick, T. H. (2011). Attention training in socially anxious children: a multiple baseline design analysis. Journal of Anxiety Disorders, 25, 972–​977. Dadds, M. R., Barrett, P., Rapee, R., & Ryan, S. (1996). Family process and child anxiety and aggression:  An observational analysis. Journal of Abnormal Child Psychology, 24, 715–​734. Dadds, M. R., Perrin, S., & Yule, W. (1998). Social desirability and self-​reported anxiety in children: An analysis of the RCMAS Lie Scale. Journal of Abnormal Child Psychology, 26, 311–​317. Dallaire, D. H., & Weinraub, M. (2007). Infant–​ mother attachment security and children’s anxiety and aggression at first grade. Journal of Applied Developmental Psychology, 28, 477–​492. Davis, T. E., III, White, S. W., & Ollendick, T. H. (Eds.). (2014). Handbook of autism and anxiety. New York, NY: Springer. Dierker, L. C., Albano, A. M., Clarke, G. N., Heimberg, R. G., Kendall, P. C., Merikangas, K. R., . . . Kupfer, D. J. (2001). Screening for anxiety and depression in early adolescence. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 929–​936. Eisen, A. R., & Silverman, W. K. (1993). Should I relax or change my thoughts? A preliminary study of the treatment of overanxious disorder in children. Journal of Cognitive Psychotherapy, 7, 265–​280. Eisen, A. R., & Silverman, W. K. (1998). Prescriptive treatment for generalized anxiety disorder in children. Behavior Therapy, 29, 105–​121.

236

Anxiety and Related Disorders

Eley, T. C. (2001). Contributions of behavioral genetics research:  Quantifying genetic, shared environmental and nonshared environmental influences. In M. W. Vasey & M. R. Dadds (Eds.), The developmental psychopathology of anxiety (pp. 45–​59). London, UK: Oxford University Press. Esbjørn, B. H., Bender, P. K., Reinholdt-​ Dunne, M. L., Munck, L. A., & Ollendick, T. H. (2012). The development of anxiety disorders: Considering the contributions of attachment and emotion regulation. Clinical Child and Family Psychology Review, 15, 129–​143. Essau, C. A. (2003). Comorbidity of anxiety disorders in adolescents. Depression and Anxiety, 18, 1–​6. Essau, C. A., & Barrett, P. M. (2001). Developmental issues in the assessment of anxiety. In C. A. Essau & F. Peterman (Eds.), Anxiety disorders in children and adolescents:  Epidemiology, risk factors, and treatment (pp. 75–​109). London, UK: Harwood. Essau, C. A., Conradt, J., & Petermann, F. (2000). Frequency, comorbidity and psychosocial impairment of anxiety disorders in German adolescents. Journal of Anxiety Disorders, 14, 263–​279. Essau, C. A., Conradt, J., & Petermann, F. (2002). Course and outcome of anxiety disorders in adolescents. Journal of Anxiety Disorders, 16, 67–​81. Eysenck, H. J., & Eysenck, M. W. (1985). Personality and individual differences. New York, NY: Plenum. Ezpeleta, L., Keeler, G., Alaatin, E., Costello, E. J., & Angold, A. (2001). Epidemiology of psychiatric disability in childhood and adolescence. Journal of Child Psychology and Psychiatry, 42, 901–​914. Feldman, R. (2016). The neurobiology of mammalian parenting and the biosocial context of human caregiving. Hormones and Behavior, 77, 3–​17. Field, A. P., & Lawson, J. (2003). Fear information and the development of fears during childhood:  Effects on implicit fear responses and behavioural avoidance. Behaviour Research and Therapy, 41, 1277–​1293. Field, A. P., & Lester, K. J. (2010). Is there room for “development” in developmental models of information processing biases to threat in children and adolescents?. Clinical Child and Family Psychology Review, 13, 315332. Fraccaro, R. L., Stelniki, A. M., & Nordstokke, D. W. (2015). Test review:  Multidimensional Anxiety Scale for Children (2nd ed.). Canadian Journal of School Psychology, 30, 70–​77. Franco, X., Saavedra, L., & Silverman, W. K. (2007). External validation of comorbid patterns of anxiety disorders in children and adolescents. Journal of Anxiety Disorders, 21, 717–​729. Goldfried, M. R., & Wolfe, B. E. (1998). Toward a more clinically valid approach to therapy research. Journal of Consulting and Clinical Psychology, 66, 143–​150.

Gregory, A. M., & Eley, T. C. (2011). The genetic basis of child and adolescent anxiety. In W. K. Silverman & A. Field (Eds.), Anxiety disorders in children and adolescents (2nd ed., pp. 161–​178). New York, NY: Cambridge University Press. Grills, A. E., & Ollendick, T. H. (2003). Multiple informant agreement and the Anxiety Disorders Interview Schedule for Parents and Children. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 30–​40. Hallett, V., Lecavalier, L., Sukhodolsky, D. G., Cipriano, N., Aman, M. G., McCracken, J. T., . . . Scahill, L. (2013). Exploring the manifestations of anxiety in children with autism spectrum disorders. Journal of Autism and Developmental Disorders, 43, 2341–​2352. Hayes, S. C., Nelson, R. O., & Jarrett, R. B. (1987). The treatment utility of assessment:  A functional approach to evaluating assessment quality. American Psychologist, 42, 963–​974. Heyne, D., Sauter, F. M., Ollendick, T. H., van Widenfelt, B. M., & Westerberg, M. W. (2014). Developmentally sensitive cognitive behavioral therapy for adolescents with school refusal: Rationale and case illustration. Clinical Child and Family Psychology Review, 17, 191–​215. Higa-​McMillan, C. K., Francis, S. E., Rith-​Najarian, L., & Chorpita, B. F (2016). Evidence base update: 50 years of research on treatment for child and adolescent anxiety. Journal of Clinical Child & Adolescent Psychology, 45, 91–​113. Hirshfeld-​Becker D. R., Biederman, J., Henin, A., Faraone, S. V., Davis, S., Harrington, K., & Rosenbaum, J. (2007). Behavioral inhibition in preschool children at risk is a specific predictor of middle childhood social anxiety:  A 5-​year follow-​up. Journal of Developmental and Behavioral Pediatrics, 28, 225–​233. Hudson, J. L., & Rapee, R. M. (2002). Parent–​child interactions in clinically anxious children and their siblings. Journal of Clinical Child and Adolescent Psychology, 31, 548–​555. Insel, T. R., Scanlan, J., & Champoux, M. (1988). Rearing paradigm in a nonhuman primate affects response to b-​ CCE challenge. Psychopharmacology, 96, 81–​86. Jones, J. D., Lebowitz, E. R., Marin, C. E., & Stark, K. D. (2015). Family accommodation mediates the association between anxiety symptoms in mothers and children. Journal of Child and Adolescent Mental Health, 27, 41–​51. Kaat, A. J., Gadow, K. D., & Lecavalier, L. (2013). Psychiatric symptom impairment in children with autism spectrum disorders. Journal of Abnormal Child Psychology, 41, 959–​969. Kaat, A. J. & Lecavalier, L. (2015). Reliability and validity of parent and child-​rated anxiety measures in autism spectrum disorder, Journal of Autism and Developmental Disorders, 45, 3219–​3231.

Anxiety Disorders in Children and Adolescents

Kagan, E. R., Peterman, J. S., Carper, M. M., & Kendall, P. C. (2016). Accommodation and treatment of anxious youth. Depression and Anxiety, 33, 840–​847. Kazdin, A. E. (1999). The meanings and measurement of clinical significance. Journal of Consulting and Clinical Psychology, 67, 332–​339. Kearney, C. A. (2002). Identifying the function of school refusal behavior:  A revision of the School Refusal Assessment Scale. Journal of Psychopathology and Behavioral Assessment, 24, 235–​245. Kearney, C. A. (2003). Bridging the gap among professionals who address youths with school absenteeism: Overview and suggestions for consensus. Professional Psychology: Research and Practice, 34, 57–​65. Kearney, C. A., Pursell, C., & Alvarez, K. (2001). Treatment of school refusal behavior in children with mixed functional profiles. Cognitive and Behavioral Practice, 8, 3–​11. Kearney, C. A., & Silverman, W. K. (1990). A preliminary analysis of a functional model of assessment and treatment for school refusal behavior. Behavior Modification, 14, 340–​366. Kearney, C. A., & Silverman, W. K. (1993). Measuring the function of school refusal behavior: The School Refusal Assessment Scale. Journal of Clinical Child Psychology, 22, 85–​96. Kearney, C. A., & Silverman, W. K. (1999). Functionally based prescriptive and nonprescriptive treatment for children and adolescents with school refusal behavior. Behavior Therapy, 30, 673–​695. Kendall, P. C., Flannery-​Schroeder, E., Panichelli-​Mindel, S. M., Southam-​Gerow, M., Henin, A., & Warman, M. (1997). Treatment of anxiety disorders in youth: A second randomized clinical trial. Journal of Consulting and Clinical Psychology, 65, 366–​380. Kovacs, M. (1992). Children’s depression inventory: Manual. North Tonawanda, NY: Multi-​Health Systems. La Greca, A. M., & Lopez, N. (1998). Social anxiety among adolescents:  Linkages with peer relations and friendships. Journal of Abnormal Child Psychology, 26, 83–​94. La Greca, A. M., & Stone, W. L. (1993). Social Anxiety Scale for Children-​Revised:  Factor structure and concurrent validity. Journal of Clinical Child Psychology, 22, 7–​27. Langley, A. K., Bergman, R. L., McCracken, J., & Piacentini, J. C. (2004). Impairment in childhood anxiety disorders:  Preliminary examination of the Child Anxiety Impact Scale-​ Parent version. Journal of Child and Adolescent Psychopharmacology, 14, 105–​114. Langley, A. K., Falk, A., Peris, T., Wiley, J. F., Kendall, P. C., Ginsburg, G., . . . Piacentini, J. (2014). The Child Anxiety Impact Scale (CAIS):  Examining parent-​and child-​reported impairment in child anxiety disorders. Journal of Clinical Child and Adolescent Psychology, 43, 579–​591.

237

Lebowitz, E. R., Leckman, J. F., Feldman, R., Zagoory-​Sharon, O., McDonald, N., & Silverman, W. K. (2016). Salivary oxytocin in clinically anxious youth:  Associations with separation anxiety and family accommodation. Psychoneuroendocrinology, 65, 35–​43. Lebowitz, E. R., Leckman, J. F., Silverman, W. K., & Feldman, R. (2016). Cross-​generational influences on childhood anxiety disorders: Pathways and mechanisms. Journal of Neural Transmission, 123, 1053–​1067. Lebowitz, E. R., & Omer, H. (2013). Treating childhood and adolescent anxiety:  A guide for caregivers. Hoboken, NJ: Wiley. Lebowitz, E. R., Omer, H., Hermes, H., & Scahill, L. (2014). Parent training for childhood anxiety disorders:  The SPACE program. Cognitive and Behavioral Practice, 21, 456–​469. Lebowitz, E. R., Panza, K. E., & Bloch, M. H. (2016). Family accommodation in obsessive–​compulsive and anxiety disorders:  A five-​year update. Expert Review of Neurotherapeutics, 16, 45–​53. Lebowitz, E. R., Panza, K. E., Su, J., & Bloch, M. H. (2012). Family accommodation in obsessive–​compulsive disorder. Expert Review of Neurotherapeutics, 12, 229–​238. Lebowitz, E. R., Scharfstein, L. A., & Jones, J. (2014). Comparing family accommodation in pediatric obsessive–​compulsive disorder, anxiety disorders, and nonanxious children. Depression and Anxiety, 31, 1018–​1025. Lebowitz, E. R., Scharfstein, L., & Jones, J. (2015). Child-​ report of family accommodation in pediatric anxiety disorders:  Comparison and integration with mother-​ report. Child Psychiatry and Human Development, 46, 501–​511. Lebowitz, E. R., Shic, F., Campbell, D., Basile, K., & Silverman, W. K. (2015). Anxiety sensitivity moderates behavioral avoidance in anxious youth. Behaviour Research and Therapy, 74, 11–​17. Lebowitz, E. R., Shic, F., Campbell, D., MacLeod, J., & Silverman, W. K. (2015). Avoidance moderates the association between mothers’ and children’s fears: Findings from a novel motion-​tracking behavioral assessment. Depression and Anxiety, 32, 91–​98. Lebowitz, E. R., Woolston, J., Bar-​Haim, Y., Calvocoressi, L., Dauser, C., Warnick, E., . . . Leckman, J. F. (2013). Family accommodation in pediatric anxiety disorders. Depression and Anxiety, 30, 47–​54. Lecavalier, L., Wood, J. J., Halladay, A., Jones, N., Aman, M. G., Cook, E. H., . . . Scahill, L. (2014). Measuring anxiety as a treatment endpoint in youth with autism spectrum disorder. Journal of Autism and Developmental Disorders, 44, 1128–​1143. Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27–​66.

238

Anxiety and Related Disorders

Lonigan, C. J., Carey, M. P., & Finch, A. J., Jr. (1994). Anxiety and depression in children and adolescents:  Negative affectivity and the utility of self-​ reports. Journal of Consulting and Clinical Psychology, 62, 1000–​1008. Lonigan, C. J., Phillips, B. M., Wilson, S. B., & Allan, N. P. (2011). Temperament and anxiety in children and adolescents. In W. K. Silverman & A. Field (Eds.), Anxiety disorders in children and adolescents (2nd ed., pp. 198–​ 226). New York, NY: Cambridge University Press. Lyneham, H. J., Abbott, M. J., & Rapee, R. M. (2007). Interrater reliability of the Anxiety Disorders Interview Schedule for DSM-​ IV:  Child and parent version. Journal of the American Academy of Child & Adolescent Psychiatry, 46, 731–​736. MacDonald, K., & Feifel, D. (2014). Oxytocin’s role in anxiety: A critical appraisal. Brain Research, 1580, 22–​56. March, J. S. (2013). Multidimensional Anxiety Scale for Children, 2nd edition. Toronto, Ontario, Canada: Multi-​ Health Systems. March, J. S., Parker, J. D.  A., Sullivan, K., Stallings, P., & Conners, K. (1997). The Multidimensional Anxiety Scale for Children (MASC): Factor structure, reliability, and validity. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 554–​565. Maric, M., Prins, P. J. M., & Ollendick, T. H. (Eds.). (2015). Moderators and mediators of youth treatment outcomes. Oxford, UK: Oxford University Press. Mash, E. J., & Terdal, L. G. (Eds.). (1988). Behavioral assessment of childhood disorders (2nd ed.). New York, NY: Guilford. Menzies, R. G., & Clarke, J. C. (1995). The etiology of phobias:  A nonassociative account. Clinical Psychology Review, 15, 23–​48. Merikangas, K. R., He, J., Burstein, M., Swanson, S. A., Avenevoli, S., Cui, L.,  .  .  .  Swendsen, J. (2010). Lifetime prevalence of mental disorders in U.S.  adolescents:  Results from the National Comorbidity Study-​Adolescent Supplement (NCS-​A). Journal of the American Academy of Child & Adolescent Psychiatry, 49, 980–​989. Mineka, S., Gunnar, M., & Champoux, M. (1986). Control and early socioemotional development:  Infant rhesus monkeys reared in controllable versus uncontrollable environments. Child Development, 57, 1241–​1256. Motoca, L. M., & Silverman, W. K. (2011). Treatment: An update and recommendations. In W. K. Silverman & A. Field (Eds.), Anxiety disorders in children and adolescents (2nd ed., pp. 392–​418). New York, NY: Cambridge University Press. Muris, P., Merckelbach, H., Ollendick, T. H., King, N. J., & Bogie, N. (2002). Three traditional and three new childhood anxiety questionnaires: Their reliability and validity in a normal adolescent sample. Behaviour Research and Therapy, 40, 753–​772.

Muris, P., & Ollendick, T. H. (2005). The role of temperament in the etiology of child psychopathology. Clinical Child and Family Psychology Review, 8, 271–​289. Muris, P., Ollendick, T. H., Roelofs, J., & Austin, K. (2014). The Short Form of the Fear Survey Schedule for Children-​Revised (FSSC-​R-​SF):  An efficient, reliable, and valid scale for measuring fear in children and adolescents. Journal of Anxiety Disorders, 28, 957–​965. Nelson-​Gray, R. O. (2003). Treatment utility of psychological assessment. Psychological Assessment, 15, 521–​531. Norman, K., Silverman, W. K., & Lebowitz, E. R. (2015). Family accommodation of child and adolescent Anxiety: Mechanisms, assessment, and treatment. Journal of Child and Adolescent Psychiatric Nursing, 28, 131–​140. Ollendick, T. H. (1983). Reliability and validity of the Revised Fear Survey Schedule for Children (FSSC-​R). Behaviour Research and Therapy, 21, 395–​399. Ollendick, T. H. (1995). Cognitive behavioral treatment of panic disorder with agoraphobia in adolescents: A multiple baseline design analysis. Behavior Therapy, 26, 517–​531. Ollendick, T. H. (1998). Panic disorder in children and adolescents: New developments, new directions. Journal of Clinical Child Psychology, 27, 234–​245. Ollendick, T. H., & Benoit, K. (2012). A parent–​child interactional model of social anxiety disorder in youth. Clinical Child and Family Psychology Review, 15, 81–​91. Ollendick, T. H., Hagopian, L. P., & Huntzinger, R. M. (1991). Cognitive–​ behavior therapy with nighttime fearful children. Journal of Behavior Therapy and Experimental Psychiatry, 22, 113–​121. Ollendick, T. H., & Hersen, M. (1993). Child and adolescent behavioral assessment. In T. H. Ollendick & M. Hersen (Eds.), Handbook of child and adolescent assessment (pp. 3–​14). New York, NY: Pergamon. Ollendick, T. H., & King, N. J. (1991). Origins of childhood fears: An evaluation of Rachman’s theory of fear acquisition. Behaviour Research and Therapy, 29, 117–​123. Ollendick, T. H., King, N. J., & Chorpita, B. (2006). Empirically supported treatments for children and adolescents. In P. C. Kendall (Ed.), Child and adolescent therapy (3rd ed., pp. 492–​520). New York, NY: Guilford. Ollendick, T. H., & March, J. S. (Eds.). (2004). Phobic and anxiety disorders: A clinician’s guide to effective psychosocial and pharmacological interventions. New  York, NY: Oxford University Press. Ollendick, T. H., Mattis, S. G., & Birmaher, B. (2004). Panic disorder. In T. L. Morris & J. S. March (Eds.), Anxiety disorders in children and adolescents (2nd ed., pp. 189–​ 211). New York, NY: Guilford. Ollendick, T. H., & Seligman, L. D. (2006). Anxiety disorders in children and adolescents. In C. Gillberg, R. Harrington, & H.-​ C. Steinhausen (Eds.), Clinician’s desk book of child and adolescent psychiatry (pp. 144–​ 147). Cambridge, UK: Cambridge University Press.

Anxiety Disorders in Children and Adolescents

Ost, L. G., Svensson, L., Hellstrom. K., & Lindwall, R. (2001). One-​session treatment of specific phobias in youths: A randomized clinical trial. Journal of Consulting and Clinical Psychology, 69, 814–​824. Perez-​ Olivas, G., Stevenson, J., & Hadwin, J. A. (2008). Do anxiety-​related attentional biases mediate the link between maternal over involvement and separation anxiety in children? Cognition and Emotion, 22, 509–​521. Perrin, S. & Last, C. (1992). Do childhood anxiety measures measure anxiety?. Journal of Abnormal Child Psychology, 20, 567–​578. Pina, A. A., Silverman, W. K., Saavedra, L. M., & Weems, C. F. (2001). An analysis of the RCMAS Lie scale in a clinic sample of anxious children. Journal of Anxiety Disorders, 15, 443–​457. Pine, D. S. (2011). The brain and behavior in childhood and adolescent anxiety disorders. In W. K. Silverman & A. Field (Eds.), Anxiety disorders in children and adolescents (2nd ed., pp. 179–​197). New York, NY: Cambridge University Press. Rachman, S. (1977). The conditioning theory of fear acquisition:  A critical examination. Behaviour Research and Therapy, 15, 375–​387. Radovic, A., Vona, P. L., Santostefano, A. M., Ciaravino, S., Miller, E., & Stein, B. D. (2016). Smartphone applications for mental health. Cyberpsychology, Behavior, and Social Networking, 19, 465–​470. Rapee, R. M. (2011). Treatments for childhood anxiety disorders: Integrating physiological and psychosocial interventions. Expert Reviews in Neurotherapeutics, 11, 1095–​1097. Rapee, R. M., Schniering, C. A., & Hudson, J. L. (2009). Anxiety disorders during childhood and adolescence:  Origins and treatment. Annual Review of Clinical Psychology, 5, 311–​341. Reich, W. (2000). Diagnostic Interview for Children and Adolescents (DICA). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 59–​66. Reuterskiöld, L., Ost, L. G., & Ollendick, T. (2008). Exploring child and parent factors in the diagnostic agreement on the Anxiety Disorders Interview Schedule. Journal of Psychopathology and Behavioral Assessment, 30, 279–​290. Reynolds, C. R., & Richmond B. O. (1978). What I think and feel:  A revised measure of children’s manifest anxiety. Journal of Abnormal Child Psychology, 6, 271–​280. Reynolds, C. R., & Richmond, B. O. (1985). Revised children’s manifest anxiety scale:  Manual. Los Angeles, CA: Western Psychological Services. Reynolds, C. R., & Richmond, B. O. (2008). Revised Children’s Manifest Anxiety Scale–​ Second Edition (RCMAS-​2). Los Angeles, CA:  Western Psychological Services. Reznick, J. S., Hegeman, I. M., Kaufman, E. R., Woods, S. W., & Jacobs, M. (1992). Retrospective and concurrent

239

self-​report of behavioral inhibition and their relation to adult mental health. Development and Psychopathology, 4, 301–​321. Rozenman, M., Weersing, V. R., & Amir, N. (2011). A case series of attention modification in clinically-​ anxious youths. Behaviour Research and Therapy, 49, 324–​330. RUPP Anxiety Study Group. (2002). The Pediatric Anxiety Rating Scale (PARS):  Development and psychometric properties. Journal of the American Academy of Child & Adolescent Psychiatry, 41, 1061–​1069. Saavedra, L. M., & Silverman, W. K. (2002). Classification of anxiety disorders in children: What a difference two decades make. International Review of Psychiatry, 14, 87–​100. Seligman, L. D., & Ollendick, T. H. (1998). Comorbidity of anxiety and depression in children and adolescents: An integrative review. Clinical Child and Family Psychology Review, 1, 125–​144. Seligman, L. D., Ollendick, T. H., Langley, A. K., & Baldacci, H. B. (2004). The utility of measures of child and adolescent anxiety:  A meta-​analytic review of the RCMAS, STAIC, and CBCL. Journal of Clinical Child and Adolescent Psychology, 33, 557–​565. Shaffer, D., Fisher, P., Lucas, C., Dulcan, M. K., & Schwab-​ Stone, M. E. (2000). NIMH Diagnostic Interview Schedule for Children Version IV (NIMH DISC-​IV): Description, differences from previous versions, and reliability of some common diagnoses. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 28–​38. Shamir-​Essakow, G., Ungerer, J. A., & Rapee, R. M. (2005). Attachment, behavioral inhibition, and anxiety in preschool children. Journal of Abnormal Child Psychology, 33, 131–​143. Silverman, W. K. (1991). Anxiety disorders interview schedule for children. Albany, NY: Graywind. Silverman, W. K. (1994). Structured diagnostic interviews. In T. H. Ollendick, N. J. King, & W. Yule (Eds.), International handbook of phobic and anxiety disorders in children and adolescents (pp. 293–​315). New  York, NY: Plenum. Silverman, W. K., & Albano, A. M. (1996). Anxiety Disorders Interview Schedule for Children for DSM-​ IV:  Child and Parent versions. San Antonio, TX:  Psychological Corporation/​Graywind. Silverman, W. K., & Berman, S. L. (2001). Psychosocial interventions for anxiety disorders in children:  Status and future directions. In W. K. Silverman & P. D.  A. Treffers (Eds.), Anxiety disorders in children and adolescents: Research, assessment and intervention (pp. 313–​334). Cambridge, UK: Cambridge University Press. Silverman, W. K., & Carter, R. (2006). Anxiety disturbance in girls and women. In J. Worell & C. Goodheart (Eds.), Handbook of girls’ and women’s psychological health (pp. 60–​68). New York, NY: Oxford University Press.

240

Anxiety and Related Disorders

Silverman, W. K., & Eisen, A. R. (1992). Age differences in the reliability of parent and child reports of child anxious symptomatology using a structured interview. Journal of the American Academy of Child & Adolescent Psychiatry, 31, 117–​124. Silverman, W. K., Kurtines, W. M., Ginsburg, G. S., Weems, C. G., Lumpkin, P. W., & Carmichael, D. H. (1999). Treating anxiety disorders in children with group cognitive behavioral therapy: A randomized clinical trial. Journal of Consulting and Clinical Psychology, 67, 995–​1003. Silverman, W. K., La Greca, A. M., & Wasserstein, S. B. (1995). What do children worry about? Worry and its relation to anxiety. Child Development, 66, 671–​686. Silverman, W. K., & Nelles, W. B. (1988). The anxiety disorders interview schedule for children. Journal of the American Academy of Child & Adolescent Psychiatry, 27, 772–​778. Silverman, W. K., & Ollendick, T. H. (Eds.). (1999). Developmental issues in the clinical treatment of children. Needham Heights, MA: Allyn & Bacon. Silverman, W. K., & Ollendick, T. H. (2005). Evidence-​based assessment of anxiety and its disorders in children and adolescents. Journal of Clinical Child and Adolescent Psychology, 34, 380–​411. Silverman, W. K., & Rabian, B. (1995). Test–​retest reliability of the DSM-​III-​R childhood anxiety disorders symptoms using the Anxiety Disorders Interview Schedule for Children. Journal of Anxiety Disorders, 9, 1–​12. Silverman, W. K., & Saavedra, L. M. (2004). Assessment and diagnosis in evidence based practice. In P. M. Barrett & T. H. Ollendick (Eds.), Handbook of interventions that work with children and adolescents: Prevention and treatment (pp. 49–​69). New York, NY: Guilford. Silverman, W. K., & Serafini, L. T. (1998). Internalizing disorders. In M. Hersen & A. S. Bellack (Eds.), Behavioral assessment: A practical handbook (4th ed., pp. 342–​360). Needham Heights, MA: Allyn & Bacon. Silverman, W. K., & Weems, C. F. (1998). Anxiety sensitivity in children. In S. Taylor (Ed.), Anxiety sensitivity:  Theory, research and the treatment of the fear of anxiety (pp. 239–​268). Mahwah, NJ: Erlbaum. Skarphedinsson, G., Villabø, M. A., & Lauth, B. (2015). Screening efficiency of the self-​report version of the Multidimensional Anxiety Scale for Children in a highly comorbid inpatient sample. Nordic Journal of Psychiatry, 69, 613–​620. Spence, S. H. (1998). A measure of anxiety symptoms among children. Behaviour Research and Therapy, 36, 545–​566. Spielberger, C. D. (1973). Manual for the state-​trait anxiety inventory for children. Palo Alto, CA:  Consulting Psychologists Press.

Storch, E. A., Ehrenreich May, J., Wood, J. J., Jones, A. M., De Nadai, A. S., Lewin, A. B., . . . Murphy, T. K. (2012). Multiple informant agreement on the anxiety disorders interview schedule in youth with autism spectrum disorders. Journal of Child and Adolescent Psychopharmacology, 22, 292–​299. Turner, S. M., Beidel, D. C., & Wolff, P. L. (1996). Is behavioral inhibition related to the anxiety disorders? Clinical Psychology Review, 16, 157–​172. Ung, D., Arnold, E. B., De Nadai, A. S., Lewin, A. B., Phares, V., Murphy, T. K., & Storch, E. A. (2014). Inter-​rater reliability of the anxiety disorders interview schedule for DSM-​IV in high-​functioning youth with autism spectrum disorder. Journal of Developmental and Physical Disabilities, 26, 53–​65. Van Meter, A., Youngstrom, E., Youngstrom, J. K., Ollendick, T., Demeter, C., & Findling, R. L. (2014). Clinical decision making about child and adolescent anxiety disorders using the Achenbach system of empirically based assessment. Journal of Clinical Child & Adolescent Psychology, 43, 552–​565. Vasey, M. W., Daleiden, E. L., Williams, L. L., & Brown, L. M. (1995). Biased attention in childhood anxiety disorders: A preliminary study. Journal of Abnormal Child Psychology, 23, 267–​279. Vasey, M. W., Dalgleish, T., & Silverman, W. K. (2003). Research on information-​processing factors in child and adolescent psychopathology:  A critical commentary. Journal of Clinical Child and Adolescent Psychology, 32, 81–​93. Vasey, M. W., & Ollendick, T. H. (2000). Anxiety. In A. J. Sameroff, M. Lewis, & S. M. Miller (Eds.), Handbook of developmental psychopathology (2nd ed., pp. 511–​529). New York, NY: Kluwer/​Plenum. Vernberg, E. M., La Greca, A. M., Silverman, W. K., & Prinstein, M. J. (1996). Prediction of posttraumatic stress symptoms in children after Hurricane Andrew. Journal of Abnormal Psychology, 105, 237–​248. Villabø, M., Gere, M., Torgersen, S., March, J. S., & Kendall, P. C. (2012). Diagnostic efficiency of the child and parent versions of the Multidimensional Anxiety Scale for Children. Journal of Clinical Child and Adolescent Psychology, 41, 75–​85. Warren, S. L., Huston, L., Egeland, B., & Sroufe, L. A. (1997). Child and adolescent anxiety disorders and early attachment. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 637–​644. Waters, A. M., Henry, J., Mogg, K., Bradley, B. P., & Pine, D. S (2010). Attentional bias towards angry faces in childhood anxiety disorders. Journal of Behaviour Therapy and Experimental Psychiatry, 41, 158–​164.

Anxiety Disorders in Children and Adolescents

Warren, S. L., & Simmens, S. J. (2005). Predicting toddler anxiety/​depressive symptoms: Effects of caregiver sensitivity on temperamentally vulnerable children. Infant Mental Health Journal, 26, 40–​55. Waters, A. M., & Lipp, O. V. (2008). Visual search for emotional faces in children. Cognition and Emotion, 22, 1306–​1326. Watson, D., & Clark, L. A. (1984). Negative affectivity: The disposition to experience aversive emotional states. Psychological Bulletin, 96, 465–​490. Weems, C. F., Berman, S. L., Silverman, W. K., & Rodriguez, E. (2002). The relation between anxiety sensitivity and attachment style in adolescence and early adulthood. Journal of Psychopathology and Behavioral Assessment, 24, 159–​168. Weems, C. F., & Silverman W. K. (2006). An integrative model of control: Implications for understanding emotion regulation and dysregulation in childhood anxiety. Journal of Affective Disorders, 91, 113–​124. Weems, C., & Silverman, W. K. (2017). Anxiety disorders. In T. P. Beauchaine & S. P. Hinshaw (Eds.), Child and adolescent psychopathology (pp. 531–​559). New  York, NY: Wiley. Weems, C. F., Silverman, W., & La Greca A. M. (2000). What do youth referred for anxiety problems worry about? Worry and its relation to anxiety and anxiety disorders in children and adolescents. Journal of Abnormal Child Psychology, 28, 63–​72.

241

Weems, C. F., Silverman, W. K., Rapee, R. R., & Pina, A. A. (2003). The role of control in childhood anxiety disorders. Cognitive Therapy and Research, 27, 557–​568. Weems, C. F., Silverman, W. K., Saavedra, L. M., Pina, A. A., & Lumpkin, P. W. (1999). The discrimination of children’s phobias using the Revised Fear Survey Schedule for Children. Journal of Child Psychology and Psychiatry, 40, 941–​952. Wei, C., Hoff, A., Villabø, M. A., Peterman, J., Kendall, P. C., Piacentini, J., . . . March, J. (2014). Assessing anxiety in youth with the Multidimensional Anxiety Scale for Children (MASC). Journal of Clinical Child and Adolescent Psychology, 43, 566–​578. Whaley, S. E., Pinto, A., & Sigman, M. (1999). Characterizing interactions between anxious mothers and their children. Journal of Consulting and Clinical Psychology, 67, 826–​836. Wood, J. J., Piacentini, J. C., Bergman, R. L., McCracken, J., & Barrios, V. (2002). Concurrent validity of the anxiety disorders section of the Anxiety Disorders Interview Schedule for DSM-​IV: Child and Parent Versions. Journal of Clinical Child and Adolescent Psychology, 31, 335–​342. Woodward, L. J., & Fergusson, D. M. (2001). Life course outcomes of young people with anxiety disorders in adolescence. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 1086–​1093.

12

Specific Phobia and Social Anxiety Disorder Karen Rowa Randi E. McCabe Martin M. Antony Although it is widely accepted that assessment procedures are an important part of understanding and treating anxiety-​based problems, relatively little attention has been paid to developing and studying comprehensive, evidence-​based assessment protocols for the anxiety disorders. This is in contrast to the extensive attention that has been paid to empirically supported treatment interventions for anxiety disorders during the past two decades (e.g., Olatunji, Cisler, & Deacon, 2010) and to the large number of studies regarding the psychometric properties of particular anxiety measures. The importance of empirically supported assessment procedures for anxiety disorders has been discussed (e.g., Antony & Rowa, 2005), with the hope that research on these procedures and protocols will add to our growing knowledge regarding the reliability and validity of particular instruments. It is useful to summarize what we know about commonly used assessment tools and procedures because this can provide guidance to clinicians and researchers regarding the most appropriate tools for various assessment tasks. This chapter reviews the scientific status and clinical utility of the most commonly practiced assessment procedures for two of the anxiety disorders—​specific phobia and social anxiety disorder.

NATURE OF SPECIFIC PHOBIA AND SOCIAL ANXIETY DISORDER

Diagnostic Considerations Specific phobia and social anxiety disorder (SAD) share a number of features, including the presence of excessive

fear, anxious apprehension, and avoidance behavior. However, it is the focus of fear that distinguishes between the two anxiety disorders. In specific phobia, the excessive fear is focused on a particular situation or object (e.g., animals or insects, heights, seeing blood or receiving a needle, driving, and enclosed places, among others). In SAD, the excessive fear is focused on one or more social and performance situations in which the individual fears acting in a way that will be embarrassing or lead to negative evaluation by others (e.g., public speaking, parties, being assertive, making small talk, and dating) or revealing unbecoming personal attributes (Moscovitch, 2009). Apart from the focus of the fear, the diagnostic criteria as outlined in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association [APA], 2013) have significant similarities for these two disorders. For both conditions, exposure to the feared stimulus typically results in an immediate anxiety response that may escalate into a full-​blown panic attack, and feared situations are avoided or endured with distress. Symptoms are persistent, lasting 6 months or longer. Moreover, the symptoms (i.e., fear, anxious apprehension, and avoidance) cause the individual significant distress or impairment in functioning and are not better explained by another mental disorder. In addition, for SAD, the fear is not due to the physiological effects of a substance or a general medical condition, and if a general medical condition or another mental disorder is present, the fear is unrelated to it (e.g., a fear of shaking in the presence of Parkinson’s disease or a fear of eating in public in the presence of an eating disorder would not be indicative of SAD). When assigning a diagnosis of specific phobia, the category of fear is specified as one of

242

Specific Phobia and Social Anxiety Disorder

five types: animal (e.g., dogs, birds, and insects), natural environment (e.g., heights and water), blood–​injection–​ injury (e.g., getting a needle and seeing blood), situational (e.g., driving, enclosed places, and flying), and other (e.g., fear of choking or vomiting). When assigning a diagnosis of SAD, the specifier performance only may be used in cases in which the fear is limited to speaking or performing in public. Evidence suggests that avoidant personality disorder and SAD may be alternative conceptualizations of the same disorder, with avoidant personality disorder representing a more severe and more generalized form of the condition (Ralevski et  al., 2005), although some have argued against this assertion (e.g., Eikenaes, Egeland, Hummelen, & Wilberg, 2015). Regardless, the assessment measures reviewed in this chapter are likely insufficient to properly assess avoidant personality disorder. Indeed, there is variability even within a group of people diagnosed with SAD that needs to be considered for assessment. Hoffmann, Heinrichs, and Moscovitch (2004) noted that although individuals meeting symptom criteria for a diagnosis of SAD share certain specific features, in reality, they are a heterogeneous group that may be better characterized along a dimensional continuum of emotional response and behavioral tendencies encompassing fearfulness, anxiousness, shyness, self-​consciousness, submissiveness, and anger. Epidemiology and Descriptive Psychopathology Specific phobia is one of the most common anxiety disorders, with a lifetime prevalence estimate of 12.5% (e.g., Kessler et al., 2005). Studies in community samples revealed a median lifetime prevalence rate of 6.65% for SAD (Fehm, Pelissolo, Furmark, & Wittchen, 2005), although the replication of the National Comorbidity Survey suggested a lifetime prevalence of 12% (Kessler et  al., 2005). Studies of college samples suggest a point prevalence of 11.6% for SAD (Baptista et  al., 2012), adding further evidence that SAD is a common disorder. There is no evidence that the prevalence rates of anxiety disorders (including SAD and specific phobia) have changed significantly during the past few decades (Bandelow & Michaelis, 2015), lending support for earlier prevalence estimates. Specific phobias are more common in women than in men (e.g., Curtis, Magee, Eaton, Wittchen, & Kessler, 1998), although there is variability across specific phobia types with respect to gender and prevalence. Similarly, SAD is slightly more common in women than in men (Fehm et al., 2005).

243

The majority of specific phobias (animal, blood–​ injection–​injury, and natural environment) typically have an onset in childhood, typically before age 15  years (de Lijster et  al., 2017). However, situational-​type phobias (e.g., flying, driving, and elevators) have a later age of onset, typically in late adolescence or early adulthood (e.g., Antony, Brown, & Barlow, 1997b; Lipsitz, Barlow, Mannuzza, Hofmann, & Fyer, 2002). Onset of SAD is typically during childhood and adolescence, with a range of 13 to 24 years in clinical studies and 10 to 16.6 years in epidemiological studies (Wittchen & Fehm, 2003). Later onset of SAD (after the age of 25 years) is rare and typically secondary to, or encompassed by, a separate mental disorder (depression, eating disorder, etc.) (Koyuncu et al., 2015; Wittchen & Fehm, 2003). Specific phobias are often comorbid with other specific phobias and other anxiety disorders (Curtis et  al., 1998; Sanderson, DiNardo, Rapee, & Barlow, 1990). However, in the latter case, specific phobia tends to be of lesser severity then the comorbid condition (Sanderson et  al., 1990). SAD is associated with a high degree of comorbidity. It is estimated that 50% to 80% of individuals with SAD have at least one other mental disorder, most commonly other anxiety disorders, major depressive disorder, and substance use disorders (Fehm et al., 2005; Wittchen & Fehm, 2003). Anxiety disorders, including SAD and specific phobia, tend to precede the onset of comorbid depression, with some evidence that interpersonal difficulties related to anxiety (e.g., interpersonal sensitivity) may partially explain this association in SAD (Starr, Hammen, Connolly, & Brennan, 2014). Evidence suggests that specific phobia can be associated with high levels of psychosocial impairment in some cases (Essau, Conradt, & Petermann, 2000), and type of phobia may be important. For example, Ollendick, Raishevich, Davis, Sirbu, and Öst (2010) found that, as assessed by parent report, adolescents with natural environment phobias had more social problems than those with animal phobias. SAD is associated with a significant degree of impairment and disability that increases over the individual’s lifespan (e.g., Fehm et al., 2005; Wittchen & Fehm, 2003). The disruption in quality of life in SAD is similar to impairments found in other anxiety disorders (e.g., Barrera & Norton, 2009). One study found that 21% of individuals with SAD had clinically severe impairment (defined as being two or more standard deviations below the community norm) in quality of life (Rapaport, Clary, Fayyad, & Endicott, 2005). The presence of SAD (at threshold or subthreshold levels) as a comorbid condition among individuals with panic disorder or

244

Anxiety and Related Disorders

generalized anxiety disorder is related to poor quality of life (Camuri et  al., 2014). In addition, there is some evidence that safety behaviors used by individuals with SAD are related to significant impairment in social performance (e.g., Rowa et al., 2015; Stangier, Heidenreich, & Schermelleh-​Engel, 2006). Etiology Genetic factors appear to play a role in the development of both specific phobia and SAD, though to different degrees. First-​degree relatives of individuals with specific phobia or SAD have an increased risk of having the disorder compared to first-​degree relatives of never mentally ill controls (Fyer et al., 1990; Steinhausen, Jakobsen, Meyer, Jørgensen, & Lieb, 2016), although family aggregation appears to be stronger for SAD compared to specific phobias (Steinhausen et al., 2016). Twin studies suggest that genetic influences may be different depending on the type of phobic stimulus in SAD and specific phobias. For example, recent research does not support genetic factors in situational specific phobias, whereas genetic factors influence the development of other phobias (Loken, Hettema, Aggen, & Kendler, 2014). Some studies have failed to find a genetic role in the development of phobias at all (Skre et al., 2000). Genetic influences in SAD also seem to vary by age; in one study, younger individuals were significantly more affected by genetic influences than were adults (Scaini, Belotti, & Ogliari, 2014). Further research is needed to more fully understand the genetic contributions both across anxiety disorders and within subtypes of particular disorders. Rachman (1977) proposed three pathways to fear development:  direct conditioning (being hurt or frightened by the phobic object or situation), vicarious acquisition (witnessing a traumatic event or seeing someone behave fearfully in the phobic situation), and informational transmission (through messages received from others). Numerous studies have found support for this model (for a review, see McCabe, Hood, & Antony, 2015). In addition to these learning processes, a fourth nonassociative pathway has been proposed to explain findings that are not accounted for by an associative model (e.g., some fears emerge without any prior associative learning experience). According to Poulton and Menzies (2002), a limited number of fears are innate or biologically determined and are adaptive from an evolutionary perspective. Other factors that may play a role in phobia development include the tendency to experience “disgust” in response to certain stimuli

(i.e., disgust sensitivity), cognitive variables (e.g., infor­ mation processing biases), and environmental factors (e.g., the context of a traumatic event, stress, and previous and subsequent exposure to a phobic stimulus) (McCabe et al., 2015). In addition to genetics and learning pathways, research has uncovered a number of specific risk factors associated with increased vulnerability for onset of SAD, including familial environment (overprotective or rejecting parenting style, parental modeling, and degree of exposure to social situations) and behavioral–​ temperamental style (elevated behavioral inhibition as a child) (Wittchen & Fehm, 2003). Rapee and Spence (Rapee and Spence, 2004; Spence & Rapee, 2016) have proposed an etiological model of SAD that attempts to capture the complexity of SAD based on available research evidence. According to their model, individuals have a “set point” level of social anxiety that is somewhat stable and consistent and is directed by broad genetic factors (e.g., general emotionality and sociability). The individual’s set point is altered (up or down) largely due to environmental factors (e.g., parents, peers, negative life events, culture, interrupted social performance, and poor social skill), which then have a reciprocal relationship on levels of social anxiety. These environmental factors operate as powerful influences due to timing (critical stage of vulnerability), impact (intensity or meaning of the event), or chronicity. Furthermore, the set point can be altered by protective factors (e.g., parenting style), decreasing the  risk of developing SAD. For example, a significant number of individuals with SAD report a history of being severely teased or bullied (McCabe, Antony, Summerfeldt, Liss, & Swinson, 2003), and peer exclusion predicts symptoms of social anxiety in young adults (Levinson, Langer, & Rodebaugh, 2013). On the other hand, recent research illustrates the protective effects of social support for individuals who are genetically predisposed to SAD (Reinelt et  al., 2014). In summary, this model underscores the myriad interacting factors that can influence the development of SAD.

PURPOSES OF ASSESSMENT

Antony and Rowa (2005) suggested the following 10 common purposes for which assessments are used: 1. To establish a diagnosis 2. To measure the presence, absence, or severity of particular symptoms

Specific Phobia and Social Anxiety Disorder

3. To measure features that cannot be assessed directly through an interview or self-​report scales (e.g., physiological processes and nonconscious processes) 4. To facilitate the selection of target problems for treatment planning 5. To measure a phenomenon that is of interest for research 6. To assess whether a particular treatment is “evidence based” 7. To include or exclude participants from a research study 8. To answer questions of interest for insurance companies (e.g., the presence of malingering) 9. To predict future behavior (e.g., treatment compliance) 10. To evaluate eligibility for employment, benefits, legal status, school placement, and so on In order to evaluate whether an assessment procedure is evidence based, one must ask the question, For what purpose? A particular assessment protocol or measure may be empirically supported for some purposes but not others. In this chapter, we review assessment procedures for specific phobia and SAD as they are used for three main clinical purposes: (a) diagnosis, (b) case conceptualization and treatment planning, and (c)  treatment monitoring and evaluation. For precise psychometric information on the measures we review (e.g., reliability values of scores with various samples), we encourage interested readers to consult the original sources cited in the following sections.

ASSESSMENT FOR DIAGNOSIS

Establishing a diagnosis for people suffering from specific phobia and SAD is important for a number of reasons. Diagnosis facilitates communication about the presenting problem and the accompanying symptoms, and it also allows for the selection of the most appropriate evidence-​ based treatments, many of which have been developed for particular disorders. Diagnostic clarification also helps clinicians distinguish among different conditions and make decisions about whether clinical issues are best conceptualized as separate problems or as different features of the same problem. For example, embarrassment about losing bowel control may lead to avoidance of situations similar to that seen in SAD, but it would often be better accounted for by a diagnosis of panic disorder or agoraphobia. Furthermore, achieving a broad diagnostic

245

picture can also aid a clinician in understanding the impact that comorbid conditions (e.g., substance use disorders and personality disorders) may have on the course and treatment outcome for specific phobia and SAD. The presence of certain comorbid conditions may influence a client’s readiness for treatment and decisions about the order of treatment interventions (e.g., whether to treat the anxiety disorder or the substance issue first) and the treatment process (e.g., the necessity to develop alternative coping strategies for someone who is using substances to manage symptoms of anxiety). Other chapters in this volume provide more details about the assessment of relevant comorbid conditions such as substance use disorders (see Chapter 17) and depression (see Chapters 6–​8). There is considerable debate about whether the focus on a particular diagnostic category should be replaced by assessment of continuous, transdiagnostic factors that cut across anxiety disorders (e.g., Gros, McCabe, & Antony, 2013). However, there is still much emphasis on diagnosis in clinical practice, and research supports a taxonic or categorical model of SAD compared to a dimensional model (e.g., Weeks, Carleton, Asmundson, McCabe, & Antony, 2010). For these reasons, there seems to be an ongoing, relevant place for diagnosis of SAD and specific phobia, supporting the need to review common diagnostic methods. Diagnoses can be established using unstructured clinical interviews, fully structured interviews, or semi-​ structured interviews (for a review, see Summerfeldt, Kloosterman, & Antony, 2010). The main way in which these approaches differ is in the level of standardization. Unstructured clinical interviews are the least standardized, and they are the most commonly used clinical interview format in routine practice. In unstructured interviews, clinicians ask whatever questions they view as appropriate for assessing the diagnostic features of particular disorders as well as other clinical characteristics of interest. However, research suggests that rates of diagnostic agreement using clinical interviews are often no better than chance (Spitzer & Fleiss, 1974), rendering the reliability and validity of diagnostic findings suspect. In contrast, fully structured interviews (e.g., the World Health Organization Composite International Diagnostic Interview [WHO-​CIDI]; Kessler & Ustun, 2004) are the most standardized format for diagnostic interviews. In these interviews, questions are always asked in the same way, and there is little flexibility to ask follow-​up questions or to ask for clarification. These interviews are designed to be used by trained lay interviewers, and they are primarily used in large epidemiological studies rather than by clinicians or clinical researchers. In addition, questions

246

Anxiety and Related Disorders

TABLE 12.1  

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

ADIS-​IV

NA

NA

G

NR

E

G

G

E



SCID-​IV

NA

NA

E

NR

E

G

G

E



Note: ADIS-​IV = Anxiety Disorders Interview Schedule for DSM-​IV; SCID-​IV = Structured Clinical Interview for DSM-​IV; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

have been raised regarding the validity of anxiety disorder diagnoses as established by fully structured interviews (see Antony, Downie, & Swinson, 1998; Summerfeldt et al., 2010). Semi-​structured interviews include many of the advantages of both structured and unstructured interviews. Standard questions are asked to assess each of the diagnostic criteria necessary for making a diagnosis, but clinicians are permitted to ask follow-​up questions for clarification and to answer questions that respondents may have about particular questions. Semi-​structured interviews are the most common type of diagnostic interview used in clinical research, and they are occasionally used in routine clinical practice as well. Two of the most extensively studied semi-​structured interviews for diagnosing anxiety-​related problems including specific phobia and SAD are the Anxiety Disorders Interview Schedule for DSM-​ IV (ADIS-​ IV; Brown, DiNardo, & Barlow, 1994; DiNardo, Brown, & Barlow, 1994)  and the Structured Clinical Interview for DSM-​ IV/​Axis I Disorders (SCID-​IV; First, Spitzer, Gibbon, & Williams, 1996). With the publication of DSM-​5, updated versions of both these interviews have been published, although few data exist on the psychometric properties of these revised measures. As a result, much of the following discussion focuses on data from the SCID-​IV and ADIS-​ IV. Similar to the SCID-​IV and ADIS-​IV, both the SCID-​ 5 and ADIS-​5 provide systematic questions to establish a current diagnosis of specific phobia or SAD. Questions and initial probes are outlined for interviewers, ensuring that all clients receive the same questions, in the same order, using the same terminology. Subsequent follow-​up questions may deviate from the structured questions. For example, additional questions may be necessary to differentiate a specific phobia of driving from fears of driving associated with panic disorder (e.g., “What is the focus of your fear when driving? Having a panic attack? Being in an accident?”). These interviews provide decision trees to establish diagnoses once pertinent information is

collected. These instruments are described next in more detail, and a summary of the psychometric properties of the SCID-​IV and ADIS-​IV can be found in Table 12.1. Anxiety Disorders Interview Schedule for DSM-​IV and DSM-​5 (ADIS-​IV and ADIS-​5) The ADIS-​ IV (DiNardo et  al., 1994)  and ADIS-​ 5 (Brown & Barlow, 2014) are clinician-​administered semi-​ structured interviews that provide both diagnostic and dimensional information about a range of psychological problems, including anxiety and related disorders, mood disorders, somatoform disorders, and substance use disorders. Screening questions are provided for other mental disorders. Depending on the version of the ADIS-​IV or ADIS-​5 used (adult and lifetime versions), current and lifetime diagnoses can be ascertained. Clinicians require extensive training in the administration of this interview, and the interview duration can be lengthy (e.g., several hours). Despite these drawbacks for everyday practice, the ADIS has the benefit of providing clear criteria to help determine the presence or absence of specific phobia and SAD (as well as common comorbid disorders), as well as assessing useful information such as the degree of fear and avoidance in a variety of social settings. Indeed, the ADIS goes well beyond the SCID-​IV or SCID-​5 in terms of screening for a wide variety of social and performance situations in which a person may experience fear, increasing the chance that difficulties with social situations or a specific phobia will be identified. If initial inquiries reveal the possibility of symptoms of specific phobia or SAD, a number of follow-​up questions are asked to assess the intensity of the fear, the frequency and breadth of avoidance, the level of distress and interference caused by symptoms, and other relevant variables. The ADIS-​ IV has demonstrated good reliability:  Inter-​rater reliability was strong for specific phobia and SAD, both when diagnosed as the principal or additional diagnosis (Brown, DiNardo, Lehman, &

Specific Phobia and Social Anxiety Disorder

Campbell, 2001). In fact, agreement between clinicians was good to excellent for most clinical diagnoses. The main source of disagreement for both specific phobia and SAD in this study involved rating the clinical severity of the disorder, with one clinician concluding that the disorder was clinically significant and another clinician concluding that the disorder severity did not exceed clinical threshold for distress or impairment. This study suggested that there were very few disagreements between clinicians when deciding between SAD and another disorder. In other words, clinicians appeared to be successful in disentangling symptom presentations to reliably diagnose SAD. Structured Clinical Interview for DSM-​IV (SCID-​IV) and DSM-​5 (SCID-​5) The SCID-​ IV (First et  al., 1996)  and SCID-​ 5 (First, Williams, Karg, & Spitzer, 2015)  are also clinician-​ administered semi-​ structured interviews that provide diagnostic decisions about a wide range of psychiatric disorders. The SCID-​IV is available in a clinician version (SCID-​CV), a personality disorders version, and a research version (SCID-​I). Four versions of the SCID-​5 are available—​a clinician version, a research version, a clinical trials version, and a personality disorders version. The clinician versions of this interview were designed for use in clinical settings and have less extensive coverage of disorders. The research versions have a broader focus. Current and lifetime diagnoses are obtained for many disorders. Extensive training is also required to administer the SCID in all forms, and administration can be lengthy, especially for the research version (i.e., 2 or 3 hours for a typical outpatient administration of the SCID-​IV and even longer for the SCID-​5). Much of the evidence for the psychometric properties of the SCID-​IV is derived from research using an earlier version based on DSM-​III-​R (APA, 1987)  criteria, and no research thus far has examined the psychometrics of the SCID-​5. Minimal changes affected the revision of the SCID from DSM-​III-​R to DSM-​IV (APA, 1994), but significant changes have affected the revision to DSM-​5. Earlier studies suggested that the SCID demonstrates adequate or better reliability (both inter-​rater and test–​retest) for most diagnoses in patient samples (Williams et  al., 1992)  but not in nonpatient samples. Symptom agreement and diagnostic accuracy using the SCID are also good (Ventura, Liberman, Green, Shaner, & Mintz, 1998). The SCID-​IV has been

247

shown to reliably come to a diagnosis of SAD (Crippa et al., 2008). However, for some populations, reliability of diagnoses may be less robust. For example, a study examining DSM-​III-​R lifetime diagnoses in a substance-​ abusing population found poorer test–​retest reliabilities for lifetime diagnoses than has been found in other studies (Ross, Swinson, Doumani, & Larkin, 1995). Given the comorbidity between SAD and substance use issues, this issue is of relevance to consider. A summary of the criterion-​related validity of the SCID suggests that there is a high level of correspondence between SCID-​derived diagnoses and other variables such as the clinical features of disorders, the course of conditions, and treatment outcome for certain conditions (Rogers, 1995). Overall Evaluation Because of their greater reliability, semi-​structured interviews such as the ADIS-​IV, ADIS-​5, SCID-​IV, or SCID-​5 are preferable to either unstructured clinical interviews or fully structured clinical interviews, both in routine clinical practice and in clinical research. However, given the lack of research on the newly published ADIS-​5 and SCID-​5, firm conclusions about the reliability of these new versions cannot be made, and our recommendation of these measures arises from research on their predecessors. Although few studies have directly examined the validity of these measures, there is a vast body of research that indirectly supports their validity. For example, studies often compare diagnostic results from the ADIS-​IV or the SCID-​IV to scores on established questionnaire measures for social or specific phobia, finding strong convergence between the presence of the disorder (based on the interview) and the presence of relevant symptoms (based on self-​report scales). For example, the widely used Social Interaction Anxiety Scale (Mattick & Clarke, 1998), a self-​report measure of symptoms of social anxiety, demonstrated a 97% correct classification rate when compared to diagnoses of SAD made using the SCID-​IV or ADIS-​IV (Rodebaugh, Heimberg, Woods, Liebowitz, & Schneier, 2006). Therefore, it is likely preferable to use a semi-​ structured interview such as the ADIS or SCID versus unstructured or fully structured interviews when establishing a diagnosis of specific phobia or SAD. Given the length and breadth of the SCID-​5 and ADIS-​5, clinicians may want to consider using the clinician version of the SCID-​5 or the most pertinent sections of either interview to establish diagnoses while still constraining the length of the interview.

248

Anxiety and Related Disorders

Self-​Report Measures of Severity and Phenomenology—​Specific Phobia

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

An important function of assessment is to gather information for the purpose of case conceptualization and planning treatment. Antony and Rowa (2005) reviewed the most important variables to assess when treating anxiety disorders, including the severity of the fear, degree of avoidance, subtle avoidance and safety behaviors, use of maladaptive coping strategies, anxious cognitions, motivation for treatment, treatment history, suitability for various forms of therapy, and the presence of skills deficits. In this section, we review a number of assessment measures designed to provide information on variables that are important to consider when developing an empirically supported treatment plan for an individual. Because empirically supported psychological treatments for specific phobia and SAD include primarily cognitive and behavioral strategies, treatment planning and conceptualization in this chapter will generally refer to preparing for a course of cognitive–​ behavioral therapy (CBT). See Table 12.2 for summary ratings of the instruments. Also, note that the diagnostic instruments described previously may also be useful for gathering information relevant to treatment planning and conceptualization, and therefore such information (e.g., avoided situations and coping strategies) may already be known after completing a diagnostic assessment. The measures described in this upcoming section provide additional as well as complementary information to what is already known from a diagnostic assessment.

To plan an effective program of treatment for specific phobia, it is useful to understand the clinical presentation of the person’s fear. Objectively, how severe is the individual’s fear compared to that of others with a similar diagnosis? What situations or objects does the person avoid as a result of the fear? What kinds of anxious thoughts or worries does the person have when confronting a feared situation? Due to the heterogeneity of specific phobia, there are few assessment tools that provide information across the broad range of phobic stimuli. Most measures are aimed at one particular type of specific phobia (e.g., a fear of spiders). An exception is the Fear Survey Schedule (FSS), versions II (Geer, 1965) and III (Wolpe & Lang, 1969). These two versions of the self-​ report measure are designed to assess a broad range of phobic stimuli and objects. Individuals are presented with extensive lists of phobic stimuli and are asked to rate the severity of their fear based on Likert scales. Another exception and a helpful screening tool for phobic stimuli is the Phobic Stimulus Response Scales (Cutshall & Watson, 2004). This tool screens individuals for a range of phobic stimuli, including blood–​injection, animal, and physical confinement fears. Preliminary analyses found adequate psychometric properties for this screening tool (Cutshall & Watson, 2004). Once a diagnosis of a particular specific phobia has been made, it is likely to be more useful to use a measure designed to assess the clinical features of that disorder.

TABLE 12.2  

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

FSQ SNAQ DAI SPAI SPIN BFNE DS ASI-​3 FMPS SPRS

A A A E E E G E G G

E G E E E E G E G A

NA NA NA NA NA NA NA NA NA G

A E A A A A NR G NR NR

A NR A E G G E E G E

G G G E G E G E G G

G E E E E E A E E E

G G G E E E G E G A

Highly Recommended

✓ ✓ ✓ ✓

Note:  FSQ  =  Fear of Spiders Questionnaire; SNAQ  =  Snake Questionnaire; DAI  =  Dental Anxiety Inventory; SPAI  =  Social Phobia and Anxiety Inventory; SPIN = Social Phobia Inventory; BFNE = Brief Fear of Negative Evaluation Scale; DS = Disgust Scale; ASI-​3 = Anxiety Sensitivity Index, third edition; FMPS = Frost Multidimensional Perfectionism Scale; SPRS = Social Performance Rating Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

Specific Phobia and Social Anxiety Disorder

During the past three decades, researchers have developed a number of self-​report scales for measuring symptoms related to fears of snakes, spiders, dogs, heights, blood, needles, dentists, enclosed places, storms, and flying. We review several examples in the following paragraphs and in Table 12.2; however, given the broad range of specific phobia types, a comprehensive review of all relevant measures is not possible. For a review of adult measures, see Hood and Antony (2012). For a review of child measures, see Ollendick, Davis, and Muris (2004) or Southam-​Gerow and Chorpita (2007). For a principal diagnosis of a specific phobia, animal type, psychometrically sound measures exist for fear of spiders and snakes, two of the most commonly feared animals. For example, the 18-​ item Fear of Spiders Questionnaire (FSQ; Szymanski & O’Donohue, 1995) provides an objective measure of the severity of a person’s fear of spiders, with scores clearly distinguishing between phobic and nonphobic participants (Muris & Merckelbach, 1996). This questionnaire seems best able to predict conscious avoidance behaviors (i.e., behaviors available to introspection and verbalization), whereas it is less able to predict automatic fear responses, such as a physiological startle response (Huijding & de Jong, 2006). Therefore, the FSQ is useful for understanding the severity of a person’s fear of spiders and for understanding the types of situations a client may avoid due to fears of spiders, although it has less utility for helping understand the role of implicit reactions to spiders when planning treatment. If a client reports a fear of snakes, the Snake Questionnaire (SNAQ; Klorman, Hastings, Weerts, Melamed, & Lang, 1974) provides a detailed understanding of a person’s particular concerns about snakes and the way this fear may affect his or her life. Individuals rate 30 fearful or nonfearful statements about snakes as true or false. Total scores clearly distinguish patient populations from nonclinical groups and from individuals with spider phobias (Fredrikson, 1983). Psychometric properties are robust in translation (e.g., Czech; Polák, Sedláčková, Nácar, Landová, & Frynta, 2016). Scores are also sensitive to treatment-​related changes (öst, 1978). However, scores on this questionnaire do not correspond to actual behavioral reactions to a caged snake, suggesting that this measure may have good, but not excellent, construct validity (Klieger, 1987). Again, as a component of treatment planning, a questionnaire such as the SNAQ will provide the clinician with an idea of the types of beliefs the individual holds about snakes and the impact of these beliefs on the person’s day-​to-​day functioning. Strongly held beliefs may

249

become the focus of cognitive restructuring efforts in therapy, they may shape the provision of educational information about snakes, or they may suggest ideas for in vivo exposure exercises. For example, one item on the SNAQ is “The way snakes move is repulsive.” If a person endorses this item, it suggests that movement may form an important part of his or her fear, and the person might benefit from information about why snakes move the way they do and from exposure exercises that incorporate a snake’s movements. Fear of dental and medical procedures is a common type of specific phobia. There are numerous self-​report measures of various dental and medical procedures that can be useful in understanding the severity of a client’s fear, the focus of the fear, and the types of situations avoided due to the fear. Examples of these include the Dental Cognitions Questionnaire (de Jongh, Muris, Schoenmakers, & ter Horst, 1995), the Dental Fears Survey (Kleinknecht, Klepac, & Alexander, 1973), the Index of Dental Anxiety and Fear (Armfield, 2010), the Medical Fear Survey (Kleinknecht, Thorndike, & Walls, 1996), and the Mutilation Questionnaire (Klorman et al., 1974). One particular example of a useful self-​report measure for dental fears is the Dental Anxiety Inventory (DAI; Stouthard, Mellenbergh, & Hoogstraten, 1993). This 36-​item measure provides information about the types of dental-​related fears a client might have and the severity of the fears. It was designed to provide information about when a person experiences anxiety (e.g., in the dental chair and in the waiting room); the situational aspects of being at the dentist’s office that may bother people (e.g., dental treatments and the interaction between patient and dentist); and the emotional, physical, and cognitive reactions the person has to a dental situation. This measure has been shown to have excellent internal consistency and test–​retest reliability estimates (although only over a short interval) and moderate correlations with a dentist’s perception about a person’s anxiety (Stouthard et  al., 1993). An independent test of the DAI’s convergent and discriminant validity suggested that this measure is highly related to other measures of dental fears, mildly related to general fear and neuroticism, and not related to scales hypothesized to be unrelated to dental fears (Stouthard, Hoogstraten, & Mellenbergh, 1995). The questionnaire has been translated into multiple languages, increasing its clinical utility. We could find no treatment studies using the DAI, and therefore its treatment sensitivity is currently unknown. In addition to the measures listed previously, there are other self-​report measures for various phobias that are

250

Anxiety and Related Disorders

not reviewed in detail here, including the Emetophobia Questionnaire (Boschen, Veale, Ellison, & Reddell, 2013); the abbreviated Spider Phobia Questionnaire (Olatunji et  al., 2009); the Dog Phobia Questionnaire (Vorstenbosch, Antony, Koerner, & Boivin, 2012); the Storm Fear Questionnaire (Nelson, Vorstenbosch, & Antony, 2014); and a questionnaire called the Circumscribed Fear Measure, which measures anticipated reactions to a specified feared stimulus and can be used for specific phobia (McCraw & Valentiner, 2015). Self-​Report Measures of Severity and Phenomenology—​Social Anxiety Disorder As is the case for specific phobias, there are a number of measures that provide useful information about the severity and features of SAD that can guide treatment planning. Again, there are too many existing measures to provide comprehensive reviews of all of them, so interested readers are referred to Antony, Orsillo, and Roemer (2001) or Fernandez, Piccirillo, and Rodebaugh (2014) for a more detailed review. Features of interest found in these measures include severity of symptoms, fearful cognitions, and avoided situations. A commonly used self-​report measure of SAD symptoms is the Social Phobia and Anxiety Inventory (SPAI; Turner, Beidel, & Dancu, 1996; Turner, Beidel, Dancu, & Stanley, 1989). This 45-​ item scale has two subscales:  social anxiety disorder and agoraphobia. Short forms with 18 and 23 items, respectively, appear to have good psychometric properties for use as a screening measure (de Vente, Majdandžić, Voncken, Beidel, & Bögels, 2014; Schry, Roberson-​Nay, & White, 2012). A  version of the SPAI is available for use with children (Scaini, Battaglia, Beidel, & Ogliari, 2012). Norms for the original SPAI are available for individuals with generalized SAD, generalized SAD comorbid with avoidant personality disorder, individuals with public speaking fears, socially anxious college students, nonanxious college students, adolescents, and community samples. Norms are also available across ethnic groups (Gillis, Haaga, & Ford, 1995), and the SPAI has been translated into multiple languages. The reliability of scores on the SPAI is strong, especially for the social anxiety disorder subscale (Osman et al., 1996). Evidence for the validity of the social anxiety disorder scale of the SPAI is also strong (Orsillo, 2001). It has demonstrated strong correlations with other measures of SAD as well as with behavioral indicators of anxiety (e.g., time spent speaking in a public-​speaking task before escaping; Beidel, Borden, Turner, & Jacob, 1989), and

only minimal associations have been found with measures thought to be unrelated to social anxiety. There is a strong relationship between scores on this measure when completed by individuals with social anxiety and informant completion of the measure (Beidel et  al., 1989). The SPAI has shown measurement invariance across men and women as well as in individuals with and without a diagnosis of SAD, supporting its broad use (Bunnell, Joseph, & Beidel, 2013). The use of the SPAI has also been validated in adolescents (Clark et  al., 1994), increasing the breadth with which this measure can be used. Another widely used measure of SAD symptoms is the Social Phobia Inventory (SPIN; Connor et al., 2000). This is a 17-​item scale that assesses the severity of social anxiety, fear of a number of social and performance stimuli, degree of avoidance, and physiological discomfort. Norms are available for adults with SAD, adolescents with SAD, adults with other anxiety disorders, and community samples of adults and adolescents (Antony, Coons, McCabe, Ashbaugh, & Swinson, 2006; Johnson, Inderbitzen-​Nolan, & Anderson, 2006). The measure has generally good to excellent psychometric properties, including internal consistency, test–​retest reliability, and convergent and discriminant validity, both in adults (Antony et al., 2006; Connor et al., 2000) and in adolescents (Johnson et al., 2006), as well as in other cultures (e.g., in a Brazilian sample; Osório, Crippa, & Loureiro, 2010). Studies of the factor structure of the SPIN are equivocal; some studies have found a five-​factor model (Connor et  al., 2000; Osório et  al., 2010), whereas others support a three-​factor model (Campbell-​Sills, Espejo, Ayers, Roy-​Byrne, & Stein, 2015). In Campbell-​Sills et al., the three factors loaded onto a higher order factor assessing the broad construct of social anxiety, providing further support for the validity of this instrument. Research has consistently demonstrated that the core construct in SAD is fear of negative evaluation, and the diagnostic criteria for SAD in DSM-​5 reflects this body of research by including fear of negative evaluation as a core diagnostic feature. A widely used self-​report measure of this construct is the Brief Fear of Negative Evaluation Scale (BFNE; Leary, 1983). There are both 12-​item and 8-​item variants, adapted from the 30-​item Fear of Negative Evaluation Scale (Watson & Friend, 1969), with research suggesting that the 8 original straightforwardly worded items form the strongest version of the BFNE (Carleton, Collimore, McCabe, & Antony, 2011). Scores on the BFNE have demonstrated high internal consistency and test–​ retest reliability (Leary, 1983), with indicators of internal consistency tending to be strongest in clinical

Specific Phobia and Social Anxiety Disorder

samples (Weeks et al., 2005). The BFNE has shown consistent psychometric properties in men, women, and Asian populations (Harpole et al., 2015; Wei, Zhang, Li, Xue, & Zhang, 2015). The BFNE has also shown good convergent validity with measures of social anxiety (Collins, Westra, Dozois, & Stewart, 2005; Leary, 1983). The BFNE has 4 reverse-​scored items that tend to cluster separately in factor analytic studies and have weaker psychometric properties than the other iterations of the scale (Rodebaugh et al., 2004; Weeks et al., 2005). There are mixed findings with regard to discriminant validity; studies have found low correlations with measures of anxiety sensitivity and depression but high correlations with measures of generalized anxiety (Weeks et al., 2005). Scores on the BFNE can discriminate between patients with SAD compared to nonanxious controls (Weeks et al., 2005), individuals with panic disorder (Collins et al., 2005), and individuals with a variety of mood and anxiety disorders (Carleton et al., 2011). The BFNE shows adequate sensitivity to change after CBT (Collins et al., 2005; Taylor, Woody, McLean, & Koch, 1997; Weeks et al., 2005). In addition to the previously discussed measures, there are a host of other measures of the severity of SAD symptoms and related constructs that would also be useful to consider incorporating into an assessment protocol, although their psychometric properties are not reviewed here due to space restrictions. Examples include the Social Interaction Anxiety Scale (Mattick & Clarke, 1998), the Social Phobia Scale (Mattick & Clarke, 1998), and the recently developed Social Anxiety Questionnaire for Adults (Caballo, Arias, Salazar, Irurtia, & Hofmann, 2015). Self-​Report Measures of Related Dimensions in Specific Phobia and Social Anxiety Disorder There are a number of additional dimensions that need to be addressed in a thorough assessment of specific phobia and SAD for the purpose of case conceptualization. Examples include disgust sensitivity, anxiety sensitivity, and perfectionism. This section discusses each of these dimensions. Disgust Sensitivity Disgust sensitivity is a trait that has been implicated in the etiology and phenomenology of certain specific phobias, especially animal phobias and blood–​ injury–​ injection (BII) phobias. For example, disgust sensitivity is elevated in BII fears and phobias, in relation to both general (e.g.,

251

rotting food) and phobia-​specific (e.g., wounds) indicators of disgust (Sawchuk, Lohr, Tolin, Lee, & Kleinknecht, 2000; Tolin, Lohr, Sawchuk, & Lee, 1997). Disgust sensitivity has also been shown to be elevated in people with spider phobias on both questionnaire measures of disgust (e.g., Bianchi & Carter, 2012; Merckelbach, de Jong, Arntz, & Schouten, 1993)  and physiological indicators of disgust (e.g., de Jong, Peters, & Vanderhallen, 2002). Studies suggest that disgust sensitivity significantly contributes to multiple types of fear (McDonald, Hartman, & Vrana, 2008). Given the elevation of disgust sensitivity in many anxiety presentations, this is an important dimension to assess when planning treatment. One of the more commonly used measures of this dimension is the Disgust Scale (DS; Haidt, McCauley, & Rozin, 1994)  and the Disgust Scale-​Revised (DS-​R; Olatunji et al., 2007). The 32-​item DS covers a broad range of disgust-​eliciting stimuli, including food, animals, body products, sex, bodily violations (e.g., seeing a man with a fishhook in his eye), death, hygiene, and magical pathways of disgust, making it broadly applicable to many subtypes of specific phobia. The DS-​R is a 25-​item measure whose items cluster into three factors: core disgust, animal reminder disgust, and contamination disgust. The DS-​R scores have shown good reliability and validity (van Overveld, de Jong, Peters, & Schouten, 2011), although a recent study found only a modest correlation between scores on the DS-​ R and reports of state disgust during exposure therapy (Duncko & Veale, 2016). Recently, the Disgust Emotion Scale (Olatunji, Ebesutani, & Reise, 2015) has shown promise for measuring disgust proneness. Anxiety Sensitivity Anxiety sensitivity (i.e., one’s beliefs that the physical sensations of fear and anxiety are dangerous) is another relevant construct when assessing specific phobia and SAD. Individuals with situational phobias or SAD may be especially concerned with the physical sensations of fear, focusing on the consequences of anxiety and panic attacks when encountering the phobic stimulus or phobic situation (Antony, Brown, & Barlow, 1997a). Research generally supports this notion, demonstrating that individuals with phobias from the situational type score higher on the Anxiety Sensitivity Index (ASI; Peterson & Reiss, 1993)  than do individuals with animal phobias and BII phobias (Antony et al., 1997a). Most people with SAD are also fearful of physical signs of anxiety (sweating, blushing, etc.), especially if these symptoms occur in front of others. Indeed, both adults (Taylor, Koch, & McNally,

252

Anxiety and Related Disorders

1992)  and children (Alkozei, Cooper, & Creswell, 2014) with SAD show elevations on the ASI in comparison to healthy controls, and the presence of situationally bound panic attacks in SAD is associated with more severe presentations of SAD (Potter et al., 2014). Changes in anxiety sensitivity contribute to post-​treatment social anxiety symptoms above and beyond pretreatment symptoms (Nowakowski, Rowa, Antony, & McCabe, 2016). Therefore, it is important to assess an individual’s fear of physical sensations. The most commonly used measure for this purpose is the ASI, and the most recent version of the ASI is the ASI-​3, an 18-​item revision of the original ASI (Taylor et  al., 2007). The ASI-​3 has strong psychometric properties (Wheaton, Deacon, McGrath, Berman, & Abramowitz, 2012). Elevated anxiety sensitivity scores suggest that treatment should include a possible focus on the meaning of physical sensations (i.e., through cognitive restructuring) and the possible inclusion of interoceptive exposure practices, where an individual engages in exercises to purposely bring about feared physical sensations in a safe environment. Perfectionism Research supports the idea that levels of maladaptive perfectionism are elevated in SAD. For example, people with SAD appear to believe that other people have high expectations for them (Bieling & Alden, 1997), and they show elevated levels of concerns over mistakes, doubts about their actions, and reports of parental criticism (Antony, Purdon, Huta, & Swinson, 1998). Perfectionism predicts aspects of SAD such as post-​event processing, the tendency to ruminate about social events after the fact (Shikatani, Antony, Cassin, & Kuo, 2016). Perfectionism is therefore an important construct to investigate in the conceptualization of an individual with SAD. A widely used measure of perfectionism is the 35-​item Frost Multidimensional Perfectionism Scale (FMPS; Frost, Marten, Lahart, & Rosenblate, 1990). This self-​ report measure assesses a number of dimensions of perfectionism (e.g., concern over mistakes, personal standards, and parental expectations), allowing the clinician to determine which aspects of perfectionism are elevated for a particular individual. Scores on this measure have demonstrated strong psychometric properties, including good internal consistency and strong relations between the scores on this measure and behavioral indications of perfectionism (e.g., reactions to mistakes made in a task; Frost et al., 1997). There is some question about the appropriate factor structure

of this measure, with different studies yielding different factor solutions (e.g., Purdon, Antony, & Swinson, 1999; Stober, 1998). Along with the FMPS, there are well over 20 measures that can be used to assess perfectionism. For a review, see Egan, Wade, Shafran, and Antony (2014). Behavioral Assessment Behavioral assessment is an especially useful form of assessment for planning cognitive or behavioral treatments. The most common form of behavioral assessment used with anxiety disorders is the behavioral approach test (BAT). A  BAT for a specific phobia may involve seeing how close the person can get to his or her feared stimulus (e.g., an animal or high ledge), how long a person can stay in a feared situation before escaping, or the degree of fear a person experiences in the situation. A  BAT for SAD may involve asking a person to engage in a feared activity (e.g., giving a speech) and measuring the degree of fear experienced during the activity. These tests provide valuable information about the intensity of a person’s fear, the cues that affect a person’s fear (e.g., size of spider and sex of the conversation partner), the physical sensations a person experiences, the person’s fearful thoughts, and the use of avoidance or subtle avoidance strategies when in the feared situation (e.g., avoiding eye contact and leaving the situation). Research suggests that responses to behavioral challenges such as a BAT are related to responses on self-​ report measures of social anxiety symptoms (e.g., Gore, Carter, & Parker, 2002)  and subjective distress scores for phobic stimuli (Ollendick, Allen, Benoit, & Cowart, 2011), supporting the convergent validity for behavioral measures. Analogue behavioral assessment strategies have demonstrated strong discriminative and convergent validity for the assessment of social functioning (Norton & Hope, 2001). Assessment of Skills Deficits Individuals with specific phobia or SAD may have skills deficits that impact upon treatment. For example, some people with SAD appear to have impairment in social skills (e.g., Beidel, Rao, Scharfstein, Wong, & Alfano, 2010; Fydrich, Chambless, Perry, Buergener, & Beazley, 1998), although other research has found no differences from control participants (e.g., Voncken & Bögels, 2008). Other research suggests that deficits may be accounted for by other constructs, such as use of safety behaviors (Rowa

Specific Phobia and Social Anxiety Disorder

et al., 2015). Some people with specific phobias of driving may lack adequate driving skills, particularly if they have avoided driving for many years. Although there are no gold standard measures to assess skills deficits, these deficits are important to address when planning for treatment. For example, poor driving skills may necessitate a course of remedial driving instruction either prior to exposure therapy or concurrent with it. Driving skills are likely best evaluated by a professional driving instructor. Social skills deficits in SAD may be readily apparent during the course of initial meetings with an individual (e.g., lack of eye contact may be noticeable during a semi-​structured interview). To more formally assess these deficits, the Social Performance Rating Scale (SPRS) can be used. This behavioral assessment tool was originally developed by Trower, Bryant, and Argyle (1978), modified by Turner, Beidel, Dancu, and Keys (1986), and then further modified by Fydrich and colleagues (1998) to provide a measure of social skill level during videotaped role-​plays that yields reliable and valid scores. The modified SPRS provides information about the following skill areas: gaze, vocal quality, speech length, discomfort, and conversation flow. Although this measure provides broad and useful ratings of behavioral skill deficits, it may not be easily transferred to a clinical setting. For example, the role-​plays require the presence of a confederate. Although this may be easily accomplished in the context of a clinic or hospital-​based program, it may be near impossible in other clinical settings, such as private practice. Furthermore, the training and time necessary for raters of the role-​plays may be difficult to justify in many contexts. Thus, although this measure appears to provide excellent information on skills deficits, it may be more reasonable for most clinicians to use some aspects of this measure. For example, clinicians could use the behavioral anchors provided for this measure to rate the social skills that emerge in the context of either an interview or other assessment protocol or to conduct analogue role-​plays with their clients using themselves as the confederate. Assessment of Treatment History, Treatment Concerns, and Suitability for Cognitive–​Behavioral Therapy During the course of treatment planning, it is useful to assess an individual’s treatment history, any treatment concerns, and suitability for a therapeutic intervention such as CBT. Previous treatment failures may provide useful information about what not to try when treating a

253

particular patient. On the other hand, previous treatment failures may have been the result of receiving inappropriate interventions for a problem such as SAD or specific phobia. Research suggests that individuals seen in a specialty anxiety clinic reported having received a number of nonempirically supported treatments (especially psychological treatments) prior to receiving cognitive or behavioral interventions for their anxiety disorder (Rowa, Antony, Brar, Summerfeldt, & Swinson, 2000). In cases in which past treatment attempts were not successful, it may be important to identify reasons for the negative outcome. For example, treatment noncompliance and lack of acceptance of the treatment rationale are predictors of negative outcome following psychological treatment (e.g., Addis & Jacobson, 2000; Woods, Chambless, & Steketee, 2002). Furthermore, the presence of comorbid personality disorders is associated with lower likelihood of individuals seeking treatment services for SAD or specific phobia (Iza et al., 2013). Knowledge of these kinds of issues suggests useful pathways the clinician should consider when planning treatment and potential obstacles that may arise. For example, the clinician may consider investing more time at the beginning stages of treatment to help the client fully understand and, it is hoped, accept the rationale underlying treatment interventions. Furthermore, a history of treatment noncompliance may suggest the importance of contracting about the completion of therapy assignments and session attendance in order for therapy to proceed. One option to better understand treatment history is to use multiple informants, including both the client and previous therapists (with permission from the client). Individuals may have strong fears about treatment, including fears that they will not get better. Measures such as the Treatment Ambivalence Questionnaire (TAQ; Rowa et  al., 2014)  may be helpful in evaluating a host of treatment fears presented by individuals with anxiety disorders. Finally, a clinician may want to consider whether a particular client is a good match for cognitive or behavioral interventions. Even though these techniques are empirically supported for treating specific phobia and SAD, this does not guarantee that a particular individual is well-​suited for a CBT intervention. Clients have to be willing to complete between-​session work, confront feared stimuli, and accept a CBT rationale for their difficulties. Suitability interviews for CBT do exist, and scores on one suitability instrument have shown moderate correlations with both client and therapist ratings of success in cognitive therapy for depression (Safran & Segal,

254

Anxiety and Related Disorders

1990). However, these interviews are detailed and time-​ consuming, focus more on suitability for cognitive interventions than behavioral interventions, have not been validated for anxiety disorders, and may not be practical for clinical practice. They may be best used when suitability issues appear to be a potential obstacle in treatment planning. Assessment of Safety Behaviors A final consideration when conducting assessment for the purpose of treatment planning in specific phobia and SAD is to ensure a thorough assessment of an individual’s use of safety behaviors, subtle avoidance, and maladaptive coping strategies (e.g., alcohol and drug use). Elevated drug and alcohol use has been documented in SAD (e.g., Van Ameringen, Mancini, Styan, & Donison, 1991), and it is often conceptualized as a means of coping with otherwise debilitating levels of anxiety. Other examples of safety behaviors and coping strategies in SAD include wearing high-​necked shirts to cover blushing, over-​rehearsing or memorizing presentations, carrying anti-​anxiety medication, and always bringing a “safe other” when attending a social gathering. Safety behaviors or subtle avoidance in specific phobia may include looking away when getting a needle, wearing long sleeves or a hooded shirt to prevent spiders from falling directly on one’s skin, holding a railing in a high place, and playing the radio while driving to distract oneself from fear. Although safety behaviors are prevalent and are a crucial aspect of understanding a person’s social or specific phobia, the breadth and variety of strategies used by individuals have been difficult to measure using a particular instrument. One instrument that has been developed to measure safety behaviors in social anxiety is the Subtle Avoidance Frequency Examination (SAFE; Cuming et al., 2009). This 32-​item measure assesses how frequently an individual uses particular safety behaviors in social situations. It has strong psychometric properties (Cuming et  al., 2009)  and is able to differentiate adolescents with social anxiety from control participants (Thomas, Daruwala, Goepel, & De Los Reyes, 2012). The SAFE is sensitive to changes in safety behavior use across treatment (e.g., Goldin et al., 2016). Overall Evaluation There are a number of topics to cover when using assessment to aid conceptualization and treatment planning for specific phobia and SAD. We have highlighted a series

of variables that we believe are valuable to cover, including symptom severity, relevant cognitions and avoidance behaviors, related constructs, coping strategies, skills deficits, treatment history, and attitudes toward future treatment. When reviewing these topics, it is encouraging that a number of psychometrically sound, clinically useful measures exist to assess these areas (for a summary, see Table 12.2). Therefore, at minimum, this stage of assessment should include a well-​studied and validated measure such as the SPIN for SAD or the FSQ for specific phobia of spiders, for example, to complement the diagnostic information already provided from a semi-​ structured interview. Even without a designation of “highly recommended,” we still encourage practitioners to use instruments such as these for the purpose of treatment planning. Furthermore, it also seems reasonable to include well-​validated measures of related constructs, such as anxiety sensitivity, disgust, and perfectionism, where relevant. The questionnaires highlighted to measure these constructs are all quick and straightforward measures whose value clearly exceeds the time taken to complete and score the instruments. We also argue that the use of idiographic diaries, questions, or monitoring forms to ascertain coping strategies and safety behaviors, although not empirically validated, is an essential aspect of treatment planning. Similarly, behavioral assessment using a BAT is a useful way to discover a great deal of valuable information for treatment planning. Overlooking this information could have serious implications for treatment outcome (e.g., if the use of maladaptive coping strategies is never targeted). The assessment picture becomes more complicated when evaluating the utility of measuring constructs such as social skills and suitability for CBT in a routine assessment for SAD. We argue that the lack of quick and easily administered measures limits the feasibility of systematically assessing these features in routine practice, and we know of no data to suggest that not measuring these constructs formally leads to compromised treatment outcome. Instead, therefore, we recommend the use of measures specifically designed to assess these topics only in scenarios when these topics appear especially relevant (e.g., if an individual clearly communicates a bad experience with previous CBT and extreme skepticism about its effectiveness or for a person who clearly has extreme social skills deficits). Otherwise, the general theme of social skills, suitability for CBT, and treatment fears can be assessed in a more informal way during the course of a diagnostic interview or assessment of these variables.

Specific Phobia and Social Anxiety Disorder TABLE 12.3  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Clinical Generalization Utility

Treatment Sensitivity

FSQ

A

E

NA

A

A

G

G

G

E

SNAQ DAI SPAI SPIN LSAS BSPS BAT IIRS

A A E E G A NR E

G E E E E A NR E

NA NA NA NA NR NR NA E

E A A A NR A NR G

NR A E G G A NA G

G G E G G A G E

E E E E E A E E

G G E E E E E E

E NR E E E G E G

255

Highly Recommended

✓ ✓ ✓ ✓ ✓

Note: FSQ = Fear of Spiders Questionnaire; SNAQ = Snake Questionnaire; DAI = Dental Anxiety Inventory; SPAI = Social Phobia and Anxiety Inventory; SPIN = Social Phobia Inventory; LSAS = Liebowitz Social Anxiety Scale; BSPS = Brief Social Phobia Scale; BAT = Behavioral Approach Test; IIRS = Illness Intrusiveness Rating Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

The final use of assessment procedures covered in this chapter is assessment for the purpose of evaluating treatment progress and outcome, both for medication and for CBT. Clearly, a hallmark feature of CBT interventions is the rigorous evaluation of their effectiveness. This is also true when using these techniques with a particular client. It is essential to understand whether treatment strategies were helpful, in what way they were helpful, and on what dimensions strategies had their impact. In some instances, degree of improvement will have important implications for course and duration of treatment. In other cases, indicators of improvement will have implications for continued funding of treatment (e.g., by insurance companies or third party payers). On a most basic level, it is useful for a client to understand and be aware of the degree of improvement made. Without explicitly assessing these variables, important gains can be ignored or missed. Table 12.3 provides a review of measures that are useful for treatment evaluation. Self-​Report Measures of Severity—​Specific Phobia and Social Anxiety Disorder One important way to assess treatment progress and outcome is to compare self-​report questionnaire scores obtained during and at the end of treatment with those obtained at pretreatment. Possible assessment tools include the self-​ report measures described previously (e.g., FSQ, SNAQ, DAI, SPAI, and SPIN). Each of these measures except for the DAI has shown at least adequate

treatment sensitivity, in that scores meaningfully decline after medication or psychological treatment (for a review, see Antony, Orsillo, et al., 2001). In addition to using empirically validated self-​report measures for specific phobia and SAD, there is also value to using more idiosyncratic self-​ report instruments to monitor progress in therapy. For example, a widely used tool of symptom progression is the exposure hierarchy, used in exposure-​based treatments of specific phobia and SAD. An exposure hierarchy is a list of feared situations ranked from most difficult at the top to least difficult at the bottom. Each item on the hierarchy is rated for fear, avoidance, or both. Hierarchies are developed either before treatment or near the beginning of treatment, and clients are encouraged to provide updated fear and avoidance ratings on a regular basis (i.e., each session; pre-​, mid-​, and post-​treatment; etc.). Although no studies have examined the use of hierarchy ratings in social or specific phobia, a study by our group on patients with panic disorder provides support for their utility. We found that fear and avoidance ratings on hierarchies changed significantly across treatment of panic disorder, with effect sizes even greater than those obtained from standard outcome measures (Katerelos, Hawley, Antony, & McCabe, 2008). Valuable information can also be gleaned from monitoring forms and exposure graphs completed by the client. For example, notable shifts in the content of cognitions can be an indicator that the client is benefiting from CBT; hypothesized reductions in peak fear and the time needed for fear to decrease can inform the therapist of whether exposure exercises are producing the desired effects.

256

Anxiety and Related Disorders

Interview Measures of Symptom Severity

to these physical sensations is an important factor in their experience of anxiety (e.g., Clark & Wells, 1995). When evaluating progress and outcome as a result of Elevated physiological reactivity may also be implicated medication or CBT for SAD, there are two widely used in specific phobias, with individuals often experiencing clinician-​rated measures of symptom severity. Clinician-​ cued panic attacks in feared situations. Furthermore, rated measures are a useful addition to self-​report meaindividuals with BII phobias have an elevated risk of faintsures to broaden the breadth and source of data used to ing when encountering their feared stimuli (Antony & evaluate outcome. The first example is the Liebowitz Barlow, 2002), a unique physiological response. Research Social Anxiety Scale (LSAS; Liebowitz, 1987). This also suggests that people with situational, as compared to measure lists 24 situations that are commonly anxiety-​ nonsituational, phobias have a higher rate of unexpected producing for people with social anxiety, and the interpanic attacks, and people with BII phobias have a greater viewer rates each situation in terms of fear and avoidance. focus on physical symptoms than on harm or catastrophe It is relatively brief, taking approximately 20 minutes to (Lipsitz et al., 2002). Thus, particular subtypes of specific complete. The psychometric properties of the LSAS are phobias may have unique physiological presentations. If good to excellent, and it has demonstrated sensitivity this is the case, it might be useful to measure physiologito treatment outcome, having been a primary outcome cal reactivity to behavioral tasks and exposure stimuli as measure in many medication and cognitive–​behavioral an indication of progress in therapy. treatment trials for SAD. The LSAS is also available in a Although it is clear that people with specific phobia self-​report format with good psychometric properties (e.g., and SAD report greater than normal apprehension about Baker, Heinrichs, Kim, & Hofmann, 2002). A  second physiological sensations (e.g., Hugdahl & Öst, 1985), widely used clinician-​rated measure of SAD symptoms research is not clear regarding whether actual physiologiis the Brief Social Phobia Scale (BSPS; Davidson et al., cal differences exist between people with and without 1991). This 18-​item measure covers symptoms of fear, these disorders. For example, Edelmann and Baker (2002) avoidance, and physiological arousal and can be adminfound no physiological differences between individuals istered in 5 to 15 minutes. with generalized SAD, anxious controls, and nonanxious controls on measures of heart rate, skin conductance, and Behavioral Indicators of Treatment Progress facial and neck temperatures on a series of behavioral, Behavioral assessment (e.g., a BAT) is also a useful way of physical, and imagery tasks. Interestingly, participants monitoring outcome of treatment for specific phobia and with SAD and other anxiety disorders provided higher SAD. If an individual with a spider phobia is unable to subjective ratings of some sensations than did nonanxious look at a spider during a pretreatment BAT but can hold controls even in the absence of physiological differences. a spider comfortably during a post-​treatment BAT, the cli- This result is consistent across other anxiety conditions, ent can be assumed to have improved. Hofmann (2000) including panic disorder and generalized anxiety disorSaric, McLeod, Funderburk, & Kowalski, used four behavioral tasks both before and after a treat- der (Hoehn-​ 2004). Furthermore, changes in physiological response ment trial of CBT for SAD and measured self-​statements can occur across a course of treatment but may occur made during these tasks. Results suggested that content separately from changes in fear and avoidance (Aderka, of self-​statements made while anticipating the behavioral McLean, Huppert, Davidson, & Foa, 2013). On the other tasks changed across successful treatment, with particihand, individuals with dental fears demonstrated changes pants endorsing fewer negative self-​focused thoughts after in physiology during exposure to scenes of dental treattreatment. The use of behavioral tests in this example ment (Johnson et al., 2003), and individuals with spider allowed the evaluator to see related changes in cognition phobia showed a reduction in heart rate during BATs across treatment. before, during, and after a session of exposure therapy (Antony, McCabe, Leeuw, Sano, & Swinson, 2001). Also, Physiological Indications of Treatment Progress children with SAD may have impaired recovery from a Models of the development and maintenance of SAD social stressor compared to healthy controls, but they have place importance on the physiological manifestations similar changes in heart rate when the stressor is introof anxiety, suggesting that people with SAD experience duced (Schmitz, Krämer, Tuschen-​Caffier, Heinrichs, & elevated physical symptoms of anxiety (e.g., blushing and Blechert, 2011). Thus, currently, there is some empiriracing heart) and that the anticipation of and reaction cal support to conclude that physiology changes across

Specific Phobia and Social Anxiety Disorder

treatment but not enough to recommend using physiological indicators to measure treatment progress. Given the expense and burden of accurately measuring physiological responses of anxiety, it appears more useful to measure concern over physiological symptoms using validated self-​ report measures such as the ASI-​3, described previously. Assessment of Functional Impairment and Quality of Life Traditionally, treatment outcome research in the area of anxiety disorders has focused on measuring change in symptom severity, paying less attention to whether treatment improves associated distress, functional impairment, and quality of life. There are no assessment tools designed specifically to assess distress, functional impairment, and quality of life in people with anxiety disorders, although a number of more general scales (e.g., Sheehan Disability Scale; Sheehan, 1983) have been used to measure these constructs in this population (e.g., Antony, Roth, Swinson, Huta, & Devins, 1998; Mendlowicz & Stein, 2000; Quilty, van Ameringen, Mancini, Oakman, & Farvolden, 2003). One such scale is the Illness Intrusiveness Ratings Scale (IIRS; Devins et al., 1983). Originally developed for use with medical populations, the IIRS has been adapted for use with mental health populations. This brief self-​report measure asks people to rate the degree to which their illness (i.e., anxiety disorder) interferes with 13 domains of functioning (e.g., work, sex life, relationships, and religious expression). The IIRS has demonstrated strong psychometric properties. Antony et al. found that individuals with anxiety disorders (including SAD) reported higher levels of functional impairment than did people with serious medical conditions, including end-​stage renal disease and multiple sclerosis. Whether this finding reflects the true level of functional impairment in anxiety disorders is unknown because research has not examined the relationship between scores on these scales and more objective indices of impairment (e.g., missed days at work and relationship impairment) in people with anxiety disorders. However, the IIRS appears sensitive to changes across treatment (e.g., Rowa et al., 2007) and is a straightforward way of measuring subjective changes in impairment as a result of treatment efforts. Overall Evaluation Measuring treatment outcome is not only important in the broad sense of validating the use of particular treatment interventions but also useful on an individual basis

257

to quantify changes made by particular clients across particular courses of therapy. Clinically, it is clear that many clients make significant changes across the course of therapy but do not recognize the magnitude or importance of these changes. Similarly, it is easy for clinicians to “forget” the severity of a client’s original fears when they have observed progress on a week-​to-​week basis. Efficacy data from randomized controlled trials may not mirror effectiveness of the treatment in a particular clinic or with a particular client. For these reasons, it is valuable to measure treatment outcome for individual clients as well as in larger, well-​controlled treatment trials. We believe that the measurement of treatment outcome should be multifaceted, ideally including self-​ report, behavioral, and clinician-​rated measures of improvement. In addition, treatment outcome should target not only improvements in symptoms of social anxiety or specific phobia but also improvements in a person’s general functioning and quality of life. At minimum, assessment of treatment outcome should involve examining changes on self-​report measures with demonstrated treatment sensitivity, on idiographic measures of progress (e.g., hierarchies and monitoring forms), on behavioral indicators of progress (e.g., ability to enter and remain in feared situations), and on measures of everyday life functioning.

CONCLUSIONS AND FUTURE DIRECTIONS

It is clear that assessment plays a crucial role in understanding an individual’s presenting problems, making informed decisions about treatment interventions, and evaluating the effectiveness of any such interventions. Within the anxiety disorders, there is a long tradition of ensuring that assessment instruments possess strong psychometric properties and that assessment tools that yield reliable and valid scores are used to measure the efficacy of treatment interventions. In addition, there is value in developing and evaluating evidence-​ based assessment protocols for the anxiety disorders. We have provided a sample assessment strategy for SAD in Table 12.4. This strategy involves using assessment measures for the purpose of diagnosis, measures to assess clinical features of both the disorder and related constructs (e.g., perfectionism) for the purpose of treatment planning, and measures that are sensitive to change across treatment. Unfortunately, anxiety researchers often use assessment protocols that are less multimodal, relying most often on self-​ report scales only (Lawyer & Smitherman, 2004). Therefore, despite our strong history of using well-​validated tools and

258

Anxiety and Related Disorders

TABLE 12.4  

Sample Assessment Protocol for Assessing Treatment Outcome in Social Anxiety Disorder Domain

Assessment Tools

Type of Tool

Diagnostic features Structured Clinical Interview for DSM-​5 (SCID-​5; First et al., 2015)a

Semi-​structured interview

Anxiety Disorders Interview Schedule for DSM-​5 (ADIS-​5; Brown & Barlow, 2014)a

Semi-​structured interview

Conceptualization Severity

Social Phobia Inventory (Connor et al., 2000) Situational cues Diaries to record situational fear, avoidance, and safety behaviors Avoidance Behavioral approach test Related Anxiety Sensitivity Index-​3 constructs (ASI-​3; Taylor et al., 2007) Disgust Scale (Haidt et al., 1994) Frost Multidimensional Perfectionism Scale (FMPS; Frost et al., 1990) Treatment outcome Severity, Social anxiety disorder situational and Anxiety Inventory cues, (SPAI; Turner et al., cognitive 1989) features, and avoidance behavior Diaries to record situational fear, avoidance, and safety behaviors Liebowitz Social Anxiety Scale (LSAS; Liebowitz, 1987) Avoidance Behavioral approach test Functional Illness Intrusiveness impairment Ratings Scale (IIRS; Devins, 1994) ADIS-​5 or SCID-​5

Self-​report Diary

Behavioral assessment Self-​report

Self-​report Self-​report

Self-​report

about individual assessment instruments and techniques to make informed judgments about what tools should be considered when assessing SAD or specific phobias. In this vein, we have provided a review of some individual tools and techniques commonly used in the assessment of specific phobia and SAD, with the idea that an empirically supported assessment strategy must have its roots in well-​validated, psychometrically sound instruments. Furthermore, we have reviewed these instruments and techniques from the perspective of clinical utility as well, with the understanding that there has to exist a crossroads between empirically supported and clinically feasible assessment strategies. For example, some semi-​structured interviews for DSM-​5 (e.g., the SCID-​5) are substantially longer than previous versions, making them less practical to use in their entirety. One option might be to move toward a more modular approach in which interviewers select the most relevant diagnostic modules to arrive at diagnostic decisions in a more efficient manner. From these reviews, we have suggested some possible avenues for combining assessment techniques in a way that may prove to be useful for assessment of specific phobia and SAD. From suggestions such as these, future research can focus on the optimal combination of assessment strategies for different purposes, how different strategies affect the utility and efficacy of others used concurrently, and how to balance clinical feasibility with maximal efficacy.

Diary

References Interview and self-​report

Behavioral assessment Self-​report

Semi-​structured interview

  The psychometric properties of the SCID-​5 and ADIS-​5 have yet to be determined. a

interventions, it appears that an increased emphasis on empirically supported, multimodal assessment in specific phobia and SAD is warranted. Although research on the incremental validity of multimodal protocols over singular measures in anxiety disorders is sparse, it is useful to review what we do know

Addis, M. E., & Jacobson, N. S. (2000). A closer look at the treatment rationale and homework compliance in cognitive–​behavioral therapy for depression. Cognitive Therapy and Research, 24, 313–​326. Aderka, I. M., McLean, C. P., Huppert, J. D., Davidson, J. R. T., & Foa, E. B. (2013). Fear, avoidance and physiological symptoms during cognitive–​behavioral therapy for social anxiety disorder. Behaviour Research and Therapy, 51, 352–​358. Alkozei, A., Cooper, P. J., & Creswell, C. (2014). Emotional reasoning and anxiety sensitivity:  Associations with social anxiety disorder in childhood. Journal of Affective Disorders, 152, 219–​228. American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3th ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author.

Specific Phobia and Social Anxiety Disorder

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Antony, M. M., & Barlow, D. H. (2002). Specific phobia. In D. H. Barlow (Ed.), Anxiety and its disorders: The nature and treatment of anxiety and panic (2nd ed., pp. 380–​ 417). New York, NY: Guilford. Antony, M. M., Brown, T. A., & Barlow, D. H. (1997a). Heterogeneity among specific phobia types in DSM-​IV. Behaviour Research and Therapy, 35, 1089–​1100. Antony, M. M., Brown T. A., & Barlow D. H. (1997b). Response to hyperventilation and 5.5% CO2 inhalation of subjects with types of specific phobia, panic disorder, or no mental disorder. American Journal of Psychiatry, 154, 1089–​1095. Antony, M. M., Coons, M. J., McCabe, R. E., Ashbaugh, A., & Swinson, R. P. (2006). Psychometric properties of the Social Phobia Inventory: Further evaluation. Behaviour Research and Therapy, 44, 1177–​1185. Antony, M. M., Downie, F., & Swinson, R. P. (1998). Diagnostic issues and epidemiology in obsessive compulsive disorder. In R. P. Swinson, M. M. Antony, S. Rachman, & M. A. Richter (Eds.), Obsessive compulsive disorder: Theory, research, and treatment (pp. 3–​32). New York, NY: Guilford. Antony, M. M., McCabe, R. E., Leeuw, I., Sano, N., & Swinson, R. P. (2001). Effect of distraction and coping style on in vivo exposure for specific phobia of spiders. Behaviour Research and Therapy, 39, 1137–​1150. Antony, M. M., Orsillo, S. M., & Roemer, L. (Eds.). (2001). Practitioner’s guide to empirically based measures of anxiety. New York, NY: Springer. Antony, M. M., Purdon, C. L., Huta, V., & Swinson, R. P. (1998). Dimensions of perfectionism among the anxiety disorders. Behaviour Research and Therapy, 36, 1143–​1154. Antony, M. M., Roth, D., Swinson, R. P., Huta, V., & Devins, G. M. (1998). Illness intrusiveness in individuals with panic disorder, obsessive compulsive disorder, or social phobia. Journal of Nervous and Mental Disease, 186, 311–​315. Antony, M. M., & Rowa, K. (2005). Evidence-​based assessment of anxiety disorders in adults. Psychological Assessment, 17, 256–​266. Armfield, J. M. (2010). Development and psychometric evaluation of the Index of Dental Anxiety and Fear (IDAF-​ 4C+). Psychological Assessment, 22, 279–​287. Baker, S. L., Heinrichs, N., Kim, H., & Hofmann, S. G. (2002). The Liebowitz Social Anxiety Scale as a self-​ report instrument: A preliminary psychometric analysis. Behaviour Research and Therapy, 40, 701–​715. Bandelow, B., & Michaelis, S. (2015). Epidemiology of anxiety disorders in the 21st century. Dialogues in Clinical Neuroscience, 17, 327.

259

Baptista, C. A., Loureiro, S. R., de Lima Osório, F., Zuardi, A. W., Magalhaes, P. V., Kapczinski, F., . . . Crippa, J. A. S. (2012). Social phobia in Brazilian university students:  Prevalence, under-​ recognition and academic impairment in women. Journal of Affective Disorders, 136, 857–​861. Barrera, T. L., & Norton, P. J. (2009). Quality of life impairment in generalized anxiety disorder, social phobia, and panic disorder. Journal of Anxiety Disorders, 23, 1086–​1090. Beidel, D. C., Borden, J. W., Turner, S. M., & Jacob, R. G. (1989). The Social Phobia and Anxiety Inventory:  Concurrent validity with a clinical sample. Behaviour Research and Therapy, 27, 573–​576. Beidel, D. C., Rao, P. A., Scharfstein, L., Wong, N., & Alfano, C. A. (2010). Social skills and social phobia: An investigation of DSM-​IV subtypes. Behaviour Research and Therapy, 48, 992–​1001. Bianchi, K. N., & Carter, M. M. (2012). An experimental analysis of disgust sensitivity and fear of contagion in spider and blood injection injury phobia. Journal of Anxiety Disorders, 26, 753–​761. Bieling, P. J., & Alden, L. E. (1997). The consequences of perfectionism for patients with social phobia. British Journal of Clinical Psychology, 36, 387–​395. Boschen, M. J., Veale, D., Ellison, N., & Reddell, T. (2013). The Emetophobia Questionnaire (EmetQ-​ 13):  Psychometric validation of a measure of specific phobia of vomiting (emetophobia). Journal of Anxiety Disorders, 27, 670–​677. Brown, T. A., & Barlow, D. H. (2014). Anxiety and Related Disorders Interview Schedule for DSM-​ 5 (ADIS-​ 5). New York, NY: Oxford University Press. Brown, T. A., DiNardo, P. A., & Barlow, D. H. (1994). Anxiety Disorders Interview Schedule for DSM-​IV (ADIS-​IV). New York, NY: Oxford University Press. Brown, T. A., DiNardo, P. A., Lehman, C. L., & Campbell, L. A. (2001). Reliability of DSM-​IV anxiety and mood disorders: Implications for the classification of emotional disorders. Journal of Abnormal Psychology, 110, 49–​58. Bunnell, B. E., Joseph, D. L., & Beidel, D. C. (2013). Measurement invariance of the social phobia and anxiety inventory. Journal of Anxiety Disorders, 27, 84–​91. Caballo, V. E., Arias, B., Salazar, I. C., Irurtia, M. J., & Hofmann, S. G. (2015). Psychometric properties of an innovative self-​report measure:  The Social Anxiety Questionnaire for adults. Psychological Assessment, 27, 997–​1012. Campbell-​Sills, L., Espejo, E., Ayers, C. R., Roy-​Byrne, P., & Stein, M. B. (2015). Latent dimensions of social anxiety disorder: A re-​evaluation of the Social Phobia Inventory (SPIN). Journal of Anxiety Disorders, 36, 84–​91. Camuri, G., Oldani, L., Dell’Osso, B., Benatti, B., Lietti, L., Palazzo, C., & Altamura, A. C. (2014). Prevalence

260

Anxiety and Related Disorders

and disability of comorbid social phobia and obsessive–​ compulsive disorder in patients with panic disorder and generalized anxiety disorder. International Journal of Psychiatry in Clinical Practice, 18, 248–​254. Carleton, R. N., Collimore, K. C., McCabe, R. E., & Antony, M. M. (2011). Addressing revisions to the Brief Fear of Negative Evaluation Scale: Measuring fear of negative evaluation across anxiety and mood disorders. Journal of Anxiety Disorders, 25, 822–​828. Clark, D. B., Turner, S. M., Beidel, D. C., Donovan, J. E., Kirisci, L., & Jacob, R. G. (1994). Reliability and validity of the Social Phobia and Anxiety Inventory for adolescents. Psychological Assessment, 6, 135–​140. Clark, D. M., & Wells A. (1995). A cognitive model of social phobia. In R. G. Heimberg, M. R. Liebowitz, D. A. Hope, & F. R. Schneier (Eds.), Social phobia:  Diagnosis, assessment, and treatment (pp. 69–​93). New York, NY: Guilford. Collins, K. A., Westra, H. A., Dozois, D. J. A., & Stewart, S. H. (2005). The validity of the brief version of the Fear of Negative Evaluation Scale. Journal of Anxiety Disorders, 19, 345–​359. Connor, K. M., Davidson, J. R. T., Churchill, E., Sherwood, A., Foa, E., & Weisler, R. H. (2000). Psychometric properties of the Social Phobia Inventory (SPIN): New self-​ rating scale. British Journal of Psychiatry, 176, 379–​386. Crippa, J. A. S., de Lima Osório, F., Del-​Ben, C., Filho, A. S., da Silva Freitas, M. C., & Loureiro, S. R. (2008). Comparability between telephone and face-​ to-​ face Structured Clinical Interview for DSM-​IV in assessing social anxiety disorder. Perspectives in Psychiatric Care, 44, 241–​247. Cuming, S., Rapee, R. M., Kemp, N., Abbott, M. J., Peters, L., & Gaston, J. E. (2009). A self-​report measure of subtle avoidance and safety behaviors relevant to social anxiety:  Development and psychometric properties. Journal of Anxiety Disorders, 23, 879–​883. Curtis, G. C., Magee, W. J., Eaton, W. W., Wittchen, H.-​ U., & Kessler, R. C. (1998). Specific fears and phobias: Epidemiology and classification. British Journal of Psychiatry, 173, 212–​217. Cutshall, C., & Watson, D. (2004). The phobic stimuli response scales:  A new self-​ report measure of fear. Behaviour Research and Therapy, 42, 1193–​1201. Davidson, J. R. T., Potts, N. L. S., Richichi, E. A., Ford, S. M., Krishnan, R. R., Smith, R. D., & Wilson, W. (1991). The Brief Social Phobia Scale. Journal of Clinical Psychiatry, 52(11, Suppl.), 48–​51. de Jong, P. J., Peters, M., & Vanderhallen, I. (2002). Disgust and disgust sensitivity in spider phobia: Facial EMG in response to spider and oral disgust imagery. Journal of Anxiety Disorders, 16, 477–​493. de Jongh, A., Muris, P., Schoenmakers, N., & ter Horst, G. (1995). Negative cognitions of dental phobics: Reliability

and validity of the dental cognitions questionnaire. Behaviour Research and Therapy, 33, 507–​515. de Lijster, J. M., Dierckx, B., Utens, E. M., Verhulst, F. C., Zieldorff, C., Dieleman, G. C., & Legerstee, J. S. (2017). The age of onset of anxiety disorders: A meta-​analysis. Canadian Journal of Psychiatry, 62(4), 237–​246. de Vente, W., Majdandžić, M., Voncken, M. J., Beidel, D. C., & Bögels, S. M. (2014). The SPAI-​18, a brief version of the Social Phobia and Anxiety Inventory: Reliability and validity in clinically referred and non-​referred samples. Journal of Anxiety Disorders, 28, 140–​147. Devins, G. M. (1994). Illness intrusiveness and the psychosocial impact of lifestyle disruptions in chronic life-​ threatening disease. Advances in Renal Replacement Therapy, 1, 251–​263. Devins, G. M., Binik, Y. M., Hutchinson, T. A., Hollomby, D. J., Barré, P. E., & Guttman, R. D. (1983). The emotional impact of end-​stage renal disease: Importance of patients’ perceptions of intrusiveness and control. International Journal of Psychiatry in Medicine, 13, 327–​343. DiNardo, P., Brown, T. A., & Barlow, D. H. (1994). Anxiety Disorders Interview Schedule for DSM-​IV. New  York, NY: Oxford University Press. Duncko, R., & Veale, D. (2016). Changes in disgust and heart rate during exposure for obsessive compulsive disorder: A case series. Journal of Behavior Therapy and Experimental Psychiatry, 51, 92–​99. Edelmann, R. J., & Baker, S. R. (2002). Self-​reported and actual physiological responses in social phobia. British Journal of Clinical Psychology, 41, 1–​14. Egan, S. J., Wade, T. D., Shafran, R., & Antony, M. M. (2014). Cognitive–​behavioral treatment of perfectionism. New York, NY: Guilford. Eikenaes, I., Egeland, J., Hummelen, B., & Wilberg, T. (2015). Avoidant personality disorder versus social phobia: The significance of childhood neglect. PLoS One, 10, e0122846. Essau, C. A., Conradt, J., & Petermann, F. (2000). Frequency, comorbidity, and psychosocial impairment of specific phobia in adolescents. Journal of Clinical Child Psychology, 29, 221–​231. Fehm, L., Pelissolo, A., Furmark, T., & Wittchen, H. (2005). Size and burden of social phobia in Europe. European Neuropsychopharmacology, 15, 453–​462. Fernandez, K. C., Piccirillo, M. L., & Rodebaugh, T. L. (2014). Self-​report assessment: The status of the field and room for improvement. In J. W. Weeks (Ed.), The Wiley-​ Blackwell handbook of social anxiety disorder (pp. 292–​ 319). New York, NY: Wiley. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1996). Structured Clinical Interview for Axis I  DSM-​ IV Disorders–​Patient Edition (SCID-​I/​P Version 2.0). New  York, NY:  Biometrics Research Department, New York State Psychiatric Institute.

Specific Phobia and Social Anxiety Disorder

First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015). Structured Clinical Interview for DSM-​5–​ Research Version. Arlington, VA: American Psychiatric Publishing. Fredrikson, M. (1983). Reliability and validity of some specific fear questionnaires. Scandinavian Journal of Psychology, 24, 331–​334. Frost, R. O., Marten, P., Lahart, C., & Rosenblate, R. (1990). The dimensions of perfectionism. Cognitive Therapy and Research, 14, 449–​468. Frost, R. O., Trepanier, K. L., Brown, E. J., Heimberg, R. G., Juster, H. R., Makris G. S., & Leung, A. W. (1997). Self-​monitoring of mistakes among subjects high and low in perfectionistic concern over mistakes. Cognitive Therapy and Research, 21, 209–​222. Fydrich, T., Chambless, D. L., Perry, K. L., Buergener, F., & Beazley, M. B. (1998). Behavioral assessment of social performance:  A rating system for social phobia. Behaviour Research and Therapy, 36, 995–​1010. Fyer, A. J., Mannuzza, S., Gallops, M. S., Martin, L. Y., Aaronson, C., Gorman, J. G., . . . Klein, D. F. (1990). Familial transmission of simple phobias and fears. Archives of General Psychiatry, 47, 252–​256. Geer, J. H. (1965). The development of a scale to measure fear. Behaviour Research and Therapy, 3, 45–​53. Gillis, M., Haaga, D., & Ford, G. (1995). Normative values for the BDI, FQ, PSWQ and SPAI. Psychological Assessment, 7, 450–​455. Goldin, P. R., Morrison, A., Jazaieri, H., Brozovich, F., Heimberg, R., & Gross, J. J. (2016). Group CBT versus MBSR for social anxiety disorder:  A randomized controlled trial. Journal of Consulting and Clinical Psychology, 84, 427–​437. Gore, K. L., Carter, M. M., & Parker, S. (2002). Predicting anxious response to a social challenge:  The predictive utility of the Social Interaction Anxiety Scale and the Social Phobia Scale in a college population. Behaviour Research and Therapy, 40, 689–​700. Gros, D. F., McCabe, R. E., & Antony, M. M. (2013). Using a hybrid model to investigate the comorbidity and symptom overlap between social phobia and the other anxiety disorders and unipolar mood disorders. Psychiatry Research, 210, 188–​192. Haidt, J., McCauley, C., & Rozin, P. (1994). Individual differences in sensitivity to disgust: A scale sampling seven domains of disgust elicitors. Personality and Individual Differences, 16, 701–​713. Harpole, J. K., Levinson, C. A., Woods, C. M., Rodebaugh, T. L., Weeks, J. W., Brown, P. J.,  .  .  .  Liebowitz, M. (2015). Assessing the straightforwardly-​ worded Brief Fear of Negative Evaluation Scale for differential item functioning across gender and ethnicity. Journal of Psychopathology and Behavioral Assessment, 37, 306–​317.

261

Hoehn-​ Saric, R., McLeod, D. R., Funderburk, F., & Kowalski, P. (2004). Somatic symptoms and physiologic responses in generalized anxiety disorder and panic disorder: An ambulatory monitor study. Archives of General Psychiatry, 61, 913–​921. Hofmann, S. G. (2000). Self-​focused attention before and after treatment of social phobia. Behaviour Research and Therapy, 38, 717–​725. Hofmann, S. G., Heinrichs, N., & Moscovitch, D. A. (2004). The nature and expression of social phobia:  Toward a new classification. Clinical Psychology Review, 24, 769–​797. Hood, H. K., & Antony, M. M. (2012). Evidence-​ based assessment and treatment of specific phobias in adults. In T. E. Davis, T. H. Ollendick, & L. -​G. Öst (Eds.), Intensive one-​session treatment of specific phobias (pp. 19–​42). New York, NY: Springer. Hugdahl, K., & Öst, L.-​G. (1985). Subjectively rated physiological and cognitive symptoms in six different clinical phobias. Personality and Individual Differences, 6, 175–​188. Huijding, J., & de Jong, P. J. (2006). Specific predictive power of spider-​related affective associations for controllable and uncontrollable fear responses toward spiders. Behaviour Research and Therapy, 44, 161–​176. Iza, M., Olfson, M., Vermes, D., Hoffer, M., Wang, S., & Blanco, C. (2013). Probability and predictors of first treatment contact for anxiety disorders in the united states. Journal of Clinical Psychiatry, 74, 1093–​1100. Johnson, B. H., Thayer, J. F., Laberg, J. C., Wormnes, B., Raadal, M., Skaret, E., . . . Berg, E. (2003). Attentional and physiological characteristics of patients with dental anxiety. Journal of Anxiety Disorders, 17, 75–​87. Johnson, H. S., Inderbitzen-​Nolan, H. M., & Anderson, E. R. (2006). The Social Phobia Inventory:  Validity and reliability in an adolescent community sample. Psychological Assessment, 18, 269–​277. Katerelos, M., Hawley, L. L., Antony, M. M., & McCabe, R. E. (2008). The exposure hierarchy as a measure of progress and efficacy in the treatment of social anxiety disorder. Behavior Modification, 32, 504–​518. Kessler, R. C., Berglund, P., Demler, O., Jin, R., Merikangas, K. R., & Walters, E. E. (2005). Lifetime prevalence and age-​of-​onset distributions of DSM-​III-​R disorders in the National Comorbidity Survey Replication. Archives of General Psychiatry, 62, 593–​602. Kessler, R. C., & Ustun, T. B. (2004). The World Mental Health (WMH) survey initiative version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI). International Journal of Methods in Psychiatric Research, 13, 93–​121. Kleinknecht, R. A., Klepac, R. K., & Alexander, L. D. (1973). Origins and characteristics of dental fear. Journal of the American Dental Association, 86, 842–​848.

262

Anxiety and Related Disorders

Kleinknecht, R. A., Thorndike, R. M., & Walls, M. M. (1996). Factorial dimensions and correlates of blood, injury, injection and related medical fears:  Cross validation of the Medical Fear Survey. Behaviour Research and Therapy, 34, 323–​331. Klieger, D. M. (1987). The Snake Anxiety Questionnaire as a measure of ophidophobia. Educational and Psychological Measurement, 47, 449–​459. Klorman, R., Hastings, J. E., Weerts, T. C., Melamed, B. G., & Lang, P. J. (1974). Psychometric description of some specific-​fear questionnaires. Behavior Therapy, 5, 401–​409. Koyuncu, A., Ertekin, E., Deveci, E., Ertekin, B. A., Yüksel, Ç., Çelebi, F.,  .  .  .  Tükel, R. (2015). Age of onset in social anxiety disorder:  Relation to clinical variables and major depression comorbidity. Annals of Clinical Psychiatry, 27, 84–​89. Lawyer, S. R., & Smitherman, T. A. (2004). Trends in anxiety assessment. Journal of Psychopathology and Behavioral Assessment, 26, 101–​106. Leary, M. R. (1983). A brief version of the Fear of Negative Evaluation Scale. Personality and Social Psychology Bulletin, 9, 371–​375. Levinson, C. A., Langer, J. K., & Rodebaugh, T. L. (2013). Reactivity to exclusion prospectively predicts social anxiety symptoms in young adults. Behavior Therapy, 44, 470–​478. Liebowitz, M. R. (1987). Social phobia. Modern Problems in Pharmacopsychiatry, 22, 141–​173. Lipsitz, J. D., Barlow, D. H., Mannuzza, S., Hofmann, S. G., & Fyer, A. J. (2002). Clinical features of four DSM-​IV specific phobia subtypes. Journal of Nervous and Mental Disease, 190, 471–​478. Loken, E. K., Hettema, J. M., Aggen, S. H., & Kendler, K. S. (2014). The structure of genetic and environmental risk factors for fears and phobias. Psychological Medicine, 44, 2375–​2384. Mattick, R. P., & Clarke, J. C. (1998). Development and validation of measures of social phobia scrutiny fear and social interaction anxiety. Behaviour Research and Therapy, 36, 455–​470. McCabe, R. E., Antony, M. M., Summerfeldt, L., Liss, A., & Swinson, R. P. (2003). Preliminary examination of the relationship between anxiety disorders in adults and self-​reported history of teasing or bullying experiences. Cognitive Behaviour Therapy, 32, 187–​193. McCabe, R. E., Hood, H., & Antony, M. M. (2015). Anxiety disorders:  Social anxiety disorder and specific phobia. In A. Tasman, J. Kay, J. A. Lieberman, M. B. First, & M. B. Riba (Eds.), Psychiatry (4th ed., pp. 1019–​1056). Chichester, UK: Wiley-​Blackwell. McCraw, K. S., & Valentiner, D. P. (2015). The Circumscribed Fear Measure:  Development and initial validation of a trans-​stimulus phobia measure. Psychological Assessment, 27, 403.

McDonald, S. D., Hartman, N. S., & Vrana, S. R. (2008). Trait anxiety, disgust sensitivity, and the hierarchic structure of fears. Journal of Anxiety Disorders, 22, 1059–​1074. Mendlowicz, M. V., & Stein, M. B. (2000). Quality of life in individuals with anxiety disorders. American Journal of Psychiatry, 157, 669–​682. Merckelbach, H., de Jong, P. J., Arntz, A., & Schouten, E. (1993). The role of evaluative learning and disgust sensitivity in the etiology and treatment of spider phobia. Advances in Behaviour Research and Therapy, 15, 243–​255. Moscovitch, D. A. (2009). What is the core fear in social phobia? A new model to facilitate individualized case conceptualization and treatment. Cognitive and Behavioral Practice, 16, 123–​134. Muris, P., & Merckelbach, H. (1996). A comparison of two spider phobia questionnaires. Journal of Behavior Therapy and Experimental Psychiatry, 27, 241–​244. Nelson, A. L., Vorstenbosch, V., & Antony, M. M. (2014). Assessing fear of storms and severe weather: Validation of the Storm Fear Questionnaire (SFQ). Journal of Psychopathology and Behavioural Assessment, 36, 105–​114. Norton, P. J., & Hope, D. A. (2001). Analogue observational methods in the assessment of social functioning in adults. Psychological Assessment, 13, 59–​72. Nowakowski, M., Rowa, K., Antony, M. M., & McCabe, R. E. (2016). Changes in anxiety sensitivity following group cognitive–​behavioral therapy for social anxiety disorder and panic disorder. Cognitive Therapy and Research, 40, 468–​478. Olatunji, B. O., Cisler, J. M., & Deacon, B. J. (2010). Efficacy of cognitive behavioral therapy for anxiety disorders: A review of meta-​analytic findings. Psychiatric Clinics of North America, 33, 557–​577. Olatunji, B. O., Ebesutani, C., & Reise, S. P. (2015). A bifactor model of disgust proneness:  Examination of the Disgust Emotion Scale. Assessment, 22, 248–​262. Olatunji, B. O., Williams, N. L., Tolin, D. F., Abramowitz, J. S., Sawchuk, C. N., Lohr, J. M., & Elwood, L. S. (2007). The Disgust Scale: Item analysis, factor structure, and suggestions for refinement. Psychological Assessment, 19, 281–​297. Olatunji, B. O., Woods, C. M., de Jong, P. J., Teachman, B. A., Sawchuk, C. N., & David, B. (2009). Development and initial validation of an abbreviated Spider Phobia Questionnaire using item response theory. Behavior Therapy, 40, 114–​130. Ollendick, T., Allen, B., Benoit, K., & Cowart, M. (2011). The tripartite model of fear in children with specific phobias: Assessing concordance and discordance using the behavioral approach test. Behaviour Research and Therapy, 49, 459–​465.

Specific Phobia and Social Anxiety Disorder

Ollendick, T., Davis, T., & Muris, P. (2004). Treatment of specific phobia in children and adolescents. In P. Barrett & T. Ollendick (Eds.), Handbook of interventions that work with children and adolescents: Prevention and treatment (pp. 273–​300). Chichester, UK: Wiley. Ollendick, T. H., Raishevich, N., Davis, T. E., III, Sirbu, C., & Öst, L.-​ G. (2010). Specific phobia in youth:  Phenomenology and psychological characteristics. Behavior Therapy, 41, 133–​141. Orsillo, S. M. (2001). Measure for social phobia. In M. M. Antony, S. M. Orsillo, & L. Roemer (Eds.), Practitioner’s guide to empirically based measures of anxiety (pp. 165–​187). New York, NY: Springer. Osman, A., Barrios, F. X., Haupt, D., King, K., Osman, J. R., & Slavens, S. (1996). The Social Phobia and Anxiety Inventory:  Further validation in two nonclinical samples. Journal of Psychopathology and Behavioral Assessment, 18, 35–​47. Osório, F. L., Crippa, J. A.  S., & Loureiro, S. R. (2010). Evaluation of the psychometric properties of the Social Phobia Inventory in university students. Comprehensive Psychiatry, 51, 630–​640. Öst, L.-​G. (1978). Fading vs. systematic desensitization in the treatment of snake and spider phobia. Behaviour Research and Therapy, 16, 379–​389. Peterson, R. A., & Reiss, S. (1993). Anxiety Sensitivity Index Revised test manual. Worthington, OH: IDS. Polák, J., Sedláčková, K., Nácar, D., Landová, E., & Frynta, D. (2016). Fear the serpent:  A psychometric study of snake phobia. Psychiatry Research, 242, 163–​168. Potter, C., Wong, J., Heimberg, R., Blanco, C., Liu, S., Wang, S., & Schneier, F. (2014). Situational panic attacks in social anxiety disorder. Journal of Affective Disorders, 167, 1–​7. Poulton, R., & Menzies, R. G. (2002). Non-​associative fear acquisition:  A review of the evidence from retrospective and longitudinal research. Behaviour Research and Therapy, 40, 127–​149. Purdon, C., Antony, M. M., & Swinson, R. P. (1999). Psychometric properties of the Frost Multidimensional Perfectionism Scale in a clinical anxiety disorders sample. Journal of Clinical Psychology, 55, 1271–​1286. Quilty, L. C., van Ameringen, M., Mancini, C., Oakman, J., & Farvolden, P. (2003). Quality of life and the anxiety disorders. Journal of Anxiety Disorders, 17, 405–​426. Rachman, S. (1977). The conditioning theory of fear-​ acquisition: A critical examination. Behaviour Research and Therapy, 15, 375–​387. Ralevski, E., Sanislow, C. A., Grilo, C. M., Skodol, A. E., Gunderson, J. G., & Shea, M. T.,  .  .  .  McGlashan, T. H. (2005). Avoidant personality disorder and social phobia: Distinct enough to be separate disorders? Acta Psychiatrica Scandinavica, 112, 208–​214.

263

Rapaport, M. H., Clary, C., Fayyad, R., & Endicott, J. (2005). Quality-​of-​life impairment in depressive and anxiety disorders. American Journal of Psychiatry, 162, 1171–​1178. Rapee, R. M., & Spence, S. H. (2004). The etiology of social phobia:  Empirical evidence and an initial model. Clinical Psychology Review, 24, 737–​767. Reinelt, E., Aldinger, M., Stopsack, M., Schwahn, C., John, U., Baumeister, S. E.,  .  .  .  Barnow, S. (2014). High social support buffers the effects of 5-​HTTLPR genotypes within social anxiety disorder. European Archives of Psychiatry and Clinical Neuroscience, 264, 433–​439. Rodebaugh, T. L., Heimberg, R. G., Woods, C. M., Liebowitz, M. R., & Schneier, F. R. (2006). The factor structure and screening utility of the Social Interaction Anxiety Scale. Psychological Assessment, 18, 231–​237. Rodebaugh, T. L., Woods, C. M., Thissen, D. M., Heimberg, R. G., Chambless, D. L., & Rapee, R. M. (2004). More information from fewer questions: The factor structure and item properties of the original and Brief Fear of Negative Evaluation Scale. Psychological Assessment, 16, 169–​181. Rogers, R. (1995). Handbook of diagnostic and structured interviewing. New York, NY: Guilford. Ross, H. E., Swinson, R., Doumani, S., & Larkin, E. J. (1995). Diagnosing comorbidity in substance abusers: A comparison of the test–​retest reliability of two interviews. American Journal of Drug and Alcohol Abuse, 21, 167–​185. Rowa, K., Antony, M. M., Brar, S., Summerfeldt, L. J., & Swinson, R. P. (2000). Treatment histories of patients with three anxiety disorders. Depression and Anxiety, 12, 92–​98. Rowa, K., Antony, M. M., Summerfeldt, L. J., Purdon, C., Young, L., & Swinson, R. P. (2007). Office-​based vs. home-​based behavioral treatment for obsessive compulsive disorder:  A preliminary study. Behaviour Research and Therapy, 45, 1883–​1892. Rowa, K., Gifford, S., McCabe, R. E., Milosevic, I., Antony, M. M., & Purdon, C. (2014). Treatment fears in anxiety disorders: Development and validation of the Treatment Ambivalence Questionnaire. Journal of Clinical Psychology, 70, 979–​993. Rowa, K., Paulitzki, J. R., Ierullo, M. D., Chiang, B., Antony, M. M., McCabe, R. E., & Moscovitch, D. (2015). A false sense of security: Safety behaviors erode objective speech performance in individuals with social anxiety disorder. Behavior Therapy, 46, 304–​314. Safran, J. D., & Segal, Z. V. (1990). Interpersonal process in cognitive therapy. New York, NY: Basic Books. Sanderson, W. C., DiNardo, P. A., Rapee, R. M., & Barlow, D. H. (1990). Syndrome comorbidity in patients diagnosed with a DSM-​III-​R anxiety disorder. Journal of Abnormal Psychology, 99, 308–​312.

264

Anxiety and Related Disorders

Sawchuk, C. N., Lohr, J. M., Tolin, D. F., Lee, T. C., & Kleinknecht, R. A. (2000). Disgust sensitivity and contamination fears in spider and blood-​injection-​injury phobias. Behaviour Research and Therapy, 38, 753–​762. Scaini, S., Battaglia, M., Beidel, D. C., & Ogliari, A. (2012). A meta-​analysis of the cross-​cultural psychometric properties of the Social Phobia and Anxiety Inventory for Children (SPAI-​C). Journal of Anxiety Disorders, 26, 182–​188. Scaini, S., Belotti, R., & Ogliari, A. (2014). Genetic and environmental contributions to social anxiety across different ages: A meta-​analytic approach to twin data. Journal of Anxiety Disorders, 28, 650–​656. Schmitz, J., Krämer, M., Tuschen-​Caffier, B., Heinrichs, N., & Blechert, J. (2011). Restricted autonomic flexibility in children with social phobia. Journal of Child Psychology and Psychiatry, 52, 1203–​1211. Schry, A. R., Roberson-​Nay, R., & White, S. W. (2012). Measuring social anxiety in college students: A comprehensive evaluation of the psychometric properties of the SPAI-​23. Psychological Assessment, 24, 846. Sheehan, D. V. (1983). The anxiety disease. New  York, NY: Scribner. Shikatani, B., Antony, M. M., Cassin, S. E., & Kuo, J. R. (2016). Examining the role of perfectionism and intolerance of uncertainty in postevent processing in social anxiety disorder. Journal of Psychopathology and Behavioral Assessment, 38, 297–​306. Southam-​Gerow, M. A., & Chorpita, B. F. (2007). Anxiety disorders. In E. J. Mash & R. A. Barkley (Eds.), Assessment of childhood disorders (4th ed., pp. 347–​397). New York, NY: Guilford. Spence, S. H., & Rapee, R. M. (2016). The etiology of social anxiety disorder:  An evidence-​based model. Behaviour Research and Therapy, 86, 50–​56. Spitzer, R. L., & Fleiss J. L. (1974). A re-​analysis of the reliability of psychiatric diagnosis. British Journal of Psychiatry, 125, 341–​347. Stangier, U., Heidenreich, T., & Schermelleh-​ Engel, K. (2006). Safety behaviors and social performance in patients with generalized social phobia. Journal of Cognitive Psychotherapy, 20, 17–​31. Starr, L. R., Hammen, C., Connolly, N. P., & Brennan, P. A. (2014). Does relational dysfunction mediate the association between anxiety disorders and later depression? Testing an interpersonal model of comorbidity. Depression and Anxiety, 31, 77–​86. Steinhausen, H. C., Jakobsen, H., Meyer, A., Jørgensen, P. M., & Lieb, R. (2016). Family aggregation and risk factors in phobic disorders over three-​generations in a nation-​wide study. PLoS One, 11, e0146591. Stouthard, M. E. A., Hoogstraten, J., & Mellenbergh, G. J. (1995). A study on the convergent and discriminant validity of the Dental Anxiety Inventory. Behaviour Research and Therapy, 5, 589–​595.

Stouthard, M. E. A., Mellenbergh, G. J., & Hoogstraten, J. (1993). Assessment of dental anxiety: A facet approach. Anxiety, Stress and Coping, 6, 89–​105. Summerfeldt, L. J., Kloosterman, P. H., & Antony, M. M. (2010). Structured and semi-​structured interviews. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (2nd ed., pp. 95–​137). New York, NY: Guilford. Szymanski, J., & O’Donohue, W. (1995). The potential role of state-​dependent learning in cognitive therapy with spider phobics. Journal of Rational–​Emotive and Cognitive–​Behavior Therapy, 13, 131–​150. Taylor, S., Koch, W. J., & McNally, R. J. (1992). How does anxiety sensitivity vary across the anxiety disorders? Journal of Anxiety Disorders, 6, 249–​259. Taylor, S., Woody, S., McLean, P. D., & Koch, W. J. (1997). Sensitivity of outcome measures for treatments of generalized social phobia. Assessment, 4, 181–​191. Taylor, S., Zvolensky, M. J., Cox, B. J., Deacon, B., Heimberg, R. G., Ledley, D. R., . . . Cardenas, S. J. (2007). Robust dimensions of anxiety sensitivity:  Development and initial validation of the Anxiety Sensitivity Index-​ 3. Psychological Assessment, 19, 176–​188. Thomas, S. A., Daruwala, S. E., Goepel, K. A., & De Los Reyes, A. (2012). Using the subtle avoidance frequency examination in adolescent social anxiety assessments. Child and Youth Care Forum, 41, 547–​559. Tolin, D. F., Lohr, J. M., Sawchuk, C. N., & Lee, T. C. (1997). Disgust and disgust sensitivity in blood-​injection-​injury and spider phobia. Behaviour Research and Therapy, 35, 949–​953. Trower, P., Bryant, B., & Argyle, M. (1978). Social skills and mental health. Pittsburgh, PA:  University of Pittsburgh Press. Turner, S. M., Beidel, D. C., & Dancu, C. V. (1996). SPAI—​ Social Phobia and Anxiety Inventory: Manual. Toronto, Ontario, Canada: Multi-​Health Systems. Turner, S. M., Beidel, D. C., Dancu, C. V., & Keys, D. J. (1986). Psychopathology of social phobia and comparison with avoidant personality disorder. Journal of Abnormal Psychology, 95, 389–​394. Turner, S. M., Beidel, D. C., Dancu, C. V., & Stanley, M. A. (1989). An empirically derived inventory to measure social fears and anxiety: The Social Phobia and Anxiety Inventory. Psychological Assessment, 1, 35–​40. Van Ameringen, M., Mancini, C., Styan, G., & Donison, D. (1991). Relationship of social phobia with other psychiatric illness. Journal of Affective Disorders, 21, 93–​99. van Overveld, M., de Jong, P. J., Peters, M. L., & Schouten, E. (2011). The Disgust Scale-​R: A valid and reliable index to investigate separate disgust domains? Personality and Individual Differences, 51, 325–​330. Ventura, J., Liberman, R. P., Green, M. F., Shaner, A., & Mintz, J. (1998). Training and quality assurance with

Specific Phobia and Social Anxiety Disorder

the Structured Clinical Interview for DSM-​IV (SCID-​I/​ P). Psychiatry Research, 79, 163–​173. Voncken, M. J., & Bögels, S. M. (2008). Social performance deficits in social anxiety disorder: Reality during conversation and biased perception during speech. Journal of Anxiety Disorders, 22, 1384–​1392. Vorstenbosch, V., Antony, M. M., Koerner, N., & Boivin, M. K. (2012). Assessing dog fear:  Evaluating the psychometric properties of the Dog Phobia Questionnaire. Journal of Behavior Therapy and Experimental Psychiatry, 43, 780–​786. Watson, D., & Friend, R. (1969). Measurement of social-​ evaluative anxiety. Journal of Consulting and Clinical Psychology, 33, 448–​457. Weeks, J. W., Carleton, R. N., Asmundson, G. J. G., McCabe, R. E., & Antony, M. M. (2010). “Social anxiety disorder carved at its joints”: Evidence for the taxonicity of social anxiety disorder. Journal of Anxiety Disorders, 24, 734–​742. Weeks, J. W., Heimberg, R. G., Fresco, D. M., Hart, T. A., Turk, C. L., Schneier, F. R., & Liebowitz, M. R. (2005). Empirical validation and psychometric evaluation of the Brief Fear of Negative Evaluation Scale in patients with social anxiety disorder. Psychological Assessment, 17, 179–​190.

265

Wei, J., Zhang, C., Li, Y., Xue, S., & Zhang, J. (2015). Psychometric properties of the Chinese version of the Fear of Negative Evaluation Scale-​Brief (BFNE) and the BFNE-​Straightforward for middle school students. PLoS ONE, 10(3), e0115948. Wheaton, M. G., Deacon, B. J., McGrath, P. B., Berman, N. C., & Abramowitz, J. S. (2012). Dimensions of anxiety sensitivity in the anxiety disorders: Evaluation of the ASI-​3. Journal of Anxiety Disorders, 26, 401–​408. Williams, J. B.  W., Gibbon, M., First, M. B., Spitzer, R. L., Davies, M., Borus, J.,  .  .  .  Wittchen, H. (1992). The Structured Clinical Interview for DSM-​ III-​ R (SCID):  Multisite test–​ retest reliability. Archives of General Psychiatry, 49, 630–​636. Wittchen, H.-​ U., & Fehm, L. (2003). Epidemiology and natural course of social fears and social phobia. Acta Psychiatrica Scandinavica, 108(Suppl. 417), 4–​18. Wolpe, J., & Lang, P. J. (1969). Fear Survey Schedule. San Diego, CA: Educational and Industrial Testing Service. Woods, C. M., Chambless, D. L., & Steketee, G. (2002). Homework compliance and behavior therapy outcome for panic with agoraphobia and obsessive compulsive disorder. Cognitive Behaviour Therapy, 31, 88–​95.

13

Panic Disorder and Agoraphobia Amy R. Sewart Michelle G. Craske In this chapter, we first describe the presenting features and prevailing theories regarding etiology and maintenance factors for both panic disorder and agoraphobia. Next, we describe the diagnostic, treatment conceptualization and planning, and treatment outcome and monitoring assessment methods and strategies specific to these diagnoses.

concern about their recurrence and their consequences or a significant maladaptive change in behavior consequent to the attacks. Such behavioral changes include avoidance of particular situations that elicit panic-​like symptoms and are perceived to elevate the likelihood of a panic attack (e.g., exercise) and the use of safety behaviors (e.g., having a cell phone in case of a panic attack). Frequency and symptom severity of panic attacks in individuals with panic disorder are highly variable (Craske & Barlow, 1988). Persons with panic disorder may experiNATURE OF PANIC DISORDER ence daily episodes of panic or go months without an AND AGORAPHOBIA unexpected attack. Highly comorbid with panic disorder, agoraphobia Presenting Features refers to marked fear or avoidance of specific situations Panic attacks are abrupt, discrete surges of intense fear from which escape is perceived to be difficult or in which or discomfort, accompanied by physical and cognitive help may be unavailable in the event of panic-​like or symptoms, as listed in the fifth edition of the Diagnostic other incapacitating or embarrassing symptoms (e.g., and Statistical Manual of Mental Disorders (DSM-​ incontinence and fainting; APA, 2013). Typical avoided 5:  American Psychiatric Association [APA], 2013). Such agoraphobic situations include open or enclosed spaces, symptoms include accelerated heart rate, lightheaded- waiting in line, using public transportation, and being ness, fear of losing control or dying, and shortness of outside of the home alone. breath. Episodes of panic peak within minutes and may The diagnosis of agoraphobia and its relationship be elicited by a cue or trigger (e.g., phobic object), or they to panic attacks have undergone redefinition from the occur “out of the blue” with no obvious precipitant. Panic DSM-​IV-​TR (APA, 2000)  to the DSM-​5 (APA, 2013). attacks occur across a variety of mood and anxiety-​related Although highly comorbid, agoraphobia again exists as a disorders and are predictive of disorder onset, course, diagnosis independent of panic disorder and irrespective and severity (Batelaan et  al., 2012; Kessler et  al., 2006; of panic attacks in the DSM-​5. This revision was driven Kircanski, Craske, Epstein, & Wittchen, 2009). Thus, if by findings that agoraphobia does not invariably develop an individual experiences four or more panic attack symp- as a secondary, conditional response to contexts in which toms within the confinement of any disorder (e.g., an panic attacks occur and that a considerable number of individual with social anxiety has panic symptoms prior individuals report clinical levels of agoraphobia without a to giving a speech), a panic attack specifier is added to the history of panic attacks or panic-​like symptoms (Faravelli, respective diagnosis (APA, 2013). Furukawa, & Truglia, 2009; Wittchen et al., 2008). Panic disorder refers to recurrent, unexpected panic From the latest epidemiological study, the National attacks, followed by at least 1  month of persistent Comorbidity Survey-​ Replication (NCS-​ R), lifetime 266

Panic Disorder and Agoraphobia

prevalence estimates in the adult American population are 3.8% for panic disorder with or without agoraphobia and 2.5% for agoraphobia with or without a history of panic disorder (Kessler, Petukhova, Sampson, Zaslavsky, & Wittchen, 2012). Although this study did not report findings for agoraphobia without panic disorder or panic attacks, a 10-​ year prospective longitudinal study conducted by Wittchen and colleagues (2008) found that approximately 1.5% of German individuals in adolescence to mid-​adulthood experienced agoraphobia without co-​occurring distressing spells of anxiety. Current prevalence estimates for agoraphobia without panic attacks are limited given that agoraphobia was previously considered secondary to panic disorder almost exclusively. Rarely do the diagnoses of panic disorder or agoraphobia occur in isolation of other psychiatric conditions. Commonly co-​ occurring disorders include major depressive disorder, specific phobia, social phobia, post-​traumatic stress disorder, and substance use disorders (Brown, Campbell, Lehman, Grisham, & Mancill, 2001; Kessler et al., 2006). A striking 28% of adults in the United States will experience at least one panic attack within their lifetime (Kessler et al., 2006). A substantial proportion of adolescents report panic attacks (e.g., Hayward, Killen, Kraemer, & Taylor, 2000), with the modal age of onset for panic attacks and panic disorder ranging from early adolescence to early adulthood (Kessler et  al., 2005; Wittchen et  al., 2008). In contrast, agoraphobia with and without panic disorder has demonstrated onset as early as childhood (Wittchen et al., 2008). Although 33.6% of adults with panic disorder and 15.1% with agoraphobia initiate contact with a health care provider for treatment within the first year of disorder onset, the median delay of treatment contact after this period is estimated respectively at 10 and 12 years (Wang et al., 2005). Furthermore, individuals with panic disorder and agoraphobia are more likely to seek treatment during the course of their lifetime for psychiatric problems compared with panic disorder, agoraphobia with panic attacks, and panic attack subgroups (Kessler et al., 2006). Although no gender differences have been observed for age of onset of panic disorder or agoraphobia, the hazard ratio for women increases significantly over time for both disorders (Wittchen et al., 2008). Most people with panic disorder report identifiable stressors around the time of their first panic attack commonly related to interpersonal issues or physical well-​being, such as disease or negative drug experiences (Craske, Miller, Rotunda, & Barlow, 1990; Pollard, Pollard, & Corn, 1989). Approximately half of those with panic disorder report having experienced panicky

267

feelings at some time before their first panic attack, suggesting that onset may be either insidious or acute (Craske et al., 1990). Finally, both panic disorder and agoraphobia tend to be chronic (Bruce et al., 2005), impairing conditions, with severe financial and interpersonal costs. Yonkers, Bruce, Dyck, and Keller (2003) demonstrated that although a large percentage of individuals with panic disorder reach full remission over the course of 8  years (76% women, 69% men), remission rates for panic disorder with agoraphobia are more modest (39% women, 35% men), indicating a higher chronicity associated with agoraphobia. Furthermore, individuals with panic disorder overutilize medical resources compared to the general public and individuals with other psychiatric disorders (e.g., Roy-​ Byrne et  al., 1999). Given that panic attack symptoms are largely somatic in nature, many individuals choose to seek help in medical settings (e.g., general practitioner’s office and the emergency room; Katerndahl & Realini, 1995). Panic attacks may be mistaken for a coronary event, prompting individuals to seek costly emergency hospitalization. Etiological and Maintaining Factors Several independent lines of research converged in the 1980s on the same basic conceptualization of panic disorder as an acquired fear of bodily sensations, particularly sensations associated with autonomic arousal, which is enhanced in the presence of certain psychological and biological predispositions. The following descriptions draw heavily from a more detailed description presented in Craske and Barlow (2007). Genetics The occurrence of panic disorder and agoraphobia clusters within families. According to a large meta-​analysis of twin and family studies, heritability of panic disorder with or without agoraphobia is estimated at a moderate .48 and possesses a summary odds ratio predicting association of illness in first-​degree relatives at 5.0 (Hettema, Neale, & Kendler, 2001). More modest heritability estimates (95% confidence interval  =  .30 to .34) have been found for panic symptoms as measured by the Anxiety Sensitivity Index (López-​Solà et  al., 2014). Such findings give evidence to a strong familial component to panic disorder and agoraphobia. In a study of individuals who met diagnostic criteria for panic disorder with or without agoraphobia, carriers of the

268

Anxiety and Related Disorders

polymorphic 5-​HTTLPR short allele variant experienced more severe and frequent panic symptoms (Lonsdorf et al., 2009). Furthermore, a strong association has been found between bi-​and triallelic 5-​HTTLPR polymorphisms and observer-​rated panic disorder symptoms (Lonsdorf et al., 2009). Conversely, a meta-​analysis examining 10 previous studies failed to find a significant association between 5-​ HTTLPR and panic disorder irrespective of agoraphobia (Blaya, Salum, Lima, Leistner-​Segal, & Manfro, 2007). In addition, the Val158Met (rs4680G/​A) polymorphism of the catechol-​O-​methyltransferase (COMT) gene has been implicated in panic disorder susceptibility in several independent samples and has demonstrated female-​specific effects (Domschke, Deckert, O’Donovan, & Glatt, 2007; Domschke et al., 2008). The 5-​HTTLPR and Val158Met polymorphisms have been implicated in other disorders, such as major depressive disorder, and likely play a role in broader affective dysfunction including panicogenesis (e.g., Massat et al., 2005). It is likely that many genetic variants collectively act to produce the panic disorder phenotype, but each gene itself may only account for minor influence. Recent genome-​wide association studies (GWAS) have localized specific single nucleotide polymorphisms (SNPs) that may play a role in the pathogenesis of panic disorder (e.g., rs12579350; Otowa et al., 2009). However, most current GWAS findings lack sufficient statistical power given the large sample sizes necessary to detect small effects of certain susceptibility loci. Neuroticism Neuroticism, the predisposition toward experiencing negative mood states (e.g., fear and disgust), is strongly associated with anxiety disorders, including panic disorder and agoraphobia (Eysenck, 1967/​2009; Watson & Clark, 1984). Neuroticism and its proxy (i.e., emotional reactivity) predict the onset of panic attacks (e.g., Hayward et  al., 2000)  and panic disorder (Craske, Poulton, Tsao, & Plotkin, 2001). Numerous multivariate genetic analyses of human twin samples consistently attribute approximately 30% to 50% of variance in neuroticism to additive genetic factors (e.g., Eley, 2001). In addition, anxiety and depression appear to be variable expressions of the heritable tendency toward neuroticism (Kendler, Heath, Martin, & Eaves, 1987). Furthermore, research also suggests that neuroticism may be linked with specific genetic polymorphisms also implicated in panic disorder pathogenesis (e.g., 5-​HTTLPR; Gonda et al., 2009). Symptoms of panic (i.e., breathlessness and heart pounding) may be

additionally explained by a unique source of genetic variance that is differentiated from the variance relevant to neuroticism (Martin, Jardine, Andrews, & Heath, 1988). Anxiety Sensitivity Anxiety sensitivity is posited to play a critical role in the pathogenesis of panic disorder and agoraphobia. The nonspecific cognitive vulnerability factor of anxiety sensitivity captures the extent to which an individual believes that autonomic arousal-​related sensations result in harmful physical, social, or cognitive consequences (Taylor, 2014; Zinbarg, Barlow, & Brown, 1997). Although such concerns are observed across most anxiety and related disorders, anxiety sensitivity is most elevated in panic disorder (Wheaton, Deacon, McGrath, Berman, & Abramowitz, 2012; Zinbarg & Barlow, 1996). Respiratory abnormalities have been observed in healthy individuals with high anxiety sensitivity, such as fast, shallow breathing and avoidance of carbon dioxide stimulation during laboratory breathing tasks (Blechert, Wilhelm, Meuret, Wilhelm, & Roth, 2013). These abnormal psychophysiological responses may predispose one to developing panic disorder, wherein similar symptoms are exhibited by individuals diagnosed with the disorder (Coryell, Fyer, Pine, Martinez, & Arndt, 2001). Longitudinal studies have demonstrated that anxiety sensitivity is predictive of future panic attacks in adults irrespective of trait anxiety and history of panic (e.g., Ehlers, 1995; Schmidt, Lerew, & Jackson, 1997)  and over 1-​to 4-​year intervals in adolescents (Hayward et al., 2000). In addition, high anxiety sensitivity regarding physical concerns in conjunction with high environmental stress was found to be predictive of panic attacks and agoraphobic avoidance over and above the influence of negative affect (Zvolensky, Kotov, Antipova, & Schmidt, 2005). History of Medical Illness and Abuse Other studies highlight the role of medical illnesses in the development of panic disorder and agoraphobia. For example, experience with personal respiratory disturbance as a youth predicted panic disorder and agoraphobia at the ages of 18 or 21 years (Craske et al., 2001). Furthermore, others report more respiratory disturbance in the history of panic disorder patients compared to other anxiety disordered patients (Verburg, Griez, Meijer, & Pols, 1995) and also in first-​degree relatives of panic disorder patients compared to first-​degree relatives of patients with other anxiety disorders (van Beek, Schruers, & Griez, 2005). Childhood

Panic Disorder and Agoraphobia

experiences of sexual and physical abuse may also prime panic disorder (Goodwin, Fergusson, & Horwood, 2005). After controlling for related diagnoses, childhood sexual abuse history was found to be uniquely associated with panic disorder with and without agoraphobia (Cougle, Timpano, Sachs-​Ericsson, Keough, & Riccardi, 2010). Agoraphobia without panic disorder (DSM-​IV-​TR) was not found to be associated with childhood abuse in this sample, which may be a consequence of the low base rate of this condition. Retrospective reporting, however, limits such findings.

269

(Bouton, Mineka, & Barlow, 2001). An extensive body of experimental literature attests to the robustness of interoceptive conditioning (e.g., Acheson, Forsyth, & Moses, 2012) and its independence from conscious awareness of triggering cues (e.g., Block, Ghoneim, Fowles, Kumar, & Pathak, 1987). Hence, slight changes in relevant bodily functions that are not consciously recognized may elicit conditioned anxiety or fear and panic due to previous pairings with panic (Bouton et al., 2001). An alternative model offered by Clark (1986) attributes fear of sensations to catastrophic misappraisals (e.g., misinterpretation of sensations as signs of imminent death). Others argue that catastrophic misappraisals become conditioned stimuli Maintenance Factors that trigger panic (Bouton et al., 2001). Following the first panic attack, individuals with panic Autonomic arousal generated by fear of sensations is disorder develop an acute fear of bodily sensations associ- believed to contribute to ongoing panic by intensifying ated with panic attacks (e.g., racing heart and dizziness; the sensations that are feared, thus creating a reciproBarlow, 2004). For example, they are more likely to inter- cating cycle of fear and sensations. In addition, because pret bodily sensations in a catastrophic fashion (Clark, bodily sensations that trigger panic attacks are not always 1988)  and allocate more attentional resources to words immediately obvious, panic attacks appear to be unexthat represent physical threat (e.g., Maidenberg, Chen, pected (Barlow, 2004), resulting in even further anxiety Craske, Bohn, & Bystritsky, 1996)  and heartbeat stimuli (Craske, Glover, & DeCola, 1995). The unpredictability (Kroeze & van den Hout, 2000). In addition, individuals of panic and perceived inability to escape from bodily with panic disorder have been shown to exhibit greater sensations similarly increases anxiety (Bouton et al., 2001; anxiety to panic word pairs relative to neutral word pairs Maier, Laudenslager, & Ryan, 1985). In turn, anxiety (De Cort et al., 2013). In contrast, previous research failed increases the likelihood of panic by directly increasing the to demonstrate a difference in reaction times to panic-​ availability of sensations that have become conditioned threat words during an emotional Stroop task between cues for panic or by increasing attentional vigilance for individuals with panic disorder, those with mixed anxi- these bodily cues. Thus, a maintaining cycle of panic and ety disorders, and healthy controls (De Cort, Hermans, anxiety develops (Barlow, 2004). Indeed, the perceived Spruyt, Griez, & Schruers, 2008). Given these conflict- probability of panicking in specific external contexts was ing findings, further research on attentional biases toward found to be significantly related to agoraphobic avoidance panic-​related threat in panic disorder is required. (Craske, Rapee, & Barlow, 1988). Furthermore, perceived Individuals with panic disorder are more anxious in threat control was found to moderate the relationship procedures that elicit bodily sensations similar to the ones between the belief that symptoms of anxiety are harmexperienced during panic attacks, such as cardiovascu- ful (anxiety sensitivity) and agoraphobic avoidance in lar, respiratory, audiovestibular exercises and inductions individuals with panic disorder (White, Brown, Somers, (Jacob, Furman, Clark, & Durrant, 1992; Kaplan et al., & Barlow, 2006). Both agoraphobic avoidance and subtle 2012; Zarate, Rapee, Craske, & Barlow, 1988) and carbon avoidance behaviors (e.g., holding onto supports for fear dioxide inhalations, compared to patients with other anxi- of fainting) are believed to maintain negative beliefs ety disorders (e.g., Rapee, Brown, Antony, & Barlow, 1992; about feared bodily sensations and related contexts (Clark Vickers, Jafarpour, Mofidi, Rafat, & Woznica, 2012) and & Ehlers, 1993; Craske & Barlow, 2014). healthy controls (e.g., Gorman et al., 1994; Zvolensky & Agoraphobia may be acquired via exteroceptive Eifert, 2001). Finally, individuals with panic disorder fear conditioning wherein panic-​ related sensations become signals that ostensibly reflect heightened arousal and false paired with external stimuli (e.g., shopping malls) presphysiological feedback (Craske & Freed, 1995; Craske ent during an attack (Mineka & Zinbarg, 2006). Due to et al., 2002). this associative process, individuals may begin to avoid Fear of bodily sensations has been attributed to intero- situations and environments that are perceived to be preceptive conditioning, in which early somatic components dictive of a panic attack. As previously mentioned, agoraof anxiety elicit conditioned bursts of anxiety or panic phobia may not always manifest as fear of interoceptive

270

Anxiety and Related Disorders

sensations (Pané-​Farré et al., 2014; Wittchen et al., 2008). Agoraphobia without panic symptoms may develop through the irrational belief that being in certain environments or situations will increase the likelihood of a negative event (unrelated to panic) occurring, such as embarrassment due to incontinence, disorientation, or injury due to falling (APA, 2013). Agoraphobic avoidance is then reinforced by the non-​occurrence of the feared event when environments are avoided. Research on specific phobias suggests that agoraphobia without panic symptoms may be acquired through a traumatic conditioning event (e.g., embarrassment due to incontinence), vicariously, (e.g., witnessing someone be shamed for incontinence), or informationally conditioned (e.g., hearing that someone was shamed for incontinence) (Mineka & Zinbarg, 2006).

PURPOSES OF ASSESSMENT

The focus of this chapter is on assessment for the purpose of (a) diagnosis, (b) case conceptualization and treatment planning, and (c)  treatment monitoring and evaluation. Emphasis is given to multiple methodologies and domains, including clinician-​ administered interviews, self-​ report questionnaires, behavioral observations, and measures of peripheral physiological functioning. In addition, we include measures of the constructs relevant to the perpetuation of panic disorder and agoraphobia, including anxiety sensitivity, fear, catastrophic misappraisals of bodily sensations, and avoidance of not only agoraphobic situations but also bodily sensations. The methods of cognitive–​behavioral therapy (CBT) uniquely designed, and highly effective, for panic disorder and agoraphobia (Craske & Barlow, 2007) are derived from models emphasizing these constructs. Hence, changes in measures of these constructs are assumed to be critical indices of therapeutic outcomes. These measures are also relevant indices of the efficacy of pharmacological approaches to treatment, which comprise the other effective treatment option for panic disorder and agoraphobia (see Freire, Machado, Arias-​Carrión, & Nardi, 2014).

ASSESSMENT FOR DIAGNOSIS

As a part of the diagnostic process, medical evaluation is generally recommended to rule out several medical conditions for the diagnosis of panic disorder, including thyroid conditions, caffeine or amphetamine intoxication,

drug withdrawal, or pheochromocytoma (a rare adrenal gland tumor). Furthermore, certain medical conditions, such as mitral valve prolapse, asthma, allergies, and hypoglycemia, can exacerbate panic disorder because they produce sensations that overlap with panic attack symptoms (e.g., shortness of breath); however, these are not rule-​ outs, and panic disorder is likely to continue even when they are under medical control. In addition, for those reporting nocturnal panic attacks, a polysomnographic sleep assessment may be recommended to rule out other sleep-​related disorders, such as sleep apnea, night terrors, periodic movements, seizures, stage IV night terrors, nonrestorative sleep, sleep hallucinogenesis, and sleep paralysis, all of which are distinct from nocturnal panic (Craske & Tsao, 2005). Informally generated clinical diagnoses are rarely as reliable as diagnoses obtained from structured diagnostic interviews (e.g., Basco et al., 2000). Given that panic attacks are ubiquitous, differential diagnosis requires carefully structured questioning regarding the degree to which the panic attacks are a source of anxiety or a reason for behavioral changes (as would be characteristic of panic disorder and agoraphobia) or are part of another anxiety disorder. Hence, diagnostic assessment of both panic disorder and agoraphobia benefits from structured interviews. However, fully structured diagnostic interviews provide almost no opportunity for probing and may suffer from limited validity. Thus, preference is given to semi-​structured interviews that involve flexibility in questioning and clinical judgment. The two semi-​structured interviews used most often for the diagnosis of panic disorder and agoraphobia are the Anxiety Disorders Interview Schedule (ADIS [ADIS-​5]; Brown & Barlow, 2014a) and the Structured Clinical Interview for DSM Disorders (SCID [SCID-​ 5]; First, Williams, Karg, & Spitzer, 2016c). Ratings of the available psychometric properties for these two instruments are shown in Table 13.1. Given the recent DSM-​5 updates to each instrument, psychometric properties of previous versions are featured when data are unavailable for the current versions. Anxiety Disorders Interview Schedule Recently updated to reflect new DSM-​5 diagnostic criteria, the semi-​structured ADIS (ADIS-​5; Brown & Barlow, 2014a) is widely used for the assessment of anxiety, trauma, obsessive–​compulsive, mood, and other associated disorders. The ADIS-​5 is advantageous for differentiating among anxiety disorders as well as diagnosing comorbid mood disorders, which may impact treatment

Panic Disorder and Agoraphobia TABLE 13.1  

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

ADIS

NA

NA

G

Aa

A

A

E

A

SCID

NA

NA

G

Aa

A

A

E

A

PDSS

A

G

G

Aa

A

A

E

A

271

Highly Recommended

  Different raters.

a

Note:  ADIS  =  Anxiety Disorders Interview Schedule; SCID  =  Structured Clinical Interview for the DSM; PDSS  =  Panic Disorder Severity Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

planning. Furthermore, the ADIS provides screening questions for additional conditions, including psychotic disorders, eating disorders, and impulse control disorders, and it assesses chronic and episodic life stress. Some of the interview questions require a “yes” or “no” response, whereas others involve ratings of fear, avoidance, and control on Likert scales. The panic disorder section of the ADIS-​5 assesses full and limited symptom panic attacks. The agoraphobia section includes a list of 25 situations organized by situation type as featured in DSM-​5 (e.g., open spaces) that are each rated in terms of fear and avoidance, as well as related questions such as probes for typical safety signals. Although later versions of the ADIS include a separate section to assess for nocturnal panic attacks, the ADIS-​5 does not assess for such. In addition to determining diagnostic status, the interviewer rates each diagnosed disorder on a 0-​to 8-​point rating to reflect overall levels of clinical severity (symptom intensity, distress, and impairment) associated with the disorder, with 4 representing the cut-​off for clinical severity (Grisham, Brown, & Campbell, 2004). Thus, the ADIS encourages diagnostic categorization as well as a dimensional approach to understanding sets of symptoms and differentiation between clinical and subclinical levels of anxiety. The ADIS is administered by a trained clinician. Training typically involves at least three observations of trained interviewers followed by achievement of acceptable inter-​rater reliability while being observed by a trained interviewer on at least three consecutive occasions (e.g., Brown, Barlow, & DiNardo, 1994). The full ADIS may take several hours to complete, although it can be shortened by excluding nondiagnostic research-​based questions. Given the modular structure of the ADIS, it is possible to limit its use to the panic disorder and agoraphobia sections, thereby reducing the time investment considerably. However, given the ubiquitous nature of panic attacks and the intricacies of differential diagnosis, completion of the entire ADIS interview is advised. Although this will require more time for the

clinician, we believe that an accurate diagnosis of panic disorder and agoraphobia depends on differential diagnosis and that this should not be compromised. Versions of the ADIS exist for DSM-​III, the DSM-​III-​ R, DSM-​IV, and currently the DSM-​5 (ADIS: DiNardo, O’Brien, Barlow, Waddell, & Blanchard, 1983; ADIS-​ R:  DiNardo & Barlow, 1988; ADIS-​ IV:  Brown et  al., 1994; ADIS-​ 5:  Brown & Barlow, 2014a, respectively). Each ADIS version provides the assessment of both current and lifetime psychopathology (e.g., ADIS-​5L; Brown & Barlow, 2014b). Based on past DSM-​IV diagnostic criteria, a child and parent version of the ADIS (ADIS-​C/​P; Silverman & Albano, 2004) is available for the assessment of anxiety and related disorders in children and adolescents (see Chapter 11, this volume). The following discussion includes references to various versions of the adult version of this interview. As of mid-​2016, psychometric data for the ADIS-​5 were not yet available. In their college sample, Brown and Deagle (1993) found good inter-​rater reliability for “panic classification” when the ADIS-​R (DiNardo & Barlow, 1988) was rated by two individuals (κ = .83). In Brown, DiNardo, Lehman, and Campbell’s (2001) study, inter-​rater reliability was good for both panic disorder (κ = .72) and panic disorder with agoraphobia (κ = .77) diagnoses. On the whole, the ADIS is judged to have good inter-​rater reliability. The test–​retest reliability of panic disorder diagnoses was generally good when lifetime diagnoses with and without agoraphobia were combined (ADIS-​IV; κ  =  .75 to .79) and good to very good for diagnoses of panic disorder (ADIS-​R; κ  =  .86) and agoraphobia (ADIS-​R; κ = .90) (Brown, DiNardo, et al., 2001; DiNardo, Moras, Barlow, Rapee, & Brown, 1993). However, reliability values were not consistent across levels of agoraphobia, and most often, less than adequate values were obtained for diagnoses of panic disorder without agoraphobia (ADIS-​ R and ADIS-​IV; κ = .39 to .72; Brown, DiNardo, et al., 2001; DiNardo et al., 1993). Last, Brown, DiNardo, et al.

272

Anxiety and Related Disorders

obtained good ratings of Clinician Severity Rating (CSR) test–​retest reliability (ADIS-​IV; r  =  .83). Given the variability and short test–​retest time intervals (0–​44  days), the ADIS-​IV is judged to have only adequate test–​retest reliability. The ADIS modules were developed to assess the diagnostic criteria as stated in the DSM-​5 (APA, 2013). However, given that the contents of the interview were not reviewed by outside judges (T. A.  Brown, personal communication, August 17, 2016), the ADIS-​5 was rated as demonstrating only adequate content validity. Brown, Chorpita, and Barlow (1998) reported convergent and discriminant validity of the ADIS-​IV by showing that symptom measures of anxiety and depression differentially loaded on different higher order factors (e.g., autonomic arousal), making panic disorder distinguishable from other diagnoses. This led to the assignment of adequate construct validity. The ADIS has been used with various demographic groups and in a variety of settings, including managed care and pediatric primary care (Addis et al., 2004; Bowen, Chavira, Bailey, Stein, & Stein, 2008). Thus, it is rated as having excellent validity generalization. Although the ADIS is frequently used as a treatment outcome measure, it is rated as having demonstrated only adequate clinical utility because there is no evidence that the use of data obtained with this particular interview results in a better treatment outcome than that which would have occurred by using a different instrument.

The structure of the SCID includes a general probe question at the beginning of each disorder module, followed by other specific questions as deemed appropriate based on answers to the probe question. Although this interview format is similar to the ADIS, the inclusion of each diagnostic criterion next to relevant questions makes the SCID more transparent. For each question, the interviewer assesses how consistent the information is with the diagnostic criterion of interest and gives a rating of 1 (absent/​false), 2 (subthreshold), or 3 (threshold/​ true). According to Spitzer, Williams, Gibbon, and First (1992), SCID training should include becoming familiar with the related SCID User’s Guide (SCID-​5-​CV User’s Guide: First, Williams, Karg, & Spitzer, 2016a), watching videotaped interviews, and achieving acceptable inter-​ rater and test–​retest reliability. As of October 2017, no psychometric data had been published for the SCID-​5. Evidence regarding the inter-​ rater reliability of panic disorder using previous versions of the SCID is mixed (SCID-​IV: First, Spitzer, Williams, & Gibbon, 1995; SCID-​I: Spitzer & Williams, 1984). Kappa values range from .65 to 1.0 (e.g., Dammen, Arnesen, Ekeberg, Husebye, & Friis, 1999; Löwe et  al., 2003; Zanarini & Frankenburg, 2001; Zanarini et  al., 2000). There is some evidence for adequately reliable agoraphobia diagnoses (κ = .69; Zanarini & Frankenburg, 2001). Overall, the SCID is judged to have good inter-​rater reliability because among the mixed data, there were several excellent values. Most test–​retest data for SCID diagnoses of panic disorder are slightly less than adequate to adequate. In a large Structured Clinical Interview for DSM Disorders study of patients and nonpatients (N  =  592), Williams, The SCID (SCID-​5; First et  al., 2016c) is administered Gibbon, and colleagues (1992) obtained test–​ retest κ by a clinician to assess common areas of psychopathol- values for panic disorder diagnoses, based on interviews ogy, including anxiety and related disorders. Thus, it conducted 1 to 14  days apart, ranging from .54 to .65. facilitates differential diagnoses and assessment of comor- However, studies with smaller sample sizes obtained κ bid conditions (e.g., mood disorders). In addition to the values ranging from .61 to .82, dependent on whether clinician version of the SCID (SCID-​5-​CV; First et  al., subtypes were examined (Williams, Spitzer, & Gibbon, 2016c), two research-​oriented versions exist—​the SCID-​ 1992; Zanarini & Frankenburg, 2001; Zanarini et  al., 5-​RV (Research Version; First, Williams, Karg, & Spitzer, 2000). Data for agoraphobia diagnoses are mixed, with 2016d) and the SCID-​ 5-​ CT (Clinical Trials Version; κ values ranging from .43 to 1.0 (Williams, Gibbon, First, Williams, Karg, & Spitzer, 2016b). A child version et  al., 1992; Williams, Spitzer, et  al., 1992; Zanarini & of the SCID, the Structured Clinical Interview for DSM-​ Frankenburg, 2001). On the basis of the range of findings, IV Childhood Diagnoses (KID-​SCID; Hein et al., 1998), a somewhat liberal rating of adequate test–​retest reliability is available but has been rarely evaluated in research stud- was assigned. ies. Versions of the SCID are available in many languages, As with the ADIS, although the SCID is worded to including English, Mandarin, Spanish, German, Dutch, address each of the DSM-​5 diagnostic criteria, there is no and Korean (http://​www.scid4.org/​trans.html; Skodol, evidence of its contents being evaluated by outside judges. Bender, Rush, & Zarin, 2000). The focus of our discus- Hence, it too demonstrates only adequate content validity. sion and ratings in Table 13.1 is on the SCID for adults. Kessler and colleagues (2005) found that anxiety disorder

Panic Disorder and Agoraphobia

diagnoses generated from the SCID-​I/​NP and the World Mental Health Survey Initiative Version of the World Health Organization Composite International Diagnostic Interview (WMH-​CIDI; Kessler & Üstün, 2004) “generally were in good concordance” (p. 594). Thus, the overall construct validity of the SCID is deemed adequate. Evidence of the SCID’s excellent validity generalization lies in the fact that it has been used in more than 1,000 studies (First & Gibbon, 2004)  and has been administered to coronary heart patients (Bankier, Januzzi, & Littman, 2004), individuals seeking community outpatient treatment (Zimmerman & Mattia, 2000), and primary care patients (e.g., Rodriguez et al., 2004). The SCID-​5 allows clinicians and assessors to follow closely the DSM-​5 criteria when making diagnoses. It is relatively inexpensive and does not require a scoring program. In addition, it may result in more valid diagnoses than those based on a standard clinical interview (Basco et al., 2000). However, because further research is needed on the usefulness of the SCID-​5 in assessing panic disorder and agoraphobia specifically, clinical utility is judged to be adequate. Panic Disorder Severity Scale Following completion of a diagnostic assessment, a dimensional assessment specifically designed for panic disorder, such as the Panic Disorder Severity Scale (PDSS; Shear et al., 1997), can be helpful. This clinician-​ completed scale rates seven areas using a 0 to 4 severity rating scale:  panic attack frequency, distress, anticipatory anxiety, agoraphobic and interoceptive-​related fears and avoidant behavior, and work and social impairment. Agoraphobic avoidance is assessed via one question and within the context of “fear of panic”; thus, the PDSS should not be used as a singular measure of agoraphobia. Administration of this instrument requires less than 15 minutes (Antony, 2002). Internal consistency is adequate to excellent (Cronbach’s α ranging from .71 to .92; e.g., Houck, Spiegel, Shear, & Rucci, 2002; Monkul et  al., 2004; Yamamoto et  al., 2004), with “adequate” limited to one study examining psychometrics of a Turkish translation (Monkul et al., 2004). Overall, scores on this measure are judged to possess good internal consistency. Different translations of the PDSS have been found to have adequate to excellent inter-​rater reliability (r = .79; intraclass correlation coefficient [ICC]  =  .87 to .99; Monkul et al., 2004; Shear et al., 1997; Yamamoto et al., 2004), resulting in an averaged rating of good inter-​rater

273

reliability. Its test–​retest data range from less than adequate to good (r = .63 to .71; ICC = .81 to .88) over short periods of time (Houck et al., 2002; Monkul et al., 2004; Shear et al., 1997, 2001), resulting in an overall rating of adequate test–​retest reliability. The PDSS has only adequate content validity because there is no evidence to indicate that independent judges reviewed this measure. Shear et al. (1997, 2001) found evidence for construct validity in that ADIS-​R panic disorder CSRs were strongly related to PDSS total scores (r = .55), patients with panic disorder with agoraphobia scored higher on the PDSS compared to patients with other anxiety or mood diagnoses, and PDSS scores correlated with various anxiety-​related questionnaires. However, overall, the construct validity was judged to be adequate rather than good, due to the lack of independently replicated validity findings. Validity generalization of the PDSS is judged to be good. This measure has been used in diverse sociodemographic samples. Japanese (Yamamoto et  al., 2004)  and Turkish versions of this measure (Monkul et  al., 2004)  exist. In addition, a self-​report version of this instrument (PDSS-​SR) possesses acceptable score reliability and promising validity (Wuyek, Antony, & McCabe, 2011). Although the PDSS and PDSS-​ SR assess the same panic symptoms, correlations between total score and panic attack frequency for these measures were found to only fall into the moderate range (Wuyek et al., 2011). Furthermore, although the PDSS may help clinicians assess different aspects of panic disorder, it has only adequate clinical utility because there is no research to show that the use of its data results in additional benefits beyond those seen when data are used from other instruments. Overall Evaluation Although semi-​ structured diagnostic interviews may be somewhat time-​consuming, the data they yield are helpful in making differential diagnoses, which is particularly important given the ubiquitous nature of panic attacks. If time does not permit to complete a full interview, the panic disorder and agoraphobia modules may be complemented by screener questions from the other anxiety disorder modules and/​or by self-​report questionnaires to gauge whether the use of the term “panic” is related to other disorders. Clinicians should also inquire about medical conditions and stimulant drug use. Last, further research is needed regarding the psychometric properties and comparative clinical utility of the ADIS-​5 and the SCID-​5.

274

Anxiety and Related Disorders

TABLE 13.2  

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

ASI

G

G

NA

A

A

E

E

A

ASI-​3 BSQ ACQ FQ MI APPQ

G G G G G G

G G G A E G

NA NA NA NA NA NA

G G A G A A

G A A A A A

E G G G A A

E E E E E E

A A A A A A

Highly Recommended

✓ ✓ ✓ ✓

Note: ASI = Anxiety Sensitivity Index; ASI-​3, Anxiety Sensitivity Index-​3; BSQ = Body Sensations Questionnaire; ACQ = Agoraphobic Cognitions Questionnaire; FQ = Fear Questionnaire; MI = Mobility Inventory for Agoraphobia; APPQ = Albany Panic and Phobia Questionnaire; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Development of a thorough case conceptualization to guide treatment planning for panic disorder and agoraphobia requires assessment of symptoms (including severity and distress), as well as fear of and beliefs about the symptoms, and avoidance of situations and activities. Whenever possible, a variety of assessment methodologies, including self-​report, in vivo, and physiological measures, is preferred. This section provides clinicians with a number of relevant instruments and methodologies. Self-​Report Instruments Self-​report measures are relatively inexpensive, require only a brief amount of time to complete, are often standardized, and allow for easy comparisons of effect sizes across treatment studies. However, self-​report instruments may result in an overestimation of panic attack frequency (e.g., Margraf, Taylor, Ehlers, Roth, & Agras, 1987) and physiological symptoms (e.g., Calvo & Eysenck, 1998). Nonetheless, self-​report instruments yield useful information and are likely to remain one of the primary methods of assessing panic disorder and agoraphobia. The following is not intended to be a comprehensive review of all available self-​report measures for panic and agoraphobia but, rather, covers the self-​report instruments that are most helpful for assessing thoughts, feelings, and behaviors as they relate to these disorders. The self-​report measures to be discussed are the Anxiety Sensitivity Index (ASI; Reiss, Peterson, Gursky, & McNally, 1986)  and Anxiety Sensitivity Index-​ 3 (ASI-​ 3; Taylor et  al., 2007), Body Sensations Questionnaire (BSQ; Chambless, Caputo,

Bright, & Gallagher, 1984), Agoraphobic Cognitions Questionnaire (ACQ; Chambless et  al., 1984), Fear Questionnaire (FQ; Marks & Mathews, 1979), Mobility Inventory (MI; Chambless et al., 1984), and Albany Panic and Phobia Questionnaire (APPQ; Rapee, Craske, & Barlow, 1994). Ratings for the psychometric properties of each instrument are shown in Table 13.2. Across measures, we were somewhat liberal in our ratings of the property of norms, such that if there were at least two available studies to cite and data from both clinical and nonclinical samples, the norms were rated as good. In addition, none of the reviewed measures received content validity or clinical utility ratings that were better than adequate. This is because there are no published data to indicate that any of the measures in their entirety were evaluated by independent judges and no published data to suggest that using results from these self-​report measures leads to clinical benefits above those gained by using data obtained from other instruments. Anxiety Sensitivity Index The ASI (Reiss et al., 1986) is a 16-​item self-​report measure that assesses beliefs surrounding the consequences of arousal-​related sensations. Zinbarg, Barlow, and Brown (1997) evaluated the factor structure to find an overall general factor representing level of sensitivity to anxiety, as well as three factors that measure physical concerns (e.g., “It scares me when my heart beats rapidly”), mental incapacitation concerns (e.g., “When I  am nervous, I worry that I might be mentally ill”), and social concerns (e.g., “Other people notice when I  feel shaky”). Other studies have found a taxonic latent class structure composed of two dimensions of anxiety sensitivity (Bernstein et al., 2007).

Panic Disorder and Agoraphobia

Norms for the ASI exist for nonclinical and clinically anxious individuals (see Peterson & Reiss, 1992; Rapee et  al., 1992). Scores on this measure have been found to have good internal consistency (Cronbach’s α  =  .84 to .90) and adequate test–​retest reliability over a 2-​week period (r  =  .75; see Shear et  al., 2000). On the basis of the adjusted item-​to-​scale correlations and factor loadings, Blais et al. (2001) reanalyzed ASI data from three earlier studies with items 1, 5, 7, 8, and 13 removed. In comparison to the original ASI, the data produced by the 11-​item version related more specifically to panic disorder than to other psychiatric conditions and were highly correlated with data from the 16-​item version (r > .95). The 11-​item version’s two factors are “fears of somatic sensations of anxiety and fears of loss of mental control” (Blais et al., 2001, p. 273). There is evidence for excellent construct validity for the ASI, including convergent, criterion, construct, predictive, and discriminative validity. For example, as reviewed previously, longitudinal studies indicate that high scores on the ASI predict the onset of panic attacks and worry about panic. In addition, the ASI discriminates panic disorder from other anxiety disorders (e.g., Zinbarg & Barlow, 1996). Furthermore, treatment studies have shown that the partial pressure of carbon dioxide (pCO2) level partially mediated the effect of capnometry-​assisted respiratory training (CART) on fear of bodily sensations as measured by the ASI (Meuret, Rosenfield, Hofmann, Suvak, & Roth, 2009). Thus, construct validity for the ASI was judged to be excellent. In terms of validity generalization, this measure is available in a variety of languages (see Antony, 2002). It has been administered to samples of various ethnic backgrounds (e.g., Native Americans and Alaskan Natives:  Zvolensky, McNeil, Porter, & Stewart, 2001; Russians:  Kotov, Schmidt, Zvolensky, Vinogradov, & Antipova, 2005), although the ASI’s factor structure does not hold true for all populations (e.g., African American college students:  Carter, Miller, Sbrocco, Suchday, & Lewis, 1999). In addition, the ASI has been used with anxious patients in primary care settings (Craske et  al., 2005). Hence, the overall rating for validity generalization was excellent. Anxiety Sensitivity Index-​3 Developed from the ASI, the ASI-​3 (Taylor et al., 2007) is an 18-​item self-​report measure that also assesses anxiety sensitivity. Unlike the ASI, the ASI-​3 was designed to be a multidimensional measure to assess the three anxiety

275

sensitivity domains. Creators of the measure thoroughly considered content validity during item selection (for methods, see Taylor et al., 2007); thus, the ASI-​3 is rated as possessing good content validity. Norms for the ASI-​ 3 exist for both nonclinical and clinically anxious individuals, including average scores for individuals in partial hospitalization with a variety of psychological disorders (Rifkin, Beard, Hsu, Garner, & Björgvinsson, 2015; Taylor et al., 2007). Scores on this measure have demonstrated good internal consistency, outperforming the ASI social and cognitive concerns subscales in both clinical and nonclinical samples (Kemper, Lutz, Bähr, Rddel, & Hock, 2011; Taylor et  al., 2007). In addition, test–​retest reliability of the ASI-​3 scores has been found to range from acceptable to good (1 month: r = .76 [Ghisi et al., 2016]; 15–​ 30 days: r = .64 [Mantar, Yemez, & Alkin, 2010]). Based on the findings, we rated the overall test–​retest as good. The ASI-​3 is available in a variety of languages, including English, Italian, and Turkish, and has been administered in various populations (Mantar et  al., 2010; Petrocchi, Tenore, Couyoumdjian, & Gragnani, 2014; Taylor et al., 2007). Thus, validity generalization was judged as excellent. Based on aforementioned findings for the ASI, the ASI-​ 3 was also rated as possessing excellent construct validity. In terms of clinical utility, the ASI and ASI-​3 are inexpensive and take only 3 to 5 minutes to complete (Antony, 2002), and they can be used as a measure of treatment outcome (e.g., Craske et al., 2007). The ASI and ASI-​3 are rated as having adequate clinical utility. In summary, the ASI and ASI-​3 measure a construct recognized to be central to the onset and maintenance of panic disorder and agoraphobia, described previously, and are critical to the measurement of responsiveness to treatment. The ASI-​3 is recommended over the ASI given its superior psychometric properties. Body Sensations Questionnaire The BSQ (Chambless et al., 1984) assesses level of fear of somatic sensations (e.g., sweating, nausea, and dizziness) experienced during an anxious state. This measure, which can be completed within 5 to 10 minutes, consists of 18 items with ratings based on a Likert scale. The BSQ is available in a variety of languages, including English, Spanish, Portuguese, French, Greek, German, Swedish, Mandarin, and Dutch (see Antony, 2002). Norms for the BSQ exist for clinical as well as community samples (see Chambless et al., 1984; Chambless

276

Anxiety and Related Disorders

& Gracely, 1989). Scores on this measure have been found to have good to excellent internal consistency (Cronbach’s α ranging from .84 to .95; Carlbring et  al., 2007; Chambless et  al., 1984; Novy, Stanley, Averill, & Daza, 2001), with test–​retest reliability ranges from below adequate to good (ranging from r  =  .67 over a median of 31 days to a corrected r = .89 over a 3-​month period; Arrindell, 1993b; Carlbring et al., 2007; Chambless et al., 1984). Thus, an overall rating of good was assigned for both test–​retest reliability and internal consistency. The items for the BSQ were developed from discussions and sessions with clients and therapists (Chambless et  al., 1984), suggestive of adequate content validity. Evidence for good construct validity derives from correlated scores between the BSQ and the ACQ (see later) and other self-​report measures of anxiety and also from the finding that individuals with panic disorder, other anxiety disorders, and no anxiety disorders score differently on the BSQ (see Arrindell, 1993a; Chambless, Beck, Gracely, & Grisham, 2000). In addition, Smits, Powers, Cho, and Telch (2004) found that BSQ change scores were partial mediators of the effects of group CBT on levels of anxiety, agoraphobia, and frequency of panic attacks. The BSQ has been used with people of different sociodemographic backgrounds and in different settings including primary care (van Boeijen et  al., 2005)  and Internet-​ based interventions (Carlbring et  al., 2006), and thus it was rated as having excellent validity generalization. According to Chambless et al. (1984), the BSQ helps clinicians focus on the particular sensations that are of most concern to clients. Furthermore, BSQ responses may be useful in the development of individualized behavioral approach tests (BATs; discussed later) and targeted interoceptive exposures (e.g., Craske, Rowe, Lewin, & Noriega-​Dimitri, 1997). Thus, the BSQ has adequate clinical utility. Agoraphobic Cognitions Questionnaire The ACQ (Chambless et al., 1984) assesses the frequency of particular thoughts while the respondent is in an anxious state. This measure consists of 15 items with ratings based on a Likert scale, and it can be completed within 5 to 10 minutes (Antony, 2002). The ACQ generates an overall mean score of the first 14 items, a mean “physical concerns” subscale score, and a mean “loss of control” subscale score. The ACQ is available in a variety of languages, including English, Spanish, Portuguese, French, Greek, German, Swedish, Mandarin, and Dutch (see Antony, 2002).

Norms for this questionnaire exist for clinical as well as community samples (see Chambless et  al., 1984). In addition, scores on this measure have been found to have good internal consistency (Cronbach’s α ranging from .80 to .87) and adequate test–​retest reliability (r = .86 to .92, ranging from a nonspecified time period to a 3-​month period; see Arrindell, 1993b; Carlbring et al., 2007). Items on the ACQ were decided upon based on inputs received from clients and therapists (Chambless et  al., 1984), suggestive of adequate content validity. The construct validity of the ACQ appears to be good given evidence for correlations with the BSQ and other self-​report measures of anxiety, as well as the finding that individuals with panic disorder, other anxiety disorders, and no anxiety disorders score differently on the ACQ (see Arrindell, 1993a; Chambless et al., 2000). Bouvard and colleagues (1998) found evidence of internal consistency and validity for the French version of the ACQ. The ACQ was rated as exhibiting excellent validity generalization due to its use with people of different language backgrounds and different settings (e.g., van Boeijen et al., 2005). The adequate clinical utility of the ACQ lies in its identification of anxious thoughts to be targeted through cognitive restructuring and exposures to feared situations and bodily sensations. Fear Questionnaire (Agoraphobia Subscale) The FQ (Marks & Mathews, 1979) assesses phobic severity and distress, as well as related symptoms of anxiety and depression. For current purposes, the discussion focuses on the agoraphobia subscale. It consists of five situational items for which level of avoidance is rated on a Likert scale. Less than 10 minutes is required to complete the entire 20-​item measure. It is available in a variety of languages, including English, Dutch, French, German, Italian, Catalan, Chinese, and Spanish (Roemer, 2002). Norms for this measure exist for a clinical sample (Cox, Swinson, & Shaw, 1991), as well as for a normative sample (Gillis, Haaga, & Ford, 1995). Although scores on the agoraphobia subscale had less than adequate internal consistency values in one study (Cronbach’s α ranging from .59 to .69; Cox et al., 1991), other studies with clinical and nonclinical samples, including a Spanish/​English bilingual sample, indicate adequate to good internal consistency (Cronbach’s α ranging from .76 to .84; e.g., Cox, Swinson, Parker, Kuch, & Reichman, 1993; Novy et al., 2001). Given the variable data, an overall internal consistency rating of adequate, rather than good, was assigned. Test–​ retest reliability of the FQ agoraphobia subscale

Panic Disorder and Agoraphobia

scores has been assessed over different time delays ranging from 1 to 16 weeks (Cronbach’s α = .85 to .89; Marks & Mathews, 1979; Michelson & Mavissakalian, 1983), although with small samples, resulting in an overall assignment of good test–​retest reliability. The items on the FQ were determined through multiple factor analyses (Marks & Mathews, 1979) and appear to represent the constructs of interest, indicative of adequate content validity. The construct validity of the FQ is judged to be good based on correlations with other self-​ report measures of anxiety, as well as the finding that FQ agoraphobia subscale scores are higher in individuals with panic disorder with agoraphobia in comparison to other individuals (see Cox et al., 1991; Oei, Moylan, & Evans, 1991). The FQ has been used with people of different ethnicities, including Spanish-​speaking anxious individuals (Novy et al., 2001) and Chinese college students (Lee & Oei, 1994), as well as with primary care and community mental health patients (Craske et al., 2005; Wade, Treat, & Stuart, 1998). Given the various populations and settings in which the FQ has been studied, it is judged to have excellent validity generalization. The FQ is rated as having adequate clinical utility. The greatest utility of the FQ is that it is a brief index of level of agoraphobic avoidance, to be compared against established norms. Mobility Inventory for Agoraphobia The MI (Chambless, Caputo, Jasin, Gracely, & Williams, 1985) assesses degree of avoidance due to agoraphobia, as well as the frequency and severity of panic attacks. The MI has undergone some changes since its original development (see Antony, 2002). The first part of the MI lists 27 agoraphobic-​ like situations (including one write-​ in response). Using a Likert scale, two avoidance ratings are given to each situation, one when accompanied and one when alone. Separate mean values (for accompanied and alone) are calculated for the first 26 items. The respondent also circles the 5 situations that are most impairing. Three questions, rated on a Likert scale, assess the frequency and severity of panic attacks. Last, the respondent is asked about his or her safety zone, including its size and location. This questionnaire can be completed in less than 20 minutes (Chambless et al., 1985) and is available in English, Spanish, Portuguese, French, Swedish, Dutch, German, and Greek (Antony, 2002). Norms exist for clinical as well as normal samples (see Chambless et al., 1985), and the MI has been completed by an elderly community sample (Hendriks et al., 2010). Because item development was informed by exposure

277

session observations and information obtained through client interviews, as well as by items on a measure of fear (Chambless et  al., 2011), the MI has adequate content validity. In addition, scores on this measure have been found to have excellent internal consistency (for a review, see Chambless et al., 2011) and adequate test–​retest reliability (r  =  .75 to .90 over a 31-​day period; Chambless et al., 1985). Furthermore, the MI demonstrates adequate test–​retest reliability over longer intervals of time (e.g., 5 years; Chambless et al., 2011). The MI has convergent validity, as evidenced by correlations with other self-​report measures of anxiety (e.g., Ehlers, 1995); however, more research is required on discriminant validity of the MI. Thus, the construct validity of the MI was rated as adequate. The MI is available in different languages and has been administered in various settings (e.g., Kenardy et al., 2003). Thus, it possesses excellent validity generalization. This measure has at least adequate clinical utility due to its usefulness in generating a list of agoraphobic situations to be targeted during in vivo exposures (Chambless et al., 1985). Albany Panic and Phobia Questionnaire The APPQ (Rapee et al., 1994) is a 27-​item questionnaire that measures degree of fear imagined in a given situation (e.g., “exercising vigorously alone”). Each item is rated from 0 (no fear) to 8 (extreme fear). Scores are calculated separately for agoraphobia, social phobia, and interoceptive subscales. Descriptive statistics are available on the APPQ for various anxiety disorders, as well as for a small nonclinical sample (Rapee et al., 1994). In addition, norms are available for clinical populations (Brown, White, & Barlow, 2005). The APPQ items were tested in three pilot studies (Rapee et al., 1994), but the measure was not judged independently by others, and hence it was assigned adequate content validity. The internal consistency of its subscale scores ranges from good to excellent in English and Spanish versions (Cronbach’s α ranges from .85 to .92; Brown, White, & Barlow, 2005; Novy et al., 2001; Rapee et al., 1994). Overall, its internal consistency appears to be good. There is some evidence of scores on this measure displaying adequate test–​retest reliability (r ranging from .68 to .84, mean period of 10.9 weeks; Rapee et al., 1994). Several studies have found evidence for this measure’s construct validity (Brown, White, & Barlow, 2005; Novy et  al., 2001; Rapee et  al., 1994), including correlations with other self-​report measures of anxiety and evidence that the subscale scores differ between groups. Examples

278

Anxiety and Related Disorders

include different interoceptive subscale scores for panic disorder groups varying in levels of avoidance, as well as differences in agoraphobia subscale scores between a panic disorder group with varying levels of avoidance and three comparison groups (social phobia, other anxiety, and control; Rapee et al., 1994). Overall, a rating of adequate construct validity was assigned. There is evidence for the use of the APPQ with various demographic groups and in different languages (e.g., Kim et  al., 2004; Novy et  al., 2001). The APPQ has been administered to a variety of groups across different clinical settings (e.g., Gonzalez, Zvolensky, Grover, & Parent, 2012); thus, validity generalization was judged to be excellent. Notably, this scale assesses fears of activities that produce bodily sensations (e.g., exercising vigorously or drinking coffee), a type of fear that is central to panic disorder and is a target of treatment. The instrument is useful for generating a list of feared activities and is therefore beneficial to treatment planning as well. At this time, the APPQ is judged to have adequate clinical utility. Behavioral Approach Tests Behavioral approach tests (also referred to as “behavioral avoidance tests”) assess the degree of behavioral approach (or avoidance) of specific situations or internal stimuli (i.e., bodily sensations). Although degree of avoidance can be reported upon during diagnostic interviews or with self-​ report questionnaires, the BAT provides another modality of assessment that is not constrained by the biases of retrospective judgment that limit verbal reporting. The BAT can be standardized across patients or individually tailored for a patient. The standardized BAT for agoraphobia usually involves walking or driving a particular route, such as a 1-​mile loop around the clinic setting (for examples, see Williams & Zane, 1989). Standardized interoceptive BATs involve exercises that induce panic-​ like symptoms, such as spinning in a circle, running in place, hyperventilating, and breathing through a straw (Barlow & Craske, 2006). The disadvantage of standardized BATs is that the specific task may not be relevant to everyone with panic disorder and agoraphobia. Individually tailored BATs usually assess approach/​avoidance of three to five individualized situations or physical exercises designed to be moderately to extremely anxiety-​provoking for a given patient. Individually tailored BATs are more informative for clinical practice, although they confound between-​ participant comparisons for research purposes.

With both individualized and standardized BATs, anxiety levels are rated at regular intervals throughout, and actual distance or length of time is measured. Ongoing anxiety typically is measured using the Subjective Units of Distress Scale (SUDS; Wolpe, 1968), a verbal or written rating of anxiety, ranging from 0 (no anxiety) to 100 (extreme anxiety). Some prefer to use a smaller range than the original SUDS rating system (e.g., a 9-​point scale; Mavissakalian & Michelson, 1982). Standardized and individually tailored BATs are each susceptible to demand biases, for distress before treatment and improvement after treatment (Borkovec, Weerts, & Bernstein, 1977). On the other hand, BATs are an important supplement to self-​report of agoraphobic avoidance because patients tend to underestimate what they can actually achieve (Craske et al., 1988). In addition, BATs often reveal information of which the individual is not fully aware but that is important for treatment planning. For example, safety signals and safety behaviors, which alleviate distress in the short term but sustain anxiety in the long term, may not be acknowledged until in the act of attempting to approach a situation or a bodily sensation that had been previously avoided. Typical safety signals include other people and medication bottles. The removal of safety signals and safety behaviors is critical to effective exposure therapy (e.g., Salkovskis, 1991). Physiological Measures Advancements in technology have given rise to practical options for clinicians to assess ongoing physiological responses. Most fitness watches and wearables feature ongoing heart rate monitoring, with more recent models allowing for blood pressure measurement. Ongoing monitoring of heart rate and blood pressure during BATs, for example, may illuminate discrepancies between reports of symptoms and actual physiological arousal (i.e., report of heart rate acceleration in the absence of actual heart rate acceleration), which can serve as a therapeutic demonstration of the role of attention and cognition in symptom production. Similarly, physiological data can disconfirm misappraisals such as “my heart feels like its beating so fast that it will explode.” Another option is to record basal peripheral physiology over protracted periods of time, such as 24-​hour ambulatory recordings as individuals engage in their normal daily routine. However, the clinical value of such data is not clear, especially because the results are inconsistent (e.g., Bystritsky, Craske, Maidenberg, Vapnik, & Shapiro, 1995; Shear et  al., 1992). Nonetheless, the finding that

Panic Disorder and Agoraphobia

CBT effects are not limited to self-​reported symptoms but extend to shifts in basal levels of physiology (Craske et al., 2002) is useful information for the clinician. Measures of In Vivo Cognition In contrast to strong endorsements of perceived dangers on self-​report questionnaires in anticipation of feared situations, Williams, Kinney, Harap, and Liebmann (1997) reported a general absence of danger appraisals as patients with panic disorder and other phobias confronted their most feared driving and claustrophobic situations. That is, patients reported very little anticipation of panic or the situation itself while confronting those situations. Instead, their verbal reports pertained mostly to perceived inability to cope. This suggests that self-​report questionnaires tap a different construct than in vivo measures of cognition. Conceivably, endorsements on self-​report questionnaires, when not faced with an agoraphobic situation, represent anticipatory anxiety, whereas reports during in vivo exposures represent fear responding. There is reason to believe that cognitive functions differ between these two states. That is, whereas anxiety is associated with improved attentional selectivity for threat-​relevant stimuli, processing of threatening information may be impeded at the height of intense fear (e.g., Watts, Trezise, & Sharrock, 1986). In addition, Goldsmith (1994) noted that cognitions associated with emotions (e.g., fear) are relatively elementary or automatic, in contrast to the more complex cognitive processing of mood states (e.g., anxiety). Thus, a more comprehensive assessment of experience when faced with agoraphobic situations or feared bodily sensations would entail behavioral observations, anxiety ratings, physiological measurements, and cognitions in the moment. There is no specific instrument to recommend, however, other than instructing individuals to verbalize their thoughts throughout the BAT. Overall Evaluation Several self-​report instruments are helpful in the assessment of subjective state (i.e., ASI or ASI-​3, BSQ, ACQ, and FQ). SUDS ratings and assessment of cognitions and physiology during BATs can generate a more thorough understanding of the individual’s subjective experience and what to target during treatment. A  clear case conceptualization necessary for individualizing treatment for panic disorder and agoraphobia warrants gathering information across all of these domains using multiple methodologies.

279

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

Similar to case conceptualization and treatment planning, assessment for treatment monitoring and treatment outcome should include measures of symptom-​related misappraisals, fear, and avoidance, as well as cognitions related to control and self-​efficacy. Although the evidence to date remains sparse and inconsistent, there is some evidence that measures of catastrophic thinking about panic attacks predict treatment outcome (e.g., Margraf & Schneider, 1991)  as well as follow-​up status (Clark et al., 1994) for patients with panic disorder with agoraphobia. In addition, there is other evidence to suggest that measures of self-​efficacy and perceived control may be more relevant and informative (Fentz et al., 2013; Williams & Falbo, 1996). In multiple independent studies, greater agoraphobic avoidance has also been found to be predictive of worse outcomes for CBT in individuals with panic disorder and agoraphobia (for a review, see Porter & Chambless, 2015). In addition to assessing multiple constructs, it is also important that various methodologies be used in treatment monitoring and treatment outcome assessment. Our discussion includes interview, self-​ report, self-​ monitoring, and behavioral assessment methodologies. Interviews ADIS In addition to the diagnostic value of the ADIS (Brown & Barlow, 2014a), there is ample evidence that CSRs for the diagnoses of panic disorder and agoraphobia are sensitive to change following CBT and acceptance and commitment therapy (ACT) (ADIS-​IV; Arch et al., 2012; Craske et al., 2007). Given that the ADIS is not restricted to only one type of intervention modality, the ADIS is rated as possessing overall excellent treatment sensitivity (Table 13.3). SCID The SCID (First et  al., 2016c) is most often used as a pretreatment diagnostic instrument rather than as an outcome measure. Because we were able to locate only one study in which pre-​to post-​treatment change was measured through the SCID (Carter, Sbrocco, Gore, Marin, & Lewis, 2003), treatment sensitivity was judged to be adequate.

280

Anxiety and Related Disorders

TABLE 13.3  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Content Reliability Validity

Construct Validity

Validity Generalization

Treatment Sensitivity

Clinical Utility

ADIS

NA

NA

G

Aa

A

A

E

E

A

SCID PDSS ACQ-​CON ASI ASI-​3 BSQ ACQ FQ MI APPQ

NA A G G G G G G G G

NA G G G G G G A E G

G G NA NA NA NA NA NA NA NA

Aa Aa A A G G A G A A

A A A A G A A A A A

A A E E E G G G A A

E E E E E E E E E E

A E E E E E E E E E

A A A A A A A A A A

Highly Recommended

✓ ✓ ✓ ✓ ✓

  Different raters.

a

Note:  ADIS  =  Anxiety Disorders Interview Schedule; SCID  =  Structured Clinical Interview for the DSM; PDSS  =  Panic Disorder Severity Scale; ACQ-​ CON  =  Anxiety Control Questionnaire; ASI  =  Anxiety Sensitivity Index; ASI-​ 3  =  Anxiety Sensitivity Index-​ 3; BSQ  =  Body Sensations Questionnaire; ACQ = Agoraphobic Cognitions Questionnaire; FQ = Fear Questionnaire; MI = Mobility Inventory for Agoraphobia; APPQ = Albany Panic and Phobia Questionnaire; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

PDSS

Anxiety Control Questionnaire

The PDSS (Shear et al., 1997) has been found to be sensitive to change following treatment (Kiropoulos et  al., 2008). This instrument is judged to have excellent treatment sensitivity due to its ability to detect change in panic disorder and agoraphobia following various types of pharmacological and psychotherapeutic interventions.

The Anxiety Control Questionnaire (ACQ-​CON; Rapee, Craske, Brown, & Barlow, 1996), not to be confused with the Agoraphobic Cognitions Questionnaire, is a 30-​item self-​report instrument that assesses perceived ability to control external events and internal emotional responses. Brown, White, Forsyth, and Barlow (2005) found evidence of three factors (emotion control, stress control, and threat control), and from their item analysis, they developed a revised version (known as the ACQ-​R) composed of only 15 of the original 30 items. ACQ-​CON total score norms and ACQ-​R subscale norms exist for anxious and nonclinical samples (Brown, White, et al., 2005; Rapee et al., 1996). The internal consistency of the ACQ-​CON scores is generally good (ranging from .81 to .89), although some data on the subscales were in the less than adequate to adequate range (.65 to .74; see Ballash, Pemble, Usui, Buckley, & Woodruff-​Borden, 2006; Brown, White, et al., 2005; Craske et al., 2007; Lang & McNiel, 2006; Rapee et al., 1996; Zebb & Moore, 1999). Adequate ACQ-​CON test–​retest reliability data exist based on 1-​week to 1-​month periods of time (r ranging from .82 to .88; Rapee et  al., 1996). Although ACQ-​CON data from an anxious sample were used in a factor analysis (Rapee et al., 1996), there is no information to suggest that this sample evaluated the items or the measure’s instructions, and thus content validity is only adequate. There is evidence of the ACQ-​ CON’s construct validity (Lang & McNiel, 2006; Rapee et  al., 1996), including data to support its incremental validity in predicting interpretation biases associated with

Self-​Report Instruments Anxiety Sensitivity Index and Anxiety Sensitivity Index-​3 Administration of the ASI (Reiss et  al., 1986)  before, during, and after treatment maps the degree to which beliefs about physical symptoms of anxiety change over the course of treatment. The ASI is rated as having excellent treatment sensitivity due to the empirical evidence showing change on this instrument following group and individual CBT (e.g., Arch & Ayers, 2013; Craske et al., 2007), mindfulness-​based stress reduction (Arch & Ayers, 2013), ACT (Arch et  al., 2012), and pharmacological treatments (e.g., Simon et  al., 2004). Furthermore, the recently developed ASI-​3 has shown excellent treatment sensitivity to brief mindfulness-​based (Tanay, Lotan, & Bernstein, 2012) and computerized cognitive anxiety sensitivity interventions (Schmidt, Capron, Raines, & Allan, 2014). Given that the ASI-​3 possesses stronger psychometric properties and is multidimensional, this version should be given preference over the ASI.

Panic Disorder and Agoraphobia

ambiguous internal and external phenomena (Zvolensky et  al., 2001), prediction of a latent factor of anxious arousal (Brown, White, et al., 2005), and its relatedness to trait anxiety (Kashdan, Barrios, Forsyth, & Steger, 2006). Moreover, the threat control factor has been found to be a moderator of the relationship between anxiety sensitivity and agoraphobia (White et al., 2006) and a mediator of the relationship between some aspects of family functioning and anxiety in a nonclinical sample (Ballash et al., 2006). Thus, the construct validity of the ACQ-​CON was judged to be excellent. The ACQ-​CON (or ACQ-​R) has been used with various samples, including outpatient clinical samples (e.g., White et  al., 2006), inpatient clinical samples (Lang & McNiel, 2006), and nonclinical samples (e.g., Kashdan et al., 2006). As a result of its use in more than one setting and with different samples, this measure was given a rating of excellent validity generalization. The ACQ-​R has recently been translated to Spanish, but data regarding its clinical utility have yet to be published (Osma, Barrada, García-​Palacios, Navarro-​Haro, & Aguilar, 2016). ACQ-​ CON scores (Rapee et  al., 1996)  have been sensitive to change after CBT for panic disorder and agoraphobia, and they are able to predict the severity of comorbid diagnoses at follow-​up assessment of CBT (Craske et al., 2007). In addition, ACQ-​R scores have demonstrated sensitivity to change in acceptance-​ based behavioral therapy for individuals diagnosed with generalized anxiety disorder (Treanor, Erisman, Salters-​Pedneault, Roemer, & Orsillo, 2011). As a result of these findings, treatment sensitivity was judged to be excellent. Body Sensations Questionnaire BSQ scores are sensitive to change following short-​term, intensive exposure treatment for agoraphobia (Chambless et al., 1984), group CBT plus exercise treatment for panic disorder with agoraphobia (Cromarty, Robinson, Callcott, & Freeston, 2004), and individual CBT for panic disorder with agoraphobia (Carlbring et al., 2007). Furthermore, the BSQ has demonstrated change sensitivity following ACT in individuals with primary panic disorder and/​or agoraphobia previously deemed as treatment resistant (Gloster et  al., 2015). Thus, the BSQ is judged to have excellent treatment sensitivity. Agoraphobic Cognitions Questionnaire ACQ scores are sensitive to change following short-​term, intensive exposure treatment for agoraphobia (Chambless

281

et al., 1984), group CBT plus exercise treatment for panic disorder with agoraphobia (Cromarty et  al., 2004), and individual CBT and ACT for panic disorder and agoraphobia (Carlbring et al., 2005, 2007; Gloster et al., 2015). Thus, the ACQ is rated as having excellent treatment sensitivity. Fear Questionnaire The FQ has been used as a treatment outcome measure in more than 50 studies, and a meta-​analysis of 56 treatment groups revealed a mean effect size of d = 1.93 (Ogles, Lambert, Weight, & Payne, 1990). Thus, the FQ is rated as exhibiting excellent treatment sensitivity. Mobility Inventory MI scores change following exposure therapy (Chambless et al., 1985; Ehlers, 1995) and individual CBT and ACT for panic disorder and agoraphobia (e.g., Carlbring et al., 2005; Gloster et al., 2015). Thus, the MI is rated as having excellent treatment sensitivity. Albany Panic and Phobia Questionnaire Rapee et al. (1994) provided some evidence of this measure’s treatment sensitivity following a course of CBT. The APPQ has also demonstrated change following a cognitive–​ behavioral-​ based treatment for individuals with panic disorder and moderate to severe agoraphobia (Sensation-​ Focused Intensive Treatment; Bitran, Morissette, Spiegel, & Barlow, 2008). Furthermore, the APPQ has demonstrated sensitivity to change following mindfulness-​based cognitive therapy in adjunct to pharmacotherapy for individuals with panic disorder (Kim et  al., 2010). As a result, the APPQ is judged to have excellent treatment sensitivity. Self-​Efficacy to Control a Panic Attack Questionnaire In anxiety-​ provoking situations, 15% of the reported thoughts from a sample with agoraphobia were about self-​efficacy (Williams et  al., 1997). Moreover, research suggests that these judgments help predict behaviors that are often the target of treatment (e.g., Kinney & Williams, 1988) and mediate the effects of treatment for panic disorder with agoraphobia upon approach behaviors (Williams, Kinney, & Falbo, 1989) and panic severity (Casey, Newcombe, & Oei, 2005). One self-​report measure of self-​efficacy that is directly relevant to panic

282

Anxiety and Related Disorders

disorder is the Self-​Efficacy to Control a Panic Attack Questionnaire (SE-​ CPAQ; Gauthier, Bouchard, Cote, Laberge, & French, 1993). This measure is well suited to be administered along with other measures discussed in this chapter. It is a 25-​item measure, consisting of the Self-​ Efficacy–​Cognitions (6 items), Self-​Efficacy–​Mobility (10 items), and Self-​Efficacy–​Symptoms (9 items) subscales. Each item is assigned a rating indicative of the respondent’s confidence in controlling panic attacks in a given situation (i.e., when experiencing a particular thought in a particular location or having a particular physiological sensation). The thoughts, locations, and sensations used in the SE-​CPAQ were adapted from the ACQ (Chambless et al., 1984), the MI (Chambless et al., 1985), and the BSQ (Chambless et  al., 1984). There is some evidence of its validity (Gauthier et al., 1993) and sensitivity to treatment-​related changes (Bouchard et al., 1996). However, there is only limited research on its psychometric properties. Self-​Monitoring Ongoing self-​monitoring is yet another modality of assessment, albeit one that overlaps with other verbal report measures (i.e., diagnostic instruments and self-​ report questionnaires). To self-​monitor panic attacks, patients typically are given portable paper forms, hand-​ held devices, or mobile applications to record every occurrence of panic (i.e., frequency recording) in terms of time of onset and offset, intensity, symptoms, and location, among other things (e.g., Barlow, Craske, Cerny, & Klosko, 1989). Most commonly, self-​monitoring continues over consecutive days for 1 or 2 weeks, especially when used to evaluate pre-​to post-​ treatment changes (e.g., Craske et al., 1997). Providing detailed instructions to patients can enhance the consistency and accuracy of data collection. This includes training in the use of rating scales and providing definitions of what constitutes a panic attack (although patients’ self-​perceptions of panic may be important in their own right; Basoglu, Marks, & Sengun, 1992). Self-​ monitoring of agoraphobic avoidance entails monitoring excursions from home, recording the time of beginning and end, whether alone or accompanied, maximum anxiety, destination or purpose, escape behavior, distance traveled, and so on (e.g., Murphy, Michelson, Marchione, Marchione, & Testa, 1998). This format of self-​monitoring may be cumbersome for the mildly agoraphobic person who is relatively mobile. General anxiety and accompanying moods also can be self-​monitored

(Barlow et  al., 1989). Mobile devices offer a particular advantage that may preclude the delay in self-​monitoring that likely occurs otherwise. For example, Stegemann, Ebenfeld, Lehr, Berking, and Funk (2013) have developed a promising mobile application (GET.ON PAPP) that functions as a self-​monitoring diary of panic attacks and as an exposure guide. Self-​monitoring strategies represent a quantification of experience at the time of its occurrence (whether it be tied to a specific event or to a moment in time), whereas self-​ reported estimates of frequency, duration, or content represent judgments of experience that is retrospective and generalized in nature. Some investigators have attempted to assess the level of agreement between self-​monitored and self-​estimated data by having the same individuals provide retrospective estimates of panic attacks and then self-​monitor for an interval of time that is equivalent to the interval that was estimated. Using this approach, patients with panic disorder and agoraphobia were found to have endorsed fewer panic symptoms during self-​monitoring in comparison to previously collected estimates (Basoglu et al., 1992; Margraf et al., 1987). Similarly, retrospective estimates of the frequency of panic attacks and symptoms collected during diagnostic interviewing have been found to be substantially higher than the frequency obtained with self-​monitored data (e.g., De Beurs, Lange, & Van Dyck, 1992). There is evidence that anxiety (Hiebert & Fox, 1981)  and panic (de Jong & Bouman, 1995)  decrease with self-​monitoring. Nevertheless, reactivity effects are generally short-​lived and subside when self-​monitoring discontinues, perhaps because the self-​monitoring device becomes a discriminative stimulus controlling the occurrence of the target behavior (Borkovec et  al., 1977). Although there are potential problems associated with the self-​monitoring methodology, it acts as a very effective means of assessing panic disorder and agoraphobia and is sensitive to change over the course of treatment. Behavioral Approach Tests Ogles, Lambert, Weight, and Payne (1990) calculated effect sizes pertaining to BATs from their meta-​analysis of 21 treatment studies for agoraphobia. The behavioral score (i.e., duration or amount completed) yielded a mean BAT effect size of d = 1.15. The heart rate score during BATs yielded an average effect size of d = 0.44, and the SUDS score yielded a mean effect size of d = 1.36. Other individual studies similarly reported large treatment effect sizes from individualized BATs (e.g., Steketee, Chambless, &

Panic Disorder and Agoraphobia

Tran, 2001). In addition, other studies have shown significant reductions in subjective anxiety in response to brief interoceptive exposure interventions (e.g., hyperventilation; Keough & Schmidt, 2012). Overall Evaluation An accurate diagnosis and case conceptualization is only the first step to conducting a thorough assessment as it relates to the treatment of panic disorder and agoraphobia. The measures reviewed in this section gauge the level of symptomatic improvement and shifts in variables considered critical to therapeutic success, such as cognition, self-​efficacy, and perceived control. Because behaviors and thoughts may arise during fear and anxiety that differ from self-​reported estimates, observation and recording of ongoing experience captures another aspect of panic disorder and agoraphobia that is important for measuring treatment change.

CONCLUSIONS AND FUTURE DIRECTIONS

From this review, it should be evident that there are a variety of measures and assessment methodologies to help guide clinicians or researchers in their work with individuals with panic disorder and agoraphobia. Throughout this chapter, we have emphasized the importance of measuring subjective, physiological, and behavioral aspects of this disorder, preferably with multiple methodologies of assessment, including diagnostic interviews, standardized self-​ report questionnaires, self-​ monitoring, behavioral observations, and physiological data. Neither diagnostic interview reviewed in this chapter met the criteria for being highly recommended for several reasons. First, psychometric properties for the recent DSM-​ 5 updated versions of the ADIS (ADIS-​ 5) and SCID (SCID-​5) have yet to be published. As a result, the diagnostic reliability of these versions has yet to be determined. Second, attempts to assess the validity of these interviews cannot be separated from attempts at validating the diagnostic system itself (Grisham et al., 2004), and “a gold standard for psychiatric diagnoses remains elusive” (First & Gibbon, 2004, p. 139). Clearly, more research of this nature is needed. Third, because the SCID is rarely used as an index of treatment outcome, data regarding its sensitivity to change as a result of treatment continue to be lacking. Although semi-​ structured diagnostic interviews are generally more time-​consuming than standard clinical

283

interviews, an assessment that allows for differential diagnosis (e.g., the ADIS or the SCID) is very important for the diagnosis of panic disorder and agoraphobia. Time and money wasted on an inaccurate diagnosis and inappropriate treatment will be significantly greater than the additional time required to conduct a thorough diagnostic assessment using a standardized instrument. Each interview has its own strengths and weaknesses, and selection can be based on purpose. For purposes of validity generalization, diagnoses being made in atypical settings or with samples with varied ethnicities, the SCID instrument is preferred. When the purpose is to obtain detailed information for treatment planning, the ADIS is preferred. Of the self-​report measures reviewed, five are highly recommended for the assessment of panic disorder and agoraphobia because they were assigned mostly good or excellent ratings: the ACQ-​CON, ASI-​3, BSQ, ACQ, and FQ. Although the remaining measures were not listed as highly recommended, further research and refinement could provide additional information and lead to improvements in their psychometric properties. Self-​monitoring and behavioral observational methods, with accompanying measures of physiology and cognition in the moment, cannot easily be reviewed in accordance with the psychometric properties listed for the diagnostic instruments and self-​report scales. Nevertheless, given their value in treatment monitoring and treatment outcome, we do judge these methods to be critical to the assessment of panic disorder and agoraphobia. Directions for future research include more research on those measures that were not rated as highly recommended, especially with respect to adding to or improving their test–​retest reliability and evidence of construct validity. Second, although many measures are available in multiple languages, continued research is needed on the usefulness of these measures with different populations. Third, because measures may operate differently in different settings (e.g., outpatient vs. inpatient vs. community centers), it is critical that researchers (especially instrument developers) continue to expand the number and variety of settings in which the measures are evaluated. Fourth, many treatment studies (especially pharmacological studies) use diagnostic interviews only in the pretreatment phase. In order to gather additional data on treatment sensitivity, diagnostic assessments are needed both prior to and following a course of treatment. It is our hope that this chapter will help clinicians choose measures and assessment methodologies that are scientifically sound and that address the various components of panic disorder and agoraphobia (i.e., cognitions,

284

Anxiety and Related Disorders

emotional reactions, physiological symptoms, and behavioral avoidance) and how to use assessment data in their treatment planning and monitoring of treatment outcomes.

References Acheson, D. T., Forsyth, J. P., & Moses, E. (2012). Interoceptive fear conditioning and panic disorder: The role of conditioned stimulus–​unconditioned stimulus predictability. Behavior Therapy, 43, 174–​189. Addis, M. E., Hatgis, C., Krasnow, A. D., Jacob, K., Bourne, L., & Mansfield, A. (2004). Effectiveness of cognitive–​ behavioral treatment for panic disorder versus treatment as usual in a managed care setting. Journal of Consulting and Clinical Psychology, 72, 625. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Press. Antony, M. M. (2002). Measures for panic disorder and agoraphobia. In M. M. Antony, S. M. Orsillo, & L. Roemer (Eds.), Practitioner’s guide to empirically based measures of anxiety (pp. 95–​125). New York, NY: Springer. Arch, J. J., & Ayers, C. R. (2013). Which treatment worked better for whom? Moderators of group cognitive behavioral therapy versus adapted mindfulness based stress reduction for anxiety disorders. Behaviour Research and Therapy, 51, 434–​442. Arch, J. J., Eifert, G. H., Davies, C., Vilardaga, J. C. P., Rose, R. D., & Craske, M. G. (2012). Randomized clinical trial of cognitive behavioral therapy (CBT) versus acceptance and commitment therapy (ACT) for mixed anxiety disorders. Journal of Consulting and Clinical Psychology, 80, 750. Arrindell, W. A. (1993a). The fear of fear concept: Evidence in favour of multidimensionality. Behaviour Research and Therapy, 31, 507–​518. Arrindell, W. A. (1993b). The fear of fear concept: Stability, retest artefact and predictive power. Behaviour Research and Therapy, 31, 139–​148. Ballash, N. G., Pemble, M. K., Usui, W. M., Buckley, A. F., & Woodruff-​Borden, J. (2006). Family functioning, perceived control, and anxiety:  A mediational model. Journal of Anxiety Disorders, 20, 486–​497. Bankier, B., Januzzi, J. L., & Littman, A. B. (2004). The high prevalence of multiple psychiatric disorders in stable outpatients with coronary heart disease. Psychosomatic Medicine, 66, 645–​650. Barlow, D. H. (2004). Anxiety and its disorders: The nature and treatment of anxiety and panic. New York, NY: Guilford.

Barlow, D. H., & Craske, M. G. (2006). Mastery of your anxiety and panic. New York, NY: Oxford University Press. Barlow, D. H., Craske, M. G., Cerny, J. A., & Klosko, J. S. (1989). Behavioral treatment of panic disorder. Behavior Therapy, 20, 261–​282. Basco, M. R., Bostic, J. Q., Davies, D., Rush, A. J., Witte, B., Hendrickse, W., & Barnett, V. (2000). Methods to improve diagnostic accuracy in a community mental health setting. American Journal of Psychiatry, 157, 1599–​1605. Basoglu, M., Marks, I. M., & Sengun, S. (1992). A prospective study of panic and anxiety in agoraphobia with panic disorder. British Journal of Psychiatry, 160, 57–​64. Batelaan, N. M., Rhebergen, D., de Graaf, R., Spijker, J., Beekman, A. T., & Penninx, B. W. (2012). Panic attacks as a dimension of psychopathology: Evidence for associations with onset and course of mental disorders and level of functioning. Journal of Clinical Psychiatry, 73, 1195–​1202. Bernstein, A., Zvolensky, M. J., Norton, P. J., Schmidt, N. B., Taylor, S., Forsyth, J. P., . . . Stewart, S. H. (2007). Taxometric and factor analytic models of anxiety sensitivity:  Integrating approaches to latent structural research. Psychological Assessment, 19, 74. Bitran, S., Morissette, S. B., Spiegel, D. A., & Barlow, D. H. (2008). A pilot study of sensation-​focused intensive treatment for panic disorder with moderate to severe agoraphobia: Preliminary outcome and benchmarking data. Behavior Modification, 32, 196–​214. Blais, M. A., Otto, M. W., Zucker, B. G., McNally, R. J., Schmidt, N. B., Fava, M., & Pollack, M. H. (2001). The Anxiety Sensitivity Index: Item analysis and suggestions for refinement. Journal of Personality Assessment, 77, 272–​294. Blaya, C., Salum, G. A., Lima, M. S., Leistner-​Segal, S., & Manfro, G. G. (2007). Lack of association between the serotonin transporter promoter polymorphism (5-​ HTTLPR) and panic disorder: A systematic review and meta-​analysis. Behavioral and Brain Functions, 3, 1. Blechert, J., Wilhelm, F. H., Meuret, A. E., Wilhelm, E. M., & Roth, W. T. (2013). Experiential, autonomic, and respiratory correlates of CO2 reactivity in individuals with high and low anxiety sensitivity. Psychiatry Research, 209, 566–​573. Block, R. I., Ghoneim, M., Fowles, D. C., Kumar, V., & Pathak, D. (1987). Effects of a subanesthetic concentration of nitrous oxide on establishment, elicitation, and semantic and phonemic generalization of classically conditioned skin conductance responses. Pharmacology Biochemistry and Behavior, 28, 7–​14. Borkovec, T. D., Weerts, T. C., & Bernstein, D. A. (1977). Assessment of anxiety. In A. R. Ciminero, K. S. Calhoun, & H. E. Adams (Eds.), Handbook of behavioral assessment (pp. 367–​428). New York, NY: Wiley.

Panic Disorder and Agoraphobia

Bouchard, S., Gauthier, J., Laberge, B., French, D., Pelletier, M.-​H., & Godbout, C. (1996). Exposure versus cognitive restructuring in the treatment of panic disorder with agoraphobia. Behaviour Research and Therapy, 34, 213–​224. Bouton, M. E., Mineka, S., & Barlow, D. H. (2001). A modern learning theory perspective on the etiology of panic disorder. Psychological Review, 108, 4. Bouvard, M., Cottraux, J., Talbot, F., Mollard, E., Duhem, S., Yao, S.-​N., . . . Cungi, C. (1998). Validation of the French translation of the Agoraphobic Cognitions Questionnaire. Psychotherapy and Psychosomatics, 67, 249–​253. Bowen, R., Chavira, D. A., Bailey, K., Stein, M. T., & Stein, M. B. (2008). Nature of anxiety comorbid with attention deficit hyperactivity disorder in children from a pediatric primary care setting. Psychiatry Research, 157, 201–​209. Brown, T. A., & Barlow, D. H. (2014a). Anxiety and Related Disorders Interview Schedule for DSM-​ 5 (ADIS-​ 5)–​ Adult Version. New York, NY: Oxford University Press. Brown, T. A., & Barlow, D. H. (2014b). Anxiety and Related Disorders Interview Schedule for DSM-​ 5 (ADIS-​ 5)–​ Lifetime Version: Client Interview Schedule. New York, NY: Oxford University Press. Brown, T. A., Barlow, D. H., & DiNardo, P. A. (1994). Anxiety Disorders Interview Schedule for DSM-​IV (ADIS-​ IV): Client Interview Schedule. Boulder, CO: Graywind. Brown, T. A., Campbell, L. A., Lehman, C. L., Grisham, J. R., & Mancill, R. B. (2001). Current and lifetime comorbidity of the DSM-​IV anxiety and mood disorders in a large clinical sample. Journal of Abnormal Psychology, 110, 585. Brown, T. A., Chorpita, B. F., & Barlow, D. H. (1998). Structural relationships among dimensions of the DSM-​IV anxiety and mood disorders and dimensions of negative affect, positive affect, and autonomic arousal. Journal of Abnormal Psychology, 107, 179. Brown, T. A., & Deagle, E. A. (1993). Structured interview assessment of nonclinical panic. Behavior Therapy, 23, 75–​85. Brown, T. A., DiNardo, P. A., Lehman, C. L., & Campbell, L. A. (2001). Reliability of DSM-​IV anxiety and mood disorders: Implications for the classification of emotional disorders. Journal of Abnormal Psychology, 110, 49. Brown, T. A., White, K. S., & Barlow, D. H. (2005). A psychometric reanalysis of the Albany Panic and Phobia Questionnaire. Behaviour Research and Therapy, 43, 337–​355. Brown, T. A., White, K. S., Forsyth, J. P., & Barlow, D. H. (2005). The structure of perceived emotional control: Psychometric properties of a revised Anxiety Control Questionnaire. Behavior Therapy, 35, 75–​99. Bruce, S. E., Yonkers, K. A., Otto, M. W., Eisen, J. L., Weisberg, R. B., Pagano, M., . . . Keller, M. B. (2005).

285

Influence of psychiatric comorbidity on recovery and recurrence in generalized anxiety disorder, social phobia, and panic disorder:  A 12-​year prospective study. American Journal of Psychiatry, 162, 1179–​1187. Bystritsky, A., Craske, M., Maidenberg, E., Vapnik, T., & Shapiro, D. (1995). Ambulatory monitoring of panic patients during regular activity:  A preliminary report. Biological Psychiatry, 38, 684–​689. Calvo, M., & Eysenck, M. (1998). Cognitive bias to internal sources of information in anxiety. International Journal of Psychology, 33, 287–​299. Carlbring, P., Bohman, S., Brunt, S., Buhrman, M., Westling, B. E., Ekselius, L., & Andersson, G. (2006). Remote treatment of panic disorder: A randomized trial of Internet-​based cognitive behavior therapy supplemented with telephone calls. American Journal of Psychiatry, 163, 2119–​2125. Carlbring, P., Brunt, S., Bohman, S., Austin, D., Richards, J., Öst, L. G., & Andersson, G. (2007). Internet vs. paper and pencil administration of questionnaires commonly used in panic/​ agoraphobia research. Computers in Human Behavior, 23, 1421–​1434. Carlbring, P., Nilsson-​Ihrfelt, E., Waara, J., Kollenstam, C., Buhrman, M., Kaldo, V.,  .  .  .  Andersson, G. (2005). Treatment of panic disorder:  Live therapy vs. self-​help via the Internet. Behaviour Research and Therapy, 43, 1321–​1333. Carter, M. M., Miller, O. J., Sbrocco, T., Suchday, S., & Lewis, E. L. (1999). Factor structure of the anxiety sensitivity index among African American college students. Psychological Assessment, 11, 525. Carter, M. M., Sbrocco, T., Gore, K. L., Marin, N. W., & Lewis, E. L. (2003). Cognitive–​behavioral group therapy versus a wait-​list control in the treatment of African American women with panic disorder. Cognitive Therapy and Research, 27, 505–​518. Casey, L. M., Newcombe, P. A., & Oei, T. P. (2005). Cognitive mediation of panic severity:  The role of catastrophic misinterpretation of bodily sensations and panic self-​ efficacy. Cognitive Therapy and Research, 29, 187–​200. Chambless, D. L., Beck, A. T., Gracely, E. J., & Grisham, J. R. (2000). Relationship of cognitions to fear of somatic symptoms:  A test of the cognitive theory of panic. Depression and Anxiety, 11, 1–​9. Chambless, D. L., Caputo, G. C., Bright, P., & Gallagher, R. (1984). Assessment of fear of fear in agoraphobics: The Body Sensations Questionnaire and the Agoraphobic Cognitions Questionnaire. Journal of Consulting and Clinical Psychology, 52, 1090. Chambless, D. L., Caputo, G. C., Jasin, S. E., Gracely, E. J., & Williams, C. (1985). The Mobility Inventory for Agoraphobia. Behaviour Research and Therapy, 23, 35–​44. Chambless, D. L., & Gracely, E. J. (1989). Fear of fear and the anxiety disorders. Cognitive Therapy and Research, 13, 9–​20.

286

Anxiety and Related Disorders

Chambless, D. L., Sharpless, B. A., Rodriguez, D., McCarthy, K. S., Milrod, B. L., Khalsa, S. R., & Barber, J. P. (2011). Psychometric properties of the Mobility Inventory for Agoraphobia: Convergent, discriminant, and criterion-​ related validity. Behavior Therapy, 42, 689–​699. Clark, D. M. (1986). A cognitive approach to panic. Behaviour Research and Therapy, 24, 461–​470. Clark, D. M. (1988). A cognitive model of panic attacks. In S. Rachman & J. Maser (Eds.), Panic: Psychological perspectives (pp. 71–​90). Hillsdale, NJ: Erlbaum. Clark, D. M., & Ehlers, A. (1993). An overview of the cognitive theory and treatment of panic disorder. Applied and Preventive Psychology, 2, 131–​139. Clark, D. M., Salkovskis, P. M., Hackmann, A., Middleton, H., Anastasiades, P., & Gelder, M. (1994). A comparison of cognitive therapy, applied relaxation and imipramine in the treatment of panic disorder. British Journal of Psychiatry, 164, 759–​769. Coryell, W., Fyer, A., Pine, D., Martinez, J., & Arndt, S. (2001). Aberrant respiratory sensitivity to CO2 as a trait of familial panic disorder. Biological Psychiatry, 49, 582–​587. Cougle, J. R., Timpano, K. R., Sachs-​ Ericsson, N., Keough, M. E., & Riccardi, C. J. (2010). Examining the unique relationships between anxiety disorders and childhood physical and sexual abuse in the National Comorbidity Survey-​Replication. Psychiatry Research, 177, 150–​155. Cox, B. J., Swinson, R. P., Parker, J. D., Kuch, K., & Reichman, J. T. (1993). Confirmatory factor analysis of the Fear Questionnaire in panic disorder with agoraphobia. Psychological Assessment, 5, 235. Cox, B. J., Swinson, R. P., & Shaw, B. F. (1991). Value of the Fear Questionnaire in differentiating agoraphobia and social phobia. British Journal of Psychiatry, 159, 842–​845. Craske, M. G., & Barlow, D. H. (1988). A review of the relationship between panic and avoidance. Clinical Psychology Review, 8, 667–​685. Craske, M. G., & Barlow, D. H. (2007). Mastery of your anxiety and panic: Client workbook. New York, NY: Oxford University Press. Craske, M. G., & Barlow, D. H. (2014). Panic disorder and agoraphobia. In D. H. Barlow (Ed.), Clinical handbook of psychological disorders: A step-​by-​step treatment manual (pp. 1–​61). New York, NY: Guilford. Craske, M. G., Farchione, T. J., Allen, L. B., Barrios, V., Stoyanova, M., & Rose, R. (2007). Cognitive behavioral therapy for panic disorder and comorbidity: More of the same or less of more? Behaviour Research and Therapy, 45, 1095–​1109. Craske, M. G., & Freed, S. (1995). Expectations about arousal and nocturnal panic. Journal of Abnormal Psychology, 104, 567.

Craske, M. G., Glover, D., & DeCola, J. (1995). Predicted versus unpredicted panic attacks: Acute versus general distress. Journal of Abnormal Psychology, 104, 214. Craske, M. G., Golinelli, D., Stein, M. B., Roy-​Byrne, P., Bystritsky, A., & Sherbourne, C. (2005). Does the addition of cognitive behavioral therapy improve panic disorder treatment outcome relative to medication alone in the primary-​care setting? Psychological Medicine, 35, 1645–​1654. Craske, M. G., Lang, A. J., Rowe, M., DeCola, J. P., Simmons, J., Mann, C.,  .  .  .  Bystritsky, A. (2002). Presleep attributions about arousal during sleep:  Nocturnal panic. Journal of Abnormal Psychology, 111, 53. Craske, M. G., Miller, P. P., Rotunda, R., & Barlow, D. H. (1990). A descriptive report of features of initial unexpected panic attacks in minimal and extensive avoiders. Behaviour Research and Therapy, 28, 395–​400. Craske, M. G., Poulton, R., Tsao, J. C., & Plotkin, D. (2001). Paths to panic disorder/​ agoraphobia:  An exploratory analysis from age 3 to 21 in an unselected birth cohort. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 556–​563. Craske, M. G., Rapee, R. M., & Barlow, D. H. (1988). The significance of panic-​expectancy for individual patterns of avoidance. Behavior Therapy, 19, 577–​592. Craske, M. G., Rowe, M., Lewin, M., & Noriega-​Dimitri, R. (1997). Interoceptive exposure versus breathing retraining within cognitive–​ behavioural therapy for panic disorder with agoraphobia. British Journal of Clinical Psychology, 36, 85–​99. Craske, M. G., & Tsao, J. C. (2005). Assessment and treatment of nocturnal panic attacks. Sleep Medicine Reviews, 9, 173–​184. Cromarty, P., Robinson, G., Callcott, P., & Freeston, M. (2004). Cognitive therapy and exercise for panic and agoraphobia in primary care:  Pilot study and service development. Behavioural and Cognitive Psychotherapy, 32, 371–​374. Dammen, T., Arnesen, H., Ekeberg, Ø., Husebye, T., & Friis, S. (1999). Panic disorder in chest pain patients referred for cardiological outpatient investigation. Journal of Internal Medicine, 245, 497–​507. De Beurs, E., Lange, A., & Van Dyck, R. (1992). Self-​ monitoring of panic attacks and retrospective estimates of panic: Discordant findings. Behaviour Research and Therapy, 30, 411–​413. De Cort, K., Hermans, D., Noortman, D., Arends, W., Griez, E. J., & Schruers, K. R. (2013). The weight of cognitions in panic: The link between misinterpretations and panic attacks. PLoS One, 8, e70315. De Cort, K., Hermans, D., Spruyt, A., Griez, E., & Schruers, K. (2008). A specific attentional bias in panic disorder? Depression and Anxiety, 25, 951–​955. de Jong, G. M., & Bouman, T. K. (1995). Panic disorder—​A baseline period:  Predictability of agoraphobic avoidance behavior. Journal of Anxiety Disorders, 9, 185–​199.

Panic Disorder and Agoraphobia

DiNardo, P., & Barlow, D. (1988). Anxiety Disorders Interview Schedule for DSM-​ III-​ R (ADIS-​ R). Albany, NY: Graywind. DiNardo, P., Moras, K., Barlow, D., Rapee, R., & Brown, T. (1993). Reliability of DSM-​III-​R anxiety disorder categories: Using the Anxiety Disorders Interview Schedule-​ Revised (ADIS-​R). Archives of General Psychiatry, 50, 251–​256. DiNardo, P., O’Brien, G., Barlow, D., Waddell, M., & Blanchard, E. (1983). Reliability of DSM-​III anxiety disorder categories using a new structured interview. Archives of General Psychiatry, 40, 1070–​1074. Domschke, K., Deckert, J., O’Donovan, M. C., & Glatt, S. J. (2007). Meta-​ analysis of COMT val158met in panic disorder:  Ethnic heterogeneity and gender specificity. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 144, 667–​673. Domschke, K., Ohrmann, P., Braun, M., Suslow, T., Bauer, J., Hohoff, C., . . . Heindel, W. (2008). Influence of the catechol-​O-​ methyltransferase val158met genotype on amygdala and prefrontal cortex emotional processing in panic disorder. Psychiatry Research:  Neuroimaging, 163, 13–​20. Ehlers, A. (1995). A 1-​ year prospective study of panic attacks:  Clinical course and factors associated with maintenance. Journal of Abnormal Psychology, 104, 164. Eley, T. C. (2001). Contributions of behavioral genetics research:  Quantifying genetic, shared environmental and nonshared environmental influences. In M. W. Vasey & M. R. Dadds (Eds.), The developmental psychopathology of anxiety (pp. 45–​59). New York, NY: Oxford University Press. Eysenck, H. J. (2009). The biological basis of personality. New Brunswick, NJ: Transaction Publishers. (Original work published 1967) Faravelli, C., Furukawa, T. A., & Truglia, E. (2009). Panic disorder. In G. Andrews, D. S. Charney, P. J. Sirovatka, & D. A. Regier (Eds.), Stress-​ induced and fear circuitry disorders (pp. 31–​58). Arlington, VA:  American Psychiatric Association. Fentz, H. N., Hoffart, A., Jensen, M. B., Arendt, M., O’Toole, M. S., Rosenberg, N. K., & Hougaard, E. (2013). Mechanisms of change in cognitive behaviour therapy for panic disorder:  The role of panic self-​ efficacy and catastrophic misinterpretations. Behaviour Research and Therapy, 51, 579–​587. First, M. B., & Gibbon, M. (2004). The Structured Clinical Interview for DSM-​IV Axis I  Disorders (SCID-​I) and the Structured Clinical Interview for DSM-​ IV Axis II Disorders (SCID-​ II). New  York, NY:  Biometrics Research, New York State Psychiatric Institute. First, M. B., Spitzer, R. L., Williams, J. B., & Gibbon, M. (1995). Structured clinical interview for DSM-​IV–​Non-​ Patient edition (SCID-​ NP, Version 1.0). New  York,

287

NY:  Biometrics Research, New  York State Psychiatric Institute. First, M. B., Williams, J. B., Karg, R. S., & Spitzer, R. L. (2016a). User’s guide for the SCID-​ 5-​ CV:  Structured Clinical Interview for DSM-​ 5 Disorders, Clinician Version. Arlington, VA:  American Psychiatric Association Publishing. First, M. B., Williams, J. B., Karg, R. S., & Spitzer, R. L. (2016b). Structured Clinical Interview for DSM-​5 Disorders: SCID-​ 5-​CT (Clinical Trials Version). Arlington, VA: American Psychiatric Association Publishing. First, M. B., Williams, J. B., Karg, R. S., & Spitzer, R. L. (2016c). Structured Clinical Interview for DSM-​ 5 Disorders:  SCID-​5-​CV (Clinician Version). Arlington, VA: American Psychiatric Association Publishing. First, M. B., Williams, J. B., Karg, R. S., & Spitzer, R. L. (2016d). Structured Clinical Interview for DSM-​ 5 Disorders:  SCID-​5-​RV (Research Version). Arlington, VA: American Psychiatric Association Publishing. Freire, R. C., Machado, S., Arias-​Carrión, O., & Nardi, A. E. (2014). Current pharmacological interventions in panic disorder. CNS & Neurological Disorders–​Drug Targets, 13, 1057–​1065. Gauthier, J., Bouchard, S., Cote, G., Laberge, B., & French, D. (1993). Development of two scales measuring self-​ efficacy to control panic attacks. Canadian Psychology, 30, 305. Ghisi, M., Bottesi, G., Altoè, G., Razzetti, E., Melli, G., & Sica, C. (2016). Factor structure and psychometric properties of the Anxiety Sensitivity Index-​3 in an Italian community sample. Frontiers in Psychology, 7, 160. Gillis, M. M., Haaga, D. A., & Ford, G. T. (1995). Normative values for the Beck Anxiety Inventory, Fear Questionnaire, Penn State Worry Questionnaire, and Social Phobia and Anxiety Inventory. Psychological Assessment, 7, 450. Gloster, A. T., Sonntag, R., Hoyer, J., Meyer, A. H., Heinze, S., Ströhle, A.,  .  .  .  Wittchen, H. U. (2015). Treating treatment-​resistant patients with panic disorder and agoraphobia using psychotherapy: A randomized controlled switching trial. Psychotherapy and Psychosomatics, 84, 100–​109. Goldsmith, H. (1994). Parsing the emotional domain from a developmental perspective. In P. Ekman & R. J. Davidson (Eds.), The nature of emotion:  Fundamental questions (pp. 68–​73). Oxford, UK: Oxford University Press. Gonda, X., Fountoulakis, K. N., Juhasz, G., Rihmer, Z., Lazary, J., Laszik, A., . . . Bagdy, G. (2009). Association of the S allele of the 5-​HTTLPR with neuroticism-​ related traits and temperaments in a psychiatrically healthy population. European Archives of Psychiatry and Clinical Neuroscience, 259, 106–​113. Gonzalez, A., Zvolensky, M. J., Grover, K. W., & Parent, J. (2012). The role of anxiety sensitivity and mindful

288

Anxiety and Related Disorders

attention in anxiety and worry about bodily sensations among adults living with HIV/​AIDS. Behavior Therapy, 43, 768–​778. Goodwin, R. D., Fergusson, D. M., & Horwood, L. J. (2005). Childhood abuse and familial violence and the risk of panic attacks and panic disorder in young adulthood. Psychological Medicine, 35, 881–​890. Gorman, J. M., Papp, L. A., Coplan, J. D., Martinez, J. M., Lennon, S., Goetz, R. R., . . . Klein, D. F. (1994). Anxiogenic effects of CO2 and hyperventilation in patients with panic disorder. American Journal of Psychiatry, 151, 547–​553. Grisham, J. R., Brown, T. A., & Campbell, L. A. (2004). The Anxiety Disorders Interview Schedule for DSM-​ IV (ADIS-​IV). In M. J. Hilsenroth, D. L. Segal, & M. Hersen (Eds.), Comprehensive handbook of psychological assessment (Vol. 2, pp. 163–​177). New York, NY: Wiley. Hayward, C., Killen, J. D., Kraemer, H. C., & Taylor, C. B. (2000). Predictors of panic attacks in adolescents. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 207–​214. Hein, D., Matzner, F., First, M., Spitzer, R., Williams, J., & Gibbons, M. (1998). Structured Clinical Interview for DSM-​IV childhood disorders (KID-​SCID). New  York, NY:  Department of Psychiatry, Columbia University Medical School. Hendriks, G. J., Keijsers, G., Kampman, M., Oude Voshaar, R., Verbraak, M., Broekman, T., & Hoogduin, C. (2010). A randomized controlled study of paroxetine and cognitive–​ behavioural therapy for late-​ life panic disorder. Acta Psychiatrica Scandinavica, 122, 11–​19. Hettema, J. M., Neale, M. C., & Kendler, K. S. (2001). A review and meta-​analysis of the genetic epidemiology of anxiety disorders. American Journal of Psychiatry, 158, 1568–​1578. Hiebert, B., & Fox, E. (1981). Reactive effects of self-​ monitoring anxiety. Journal of Counseling Psychology, 28, 187. Houck, P. R., Spiegel, D. A., Shear, M. K., & Rucci, P. (2002). Reliability of the self-​report version of the Panic Disorder Severity Scale. Depression and Anxiety, 15, 183–​185. Jacob, R. G., Furman, J. M., Clark, D. B., & Durrant, J. D. (1992). Vestibular symptoms, panic, and phobia. Annals of Clinical Psychiatry, 4, 163–​174. Kaplan, J. S., Arnkoff, D. B., Glass, C. R., Tinsley, R., Geraci, M., Hernandez, E., . . . Carlson, P. J. (2012). Avoidant coping in panic disorder: A yohimbine biological challenge study. Anxiety, Stress & Coping, 25, 425–​442. Kashdan, T. B., Barrios, V., Forsyth, J. P., & Steger, M. F. (2006). Experiential avoidance as a generalized psychological vulnerability: Comparisons with coping and emotion regulation strategies. Behaviour Research and Therapy, 44, 1301–​1320.

Katerndahl, D. A., & Realini, J. P. (1995). Where do panic attack sufferers seek care? Journal of Family Practice, 40, 237–​244. Kemper, C. J., Lutz, J., Bähr, T., Rddel, H., & Hock, M. (2011). Construct validity of the Anxiety Sensitivity Index-​3 in clinical samples. Assessment, 19, 89–​100. Kenardy, J. A., Dow, M. G., Johnston, D. W., Newman, M. G., Thomson, A., & Taylor, C. B. (2003). A comparison of delivery methods of cognitive–​behavioral therapy for panic disorder:  An international multicenter trial. Journal of Consulting and Clinical Psychology, 71, 1068. Kendler, K. S., Heath, A. C., Martin, N. G., & Eaves, L. J. (1987). Symptoms of anxiety and symptoms of depression:  Same genes, different environments? Archives of General Psychiatry, 44, 451–​457. Keough, M. E., & Schmidt, N. B. (2012). Refinement of a brief anxiety sensitivity reduction intervention. Journal of Consulting and Clinical Psychology, 80, 766–​772. Kessler, R. C., Berglund, P., Demler, O., Jin, R., Merikangas, K. R., & Walters, E. E. (2005). Lifetime prevalence and age-​of-​onset distributions of DSM-​IV disorders in the National Comorbidity Survey Replication. Archives of General Psychiatry, 62, 593–​602. Kessler, R. C., Chiu, W. T., Jin, R., Ruscio, A. M., Shear, K., & Walters, E. E. (2006). The epidemiology of panic attacks, panic disorder, and agoraphobia in the National Comorbidity Survey Replication. Archives of General Psychiatry, 63, 415–​424. Kessler, R. C., Petukhova, M., Sampson, N. A., Zaslavsky, A. M., & Wittchen, H. U. (2012). Twelve-​month and lifetime prevalence and lifetime morbid risk of anxiety and mood disorders in the United States. International Journal of Methods in Psychiatric Research, 21, 169–​184. Kessler, R. C., & Üstün, T. B. (2004). The World Mental Health (WMH) survey initiative version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI). International Journal of Methods in Psychiatric Research, 13, 93–​121. Kim, B., Lee, S.-​H., Kim, Y. W., Choi, T. K., Yook, K., Suh, S. Y.,  .  .  .  Yook, K.-​H. (2010). Effectiveness of a mindfulness-​ based cognitive therapy program as an adjunct to pharmacotherapy in patients with panic disorder. Journal of Anxiety Disorders, 24, 590–​595. Kim, J. H., Yang, J. C., Kim, J. B., Lim, K. Y., Lee, S. Y., & Yu, B. H. (2004). A validation study of Korean Albany Panic and Phobia Questionnaire (APPQ). Journal of Korean Neuropsychiatric Association, 43, 329–​336. Kinney, P. J., & Williams, S. L. (1988). Accuracy of fear inventories and self-​efficacy scales in predicting agoraphobic behavior. Behaviour Research and Therapy, 26, 513–​518. Kircanski, K., Craske, M. G., Epstein, A. M., & Wittchen, H. U. (2009). Subtypes of panic attacks:  A critical review of the empirical literature. Depression and Anxiety, 26, 878–​887.

Panic Disorder and Agoraphobia

Kiropoulos, L. A., Klein, B., Austin, D. W., Gilson, K., Pier, C., Mitchell, J., & Ciechomski, L. (2008). Is Internet-​ based CBT for panic disorder and agoraphobia as effective as face-​to-​face CBT? Journal of Anxiety Disorders, 22, 1273–​1284. Kotov, R., Schmidt, N. B., Zvolensky, M. J., Vinogradov, A., & Antipova, A. V. (2005). Adaptation of panic-​related psychopathology measures to Russian. Psychological Assessment, 17, 242. Kroeze, S., & van den Hout, M. A. (2000). Selective attention for cardiac information in panic patients. Behaviour Research and Therapy, 38, 63–​72. Lang, A. J., & McNiel, D. E. (2006). Use of the Anxiety Control Questionnaire in psychiatric inpatients. Depression and Anxiety, 23, 107–​112. Lee, H.-​C. B., & Oei, T. P. (1994). Factor structure, validity, and reliability of the Fear Questionnaire in a Hong Kong Chinese population. Journal of Psychopathology and Behavioral Assessment, 16, 189–​199. Lonsdorf, T. B., Rück, C., Bergström, J., Andersson, G., Öhman, A., Schalling, M., & Lindefors, N. (2009). The symptomatic profile of panic disorder is shaped by the 5-​HTTLPR polymorphism. Progress in Neuro-​ Psychopharmacology and Biological Psychiatry, 33, 1479–​1483. López-​Solà, C., Fontenelle, L. F., Alonso, P., Cuadras, D., Foley, D. L., Pantelis, C., . . . Soriano-​Mas, C. (2014). Prevalence and heritability of obsessive–​ compulsive spectrum and anxiety disorder symptoms:  A survey of the Australian Twin Registry. American Journal of Medical Genetics Part B:  Neuropsychiatric Genetics, 165, 314–​325. Löwe, B., Gräfe, K., Zipfel, S., Spitzer, R. L., Herrmann-​ Lingen, C., Witte, S., & Herzog, W. (2003). Detecting panic disorder in medical and psychosomatic outpatients:  Comparative validation of the Hospital Anxiety and Depression Scale, the Patient Health Questionnaire, a screening question, and physicians’ diagnosis. Journal of Psychosomatic Research, 55, 515–​519. Maidenberg, E., Chen, E., Craske, M., Bohn, P., & Bystritsky, A. (1996). Specificity of attentional bias in panic disorder and social phobia. Journal of Anxiety Disorders, 10, 529–​541. Maier, S. F., Laudenslager, M. L., & Ryan, S. M. (1985). Stressor controllability, immune function, and endogenous opiates. In F. R. Brush & J. B. Overmeir (Eds.), Affect, conditioning, and cognition:  Essays on the determinants of behavior (pp. 183–​ 201). Hillsdale, NJ: Erlbaum. Mantar, A., Yemez, B., & Alkin, T. (2010). The validity and reliability of the Turkish version of the Anxiety Sensitivity Index-​3. Turk Psikiyatri Dergisi, 21, 1. Margraf, J., & Schneider, S. (1991). Outcome and active ingredients of cognitive–​behavioral treatments for panic

289

disorder. Paper presented at the 25th annual meeting of the Association for the Advancement of Behavior Therapy, New York, NY. Margraf, J., Taylor, C. B., Ehlers, A., Roth, W. T., & Agras, W. S. (1987). Panic attacks in the natural environmet. Journal of Nervous and Mental Disease, 175, 558–​565. Marks, I. M., & Mathews, A. M. (1979). Brief standard self-​ rating for phobic patients. Behaviour Research and Therapy, 17, 263–​267. Martin, N., Jardine, R., Andrews, G., & Heath, A. (1988). Anxiety disorders and neuroticism: Are there genetic factors specific to panic? Acta Psychiatrica Scandinavica, 77, 698–​706. Massat, I., Souery, D., Del-​ Favero, J., Nothen, M., Blackwood, D., Muir, W., . . . Rietschel, M. (2005). Association between COMT (Val158Met) functional polymorphism and early onset in patients with major depressive disorder in a European multicenter genetic association study. Molecular Psychiatry, 10, 598–​6 05. Mavissakalian, M., & Michelson, L. (1982). Patterns of psychophysiological change in the treatment of agoraphobia. Behaviour Research and Therapy, 20, 347–​356. Meuret, A. E., Rosenfield, D., Hofmann, S. G., Suvak, M. K., & Roth, W. T. (2009). Changes in respiration mediate changes in fear of bodily sensations in panic disorder. Journal of Psychiatric Research, 43, 634–​641. Michelson, L., & Mavissakalian, M. (1983). Temporal stability of self-​report measures in agoraphobia research. Behaviour Research and Therapy, 21, 695–​698. Mineka, S., & Zinbarg, R. (2006). A contemporary learning theory perspective on the etiology of anxiety disorders: It’s not what you thought it was. American Psychologist, 61, 10. Monkul, E. S., Tural, Ü., Onur, E., Fidaner, H., Alkın, T., & Shear, M. K. (2004). Panic Disorder Severity Scale:  Reliability and validity of the Turkish version. Depression and Anxiety, 20, 8–​16. Murphy, M. T., Michelson, L. K., Marchione, K., Marchione, N., & Testa, S. (1998). The role of self-​directed in-​vivo exposure in combination with cognitive therapy, relaxation training, or therapist-​assisted exposure in the treatment of panic disorder with agoraphobia. Journal of Anxiety Disorders, 12, 117–​138. Novy, D. M., Stanley, M. A., Averill, P., & Daza, P. (2001). Psychometric comparability of English-​ and Spanish-​ language measures of anxiety and related affective symptoms. Psychological Assessment, 13, 347. Oei, T. P., Moylan, A., & Evans, L. (1991). Validity and clinical utility of the Fear Questionnaire for anxiety-​disorder patients. Psychological Assessment, 3, 391. Ogles, B. M., Lambert, M. J., Weight, D. G., & Payne, I. R. (1990). Agoraphobia outcome measurement: A review and meta-​analysis. Psychological Assessment, 2, 317.

290

Anxiety and Related Disorders

Osma, J., Barrada, J., García-​Palacios, A., Navarro-​Haro, M., & Aguilar, A. (2016). Internal structure and clinical utility of the Anxiety Control Questionnaire-​ Revised (ACQ-​ R) Spanish version. Manuscript submitted for publication. Spanish Journal of Psychology, 19, E63. Otowa, T., Yoshida, E., Sugaya, N., Yasuda, S., Nishimura, Y., Inoue, K.,  .  .  .  Nishida, N. (2009). Genome-​wide association study of panic disorder in the Japanese population. Journal of Human Genetics, 54, 122–​126. Pané-​Farré, C. A., Stender, J. P., Fenske, K., Deckert, J., Reif, A., John, U., . . . Alpers, G. W. (2014). The phenomenology of the first panic attack in clinical and community-​based samples. Journal of Anxiety Disorders, 28, 522–​529. Peterson, R., & Reiss, S. (1992). Anxiety Sensitivity Index revised test manual. Worthington, OH:  International Diagnostic Services. Petrocchi, N., Tenore, K., Couyoumdjian, A., & Gragnani, A. (2014). The Anxiety Sensitivity Index-​3: Factor structure and psychometric properties in Italian clinical and non-​clinical samples. Applied Psychology Bulletin, 269, 53–​64. Pollard, C. A., Pollard, H. J., & Corn, K. J. (1989). Panic onset and major events in the lives of agoraphobics: A test of contiguity. Journal of Abnormal Psychology, 98, 318. Porter, E., & Chambless, D. L. (2015). A systematic review of predictors and moderators of improvement in cognitive–​ behavioral therapy for panic disorder and agoraphobia. Clinical Psychology Review, 42, 179–​192. Rapee, R. M., Brown, T. A., Antony, M. M., & Barlow, D. H. (1992). Response to hyperventilation and inhalation of 5.5% carbon dioxide-​enriched air across the DSM-​ III-​R anxiety disorders. Journal of Abnormal Psychology, 101, 538. Rapee, R. M., Craske, M. G., & Barlow, D. H. (1994). Assessment instrument for panic disorder that includes fear of sensation-​producing activities: The Albany Panic and Phobia Questionnaire. Anxiety, 1, 114–​122. Rapee, R. M., Craske, M. G., Brown, T. A., & Barlow, D. H. (1996). Measurement of perceived control over anxiety-​ related events. Behavior Therapy, 27, 279–​293. Reiss, S., Peterson, R. A., Gursky, D. M., & McNally, R. J. (1986). Anxiety sensitivity, anxiety frequency and the prediction of fearfulness. Behaviour Research and Therapy, 24, 1–​8. Rifkin, L. S., Beard, C., Hsu, K. J., Garner, L., & Björgvinsson, T. (2015). Psychometric properties of the Anxiety Sensitivity Index-​3 in an acute and heterogeneous treatment sample. Journal of Anxiety Disorders, 36, 99–​102. Rodriguez, B. F., Weisberg, R. B., Pagano, M. E., Machan, J. T., Culpepper, L., & Keller, M. B. (2004). Frequency and patterns of psychiatric comorbidity in a sample of primary care patients with anxiety disorders. Comprehensive Psychiatry, 45, 129–​137.

Roemer, L. (2002). Measures for anxiety and related constructs. In A. M. Anthony, S. M. Osillo, & L. Roemer (Eds.), Practitioner’s guide to empirically based measures of anxiety (pp. 49–​83). Dordrecht, the Netherlands: Kluwer. Roy-​Byrne, P. P., Stein, M. B., Russo, J., Mercier, E., Thomas, R., McQuaid, J., . . . Sherbourne, C. D. (1999). Panic disorder in the primary care setting:  Comorbidity, disability, service utilization, and treatment. Journal of Clinical Psychiatry, 60, 492–​499. Salkovskis, P. M. (1991). The importance of behaviour in the maintenance of anxiety and panic: A cognitive account. Behavioural Psychotherapy, 19, 6–​19. Schmidt, N. B., Capron, D. W., Raines, A. M., & Allan, N. P. (2014). Randomized clinical trial evaluating the efficacy of a brief intervention targeting anxiety sensitivity cognitive concerns. Journal of Consulting and Clinical Psychology, 82, 1023. Schmidt, N. B., Lerew, D. R., & Jackson, R. J. (1997). The role of anxiety sensitivity in the pathogenesis of panic:  Prospective evaluation of spontaneous panic attacks during acute stress. Journal of Abnormal Psychology, 106, 355. Shear, M., Feske, U., Brown, C., Clark, D., Mammen, O., & Scotti, J. (2000). Anxiety disorders measures. In Taskforce for the Handbook of Psychiatric Measures (Eds.), Handbook of psychiatric measures (pp. 549–​590). Washington, DC: American Psychiatric Association. Shear, M. K., Brown, T. A., Barlow, D. H., Money, R., Sholomskas, D. E., Woods, S. W.,  .  .  .  Papp, L. A. (1997). Multicenter Collaborative Panic Disorder Severity Scale. American Journal of Psychiatry, 154, 1571–​1575. Shear, M. K., Polan, J. J., Harshfield, G., Pickering, T., Mann, J. J., Frances, A., & James, G. (1992). Ambulatory monitoring of blood pressure and heart rate in panic patients. Journal of Anxiety Disorders, 6, 213–​221. Shear, M. K., Rucci, P., Williams, J., Frank, E., Grochocinski, V., Vander Bilt, J., . . . Wang, T. (2001). Reliability and validity of the Panic Disorder Severity Scale: Replication and extension. Journal of Psychiatric Research, 35, 293–​296. Silverman, W. K., & Albano, A. M. (2004). Anxiety Disorders Interview Schedule (ADIS-​ IV) child/​ parent clinician manual. New York, NY: Oxford University Press. Simon, N. M., Otto, M. W., Smits, J. A., Nicolaou, D. C., Reese, H. E., & Pollack, M. H. (2004). Changes in anxiety sensitivity with pharmacotherapy for panic disorder. Journal of Psychiatric Research, 38, 491–​495. Skodol, A., Bender, D., Rush, A., & Zarin, D. (2000). Diagnostic interviews for adults. In Taskforce for the Handbook of Psychiatric Measures (Eds.), Handbook of psychiatric measures (pp. 45–​ 71). Washington, DC: American Psychiatric Association.

Panic Disorder and Agoraphobia

Smits, J. A., Powers, M. B., Cho, Y., & Telch, M. J. (2004). Mechanism of change in cognitive–​ behavioral treatment of panic disorder:  Evidence for the fear of fear mediational hypothesis. Journal of Consulting and Clinical Psychology, 72, 646. Spitzer, R., & Williams, J. (1984). Structural Clinical Interview for DSM-​III (SCID-​I). New  York, NY:  Biometrical Research Department, New  York State Psychiatric Institute. Spitzer, R. L., Williams, J. B., Gibbon, M., & First, M. B. (1992). The Structured Clinical Interview for DSM-​ III-​ R (SCID):  I. History, rationale, and description. Archives of General Psychiatry, 49, 624–​629. Stegemann, S. K., Ebenfeld, L., Lehr, D., Berking, M., & Funk, B. (2013). Development of a mobile application for people with panic disorder as augmentation for an Internet-​based intervention. Paper presented at the Federated Conference on Computer Science and Information Systems, Krakow, Poland. Steketee, G., Chambless, D. L., & Tran, G. Q. (2001). Effects of Axis I and II comorbidity on behavior therapy outcome for obsessive–​compulsive disorder and agoraphobia. Comprehensive Psychiatry, 42, 76–​86. Tanay, G., Lotan, G., & Bernstein, A. (2012). Salutary proximal processes and distal mood and anxiety vulnerability outcomes of mindfulness training:  A pilot preventive intervention. Behavior Therapy, 43, 492–​505. Taylor, S. (2014). Anxiety sensitivity:  Theory, research, and treatment of the fear of anxiety. New York, NY: Routledge. Taylor, S., Zvolensky, M. J., Cox, B. J., Deacon, B., Heimberg, R. G., Ledley, D. R., . . . Stewart, S. H. (2007). Robust dimensions of anxiety sensitivity:  Development and initial validation of the Anxiety Sensitivity Index-​ 3. Psychological Assessment, 19, 176. Treanor, M., Erisman, S. M., Salters-​Pedneault, K., Roemer, L., & Orsillo, S. M. (2011). Acceptance-​based behavioral therapy for GAD:  Effects on outcomes from three theoretical models. Depression and Anxiety, 28, 127–​136. van Beek, N., Schruers, K. R., & Griez, E. J. (2005). Prevalence of respiratory disorders in first-​degree relatives of panic disorder patients. Journal of Affective Disorders, 87, 337–​340. van Boeijen, C. A., van Oppen, P., van Balkom, A. J., Visser, S., Kempe, P. T., Blankenstein, N., & van Dyck, R. (2005). Treatment of anxiety disorders in primary care practice: A randomised controlled trial. British Journal of General Practice, 55, 763–​769. Verburg, K., Griez, E., Meijer, J., & Pols, H. (1995). Respiratory disorders as a possible predisposing factor for panic disorder. Journal of Affective Disorders, 33, 129–​134. Vickers, K., Jafarpour, S., Mofidi, A., Rafat, B., & Woznica, A. (2012). The 35% carbon dioxide test in stress and panic

291

research: Overview of effects and integration of findings. Clinical Psychology Review, 32, 153–​164. Wade, W. A., Treat, T. A., & Stuart, G. L. (1998). Transporting an empirically supported treatment for panic disorder to a service clinic setting: A benchmarking strategy. Journal of Consulting and Clinical Psychology, 66, 231. Watson, D., & Clark, L. A. (1984). Negative affectivity: The disposition to experience aversive emotional states. Psychological Bulletin, 96, 465. Watts, F. N., Trezise, L., & Sharrock, R. (1986). Processing of phobic stimuli. British Journal of Clinical Psychology, 25, 253–​259. Wheaton, M. G., Deacon, B. J., McGrath, P. B., Berman, N. C., & Abramowitz, J. S. (2012). Dimensions of anxiety sensitivity in the anxiety disorders: Evaluation of the ASI-​3. Journal of Anxiety Disorders, 26, 401–​408. White, K. S., Brown, T. A., Somers, T. J., & Barlow, D. H. (2006). Avoidance behavior in panic disorder:  The moderating influence of perceived control. Behaviour Research and Therapy, 44, 147–​157. Williams, J., Gibbon, M., First, M. B., Spitzer, R. L., Davies, M., Borus, J., . . . Rounsaville, B. (1992). The Structured Clinical Interview for DSM-​ III-​ R (SCID):  Multisite test–​retest reliability. Archives of General Psychiatry, 49, 630–​636. Williams, J., Spitzer, R., & Gibbon, M. (1992). International reliability of a diagnostic intake procedure for panic disorder. American Journal of Psychiatry, 149, 560–​562. Williams, S. L., & Falbo, J. (1996). Cognitive and performance-​ based treatments for panic attacks in people with varying degrees of agoraphobic disability. Behaviour Research and Therapy, 34, 253–​264. Williams, S. L., Kinney, P. J., & Falbo, J. (1989). Generalization of therapeutic changes in agoraphobia:  The role of perceived self-​ efficacy. Journal of Consulting and Clinical Psychology, 57, 436. Williams, S. L., Kinney, P. J., Harap, S. T., & Liebmann, M. (1997). Thoughts of agoraphobic people during scary tasks. Journal of Abnormal Psychology, 106, 511. Williams, S. L., & Zane, G. (1989). Guided mastery and stimulus exposure treatments for severe performance anxiety in agoraphobics. Behaviour Research and Therapy, 27, 237–​245. Wittchen, H.-​U., Nocon, A., Beesdo, K., Pine, D. S., Höfler, M., Lieb, R., & Gloster, A. T. (2008). Agoraphobia and panic. Psychotherapy and Psychosomatics, 77, 147–​157. Wolpe, J. (1968). Psychotherapy by reciprocal inhibition. Conditional Reflex: A Pavlovian Journal of Research & Therapy, 3, 234–​240. Wuyek, L. A., Antony, M. M., & McCabe, R. E. (2011). Psychometric properties of the panic disorder severity scale:  Clinician-​administered and self-​report versions. Clinical Psychology & Psychotherapy, 18, 234–​243.

292

Anxiety and Related Disorders

Yamamoto, I., Nakano, Y., Watanabe, N., Noda, Y., Furukawa, T. A., Kanai, T.,  .  .  .  Kamijima, K. (2004). Cross-​cultural evaluation of the Panic Disorder Severity Scale in Japan. Depression and Anxiety, 20, 17–​22. Yonkers, K. A., Bruce, S. E., Dyck, I. R., & Keller, M. B. (2003). Chronicity, relapse, and illness—​ Course of panic disorder, social phobia, and generalized anxiety disorder: Findings in men and women from 8 years of follow-​up. Depression and Anxiety, 17, 173–​179. Zanarini, M. C., & Frankenburg, F. R. (2001). Attainment and maintenance of reliability of Axis I and II disorders over the course of a longitudinal study. Comprehensive Psychiatry, 42, 369–​374. Zanarini, M. C., Skodol, A. E., Bender, D., Dolan, R., Sanislow, C., Schaefer, E.,  .  .  .  McGlashan, T. H. (2000). The Collaborative Longitudinal Personality Disorders Study: Reliability of Axis I and II diagnoses. Journal of Personality Disorders, 14, 291–​299. Zarate, R., Rapee, R., Craske, M., & Barlow, D. (1988). Response-​ norms for symptom induction procedures. Poster presented at the 22nd annual convention of the Association for the Advancement of Behavior Therapy, New York, NY. Zebb, B. J., & Moore, M. C. (1999). Another look at the psychometric properties of the Anxiety Control

Questionnaire. Behaviour Research and Therapy, 37, 1091–​1103. Zimmerman, M., & Mattia, J. I. (2000). Principal and additional DSM-​ IV disorders for which outpatients seek treatment. Psychiatric Services, 51, 1299–​1304. Zinbarg, R. E., & Barlow, D. H. (1996). Structure of anxiety and the anxiety disorders: A hierarchical model. Journal of Abnormal Psychology, 105, 181. Zinbarg, R. E., Barlow, D. H., & Brown, T. A. (1997). Hierarchical structure and general factor saturation of the Anxiety Sensitivity Index:  Evidence and implications. Psychological Assessment, 9, 277. Zvolensky, M. J., & Eifert, G. H. (2001). A review of psychological factors/​processes affecting anxious responding during voluntary hyperventilation and inhalations of carbon dioxide-​ enriched air. Clinical Psychology Review, 21, 375–​400. Zvolensky, M. J., Kotov, R., Antipova, A. V., & Schmidt, N. B. (2005). Diathesis stress model for panic-​related distress:  A test in a Russian epidemiological sample. Behaviour Research and Therapy, 43, 521–​532. Zvolensky, M. J., McNeil, D. W., Porter, C. A., & Stewart, S. H. (2001). Assessment of anxiety sensitivity in young American Indians and Alaska Natives. Behaviour Research and Therapy, 39, 477–​493.

14

Generalized Anxiety Disorder Michel J. Dugas Catherine A. Charette Nicole J. Gervais The main goal of this chapter is to present a comprehensive assessment option for clinicians working with clients with generalized anxiety disorder (GAD). The chapter begins with a discussion of the nature of GAD. Specifically, we review the history of its diagnostic criteria and then summarize the data on the disorder’s onset and course, etiology, prevalence, sex and age differences, comorbidity, and associated costs. We then provide a detailed description of assessment strategies for identifying GAD, for treatment planning, and for treatment monitoring and outcome. Finally, we discuss the current status of assessment methods for GAD and suggest ways of enhancing our ability to measure the symptoms and associated features of this prevalent and costly anxiety disorder.

NATURE OF GENERALIZED ANXIETY DISORDER

History of the Diagnostic Criteria The diagnostic category of GAD has undergone much change since its debut in the third edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​III; American Psychiatric Association [APA], 1980). In DSM-​ III, GAD was considered a residual disorder characterized by persistent anxiety occurring for at least 1  month and accompanied by symptoms from three out of four categories (i.e., motor tension, autonomic hyperactivity, apprehensive expectation, and vigilance/​scanning). In contrast, in the fifth edition of the DSM (DSM-​5; APA, 2013), GAD is described as a chronic condition (minimum duration of 6  months) involving excessive and uncontrollable worry and anxiety about a number of events or activities and

leading to significant distress or impairment in important areas of functioning. In addition, the diagnosis of GAD requires at least three of six somatic symptoms:  restlessness or feeling keyed up or on edge, being easily fatigued, difficulty concentrating or mind going blank, irritability, muscle tension, and sleep disturbance. This definition reflects attempts to identify features that are specific to GAD, including muscle tension and excessive worry about several events or activities. In general, worry is common among individuals with anxiety disorders; however, for those with anxiety disorders other than GAD, the content of their worry tends to be confined to topics related to their specific disorder. For example, individuals with social anxiety disorder may worry about how others perceive them, and those with obsessive–​compulsive disorder with checking compulsions may worry about whether or not they locked the front door. It is worth noting that the diagnostic criteria for GAD were essentially unchanged from DSM-​IV (APA, 1994) to DSM-​5 (APA, 2013), even though the DSM-​5 Anxiety, Obsessive–​ Compulsive Spectrum, Posttraumatic, and Dissociative Disorders Work Group suggested a number of major modifications (see Andrews et  al., 2010). For example, the working group suggested that the terms “generalized worry disorder,” “major worry disorder,” and “pathological worry disorder” might better capture the main clinical feature of individuals with GAD (i.e., excessive worry). They also proposed that the minimal duration of GAD be reduced to 3 months from 6 months. The reasoning behind this proposal was that it can sometimes be difficult for a person to recall if his or her worry was excessive 6 months ago, especially if that person is a child (Starcevic, Portman, & Beck, 2012). The working group

293

294

Anxiety and Related Disorders

further suggested that retaining only two associated symptoms in the DSM-​5 (restlessness or feeling keyed up or on edge, and muscle tension) could increase the discriminant validity of GAD. Finally, and perhaps most important, the working group noted that the addition of behavioral symptoms to the DSM definition of GAD could markedly improve the diagnostic reliability of the disorder. As such, four behavioral symptoms were proposed:  avoidance, overpreparation, procrastination, and reassurance seeking. These behavioral symptoms, which can be thought of as safety-​seeking behaviors, have been shown to be associated with GAD (Starcevic et  al., 2012). In addition, given that individuals with GAD appear to engage in these safety-​seeking behaviors in an effort to increase their feelings of certainty (Andrews et al., 2010), the suggested behaviors are consistent with current conceptualizations of GAD as being rooted in intolerance of uncertainty (Bennett-​Levy et  al., 2004; Clark & Beck, 2010; Dugas & Robichaud, 2007). Unfortunately, although the suggestions of the working group held the promise of increasing the diagnostic validity and reliability of GAD, ultimately none were retained for the DSM-​5. Onset and Course Some research suggests that there can be both an early and a late onset of GAD, with the early onset occurring between the ages of 11 years and the early 20s, and the late onset typically occurring in middle adulthood (Blazer, Hughes, & George, 1987; Brown, Barlow, & Liebowitz, 1994). According to these studies, although a significant minority experience a late onset of GAD, an early onset is more common, with approximately two-​thirds of individuals with GAD developing the disorder by their early 20s. Of note, Kessler and colleagues (2005) reported a slightly different pattern of results. Specifically, the authors found evidence for a steady increase in the onset of GAD during the early 20s (which is consistent with the findings reported previously); but rather than finding evidence of a late onset, Kessler and colleagues found moderately lower rates of onset between the ages of 31 and 47 years and dramatically lower rates after the age of 47 years. Therefore, the data presented by Kessler and colleagues support a peak onset age in the early 20s but find no evidence for a later peak. The symptoms of GAD rarely remit completely without treatment (Stein, 2004). In the Harvard/​Brown Anxiety Research Program (HARP), a prospective study examining the course of anxiety disorders, GAD remission rates were examined in a large number of patients.

The results showed that 15% of patients with GAD remitted after 1 year, 25% after 2 years, and 38% after 5 years (Yonkers, Warshaw, Massion, & Keller, 1996). Overall, it appears that GAD is characterized by the fluctuation of symptoms over time in response to life stressors, with episodes of the disorder commonly persisting for more than 10 years (Kessler, Keller, & Wittchen, 2001; Stein, 2004). Thus, the general consensus is that GAD is a chronic condition that is unlikely to remit unless treated. Etiology Biological, environmental, and psychological factors all play a role in the development and maintenance of GAD. Biological factors include genetic predisposition and alterations in neurotransmitter function. Genetic predisposition plays a relatively modest role in the development of GAD, with research suggesting that the disorder has a heritability of 15% to 30% (Hettema, Prescott, & Kendler, 2001; Kendler, Neale, Kessler, Heath, & Eaves, 1992). In addition, genetic predisposition appears to be nonspecific in that individuals who are at higher risk of developing GAD are also more likely to develop other anxiety and mood disorders (Andrews, Stewart, Morris-​Yates, Holt, & Henderson, 1990). It is likely that genetic predisposition interacts with environmental and psychological factors to determine if a given individual will in fact develop GAD and/​or another disorder. Alterations in neurotransmitter function also appear to be involved in GAD. However, the exact mechanisms by which alterations in neurotransmitter function affect the development and maintenance of GAD have yet to be clearly understood (Gorman, 2002). Research into environmental factors suggests that the interactions between young children and their parents (or caregivers) also play a role in the development of GAD. Whereas a number of studies show that children with insecure attachments to their parents are at risk of developing GAD (for a review, see Hudson & Rapee, 2004), other studies show that high levels of enmeshment characterize the childhood experiences of adults with GAD (Lichtenstein & Cassidy, 1991; Peasley, Molina, & Borkovec, 1994). In this context, enmeshed relationships refer to the children attending to the needs of their parents, without necessarily having their own needs met. In other words, the parent–​child relationship is marked by role reversal, with the child “taking care” of the parent. Many psychological factors also appear to play a role in the development and maintenance of GAD (Borkovec, Alcaine, & Behar, 2004; Mennin, Heimberg, Turk, & Fresco, 2002; Wells & Carter, 1999). Our research group

Generalized Anxiety Disorder

has developed a cognitive model of GAD that has four main features: intolerance of uncertainty, positive beliefs about worry, negative problem orientation, and cognitive avoidance (Dugas, Gagnon, Ladouceur, & Freeston, 1998). According to this model, deep-​ seated negative beliefs about uncertainty (which manifest as intolerance of uncertainty) lead to biases in cognitive processing, contribute to the other model components, and ultimately lead to the development and maintenance of GAD (Dugas & Koerner, 2005). Research suggests that intolerance of uncertainty is not only closely related to GAD but also plays a causal role in GAD (Ladouceur, Gosselin, & Dugas, 2000). Research also shows that although intolerance of uncertainty is the central cognitive process of our model, all model components nonetheless make significant and unique contributions to the prediction of GAD symptoms (Dugas et al., 1998).

295

Dugas and Robichaud (2007) suggest that such findings can be explained by the many health complications older adults typically experience; as such, these complications can obscure the presence of GAD, especially when symptoms of the health problem are similar to those seen in GAD. Given that GAD is a chronic disorder that typically commences in the early 20s and rarely remits on its own, it seems obvious that middle-​aged adults would have a higher likelihood of having GAD than would younger adults. However, more research examining the presence of GAD in adults aged 65  years or older is necessary to clarify whether the rates of this disorder continue to increase in this population. Comorbidity and Cost

Although GAD can present in individuals without other disorders (Wittchen et  al., 1994), it is most commonly accompanied by other mental health conditions. Carter Prevalence, Sex Differences, and Age and colleagues (2001) reported a 12-​month prevalence GAD is highly prevalent in the general population and rate of 93% for other mental health disorders among in clinical settings. In the Canadian Census Survey, 8.7% individuals from the general population meeting DSM-​ of a representative community sample reported symptoms IV criteria for GAD. This included 71% for any mood consistent with lifetime GAD, and 2.6% met criteria for disorder and 56% for any anxiety disorder. In addition, GAD in the past 12-​month period (Pearson, Janz, & Ali, individuals meeting GAD criteria were more likely to 2013). Similarly, the APA (2013) reported that the 12-​ have two or more comorbid conditions rather than just month prevalence of GAD in the general population of one comorbid condition. Given its high comorbidity rate, the United States is 2.9%. In clinical populations, the many have suggested that GAD is not a distinct disorder prevalence rate is higher, with nearly 8% of all patients but, rather, a condition that promotes the development seeking treatment meeting diagnostic criteria for GAD of anxiety or mood disorders (Akiskal, 1998; Maser, 1998; (Maier et  al., 2000). Finally, Wittchen and colleagues Roy-​Byrne & Katon, 1997). However, this position has (2002) argued that there is evidence that GAD is the most essentially been rejected because there is much evidence frequent anxiety disorder and the second most frequent of supporting the notion that GAD is a distinct diagnostic all mental disorders in clinical settings. category (e.g., Brown, Chorpita, & Barlow, 1998; Maier In terms of sex differences, GAD is more common et al., 2000). For example, although the comorbidity rate among women than men (Blazer, Hughes, George, for GAD is quite high, it is in fact comparable to those of Swartz, & Boyer, 1991; Hunt, Issakidis, & Andrews, other anxiety and mood disorders (Holaway, Rodebaugh, 2002), which is consistent with the pattern found in most & Heimberg, 2006). In addition, with the exception of other anxiety disorders (Kessler et al., 1994). For example, depression (Kessler, Walters, & Wittchen, 2004), GAD Wittchen, Zhao, Kessler, and Eaton (1994) found that does not systematically precede or follow the onset of women were approximately twice as likely as men to have other disorders (although GAD typically precedes depreshad GAD at some point in their lives, with a reported life- sion, it follows depression in a significant minority of time prevalence of 6.6% for women and 3.6% for men. cases). Other than anxiety and mood disorders, personalAccording to the APA (2013), the ratio of female to males ity disorders have also been found to occur frequently in experiencing GAD is 2:1. individuals with GAD (Grant, Hasin, Stinson, Dawson, Studies examining GAD in older adults suggest that it Chou, et al., 2005), with avoidant, obsessive–​compulsive, is one of the most prevalent disorders in that population, and dependent personality disorders being the most comwith some authors reporting a steady increase in the rate mon (Dyck et al., 2001). of GAD that extends beyond the age of 65 years (Beekman Compared to non-​comorbid GAD (also referred to as et al., 1998; Carter, Wittchen, Pfister, & Kessler, 2001). pure GAD), comorbid GAD is associated with a greater

296

Anxiety and Related Disorders

likelihood of impairment, help seeking, and medication use (Kessler DuPont, Berglund, & Wittchen, 1999; Wittchen et  al., 1994). In a study conducted by Grant, Hasin, Stinson, Dawson, Ruan, et al. (2005), individuals with co-​occurring GAD and major depressive disorder (MDD) diagnoses reported a lower health-​related quality of life compared to those diagnosed with GAD or MDD alone. However, pure GAD is also associated with significant impairment, which is comparable to that found in major depression (Kessler et al., 1999). GAD is costly not only to the individual but also to society because it often leads to decreases in work productivity and higher utilization of health care services (Wittchen & Hoyer, 2001). Despite the disproportionate use of primary care facilities, many individuals with GAD avoid seeking proper treatment for as long as 25 years (Rapee, 1991). Furthermore, GAD is the anxiety disorder with the lowest diagnostic reliability (Brown, DiNardo, Lehman, & Campbell, 2001), making the disorder difficult to identify in help-​seeking individuals. However, as discussed later, there have been many recent advances in the diagnostic tools available for the assessment of GAD, which have improved the diagnostic reliability of this disorder. Summary GAD is a prevalent, chronic, and disabling disorder that has undergone numerous revisions in diagnostic criteria since its introduction in the DSM. Considering the many factors that can complicate the assessment of GAD, it is imperative that clinicians use a sound assessment strategy for diagnosing this disorder. In addition, it is important to assess for the presence of comorbid conditions (including mental health and physical conditions) because they can influence diagnostic and treatment decisions that relate to GAD. In the following sections, we provide a thorough review of the diagnostic tools with the strongest empirical support. We then present the measures that can be used for case formulation and treatment planning. Finally, we offer suggestions for the assessment of treatment progress and outcome.

PURPOSES OF ASSESSMENT OF GAD

For the most part, the measures currently available for the assessment of GAD consist of semi-​structured interviews and self-​report questionnaires. Given that different research groups have developed self-​report questionnaires that are specific to their conceptualization of GAD, not

all GAD measures are reviewed in this chapter. In the sections on assessment for case conceptualization/​treatment planning and treatment monitoring/​treatment outcome, we focus our presentation of instruments on (a)  constructs common to most models of GAD and (b) specific components of our model of GAD, namely intolerance of uncertainty, positive beliefs about worry, negative problem orientation, and cognitive avoidance (Dugas & Robichaud, 2007). Validated measures of GAD components are available for other models of GAD, and the interested reader can refer to Borkovec et al. (2004), Mennin et al. (2002), or Wells (2009) for more information.

ASSESSMENT FOR THE DIAGNOSIS OF GAD

Before conducting the psychological assessment of a client, the clinician should ensure that a medical examination has been conducted by a physician to rule out conditions that can produce symptoms that resemble those of GAD (e.g., hyperthyroidism, hypoglycemia, and anemia). Furthermore, either the physician or the clinician should obtain information about family history of both medical problems and mental health conditions. Once a complete picture of the client’s physical state is obtained, the clinician should then proceed with the psychological assessment. In the following paragraphs, we discuss both self-​report measures and semi-​structured interviews that assess for the presence of mental health conditions including GAD. The two self-​report measures to be presented are screening tools for GAD that can be used prior to conducting a semi-​ structured interview. We recommend using semi-​structured interviews rather than unstructured clinical interviews because the former are likely to produce diagnoses that are more reliable. Concerns have been raised regarding the precision and rigor of unstructured interviews because they tend to produce lower comorbidity rates relative to semi-​structured interviews (Miller, Dasher, Collins, Griffiths, & Brown, 2001; Zimmerman & Mattia, 1999). Because semi-​ structured interviews encourage clinicians to inquire about a broad range of disorders, they reduce the risk that clinicians will overlook comorbid disorders. This is an especially important consideration for individuals with GAD because many present for assessment without realizing that it is worthwhile to mention their excessive and uncontrollable worry. Furthermore, individuals with GAD may be seeking help for problems that are the result of their worrying, such as painful muscle tension. However, when they are specifically asked about the experience of

Generalized Anxiety Disorder TABLE 14.1  

297

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Rest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

WAQ GAD-​Q

G G

NR E

NA NA

A A

G G

A A

A NR

A A



ADIS-​IV SCID-​I/​P

NA NA

NA NA

A A

NA NA

A A

A A

G G

E E



Note: WAQ = Worry and Anxiety Questionnaire; GAD-​Q = Generalized Anxiety Disorder Questionnaire; ADIS-​IV = Anxiety Disorders Interview Schedule for DSM-​IV; SCID-​I/​P = Structured Clinical Interview for DSM-​IV-​TR for Axis I disorders, Patient Edition. A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

worry, these individuals readily acknowledge its importance. There are, however, certain limitations to using semi-​structured interviews, the most obvious being the time required to conduct them. In addition, practice is required before the clinician may feel fully comfortable in the use of such interviews. Despite these limitations, semi-​structured interviews are clearly superior to unstructured clinical interviews in terms of obtaining reliable information about a broad array of symptom constellations. A  summary of the psychometric properties of the self-​report measures and the previous versions of the semi-​ structured interviews reviewed in this section is presented in Table 14.1. Self-​Report Measures The following two self-​report measures can be used to screen for GAD prior to administering a semi-​structured interview. Given that the two provide similar information, it is recommended that the clinician choose one. The reader should keep in mind that although the symptoms covered by self-​report measures are similar to those covered by semi-​structured interviews, self-​report can by no means replace a diagnostic interview. Semi-​ structured interviews provide the clinician with more comprehensive, valid, and reliable information for the formal diagnosis of GAD. However, as mentioned previously, self-​report questionnaires can be used to provide initial information on the symptoms of GAD prior to administering an interview. Worry and Anxiety Questionnaire The Worry and Anxiety Questionnaire (WAQ; Dugas et al., 2001) is an 11-​item self-​report questionnaire assessing DSM-​5 (and DSM-​IV) diagnostic criteria for GAD. The WAQ assesses worry themes, the degree of excessiveness and uncontrollability of worry, the length of time that

the person has experienced excessive worry, the severity of GAD physical symptoms, and the degree of interference and distress related to the worry and anxiety. With the exception of the first item, which asks respondents to list their worry themes, items are rated on a 9-​point Likert scale. The WAQ can be scored categorically (Dugas et al., 2001) or continuously (Deschênes & Dugas, 2013). Overall, scores on the WAQ have demonstrated adequate to good psychometric properties. For example, Beaudoin and colleagues (1997) found the test–​retest reliability of WAQ scores at 4 weeks to be adequate (r = .76). There is evidence of good content validity and adequate construct validity (Dugas et al., 2001). In addition, there is good normative data available for the WAQ in nonclinical and clinical samples (Buhr & Dugas, 2002; Dugas et al., 2007). Both Buhr and Dugas (2002) and Dugas and colleagues (2007) report a gender difference, which is not surprising given that women tend to report more worry and anxiety symptoms than do men in the general population (Robichaud, Dugas, & Conway, 2003). Generalized Anxiety Disorder Questionnaire The Generalized Anxiety Disorder Questionnaire (GAD-​Q; Newman et al., 2002) is the most commonly used self-​report measure assessing for the presence of DSM-​ 5 (previously DSM-​ IV) diagnostic criteria for GAD. The GAD-​Q consists of nine items, the majority of which inquire about the presence or absence of specific symptoms of GAD (dichotomous response scale), including whether the respondent experiences excessive and uncontrollable worry as well as any of the six GAD physical symptoms. There are, however, two items on the GAD-​Q that are rated on a 9-​point Likert scale. These items assess the severity of functional impairment and subjective distress that result from the worry and anxiety. In addition, the GAD-​Q contains one item asking respondents to list their most frequent worry topics.

298

Anxiety and Related Disorders

The GAD-​Q total score can be used to screen for GAD, and Newman and colleagues (2002) have suggested a clinical cut-​off score. For a detailed description of the scoring system for the GAD-​Q, refer to the validation article (Newman et al., 2002). Scores on the GAD-​Q have demonstrated adequate to excellent psychometric properties, including excellent internal consistency (α = .93; Luterek, Turk, Heimberg, Fresco, & Mennin, 2002), adequate test–​retest reliability at 2 weeks (κ = .64), good content validity, and adequate construct validity (Newman et  al., 2002). Furthermore, there is good normative data on the GAD-​Q as provided in Newman and colleagues’ (2002) article.

diagnostic criteria for more than one condition, the disorder that has the highest rating on the CSR is considered to be the primary diagnosis. Although we highly recommend the ADIS-​5 for the assessment of anxiety disorders, the interview is not without limitations. First, for reasons that are unclear to us, the ADIS-​5 does not assess all conditions that frequently co-​occur with anxiety disorders. For example, the interview does not allow for the assessment of eating disorders. Second, many of the sections of the ADIS-​5 are overly-​ detailed and include ancillary questions that provide little diagnostic information. Finally, the ADIS-​5 requires a considerable amount of time to administer, sometimes as much as 2 hours. Due to the recent development of the ADIS-​5, its Semi-​Structured Interviews psychometric properties have yet to be adequately evaluated. However, research suggests that the psychometric Anxiety and Related Disorders Interview properties of the previous version of the ADIS (ADIS-​IV; Schedule for DSM-​5 Brown, DiNardo, & Barlow, 1994)  are good. In a large The Anxiety and Related Disorders Interview Schedule reliability study conducted by Brown and colleagues for DSM-​5 (ADIS-​5; Brown & Barlow, 2014)  is a semi-​ (2001), 362 patients received two independent ADIS-​IV structured diagnostic interview designed to thoroughly interviews, and kappa values were calculated to obtain assess all anxiety disorders. The interview screens for inter-​rater agreement. In this study, the diagnosis of GAD many other conditions, including depressive, obsessive–​ demonstrated adequate inter-​rater agreement. Brown and compulsive, trauma-​ related, somatic symptom, and colleagues also found that much of the diagnostic disagreesubstance-​ related disorders. The ADIS-​ 5 allows clini- ment frequently involved mood disorders, which have cians to dimensionally and functionally assess patient considerable symptom overlap with GAD. Furthermore, symptoms. The assessment is wide in its scope: It provides the ADIS-​IV has demonstrated adequate content and conscreening questions for many conditions and explores struct validity, good validity generalization, and excellent family psychiatric history and life stressors. The ADIS-​ clinical utility. 5 is available in two versions for adults:  (a) the current version, which assesses the presence of pathology at the Structured Clinical Interview for DSM-​5 Disorders time of the interview, and (b) the lifetime version, which assesses the presence of pathology at any point during the The Structured Clinical Interview for DSM-​5 Disorders, respondent’s life. Although the lifetime version of the Clinician Version (SCID-​5-​CV; First, Williams, Karg, & ADIS-​5 can be useful in that it provides clinicians with Spitzer, 2016) is an updated, current version of the SCID information regarding the temporal development of each assessing various psychiatric conditions, including anxiety condition, the ADIS-​5 current version generally suffices disorders, substance-​related disorders, somatic symptom for clinical diagnostic use. disorders, psychotic disorders, adjustment disorders, eatThe ADIS-​5 offers many advantages. For one, each ing disorders, and depressive disorders. The SCID-​5-​CV section begins with a screening question for a particu- assesses patients’ symptoms within the past month and lar disorder, followed by questions pertaining to specific over their lifetime, based on DSM-​5 criteria. The intersymptoms related to the disorder, which are rated on a 9-​ view takes 45 to 90 minutes to administer. Although the point Likert scale (0–​8). Another advantage of the ADIS-​ SCID-​5-​CV covers more disorders than does the ADIS-​5, 5 over other semi-​structured interviews is that it contains the measure is limited by its categorical, rather than cona Clinician’s Severity Rating (CSR) scale, which also tinuous, rating scale. In other words, clinicians using the consists of a 9-​point Likert scale (0–​8). The CSR allows SCID-​5-​CV can only determine if symptoms and disorthe clinician to evaluate the severity of each diagnosed ders are present or absent. The interview is also limited by condition. A  score of 4 or above indicates the presence the fact that it does not include items that directly address of a clinically significant disorder. When a patient meets the differential diagnosis of anxiety disorders. In contrast

Generalized Anxiety Disorder

to the SCID-​5-​CV, the ADIS-​5 provides a continuous rating scale for symptoms and disorders, as well as detailed questions that facilitate the differential diagnosis of anxiety disorders. Given the recent development of the SCID-​ 5-​CV, the psychometric properties of the interview have yet to be closely examined. However, research on the psychometric properties of prior versions of the SCID (Williams et al., 1992; Zanarini et al., 2000) suggests that the latest version will likely be shown to have adequate psychometric properties. For example, the SCID-​I/​P (the previous version of the SCID) has shown evidence of construct validity, validity generalization, and clinical utility (First & Gibbon, 2004). Thus, data from previous studies suggest that the latest version of the SCID may well have considerable clinical usefulness. Overall Evaluation As mentioned previously, two self-​report measures were described in this section, but only one of the two is required to screen for GAD. Therefore, the clinician must decide which screening tool to use. Despite evidence of similar psychometric properties, we suggest that the WAQ may be superior as a screening measure for GAD because it contains ratings for each item (with the exception of the first item, which requires the respondent to list worry topics), whereas the GAD-​Q possesses many (less sensitive) dichotomous items. Although other semi-​structured diagnostic interviews can be used to identify individuals with GAD (including briefer interviews), the ADIS-​5 and SCID-​5-​CV are excellent choices because their previous versions have considerable empirical support. Diagnosing GAD is quite a challenge because many difficulties can be encountered, including overlapping symptoms with other anxiety and mood disorders and also the high likelihood that the condition will be comorbid. Semi-​structured interviews such as the two described previously are extremely valuable because they require the clinician to go beyond the client’s presenting complaints, thus facilitating the identification of comorbid conditions. If the most psychometrically sound semi-​structured interviews are utilized, clinicians should find that many difficulties can be reduced. Although the interviews require a considerable amount of time to administer, they offer benefits that far outweigh their costs. As was recommended previously for the self-​report measures used to screen for the presence of GAD, the clinician will need to choose which of the two semi-​structured interviews to administer.

299

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

In this section, we discuss assessment tools for the purposes of case conceptualization and treatment planning. We first present measures designed to assess worry severity, somatic anxiety, depression, and quality of life. Because somatic anxiety symptoms, depression, and poor quality of life can negatively impact treatment progression, clinicians should assess these factors prior to the start of treatment to determine whether they should be addressed during therapy. We then present measures of the cognitive processes central to our model of GAD (intolerance of uncertainty, positive beliefs about worry, negative problem orientation, and cognitive avoidance; see Dugas et al., 1998). Assessing these cognitive processes is essential for clinicians intending to use the treatment protocol based on the model and described in detail in Dugas and Robichaud (2007). Even for clinicians intending to use another treatment for GAD, we suggest that it would be useful to assess each of these cognitive processes because they have been implicated in the maintenance of GAD. Finally, although the WAQ was described in the previous section as a screening measure for GAD, it can also be used to assess the severity of GAD symptoms (much like the Beck Depression Inventory-​II for depressive symptoms). Given that the WAQ was presented in the previous section, it is not presented again here. Ratings of the psychometric properties of the instruments discussed in this section are presented in Table 14.2. Measure of Worry Severity Penn State Worry Questionnaire The Penn State Worry Questionnaire (PSWQ; Meyer, Miller, Metzger, & Borkovec, 1990)  is the most commonly used measure of worry, the cardinal feature of GAD. In fact, it is widely recognized as the gold standard for the measurement of excessive worry. Given that there is much normative data available for this questionnaire, it is clearly the best choice for clinicians because clients’ scores can be interpreted with relative ease (for a review, see Startup & Erickson, 2006). Although the available data include cut-​off scores for different populations, we recommend using this questionnaire simply as a measure of the severity of pathological worry. Interestingly, although women report more worry than do men in the general population, among GAD patients, there is no difference between the scores of women and men on the

300

Anxiety and Related Disorders

TABLE 14.2   Instrument

Ratings of Instruments Used for Case Conceptualization and Treatment Planning Norms

Measure of Worry Severity PSWQ E

Internal Consistency

Inter-​Rater Reliability

Test–​Rest Reliability

Content Validity

Construct Validity

Validity Clinical Generalization Utility

Highly Recommended

E

NA

G

G

G

E

A



Measures of Anxiety, Depression, and Quality of Life BAI G E NA BDI-​II G E NA QLQ A G NA

A A A

A A NR

A A NR

G G NR

G G A

✓ ✓

QOLI

G

NA

A

NR

A

NR

NR

Measures of GAD Cognitive Processes IUS G E WW-​II A E NPOQ NR E CAQ A E

NR

NA NA NA NA

A A A A

G A G G

G A G G

G G NR A

G G A G

✓ ✓ ✓ ✓

Note: PSWQ = Penn State Worry Questionnaire; BAI = Beck Anxiety Inventory; BDI-​II = Beck Depression Inventory, Second Edition; QLQ = Quality of Life Questionnaire; QOLI = Quality of Life Inventory; IUS = Intolerance of Uncertainty Scale; WW-​II = Why Worry-​II; NPOQ = Negative Problem Orientation Questionnaire; CAQ = Cognitive Avoidance Questionnaire; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

PSWQ. There also seems to be no major difference in PSWQ scores across many ethnic and cultural groups. However, age does appear to influence level of worry on the PSWQ, as younger individuals have consistently been found to report more worry relative to older adults (Startup & Erickson, 2006). The 16 items from the PSWQ are rated on a 5-​point Likert scale. Examples of the items include “My worries overwhelm me” and “I know I  shouldn’t worry about things but I  just can’t help it.” Five items are reversed scored (e.g., “I find it easy to dismiss worrisome thoughts”). The psychometric properties of the PSWQ scores range from adequate to excellent. The internal consistency is excellent in both clinical and nonclinical samples. The PSWQ scores have also shown good test–​retest reliability over periods of 2 to 10 weeks. In addition, the PSWQ shows evidence of content and construct validity (Molina & Borkovec, 1994; Startup & Erickson, 2006). Finally, given that there is considerable data supporting use of the PSWQ in many different groups and across multiple contexts, it can be concluded that the measure shows excellent validity generalization. Anxiety, Depression, and Quality of Life In addition to assessing the severity of GAD symptoms, clinicians should also acquire information about somatic anxiety, depression, and quality of life. Although somatic anxiety is more characteristic of panic disorder, it is not

uncommon for individuals with GAD to experience “panic-​ like” symptoms. Therefore, it is important to establish whether a particular client with GAD experiences somatic anxiety, as it may need to be addressed during treatment. Furthermore, given that GAD generally has a chronic and unremitting course, it should come as no surprise that many individuals with GAD come to experience symptoms of demoralization and depression. Consequently, it is standard procedure to assess depressive symptoms in a comprehensive assessment protocol for GAD. Finally, individuals with GAD typically experience poor quality of life, which can often be attributed to the distress and interference that result from worry and associated symptoms. As discussed in the section on treatment outcome, clinicians should also assess somatic anxiety, depression, and quality of life at the end of treatment because this can provide additional information on the effects of treatment. Beck Anxiety Inventory Although a high level of somatic anxiety (i.e., autonomic arousal) does not characterize GAD, its assessment is recommended for treatment planning. In fact, our clinical experience suggests that somatic symptoms of anxiety are more common in GAD than is generally acknowledged. The Beck Anxiety Inventory (BAI; Beck, Epstein, Brown, & Steer, 1988) is a 21-​item measure that assesses the severity of somatic anxiety symptoms. BAI items are rated on a

Generalized Anxiety Disorder

4-​point scale, with respondents indicating the degree to which they have been bothered by each symptom during the past week. Because the BAI was designed to discriminate anxiety from depressive symptoms, the majority of its items describe symptoms of autonomic arousal and panic (e.g., “heart pounding or racing” and “hands trembling”). Overall, scores on the BAI have demonstrated adequate to excellent psychometric properties. The BAI scores have excellent internal consistency (Beck et al., 1988; de Beurs, Wilson, Chambless, Goldstein, & Feske, 1997; Fydrich, Dowdall, & Chambless, 1992)  and adequate test–​retest reliability over a 5-​week period (r = .83; de Beurs et al., 1997). The scores also show evidence of content and construct validity, as well as good validity generalization (Beck et  al., 1988; de Beurs et  al., 1997; Fydrich et  al., 1992; Gillis, Haaga, & Ford, 1995). Finally, because the BAI has been widely used, it has well-​established norms in both clinical and nonclinical samples (Beck & Steer, 1990; Gillis et al., 1995). Beck Depression Inventory, Second Edition The Beck Depression Inventory, Second Edition (BDI-​II; Beck, Steer, & Brown, 1996) is a 21-​item self-​report measure of the severity of depressive symptoms. Each item contains four options referring to the degree to which the symptom is experienced; respondents are asked to indicate which of the options best describes them during the past 2 weeks. For items assessing symptoms involving a change in either direction (e.g., sleep disturbance includes either insomnia or hypersomnia), additional options are included to account for both the increase and the decrease in the symptom. The BDI-​II assesses all DSM-​5 diagnostic criteria for depression. As is the case for the BAI, there exist ample normative data for the BDI-​II (see Beck et al., 1996; Steer & Clark, 1997). The BDI-​II scores have demonstrated excellent internal consistency and adequate test–​retest reliability at 1 week (Beck et al., 1996). In addition, the BDI-​II shows evidence of content and construct validity and also adequate validity generalization (Beck et al., 1996). Quality of Life Questionnaire The Quality of Life Questionnaire (QLQ; Léger, Freeston, Dugas, & Ladouceur, 1998) is a 31-​item measure assessing eight life domains:  health, family, activity, finance, community, work, goals, and security. Each item is rated initially on a 4-​point scale assessing level of satisfaction

301

and then rated on a second 4-​point scale assessing level of importance of each domain. The psychometric properties of scores on the QLQ include good internal consistency (α  =  .86) and adequate test–​retest reliability at 6 weeks (r  =  .86; Labrecque, Leblanc, Kirouac, Marchand, & Stephenson, 2001). In addition, normative data are available for both clinical and nonclinical samples (Labrecque et al., 2001). Quality of Life Inventory The Quality of Life Inventory (QOLI; Frisch, 1994) is a 17-​ item measure assessing quality of life. Each item refers to a different life domain. Respondents are asked to indicate the level of importance of each domain using a 3-​point Likert scale and to rate their overall satisfaction with each life domain using a 7-​point scale. The total score is calculated by averaging the weighted scores of the life domains deemed to be relevant by the individual. The psychometric properties of the QOLI scores include good internal consistency, adequate test–​ retest reliability over 2 or 3 weeks, and adequate construct validity (Frisch, Cornell, Villanueva, & Retzlaff, 1992). Despite using the QLQ in our clinical research, we equally recommend using the QOLI. Currently, more research is needed to establish the psychometric properties for each measure and so, for this reason, we are unable to recommend one over the other. GAD Cognitive Processes As mentioned previously, this section focuses on the measurement of the four cognitive processes germane to our model of GAD (see Dugas et al., 1998). During the past 20 years, we have developed and validated self-​report questionnaires for each of the model’s four components (intolerance of uncertainty, positive beliefs about worry, negative problem orientation, and cognitive avoidance). Overall, the measures have proven to be clinically useful, not only in terms of identifying treatment mechanisms but also in terms of predicting the maintenance of treatment gains. The four measures described below are the Intolerance of Uncertainty Scale, the Why Worry-​II, the Negative Problem Orientation Questionnaire, and the Cognitive Avoidance Questionnaire. Intolerance of Uncertainty Scale. Intolerance of uncertainty (IU) is a negative dispositional characteristic that results from deep-​seated catastrophic

302

Anxiety and Related Disorders

beliefs about uncertainty (Dugas & Robichaud, 2007). Within our cognitive model of GAD, IU is the central component and is believed to contribute to the development and maintenance of GAD both directly and indirectly (via its impact on cognitive processing and on the other model components). The Intolerance of Uncertainty Scale (IUS; French version:  Freeston, Rhéaume, Letarte, Dugas, & Ladouceur, 1994; English translation: Buhr & Dugas, 2002) is a 27-​item self-​report measure that assesses two beliefs about uncertainty:  (a) Uncertainty has negative personal implications (“When I am uncertain, I can’t go forward”), and (b) uncertainty is unfair and spoils everything (“A small unforeseen event can spoil everything, even with the best of planning”) (Sexton & Dugas, 2009). Items on the IUS are rated on a 5-​point Likert scale, and the measure takes between 5 and 10 minutes to complete. Scores on the English version of the IUS have demonstrated adequate to excellent psychometric properties. The psychometric properties of the English translation are consistent with those of the original French version of the scale (Freeston et  al., 1994). For example, the IUS scores have demonstrated excellent internal consistency (α = .94), adequate test–​retest reliability at 5 weeks (r  =  .74), good content and construct validity, and good validity generalization (Buhr & Dugas, 2002, 2006). Finally, normative data have been presented elsewhere (Buhr & Dugas, 2002; Dugas et al., 2007). Carleton, Norton, and Asmundson (2007) have validated a short form of the IUS. The IUS-​12 is made up of 12 items derived from the original IUS. Like the full-​scale IUS, the IUS-​12 has a two-​factor structure:  (a) prospective IU (similar to Factor 2 from the original IUS) and (b)  inhibitory IU (similar to Factor 1 from the original IUS). The IUS-​12 has a strong correlation with the original scale (r = .94 to .96) (Carleton et al., 2007; Khawaja & Yu, 2010). The IUS-​ 12 scores have demonstrated excellent internal consistency and have shown evidence of convergent and discriminant validity (Carleton et  al., 2007; McEvoy & Mahoney, 2011). Given that the psychometric properties of the briefer IUS-​12 are similar to those of the full-​scale IUS, it has been increasingly used in applied and clinical research. Why Worry-​II Although other models of GAD include both positive and negative beliefs about worry (Wells & Carter, 1999), our treatment protocol (see Dugas & Robichaud, 2007) does not directly address negative beliefs about worry (negative

beliefs are subsumed under the “cognitive avoidance” component). For this reason, measures of negative beliefs about worry are not presented this section. In terms of positive beliefs about worry, previous research has shown that these beliefs distinguish patients with GAD from nonclinical individuals (Dugas et  al., 1998; Ladouceur et  al., 1999). In nonclinical samples, positive beliefs about worry have been found to predict level of worry (Laugesen, Dugas, & Bukowski, 2003; Robichaud et al., 2003). Finally, data from a treatment study show that the re-​evaluation of positive beliefs about worry leads to decreases in both positive beliefs and GAD symptoms (Dugas et  al., 2004). Thus, positive beliefs about worry appear to be important targets in the treatment of GAD. Therefore, we suggest that clinicians assess positive beliefs about worry prior to the start of treatment in order to determine appropriate intervention strategies. The Why Worry-​II (WW-​II; French version: Gosselin et al., 2003; English translation: Hebert, Dugas, Tulloch, & Holowka, 2014)  is a revised version of the original Why Worry questionnaire (WW; Freeston et al., 1994). The original version of the measure was revised to cover five types of positive beliefs about worry that are related to level of worry. The WW-​II is a 25-​item self-​report questionnaire that contains five subscales, with each subscale measuring one type of positive belief about worry. The subscales assess the following beliefs:  (a) Worry facilitates problem solving (e.g., “The fact that I worry helps me plan my actions to solve a problem”); (b) worry helps motivate (e.g., “The fact that I worry motivates me to do the things I  must do”); (c)  worrying protects one from difficult emotions in the event of a negative outcome (e.g., “If I  worry, I  will be less unhappy when a negative event occurs”); (d) the act of worrying itself prevents negative outcomes (e.g., “My worries can, by themselves, reduce the risks of danger”); and (e)  worry is a positive personality trait (e.g., “The fact that I  worry shows that I am a good person”). Items are rated on a 5-​point Likert scale. The WW-​ II scores have demonstrated adequate to excellent psychometric properties, including excellent internal consistency for the total score (α = .93) and good to excellent internal consistency for all subscales (α = .71 to .93). The WW-​II total score also has shown adequate test–​retest reliability at 6 weeks (r  =  .72) and adequate content and construct validity (Gosselin et  al., 2003; Hebert et  al., 2014). Furthermore, normative data are available on the WW-​II for both clinical and nonclinical samples (Dugas et al., 2007; Gosselin et al., 2003; Hebert et al., 2014).

Generalized Anxiety Disorder

Negative Problem Orientation Questionnaire The Negative Problem Orientation Questionnaire (NPOQ; French version:  Gosselin, Ladouceur, & Pelletier, 2005; English translation: Robichaud & Dugas, 2005a) is a 12-​item measure of a dysfunctional cognitive set that interferes with the ability to solve everyday problems effectively. Specifically, negative problem orientation refers to the tendency to view problems as threats, to doubt one’s own ability to problem solve, and to be pessimistic about the outcome of problem-​solving attempts. Using a 5-​point Likert scale, respondents rate their reactions or thoughts when confronted with a problem. Examples include “I see problems as a threat to my well-​being” and “I often see problems as bigger than they really are.” In nonclinical samples, scores on the English translation of the NPOQ have demonstrated adequate to excellent psychometric properties, which include excellent internal consistency (α  =  .92), adequate test–​retest reliability at 5 weeks (r  =  .80), and good content and construct validity (Robichaud & Dugas, 2005a, 2005b). Note, however, that only normative data from nonclinical samples are currently available (see Gosselin et al., 2005; Robichaud & Dugas, 2005a).

Cognitive Avoidance Questionnaire The Cognitive Avoidance Questionnaire (CAQ; French version: Gosselin et al., 2002; English translation: Sexton & Dugas, 2008) is a 25-​item measure of the tendency to use avoidance strategies when dealing with threatening intrusive thoughts. The process of cognitive avoidance, although not specific to GAD, is a contributing process in excessive and uncontrollable worry (Borkovec, Ray, & Stöber, 1998). The CAQ contains five subscales, each of which assesses a different avoidance strategy:  (a) suppressing worrisome thoughts (e.g., “There are things I try not to think about”); (b)  substituting neutral or positive thoughts for worries (e.g., “I think about trivial details so as not to think about important subjects that worry me”); (c) using distraction as a way to interrupt worrying (e.g., “I often do things to distract myself from my thoughts”); (d) avoiding actions/​situations that can lead to worrisome thinking (e.g., “I avoid actions that remind me of things I do not want to think about”); and (e) transforming mental images into verbal–​linguistic thoughts (e.g., “When I  have mental images that are upsetting, I  say things to myself in my head to replace the images”). Items on the CAQ are rated on a 5-​point Likert scale. Scores on this measure have been shown to have adequate to excellent

303

psychometric properties. For example, the CAQ has demonstrated excellent internal consistency for the total score (α  =  .95), adequate test–​retest reliability over a 5-​week period (r = .85), good content and construct validity, and adequate validity generalization (Gosselin et  al., 2002; Sexton & Dugas, 2008). Furthermore, adequate normative data for both clinical and nonclinical samples have been described elsewhere (Dugas et  al., 2007; Gosselin et al., 2002; Sexton & Dugas, 2008). Overall Evaluation Most of the questionnaires described in this section have received at least adequate empirical support for their use in clinical settings. The PSWQ, which has the most support, is a widely accepted measure of worry. The BAI and BDI-​II also have strong support for use in clinical settings. However, to our knowledge, no data have yet been published on the clinical utility of either of the quality of life measures described previously. Despite this lack of information, assessing for quality of life, as well as both somatic anxiety and depression, is important during the initial assessment. As mentioned previously, the measures of model-​specific components should be used by clinicians intending to use the treatment protocol described in Dugas and Robichaud (2007) because these measures offer the unique opportunity to assess each component of the underlying cognitive model with a questionnaire that was developed explicitly for that purpose. Even if the clinician were to use an alternative treatment for GAD, assessing for such cognitive processes, especially IU, is important because beliefs about uncertainty play a central role in the maintenance of worry/​GAD. However, more research is necessary to further establish their usefulness as assessment tools for use in clinical settings.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

In this section, we present an overview of the measures that can be used to monitor treatment progress and assess treatment outcome. As a rule, the measures described in the previous sections (semi-​structured interviews and self-​report questionnaires) can be readministered at post-​ treatment and at follow-​up to assess treatment outcome and maintenance. Therefore, the treatment outcome portion of this section is brief and focuses on the evidence supporting the sensitivity to change of the previously described measures. This section also presents a simple

304

Anxiety and Related Disorders

TABLE 14.3  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation Internal Consistency

Inter-​Rater Test–​Rest Content Reliability Reliability Validity

Construct Validity Treatment Validity Generalization Sensitivity

Clinical Highly Utility Recommended

E NA

NA NA

L NA

G NR

G A

NR NR

G A

A A

✓ ✓

Treatment Outcome Measures ADIS-​IV NA NA SCID-​I/​P NA NA

A A

NA NA

A A

A A

G G

E E

E E



PSWQ WAQ BAI BDI-​II QLQ

E G G G A

E NR E E G

NA NA NA NA NA

G A A A A

G G A A NR

G A A A NR

E A G G NR

G A A A NR

A A G G A

✓ ✓ ✓ ✓

QOLI

NR

G

NA

A

NR

A

NR

NR

NR

IUS WW-​II NPOQ CAQ

G A NR A

E E E E

NA NA NA NA

A A A A

G A G G

G A G G

G G NR A

G NR NR NR

G G A G

Instrument

Norms

Treatment Monitoring Measures PSWQ-​PW Self-​monitoring booklet

A G

✓ ✓ ✓ ✓

Note:  PSWQ-​ PW  =  Penn State Worry Questionnaire-​ Past Week; ADIS-​ IV  =  Anxiety Disorders Interview Schedule for DSM-​ IV; SCID-​ I/​ P = Structured Clinical Interview for DSM-​IV-​TR for Axis I disorders, Patient Edition; PSWQ = Penn State Worry Questionnaire; WAQ = Worry and Anxiety Questionnaire; BAI = Beck Anxiety Inventory; BDI-​II = Beck Depression Inventory, Second Edition; QLQ = Quality of Life Questionnaire; QOLI  =  Quality of Life Inventory; IUS  =  Intolerance of Uncertainty Scale; WW-​II  =  Why Worry-​II; NPOQ  =  Negative Problem Orientation Questionnaire; CAQ = Cognitive Avoidance Questionnaire; L = Less than adequate; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

strategy that clinicians can use to determine clinically meaningful change in their clients. A  summary of the psychometric properties of the measures presented in this section is provided in Table 14.3. As was the case for Table 14.1, ratings of the previous (i.e., DSM-​IV) versions of the semi-​structured diagnostic interviews are presented in Table 14.3.

of one item, which was removed because it specifically assessed trait worry (“I’ve been a worrier all my life”). Therefore, the PSWQ-​PW contains 15 items instead of 16. A  further change was made to the response scale, which was changed from a 5-​point to a 7-​point scale. In general, the PSWQ-​ PW scores demonstrate excellent internal consistency (α = .91) and a level of test–​retest reliability (r = .59) that is appropriate for a measure designed to assess weekly fluctuations in symptoms. The content Treatment Monitoring and construct validity of the PSWQ-​PW is good (Stöber Given that excessive and uncontrollable worry is the cen- & Bittencourt, 1998), which is consistent with that of the tral feature of GAD, the assessment of the severity of worry original PSWQ (Meyer et  al., 1990). Moreover, scores on a weekly basis is essential to monitor the progress of on the revised questionnaire have shown good sensitivity clients. For this purpose, we recommend that clinicians to treatment-​related changes in worry (more so than the use an adapted version of the PSWQ, which was devel- original PSWQ). oped to allow for the weekly assessment of excessive and In addition to the weekly assessment of excessive and uncontrollable worry. Clinicians can simply ask clients uncontrollable worry, clinicians should also obtain daily to complete this questionnaire prior to the start of every self-​ratings of worry, anxiety, depression, and medication therapy session. The Penn State Worry Questionnaire-​ use (if applicable). Not only is daily self-​monitoring a usePast Week (PSWQ-​PW; Stöber & Bittencourt, 1998) is a ful tool for helping clients become more aware of their reformulation of the original PSWQ (Meyer et al., 1990). affective states but also it provides valuable information The instructions for the revised version emphasize worry about client progress. We typically use a self-​monitoring during the past week (rather than trait worry). In addition, booklet that consists of four questions (proportion of the each item was rephrased to past tense, with the exception day spent worrying, feeling anxious or tense, feeling sad

Generalized Anxiety Disorder

or depressed, and name and quantity of any medication consumed). There is evidence supporting the validity of this type of self-​monitoring, which includes good normative data, adequate construct validity, and adequate sensitivity to treatment (for a summary, refer to Table 14.3). For example, one study using daily self-​monitoring booklets found that patients with GAD reported significantly more time spent worrying than did a nonclinical control group (Dupuy, Beaudoin, Rhéaume, Ladouceur, & Dugas, 2001). However, self-​monitoring booklets are not without their limitations. In particular, they would benefit from greater standardization because different GAD treatment protocols tend to use different self-​monitoring booklets. Despite this limitation, self-​monitoring booklets remain a useful means of monitoring treatment progress and assessing treatment outcome (e.g., Campbell & Brown, 2002).

305

The clinician can use the information obtained following treatment to determine if the client has achieved clinically significant change. According to Jacobson and Truax (1991), the clinician can assess the clinical significance of change by determining whether a client’s post-​treatment score falls within the range of the general population rather than the range of the clinical population (in this case, GAD). Given that normative data are available for most of the measures presented in this chapter, clinicians will be in a position to assess the clinical significance of change on these measures. Jacobson and Truax also argued that clinicians should determine the degree of change on each measure for each client, which can be accomplished by calculating the index of reliable change (for the formulas for calculating the clinical significance of change and the reliable change index, see Jacobson & Truax, 1991). In addition to the methods described by Jacobson and Truax, some GAD studies have defined treatment response as a 20% reduction from Treatment Outcome pre-​to post-​treatment on measures of GAD and associAs mentioned previously, the assessment of treatment out- ated symptoms (see, e.g., Borkovec & Costello, 1993; come (and treatment maintenance) involves the readmin- Dugas et al., 2003; Ladouceur et al., 2000). Other methistration of measures described in the previous sections. ods also exist for calculating clinically significant change, Following treatment, either the ADIS-​5 or the SCID-​5-​ many of which are more complex than the one described CV should be used to assess for the presence of GAD and by Jacobson and Truax. However, a study examining any comorbid conditions. The same interview used at pre- the efficacy of different techniques for assessing clinitreatment should be used at post-​treatment to allow for a cally significant change (Atkins, Bedics, McGlinchey, & direct comparison of diagnostic impressions. The PSWQ Beauchaine, 2005) found that the method described by and WAQ should also be readministered at post-​treatment Jacobson and Truax was comparable to other more comand follow-​up assessments, as well as the measures of plex methods. somatic anxiety (BAI), depression (BDI-​II), and quality of life (QLQ or QOLI). Likewise, clinicians should readOverall Evaluation minister the measures of the cognitive processes involved in GAD (IUS, WW-​II, NPOQ, and CAQ) immediately There is considerable support for the use of the meaPW and self-​ following treatment and at all follow-​up assessments. In sures described in this section (PSWQ-​ monitoring booklets) to assess progress during treatment. fact, having clients complete the IUS at post-​treatment Furthermore, the measures that can be used to assess may be particularly important. Data show that changes in for the presence/​ a bsence of GAD (and other disorders), IUS scores from pre-​to post-​treatment predict GAD sympGAD symptoms, somatic anxitoms up to 2  years after treatment completion (Dugas the severity of worry/​ ety, depression, quality of life, and cognitive processes et al., 2003). We recognize that readministering each of implicated in GAD also have empirical support for their these measures at post-​treatment is time-​consuming and use following treatment. What is less clear is how much would probably take approximately two sessions to comchange is sufficient to terminate the therapy. Weighing plete. Although this is highly recommended, if it is not the advantages and disadvantages for terminating treatfeasible for the clinician, then the BAI, WW-​II, NPOQ, ment while evaluating the level of clinically significant and CAQ may be removed from the assessment protocol. change will aid the clinician in his or her decision. Generally, all measures described previously have been Relatedly, the assessment of clinically significant change shown to be sensitive to treatment changes (see, e.g., can also be of great use to determine if an adjustment in Borkovec & Costello, 1993; Dugas et  al., 2003, 2010; treatment is needed. Ladouceur et al., 2000).

306

Anxiety and Related Disorders CONCLUSIONS AND FUTURE DIRECTIONS

GAD is a diagnostic category that has undergone much change since its introduction in the DSM-​III; thus, the development of adequate assessment options has been an arduous task. However, due to developments in our understanding of GAD, there has been considerable improvement in our ability to comprehensively assess GAD for the purposes of treatment. The reader should keep in mind, nonetheless, that some measures described in this chapter do not yet have sufficient normative data. As such, more research is needed. There is currently much reliance on self-​report measures for the assessment of GAD, which is problematic because this type of assessment method does not always produce valid and clinically meaningful information. Clearly, more attention needs to be directed at producing more “objective” methods of assessing GAD to complement interview and self-​report methods. For example, psychophysiological measures are used more frequently for other anxiety disorders than for GAD. However, before these measures can be incorporated into an assessment battery for GAD, we need to learn more about the disorder’s psychophysiological features. For example, future research should examine whether heart rate variability can provide useful information for the differential diagnosis of GAD. In addition, clarifying the neural circuitry of GAD is another area in need of more attention. Identifying neuroanatomical structures and neural pathways implicated in GAD (and discovering how these structures and pathways influence the development of specific symptoms of GAD) can aid in the identification of GAD-​like patterns of brain activity in at-​risk or symptomatic individuals. Despite the challenges that are encountered when assessing individuals with GAD, assessment options have come a long way since GAD was first introduced as a diagnostic category. By presenting the assessment strategies described in this chapter, we hope to help clinicians working with this population to better identify GAD, monitor treatment progress, and assess short-​and long-​term treatment outcomes. Although the comprehensive assessment of GAD is a work in progress, it can be said that we now possess a variety of measurement instruments that have considerable clinical utility.

References Akiskal, H. S. (1998). Toward a definition of generalized anxiety disorder as an anxious temperament type. Acta Psychiatrica Scandinavica, 98, 66–​73.

American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Press. Andrews, G., Hobbs, M. J., Borkovec, T. D., Beesdo, K., Craske, M. G., Heimberg, R. G.,  .  .  .  Stanley, M. A. (2010). Generalized worry disorder: A review of DSM-​ IV generalized anxiety disorder and options for DSM-​V. Depression and Anxiety, 27, 134–​147. Andrews, G., Stewart, G. W., Morris-​Yates, A., Holt, P., & Henderson, A. S. (1990). Evidence for a general neurotic syndrome. British Journal of Psychiatry, 157, 6–​12. Atkins, D. C., Bedics, J. D., McGlinchey, J. B., & Beauchaine, T. P. (2005). Assessing clinical significance: Does it matter which method we use? Journal of Consulting and Clinical Psychology, 73, 982–​989. Beaudoin, S., Tremblay, M., Carbonneau, C., Dugas, M. J., Provencher, M., & Ladouceur, R. (1997, October). Validation d’un instrument diagnostique pour le trouble d’anxiété généralisée [Validation of a diagnostic measure for generalized anxiety disorder]. Poster session presented at the annual meeting for the Société Québecoise pour la Recherche en Psychologie, Sherbrooke, Quebec, Canada. Beck, A. T., Epstein, N., Brown, G., & Steer, R. A. (1988). An inventory for measuring clinical anxiety:  Psychometric properties. Journal of Consulting and Clinical Psychology, 56, 893–​897. Beck, A. T., & Steer, R. A. (1990). Manual for the Beck Anxiety Inventory. San Antonio, TX:  Psychological Corporation. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory manual (2nd ed.). San Antonio, TX: Psychological Corporation. Beekman, A. T., Bremmer, M. A., Deeg, D. J., van Balkom, A. J., Smit, J. H., de Beurs, E.,  .  .  .  van Tilburg, W. (1998). Anxiety disorders in later life: A report from the Longitudinal Aging Study Amsterdam. International Journal of Geriatric Psychiatry, 13, 717–​726. Bennett-​Levy, J., Butler, G., Fennell, M., Hackmann, A., Mueller, M., & Westbrook, D. (2004). Oxford guide to behavioural experiments in cognitive therapy. New York, NY: Oxford University Press. Blazer, D. G., Hughes, D., & George, L. K. (1987). Stressful life events and the onset of generalized anxiety disorder syndrome. American Journal of Psychiatry, 144, 1178–​1183. Blazer, D. G., Hughes, D., George, L. K., Swartz, M., & Boyer, R. (1991). Generalized anxiety disorder. In L. N.

Generalized Anxiety Disorder

Robins & D. A. Reiger (Eds.), Psychiatric disorders in America: The Epidemiologic Catchment Area Study (pp. 180–​203). New York, NY: Free Press. Borkovec, T. D., Alcaine, O., & Behar, E. (2004). Avoidance theory of worry and generalized anxiety disorder. In R. G. Heimberg, C. L. Turk, & D. S. Mennin (Eds.), Generalized anxiety disorder:  Advances in research and practice (pp. 77–​108). New York, NY: Guilford. Borkovec, T. D., & Costello, E. (1993). Efficacy of applied relaxation and cognitive–​ behavioral therapy in the treatment of generalized anxiety disorder. Journal of Consulting and Clinical Psychology, 61, 611–​619. Borkovec, T. D., Ray, W. J., & Stöber, J. (1998). Worry:  A cognitive phenomenon intimately linked to affective, physiological, and interpersonal behavioral processes. Cognitive Therapy and Research, 22, 561–​576. Brown, T. A., & Barlow, D. H. (2014). Anxiety and Related Disorders Interview Schedule for DSM-​5 (ADIS-​5). New York, NY: Oxford University Press. Brown, T. A., Barlow, D. H., & Liebowitz, M. R. (1994). The empirical basis of generalized anxiety disorder. American Journal of Psychiatry, 151, 1272–​1280. Brown, T. A., Chorpita, B. F., & Barlow, D. H. (1998). Structural relationships among dimensions of the DSM-​IV anxiety and mood disorders and dimensions of negative affect, positive affect, and autonomic arousal. Journal of Abnormal Psychology, 107, 179–​192. Brown, T. A., DiNardo, P. A., & Barlow, D. H. (1994). Anxiety Disorders Interview Schedule for DSM-​IV (ADIS-​IV). New York, NY: Oxford University Press. Brown, T. A., DiNardo, P. A., Lehman, C. L., & Campbell, L. A. (2001). Reliability of DSM-​IV anxiety and mood disorders: Implications for the classification of emotional disorders. Journal of Abnormal Psychology, 110, 49–​58. Buhr, K., & Dugas, M. J. (2002). The Intolerance of Uncertainty Scale:  Psychometric properties of the English version. Behaviour Research and Therapy, 40, 931–​945. Buhr, K., & Dugas, M. J. (2006). Investigating the construct validity of intolerance of uncertainty and its unique relationship with worry. Journal of Anxiety Disorders, 20, 222–​236. Campbell, L. A., & Brown, T. A. (2002). Generalized anxiety disorder. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (pp. 147–​181). New York, NY: Guilford. Carleton, R. N., Norton, M. A., & Asmundson, G. J. G. (2007). Fearing the unknown: A short version of the intolerance of uncertainty scale. Journal of Anxiety Disorders, 21, 105–​117. Carter, R. M., Wittchen, H.-​U., Pfister, H., & Kessler, R. C. (2001). One-​year prevalence of subthreshold and threshold DSM-​ IV generalized anxiety disorder in a nationally representative sample. Depression and Anxiety, 13, 78–​88.

307

Clark, D. A., & Beck, A. T. (2010). Cognitive therapy of anxiety disorders:  Science and practice. New  York, NY: Guilford. de Beurs, E., Wilson, K. A., Chambless, D. L., Goldstein, A. J., & Feske, U. (1997). Convergent and divergent validity of the Beck Anxiety Inventory for patients with panic disorder and agoraphobia. Depression and Anxiety, 6, 140–​146. Deschênes, S. S., & Dugas, M. J. (2013). Sudden gains in the cognitive–​behavioral treatment of generalized anxiety disorder. Cognitive Therapy and Research, 37, 805–​811. Dugas, M. J., Brillon, P., Savard, P., Turcotte, J., Gaudet, A., Ladouceur, R.,  .  .  .  Gervais, N. J. (2010). A randomized clinical trial of cognitive–​behavioral therapy and applied relaxation for adults with generalized anxiety disorder. Behavior Therapy, 41, 46–​58. Dugas, M. J., Freeston, M. H., Provencher, M. D., Lachance, S., Ladouceur, R., & Gosselin, P. (2001). Le Questionnaire sur l’inquiétude et l’anxiété:  Validation dans des échantillons non cliniques et cliniques. [The Worry and Anxiety Questionnaire:  Validation in clinical and nonclinical samples]. Journal de Thérapie Comportementale et Cognitive, 11, 31–​36. Dugas, M. J., Gagnon, F., Ladouceur, R., & Freeston, M. H. (1998). Generalized anxiety disorder: A preliminary test of a conceptual model. Behaviour Research and Therapy, 36, 215–​226. Dugas, M. J., & Koerner, N. (2005). The cognitive–​behavioral treatment for generalized anxiety disorder: Current status and future directions. Journal of Cognitive Psychotherapy: An International Quarterly, 19, 61–​81. Dugas, M. J., Ladouceur, R., Léger, E., Freeston, M. H., Langlois, F., Provencher, M., & Boisvert, J. M. (2003). Group cognitive–​ behavioral therapy for generalized anxiety disorder:  Treatment outcome and long-​ term follow-​up. Journal of Consulting and Clinical Psychology, 71, 821–​825. Dugas, M. J., & Robichaud, M. (2007). Cognitive–​behavioral treatment for generalized anxiety disorder: From science to practice. New York, NY: Routledge. Dugas, M. J., Savard, P., Gaudet, A., Turcotte, J., Brillon, P., Leblanc, R., .  .  .  et  al. (2004, November). Cognitive–​ behavioral therapy versus applied relaxation for generalized anxiety disorder:  Differential outcomes and processes. In H. Hazlett-​Stevens (Chair), New advances in the treatment of chronic worry and generalized anxiety disorder. Symposium conducted at the annual convention of the Association for Advancement of Behavior Therapy, New Orleans, LA. Dugas, M. J., Savard, P., Gaudet, A., Turcotte, J., Laugesen, N., Robichaud, M.,  .  .  .  Koerner, N. (2007). Can the components of a cognitive model predict the severity of generalized anxiety disorder? Behavior Therapy, 38, 169–​178.

308

Anxiety and Related Disorders

Dupuy, J.-​B., Beaudoin, S., Rhéaume, J., Ladouceur, R., & Dugas, M. J. (2001). Worry: Daily self-​report in clinical and non-​clinical populations. Behaviour Research and Therapy, 39, 1249–​1255. Dyck, I. R., Phillips, K. A., Warshaw, M. G., Dolan, R. T., Shea, T., Stout, R. L., . . . Keller, M. B. (2001). Patterns of personality pathology in patients with generalized anxiety disorder, panic disorder with and without agoraphobia, and social phobia. Journal of Personality Disorders, 15, 60–​71. First, M. B., & Gibbon, M. (2004). The Structured Clinical Interview for DSM-​IV Axis I  Disorders (SCID-​I) and the Structured Clinical Interview for DSM-​IV Axis II Disorders (SCID-​II). In M. J. Hilsenroth & D. L. Segal (Eds.), Comprehensive handbook of psychological assessment:  Vol. 2.  Personality assessment (pp. 134–​ 143). Hoboken, NJ: Wiley. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2016). The Structured Clinical Interview for DSM-​ 5 Disorders, Clinician Version. Washington, DC: American Psychiatric Association. Freeston, M. H., Rhéaume, J., Letarte, H., Dugas, M. J., & Ladouceur, R. (1994). Why do people worry? Personality and Individual Differences, 17, 791–​802. Frisch, M. B. (1994). Quality of Life Inventory: Manual and treatment guide. Minneapolis, MN: National Computer Systems. Frisch, M. B., Cornell, J., Villanueva, M., & Retzlaff, P. J. (1992). Clinical validation of the Quality of Life Inventory: A measure of life satisfaction for use in treatment planning and outcome assessment. Psychological Assessment, 4, 92–​101. Fydrich, T., Dowdall, D., & Chambless, D. L. (1992). Reliability and validity of the Beck Anxiety Inventory. Journal of Anxiety Disorders, 6, 55–​61. Gillis, M. M., Haaga, D. A.  F., & Ford, G. T. (1995). Normative values of the Beck Anxiety Inventory, Fear Questionnaire, Penn State Worry Questionnaire, and Social Phobia and Anxiety Inventory. Psychological Assessment, 7, 450–​455. Gorman, J. M. (2002). Treatment of generalized anxiety disorder. Journal of Clinical Psychiatry, 63(Suppl. 8), 17–​23. Gosselin, P., Ladouceur, R., Langlois, F., Freeston, M. H., Dugas, M. J., & Bertrand, J. (2003). Développement et validation d’un nouvel instrument évaluant les croyances erronées à l’égard des inquiétudes [Development and validation of a new instrument evaluating erroneous beliefs about worry]. European Review of Applied Psychology, 53, 199–​211. Gosselin, P., Ladouceur, R., & Pelletier, O. (2005). Évaluation de l’attitude d’un individu face aux différents problèmes de vie: Le Questionnaire d’Attitude face aux Problèmes (QAP) [Evaluation of an individual’s attitude toward daily life problems: The Negative Problem Orientation

Questionnaire]. Journal de Thérapie Comportementale et Cognitive, 15, 141–​153. Gosselin, P., Langlois, F., Freeston, M. H., Ladouceur, R., Dugas, M. J., & Pelletier, O. (2002). Le Questionnaire d’Évitement Cognitif (QEC):  Développement et validation auprès d’adultes et d’adolescents [The Cognitive Avoidance Questionnaire (CAQ):  Development and validation among adult and adolescent samples]. Journal de Thérapie Comportementale et Cognitive, 12, 24–​37. Grant, B. F., Hasin, D. S., Stinson, F. S., Dawson, D. A., Chou, S. P., Ruan, W. J., & Huang, B. (2005). Co-​ occurrence of 12-​month mood and anxiety disorders and personality disorders in the US:  Results from the National Epidemiologic Survey on Alcohol and Related Conditions. Journal of Psychiatric Research, 39, 1–​9. Grant, B. F., Hasin, D. S., Stinson, F. S., Dawson, D. A., Ruan, W. J., Goldstein, R. B.,  .  .  .  Huang, B. (2005). Prevalence, correlates, co-​morbidity, and comparative disability of DSM-​IV generalized anxiety disorder in the USA: Results from the National Epidemiologic Survey on Alcohol and Related Conditions. Psychological Medicine, 35, 1747–​1759. Hebert, E. A., Dugas, M. J., Tulloch, T. G., & Holowka, D. W. (2014). Positive beliefs about worry:  A psychometric evaluation of the Why Worry-​ II. Personality and Individual Differences, 56, 3–​8. Hettema, J. M., Prescott, C. A., & Kendler, K. S. (2001). A population-​based twin study of generalized anxiety disorder in men and women. Journal of Nervous and Mental Disease, 189, 413–​420. Holaway, R. M., Rodebaugh, T. M., & Heimberg, R. G. (2006). The epidemiology of worry and generalized anxiety disorder. In G. C. L. Davey & A. Wells (Eds.), Worry and its psychological disorders: Theory, assessment and treatment (pp. 3–​20). Chichester, UK: Wiley. Hudson, J. L., & Rapee, R. M. (2004). From anxious temperament to disorder:  An etiological model. In R. G. Heimberg, C. L. Turk, & D. S. Mennin (Eds.), Generalized anxiety disorder:  Advances in research and practice (pp. 51–​74). New York, NY: Guilford. Hunt, C., Issakidis, C., & Andrews, G. (2002). DSM-​ IV generalized anxiety disorder in the Australian National Survey of Mental Health and Well-​Being. Psychological Medicine, 32, 649–​659. Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59, 12–​19. Kendler, K. S., Neale, M. C., Kessler, R. C., Heath, A. C., & Eaves, L. J. (1992). Major depression and generalized anxiety disorder: Same genes, (partly) different environments? Archives of General Psychiatry, 49, 716–​722. Kessler, R. C., Berglund, P., Demler, O., Jin, R., Merikangas, K. R., & Walters, E. E. (2005). Lifetime prevalence and

Generalized Anxiety Disorder

age-​of-​onset distributions of DSM-​IV disorders in the National Comorbidity Survey replication. Archives of General Psychiatry, 62, 593–​603. Kessler, R. C., DuPont, R. L., Berglund, P., & Wittchen, H.-​ U. (1999). Impairment in pure and comorbid generalized anxiety disorder and major depression at 12 months in two national surveys. American Journal of Psychiatry, 156, 1915–​1923. Kessler, R. C., Keller, M. B., & Wittchen, H.-​U. (2001). The epidemiology of generalized anxiety disorder. Psychiatric Clinics of North America, 24, 19–​39. Kessler, R. C., McGonagle, K. A., Zhao, S., Nelson, C. B., Hugues, M., Eshleman, S., . . . Kendler, K. S. (1994). Lifetime and 12-​month prevalence of DSM-​III-​R psychiatric disorders in the United States. Archives of General Psychiatry, 51, 8–​19. Kessler, R. C., Walters, E. E., & Wittchen, H.-​ U. (2004). Epidemiology. In R. G. Heimberg, C. L. Turk, & D. S. Mennin (Eds.), Generalized anxiety disorder: Advances in research and practice (pp. 29–​50). New York, NY: Guilford. Khawaja, N. G., & Yu, L. N. H. (2010). A comparison of the 27-​item and 12-​item intolerance of uncertainty scales. Clinical Psychologist, 14, 97–​106. Labrecque, J., Leblanc, G., Kirouac, C., Marchand, A., & Stephenson, R. (2001, March). Validation d’un questionnaire mesurant le qualité de vie auprès d’un échantillon étudiant [Validation of a Quality of Life Questionnaire in a student population]. Poster session presented at the 24th annual congress of La Société Québecoise pour la Recherche en Psychologie, Montréal, Quebec, Canada. Ladouceur, R., Dugas, M. J., Freeston, M. H., Léger, E., Gagnon, F., & Thibodeau, N. (2000). Efficacy of a new cognitive–​ behavioral treatment for generalized anxiety disorder:  Evaluation in a controlled clinical trial. Journal of Consulting and Clinical Psychology, 68, 957–​964. Ladouceur, R., Dugas, M. J., Freeston, M. H., Rhéaume, J., Blais, F., Boisvert, J.-​M., . . . Thibodeau, N. (1999). Specificity of generalized anxiety disorder symptoms and processes. Behavior Therapy, 30, 191–​207. Ladouceur, R., Gosselin, P., & Dugas, M. J. (2000). Experimental manipulation of intolerance of uncertainty: A study of a theoretical model of worry. Behaviour Research and Therapy, 38, 933–​941. Laugesen, N., Dugas, M. J., & Bukowski, W. M. (2003). Understanding adolescent worry:  The application of a cognitive model. Journal of Abnormal Child Psychology, 31, 55–​64. Léger, E., Freeston, M. H., Dugas, M. J., & Ladouceur, R. (1998). The Quality of Life Questionnaire. Quebec City, Quebec, Canada:  Behavioural Therapy Laboratory, School of Psychology, Laval University. Lichtenstein, J., & Cassidy, J. (1991, April). The Inventory of Adult Attachment (INVAA): Validation of a new measure.

309

Paper presented at the biennial meeting of the Society for Research in Child Development, Seattle, WA. Luterek, J. A., Turk, C. L., Heimberg, R. G., Fresco, D. M., & Mennin, D. S. (2002, November). Psychometric properties of the GAD-​Q-​IV among individuals with clinician-​ assessed generalized anxiety disorder:  An update. Poster session presented at the annual meeting of the Association for the Advancement of Behavior Therapy, Reno, NV. Maier, W., Gansicke, M., Freyberger, H. J., Linz, M., Heun, R., & Lecrubier, Y. (2000). Generalized anxiety disorder (ICD-​10) in primary care from a cross-​cultural perspective: A valid diagnostic entity? Acta Psychiatrica Scandinavica, 101, 29–​36. Maser, J. D. (1998). Generalized anxiety disorder and its comorbidities:  Disputes at the boundaries. Acta Psychiatrica Scandinavica, 98, 12–​22. McEvoy, P. M., & Mahoney, A. E. J. (2011). Achieving certainty about the structure of intolerance of uncertainty in a treatment-​seeking sample with anxiety and depression. Journal of Anxiety Disorders, 25, 112–​122. Mennin, D. S., Heimberg, R. G., Turk, C. L., & Fresco, D. M. (2002). Applying an emotion regulation framework to integrative approaches to generalized anxiety disorder. Clinical Psychology: Science and Practice, 9, 85–​90. Meyer, T. J., Miller, M. L., Metzger, R. L., & Borkovec, T. D. (1990). Development and validation of the Penn State Worry Questionnaire. Behaviour Research and Therapy, 28, 487–​495. Miller, P. R., Dasher, R., Collins, R., Griffiths, P., & Brown, F. (2001). Inpatient diagnostic assessments: 1. Accuracy of structured vs. unstructured interviews. Psychiatry Research, 105, 255–​264. Molina, S., & Borkovec, T. D. (1994). The Penn State Worry Questionnaire: Psychometric properties and associated characteristics. In G. C.  L. Davey & F. Tallis (Eds.), Worrying:  Perspectives on theory, assessment, and treatment (pp. 265–​283). Chichester, UK: Wiley. Newman, M. G., Zuellig, A. R., Kachin, K. E., Constantino, M. J., Przeworski, A., Erickson, T., & Cashman-​ McGrath, L. (2002). Preliminary reliability and validity of the Generalized Anxiety Disorder Questionnaire-​ IV: A revised self-​report diagnostic measure of generalized anxiety disorder. Behavior Therapy, 33, 215–​233. Pearson, C., Janz, T., & Ali, J. (2013, September). Mental and substance use disorders in Canada. Health at a Glance, Statistics Canada Catalogue No. 82-​624-​X. Peasley, C. E., Molina, S., & Borkovec, T. D. (1994, November). Empathy in generalized anxiety disorder. Poster session presented at the annual meeting of the Association for the Advancement of Behavior Therapy, San Diego, CA. Rapee, R. M. (1991). Generalized anxiety disorder: A review of clinical features and theoretical concepts. Clinical Psychology Review, 11, 419–​440.

310

Anxiety and Related Disorders

Robichaud, M., & Dugas, M. J. (2005a). Negative problem orientation (Part I):  Psychometric properties of a new measure. Behaviour Research and Therapy, 43, 391–​401. Robichaud, M., & Dugas, M. J. (2005b). Negative problem orientation (Part II): Construct validity and specificity to worry. Behaviour Research and Therapy, 43, 3403–​3412. Robichaud, M., Dugas, M. J., & Conway, M. (2003). Gender differences in worry and associated cognitive–​behavioral variables. Journal of Anxiety Disorders, 17, 501–​516. Roy-​Byrne, P. P., & Katon, W. (1997). Generalized anxiety disorder in primary care:  The precursor/​modifier pathway to increased healthcare utilization. Journal of Clinical Psychiatry, 58, 34–​38. Sexton, K. A., & Dugas, M. J. (2008). The Cognitive Avoidance Questionnaire:  Validation of the English translation. Journal of Anxiety Disorders, 22, 355–​370. Sexton, K. A., & Dugas, M. J. (2009). Defining distinct negative beliefs about uncertainty:  Validating the Factor Structure of the Intolerance of Uncertainty Scale. Psychological Assessment, 21, 176–​186. Starcevic, V., Portman, M. E., & Beck, A. T. (2012). Generalized anxiety disorder:  Between neglect and an epidemic. Journal of Nervous and Mental Disease, 200, 664–​667. Startup, H. M., & Erickson, T. M. (2006). The Penn State Worry Questionnaire (PSWQ). In G. C.  L. Davey & A. Wells (Eds.), Worry and its psychological disorders:  Theory, assessment and treatment (pp. 101–​119). Chichester, UK: Wiley. Steer, R. A., & Clark, D. A. (1997). Psychometric properties of the Beck Depression Inventory-​II with college students. Measurement and Evaluation in Counselling and Development, 30, 128–​136. Stein, M. B. (2004). Public health perspectives on generalized anxiety disorder. Journal of Clinical Psychiatry, 65, 3–​7. Stöber, J., & Bittencourt, J. (1998). Weekly assessment of worry:  An adaptation of the Penn State Worry

Questionnaire for monitoring changes during treatment. Behaviour Research and Therapy, 36, 645–​656. Wells, A. (2009). Metacognitive therapy for anxiety and depression. New York, NY: Guilford. Wells, A., & Carter, K. (1999). Preliminary tests of a cognitive model of generalized anxiety disorder. Behaviour Research and Therapy, 37, 585–​594. Williams, J. B. W., Gibbon, M., First, M. B., Spitzer, R. L., Davies, M., Borus, J., . . . et al. (1992). The Structured Clinical Interview for DSM-​III-​R (SCID): II. Multisite test–​retest reliability. Archives of General Psychiatry, 49, 630–​636. Wittchen, H.-​U., & Hoyer, J. (2001). Generalized anxiety disorder: Nature and course. Journal of Clinical Psychiatry, 62(Suppl. 11), 15–​19. Wittchen, H.-​ U., Kessler, R. C., Beesdo, K., Krause, P., Höfler, M., & Hoyer, J. (2002). Generalized anxiety and depression in primary care:  Prevalence, recognition, and management. Journal of Clinical Psychiatry, 63, 24–​34. Wittchen, H.-​U., Zhao, S., Kessler, R. C., & Eaton, W. W. (1994). DSM-​ III-​ R generalized anxiety disorder in the National Comorbidity Survey. Archives of General Psychiatry, 51, 355–​364. Yonkers, K. A., Warshaw, M. G., Massion, A. O., & Keller, M. B. (1996). Phenomenology and course of generalized anxiety disorder. British Journal of Psychiatry, 168, 308–​313. Zanarini, M. C., Skodol, A. E., Bender, D., Dolan, R., Sanislow, C., Schaefer, E., . . . Gunderson, J. G. (2000). The Collaborative Longitudinal Personality Disorders Study: Reliability of Axis I and II diagnoses. Journal of Personality Disorders, 14, 291–​29. Zimmerman, M., & Mattia, J. I. (1999). Psychiatric diagnosis in clinical practice:  Is comorbidity being missed? Comprehensive Psychiatry, 40, 182–​191.

15

Obsessive–​Compulsive Disorder Shannon M. Blakey Jonathan S. Abramowitz This chapter addresses the conceptualization and assessment of obsessive–​compulsive disorder (OCD) in order to aid the clinician in the treatment of this condition. After identifying, defining, and describing the nature of OCD, we provide a brief review of empirically based theories and psychological treatments. Next, three sections address assessment for the purposes of (a) establishing a clinical diagnosis of OCD, (b) formulating a treatment plan, and (c) measuring severity and treatment response. Not all of the available measures of OCD are reviewed in this chapter because some older measures have fallen out of favor as our understanding of this disorder has advanced; other measures have poor psychometric properties or confound important variables. The chapter concludes with a discussion of the strengths and limitations of existing assessment options, as well as future directions in the assessment of OCD.

THE NATURE OF OCD

Definition OCD is classified in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association [APA], 2013)  as an obsessive–​compulsive and related disorder characterized by obsessions or compulsions. Obsessions are persistent intrusive thoughts, ideas, images, or doubts that are experienced as unacceptable, senseless, or bizarre. The intrusions also evoke subjective distress (e.g., anxiety, fear, and doubt) and are not simply everyday worries about work, relationships, or finances. Common obsessions include ideas of contamination by the Ebola virus, unwanted impulses to harm others, doubts about one’s sexual

preference, the idea one might have emotionally harmed someone, and intrusive sacrilegious images. Although highly individualistic, obsessions typically concern the following general themes: aggression and violence, responsibility for causing harm, contamination, sex, religion, the need for exactness or completeness, and concerns about serious illnesses. Most individuals with OCD evidence multiple types of obsessions. To control their anxiety, individuals with OCD attempt to avoid stimuli that trigger obsessions (e.g., public restrooms in the case of contamination obsessions). If such stimuli cannot be avoided, however, the person performs compulsive rituals—​behavioral or mental acts that are completed according to self-​generated “rules.” The rituals are deliberate yet clearly senseless or excessive in relation to the obsessional fear they are designed to neutralize (e.g., washing one’s hands for 30 minutes after using the restroom). As with obsessions, rituals are highly individualized. Common overt rituals include excessive decontamination (e.g., washing), checking (e.g., locks and the stove), counting, and repeating routine actions (e.g., going through doorways). Examples of covert or mental rituals include excessive prayer and using special “safe” phrases or numbers to neutralize “unsafe” thoughts or stimuli (e.g., thinking the number 2 to “undo” the number 666). Obsessions and compulsions are functionally related in that obsessions (e.g., images of germs) increase subjective distress, whereas rituals (e.g., washing) reduce distress. Individuals with OCD display a range of insight into the senselessness of their symptoms in that some acknowledge the irrationality of their obsessions and compulsions and others are firmly convinced that these symptoms are rational. Often, the degree of insight varies across time and obsessional themes. For example, one person might

311

312

Anxiety and Related Disorders

recognize her obsessional thoughts of harm as senseless but have poor insight into the irrationality of her contamination obsessions.

cognitive–​behavioral approach. Early conditioning models proposed that obsessional anxiety is acquired when a previously neutral stimulus (e.g., the floor) becomes associated with fear through classical conditioning. This fear is then maintained by avoidance and the perforEtiological Models and Treatment mance of rituals, which prevent the natural extinction of the fear. Avoidance and rituals are also negatively Biological Models reinforced by the reduction in fear they engender; thus, Prevailing neurotransmitter theories posit that abnormali- they develop into compulsive-​like habits. Contemporary ties in the serotonin system underlie OCD (Okuda & learning models focus on other sources of learning, Simpson, 2015). Results from studies that have directly such as vicarious conditioning and social learning, to examined the relationship between serotonin and OCD, account for the development of obsessions (e.g., Mineka however, have been inconsistent. Whereas the preferential & Zinbarg, 2006). response of patients with OCD to serotonergic medication Conditioning models form the basis for the most is often championed as supporting the serotonin hypoth- effective treatment for OCD, which includes the esis, this argument is of little value because the serotonin behavioral therapy techniques of exposure and response hypothesis was initially derived from this treatment outcome prevention (ERP; Abramowitz & Jacoby, 2014). result (i.e., it is therefore a circular argument). Moreover, to Therapeutic exposure aims to extinguish obsessional reason backward with respect to specific neurotransmitter-​ fear by helping the individual systematically confront related etiology from the apparent success of a medication situations and stimuli that evoke obsessions (e.g., represents a logical error called post hoc ergo propter hoc touching floors and thinking upsetting thoughts) and (“after this, therefore because of this”). Indeed, there might remain in the feared situation until he or she learns be numerous reasons for the observed efficacy of serotoner- that feared outcomes are less likely or less catastrophic gic medications. Finally, there is no coherent explanation than anticipated (i.e., extinction). Response prevenfor how serotonin abnormalities might translate to obses- tion entails refraining from compulsive rituals, with sions and compulsions and, given the efficacy of seroto- the aim of weakening the association between rituals nergic medications for numerous psychiatric conditions, and anxiety reduction. Exposure exercises are repeated no explanation for why one might develop OCD instead frequently and in multiple contexts, perhaps (although of another disorder. Accordingly, the notion that serotonin not necessarily) using a hierarchy-​driven (i.e., gradufunctioning mediates OCD symptoms is tenuous. ated) approach in which less distressing stimuli are Predominant neuroanatomical models of OCD pro- confronted and mastered before more difficult stimuli pose that symptoms arise from structural and functional are faced. The details regarding implementation of abnormalities in orbitofrontal–​subcortical circuits within ERP are beyond the scope of this chapter but are well the brain (Lapidus, Stern, Berlin, & Goodman, 2014). described elsewhere (e.g., Abramowitz & Jacoby, 2014). These circuits are thought to connect regions of the brain Numerous studies conducted throughout the world involved in information processing with those involved in indicate that ERP is highly effective, with the average the initiation of certain behavioral responses. Although patient receiving a 60% to 70% reduction in symptoms highly interesting, these models are derived from cross-​ (e.g., Olatunji, Davis, Powers, & Smits, 2013). sectional data merely indicating differences in brain strucCognitive–​ behavioral models of OCD (e.g., ture and function between people with and without OCD. Salkovskis, 1999) are derived from Beck’s (1976) cogniBecause of their correlational nature, however, such data tive specificity hypothesis, which proposes that different cannot reveal whether OCD is a cause or a consequence types of psychopathology arise from disorder-​specific dysof the observed brain differences. It is indeed possible (and functional beliefs and appraisals. As applied to OCD, even likely) that such observations represent the effects of such models consider unwanted intrusive thoughts chronic anxiety on normally functioning brain systems. as normal stimuli that occur from time to time in just about everyone but that develop into clinical obsessions when the intrusions are appraised as highly significant Psychological Models and threatening. Two psychological models are relevant to the effecTo illustrate, consider an unwanted intrusive thought tive treatment of OCD: a conditioning approach and a of harming an infant. Whereas most people would regard

Obsessive–Compulsive Disorder

this experience as meaningless (“mental noise”), such an intrusion could develop into a clinical obsession if the person mistakenly appraised it as having serious implications (e.g., “Only bad people think these kinds of thoughts”). Such appraisals evoke distress and motivate the person to try to suppress or remove the intrusion (e.g., via rituals). The tendency to misappraise intrusive thoughts as having serious consequences is thought to arise from dysfunctional beliefs concerning responsibility, the importance of thoughts, need for perfectionism, overestimation of threat, and need for certainty. Rituals are conceptualized as efforts to remove obsessional intrusions and to prevent any perceived harmful consequences. Treatment based on the cognitive–​behavioral model incorporates ERP but emphasizes cognitive changes that occur with this treatment. For example, exposure is thought to modify erroneous expectations about the likelihood and severity of feared outcomes. Therapy also includes verbal techniques such as psychoeducation and cognitive restructuring that help the patient to recognize and correct faulty beliefs and appraisals of intrusive thoughts and other feared stimuli (e.g., Clark, 2004; Wilhelm & Steketee, 2006). Epidemiology, Course, and Prognosis The lifetime prevalence of OCD in the general adult population is as high as 2.3% (e.g., Kessler et al., 2005). Symptoms typically develop gradually, often beginning in the teenage years. An exception is the abrupt onset sometimes observed following pregnancy (Speisman, Storch, & Abramowitz, 2011). Left untreated, the disorder typically runs a chronic course, although symptoms may wax and wane in severity over time, and in some cases improve (often dependent on levels of psychosocial stress; e.g., Skoog & Skoog, 1999). Most individuals with OCD suffer for several years before they receive adequate diagnosis and treatment. Factors contributing to the underrecognition of OCD include the failure of patients to disclose symptoms, the failure to assess for obsessions and compulsions during mental status examinations, and difficulties with differential diagnoses. Because OCD represents a seemingly complex set of thinking and behavioral symptoms, its assessment has traditionally been considered highly challenging. This is likely because many clinicians undertake assessment without a theoretical framework to guide the process. The aim of this chapter is to facilitate a theoretically and empirically grounded approach to assessing OCD that is also consistent with the empirically

313

supported cognitive and behavioral interventions for this condition. Associated Features Most individuals with OCD also suffer from depressive symptoms, which can exacerbate obsessional problems and attenuate response to ERP (e.g., Abramowitz, Franklin, Kozak, Street, & Foa, 2000). Therefore, it is necessary to assess mood state and, in particular, to inquire about the chronological history of mood complaints in order to establish whether such symptoms should be considered as a primary diagnosis (e.g., major depressive disorder) or as secondary to OCD symptoms. Relatives’ emotional and behavioral responses to the patient’s OCD symptoms should also be considered. In some instances, family members who wish not to see their loved one suffer unwittingly contribute to the persistence of OCD symptoms by performing rituals, providing frequent reassurance, and engaging in avoidance to “help the affected relative cope with anxiety.” Thus, family accommodation is an important factor to assess (Boeding et al., 2013). In other families, relatives are highly critical and express hostility toward their loved one with OCD. When relatives meddle or chronically intrude into the patient’s daily activities, it can affect course and treatment response. Relatives can be invited to take part in the assessment process, thus providing an opportunity to assess how they interact with the patient. Relatives can be asked about (a)  the extent to which they participate in the patient’s rituals and avoidance habits, (b) how they respond when repeatedly asked questions for reassurance, (c) what consequences they fear might occur if symptoms are not accommodated, and (d) the extent to which the family’s daily activities are influenced by the patient’s OCD symptoms.

PURPOSES OF ASSESSMENT

Proper assessment of OCD is guided by conceptual models of phenomenology, etiology, and treatment. Because the cognitive–​behavioral model has strong empirical support, this framework is used in this chapter to determine what parameters are necessary to assess. The next sections include a review and discussion of the use of particular instruments and methodologies that clinicians and clinical researchers will find helpful for the purposes of (a) making a diagnosis of OCD, (b) case conceptualization and treatment planning, and (c) evaluating the effects of treatment.

314

Anxiety and Related Disorders ASSESSMENT FOR DIAGNOSIS

General Description of the Problem It is useful to begin the diagnostic assessment in an unstructured way by asking the patient to provide a general description of his or her difficulties with obsessions and compulsions. Reviewing a typical day can highlight, for example, the frequency and duration of OCD symptoms, how these symptoms are managed, and the ways in which the person is functionally impaired. Examples of open-​ended questions to ask regarding the presence of obsessions, compulsions, and related signs and symptoms include the following: • What kinds of activities or situations trigger anxiety or fear? • What kinds of upsetting or scary thoughts have you been experiencing? • What places or situations have you have been avoiding? • Tell me about any behaviors that you feel compelled to perform over and over. • What do you think might happen if you could not perform these behaviors? Information about the onset, historical course of the problem, comorbid conditions, social and developmental history, and personal/​family history of mental health treatment should also be obtained. The most common comorbid conditions among individuals with OCD are unipolar mood disorders (see Chapter  7, this volume) and other anxiety disorders (e.g., generalized anxiety disorder; see Chapter 14, this volume). Yale–​Brown Obsessive Compulsive Scale Symptom Checklist Because OCD is highly heterogeneous in its presentation, a semi-​structured approach to assessing the topography of a given patient’s symptoms is recommended as an initial step. The Yale–​Brown Obsessive Compulsive Scale Symptom Checklist (Y-​BOCS-​ SC; Goodman, Price, Rasmussen, Mazure, Delgado, et  al., 1989; Goodman, Price, Rasmussen, Mazure, Fleischmann, et  al., 1989; reprinted in the Journal of Clinical Psychiatry, Vol. 60 [1999], Suppl. 18, pp. 67–​77) is the best available instrument for such purposes. The first section of the Y-​BOCS-​ SC provides definitions of obsessions and compulsions that are read to the patient. Next, the clinician reviews

a list of more than 50 common obsessions and compulsions and asks whether each symptom is currently present or has occurred in the past. Finally, the most prominent obsessions, compulsions, and OCD-​ related avoidance behaviors are identified from those endorsed by the patient. Although it is comprehensive in scope, there are no psychometric studies of the Y-​BOCS-​SC. Moreover, the checklist merely assesses the form of the patient’s obsessions and rituals without regard for the function of these symptoms. That is, there are no questions relating to how rituals are used to reduce obsessional fears (later, we describe a functional approach to assessing OCD symptoms that has incremental validity over the Y-​BOCS-​SC for the purpose of developing a treatment plan). Another limitation of the Y-​BOCS-​SC is that it contains only one item assessing mental rituals. Thus, the clinician must probe in a less structured way for the presence of these covert symptoms (the assessment of mental rituals is also discussed further in the section on case conceptualization and treatment planning). Furthermore, the checklist contains some items that do not pertain to OCD per se, such as hoarding obsessions (hoarding is defined as a separate disorder in DSM-​5) and hair pulling and self-​injurious compulsions. Finally, because of its emphasis on the overt characteristics of obsessions and compulsions—​such as their repetitiveness and thematic content (e.g., fears of illness and repetitive counting)—​ the Y-​BOCS-​SC offers little help in differentiating OCD symptoms from other clinical phenomena that might also be repetitive or thematically similar. For example, worries might be repetitive and can focus on matters of health and illness, depressive ruminations are repetitive and involve negative thinking, and hair pulling disorder (i.e., trichotillomania) can involve repetitive behaviors. It is therefore necessary to distinguish OCD symptoms from these other entities. Distinguishing OCD Symptoms from Other Phenomena Whereas obsessions and worries can both involve themes of illness and harm, obsessions focus on doubts about unrealistic disastrous consequences (e.g., “What if I had a hit-​ and-​ run automobile accident and didn’t realize it?”). Worries, in contrast, concern real-​ life (everyday) situations such as relationships, health and safety, work or school, and finances. In addition, compared with worries, obsessions are experienced as more unacceptable, and they evoke greater subjective resistance. Obsessions can

Obsessive–Compulsive Disorder

315

TABLE 15.1  

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

ADIS-​IV SCID-​IV

NA NA

NA NA

E A

G A

E E

E E

E E

G A



Y-​BOCSC-​SC NA BABS A

NA G

NA G

NA A

G G

E G

NA G

A G

✓ ✓

Note: ADIS-​IV = Anxiety Disorders Interview Schedule for DSM-​IV; SCID-​IV = Structured Clinical Interview for DSM-​IV; Y-​BOCS-​SC = Yale–​Brown Obsessive Compulsive Scale Symptom Checklist; BABS = Brown Assessment of Beliefs Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

be differentiated from depressive ruminations based on content as well as subjective experience. Depressive ruminations typically involve overly negative thoughts about oneself, the world, and the future (e.g., “No one will ever love me”). Moreover, depressive ruminations do not elicit subjective resistance or ritualistic behavior. Whereas obsessions are experienced as distressing, unwanted, and unacceptable, fantasies are experienced as pleasurable and therefore should not be confused with obsessions. For example, the erotic thoughts of individuals with paraphilia lead to sexual arousal (even if the sufferer wishes not to have such thoughts or feels guilty about them). Sexual obsessions in OCD, however, do not lead to sexual arousal (Schwartz & Abramowitz, 2003). Similarly, repetitive “obsessive” thoughts about acquiring psychoactive substances (e.g., drugs and alcohol) are not, in and of themselves, experienced as distressing (although the person might feel guilty about the consequences of his or her alcohol consumption). Finally, whereas obsessions and delusions might both have a bizarre quality, delusions do not evoke anxiety or rituals. Tics (as in Tourette’s syndrome) and compulsive rituals differ primarily in that rituals are usually purposeful, meaningful behaviors that are performed in response to obsessional distress and intended to reduce an obsessional fear. Tics, in contrast, are often performed in response to physical urges and sensations (i.e., premonitory urges) and are not triggered by obsessional thinking or performed to reduce fear. Other repetitive behaviors, such as “compulsive” hair pulling, gambling, skin picking, overeating, stealing, and excessive shopping, are problems with impulse control yet are often mistaken for OCD rituals. These impulsive behaviors, however, are not associated with obsessions and do not serve to reduce anxiety or the probability of feared outcomes. In fact, these acts are experienced as pleasurable even if the person wishes he or she did not feel compelled to do these behaviors.

Standardized Diagnostic Assessment When initial questioning reveals the apparent presence of obsessions and/​ or compulsions, assessment should include a standardized diagnostic interview to confirm the diagnosis of OCD (as well as other common comorbid anxiety and mood disorders). Two incrementally valid instruments exist for this purpose and are described next. Table 15.1 shows ratings of various psychometric and practical characteristics of these diagnostic interviews. The reader should note that this chapter discusses assessment tools based on the DSM-​IV criteria for OCD (APA, 1994)  given that (a)  the diagnostic criteria have undergone only very minor changes from DSM-​IV to DSM-​5 and (b)  diagnostic instruments consistent with DSM-​5 have not yet been psychometrically evaluated (although updated versions have been published to allow for the diagnosis of OCD according to DSM-​5 criteria [APA, 2016; Brown & Barlow,  2014]). The most important change from DSM-​IV to DSM-​5 regarding OCD is that it has been removed from the anxiety disorders and incorporated as the flagship diagnosis of the obsessive–​compulsive related disorders (OCRDs)—​a completely novel diagnostic class in DSM-​5. Other OCRDs include hair pulling disorder (trichotillomania), skin picking (excoriation) disorder, hoarding disorder, and body dysmorphic disorder.

Anxiety Disorders Interview Schedule for DSM-​IV The Anxiety Disorders Interview Schedule for DSM-​IV (ADIS-​ IV) is a clinician-​ administered, semi-​ structured, diagnostic interview developed to establish the differential diagnosis among the anxiety disorders based on DSM-​IV criteria (DiNardo, Brown, & Barlow, 1994). Compared with other diagnostic interviews, it provides greater detail about anxiety-​ related problems. The ADIS-​ IV begins with demographic questions and items about general functioning and life stress. Sections for assessing each

316

Anxiety and Related Disorders

anxiety, mood, and somatoform disorder appear next. The OCD section begins with a screening question, a positive answer to which triggers more detailed questions about obsessions and compulsions based on DSM-​IV criteria. In a large reliability study (Brown, DiNardo, Lehman, & Campbell, 2001), scores on the ADIS-​IV OCD module evidenced very good inter-​rater reliability, with the main sources of unreliability coming from the occasional assignment of a subclinical OCD diagnosis (as opposed to a different anxiety disorder). Although no studies have directly examined the validity of the scores on the ADIS-​IV OCD section, the many studies showing that OCD samples diagnosed with this instrument have higher scores on measures of OCD severity, compared to non-​OCD samples, provide evidence for its validity. Other advantages of the ADIS-​IV include the fact that it contains a semi-​structured format, which allows the clinician to collect detailed information. It also includes a dimensional rating of symptom severity. One limitation of the ADIS-​IV is that administration of the entire instrument can be time-​consuming, although the OCD module itself is not very long. Structured Clinical Interview for DSM-​IV Axis I Disorders The Structured Clinical Interview for DSM-​ IV Axis I  Disorders (SCID) is a clinician-​ administered, semi-​ structured interview developed for the purpose of diagnosing a range of DSM-​IV Axis I disorders (First, Spitzer, Gibbon, & Williams, 2002). Accordingly, it contains a module to assess the presence of OCD. The SCID begins with an open-​ended assessment of demographic information and various domains of functioning. The OCD section includes probe questions about the presence of obsessions and compulsions. Next to each probe appear the corresponding DSM-​IV diagnostic criteria, which are rated as absent (false), subthreshold, or present (true). Thus, ratings are of diagnostic criteria, not of interviewees’ responses. Research on the reliability of the SCID scores for assessing the presence of OCD has provided mixed results. Whereas some studies report low kappas, others report more acceptable inter-​rater reliability (e.g., Williams et al., 1992). Assessing Insight into the Senselessness of OCD Symptoms The Brown Assessment of Beliefs Scale The Brown Assessment of Beliefs Scale (BABS) is a brief (seven items) interview that provides a continuous

measure of insight into the senselessness of OCD symptoms (Eisen et  al., 1998). Administration begins with the interviewer and patient identifying one or two of the patient’s specific obsessional fears that have been of significant concern during the past week. Next, individual items assess the patient’s (a) conviction in the validity of this fear, (b) perceptions of how others view the validity of the fear, (c) explanation for why others hold a different view, (d) willingness to challenge the fear, (e) attempts to disprove the fear, (f) insight into whether the fear is part of a psychological/​psychiatric problem, and (g)  ideas/​delusions of reference. Only the first six items are summed to produce a total score. Norms for OCD samples have been established in several studies (e.g., Eisen, Phillips, Coles, & Rasmussen, 2004). The BABS appears to yield scores that have good internal consistency, and it discriminates OCD patients with good insight from those with poor insight (Eisen et  al., 1998). Whereas the BABS is sensitive to treatment-​related changes in OCD symptoms, there is mixed evidence regarding whether higher scores are predictive of poorer response to treatment (e.g., Ravie Kishore, Samar, Janardhan Reddy, Chandrasekhar, & Thennarasu, 2004). Practical Considerations People with OCD often have difficulty discussing their obsessions and compulsions. Embarrassment over the theme (e.g., sexual) and senselessness of such symptoms is a primary factor. The interviewer must be sensitive to such concerns and demonstrate appropriate empathy regarding the difficulties inherent in discussing these problems with others. Clearly, the clinician should avoid appearing shocked or disturbed by descriptions of obsessions and compulsions. Semi-​structured instruments such as the Y-​BOCS-​SC and BABS help the interviewer normalize such symptoms. Patients may also have difficulty describing their symptoms if they are unaware that such thoughts and behaviors represent obsessions and compulsions. Thus, including significant others in the interview can help identify such symptoms. Occasionally, features of OCD itself—​such as fear, indecisiveness, rigidity, and the need for reassurance—​ attenuate the assessment process. Patients might be afraid to verbalize their obsessional thoughts for fear that doing so will cause harm to befall themselves or others (e.g., thoughts of loved ones dying). They might also be highly circumstantial in their responses because of fears that if they do not provide “all the details,” they will not

Obsessive–Compulsive Disorder

benefit from therapy. Such obstacles require the clinician’s patience but can often be managed with persistent gentle, yet firm, reminders of the importance of accurate reporting, as well as time constraints and the need for short, concise responses. Overall Evaluation OCD is a highly heterogeneous condition in which each individual presents with idiosyncratic and personalized symptom content. Thus, the clinician must be flexible and comprehensive, and he or she must also be able to distinguish bona fide OCD symptoms from symptoms of other disorders with topographically similar presentations—​ especially other OCRDs such as hair pulling and skin picking disorders (Abramowitz & Jacoby, 2015). Family members living with patients may be able to serve as reliable sources of information regarding the validity of this diagnosis. Keeping these points in mind and using careful open-​ended questioning often leads to correctly identifying whether or not an individual has OCD. The ADIS-​IV and SCID are empirically established and widely used semi-​structured interviews for confirming the diagnosis of OCD. Some authors favor the ADIS-​IV for the excellent reliability of its scores and wider scope of information yielded compared with the SCID. Both of these instruments, however, require that interviewers be well trained in their administration, although BA-​or MA-​ level training in psychology is often sufficient to achieve good reliability as long as the interviewers are supervised by experienced doctoral-​level psychologists. How well the individual recognizes his or her obsessions and compulsions as senseless and excessive is best assessed using the BABS, a continuous measure of insight, as opposed to using the categorical DSM-​5 insight specifiers (i.e., “good or fair insight,” “poor insight,” and “absent insight/​delusional beliefs”).

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

The cognitive–​behavioral model, from which effective psychological treatment is derived, provides a framework for collecting patient-​specific information and generating an individualized case conceptualization and treatment plan. This framework, referred to as functional assessment (Abramowitz, Deacon, & Whiteside, 2011; Abramowitz & Jacoby, 2014), is important because identifying the particular stimuli to be confronted during exposure

317

therapy requires detailed knowledge of the patient’s idiosyncratic fear triggers and cognitions. Similarly, assisting patients to resist compulsive urges (i.e., response prevention) requires knowing about all ritualistic maneuvers performed in response to obsessive fear. This section describes the procedures for conducting this type of assessment. Assessing Obsessional Stimuli Guided by information already collected, a thorough inventory of external triggers and intrusive thoughts that evoke the patient’s obsessional fear is obtained. Some of these stimuli will later be chosen for inclusion as exposure therapy tasks. Because of the idiosyncratic nature of obsessional triggers, there are no psychometrically validated instruments for this purpose. Therefore, the assessor must rely on his or her clinical experience and knowledge of the OCD research literature. External Triggers External triggers include specific objects, situations, places, and so on that evoke obsessional fears and urges to ritualize. Examples include toilets, knives, completing paperwork, religious icons, feared numbers (e.g., 13 or 666), and leaving the house. Examples of questions to help the patient describe such triggers include “In what situations do you feel anxious?” “What do you avoid?” and “What triggers your urge to do compulsive rituals?” Intrusive Thoughts Intrusive thoughts include unwanted mental stimuli (e.g., upsetting images) that are experienced as unacceptable, immoral, or repulsive and that evoke obsessional anxiety. Examples include images of germs, impulses to harm loved ones, doubts about one’s sexual preference, and thoughts of loved ones being injured. Examples of questions to elicit this information include “What intrusive thoughts do you have that trigger anxiety?” and “What thoughts do you try to avoid, resist, or dismiss?” Some patients are unwilling to describe their intrusions, fearing that the therapist will not understand that these are unwanted thoughts. To overcome such reluctance, the assessor can educate the patient about the universality of such intrusions and even self-​disclose his or her own senseless intrusions. A list of intrusive thoughts from nonclinical individuals that can be given to patients to demonstrate the universality of such phenomena is published elsewhere (e.g., Abramowitz, 2006a).

318

Anxiety and Related Disorders

Assessing Cognitive Features

TABLE 15.2  

Domains of Dysfunctional Beliefs Associated

with OCD Feared Consequences Information should be obtained about the cognitive basis of obsessional fear—​ that is, the feared consequences associated with obsessional stimuli (e.g., “If I use a public restroom I will get AIDS” and “If my receipt has the number 13, I  will have bad luck”). Knowing this information helps the therapist arrange exposure tasks that will disconfirm such exaggerated expectations. Although most patients readily articulate such fears, some do not. When feared disasters cannot be explicitly articulated, the patient might fear that anxiety itself will persist indefinitely (or escalate to “out-​of-​control” levels) unless a ritual is performed. Other patients might be afraid merely of not knowing “for sure” whether a feared outcome (usually in the more distant future) will occur. The following open-​ ended questions are appropriate for assessing feared consequences: “What is the worst thing that could happen if you are exposed to (obsessional trigger)?” “What do you think might happen if you didn’t complete the ritual?” “What would happen if you didn’t do anything to reduce your high levels of anxiety?” and “What if you don’t know for certain whether _​_​_​_​_​ will happen?” Dysfunctional Beliefs Cognitive therapy techniques (e.g., Wilhelm & Steketee, 2006), which can be used to supplement exposure therapy, require assessment of the patient’s dysfunctional thinking patterns that underlie obsessional fear. An international group of researchers, the Obsessive Compulsive Cognitions Working Group (OCCWG), has developed and tested two instruments that provide a comprehensive assessment of the cognitive landscape of OCD:  the Obsessive Beliefs Questionnaire (OBQ) and the Interpretation of Intrusions Inventory (III; OCCWG, 2005). The reader should note that additional measures for assessing specific OCD-​related dysfunctional beliefs are available (e.g., the Thought–​ Action Fusion Scale; Shafran, Thordarson, & Rachman, 1996). However, because the OBQ and III are comprehensive in their coverage of the various domains of dysfunctional beliefs, this chapter focuses on these measures. Information on many of the other measures can be found in Antony, Orsillo, and Roemer (2001). An initial 87-​item version of the OBQ (OCCWG, 2001, 2003)  contained six rationally derived and highly correlated subscales. Subsequent research, however, has led to a 44-​item version with three empirically derived

Belief Domain

Description

Overestimation of threat and inflated responsibility

Beliefs that negative events are especially likely and would be especially awful. Beliefs that one has the special power to cause and/​or the duty to prevent negative outcomes.

Overimportance Beliefs that the mere presence of a thought of, and need to indicates that the thought is significant. For control, intrusive example, the belief that the thought has thoughts ethical or moral ramifications or that thinking the thought increases the probability of the occurrence of the corresponding behavior or event. Also, beliefs that complete control over one’s thoughts is both necessary and possible. Perfectionism and intolerance for uncertainty

Beliefs that mistakes and imperfection are intolerable. Beliefs that it is necessary and possible to be completely certain that negative outcomes will not occur.

subscales (OCCWG, 2005), which assess domains of dysfunctional beliefs (termed “obsessive beliefs”) thought to increase risk for the development of OCD (e.g., Frost & Steketee, 2002; see Table 15.2). Specifically, obsessive beliefs are considered enduring trait-​like cognitive biases that give rise to the misinterpretation of normally occurring intrusive thoughts as highly significant and threatening, leading to obsessional anxiety and compulsive urges (e.g., Taylor, Abramowitz, & McKay, 2007). When completing the measure, respondents rate their agreement with each of the 44 items using a scale from 1 (disagree very much) to 7 (agree very much). A summary of the psychometric viability of the OBQ appears in Table 15.3. The measure has been studied extensively with clinical and nonclinical samples, and its scores demonstrate very good internal consistency and test–​retest reliability. Items were carefully designed by the OCCWG and, as such, demonstrate excellent content and construct validity. Prospective research also indicates that, to some extent, scores on the OBQ are predictive of the development of obsessive–​compulsive symptoms (Abramowitz, Khandker, Nelson, Deacon, & Rygwall, 2006). The OBQ is quite useful in clinical settings because it identifies patterns of dysfunctional thinking that can be targeted by cognitive therapy techniques (e.g., Wilhelm & Steketee, 2006). The III is a semi-​idiographic measure designed to assess negative appraisals of intrusive thoughts. The respondent first reads a set of instructions that includes examples of cognitive intrusions (e.g., “an impulse to do something shameful or terrible”) and then is asked to identify one or

Obsessive–Compulsive Disorder TABLE 15.3  

319

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

OBQ

E

E

NA

G

E

E

G

G



III

E

E

NA

G

E

E

G

G



Note: OBQ = Obsessive Beliefs Questionnaire; III = Interpretation of Intrusions Inventory; G = Good; E = Excellent; NA = Not Applicable.

two examples of his or her specific intrusions. The respondent next indicates the extent of agreement with the scale’s 31 items that concern various erroneous appraisals of intrusions (e.g., “I would be a better person if I didn’t have this thought”). Although three theoretically derived subscales were initially proposed, further psychometric analyses indicate that only a single III factor exists (OCCWG, 2005). As with the OBQ, the III has been studied in clinical and nonclinical samples, and its scores show good to excellent reliability (see Table 15.3). The scores also show excellent construct validity and predict, in a prospective fashion, the persistence of obsessional symptoms (Abramowitz, Nelson, Rygwall, & Khandker, 2007). The III is well suited for clinical practice because it is fairly brief and provides valuable information regarding how the patient negatively appraises the presence and meaning of his or her own intrusive thoughts. The clinician can use this information to illustrate how such faulty appraisals lead to obsessional anxiety and also how such interpretations can be modified (e.g., “It’s no wonder you spend so much time trying to fight your unwanted thoughts about accidents. It looks like you’re convinced that just by thinking these thoughts you will cause innocent people to have accidents. I wonder if that’s how our thoughts really work?”). Assessing Responses to Obsessional Distress As discussed previously, avoidance and compulsive rituals performed in response to obsessional stimuli serve to reduce anxiety in the short term, but they paradoxically maintain OCD symptoms by preventing the natural extinction of fear and by interfering with the disconfirmation of fears of disastrous consequences. Accordingly, one must ascertain the specifics of such behaviors so that they can be treatment targets. Passive Avoidance Most individuals with OCD avoid situations and stimuli associated with their obsessions in order to prevent obsessional thoughts, anxiety, or feared disastrous outcomes.

Avoidance might be overt, such as the evasion of certain people (e.g., cancer patients), places (e.g., public washrooms and places of worship), situations (e.g., using pesticides), and certain words (e.g., “murder”). It might also be subtle, such as staying away from the most often touched part of the door handle and refraining from listening to loud music while driving. The assessor should also ascertain the cognitive basis for avoidance (e.g., “If I  listen to music, I  might not realize it if I  hit a pedestrian”). Examples of questions to elicit this information about avoidance include “What situations do you avoid because of obsessional fear and why?” and “What would happen if you couldn’t avoid this situation?” Behavioral Rituals Because the external stimuli and intrusive thoughts that evoke obsessional fear are often ubiquitous (e.g., using the bathroom and intrusive thoughts), they might be difficult to avoid successfully. Patients use rituals, therefore, as “active avoidance” strategies that serve as an escape from obsessional fear, which could not be avoided in the first place. Some rituals could be called “compulsive” in that they are performed repetitively and in accordance with certain self-​prescribed rules (e.g., checking an even number of times, and washing for 40 seconds). Other rituals, however, would not be classified as compulsive because they might be subtle, brief, or performed only once at a time (e.g., holding the steering wheel tightly, and using a shirtsleeve to open a door). Topographically similar rituals can serve very different functions. For example, many patients engage in hand-​ washing rituals to decontaminate themselves. Such washing rituals are typically evoked by thoughts and images of germs or by doubts of whether one has had contact with a feared contaminant. Some individuals with OCD, however, engage in washing rituals in response to feelings of “mental pollution” evoked by unwanted disturbing intrusive thoughts of a sexual or otherwise immoral nature (e.g., Fairbrother, Newth, & Rachman, 2005). A  functional assessment, therefore, is necessary to elucidate how rituals

320

Anxiety and Related Disorders

are linked to obsessions and feared consequences—​for example, checking the stove to prevent fires or using a certain type of soap because it specifically targets certain sorts of germs. Examples of probes to elicit this information include “What do you do when you can’t avoid the word ‘cancer’?” “What do you do to reduce your fears of being responsible for accidents?” “Why does this ritual reduce your discomfort?” and “What could happen if you didn’t engage in this ritual?”

in, and situations that lead to, rituals. It also helps to identify symptoms that might have gone unreported in the assessment sessions. The patient should be instructed that rather than guessing, he or she should use a watch to determine the exact amount of time spent ritualizing. Moreover, to maximize accuracy, each entry should be recorded immediately after it occurs (as opposed to waiting until the end of the day). Case Conceptualization

Mental Rituals The function of mental rituals is the same as that of behavioral rituals (de Silva, Menzies, & Shafran, 2003)—​ to reduce anxiety and prevent feared outcomes. Mental rituals typically take the form of silently repeating special “safe” words (e.g., “life”), images (e.g., of Jesus Christ), or phrases (e.g., prayers) in a set manner to neutralize or “deal with” unwanted obsessional thoughts. Other common presentations include thought suppression, privately reviewing one’s actions repeatedly (e.g., to reassure oneself that one did not do something terrible), and mental counting. Many clinicians fail to assess mental rituals, or they confuse them for obsessions. Although mental rituals and obsessions are both cognitive events, they can be differentiated by careful questioning and by keeping in mind that the former are unwanted, intrusive, and anxiety-​ evoking, whereas the latter are deliberate attempts to neutralize obsessional intrusions and, as such, they function to reduce anxiety. Examples of questions to elicit information about mental rituals include the following:  “Sometimes people with OCD have mental strategies that they use to manage obsessional thoughts. What kinds of mental strategies do you use to dismiss unwanted thoughts?” and “What might happen if you didn’t use the strategy?” Self-​Monitoring Self-​monitoring, in which the patient records the occurrence of obsessive–​compulsive symptoms in “real time,” provides data to complement the functional assessment. Patients can be instructed to log the following parameters of each symptomatic episode (i.e., using a form with corresponding column headers):  (a) date and time of the episode, (b)  situation or thought that triggered obsessional fear, and (c)  rituals and the length of time spent engaged. The task of self-​monitoring should be introduced as a vehicle by which the assessor and patient can gain a highly accurate picture of the time spent engaged

The functional assessment described previously yields the information necessary to construct an individualized conceptualization of the patient’s idiosyncratic OCD symptoms. This formulation serves as a “road map” for cognitive–​ behavioral therapy and is synthesized by listing the obsessional stimuli (external cues and intrusive thoughts), cognitive appraisals of these stimuli (e.g., “I will get sick” and “I will be responsible for”), and the avoidance and ritualistic strategies used to reduce obsessional anxiety. Arrows are then drawn to show the links between stimuli, cognitions, emotions, and behavior as specified by the cognitive–​behavioral model. An example of a patient’s individualized model is shown in Figure 15.1. The model suggests that the modification of faulty beliefs and interpretations is required to reduce obsessional anxiety, and that the cessation of avoidance and ritualistic behavior is necessary for being able to modify the faulty cognitions. As discussed previously, this leads to the use of exposure therapy, cognitive techniques, and response prevention in the treatment of OCD (e.g., Abramowitz 2006a, 2006b; Salkovskis, 1996). Practical Considerations As with the diagnostic assessment, patients may seem hesitant to self-​disclose some of the details of their OCD symptoms. Explaining the purpose and importance of such an in-​depth analysis of obsessions and compulsions might be helpful in this regard. One tactic that often works well in building rapport and camaraderie (and thus, more self-​disclosure) is to describe the functional assessment phase as an exchange of information between two “experts.” The patient, who is an expert on his or her particular OCD symptoms, must help the therapist to understand these symptoms so that an effective treatment plan can be drawn up. Simultaneously, the therapist, an expert on conceptualizing OCD in general, must help the patient learn to think about his or her symptoms from a cognitive–​behavioral perspective so that the patient can get the most out of treatment.

Obsessive–Compulsive Disorder

321

Fear Cues – Driving by pedestrians – Driving by schools – Thoughts/doubts: “did I hit a pedestrian without knowing it?”

Beliefs/Interpretations – “Because I’ve had this thought, I must be at high risk for hitting a pedestrian” – “I can not take the chance that it has happened” – “I am a dangerous person/driver”

OBSESSIONAL ANXIETY

Rituals – C hecking the car for blood/marks – Checking the road for injured people – Compulsively checking the rear-view mirrors FIGURE 15.1  

Avoidance Driving through: – Neighborhoods – Parking lots – School zones – After dark

Example of a cognitive–​behavioral case formulation for a patient with OCD.

As alluded to previously, patients are sometimes afraid to mention certain symptoms due to dysfunctional beliefs about the consequences of saying certain things. For example, one individual was reluctant to describe his unwanted blasphemous images of Jesus having sexual intercourse with Mary because he feared that discussing these ideas (i.e., thinking about them) would invite divine punishment. In such instances, gentle, but firm, encouragement to openly discuss the obsession in the spirit of reducing old avoidance habits is the recommended course of action. As mentioned previously, to avoid reinforcing the patients’ fears, the interviewer should be sure to react in a calm and understanding manner when even the most unpleasant obsessions are self-​disclosed. Overall Evaluation Because of the idiographic nature of functional assessment, evaluation of the psychometric properties of this approach is difficult. The goals of functional

assessment, however, include (a)  the identification of target behaviors and the processes that maintain these behaviors and (b) the selection of appropriate interventions (Follette, Naugle, & Linnerroth, 2000). Therefore, given consistent evidence that ERP—​which can only be implemented with data derived from a functional assessment—​ is often highly successful in reducing OCD symptoms when assessed in this manner, it can be indirectly concluded that functional assessment is a valid and highly clinically useful tool. Incorporation of the psychometrically sound OBQ and III can add to the functional assessment by providing additional data regarding the cognitive basis of obsessional fears and compulsive urges. Advances in cognitive therapy for OCD (e.g., Wilhelm & Steketee, 2006)  include the development of specific cognitive techniques to target the types of cognitive distortions measured by these instruments. Despite these advances, the patient-​specific nature and vast heterogeneity of OCD symptoms (and associated cognitive distortions) can present challenges for even the most skilled clinicians.

322

Anxiety and Related Disorders

TABLE 15.4  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity Treatment Validity Generalization Sensitivity

Clinical Utility

Highly Recommended

Y-​BOCS interview PI-​R

E

G

G

A

G

E

E

E

A



E

E

NA

G

G

E

G

NA

A

OCI-​R VOCI SCOPI

E G G

G E G

NA NA NA

A A G

G G G

G A A

A A A

A NA NA

A G A

✓ ✓

DOCS

E

E

NA

G

G

E

E

G

A



Note: Y-​BOCS = Yale–​Brown Obsessive Compulsive Scale; PI-​R = Padua Inventory-​Revised Version; OCI-​R = Obsessive–​Compulsive Inventory-​ Revised; VOCI  =  Vancouver Obsessional Compulsive Inventory; SCOPI  =  Schedule of Compulsions, Obsessions, and Pathological Impulses; DOCS = Dimensional Obsessive Compulsive Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

parameters of obsessions (items 1–​ 5) and compulsions (items 6–​ 10) identified using the Y-​BOCS-​ SC. These parameters are (a) time/​frequency, (b) related interference Continually assessing the nature and severity of OCD in functioning, (c) associated distress, (d) attempts to resist, and related symptoms throughout the course of treatment and (e)  degree of control. Each item is rated from 0 (no assists the therapist in evaluating whether, and in what symptoms) to 4 (extreme), and the 10 items are summed ways, the patient is responding. This is consistent with the to produce a total score ranging from 0 to 40. In most empirical demonstration of treatment effectiveness. It is instances, Y-​BOCS scores of 0 to 7 represent subclinical not sufficient for the clinician simply to conclude that the OCD symptoms, those from 8 to 15 represent mild symppatient “seems to be less obsessed” or even for the patient toms, scores of 16 to 23 relate to moderate symptoms, scores (or an informant) to report that he or she “feels better.” of 24 to 31 suggest severe symptoms, and scores of 32 to 40 Instead, progress should be measured systematically by imply extreme symptoms. A strength of the Y-​BOCS is that comparing current functioning against a baseline. Thus, it measures symptom severity independent of the number periodic assessment using the instruments described in this or types of different obsessions and compulsions. In fact, it section should be conducted to objectively clarify in what is the only measure of OCD that assesses symptoms in this ways treatment has been helpful and what work remains way. A limitation is that it can take 30 minutes or longer to to be done. A multimethod approach is suggested, involv- administer, especially if used together with the Y-​BOCS-​SC. ing the use of clinician-​administered interview and self-​ Numerous studies have established clinical and report instruments that tap into various facets of OCD and nonclinical norms and psychometric properties of the related symptoms (i.e., depression, general anxiety, and Y-​BOCS. The scale scores have adequate internal confunctional disability). Table 15.4 shows ratings of various sistency, and they have good inter-​rater reliability and psychometric and practical characteristics of instruments test–​retest reliability over a period of several weeks (e.g., developed to measure the severity of OCD symptoms. The Goodman, Price, Rasmussen, Mazure, Delgado, et  al., individual measures are described next. As indicated previ- 1989). The Y-​BOCS differentiates people with OCD ously, additional chapters in this volume provide guidance from nondisordered individuals and those with other anxiin selecting appropriate tools for evaluating and monitor- ety disorders (e.g., Rosenfeld, Dar, Anderson, Kobak, & ing the symptoms of comorbid conditions. Griest, 1992). Finally, scores are also sensitive to changes that occur as a result of treatment (for a review, see Taylor, Interview Measures Thordarson, & Sochting, 2002). ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

Y-​BOCS Severity Scale

Self-​Report Measures

The Y-​BOCS severity scale (Goodman, Price, Rasmussen, Mazure, Delgado, et al., 1989; Goodman, Price, Rasmussen, Mazure, Fleischmann, et al., 1989) was designed as a semi-​ structured interview consisting of 10 items that assess five

Some researchers and clinicians use the Y-​BOCS severity scale as a self-​report measure; however, relatively few studies have evaluated the psychometric properties of the instrument when used in this way. Steketee, Frost,

Obsessive–Compulsive Disorder

and Bogart (1996) found that the self-​ report version tends to yield higher scores compared to the interview version. This might occur if respondents confuse other phenomena (e.g., worries, depressive ruminations, and impulsive behaviors) for obsessions and compulsions. An advantage to using the Y-​BOCS as a self-​ report measure, however, is that it can be administered more quickly and, therefore, more regularly during a course of treatment. Many additional self-​report inventories, however, have been developed to assess the main symptoms of OCD. The most promising of these instruments are discussed next. Revised Padua Inventory There are several available versions of the Padua Inventory. Because the most recent revision (the Revised Padua Inventory [PI-​R]; Burns, Keortge, Formea, & Sternberger, 1996)  is also the most widely used, it is described here. The PI-​R is a 39-​item measure that contains five subscales:  (a) contamination and washing, (b)  dressing and grooming compulsions, (c)  checking, (d)  obsessional thoughts of harm, and (e) obsessional impulses to harm. Agreement with each item is rated from 0 (not at all) to 4 (very much); thus, the total score ranges from 0 to 156. The scale requires approximately 10 minutes to complete, and its scores demonstrate at least adequate reliability and validity. It also differentiates between OCD symptoms and worry, although PI-​ R scores are significantly correlated with scores on measures of worry. Although the van Oppen, Hoekstra, and Emmelkamp (1995) revision has been shown to have good sensitivity to treatment, this characteristic has not been formally investigated for the Burns et al. (1996) version. Revised Obsessive Compulsive Inventory There are two versions of the Obsessive Compulsive Inventory (OCI). The original (Foa, Kozak, Salkovskis, Coles, & Amir, 1998) contains 42 items that assess the frequency and distress associated with a wide range of obsessional and compulsive symptoms. Items, each of which is rated on two scales (frequency and distress) from 0 (not at all) to 4 (extremely), are organized into the following seven subscales:  washing, checking, obsessing, hoarding, mental neutralizing, ordering, and doubting. The original OCI, however, has a number of psychometric and practical liabilities (e.g., the doubting and checking subscales appear to measure the same construct; Wu & Watson, 2003).

323

A revision of the measure (the OCI-​R; Foa et  al., 2002) has addressed some of the limitations of the original scale. The OCI-​R consists of only 18 items (Foa et al., 2002)  and six subscales (although one of the subscales pertains to hoarding symptoms, which are no longer considered symptoms of OCD). Each subscale contains 3 items that are rated on a single 5-​point scale (0–​4) of distress associated with that particular symptom. The OCI-​R subscales include washing, checking, ordering, obsessing, hoarding, and neutralizing. A total score may be calculated by summing all 18 items, and subscale scores can be calculated from the 3 items within each subscale. Research suggests that scores on the OCI-​R and its subscales have adequate convergent validity, although divergent validity of some of the subscales is suspect (e.g., Abramowitz & Deacon, 2006). The neutralizing subscale has been specifically criticized because the 3 items on this subscale all pertain to counting symptoms (e.g., Abramowitz & Deacon, 2006). Abramowitz, Tolin, and Diefenbach (2005) found that the OCI-​R is useful for measuring response to treatment. Moreover, a cut-​off score of 21 can differentiate OCD patients from nonpatients (Foa et al., 2002). Vancouver Obsessive Compulsive Inventory The Vancouver Obsessive Compulsive Inventory (VOCI; Thordarson et  al., 2004)  is a 55-​item measure that represents an update of the 30-​item Maudsley Obsessional Compulsive Inventory (MOCI; Hodgson & Rachman, 1977). Although the MOCI has sound psychometric properties and was once widely used, it has largely fallen out of favor due to two factors. First, it has poor sensitivity to treatment due to its true–​false response format and inclusion of items assessing past and permanent events (as opposed to current behaviors). Second, the MOCI mainly measures the severity of washing and checking concerns but not other symptoms of OCD, such as obsessions and mental rituals. Although the VOCI is a lengthier instrument than its predecessor, items assess a broader range of OCD symptoms and are rated on a Likert-​type scale from 0 (not at all true of me) to 4 (very much true of me). The VOCI’s six empirically derived subscales include contamination, checking, obsessions, hoarding, just right, and indecisiveness. Thordarson et  al. examined the factor structure and psychometric properties of the scale and found evidence of internal consistency, test–​retest reliability, construct validity, and known groups validity of the subscale scores. The sensitivity to treatment of the VOCI has yet to be examined.

324

Anxiety and Related Disorders

Schedule of Compulsions, Obsessions, and Pathological Impulses The Schedule of Compulsions, Obsessions, and Pathological Impulses (SCOPI) is a 47-​ item self-​ report scale that is designed to measure the presence of OCD symptoms while also assessing a number of impulse-​control phenomena (e.g., “I sometimes feel a sudden urge to play with fire”; Watson & Wu, 2005). The impulse-​control focus was included on the basis of evidence that there are links between impulse control and OCD symptoms. Respondents rate their degree of agreement with each item from 1 (disagree strongly) to 6 (agree strongly), and the scale contains five factors that correspond with empirically identified OCD symptom dimensions:  obsessive checking (14 items), obsessive cleanliness (12 items), compulsive rituals (8 items), hoarding (5 items), and pathological impulses (8 items). The SCOPI was developed empirically from a large item pool that sampled a broad range of OCD and impulsive symptoms, which was subjected to a series of factor analyses. In the only study evaluating the SCOPI, Watson and Wu (2005) found evidence that scores on the various subscales are internally consistent, are stable over a 2-​month interval, and show adequate convergent and discriminant validity. In particular, the SCOPI converges well with the OCI-​R. Although the measure is fairly easy to administer, the impulse-​control items do not assess whether these impulses (e.g., to steal and to act violently) are unwanted intrusive urges (i.e., obsessions) or actual impulses that the person acts upon (i.e., premonitory urges). Indeed, such impulses might occur among individuals with OCD and those with impulse-​control disorders. Yet they are experienced in very different ways depending on the disorder that affects the individual. This could lead to difficulties when interpreting responses to such items.

Dimensional Obsessive–​Compulsive Scale One limitation of the measures described previously is that they assess obsessions separately from compulsions. Yet this is inconsistent with the most up-​to-​date structural analyses of OCD symptoms indicating broader symptom dimensions composed of certain obsessions and rituals (e.g., McKay et  al., 2004). A  related limitation is that because these instruments emphasize the form of obsessions and rituals, the function of these symptoms is overlooked. To this end, Abramowitz and colleagues (2010) developed the Dimensional Obsessive–​Compulsive Scale (DOCS) to assess the severity of the four most empirically supported OCD symptom dimensions:  contamination,

responsibility for harm and mistakes, symmetry/​ordering, and unacceptable thoughts (e.g., McKay et  al., 2004). The DOCS is unique in that it affords an assessment of OCD symptoms based on function rather than form. The DOCS also assesses multiple empirically based parameters of severity (frequency, avoidance, distress, and functional interference; Deacon & Abramowitz, 2005)  for each of the four symptom dimensions. A 20-​ item self-​ report measure, the DOCS contains four subscales that correspond to the symptom dimensions mentioned previously. To accommodate the heterogeneity of OCD symptoms, and the presence of obsessions and rituals within each symptom dimension, each subscale begins with a description of the symptom dimension along with examples of representative obsessions and rituals. The examples clarify the form and function of each dimension’s fundamental obsessional fears, compulsive rituals, and avoidance behaviors. Within each symptom dimension, five items (rated 0–​4) assess the following parameters of severity (over the past month):  (a) time occupied by obsessions and rituals, (b) avoidance behavior, (c) associated distress, (d) functional interference, and (e) difficulty disregarding the obsessions and refraining from the compulsions. The DOCS subscale scores have excellent reliability in clinical samples (α = .94 to .96), and the measure converges well with other measures of OC symptoms. Practical Considerations In addition to some of the practical issues raised in the previous sections, a few considerations regarding ongoing assessment deserve comment. First, patients sometimes attempt to either minimize their OCD symptoms or make themselves look worse off than they truly are. Self-​report measures provide an easy vehicle for doing so. Such behavior might be motivated by either resistance to beginning treatment or the fear of ending treatment and terminating the therapeutic relationship. If this is suspected, it might be helpful to gain observations from significant others to provide additional data regarding symptom severity and current functioning. A separate issue is that patients may feel tempted to minimize their symptoms in order to please their therapist. Gentle, yet firm, encouragement to complete these forms for the purpose of providing important clinical information often helps. Overall Evaluation There exists a wide array of interview and self-​report measures of OCD. In most instances, scale items have been

Obsessive–Compulsive Disorder

carefully written, submitted to appropriate statistical procedures, and examined for psychometric viability using clinical or nonclinical samples. Some of these measures are “global” in that they aim to assess the broad range of OCD symptoms, whereas others focus on individual symptom dimensions such as scrupulosity (e.g., Abramowitz, Huppert, Cohen, Tolin, & Cahill, 2002) and symmetry/​ ordering concerns (e.g., Abramowitz et al., 2010; Coles, Frost, Heimberg, & Rhéaume, 2003). The Y-​BOCS severity scale is unique in that it measures the severity of OCD symptoms independent of symptom theme or the number of symptoms. Due to space limitations, the previous review of self-​report instruments was restricted to global measures of OCD that have the greatest potential (from a practical and scientific standpoint) for use in clinical and research settings. The heterogeneity of obsessions and rituals presents a major challenge to developing a concise global OCD self-​report symptom measure. Authors of such scales must strike a balance between (a) including enough items to comprehensively assess the various sorts of OCD phenomena and (b) constructing a scale that is manageable in length and therefore practical for widespread use. Whereas each of the measures discussed in this section provides an adequate self-​reported assessment of OCD, the DOCS appears to most optimally achieve this balance.

CONCLUSIONS AND FUTURE DIRECTIONS

325

might be intrigued by descriptions of the often remarkably senseless and bizarre symptom presentations of OCD (and legitimately so), he or she should avoid the temptation to become sidetracked by form or topography and instead keep in mind the essential features of OCD, which are that (a) obsessional thoughts and images evoke anxiety and distress and (b) avoidance and rituals serve to reduce or neutralize this distress. The successful implementation of empirically supported treatment (i.e., ERP) hinges on an assessment strategy grounded within this model. A multitrait–​multimethod approach to assessment will yield the most comprehensive data regarding an individual’s symptom presentation and related difficulties. Although we have reviewed both self-​report and interview measures of OCD, this chapter has not focused on the measurement of traits or domains related to OCD, such as depression, general anxiety, quality of life, and global functional impairment. Nevertheless, such parameters are important to assess during the initial interview and functional assessment and also when measuring treatment outcome. Several sources detail the assessment of OCD-​related phenomena (e.g., Abramowitz [2006a] and other chapters in the current volume on mood and anxiety disorders) and can be consulted for suggestions regarding specific measures to use. Certainly there is room for the development of additional measures of OCD. In particular, the development of standardized functional assessment techniques would be advantageous as long as these could remain flexible enough to accommodate the heterogeneity of the problem. Also, the assessment of children continues to lag behind advances in the assessment of adults. Although the age-​downward extensions of several measures discussed here (e.g., OBQ and OCI-​R) have been developed, little empirical work has appeared in the literature. Finally, it will be advantageous to empirically examine the psychometric properties of those newer diagnostic instruments developed for assessing OCD in the context of DSM-​5.

This chapter provides the reader with a practical guide for the comprehensive clinical evaluation of patients with obsessions, compulsions, and related phenomena. Advances in how we conceptualize OCD have been paralleled by improvements in the methods for assessment and treatment of this disorder. Although there are numerous valid and reliable instruments for assessing the signs and symptoms of OCD,1 we underscore that the proper assessment and treatment of this condition requires more than simply administering empirically supported tools. The assessment should be guided by a cognitive–​ References behavioral theoretical framework in which obsessions and Abramowitz, J. S. (2006a). Understanding and treating obsessive–​ compulsions are conceptualized in terms of their function compulsive disorder:  A cognitive–​ behavioral approach. (i.e., antecedents and consequences) as opposed to their Mahwah, NJ: Erlbaum. topography (form or frequency). Whereas the clinician Abramowitz, J. S. (2006b). Obsessive–​compulsive disor1.  We remind the reader that our discussion of structured diagnostic interviews is based on DSM-​IV’s definition of OCD, which was changed very little in the transition to DSM-​5.

der:  Advances in psychotherapy—​Evidence based practice. Cambridge, MA: Hogrefe & Huber. Abramowitz, J. S., & Deacon, B. J. (2006). Psychometric properties and construct validity of the obsessive–​ compulsive inventory-​revised: Replication and extension

326

Anxiety and Related Disorders

with a clinical sample. Journal of Anxiety Disorders, 20, 1016–​1035. Abramowitz, J. S., Deacon, B. J., Olatunji, B., Wheaton, M. G., Berman, N., Losardo, D., . . . Hale, L. (2010). Assessment of obsessive–​compulsive symptom dimensions: Development and evaluation of the Dimensional Obsessive–​Compulsive Scale. Psychological Assessment, 22, 180–​198. Abramowitz, J. S., Deacon, B. J., & Whiteside, S. P. (2011). Exposure therapy for anxiety:  Principles and practice. New York, NY: Guilford. Abramowitz, J. S., Franklin, M. E., Kozak, M. J., Street, G. P., & Foa, E. B. (2000). The effects of pretreatment depression on cognitive–​behavioral treatment outcome in OCD clinic patients. Behavior Therapy, 31, 517–​528. Abramowitz, J. S., Huppert, J. D., Cohen, A. B., Tolin, D. F., & Cahill, S. P. (2002). Religious obsessions and compulsions in a non-​clinical sample: The Penn Inventory of Scrupulosity (PIOS). Behaviour Research and Therapy, 40, 825–​838. Abramowitz, J. S., & Jacoby, R. J. (2014). Obsessive–​ compulsive disorder in adults. Boston, MA: Hogrefe. Abramowitz, J. S., & Jacoby, R. J. (2015). Obsessive–​ compulsive related disorders:  A critical review of the new diagnostic class. Annual Review of Clinical Psychology, 11, 165–​186. Abramowitz, J. S., Khandker, M., Nelson, C. A., Deacon, B. J., & Rygwall, R. (2006). The role of cognitive factors in the pathogenesis of obsessions and compulsions:  A prospective study. Behaviour Research and Therapy, 44, 1361–​1374. Abramowitz, J. S., Nelson, C. A., Rygwall, R., & Khandker, M. (2007). The cognitive mediation of obsessive–​ compulsive symptoms: A longitudinal study. Journal of Anxiety Disorders, 21, 91–​104. Abramowitz, J. S., Tolin, D. F., & Diefenbach, G. J. (2005). Measuring change in OCD:  Sensitivity of the Obsessive Compulsive Inventory-​ Revised. Journal of Psychopathology and Behavioral Assessment, 27, 317–​324. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Press. American Psychiatric Association. (2015). Structured Clinical Interview for DSM-​ 5 Disorders:  Clinician Version. Arlington, VA: American Psychiatric Publishing. Antony, M., Orsillo, S., & Roemer L. (Eds.). (2001). Practitioner’s guide to empirically based measures of anxiety. New York, NY: Kluwer /​Plenum. Beck, A. T. (1976). Cognitive therapy of the emotional disorders. New York, NY: International Universities Press.

Boeding, S. E., Paprocki, C. M., Baucom, D. H., Abramowitz, J. S., Wheaton, M. G., Fabricant, L. E., & Fischer, M. S. (2013). Let me check that for you: Symptom accommodation in romantic partners of adults with obsessive–​ compulsive disorder. Behaviour Research and Therapy, 51, 316–​322. Brown, T. A., & Barlow, D. H. (2014). Anxiety and Related Disorders Interview Schedule for DSM-​5–​Adult Version. New York, NY: Oxford University Press. Brown, T. A., DiNardo, P., Lehman, C., & Campbell, L. (2001). Reliability of DSM-​IV anxiety and mood disorders: Implications for the classification of emotional disorders. Journal of Abnormal Psychology, 110, 49–​58. Burns, G. L., Keortge, S., Formea, G., & Sternberger, L. (1996). Revision of the Padua Inventory of Obsessive–​ Compulsive Disorder:  Distinctions between worry, obsessions, and compulsions. Behaviour Research and Therapy, 34, 163–​173. Clark, D. A. (2004). Cognitive–​behavioral treatment of OCD. New York, NY: Guilford. Coles, M., Frost, R., Heimberg, R., & Rhéaume, J. (2003). “Not just right experiences”:  Perfectionism, obsessive–​ compulsive features and general psycho-​ pathology. Behaviour Research and Therapy, 41, 681–​700. De Silva, P., Menzies, R. G., & Shafran, R. (2003). Spontaneous decay of compulsive urges:  The case of covert compulsions. Behaviour Research and Therapy, 41, 129–​137. Deacon, B. J., & Abramowitz, J. S. (2005). The Yale–​Brown Obsessive Compulsive Scale: Factor analysis, construct validity, and suggestions for refinement. Journal of Anxiety Disorders, 19, 573–​585. DiNardo, P., Brown, T., & Barlow, D. (1994). Anxiety Disorders Interview Schedule for DSM-​ IV:  Lifetime Version (ADIS-​IV-​LV). San Antonio, TX: Psychological Corporation. Eisen, J. L., Phillips, K. A., Baer, L., Beer, D. A., Atala, K. D., & Rasmussen, S. A. (1998). The Brown Assessment of Beliefs Scale: Reliability and validity. American Journal of Psychiatry, 155, 102–​108. Eisen, J. L., Phillips, K. A., Coles, M. E., & Rasmussen, S. A. (2004). Insight in obsessive compulsive disorder and body dysmorphic disorder. Comprehensive Psychiatry, 45, 10–​15. Fairbrother, N., Newth, S., & Rachman, S. (2005). Mental pollution: Feelings of dirtiness without physical contact. Behaviour Research and Therapy, 43, 121–​130. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. (2002). Structured Clinical Interview for the DSM-​IV Axis 1 Disorders. New  York, NY:  Biometrics Research Department, New York State Psychiatric Institute. Foa, E. B., Huppert, J. D., Leiberg, S., Langner, R., Kichic, R., Hajcak, G., & Salkovskis, P. M. (2002). The Obsessive–​ Compulsive Inventory:  Development and

Obsessive–Compulsive Disorder

validation of a short version. Psychological Assessment, 14, 485–​496. Foa, E. B., Kozak, M. J., Salkovskis, P. M., Coles, M. E., & Amir, N. (1998). The validation of a new obsessive–​ compulsive disorder scale: The Obsessive–​Compulsive Inventory. Psychological Assessment, 10, 206–​214. Follette, W., Naugle, A., & Linnerroth, P. (2000). Functional alternatives to traditional assessment and diagnosis. In M. J. Dougher (Ed.), Clinical behavior analysis (pp. 99–​ 125). Reno, NV: Context Press. Frost, R. O., & Steketee, S. (2002). Cognitive approaches to obsessions and compulsions: Theory, assessment, and treatment. Oxford, UK: Elsevier. Goodman, W. K., Price, L. H., Rasmussen, S. A., Mazure, C., Delgado, P., Heninger, G. R., & Charney, D. S. (1989). The Yale–​Brown Obsessive Compulsive Scale: Validity. Archives of General Psychiatry, 46, 1012–​1016. Goodman, W. K., Price, L. H., Rasmussen, S. A., Mazure, C., Fleischmann, R. L., Hill, C. L.,  .  .  .  Charney, D. S. (1989). The Yale–​Brown Obsessive Compulsive Scale:  Development, use, and reliability. Archives of General Psychiatry, 46, 1006–​1011. Hodgson, R., & Rachman, S. (1977). Obsessive compulsive complaints. Behaviour Research and Therapy, 15, 389–​395. Kessler, R., Berglund, P., Demler, O., Jin, R., Merikangas, K. R., & Walters, E. E. (2005). Lifetime prevalence and age-​of-​onset distributions of DSM-​IV disorders in the National Comorbidity Survey Replication. Archives of General Psychiatry, 62, 593–​602. Lapidus, K. A. B., Stern, E. R., Berlin, H. A., & Goodman, W. K. (2014). Functional neuroimaging models for obsessive–​ compulsive disorder and obsessive–​ compulsive spectrum disorders. In E. A. Storch & D. McKay (Eds.), Obsessive–​compulsive disorder and its spectrum:  A lifespan approach (pp. 363–​396). Washington, DC: American Psychological Association. McKay, D., Abramowitz, J. S., Calamari, J. E., Kyrios, M., Radomsky, A., Sookman, D.,  .  .  .  Wilhelm, S. (2004). A critical evaluation of obsessive–​ compulsive disorder subtypes:  Symptoms versus mechanisms. Clinical Psychology Review, 24, 283–​313. Mineka, S., & Zinbarg, R. (2006). A contemporary learning theory perspective on the etiology of anxiety disorders. American Psychologist, 61, 10–​26. Obsessive Compulsive Cognitions Working Group. (2001). Development and initial validation of the Obsessive Beliefs Questionnaire and the Interpretations of Intrusions Inventory. Behaviour Research & Therapy, 39, 987–​1006. Obsessive Compulsive Cognitions Working Group. (2003). Psychometric validation of the Obsessive Beliefs Questionnaire and the Interpretation of Intrusions Inventory: Part I. Behaviour Research & Therapy, 41, 863–​878.

327

Obsessive Compulsive Cognitions Working Group. (2005). Psychometric Validation of the Obsessive Belief Questionnaire and Interpretation of Intrusions Inventory: Part  2. Factor analyses and testing of a brief version. Behaviour Research & Therapy, 43, 1527–​1542. Okuda, M., & Simpson, H. B. (2015). Obsessive–​compulsive disorder. In K. A. Phillips & D. J. Stein (Eds.), Handbook on obsessive–​compulsive and related disorders (pp. 25–​ 56). Washington, DC: American Psychiatric Publishing. Olatunji, B. O., Davis, M. L., Powers, M. B., & Smits, J. J. (2013). Cognitive–​ behavioral therapy for obsessive–​ compulsive disorder: A meta-​analysis of treatment outcome and moderators. Journal of Psychiatric Research, 47, 33–​41. Ravie Kishore, V., Samar, R., Janardhan Reddy, Y., Chandrasekhar, C., & Thennarasu, K. (2004). Clinical characteristics and treatment response in poor and good insight obsessive–​compulsive disorder. European Psychiatry, 19, 202–​208. Rosenfeld, R., Dar, R., Anderson, D., Kobak, K., & Greist, J. (1992). A computer-​administered version of the Yale–​ Brown Obsessive Compulsive Scale. Psychological Assessment, 4, 329–​332. Salkovskis, P. M. (1996). Cognitive–​behavioral approaches to the understanding of obsessional problems. In R. Rapee (Ed.), Current controversies in the anxiety disorders (pp. 103–​133). New York, NY: Guilford. Salkovskis, P. M. (1999). Understanding and treating obsessive–​compulsive disorder. Behaviour Research and Therapy, 37, S29–​S52. Schwartz, S. A., & Abramowitz, J. S. (2003). Are non-​ paraphilic sexual addictions a variant of obsessive–​ compulsive disorder? A  pilot study. Cognitive and Behavioral Practice, 10, 373–​378. Shafran, R., Thordarson, D. S., & Rachman, S. (1996). Thought-​action fusion in obsessive compulsive disorder. Journal of Anxiety Disorders, 10, 379–​391. Skoog, G., & Skoog, I. (1999). A 40-​year follow-​up of patients with obsessive–​compulsive disorder. Archives of General Psychiatry, 56, 121–​127. Speisman, B. B., Storch, E. A., & Abramowitz, J. S. (2011). Postpartum obsessive–​compulsive disorder. Journal of Obstetric, Gynecologic, & Neonatal Nursing:  Clinical Scholarship for the Care of Women, Childbearing Families, & Newborns, 40, 680–​690. Steketee, G., Frost, R., & Bogart, K. (1996). The Yale–​ Brown Obsessive Compulsive Scale:  Interview versus self-​report. Behaviour Research and Therapy, 34, 675–​684. Taylor, S., Abramowitz, J. S., & McKay, D. (2007). Cognitive–​ behavioral models of obsessive–​compulsive disorder. In M. M. Antony, C. Purdon, & L. Summerfeldt (Eds.), Psychological treatment of obsessive–​ compulsive disorder: Fundamentals and beyond (pp. 9–​29). Washington, DC: American Psychological Association Press.

328

Anxiety and Related Disorders

Taylor, S., Thordarson, D., & Sochting, I. (2002). Obsessive–​ compulsive disorder. In M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (pp. 182–​214). New York, NY: Guilford. Thordarson, D., Radomsky, A., Rachman, S., Shafran, R., Sawchuck, C., & Hakstain, A. (2004). The Vancouver Obsessional Compulsive Inventory. Behaviour Research and Therapy, 42, 1289–​1314. van Oppen, P., Hoekstra, J., & Emmelkamp, P. (1995). The structure of obsessive–​compulsive symptoms. Behaviour Research and Therapy, 33, 15–​23. Watson, D., & Wu, K. (2005). Development and validation of the Schedule of Compulsions, Obsessions,

and Pathological Impulses (SCOPI). Assessment, 12, 50–​65. Wilhelm, S., & Steketee, G. (2006). Cognitive therapy for obsessive–​compulsive disorder. Oakland, CA: New Harbinger. Williams, J., Gibbon, M., First, M., Spitzer, R., Davies, M., Borus, J., et  al. (1992) The Structured Clinical Interview for DSM-​III-​R (SCID) II:  Multisite test–​ retest reliability. Archives of General Psychiatry, 49, 630–​636. Wu, K., & Watson, D. (2003). Further investigation of the Obsessive Compulsive Inventory: Psychometric analysis in two nonclinical samples. Journal of Anxiety Disorders, 17, 305–​319.

16

Post-​Traumatic Stress Disorder in Adults Samantha J. Moshier Kelly S. Parker-​Guilbert Brian P. Marx Terence M. Keane Post-​ traumatic stress disorder (PTSD) was first introduced as a diagnosis in the third edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​III; American Psychiatric Association [APA], 1980)  and was conceptualized as a relatively rare response to extraordinary and severe stressors, such as war, violent acts, vehicular or industrial accidents, sexual assault, and other disasters or events that are outside the range of usual human experience. Today, traumatic events and PTSD are viewed as worldwide phenomena that are prevalent and cross all subgroups of the population. Epidemiological studies have documented the prevalence of PTSD, providing information on rates of exposure to trauma, the distribution of PTSD within different segments of the population (adults and children, males and females, etc.), and those factors that affect the onset and course of PTSD. Recent events, including the mass shooting in Orlando, Florida, in 2016, the terrorist attacks on September 11, 2001, and the 2010 Haiti earthquake, emphasize the importance of arriving at best practices for the management of disasters and violence on humanity. It is essential for clinicians to utilize gold standard methods to diagnose PTSD and related psychiatric conditions, monitor progress made throughout treatment, and measure treatment outcomes. A multitude of PTSD measures are available; a clinician seeking a tool to assess PTSD may find this array of measures overwhelming. Therefore, the purpose of this chapter is to discuss available methods for assessing PTSD and make recommendations regarding their suitability in a clinical context with a variety of populations. Our hope is that clinicians will adopt the use of state-​of-​the-​science

methods of diagnosing and monitoring PTSD and trauma symptoms in their work with traumatized adults. To provide a context for this discussion, we begin with a brief review of diagnostic considerations related to PTSD, the epidemiology of trauma, comorbid conditions, etiology, and prognosis.

NATURE OF PTSD

Diagnostic Considerations and Associated Features The diagnostic criteria for PTSD were revised substantially for the fifth edition of the DSM (DSM-​ 5; APA, 2013). Among the more significant changes is the removal of PTSD from the anxiety disorders category and subsequent recategorization of it in a newly developed diagnostic category for conditions characterized as a trauma-​and stressor-​related disorder. Other changes to the diagnostic criteria include a modification to the definition of a potentially traumatic event (Criterion A); a shift from three symptom clusters to four; and the addition of three symptoms to the previously included 17 PTSD symptoms to reflect “reckless or self-​destructive behavior,” “distorted blame of self or others,” and “persistent negative emotional states (e.g., fear, anger, guilt, shame, sadness).” Furthermore, the wording of some other symptoms was substantially modified to better define them. The first PTSD symptom cluster (Criterion B) in DSM-​5 is characterized by re-​experiencing, or reliving, some or all of the traumatic event through recurring

329

330

Anxiety and Related Disorders

unwanted memories, vivid and intrusive nightmares, flashbacks, and physiological reactions and psychological distress when confronted with reminders of the trauma. The second symptom cluster (Criterion C) involves avoidance of stimuli (including people, places, cognitions, etc.) that are associated with and remind the individual of the traumatic event. Consequently, the lives of those with PTSD can become increasingly constricted as they withdraw from relationships, routine activities, or contexts that serve as reminders. The third symptom cluster (Criterion D) includes symptoms that are characterized by negative alterations in cognitions and mood. Examples of such symptoms are an inability to remember important aspects of the traumatic event; persistent and exaggerated negative beliefs about oneself, others, or the world; distorted cognitions about the cause or consequences of a traumatic event; a persistent negative emotional state; loss of interest or participation in significant activities; a sense of detachment or estrangement from others; or an inability to feel positive emotions such as love or happiness. The fourth symptom cluster (Criterion E) is characterized by marked alterations in arousal and reactivity and includes irritable or angry behavior, reckless or self-​destructive behavior, impaired concentration and memory, difficulty sleeping, an exaggerated startle response, and hypervigilance. Symptoms of PTSD must be present for more than 1 month and cause clinically significant distress or impairment in functioning. In cases in which the full diagnostic criteria for PTSD are not met until at least 6 months following the traumatic event, a diagnosis of PTSD with delayed expression is conferred. DSM-​5 criteria for PTSD also include a diagnostic specifier indicating the presence or absence of dissociative symptoms (defined as persistent or recurrent symptoms of depersonalization or derealization). In order to be diagnosed with PTSD, an individual must experience an event that meets the definition of a Criterion A stressor; endorse at least one symptom from Criterion B, at least one symptom from Criterion C, at least two symptoms from Criterion D, and at least two symptoms from Criterion E; experience the symptoms for 1  month or longer; and experience either clinically significant distress or functional impairment resulting from the symptoms. In contrast to the DSM-​5 criteria, a substantially narrowed definition of PTSD has been proposed for the 11th edition of the International Classification of Diseases (ICD-​11). The PTSD Working Group for ICD-​11 has selected diagnostic criteria that are assumed to be specific to PTSD in an effort to improve diagnostic accuracy and reduce comorbidity (Maercker et  al., 2013). The newly

proposed criteria include six symptoms and require the presence of at least one re-​experiencing symptom (distressing dreams or dissociative reactions), at least one symptom of avoidance (of internal or external reminders of the event), and at least one symptom of hyperarousal (exaggerated startle response or hypervigilance). Preliminary research indicates that this narrower definition of PTSD does not reduce the rates of common comorbidities and may lead to a substantial reduction in the prevalence of PTSD compared with both ICD-​10 and DSM-​5, suggesting that the criteria may fail to capture some individuals with clinically significant PTSD symptoms (Wisco et al., 2016). The physical, emotional, societal, and interpersonal costs of PTSD are substantial, making PTSD a major public health issue. Individuals diagnosed with PTSD are at an increased risk of developing chronic medical conditions (Edmondson, Kronish, Shaffer, Falzon, & Burg, 2013; Wolf et al., 2016) and are more likely to be physically inactive, smoke, and be nonadherent with medications (Zen, Whooley, Zhao, & Cohen, 2012). Compared to individuals in the general population, people with PTSD are less likely to be married and are more likely to divorce (Breslau et al., 2011), and they have greater levels of discord and physical aggression perpetration in their intimate relationships (Taft, Watkins, Stafford, Street, & Monson, 2011). PTSD is also associated with unemployment and disability (Desai, Spencer, Gray, & Pilver, 2010; Kimerling et al., 2009), homelessness, and money mismanagement (Elbogen, Sullivan, Wolfe, Wagner, & Beckham, 2013). Epidemiological Evidence When PTSD was initially conceptualized, both exposure to traumatic events and the disorder were considered relatively rare. However, researchers have reported that exposure to a traumatic event is not uncommon. A recent study of a nationally representative sample found that when using the DSM-​5 Criterion A  definition, 68.6% of adults reported exposure to at least one potentially traumatic event (Goldstein et  al., 2016). Findings from prior studies suggest that a trauma can activate PTSD in individuals who are psychologically vulnerable but that, fortunately, the majority of people who survive a traumatic event will not develop PTSD or any other form of psychopathology (e.g., depression, anxiety, and substance abuse disorder). Nonetheless, the likelihood of developing PTSD increases with repeated exposure to potentially traumatic events and with exposure to traumatic events

Post-Traumatic Stress Disorder

characterized by assaultive violence (Goldstein et  al., 2016; Kessler et al., 2014), regardless of personal resources or emotional stability. The lifetime prevalence of PTSD in the general population has been estimated to be between 6.8% and 9.5% in epidemiological surveys based on DSM-​IV (APA, 1994)  and DSM-​ IV-​ TR (APA, 2000; Breslau, Davis, Andreski, & Peterson, 1991; Kessler, Berglund, Demler, Jin, & Walters, 2005). Women consistently demonstrate higher lifetime prevalence of PTSD compared to men; for instance, an analysis of 70,000 adults across 15 countries revealed the lifetime odds of PTSD to be 2.6 times higher in women than in men (Seedat et al., 2009). With the changes made to the diagnostic criteria in DSM-​ 5, there has been concern regarding the impact these changes may have on prevalence (Hoge, 2015). Studies thus far do show some discordance between the two definitions (Hoge, Riviere, Wilk, Herrell, & Weathers, 2014); however, the majority of individuals who meet criteria for PTSD do so when using both DSM-​IV and DSM-​5 definitions (e.g., Kilpatrick et al., 2013). Furthermore, most studies have found very similar prevalence between the two sets of diagnostic criteria (Elhai & Palmieri, 2011; Hoge et al., 2014; Kilpatrick et al., 2013). In the first large-​scale nationally representative survey to assess DSM-​5-​defined PTSD, the National Epidemiologic Survey on Alcohol and Related Conditions-​III found a lifetime prevalence of 6.1% (Goldstein et al., 2016), which is only slightly lower than the 6.8% found using DSM-​IV-​TR in the National Comorbidity Replication Survey (Kessler et al., 2005). Head-​to-​head comparisons of the two criteria sets indicate that the slightly lower rates in prevalence found with DSM-​5 are due to the more restrictive definition of a traumatic event as well as the requirement of at least one avoidance symptom (Kilpatrick et al., 2013). Certain subgroups within the population, including combatants, are at increased risk for exposure to trauma. Researchers have examined the impact of war on the psychological functioning of soldiers because deployment, combat, physical injury, and readjustment to civilian life can be intensely stressful. The National Vietnam Veterans Readjustment Study (NVVRS; Kulka et al., 1990) was the first systematic study of combat-​related PTSD, and its findings were striking. The authors reported that 64% of Vietnam veterans were exposed to one or more traumatic events in their lives. More than 15% of males and 9% of females serving in Vietnam met the criteria for current PTSD. More important, these rates were 5 to 10 times higher than found for Vietnam-​era veteran and civilian comparison subjects, highlighting the psychological toll

331

of war. Skeptics, however, have argued that these results are inflated. In response, Dohrenwend et al. (2006) reanalyzed a sample of the NVVRS data and found little evidence of malingering; according to their results, 9.1% of the veterans met criteria for current PTSD and 18.7% met criteria for lifetime PTSD when using the most stringent criteria for confirming traumatic war experiences. The United States’ involvement in the wars in Iraq and Afghanistan has placed PTSD in the national spotlight as reports of soldiers returning from deployment reveal significant mental health problems, including PTSD. Studies vary widely in reported prevalence rates. A recent meta-​analysis involving more than 4.9 million Operation Iraqi Freedom (OIF) and Operation Enduring Freedom (OEF) veterans found a prevalence of 23% (Fulton et al., 2015). This estimate was based largely on studies of OIF/​ OEF veterans enrolled in the Veterans Affairs health care system; however, prevalence appears to be lower in population-​based, non-​treatment-​seeking samples, which include a high proportion of military support personnel. A meta-​analysis by Kok, Herrell, Thomas, and Hoge (2012) of population-​based studies found a lifetime prevalence of 5.5% among the larger population of OIF/​OEF veterans and 13.2% in combat infantry personnel. When studied as a whole, U.S. veteran samples have been found to have a slightly higher prevalence of PTSD relative to community samples. For instance, data from the National Health and Resilience in Veterans Study (NHRVS), a nationally representative survey of U.S. veterans, showed lifetime and current prevalence of DSM-​ 5-​based probable PTSD of 8.0% and 4.8%, respectively (Wisco et al., 2014). Civilians exposed to war or mass violence are also at risk for developing PTSD. The United Nations reported that in 2015, 65.3 million people were forcibly displaced due to violence, persecution, and human rights violations (United Nations High Commissioner for Refugees, 2016). A meta-​analysis of studies involving conflict-​affected persons throughout the world found an average prevalence of PTSD of 31% (Steel et al., 2009). Results also showed that experiencing torture is strongly related to being diagnosed with PTSD. The cumulative nature of potentially traumatic events was also associated with greater prevalence of PTSD, but it was even more strongly related to major depressive disorder (MDD). Notably, prevalence of PTSD was significantly lower among those who had permanently resettled compared with those living in refugee camps or other temporary settings (Steel et al., 2009). Researchers have also examined the psychological impact of the terrorist attacks that occurred on September

332

Anxiety and Related Disorders

11, 2001. A  recent review of the literature found a high prevalence of PTSD in the immediate aftermath of the attacks, particularly among those living in the New York City area (ranging from 11% to 20%; Neria, DiGrande, & Adams, 2011). Studies show that the prevalence of PTSD declined with time in the majority of longitudinal studies available (Neria et al., 2011) but actually increased over the course of two studies of highly exposed populations such as rescue and recovery workers or World Trade Center evacuees (Berninger et al., 2010; Brackbill et al., 2009). Increases in PTSD prevalence can also be found in the wake of natural disasters. For example, in 2005, Hurricane Katrina struck the Gulf Coast, leaving hundreds of thousands of individuals homeless and displaced. The psychological impact of this disaster was also great, with prevalence of serious psychological problems essentially doubling from the predisaster levels (Kessler, Galea, Jones, & Parker, 2006). Similarly, prevalence of PTSD surpassed 20% among samples of survivors of the 2004 tsunami affecting Sri Lanka, Thailand, India, and Indonesia (Hollifield et  al., 2008)  and the 2008 earthquake in the Sichuan province of China (Kun et al., 2009). Comorbidity Exposure to a traumatic event is a risk factor not only for PTSD but also for a number of other mental disorders and conditions. Goldstein and colleagues (2016) found that when adjusting for sociodemographic factors, a DSM-​5 diagnosis of PTSD was associated with increased likelihood for every mood, anxiety, personality, and substance use disorder assessed (however, notably, PTSD was not associated with alcohol use disorder in this study). Similarly, in a nationally representative epidemiological survey of U.S. adults using DSM-​IV-​TR criteria, individuals with a lifetime diagnosis of PTSD had elevated lifetime prevalences of mood disorders (62%), anxiety disorders (59%), and substance or alcohol use disorders (47%; Pietrzak, Goldstein, Southwick, & Grant, 2011)  compared with individuals in the general population. This pattern of comorbidity has also been found in veteran samples; for instance, Wisco and colleagues (2014) found that lifetime probable PTSD diagnosed using DSM-​5 criteria was associated with increased risk of every psychiatric disorder assessed (and was most strongly associated with mood and anxiety disorders). Personality disorders are also highly comorbid with PTSD; for example, in one DSM-​ IV-​based epidemiological study of U.S.  adults, 24% of those individuals with PTSD also met diagnostic criteria for borderline personality disorder (Pagura et al., 2010).

Studies also commonly demonstrate a significant and robust association between PTSD and suicidal behaviors. A nationally representative study of U.S. adults found that individuals with a lifetime diagnosis had an increased likelihood of a past suicide attempt (Odds Ratio [OR] = 5.1) compared to those without PTSD (Pietrzak et al., 2011). Similarly, the NHRVS survey of U.S. veterans found elevated risk of past suicide attempts (OR = 11.8) and current suicidal ideation (OR = 9.7) in those with probable lifetime PTSD (Wisco et al., 2014). In addition, the presence of comorbid conditions amplifies suicide risk in individuals with PTSD (Gradus et al., 2010; Jakupcak et al., 2009; Oquendo et  al., 2003). PTSD and its associated psychiatric comorbidities (e.g., MDD) are characterized by or found to be associated with key aspects of the established psychological theories of suicidal behavior (O’Connor & Nock, 2014), including difficulties problem solving and coping (Guerreiro et al., 2013), memory and attentional biases (Cha, Najmi, Park, Finn, & Nock, 2010), and social isolation (Haw & Hawton, 2011). Empirical studies also suggest that PTSD may be one of a group of disorders that provides the necessary activation and energy to move from a state of contemplating suicide to one of acting on suicidal thoughts, putting together a suicide plan, and making preparations to die by suicide (Nock et al., 2009). Traumatic brain injury (TBI) is also highly comorbid with PTSD in veterans returning from recent conflicts in Iraq and Afghanistan. TBI and PTSD are considered to be the “signature injuries” of these conflicts, and more than 340,000 veterans have been diagnosed with some form of TBI, most often mild TBI (mTBI), since 2000 (Defense and Veteran Brain Injury Center, 2016). Studies show that veterans experiencing mTBI are more likely to suffer from PTSD (Hoge et al., 2008; Schneiderman, Braver, & Kang, 2008), and one recent prospective study found that TBI during the most recent deployment was the strongest predictor of the development of PTSD, even when accounting for combat intensity, pre-​deployment symptoms, and previous TBI (Yurgil et al., 2014). Furthermore, the association between the two disorders remains even when accounting for overlapping symptoms (e.g., insomnia, anger, and difficulty concentrating; Schneiderman et al., 2008). Veterans comorbid for mTBI and PTSD also often experience more severe PTSD symptoms (Spira, Lathan, Bleiberg, & Tsao, 2014), and one prospective study of U.S. Army soldiers found that PTSD, but not TBI, was significantly associated with decrements in neuropsychological performance (Vasterling et al., 2012). Due to the overlapping symptom profiles between these conditions, they require careful assessment. Even with thorough

Post-Traumatic Stress Disorder

clinical interviewing and the use of standardized diagnostic measures for assessment of PTSD, it may be difficult to arrive at a clear conclusion regarding the etiological source of post-​ concussive symptoms (Hoge, Goldberg, & Castro, 2009). For a comprehensive discussion of the interaction of PTSD and mTBI, see Vasterling, Bryant, and Keane (2012). The presence of one or multiple comorbid conditions with PTSD can complicate the assessment process. The overlap among symptoms of PTSD with conditions such as MDD, mTBI, substance use disorders, and anxiety disorders requires thorough assessment in order to accurately attribute an individual’s symptoms to a specific disorder. A comprehensive discussion of assessment measures of comorbid conditions is beyond the scope of this chapter, although we note the importance of screening for the most common co-​occurring disorders, at a minimum, during the diagnostic assessment process, because treatment of PTSD often is more effective when the comorbid conditions are also addressed (e.g., Shalev, Friedman, Foa, & Keane, 2000). Comprehensive diagnostic assessment measures, such as a structured or semi-​ structured diagnostic interview, facilitate identification and diagnosis of comorbid conditions that can be incorporated into the therapeutic plan. We note that rigorous assessment and ongoing monitoring of substance use disorders is extremely important in planning for trauma-​ focused treatment; if an individual is using substances to self-​ medicate for PTSD symptoms, that individual might be at greater risk for increased substance misuse in response to distress related to therapy content or process. See the chapters on substance abuse (Chapter  17) and depression (Chapter  7) in this volume for a more thorough discussion on how to efficiently assess these conditions when they occur. In addition, the strong salience of PTSD symptoms, such as nightmares or flashbacks, may also lead clinicians to overlook the presence of additional disorders that may be important for case conceptualization and targeting in treatment. Furthermore, the lack of comprehensive assessment of comorbidities may also lead to misspecification of treatment targets. Although current guidelines for PTSD generally recommend treating PTSD and psychiatric comorbidities concurrently, first-​ line PTSD treatments may need to be delayed or referral to specialty care may be necessary for individuals with high levels of suicidality or severe co-​occurring disorders (e.g., severe substance use disorders and psychotic disorders; U.S. Department of Veterans Affairs/​U.S. Department of Defense [VA/​DOD] Management of Post-​Traumatic Stress Working Group,

333

2010). Thus, a sound understanding of these co-​occurring psychiatric conditions and their manifestation in the presence of PTSD is essential when working with members of this patient population. Etiology PTSD emerges from a complex chain of events that begins with psychological and biological predispositions and follows a precipitating traumatic event that leaves an individual with intense and distressing emotions (Keane & Barlow, 2002; Keane, Marshall, & Taft, 2006). Early theories of the development and maintenance of PTSD focused on the application of classical conditioning principles (Keane, Fairbank, Caddell, Zimering, & Bender, 1985). In this model, the individual develops a learned emotional response in the wake of a traumatic event. This learned response is activated during exposure to situations that symbolize or resemble the traumatic event, including cognitions, feelings, and memories of the actual traumatic event. More recent theories of PTSD have incorporated individuals’ cognitive interpretations of the trauma, as well as preexisting beliefs about one’s own competence and safety (Foa & Rothbaum, 1998). For example, Ehlers and Clark (2000) discussed the role of negative appraisals related to the trauma and posited that PTSD results from the lack of integration of trauma memories with autobiographical information. Brewin and colleagues (Brewin, Dalgleish, & Joseph, 1996; Brewin, Gregory, Lipton, & Burgess, 2010) added to this by incorporating neurocognitive aspects of memory into Ehlers and Clark’s theory. They proposed two types of memory: situationally accessible memories (SAM), which are not verbally accessible and cannot be consciously remembered, and verbally accessible memories (VAM), which resemble declarative memories and can be consciously remembered. Brewin and colleagues have suggested that traumatic memories are largely stored in SAM and are therefore difficult to integrate. In addition to the difficulty integrating the trauma memory with one’s autobiographical memory outlined by Ehlers and Clark and also Brewin and colleagues (1996, 2010), other factors can contribute to the development of PTSD. For example, psychological and biological vulnerabilities (e.g., family history of psychopathology, previous trauma history, and genetic factors), poor coping skills, and/​or inadequate social supports can all contribute to the development of the disorder. For a more comprehensive review of the theories of PTSD, see Green, Marx, and Keane (2017) or Bovin, Wells, Rasmusson, Hayes, and Resick (2014).

334

Anxiety and Related Disorders

Treatment and Prognosis Treatment programs to address symptoms of PTSD were developed in response to the influx of returning Vietnam veterans and rape survivors with psychological difficulties. Early strategies were based on the conceptualization of PTSD as a disorder arising from classical conditioning, and therefore targeted avoidance and involved direct therapeutic exposure to the traumatic memory (Fairbank & Keane, 1982; Keane & Kaloupek, 1982; Keane et al., 1985, 1989). As treatment strategies have evolved with time, exposure to the trauma memory remains a common element across several psychotherapies for PTSD. Currently, the most widely studied and recommended treatments for PTSD are two cognitive–​ behavioral approaches:  cognitive processing therapy (CPT) and prolonged exposure therapy (PE). These treatments are recommended as first-​line therapies in practice guidelines issued by the Institute of Medicine (2008), the International Society for Traumatic Stress Studies (Foa, Keane, Friedman, & Cohen, 2008), and the VA/​DOD (2010). Research suggests that CPT and PE yield large effects relative to control conditions (Chard, 2005; Forbes et  al., 2012; Powers, Halpern, Ferenschak, Gillihan, & Foa, 2010). Furthermore, these treatments have been shown to be efficacious across a range of populations with comorbid conditions including mTBI, MDD, and substance use disorders (Kaysen et al., 2014; van Minnen, Zoellner, Harned, & Mills, 2015). Yet despite their promise, a substantial proportion of individuals who complete PE and CPT continue to experience clinically significant symptoms, particularly veterans and active duty service members (Steenkamp, Litz, Hoge, & Marmar, 2015). Furthermore, high rates of attrition prevent many from receiving a full course of treatment for PTSD across therapeutic modalities (Imel, Laska, Jakupcak, & Simpson, 2013). These limitations speak to the importance of developing and evaluating alternative treatment options to CPT and PE, such as Seeking Safety (Najavits, 2002), which addresses symptoms of PTSD and substance use simultaneously, or Written Exposure Therapy (Sloan, Marx, Bovin, Feinstein, & Gallagher, 2012), which uses an abbreviated written disclosure protocol. In terms of prognosis, the clinical course of PTSD is highly variable, but for many, the disorder is a chronic condition. In a large meta-​analysis of more than 80,000 patients with PTSD, spontaneous remission occurred within 40 months of the index trauma in less than half of cases (Morina, Whicherts, Lobbrecht, & Priebe, 2014). Similarly, a review of longitudinal studies of PTSD found

that approximately 40% of individuals experienced a chronic course (Santiago et al., 2013). Research examining the trajectories of PTSD symptoms over time suggests that a number of risk factors, patient characteristics, and treatment-​related variables may contribute to the variation in course and chronicity (Dickstein, Suvak, Litz, & Adler, 2010; Orcutt, Bonanno, Hannan, & Miron, 2014). Factors that may contribute to the chronic course of PTSD include low rates of treatment seeking and a delay in obtaining treatment; in one recent epidemiological survey, 60% of participants with a lifetime diagnosis had received treatment, and the average length of time between the traumatic event and treatment was 4.5 years (Goldstein et al., 2016). ASSESSMENT OF PTSD

Since the first mention of PTSD in the DSM, there has been excellent progress in developing sound measures to assess trauma symptoms and PTSD in adults (Keane & Barlow, 2002; Keane, Weathers, & Foa, 2000; Weathers, Keane, & Davidson, 2001). Given the limitations of any single assessment measure, a comprehensive assessment of PTSD should use a multimethod approach (Bovin, Marx, & Schnurr, 2015; Weathers, Keane, & Foa, 2009), which may include use of a structured or semi-​structured diagnostic interview to assess PTSD and other psychiatric comorbidities, self-​ report measures of symptom severity and psychosocial functioning, examination of medical records and collateral source information, and assessment of psychophysiological reactivity. Practice guidelines issued by both the VA/​DOD (2010) and the International Society for Traumatic Stress Studies (Foa et al., 2008) present recommendations for assessment of PTSD in detail, with both encouraging the use of multimethod assessment that includes psychometrically sound structured or semi-​structured interviews and self-​report measures. Multimethod approaches to PTSD assessment may not be feasible for many clinicians. Therefore, when selecting PTSD assessment instruments, it is especially important that clinicians consider the specific assessment questions and goals. The objective of many clinicians is to diagnose a patient by conducting an evaluation that includes a differential diagnosis, a functional assessment, and the collection of other related data that can be helpful in case conceptualization, as well as treatment planning. Other practitioners may be involved in forensic assessments or compensation evaluations in which diagnostic

Post-Traumatic Stress Disorder

accuracy is paramount. Researchers involved in epidemiological or prevalence studies may be interested in the extent to which PTSD is diagnosed among study participants, the risk factors associated with the condition, and the occurrence of comorbid psychiatric conditions. These different assessment contexts require different assessment approaches. In this chapter, we provide an overview of some of the most commonly used diagnostic interviews and self-​ report measures for the assessment of PTSD, and we review their utility for diagnostic purposes, case conceptualization, treatment planning, and treatment monitoring and outcome. Structured and semi-​structured diagnostic interviews are considered to be the gold standard for diagnosing PTSD and should be used whenever possible, particularly when assessing diagnostic status. The structured nature of the interview ensures a higher degree of clinical accuracy and reliability, whereas flexibility allows for the use of clarifying and follow-​up questions, which will lessen misinterpretation of questions by respondents or minimization or exaggeration in reporting. Self-​report measures provide information on the presence or absence of PTSD and the severity of PTSD symptoms. Several measures provide specific cut-​offs that are indicative of a diagnosis of PTSD, whereas the majority incorporate continuous indicators of symptom severity. In general, self-​ report measures are more time-​and cost-​efficient than diagnostic interviews and are of particular utility in clinical settings in which a structured interview is not feasible or practical. However, the validity of the data captured by these measures depends on the extent to which the patient understands and answers the questions accurately (Bovin, Marx, & Schnurr, 2015). Assessment instruments reflecting the updated PTSD diagnostic criteria are in varying stages of development and psychometric testing. The evidence base for DSM-​5 commensurate measures is still relatively limited compared with the wealth of data available for many DSM-​IV-​based assessment tools. This may pose a challenge for clinicians seeking to select measures that are grounded in current definitions of PTSD and also have a strong evidence base. Throughout the remainder of this chapter, we describe the state of the research on DSM-​5-​based assessment tools for PTSD; however, we expect that psychometric support for many of the DSM-​5 measures presented here will expand rapidly in the next few years. As scientific advances have strengthened the biological understanding of PTSD, an increasing focus has been placed on using this knowledge to inform assessment.

335

Before discussing more traditional methods used to assess and diagnose PTSD, we briefly consider the role of biologically based assessment techniques. Biologically Based Assessment of PTSD During the past 20 years, research on biologically based measures of PTSD has established a foundation for a psychobiological description of PTSD. A  substantial literature on the psychophysiology of PTSD has developed, utilizing measures of heart rate, skin conductance, event-​ related potentials, electromyography reactivity, and other biological indicators (for a review, see Pitman et al., 2012). Results of a meta-​analysis of more than 1,000 adults with PTSD suggest the disorder is characterized by a heightened resting physiological state, strong physiological and emotional reactivity to general and idiographic trauma cues, and exaggerated startle response (Pole, 2007). Although psychophysiological assessment can provide unique information, widespread use of this approach in a clinical environment is not anticipated because it is expensive and requires equipment and specialized training. In the majority of cases, more time-​and cost-​efficient methods of assessment, such as diagnostic interviews or self-​report measures, are more than adequate. Research also indicates that individuals with PTSD report their physiological arousal response to trauma-​ related cues with relative accuracy, as evidenced by a significant association between subjective distress ratings and physiological measures of skin conductance and heart rate (Marx et  al., 2012). Because psychophysiological methods are not accessible to the majority of clinicians, we do not discuss these methods in detail in this chapter. We refer the interested reader to Orr, Metzger, Miller, and Kaloupek (2004) for an excellent review. Efforts are also underway to understand the neural correlates of PTSD. Researchers have found PTSD to be associated with alterations in both structure and function in areas of the brain associated with emotional reactivity, fear conditioning, emotion regulation, and episodic memory (for a detailed review, see Pitman et al., 2012). PTSD is also characterized by changes in a range of other physiological functions, such as hypothalamic–​pituitary axis functioning and pro-​inflammatory immune response (also reviewed by Pitman et al., 2012). As understanding of the neurobiology of PTSD increases, efforts are being made to identify biomarkers that might aid in more objective assessment of PTSD. Promising results include the recent use of resting-​state functional neuroimaging to distinguish

336

Anxiety and Related Disorders

TABLE 16.1  

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

CAPS-​IV CAPS-​5 SCID-​IV SCID-​5 PSS-​I-​IV

E A E NR G

E G E NR G

E E A G E

E A A NR G

E E G G G

E E G NR G

E A E NR G

E A G A G

✓ ✓a ✓ ✓a

PSSI-​5

G

G

G

A

G

G

G

A

ADIS-​IV

G

G

G

A

G

G

G

G

ADIS-​5

NR

NR

NR

NR

G

NR

NR

A

CIDI

E

NR

E

A

G

G

E

NA

IES-​R

G

G

NA

A

G

G

G

G

Mississippi

E

E

NA

G

E

E

G

G

PDS-​IV PDS-​5 PCL-​IV PCL-​5

E G G A

E E E E

NA NA NA NA

G A G A

G G G G

E G E G

E G E A

G A G A

✓b ✓a ✓b ✓a

 Limited data available for these measures; recommendations have been made tentatively based on available data and the strong psychometric support for previous versions of these measures. a

  Self-​report measures used as diagnostic instruments should include explicit assessment of the Criterion A event in addition to assessment of current DSM-​5 symptoms. b

Note: CAPS-​IV = Clinician-​Administered PTSD Scale for DSM-​IV; CAPS-​5 = Clinician-​Administered PTSD Scale for DSM-​5; SCID-​IV = Structured Clinical Interview for DSM-​IV; SCID-​5 = Structured Clinical Interview for DSM-​5; PSS-​I-​IV = PTSD Symptom Scale Interview for DSM-​IV; PSSI-​ 5 = PTSD Symptom Scale Interview for DSM-​5; ADIS-​IV = Anxiety Disorders Interview Schedule for DSM-​IV; ADIS-​5 = Anxiety Disorders Interview Schedule for DSM-​5; CIDI = Composite International Diagnostic Interview; IES-​R = Impact of Event Scale-​Revised; Mississippi = Mississippi Scale for Combat-​Related PTSD; PDS-​IV = Posttraumatic Diagnostic Scale for DSM-​IV; PDS-​5 = Posttraumatic Diagnostic Scale for DSM-​5; PCL-​IV = PTSD Checklist for DSM-​IV; PCL-​5 = PTSD Checklist for DSM-​5; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

between PTSD and trauma-​exposed control cases with more than 90% specificity and sensitivity (Christova, James, Engdahl, Lewis, & Georgopoulos, 2015). Yet given the complex and heterogeneous nature of the disorder, it may be unlikely that a single biomarker of PTSD exists, and current methods do not demonstrate the necessary levels of accuracy to function as stand-​alone diagnostic tests. For a comprehensive review of these issues, see Michopoulos, Norrholm, and Jovanovic (2015).

diagnostic purposes, commenting on their comprehensiveness, their utility within a clinical context, and their psychometric properties in the subsequent text. We do not provide a comprehensive review of all available instruments; rather, we highlight a select few based on the relative frequency with which they are used in the field. Table 16.1 contains ratings of those instruments currently available for making diagnoses of PTSD. Structured Diagnostic Interviews

ASSESSMENT FOR DIAGNOSIS

Clinician-​Administered PTSD Scale

It is fundamental to case conceptualization and treatment planning for clinicians to determine the appropriate psychological diagnosis or diagnoses for their patients. Paramount to a diagnosis of PTSD is the clear identification of a Criterion A  event, to which subsequent symptoms are linked. Therefore, when selecting diagnostic measures, clinicians should consider whether or not the measure assesses the presence of a traumatic event, in addition to ensuring that the measure is psychometrically sound. We review methods of assessing PTSD for

Developed by the National Center for PTSD (Blake et al., 1990; Weathers et al., 2013), the Clinician-​Administered PTSD Scale (CAPS) is one of the most widely used structured interviews for diagnosing and measuring the severity of PTSD (Weathers et al., 2001). The CAPS was initially developed for DSM-​ IV, and an updated version (i.e., CAPS-​5) was developed for DSM-​5 in 2013. The CAPS for DSM-​IV assesses all 17 DSM-​IV diagnostic criteria for PTSD, as well as the associated symptoms of guilt and dissociation. The CAPS-​5 assesses all 20 DSM-​5 diagnostic

Post-Traumatic Stress Disorder

criteria, as well as the associated dissociative symptoms of derealization and depersonalization. The CAPS for DSM-​ IV contains separate ratings for the frequency and intensity of each symptom. On the CAPS-​5, frequency and intensity are still assessed, although they are combined for one overall severity score for each item. Both older and newer versions of the CAPS also promote uniform administration and scoring through carefully phrased prompt questions and explicit rating scale anchors with clear behavioral referents. There is also flexibility built into the administration of the CAPS. Interviewers can administer all DSM criteria and/​or the associated symptoms. Administration time is approximately 30 minutes to 1 hour, depending on those sections the interviewer chooses to use, as well as the extent to which symptoms are reported by the respondent. The CAPS-​5 has past week, past month, and worst month (i.e., lifetime) versions. The CAPS for DSM-​IV has excellent psychometric properties (for a review, see Weathers et  al., 2001)  and has been used successfully to assess PTSD with a wide variety of trauma-​exposed samples (e.g., combat veterans, Cambodian and Bosnian refugees, and victims of rape, crime, motor vehicle accidents, incest, the Holocaust, torture, and cancer). It has served as the primary diagnostic or outcome measure in hundreds of empirical studies on PTSD, and it has been translated into at least 12 languages (Hinton et al., 2006; Weathers et al., 2001). Thus, the existing data strongly support its use across clinical and research settings. The CAPS-​5 has also demonstrated strong psychometric properties in an initial study. Weathers and colleagues (2017) examined the psychometric properties of the CAPS-​5 in two samples of military veterans and found that diagnosis of PTSD on the CAPS-​5 demonstrated excellent inter-​ rater reliability, test–​ retest reliability, as well as strong agreement with PTSD diagnosis using the CAPS for DSM-​IV. Furthermore, CAPS-​5 severity scores demonstrated strong inter-​ rater reliability, internal consistency, and test–​retest reliability. In addition, the CAPS-​5 showed good convergent validity with the CAPS-​IV and the PTSD Checklist for DSM-​5 (PCL-​ 5; Weathers et  al., 2013), as well as discriminant validity with measures of functional impairment (Inventory of Psychosocial Functioning:  Marx et  al., 2009; World Health Organization Disability Assessment Schedule 2.0:  Ustün, Kostanjsek, Chatterji, & Rehm, 2010), psychopathy (Psychopathic Personality Inventory-​ Short Version:  Lilienfeld & Andrews, 1996), and depression (Patient Health Questionnaire:  Spitzer et  al., 1999). Although these data support the use of the CAPS-​5 as a

337

psychometrically sound measure of PTSD symptomatology, more work with other trauma-​exposed samples is needed to further validate it. Structured Clinical Interview for DSM-​5 The Structured Clinical Interview for DSM-​5 (SCID-​5; First, Williams, Karg, & Spitzer, 2015)  assesses a broad range of current and lifetime psychiatric conditions. It is divided into separate modules corresponding to DSM-​5 diagnostic criteria, with each module providing the interviewer with prompts and follow-​up inquiries intended to be read verbatim to respondents. The SCID-​5 can be administered by clinicians and highly trained interviewers. Although the administration of the full SCID-​5 can be time-​consuming, the modular structure allows clinicians to tailor their assessment appropriately. Within the context of a trauma clinic, it is recommended that the anxiety disorders, affective disorders, and substance use disorder modules be administered to rule out any comorbid diagnoses. Administration of the psychotic symptom screen will also help rule out psychiatric conditions that require a different set of treatment interventions (Keane & Barlow, 2002). Previous versions of SCID-​PTSD module (e.g., First, Spitzer, Williams, & Gibbon, 2000), based on earlier versions of the DSM, are considered psychometrically sound. Keane et al. (1998) reported that the SCID-​PTSD module had adequate reliability, and McFall, Smith, Roszell, Tarver, and Malas (1990) reported evidence of convergent validity, finding significant correlations between the SCID-​PTSD and other measures of PTSD, including the Mississippi Scale (Keane et al., 1988) and the Minnesota Multiphasic Personality Inventory (MMPI)-​PTSD Scale (Keane, Malloy, & Fairbank, 1984). Earlier versions of the SCID-​PTSD module also show strong convergent validity with the CAPS; for instance, the number of positive symptoms assessed by the SCID-​IV PTSD module correlated with CAPS for DSM-​IV scores at r = .89 in one study (Weathers et al., 2001). The SCID-​PTSD module also had good diagnostic utility (Kulka et al., 1988). Due to its recent publication, little has been published on the psychometric properties of the SCID-​ 5 PTSD module. Preliminary results indicate a high degree of inter-​rater reliability (κ = .82; Wolf et al., 2016). Although the SCID is a good diagnostic tool, it has some important limitations. For example, the SCID does not provide specific questions to the assessor to ask the respondent about symptom frequency or intensity. In addition, the SCID symptom ratings of “absent,” “present,” and

338

Anxiety and Related Disorders

“subthreshold” provide relatively limited information about symptom severity in comparison to interviews such as the CAPS-​5 or the PTSD Symptom Scale Interview.

Schedule-​ Revised (ADIS) was designed to permit differential diagnoses among the DSM-​III anxiety disorder categories. The interview was revised to correspond to DSM-​IV criteria (ADIS-​IV; DiNardo, Brown, & Barlow, 1994) and, recently, to DSM-​5 criteria (ADIS-​5; Brown & PTSD Symptom Scale Interview Barlow, 2014). The ADIS also includes an assessment of Developed by Foa, Riggs, Dancu, and Rothbaum (1993), affective disorders, substance use disorders, and selected the PTSD Symptom Scale Interview (PSS-​I) is a struc- somatoform disorders; a diagnostic timeline; and a dimentured interview designed to assess symptoms of PTSD. Foa sional assessment of the key and associated features of and colleagues (2016) then developed an updated version the disorders. The provision of a dimensional as well as corresponding to DSM-​5 (i.e., PSSI-​5). Using a Likert a categorical assessment allows the clinician to describe scale, interviewers rate the severity of 17 symptoms corre- subthreshold manifestations of each disorder, thus allowsponding to the DSM-​III-​R (APA, 1987) criteria for PTSD ing for better case conceptualization. In addition to being for the PSS-​I and the 20 symptoms corresponding to the updated to DSM-​5 criteria, the PTSD module of the DSM-​5 criteria for PTSD for the PSSI-​5. One limitation ADIS-​5 allows for the nature of the traumatic event to of the PSS-​I is that it measures symptoms during the past 2 be assessed and coded more extensively than in previous weeks rather than 1 month, which the DSM criteria spec- versions. ify as necessary for a diagnosis of PTSD (Cusack, Falsetti, Results from psychometric studies of the ADIS-​PTSD & de Arellano, 2002); however, the PSSI-​5 assesses the module are mixed. Originally tested in a small sample past month. The PSS-​I is brief (administration time is of Vietnam combat veterans, the ADIS-​PTSD module approximately 20 minutes) and may be administered by lay yielded strong agreement with interview-​ determined interviewers who are trained to work with trauma patients. diagnoses (Blanchard, Gerardi, Kolb, & Barlow, 1986). The PSS-​I was originally tested in a sample of women with However, DiNardo, Moras, Barlow, Rapee, and Brown a history of rape and nonsexual assault, and the resulting (1993) tested the reliability of the ADIS in a commuscores were found to have good internal consistency, test–​ nity sample recruited from an anxiety disorders clinic retest reliability during a 1-​month period, and inter-​rater and found only adequate agreement between two indeagreement for a PTSD diagnosis (Foa et  al., 1993). The pendent raters when PTSD was the principal diagnosis PSS-​I scores are significantly correlated with other mea- or an additional diagnosis. In a test of the ADIS-​IV, the sures of traumatic stress, such as the Impact of Events inter-​rater reliability across two interviews given 10  days Intrusion score (Horowitz, Wilner, & Alvarez, 1979) and apart was also fair for current diagnoses (Brown, DiNardo, the Rape Aftermath Symptom Test total score (Kilpatrick, Lehman, & Campbell, 2001)  but slightly improved for 1988). In addition, the scores have demonstrated good lifetime diagnoses. Psychometric analyses of the ADIS-​5 diagnostic utility compared to a SCID-​PTSD diagnosis. (including the PTSD module) are underway, but results The PSSI-​5 was tested in samples of urban community are not yet available. Provision of additional reliability and residents, undergraduates, and veterans, and its scores validity data on the ADIS-​5 is needed to ensure its continwere also found to have good internal consistency, test–​ ued use in clinical settings. retest reliability, and excellent inter-​rater reliability (Foa et  al., 2016). PSSI-​5 scores correlated significantly with Composite International Diagnostic Interview the CAPS-​5, the Posttraumatic Diagnostic Scale for DSM-​ 5 (Foa et  al., 2016), and the PTSD Checklist–​Specific The World Health Organization (WHO) Composite Version (Weathers, Litz, Herman, Huska, & Keane, 1993), International Diagnostic Interview (CIDI, version 3.0; and they demonstrated discriminant validity with the Beck Kessler & Üstün, 2004) was developed for epidemiological Depression Inventory-​II (Beck, Brown, & Steer, 1996) and purposes. Specifically, the CIDI has been used to assess a the Trait subscale of the State–​Trait Anxiety Inventory​ variety of mental health disorders in the WHO World (Spielberger, Gorsuch, Lushene, Vagg, & Jacobs, 1983). Mental Health Survey Initiative (Kessler et  al., 2007). When originally developed, it was a semi-​ structured interview that mapped on to DSM-​IV PTSD criteria. It Anxiety Disorders Interview Schedule for DSM-​5 assesses whether diagnostic criteria are satisfied (yes/​no), Developed by DiNardo, O’Brien, Barlow, Waddell, and it has been translated into many languages. The CIDI and Blanchard (1983), the Anxiety Disorders Interview has demonstrated excellent inter-​rater reliability and good

Post-Traumatic Stress Disorder

test–​retest reliability (Andrews & Peters, 1998), although there has been moderate support for its use as a PTSD diagnostic instrument (Breslau, Kessler, & Peterson, 1998; Haro et al., 2006). Kimerling and colleagues (2014) compared the CIDI with the CAPS on both past year and lifetime PTSD diagnoses in a sample of female veterans and found that the CIDI has good diagnostic utility. However, the high specificity and low sensitivity of the measure indicated that the CIDI tends to be very conservative when identifying lifetime PTSD. Self-​Report Measures Impact of Event Scale-​Revised Developed by Horowitz et al. (1979), the Impact of Event Scale (IES) was one of the first self-​report measures developed to assess symptoms of PTSD. The initial 15-​item questionnaire, which focused only on intrusion and avoidance symptoms, was derived from a model of traumatic stress developed by Horowitz (1976). A  revised 22-​item version was developed to include all the DSM-​IV criteria (IES-​R; Weiss & Marmar, 1997). Respondents complete the measure by rating on a Likert scale “how distressed or bothered” they were by each symptom during the past week. The IES has been translated into several languages, has been used with many different trauma populations, and takes 5 to 10 minute to complete. Support for the internal consistency and convergent validity of the IES-​R scores is strong across diverse samples, including emergency response personnel, earthquake and motor vehicle accident survivors, and Vietnam combat veterans (Beck et  al., 2008; Creamer, Bell, & Failla, 2003; Weiss & Marmar, 1997). IES-​R scores have correlated significantly with other well-​established measures, such as the CAPS and the PTSD Symptom Scale Self-​Report (Beck et al., 2008, Rash, Coffey, Baschnagel, Drobes, & Saladin, 2008). Notably, test–​retest reliability data are available for only two samples (Adkins, Weathers, McDevitt-​Murphy, & Daniels, 2008; Weiss & Marmar, 1997). Results from these studies show discrepant reliability coefficients. Although the IES-​R was not originally intended for use as a diagnostic tool, a number of studies employed it in this capacity. Results from these studies suggested good sensitivity and specificity, but data suggest a potential for overdiagnosis of PTSD using common cut-​off scores. For instance, Morina, Ehring, and Priebe (2013) found that although a cut-​off score of 34 led to identification of 89% of PTSD cases in the sample, 45% of individuals who

339

scored above this cut-​point did not actually meet diagnostic criteria for PTSD according to a semi-​structured interview. Although the items on the scale parallel the DSM-​IV symptom criteria, it does not fully map on to the DSM-​IV PTSD criteria because, like other PTSD self-​report measures, it does not assess traumatic stress exposure, symptom duration, and clinical significance (i.e., symptom-​related distress and/​or impairment). Undoubtedly, these diagnostic criteria omissions are, at least in part, responsible for the higher than expected PTSD prevalence when using this scale. As such, this and other similar self-​report PTSD symptom scales should not be used to determine a diagnosis of PTSD. The IES-​R has not been updated to reflect the DSM-​5 PTSD criteria. Mississippi Scale for Combat-​Related PTSD Developed by Keane et al. (1988), the 35-​item Mississippi Scale is widely used to assess combat-​related PTSD symptoms. The scale items were selected from an initial pool of 200 items generated by experts to match the DSM-​III criteria for the disorder. Respondents are asked to rate, on a Likert scale, the severity of symptoms over the time period occurring “since the event.” The Mississippi Scale yields a continuous score of symptom severity as well as diagnostic information. It is available in several languages and takes 10 to 15 minutes to administer. Although 4 additional items were later added to the scale to reflect additional DSM-​III-​R symptoms, the original 35-​item version is frequently used because the two scales have performed comparably (Lauterbach, Vrana, King, & King, 1997). The Mississippi Scale has excellent psychometric properties. In Vietnam-​ era veterans seeking treatment, Keane et al. (1988) reported high internal consistency and test–​retest reliability during a 1-​week time interval. In a subsequent validation study, the authors found an overall hit rate of 90% when the scale was used to differentiate between a PTSD group and two non-​PTSD comparison groups. McFall, Smith, Mackay, and Tarver (1990) replicated these findings and further demonstrated that PTSD patients with and without substance use disorders did not differ on the Mississippi Scale. Given the high comorbidity between PTSD and substance use disorders, the authors believed it was important to demonstrate that the test assesses PTSD symptoms rather than effects associated with alcohol and drug use. McFall, Smith, Mackay, et  al. also obtained information on convergent validity, finding significant correlations between the Mississippi Scale and other measures of PTSD, including the total number of SCID-​PTSD symptoms, total IES score, and

340

Anxiety and Related Disorders

degree of traumatic combat exposure on the Vietnam Era Stress Inventory (Wilson & Krauss, 1984). These findings suggest that the Mississippi Scale is a valuable self-​report tool in settings in which assessment of combat-​related PTSD is needed. Relatively recently, Orazem, Charney, and Keane (2006) examined the psychometric properties of the Mississippi Scale in more than 1,200 cases of Vietnam War veterans participating in a multisite study of the psychophysiology of PTSD (Keane et al., 1998). Results indicated that scores on the Mississippi Scale possessed excellent internal consistency and were highly correlated with the Keane PTSD Scale of the MMPI-​2. Using the SCID-​ PTSD module as the diagnostic gold standard, the Mississippi Scale possessed excellent diagnostic utility, suggesting strong support for the use of this test when assessing combat-​related PTSD. Notably, several variations of the Mississippi Scale are available, including a brief 10-​item version (Hyer, Davis, Boudewyns, & Woods, 1991), a modified scale for civilians (the Revised Civilian Mississippi Scale; Lauterbach et al.,1997), and an informant-​report version for partners (Taft, King, King, Leskin, & Riggs, 1999). Posttraumatic Diagnostic Scale Developed by Foa et  al. (1997), the Posttraumatic Diagnostic Scale (PDS) is a 49-​item scale designed to measure DSM-​IV PTSD criteria and symptom severity. Foa and colleagues (2016) then developed an updated version of the PDS to correspond to DSM-​5 PTSD criteria (i.e., PDS-​5). The PDS reviews trauma exposure and identifies the most distressing trauma. It also assesses all DSM-​5 criteria for PTSD distress and interference of PTSD symptoms, and onset and duration of symptoms. This measure has been used with numerous samples, including combat veterans, accident victims, and sexual and nonsexual assault survivors, and it has been validated in other languages (e.g., German: Griesel, Wessa, & Flor, 2006). The PDS can be completed in 10 to 15 minutes. The psychometric properties of the PDS for DSM-​ IV were evaluated among 264 volunteers recruited from several PTSD treatment centers as well as from non-​ treatment-​ seeking populations at high risk for trauma (Foa et  al., 1997). Investigators reported high internal consistency for the PTSD total score and subscales and adequate test–​retest reliability coefficients for the total PDS score and for the symptom cluster scores. With regard to validity, the PDS total score correlated highly

with other scales that measure PTSD symptoms, such as the IES. In addition, the measure yielded high levels of diagnostic agreement with a SCID diagnosis. Griffin, Uhlmansiek, Resick, and Mechanic (2004) compared the PDS for DSM-​IV with the CAPS for DSM-​IV in a sample of female survivors of domestic violence. They found strong correlations between the two measures, although the PDS tended to overdiagnose PTSD. The psychometric properties of the PDS-​5 were evaluated in a sample of urban community residents, undergraduates, and veterans (Foa et  al., 2016). The PDS-​5 scores demonstrated excellent internal consistency and test–​retest reliability. They also demonstrated convergent validity with the PSSI-​5 (Foa et  al., 2016; r  =  .85), the PTSD Checklist–​ Specific Version (PCL-​ S; Weathers et  al., 1993; r  =  .90), and demonstrated discriminant validity with the Beck Depression Inventory-​II (BDI-​II; Beck et  al., 1996; r  =  .77) and the State–​Trait Anxiety Inventory–​Trait Scale (STAI-​T; Spielberger et  al., 1983; r = .64). Convergent and discriminant validity were compared using the method created by Steiger (1980) and Hoerger (2013), which demonstrated that the associations between the PDS-​5 and the BDI-​II and STAI-​T were significantly lower than the associations between the PDS-​5 and the PSSI-​5 and the PCL-​S (all ZH > 3.05, ps < .01). Diagnostic agreement between the PDS-​5 and the PSSI-​ 5 was 78% (sensitivity  =  .84, specificity  =  .73), where scores were significantly higher on the PDS-​5 than on the PSSI-​5. These findings suggest that the PDS-​5 can be used to assess PTSD symptom severity as well as serve as a screening instrument for probable PTSD but that, similar to the IES-​R, clinicians should not use it as the sole means of determining PTSD diagnostic status. PTSD Checklist Developed by researchers at the National Center for PTSD (Weathers et al., 1993), the PTSD Checklist (PCL) is a self-​report measure of DSM PTSD symptoms. The original scale was based on the 17 symptoms included in DSM-​III-​R criteria for PTSD, was subsequently updated to reflect the DSM-​IV diagnostic criteria, and was most recently updated to reflect the 20 DSM-​5 symptom criteria (i.e., PCL-​5; Weathers et al., 2013). Different scoring procedures may be used to yield either a continuous measure of symptom severity or a dichotomous indicator of diagnostic status. Dichotomous scoring methods include an overall cut-​off score, a symptom cluster scoring approach, or a combination of the two. Respondents are

Post-Traumatic Stress Disorder

asked to rate, on a Likert scale, “how much each problem has bothered them” during the past month. The time frame can be adjusted as needed to suit the goals of the assessment. The PCL for DSM-​IV has three versions:  a civilian version (PCL-​C), a military version (PCL-​M), and a specific version (PCL-​S). On the PCL-​C, respondents are asked to report on symptoms related to any traumatic stressor, whereas on the PCL-​M, respondents are asked to report on symptoms related to military stressor exposures only. On the PCL-​S, symptoms are tied to one specific traumatic event that the respondent indicates in writing. Although the PCL-​5 has one version, there are three separate formats of the measure:  one without the Criterion A identification, one with a Criterion A identification, and one with the Life Events Checklist (LEC-​5) and extended Criterion A  identification. Prior and current versions of the PCL have been used extensively in both research and clinical settings; all versions take 5 to 10 minutes to complete. The PCL was originally validated with a sample of Vietnam and Persian Gulf War veterans and found to have strong psychometric properties (Weathers et  al., 1993). Many studies provide evidence for the reliability and validity of scores on the PCL for DSM-​IV in both veteran and nonveteran samples (e.g., primary care patients and severely mentally ill adults), although the optimal cut-​off score varies across samples (Cook, Elhai, & Arean, 2005; Dobie et  al., 2002; Grubaugh, Elhai, Cusack, Wells, & Frueh, 2006; Keen, Kutter, Niles, & Krinsley, 2004; Ruggiero, Del Ben, Scotti, & Rabalais, 2003; Walker, Newman, Dobie, Ciechanowski, & Katon, 2002). The many possible reasons for these discrepancies (e.g., gender, recency of trauma, severity of trauma, PTSD prevalence in the sample, and treatment-​seeking status; Manne, DuHamel, Gallelli, Sorgen, & Redd, 1998)  warrant further investigation (for a comprehensive review of the PCL for DSM-​IV, see McDonald & Calhoun, 2010). In addition, there is evidence that different scoring options for the PCL (e.g., an absolute cut-​ off score vs. symptom cluster scoring vs. a combination of the two) yield differences in sensitivity, specificity, and diagnostic efficiency. The selection of a scoring routine should therefore depend on the goal of the assessment (Keen et al., 2004). However, it is important to note that similar to the PDS and IES-​R, the PCL is not meant to establish diagnostic status. Although the PCL-​5 is relatively new, the psychometric properties of the measure already have been examined in several studies. Bovin and colleagues (2016)

341

tested the PCL-​5 in a sample of veterans and found its scores to have excellent internal consistency, good test–​ retest reliability, as well as convergent and discriminant validity. Armour and colleagues (2015) found excellent internal consistency in samples of both veterans and students. Furthermore, Keane and colleagues (2015) found that the PCL-​5 scores correlated significantly with the PCL for DSM-​IV and also that PCL-​5 scores demonstrated excellent internal consistency, as well as convergent validity, with the CAPS-​5 in two studies of returning veterans and veterans from all eras. In addition, Hoge and colleagues (2014) compared the PCL-​S with the PCL-​5 and found them to be equivalent. However, they did find that 67 (30%) of those who met DSM-​IV-​TR criteria for PTSD did not meet criteria for DSM-​5 PTSD and that an additional 59 participants met only DSM-​ 5 criteria, indicating that there are prevalence differences between the two instruments. To date, only one study has examined the performance of the PCL-​5 in a non-​Western sample. Liu and colleagues (2014) examined the measure in a sample of Chinese earthquake survivors, and the internal consistency of the PCL-​5 scores was found to be excellent. Although more data are needed to further establish reliability and validity of the PCL-​5, the data thus far support its use in assessing PTSD symptom severity. Overall Evaluation Efforts to diagnose and assess patients for the presence of PTSD symptoms should include a range of assessment methods in addition to reviewing medical records, accessing collateral sources, and taking a thorough history. Previously, we reviewed the use of semi-​structured or structured diagnostic interviews and self-​ report measures as primary methods for assessing PTSD in a clinical context. In making choices about measures, it is important to consider utility within a clinical context (e.g., Are the measures time-​and cost-​effective?), as well as psychometric properties. Using these guidelines, the gold standard in PTSD assessment is the CAPS, given that it is a sound measure with excellent psychometric properties. Although only a single CAPS-​5 validation study have been published, it is still the most comprehensive measure of PTSD and is thus still recommended as the gold standard for PTSD assessment. As an adjunct, or in cases in which administering a structured interview is not feasible or practical, we recommend the use of self-​report measures that explicitly

342

Anxiety and Related Disorders

TABLE 16.2  

Ratings of Instruments Used for Case Conceptualization and Treatment Planninga

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

BTQ

A

NA

NA

A

A

NR

G

A

LEC

G

NA

NA

G

G

NR

G

A

LSC-​R

A

NA

NA

A

A

NR

G

A

SLESQ

A

NA

NA

A

A

NR

G

A

TEQ

A

NA

NA

A

A

NR

G

A

TLEQ TSS

E G

NA NA

NA NA

E A

E A

NR NR

E G

A A

Highly Recommended



  Due to limited availability of psychometric data, DSM-​5-​related measures of trauma exposure are not included here. See text for description of these instruments. a

Note: BTQ = Brief Trauma Questionnaire; LEC = Life Events Checklist; LSC-​R = Life Stressor Checklist-​Revised; SLESQ = Stressful Life Events Screening Questionnaire; TEQ = Traumatic Events Questionnaire; TLEQ = Traumatic Life Events Questionnaire; TSS = Traumatic Stress Schedule; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

assess the Criterion A  event or that are administered with the instruction to anchor all symptom endorsement to the index Criterion A event. Many of the self-​report measures described previously can be used interchangeably; however, we recommend that clinicians consider the available psychometric data for the instrument for the population on which it is to be used. In doing this, clinicians are maximizing the accuracy and efficiency of the selected measure.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

On completion of a comprehensive diagnostic assessment of PTSD that includes identification of the Criterion A event and related symptoms, the clinician will have a considerable amount of information regarding the index trauma and the severity of its psychological sequelae. This obviously is a necessary first step to case conceptualization and treatment planning; however, selection of a particular treatment and specific initial targets of treatment require additional information. The augmentation of diagnostic measures with more idiographic assessment of trauma-​related symptoms, as well as the assessment of comorbid conditions (e.g., Keane, Solomon, Maser, & Gerrity, 1995), provides an excellent foundation for treatment planning. We confine our discussion here primarily to case conceptualization and treatment planning specific to PTSD symptoms. Assessment for case conceptualization and treatment planning incorporates the influence of contextual factors that may or may not be revealed during

the diagnostic procedure. In this section, we focus on how to address these contextual factors, and we refer the reader to Table 16.2 for a review of instruments available for this purpose. Type of Trauma There is a considerable range of traumatic events that occur, and there are multiple ways in which to categorize these events. For example, involvement in a combat situation, a sexual assault, or a motor vehicle accident all qualify as potential Criterion A events, given the exposure to actual or threatened death, serious injury, or sexual violation. However, these traumatic events may vary on dimensions that are of particular salience in case conceptualization and treatment planning. For example, emotions and beliefs secondary to the trauma might differ in ways related to the trauma type. Guilt may be prominent for a combat veteran, dissociation might be more likely for a sexual assault victim, and conditioned fear of driving might be the most serious problem for an individual involved in a motor vehicle accident. Previously, we discussed diagnostic measures that assess associated dissociative features, such as derealization and depersonalization, in addition to the core 20 DSM-​5 symptoms. These elements of the diagnostic assessment are particularly valuable in the initial identification of idiographic factors that aid in case conceptualization. In addition, although many of the available diagnostic measures provide only assessment of the presence or absence of the core symptoms, continuous measures of symptom frequency and severity also provide valuable information for case

Post-Traumatic Stress Disorder

conceptualization and treatment planning in the areas of greatest distress and impairment. Single Versus Multiple Trauma Although the determination of a PTSD diagnosis includes an assessment of an index trauma (Criterion A) to which all other symptoms are presumed to be secondary, a striking percentage of individuals with PTSD have experienced multiple traumas during their lifetimes (e.g., Kessler et  al., 2005). Indeed, past trauma has been shown to be a risk factor for the development of PTSD in response to a subsequent traumatic event (e.g., Breslau, Chilcoat, Kessler, & Davis, 1999). Generally, the index trauma (i.e., the identified Criterion A  event) is the event that prompted the patient to seek treatment, and sequelae of that event often are the primary targets of treatment. It follows that the identification of additional events that may have contributed to maladaptive functioning in response to the Criterion A  event can aid in treatment planning. Numerous checklists are available to assess exposure to various traumatic events. These measures can be used as part of an initial PTSD screen, but they also can be used to identify additional traumatic experiences following a comprehensive diagnostic assessment of the index trauma. Five brief self-​report measures assess exposure to a variety of different potential DSM-​5 Criterion A events. The Life Events Checklist (LEC; Gray, Litz, Hsu, & Lombardo, 2004)  was developed using DSM-​IV criteria, is typically used in tandem with the CAPS, and also has been used as an initial screen for potentially traumatic events. The LEC lists 17 types of traumatic events that the respondent may have experienced, as well as levels of exposure (i.e., happened to me, witnessed event, or learned about event). The LEC scores demonstrated adequate test–​retest reliability over 1 week for endorsement of direct exposure to five of the listed events in a nonclinical sample, although reliability was lower for the remaining items, perhaps due to low base rates of those events (Gray et al., 2004). The LEC was updated for DSM-​5 (i.e., LEC-​5; Weathers et al., 2013), although the revisions are quite minor (i.e., Item 15, “Sudden, accidental death of someone close to you,” was changed to “Sudden accidental death,” and the response category “Part of my job” was added to each item). Psychometrics are not yet available for the LEC-​5. The Traumatic Stress Schedule (TSS; Norris, 1990)  is a 10-​item measure with scores that have demonstrated good reliability (r  =  .88) with multicultural samples. The Traumatic Events Questionnaire (TEQ; Vrana & Lauterbach, 1994) similarly assesses the experience of 11

343

potential Criterion A events; scores on this measure also demonstrate excellent reliability. The Stressful Life Events Screening Questionnaire (SLESQ; Goodman, Corcoran, Turner, Yuan, & Green, 1998)  scores demonstrate good reliability in the assessment of 13 potentially traumatic events, and the questionnaire also incorporates age at the time of trauma. The last Criterion A checklist is the Brief Trauma Questionnaire (BTQ; Schnurr, Vielhauer, Weathers, & Findler, 1999), which assesses the experience of 10 potentially traumatic events. This measure is explicit in its requirement that individuals respond to each item as Criterion A; respondents are asked if they thought their lives were in danger or if they thought they were injured or could be injured during the event. In addition to the five measures designed to identify a range of potential Criterion A events, two checklist-​type measures are slightly more in-​ depth. These measures include a greater number of items that may be helpful in case conceptualization and treatment planning. The Traumatic Life Events Questionnaire (TLEQ; Kubany, Haynes, et al., 2000) assesses exposure to 23 events; this measure expands on the lists employed by the aforementioned Criterion A measures by breaking down a generally traumatic experience (e.g., unwanted sexual experience) into more specific items that include contextual factors (e.g., childhood sexual touching and adolescent sexual touching). Test–​retest reliability was shown to be good. Note that some items on the checklist may not qualify as Criterion A  events (e.g., sexual harassment). In addition, the Life Stressor Checklist-​Revised (LSC-​R; Wolfe, Kimerling, Brown, Chrestman, & Levin, 1996)  assesses exposure to 30 potential Criterion A events. The checklist also includes follow-​up questions, including age at trauma and degree of event-​related distress during the past year. Item reliability ranges from good to excellent in a large sample of women (McHugo et al., 2005). Also of note is the inclusion of stressful events that may be of particular relevance for women, such as abortion and miscarriage. In general, checklists that identify exposure to various potential traumatic events allow the clinician a more comprehensive picture of client experiences for case conceptualization and treatment planning. In addition to broad Criterion A screening instruments, it may be helpful to consider measures of exposure to combat and other military-​related stressors when working with veteran and active duty samples. A comprehensive review of combat experience assessment is beyond the scope of this chapter; however, the following measures may be useful when determining the extent of an individual’s military experiences. The 7-​item Combat Exposure Scale (Keane

344

Anxiety and Related Disorders

et al., 1989) was validated on Vietnam veterans, and scores have demonstrated good internal consistency and test–​retest reliability. The Deployment Risk and Resilience Inventory (DRRI; King, King, Vogt, Knight, & Sampler, 2006)  and the Deployment Risk and Resilience Inventory-​2 (DRRI-​2; Vogt et  al., 2013)  are instruments used to assess a variety of deployment-​related risk and resilience factors among veterans. These 30-​item measures of warzone experiences were designed to assess the broader context of deployment (as well as pre-​deployment and post-​deployment) and factors that may impact an individual’s mental health. The DRRI was developed using samples of Gulf War veterans, and the DRRI-​2 was validated on Iraq and Afghanistan veterans. The measures include multiple subscales (e.g., Sexual Harassment, Postdeployment Stressors, and Family Stressors) and are widely used. Although many/​all subscales may be appropriate to use, the particular subscales of Combat Experiences and Aftermath of Battle may be particularly relevant. Recently, the Critical Warzone Experiences Scale (Kimbrel et al., 2014), a 7-​item measure assessing combat experiences, was developed as a short version of the 41-​item Marine Corps Mental Health Advisory Team’s Combat Experiences Scale. Scores were validated across independent samples of Iraq and Afghanistan veterans, and they demonstrated good internal consistency, test–​ retest reliability, and convergent validity.

the Posttraumatic Stress Related Functioning Inventory (McCaslin et al., 2016). Instruments are also available to measure dimensions relevant to marital difficulty (e.g., Conflict Tactics Scale-​Revised:  Straus, Hamby, Boney-​ McCoy, & Sugarman, 1996)  and quality of life (e.g., Quality of Life Inventory: Frisch, Cornell, Villañueva, & Retzlaff, 1992). A  comprehensive listing and discussion of these measures is beyond the scope of this chapter, although we note the importance of such adjunct assessment in case conceptualization and treatment planning, and we suggest that, at a minimum, the patient history incorporates multiple functional domains. Developmental Factors Related to Age at Trauma

Patient age at which the trauma occurred does not appear to predict treatment outcome (e.g., Foa, Keane, Friedman, & Cohen, 2008). Despite the absence of evidence for treatment effects specific to age, age variables are important for case conceptualization and treatment planning. A  30-​year-​old adult seeking treatment related to childhood sexual trauma that occurred at age 10 years is likely to present very differently from a 40-​ year-​ old who seeks treatment for a sexual assault that occurred at age 20  years. In both cases, 20  years have passed since the trauma; however, the 10-​year-​old victim presumably coped using strategies that were developmentally appropriate for a child, whereas the 20-​year-​old victim presumAssessment of Functioning ably coped using strategies that were developmentally Although many individuals who experience trauma-​ appropriate for a young adult. Such developmental factors related symptoms present themselves for treatment have significant potential impact on the manner in which relatively soon after the traumatic event, many others each individual initially processed the difficulty related to experience symptoms for years before seeking treatment. the event and how he or she continues to experience it. In its chronic form, PTSD often pervades an individual’s Patient age (or approximate age) at the time of the life and has a deleterious impact in multiple domains, index trauma should be included in the patient history including occupational functioning, social functioning, and, as noted, specifically is requested by several of the and intimate relationships. Assessment of maladaptive Criterion A  assessment measures. In addition, an idiofunctioning in the relevant areas begins with a thorough graphic approach to the identification of beliefs resulting patient history. In addition, the impact of the traumatic from the traumatic event(s) can incorporate developmenevent on particular areas of functioning can be assessed tal factors and can be valuable in planning targets for using a number of measures available to monitor a wider treatment. One such idiographic approach is employed range of functional difficulty. For example, some par- within the Cognitive Processing Therapy manualized ticularly useful measures of functioning across multiple treatment for PTSD (Resick, Monson, & Chard, 2014; domains, including physical and psychiatric functioning, Resick & Schnicke, 1993). Patients are instructed to write are the SF-​36 Health Survey (Ware & Kosinski, 2001), an “impact statement,” which is their own account of how the BASIS-​32 (Eisen, Wilcox, Leff, Schaefer, & Culhane, their beliefs about themselves, others, and the world have 1999), the Inventory of Psychosocial Functioning (Marx changed due to the traumatic event. This procedure often et  al., 2009), the World Health Organization Disability provides detailed information about potentially maladapAssessment Schedule (WHODAS 2.0; Üstün, et  al., tive beliefs related to safety, trust, intimacy, control, and so 2010), the Sheehan Disability Scale (Sheehan, 1983), and forth, and it provides the initial foundation for cognitive

Post-Traumatic Stress Disorder

restructuring, should such techniques be utilized in the treatment. We note that this is completely idiographic because the format is open-​ended and qualitative; we suggest that such an addition to a comprehensive quantitative assessment offers important adjunct data that are useful in treatment planning.

345

refugees from war-​torn countries may have been exposed, including exposure to torture, brainwashing, and deprivation of food or water. Originally developed in English, the HTQ has been translated and validated in Vietnamese, Laotian, Cambodian, Japanese, Bosnian, and Croatian. The HTQ possesses linguistic equivalence across the many cultures and languages with which it has been used thus far. In addition, each version includes assessment of Cultural Considerations trauma-​related symptoms based on DSM-​IV criteria for The generalizability of methods used to assess PTSD is a PTSD as well as additional items specific to the refugee function of several features of the assessment setting and experience or to a particular culture. Mollica et al. have patient characteristics. Culture, language, race, age, and reported good reliability for the HTQ scores. gender are factors that might influence the use and the In addition, the  CAPS has been studied among culinterpretation of psychological instruments, whether they turally different groups with excellent success. As one are structured diagnostic interviews or self-​report mea- example, Charney and Keane (2007) examined the psysures. Attention to these variables is essential to discerning chometric properties of the CAPS after it was adapted for the presence or absence of PTSD. use among Bosnian refugees. They applied contemporary When selecting a measure, we recommend that cli- methods for translation, back translation, and then qualinicians consider the samples on which an assessment tative approaches for reconciling any differences in meaninstrument for PTSD was validated. The need to develop ing that might have arisen as a function of this process. instruments that are culturally sensitive has been of great The researchers found that the CAPS-​Bosnian translation interest for many years as a result of documentation of was comparable in its psychometric properties to earlier ethnocultural-​ specific responses to traumatic events. versions of the instrument. This indicates that the CAPS, For example, researchers have provided evidence of dif- when properly adapted, can be successfully used to meaferences in the reported risk for and severity of PTSD sure PTSD symptoms in culturally diverse populations symptoms in Caucasians and ethnic minorities following and that PTSD secondary to war in civilians appears to be a traumatic event (e.g., Alcántara, Casement, & Lewis-​ comparable in nature to other forms of PTSD. Fernández, 2013; Roberts, Gilman, Breslau, Breslau, & Koenen, 2011). Furthermore, there is substantial variaOverall Evaluation tion in the prevalence of PTSD throughout the world. Even across similarly low-​income and developing nations In conjunction with a comprehensive diagnostic stratrecovering from conflict (which have high rates of trauma egy, assessment for case conceptualization and treatment exposure and vulnerability factors), rates of PTSD vary planning broadens the scope of relevant data available widely (de Jong et al., 2001), suggesting the importance to the clinician. Measures that take into account contexof considering unique contextual factors in the develop- tual factors that are relevant to PTSD, such as exposure ment and use of PTSD assessment instruments. to multiple traumatic events or the presence of comorTo date, evidence-​ based psychological assessment bid conditions, provide detailed information that can be of PTSD has evolved primarily within the context of helpful in deciding on the therapeutic approach and the Western, developed, and industrialized countries. Thus, specific targets of treatment. A variety of psychometrically PTSD assessment may be limited by a lack of cultur- sound measures are available for these purposes, and we ally sensitive measures and by the tremendous diversity have focused on those with the greatest clinical relevance. among cultural groups of interest (Marques, Robinaugh, Table 16.2 presents our general recommendations for LeBlanc, & Hinton, 2011). However, there has been prog- tools that should be considered for use. ress in developing culturally sensitive measures. A  good example of a measure that possesses culturally relevant features is the Harvard Trauma Questionnaire (HTQ; ASSESSMENT FOR TREATMENT MONITORING Mollica et al., 1992), which is widely used in refugee and AND TREATMENT OUTCOME internally displaced samples. The HTQ assesses a range of potentially traumatic events and trauma-​related symp- Monitoring the outcome of psychological treatment is toms. The assessment of trauma includes events to which essential to help providers demonstrate the effectiveness

346

Anxiety and Related Disorders

of their treatments to patients and service payers. In an early example of such monitoring, Keane and Kaloupek (1982) presented the first empirical evidence that cognitive–​behavioral treatments for PTSD had promise. Within a single-​subject design, they assessed clients’ subjective units of distress (SUD) ratings from 0 to 10 within treatment sessions to monitor changes in the response to traumatic memories in a prolonged exposure treatment paradigm. In addition, they utilized the Spielberger State Anxiety Inventory (STAI; Spielberger et al., 1983) to monitor between-​session levels of anxiety and distress throughout the course of the 19 treatment sessions. Currently, the use of sound psychometric instruments has become an important part of monitoring outcomes of PTSD treatment, regardless of whether the intervention is psychopharmacological, psychological, or both (e.g., Keane & Kaloupek, 2002). In addition to the provision of a measurement of change within and between sessions for a given individual, such measurements ideally present normative information against which the individual’s presentation and progress can be compared either with the general population or with target populations of interest (cf. Kraemer, 1992). It follows that clinicians would do well to utilize tests or questionnaires with sound psychometric properties when deciding how best to monitor the outcomes of their interventions (Keane & Kaloupek, 1997). Table 16.3 provides an analysis of our perspective on a variety of treatment monitoring and outcome measures for use in PTSD work. When monitoring outcomes, clinicians are also encouraged to consider outcomes at several levels, including the

symptom level, the individual level, the system level, and the social and contextual level. All are important and can provide valuable information for both clinician and client (Keane & Kaloupek, 1997, 2002). Numerous measures are available to measure psychopathology, and clinicians are encouraged to search for the measures that are most appropriate for their circumstances and settings. Use of these measures at regular intervals (daily, weekly, monthly, quarterly, etc.) during the course of treatment will provide knowledge of the client’s status and communicate to the clinician the extent to which the patient is demonstrating change in the desired directions. At the symptom level, regular assessment of PTSD symptom frequency and severity can provide the clinician with useful information regarding within-​and between-​ session change over the course of treatment. For example, it may be that some clients experience clinically significant symptom reduction in some symptoms early in treatment, whereas other symptoms persist; frequent assessment can help the clinician target particular problem areas while continually monitoring less problematic symptoms that may not be addressed specifically in session due to time constraints. In general, regular administration of symptom checklists can provide ongoing feedback to the clinician and the client. The brief PTSD symptom checklists discussed previously in this chapter can be quite useful for this purpose. In particular, the PCL-​5 can be completed quickly, is anchored to the index trauma, and assesses symptoms during a specific time frame relevant to treatment monitoring. Also at the symptom level, comorbid conditions such as major depression similarly

TABLE 16.3  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Content Reliability Validity

Construct Validity

Validity Generalization

Treatment Clinical Sensitivity Utility

Highly Recommended

CAPS-​IV CAPS-​5 IES-​R

E A G

E G G

E E NA

E A A

E E G

E E G

E A G

E NR G

E A G

✓ ✓a

PCL-​IV PCL-​5 PDS-​IV PDS-​5 SCID-​IV SCID-​5

G A E G E NR

E E E E E NR

NA NA NA NA A G

G A G A A NR

G G G G G G

E G E G G NR

E A E G E NR

E NR E NR G NR

G A G A G A

✓ ✓a ✓ ✓a

 Limited data available for these measures; recommendations have been made tentatively based on available data and the strong psychometric support for previous versions of these measures. a

Note: CAPS-​IV = Clinician-​Administered PTSD Scale for DSM-​IV; CAPS-​5 = Clinician-​Administered PTSD Scale for DSM-​5; IES-​R = Impact of Events Scale; PCL-​IV = PTSD Checklist for DSM-​IV; PCL-​5 = PTSD Checklist for DSM-​5; PDS-​IV = Posttraumatic Diagnostic Scale for DSM-​IV; PDS-​5 = Posttraumatic Diagnostic Scale for DSM-​5; SCID-​IV = Structured Clinical Interview for DSM-​IV; SCID-​5 = Structured Clinical Interview for DSM-​5; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

Post-Traumatic Stress Disorder

can be assessed easily using brief symptom measures (e.g., BDI-​II; Beck et  al., 1996). There are also a number of measures available for the purpose of monitoring a wider range of outcomes on the systems, social, and contextual levels of clients’ lives. Selection of the most appropriate measures of outcome is fundamentally a clinical decision that should be determined by the provider in consultation with the client. In the context of PTSD, we recommend the use of the WHODAS in an effort to arrive at a systematic understanding of the impact of any single disorder or the presence of concurrent disorders. Assessment of outcomes at termination of treatment should bring the clinician and the client full circle, by again assessing diagnostic and functional status to examine change from pre-​to post-​treatment and identifying remaining problem areas. Ideally, the clinician and the client will repeat the initial diagnostic interview (e.g., CAPS and PSS-​I) to determine change in symptom frequency and severity, as well as collateral change in other areas of functioning. Due to clinicians’ time constraints, it may not be feasible to repeat an entire structured interview; in such cases, the self-​report symptom measures outlined in section titled “Assessment for Diagnosis” may provide an adequate substitute. Screening for PTSD A number of the self-​report measures described throughout this chapter also have empirical support for use as screening tools for PTSD. For instance, the PCL-​5 and the PDS map directly onto the DSM-​5 criteria, and both scales have validated cut scores or algorithms that are suggestive of a PTSD diagnosis. However, briefer screening instruments are also available and may be useful for efficient screening of PTSD in settings in which more comprehensive assessment is not feasible (e.g., primary care). Some of the more commonly used brief screening measures include the four-​item Primary Care-​PTSD screen (PC-​ PTSD; Prins et  al., 2004), the seven-​ item Short Screening Scale for PTSD (Breslau, Peterson, Kessler, & Schultz, 1999), and the four-​item SPAN (Davidson, 2002). Scores on these three screening tools have demonstrated good sensitivity and specificity in the detection of PTSD and have often been used and evaluated for use in primary care clinics. However, the literature suggests that of these three, the PC-​PTSD has optimal psychometric properties for the detection of PTSD (for a review, see Spoont et al., 2015). The PC-​PTSD is a four-​item screen consisting of yes/​no questions that assess the presence of re-​ experiencing, avoidance, numbing/​ detachment, and

347

arousal symptoms within the past month (Prins et  al., 2004). A score of 3 or more “yes” answers is indicative of a positive screen for PTSD. The PC-​PTSD scores have been shown to have very good sensitivity and specificity in a range of settings, including VA primary care clinics (Prins et  al., 2004; Ouimette, Wade, Prins, & Schohn, 2008), post-​deployment health assessments (Bliese et al., 2008), civilian primary care settings (Freedy et al., 2010), and civilian substance use treatment programs (Van Dam, Ehring, Vedel, & Emmelkamp, 2010). An updated version (the PC-​PTSD-​5) based on DSM-​5 criteria has been developed and includes a new item assessing trauma-​ distorted blame and guilt as well as a revised stem question that assesses for a Criterion A traumatic stressor (Prins et al., 2016). Preliminary validation of the PC-​PTSD-​5 in a veteran primary care sample has yielded strong support for the revised version of the scale; however, further study is needed to confirm its diagnostic utility across a range of patient populations and settings and compare it with diagnostic outcomes from the CAPS-​5. Overall Evaluation The most thorough assessment for PTSD treatment monitoring incorporates regular measurement of PTSD symptoms, symptoms of comorbid conditions, and indices of functional domains such as marital relationships. In the clinical setting, this task is best accomplished using brief instruments that are relatively easy to administer and score. Treatment outcome measurement should, at a minimum, include brief assessment of symptoms, although the readministration of diagnostic measures (e.g., structured interview) allows the most comprehensive assessment of change following treatment. We suggest that each of the measures recommended in Table 16.3 has psychometric properties appropriate for these purposes.

CONCLUSIONS AND FUTURE DIRECTIONS

We have discussed the assessment of PTSD for the purposes of diagnosis, case conceptualization, treatment planning, and treatment monitoring and outcome, with emphasis on the most psychometrically sound measures available. We also have attempted to consider clinical feasibility in the use of these measures. We are confident that the available assessment options can be easily incorporated into clinical practice. With respect to assessment for diagnosis, we emphasize the importance of the clear identification of a Criterion

348

Anxiety and Related Disorders

A event, to which subsequent symptom endorsements are linked. We also recommend structured or semi-​structured diagnostic interviews when possible, particularly those that assess frequency and intensity of symptoms. For the purposes of case conceptualization and treatment planning, a broader assessment of contextual factors, including psychiatric comorbidity and exposure to other potentially traumatic events, provides valuable adjunct information. We further recommend regular brief assessments of symptoms for the purposes of treatment monitoring and a repeated diagnostic assessment to determine diagnostic status and functional change at treatment or protocol termination. Clearly, the current review is not intended to be comprehensive in its evaluation of all instruments available for the assessment of PTSD. The intent of the review has been to provide a heuristic structure that clinicians might employ when selecting a particular instrument for their clinical purposes. By carefully examining the psychometric properties of a measure, the clinician can make an informed decision about the appropriateness of a particular instrument for the task at hand (e.g., diagnosis, case conceptualization, treatment planning, and monitoring outcomes). In addition to the psychometric properties of a measure, instruments that are developed and evaluated on multiple trauma populations and culturally diverse populations are highly desirable. Future efforts are needed to establish the reliability and validity of scores on new instruments, such as those developed using DSM-​ 5 criteria, on a wider range of populations for clinicians’ use with diverse patients. The quality of our measures will ultimately determine our understanding of PTSD and will yield improved treatment of patients suffering from this disorder.

References Adkins, J. W., Weathers, F. W., McDevitt-​Murphy, M., & Daniels, J. B. (2008). Psychometric properties of seven self-​report measures of posttraumatic stress disorder in college students with mixed civilian trauma exposure. Journal of Anxiety Disorders, 22, 1393–​1402. Alcántara, C., Casement, M. D., & Lewis-​Fernández, R. (2013). Conditional risk for PTSD among Latinos:  A systematic review of racial/​ethnic differences and sociocultural explanations. Clinical Psychology Review, 33, 107–​119. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author.

American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Andrews, G., & Peters, L. (1998). The psychometric properties of the Composite International Diagnostic Interview. Social Psychiatry and Psychiatric Epidemiology, 33, 80–​88. Armour, C., Tsai, J., Durham, T. A., Charak, R., Biehn, T. L., Elhai, J. D., & Pietrzak, R. H. (2015). Dimensional structure of DSM-​ 5 posttraumatic stress symptoms: Support for a hybrid anhedonia and externalizing behaviors model. Journal of Psychiatric Research, 61, 106–​113. Beck, A. T., Brown, G., & Steer, R. A. (1996). Beck Depression Inventory II manual. San Antonio, TX:  Psychological Corporation. Beck, J. G., Grant, D. M., Read, J. P., Clapp, J. D., Coffey, S. F., Miller, L. M., & Palyo, S. A. (2008). The Impact of Event Scale-​Revised:  Psychometric properties in a sample of motor vehicle accident survivors. Journal of Anxiety Disorders, 22, 187–​198. Berninger, A., Webber, M. P., Niles, J. K., Gustave, J., Lee, R., Cohen, H. W.,  .  .  .  Prezant, D. J. (2010). Longitudinal study of probable post-​traumatic stress disorder in firefighters exposed to the World Trade Center disaster. American Journal of Industrial Medicine, 53, 1177–​1185. Blake, D. D., Weathers, F. W., Nagy, L. M., Kaloupek, D. G., Charney, D. S., & Keane, T. M. (1990). The Clinician Administered PTSD Scale-​ IV. Boston, MA:  National Center for PTSD, Behavioral Sciences Division. Blanchard, E. B., Gerardi, R. J., Kolb, L. C., & Barlow, D. H. (1986). The utility of the Anxiety Disorders Interview Schedule (ADIS) in the diagnosis of post-​ traumatic stress disorder (PTSD) in Vietnam veterans. Behaviour Research & Therapy, 24, 577–​580. Bliese, P. D., Wright, K. M., Adler, A. B., Cabrera, O., Castro, C. A., & Hoge, C. W. (2008). Validating the Primary Care Posttraumatic Stress Disorder Screen and the Posttraumatic Stress Disorder Checklist with soldiers returning from combat. Journal of Consulting and Clinical Psychology, 76, 272. Bovin, M. J., Marx, B. P., & Schnurr, P. P. (2015). Evolving DSM diagnostic criteria for PTSD: Relevance for assessment and treatment. Current Treatment Options in Psychiatry, 2, 86–​98.

Post-Traumatic Stress Disorder

Bovin, M. J., Marx, B. P., Weathers, F. W., Gallagher, M. W., Rodriguez, P., Schnurr, P. P., & Keane, T. M. (2016). Psychometric properties of the PTSD Checklist for Diagnostic and Statistical Manual of Mental Disorders–​ Fifth Edition (PCL-​ 5) in veterans. Psychological Assessment, 28(11), 1379–​1391. Bovin, M. J., Wells, S. Y., Rasmusson, A. M., Hayes, J. P., & Resick, P. A. (2014). Posttraumatic stress disorder. In P. Emmelkamp & T. Ehring (Eds.), The Wiley handbook of anxiety disorders (pp. 457–​498). West Sussex, UK: Wiley. Brackbill, R. M., Hadler, J. L., DiGrande, L., Ekenga, C. C., Farfel, M. R., Friedman, S., . . . Thorpe, L. E. (2009). Asthma and posttraumatic stress symptoms 5 to 6 years following exposure to the World Trade Center terrorist attack. JAMA, 302, 502–​516. Breslau, N., Chilcoat, H. D., Kessler, R. C., & Davis, G. C. (1999). Prior trauma and the issue of sensitization in posttraumatic stress disorder:  Results from the Detroit Area Survey of Trauma. American Journal of Psychiatry, 156, 902–​907. Breslau, N., Davis, G. C., Andreski, P., & Peterson, E. (1991). Traumatic events and posttraumatic stress disorder in an urban population of young adults. Archives of General Psychiatry, 48, 216–​222. Breslau, N., Kessler, R., & Peterson, E. L. (1998). Post-​ traumatic stress disorder assessment with a structured interview: Reliability and concordance with a standardized clinical interview. International Journal of Methods in Psychiatric Research, 7, 121–​127. Breslau, J., Miller, E., Jin, R., Sampson, N. A., Alonso, J., Andrade, L. H., . . . Fukao, A. (2011). A multinational study of mental disorders, marriage, and divorce. Acta Psychiatrica Scandinavica, 124, 474–​486. Breslau, N., Peterson, E. L., Kessler, R. C., & Schultz, L. R. (1999). Short screening scale for DSM-​ IV post-​ traumatic stress disorder. American Journal of Psychiatry, 156, 908–​911. Brewin, C. R., Dalgleish, T., & Joseph, S. (1996). A dual representation theory of posttraumatic stress disorder. Psychological Review, 103, 670–​686. Brewin, C. R., Gregory, J. D., Lipton, M., & Burgess, N. (2010). Intrusive images in psychological disorders:  Characteristics, neural mechanisms, and treatment implications. Psychological Review, 117(1), 210–​232. Brown, T. A., & Barlow, D. H. (2014). Anxiety and Related Disorders Interview Schedule for DSM-​ 5 (ADIS-​ 5)–​ Adult version. New York, NY: Oxford University Press. Brown, T. A., DiNardo, P. A., Lehman, C. L., & Campbell, L. A. (2001). Reliability of DSM-​ IV anxiety and mood disorders:  Implications for the classification of emotional disorders. Journal of Abnormal Psychology, 110, 49–​58.

349

Cha, C. B., Najmi, S., Park, J. M., Finn, C. T., & Nock, M. K. (2010). Attentional bias toward suicide-​related stimuli predicts suicidal behavior. Journal of Abnormal Psychology, 119, 616–​622. Chard, K. M. (2005). An evaluation of cognitive processing therapy for the treatment of posttraumatic stress disorder related to childhood sexual abuse. Journal of Consulting and Clinical Psychology, 73, 965–​971. Charney, M. E., & Keane, T. M. (2007). Psychometric analysis of the Clinician Administered PTSD Scale (CAPS)–​ Bosnian translation. Cultural and Ethnic Minority Psychology, 13, 161–​168. Christova, P., James, L. M., Engdahl, B. E., Lewis, S. M., & Georgopoulos, A. P. (2015). Diagnosis of posttraumatic stress disorder (PTSD) based on correlations of prewhitened fMRI data: Outcomes and areas involved. Experimental Brain Research, 233, 2695–​2705. Cook, J. M., Elhai, J. D., & Arean, P. A. (2005). Psychometric properties of the PTSD Checklist with older primary care patients. Journal of Traumatic Stress, 18, 371–​376. Creamer, M., Bell, R., & Failla, S. (2003). Psychometric properties of the Impact of Events Scale-​ Revised. Behaviour Research and Therapy, 41, 1489–​1496. Cusack, K., Falsetti, S., & de Arellano, M. (2002). Gender considerations in the psychometric assessment of PTSD. In R. Kimerling, P. Ouimette, & J. Wolfe (Eds.), Gender and PTSD (pp. 150–​176). New York, NY: Guilford. Davidson, J. (2002). SPAN Addendum to DTS Manual. New York, NY: Multi-​Health Systems. De Jong, J. T., Komproe, I. H., Van Ommeren, M., El Masri, M., Araya, M., Khaled, N., & Somasundaram, D. (2001). Lifetime events and posttraumatic stress disorder in 4 postconflict settings. JAMA, 286, 555–​562. Defense and Veteran Brain Injury Center. (2016). DOD traumatic brain injury worldwide numbers since 2000. Retrieved from http://​dvbic.dcoe.mil/​files/​tbi-​numbers/​ DoD-​TBI-​Worldwide-​Totals_​2000-​2016_​Q1_​May-​16-​ 2016_​v1.0_​2016-​06-​24.pdf Desai, R., Spencer, H., Gray, S., & Pilver, C. (2010). The long journey home XVIII:  Treatment of posttraumatic stress disorder in the Department of Veterans Affairs. West Haven, CT: Northeast Program Evaluation Center. Dickstein, B. D., Suvak, M., Litz, B. T., & Adler, A. B. (2010). Heterogeneity in the course of posttraumatic stress disorder: Trajectories of symptomatology. Journal of Traumatic Stress, 23, 331–​339. DiNardo, P. A., Brown, T. A., & Barlow, D. H. (1994). Anxiety Disorders Interview Schedule for DSM-​ IV:  Lifetime Version (ADIS-​IV-​L). San Antonio, TX:  Psychological Corporation. DiNardo, P. A., Moras, K., Barlow, D. H., Rapee, R. M., & Brown, T. A. (1993). Reliability of DSM-​III-​R anxiety disorder categories:  Using the Anxiety Disorders

350

Anxiety and Related Disorders

Interview Schedule-​Revised (ADIS-​R). Archives of General Psychiatry, 50, 251–​256. DiNardo, P. A., O’Brien, G. T., Barlow, D. H., Waddell, M. T., & Blanchard, E. B. (1983). Reliability of DSM-​III anxiety disorder categories using a new structured interview. Archives of General Psychiatry, 40, 1070–​1074. Dobie, D. J., Kivlahan, D. R., Maynard, C., Bush, K. R., McFall, M., Epler, A. J., & Bradley, K. A. (2002). Screening for post-​traumatic stress disorder in female Veteran’s Affairs patients:  Validation of the PTSD Checklist. General Hospital Psychiatry, 24, 367–​374. Dohrenwend, B. P., Turner, J. B., Turse, N. A., Adams, B. G., Koenen, K. C., & Marshall, R. (2006). The psychological risks of Vietnam for U.S. veterans: A revisit with new data and methods. Science, 313, 979–​982. Edmondson, D., Kronish, I. M., Shaffer, J. A., Falzon, L., & Burg, M. M. (2013). Posttraumatic stress disorder and risk for coronary heart disease: A meta-​analytic review. American Heart Journal, 166, 806–​814. Ehlers, A., & Clark, D. M. (2000). A cognitive model of posttraumatic stress disorder. Behaviour Research and Therapy, 38, 319–​345. Eisen, S. V., Wilcox, M., Leff, H. S., Schaefer, E., & Culhane, M. A. (1999) Assessing behavioral health outcomes in outpatient programs: Reliability and validity of the BASIS-​32. Journal of Behavioral Health Services and Research, 26, 5–​17. Elbogen, E. B., Sullivan, C. P., Wolfe, J., Wagner, H. R., & Beckham, J. C. (2013). Homelessness and money mismanagement in Iraq and Afghanistan veterans. American Journal of Public Health, 103, S248–​S254. Elhai, J. D., & Palmieri, P. A. (2011). The factor structure of posttraumatic stress disorder: A literature update, critique of methodology, and agenda for future research. Journal of Anxiety Disorders, 25, 849–​854. Fairbank, J. A., & Keane, T. M. (1982). Flooding for combat-​ related stress disorders:  Assessment of anxiety reduction across traumatic memories. Behavior Therapy, 13, 499–​510. First, M., Spitzer, R., Williams, J., & Gibbon, M. (2000). Structured Clinical Interview for DSM-​ IV AXIS I Disorders (SCID-​I). In A. John Rush (Ed.), Handbook of psychiatric measures (pp. 49–​ 53). Washington, DC: American Psychiatric Association. First, M. B., Williams, J. W., Karg, R. S., & Spitzer, R. L. (2015). Structured Clinical Interview for DSM-​ 5–​ Research Version (SCID-​5 for DSM-​5, Research Version; SCID-​5-​RV). Arlington, VA:  American Psychiatric Association. Foa, E. B., Cashman, L., Jaycox, L., & Perry, K. (1997). The validation of a self-​ report measure of posttraumatic stress disorder:  The Posttraumatic Diagnostic Scale. Psychological Assessment, 9, 445–​451. Foa, E. B., Keane, T. M., Friedman, M. J., & Cohen, J. (2008). Effective treatments for PTSD: Practice guidelines from

the International Society for Traumatic Stress Studies. New York, NY: Guilford. Foa, E. B., McLean, C. P., Zang, Y., Zhong, J., Powers, M. B., Kauffman, B. Y., . . . Knowles, K. (2016). Psychometric properties of the Posttraumatic Diagnostic Scale for DSM-​5 (PDS-​5). Psychological Assessment, 28(10), 1166–​1171. Foa, E. B., McLean, C. P., Zang, Y., Zhong, J., Rauch, S., Porter, K.,  .  .  .  Kauffman, B. Y. (2016). Psychometric properties of the Posttraumatic Stress Disorder Symptom Scale Interview for DSM-​ 5 (PSSI-​ 5). Psychological Assessment, 28(10), 1159–​1165. Foa, E. B., Riggs, D. S., Dancu, C. V., & Rothbaum, B. O. (1993). Reliability and validity of a brief instrument for assessing post-​traumatic stress disorder. Journal of Traumatic Stress, 6, 459–​474. Foa, E. B., & Rothbaum, B. O. (1998). Treating the trauma of rape: Cognitive behavioral therapy for PTSD. New York, NY: Guilford. Forbes, D., Lloyd, D., Nixon, R. D.  V., Elliott, P., Varker, T., Perry, D., . . . Creamer, M. (2012). A multisite randomized controlled effectiveness trial of cognitive processing therapy for military-​related posttraumatic stress disorder. Journal of Anxiety Disorders, 26, 442–​452. Freedy, J. R., Steenkamp, M. M., Magruder, K. M., Yeager, D. E., Zoller, J. S., Hueston, W. J., & Carek, P. J. (2010). Post-​traumatic stress disorder screening test performance in civilian primary care. Family Practice, 27, 615–​624. Frisch, M. B., Cornell, J., Villañueva, M., & Retzlaff, P. J. (1992). Clinical validation of the Quality of Life Inventory: A measure of life satisfaction for use in treatment planning and outcome assessment. Psychological Assessment, 4, 92–​101. Fulton, J. J., Calhoun, P. S., Wagner, H. R., Schry, A. R., Hair, L. P., Feeling, N., . . . Beckham, J. C. (2015). The prevalence of posttraumatic stress disorder in Operation Enduring Freedom/​ Operation Iraqi Freedom (OEF/​ OIF) veterans:  A meta-​ analysis. Journal of Anxiety Disorders, 31, 98–​107. Goldstein, R. B., Smith, S. M., Chou, S. P., Saha, T. D., Jung, J., Zhang, H., . . . Grant, B. F. (2016). The epidemiology of DSM-​5 posttraumatic stress disorder in the United States: Results from the National Epidemiologic Survey on Alcohol and Related Conditions-​III. Social Psychiatry and Psychiatric Epidemiology, 51, 1137–​1148. Goodman, L., Corcoran, C., Turner, K., Yuan, N., & Green, B. (1998). Assessing traumatic event exposure: General issues and preliminary findings for the Stressful Life Events Screening Questionnaire. Journal of Traumatic Stress, 11, 521–​542. Gradus, J. L., Qin, P., Lincoln, A. K., Miller, M., Lawler, E., Sørensen, H. T., & Lash, T. L. (2010). Posttraumatic stress disorder and completed suicide. American Journal of Epidemiology, 171, 721–​727.

Post-Traumatic Stress Disorder

Gray, M. J., Litz, B. T., Hsu, J. L., & Lombardo, T. W. (2004). Psychometric properties of the Life Events Checklist. Assessment, 11, 330–​341. Green, J. D., Marx, B. P., & Keane, T. M. (2017). Empirically supported conceptualizations and treatments of posttraumatic stress disorder. In D. McKay, J. Abramowitz, & E. Storch (Eds.), Treatments for psychological syndromes and problems (pp. 115–​135). Hoboken, NJ: John Wiley & Sons, Ltd. Griesel, D., Wessa, M., & Flor, H. (2006). Psychometric qualities of the German version of the Posttraumatic Diagnostic Scale (PTDS). Psychological Assessment, 18, 262–​268. Griffin, M. G., Uhlmansiek, M. H., Resick, P. A., & Mechanic, M. B. (2004). Comparison of the Posttraumatic Stress Disorder Scale versus the Clinician-​administered Post-​ traumatic Stress Disorder Scale in domestic violence survivors. Journal of Traumatic Stress, 17, 497–​503. Grubaugh, A. L., Elhai, J. D., Cusack, K. J., Wells, C., & Frueh, B. C. (2006). Screening for PTSD in public-​ sector mental health settings:  The diagnostic utility of the PTSD Checklist. Depression and Anxiety, 24, 124–​129. Guerreiro, D. F., Cruz, D., Frasquilho, D., Santos, J. C., Figueira, M. L., & Sampaio, D. (2013). Association between deliberate self-​ harm and coping in adolescents:  A critical review of the last 10  years’ literature. Archives of Suicide Research, 17, 91–​105. Haro, J. M., Arbabzadeh-​Bouchez, S., Brugha, T. S., De Girolamo, G., Guyer, M. E., Jin, R., . . . Kessler, R. C. (2006). Concordance of the Composite International Diagnostic Interview version 3.0 (CIDI 3.0) with standardised clinical assessments in the WHO World Mental Health Surveys. International Journal of Methods in Psychiatric Research, 14, 167–​180. Haw, C., & Hawton, K. (2011). Living alone and deliberate self-​harm:  A case–​control study of characteristics and risk factors. Social Psychiatry and Psychiatric Epidemiology, 46, 1115–​1125. Hinton, D. E., Chhean, D., Pich, V., Pollack, M. H., Orr, S. P., & Pitman, R. K. (2006). Assessment of posttraumatic stress disorder in Cambodian refugees using the Clinician-​ Administered PTSD Scale:  Psychometric properties and symptom severity. Journal of Traumatic Stress, 19, 405–​409. Hoerger, M. (2013). ZH:  An updated version of Steiger’s Z and Web-​based calculator for testing the statistical significance of the difference between dependent correlations. Retrieved from http://​www.psychmike.com/​dependent_​ correlations.php Hoge, C. W. (2015). Measuring the long-​term impact of war-​zone military service across generations and changing posttraumatic stress disorder definitions. JAMA Psychiatry, 72, 861–​862.

351

Hoge, C. W., Goldberg, H. M., & Castro, C. A. (2009). Care of war veterans with mild traumatic brain injury. New England Journal of Medicine, 360, 1588–​1591. Hoge, C. W., McGurk, D., Thomas, J. L., Cox, A. L., Engel, C. C., & Castro, C. A. (2008). Mild traumatic brain injury in US soldiers returning from Iraq. New England Journal of Medicine, 358, 453–​463. Hoge, C. W., Riviere, L. A., Wilk, J. E., Herrell, R. K., & Weathers, F. W. (2014). The prevalence of posttraumatic stress disorder (PTSD) in US combat soldiers: A head-​to-​head comparison of MDS-​5 versus DSM-​IV-​ TR symptom criteria with the PTSD Checklist. Lancet Psychiatry, 1, 269–​277. Hollifield, M., Hewage, C., Gunawardena, C. N., Kodituwakku, P., Bopagoda, K., & Weerarathnege, K. (2008). Symptoms and coping in Sri Lanka 20–​ 21  months after the 2004 tsunami. British Journal of Psychiatry, 192, 39–​44. Horowitz, M. J. (1976). Stress response syndromes. Northvale, NJ: Aronson. Horowitz, M. J., Wilner, N., & Alvarez, W. (1979). Impact of Event Scale:  A measure of subjective stress. Psychosomatic Medicine, 41, 209–​218. Hyer, L., Davis, H., Boudewyns, P., & Woods, M. G. (1991). A short form of the Mississippi Scale for Combat-​Related PTSD. Journal of Clinical Psychology, 47, 510–​518. Imel, Z. E., Laska, K., Jakupcak, M., & Simpson, T. L. (2013). Meta-​analysis of dropout in treatments for posttraumatic stress disorder. Journal of Consulting and Clinical Psychology, 81, 394–​404. Institute of Medicine. (2008). Treatment of PTSD: Assessment of the evidence. Washington, DC:  National Academies Press. Jakupcak, M., Cook, J., Imel, Z., Fontana, A., Rosenheck, R., & McFall, M. (2009). Posttraumatic stress disorder as a risk factor for suicidal ideation in Iraq and Afghanistan war veterans. Journal of Traumatic Stress, 22, 303–​306. Kaysen, D., Schumm, J., Pedersen, E. R., Seim, R. W., Bedard-​Gilligan, M., & Chard, K. (2014). Cognitive processing therapy for veterans with comorbid PTSD and alcohol use disorders. Addictive Behaviors, 39, 420–​427. Keane, T. M., & Barlow, D. H. (2002). Posttraumatic stress disorder. In D. H. Barlow (Ed.), Anxiety and its disorders: The nature and treatment of anxiety and panic (2nd ed., pp. 418–​453). New York, NY: Guilford. Keane, T. M., Caddell, J. M., & Taylor, K. L. (1988). Mississippi Scale for combat-​related posttraumatic stress disorder: Three studies in reliability and validity. Journal of Consulting and Clinical Psychology, 56, 85–​90. Keane, T. M., Fairbank, J. A., Caddell, J. M., Zimering, R. T., & Bender, M. E. (1985). A behavioral approach to assessing and treating post-​traumatic stress disorder in

352

Anxiety and Related Disorders

Vietnam veterans. In C. R. Figley (Ed.), Trauma and its wake (pp. 257–​294). New York, NY: Brunner/​Mazel. Keane, T. M., Fairbank, J. A., Caddell, J. M., Zimering, R. T., Taylor, K. L., & Mora, C. A. (1989). Clinical evaluation of a measure to assess combat exposure. Psychological Assessment, 1, 53–​55. Keane, T. M., & Kaloupek, D. G. (1982). Imaginal flooding in the treatment of posttraumatic stress disorder. Journal of Consulting and Clinical Psychology, 50, 138–​140. Keane, T. M., & Kaloupek, D. G. (1997). Comorbid psychiatric disorders in post-​traumatic stress disorder: Implications for research. In R. Yehuda & A. McFarlane (Eds.), Psychobiology of posttraumatic stress disorder. New York, NY: Annals of New York Academy of Science. Keane, T. M., & Kaloupek, D. G. (2002). Posttraumatic stress disorder:  Diagnosis, assessment, and monitoring outcomes. In R. Yehuda (Ed.), Clinical assessment and treatment of PTSD. Washington, DC:  American Psychiatric Publishing. Keane, T. M., Kolb, L. C., Kaloupek, D. G., Orr, S. P., Blanchard, E. B., Thomas, R. G.,  .  .  .  Lavori, P. W. (1998). Utility of psychophysiology measurement in the diagnosis of posttraumatic stress disorder: Results from a Department of Veterans Affairs cooperative study. Journal of Consulting and Clinical Psychology, 66, 914–​923. Keane, T. M., Malloy, P. F., & Fairbank, J. A. (1984). Empirical development of an MMPI subscale for the assessment of combat-​related posttraumatic stress disorder. Journal of Consulting and Clinical Psychology, 52, 888–​891. Keane, T. M., Marshall, A., & Taft, C. (2006). Posttraumatic stress disorder:  Etiology, epidemiology, and treatment outcome. Annual Review of Clinical Psychology, 2, 161–​197. Keane, T. M., Solomon, S., Maser, J., & Gerrity, E. (1995, November). Assessment of PTSD. Paper presented at the National Institute of Mental Health–​National Center for PTSD Consensus Conference on Assessment of PTSD, Boston, MA. Keane, T. M., Weathers, F. W., & Foa, E. B. (2000). Diagnosis and assessment. In E. B. Foa, T. M. Keane, & M. J. Friedman (Eds.), Effective treatments for PTSD (pp. 18–​36). New York, NY: Guilford. Keen, S. M., Kutter, C. J., Niles, B. L., & Krinsley, K. E. (2004, November). Psychometric properties of the PTSD Checklist. Poster presented at the annual meeting of the International Society for Traumatic Stress Studies, New Orleans, LA. Kessler, R. C., Berglund, P., Demler, O., Jin, R., & Walters, E. (2005). Lifetime prevalence and age of onset distributions of DSM-​IV disorders in the National Comorbidity Survey-​Replication. Archives of General Psychiatry, 62, 593–​602.

Kessler, R. C., Galea, S., Jones, R. T., & Parker, H. A. (2006). Mental illness and suicidality after Hurricane Katrina. Bulletin of the World Health Organization, 84, 930–​939. Kessler, R. C., Matthias, A., Anthony, J. C., de Graaf, R., Demyttenaere, K., Gasquet, I., . . . Ustun, T. B. (2007). Lifetime prevalence and age-​of-​onset distributions of mental disorders in the World Health Organization’s World Mental Health Survey Initiative. World Psychiatry, 6(3), 168–​176. Kessler, R. C., Rose, S., Koenen, K. C., Karam, E. G., Stang, P. E., Stein, D. J., . . . Carmen Viana, M. (2014). How well can post-​ traumatic stress disorder be predicted from pre-​trauma risk factors? An exploratory study in the WHO World Mental Health Surveys. World Psychiatry, 13, 265–​274. Kessler, R. C., & Üstün, T. B. (2004). The World Mental Health (WMH) Survey Initiative version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI). International Journal of Methods in Psychiatric Research, 13, 93–​121. Kilpatrick, D. G. (1988). Rape aftermath symptom test. In M. Hersen & A. S. Bellack (Eds.), Dictionary of behavioral assessment techniques (pp. 658–​ 669). Oxford, UK: Pergamon. Kilpatrick, D. G., Resnick, H. S., Milanak, M. E., Miller, M. W., Keyes, K. M., & Friedman, M. J. (2013). National estimates of exposure to traumatic events and PTSD prevalence using DSM-​IV and DSM-​5 criteria. Journal of Traumatic Stress, 26, 537–​547. Kimbrel, N. A., Evans, L. D., Patel, A. B., Wilson, L. C., Meyer, E. C., Gulliver, S. B., & Morissette, S. B. (2014). The Critical Warzone Experiences (CWE) scale: Initial psychometric properties and association with PTSD, anxiety, and depression. Psychiatry Research, 220, 1118–​1124. Kimerling, R., Alvarez, J., Pavao, J., Mack, K. P., Smith, M. W., & Baumrind, N. (2009). Unemployment among women examining the relationship of physical and psychological intimate partner violence and posttraumatic stress disorder. Journal of Interpersonal Violence, 24, 450–​463. Kimerling, R., Serpi, T., Weathers, F., Kilbourne, A. M., Kang, H., Collins, J. F.,  .  .  .  Magruder, K. (2014). Diagnostic accuracy of the Composite International Diagnostic Interview (CIDI 3.0) PTSD module among female Vietnam-​ era veterans. Journal of Traumatic Stress, 27, 160–​167. King, L. A., King, D. W., Vogt, D. S., Knight, J., & Sampler, R. E. (2006). Deployment Risk and Resilience Inventory: A collection of measures for studying deployment-​related experiences of military personnel and veterans. Military Psychology, 18, 89–​120. Kok, B. C., Herrell, R. K., Thomas, J. L., & Hoge, C. W. (2012). Posttraumatic stress disorder associated with

Post-Traumatic Stress Disorder

combat service in Iraq or Afghanistan:  Reconciling prevalence differences between studies. Journal of Nervous and Mental Disease, 200, 444–​450. Kraemer, H. C. (1992). Evaluating medical tests:  Objective and quantitative guidelines. Newbury Park, CA: Sage. Kubany, E. S., Haynes, S. N., Leisen, M. B., Owens, J. A., Kaplan, A. S., Watson, S. B., & Burns, K. (2000). Development and preliminary validation of a brief broad-​ spectrum measure of trauma exposure:  The Traumatic Life Events Questionnaire. Psychological Assessment, 12, 210–​224. Kulka, R. A., Schlenger, W. E., Fairbank, J. A., Hough, R. L., Jordan, B. K., Marmar, C. R., & Weiss, D.S. (1988). National Vietnam Veterans Readjustment Study (NVVRS): Design, current status, and initial PTSD prevalence estimates. Research Triangle Park, NC: Research Triangle Park Institute. Kulka, R. A., Schlenger, W. E., Fairbank, J. A., Jordan, B. K., Hough, R. L., Marmar, C. R., & Weiss, D. S. (1990). Trauma and the Vietnam War generation: Report of findings from the National Vietnam Veterans Readjustment Study. New York, NY: Brunner/​Mazel. Kun, P., Chen, X., Han, S., Gong, X., Chen, M., Zhang, W., & Yao, L. (2009). Prevalence of post-​traumatic stress disorder in Sichuan Province, China, after the 2008 Wenchuan earthquake. Public Health, 123, 703–​707. Lauterbach, D., Vrana, S. R., King, D. W., & King, L. A. (1997). Psychometric properties of the civilian version of the Mississippi PTSD Scale. Journal of Traumatic Stress, 10, 499–​513. Lilienfeld, S. O., & Andrews, B. P. (1996). Development and preliminary validation of a self-​report measure of psychopathic personality traits in noncriminal population. Journal of Personality Assessment, 66, 488–​524. Liu, P., Wang, L., Cao, C., Wang, R., Zhang, J., Zhang, B., . . . Elhai, J. D. (2014). The underlying dimensions of DSM-​5 posttraumatic stress disorder symptoms in an epidemiological sample of Chinese earthquake survivors. Journal of Anxiety Disorders, 28, 345–​351. Maercker, A., Brewin, C. R., Bryant, R. A., Cloitre, M., Reed, G. M., van Ommeren, M.,  .  .  .  Rousseau, C. (2013). Proposals for mental disorders specifically associated with stress in the International Classification of Diseases-​11. Lancet, 381, 1683–​1685. Manne, S. L., DuHamel, K., Gallelli, K., Sorgen, K., & Redd, W. H. (1998). Posttraumatic stress disorder among mothers of pediatric cancer survivors: Diagnosis, comorbidity, and utility of the PTSD Checklist as a screening instrument. Journal of Pediatric Psychology, 23, 357–​366. Marques, L., Robinaugh, D. J., LeBlanc, N. J., & Hinton, D. (2011). Cross-​cultural variations in the prevalence and presentation of anxiety disorders. Expert Review of Neurotherapeutics, 11, 313–​322.

353

Marx, B. P., Bovin, M. J., Suvak, M. K., Monson, C. M., Sloan, D. M., Fredman, S. J., . . . Keane, T. M. (2012). Concordance between physiological arousal and subjective distress among Vietnam combat veterans undergoing challenge testing for PTSD. Journal of Traumatic Stress, 25, 416–​425. Marx, B. P., Schnurr, P. P., Rodriguez, P., Holowka, D. W., Lunney, C., Weathers, F.,  .  .  .  Keane, T. M. (2009, November). Development and validation of a scale to assess functional impairment among active duty service members and veterans. Paper presented at the 25th annual meeting of the International Society for Traumatic Stress Studies, Atlanta, GA. McCaslin, S. E., Maguen, S., Metzler, T., Bosch, J., Neylan, T. C., & Marmar, C. R. (2016). Assessing posttraumatic stress related impairment and well-​being:  The Posttraumatic Stress Related Functioning Inventory (PRFI). Journal of Psychiatric Research, 72, 104–​111. McDonald, S. D., & Calhoun, P. S. (2010). The diagnostic accuracy of the PTSD Checklist: A critical review. Clinical Psychology Review, 30, 976–​987. McFall, M. E., Smith, D. E., Mackay, P. W., & Tarver, D. J. (1990). Reliability and validity of Mississippi Scale for Combat-​Related Posttraumatic Stress Disorder. Journal of Consulting and Clinical Psychology, 2, 114–​121. McFall, M. E., Smith, D., Roszell, D. K., Tarver, D. J., & Malas, K. L. (1990). Convergent validity of measures of PTSD in Vietnam combat veterans. American Journal of Psychiatry, 147, 645–​648. McHugo, G., Caspi, Y., Kammerer, N., Mazelis, R., Jackson, E., Russell, L.,  .  .  .  Kimerling, R. (2005). The assessment of trauma history in women with co-​occurring substance abuse and mental disorders and a history of interpersonal violence. Journal of Behavioral Health Sciences and Research, 32, 113–​127. Michopoulos, V., Norrholm, S. D., & Jovanovic, T. (2015). Diagnostic biomarkers for posttraumatic stress disorder:  Promising horizons from translational neuroscience research. Biological Psychiatry, 78, 344–​353. Mollica, R. F., Caspi-​Yavin, Y., Bollini, P., Truong, T., Tor, S., & Lavelle, J. (1992). The Harvard Trauma Questionnaire:  Validating a cross-​cultural instrument for measuring torture, trauma, and post-​traumatic stress disorder in Indochinese refugees. Journal of Nervous and Mental Disease, 180, 111–​116. Morina, N., Ehring, T., & Priebe, S. (2013). Diagnostic utility of the Impact of Event Scale-​Revised in two samples of survivors of war. PLoS One, 8, e83916. Morina, N., Wicherts, J. M., Lobbrecht, J., & Priebe, S. (2014). Remission from post-​traumatic stress disorder in adults: A systematic review and meta-​analysis of long term outcome studies. Clinical Psychology Review, 34, 249–​255. Najavits, L. (2002). Seeking safety:  A treatment manual for PTSD and substance abuse. New York, NY: Guilford.

354

Anxiety and Related Disorders

Neria, Y., DiGrande, L., & Adams, B. G. (2011). Posttraumatic stress disorder following the September 11, 2001, terrorist attacks: A review of the literature among highly exposed populations. The American Psychologist, 66, 429–​446. Nock, M. K., Hwang, I., Sampson, N., Kessler, R. C., Angermeyer, M., Beautrais, A.,  .  .  .  Williams, D. R. (2009). Cross-​ national analysis of the associations among mental disorders and suicidal behavior: Findings from the WHO World Mental Health Surveys. PLoS Medicine, 6, e1000123. Norris, F. (1990). Screening for traumatic stress: A scale for use in the general population. Journal of Applied Social Psychology, 20, 1704–​1718. O’Connor, R. C., & Nock, M. K. (2014). The psychology of suicidal behaviour. Lancet Psychiatry, 1, 73–​85. Oquendo, M. A., Friend, J. M., Halberstam, B., Brodsky, B. S., Burke, A. K., Grunebaum, M. F.,  .  .  .  Mann, J. J. (2003). Association of comorbid posttraumatic stress disorder and major depression with greater risk for suicidal behavior. American Journal of Psychiatry, 160, 580–​582. Orazem, R. J., Charney, M. E., & Keane, T. M. (2006, March). Mississippi Scale for Combat-​Related PTSD: Analysis of reliability and validity. Poster session presented at the annual meeting of the Anxiety Disorders Association of America, Miami, FL. Orcutt, H. K., Bonanno, G. A., Hannan, S. M., & Miron, L. R. (2014). Prospective trajectories of posttraumatic stress in college women following a campus mass shooting. Journal of Traumatic Stress, 27, 249–​256. Orr, S. P., Metzger, L. J., Miller, M. W., & Kaloupek, D. G. (2004). Psychophysiological assessment of PTSD. In J. P. Wilson & T. M. Keane (Eds.), Assessing psychological trauma and PTSD (2nd ed., pp. 289–​343). New York, NY: Guilford. Ouimette, P., Wade, M., Prins, A., & Schohn, M. (2008). Identifying PTSD in primary care:  Comparison of the Primary Care-​PTSD screen (PC-​PTSD) and the General Health Questionnaire-​12 (GHQ). Journal of Anxiety Disorders, 22, 337–​343. Pagura, J., Stein, M. B., Bolton, J. M., Cox, B. J., Grant, B., & Sareen, J. (2010). Comorbidity of borderline personality disorder and posttraumatic stress disorder in the US population. Journal of Psychiatric Research, 44, 1190–​1198. Pietrzak, R. H., Goldstein, R. B., Southwick, S. M., & Grant, B. F. (2011). Prevalence and Axis I comorbidity of full and partial posttraumatic stress disorder in the United States: Results from Wave 2 of the National Epidemiologic Survey on Alcohol and Related Conditions. Journal of Anxiety Disorders, 25, 456–​465. Pitman, R. K., Rasmusson, A. M., Koenen, K. C., Shin, L. M., Orr, S. P., Gilbertson, M. W., . . . Liberzon, I. (2012). Biological studies of post-​traumatic stress disorder. Nature Reviews Neuroscience, 13, 769–​787.

Pole, N. (2007). The psychophysiology of posttraumatic stress disorder:  A meta-​analysis. Psychological Bulletin, 133, 725–​746. Powers, M. B., Halpern, J. M., Ferenschak, M. P., Gillihan, S. J., & Foa, E. B. (2010). A meta-​analytic review of prolonged exposure for posttraumatic stress disorder. Clinical Psychology Review, 30, 635–​641. Prins, A., Bovin, M. J., Smolenski, D. J., Marx, B. P., Kimerling, R., Jenkins-​Guarnieri, M. A.,  .  .  .  Tiet, Q. Q. (2016). The Primary Care PTSD Screen for DSM-​ 5 (PC-​PTSD-​5): Development and evaluation within a veteran primary care sample. Journal of General Internal Medicine, 31, 1206–​1211. Prins, A., Ouimette, P., Kimerling, R., Cameron, R. P., Hugelshofer, D. S., Shaw-​ Hegwer, J.,  .  .  .  Sheikh, J. I. (2004). The Primary Care PTSD Screen (PC-​ PTSD): Corrigendum. Primary Care Psychiatry, 9, 151. Rash, C. J., Coffey, S. F., Baschnagel, J. S., Drobes, D. J., & Saladin, M. E. (2008). Psychometric properties of the IES-​R in traumatized substance dependent individuals with and without PTSD. Addictive Behaviors, 33, 1039–​1047. Resick, P. A., Monson, C. M., & Chard, K. M. (2014). Cognitive processing therapy:  Veteran/​ military version:  Therapist’s manual. Washington, DC:  US Department of Veterans Affairs. Resick, P. A., & Schnicke, M. K. (1993). Cognitive processing therapy for rape victims: A treatment manual. Newbury Park, CA: Sage. Roberts, A. L., Gilman, S. E., Breslau, J., Breslau, N., & Koenen, K. C. (2011). Race/​ethnic differences in exposure to traumatic events, development of post-​traumatic stress disorder, and treatment-​seeking for post-​traumatic stress disorder in the United States. Psychological Medicine, 41, 71–​83. Ruggiero, K. J., Del Ben, K., Scotti, J. R., & Rabalais, A. E. (2003). Psychometric properties of the PTSD Checklist-​ Civilian Version. Journal of Traumatic Stress, 16, 495–​502. Santiago, P. N., Ursano, R. J., Gray, C. L., Pynoos, R. S., Spiegel, D., Lewis-​Fernandez, R., . . . Fullerton, C. S. (2013). A systematic review of PTSD prevalence and trajectories in DSM-​5 defined trauma exposed populations: Intentional and non-​intentional traumatic events. PLoS One, 8, e59236. Schneiderman, A. I., Braver, E. R., & Kang, H. K. (2008). Understanding sequelae of injury mechanisms and mild traumatic brain injury incurred during the conflicts in Iraq and Afghanistan: Persistent postconcussive symptoms and posttraumatic stress disorder. American Journal of Epidemiology, 167, 1446–​1452. Schnurr, P., Vielhauer, M., Weathers, F., & Findler, M. (1999). The Brief Trauma Questionnaire. White River Junction, VT: National Center for PTSD.

Post-Traumatic Stress Disorder

Seedat, S., Scott, K. M., Angermeyer, M. C., Berglund, P., Bromet, E. J., Brugha, T. S., . . . Karam, E. G. (2009). Cross-​national associations between gender and mental disorders in the World Health Organization World Mental Health Surveys. Archives of General Psychiatry, 66, 785–​795. Shalev, A. Y., Friedman, M. J., Foa, E. B., & Keane, T. M. (2000). Integration and summary. In E. B. Foa, T. M. Keane, & M. J. Friedman (Eds.), Effective treatments for PTSD: Practice guidelines from the International Society for Traumatic Stress Studies (pp. 617–​642). New  York, NY: Guilford. Sheehan, D. V. (1983). The anxiety disease. New  York, NY: Scribner. Sloan, D. M., Marx, B. P., Bovin, M. J., Feinstein, B. A., & Gallagher, M. W. (2012). Written exposure as an intervention for PTSD:  A randomized clinical trial with motor vehicle accident survivors. Behaviour Research & Therapy, 50, 627–​635. Spielberger, C. S., Gorsuch, R. L., Lushene, R., Vagg, P. R., & Jacobs, G. A. (1983). Manual for the State–​Trait Anxiety Inventory (Form Y). Palo Alto, CA: Mind Garden. Spira, J. L., Lathan, C. E., Bleiberg, J., & Tsao, J. W. (2014). The impact of multiple concussions on emotional distress, post-​ concussive symptoms, and neurocognitive functioning in active duty United States Marines independent of combat exposure or emotional distress. Journal of Neurotrauma, 31, 1823–​1834. Spitzer, R. L., Kroenke, K., Williams, J. B., & Patient Health Questionnaire Primary Care Study Group. (1999). Validation and utility of a self-​report version of PRIME-​MD: The PHQ primary care study. JAMA, 282, 1737–​1744. Spoont, M. R., Williams, J. W., Kehle-​Forbes, S., Nieuwsma, J. A., Mann-​Wrobel, M. C., & Gross, R. (2015). Does this patient have posttraumatic stress disorder? Rational clinical examination systematic review. JAMA, 314, 501–​510. Steel, Z., Chey, T., Silove, D., Marnane, C., Bryant, R. A., & Van Ommeren, M. (2009). Association of torture and other potentially traumatic events with mental health outcomes among populations exposed to mass conflict and displacement:  A systematic review and meta-​ analysis. JAMA, 302, 537–​549. Steenkamp, M. M., Litz, B. T., Hoge, C. W., & Marmar, C. R. (2015). Psychotherapy for military-​related PTSD: A review of randomized clinical trials. JAMA, 314, 489–​500. Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87, 245–​251. Straus, M. A., Hamby, S. L., Boney-​McCoy, S., & Sugarman, D. B. (1996). The revised Conflict Tactics Scales (CTS2):  Development and preliminary psychometric data. Journal of Family Issues, 17, 283–​316.

355

Taft, C. T., King, L. A., King, D. W., Leskin, G. A., & Riggs, D. S. (1999). Partners’ ratings of combat veterans’ PTSD symptomatology. Journal of Traumatic Stress, 12, 327–​334. Taft, C. T., Watkins, L. E., Stafford, J., Street, A. E., & Monson, C. M. (2011). Posttraumatic stress disorder and intimate relationship problems: A meta-​analysis. Journal of Consulting and Clinical Psychology, 79, 22–​33. United Nations High Commissioner for Refugees. (2016). Global trends 2015. Geneva, Switzerland: Author. Üstün, T. B., Kostanjsek, N., Chatterji, S., & Rehm, J. (2010). Measuring health and disability:  Manual for WHO Disability Assessment Schedule (WHODAS 2.0). Geneva, Switzerland: World Health Organization. VA/​ DOD Management of Post-​ Traumatic Stress Working Group. (2010). VA/​ DOD clinical practice guideline for management of post-​ traumatic stress, version 2.0. Washington, DC:  US Department of Veterans Affairs, Office of Quality and Performance. Van Dam, D., Ehring, T., Vedel, E., & Emmelkamp, P. M. (2010). Validation of the Primary Care Posttraumatic Stress Disorder screening questionnaire (PC-​ PTSD) in civilian substance use disorder patients. Journal of Substance Abuse Treatment, 39, 105–​113. van Minnen, A., Zoellner, L. A., Harned, M. S., & Mills, K. (2015). Changes in comorbid conditions after prolonged exposure for PTSD: A literature review. Current Psychiatry Reports, 17(3), 1–​16. Vasterling, J. J., Bryant, R. A., & Keane, T. M. (Eds.). (2012). PTSD and mild traumatic brain injury. New  York, NY: Guilford. Vasterling, J. J., Brailey, K., Proctor, S. P., Kane, R., Heeren, T., & Franz, M. (2012). Effects of mild TBI, PTSD, and depression on neuropsychological performance and functional health in Iraq-​deployed U.S. Army soldiers. British Journal of Psychiatry, 201, 186–​192. Vogt, D., Smith, B. N., King, L. A., King, D. W., Knight, J., & Vasterling, J. J. (2013). Deployment Risk and Resilience Inventory-​2 (DRRI-​2):  An updated tool for assessing psychosocial risk and resilience factors among service members and veterans. Journal of Traumatic Stress, 26, 710–​717. Vrana, S., & Lauterbach, D. (1994). Prevalence of traumatic events and posttraumatic psychological symptoms in a nonclinical sample of college students. Journal of Traumatic Stress, 7, 289–​302. Walker, E. A., Newman, E., Dobie, D. J., Ciechanowski, P., & Katon, W. (2002). Validation of the PTSD Checklist in an HMO sample of women. General Hospital Psychiatry, 24, 375–​380. Ware, J. E., & Kosinski, M. (2001). SF-​36 Physical and mental health summary scales: A manual for users of version 1 (2nd ed.). Lincoln, RI: QualityMetric. Weathers, F. W., Blake, D. D., Schnurr, P. P., Kaloupek, D. G., Marx, B. P., & Keane, T. M. (2013). The Life Events

356

Anxiety and Related Disorders

Checklist for DSM-​5 (LEC-​5). Instrument available from the National Center for PTSD at http://​www.ptsd.va.gov Weathers, F. W., Bovin, M. J., Lee, D. J., Sloan, D. M., Schnurr, P. P., Kaloupek, D. G., . . . Marx, B. P. (2017). The Clinician Administered PTSD Scale for DSM-​5 (CAPS-​5): Development and initial psychometric evaluation in military veterans. Psychological Assessment [Epub ahead of print]. Weathers, F. W., Keane, T. M., & Davidson, J. R. T. (2001). The Clinician Administered PTSD scale (CAPS):  A review of the first ten years of research. Depression and Anxiety, 13, 132–​156. Weathers, F. W., Keane, T. M., & Foa, E. B. (2009). Assessment and diagnosis of adults. In E. B. Foa, T. M. Keane, M. J. Friedman, & J. A. Cohen (Eds.), Effective treatments for PTSD:  Practice guidelines from the International Society for Traumatic Stress Studies (2nd ed., pp. 23–​61). New York, NY: Guilford. Weathers, F. W., Litz, B. T., Herman, D. S., Huska, J. A., & Keane, T. M. (1993, October). The PTSD Checklist (PCL):  Reliability, validity, and diagnostic utility. Poster presented at the 9th annual meeting of the International Society for Traumatic Stress Studies, San Antonio, TX. Weiss, D., & Marmar, C. (1997). The Impact of Event Scale-​ Revised. In J. P. Wilson & T. M. Keane (Eds.), Assessing psychological trauma and PTSD (pp. 399–​ 411). New York, NY: Guilford. Wilson, J. P., & Krauss, G. E. (1984, September). The Vietnam Era Stress Inventory:  A scale to measure war stress and post-​traumatic stress disorder among Vietnam veterans.

Paper presented at the 3rd National Conference on Posttraumatic Stress Disorder, Baltimore, MD. Wisco, B. E., Marx, B. P., Wolf, E. J., Miller, M. W., Southwick, S. M., & Pietrzak, R. H. (2014). Posttraumatic stress disorder in the US veteran population:  Results from the National Health and Resilience in Veterans Study. Journal of Clinical Psychiatry, 75, 1338–​1346. Wisco, B. E., Miller, M. W., Wolf, E. J., Kilpatrick, D., Resnick, H. S., Badour, C. L.,  .  .  .  Friedman, M. J. (2016). The impact of proposed changes to ICD-​11 on estimates of PTSD prevalence and comorbidity. Psychiatry Research, 240, 226–​233. Wolf, E. J., Bovin, M. J., Green, J. D., Mitchell, K. S., Stoop, T. B., Barretto, K. M., . . . Rosen, R. C. (2016). Longitudinal associations between post-​traumatic stress disorder and metabolic syndrome severity. Psychological Medicine, 46, 2215–​2226. Wolfe, J., Kimerling, R., Brown, P. J., Chrestman, K. R., & Levin, K. (1996). Psychometric review of the Life Stressor Checklist-​ Revised. In B. H. Stamm (Ed.), Measurement of stress, trauma, and adaptation (pp. 198–​201). Lutherville, MD: Sidran Press. Yurgil, K. A., Barkauskas, D. A., Vasterling, J. J., Nievergelt, C. M., Larson, G. E., Schork, N. J., . . . Baker, D. G. (2014). Association between traumatic brain injury and risk of posttraumatic stress disorder in active-​duty Marines. JAMA Psychiatry, 71, 149–​157. Zen, A. L., Whooley, M. A., Zhao, S., & Cohen, B. E. (2012). Post-​ traumatic stress disorder is associated with poor health behaviors:  Findings from the Heart and Soul study. Health Psychology, 31, 194–​201.

Part V

Substance-​Related and Gambling Disorders

17

Substance Use Disorders Damaris J. Rohsenow Clinicians working with substance use disorders (SUDs) need good tools to help them evaluate patient needs, plan treatment strategies tailored to these individual needs, and monitor progress in treatment. This chapter provides an overview of the most widely used, psychometrically sound instruments that are potentially useful for clinicians working with clients with SUDs. Instruments that would probably only be used by researchers and commonly used, but psychometrically weak, instruments are not included. Accordingly, this chapter is not intended to provide an exhaustive list of available instruments, and someone’s preferred instrument may well be omitted. Nevertheless, most of the best instruments that are likely to be clinically useful are reviewed in the chapter. The focus is on alcohol and illicit drugs, not tobacco or other licit substances. However, because assessments for alcohol use disorders are covered in Chapter 18, measures specific to alcohol are not discussed here. Additional instruments used in research are described by Donovan and Marlatt (2005).

THE NATURE OF SUBSTANCE ABUSE AND DEPENDENCE

Whether called addiction, abuse, or dependence, patients with SUDs generally show a combination of physical indicators (generally an abstinence syndrome), a variety of serious or ongoing negative consequences of drug use that affect significant areas of their lives (including financial, employment, health, family, social relationships, and psychological function), and an apparent compulsion to seek and use drugs despite ongoing negative consequences. Many of the behaviors involved in getting the drugs also lead to victimization of others in terms of crime (usually committed to obtain funds to buy drugs) or physical victimization (e.g., gunshot wounds). As such,

addiction has individual effects (physical, psychological, and family), community effects (social, employment, and financial burdens), and societal effects (crime, legal system, politics, and societal costs). However, for the practitioner, the primary focus is the individual with substance misuse (using the term in a broad sense), along with the consequences to that person and the effects of his or her drug use on his or her own network. The diagnostic criteria for SUDs are described later, but it should be noted that the terms “abuse,” “misuse,” or “addiction” are often used in the literature in a looser manner to refer to the person who continues to have ongoing use despite serious problems, regardless of whether formal diagnostic criteria for SUDs are met. Comorbidity Because this text is oriented toward the clinical assessment of SUDs, information on comorbidity derived from clinical samples will be emphasized as more relevant than community samples, to the extent that they differ. Abuse of one substance is often comorbid with abuse of a second substance of abuse. Approximately one-​third of admissions to substance treatment programs are for both alcohol and illicit drug use (Substance Abuse and Mental Health Services Administration [SAMSHA], 2014). The most common additional substances of abuse for patients with opiate use disorders are marijuana, alcohol, and/​or cocaine; for injecting drug abusers these are alcohol, benzodiazepines, cannabis, and/​or amphetamines; and for patients with cocaine use disorders, they are marijuana or alcohol (SAMSHA, 2003). Patients with more than one drug of abuse are less likely to achieve remission and have more relapse after intensive treatment compared to patients abusing a single drug (Ritsher, Moos, & Finney, 2002; Walton, Blow, & Booth, 2000).

359

360

Substance-Related and Gambling Disorders

An estimated 37% of adults older than age 18  years with SUDs also have any mental illness, and 11% have a serious mental illness (SAMHSA, 2015). Comorbid disorders are most commonly affective disorders and anxiety disorders (Acosta, Haller, & Schnoll, 2005). Among people with cocaine use disorders, comorbidity rates for depressive disorders range from 11% to 55% (with depression usually preceding the SUD by approximately 7 years) and for bipolar disorder are approximately 42%; panic disorder is a common result of cocaine abuse; and the prevalence of post-​traumatic stress disorder among those with SUDs is 10 times higher than among those who do not have a SUD (Acosta et al., 2005). Psychiatric comorbidities (other than personality disorders) for individuals with opioid use disorders are most commonly bipolar or anxiety disorders (Dilts & Dilts, 2005). The prevalence of current mood and/​ or anxiety disorder among heroin injectors with multiple substances of abuse is approximately 55%, with 25% having both a mood and an anxiety disorder (Darke & Ross, 1997). An excellent clinical guide to treatment issues involved with psychiatric comorbidity in those with SUDs is provided by Busch, Weiss, and Najavits (2005). Personality disorders occur in approximately 27% of people with past-​year alcohol dependence and 54% of people with past-​ year drug dependence (corrected re-​ analysis of National Epidemiological Survey on Alcohol and Related Conditions [NESARC] data of 2001–​2005 presented in Trull et  al., 2016). The most common comorbid personality disorders for drug dependence are antisocial (40.2%), borderline (27.88%), avoidant (14.2%), schizoid or schizotypal (14.2%), obsessive–​ compulsive (10.6%), histrionic (10.3%), and paranoid (7.8%) (Trull et al., 2016). The rates are lower for alcohol dependence, ranging from the most prevalent, antisocial personality (18.8%), to the least prevalent, histrionic personality (1.8%). Prevalence, Gender, Race, Ethnicity, and Geography The prevalence of current substance dependence or abuse in the United States in 2013 (SAMHSA, 2014) was approximately 8.2% of people aged 12 years or older. Of these, approximately 12% had both alcohol and illicit drug use disorders, 20% had an illicit drug use disorder but not alcohol use disorder, and 68% had alcohol use disorder without a disorder of illicit drugs. Although from age 12 to 17  years the same number of males and females have SUDs (5.3% vs. 5.2%, respectively), from age 18  years on almost twice as many men as women

have SUDs in the past year (11.4% vs. 5.8%, respectively; SAMSHA, 2014). The geographic distribution of people in the United States with illicit drug use in the population is fairly even, but with somewhat higher rates in the West (11.8%) and Northeast (9.2%) than in the Midwest (8.7%) or South (8.3%) (SAMHSA, 2014). Only approximately 1.6% of people aged 12 years or older with lifetime SUD ever received any substance use treatment, and only 1.0% received it in substance abuse treatment facilities in 2014 (SAMSHA, 2015). The National Survey on Drug Use and Health study (2013 survey; SAMHSA, 2014) showed the highest rates of SUDs for American Indians or Alaskan Natives (14.9%), then Native Hawaiians or Pacific Islanders (11.3%), multiracial people (10.9%), Hispanics (8.6%), non-​Hispanic Whites (8.4%), and non-​ Hispanic Blacks (7.4%), with the lowest rates for Asians (4.6%). Gender differences in alcohol use disorders within each race/​ethnicity were reported in the NESARC survey of 2001 and 2002 (National Institute on Alcohol Abuse and Alcoholism, 2006). In this survey, the highest rates were for American Indians (17.3% of males and 16.8% of females), then non-​ Hispanic Whites (17.3% of males and 8.1% of females), non-​Hispanic Blacks (17.2% of males and 8.3% of females), and Hispanics (17.3% of males and 7.2% of females), with the lowest rates for Asians (11.0% of males and 6.8% of females). However, Whites have the highest rates of admission to publicly funded substance treatment facilities (2008 survey by SAMHSA; National Institute on Drug Abuse, 2011): Admissions were 59.8% White, 20.9% Black, 13.7% Hispanic, 2.3% American Indian or Alaskan Native, and 1.0% Asian or Pacific Islander. The Addiction Career There is little agreement on the etiology of SUDs—​a topic difficult to study given that substances of abuse are not all similar in mechanism, effects, or likely determinants (Anthenelli & Schuckit, 1992). It is difficult to study the etiology of drug abuse or dependence for each drug of abuse completely separately, given the fact that people may use various substances at different times or the same time. Because different drugs of abuse involve different mechanisms, it has been difficult to investigate possible genetic factors specific to illicit drug abuse as opposed to alcohol abuse, so such research has focused on genetic factors in the neurotransmitters believed to confer greater susceptibility to drug dependence (e.g., mu or kappa opioid receptors or dopamine transmission) along with genes influencing externalizing

Substance Use Disorders

psychopathology (Dick & Agrawal, 2008). Studies of sociocultural factors do little to explain who specifically will develop drug dependence given that such a small number of people affected by these influences develop drug dependence (Johnson & Muffler, 1992). There is no one psychological or sociopsychological theory that is generally accepted as explanatory (e.g., Schulenberg, Maggs, Steinman, & Zucker, 2001). However, there may be a general genetically influenced liability of negative emotionality that is expressed as personality characteristics and behavioral tendencies (inadequate emotional regulations and maladaptive responses to stress) common to abuse of various substances as well as to other comorbid disorders (Tully & Iacono, 2016). Adolescent substance abuse is highest for those who have high novelty seeking combined with low harm avoidance and low reward dependence personality traits (Wills, Vaccaro, & McNamara, 1994). These personality trait measures were correlated with other measures of behavioral undercontrol such as risk-​taking, impulsivity, anger, independence, tolerance for deviance, and sensation seeking (Wills et  al., 1994). A  childhood pattern of behavioral undercontrol often leads to early onset of cigarette use, which in turn increases the probability of the onset of drug use (e.g., Brown, Gleghorn, Schuckit, Myers, & Mott, 1996; Farrell, Danish, & Howard, 1992; U.S. Department of Health and Human Services, 1989). The etiology of this pattern of behavioral undercontrol itself is unknown. However, this may not be the only pathway to substance dependence. For a review of the concept and evidence for and against behavioral undercontrol and negative emotionality as mechanisms, see Smith and Anderson (2001) and Tully and Iacono (2016). A large study of the natural history of 581 people with narcotic addictions tracked the course of events during the 30-​year period from 1956 to 1986 (Anglin et  al., 1988). There were several notable findings. First, 5 years after starting narcotics use (approximately age 17 years), most were daily users, with few remaining as occasional users. Second, daily use peaked at approximately age 30 years, decreased slightly as people entered into methadone maintenance, and then remained stable. Third, incarceration rates were highest between ages 20 and 30 years (approximately 60% of group) and then dropped off to 11% for the last decade. Fourth, deaths started occurring within 10  years, with a mortality rate of 27% of the group at the end of 30  years (comparable to the 10% to 50% mortality rate within 8 years reported across studies by Finney, Moos, & Timko, 2013). Fifth, for the last 10 to 15  years of the study period, approximately

361

22% of the people were abstinent. Although these data are rather old, to the extent to which they reflect common developmental factors, the progression may hold up over time. Others have summarized the course of opiate addiction more simply:  First use is usually in the teens or 20s, most active opiate users are 20 to 50  years old, the addiction abates slowly and spontaneously in middle age, with 9  years being the estimated average duration of active addiction (Dilts & Dilts, 2005; Jaffe, 1989). Anglin et al. concluded that substance abuse treatment is needed much earlier in the addiction careers because treatment interrupts the typical progression of addiction. The National Drug Abuse Treatment Outcome Study showed that, overall, treatment does work, with the greatest reductions in drug use occurring with treatments that last 3 or more months for 1-​year results and 6 or more months for 5-​ year recovery (Hubbard, Craddock, & Anderson, 2003). Across 15 studies with 8 or more years of follow-​up, the annualized “remission” rates averaged 4.0% (Finney et al., 2013).

PURPOSES OF ASSESSMENT

Three specific assessment purposes of most relevance for clinical use are emphasized in this chapter: (a) diagnosis, (b)  case conceptualization and treatment planning, and (c)  treatment monitoring and outcome evaluation. The emphasis in the case conceptualization and treatment planning section is on problem severity. This section also includes assessment of expectancies, high-​risk situations, self-​efficacy in handling risk, and coping skills because these can be useful in planning motivational interventions, relapse prevention, and coping skills training specific to individual needs. Measures of overall functioning/​ impairment or functioning in interpersonal, family, psychiatric, medical, and employment domains can be useful in evaluating need for family or couples therapy, employment assistance, legal or medical services, social services, and so on. The focus in this chapter is on the assessment of SUDs regardless of substance, with less focus on measures that were developed for use with only one substance. The assessment of other behaviors that are sometimes seen as addictive, such as gambling or sexual offending, is not discussed here (the assessment of gambling is covered in Chapter  19). An excellent text that covers an array of substance-​specific assessment measures, addictive behaviors not involving chemical substance, and measures designed for use in research studies is the one by Donovan and Marlatt (2005).

362

Substance-Related and Gambling Disorders

unsuccessful attempts to cut down or control substance use; much time spent in activities needed to obtain, use, or recover from the substance; important activities stopped or reduced due to substance use; substance use continues despite knowledge of a persistent or recurrent physical or psychological problem caused or exacerbated by the substance; recurrent substance use resulting in failure to fulfill major obligations at work, home, or school; recurrent substance use in situations in which it is physically hazardous; continued substance use despite the use causing or exacerbating social or interpersonal problems; and craving or strong desire or urge to use a specific substance. (For full, exact wordings of these criteria and any substance-​specific differences, see DSM-​5 or del Boca, Darkes, and McRee [2016].) Two or three symptoms indicate a mild SUD, four or five symptoms indicate a moderate SUD, and six or more symptoms indicate a severe SUD.

ASSESSMENT FOR DIAGNOSIS

The diagnostic criteria of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association [APA], 2013)  have replaced those of DSM-​IV-​TR (APA, 2000)  as the standard for clinical practice in the United States. Measures focused on World Health Organization (WHO) criteria are not addressed because they are less likely to be relevant to practice in the United States. DSM-​5 revisions replaced the abuse and dependence categories with a single continuum, but people can be categorized based on number of criteria met into mild, moderate, and severe levels of SUD. Substance-​related legal problems were eliminated due to low occurrence, cultural variability, and poor fit with the other diagnostic information (Goldstein et  al., 2015; Schuckit, 2012), and craving or urge to use was added (to increase consistency with WHO criteria), but otherwise the list of criteria is the same. The SUD can be specified as “in a controlled environment,” “in early remission,” “in sustained remission,” and, for certain substances, “on maintenance therapy” (craving is the only criterion that can occur during remission). SUDs are a maladaptive pattern of substance use leading to clinically significant impairment or distress as indicated by two or more of the following occurring within the same 12-​month period:  tolerance (increased amount needed for same effect or markedly less effect with same amount of use); withdrawal (either the characteristic withdrawal syndrome or using the substance or a closely related substance to prevent/​relieve withdrawal); amount or duration of substance use that is greater than intended; repeated TABLE 17.1   Instrument

Screening Measures Screening measures are typically used in settings such as general medical settings or employee assistance programs to identify or rule out probable SUD without providing a diagnosis. These are brief measures that can be quickly administered to identify people who may be in need of further evaluation or assistance. Cut-​off points for screening measures can be set to err on the side of false positives or false negatives, depending on the purpose of the assessment. However, any positives should be followed up with further evaluation rather than being considered indicative of an SUD per se. The best known screening measures are rated in Table 17.1 and described in the

Ratings of Instruments Used for Screening or Diagnosis Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

Screening for Substance Use Disorders DAST

NA

E

NA

NR

A

A

E

A

DUSI-​R

NA

G

NA

NR

A

A

E

A



Diagnostic Instruments SCID-​5

NA

NA

G

G

G

A

E

L

MINI CIDI

NA NA

NA NA

G G

E G

G G

G A

E E

E L

SDSS

NA

G

NA

G

G

Ga

G

A

GAIN-​I MWC

A NA

G NR

NA NA

G G

G G

G G

E G

A E





  Good except for ICD-​10 harmful use and cocaine dependence diagnoses, which were less than adequate.

a

Note:  DAST  =  Drug Abuse Screening Test; DUSI-​R  =  Drug Use Screening Inventory-​Revised; SCID-​5  =  Structured Clinical Interview for DSM-​ 5 Axis I  Disorders-​Patient Version; MINI  =  Mini-​International Neuropsychiatric Interview 6.0; CIDI  =  Composite International Diagnostic Interview; SDSS = Substance Dependence Severity Scale, section on diagnoses; GAIN-​I = Global Appraisal of Individual Needs-​Initial Interview, substance use scales; MWC = Marijuana Withdrawal Checklist; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

Substance Use Disorders

following paragraphs. It was not necessary for test developers to update these measures to address DSM-​5 criteria because they were never intended to assess those exact criteria. In addition, because reasonably similar prevalences of moderate to severe DSM-​5 SUD and DSM-​IV (APA, 1994)  substance dependence occur (Goldstein et  al., 2015), the screening measures are likely to work as well with the current DSM system. For a further discussion of alcohol-​specific and adolescent-​specific measures, see the review of assessments by del Boca et al. (2016). Drug Abuse Screening Test The Drug Abuse Screening Test (DAST; Skinner, 1982), a 28-​ item self-​ report test, and the 10-​ item short form (DAST-​10) provide an indicator of who might have an SUD and need further evaluation, with evidence of excellent internal reliability and good validity (Gavin, Ross, & Skinner, 1989). Data on test–​retest reliability are unavailable. Both ask about drug abuse in the past year, rather than over the lifetime, and an adolescent version is available. The DAST is composed of five factors (early psychosocial complications with problem recognition, late-​onset serious social consequences, treatment/​ help-​ seeking, illegal activities, and inability to control drug use), but because psychometric properties of the separate factors were not investigated (Staley & El-​Guebaly, 1990), only the total score should be used. Both the DAST and the DAST-​10 focus on negative consequences of use rather than quantity or frequency of use. Drug Use Screening Inventory The Drug Use Screening Inventory (DUSI; Tarter, 1990), which is a longer self-​report measure (140 yes/​no items) with both adult and teen versions, assesses problem severity in the following 10 domains:  substance use preferences and consequences, behavioral maladjustment, health, psychiatric disorder (depression, anxiety, antisocial, and psychotic), school adjustment, work adjustment, social competence, peer relationships (e.g., antisocial or substance involvement), family dysfunction/​conflict, and recreation. It takes approximately 20 minutes to complete via paper or computer and is easy to manually score. Efforts were made to ensure items are free of cultural bias and at a fifth-​grade reading level. The revised version (DUSI-​R) has a validity check and Lie scale, has been found to have adequate or better psychometric qualities, and includes cut-​off scores to indicate a probable diagnosis (Tarter & Kirisci, 1997). This measure is more highly recommended than the DAST and DAST-​ 10, despite

363

the extra administration time, because it provides more information. Diagnostic Instruments In clinical settings, diagnosis is often determined without a formal structured set of specific questions. When a formal system is needed to ensure accuracy of diagnosis (e.g., for research or clinical statistics), the instruments rated in Table 17.1 and described next are the best validated structured systems available. Because it can take up to 10 years to develop and test the reliability and validity of a structured interview, it may take a while before there is psychometric evidence for instruments that have been fully updated to DSM-​5 criteria. The Structured Clinical Interview for the DSM (SCID-​ 5; First, Williams, Karg, & Spitzer, 2015)  is validated directly against both the DSM-​5 and the 10th edition of the International Classification of Diseases (ICD-​ 10; WHO, 1992; https://​www.cdc.gov/​nchs/​icd/​ icd10cm.htm) diagnostic criteria. This interview is considered the most valid method for determining DSM-​5 psychiatric diagnoses and is the most widely used structured diagnostic interview. The SCID-​5 should be administered by clinicians, not other trained interviewers, with the version for clinicians called SCID-​5-​CV. This structured interview is very lengthy (1 hour or more for the full interview) and requires formal training in administration and scoring; as such, it may not be cost-​effective for treatment agencies because the resulting data provide diagnostic information but no other information necessary for treatment planning. A much briefer alternative is the Mini-​International Neuropsychiatric Interview (M.I.N.I. 6.0;https://​www. psychcongress.com/ ​ s aundras- ​ c orner/ ​ s cales- ​ s creeners/​ structured-​ diagnostic-​ i nterview-​ i nstruments/​ m ini-​ international-​ n europsychiatric-​ i nterview-​ 6 0-​mini-​ 6 0; Sheehan et  al., 1998), which evaluates all current diagnoses in approximately 15 to 17 minutes, so determining alcohol or SUDs takes only a fraction of that time. Although it was designed for researchers, a computerized version makes it easy for any clinician to use; it is used by health and mental health professionals in more than 100 countries. Because each section starts with one or more screening questions (questions about drinking a certain amount or using street drugs for the alcohol and substance use sections, respectively), the whole interview does not need to be administered for people who do not meet some minimal criteria for a section. Patients had positive opinions about the interview and the interview format (Pinninti, Madison, Musser, & Pissmiller, 2003).

364

Substance-Related and Gambling Disorders

The M.I.N.I.  was validated against the SCID for DSM-​ IV and the Composite International Diagnostic Interview (CIDI; Robins et al., 1988) for ICD-​10 as well as against expert opinion (Sheehan et  al., 1998), and it produces separate diagnoses for current (past 12  months) alcohol abuse, alcohol dependence, substance abuse, and substance dependence. So far, the mania/​hypomania sections have been updated and validated for DSM-​5 (Hergueta & Weiller, 2013) and a version for 17 DSM-​5 diagnoses has been developed but was not available at the time this chapter was written http://​harmresearch.org/​index.php/​ product/​mini-​international-​neuropsychiatric-​interview-​ mini-​7-​0-​2/​. The Diagnostic Interview Schedule (DIS; Robins, Helzer, Cottler, & Golding, 1989; Robins et  al., 2000)  was designed to provide reliable and valid SUD diagnoses based on DSM-​III (APA, 1980)  and DSM-​IV criteria using a format involving fewer clinical judgments so it could be administered by a trained technician. It was replaced by the CIDI when WHO expanded and updated the DIS to meet international criteria (ICD-​10). (A version for DSM-​5 is due in 2018 https://​www.hcp.med.harvard.edu/​wmhcidi/​trc_​americas/​.) However, it is designed for research, not clinical practice; takes 2 hours to administer; and requires extensive training, and is available only as a computerized version. Therefore, these instruments are usually not clinically useful. For assessing withdrawal aspects specific to cannabis abuse, it is preferable to use a measure based on the empirical work that established the cannabis withdrawal syndrome. Budney, Moore, Vandrey, and Hughes (2003) demonstrated a unique pattern of withdrawal symptoms, including aggression, anger, anxiety, decreased appetite, decreased body weight, irritability, restlessness, shakiness, sleep problems, and stomach pain. A measure designed to assess this pattern, the Marijuana Withdrawal Checklist (MWC; Budney, Hughes, Moore, & Novy, 2001; Budney et al., 2003), is a self-​report measure that includes the 15 items most frequently endorsed. Studies with the 15-​item version found evidence of good test–​retest reliability and validity and also the expected gradual change over days along the predicted time course of withdrawal (Budney et al., 2003). Two instruments for assessing opiate withdrawal with some supporting reliability and validity evidence are the Subjective Opiate Withdrawal Scale (16 items, self-​administered) and the Objective Opiate Withdrawal Scale (13 items, interviewer-​administered), both developed by Handelsman et  al. (1987). For cocaine dependence, withdrawal is an infrequently endorsed symptom.

These measures are not rated in Table 17.1 because there is only limited evidence concerning their reliability and validity. The Substance Dependence Severity Scale (SDSS; Miele et al., 2000a) is a semi-​structured clinical interview that takes approximately 30 to 45 minutes and requires extensive training. Part of it results in current diagnoses for DSM-​IV and ICD-​10 (WHO, 1997)  substance abuse/​dependence/​harmful use disorders by operationalizing every criterion used in diagnosis. Each diagnostic criterion is rated for both severity (usual and worst) and frequency (number of days and number of days at the worst). Scores on the SDSS scales have demonstrated good to excellent test–​retest reliability (except for cannabis, for which it was fair to poor), internal consistency, and validity for the DSM-​IV items (Miele et al., 2000a, 2000b). Percent agreement with DSM-​IV diagnoses was 83% to 92% for alcohol, cocaine, heroin, sedatives, and cannabis (the only diagnoses investigated). The test–​ retest reliability and internal consistencies for the ICD-​ 10 dependence scales of alcohol, heroin, and cocaine were excellent, but the ICD-​ 10 harmful use scales mostly had unacceptably poor test–​retest and/​or internal consistency reliabilities. Percent agreement with ICD-​ 10 dependence diagnoses was good to excellent for alcohol, heroin, and cannabis but only fair for cocaine, and for harmful use diagnoses were only fair (unacceptable) for heroin, cocaine, and cannabis. Therefore, as long as ICD-​10 harmful use or cocaine dependence diagnostic information is not needed, this instrument will produce valid DSM-​IV diagnoses for alcohol, cocaine, heroin, sedatives, and cannabis use disorders. It is not clear whether there are plans for an update to DSM-​5. However, the SDSS severity scores, indicating degree of severity similar to the DSM-​5 degree of severity, has been validated against clinical severity ratings for alcohol, cocaine, and heroin (Miele et al., 2000b). The Global Appraisal of Individual Needs (GAIN-​ I; Dennis, 1999; Dennis, Scott, & Funk, 2003; Dennis, Titus, White, Unsicker, & Hodgkins, 2003)  is a semi-​ structured interview designed to obtain comprehensive information about the functioning of adult or adolescent patients (see further description later). The latest version, GAIN 5.7, has been updated to DSM-​5 criteria. It takes 1½ to 2½ hours to administer, and it requires considerable training. There is a Web-​based Assessment Building System format (GAIN-​ABS) that also provides DSM-​5 diagnoses. The Initial Interview version includes a diagnostic section, and the diagnoses of SUDs as well as other disorders have evidence of good test–​ retest reliability

Substance Use Disorders

estimates and concordance with independently obtained diagnoses (Dennis, 1999; Shane, Jasiukaitis, & Green, 2003). A  2-​to 5-​minute Short Screener (GAIN-​SS) is available for rapidly identifying those who are likely to have an SUD. Overall Evaluation The previously discussed screening and diagnostic measures have all demonstrated scientific adequacy as screening or diagnostic measures. Screening measures such as the DAST are not relevant for SUD treatment programs, but they are useful in other settings to identify people probably in need of further assessment or treatment. The GAIN-​SS (screening version) is useful for identifying people who need full diagnostic assessment for certain diagnoses (SUD or other), but psychometric information about this screener was not available. The DUSI, although intended to screen for SUDs, is also useful for screening for a number of areas of life function in a way comparable to the Addiction Severity Index (discussed later) but with easier administration and scoring; for this reason, it is highly recommended. The diagnostic measures are relevant only if accurate formal diagnoses are needed. Because many SUD treatment programs treat anyone who presents with substance-​ related problems or concerns, having access to accurate diagnoses is unlikely to affect treatment admission or planning, but diagnoses can affect reimbursement. The M.I.N.I., when updated to DSM-​5, is recommended as a method that is fast, accurate, and shows patient acceptance. Otherwise, the diagnostic section of the GAIN 5.7 is most highly recommended as the next least time-​intensive way to obtain the most diagnoses with good psychometric support. The others were not recommended due to the lengthy time and training needed (SCID-​5, CIDI, and SDSS) or the cumbersome amount of information produced (SDSS).

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Rationale for Instrument Selection A number of assessment instruments are commonly used to provide clinicians with guidance for case conceptualization and treatment planning. Some measures include severity of drug use and problems specific to the drugs

365

per se; others address the severity of problems in related aspects of life functioning (e.g., employment, legal, family), whether or not drugs are perceived as the cause of the problems, thus allowing for the determination of areas of life functioning in need of improvement and for additional specialized services such as social services, employment assistance, or marital or family therapy. Assessment of the patient’s anticipated positive and negative consequences of drug use is sometimes used in developing motivational interviewing treatment plans by investigating sources of and barriers to motivation. Relapse prevention training involves assessing high risk situations for relapse so as to prepare patients to cope with their own “Achilles heel” situations. Assessment of coping skills can provide information about skills and resources that can already be drawn on, maladaptive skills that need to be replaced, and skills and resources that are lacking. Skills training for substance abusers has focused either on making general lifestyle changes consistent with sobriety or on developing skills for coping with immediate urges to use in the presence of situations that pose a high risk for relapse (Monti, Kadden, Rohsenow, Cooney, & Abrams, 2002). In most cases, both types of skills need to be assessed. Some potential assessment domains are not addressed in the chapter, based on either clinical or scientific reasons. First, measures of craving are not included because it is not clear that degree of craving per se can be useful in treatment planning, as opposed to identifying situations or events that trigger craving, which can be quite important. Second, although numerous studies have shown that having social networks that include substance users (particularly one’s partner) poses a serious risk for continued drug use (see review by Westphal, Wasserman, Masson, & Sorenson, 2005), this risk is easy to assess without any formal assessment tool. Although the Important People Drug and Alcohol interview predicted outcome for patients with cocaine use disorders, it was just the number of people in the daily network that predicted less drinking, drug use, and problem severity over 6  months, not the measures of the supportiveness or drinking status of the network (Zywiak et  al., 2009), and that is easy to assess without using a measure. Therefore, this section focuses on tools that have adequate psychometric information and that could be useful in treatment planning. Detailed ratings of their psychometric properties can be found in Table 17.2. Increasing Honest Reporting Structured interviews with individuals with SUDs about their drinking or substance use have been found to

366

Substance-Related and Gambling Disorders

TABLE 17.2   Instrument

Ratings of Instruments Used for Case Conceptualization and Treatment Planning Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliabilitya

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

Severity of Drug Use and Psychosocial Functioning SDS

NA

E

NA

E

E

E

NR

A

SDSS

NA

E

NA

E

E

E

E

A

ASI-​6

NA

G

G

G

A

A

G

A

GAIN-​I

NA

G

NA

NA

G

G

A

A

✓ ✓ ✓

Negative Consequences of Ongoing Use IDUC SIP-​AD MPS

NA NA NA

E E E

NA NA NA

G G NR

G G NR

A A A

A A NR

A A A

CNCC-​87

NA

G

NA

NR

G

G

A

A

Expected Acute Effects of Use SUBQ

NA

E

NA

NR

G

G

NR

A

CEQ

NA

G

NA

NR

G

G

A

A

Assessment for Relapse Prevention Treatment Planning IDTS DTCQ AASE

A NA NA

G G E

NA NA NA

NR NA NR

G A A

G A G

G G NR

A A A

CRACS-​SE

NA

E

NA

NR

A

G

G

A

POC-​10 items

NA

A

NA

NA

A

A

NR

A

USS/​GSC

NA

E

NA

NR

G

G

A

A

✓ ✓



Test–​retest reliability is generally not applicable because clients in treatment are unstable in these areas and are expected to have variability over short periods of time. a

Note:  SDS  =  Severity of Dependence Scale; SDSS  =  Substance Dependence Severity Scale; ASI-​6  =  Addiction Severity Index-​Version 6; GAIN-​ I = Global Appraisal of Individual Needs-​Initial Interview, substance use scales; IDUC = Inventory of Drug Use Consequences, four scales (excluding intrapersonal); SIP-​AD  =  Short Inventory of Problems–​Alcohol and Drugs; MPS  =  Marijuana Problems Scale; CNCC-​87  =  Cocaine Negative Consequences Checklist; SUBQ = Substance Use Beliefs Questionnaire; CEQ = Cocaine Effects Questionnaire; IDTS = Inventory of Drug Taking Situations; DTCQ = Drug-​Taking Confidence Questionnaire; AASE = Alcohol Abstinence Self-​Efficacy; CRACS-​SE = Self-​Efficacy ratings from the Cocaine Related Assessment of Coping Skills: POC-​10 = 10 items extracted from the Processes of Change Questionnaire for a study with opiate-​using patients; USS/​GSC = Urge-​Specific Strategies Questionnaire and General Change Strategies Questionnaire; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

provide sensitive and reliable information when there is (a)  an interviewer and clinical set that encourages honest reporting (i.e., no unpleasant consequences, including interviewer disapproval), (b) assurances of confidentiality, (c)  breath alcohol testing at the interview to ensure the person is alcohol-​free during the interview, and (d) interviewee awareness that his or her reports will be corroborated by urine screens and/​or reports of family members or close friends (Ehrman & Robbins, 1994; Sobell et  al., 1996; Sobell & Sobell, 1986). Patients are likely to become dishonest in their reporting when expecting scolding, lectures, disappointing the therapist, changes in treatment, or reporting to others who may impose consequences as a result of disclosing use. Thus, the interviewer set is particularly important both with interviews and with self-​report measures: Knowing that there will be no negative consequences or disapproval for reporting substance use removes the primary disincentive to honesty.

Severity of Substance Use and Psychosocial Functioning Number of DSM-​5 Symptoms With DSM-​5 indicating severity on a continuum, the number of criteria met is designed to be used as a measure of SUD severity. The Severity of Dependence Scale If the clinician is not assessing in order to obtain a formal diagnosis, the simplest assessment alternative is the Severity of Dependence Scale (SDS; Ferri, Marsden, de Araujo, Laranjeira, & Gossop, 2000; Gossop, Best, Marsden, & Strang, 1997; Gossop et  al., 1995), which uses just five face-​valid items (about worry/​anxiety, feeling out of control, and desire to stop or difficulty with stopping) to assess a pattern of dependence severity across the

Substance Use Disorders

patient’s preferred substance. It has demonstrated excellent psychometric properties, and although there is no information about any racial diversity in any of the validation samples, it has been validated in three countries (England, Australia, and Brazil). Substance Dependence Severity Scale The SDSS (Miele et al., 2000a), a semi-​structured clinical interview, assesses the severity of every symptom of both DSM-​ IV and ICD-​ 10 (WHO, 1997)  substance abuse/​dependence/​harmful use disorders among people aged 16  years or older. Substance-​ specific questions assess frequency, recency, and amount of use in the past 30  days only, as well as asking usual and worst severity of each diagnostic criterion and also number of days of any use and number of days at the worst severity for each. These questions cover a wide range of abused substances, including alcohol, cocaine, heroin, stimulants, licit opiates, sedatives, methadone, cannabis, hallucinogens, and two “other” categories covering drugs such as inhalants. However, the cannabis items omit withdrawal, which was only found to be a valid symptom after this measure was developed. The SDSS takes specialized training and can require as much as 45 minutes to administer. The SDSS scale scores have been found to demonstrate good to excellent test–​ retest reliability, internal consistency, and validity for the use (quantity/​frequency) items, DSM-​IV severity items (except for cannabis), and ICD-​10 dependence but not harmful use scales (Miele et  al., 2000a, 2000b). The best validity was shown for the alcohol, heroin, and cocaine severity scales. Patients reporting more days that symptoms were present returned to drug use more quickly, suggesting that this frequency scale predicts need for more intensive care (Miele et al., 2001). On the other hand, greater usual severity of dependence symptoms predicted slower return to drug use (Miele et  al., 2001), consistent with more serious problems or concern about consequences of drug use making people more motivated for change. Therefore, this instrument has generally excellent psychometric properties (except for cannabis scales for either DSM-​IV or ICD-​10 harmful use) and can be a useful way to assess recent use and severity of specific DSM-​IV SUD symptoms.

367

interview has become the most widely used instrument for assessing both SUD severity and severity of other life problems in SUD treatment settings. It was recently updated to the sixth edition so as to correct some psychometric problems with the widely used fifth edition, including adding a 6-​month time frame to the lifetime and 30-​day questions, thus improving the structure and clarity of the questions, and reducing the time burden of the many additional questions by adding screening questions with skip-​outs. The ASI provides severity scores that have been found to be reliable and valid for recent (past 30  days) drug use (not specific to any one drug), alcohol use, and problems in five life areas: medical, employment/​finances, legal, psychiatric, and social/​family functioning (three scales for this aspect: adult relationships (problems and support), use of free time, and problems and needs regarding minor children). The drug and alcohol use sections ask about past 30 days, past 6 months, and lifetime frequency of use of each of a number of drugs and also of a number of consequences of drug or alcohol use. Each section of the interview previously included an overall clinical rating of severity, but because these ratings and the previous composite scores were not acceptably reliable, they were replaced by newly developed summary indices that were developed empirically (Cacciola et al., 2011; Denis, Cacciola, & Alterman, 2013). However, whether or not a summary score is desired, the specific information derived from the interview provides the clinician with a wealth of useful information. The ASI requires specialized training offered by the authors in Philadelphia, requires computerized scoring of the summary indices, and requires approximately 45 to 90 minutes to administer. A  computer-​administered version eases some of the burden. The new version shows acceptable support for each of the indices in terms of separation and stability, with strong evidence of reliability and validity for the 30-​day indices (Cacciola et al., 2011). (Psychometrics for the 6-​month and lifetime indices were not reported.) Like the original version, this edition was developed using patients in a variety of community settings in an urban area, with the primary substance of abuse being cocaine, heroin, or alcohol, but it was limited to mostly unemployed patients. Generalizability was assessed between genders and between Whites and African Americans and was found to be acceptable. It has been validated in Spanish (Díaz Mesa et al., 2010) and in Portuguese in Brazil (Kessler et al., 2012).

Addiction Severity Index The Addiction Severity Index (ASI; Cacciola, Alterman, Habing, & McLellan, 2011; McLellan, Luborsky, Woody, & O’Brien, 1980; McLellan et  al., 1992). This structured

Global Appraisal of Individual Needs The GAIN’s (Dennis, 1999; Dennis, Scott, et  al., 2003; Dennis, Titus, et  al., 2003; http://​www.chestnut.org/​li/​

368

Substance-Related and Gambling Disorders

gain) semi-​structured interview has sections on family/​ living arrangement, substance use, physical health, risk behaviors, mental health, environment, legal, and vocation. As such, it can provide comprehensive background information on patients similar to that obtained by the ASI. It can be used for American Society of Addiction Medicine-​based level of care placement, Joint Committee on Accreditation of Hospital Organization-​ based treatment planning, and Drug Outcome Monitoring Study-​ based outcome monitoring. The GAIN can be administered by paper or computer and takes 60 to 120 minutes for the initial evaluation. The substance use section, in addition to providing diagnostic information (as previously described), asks for self-​reported frequency of use in the past month for categories of drugs or any substance, recency of use of each of these categories, peak quantity of use of each category, frequency (days) of use of each, number of days with problems from substance use, number of past-​month SUD diagnostic symptoms, and a current withdrawal scale, all with excellent reliability and validity (Dennis, Titus, et al., 2003). In a comparison of biometric data (hair and urine) and three self-​report measures (recency, quantity, and frequency) of use of marijuana, cocaine, opioids, and other substance, the GAIN’s Substance Frequency Scale performed as well or better than other measures or methods of combining measures (Lennox, Dennis, Scott, & Funk, 2006). Other scales in the GAIN, all with evidence of at least adequate reliability and validity, include number of days of past treatment, environmental risks for relapse, illegal activities, emotional problems, and employment activities. Negative Consequences of Use Although the assessment of negative consequences of substance abuse overlaps with material addressed in the preceding section, the measures described previously either focused on severity of diagnostic symptoms alone or on life functioning (whether or not problems in life functioning could be directly attributed to substance use). Assessment of a range of consequences perceived by patients to be specifically due to substance use can be useful for treatment planning in two ways. First, it provides an overview of areas of functioning that should improve as a result of abstinence and treatment. Second, the information can be used to increase the patient’s awareness of areas of life that could be improved via abstinence. The Inventory of Drug Use Consequences (IDUC; Tonigan & Miller, 2002) is a 50-​item self-​report measure

of the consequences of drug or alcohol use (not differentiated from each other). There are separate versions for lifetime and the past 3  months of use, and each of these has a version worded in the third person that can be completed by a family member or friend. The IDTC was developed to provide clinicians with a relatively brief (approximately 10–​15 minutes) and easy tool that is in the public domain. Scores on four of the five scales have demonstrated excellent internal consistency reliability (physical problems, social relationships, interpersonal problems, and impulse control), and a confirmatory factor analysis showed that these same four scales adequately represent a larger domain of negative consequences and correlate with other measures of negative consequences (Tonigan & Miller, 2002). Further work produced the 15-​item Short Inventory of Problems–​Alcohol and Drugs (SIP-​AD; Blanchard, Morgenstern, Morgan, Labouvie, & Bux, 2003). The items all load on one scale (indicating the degree of adverse consequences) that has been found to yield excellent reliability estimates and that significantly correlates with other measures of alcohol and drug severity, dependence symptoms, substance use frequency, and psychiatric severity. Although both versions have demonstrated good to excellent reliability and at least adequate validity (see Table 17.2), the long version (excluding the intrapersonal section) would be more useful in treatment planning because it provides reliable indices of problems in four different life areas that can be targeted for coping skills training or motivational approaches. The Marijuana Problems Scale (MPS; Stephens, Roffman, & Curtin, 2000) assesses 19 recent and lifetime problems that patients with SUDs attribute to marijuana use, each rated as no problem, minor problem, and serious problem, and that are summed to provide an index of problem severity. This self-​report measure was derived in part by rewording many DAST items for marijuana, deleting the treatment items, and adding some other consequences (Stephens, Wertz, & Roffman, 1993). (In one publication, it was called the Marijuana Consequences Questionnaire [Budney, Higgins, Radonovich, & Novy, 2000], which can result in confusion with the other measure by that name.) Domains include psychological, social, legal, and occupational consequences (examples include memory problems, family problems, and procrastination). A  26-​item version is a checklist, but the 19-​item version asks patients to rate each item as a mild or major problem versus no problem. There is limited psychometric information available on either version of

Substance Use Disorders

this measure. For the 19-​item version, one study reported very high internal consistency reliability (Stephens et al., 2000) and showed change in problems during a 4-​month period among marijuana-​ dependent patients in active treatment versus delayed-​treatment condition that paralleled changes reported for frequency of marijuana use and number of dependence symptoms (Stephens et  al., 2000). However, no other forms of validity analyses have been conducted. Although the 26-​ item checklist has been used in more studies, there is virtually no supporting psychometric information for this version, with one report of high internal consistency reliability at follow-​up (Stephens et al., 1993) but no reported reliability pretreatment, no concurrent correlations reported to support its validity, and no differences between pretreatment abstainers and users of marijuana in scores (Moore & Budney, 2002). Therefore, the 19-​item MPS is a brief and valid measure of degree of initial problems, but further psychometric information is needed and psychometric properties of the 26-​item checklist version are unknown. A separate 50-​ item measure called the Marijuana Consequences Questionnaire (Simons, Dvorak, Merrill, & Read, 2012)  was developed and validated only on college students and so is not recommended for clinical use. The Cocaine Negative Consequences Checklist (CNCC; Michalec et al., 1996) assesses long-​term negative life events that cocaine abusers perceive to result from their own cocaine use. The items all fall on a single scale that has demonstrated evidence of high reliability, but they can also be scored for four reliable content area scales:  physical health, emotional/​psychological, social/​ relationship, and legal problems. The scales correlated significantly with other measures of use and severity in two samples, and they were found to predict which cocaine users would seek help (Varney et al., 1995). An expanded second edition, with 75 items (CNCC-​75) that added financial and vocational items (Rohsenow, Monti, et al., 2004), has been reported to yield equally high reliability estimates and predicts cocaine use outcomes after treatment. Expected Effects of Use In addition to assessing past consequences, often due to longer term use, the assessment of positive and negative effects expected fairly immediately from an episode of substance use can be used as feedback in motivational interviewing (Miller & Rollnick, 1991, 2002) or in functional analysis-​ based coping skills training (Rohsenow,

369

Monti, et al., 2004). These measures are inherently specific to specific substances. Measures developed on and for college students are not covered because they are not known to be relevant to clinical populations and often involve a high reading level and large number of items. Expectancies Across Four Substances A brief Substance Use Beliefs Questionnaire (SUBQ) was designed to assess expected effects of alcohol, nicotine, opiates, and stimulants among users seeking treatment or willing to seek treatment (Kouimtsidis, Stahl, West, & Drummond, 2014). The two resulting factors are positive versus negative expectancies, with good criterion and predictive validity. The 98-​item original version was reduced to 28 items, with evidence of excellent internal reliability estimates and substantial correlations with the long version. The negative expectancies scales predicted change in dependence level 3 months after treatment. No other measures of opiate or stimulant expectancies were found that had evidence of at least adequate reliability and validity. Most other measures of alcohol expectancies were developed on and for university students, most of whom did not have alcohol diagnoses. Expectancies for Cocaine The Cocaine Effects Questionnaire for Patient Populations (CEQ; Rohsenow, Sirota, Martin, & Monti, 2004)  is a 33-​item self-​report instrument with seventh-​ grade reading level that assesses seven factors of fairly immediate positive and negative effects that patients said they expected from cocaine use. Reliability and validity estimates have been found to be good, with several subscales correlated with amount of cocaine use and with urge to use cocaine. This information was used in coping skills treatment planning by helping patients identify alternative nondrug ways to obtain desired positive effects and to remind patients of negative experiences they wish to avoid (Rohsenow, Monti, Martin, Michalec, & Abrams, 2000), and it was used in motivational interviewing as a way to augment discussion of advantages and disadvantages of cocaine use (Rohsenow, Monti, et  al., 2004). Other cocaine expectancy measures and a parallel measure for marijuana expectancies have been developed on college populations, most of whom did not use cocaine/​marijuana, much less meet criteria for SUDs, so these measures are not considered useful for patient populations.

370

Substance-Related and Gambling Disorders

Assessment for Relapse Prevention

2002; Rohsenow et al., 2001), a structured interview developed to identify highly personal relapse risk situations for According to social learning models of relapse prevenuse in cue exposure therapy, is easily adapted for use with tion (e.g., Monti et al., 2002), some of the most imporany drug of abuse, as was done in identifying personal tant areas to assess for treatment intervention include high-​risk situations of cocaine-​dependent patients as the (a)  situations (interpersonal, emotional/​ cognitive, and basis of functional analysis-​based cocaine-​specific coping environmental) that increase risk of relapse, (b)  self-​ skills training (Monti, Rohsenow, Michalec, Martin, & efficacy about staying abstinent (both in general and in Abrams, 1997; Rohsenow et al., 2000). However, there is specific high-​risk situations), and (c) types of coping skills insufficient psychometric information to allow this instruavailable to use and/​or actually used when in high-​risk ment to be rated in the table. situations or in general to prevent relapse. If initiation of abstinence in treatment seekers who are not abstinent is the goal, these same domains are important to target. Assessing Self-​Efficacy The use of other substances is another source of relapse Self-​efficacy for ability to resist using in high-​risk situations risk, but methods of monitoring these are covered in can be useful at any stage of treatment for identifying situother sections of this chapter. ations in which a patient expects to have the most trouble. These can be assessed with several measures. First, the Drug-​Taking Confidence Questionnaire (DTCQ; Sklar, Assessing High-​Risk Situations Annis, & Turner, 1997)  is a 50-​item measure that uses The Inventory of Drug Taking Situations (IDTS; Annis & the same list of situations as in the IDTS to assess self-​ Martin, 1985; Turner, Annis, & Sklar, 1997) assesses high-​ efficacy. It requires respondents to rate how confident risk situations for relapse based on common domains of they are that they would be able to resist the urge to use relapse risk situations. The categories were derived from drugs in that situation. Thus, the IDTS is behavioral but analyses of alcohol-​dependent patients’ relapse risk situa- past-​oriented, whereas the DTCQ is more subjective but tions and therefore omit some triggers relevant to people future-​ oriented. This measure also was developed on with drug dependence (e.g., the presence of money or people with a range of types of SUDs. The confirmatory ATM cards [Rohsenow et  al., 2000; Rohsenow, Monti, factor analysis supported essentially the same three high-​ et al., 2004]), but the measure was normed on 364 drug-​ order factors as the IDTS: positive situations, negative sitdependent patients with primary cocaine (n = 159), can- uations, and temptation situations. An 8-​item short form nabis (n = 98), or alcohol use disorders (n = 76). Factor also has been found to have generally good psychometric structure and reliability estimates have been shown to properties (Sklar & Turner, 1999). be good, but there is no simple way to validate items on Second, the Alcohol Abstinence Self-​Efficacy Scale actual risk situations. The 50 self-​report items fall into fac- (AASE; DiClemente, Carbonari, Rosario, Montgomery, tors of unpleasant emotions, pleasant emotions, physical & Hughes, 1994) has patients rate 20 situations on 5-​point discomfort, testing personal control, urges/​temptations to scales for how confident they are that they would not drink use, conflict with others, social pressure to use, and pleas- in each situation and again for how tempted they are to ant times with others. These factors can be grouped into drink. The categories of high-​risk situations included are three second-​order factors (with good psychometric model (a)  negative affect, (b)  social interactions and positive fit): negative situations, positive situations, and urges and states, (c) physical and other concerns, and (d) withdrawal testing personal control. Although the reliability (internal and urges. The total score had high internal reliability and consistency) estimate was poor for the physical discom- good validity. A brief 12-​item version (McKiernan et al., fort scale, all other scales have demonstrated acceptable 2011)  has two factors (temptation and confidence) with to good reliability. For each situation described, patients high internal consistency and concurrent validity. This report how often they have used drugs in that situation in would be less useful for treatment planning because only the past. The information can be used to design person- 6 situations are involved. It also has been adapted for use alized relapse prevention training by emphasizing skills with drug abusers as the Drug Abstinence Self-​Efficacy needed for handling the situations a person has actually Scale (DASE; Hiller, Broome, Knight, & Simpson, 2000), most often associated with drug use. resulting in the same four subscales. However, because For identifying highly idiosyncratic relapse risk situa- information on reliability and validity was not found, this tions, the Drinking Triggers Inventory (DTI; Monti et al., measure was not rated in the table.

Substance Use Disorders

Third, in the Cocaine Related Assessment of Coping Skills (Rohsenow, Monti, et al., 2004), cocaine-​dependent patients rated how confident they would be to refrain from substance use in each of 11 high-​risk situations. The score demonstrated high internal consistency and concurrent validity, and it predicted quantity and frequency of drug use 3 months after treatment (Dolan, Martin, & Rohsenow, 2008). Fourth, a simple 4-​point rating of confidence that the person would not use drugs again during a specific period of time predicts treatment outcome for opiate addicts (Gossop, Green, Phillips, & Bradley, 1990). However, there is insufficient information on this measure to rate it in the table, and the broader situation-​specific measures are preferable because they can be used to individualize relapse prevention and/​or coping skills training by focusing on the types of situations in which the patient would be most tempted to use or least confident about abstaining from use.

371

designed to maintain abstinence (the General Change Strategies Questionnaire [GSC]). The measures were developed, found to each consist of a single factor with excellent internal consistency, and validated first with alcohol-​dependent patients in treatment, with the summary scores for each differentiating between coping skills treatment versus control treatment and correlating with treatment outcome 3 to 6  months later (Monti et  al., 2001). In analysis of the value of individual strategies, 13 of the urge-​specific strategies and 18 general lifestyle change strategies correlated with successful treatment outcome 3 to 6  months after treatment, whereas other common strategies did not (Dolan, Rohsenow, Martin, & Monti, 2013), thus indicating the most important coping skills to focus treatment on. The measure was then adapted for use with cocaine-​dependent patients in treatment with 21 strategies in each measure, with each forming scales that demonstrated substantial reliability and validity estimates (Rohsenow et  al., 2005). These were used to determine the specific skills that were correlated with less cocaine use at 3 and 6  months post-​ Assessing Coping Skills treatment, with results indicating that 13 of the USS Only a few studies investigating coping to predict out- strategies and 12 of the GSC strategies were effective in come for opiate abusers used measures with substantial this regard (Rohsenow et al., 2005). Thus, the measures evidence of reliability and validity. In one such study, were found to be heuristic across two types of substance 10 items were selected from the psychometrically sound use disorders. The open-​ended section can be used to Processes of Change Questionnaire (POC; Prochaska, elicit patients’ free recall of all the strategies they plan Velicer, DiClemente, & Fava, 1988). Among opiate-​ to use, and the frequency ratings are used to assess how dependent individuals, abstinence was related to an often they say they have used each strategy. By identifyincrease in the 10 processes of change assessed (POC-​ ing the skills the patients already know or use, gaps in 10; Gossop, Stewart, Browne, & Marsden, 2002). These knowledge or use of effective skills can be targeted for items were categorized into Avoidance (“remove things treatment. from my home that remind me of drugs,” “stay away from people who remind me of drugs,” and “stay with people Overall Evaluation who remind me not to use”), Cognitive (“I tell myself I  can choose not to use drugs,” “I can keep from using There are a variety of clinically useful instruments that if I try hard enough,” “I am able to avoid using if I want can be used in treatment planning. There is a choice of to,” and “I must not use to be content with myself”), and scientifically sound measures that provide an evaluation Distraction (“physical activity,” “do something to help me of the patient’s ability to function across major life areas. relax,” and “think about something else when tempted to Whether or not problems in some of these areas result use”). Scores for these three categories had adequate to from drug use, these areas may need to be addressed in good internal consistency reliability in this study, and all treatment so as to maximize the individual’s structural three types of coping were significantly greater in abstain- and functional support for abstinence, motivation to ers, suggesting that only these 10 items are needed for use stay clean and sober, and quality of life. Drug-​specific consequences an individual experienced can be parwith opiate-​dependent patients. Because existing measures tapped only a limited ticularly useful in sustaining or increasing the person’s number of the specific skills taught in many treatment motivation to become or stay abstinent from drugs by programs, we developed measures of coping skills to be highlighting what he or she has to gain from abstinence, used in high-​risk situations (the Urge-​Specific Strategies whereas expected acute effects can be used in functional Questionnaire [USS]) and of lifestyle change skills analyses to focus on alternative ways to achieve many of

372

Substance-Related and Gambling Disorders

the desired outcomes (e.g., negative affect reduction or social facilitation). The measures of situations in which the patient would be more tempted to use or have less confidence about staying abstinent can be used to target relapse-​prevention treatment toward helping the patient learn to better avoid or cope with unavoidable high-​risk situations without using. Measures of both urge-​specific and general lifestyle coping have been developed to assess a range of coping skills that have been shown to be related to reduced alcohol or cocaine use after treatment and can be used to identify gaps in individual patients’ needed skills. Good measures of social support for abstinence may not be needed because such support is easy to evaluate informally in a way that predicts outcome (e.g., better treatment outcome for cocaine-​dependent patients was predicted by number of people in one’s network regardless of their support for treatment and by replacing substance-​ involved with substance-​free daily contacts in one’s network [Zywiak et al., 2009]). The measures selected for inclusion in this section are all ones that could be good clinical tools, although some require considerably more training and time than others, and time is often of short supply in many treatment contexts. Some of the measures in Table 17.2 with good psychometric properties are not highly recommended simply due to the amount of time and training required for administration and the complexity of the scoring (i.e., SDSS and ASI-​6). Other measures were not highly recommended because they were specific to only one substance (e.g., CNCC-​87 and CEQ). The ones rated as highly recommended are the ones with good psychometric qualities

TABLE 17.3  

and seeming utility for treatment planning that are also relatively easy to administer.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

There are several assessment measures and strategies that can be used to track the effects of treatment on substance use and problem severity. In addition to the IDUC, SIP-​ AD, and MPS described previously, the main options are indices of symptom severity and toxicology analyses. Details of the psychometric properties of these measures are presented in Table 17.3. Assessing Areas of Life Function A briefer (118-​ item) form of the ASI-​ 6 (ASI-​ 30  day; described previously) that includes only the questions with a 30-​day time frame is commonly used for tracking progress. However, there are few psychometric data on the value of using the ASI to predict outcome or track progress. Although this measure can be used to track changes in functioning on a monthly basis, it is unclear the extent to which changes in ASI-​6 scores correlate with changes in drug use during the same time period. Changes in life functioning could be somewhat independent from changes in substance use, depending on the extent to which these are direct targets of treatment. However, increase in future crime at 2 years was predicted by change in alcohol use from 0 to 6 months and not by

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Inter-​Rater Consistency Reliability

Test–​Retest Reliability

Content Construct Validity Treatment Validity Validity Generalization Sensitivity

Clinical Utility

ASI-​6 30-​day

NA

G

G

G

A

A

G

NR

NR

TLFB

NA

NA

G

G

NA

E

E

E

A

IDUC SIP-​AD

NA NA

E E

NA NA

G G

G G

A A

A A

A A

A A

MPS

NA

NR

NA

NR

NR

A

NR

A

NR

GAIN 90 Day M Urine screens Urinalysis

NA NR G

G NA NA

NA NA NA

NA NA NA

G NA NA

G A G

A E E

A L E

A Aa Ea

Highly Recommended



✓ ✓

Utility is excellent during the time a program requires 3 to 7 days/​week of attendance, but with high cost.

a

Note: ASI = Addiction Severity Index 30-​day form; TLFB = Timeline Followback Interview; IDUC = Inventory of Drug Use Consequences, four scales; SIP-​AD  =  Short Inventory of Problems–​Alcohol and Drugs; MPS  =  Marijuana Problems Scale; GAIN 90 Day M  =  Global Appraisal of Individual Needs–​90-​Day Monitoring version; urine screens  =  drug screening with on-​site test kits; urinalyses  =  urine drug toxicology analyses using standard commercial laboratory methods such as EMIT or gas chromatography; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

Substance Use Disorders

the legal, drug, or other ASI Version 5 scores (Alterman et al., 1998), contrary to what one would expect. The GAIN Monitoring 90-​ Day version (Dennis, Scott, Godley, & Funk, 1999; (http://​www.chestnut. org/​li/​gain) is designed to evaluate change over time in living arrangements, substance use (frequency, situational antecedents, withdrawal, and problematic consequences), treatment (use, satisfaction, and medications), physical health, risk behaviors, emotional health, legal system events, vocation, and finances. The full measure takes 60 minutes and core questions take 25 minutes, with a 10-​minute Quick Monitoring version available. The measure has excellent statistics on change over time in the most relevant areas across a variety of types of substance treatment settings. Assessing Drug and Alcohol Use Frequency Although Timeline Followback (TLFB; Ehrman & Robbins, 1994; Sobell & Sobell, 1980), a method of asking about daily drug or alcohol use, is used primarily in research, when retrospective self-​report of days of use is desired, this method has been found to be the least subject to memory problems. The TLFB is a calendar-​assisted structured interview that provides a way to cue memory so that recall is more accurate. For the period of time of interest, the person is asked to fill in all days with special events such as holidays, birthdays, and days in jail or hospital. The person is then asked about alcohol/​drug use on those days and the days immediately before and after those days, with other days gradually filled in from there. Although social drinkers cannot easily do this, people with alcohol or drug use disorders are better at remembering this information. The TLFB has been found to yield good to excellent reliability and validity estimates (Ehrman & Robbins, 1994; Sobell et  al., 1996)  when the previous caveats about self-​report measures of substance use (see the section titled Increasing Honest Reporting) are taken into account. This method has been found to be sensitive to SUD treatment effects across a great many studies (e.g., McKay et al., 1997; Rohsenow, Monti, et al., 2004).

373

control). These scales were sensitive to changes in drug use behavior over 3  months so that a 40% decrease in drug use was paralleled by a 33% decrease in drug-​related consequences (Tonigan & Miller, 2002). The short form reviewed previously, the SIP-​AD (Blanchard et al., 2003), is sensitive to treatment change, decreasing from pre-​to post-​treatment, and with post-​treatment SIP-​AD scores correlating as expected with post-​treatment number of substance use days (Blanchard et  al., 2003). Both measures are rated in Table 17.3. Because of the demonstrated sensitivity of these measures to change combined with ease of administration, they are highly recommended. The MPS (Stephens et al., 2000), in the 19-​item 90-​day version, may be used to track change in marijuana-​related problems. The MPS was sensitive to change in problems during a 4-​month period among marijuana-​dependent patients in active treatment versus delayed-​ treatment condition that paralleled changes reported for frequency of marijuana use and number of dependence symptoms (Stephens et  al., 2000). The 26-​item checklist version showed no effects of treatment in one study (Budney et al., 2000) but showed a significant decrease from before to after treatment independent of type of treatment in two other studies (Budney, Moore, Rocha & Higgins, 2006; Stephens, Roffman, & Simpson, 1994). Although change over time paralleled change in frequency of use, no attempt was made to validate the measure in terms of change in other measures of problems from cannabis use. Therefore, the 19-​item measure may provide a basis for seeing reduction over time in problems as a function of treatment, but replication in a second study and information on correlations of change in this measure to change in other indicators of problems are needed before the actual value of this measure is known. The limited available psychometric information prevents a high recommendation from being made for this measure. Urine and Hair Toxicology Analyses

Urine toxicology drug analyses for substances of abuse other than alcohol are the gold standard for monitoring patients, but they require that patients still be enrolled in a program that provides them with some reason to Assessing Consequences of Drug or Alcohol Use come in for such testing 3 to 7  days per week. Urine The IDUC (Tonigan & Miller, 2002), a 50-​item self-​ screens and toxicology analyses test for the presence of report measure of the consequences of drug or alcohol the drugs themselves and/​or of the metabolites of the use, has a version asking about the past 3 months that can drugs (metabolites permit longer detection). The drugs be used for tracking progress using the four scales with most commonly screened for include benzodiazepines, evidence of substantial reliability (physical problems, cocaine, opiates, amphetamines, phencyclidine, and social relationships, interpersonal problems, and impulse cannabinoids. Commercial laboratories usually provide

374

Substance-Related and Gambling Disorders

a standard panel of substances to be analyzed and the option of testing for other drugs upon request. The assay methodologies used in most laboratory testing methods (e.g., enzyme-​ multiplied immunoassay technique [EMIT] or gas chromatography–​mass spectrometry [GC-​ MS]) yield data that are highly reliable and valid. On-​ site screening tests (strips or cups with detection strips built in) are far less expensive and agree 97% of the time with GC-​MS results. They do, however, have increased false positives because they are designed to be highly sensitive, so positive readings generally need to be confirmed with a laboratory test. A comparison of laboratory-​ analyzed urine toxicology data and self-​reports of days of use 12 months after treatment entry for 337 patients with SUDs found that neither urine tests nor self-​reports were without their problems as a method of detection (Lennox et al., 2006). Higher validity was seen, in general, for self-​reported recency of use of cocaine, opioid, and marijuana use (Lennox et al., 2006), indicating that it is of value also simply to ask patients how recently they used drugs when monitoring their use (when using the guidelines described under the section titled Increasing Honest Reporting). There are problems that can be encountered with urine drug testing. One such problem pertains to the window of detection. For example, although methadone programs routinely require daily testing, most drugs of abuse can be detected with certainty over a 2-​or 3-​day window even with qualitative methods of detection (just a positive or negative answer, as opposed to quantitative methods that give the amount detected). However, because most drugs can stay in the tissues for approximately 7 days after abstinence begins, and marijuana can be detectable (50 ng/​L) for 2 weeks after heavy use (Hawks & Chiang, 1986), readings may be positive for some time after abstinence begins. Therefore, programs often allow an initial washout period for the urine to become clean before imposing any consequences or before contingency management programs start voucher reinforcement based on abstinent readings (e.g., Budney et al., 2006). A second problem is the potential for false-​positive test results. The methodologies involved in most laboratory tests greatly decrease the chance of false positives, yet a person can still have reason to claim that a test showed a false positive for opiates if, for example, he or she had eaten a large amount of poppy seeds. When not used for legal purposes, it may be enough to require that patients avoid all non-​illicit sources of positive readings. A third problem is related to the introduction of contaminants by patients. Patients who expect unpleasant consequences from positive readings may go

to great lengths to “beat” the test. This can include bringing a hidden sample of urine from a clean person, adding contaminants (e.g., soap, vinegar, lemon juice, salt, and bleach) to invalidate the test, or drinking large quantities of water before giving a sample to make the sample too dilute for a valid test. Other evasion methods have been developed, including an artificial penis or hidden plastic tubing and an IV bag with heating strips. Some of these can be overcome by requiring carefully monitored testing and requiring some hours at the site without drinking before obtaining the sample. Testing hair for the presence of drugs of abuse has raised some interest because hair will contain residue of drugs over the length of the hair, thus providing a detection window of months or years, depending on the length of the hair. Drugs enter the hair at the follicle level via blood, sebum (from glands in the scalp), and sweat (Huestis, 2001). However, several problems limit the adoption of this method more widely to date, including two serious ones: hair color bias and environmental factors. First, drugs are more strongly detectable in darker hair than in lighter hair (Joseph, Tsai, Tsao, Su, & Cone, 1997), leading to more false negatives among blond or white-​haired people than among people with brown or black hair. In addition, because the higher lipid content in curlier hair (Cruz et  al., 2013)  may affect absorption of lipid-​soluble drugs, there is a serious concern that this would lead to racial or ethnic differences in accuracy of detection. Second, drugs also can be absorbed into the hair via environmental exposure, especially smoke, and repeated shampoo treatments or solvent washes do not completely remove environmental cocaine from the hair (e.g., Wang, Darwin, & Cone, 1994). Therefore, someone can test positive despite remaining abstinent. A third problem is that there are few places where hair testing for drugs is available. A fourth is that hair testing is less sensitive to detecting marijuana than is urine toxicology analysis, and there is great individual variability in the sweat that affects hair testing (Baron, Baron, & Baron, 2005). Therefore, hair analysis has more pitfalls than advantages at the present time. Given that urine detection is highly reliable and fully adequate for within-​treatment monitoring, it remains the preferred biological method. Overall Evaluation For monitoring of progress in terms of drug use, urine drug analyses at least three times per week remain the gold standard. Although urine drug screens are poor at detecting alcohol use, due to the rapid metabolism

Substance Use Disorders

of alcohol, these are excellent for monitoring all other drugs of abuse when the precautions described previously are handled. For monitoring of monthly change in problems resulting from drug or alcohol use or function in terms of family, legal, and psychological problems, the GAIN-​90 Day M is developed for this purpose and is scientifically sound. Both methods can identify when the person might be using substances and, therefore, be in need of some additional booster counseling. The IDUC is effective in showing change in problems that result from drug use over time and so is recommended for this purpose.

375

References

Acosta, M. C., Haller, D., L., & Schnoll, S. H. (2005). Cocaine and stimulants. In R. J. Frances, S. I. Miller, & A. H. Mack (Eds.), Clinical textbook of addictive disorders (3rd ed., pp. 184–​218). New York, NY: Guilford. Alterman, A. I., McDermott, P. A., Cook, T. G., Metzger, D., Rutherford, M. J., Cacciola, J. S., & Brown, L. S. (1998). New scales to assess change in the Addiction Severity Index for the opioid, cocaine, and alcohol dependent. Psychology of Addictive Behaviors, 12, 233–​246. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). CONCLUSIONS AND FUTURE DIRECTIONS Washington, DC: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). The clinician treating patients with SUDs has a number Washington, DC: Author. of tools available for screening, diagnosis, assessment of problem severity, assessment of risk factors for relapse to American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, address in treatment, and monitoring of treatment progVA: American Psychiatric Press. ress and outcomes. Some of the tools (particularly the Anglin, M. D., Hser, Y. I., Booth, M., Speckart, G. R., diagnostic and some problem severity tools) are time-​ McCarthy, W. J., Ryan, T., & Powers, K. (1988). consuming and require extensive training: These may be The natural history of narcotic addiction:  A 25-​ year more difficult to adopt into clinic practices that are short follow-​up (NIDA Grant R01 DA 03425). Los Angeles, on time. Other tools are quicker and/​or easier to adminisCA:  University of California Los Angeles, Drug Abuse ter and score for rapid use in treatment planning or moniResearch Center. toring of progress. It is hoped that future work will focus Annis, H. M., & Martin, G. (1985). Inventory of Drug-​Taking Situations. Toronto, Ontario, Canada:  Addiction more on developing instruments that clinicians can use Research Foundation. easily and with minimal time to provide useful guidance Anthenelli, R. M., & Schuckit, M. A. (1992). Genetics. for treatment planning and evaluation. In J. H. Lowinson, P. Ruiz, & R. B. Millman (Eds.), Future development work with assessment instruSubstance abuse: A comprehensive textbook (2nd ed., pp. ments needs to include more of a focus on determin56–​69). Baltimore, MD: Williams & Wilkins. ing validity and treatment sensitivity, in particular. For Baron, D. A., Baron, D. A., & Baron, S. H. (2005). Laboratory example, although the ASI is widely used for monitortesting for substances of abuse. In R. J. Frances, S. I. ing treatment progress, there is very little information to Miller, & A. H. Mack (Eds.), Clinical textbook of demonstrate that it validly tracks changes in other indiaddictive disorders (3rd ed., pp. 63–​ 71). New  York, cators of progress. Given such a widely used face-​valid NY: Guilford. instrument, such information would be valuable. Future Blanchard, K. A., Morgenstern, J., Morgan, T. J., Labouvie, E. W., & Bux, D. A. (2003). Assessing consequences of work should also focus on making instruments that do substance use: Psychometric properties of the Inventory not require extensive training, long administration time, of Drug Use Consequences. Psychology of Addictive and complex scoring procedures. Not only do these facBehaviors, 17, 328–​331. tors drive up costs, and often prevent use due to busy schedules, but also when extensive training is needed, it Brown, S., Gleghorn, A., Schuckit, M., Myers, M., & Mott, M. (1996). Conduct disorder among adolescent alcois too easy for assessors’ abilities to drift over time unless hol and drug abusers. Journal of Studies on Alcohol, 57, regular retraining or testing of their abilities is conducted. 314–​324. A number of instruments in the future will probably be Budney, A. J., Higgins, S. T., Radonovich, K. J., & Novy, P. L. available not only via computer but also via Web-​based (2000). Adding voucher-​based incentives to coping skills applications, allowing interactive responses with patients, and motivational enhancement improves outcomes computerized scoring, and access to expert help at the during treatment for marijuana dependence. Journal of Consulting and Clinical Psychology, 68, 1051–​1061. touch of a mouse.

376

Substance-Related and Gambling Disorders

Budney, A. J., Hughes, J. R., Moore, B. A., & Novy, P. L. (2001). Marijuana abstinence effects in marijuana smokers maintained in their home environment. Archives of General Psychiatry, 58, 917–​924. Budney, A. J., Moore, B. A., Rocha, H. L., & Higgins, S. T. (2006). Clinical trial of abstinence-​based vouchers and cognitive–​ behavioral therapy for cannabis dependence. Journal of Consulting and Clinical Psychology, 74, 307–​316. Budney, A. J., Moore, B. A., Vandrey, R. G., & Hughes, J. R. (2003). The time course and significance of cannabis withdrawal. Journal of Abnormal Psychology, 112, 393–​402. Busch, A. B., Weiss, R. D., & Najavits, L. M. (2005). Co-​ occurring substance use disorders and other psychiatric disorders. In R. J. Frances, S. I. Miller, & A. H. Mack (Eds.), Clinical textbook of addictive disorders (3rd ed., pp. 271–​302). New York, NY: Guilford. Cacciola, J. S., Alterman, A. I., Habing, B., & McLellan, A. T. (2011). Recent status scores for Version 6 of the Addiction Severity Index (ASI-​6). Addiction, 106, 1588–​1602. Cruz, C. F., Fernandes, M. M., Gomes, A. C., Coderch., L., Marti, M., Méndez, S.,  .  .  .  Cavaco-​Paulo, A. (2013). Keratins and lipids in ethnic hair. International Journal of Cosmetic Science, 35, 244–​249. Darke, S., & Ross, J. (1997). Polydrug dependence and psychiatric comorbidity among heroin injectors. Drug and Alcohol Dependence, 48, 135–​141. Del Boca, F. K., Darkes, J., & McRee, B. (2016). Self-​report assessments of psychoactive substance use and dependence. In K. J. Sher (Ed.), Oxford handbook of substance use and substance use disorders (Vol. 2, pp. 430–​465). New York, NY: Oxford University Press. Denis, C. M., Cacciola, J. S., & Alterman, A. I. (2013). Addiction Severity Index (ASI) summary scores: Comparison of the Recent Status Scores of the ASI-​6 and the Composite Scores of the ASI-​5. Journal of Substance Abuse Treatment, 45, 444–​450. Dennis, M. L. (1999). Global Appraisal of Individual Needs (GAIN): Administration guide for the GAIN and related measures (Version 1299). Bloomington, IL:  Chestnut Health Systems. Dennis, M. L., Scott, C. K., & Funk, R. (2003). An experimental evaluation of recovery management checkups (RMC) for people with chronic substance use disorders. Evaluation and Program Planning, 26, 339–​352. Dennis, M. L., Scott, C. K., Godley, M. D., & Funk, R. (1999). Comparison of adolescents and adults by ASAM profile using GAIN data from the Drug Outcome Monitoring Study (DOMS):  Preliminary data tables. Bloomington, IL: Chestnut Health Systems. Dennis, M. L., Titus, J. C., White, M., Unsicker, J., & Hodgkins, D. (2003). Global Appraisal of Individual

Needs (GAIN):  Administration Guide for the GAIN and Related Measures (Version 1299). Bloomington, IL:  Chestnut Health Systems. Retrieved January 2005 from http://​www.chestnut.org/​li/​gain DiClemente, C. C., Carbonari, J. P., Rosario, P. G., Montgomery, M. A., & Hughes, S. O. (1994). The Alcohol Abstinence Self-​Efficacy Scale. Journal of Studies on Alcohol, 55, 141–​148. Díaz Mesa, E. M., García-​Portilla, P., Sáiz, P. A., Bobes Bascarán, T., Casares, M. J., Fonseca, E.,  .  .  .  Bobes, J. (2010). Rendimiento psicométrico de la sexta versión del Addiction Severity Index en español (ASI-​6). Psicothèma, 22, 513–​519. Dick, D. M., & Agrawal, A. (2008). The genetics of alcohol and other drug dependence. Alcohol Health and Research, 31, 111–​118. Dilts, S. L., Jr., & Dilts, S. L. (2005). Opioids. In R. J. Frances, S. I. Miller, & A. H. Mack (Eds.), Clinical textbook of addictive disorders (3rd ed., pp. 138–​156). New  York, NY: Guilford. Dolan, S. L., Martin, R. A., & Rohsenow, D. J. (2008). Self-​ efficacy for cocaine abstinence: Pretreatment correlates and relationship to outcomes. Addictive Behaviors, 33, 675–​688. Dolan, S. L., Rohsenow, D. J., Martin, R. A., & Monti, P. M. (2013). Urge-​ specific and lifestyle coping strategies of alcoholics: Relationships of specific strategies to treatment outcome. Drug and Alcohol Dependence, 128, 8–​14. Donovan, D. M., & Marlatt, G. A. (Eds.). (2005). Assessment of addictive behaviors (2nd ed.). New York, NY: Guilford. Ehrman, R. N., & Robbins, S. J. (1994). Reliability and validity of 6-​month timeline reports of cocaine and heroin use in a methadone population. Journal of Consulting and Clinical Psychology, 62, 843–​850. Farrell, A., Danish, S., & Howard, C. (1992). Relationship between drug use and other problem behaviors in urban adolescents. Journal of Consulting and Clinical Psychology, 60, 705–​712. Ferri, C. P., Marsden, J., de Araujo, M., Laranjeira, R. R., & Gossop, M. (2000). Validity and reliability of the Severity of Dependence Scale (SDS) in a Brazilian sample of drug users. Drug and Alcohol Review, 19, 451–​455. Finney, J. W., Moos, R. H., & Timko, C. (2013). The course of treated and untreated substance use disorders. In B. S. McCrady & E. E. Epstein (Eds.), Addictions: A comprehensive guidebook (2nd ed., pp. 108–​131). New York, NY: Oxford University Press. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015). Structured Clinical Interview for the DSM-​5 Disorders, Clinician Version (SCID-​5-​CV). Arlington, VA: American Psychiatric Association. Gavin, D. R., Ross, H. E., & Skinner, H. A. (1989). Diagnostic validity of the Drug Abuse Screening Test

Substance Use Disorders

in the assessment of DSM-​III drug disorders. British Journal of Addiction, 84, 301–​307. Goldstein, R. B., Chou, S. P., Smith, S. M., Jung, J., Zhang, H., Tulshi, D. S.,  .  .  .  Grant, B. F. (2015). Nosologic comparisons of DSM-​IV and SDM-​5 alcohol and drug use disorders: Results from the National Epidemiologic Survey on Alcohol and Related Conditions. Journal of Studies on Alcohol and Drugs, 76, 378–​388. Gossop, M., Best, D., Marsden, J., & Strang, J. (1997). Letter to the Editor:  Test–​retest reliability of the Severity of Dependence Scale. Addiction, 92, 353–​354. Gossop, M., Darke, S., Griffiths, P., Hando, J., Powis, B., Hall, W., & Strang, J. (1995). The Severity of Dependence Scale (SDS):  Psychometric properties of the SDS in English and Australian samples of heroin, cocaine and amphetamine users. Addiction, 90, 607–​614. Gossop, M., Green, L., Phillips, G., & Bradley, B. (1990). Factors predicting outcome among opiate addicts after treatment. British Journal of Clinical Psychology, 29, 209–​216. Gossop, M., Stewart, D., Browne, N., & Marsden, J. (2002). Factors associated with abstinence, lapse, or relapse to heroin use after residential treatment: Protective effect of coping responses. Addiction, 97, 1259–​1267. Handelsman, L., Cochrane, K. J., Aronson, M. J., Ness, R., Rubinstein, K. J., & Kanof, P. D. (1987). Two new rating scales for opiate withdrawal. American Journal of Drug and Alcohol Abuse, 13, 293–​308. Hawks, R. L., & Chiang, C. N. (1986). Examples of specific drugs. In R. L. Hawks & C. N. Chiang (Eds.), Urine testing for drugs of abuse (NIDA Research Monograph No. 73, pp. 84–​112). Washington, DC:  U.S. Government Printing Office. Hergueta, T., & Weiller, E. (2013). Evaluating depressive symptoms in hypomanic and manic episodes using a structured diagnostic tool:  Validation of a new Mini International Neuropsychiatric Interview (M.I.N.I.) module for the DSM-​5  “With Mixed Features” specifier. International Journal of Bipolar Disorder, 1, 21. Hiller, M. L., Broome, K. M., Knight, K., & Simpson, D. D. (2000). Measuring self-​efficacy among drug-​involved probationers. Psychological Reports, 86, 529–​538. Hubbard, R. L., Craddock, S. G., & Anderson, J. (2003). Overview of 5-​ year follow-​ up outcomes in the drug abuse treatment outcome studies. Journal of Substance Abuse Treatment, 25, 125–​134. Huestis, M. A. (2001, February 28). Monitoring drug exposure with alternative matrices. Presentation to the NIDA-​E Treatment Research Review Committee, Bethesda, MD. Jaffe, J. H. (1989). Psychoactive substance abuse disorders. In H. I. Kaplan & B. J. Sadock (Eds.), Comprehensive textbook of psychiatry (5th ed., pp. 642–​698). Baltimore, MD: Williams & Wilkins.

377

Johnson, B. D., & Muffler, J. (1992). Sociocultural aspects of drug use and abuse in the 1990s. In J. H. Lowinson, P. Ruiz, & R. B. Millman (Eds.), Substance abuse:  A comprehensive textbook (2nd ed., pp. 56–​69). Baltimore, MD: Williams & Wilkins. Joseph, R. E., Jr., Tsai, W. J., Tsao, L. I., Su, T. P., & Cone, E. J. (1997). In vitro characterization of cocaine binding site in human hair. Journal of Pharmacology and Experimental Therapeutics, 282, 1228–​1241. Kessler, F., Cacciola, J., Alterman, A., Faller, S., Souza-​ Formigoni, M. L., Cruz, M. S.,  .  .  .  Pechansky, F. (2012). Psychometric properties of the sixth version of the Addiction Severity Index (ASI-​6) in Brazil. Revista Brasileira de Psiquiatria, 34(1), 24–​33. https://​dx.doi. org/​10.1590/​S1516-​44462012000100006 Kouimtsidis, C., Stahl, D., West, R., & Drummond, C. (2014). Can outcome expectancies be measured across substances? Development and validation of a questionnaire for populations in treatment. Drugs and Alcohol Today, 14, 172–​186. Lennox, R., Dennis, M. L., Scott, C. S., & Funk, R. (2006). Combining psychometric and biometric measures of substance use. Drug and Alcohol Dependence, 83, 95–​103. McKay, J. R., Alterman, A. I., Caciola, J. S., Rutherford, M. J., O’Brien, C. P., & Koppenhaver, J. (1997). Group counseling versus individualized relapse prevention aftercare following intensive outpatient treatment for cocaine dependence:  Initial results. Journal of Consulting and Clinical Psychology, 65, 778–​788. McKiernan, P., Cloud, R., Patterson, D. A., Wolf (Adelv Unegv Waya), S., Golder, S., & Besel, K. (2011). Development of a brief abstinence self-​efficacy measure. Journal of Social Work Practice in the Addictions, 11, 245–​253. McLellan, A. T., Kushner, H., Metzger, D., Peters, R., Smith, I., Grissom, G.,  .  .  .  Argeriou, M. (1992). The fifth edition of the Addiction Severity Index. Journal of Substance Abuse Treatment, 9, 199–​213. McLellan, A. T., Luborsky, L., Woody, G. E., & O’Brien, C. P. (1980). An improved diagnostic instrument for substance abuse patients:  The Addiction Severity Index. Journal of Nervous and Mental Disease, 168, 26–​33. Michalec, E. M., Rohsenow, D. J., Monti, P. M., Varney, S. M., Martin, R. A., Dey, A. N.,  .  .  .  Sirota, A. D. (1996). A cocaine negative consequences checklist: Development and validation. Journal of Substance Abuse, 8, 181–​193. Miele, G. M., Carpenter, K. M., Cockerham, M. S., Trautman, K. D., Blaine, J., & Hasin, D. S. (2000a). Substance Dependence Severity Scale (SDSS):  Reliability and validity of a clinician-​administered interview for DSM-​IV substance use disorder. Drug and Alcohol Dependence, 59, 63–​75.

378

Substance-Related and Gambling Disorders

Miele, G. M., Carpenter, K. M., Cockerham, M. S., Trautman, K. D., Blaine, J., & Hasin, D. S. (2000b). Concurrent and predictive validity of the Substance Dependence Severity Scale (SDSS). Drug and Alcohol Dependence, 59, 77–​88. Miele, G. M., Carpenter, K. M., Cockerham, M. S., Trautman, K. D., Blaine, J., & Hasin, D. S. (2001). Substance Dependence Severity Scale reliability and validity for ICD-​10 substance use disorders. Addictive Behaviors, 26, 601–​612. Miller, W. R., & Rollnick, S. (1991). Motivational interviewing: Preparing people to change addictive behavior. New York, NY: Guilford. Miller, W. R., & Rollnick, S. (2002). Motivational interviewing:  Preparing people for change (2nd ed.). New  York, NY: Guilford. Monti, P. M., Kadden, R., Rohsenow, D. J., Cooney, N., & Abrams, D. B. (2002). Treating alcohol dependence:  A coping skills training guide (2nd ed.). New  York, NY: Guilford. Monti, P. M., Rohsenow, D. J., Michalec, E., Martin, R. A., & Abrams, D. B. (1997). Brief coping skills treatment for cocaine abuse:  Substance use outcomes at 3 months. Addiction, 92, 1717–​1728. Monti, P. M., Rohsenow, D. J., Swift, R. M., Gulliver, S. B., Colby, S. M., Mueller, T. I., . . . Asher, M. K. (2001). Naltrexone and cue exposure with coping and communication skills training for alcoholics:  Treatment process and one-​year outcomes. Alcoholism:  Clinical and Experimental Research, 25, 1634–​1647. Moore, B. A., & Budney. A. J. (2002). Abstinence at intake for marijuana dependence treatment predicts response. Drug and Alcohol Dependence, 67, 249–​257. National Institute on Alcohol Abuse and Alcoholism. (2006). Alcohol use and alcohol use disorders in the United States:  Main findings from the 2001–​ 2002 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) [Data tables]. In US alcohol epidemiologic data reference manual. Bethesda, MD:  National Institutes of Health. https://​ pubs.niaaa.nih.gov/ ​ p ublications/ ​ N ESARC_ ​ D RM/​ NESARCDRM.pdf Pinninti, N. R., Madison, H., Musser, E., & Pissmiller, D. (2003). MINI International Neuropsychiatric Schedule:  Clinical utility and patient acceptance. European Psychiatry, 18, 361–​364. Prochaska, J. O., Velicer, W. F., DiClemente, C. C., & Fava, J. S. (1988). Measuring process of change: Applications to the cessation of smoking. Journal of Consulting and Clinical Psychology, 56, 520–​528. Ritsher, J. B., Moos, R. H., & Finney, J. W. (2002). Relationship of treatment orientation and continuing care to remission among substance abuse patients. Psychiatric Services, 53, 595–​601.

Robins, L. N., Cottler, L., Bucholz, K. K., Compton, W. M., North, C. S., & Rourke, K. M. (2000). Diagnostic Interview Schedule for the DSM-​IV (DIS-​IV). St. Louis, MO: Washington University School of Medicine. Robins, L. N., Helzer, J. E., Cottler, L., & Golding, E. (1989). National Institute of Mental Health Diagnostic Interview Schedule (3rd ed.). St. Louis, MO: Washington University Press. Robins, L. N., Wing, J., Wittchen, H. U., Helzer, J. E., Babor, F. F., Burke, J., . . . Towle, L. H. (1988). The Composite International Diagnostic Interview:  An epidemiologic instrument suitable for use in conjunction with different diagnostic systems and in different cultures. Archives of General Psychiatry, 45, 1069–​1077. Rohsenow, D. J., Martin, R. A., & Monti, P. M. (2005). Urge-​ specific and lifestyle coping strategies of cocaine abusers:  Relationships to treatment outcomes. Drug and Alcohol Dependence, 78, 211–​219. Rohsenow, D. J., Monti, P. M., Martin, R. A., Colby, S. M., Myers, M. G., Gulliver, S. B., . . . Abrams, D. B. (2004). Motivational enhancement and coping skills training for cocaine abusers: Effects on substance use outcomes. Addiction, 99, 862–​874. Rohsenow, D. J., Monti, P. M., Martin, R. A., Michalec, E., & Abrams, D. B. (2000). Brief coping skills treatment for cocaine abuse: 12-​month substance use outcomes. Journal of Consulting and Clinical Psychology, 68, 515–​520. Rohsenow, D. J., Monti, P. M., Rubonis, A. V., Gulliver, S. B., Colby, S. M., Binkoff, J. A., & Abrams, D. B. (2001). Cue exposure with coping skills training and communication skills training for alcohol dependence: Six and twelve month outcomes. Addiction, 96, 1161–​1174. Rohsenow, D. J., Sirota, A. D., Martin, R. A., & Monti, P. M. (2004). The Cocaine Effects Questionnaire for patient populations:  Development and psychometric properties. Addictive Behaviors, 29, 537–​553. Schuckit, M. A. (2012). Editorial in reply to the comments of Griffith Edwards. Journal of Studies on Alcohol and Drugs, 73, 521–​522. Schulenberg, J., Maggs, J. L., Steinman, K. J., & Zucker, R. A. (2001). Development matters:  Taking the long view on substance abuse etiology and intervention during adolescence. In P. M. Monti, S. M. Colby, & T. A. O’Leary (Eds.), Adolescent, alcohol and substance use (pp. 109–​141). New York, NY: Guilford. Shane, P., Jasiukaitis, P., & Green, R. S. (2003). Treatment outcomes among adolescents with substance abuse problems: The relationship between comorbidities and post-​treatment substance involvement. Evaluation and Program Planning, 26, 393–​402 Sheehan, D., Lecrubier, Y., Sheehan, K. H., Amorim, P., Janavs, J., Weiller, E., . . . Dunbar, G. C. (1998). The Mini-​ International Neuropsychiatric Interview (M.I.N.I.): The development and validation of a structured diagnostic

Substance Use Disorders

psychiatric interview for DSM-​IV and ICD-​10. Journal of Clinical Psychiatry, 59(Suppl. 20), 22–​33. Simons, J. S., Dvorak, R. D., Merrill, J. E., & Read, J. P. (2012). Dimensions and severity of marijuana consequences: Development and validation of the Marijuana Consequences Questionnaire (MACQ). Addictive Behaviors, 37, 613–​621. Skinner, H. A. (1982). The Drug Abuse Screening Test. Addictive Behaviors, 7, 363–​371. Sklar, S. M., Annis, H. M., & Turner, N. E. (1997). Development and validation of the Drug-​ Taking Confidence Questionnaire:  A measure of coping self-​ efficacy. Addictive Behaviors, 22, 655–​670. Sklar, S. M., & Turner, N. E. (1999). A brief measure for the assessment of coping self-​efficacy among alcohol and other drug users. Addiction, 94, 723–​729. Smith, G. T., & Anderson, K. G. (2001). Personality and learning factors combine to create risk for adolescent problem drinking. In P. M. Monti, S. M. Colby, & T. A. O’Leary (Eds.), Adolescents, alcohol and substance use (pp. 109–​141). New York, NY: Guilford. Sobell, L. C., Buchan, G., Cleland, P., Sobell, M. B., Fedoroff, I., & Leo, G. I. (1996, November). The reliability of the timeline followback (TLFB) method as applied to drug, cigarette and cannabis use. Paper presented at the 30th Meeting of the Association for Advancement of Behavior Therapy, New York, NY. Sobell, L. C., & Sobell, M. B. (1980). Convergent validity: An approach to increasing confidence in treatment outcome conclusions with alcohol and drug abusers. In L. C. Sobell, M. B. Sobell, & E. Ward (Eds.), Evaluating alcohol and drug abuse treatment effectiveness:  Recent advances (pp. 177–​183). Elmsford, NY: Pergamon. Sobell, L. C., & Sobell, M. B. (1986). Can we do without alcohol abusers’ self-​reports? The Behavior Therapist, 7, 141–​146. Staley, D., & El-​Guebaly, N. (1990). Psychometric properties of the Drug Abuse Screening Test in a psychiatric patient population. Addictive Behaviors, 15, 257–​264. Stephens, R. S., Roffman, R. A., & Curtin, L. (2000). Comparison of extended versus brief treatments for marijuana use. Journal of Consulting and Clinical Psychology, 68, 898–​908. Stephens, R. S., Roffman, R. A., & Simpson, E. E. (1994). Treating adult marijuana dependence:  A test of the relapse prevention model. Journal of Consulting and Clinical Psychology, 62, 92–​99. Stephens, R. S., Wertz, J. S., & Roffman, R. A. (1993). Predictors of marijuana treatment outcomes: The role of self-​efficacy. Journal of Substance Abuse, 5, 341–​353. Substance Abuse and Mental Health Services Administration, Office of Applied Studies. (2003). Treatment Episode Data Set (TEDS):  1992–​2001 [National Admissions to Substance Abuse Treatment Services, DASIS Series: S-​20,

379

DHHS Publication No. (SMA) 03-​ 3778]. Rockville, MD: U.S. Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. (2014). Results from the 2013 National Survey on Drug Use and Health:  Summary of National Findings [NSDUH Series H-​48, HHS Publication No. (SMA) 14-​4863]. Rockville, MD: Author. Substance Abuse and Mental Health Services Administration. (2015). Receipt of services for behavioral health problems: Results from the 2014 National Survey on Drug Use and Health [NSDUH Data Review: September 2015]. Retrieved from https://www.samhsa.gov/data/sites/ default/files/NSDUH-DR-FRR3-2014/NSDUH-DRFRR3-2014/NSDUH-DR-FRR3-2014.htm Tarter, R. (1990). Evaluation and treatment of adolescent substance abuse:  A decision tree method. American Journal of Drug and Alcohol Abuse, 16, 1–​46. Tarter, R., & Kirisci, L. (1997). The Drug Use Screening Inventory for adults:  Psychometric structure and discriminative sensitivity. American Journal of Drug and Alcohol Abuse, 23, 207–​219. Tonigan, J. S., & Miller, W. S. (2002). The Inventory of Drug Use Consequences (InDUC):  Test–​retest stability and sensitivity to detect change. Psychology of Addictive Behaviors, 16, 165–​168. Trull, T. I., Solhan, M. B., Brown, W. C., Tomko, R. L., Schaefer, L., McLaughlin, K. D., & Jahng, S. (2016). Substance use disorders and personality disorders. In K. J. Sher (Ed.), Oxford handbook of substance use and substance use disorders (Vol. 2, pp. 116–​148). New York, NY: Oxford University Press. Tully, E. C., & Iacono, W. G. (2016). An integrative common liabilities model for the comorbidity of substance use disorders with externalizing and internalizing disorders. In K. J. Sher (Ed.), Oxford handbook of substance use and substance use disorders (Vol. 2, pp. 187–​212). New York, NY: Oxford University Press. Turner, N. E., Annis, H. M., & Sklar, S. M. (1997). Measurement of antecedents to drug and alcohol use: Psychometric properties of the Inventory of Drug-​ Taking Situations (IDTS). Behaviour Research and Therapy, 35, 465–​483. U.S. Department of Health and Human Services. (1989). Reducing the health consequences of smoking: 25 years of progress. A  report of the Surgeon General [DHHS Publication No. (CDC) 89–​ 8411]. Rockville, MD: Author. Varney, S. M., Rohsenow, D. J., Dey, A. N., Myers, M. G., Zwick, W. R., & Monti, P. M. (1995). Factors associated with help seeking and perceived dependence among cocaine users. American Journal of Drug and Alcohol Abuse, 21, 81–​91. Walton, M. A., Blow, F. C., & Booth, B. M. (2000). A comparison of substance abuse patients’ and counselors’

380

Substance-Related and Gambling Disorders

perceptions of relapse risk: Relationship to actual relapse. Journal of Substance Abuse Treatment, 19, 161–​169. Wang, W. L., Darwin, W. D., & Cone, E. J. (1994). Simultaneous assay of cocaine, heroin and metabolites in hair, plasma, saliva and urine by gas chromatography–​ mass spectrometry. Journal of Chromatography B: Biomedical Applications, 660, 279–​290. Westphal, J., Wasserman, D. A., Masson, C. L., & Sorenson, J. L. (2005). Assessment of opioid use. In D. M. Donovan & G. A. Marlatt, G. A. (Eds.), Assessment of addictive behaviors (2nd ed., pp. 215–​247). New York, NY: Guilford. Wills, T. A., Vaccaro, D., & McNamara, G. (1994). Novelty seeking, risk taking, and related constructs

as predictors of adolescent substance use: An application of Cloninger’s theory. Journal of Substance Abuse, 6, 1–​20. World Health Organization. (1992). International classification of diseases (10th ed.). Geneva, Switzerland: Author. World Health Organization. (1997). Composite International Diagnostic Interview, Core Version, 2.1, 12 Month Version. Geneva, Switzerland: Author. Zywiak, W. H., Neighbors, C. J., Martin, R. A., Johnson, J. E., Eaton, C. A., & Rohsenow, D. J. (2009). The Important People Drug and Alcohol Interview:  Psychometric properties, predictive validity, and implications for treatment. Journal of Substance Abuse Treatment, 36, 321–​330.

18

Alcohol Use Disorder Angela M. Haeny Cassandra L. Boness Yoanna E. McDowell Kenneth J. Sher In this chapter, we consider various measures for assessing alcohol use disorder (AUD) for the purposes of diagnosis, case conceptualization and treatment planning, and treatment monitoring and evaluation. It is important to preface this chapter with information about the changes made to the AUD criteria in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric Association [APA], 2013). Unlike the DSM-​ IV-​ TR (APA, 2000), which defined two diagnoses under the rubric of AUDs (i.e., alcohol abuse and alcohol dependence), the DSM-​ 5 defines a single AUD diagnosis. Key reasons for this change are that there was scant psychometric evidence that abuse and dependence were distinct constructs and that some criteria of abuse appeared to be more severe than some criteria for dependence (which contradicted the assumption that abuse was a less severe form of AUD) (Hasin et al., 2013). In addition, one could receive a diagnosis of abuse based on a single symptom, resulting in a high prevalence of false-​positive diagnoses (i.e., “diagnostic impostors”; Martin, Chung, & Langenbucher, 2008). The AUD criteria were largely unchanged from DSM-​IV-​TR to DSM-​5, retaining 10 of the 11 criteria previously used to diagnose abuse or dependence; the criterion legal problems was removed and craving was added (for more details, see Hasin et al., 2013). To meet DSM-​5 criteria for an AUD diagnosis, an individual must endorse at least 2 of the 11 items within the same 12-​month period, and the symptoms must cause impairment and/​or distress. The DSM-​5 also includes a severity continuum such that individuals who endorse 2 or 3

criteria meet the requirement for mild AUD, individuals who endorse 4 or 5 criteria meet the requirement for moderate AUD, and individuals who endorse 6 or more criteria meet the requirement for severe AUD. Although these changes influence measures used to diagnose and conceptualize AUD as a syndrome, they have had little influence on measures used for treatment planning, treatment monitoring, or treatment evaluation.

GENERAL DIAGNOSTIC CONSIDERATIONS

Prevalence/​Incidence The National Epidemiologic Survey on Alcohol Related Conditions (NESARC-​III; Grant et al., 2014) is the most recent U.S.  nationally representative survey that provides information on the prevalence and incidence of AUD in the past year (i.e., 12-​month prevalence) and over the course of a person’s lifetime (i.e., lifetime prevalence). According to published NESARC-​III data, the 12-​month and lifetime prevalence rates of AUD are 14% and 29%, respectively (Grant et al., 2015). Consistent with prior national epidemiologic surveys, Grant et al. (2015) found that individuals with the highest 12-​month and lifetime rates of AUD tended to be male (18% and 36%, respectively), Native American (19% and 43%, respectively), White (14% and 33%, respectively), aged 18 to 29  years (27% and 37%, respectively), and never married (25% and 36%, respectively). In terms of socioeconomic status, individuals with a high school degree and some college or higher had the highest rates

381

382

Substance-Related and Gambling Disorders

of 12-​month and lifetime AUD compared to those with less than a high school degree. Those with an income of less than $20,000 had the highest rate of 12-​month AUD (16%) and those with an income greater than $70,000 had the highest rate of lifetime AUD (30%). Generally, a positive relationship has been found between severity of AUD and disability; endorsing a single AUD criterion is associated with greater disability compared to those who do not endorse any AUD criteria. Notably, only 8% and 20% of those with a 12-​month and lifetime AUD, respectively, sought treatment or help for their alcohol problems, and the mean age of first treatment episode was 29 years. The most commonly used treatment resources included 12-​step groups, health care practitioners, outpatient facilities, and rehabilitation programs. Course/​Prognosis The general course of AUD begins with increasing alcohol involvement during adolescence, peak involvement during late adolescence and early adulthood, and a gradual decrease during adulthood (Grant et al., 2004; Schuckit, 2009; Sher, Grekin, & Williams, 2005). Although this is recognized as the general course of AUD, the course is extremely heterogeneous across individuals. Four prototypical courses of alcohol involvement have been identified:  a non-​ user/​ stable low-​ user course, a chronic or high-​user course, a “developmentally limited” course that declines over the lifespan (see Zucker, 1987), and a later-​ onset course that gradually increases over the lifespan (Sher, Jackson, & Steinley, 2011). Individuals who begin drinking alcohol at an earlier age are at an increased risk for developing AUD and alcohol-​ related problems in adulthood (Nelson, Van Ryzin, & Dishion, 2015). For example, those who begin drinking prior to age 15  years are 1.4 times more likely to develop AUD compared to those who begin drinking later (Dawson, Goldstein, Chou, Ruan, & Grant, 2008). Although the majority of those with early onset AUD “mature out” over time in large part due to the assumption of adult roles that are incompatible with drinking (e.g., increased work responsibilities and marriage) and general psychosocial maturity, others persist in risky drinking (Chassin, Sher, Hussong, & Curran, 2013; Lee & Sher, in press). The relative risk ratio (RR) of mortality among those with AUD is nearly one and a half times higher for women than for men (RR = 4.64 vs. RR = 2.98, respectively). Furthermore, those in younger age groups and those in treatment have a substantially higher risk of

mortality than do those with AUD who are not in these groups (Roerecke & Rehm, 2013). Comorbidity AUD is highly comorbid with other psychiatric disorders. For example, individuals with lifetime AUD are at greater odds of also having co-​occurring lifetime major depressive disorder (odds ratio [OR]  =  2.0), bipolar I  disorder (OR  =  5.0), post-​traumatic stress disorder (OR  =  3.0), generalized anxiety disorder (OR  =  2.5), and borderline personality disorder (OR  =  4.1) after controlling for sociodemographic characteristics (Grant et al., 2015). After adjusting for other co-​occurring disorders, relative odds for most major disorders still remain quite high: major depressive disorder (OR  =  1.3), bipolar I  disorder (OR  =  2.0), post-​traumatic stress disorder (OR = 1.3), generalized anxiety disorder (OR = 1.2), and borderline personality disorder (OR  =  2.0). Furthermore, AUD is highly comorbid with other substance use disorders (SUDs). Odds ratios are estimated to be approximately 7.8 between lifetime AUD and any other drug use disorder and 4.6 between AUD and nicotine use disorder after controlling for sociodemographic characteristics (Grant et  al., 2015). After adjusting for co-​occurring disorders, these odds are 4.1 and 3.2, respectively. Consequently, individuals presenting with AUD should be evaluated for a range of frequently comorbid conditions, and AUD should always be considered when assessing individuals with emotional, psychotic, and personality disorders, as well as with other forms of SUDs. Etiology There exists a rich history of etiologic models of AUD, which implicate various genetic, biological, psychosocial, and environmental influences (e.g., Chassin, Colder, Hussong, & Sher, 2016; Sher, Martinez, & Littlefield, 2011). Having a family history of alcohol problems is one of the strongest risk factors for developing problems with alcohol. Data from NESARC (Grant, Moore, Shepard, & Kaplan, 2003; Grant et al., 2004) indicated that 22% of adults in the United States have at least one parent with alcohol problems, and this risk nearly doubles with two (OR  =  4.44; 95% confidence interval [CI], 3.93–​5.02]) versus one parent (OR = 2.51; 95% CI, 2.38–​2.66) (Yoon, Westermeyer, Kuskowski, & Nesheim, 2013). A  meta-​ analysis of twin and adoption studies indicated that the heritability (i.e., the proportion of variance explained by genetic factors) of AUD was 49% (95% CI, 43%–​53%), the

Alcohol Use Disorder

proportion of variation explained by shared environment was 10% (95% CI, 3%–​16%), and the proportion of variance due to unique effects was 39% (95% CI, 38%–​42%) (Verhulst, Neale, & Kendler, 2015). Although various genetic variants have been implicated in the biological risk for AUD (e.g., ALDH2, OPRM1, and GABRA2) (Jones, Comer, & Kranzler, 2015; Stallings, Gizer, & Young-​ Wolff, 2016), the field is still in the early stages of characterizing how genetic variation impacts susceptibility to AUD. Some research has suggested that biological risk is mediated by factors such as alcohol metabolism, subjective response to alcohol, and a general tendency toward externalizing behavior. In terms of subjective response to alcohol, some evidence suggests those with a family history of AUD, compared to those with no family history of AUD, are more sensitive to the rewarding effects of alcohol (e.g., Morzorati, Ramchandani, Flury, & O’Connor, 2002; Schuckit, 1994; Söderpalm Gordh & Söderpalm, 2011). Furthermore, having a general tendency toward externalizing behavior makes one more likely to engage in a range of deviant behaviors such as excessive alcohol use (Chassin et al., 2013). In fact, research demonstrates that externalizing disorders are robust predictors of AUD onset from ages 13 to 30 years (Farmer et al., 2016), and much of the genetic influence on AUD is shared with other externalizing disorders (Kendler, Prescott, Myers, & Neale, 2003). Many etiologic models, including positive and negative affect regulation models, propose AUD develops through processes related to both positive and negative reinforcement. These include expectancy models that posit that drinking is related to the expectancy that alcohol use will enhance positive emotions or relieve negative emotions (e.g., Maisto, Carey, & Bradizza, 1999). Similarly, motivational models of alcohol consumption highlight the importance of consuming alcohol for both positive reinforcement (“enhancement motives”) and negative reinforcement (“coping motives” and “self-​ medication”), with the latter being more directly related to alcohol problems (and presumably syndromal AUD) (Cooper, Kuntsche, Levitt, Barber, & Wolf, 2016). As such, drinking expectancies and motives can represent a critical domain for assessment in those with AUD. For instance, some individuals may drink to increase positive affect, whereas others may drink to escape or avoid painful emotions. In fact, individuals are versatile in their motivations and expectancies, as most heavily alcohol-​ involved individuals report drinking for both positive and negative reasons. Both positive and negative affect

383

dysregulation are related to early onset, risky alcohol use and AUD (Cheetham, Allen, Yucel, & Lubman, 2010). Environmental factors such as experiences within the family and peer relations have also been implicated in etiologic models for AUD. Familial factors that increase risk for offspring AUD include parenting practices, family structure, prenatal exposure to alcohol and drugs, poor parent–​ child relationships (characterized by disharmony, low cohesion, and disorganization), family environment effects, exposure to socialization messages about substances, and increased opportunities to use. Research has demonstrated, for example, that decreased parental monitoring, low responsiveness (e.g., neglect), excessive use of harsh discipline (e.g., abuse), and deficits in parental warmth and control predict adolescent substance use (Chassin, Colder, et  al., 2016; Chassin, Haller, et al., 2016). AUD must also be considered within the context of interpersonal relationships outside the family of origin. Individuals with AUD experience problems in their relationships with others, particularly their relationships with intimate partners. Marriage is generally related to a reduced risk for AUD. Specifically, married individuals tend to drink less and have fewer AUD symptoms compared to their unmarried peers (Rodriguez, Neighbors, & Knee, 2013). However, some married couples still experience AUD. Spouses of individuals with AUD have lower marital satisfaction and higher rates of depression, anxiety, and psychological distress, in addition to more frequent reports of physical and emotional abuse, compared to spouses of individuals without AUD. In fact, alcohol problems predict subsequent marital distress. Interdependently, marital distress also predicts increased alcohol use and alcohol-​related problems, demonstrating the complex nature of the reciprocal relationship between marriage and alcohol use (Rodriguez et al., 2013). It is also important to understand the social context in which the drinking occurs (e.g., while out with friends or alone) as well as the drinking patterns of the individual’s peers. For example, there is considerable evidence demonstrating that affiliation with deviant peers in adolescence is associated with substance use and related problems (Chassin, Colder, et  al., 2016). There also exists a well-​ documented relationship between treatment outcomes and both level of social support and social network characteristics (including size, composition, and density; Mavandadi, Helstrom, Sayers, & Oslin, 2015). Therefore, given the importance of contextual factors such as peer affiliations, the marital relationship, and social support,

384

Substance-Related and Gambling Disorders

the consideration of interpersonal and social consequences of drinking and abstinence can provide valuable clinical information.

PURPOSES OF ASSESSMENT

The purpose of this chapter is to review measures relevant to assessing individuals with AUD. Specifically, this chapter provides an overview of assessments intended for AUD (a)  diagnosis, (b)  case conceptualization and treatment planning, and (c)  treatment monitoring and evaluation. Although this chapter primarily focuses on clinical assessment, measures of alcohol-​ related constructs are also important in research and forensic settings. Diagnostic and outcome assessments may be especially useful in treatment development and implementation research. In addition, medical settings, such as hospitals, urgent care facilities, and emergency rooms, use alcohol screening to evaluate patients and determine suitable care or to provide treatment referrals for alcohol-​related conditions such as alcohol withdrawal syndrome. Likewise, alcohol-​ related assessments are useful in legal settings for identifying intoxicated drivers and conducting psychological evaluations (e.g., custody hearings). Thus, assessments included in this chapter may be applicable in a variety of both clinical and nonclinical settings. A systematic approach was used to identify relevant assessments within each of the three purposes mentioned previously. First, we searched major books (Binge Drinking and Alcohol Misuse: Among College Students and Young Adults, Winograd & Sher, 2015; Center for Substance Abuse Treatment, 1998; National Institute on Alcohol Abuse and Alcoholism [NIAAA], 2003; ) and book chapter reviews (e.g., Del Boca, Darkes, & McRee, 2016; Martens, Arterberry, Cadigan, & Smith, 2012; Martino, Poling, & Rounsaville, 2008)  of alcohol assessment measures for clinical practice and research. In addition, all alcohol-​related journals were reviewed from 2013 to 2016 for alcohol assessments. The 27 alcohol-​related journals reviewed included Alcohol; Alcohol and Alcoholism; Alcohol Research; ISRN Addiction Journal of Addiction; Journal of Alcoholism and Drug Dependence; Drug and Alcohol Review; Journal of Studies on Alcohol and Drugs (and supplemental journal); Addiction and Health; Addiction Biology; Addiction Science & Clinical Practice; Canadian Journal of Addiction; International Journal of Mental Health and Addiction; Journal of Substance Abuse Treatment; Substance Abuse and Rehabilitation; Substance Abuse: Research and Treatment; Substance Abuse, Treatment, Prevention and Policy; Addictive Behaviors; Alcoholism & Drug

Abuse Weekly; American Journal on Addictions; International Journal of High Risk Behaviors & Addiction; Journal of Addictions & Offender Counseling; Journal of Addictive Diseases; Psychology of Addictive Behavior; Alcoholism, Clinical and Experimental Research; Substance Abuse; and Alcohol Research:  Current Reviews. Furthermore, major websites providing a catalog of assessment measures were reviewed, specifically those of The Center on Alcoholism, Substance Abuse, and Addiction (CASAA) at the University of New Mexico (https://​casaa.unm.edu/​Instruments); the Alcohol and Drug Abuse Institute (ADAI) at the University of Washington (http://​lib.adai.uw.edu/​instruments); and the PhenX Toolkit sponsored by the National Institutes of Health (NIH; https://​www.phenxtoolkit.org). Google Scholar and Scopus search engines were used to find relevant psychometric articles to rate each instrument. Articles that provide the psychometric properties of the instruments included in this chapter were identified using the following search terms: “psychometrics,” “reliability,” “validity,” and “internal consistency.” A  combination of data from the books, chapters, psychometric articles, and manuals was used to rate each instrument recommended in this chapter.

ASSESSMENT FOR CASE IDENTIFICATION AND DIAGNOSIS

Case Identification Although nearly 14% of the U.S.  population meets criteria for a past-​year diagnosis of AUD (Grant et  al., 2015), many individuals with alcohol use problems will go undetected (NIAAA, 2003). Given that continued risky drinking is associated with further alcohol-​related negative consequences, screening is a prevention priority. In fact, the National Commission on Prevention Priorities includes alcohol misuse screening among the top prevention services (Maciosek et  al., 2006). Research demonstrates that regular AUD screening is cost-​effective from a health system perspective as well as a societal perspective (Solberg, Maciosek, & Edwards, 2008). Interestingly, research has also indicated that patients support being screened for at-​risk drinking by their physicians whether in the form of biomarker laboratory tests or self-​report measures (Miller, Thomas, & Mallin, 2006). The purpose of screening is to identify individuals with alcohol-​related problems or consequences as well as those who are at risk for experiencing such problems.

Alcohol Use Disorder

An important goal of utilizing alcohol screeners is the early detection of individuals with alcohol-​related problems, with the intention of initiating further assessment and treatment when indicated. Screening tests are evaluated among a range of dimensions, including sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Sensitivity, also known as the true positive rate, quantifies the extent to which test scores correctly identify people with the problem of interest. Specificity, or true negative rate, quantifies the extent to which test scores can identify people without the problem of interest. PPV is the proportion of individuals who screen positive that actually have the disorder, whereas NPV is the proportion of individuals who screen negative that do not have the disorder. Cut-​off values for screening tests are chosen such that they maximize sensitivity, specificity, PPV, and NPV. An important characteristic when considering screening measures is ease of administration. Providers often have limited financial and time resources, which means screening tests for AUDs and related problems should be brief, easily administered with minimal time needed for training and scoring, and straightforwardly incorporated with other assessments or procedures. Our recommendations for screening measures favor these characteristics. Finally, it is important to consider the populations for which various instruments have been normed. Each of the measures described in this section has been validated across a range of populations (e.g., college students and adolescents) and settings (e.g., hospitals and mental health centers). In cases in which this is not the case or the instruments are strongly supported in a particular population (e.g., the Fast Alcohol Screening Test for use in hospital settings), this is noted in the text. Table 18.1 summarizes the ratings of various screening and diagnostic measures. Specific values for reliability and validity estimates are not reported in the text. Interested readers should consult the citations reported for each instrument in order to augment the general information provided in the Table 18.1. Three major approaches to screening and case identification exist: laboratory tests, screening interviews, and self-​ report questionnaires. Laboratory tests are used to assess biomarkers, which are reflections of physiological reactions to heavy drinking. Biomarkers are useful because they do not rely on self-​report and, as a result, are not vulnerable to subjective recall and various self-​report biases. Phosphatidylethanol (PEth; e.g., Walther et al., 2015) is a specific metabolite of ethanol and, unlike other biomarkers, is not influenced by liver diseases (Litten, Bradley, &

385

Moss, 2010). Numerous studies have demonstrated that PEth has a low rate of false positives and has the highest sensitivity compared to carbohydrate-​ deficient transferrin (CDT), mean corpuscular volume (MCV), and γ-​ glutamyl transferase (GGT) (e.g., Bajunirwe, Haberer, Ii, & Hunt, 2014). GGT (e.g., Reynaud et al., 2000) is a glycoenzyme and is included in most standard blood panels. GGT is elevated when an individual has engaged in heavy alcohol use over the course of a few weeks, which makes it appropriate for the detection of severe AUDs. Although the evidence on the sensitivity of CDT is mixed, for some purposes it has adequate psychometric properties as an AUD screening laboratory test. CDT (e.g., Walther et al., 2015)  is often very sensitive to changes in drinking and therefore useful for both screening and relapse identification. CDT has been widely studied across a variety of samples and settings; however, the fact that it is not routinely included in blood panels makes it somewhat more expensive and time-​consuming as a screener than GGT. Although GGT is useful as a screening biomarker, studies suggest CDT is superior to GGT in terms of specificity. Some work has demonstrated the utility of combining CDT and GGT in the screening of AUD because this results in increased sensitivity and specificity relative to either metabolite on its own (e.g., Bertholet, Winter, Cheng, Samet, & Saitz, 2014). MCV (e.g., Mundle, Munkes, Ackermann, & Mann, 2000) is a measure of the size of red blood vessels and is affected by many other conditions, such as liver disease and anemia, which reduces its specificity as a screening tool. MCV responds slowly to abstinence and may remain at high levels in an individual’s system as long as 3  months after the individual has stopped using alcohol. There is also evidence that MCV performs differently in males and females, so different cut-​ offs are recommended. Furthermore, MCV has been shown to increase linearly with age. Overall, if a biomarker is desired, PEth appears to be the laboratory test with the best psychometric properties for the screening of hazardous/​harmful drinking persisting for at least a few weeks. If the aim is to screen for hazardous/​harmful drinking that has lasted less than a few weeks, CDT is recommended (Snell, Bhave, Takacs, & Tabakoff, 2016). Four screening interviews are psychometrically sound for screening across various populations and settings: the CAGE (Ewing, 1984), the Fast Alcohol Screening Test (FAST; Hodgson, Alwyn, John, Thom, & Smith, 2002), the Alcohol, Smoking and Substance Involvement Screening Test (ASSIST; WHO ASSIST Working Group, 2002), and the Global Appraisal of Individual Needs Short Screener (GAIN-​ GSS; Dennis, Feeney,

386

Substance-Related and Gambling Disorders

TABLE 18.1  

Ratings of Instruments Used for Case Identification and Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct validity

Validity Generalization

Clinical Utility

Highly Recommended



Screening and Case Identification Laboratory Tests PEth GGT

E E

NA NA

NA NA

NR NR

NA NA

E A

G E

A A

MCV

E

NA

NA

NR

NA

A

A

A

CDT

E

NA

NA

A

NA

G

A

A

Screening Interviews and Self-​Report Questionnaires AUDITa ASSIST GAIN-​GSSa CAGE MASTa FAST

E G G G A G

G G G G G A

NA E NR G NA NR

G A NR A G A

A G A A A NR

G G G G G A

E E G G A A

A G G A A A

RAPS-​4

G

NR

NA

NR

A

A

G

A

DUSI-​R

G

A

NA

NR

A

G

E

A

SAAST-​R

A

E

NA

NR

NR

G

A

G

PAWSS

A

NR

E

NR

A

NR

NR

A

PDSQ

A

G

NA

A

NR

G

A

A

SSI-​AOD (SSI-​SA)a

A

G

NA

A

G

A

E

E

ADS

A

G

NA

A

A

G

G

A

✓ ✓ ✓ ✓ ✓

Diagnostic Assessments AUDADISa SCID PRISM

G A A

NR NR NR

NR G G

G A A

A A A

A A NR

G A A

A A A

SDDS

A

G

NR

A

A

NR

A

A

SSAGAa

A

NR

G

G

A

A

G

A

✓ ✓

  Measure available at https://​www.phenxtoolkit.org.

a

Note:  PEth  =  phosphatidyl ethanol; GGT  =  γ-​ glutamyl transpeptidase; MCV  =  Mean Corpuscular Volume; CDT  =  Carbohydrate-​ Deficient Transferrin; AUDIT  =  Alcohol Use Disorders Identification Test; ASSIST  =  Alcohol, Smoking, and Substance Involvement Screening Test; GAIN-​ GSS = Global Appraisal of Individual Needs–​Gain Short Screener; CAGE = not an acronym—​the letters cue the questions that compose the instrument; MAST = Michigan Alcoholism Screening Test; FAST = Fast Alcohol Screening Test; RAPS-​4 = Rapid Alcohol Problems Screen-​4; DUSI-​R = Drug Use Screening Inventory-​Revised; SAAST-​R = Self-​Administered Alcohol Screening Test-​Revised; PAWSS = Prediction of Alcohol Withdrawal Severity Scale; PDSQ = Psychiatric Diagnostic Screening Questionnaire; SSI-​AOD (SSI-​SA) = Simple Screening Instrument for Alcohol and Other Drugs (also called the Simple Screening Instrument for Substance Abuse); ADS = Alcohol Dependence Scale; AUDADIS = Alcohol Use Disorder and Associated Disabilities Interview Schedule; SCID = Structured Clinical Interview for DSM; PRISM = Psychiatric Research Interview for Substance and Mental Disorders; SDDS  =  Substance Dependence Severity Scale; SSAGA  =  Semi-​Structured Assessment for the Genetics of Alcoholism; A  =  Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

Stevens, & Bedoya, 2006). The CAGE takes less than 1 minute to administer and requires minimal training. Each letter in CAGE represents a single question, making the instrument easy to administer and replicate. It assesses the following: ever felt the need to cut down on drinking, ever felt annoyed by others criticizing your drinking, ever felt bad or guilty about your drinking, and ever have a drink first thing in the morning (i.e., eye-​ opener). The original instrument uses a cut-​off score of 2, but a cut-​off score of 1 has been recognized as clinically significant and indicative of the need for further assessment (Bradley, Bush, McDonell, Malone, & Fihn, 1998). The FAST is also a 4-​item, brief interview (as described later, it can also be administered as a self-​report

measure). The first item is a screener assessing how often the individual has one drink or more on a single occasion. If the subject does not respond “never,” then he or she is asked about blackout or memory loss, failure to do what was expected because of drinking, and concern from others about drinking behavior or being advised by others to cut down. This procedure, in both interview and questionnaire formats, is effective for mass screening in epidemiologic work or in hospital settings, for example. The ASSIST is more extensive than the other interviews because it assesses both alcohol and other drug use. The skip logic employed by the instrument aims to keep administration time relatively brief, with estimates ranging from 5 to 15 minutes. ASSIST requires

Alcohol Use Disorder

more training in administration and scoring compared to the other interviews. The GAIN-​GSS is one of the few available screeners addressing both mental health and drug and alcohol problems. It includes 20 items and takes 3 to 5 minutes to administer. GAIN-​GSS is highly recommended by the NIH Data Harmonization Project (i.e., https://​www.PhenX.org). The CAGE, ASSIST, and GAIN-​GSS have been normed in both adolescent and adult populations, whereas the FAST has been normed mainly among adults. Among self-​report questionnaire measures, the Alcohol Use Disorder Identification Task (AUDIT; Babor, Biddle-​Higgins, Saunders, & Monteiro, 2001) and Michigan Alcoholism Screening Test (MAST; Selzer, 1971) are the most highly recommended. Developed by the World Health Organization (WHO), the AUDIT is a brief, 10-​item instrument with excellent psychometric properties across gender, ages, and culture as well as a range of settings. There are various short versions of the AUDIT, including the AUDIT-​3 and AUDIT-​4, which are respectively 3-​and 4-​item versions of the full scale (Gual, 2002). The AUDIT-​Primary Care (AUDIT-​PC; Piccinelli et  al., 1997)  is a 5-​item abbreviated version, and the AUDIT-​Consumption (AUDIT-​C; Bush et  al., 1998)  is a 10-​ item abbreviated version that includes items on alcohol consumption. These short versions have promising psychometric properties, but more research is needed before they can be recommended for widespread use. The MAST is a 25-​item instrument that has performed well in a variety of settings (e.g., inpatient and outpatient) across a wide range of populations. The MAST has various short forms, including the 10-​ item Brief MAST (BMAST; Pokorny, Miller, & Kaplan, 1972)  and the 13-​item Short MAST (SMAST; Selzer, Vinokur, & van Rooijen, 1975). Notably, some of the measures in Table 18.1 can be administered as either interviews or questionnaires. These include the MAST (Selzer, 1971), FAST (Hodgson et al., 2002), AUDIT (Babor et  al., 2001), Simple Screening Instrument for Alcohol and Other Drugs (SSI-​ AOD; Winters & Zenilman, 1994), GAIN-​GSS (Dennis et al., 2006), Alcohol Dependence Scale (ADS; Horn, 1984; Skinner & Horn, 1984), and CAGE (Ewing, 1984). All screening measures listed in Table 18.1 have adequate psychometric properties, but compared to those described in the text, the other measures require more time to administer (e.g., the Drug Use Screening Inventory-​ Revised [DUSI-​R]; Tarter & Kirisci, 2001)  or have less research supporting their generalizability across settings and populations.

387

Diagnosis DSM-​5 defines AUD as a problematic pattern of alcohol use leading to clinically significant impairment or distress. To receive a “current” diagnosis of AUD, an individual must have experienced at least two of the following symptoms within the past 12 months: drinking more alcohol or over a longer time period than initially intended; a recurrent desire to cut down on alcohol use or failed attempts to control use; spending a significant amount of time finding, consuming, or recuperating from the effects of alcohol; craving (i.e., a powerful desire or urge to consume alcohol); failure to meet important responsibilities at work, school, or home due to alcohol use; continued alcohol use regardless of social or relational conflicts; important activities given up or reduced due to alcohol use; repeated use in situations in which there is potential for physical harm to self or others; continued use despite awareness of a physical or psychological ailment caused or made worse by alcohol; tolerance (i.e., a need for larger amounts of alcohol to attain the anticipated effect); and withdrawal. To receive a lifetime diagnosis of AUD, an individual must endorse two of the aforementioned criteria within the same 12-​month period. Lifetime diagnosis of AUD serves many important purposes, such as estimating prevalence of AUD. However, clinicians and researchers conducting lifetime assessments of AUD should be aware of the limitations of these assessments, such as underreporting of problems at a single assessment and the limited validity of late-​onset cases (Haeny, Littlefield, & Sher, 2014a, 2014b, 2016). Accurate diagnosis is fundamental to AUD treatment and research. Formal diagnosis has many benefits. For example, it provides a shared nomenclature for clinicians to discuss treatment planning and outcome, serves as a basis for organizing and retrieving information, provides a basis for predictions, and serves a sociopolitical function (e.g., allows reimbursement from insurance companies) (Blashfield, Keeley, Flanagan, & Miles, 2014). Although many clinicians and researchers may gather information to inform diagnosis during initial sessions, formal and structured (and semi-​structured) assessment tools can improve the validity and reliability of diagnoses. Furthermore, the use of structured interviews as a basis for diagnosing AUD also allows clinicians and researchers to collect relevant information within an acceptable time frame. Although diagnostic interviews tend to require more time and training compared to screening instruments, research demonstrates that structured diagnostic interviews are accepted by patients in a variety of settings (Suppiger et al., 2009).

388

Substance-Related and Gambling Disorders

Table 18.1 summarizes the psychometric properties of instruments used for the diagnosis of DSM-​5 AUD. Specific values for reliability and validity estimates are not reported in the text. Interested readers should consult the citations reported for each instrument in order to augment the general information provided in the table. Only instruments that have been adapted for DSM-​5 AUD (i.e., they must include the recently added craving criterion) are included in this review. However, because the DSM-​5 is relatively new, few studies examining the psychometric properties of these newer instruments exist. Therefore, psychometric information from the corresponding DSM-​ IV-​TR version was examined when necessary because it is reasonable to suspect a certain level of concordance between the two versions given the minor changes to the AUD criteria and the fact that some measures assessing DSM-​IV-​TR included craving even though that symptom was not included in DSM-​IV-​TR. The most highly recommended diagnostic interviews are the Structured Clinical Interview for DSM-​5 (SCID-​ 5; First, Williams, Karg, & Spitzer, 2016) and the Alcohol Use Disorder and Associated Disabilities Interview Schedule-​5 (AUDADIS-​5; Grant et al., 2011). The SCID-​ 5 is a semi-​structured interview with available clinical and research versions. The research version is slightly more comprehensive than the clinical version. The entire SCID-​5 takes between 60 and 90 minutes to administer, but the section assessing AUD takes 5 to 10 minutes. Previous versions of the SCID have been adapted for multiple languages (e.g., Spanish, Chinese, French, and German) and validated across a range of populations (e.g., inpatient and outpatient). The entire AUDADIS-​5 takes approximately 60 minutes to administer, shows adequate psychometric properties, and has been validated across a range of clinical and general population samples. The section specifically assessing AUD should typically take less than 10 minutes. Despite these desirable properties, the AUDADIS may overestimate the prevalence of withdrawal by failing to adequately disambiguate withdrawal from hangover symptomatology (Boness, Lane, & Sher, 2016). As a result, researchers and clinicians should take care to address this shortcoming by further assessing withdrawal symptoms if that is a key concern. The remaining diagnostic assessment instruments in Table 18.1 have versions for assessing DSM-​5 AUD but have less psychometric research available. However, those instruments that do have psychometric information available appear to be adequate for both clinical and research use. Of note is the Semi-​Structured Assessment for the Genetics of Alcoholism (SSAGA; Bucholz et al., 1994), which has not

been updated for DSM-​5 but includes an item on craving, thus making it possible to derive a DSM-​5 AUD diagnosis. In addition, it is important to distinguish between diagnostic instruments that must be administered by clinical interviewers and those that can be administered by lay interviewers, as all these instruments require different levels of training. Clinician interviewers are trained mental health professionals (e.g., psychologist or psychiatrist) familiar with diagnostic classification and diagnostic criteria, whereas lay interviewers are nonclinicians. Although clinicians must go through training for administration of these instruments, lay interviewers are often trained much more extensively via a combination of directed self-​study, intensive classroom training, and supervised practice administrations. The level of training required for administration varies by instrument. The SCID, the Psychiatric Research Interview for Substance and Mental Disorders (PRISM; Hasin et al., 1996), and the Substance Dependence Severity Scale (SDSS; Miele et  al., 2000)  each must be administered by clinicians, whereas the AUDADIS and SSAGA are designed to be administered by trained lay interviewers. Overall Evaluation As a whole, there exists a wide range of instruments for AUD screening and diagnosis. The most highly recommended laboratory test is PEth. For the purposes of screening, the most highly recommended interviews and self-​report questionnaires include the AUDIT, ASSIST, GAIN-​ GSS, CAGE, and MAST. The SCID-​ 5 and AUDADIS are the recommended measures for syndromal diagnosis of DSM-​5 AUD. Overall, screening laboratory tests are useful for identifying those at risk for alcohol-​ related problems and are therefore in need of further diagnostic assessment. Given that screening instruments may be impacted by recall bias or social desirability, the possibility of false negatives should be considered when using them. Although AUD diagnostic instruments are more thorough and time-​ consuming than screening instruments, they can provide detailed clinical information useful for treatment planning and coordination.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

For individuals with AUD, case conceptualization and treatment planning requires the consideration of many different issues and client characteristics. This includes

Alcohol Use Disorder

aspects such as the need for detoxification, screening for medical/​health problems, comorbid conditions, level of care determination, client treatment preference, consumption patterns, consequences of drinking, family history of alcoholism, readiness to change, drinking goals, treatment history, craving, possible high-​ risk/​ relapse situations, alcohol outcome expectancies, drinking self-​ efficacy, and social network characteristics. Table 18.2 provides a summary of measures that can be used to assess this range of constructs. Need for Detoxification When conceptualizing a case and planning treatment, it is important to assess for symptoms of alcohol withdrawal syndrome (AWS) in individuals who recently quit or reduced their alcohol use. This is because those with moderate to severe AWS may require supervised detoxification in either an inpatient or an outpatient setting. AWS involves experiencing two or more of the following symptoms causing significant distress or impairment within a few hours to several days after a reduction in heavy or prolonged alcohol use: (a) autonomic hyperactivity; (b) increased hand tremor; (c) insomnia; (d) nausea; (e) transient visual, tactile, or auditory hallucinations; (f) anxiety; and/​or (g) seizures (APA, 2013). Despite that fact that several measures have been developed to assess AWS, the most widely used measure continues to be the revised Clinical Institute Withdrawal Assessment for Alcohol (CIWA-​ Ar; Sullivan, Sykora, Schneiderman, Naranjo, & Sellers, 1989). This is a 10-​item clinician-​ report questionnaire that can be completed in less than 2 minutes. However, the CIWA-​Ar is not without limitations. Investigations of the psychometrics of this measure indicate that is has been found to have poor internal consistency in some studies (Holzman & Rastegar, 2016; Pittman et al., 2007) and may underestimate the severity of AWS in Native Americans, which limits its generalizability (Rappaport et  al., 2013). Newly developed measures are often compared to the CIWA because it is regarded as the “gold standard” measure for assessing AWS. A briefer measure than the CIWA that has acceptable psychometric properties is the Short Alcohol Withdrawal Scale (SAWS; Gossop, Keaney, Stewart, Marshall, & Strang, 2002), which is a 10-​item self-​report questionnaire. Medical/​Health Screening Excessive alcohol consumption and AUD are associated with a range of health conditions. The presence of

389

medical or health conditions can complicate the clinical picture; thus, it is important to take this into account during treatment conceptualization and planning. A review of the chronic diseases and conditions related to alcohol use (Shield, Parry, & Rehm, 2013)  indicated that alcohol consumption is often the primary or the sole cause of 25 chronic diseases (e.g., liver cirrhosis, gastritis, and pancreatitis) listed in the 10th edition of the International Classification of Disease (ICD-​10; WHO, 2004). In addition, alcohol use increases risk for certain cancers, tumors, neuropsychiatric conditions, and many cardiovascular and digestive diseases, and it can have both beneficial and detrimental effects on diabetes, stroke, and heart disease (Shield et al., 2013). Multiple measures are available for assessing general health status. The highly recommended measures are those with evidence of strong psychometric characteristics and include the Addiction Severity Index (ASI-​5; Denis, Cacciola, & Alterman, 2013)  and the 12-​and 36-​Item Short Form Health Surveys (SF-​12 [Ware, Kosinski, & Keller, 1996] and SF-​36 [Medical Outcomes Trust, 1991; Ware & Sherbourne,  1992], respectively). These measures do not assess specific physical diseases but, rather, overall current health and whether health problems are interfering in important areas of life. Comorbid Psychopathology As mentioned previously, AUD is highly comorbid with other mental disorders, including mood disorders, anxiety disorders, personality disorders, and other drug use disorders (Grant et al., 2015). Thus, it is essential that case conceptualization and treatment planning involve the assessment of co-​occurring psychiatric disorders. Several measures have been widely used to assess mental disorders co-​occurring with AUD; however, there is currently a limited number of published measures for diagnosing AUD and other mental disorders that have been updated for the DSM-​5. Given that the most relevant instruments for assessing AUD and co-​occurring mental disorders are the semi-​ structured interviews listed in Table 18.1, to avoid redundancy, these measures are not included in Table 18.2 but are discussed in the text. The most widely used measures for assessing AUD and commonly co-​occurring mental disorders include the SCID, the Mini-​International Neuropsychiatric Interview (M.I.N.I.; Sheehan, 2014; Sheehan et  al., 1994), the SSAGA, and the AUDADIS. The SCID has been updated to reflect DSM-​ 5 diagnoses (First, Williams, Karg, & Spitzer, 2014 [research version]; First, Williams, Karg, &

390

Substance-Related and Gambling Disorders

TABLE 18.2  

Ratings of Instruments Used for Case Conceptualization and Treatment Planning Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

A A

Ab G

G NR

NR NR

NR NR

G A

Ab A

A A



ASIa SF-​36a SF-​12 LISRES

E E E A

G G G A

A G G NR

A A A NR

E G G A

A G G NR

G E E A

G G G A

✓ ✓ ✓

Form 90

A

NR

NR

A

NR

A

G

A

A A

A NR

A A

NR A

NR NR

A A

G G

A A



A A A A A A

NR NR NR NR NR A

NR NR NR NR NR NR

A A NA A A A

A A A A A NR

A A A A A A

A A A A A G

A A A A A A

✓ ✓ ✓ ✓ ✓

RAPI B-​YAACQ MAST AUI

G A A A

G G G A

NR NR NA NR

NR A G A

A A A NR

A A G A

A G A G

A A A A

✓ ✓ ✓

SIP

A

A

NR

A

A

A

A

A

DrInC YAACQ

A A

A A

NR NR

A A

A A

A A

A G

A A

G E G

G G G

NA NA A

A G NR

NR E NR

G A NR

E E E

A A A

✓ ✓

A A

NR NR

NR NR

A A

A A

A A

A A

A A



PACS OCDS AUQa TRI

A A A A

G G G A

NR NR NR NR

NR A A NR

A A A A

A A A A

E E E A

A A A A

✓ ✓ ✓

PACS

A

A

NR

NR

A

A

A

A

ACQ-​R

A

E

NR

G

A

A

A

A

ACQ-​Now

A

G

NR

NR

A

G

A

A

JACQ

A

E

NR

G

A

A

A

A

G G G G A

NR NR NR NR NR

A NR NR NR NR

A A A NR NR

G G G A A

G G G G A

G A A A A

✓ ✓ ✓

G A

NR NR

A NR

A A

G G

G A

A A

✓ ✓

G

NR

NR

E

A

A

A



Instrument Need for Detoxification CIWA-​Ar SAWS Medical/​Health Screening

Level of Care Determination RAATE ASAM PPC Drinking Patterns TLFB Form 90 DMSL LDH Q-​F Measures AUI Consequences of Drinking

Readiness to Change URICA SOCRATES RCQ Treatment History Form 90 TLFB Craving

High-​Risk Drinking Situations/​Relapse Situations DMQ-​Ra IDS IDS-​42 DCS DPQ

G A A A A

Alcohol Outcome Expectancies CEOA E B-​CEOA E Drinking Self-​Efficacy DRSEQ

A

Alcohol Use Disorder

391

TABLE 18.2  Continued Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

DRSEQ-​R SCQ AASE

E A A

A A G

NR NR NR

NR NR NR

E A A

A A A

A A A

A A A



A

NR

NR

A

NA

A

A

A



Social Network IPA

  Measure recommend by the PhenX Toolkit (https://​www.phenxtoolkit.org).

a

  The internal consistency and validity of the CIWA-​Ar have been inconsistent; however, this measure was included because it is considered the gold standard and is currently the best measure available despite these limitations. b

Note: CIWA-​Ar = Clinical Institute Withdrawal Assessment for Alcohol Scale; SAWS = Short Alcohol Withdrawal Scale; ASI = Addiction Severity Index; SF-​36  =  36-​Item Short Form Health Survey; SF-​12  =  12-​Item Short Form Health Survey; LISRES  =  Life Stressors and Social Resources Inventory; RAATE = Recovery Attitude and Treatment Evaluator; ASAM PPC = American Society of Addiction Medicine Patient Placement Criteria; TLFB = Timeline Followback; DSML = Drinking Self-​Monitoring Log; LDH = Lifetime Drinking History; Q-​F Measures = Quantity–​Frequency Measures; AUI = Alcohol Use Inventory; RAPI = Rutgers Alcohol Problem Index; B-​YAACQ = Brief Young Adult Alcohol Consequences Questionnaire; MAST = Michigan Alcoholism Screening Test; SIP = Short Index of Problems; DrInC = Drinker Inventory of Consequences; YAACQ = Young Adult Alcohol Consequences Questionnaire; URICA = University of Rhode Island Change Assessment; SOCRATES = Stages of Change Readiness and Treatment Eagerness Scale; RCQ = Readiness to Change Questionnaire; PACS = Penn Alcohol Craving Scale; OCDS = Obsessive Compulsive Drinking Scale; AUQ = Alcohol Urge Questionnaire; TRI = Temptation and Restraint Inventory; PACS = Penn Alcohol Craving Scale; ACQ-​R = Alcohol Craving Questionnaire-​Revised; ACQ-​Now = Alcohol Craving Questionnaire—​Now (assessing craving in the present moment); JACQ = Jellinek Alcohol Craving Questionnaire; DMQ-​R = Drinking Motives Questionnaire-​Revised; IDS = Indices of Problems; IDS-​42 = Indices of Problems; DCS = Drinking Context Scale; DPQ = Drinking Patterns Questionnaire; CEOA = Comprehensive Effects of Alcohol; B-​CEOA = Brief Comprehensive Effects of Alcohol; DRSEQ = Drinking Refusal Self-​Efficacy Questionnaire; DRSEQ-​R  =  Drinking Refusal Self-​Efficacy Questionnaire-​Revised; SCQ  =  Situational Confidence Questionnaire; AASE  =  Alcohol Abstinence Self-​ Efficacy; IPA = Important People and Activities; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

Spitzer, 2016 [clinician version]). The SCID was developed to assess the most frequently diagnosed disorders in adults, including psychotic disorders, mood disorders, drug and alcohol use disorders, anxiety disorders, somatoform disorders, eating disorders, adjustment disorder, and personality disorders. There are patient and nonpatient versions of the SCID for those conducting assessments in clinical or nonclinical settings. The SCID can take, on average, 60 to 90 minutes to administer. A briefer version of the SCID is the M.I.N.I., which has also been updated to reflect DSM-​5 criteria and assesses mood disorders, anxiety disorders, drug and alcohol use disorders, psychotic disorders, and antisocial personality disorder. The M.I.N.I.  can be administered in roughly 10 minutes, which increases its clinical utility. The SSAGA was specifically developed to distinguish between symptoms of AUD, depression, and antisocial behaviors to ensure accuracy of diagnosis for genetics research. The disorders assessed in the SSAGA include drug and alcohol use disorders, major depression, dysthymia, mania, somatization, antisocial personality disorder, anorexia, bulimia, panic, agoraphobia, social phobia, and obsessive–​ compulsive disorders. The SSAGA was developed specifically to disentangle disorders that commonly co-​occur with AUD, which may increase its clinical utility. However, the administration time is much longer than other measures (approximately 120 minutes), which may limit the clinical

utility of the SSAGA. The AUDADIS was developed to assess AUD and other related conditions in both clinical samples and the general population. Diagnoses assessed by the AUDADIS include drug and alcohol use disorders, mood disorders, anxiety disorders, eating disorders, and personality disorders. The AUDADIS takes approximately 60 minutes to administer. Level of Care Determination Level of care needed is an important factor to consider when treatment planning. It is widely known that there is no single treatment model that effectively treats all individuals with alcohol problems (e.g., Institute of Medicine, 1990; National Institute on Drug Abuse, 2009). Despite this knowledge, many programs emphasize one main treatment model whether it is abstinence only, 12-​step based, harm reduction, behavior therapy, or therapeutic communities (Mee-​Lee & Gastfriend, 2014). Often, the image of drug and alcohol treatment is a 28-​day program in a residential setting. Given the heterogeneity of problems associated with alcoholism, it is understandable that a variety of treatment options are needed. Some treatment programs may have a single standard of care regardless of the case presentation; however, data suggest that matching clients to services based on their identified problem needs improves treatment outcomes (e.g., Camilleri,

392

Substance-Related and Gambling Disorders

Cacciola, & Jenson, 2012; Gastfriend & McLellan, 1997; McLellan et al., 1997). There are measures designed to determine the level of care needed when planning treatment. These methods were developed with the goal of providing a rationale for the treatment approach recommended. The Patient Placement Criteria (PPC; Hoffman, Halikas, Mee-​Lee, & Weedman, 1993) published by the American Society of Addiction Medicine (ASAM) is one method available to assess patients for (a) acute intoxication or withdrawal potential, (b)  biomedical conditions or complications, (c) treatment acceptance or resistance, (d)  relapse potential, and (e)  recovery environment. These criteria are then used to assign clients to one of four levels of care:  I, outpatient; II, intensive outpatient; III, medically monitored inpatient; and IV, medically managed inpatient (Camilleri et al., 2012). There is extensive evidence suggesting that using the PPC to match patients to levels of care is associated with lower rates of morbidity, improved functioning, and lower rates of service utilization compared to mismatching patients to lower levels of care (Gastfriend & Mee-​Lee, 2004). Treatment Preference There exists an abundance of research on shared decision-​ making (e.g., Crawford et  al., 2003; Härter et  al., 2011; Härter, van der Weijden, & Elwyn, 2011; Joosten, De Jong, De Weert-​van Oene, Sensky, & Van Der Staak, 2009; Neuner et al., 2007), in which clients participate in selecting the treatment (e.g., motivational enhancement, cognitive behavioral, and 12-​step facilitation) they prefer. Evidence suggests that matching clients to their preferred treatment can result in higher treatment adherence, improved symptom related-​outcomes, and higher treatment retention rates (Graff et al., 2009; Swift & Callahan, 2009). For example, clients matched to their preferred treatment tended to drink less than their unmatched counterparts (Adamson, Sellman, & Dore, 2005). Notably, other researchers found no differences in number of drinking days, days intoxicated, and a reduction in drinking among clients matched to their preferred treatment (McKay, Alterman, McLellan, Snider, & O’Brien, 1995). Friedrichs, Spies, Härter, and Buchholz (2016) suggested that shared decision-​ making interventions might be a useful approach for assessing and utilizing treatment preferences in treatment planning given preliminary data on their effectiveness (Brener, Resnick, Ellard, Treloar, & Bryant, 2009; Joosten, De Weert-​van Oene, Sensky, Van Der Staak, & De Jong, 2010; Joosten et  al., 2009); however, the data have been mixed, and additional research is needed to further investigate the effectiveness

of these approaches in improving drug and alcohol treatment outcomes. Given there are currently limited data on evidence-​based measures for identifying treatment preference, there are no measures to assess treatment preference listed in Table 18.2. Consumption Patterns Assessing consumption patterns is important for understanding the extent of a client’s use prior to treatment. Identifying how much and when a client typically drinks can be valuable information for developing treatment targets. In addition, quantitative variables such as percentage of days drinking, percentage of abstinent days, mean drinks per drinking day, typical blood alcohol content, peak blood alcohol content, and percentage of low, moderate, and heavy drinking days can be used as important feedback to clients regarding their drinking patterns and associated consequences. Several measures are available for assessing consumption patterns, including the Timeline Followback (TLFB; Agrawal, Sobell, & Sobell, 2008; Sobell & Sobell, 1995; Sobell, Sobell, Bogardis, Leo, & Skinner, 1993) and the Form 90 (Miller, 1996; Tonigan, Miller, & Brown, 1997), which are the mostly highly recommended interviews. Although both methods utilize calendars as memory aids to allow for estimating daily drinking patterns and the information needed to estimate the aforementioned quantitative variables, there are several important distinctions. The TLFB assesses the number of standard drinks consumed each day during the assessment period, which can range from 30 days to 1 year. The Form 90 assesses common drinking patterns while isolating drinking episodes, which is used to estimate the amount of drinking each day during the period of assessment. In addition to assessing drinking patterns, the Form 90 assesses general functioning with regard to work, school, religious practice, medical concerns, legal issues, and psychiatric care. Despite the many benefits of the TLFB and the Form 90, both are time-​consuming and require extensive training to achieve an adequate level of reliability. Specifically, the TLFB can take 10 to 15 minutes to assess the past 90 days and up to 30 minutes to assess the past 12 months, and the Form 90 takes on average 45 minutes to administer. The use of these measures for the purposes of case conceptualization and treatment planning is important for understanding contextual factors that contribute to drinking episodes, which aid in identifying targets for treatment. There are also several self-​report questionnaires that are highly recommended, including drinking self-​monitoring

Alcohol Use Disorder

logs, lifetime drinking measures, and quantity–​frequency measures. Drinking self-​monitoring logs (e.g., Sobell & Sobell, 1995; Vuchinich, Tucker, & Harllee, 1988) provide concurrent data on alcohol consumption that can be used to identify high-​risk situations, monitor urges, and track treatment progress. Lifetime drinking measures include the Lifetime Drinking History (LDH; Skinner & Sheu 1982), the Concordia Lifetime Drinking Questionnaire (CLDQ; Chaikelson, Arbuckle, Lapidus, & Gold, 1994), and the Cognitive Lifetime Drinking History (CLDH; Russell et  al., 1997, 1998). The LDH is the most widely used and only takes 20 to 30 minutes to complete, the CLDQ uses visual aids and takes 20 minutes to complete, and the CLDH is a computer-​administered interview that also includes cognitive techniques similar to those used in the TLFB. There are several quantity–​frequency measures, including the Volume–​Variability Index (VV Index; Cahalan & Cisin, 1968), the Khavari Alcohol Test (Khavari & Farber, 1978), the Graduated–​ Frequency Measure (Clark & Midanik, 1982; Midanik, 1994), the NIAAA Quantity Frequency (Armor, Polich, & Stambul, 1978), and the CLDH (Russell et  al., 1997). These measures provide a quick and easy assessment of total consumption and number of drinking days. Notably, there are many limitations of quantity–​ frequency measures, including that they are subject to underreporting alcoholic beverages consumed; often fail to assess the type of drink consumed in a day (and when type of drink consumed is assessed, it increases the administration time); and often do not have the ability to detect fluctuations in drinking, including days of sporadic heavy drinking (Sobell & Sobell, 2004). There are advantages and disadvantages to using either interview formats or self-​report questionnaires (including computerized methods) for assessing alcohol consumption. Clinicians and researchers must weigh the importance of the accuracy of information provided, the type of information desired, and the time of administration when selecting a measure of drinking patterns. The TLFB and Form 90 are most beneficial in settings in which the accuracy and type of information are most important. When time is of the essence, as in most clinical situations, lifetime drinking measures and quantity–​frequency measures may be most useful. Drinking self-​monitoring logs should be used throughout treatment to monitor urges and treatment progress. Consequences of Drinking Another important factor to consider during the case conceptualization and treatment-​ planning phase is

393

consequences of drinking. Because individuals who drink the same amount of alcohol may have vast differences in the consequences they experience as a result of their drinking, it is important to assess consequences of drinking in addition to alcohol consumption patterns. Although the diagnostic and screening instruments noted previously assess various consequences of drinking, they are typically limited in scope of consequences surveyed. In addition, if there exists a discrepancy in the client’s perception of negative consequences as a result of drinking compared to the actual consequences, using a standardized measure of alcohol consequences can highlight this discrepancy. Identifying consequences of drinking may also help increase motivation for change. One measure used to assess consequences of drinking is the Drinker Inventory of Consequences (DrInC; Miller, Tonigan, & Longabaugh, 1995), which is a 50-​ item measure that comprehensively assesses lifetime and recent adverse events as a result of drinking in five domains:  physical, intrapersonal, social responsibility, interpersonal, and impulse control. The Short Inventory of Problems (SIP; Miller et  al., 1995)  is a condensed version of the DrInC. This measure contains only 15 items and can be administered in 5 minutes or less. The Alcohol Use Inventory (AUI; Horn, Wanberg, & Foster, 1974; Wanberg, Horn, & Foster, 1977)  is a 228-​ item measure assessing many dimensions of an individual’s drinking, including alcohol consequences (e.g., loss of control, hangover, and role maladaptation). Briefer measures than the AUI include the MAST and the Rutgers Alcohol Problem Index (RAPI; White & Labouvie, 1989). The RAPI is a 23-​item self-​ report questionnaire assessing the frequency of negative alcohol-​related consequences, whereas the MAST consists of 25 items assessing alcohol-​ related problems. The AUI, RAPI, and MAST all have evidence of strong psychometric properties. The Young Adult Alcohol Consequences Questionnaire (YAACQ; Read, Kahler, Strong, & Colder, 2006) is a 48-​item measure that assesses eight domains of alcohol problems: social-​ interpersonal, impaired control, diminished self-​ perception, poor self-​care, risky behavior, academic/​ occupational, physiological dependence, and blackout drinking. In addition to the broader range of alcohol problems assessed, an advantage of the YAACQ is that the subscales provide a way of aggregating alcohol consequences that may be used in motivational-​ enhancement, skill-​based, and personalized-​feedback

394

Substance-Related and Gambling Disorders

interventions. A  briefer version of the YAACQ, the B-​ YAACQ (Kahler, Strong, & Read, 2005), consists of 24 items and may be preferred over the YAACQ given its shorter administration time and adequate psychometric properties. Family History of Alcoholism Research indicates that assessing family members directly is the optimal method for obtaining a family history of alcohol problems (Andreasen, Endicott, Spitzer, & Winokur, 1977; Andreasen, Rice, Endicott, Reich, & Coryell, 1986), although this method is not without limitations. However, assessing family members directly is not always realistic or feasible. Family history methods may be a more realistic approach and have been shown to have good sensitivity but poor specificity (Andreasen et  al., 1977). Although family history methods tend to have lower reliability and sensitivity (Andreasen et  al., 1986) than direct assessments of relatives, optimal family history methods are those that are structured, such as the Family History Research Diagnostic Criteria (FHRDC; Andreasen et  al., 1977; Endicott, 1978). Unfortunately, this specific method is outdated because it does not include the most recent iteration of the AUD diagnostic criteria. In addition, there is evidence that simply asking a single question about whether anyone in one’s family had problems with alcohol is an equally effective method (Crews & Sher, 1992; Cuijpers & Smit, 2001; Slutske et al., 1996). Thus, risk for alcohol problems can simply be assessed by asking this single question, whereas more detailed information about family history is best gathered using the Family Tree Questionnaire (FTQ; Mann, Sobell, Sobell, & Sobell, 1985). The FTQ is a brief and easy way to identify first-​and second-​degree biological relatives who range from lifelong abstainers to definite problematic drinkers. Although the Children of Alcoholics Screening Test (CAST; Jones, 1983) has been widely used, there is questionable evidence of its reliability and validity; thus, it is not a recommended measure for assessing family history of alcoholism (Schinke, 1989). A self-​report technique that has initial evidence of strong psychometric properties for assessing family history of alcoholism is the Short Michigan Alcoholism Screening Test (SMAST; Selzer, Vinokur, & van Rooijen, 1975) adapted to assess mother’s alcoholism (M-​SMAST) and father’s alcoholism (F-​SMAST) (Crews & Sher, 1992). However, additional research is needed testing the psychometric properties of the M-​SMAST and F-​SMAST by independent investigators.

Readiness for Change It has been suggested that assessing motivation to change drinking behavior is important for tailoring treatment goals to match the individual’s motivation level (Bergly, Stallvik, Nordahl, & Hagen, 2014; Norcross, Krebs, & Prochaska, 2011; Prochaska, DiClemente, & Norcross, 1992). Prochaska and DiClemente (1983) theorized that change occurs in five stages, a process that they describe using the transtheoretical model of change. Assignment to these stages is achieved using various algorithms (e.g., Prochaska, 1994; Prochaska et al., 1994). Throughout the years, the stages of change have been modified, but they can be described in general as follows: (a) Individuals with evidence of a problem but no intention to quit may be classified into the precontemplation stage, (b) individuals intending to quit within the next 6 months are assigned to the contemplation stage, (c) individuals who may have taken steps toward change and are interested in changing within 1 month may be assigned to the preparation stage, (d) individuals in the process of making a change are in the action stage, and (e) individuals who have made a change may be assigned to the maintenance stage until they have maintained their changes for 6 months or longer (Carey, Purnine, Maisto, & Carey, 1999). There are many measures that purport to assess readiness to change problematic alcohol use behavior. The measures with evidence of adequate or better psychometric properties include the University of Rhode Island Change Assessment (URICA; McConnaughy, Prochaska, & Velicer, 1983), the Stage of Change Readiness and Treatment Eagerness Scale (SOCRATES; Miller & Tonigan, 1996), and the Readiness to Change Questionnaire (RCQ; Rollnick, Heather, Gold, & Hall, 1992). There have been mixed findings on the validity of these measures despite their wide use (e.g., lack of predictive validity for the URICA [Bergly et  al.,  2014], limited predictive validity for the RCQ [Carey et al., 1999], and limited convergent validity between the URICA and the SOCRATES [Napper et  al.,  2008]). It has also been proposed that examining commitment to change in addition to examining motivation to change is important. Specifically, some argue that expressing commitment may represent a stronger desire for change and may be less susceptible to factors that promote ambivalence about change (Kaminer, McCauley Ohannessian, McKay, & Burke, 2016; Kelly & Greene, 2013). Currently, there exists a single-​item commitment to abstinence measure (Havassy, Hall, & Wasserman, 1991)  that has been validated by others (Mensinger, Lynch, TenHave, & McKay, 2007; Morgenstern, Frey,

Alcohol Use Disorder

McCrady, Labouvie, & Neighbors, 1996)  and a 5-​item commitment to sobriety scale (Kelly & Greene, 2013). The Adolescent Substance Abuse Goal Commitment (ASAGC) questionnaire is a recently developed 16-​item measure used to assess commitment to treatment goals in adolescents and has promising psychometric properties. Unlike the other commitment measures, the ASAGC offers the option to rate commitment to abstinence versus harm reduction. Future research should investigate the psychometric properties of the ASAGC as well as measures assessing both commitment and readiness to change more broadly. Drinking Goals As discussed previously, an important part of treatment planning is assessing the client’s drinking goals. Although there are many abstinence-​only programs, these programs may not be effective for all clients. Individuals who participate in an abstinence-​only treatment program when their drinking goal is not to abstain from alcohol use will miss out on opportunities to learn how to moderate use by drinking in a way that minimizes problems caused by alcohol use. Despite the importance of taking into account drinking goals when treatment planning, no psychometrically validated measures are currently available. It is recommended that drinking goals be assessed informally and considered when treatment planning. Future research is needed to fill the gap in the area of assessing drinking goals. Treatment History Given the chronic nature of AUD, relapse is not an uncommon phenomenon among those addicted to alcohol. Thus, another pertinent part of case conceptualization and treatment planning is to obtain the client’s history of drug and alcohol treatment. Learning about the client’s treatment history can provide a sense of which methods were helpful and which were not helpful. Obtaining individual treatment history may happen informally during the intake or the initial session with the client. When asking about treatment history, it may be helpful to discuss the client’s motivation for change at the time of the previous treatment episodes. If the client reports a different level of motivation to change, this may lend the client to being open to repeating similar treatment approaches from their client’s past because they may be more receptive to the treatment material compared to their prior treatment attempt.

395

The TLFB and the Form 90 are psychometrically sound instruments that provide mechanisms for obtaining a history of drug and alcohol treatment. However, these are time-​limited measures. CASAA provides the Lifetime Treatment History Interview (Center on Alcoholism, Substance Abuse, and Addictions Research Division, 1994)  that parallels the Form 90. This measure assesses lifetime and the date of the most recent medical hospitalization, detoxification, residential treatment, incarceration, outpatient treatment, and participation in Alcoholics Anonymous or 12-​step meetings. Although this measure provides a briefer method of assessing lifetime treatment, there is little evidence of its psychometric properties. It is recommended that treatment history be obtained, along with questions regarding which aspects of prior treatments were helpful and not helpful. Craving Craving is perhaps the most subjective DSM-​5 AUD diagnostic criterion. It plays an important role in relapse, and various medications are viewed as targeting this symptom directly. Thus, craving is an important factor to consider in treatment planning. However, when Sayette and colleagues (2000) reviewed the literature on psychometrically sound measures assessing craving, they concluded that there was a need for a clear theoretical framework of craving to drive measurement development and adoption. Thirteen years later, Kavanagh and colleagues (2013) indicated that this still remains true. The ICD-​10 has described craving as “a strong desire or sense of compulsion to take the substance” (WHO, 1993, p. 70), and the DSM criteria for AUD define it as “craving, or a strong desire or urge to use alcohol” (APA, 2013, p. 491), which both represent the severe end of the craving spectrum. Other definitions of craving have included cognitions such as expectancies, intentions, or perceived behavioral control (Kavanagh et al., 2013). When assessing craving, it is important to consider the time frame. Tonic craving can be defined as retrospective reports of craving during a specific period of time, whereas phasic craving refers to experiences of craving in the moment (Ray, Courtney, Bacio, & MacKillop, 2013). Tonic craving measures can be useful for understanding a client’s pattern and triggers of craving. The most psychometrically sound tonic craving measures include the Obsessive Compulsive Drinking Scale (OCDS; Anton, Moak, & Latham, 1995), the Temptation and Restraint Inventory (TRI; Collins & Lapp, 1992), the Penn Alcohol Craving Scale (PACS; Flannery, Volpicelli, & Pettinati,

396

Substance-Related and Gambling Disorders

1999), the Preoccupation with Alcohol Scale (PACS; Leonar, Harwood, & Blane, 1988), the Jellinik Alcohol Craving Questionnaire (JACQ; Ooteman, 2006), the Alcohol Craving Questionnaire Revised (ACQ-​R; Raabe et al., 2005), and the Alcohol Craving Questionnaire Now (ACQ; Singleton, Tiffany, & Henningfield, 1995). Phasic craving measures are useful for tracking cravings throughout treatment. The only psychometrically sound phasic craving measure is the Alcohol Urge Questionnaire (AUQ; Bohn, Krahn, & Staehler, 1995).

negative emotions outside of drinking. In the event that a client experiences a relapse during or after treatment, the Reasons for Drinking Questionnaire (RFDQ; Westerberg, Miller, & Heather, 1996)  could be used as a learning opportunity to identify additional relapse factors to be considered and addressed in treatment. Results of initial research suggest adequate psychometric properties of the RFDQ, but further evidence for the measure is needed. Alcohol Outcome Expectancies

Expectancy theory describes alcohol outcome expectancies as our beliefs about the effects of consuming alcohol In developing the treatment plan with the client, it is (Brown, Goldman, Inn, & Anderson, 1980), which influimportant to identify situations that put the client at risk ence drinking behavior. Elucidating alcohol outcome for relapse (or, for treatments that are not abstinence expectancies is important for case conceptualization and based, at risk for excessive consumption). Encountering treatment planning, especially when using expectancy cues (e.g., people, places, events, or feelings) associated challenge interventions (Darkes & Goldman, 1993; Scott-​ with drinking may evoke a variety of cognitive, behav- Sheldon, Terry, Carey, Garey, & Carey, 2012). Among the ioral, and affective responses that increase risk of relapse alcohol outcome expectancy measures, the most widely for abstainers or returning to excessive alcohol consump- used is the Alcohol Expectancy Questionnaire (AEQ; tion among those desiring to maintain a moderate level of Brown et al., 1980), which assesses six domains of positive drinking. Ascertaining antecedents to drinking behavior expectancies. A  major limitation of this measure is that is helpful for identifying potential strategies to discuss in it neglects to assess negative alcohol outcome expectantreatment. For example, if anger is a common anteced- cies and the subjective valuation of the effects of alcohol. ent to drinking behavior, then the therapist may consider The Comprehensive Effects of Alcohol Questionnaire including techniques for identifying and managing anger (CEOA; Fromme, Stroot, & Kaplan, 1993)  addresses as one treatment strategy for reducing the risk of relapse. both of these concerns. Specifically, the CEOA consists of The most widely used and empirically validated self-​ 38 items assessing four positive expectancies (sociability, report measure for assessing high-​ risk drinking condi- tension reduction, enhanced sexuality, and liquid courtions is the Inventory of Drinking Situations (IDS; Annis, age) as well as three negative expectancies (cognitive and Graham, & Davis, 1987). The IDS is a 100-​item measure behavioral impairment, risk and aggression, and negaassessing frequency of past-​year drinking in the follow- tive self-​perception). The CEOA also includes perceived ing eight areas:  unpleasant emotions, physical discom- desirability of each of the expected effects. The adminisfort, pleasant emotions, testing personal control, urges or tration time for the CEOA can be as long as 10 minutes. temptations to drink, conflict with others, social pressure A briefer version, the B-​CEOA (Ham, Stewart, Notron, & to drink, and pleasant times with others. A  briefer ver- Hope, 2005), consists of 15 items that tap into the same sion, the IDS-​42 (Isenhart, 1991), is also available and has domains and is psychometrically sound. strong psychometric properties. The Drinking Motives Questionnaire-​Revised (DMQ-​R; Cooper, 1994; Cooper, Drinking Self-​Efficacy Russell, Skinner, & Windle, 1992)  provides important information for conceptualizing the client’s motivations In addition to assessing readiness and importance of makfor drinking and identifying treatment targets. The DMQ-​ ing a change, it is important to assess the client’s perR assesses four drinking motives:  enhancement (e.g., to ceived ability to make the change. Some clients may feel enhance positive mood), social (e.g., to augment social ready and willing to change but do not believe they have situations), coping (e.g., to relieve negative emotions), the skills needed to take the first step. Identifying the cliand conformity (to external social pressures). If a client’s ent’s self-​efficacy to make a change is an important part of motivation for drinking is primarily to cope with negative treatment planning. Measures used to assess self-​efficacy emotions, for example, the treatment plan may include include the Situational Confidence Questionnaire identifying alternative ways to effectively cope with (SCQ; Annis, 1987), the Alcohol Abstinence Self-​Efficacy High-​Risk Drinking Situations/​Relapse Situations

Alcohol Use Disorder

Scale (AASE; DiClemente, Carbonari, Montgomery, & Hughes, 1994), and the Drinking Refusal Self-​Efficacy Questionnaire (DRSEQ; Young, Oei, & Crook, 1991). Although the SCQ has been widely used, it is limited because it assesses ability to resist heavy drinking without clearly defining heavy drinking (Oei, Hasking, & Young, 2005). The DRSEQ addresses this limitation by assessing the ability to resist drinking, which provides an opportunity for high-​risk situations to be revealed for both social drinkers and problematic drinkers. A briefer version of the DRESQ is available. The Alcohol Reduction Strategies–​ Current Confidence (ARS-​CC; Bonar et  al., 2011)  is a promising newer measure of self-​efficacy. The ARS-​CC consists of 31 items assessing perceived ability to utilize each of the drinking-​reduction self-​control skills, and preliminary data support its psychometric properties. Social Network Social network drinking is another important factor to assess as part of the case conceptualization and treatment planning processes. Assessing a client’s social network drinking patterns can identify people who may provide support for the client throughout treatment and others who may serve as stressors to be discussed in treatment. Often, alcohol interventions will include modifications to the social network in order to increase support for changes with problematic drinking. The Important People and Activities (IPA) interview (Longabaugh & Zywiak, 1999) is an instrument that has been tested extensively in a variety of multisite randomized alcohol-​related clinical trials (Project MATCH: Allen et al., 1997; Kadden, Carbonari, Litt, Tonigan, & Zweben, 1998; COMBINE: COMBINE Study Research Group, 2003). This 19-​item interview involves asking clients to name important people in their social network and to specify the drinking behaviors of each person identified, which can take roughly 20 minutes to administer. Similarly, the Important People instrument (IP; Clifford & Longabaugh, 1991) is another commonly used measure of social network drinking. The IP was originally developed as an interview but is also available as a questionnaire in both paper-​ and-​ pencil and computer-​based formats. Notably, the computerized format requires less training to administer and includes a scoring algorithm that allows for immediate feedback after completion (Hallgren, Ladd, & Greenfield, 2013). The IP requires respondents to identify up to 10 important people in their social network with whom they have had recent contact. Despite extensive evidence of its reliability and validity (e.g., Hallgren et al., 2013; Longabaugh,

397

Wirtz, Zweben, & Stout, 1998), the IP may contain up to 80 items (8 items per person in the social network) and takes, on average, 12 minutes to administer. There is also preliminary psychometric support for the five-​person version of the Important People measure (IP-​5; Hallgren & Barnett, 2016)  derived from the original IP (Clifford & Longabaugh, 1991). A major advantage of the IP-​5 is its reduced administration time, which may be especially important in clinical or research settings in which time is of the essence. Overall Evaluation There are several psychometrically sound measures that can be used in the process of case conceptualization and treatment planning. The CIWA-​ Ar remains the most psychometrically sound measure of alcohol withdrawal despite its limitations in internal consistency and generalized validity. The ASI, SF-​36, and SF-​12 are the best measures for assessing medical health concerns. The RAATE can be used to determine the most appropriate level of care for the client. Several measures are available to identify drinking patterns, such as the TLFB and the Form 90. The RAPI and the B-​YAACQ are good measures to assess alcohol consequences. Family history of alcoholism may be best assessed using a single question followed up by the FTQ for more detailed information. When assessing readiness to change, the URICA is highly recommended. Treatment history may be assessed using the Form 90. Several psychometrically sound measures are available for assessing craving, including the AUQ, which assesses craving at the time of the assessment. High-​risk drinking or relapse situations may be identified using one of the IDS measures. Alcohol outcome expectancies can be assessed using one of the CEOA measures. Either of the DRSEQ measures may be used to assess drinking self-​ efficacy. Finally, use of the IPA is the best way to identify the drinking patterns of important people in a client’s social network.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

In order to monitor progress and treatment outcome, clinicians often assess alcohol use, alcohol-​related consequences, motivation to discontinue alcohol use, treatment adherence, and general functioning. This section outlines alcohol instruments assessing consumption patterns, drinking consequences, motivation to change, quality of

398

Substance-Related and Gambling Disorders

life, Alcoholics Anonymous (AA)/​12-​step affiliation, and coping skills. Generally, the goal of alcohol treatment is to reduce alcohol consumption and its related negative consequences. Thus, measures of consumption and drinking consequences provide clinicians and researchers with objective measures of progress and post-​treatment changes related to harmful alcohol use. In addition, the individual’s commitment and motivation for change are also indicators of progress. Understanding commitment and motivation levels can be especially helpful during treatment monitoring to guide treatment approaches. Moreover, measures that assess overall quality of life are reviewed. These measures are useful in treatment monitoring and treatment evaluation because they are indicators of improved functioning. Last, we discuss AA involvement/​ 12-​ step affiliation and coping skill assessments. AA involvement and acquisition of coping skills tend to be positively associated with treatment effectiveness. Therefore, assessments of AA/​12-​step commitment and acquisition of coping skills can be critical measures of treatment gains. This section concludes with an overall evaluation of the measures recommended for treatment monitoring and evaluation. Instruments that are “highly recommended” have strong psychometric properties across all domains. A full list of measures and their psychometric characteristics are presented in Table 18.3. Consumption Patterns Consumption pattern is a widely used indicator of treatment progress and outcome. Monitoring rate of consumption during and following treatment can be an objective method to assess an individual’s risk, change in drinking, and effectiveness of the treatment method. Thus, it is important to assess drinking patterns in both clinical and research settings when monitoring treatment and outcome. As noted previously, the TLFB and Form 90 are highly recommended consumption measures due to their “good” to “excellent” psychometric properties across clinical and nonclinical samples. The TLFB and Form 90 are good measures of treatment outcome and are commonly used to gather baseline and follow-​up data. One advantage of the Form 90 compared to the TLFB is its inclusion of collateral reports, specifically the Form 90-​Collateral (Form 90-​AC; Miller & Del Boca, 1994). For treatment outcome evaluation, collateral reports can provide information on relevant alcohol-​related consequences that are not captured in the TLFB and can corroborate self-​report data. Examples of consequences include hospitalization, treatment utilization, and incarceration following

treatment. The level of detail in the Form 90-​AC can limit its usefulness in some settings; therefore, the Form 90-​ACS, a shortened form focusing on drinking behaviors, can be used for more general applications. As described previously, the utility of both TLFB and Form 90 instruments is limited by the need for training and the length of administration. For settings necessitating brief measures of alcohol consumption that do not require training, screening measures such as the AUDIT or Daily Drinking Log may be appropriate. Furthermore, the Brief Addiction Monitor (BAM; Center for Excellence in Substance Abuse Treatment and Education, 2010), a brief monitoring and outcome measure, may also be an alternative. Despite its limited number of psychometric investigations, the BAM appears to be a promising self-​ report measure with strong content validity and good construct validity and reliability (e.g., Cacciola et  al., 2013; Nelson, Young, & Chapman, 2014). Similarly, the Patient-​ Reported Outcomes Measurement Information System alcohol item bank (PROMIS; Pilkonis et  al., 2013)  assesses alcohol use, consequences, and expectancies. The PROMIS alcohol use items assess past 30-​day typical quantity, maximum number of drinks, number of days intoxicated, and subjective ratings of drinking behavior (e.g., “I spent too much time drinking” or “I drank too much”). These measures of alcohol use can be used to determine an individual’s change in drinking patterns and beliefs about his or her drinking. Moreover, the PROMIS assesses negative and positive alcohol-​related consequences and expectancies. These items describe an individual’s maladaptive drinking patterns (i.e., consequences) and further elucidate beliefs, or expectancies, about drinking. We did not rate the PROMIS alcohol item bank as highly recommended because of the limited number of independent psychometric investigations of the measure. Initial development and early psychometric investigations indicate the PROMIS scores have excellent psychometric properties, specifically excellent normed data, internal consistency, and content validity (Pilkonis et al., 2016). Future investigations are needed to further validate the PROMIS alcohol use items. Biological Measures In addition to screening, biological measures of alcohol consumption can be used in conjunction with self-​ report measures for treatment monitoring and outcome evaluation. Due to the varying levels of specialized training and testing needed for biological measures, clinical

TABLE 18.3  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Inter-​Rater Consistency Reliability

Test–​Retest Content Reliability Validity

Construct Validity

Validity Generalization

Treatment Sensitivity

Clinical Highly Utility Recommended

Consumption Patterns TLFB PROMIS BAM

G E A

NA E A

NA NA NA

A NR A

NR E G

G A A

E G A

E G A

A A NR

MAP Form 90 5-​HTOL GGT CDT PEth

G G G E G E

A NA NA NA NA NA

A NA NA NA NA NA

A G NR NR NR NR

NR NR NA NA NA NA

NR A NA G G NA

G E G E E G

NR G G E E E

A A A A A A

EtG

A

NA

NA

NR

NA

NA

G

A

A

EtS

A

NA

NA

NR

NA

NA

G

E

A

AST

A

NA

NA

NR

NA

NA

E

G

A

ALT

A

NA

NA

NR

NA

NA

E

G

A

FAEE

A

NA

NA

NR

NA

NA

G

A

A

MCV

G

NA

NA

NR

NA

NA

E

G

A

Transdermal A monitoring via sweat/​SCRAM/​ Giner TAS

NA

NA

NR

NA

NA

A

A

A



✓ ✓ ✓ ✓

Drinking Consequences DrInC SIP

G G

G G

NA NA

A A

A A

G G

E E

E A

A A

RAPI

G

G

NA

G

A

A

G

G

A

YAACQ

G

G

NA

A

A

A

G

A

A

B-​YAACQ

G

G

NA

A

A

A

G

G

A

ATOM

A

G

NA

A

G

A

A

A

A

G G E

NA NA NA

G A A

E NR A

A G A

E E E

G A NR

A A A



Readiness to Change SOCRATES URICA Change Questionnaire

E G E

✓ ✓

AA Involvement/​12-​Step Affiliation AAS

G

G

NA

NR

G

A

G

NR

A

AAI

G

G

NA

A

NR

NR

G

A

A

B-​PRI

A

G

NA

NR

E

A

A

A

A

GAATOR

A

G

NA

NR

A

A

G

NR

A

G

G

NA

A

G

A

G

G

A

SF-​36 E SF-​12 E WHOQOL-​BREF E

G G G

NA NA NA

A A A

G G G

G G G

E E E

E E NR

A A A

G

G

A

G

A

A

NR

A

G

NA

A

G

G

G

A

A

A

NA

NR

G

G

A

NR

A

Quality of Life LSS

✓ ✓

Coping Skills Alcohol-​Specific A Role-​Play Test Impaired Control G Scale PBSS G

Note:  TLFB  =  Timeline Followback; PROMIS  =  Patient-​Reported Outcomes Measurement Information System Alcohol Use Bank; BAM  =  Brief Addiction Monitor; MAP = Maudsley Addiction Profile; 5-​HTOL = 5-​hydroxytryptophol; GGT = serum γ-​glutamyl transferase; CDT = carbohydrate-​ deficient transferrin; PEth  =  phosphatidyl ethanol; EtG  =  ethyl glucuronic; EtS  =  ethyl sulfate; AST  =  aspartate aminotransferase; ALT  =  alanine aminotransferase; FAEE = fatty acid ethyl esters; MCV = mean corpuscular volume; SCRAM = secure continuous remote alcohol monitor; TAS = transdermal alcohol sensor; DrInC  =  Drinker Inventory of Consequences; SIP  =  Short Index of Problems; LDQ  =  Leeds Dependence Questionnaire; RAPI  =  Rutgers Alcohol Problem Index; YAACQ  =  Young Adult Alcohol Consequences Questionnaire; B-​YAACQ  =  Brief Young Adult Alcohol Consequences Questionnaire; ATOM = Alcohol Treatment Outcome Measure; SOCRATES = Stages of Change Readiness and Treatment Eagerness Scale; URICA  =  University of Rhode Island Change Assessment; AA  =  Alcoholics Anonymous; AAS  =  Alcoholics Anonymous Affiliation Scale; AAI = Alcoholics Anonymous Involvement Scale; B-​PRI = Brown–​Peterson Recovery Progress Inventory; GAATOR = General Alcoholics Anonymous Tools of Recovery Scale; LSS = Life Situation Survey; SF-​36 = Medical Outcome Study Health-​Related Survey Short Form; SF-​12 = Medical Outcome Study Health-​Related Survey Short Form; WHOQOL-​BREF = World Health Organization Quality of Life Survey–​BREF; PBSS = Protective Behavioral Strategies Survey; A = Acceptable; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

400

Substance-Related and Gambling Disorders

utility is often limited. However, the high precision, sensitivity, and specificity of biological markers and limited reporting bias increase their utility as objective measures for treatment monitoring and evaluation. The two most commonly studied biological measures are GGT and CDT levels. As mentioned previously, GGT has been shown to have moderate levels of specificity and sensitivity to ethanol and is most accurate for chronic, heavy consumption, whereas CDT has been shown to have high specificity and moderate sensitivity for heavy consumption (Djukic, 2012; Snell et al., 2016). Despite their treatment sensitivity, GGT and CDT are limited to heavy alcohol consumers and remain in the body for long periods (up to 8 weeks for GGT and up to 3 weeks for CDT). Researchers have suggested assessing GGT and CDT levels in combination to increase accuracy, specificity, and sensitivity for heavy drinking populations (Djukic, 2012; Snell et al., 2016). Tests for biological markers, such as 5-​hydroxytryptophol (5-​ HTOL) and ethyl glucuronide (EtG)/​ ethyl sulfate (EtS), can be used to measure acute alcohol intoxication spanning one to several days, respectively. These biological measures have been shown to have high specificity and sensitivity in identifying ethanol levels following recent alcohol consumption; however, they are limited by their susceptibility to individual differences (sex, size, etc.). In addition, biomarkers such as PEth, fatty acid ethyl esters (FAEE), MCV, aminotransferase (ALT), aspartate aminotransferase (AST), and transdermal alcohol monitors are potentially useful in treat monitoring. These measures were not highly recommended due to their limited utility in treatment monitoring. For example, FAEE is best used to distinguish heavy alcohol drinkers from light alcohol drinkers and can be detected up to 24 hours after drinking and remain in hair for several months after drinking (Djukic, 2012). Heavy alcohol drinkers who cut down on drinking may not have low enough FAEE levels after the first few weeks of cutting down. Thus, this biomarker may be useful in determining pre-​and post-​treatment drinking status rather than monitoring alcohol use throughout treatment. Similarly, ALT and AST better assess liver damage rather than alcohol consumption (Djukic, 2012), limiting their use in monitoring alcohol consumption during treatment. However, they may be useful in assessing alcohol relapse following treatment. Although not highly recommended, these biomarkers may play an important role in assessing some aspects of treatment monitoring and evaluation.

Consequences of Drinking As noted previously, drinking consequences can be used as biopsychosocial measures of impairment from alcohol use. Researchers and clinicians can use measures of pre-​ and post-​drinking consequences as indicators of alcohol-​ related dysfunction to monitor treatment and evaluate outcome. The DrInC assesses lifetime and past 3-​month alcohol consequences. The DrInC was originally normed for adults, which may limit its applicability to adolescent or young adult populations. The RAPI (White & Labouvie, 1989) and the YAACQ (Read et al. 2006) are two alcohol consequences measures that have “acceptable” to “good” treatment sensitivity and validity generalization, and they are designed specifically for adolescents and college students. In addition, the Alcohol Treatment Outcome Questionnaire (ATOM [also known as the Australian Alcohol Treatment Outcome Questionnaire]; Simpson, Lawrinson, Copeland, & Gates, 2007) is a brief measure designed for alcohol treatment outcome with potential for good clinical utility. The ATOM has different versions for research (ATOM-​R) and clinical practice (ATOM-​C; Simpson et al. 2007). Reliability and validity investigations of the ATOM-​C suggest evidence of strong internal consistency and content validity, with recent work illustrating satisfactory treatment sensitivity and construct validity (Simpson, Lawrinson, Copeland, & Gates, 2009). Readiness for Change Similar to case conceptualization and treatment planning, readiness for change is also an important part of treatment monitoring and evaluation. Understanding which stage of change an individual is in is helpful to monitor progress throughout treatment and can be used to prepare for potential relapse. The URICA and the SOCRATES are highly recommended measures that can be used for treatment conceptualization as well as treatment monitoring and outcome evaluation. The URICA is normed for clinical and nonclinical samples and exhibits high generalizability to a broad range of groups. Moreover, pretreatment measures of readiness to change using the URICA and SOCRATES instruments have been related to treatment outcome (Edens & Willoughby, 2000; Isenhart, 1997), making it a potentially useful tool in assessing treatment outcome. Quality of Life Assessing quality of life is also a useful method of monitoring treatment progress and outcomes. Evidence

Alcohol Use Disorder

401

suggests that maladaptive drinking patterns are negatively associated with quality of life (Donovan, Mattson, Cisler, Longabaugh, & Zweben, 2005). Thus, monitoring an individual’s functioning via quality of life (QoL) instruments can provide insight into the effectiveness of alcohol treatment and the impact of treatment on an individual’s health. The SF-​12 and SF-​36 are two highly recommended measures for assessing QoL given they are commonly used in alcohol research due the availability of norms specific to psychological treatment. Although these instruments do not assess all domains of life functioning, their relevance to alcohol treatment, good psychometric properties, and treatment sensitivity make them highly recommended for assessing QoL outcomes.

(PBSS; Martens et al., 2005) has promising psychometric properties. The PBSS is a 25-​item questionnaire assessing the use of protective behavioral strategies related to alcohol consumption and alcohol-​related problems. This measure was initially developed for college student populations, and it has strong content validity and normative data. The PBSS is negatively correlated with alcohol-​ related consequences (e.g., Pearson, Kite, & Henson, 2012). Reduction in negative alcohol-​ related consequences can be an indicator of treatment effectiveness. Despite limited data on the validity of measures assessing coping skills, these measures may be useful in clinical and research settings in which coping skills are a component of AUD treatment.

AA Involvement/​12-​Step Affiliation

Overall Evaluation

Many of the measures mentioned in this chapter may be applicable to treatment monitoring and evaluation. Instruments administered at the beginning of treatment (e.g., screening or diagnostic measures) can be readministered during and following treatment to assess change. However, the measures included in this section are specifically designed to be primary and secondary measures for treatment monitoring and evaluation, or they are most relevant for assessing treatment effectiveness. The TLFB and Form 90 are commonly used measures of consumption that are easy to administer. Their ease of use and high treatment sensitivity make these instruments useful for treatment monitoring and evaluation. In addition, biological measures such as GGT and CDT can be used in conjunction with other consumpCoping Skills tion assessments to improve detection of ethanol during Coping skills are a major component of cognitive–​ and following treatment. Not only should consumption behavioral therapy (CBT) AUD treatments. Examples of patterns be monitored throughout treatment but also a coping skills include monitoring mood, managing crav- reduction in alcohol-​related problems is an important ing, or managing difficult situations involving alcohol indicator of change. The DrInC has strong psychomet(e.g., saying “no” to an offer to drink). CBT-​focused alco- ric properties and sensitivity to changes in the number of hol treatments provide individuals with coping skills to adverse consequences during and following treatment. help maintain treatment gains and prevent future relapse. Moreover, understanding an individual’s motivation for Understanding an individual’s coping skills can be help- change can help determine appropriate treatment goals. ful in identifying progress, areas for improvement, and Measures such as the SOCRATES and the URICA can treatment effectiveness. Moreover, assessing coping skills aid in determining an individual’s motivation to better can be useful in non-​CBT-​focused treatments. Measures monitor potential gains or losses during or following treatof an individual’s competency in managing cravings or ment. In addition, QoL indexes the overall functioning of high-​risk situations can be used to determine treatment an individual. The inverse relationship between maladapprogress or likelihood of relapse. The coping skill mea- tive QoL and alcohol consumption makes QoL a useful sures reviewed in this chapter could not be highly rec- metric of progress for treatment monitoring and evaluaommended because of limited evidence of their validity. tion. Last, AA involvement/​12-​step affiliation and coping However, the Protective Behavioral Strategies Scale skills can be useful measures of an individual’s adherence In AA-​focused treatments, adherence to the values and beliefs of the AA/​12-​step model is viewed as a marker of progress among those desiring to remain abstinent from alcohol. Thus, for settings using AA-​focused treatments, regular assessments of AA involvement/​12-​step affiliation can be used to monitor treatment progress and evaluate outcomes. As noted in Table 18.3, several AA/​ 12-​ step measures have satisfactory psychometric properties. These instruments were not highly recommended due to limited evidence of their validity (e.g., construct validity). However, readers should note the listed instruments have been described elsewhere as valid measures of AA involvement/​12-​step affiliation (Allen, 2000).

402

Substance-Related and Gambling Disorders

to common alcohol treatments. Treatments such as AA and CBT focus on adherence to the treatment model and acquisition of coping skills. Relevant AA/​12-​step or CBT-​centered treatments may benefit from monitoring AA/​12-​step commitment and coping skills as proxies for improvement during and following treatment. Readers should be aware that the measures provided in this section are not exhaustive. In addition, measures not identified as “highly recommended” may still have good clinical utility. Other measures that are relatively new have promising psychometric properties, such as the PROMIS and the BAM for assessing consumption and the ATOM-​C for assessing consequences of drinking.

psychometric properties to record “accurate data in order to study alcohol-​related health outcomes, disease progression, treatment efficacy, and recovery. A wearable alcohol monitoring device could have consumer appeal as well; much like counting one’s steps, this information could help individuals make better health choices” (https://​ www.niaaa.nih.gov/​challenge-​prize). Although it is not yet clear when such devices will be “ready for prime time” for clinicians and the general public, achievement of this goal could revolutionize the assessment of consumption, especially when coupled with information about “when and where” such consumption takes place. Similarly, electronic diaries in various forms have been used in alcohol (and other types of substance use) research for many years, but they have not yet achieved widespread CONCLUSIONS AND FUTURE DIRECTIONS acceptance in clinical practice. Real-​time measurement of consumption, drinking situations, motivation, affective Assessment of AUD and related variables continues to states, and drinking consequences figures prominently in evolve in multiple ways that reflect refinement of basic basic research (e.g., Shiffman, 2016) and is used in some constructs on the basis of both theory and the ever-​ clinical trials, but no single standard instrumentation has increasing evidence base. Researchers and clinicians have been widely adopted. It is possible that part of the issue a range of tools available to them to assess risk factors for is that most researchers develop their own programs or problematic alcohol involvement, to evaluate the nature “apps” for individual studies without making them generand extent of drinking patterns and associated problems ally available to the public. However, it seems likely that (including AUD), to characterize a patient’s profile on a as more health-​ related apps are disseminated through range of key variables that help guide the clinician with commercial vendors, a standard will emerge that will help respect to gauging motivation for change, to identify clinicians (and users) use common approaches to assess appropriate treatment targets, and to monitor treatment alcohol use and related constructs in real time and proprogress and outcomes. In each of these areas, high-​ vide useful summaries of these data. quality measures have been developed with an eye toward Although we believe that the next version of this both psychometric adequacy and patient and provider handbook is more likely than not to discuss real-​time acceptability. assessment of drinking and related intra-​individual and Although not highlighted in the current chapter contextual variables, the armamentarium available to because of our emphasis on existing measures that have the clinician is vast and even passing familiarity with the been established as “tried and true” in the field, a number range of constructs assessable now can only help cliniof evolving measurement approaches have proven useful cians choose what constructs are likely to be most useful in basic research and are finding their way into clinical in their work and how to best assess them. To this end, we practice. For example, basic research on the metabolic encourage readers to not only become familiar with the pathways and physiological consequences of alcohol constructs and their measures described in this chapter intake on the individual is providing a scientific basis for but also use these as a basis for monitoring further develnovel biomarkers that are likely to be better at characteriz- opments in the field. ing consumption compared to existing biomarkers (Snell et  al., 2016). Notably, “wearable” ethanol sensors have been available now for a number of years (Swift, Martin, References Swette, Laconti, & Kackley, 1992), and the strengths Adamson, S. J., Sellman, J. D., & Dore, G. M. (2005). and limitations of existing devices are being increasingly Therapy preference and treatment outcome in clients recognized (Leffingwell et  al., 2013). The NIAAA has with mild to moderate alcohol dependence. Drug and made the development of an accurate wearable alcohol Alcohol Review, 24, 209–​216. biosensing device a priority for research by funding com- Agrawal, S., Sobell, M. B., & Sobell, L. C. (2008). The Timeline Followback:  A scientifically and clinically petition to develop an advanced device with excellent

Alcohol Use Disorder

useful tool for assessing substance use. In R. F. Belli, F. P. Stafford, & D. F. Alwin (Eds.), Calendar and time diary methods in life course research (pp. 57–​68). Beverly Hills, CA: Sage. Allen, J. P. (2000). Measuring treatment process variables in Alcoholics Anonymous. Journal of Substance Abuse Treatment, 18, 227–​230. Allen, J. P., Mattson, M. E., Miller, W. R., Tonigan, J. S., Connors, G. J., Rychtarik, R. G., .  .  . Cooney, N. L. (1997). Matching alcoholism treatments to client heterogeneity. Journal of Studies on Alcohol, 58, 7–​29. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Andreasen, N. C., Endicott, J., Spitzer, R. L., & Winokur, G. (1977). The family history method using diagnostic criteria: Reliability and validity. Archives of General Psychiatry, 34, 1229–​1235. Andreasen, N. C., Rice, J., Endicott, J., Reich, T., & Coryell, W. (1986). The family history approach to diagnosis: How useful is it? Archives of General Psychiatry, 43, 421–​429. Annis, H. M. (1987). Situational Confidence Questionnaire (SCQ-​ 39): User’s guide. Toronto, Ontario, Canada: Addiction Research Foundation of Ontario. Annis, H. M., Graham, J. M., & Davis, C. S. (1987). Inventory of Drinking Situations (IDS): User’s guide. Toronto, Ontario, Canada: Addiction Research Foundation of Ontario. Anton, R. F., Moak, D. H., & Latham, P. (1995). The Obsessive Compulsive Drinking Scale:  A self-​ rated instrument for the quantification of thoughts about alcohol and drinking behavior. Alcoholism: Clinical and Experimental Research, 19, 92–​99. Armor, D. J., Polich, J. M., & Stambul, H. B. (1978). Alcoholism and treatment (Vol. 232). New York, NY: Wiley. Babor, T. F., Biddle-​ Higgins, J. C., Saunders, J. B., & Monteiro, M. G. (2001). AUDIT:  The Alcohol Use Disorders Identification Test:  Guidelines for use in primary health care. Geneva, Switzerland:  World Health Organization. Bajunirwe, F., Haberer, J. E., Ii, Y. B., & Hunt, P. (2014). Comparison of self-​ reported alcohol consumption to phosphatidylethanol measurement among HIV-​ infected patients initiating antiretroviral treatment in southwestern Uganda. PLoS One, 30, 1–​12. Bergly, T. H., Stallvik, M., Nordahl, H. M., & Hagen, R. (2014). The predictive validity of the URICA in a sample of patients in substance use treatment. Addictive Disorders & Their Treatment, 13, 170–​178. Bertholet, N., Winter, M. R., Cheng, D. M., Samet, J. H., & Saitz, R. (2014). How accurate are blood (or

403

breath) tests for identifying self-​reported heavy drinking among people with alcohol dependence? Alcohol and Alcoholism, 49, 423–​429. Blashfield, R. K., Keeley, J. W., Flanagan, E. H., & Miles, S. R. (2014). The cycle of classification: DSM-​I through DSM-​ 5. Annual Review of Clinical Psychology, 10, 25–​51. Bohn, M. J., Krahn, D. D., & Staehler, B. A. (1995). Development and initial validation of a measure of drinking urges in abstinent alcoholics. Alcoholism:  Clinical and Experimental Research, 19, 600–​606. Bonar, E. E., Rosenberg, H., Hoffmann, E., Kraus, S. W., Kryszak, E., Young, K. M., . . . Bannon, E. E. (2011). Measuring university students’ self-​ efficacy to use drinking self-​control strategies. Psychology of Addictive Behaviors, 25, 155–​161. Boness, C. L., Lane, S. P., & Sher, K. J. (2016). Assessment of withdrawal and hangover is confounded in the Alcohol Use Disorder and Associated Disabilities Interview Schedule:  Withdrawal prevalence is likely inflated. Alcoholism: Clinical and Experimental Research, 5, 1–​9. Bradley, K. A., Bush, K. R., McDonell, M. B., Malone, T., & Fihn, S. D. (1998). Screening for problem drinking. Journal of General Internal Medicine, 13, 379–​389. Brener, L., Resnick, I., Ellard, J., Treloar, C., & Bryant, J. (2009). Exploring the role of consumer participation in drug treatment. Drug and Alcohol Dependence, 105, 172–​175. Brown, S. A., Goldman, M. S., Inn, A., & Anderson, L. R. (1980). Expectations of reinforcement from alcohol: Their domain and relation to drinking patterns. Journal of Consulting and Clinical Psychology, 48, 419–​426. Bucholz, K. K., Cadoret, R., Cloninger, C. R., Dinwiddie, S. H., Hesselbrock, V. M., Nurnberger, J. I., Jr., .  .  . Schuckit, M. A. (1994). A new, semi-​structured psychiatric interview for use in genetic linkage studies:  A report on the reliability of the SSAGA. Journal of Studies on Alcohol, 55, 149–​158. Cacciola, J. S., Alterman, A. I., DePhilippis, D., Drapkin, M. L., Valadez, C., Fala, N. C., .  .  . McKay, J. R. (2013). Development and initial evaluation of the Brief Addiction Monitor (BAM). Journal of Substance Abuse Treatment, 44, 256–​263. Cahalan, D., & Cisin, I. H. (1968). American drinking practices—​Summary of findings from a national probability sample: 1. Extent of drinking by population subgroups. Quarterly Journal of Studies on Alcohol, 29, 130–​151. Camilleri, A. C., Cacciola, J. S., & Jenson, M. R. (2012). Comparison of two ASI-​ based standardized patient placement approaches. Journal of Addictive Diseases, 31, 118–​129. Carey, K. B., Purnine, D. M., Maisto, S. A., & Carey, M. P. (1999). Assessing readiness to change substance abuse: A critical review of instruments. Clinical Psychology: Science and Practice, 6, 245–​266.

404

Substance-Related and Gambling Disorders

Center for Excellence in Substance Abuse Treatment and Education. (2010). Brief Addiction Monitor: Manual of operations. Philadelphia, PA: Author. Center for Substance Abuse Treatment. (1998). Screening and assessing adolescents for substance use disorders [Treatment Improvement Protocol (TIP) Series, No. 31; HHS Publication No. (SMA) 12-​4079]. Rockville, MD:  Substance Abuse and Mental Health Services Administration. Center on Alcoholism, Substance Abuse, and Addictions Research Division. (1994). Lifetime Treatment History Interview. Retrieved from https://​casaa.unm.edu/​inst/​ Lifetime%20Treatment%20History%20Interview.pdf Chaikelson, J. S., Arbuckle, T. Y., Lapidus, S., & Gold, D. P. (1994). Measurement of lifetime alcohol consumption. Journal of Studies on Alcohol, 55, 133–​140. Chassin, L., Colder, C. R., Hussong, A., & Sher, K. J. (2016). Substance use and substance use disorders. In D. Cicchetti (Ed.), Developmental psychopathology:  Volume 3:  Maladaptation and psychopathology (3rd ed. pp. 833–​897). Hoboken, NJ: Wiley. Chassin, L., Haller, M., Lee, M., Handley, E., Bourtress, K., & Beltran, I. (2016). Familial factors influencing offspring substance use and dependence. In K. J. Sher (Ed.), The Oxford handbook of substance use and substance use disorders: Volume 2 (pp. 393–​429). New York, NY: Oxford University Press. Chassin, L., Sher, K., Hussong, A., & Curran, P. (2013). The developmental psychopathology of alcohol use and alcohol disorders:  Research achievements and future directions. Development Psychopathology, 25, 1567–​1584. Cheetham, A., Allen, N. B., Yucel, M., & Lubman, D. I. (2010). The role of affective dysregulation in drug addiction. Clinical Psychology Review, 30, 621–​634. Clark, W. B., & Midanik, L. (1982). Alcohol use and alcohol problems among US adults:  Results of the 1979 national survey. In Alcohol consumption and related problems (Alcohol and Health Monographs No. 1, pp. 3–​52). Rockville, MD:  National Institute on Alcohol Abuse and Alcoholism. Clifford, P. R., & Longabaugh, R. (1991). Manual for the administration of the Important People and Activities Instrument. Adapted for use by Project MATCH for NIAAA, 5, R01AA06698-​05. Collins, R. L., & Lapp, W. M. (1992). The Temptation and Restraint Inventory for measuring drinking restraint. British Journal of Addiction, 87, 625–​633. COMBINE Study Research Group. (2003). Testing combined pharmacotherapies and behavioral interventions in alcohol dependence: Rationale and methods. Alcoholism: Clinical and Experimental Research, 27, 1107–​1122. Cooper, M. L. (1994). Motivations for alcohol use among adolescents:  Development and validation of a four-​ factor model. Psychological Assessment, 6, 117–​128.

Cooper, M. L., Kuntsche, E., Levitt, A., Barber, L. L., & Wolf, S. (2016). A motivational perspective on substance use: Review of theory and research. In K. J. Sher (Ed.), The Oxford handbook of substance use disorders (pp. 375–​421). New York, NY: Oxford University Press. Crawford, M. J., Aldridge, T., Bhui, K., Rutter, D., Manley, C., Weaver, T., .  .  . Fulop, N. (2003). User involvement in the planning and delivery of mental health services:  A cross-​ sectional survey of service users and providers. Acta Psychiatrica Scandinavica, 107, 410–​414. Crews, T. M., & Sher, K. J. (1992). Using adapted short MASTs for assessing parental alcoholism:  Reliability and validity. Alcoholism:  Clinical and Experimental Research, 16, 576–​584. Cuijpers, P., & Smit, F. (2001). Assessing parental alcoholism:  A comparison of the Family History Research Diagnostic Criteria versus a single-​ question method. Addictive Behaviors, 26, 741–​748. Darkes, J., & Goldman, M. S. (1993). Expectancy challenge and drinking reduction:  Experimental evidence for a mediational process. Journal of Consulting and Clinical Psychology, 61, 344–​353. Dawson, D. A., Goldstein, R. B., Chou, S. P., Ruan, W. J., & Grant, B. F. (2008). Age at first drink and the first incidence of adult-​onset DSM-​IV alcohol use disorders. Alcoholism:  Clinical and Experimental Research, 32, 2149–​2160. Del Boca, F. K., Darkes, J., & McRee, B. (2016). Self-​report assessments of psychoactive substance use and dependence. In K. J. Sher (Ed.), The Oxford handbook of substance use and substance use disorders:  Volume 2 (pp. 430–​465). New York, NY: Oxford University Press. Denis, C. M., Cacciola, J. S., & Alterman, A. I. (2013). Addiction Severity Index (ASI) summary scores:  Comparison of the recent status scores of the ASI-​6 and the composite scores of the ASI-​5. Journal of Substance Abuse Treatment, 45, 444–​450. Dennis, M. L., Feeney, T., Stevens, L. H., & Bedoya, L. (2006). Global Appraisal of Individual Needs–​ Short Screener (GAIN-​ SS):  Administration and scoring manual for the GAIN-​SS Version 2.0. 1. Bloomington, IL: Chestnut Health Systems. DiClemente, C. C., Carbonari, J. P., Montgomery, R. P., & Hughes, S. O. (1994). The alcohol abstinence self-​efficacy scale. Journal of Studies on Alcohol, 55, 141–​148. Djukic, M. (2012). Diagnostic characteristics and application of alcohol biomarkers. Clinical Laboratory, 59, 233–​245. Donovan, D., Mattson, M. E., Cisler, R. A., Longabaugh, R., & Zweben, A. (2005). Quality of life as an outcome measure in alcoholism treatment research. Journal of Studies on Alcohol Supplement, 15, 119–​139.

Alcohol Use Disorder

Edens, J. F., & Willoughby, F. W. (2000). Motivational patterns of alcohol dependent patients:  a replication. Psychology of Addictive Behaviors, 14(4), 397. Endicott, J. (1978). Family History:  Research Diagnostic Criteria:  (FH-​RDC). New  York, NY:  New  York State Psychiatric Institute. Ewing, J. A. (1984). Detecting alcoholism: The CAGE questionnaire. JAMA, 252, 1905–​1907. Farmer, R. F., Gau, J. M., Seeley, J. R., Kosty, D. B., Sher, K. J., & Lewinsohn, P. M. (2016). Internalizing and externalizing disorders as predictors of alcohol use disorder onset during three developmental periods. Drug and Alcohol Dependence, 164, 38–​46. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2014). Structured Clinical Interview for DSM-​ 5 Disorders–​Research Version (SCID-​5-​RV). Arlington, VA: American Psychiatric Association Publishing. First, M. B., Williams, J. B., Karg, R. S., & Spitzer, R. L. (2016). Structured Clinical Interview for DSM-​ 5 Disorders:  SCID-​5-​CV Clinician Version. Arlington, VA: American Psychiatric Association Publishing. Flannery, B. A., Volpicelli, J. R., & Pettinati, H. M. (1999). Psychometric properties of the Penn Alcohol Craving Scale. Alcoholism: Clinical and Experimental Research, 23, 1289–​1295. Friedrichs, A., Spies, M., Härter, M., & Buchholz, A. (2016). Patient preferences and shared decision making in the treatment of substance use disorders:  A systematic review of the literature. PLoS One, 11, e0145817. Fromme, K., Stroot, E. A., & Kaplan, D. (1993). Comprehensive effects of alcohol:  Development and psychometric assessment of a new expectancy questionnaire. Psychological Assessment, 5, 19–​26. Gastfriend, D. R., & McLellan, A. T. (1997). Treatment matching:  Theoretic basis and practical implications. Medical Clinics of North America, 81, 945–​966. Gastfriend, D. R., & Mee-​Lee, D. (2004). The ASAM patient placement criteria:  Context, concepts and continuing development. Journal of Addictive Diseases, 22, 1–​8. Gossop, M., Keaney, F., Stewart, D., Marshall, E. J., & Strang, J. (2002). A Short Alcohol Withdrawal Scale (SAWS):  Development and psychometric properties. Addiction Biology, 7, 37–​43. Graff, F. S., Morgan, T. J., Epstein, E. E., McCrady, B. S., Cook, S. M., Jensen, N. K., & Kelly, S. (2009). Engagement and retention in outpatient alcoholism treatment for women. American Journal on Addictions, 18, 277–​288. Grant, B. F., Amsbary, M., Chu, A., Sigman, R., Kali, J., Sugawana, Y., .  .  . Chou, P. S. (2014). Source and accuracy statement:  National Epidemiologic Survey on Alcohol and Related Conditions-​III (NESARC-​III). Rockville, MD:  National Institute on Alcohol Abuse and Alcoholism.

405

Grant, B. F., Goldstein, R. B., Chou, S. P., Saha, T. D., Ruan, W. J., Huang, B., . . . Aivadyan, C. (2011). The Alcohol Use Disorder and Associated Disabilities Interview Schedule–​ Diagnostic and Statistical Manual of Mental Disorders, Version (AUDADIS-​5). Rockville, MD:  National Institute on Alcohol Abuse and Alcoholism. Grant, B. F., Goldstein, R. B., Saha, T. D., Chou, S. P., Jung, J., Zhang, H., .  .  . Hasin, D. S. (2015). Epidemiology of DSM-​ 5 alcohol use disorder:  Results from the National Epidemiologic Survey on Alcohol and Related Conditions III. JAMA Psychiatry, 72, 757–​766. Grant, B. F., Moore, T. C., Shepard, J., & Kaplan, K. (2003). Source and accuracy statement:  Wave 1 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Bethesda, MD:  National Institute on Alcohol Abuse and Alcoholism. Grant, B. F., Stinson, F. S., Dawson, D. A., Chou, S. P., Dufour, M. C., Compton, W., . . . Kaplan, K. (2004). Prevalence and co-​ occurrence of substance use disorders and independent mood and anxiety disorders: Results from the National Epidemiologic Survey on Alcohol and Related Conditions. Archives of General Psychiatry, 61, 807–​816. Gual, A. (2002). AUDIT-​3 and AUDIT-​4: Effectiveness of two short forms of the Alcohol Use Disorders Identification Test. Alcohol and Alcoholism, 37, 591–​596. Haeny, A. M., Littlefield, A. K., & Sher, K. J. (2014a). Repeated diagnoses of lifetime alcohol use disorders in a prospective study: Insights into the extent and nature of the reliability and validity problem. Alcoholism: Clinical and Experimental Research, 38, 489–​500. Haeny, A. M., Littlefield, A. K., & Sher, K. J. (2014b). False negatives in the assessment of lifetime alcohol use disorders: A serious but unappreciated problem. Journal of Studies on Alcohol and Drugs, 75, 530–​535. Haeny, A. M., Littlefield, A. K., & Sher, K. J. (2016). Limitations of lifetime alcohol use disorder assessments: A criterion-​validation study. Addictive Behaviors, 59, 95–​99. Hallgren, K. A., & Barnett, N. P. (2016). Briefer assessment of social network drinking: A test of the Important People Instrument-​5 (IP-​5). Psychology of Addictive Behaviors, 30, 955–​964. Hallgren, K. A., Ladd, B. O., & Greenfield, B. L. (2013). Psychometric properties of the Important People Instrument with college student drinkers. Psychology of Addictive Behaviors, 27, 819–​825. Ham, L. S., Stewart, S. H., Norton, P. J., & Hope, D. A. (2005). Psychometric assessment of the Comprehensive Effects of Alcohol Questionnaire: Comparing a brief version to the original full scale. Journal of Psychopathology and Behavioral Assessment, 27, 141–​158. Härter, M., Müller, H., Dirmaier, J., Donner-​Banzhoff, N., Bieber, C., & Eich, W. (2011). Patient participation and

406

Substance-Related and Gambling Disorders

shared decision making in Germany—​History, agents and current transfer to practice. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen, 105, 263–​270. Härter, M., van der Weijden, T., & Elwyn, G. (2011). Policy and practice developments in the implementation of shared decision making:  An international perspective. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen, 105, 229–​233. Hasin, D., Trautman, K., Miele, G., Samet, S., Smith, M., & Endicott, J. (1996). Psychiatric Research Interview for Substance and Mental Disorders (PRISM): Reliability for substance abusers. American Journal of Psychiatry, 153, 1195–​1201. Hasin, D. S., O’Brien, C. P., Auriacombe, M., Borges, G., Bucholz, K., Budney, A., . . . Schuckit, M. (2013). DSM-​5 criteria for substance use disorders: Recommendations and rationale. American Journal of Psychiatry, 170, 834–​851. Havassy, B. E., Hall, S. M., & Wasserman, D. A. (1991). Social support and relapse:  Commonalities among alcoholics, opiate users, and cigarette smokers. Addictive Behaviors, 16, 235–​246. Hodgson, R., Alwyn, T., John, B., Thom, B., & Smith, A. (2002). The Fast Alcohol Screening Test. Alcohol and Alcoholism, 37, 61–​66. Hoffman, N. G., Halikas, J. A., Mee-​Lee, D., & Weedman, R. D. (1993). Patient placement criteria for the treatment of psychoactive substance use disorders. Washington, DC: American Society of Addiction Medicine. Holzman, S. B., & Rastegar, D. A. (2016). AST: A simplified 3-​item tool for managing alcohol withdrawal. Journal of Addiction Medicine, 10, 190–​195. Horn, J. L. (1984). Alcohol Dependence Scale (ADS) user’s guide. Toronto, Ontario, Canada:  Addiction Research Foundation. Horn, J. L., Wanberg, K. W., & Foster, F. M. (1974). The Alcohol Use Inventory. Denver, CO: Center for Alcohol Abuse Research and Evaluation. Institute of Medicine. (1990). Broadening the base of treatment for alcohol problems. Washington, DC:  National Academies Press. Isenhart, C. E. (1991). Factor structure of the Inventory of Drinking Situations. Journal of Substance Abuse, 3, 59–​71. Isenhart, C. E. (1997). Pretreatment readiness for change in male alcohol dependent subjects: Predictors of one-​year follow-​up status. Journal of Studies on Alcohol, 58(4), 351–​357. Jones, J. W. (1983). The Children of Alcoholics Screening Test: Test manual. Chicago, IL: Camelot Unlimited. Jones, J. D., Comer, S. D., & Kranzler, H. R. (2015). The pharmacogenetics of alcohol use disorder. Alcoholism:  Clinical and Experimental Research, 39, 391–​402.

Joosten, E. A. G., De Jong, C. A. J., De Weert-​van Oene, G. H., Sensky, T., & Van der Staak, C. P. F. (2009). Shared decision-​making reduces drug use and psychiatric severity in substance-​dependent patients. Psychotherapy and Psychosomatics, 78, 245–​253. Joosten, E. A.  G., De Weert-​van Oene, G. H., Sensky, T., Van Der Staak, C. P.  F., & De Jong, C. A.  J. (2010). Treatment goals in addiction healthcare: The perspectives of patients and clinicians. International Journal of Social Psychiatry, 57, 263–​276. Kadden, R., Carbonari, J., Litt, M., Tonigan, S., & Zweben, A. (1998). Matching alcoholism treatments to client heterogeneity:  Project MATCH three-​year drinking outcomes. Alcoholism:  Clinical and Experimental Research, 22, 1300–​1311. Kahler, C. W., Strong, D. R., & Read, J. P. (2005). Toward efficient and comprehensive measurement of the alcohol problems continuum in college students: The Brief Young Adult Alcohol Consequences Questionnaire. Alcoholism:  Clinical and Experimental Research, 29, 1180–​1189. Kaminer, Y., McCauley Ohannessian, C. M., McKay, J. R., & Burke, R. H. (2016). The Adolescent Substance Abuse Goal Commitment (ASAGC) Questionnaire: An examination of clinical utility and psychometric properties. Journal of Substance Abuse Treatment, 61, 42–​46. Kavanagh, D. J., Statham, D. J., Feeney, G. F., Young, R. M., May, J., Andrade, J., & Connor, J. P. (2013). Measurement of alcohol craving. Addictive Behaviors, 38, 1572–​1584. Kelly, J. F., & Greene, M. C. (2013). The Twelve Promises of Alcoholics Anonymous:  Psychometric measure validation and mediational testing as a 12-​step specific mechanism of behavior change. Drug and Alcohol Dependence, 133, 633–​640. Kendler, K. S., Prescott, C. A., Myers, J., & Neale, M. C. (2003). The structure of genetic and environmental risk factors for common psychiatric and substance use disorders in men and women. Archives of General Psychiatry, 60, 929–​937. Khavari, K. A., & Farber, P. D. (1978). A profile instrument for the quantification and assessment of alcohol consumption: The Khavari Alcohol Test. Journal of Studies on Alcohol, 39, 1525–​1539. Lee, M. R., & Sher, K. J. (in press). “Maturing out” of binge and problem drinking. Alcohol Research:  Current Reviews. Leffingwell, T. R., Cooney, N. J., Murphy, J. G., Luczak, S., Rosen, G., Dougherty, D. M., & Barnett, N. P. (2013). Continuous objective monitoring of alcohol use: Twenty-​ first century measurement using transdermal sensors. Alcoholism: Clinical and Experimental Research, 37, 16–​22. Leonard, K. E., Harwood, M. K., & Blane, H. T. (1988). The preoccupation with alcohol scale:  Development

Alcohol Use Disorder

and validation. Alcoholism:  Clinical and Experimental Research, 12, 394–​399. Litten, R. Z., Bradley, A. M., & Moss, H. B. (2010). Alcohol biomarkers in applied settings:  Recent advances and future research opportunities. Alcoholism: Clinical and Experimental Research, 34, 955–​967. Longabaugh, R., Wirtz, P. W., Zweben, A., & Stout, R. L. (1998). Network support for drinking, Alcoholics Anonymous and long-​term matching effects. Addiction, 93, 1313–​1333. Longabaugh, R., & Zywiak, W. (1999). Manual for the administration of the Important People Instrument adapted for use by Project COMBINE. Providence, RI: Center for Alcohol and Addiction Studies, Brown University. Maciosek, M. V., Coffield, A. B., Edwards, N. M., Flottemesch, T. J., Goodman, M. J., & Solberg, L. I. (2006). Priorities among effective clinical preventive services:  Results of a systematic review and analysis. American Journal of Preventive Medicine, 31, 52–​61. Maisto, S. A., Carey, K. B., & Bradizza, C. M. (1999). Social learning theory. In H. T. Blane (Ed.), Psychological theories of drinking and alcoholism:  2. (pp. 106–​163). New York, NY: Guilford. Mann, R. E., Sobell, L. C., Sobell, M. B., & Sobell, D. P. (1985). Reliability of a family tree questionnaire for assessing family history of alcohol problems. Drug and Alcohol Dependence, 15, 61–​67. Martens, M. P., Arterberry, B. J., Cadigan, J. M., & Smith, A. E. (2012). Review of clinical assessment tools. In C. J. Correia, J. G. Murphy, & N. P. Barnett (Eds.), College student alcohol abuse:  A guide to assessment, intervention, and prevention (pp. 115–​145). Hoboken, NJ: Wiley. Martens, M. P., Ferrier, A. G., Sheehy, M. J., Corbett, K., Anderson, D. A., & Simmons, A. (2005). Development of the Protective Behavioral Strategies Survey. Journal of Studies on Alcohol, 66, 698–​705. Martin, C. S., Chung, T., & Langenbucher, J. W. (2008). How should we revise diagnostic criteria for substance use disorders in the DSM-​ V? Journal of Abnormal Psychology, 117, 561–​575. Martino, S., Poling, J., & Rounsaville, B. J. (2008). Substance use disorders measures. In A. J. Rush, Jr., M. B. First, & D. Blacker (Eds.), Handbook of psychiatric measures (pp. 437–​ 468). Arlington, VA:  American Psychiatric Publishing. Mavandadi, S., Helstrom, A. M. Y., Sayers, S., & Oslin, D. (2015). The moderating role of perceived social support on alcohol treatment outcomes. Journal of Studies on Alcohol and Drugs, 76, 818–​823. McConnaughy, E. A., Prochaska, J. O., & Velicer, W. F. (1983). Stages of change in psychotherapy: Measurement and sample profiles. Psychotherapy:  Theory, Research, and Practice, 20, 368–​375.

407

McKay, J. R., Alterman, A. I., McLellan, A. T., Snider, E. C., & O’Brien, C. P. (1995). Effect of random versus nonrandom assignment in a comparison of inpatient and day hospital rehabilitation for male alcoholics. Journal of Consulting and Clinical Psychology, 63, 70–​78. McLellan, A. T., Grissom, G. R., Zanis, D., Randall, M., Brill, P., & O’Brien, C. P. (1997). Problem-​service “matching” in addiction treatment:  A prospective study in 4 programs. Archives of General Psychiatry, 54, 730–​735. Medical Outcomes Trust. (1991). Medical Outcomes Trust:  Improving Medical Outcomes from the Patient’s Point of View. Boston, MA: Medical Outcomes Trust. Mee-​Lee, D., & Gastfriend, D. R. (2014). Patient placement criteria. In M. Galanter, H. D. Kleber, & K. T. Brady (Eds.), The American Psychiatric Publishing Textbook of Substance Abuse Treatment (pp. 111–​128). Arlington, VA: American Psychiatric Publishing. Mensinger, J. L., Lynch, K. G., TenHave, T. R., & McKay, J. R. (2007). Mediators of telephone-​based continuing care for alcohol and cocaine dependence. Journal of Consulting and Clinical Psychology, 75, 775–​784. Midanik, L. T. (1994). Comparing usual quantity/​frequency and graduated frequency scales to assess yearly alcohol consumption:  Results from the 1990 US National Alcohol Survey. Addiction, 89, 407–​412. Miele, G. M., Carpenter, K. M., Cockerham, M. S., Trautman, K. D., Blaine, J., & Hasin, D. S. (2000). Substance Dependence Severity Scale (SDSS):  Reliability and validity of a clinician-​ administered interview for DSM-​ IV substance use disorders. Drug and Alcohol Dependence, 59, 63–​75. Miller, W. R. (1996). Manual for Form 90:  A structured assessment interview for drinking and related behaviors (Project MATCH Monograph Series Vol. 5, NIH Publication No. 96-​ 4004). Bethesda, MD:  National Institute on Alcohol Abuse and Alcoholism. Miller, W. R., & Del Boca, F. K. (1994). Measurement of drinking behavior using the Form 90 family of instruments. Journal of Studies on Alcohol Supplement, 12, 112–​118. Miller, P. M., Thomas, S. E., & Mallin, R. (2006). Patient attitudes towards self-​ report and biomarker alcohol screening by primary care physicians. Alcohol and Alcoholism, 41, 306–​310. Miller, W. R., & Tonigan, J. S. (1996). Assessing drinkers’ motivation for change: The Stages of Change Readiness and Treatment Eagerness Scale (SOCRATES). Psychology of Addictive Behaviors, 10, 81–​89. Miller, W. R., Tonigan, J. S., & Longabaugh, R. (1995). The Drinker Inventory of Consequences (DrInC): An instrument for assessing adverse consequences of alcohol abuse (Project MATCH Monograph Series Vol. 4, DHHS Publication No. 95-​ 3911). Rockville, MD:  National Institute on Alcohol Abuse and Alcoholism.

408

Substance-Related and Gambling Disorders

Morgenstern, J., Frey, R. M., McCrady, B. S., Labouvie, E., & Neighbors, C. J. (1996). Examining mediators of change in traditional chemical dependency treatment. Journal of Studies on Alcohol, 57, 53–​64. Morzorati, S. L., Ramchandani, V. A., Flury, L., Li, T. K., & O’Connor, S. (2002). Self-​ reported subjective perception of intoxication reflects family history of alcoholism when breath alcohol levels are constant. Alcoholism:  Clinical and Experimental Research, 26, 1299–​1306. Mundle, G., Munkes, J., Ackermann, K., & Mann, K. (2000). Sex differences of carbohydrate-​deficient transferrin, γ-​ glutamyltransferase, and mean corpuscular volume in alcohol-​dependent patients. Alcoholism:  Clinical and Experimental Research, 24, 1400–​1405. Napper, L. E., Wood, M. M., Jaffe, A., Fisher, D. G., Reynolds, G. L., & Klahn, J. A. (2008). Convergent and discriminant validity of three measures of stage of change. Psychology of Addictive Behaviors, 22, 362–​371. National Institute on Alcohol Abuse and Alcoholism. (2003). Assessing alcohol problems:  A guide for clinicians and researchers (2nd ed.). Bethesda, MD: Author. National Institute on Drug Abuse. (2009). Principles of drug addiction treatment: A research based guide. Washington, DC: National Institutes of Health. Nelson, K. G., Young, K., & Chapman, H. (2014). Examining the performance of the Brief Addiction Monitor. Journal of Substance Abuse Treatment, 46, 472–​481. Nelson, S. E., Van Ryzin, M. J., & Dishion, T. J. (2015). Alcohol, marijuana, and tobacco use trajectories from age 12 to 24  years:  Demographic correlates and young adult substance use problems. Development and Psychopathology, 27, 253–​277. Neuner, B., Dizner-​Golab, A., Gentilello, L. M., Habrat, B., Mayzner-​Zawadzka, E., Gorecki, A., .  .  . Spies, C. D. (2007). Trauma patients’ desire for autonomy in medical decision making is impaired by smoking and hazardous alcohol consumption—​A bi-​national study. Journal of International Medical Research, 35, 609–​614. Norcross, J. C., Krebs, P. M., & Prochaska, J. O. (2011). Stages of change. Journal of Clinical Psychology, 67, 143–​154. Oei, T. P., Hasking, P. A., & Young, R. M. (2005). Drinking Refusal Self-​Efficacy Questionnaire-​Revised (DRSEQ-​ R):  A new factor structure with confirmatory factor analysis. Drug and Alcohol Dependence, 78, 297–​307. Ooteman, W., Koeter, M. W., Vserheul, R., Schippers, G. M., & Brink, W. (2006). Measuring craving: An attempt to connect subjective craving with cue reactivity. Alcoholism: Clinical and Experimental Research, 30(1), 57–69. Pearson, M. R., Kite, B. A., & Henson, J. M. (2012). The assessment of protective behavioral strategies:  Comparing prediction and factor structures across measures. Psychology of Addictive Behaviors, 26, 573–​584.

Piccinelli, M., Tessari, E., Bortolomasi, M., Piasere, O., Semenzin, M., Garzotto, N., & Tansella, M. (1997). Efficacy of the Alcohol Use Disorders Identification Test as a screening tool for hazardous alcohol intake and related disorders in primary care: A validity study. BMJ, 314, 420–​424. Pilkonis, P. A., Yu, L., Colditz, J., Dodds, N., Johnston, K. L., Maihoefer, C., . . . McCarty, D. (2013). Item banks for alcohol use from the Patient-​Reported Outcomes Measurement Information System (PROMIS):  Use, consequences, and expectancies. Drug and Alcohol Dependence, 130, 167–​177. Pilkonis, P. A., Yu, L., Dodds, N. E., Johnston, K. L., Lawrence, S. M., & Daley, D. C. (2016). Validation of the alcohol use item banks from the Patient-​Reported Outcomes Measurement Information System (PROMIS). Drug and Alcohol Dependence, 161, 316–​322. Pittman, B., Gueorguieva, R., Krupitsky, E., Rudenko, A. A., Flannery, B. A., & Krystal, J. H. (2007). Multidimensionality of the Alcohol Withdrawal Symptom Checklist:  A factor analysis of the Alcohol Withdrawal Symptom Checklist and CIWA-​ Ar. Alcoholism:  Clinical and Experimental Research, 31, 612–​618. Pokorny, A. D., Miller, B. A., & Kaplan, H. B. (1972). The Brief MAST:  A shortened version of the Michigan Alcoholism Screening Test. American Journal of Psychiatry, 129, 342–​345. Prochaska, J. O. (1994). Strong and weak principles for progressing from precontemplation to action on the basis of twelve problem behaviors. Health Psychology, 13, 47–​51. Prochaska, J. O., & DiClemente, C. (1983). Stages and processes of self-​change of smoking: Toward an integrative model of change. Journal of Consulting and Clinical Psychology, 51, 390–​395. Prochaska, J. O., DiClemente, C. C., & Norcross, J. C. (1992). In search of how people change:  Applications to addictive behaviors. American Psychologist, 47, 1102–​1114. Prochaska, J. O., Velicer, W. F., Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., . . . Rossi, S. R. (1994). Stages of change and decisional balance for 12 problem behaviors. Health Psychology, 13, 39–​46. Raabe, A., Nakaji, P., Beck, J., Kim, L. J., Hsu, F. P., Kamerman, J. D., . . . Spetzler, R. F. (2005). Prospective evaluation of surgical microscope-​integrated intraoperative near-​infrared indocyanine green videoangiography during aneurysm surgery. Journal of Neurosurgery, 103, 982–​989. Rappaport, D., Chuu, A., Hullett, C., Nematollahi, S., Teeple, M., Bhuyan, N., . . . Sanders, A. (2013). Assessment of alcohol withdrawal in Native American patients utilizing the Clinical Institute Withdrawal Assessment of

Alcohol Use Disorder

Alcohol Revised scale. Journal of Addiction Medicine, 7, 196–​199. Ray, L. A., Courtney, K. E., Bacio, G., & MacKillop, J. (2013). The assessment of craving in addiction research. In J. MacKillop & H. de Wit (Eds.), The Wiley-​Blackwell handbook of addiction psychopharmacology (pp. 345–​ 380). West Sussex, UK: Wiley. Read, J. P., Kahler, C. W., Strong, D. R., & Colder, C. R. (2006). Development and preliminary validation of the Young Adult Alcohol Consequences Questionnaire. Journal of Studies on Alcohol, 67, 169–​177. Reynaud, M., Schellenberg, F., Loisequx-​Meunier, M. N., Schwan, R., Maradeix, B., Planche, F., & Gillet, C. (2000). Objective diagnosis of alcohol abuse: Compared values of carbohydrate-​ deficient transferrin (CDT), γ-​glutamyl transferase (GGT), and mean corpuscular volume (MCV). Alcoholism: Clinical and Experimental Research, 24, 1414–​1419. Rodriguez, L. M., Neighbors, C., & Knee, C. R. (2013). Problematic alcohol use and marital distress: An interdependence theory perspective. Addiction Research & Theory, 22(4), 294–​312. Roerecke, M., & Rehm, J. (2013). Alcohol use disorders and mortality:  A systematic review and meta-​analysis. Addiction, 108, 1562–​1578. Rollnick, S., Heather, N., Gold, R., & Hall, W. (1992). Development of a short “readiness to change” questionnaire for use in brief, opportunistic interventions among excessive drinkers. British Journal of Addiction, 87, 743–​754. Russell, M., Marshall, J. R., Trevisan, M., Freudenheim, J. L., Chan, A. W., Markovic, N.,  .  .  .  Priore, R. L. (1997). Test–​retest reliability of the Cognitive Lifetime Drinking History. American Journal of Epidemiology, 146, 975–​981. Russell, M., Peirce, R. S., Vana, J. E., Nochajski, T. H., Carosella, A., Muti, P., . . . Trevisan, M. (1998). Relations among alcohol consumption measures derived from the Cognitive Lifetime Drinking History. Drug and Alcohol Review, 17, 377–​387. Sayette, M. A., Shiffman, S., Tiffany, S. T., Niaura, R. S., Martin, C. S., & Schadel, W. G. (2000). The measurement of drug craving. Addiction, 95, 189–​210. Selzer, M. L. (1971). The Michigan Alcoholism Screen Test (MAST):  The quest for a new diagnostic instrument. American Journal of Psychiatry, 127, 1653–​1658. Selzer, M. L., Vinokur, A., & van Rooijen, L. (1975). A self-​ administered Short Michigan Alcoholism Screening Test (SMAST). Journal of Studies on Alcohol, 36, 117–​126. Schinke, S. P. (1989). Review of the Children of Alcoholics Screening Test. In The tenth mental measurements yearbook (pp. 158–​159). Lincoln, NE:  Buros Institute of Mental Measurements.

409

Schuckit, M. A. (1994). Low level of response to alcohol as a predictor of future alcoholism. American Journal of Psychiatry, 151, 184–​189. Schuckit, M. A. (2009). Alcohol-​use disorders. Lancet, 373, 492–​501. Scott-​Sheldon, L. A., Terry, D. L., Carey, K. B., Garey, L., & Carey, M. P. (2012). Efficacy of expectancy challenge interventions to reduce college student drinking:  A meta-​analytic review. Psychology of Addictive Behaviors, 26, 393–​405. Selzer, M. L. (1971). The Michigan Alcoholism Screening Test (MAST):  The quest for a new diagnostic instrument. American Journal of Psychiatry, 127, 1653–​1658. Selzer, M. L., Vinokur, A., & van Rooijen, L. (1975). A self-​ administered Short Michigan Alcoholism Screening Test (SMAST). Journal of Studies on Alcohol, 36, 117–​126. Sheehan, D. V. (2014). Mini-​International Neuropsychiatric Interview (MINI) English Version 7.0.0 for DSM-​ 5. Tampa, FL:  University of South Florida, Institute for Research in Psychiatry. Sheehan, D. V., Lecrubier, Y., Janavs, J., Knapp, E., Weiller, E., Sheehan, M. F., . . . Sheehan, K. H. (1994). Mini International Neuropsychiatric Interview (MINI) release 4.4. Tampa, FL: University of South Florida, Institute of Research in Psychiatry. Sher, K. J., Grekin, E. R., & Williams, N. A. (2005). The development of alcohol use disorders. Annual Review of Clinical Psychology, 1, 493–​523. Sher, K. J., Jackson, K. M., & Steinley, D. (2011). Alcohol use trajectories and the ubiquitous cat’s cradle:  Cause for concern? Journal of Abnormal Psychology, 120, 322–​335. Sher, K. J., Martinez, J. A., & Littlefield, A. K. (2011). Alcohol use and alcohol use disorders. In:  D. Barlow (Ed.), Oxford handbook of clinical psychology (pp. 405–​ 445). New York, NY: Oxford University Press. Shield, K. D., Parry, C., & Rehm, J. (2013). Chronic diseases and conditions related to alcohol use. Alcohol, 85, 155–​173. Shiffman, S. (2016). Ecological momentary assessment. In K. J. Sher (Ed.), Handbook of Substance Use and Substance Use Disorders (Vol. 2, 466–​ 509). New  York:  Oxford University Press. Simpson, M., Lawrinson, P., Copeland, J., & Gates, P. (2007). The Australian Alcohol Treatment Outcome Measure (AATOM-​C):  Psychometric properties. South Wales, UK: NDARC, University of New South Wales. Simpson, M., Lawrinson, P., Copeland, J., & Gates, P. (2009). The Alcohol Treatment Outcome Measure (ATOM): A new clinical tool for standardising outcome measurement for alcohol treatment. Addictive Behaviors, 34, 121–​124. Singleton, E. G., Tiffany, S. T., & Henningfield, J. E. (1995). Development and validation of a new questionnaire to

410

Substance-Related and Gambling Disorders

assess craving for alcohol. In Problems of drug dependence, 1994:  Proceeding of the 56th Annual Meeting, The College on Problems of Drug Dependence, Inc., Volume II:  Abstracts (Research Monograph No. 153). Rockville, MD: National Institute on Drug Abuse. Skinner, H. A., & Horn, J. L. (1984). Alcohol Dependence Scale (ADS) user’s guide. Birmingham, AL:  Addiction Research Foundation. Skinner, H. A., & Sheu, W. J. (1982). Reliability of alcohol use indices. The Lifetime Drinking History and the MAST. Journal of Studies on Alcohol, 43, 1157–​1170. Slutske, W. S., Heath, A. C., Madden, P. A., Bucholz, K. K., Dinwiddie, S. H., Dunne, M. P., .  .  . Martin, N. G. (1996). Reliability and reporting biases for perceived parental history of alcohol-​related problems: Agreement between twins and differences between discordant pairs. Journal of Studies on Alcohol, 57, 387–​395. Snell, L. D., Bhave, S. V., Takacs, L., & Tabakoff, B. (2016). Biological markers of substance use. In K. J. Sher (Ed.), The Oxford handbook of substance use and substance use disorders (Vol. 2, pp. 393–​429). New York, NY: Oxford University Press. Sobell, L. C., & Sobell, M. B. (1995). Alcohol consumption measures. In J. P. Allen & M. Columbus (Eds.), Assessing alcohol problems:  A guide for clinicians and researchers (Vol. 2, pp. 75–​ 99). Darby, PA:  Diane Publishing. Sobell, L. C., & Sobell, M. B. (2004). Alcohol consumption measures. Retrieved from https://​pubs.niaaa.nih.gov/​ publications/​AssessingAlcohol/​measures.htm Sobell, M. B., Sobell, L. C., Bogardis, J., Leo, G. I., & Skinner, W. (1993). Problem drinkers’ perceptions of whether treatment goals should be self-​selected or therapist-​selected. Behavior Therapy, 23, 43–​52. Söderpalm Gordh, A. H., & Söderpalm, B. (2011). Healthy subjects with a family history of alcoholism show increased stimulative subjective effects of alcohol. Alcoholism:  Clinical and Experimental Research, 35, 1426–​1434. Solberg, L. I., Maciosek, M. V., & Edwards, N. M. (2008). Primary care intervention to reduce alcohol misuse ranking its health impact and cost effectiveness. American Journal of Preventive Medicine, 34, 143–​152. Stallings, M. C., Gizer, I., & Young-​Wolff, K. C. (2016). Genetic epidemiology and molecular genetics. In K. J. Sher (Ed.), The Oxford handbook of substance use and substance use disorders (Vol. 1, pp. 192–​272). New York, NY: Oxford University Press. Sullivan, J. T., Sykora, K., Schneiderman, J., Naranjo, C. A., & Sellers, E. M. (1989). Assessment of alcohol withdrawal: The Revised Clinical Institute Withdrawal Assessment for Alcohol Scale (CIWA-​Ar). British Journal of Addiction, 84, 1353–​1357. Suppiger, A., In-​Albon, T., Hendriksen, S., Hermann, E., Margraf, J., & Schneider, S. (2009). Acceptance of

structured diagnostic interviews for mental disorders in clinical practice and research settings. Behavior Therapy, 40, 272–​279. Swift, J. K., & Callahan, J. L. (2009). The impact of client treatment preferences on outcome:  A meta-​ analysis. Journal of Clinical Psychology, 65, 368–​381. Swift, R. M., Martin, C. S., Swette, L., Laconti, A., & Kackley, N. (1992). Studies on a wearable, electronic, transdermal alcohol sensor. Alcoholism:  Clinical and Experimental Research, 16, 721–​725. Tarter, R. E., & Kirisci, L. (2001). Validity of the Drug Use Screening Inventory for predicting DSM-​ III-​ R substance use disorder. Journal of Child and Adolescent Substance Abuse, 10, 45–​53. Tonigan, J. S., Miller, W. R., & Brown, J. M. (1997). The reliability of Form 90: An instrument for assessing alcohol treatment outcome. Journal of Studies on Alcohol, 58, 358–​364. Verhulst, B., Neale, M. C., & Kendler, K. S. (2015). The heritability of alcohol use disorders: A meta-​analysis of twin and adoption studies. Psychological Medicine, 45, 1061–​1072. Vuchinich, R. E., Tucker, J. A., & Harllee, L. M. (1988). Behavioral assessment. In D. Donovan & G. A. Marlatt (Eds.), The Guilford behavioral assessment series. Assessment of addictive behaviors (pp. 51–​83). New York, NY: Guilford. Walther, L., de Bejczy, A., Lof, E., Hansson, T., Andersson, A., Guterstam, J., .  .  . Isaksson, A. (2015). Phosphatidylethanol is superior to carbohydrate-​ deficient transferrin and  γ-​ glutamyltransferase as an alcohol marker and is a reliable estimate of alcohol consumption level. Alcoholism: Clinical and Experimental Research, 39, 2200–​2208. Wanberg, K. W., Horn, J. L., & Foster, F. M. (1977). A differential assessment model for alcoholism. The scales of the Alcohol Use Inventory. Journal of Studies on Alcohol, 38, 512–​543. Ware, J. E., Jr., Kosinski, M., & Keller, S. D. (1996). A 12-​ Item Short Form Health Survey: Construction of scales and preliminary tests of reliability and validity. Medical Care, 34, 220–​233. Ware, J. E., Jr., & Sherbourne, C. D. (1992). The MOS 36-​Item Short-​Form Health Survey (SF-​36): I. Conceptual framework and item selection. Medical Care, 30, 473–​483. Westerberg, V. S., Miller, W. R., & Heather, N. (1996). The Reasons for Drinking Questionnaire. Addiction, 91, S129–​S130. White, H. R., & Labouvie, E. W. (1989). Towards the assessment of adolescent problem drinking. Journal of Studies on Alcohol, 50, 30–​37. WHO ASSIST Working Group. (2002). The Alcohol, Smoking and Substance Involvement Screening Test (ASSIST):  Development, reliability and feasibility. Addiction, 97, 1183–​1194.

Alcohol Use Disorder

Winograd, R. P., & Sher, K. J. (2015). Binge drinking and alcohol misuse among college students and young adults. Boston, MA: Hogrefe. Winters, K., & Zenilman, J. (1994). Simple screening instrument for outreach for alcohol and other drug abuse and infectious diseases [Vol. 11, Publication No. (SMA) 02-​3683]. Rockville, MD: Center for Substance Abuse Treatment. World Health Organization. (1993). ICD-​10 classification of mental and behavioural disorders: Diagnostic criteria for research. Geneva, Switzerland: Author. World Health Organization. (2004). International statistical classification of diseases and health related problems (Vol. 1). Geneva, Switzerland: Author.

411

Yoon, G., Westermeyer, J., Kuskowski, M. A., & Nesheim, L. (2013). Impact of the number of parents with alcohol use disorder on alcohol use disorder in offspring: A population-​based study. Journal of Clinical Psychiatry, 74, 795–​801. Young, R. M., Oei, T. P., & Crook, G. M. (1991). Development of a drinking self-​efficacy questionnaire. Journal of Psychopathology and Behavioral Assessment, 13, 1–​15. Zucker, R. A. (1987). The four alcoholisms: A developmental account of the etiologic process. In P. C. Rivers (Ed.), Alcohol and addictive behavior (Nebraska Symposium on Motivation, 1986, Vol. 34, pp. 27–​ 83). Lincoln, NE: University of Nebraska Press.

19

Gambling Disorders David C. Hodgins Jennifer L. Swan Randy Stinchfield Gambling is defined as wagering money or something else of value on an outcome that is partially or primarily determined by chance. This broad definition comprises a wide range of activities, including the purchase of raffle tickets for a local charity, playing the animal lottery in Sao Paulo, betting on the outcome of a weekly golf game in Los Angeles, dog track betting in Miami, or playing casino games at the Grand Casino in Ashgabat, Turkmenistan, or Pachinko in a parlor in Tokyo. People can become overinvolved in any of these activities, although certain types of gambling appear to be more likely to lead to problems. Types of gambling such as slot machines and other electronic formats that provide relatively quick feedback are considered most risky for the development of problematic gambling. These formats are typically relatively inexpensive, easy to learn and play, and often widely available both inside and outside casinos, which also contributes to the risk associated with them. Although the financial cost of limited social play is small, uncontrolled involvement leads to overwhelmingly large expenditures. Although gambling problems have been recognized for centuries, and have been described in the Diagnostic and Statistical Manual of Mental Disorders since 1980, their prevalence and visibility have increased significantly since gambling has become broadly available during the past three decades (Hodgins, Stea, & Grant, 2011). Currently, online gambling is mushrooming in popularity, which may lead to even further growth in the prevalence of gambling problems. Clinicians from both the mental health and addiction communities have begun to respond to the need for treatment for gambling disorders. This chapter briefly describes the nature of gambling disorders and then

reviews the various assessment instruments that are available to help clinicians with diagnosis, case conceptualization and treatment planning, and treatment monitoring and evaluation. The psychometric research for each type of assessment instrument is summarized, and instruments are rated in terms of their clinical utility.

THE NATURE OF GAMBLING DISORDERS

The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric Association [APA], 2013)  provides diagnostic criteria for gambling disorder, a disorder characterized by impaired control over gambling activities. Most general population prevalence surveys, in contrast, describe two levels of problems—​disordered gambling, which roughly corresponds to the DSM-​5 category, and problem gambling, which is a significant but less severe type of problem. National and state prevalence surveys have been conducted worldwide, mostly using random digit telephone dialing methodologies. Combined rates of problem and disordered gambling range from 0.2% to 5.3% of adults, depending on methodological differences and on local availability and accessibility of gambling opportunities (Hodgins et al., 2011). Although gambling disorders can affect anyone, younger people, males, and individuals with lower socioeconomic status have higher rates (Petry, 2005). Gambling disorders are associated with significant distress and social and family impairment. Huge financial debts contribute to high levels of stress and pressure to be less than honest with family members, friends, colleagues, and even with

412

Gambling Disorders

themselves. Nongambling leisure activities are curtailed, and increasing time and energy go into gambling or obtaining the money for gambling. Sometimes checks are knowingly cashed without sufficient money in the bank to cover them, and not infrequently funds are embezzled from employers. Rates of suicidal ideation, attempts, and completed attempts are high among individuals with gambling disorders (Hodgins, Mansley, & Thygesen, 2006). Other mental health diagnoses are highly comorbid with gambling disorders, especially substance use, mood, and anxiety disorders (Crockford & el-​Guebaly, 1998). For example, a community survey of more than 43,000 Americans revealed that almost three-​fourths of pathological gamblers had a lifetime alcohol use disorder, 38% had a lifetime drug use disorder, 50% had a mood disorder, and 41% had an anxiety disorder (Petry, Stinson, & Grant, 2005). Our understanding of the temporal onset and patterning of pathological gambling and other mental health disorders is limited, but the relationship appears to vary by disorder. Substance abuse tends to precede pathological gambling (e.g., Cunningham-​Williams, Cottler, Compton, Spitznagel, & Ben-​Abdallah, 2000). On the other hand, the onset of major depression was found to be equally likely to precede or to follow the development of pathological gambling in one study (Hodgins, Peden, & Cassidy, 2005) and more often followed the onset of pathological gambling in others (Taber, McCormick, Russo, Adkins, & Ramirez, 1987). A variety of psychological treatment approaches have been offered, including mutual support groups such as

TABLE 19.1  

413

Gamblers Anonymous, psychodynamic therapies, behavioral and cognitive–​ behavioral treatments, and brief motivational treatments. Cognitive–​behavioral and brief motivational treatments have the most empirical support to date (Yakovenko & Hodgins, 2016). Natural or non-​ treatment-​assisted recovery rates are also sizeable. Surveys that report past-​year prevalence as well as lifetime prevalence consistently indicate recovery rates of approximately 40%, with the vast majority of these recovered individuals reporting never having accessed treatment (Hodgins, Wynne, & Makarchuk, 1999; Slutske, 2006).

ASSESSMENT FOR DIAGNOSIS

There has been a proliferation of disordered gambling assessment instruments during the past decade, and the majority of them fall into the area of interview or self-​ report diagnostic instruments (Stinchfield, 2014). The preponderance of measures have been developed for use in prevalence surveys, and their design reflects the need to balance maximal reliability and validity with the brevity that is required in such research. Some of these diagnostic instruments have only had psychometric properties assessed in community samples. However, as discussed later and shown in Table 19.1, a number have also been validated in clinical populations and are becoming widely used by clinicians. The majority of the available diagnostic instruments are based on the DSM-​ IV (APA, 1994)  criteria for

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity Validity Generalization

Clinical Utility

SOGS

E

G

G

A

A

E

E

G

SOGS-​R

E

G

NR

A

A

E

A

G

NODS

NR

G

NR

A

A

G

A

G

GAMTOMS–​DSM

A

A

NR

G

A

G

A

A

DIGS-​DSM

NR

E

NR

NR

A

A

NR

G

SCI-​PG

NR

NR

E

A

A

G

NR

G

GBI

A

E

NR

NR

A

G

NR

A

CPGI–​PGSI

A

G

NA

A

L

A

A

A

PPGM

L

G

NA

A

G

A

A

L

Highly Recommendeda

  See page 418 for the reasons that no measure is currently highly recommended for this assessment purpose.

a

Note:  SOGS  =  South Oaks Gambling Screen; SOGS-​R  =  SOGS past-​year version; NODS  =  National Opinion Research Center (NORC) DSM-​ IV Screen for Gambling Problems; GAMTOMS = Gambling Treatment Outcome Monitoring System; DIGS = Diagnostic Interview for Gambling Schedule; SCI-​PG = Structured Clinical Interview for Pathological Gambling; GBI = Gambling Behavior Inventory; CPGI–​PGSI = Canadian Problem Gambling Index–​Problem Gambling Severity Index; PPGM = Problem and Pathological Gambling Measure; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

414

Substance-Related and Gambling Disorders

pathological gambling or criteria from previous versions of the DSM. A small number have been updated to assess the DSM-​5 diagnosis of gambling disorder. The DSM-​IV and DSM-​5 criteria are nearly identical, except for the following:  (a) The DSM-​IV illegal activity criterion was omitted from the DSM-​5; (b)  the number of required criteria for diagnosis of gambling disorder was reduced by one; (c)  DSM-​5 now specifies that symptoms occur within a 12-​month time period; and (d) some minor wording revisions were made to three of the criteria, such as inserting the word “often” in the preoccupation criterion (Stinchfield et al., 2016). As a result, instruments can be easily modified; however, any modified measures would require psychometric evaluation. The current criteria include items such as tolerance (escalating gambling activities over time), withdrawal-​like symptoms (restlessness and irritability), attempts to control one’s gambling, impaired control (“chasing losses”), and continuing to gamble despite negative consequences. Generally, the criteria are behavioral and objective in nature. An individual receives a diagnosis if four or more of the nine criteria are met (Criterion A) and the gambling behavior is not better accounted for by a manic episode (Criterion B). The problem gambling category that is often reported in prevalence surveys, but not included in the DSM, is typically conceptualized as subthreshold pathological gambling. Many of the diagnostic instruments reviewed here and summarized in Table 19.1 provide a lower cut-​ off for determining problem gambling, and some instruments provide one or two additional “at risk” categories that reflect even lower levels of problem severity. The medically based conceptualization of disordered gambling in the DSM has been criticized as ignoring the role of the social and environmental context of gambling disorders. In response, broader “harm-​based” models of gambling problems have been proposed in which problems are defined as gambling that creates negative consequences for the gambler, others in the social network, or the community (Ferris & Wynne, 2001; Ferris, Wynne, & Single, 1998). The Problem Gambling Severity Index of the Canadian Problem Gambling Index (Ferris & Wynne, 2001), reviewed later, was developed from this alternative conceptualization and has been popular in Canada as well as in numerous other countries (e.g., Australia and New Zealand). South Oaks Gambling Screen The most well-​ known instrument is the South Oaks Gambling Screen (SOGS; Lesieur & Blume, 1987). The

SOGS, developed in the 1980s to screen clinical populations, was based on the DSM-​III (APA, 1980) and DSM-​ III-​R (APA, 1987)  criteria. It subsequently became the most widely used instrument in general population prevalence surveys (Shaffer, Hall, & Vander Bilt, 1999)  and has been translated into French, Spanish, Italian, Swedish, Lao, Chinese, Vietnamese, Portuguese, and Cambodian. The original SOGS consisted of 20 true–​ false self-​completion items that reflect lifetime gambling involvement, although parallel past-​ year and 3-​ month versions were subsequently developed (SOGS-​R; Lesieur & Blume, 1993; SOGS-​ 3; Wulfert et  al., 2005). The SOGS-​3 is useful in evaluating outcome, and is discussed in the treatment monitoring section. The SOGS can be administered either in self-​report format or via face-​to-​face or telephone interview. Although the original scale was designed to identify pathological gambling, a lower cut-​ off score for problem gambling has been established, and the SOGS total score is also used as an indicator of gambling problem severity. The content of the SOGS includes items that inquire about hiding evidence of gambling, spending more time or money gambling than intended, arguing with family members about gambling, and borrowing money from a variety of sources to gamble or to pay gambling debts. Each of these sources of money is scored as a separate item, which weights this criterion very heavily. Because the SOGS items were developed from DSM-​III criteria, there is some concern regarding its content validity for DSM-​ 5 assessments because a number of criteria have been changed significantly in the DSM revisions. Nonetheless, an investigation of the psychometric properties of the past-​year self-​report version of the SOGS in three large clinical samples found good internal reliability and concurrent validity compared with DSM-​IV assessments (Stinchfield, 2002). Classification accuracy overall was good (.96), with better sensitivity (.99) than specificity (.75). Regarding specificity, the SOGS appears to be a liberal measure of DSM-​ IV pathological gambling. In general population samples, the SOGS identifies a greater number of pathological gamblers than do DSM-​ IV-​oriented measures (Cox, Yu, Afifi, & Ladouceur, 2005; Stinchfield, 2002). Fewer comparative data are available for clinical samples, although the same concern about false positives exists (Grant, Steinberg, Kim, Rounsaville, & Potenza, 2004; Hodgins, 2004). With a reduction in the number of symptoms required for diagnosis of gambling disorder in the DSM-​5 (thereby raising the population prevalence of disordered gambling), problems with false positives on the SOGS could be expected to decrease.

Gambling Disorders

However, limitations with the classification accuracy of the SOGS remain unchanged when evaluated with DSM-​5 criteria (Goodie et al., 2013). Test–​retest reliability was acceptable with the original interview version (Lesieur & Blume, 1987) and the past-​ year self-​report version in a clinical sample (Stinchfield, Winters, Botzet, Jerstad, & Breyer, 2007). The self-​report SOGS often acts as the comparison standard in the assessment of other measures, so evidence of concurrent validity of both past-​year and lifetime versions across a variety of clinical and nonclinical samples is available and is generally positive (see Table 19.1; Grant et al., 2003; Hodgins, 2004; Lesieur & Blume, 1987; Stinchfield, Govoni, & Frisch, 2005; Wulfert et  al., 2005). Ladouceur and colleagues (2000) investigated validity at the item level and reported that most respondents in a community sample misinterpreted one or more items. Because all the true–​ false items are keyed in the true direction (true reflecting a problem), community respondents were more likely to overreport than underreport symptoms—​clarification of item meaning reduced the number of individuals classified as pathological gamblers. Similar research has not been conducted with clinical samples, but clearly interpretation at the item level is likely to be unreliable for any scale. National Opinion Research Center DSM-​IV Screen for Gambling Problems The National Opinion Research Center DSM-​IV Screen for Gambling Problems (NODS) was originally developed for a U.S. national gambling telephone survey as a past-​year and lifetime diagnostic measure based on DSM-​ IV diagnostic criteria (Gerstein et  al., 1999). As well as being designed for use in an interview format, it is also used as a self-​report instrument, although no psychometric information is available for the self-​ report version. Seventeen true–​false items measure the 10 DSM-​IV diagnostic criteria (and therefore the 9 DSM-​5 criteria), and the past-​year items are asked only if the lifetime item is answered with a positive response. The NODS total score is used to identify pathological gambling, and lower cut-​ offs indicate problem and low-​risk gamblers. A  number of the DSM criteria are operationalized with the use of time periods (e.g., past 2 weeks) and frequency parameters (e.g., three or more times) in order to increase the item reliability. Because these changes represent a tightening of these criteria relative to their description in the DSM-​ IV and DSM-​5, the NODS may underidentify pathological gamblers. Consistent with this concern, in the

415

U.S. national sample, the estimated prevalence was lower than that found in other surveys (Gerstein et  al., 1999). However, without a gold standard for comparison, it is unclear that this lower estimate is less valid. When using clinician rating (based on DSM-​IV criteria) as the threshold for comparison in a sample of gamblers, the NODS identified only 68.5% of problem gamblers identified by clinicians, but it provided a reasonably accurate overall prevalence rate (Williams & Volberg, 2014). However, clinician ratings do not necessarily reflect a gold standard, and there have not been similar published comparisons of the NODS with other DSM-​IV or DSM-​5 measures in clinical populations. The NODS has less supporting psychometric research than the SOGS. In terms of additional indicators of validity, during the scale development phase the NODS was administered to a small sample of individuals in outpatient problem gambling treatment programs. Of the 40 individuals, 38 scored 5 or more on the lifetime NODS, and 2 obtained scores of 4. Retest reliability over 2 to 4 weeks in an overlapping sample of 44 gamblers in treatment was high (r = .99 and r = .98 for lifetime and past year, respectively). The authors did not report internal consistency coefficients, although alpha coefficients in clinical samples were reported to be adequate in the past-​year version administered via telephone (Hodgins, 2004) and good in the past-​year and lifetime versions administered face-​to-​face (Wickwire, Burke, Brown, Parker, & May, 2008; Wulfert et al., 2005). The validity of the lifetime and past-​year total scores was also assessed in these clinical samples. Using a variety of discriminant and convergent measures, good validity results were generally obtained (Hodgins, 2004; Wickwire et  al., 2008; Wulfert et  al., 2005). Hodgins also reported the validity of the categorical cut-​ points compared with the SOGS pathological and problem categories. Agreement was poor, with most NODS problem gamblers categorized as pathological on the SOGS (i.e., more severe). Because it is unclear which categorization is more valid in the absence of a gold standard indicator, clinicians should be cautious about relying too much on cut-​off scores to indicate the presence or absence of a diagnosable condition. Following the development of the NODS, a subset of three items were found to identify nearly all pathological gamblers and more than 90% of problem gamblers. These three items—​ evaluating loss of control, lying, and preoccupation—​comprise the NODS-​CLiP (Toce-​ Gerstein, Gerstein, & Volberg, 2009). This brief screen has demonstrated excellent sensitivity (.96) and adequate

416

Substance-Related and Gambling Disorders

specificity (.90) in the general population, but it did not perform as well in a clinical sample. Although it captured nearly all pathological and problem gamblers, the NODS-​CLiP also captured a high proportion of low-​risk and at-​risk gamblers (Volberg, Munck, & Petry, 2011). Volberg and colleagues identified a different subset of four items evaluating preoccupation, escape, risked relationships, and chasing (NODS-​PERC). This alternative set of items demonstrated better psychometric properties in a clinical sample than the NODS-​CLiP (Volberg et  al., 2011). The authors recommend use of the NODS-​PERC over the CLiP in settings with a higher base prevalence rate of disordered gambling (e.g., substance abuse treatment programs). In summary, the NODS appears to identify fewer individuals as pathological gamblers in both general population and treatment samples. It provides a DSM-​ IV diagnosis plus a subclinical problem gambling category. To date, positive, but limited, psychometric research is available for the interview version. Subsets of NODS items also appear to form promising brief screeners for gambling problems in the general population (i.e., NODS-​CLiP) and in clinical populations (i.e., NODS-​PERC). GAMTOMS–​DSM-​IV Measure The Gambling Treatment Outcome Monitoring System (GAMTOMS; Stinchfield, Winters, et  al., 2007) is a multidimensional questionnaire or interview assessment tool designed for outcome assessment. It is described in detail in the sections on other assessment purposes. However, it also contains a 10-​item true–​false DSM-​ IV measure relevant for diagnostic purposes. Both the questionnaire and interview versions of the GAMTOMS have been subjected to a number of psychometric evaluations in clinical samples (Stinchfield, Govoni, & Frisch, 2007). The DSM-​ IV total score showed good internal reliability in one treatment sample but less than adequate reliability in two other samples. Retest reliability over 1 week was good in the three samples but slightly lower than the SOGS retest estimate in the same samples. The total scores showed good convergent and discriminant validity with a variety of criteria, including the SOGS. The categorical diagnosis of pathological gambling showed good sensitivity (.96) and specificity (.95) identifying clinical from nonclinical individuals and good sensitivity (.97) and specificity (1.0) using SOGS classification as the criterion (Stinchfield, 2003; Stinchfield et al., 2005).

Other DSM-​IV Measures A number of additional DSM-​IV-​based measures have been developed but, to date, have had limited psychometric evaluation. For example, a brief gambling module of the Diagnostic Interview Schedule (DIS; Robins, Cottler, Bucholz, & Compton, 1996) has been used in a number of investigations (e.g., Cunningham-​ Williams, Cottler, Compton, & Spitznagel, 1998; Welte, Barnes, Wieczorek, Tidwell, & Parker, 2001), although no psychometric data have been reported. A revised and more extensive Composite International Diagnostic Interview includes assessment of DSM-​IV pathological gambling and has demonstrated good psychometric properties in a U.S.  household population (Kessler et al., 2008). Two other diagnostic assessment measures are appealing because they allow the clinician to probe responses to determine whether each diagnostic criterion is passed. The Diagnostic Interview for Gambling Schedule–​DSM-​ IV Diagnosis (DIGS-​ DSM-​ IV; Winters, Specker, & Stinchfield, 2002)  is a structured clinical interview for assessment and treatment planning that contains a 20-​ item assessment of the DSM-​IV criteria for the past-​year and lifetime time frames. Psychometric data were assessed in only one treatment sample but were positive (Winters et  al., 2002). Grant and colleagues (2004) describe a similar measure, the Structured Clinical Interview for Pathological Gambling (SCI-​ PG) that is modeled after the Structured Clinical Interview for the DSM-​IV (SCID; Spitzer, Williams, Gibbon, & First, 1990), which is widely used for assessment of DSM disorders but does not include a pathological gambling module. A  DSM-​ 5 updated version of the instrument (SCID-​ 5; First, Williams, Karg, & Spitzer, 2015a) includes an optional module to assess current (past-​year) gambling disorder in the research version of the instrument (SCID-​5-​RV); however, this module is unavailable for the clinician version (SCID-​5-​CV; First, Williams, Karg, & Spitzer, 2015b). In a SCID assessment, trained clinicians use a series of probe questions to determine whether each of the 10 criteria has been met over the lifetime and currently. If the gambling module of the SCI-​PG (or the SCID-​5-​RV) is used in conjunction with the full SCID, then the clinician can assess the DSM exclusion criterion for pathological gambling: The gambling behaviors are not better accounted for by a manic episode (Criterion B). In a small clinical sample, inter-​rater reliability and retest reliability over a 1-​week period were excellent and sensitivity was .88 and specificity was 1.00 assessed against clinical ratings.

Gambling Disorders

A final DSM-​ based alternative is the Gambling Behavior Inventory (GBI; Stinchfield, 2003; Stinchfield et al., 2005), which is a 76-​item structured interview that includes a 10-​item past-​year DSM scale. The DSM scale has shown excellent internal reliability in two treatment samples as well as convergent and discriminant validity with a variety of measures (Stinchfield, 2003; Stinchfield et al., 2005; Stinchfield, Winters, et al., 2007). The categorical diagnosis of pathological gambling showed good sensitivity (.91), but lower specificity (.83), in identifying clinical from nonclinical individuals (Stinchfield et  al., 2005). Sensitivity and specificity improved using a cut-​off of four versus five criteria. The GBI and the GAMTOMS have been evaluated with current DSM-​ 5 criteria by removing the illegal acts criterion and lowering the cut-​ score, and they demonstrated satisfactory reliability, validity, and classification accuracy (Stinchfield et al., 2015). A number of brief screens, including the NODS-​CLiP and NODS-​PERC discussed previously, have been developed to quickly assess disordered gambling. An additional brief screen, the Brief Biosocial Gambling Screen (BBGS; Gebauer, LaBrie, & Shaffer, 2010), consists of three yes/​ no items. The screen assesses for disordered gambling over a 12-​month time frame and has been shown to have high sensitivity (.96) and high specificity (.99) using a cut-​off of 1 (Gebauer et  al., 2010). The BBGS appears to demonstrate satisfactory classification accuracy, with reported hit rates, sensitivity, and specificity values of .88, .91, and .87, respectively (Himelhoch et  al., 2015). Furthermore, although the BBGS was developed using DSM-​ IV diagnostic criteria, the psychometric properties appear to remain strong with current DSM-​5 criteria (Brett et al., 2014). Canadian Problem Gambling Index–​Problem Gambling Severity Index The Canadian Problem Gambling Index (CPGI; Ferris & Wynne, 2001) is an interview tool assessing gambling involvement and social context designed for prevalence surveys. It has been used in surveys in most Canadian provinces and nationally, which provides a large normative database (Cox et  al., 2005). The CPGI contains a nine-​item Problem Gambling Severity Index (PGSI) that has a past-​year time frame. The PGSI total score indicates non-​problem, low-​risk, moderate-​risk, and problem gambling. The total score has demonstrated good internal reliability and adequate test–​retest reliability over a 4-​week period in the general population. It also shows good convergent validity with the SOGS, DSM-​IV, and clinical

417

ratings in a treatment sample. Classification accuracy of the problem gambling category showed adequate sensitivity (.83) and excellent specificity (1.0) using DSM-​IV classification as the criterion (Ferris & Wynne, 2001). However, several weaknesses of the classification categories of the PGSI have been noted. Given that researchers often merge moderate-​risk and problem gambling categories to increase statistical power due to low prevalence (Afifi, Cox, Martens, Sareen, & Enns, 2010; Crockford et al., 2008), more attention needs to be paid to the validity of the instrument’s cut-​ off scores. The scale developers proposed a cut-​off score of 3 to identify moderate-​risk gamblers. When using a cut-​off score of 3, the PGSI has shown poor correspondence with clinical ratings, producing a problem gambling rate 1.85 times higher than clinical ratings (Williams & Volberg, 2014). Using the proposed cut-​off score of 8 for problem gambling, the PGSI has demonstrated excellent specificity (.99) but only identified 49% of problem gamblers identified using clinical ratings (Williams & Volberg, 2014). Some research teams have proposed that using a cut-​off score of 5 on the PGSI provides a more distinctive classification between low-​and moderate-​risk gamblers (Currie, Hodgins, & Casey, 2013)  and provides significantly higher specificity, positive predictive power, and diagnostic efficiency compared to a cut-​off score of 3 (Williams & Volberg, 2014). Problem and Pathological Gambling Measure The Problem and Pathological Gambling Measure (PPGM; Williams & Volberg, 2010)  is a relatively new instrument developed for use in population prevalence surveys. The development of the PPGM aimed to address identified weaknesses in previous instruments, including limited assessment of gambling-​related harms and inability to capture problem gamblers in denial or who lack insight. The PPGM assesses a past-​year time frame and consists of 14 items divided into three sections:  Problems, Impaired Control, and Other Issues. Respondents are classified as a nongambler, recreational gambler, at-​risk gambler, problem gambler, or pathological gambler based on section scores, frequency of gambling, and reported gambling loss. The PPGM scores have demonstrated good internal consistency and adequate 1-​month test–​retest reliability, and they have shown higher agreement with clinical ratings compared to other instruments, such as the SOGS, PGSI, and NODS. The PPGM has demonstrated good convergent validity with the SOGS, PGSI, and NODS and clinical ratings in the

418

Substance-Related and Gambling Disorders

general population (Williams & Volberg, 2010). The scale has demonstrated good psychometric properties in two large samples (Williams & Volberg, 2014); however, independent replication of the psychometric properties of the PPGM is needed for it to be highly recommended.

advised to use the instrument that best fits their purpose from a practical perspective.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Overall Evaluation Because gambling disorder is a relatively new area of investigation, there is a lack of consensus about the gold standard diagnostic instruments. The SOGS was almost unanimously used until it was eclipsed by the desire for a DSM-​IV-​based instrument. Subsequently, a number of DSM-​IV alternatives have been developed, but none has the extensive psychometric database of the SOGS and none has become universally used in either research or clinical contexts. These instruments can easily be modified to provide a DSM-​5 diagnosis by eliminating one of the diagnostic criteria and making minor revisions to wording, but few psychometric evaluations have been reported of DSM-​5 revised instruments. Although the SOGS and DSM-​IV measures generally appear to assess the same construct, it also appears that the SOGS pathological and problem gambling categories represent a lower threshold for the disorders compared to the DSM-​ IV measures. The SOGS also lacks content validity with respect to the DSM-​IV criteria. All of the proposed DSM-​IV measures have positive preliminary psychometric support and, not surprisingly, the items on the various scales are quite similar. In fact, even the CPGI–​PGSI, which was not derived from a DSM conceptualization, has eight of nine items that overlap with either the SOGS or DSM-​IV items. The measures vary in other ways. The NODS and GAMTOMS DSM are the only self-​completion options, although all of the psychometric evaluation of the NODS has been on the interview format. The interview options include the GAMTOMS, NODS, GBI, DIGS, and SCI-​PG. The NODS and GBI can be administered via telephone as well as face to face. The GAMTOMS, NODS, and GBI can be administered by lay persons, and the DIGS and SCI-​PG require clinical training and experience. These latter two measures are, arguably, true diagnostic measures because interviewers probe to ensure that each criterion is reached, whereas the others can be better viewed as screening measures. Nonetheless, further psychometric evaluation is required before any of these instruments can be highly recommended for routine use (see Table 19.1). In the meantime, the information available for each instrument is generally supportive, and clinicians are

Table 19.2 outlines important domains in case conceptualization and treatment planning for gambling disorders. Together, the domains provide a comprehensive description of the severity and consequences of the problem. These factors point to potential treatment targets. It is also recommended that some type of functional analysis of the precipitants of gambling be performed. This type of information is particularly relevant for cognitive–​behavioral therapy but can also inform the therapeutic direction in other treatment models. Assessment of comorbidity serves a similar purpose and also may provide information about etiology (as does family history). A  clear understanding of the client’s treatment goal, previous treatment experience, and motivation is also essential. Limited instrumentation exists for many of these areas and, in fact, research regarding the specifics of the construct is also limited in some instances. A good example is the first domain in Table 19.2, Severity/​Impaired Control. An issue exists in the field that parallels a long-​standing debate in the alcohol field—​the advisability of gambling moderation goals versus complete abstinence from gambling (Ladouceur, 2005). The issue in gambling is complicated by the possibility that gambling abstinence can be narrowly defined as quitting the types of gambling that have caused problems for the individual or broadly defined as all types of gambling even if they have never caused problems (Stea, Hodgins, & Fung, 2015). In the alcohol field, the most robust clinical indicator of the likelihood of the success of moderation versus abstinence from alcohol is the degree of alcohol dependence (Rosenberg, 1993). Efforts are underway to delineate a similar construct in the gambling field, impairment of control over gambling, although efforts to develop a reliable measurement tool have yielded mixed results (Dickerson & O’Connor, 2006). Kyngdon (2004) described a 12-​item unifactorial scale that in preliminary studies correlated highly with measures of severity, such as the SOGS, which suggests that until measurement of impaired control further develops, severity of problem can be used as a proxy. Problem severity has been shown to be related to natural versus treatment assisted recovery (Hodgins & el-​Guebaly, 2000) and response to brief interventions (Stea et  al., 2015), and it is used by clinicians

Gambling Disorders

to help determine the optimal treatment goal (Robson, Edwards, Smith, & Colman, 2002). Table 19.2 provides a number of suggestions for standardized tools to assess severity of problem, and these tools are described in detail in the preceding diagnostic section. As outlined previously, psychometric research has focused on the validity of these scales as indicators of pathological gambling, and little work has assessed the validity of these scales as indicators of lower degree of problem severity. DSM-​based measures are designed to have items that measure severe pathology. For example, an examination of the SOGS with a Rasch model of measurement (Strong, Breen, Lesieur, & Lejuez, 2003) found that SOGS items could be ordered in terms of their level of gambling problem severity, similar to a Guttman scale, but that the scale is composed of mostly items reflecting severe gambling problems and that more low-​and moderate-​ severity items would be necessary to obtain an optimal measure of the entire continuum of problem severity. In contrast to DSM-​based scales, the CPGI–​PGSI was specifically designed to assess the full range of severity, although the low-​and moderate-​risk interpretation categories also have not been validated for the CPGI–​PGSI. There are several omnibus instruments that cover a number of the remaining relevant assessment domains outlined in Table 19.2. The first is an adapted version of the Addiction Severity Index (ASI; McLellan, Kushner, Metzger, & Peters, 1992). The ASI is among the most widely used and validated tools for assessing and monitoring patients with substance abuse problems. It provides assessment of the severity and need for treatment in the medical, employment, family–​ social, psychiatric, legal and substance abuse domains, which are all relevant for individuals with gambling disorders. The ASI was developed as an interview, although computerized and self-​ completion versions are also available. The ASI–​Gambling Severity Index (Lesieur & Blume, 1991)  is a supplemental module that uses five items to assess gambling severity and need for treatment. It was initially validated in the interview format with inpatients in a substance abuse and gambling program (Lesieur & Blume, 1991) and later with a large sample drawn from four different populations—​ pathological gamblers in outpatient treatment, pathological gamblers participating in a treatment study, community problem gamblers, and substance abusers (Petry, 2003). In the first study, internal reliability was adequate, and some evidence of convergent validity was presented. The second study was more comprehensive, revealing strong internal reliability and good test–​retest reliability over a 1-​month period

419

TABLE 19.2  

Important Domains in Case Conceptualization and Treatment Planning for Gambling Disorders General Dimension

Specific Construct

Standardized Tools

Severity/​Impaired Control

Impaired control

SOGS, NODS, CPGI–​PGSI

Gambling Quantity

Lifetime history Recent (past month)

Consequences

Health (e.g., gastrointestinal, insomnia) Family Social relationships Employment Financial Emotional (self-​esteem) Legal Functional analysis

Association/​ Circumstances of Gambling Comorbid Psychiatric DSM-​5 Disorders Other Drug Use Prescription and illicit drugs Nicotine, caffeine Family History Biological and family exposure to gambling Treatment History Programs started and completed Twelve-​step involvement Periods of abstinence or nonproblematic gambling Treatment Goal Goal (abstinence or moderation) Self-​efficacy Motivation Readiness to change Reasons to change

Timeline Followback method ASI–​GSI, GAMTOMS, DIGS

IGS, TGS, GMQ, GFA SCID-​5, DIS, CIDI AUDIT, DAST

GAMTOMS

GASS, SCQG

GAMTOMS

Family and social support Note: SOGS = South Oaks Gambling Screen; NODS = National Opinion Research Center DSM-​ IV Screen for Gambling Problems; CPGI–​ PGSI = Canadian Problem Gambling Index–​Problem Gambling Severity Index; ASI–​GSI  =  Addiction Severity Index–​Gambling Severity Index; GAMTOMS  =  Gambling Treatment Outcome Monitoring System; DIGS = Diagnostic Interview for Gambling Schedule; IGS = Inventory of Gambling Situations; TGS  =  Temptation to Gamble Scale; GMQ = Gambling Motives Questionnaire; GFA = Gambling Functional Assessment; SCID-​ 5  =  Structured Clinical Interview for the DSM-​ 5; DIS = Diagnostic Interview Schedule; CIDI = Composite International Diagnostic Interview; AUDIT  =  Alcohol Use Disorders Identification Test; DAST = Drug Abuse Screening Test; GASS = Gambling Abstinence Self-​efficacy Scale; SCQG  =  Situational Confidence Questionnaire for Gambling.

as well as convergent and discriminant validity across a range of external variables, including collateral and clinical ratings. The ASI, together with the ASI gambling module, can provide a profile of the treatment needs of

420

Substance-Related and Gambling Disorders

an individual, although the composite severity scores for each of the domains are difficult to compute by hand. As indicated later, each index is responsive to change, which makes it a useful tool for monitoring outcome, but its value for treatment planning is limited by lack of interpretation guidelines and norms (see Table 19.3). A second omnibus instrument is the GAMTOMS (Stinchfield, Winters, et al., 2007), which is a self-​report or interview instrument that takes approximately 30 to 45 minutes to complete. As shown in Table  19.3, the GAMTOMS receives generally good psychometric ratings, although information is unavailable in three areas. The latest version of the GAMTOMS includes, in addition to the DSM-​IV measure described previously, scales assessing gambling frequency, mental health, financial problems, legal problems, and stage of change. The GAMTOMS also incorporates the SOGS scale. Content validity for assessing outcome was confirmed by an expert panel of gambling treatment professionals. Gambling quantity is measured by items enquiring about the frequency of gambling for 14 specific types of gambling. Scores on these items in both the interview and self-​ administered versions generally show good test–​ retest reliability over a 1-​week period as well as convergent validity with a time line interview of gambling behavior described later (Stinchfield, Winters, et  al., 2007).

TABLE 19.3  

Mental health is measured with the ASI Psychiatric composite severity score described previously. The ASI psychiatric score had inadequate internal reliability but good retest reliability over 1 week (intraclass correlation coefficient [ICC]  =  .83) and good convergent validity with the BASIS-​32 (Eisen, Dill, & Grob, 1994), a self-​ report instrument validated with psychiatric outpatients. Scores on the 23-​item financial consequences scale and the 7-​item legal consequences scale had good internal reliability and retest reliability as well as convergent and discriminant validity with other GAMTOMS scales and with federal bankruptcy and court records and collateral reports (Stinchfield, Winters, et  al., 2007). Finally, the GAMTOMS includes a single item assessing stage or readiness to change according to the Prochaska and DiClemente model (Prochaska, DiClemente, & Norcross, 1992). The item showed poor retest reliability over a 1-​week period, although it was sensitive to change and showed good convergent validity with gambling items (Stinchfield, Winters, et al., 2007). In summary, the GAMTOMS covers a number of important content domains for treatment planning and for monitoring treatment outcome. Psychometric evidence for both the self-​report and interview versions is accumulating and is generally positive. To date, as with the ASI, interpretation norms for the various scales have

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

SOGS NODS

E NR

E A

G NR

E A

A A

E G

E A

G G



CPGI–​PGSI

A

G

NR

A

L

A

A

A

GAMTOMS ASI–​GSI DIGS

NR NR NR

G A A

NR NR NR

G A NR

G A A

G G A

NR G NR

G G G

TLFB

NR

NA

A

A

A

A

E

G

IGS TGS

A NR

E E

NA NA

NR A

G G

G A

A NR

A A

GFA-​R

NR

E

NA

A

A

G

A

A

GMQ

NR

G

NA

NR

A

G

G

L

GMQ-​F

NR

A

NA

NR

G

G

A

L

GASS SCQG

A NR

E E

NA NA

A A

G G

G A

NR NR

A A

✓ ✓





Note:  SOGS  =  South Oaks Gambling Screen; NODS  =  National Opinion Research Center DSM-​IV Screen for Gambling Problems; CPGI–​ PGSI = Canadian Problem Gambling Index–​Problem Gambling Severity Index; GAMTOMS = Gambling Treatment Outcome Monitoring System; ASI–​GSI = Addiction Severity Index–​Gambling Severity Index; DIGS = Diagnostic Interview for Gambling Schedule; TLFB = Timeline Followback;; IGS = Inventory of Gambling Situations; TGS = Temptations to Gamble Scale; GFA-​R = Gambling Functional Assessment-​Revised; GMQ = Gambling Motives Questionnaire; GMQ-​F = Gambling Motives Questionnaire–​Financial; GASS = Gambling Abstinences Self-​efficacy Scale; SCQG = Situational Confidence Questionnaire for Gambling; L = Less than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

Gambling Disorders

not been published, which limits its value for clinicians assessing individual clients. The DIGS (Winters et  al., 2002)  is a third omnibus instrument, previously described, designed to assess numerous dimensions relevant to case conceptualization and treatment planning. The DIGS assesses demographics, gambling involvement and history, legal problems, other impulse disorders, medical status, and family and social functioning, and it also includes a mental health screen. These domains represent the majority of the relevant assessment areas but, as mentioned previously, the DIGS has had limited psychometric evaluation, although the available data are positive. These three omnibus instruments collect basic gambling frequency information. More detailed descriptions of gambling frequency, expenditures, time spent gambling, and monthly patterns can be assessed using the Timeline Followback (TLFB) methodology, adapted from the alcohol field. The TLFB has been shown to provide reliable and valid gambling reports, at least in the research context (Hodgins & Makarchuk, 2003; Weinstock, Whelan, & Meyers, 2004). The method involves providing the individuals with a calendar, reviewing with them personal and public events to cue memories, and having them reconstruct their daily gambling over a period of 1 to 6 months. Frequency and expenditure information can be summarized into reliable indices for weekly or monthly time periods, but clinically rich information about patterns of gambling can also emerge. Table 19.2 lists the assessment of the associations and circumstances associated with gambling behavior as a third important domain. Prospective research examining the process of relapse in pathological gamblers seeking abstinence (Hodgins & el-​Guebaly, 2004)  has revealed that individuals are most likely to be alone and thinking about finances, but that a positive mood state is as likely as a negative mood state to precede the initiation of gambling. A relapse associated with social pressure to participate or a desire to fit in socially typically led to a relatively minor relapse, whereas gambling associated with a false optimism about winning or a feeling of financial pressure was more serious. Women were more likely to relapse in response to feelings of depression, whereas men described gambling in response to being bored or having unstructured time or in response to the need to make money (Hodgins & el-​Guebaly, 2004). A detailed assessment of these potential high-​risk situations at the individual level is important for treatment planning and can be accomplished by conducting an informal functional analysis of recent heavy gambling situations.

421

The Inventory of Gambling Situations (IGS) has been developed to assess these factors (Turner, Littman-​Sharp, Toneatto, Liu, & Ferentzy, 2013). The self-​ report scale contains 63 items that comprise 10 subscales:  Negative Emotions, Conflict with Others, Urges and Temptations, Testing Personal Control, Pleasant Emotions, Social Pressure, Need for Excitement, Worried About Debts, Winning and Chasing, and Confidence in Skill. The IGS scale scores showed excellent internal reliability and the factor structure was confirmed in two clinical samples. Preliminary evidence of discriminant and convergent validity was also reported. The subscales all correlated highly with the SOGS and DSM-​IV criteria, and the pattern of correlations with a group of external measures such as depression, impulsivity, and cognitive errors conformed to expectation. The scale has good potential for clinical use, although it is lengthy and a computer scoring program is recommended because it is difficult to score by hand. A 10-​item short form of the IGS (IGS-​10; Smith, Stewart, O’Connor, Collins, & Katz, 2011) demonstrated good convergent validity and internal consistency in a sample of undergraduate gamblers, although additional psychometric evaluation is required. An alternative option is the Temptations to Gamble Scale (TGS; Holub, Hodgins, & Peden, 2005), which has 21 items that comprise four subscales:  Negative Affect, Positive Mood/​ Impulsivity, Seeking Wins, and Social Factors. The TGS has good content validity and demonstrated strong internal and test–​retest reliability over a 3-​ week period in a sample of pathological gamblers. To examine gambling motives more broadly, the Gambling Motives Questionnaire (GMQ; Stewart & Zack, 2008)  draws from the alcohol literature and measures the frequency of gambling for a variety of reasons. The 15-​ item GMQ was adapted from the Drinking Motives Questionnaire (DMQ; Cooper, Russell, Skinner, & Windle, 1992) and consists of three subscales: Coping, Enhancement, and Social motives. The GMQ scores demonstrated good internal consistency and concurrently validity (Lambe, Mackinnon, & Stewart, 2015; Stewart & Zack, 2008). Adapted from the drinking literature, the GMQ is limited in its scope of motives specific to gamblers, particularly around financial and charitable motives (Dechant & Ellery, 2011; Hodgins, 2008). An extended version of the GMQ that includes financial motives (GMQ-​F; Dechant, 2014) consists of 16 items across four subscales: Coping, Enhancement, Social, and Financial motives. The GMQ-​F scores have demonstrated fair to good internal consistency, good criterion validity, and factor structure in nonclinical samples (Dechant, 2014; Schellenberg, McGrath, & Dechant, 2016).

422

Substance-Related and Gambling Disorders

The Gambling Functional Assessment (GFA; Dixon & Johnson, 2007) is another self-​report instrument designed to identify potential mechanisms maintaining an individual’s gambling behavior. The original GFA includes 20 items that can be scored into four subscales: Sensory, Attention, Escape, and Tangible. Although the proposed structure of the GFA was theoretically strong, subsequent factor analysis yielded a two-​factor solution instead of the proposed four factors (Miller, Meier, Muehlenkamp, & Weatherly, 2009). A revised version of the GFA (GFA-​R; Weatherly, Miller, & Terrell, 2011)  consists of 16 items comprising two subscales:  Positive Reinforcement and Escape. The GFA-​ R scores have demonstrated good to excellent internal consistency for the overall score and both subscales (Weatherly, Miller, Montes, & Rost, 2012; Weatherly & Terrell, 2014), good construct validity (Weatherly, Dymond, Samuels, Austin, & Terrell, 2014), and good 4-​week test–​retest reliability (Weatherly et  al., 2012). Test–​retest findings over a 12-​week period were more mixed; overall GFA-​ R scores and Positive Reinforcement subscale scores were much more reliable than Escape subscale scores. Although intended for use with clinical populations, much of the available psychometric data derive from nonclinical, university samples. To date, one study has examined the GFA-​R in a sample of probable problem and disordered gamblers (Weatherly & Terrell, 2014). The GFA-​R demonstrated good to excellent internal consistency and good construct validity in this sample. However, the authors propose scoring only 15 of the 16 scale items when using the GFA-​R with probable problem or disordered gamblers because this scale structure appeared to be a better fit (Weatherly & Terrell, 2014). Replication in independent clinical samples is required to determine if this modified structure holds. Table 19.2 also lists routinely assessing comorbid psychiatric disorders and substance use and abuse. A number of well-​validated structured assessment instruments are available for psychiatric disorders (e.g., SCID-​5; First et al., 2015b). The Alcohol Use Disorders Inventory (AUDIT; Babor, de la Fuente, Saunders, & Grant, 1992) provides a brief, 10-​item, self-​report assessment of alcohol problems. The AUDIT is most easily administered in a self-​ report version, but it can also be administered orally or via computer. The AUDIT covers three domains—​alcohol consumption, alcohol dependence, and alcohol-​related problems—​and was designed to be appropriate for use in a number of cultures and languages. The psychometric properties of this scale, including the validation of cut-​ points for identifying high-​ risk and abusive drinking, have been assessed in a broad range of populations (e.g.,

primary care, students, and emergency room patients; for a review, see Reinert & Allen, 2002), although not specifically with pathological gamblers. Less well validated but also widely used is a similar self-​completion measure for other drug use, the Drug Abuse Screening Test (DAST; Skinner, 1982). There are 28-​, 20-​, and 10-​item versions of the DAST with interpretation guidelines, although the majority of the psychometric data were derived from the longest version (Cocco & Carey, 1998). Studies with gambling samples have not been reported. Treatment history and experience, treatment goals, and motivation are also important assessment domains that are identified in Table 19.2. Because standardized tools to assess these domains are not available, they are typically assessed through clinical interview. It is recommended that treatment goals be assessed in clear behavioral terms in which the person identifies a goal of abstinence or moderation for each type of gambling and that moderation goals be specified in terms of frequency and expenditure limits (Hodgins & Makarchuk, 2002). The setting of specific goals also facilitates the task of monitoring treatment progress. Two self-​completion measures are available to assess self-​efficacy: the Gambling Abstinence Self-​efficacy Scale (GASS; Hodgins, Peden, & Makarchuk, 2004)  and the Situational Confidence Questionnaire for Gambling (SCQG; May, Whelan, Steenbergh, & Meyers, 2003). The GASS has 21 items that parallel the temptation items of the TGS (described previously) and that are scored into the same four subscales. Scores from a sample of pathological gamblers revealed strong internal and test–​retest reliability over a 3-​week period (ICC = .86) and also showed evidence of predictive validity over 12  months. Higher GASS scores predicted less gambling, which is consistent with self-​efficacy theory. The SCQG has 16 items, similar to the GASS items, and yields a single score. Psychometric properties of the SCQG have not been assessed in clinical samples, although internal reliability in a community sample of gamblers (α  =  .96) and test–​retest reliability over 2 weeks with a college sample (r = .86) were good, yielding adequate ratings in Table 19.3. Overall Evaluation Substantial progress has been made in the development of gambling treatment planning assessment tools over a short period of time, although many gaps remain, as shown in Tables 19.2 and 19.3. Table 19.3 identifies recommended instruments that have mostly good or excellent psychometric support for these purposes, albeit

Gambling Disorders

based on limited research. These include the SOGS, GAMTOMS, ASI–​GSI, IGS, and GASS. The omnibus instruments, the ASI–​ GSI and GAMTOMS, provide much potentially useful clinical information for individual clients, although the initial phases of measurement development have focused on their utility in outcome monitoring where scores are aggregated over groups of individuals. Interpretation guidelines and norms for individual scores are necessary for these scales to be optimally useful to clinicians. The gambling field has benefited from a long history of measurement in alcohol, other drug, and mental health disorders, although we need to exert caution when adopting tools from these areas, such as the AUDIT and DAST. It is also important that we establish psychometric properties and collect norms from gambling samples.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

There is considerable variability in the focus of outcome measurement in the small, but growing, body of disordered gambling treatment efficacy trials, which makes comparison of trials challenging. In response, an expert panel of outcome researchers has provided a set of recommendations on outcome measurement (Walker et al., 2006). The panel identified three important elements in determining the effectiveness of treatment interventions: reduction in the frequency or intensity of gambling behavior, reduction in gambling-​related consequences, and evidence that the reduction in gambling behavior results from the hypothesized therapeutic mechanism. This framework for reporting outcomes in gambling treatment research (known as the “Banff framework”) is also instructive for clinicians because it clearly identifies two readily measured domains:  gambling behavior and gambling-​related consequences. The third element, measurement of process variables, will vary in focus depending on the type of intervention. Measurement of Gambling Behavior The Banff framework noted that the wide individual variation in types, frequency, and intensity of gambling means that any single measurement of gambling involvement is unable to capture all relevant aspects. At minimum, two specific indicators of gambling behavior are recommended for evaluation:  financial losses and gambling frequency. Financial losses should be reported as net

423

expenditure (i.e., the amount of money that the individual brought to or accessed during the gambling session minus the amount left at the end of the session). Asking how much an individual “spent gambling” leads to inconsistent responses depending on the pattern of wins and losses during the gambling session, which is typically quite lengthy. Disordered slot machine gamblers, for example, report gambling sessions that are typically 5 to 8 hours in length. Net expenditure, in contrast, ignores any wins that are subsequently lost during the session. It is further recommended in the Banff framework that money lost not be normed against total personal or family income or expendable income. It is true that the same monetary loss will have different consequences for individuals of different financial means, but it is also true that individuals do not easily provide reliable reports of their financial means (Walker et  al., 2006). The attempt to normalize loss reports with financial means is apt to lead to an overall less reliable expenditure index. Because the focus in outcome monitoring is individual change over the course of time or treatment, the expenditure information does not require this adjustment in order to monitor change. Per session expenditures need to be averaged over a monthly or longer time period to reduce the variability in gambling that results from variability in access to money and gambling opportunities. Gambling behavior often varies according to employment pay schedules, for example, which can be weekly, biweekly, or monthly. The optimal time frame for summarizing expenditures has not yet been identified, although a 3-​month period is often reported in efficacy studies (e.g., Petry et al., 2006). Future research will help establish the benefits of this time frame versus a shorter (e.g., 1 month) or longer period (e.g., 6 months). Finally, the framework recommended that the expenditure measure include only forms of gambling that are causing the individual problems in order to minimize error variance. Monitoring involvement in nonproblematic types of gambling is also advisable, but it should be reported as a separate factor. The second critical indicator of gambling behavior is gambling frequency. Frequency can be measured in a variety of metrics, such as hours, number of sessions, time spent thinking about gambling, and so forth, although days of gambling appears to be the easiest for individuals to recall reliably (Hodgins & Makarchuk, 2003). As with expenditures, days are typically averaged over a time period of 1 to 3  months. The TLFB interview is one procedure for eliciting reliable expenditure and frequent reports, and it is rated as highly recommended in Table 19.4. The use of other methodologies, such as daily

424

Substance-Related and Gambling Disorders

TABLE 19.4  

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Test–​Retest Content Reliability Reliability Validity

Construct Validity

Validity Treatment Clinical Generalization Sensitivity Utility

Highly Recommended

GAMTOMS-​D GAMTOMS-​F TLFB ASI–​GSI GASS SOGS-​3 NODS-​3 GBQ

NR NR NA NR NR NR NR NR

G G NA A G G G G

NR NR G NR NA NR NR NR

NR NR A G A NR NR A

G G G A E A A NR

G G G A G A A A

NR NR E G A NR NR NR

A A E G A G A NR

G G G A G G G G

✓ ✓ ✓

GCI

NR

E

NR

A

A

G

NR

NR

G

PG-​YBOCS

NR

NR

G

NR

L

A

NR

G

G

✓ ✓

Note: GAMTOMS = Gambling Treatment Outcome Monitoring System, D = discharge questionnaire, F = follow-​up questionnaire; TLFB = Timeline Followback; ASI–​GSI = Addiction Severity Index–​Gambling Severity Index; GASS = Gambling Abstinence Self-​efficacy Scale; SOGS-​3 = 3-​Month Version South Oaks Gambling Screen; NODS-​3 = 3-​Month Version–​National Opinion Research Center DSM-​IV Screen for Gambling Problems; GBQ  =  Gamblers’ Beliefs Questionnaire; GCI  =  Gambling Cognitions Inventory; PG-​YBOCS  =  Yale–​Brown Obsessive–​Compulsive Scale Pathological Gambling Modification; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

diaries or quantity–​frequency summary measures, has not yet been evaluated. Measurement of Gambling-​Related Consequences Table 19.2 outlined specific gambling-​ related consequences that are relevant for outcome monitoring as well as treatment planning. We have already reviewed two omnibus instruments, the ASI and the GAMTOMS, which cover some of these consequences. Table 19.4 provides ratings of psychometric research for purposes of outcome monitoring. The ASI provides composite scores in each of the eight assessment areas that are responsive to change and are often used in substance abuse efficacy research (McLellan et  al., 1992). The composite scales, including the ASI-​ GSI, assess frequency of behavior, related problems, and perceived need for treatment in a 30-​day window using a 0-​to-​1 range. Ideal outcome would involve a score of zero on the scale indicating no problems (McLellan et  al., 1992), although more typically statistically significant pre-​and post-​treatment differences are used to demonstrate improvement. Because the scores are not pure measures of behavior, related problems, or a therapeutic mechanism, interpretation of specific scores is problematic. A score of 0.5, for example, could indicate a number of different problems. There are no interpretation guidelines for specific non-​zero values, which limits the usefulness of these scores for clinicians. The GAMTOMS includes a treatment discharge and a treatment follow-​up questionnaire or interview to complement the intake assessment. The questionnaire

version has been used to evaluate outcome in Minnesota state treatment programs, and its content validity for this purpose has been assessed positively by an expert panel. Psychometric evaluation of these scales is promising, albeit limited to date. The discharge questionnaire (88 items, 30 minutes) provides outcome indices in six areas:  gambling frequency, stage of change, efforts at recovery, psychiatric symptoms, treatment component helpfulness, and client satisfaction. In support of construct validity, principal component analyses of the latter four of these scales, which are designed to be summed total scores, confirmed that they are unifactorial in a treatment sample that completed the self-​report version and one that completed the interview version (Stinchfield, Winters, et  al., 2007). Internal reliability in these same samples varied from unacceptable to excellent, but overall it is rated good (see Table 19.4). The follow-​up assessment is designed to be administered after 6 to 12  months (95 items, 30–​45 minutes) and provides a broader range of indicators: gambling frequency; gambling debt; stage of change; alcohol, tobacco, and other drug use frequency; post-​treatment service utilization; gambling-​related illegal activities; occupational problems; problem gambling severity (DSM and SOGS); financial problems; psychiatric symptoms; and general treatment outcome. Principal component analyses of five of these scales, which are designed to be summed total scores, confirmed that they are unifactorial in a treatment sample completing the interview version. Overall, the internal reliability for scores on these scales was good (Stinchfield, Winters, et al., 2007).

Gambling Disorders

As with the ASI, norms are not provided to facilitate interpretation of these scores, although scores of zero indicate optimal functioning. The GAMTOMS incorporates both the SOGS and the DSM-​IV measure as outcome indicators. Continuously scored data from these diagnostic scales are often reported in efficacy trials and could serve as benchmarks against which to compare the progress of individual patients. The Banff framework cautions against the use of these severity measures as primary outcome measures because of the ambiguity in meaning of low, but non-​ zero, scores. Nonetheless, these measures can act as useful secondary indicators of outcome. Most of these measures were developed to assess lifetime and past-​year functioning (Hodgins, 2004)  and, therefore, cannot be used for treatment monitoring with follow-​up time periods shorter than 1 year. However, Wulfert et al. (2005) examined the reliability and validity of 3-​month versions of the SOGS and NODS and concluded that they are potentially useful for outcome evaluation. Scores on the 3-​month versions showed good internal reliability and convergent validity with gambling frequency and expenditure in a treatment sample (Wulfert et al., 2005), as well as sensitivity to change in an efficacy study (Wulfert, Blanchard, Freidenberg, & Martell, 2006). The SOGS-​3 has been shown to be sensitive to change in other treatment studies, so it currently is recommended over the NODS-​3 (see Table 19.4). Measurement of Therapeutic Mechanisms Relative to measurement of outcome, the measurement of therapeutic variables, the third element recommended in the Banff framework, is underdeveloped. Our ability to measure accurately what occurs during treatment in terms of the client, the therapist, and the therapeutic approach is limited, but it is crucial for improving treatment outcomes. Based on the growing empirical support of cognitive–​behavioral treatments (Cowlishaw et al., 2012) and the interest in pharmacological efficacy, measurement in four process domains is reviewed in this chapter. Reduction of gambling in cognitive–​behavioral therapy is hypothesized to be mediated by reductions in cognitive errors, increases in coping skills, and increases in self-​efficacy. Reduction of gambling related to pharmacological agents (e.g., naltrexone) is thought to be related to reductions in urges to gamble. Measurement of Cognitive Distortions, Coping Skills, and Self-​Efficacy Cognitive–​ behavioral therapy targets, in part, changes in cognitive distortions, coping skills, and self-​efficacy.

425

Assessment of self-​efficacy was addressed previously in relation to general treatment planning; in addition, the GASS has been shown to be sensitive to change and to mediate improvement in gambling (Peden, 2004; see Table 19.4 for relevant ratings for this purpose). To date, we are not aware of any established measures of coping skills that have been validated for gambling, although a number of similar behavioral role-​play and self-​completion measures are available in the alcohol field (Finney, 1995). Content validity may be an issue if these measures are adapted to gambling, given that they assess methods for coping with typical drinking situations. The coping skills targeted in cognitive–​behavioral therapy for gambling disorders are overlapping, but not identical to, those targeted in alcohol use disorders. Assessment of cognitive distortions is relevant for both treatment planning for this type of therapy and outcome monitoring, but it is a challenge because these distortions are thought to operate outside of conscious awareness (Toneatto, 1999). Theoretically, a number of assessment options exist. It is possible to observe gambling behavior to assess underlying cognitions. For example, throwing dice vigorously when a high number is desired and lightly when a lower number is desired is indicative of an illusion of control over the outcome. However, this assessment depends on an inference concerning the cognition underlying the behavior, which may limit the reliability and validity of this technique. The think-​aloud method (Ericsson & Simon, 1980) provides a more direct assessment of cognitions. It requires that, after a brief training, gamblers verbalize their thoughts while they are engaged in a gambling activity. These verbalizations are typically recorded, transcribed, and then examined for the presence of irrational statements. This method has been effectively used in research paradigms, and trained raters can provide reliable categorizations of cognitive distortions (Ladouceur, Gaboury, Bujold, Lachance, & Tremblay, 1991). However, reactivity is an issue that compromises validity: Once voiced, a certain statement may sound dubious or surprising to the participant and therefore influence his or her subsequent thoughts and actions (Stewart & Jefferson, 2007). The verbalization requirement has also been criticized as “unnatural,” not reflecting cognitions but rather self-​descriptions of behavior (Delfabbro & Winefield, 1999). More research is required concerning validity, and practicality of the paradigm is necessary prior to clinical use. Finally, cognitions can be assessed directly with self-​ report scales, which is a practical method but one that also requires individuals to report on a process that is

426

Substance-Related and Gambling Disorders

assumed to be unconscious. Steenbergh, Meyers, May, and Whelan (2001) developed a 21-​item self-​report scale measuring two factors: luck/​perseverance and illusion of control. Scores on the Gamblers’ Beliefs Questionnaire showed evidence of factorial validity, good internal and test–​retest reliability, and some evidence of discriminative and convergent validity within student and community samples. The scale has not been validated in clinical samples and has not been shown to be sensitive to change. Holub (Holub, 2003; Holub, Hodgins, & Rose, 2007)  described the Gambling Cognitions Inventory, a 40-​item self-​report scale that measures four categories of cognitive distortions: probability errors, magical thinking/​ luck, information processing biases, and illusion of control. The scale consists of two subscales: the Skill/​Attitude subscale and the Luck and Chance subscale (McInnes, Hodgins, & Holub, 2014). The scores have shown excellent internal consistency in student and pathological gambling samples, as well as evidence of convergent and discriminant validity. The total score was not, however, related to the number of cognitive errors during a think-​ aloud task, which supports the need for more research on the validity of different assessment approaches. This scale also has not been shown to be sensitive to change related to improvement in cognitive–​behavioral treatment. Measurement of Urges Pharmacological trials often target urges to gamble (Hollander, Begaz, & DeCaria, 1998), and these studies often include measures of overall outcome that mix urge items with behavior items (e.g., Gambling Symptom Assessment Scale; Kim et  al., 2001). The Pathological Gambling Modification of the Yale–​ Brown Obsessive Compulsive Scale (PG-​ YBOCS; Hollander, DeCaria, et  al., 1998), however, is a widely used scale that provides separate behavior and urge scores, as well as a total score. The PG-​YBOCS interview includes five urge and five behavior items that are clinician-​rated. To date, psychometric study has been very limited, but scores on the two subscales show good inter-​rater reliability and the total score shows convergent validity with the SOGS and another clinical rating scale in a small clinical sample. The PG-​YBOCS is also sensitive to change, as shown in a number of efficacy trials (Grant et al., 2003). Overall Evaluation The outcome monitoring area is more advanced than the treatment planning area because of the strong interest in

the field of developing evidence-​based treatments. The Banff framework is designed to encourage increased consistency among studies by recommending basic measures of gambling behavior, related problems, and therapeutic mechanisms. These same dimensions are important for clinicians. Measurement of gambling behavior (frequency and expenditure) can be done easily, reliably, and validly using the timeline interview method. Alternative methods, such as diaries and retrospective quantity–​frequency reports, may also be feasible, although they have not been assessed. Measurement of gambling-​related problems is less advanced, although the ASI and GAMTOMS are promising omnibus measures. On the basis of the available psychometric research, the GAMTOMS is recommended for use in Table 19.4. Omnibus measures have appeal to clinicians because they are comprehensive and do not require compiling a battery of individual measures to cover the important domains to be assessed. Two additional instruments are highly recommended in Table 19.4. The SOGS-​3 provides a brief measure of severity of problems using a 3-​month window. The GASS is the only instrument that measures a therapeutic mechanism that currently meets the criteria for recommendation.

CONCLUSIONS AND FUTURE DIRECTIONS

It is exciting to work in a nascent and expanding clinical area in which policy makers and treatment providers are thirsty for new information and novel ideas about organizing and delivering effective treatment. The clear advances, made in assessment and treatment of gambling problems over the past few years, reflect this attention. The DSM-​ IV conceptualization underpins much of the clinical research that is conducted. The criteria were developed based on expert opinion and have not been subjected to extensive psychometric study to evaluate the validity of the criteria. For example, the cut-​off of four or more criteria in the DSM-​III-​R was raised to five for the DSM-​IV based on expert opinion, not empirical data. However, the elimination of the illegal acts criteria and the reduction of the cut-​off to four for DSM-​5 was based on empirical analyses of existing data that showed improved diagnostic accuracy. The DSM criteria are indicators of extreme pathology, and items that are ideal for a diagnostic classification measure may be different than those that are ideal for a continuous severity measure, so it may be helpful to develop content and construct valid measures of severity that are independent from the DSM

Gambling Disorders

criteria. Item selection would be based on the ability to discriminate different levels of severity versus using a DSM diagnosis or DSM-​based continuous measure as the gold standard. One of the issues that has not been addressed in this area is the heavy reliance on information obtained from the individual with the gambling problem. All of the measures identified as promising are self-​report or interview measures. There is evidence that pathological gamblers provide accurate self-​reports in the research context, in which confidentiality is emphasized and the information provided does not have personal consequences for the individual (Hodgins & Makarchuk, 2003). However, little is known about accuracy in clinical settings, in which the implications of honesty are more variable. Lying to “family members, therapist or others to conceal the extent of involvement with gambling” is one of the DSM-​5 criteria (APA, 2013) and should be expected to be the norm among individuals in treatment. Certainly such individuals are likely to withhold sensitive information until trust is established with the treatment provider (Stinchfield, Govoni, et al., 2007). Multimethod assessment, which is advisable with all clinical assessment, seems even more important with gambling disorders, although this has received little attention in the assessment literature. In the research context, family members and friends are sometimes used as collateral reporters (e.g., Hodgins, Currie, & el-​Guebaly, 2001), but family members, even if more honest, typically have less complete information than the identified gambler about his or her behavior (Hodgins & Makarchuk, 2003). Other sources of collateral information, such as bank or court records, are impractical for routine clinical use. Assessment techniques such as the “think-​aloud” procedure for cognitive distortions have potential, but they are not yet sufficiently developed for clinical use. We have noted that measurement in the field is advancing rapidly. It is imperative that we take the steps to “do it right” and set a solid empirical measurement foundation on which to conduct meaningful research and provide effective intervention. Although rapid progress has been made and a number of assessment tools are promising in terms of their diagnostic, treatment planning, and monitoring ability, very few were rated as highly recommended. In many instances, the only psychometric information that is available for other promising instruments is based on the development sample—​that is, the sample used to derive the instrument. Accuracy in measurement in this field will require not only the development of new instruments but also further psychometric research on existing instruments.

427

References Afifi, T. O., Cox, B. J., Martens, P. J., Sareen, J., & Enns, M. W. (2010). The relation between types and frequency of gambling activities and problem gambling among women in Canada. Canadian Journal of Psychiatry, 55, 21–​28. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author. American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Babor, T., de la Fuente, J. R., Saunders, J., & Grant, M. (1992). The Alcohol Use Disorders Identification Test: Guidelines for use in primary health care. Geneva, Switzerland: World Health Organization. Brett, E. I., Weinstock, J., Burton, S., Wenzel, K. R., Weber, S., & Moran, S. (2014). Do the DSM-​5 diagnostic revisions affect the psychometric properties of the Brief Biosocial Gambling Screen? International Gambling Studies, 14, 447–​456. Cocco, K. M., & Carey, K. B. (1998). Psychometric properties of the Drug Abuse Screening Test in psychiatric outpatients. Psychological Assessment, 10, 408–​414. Cox, B. J., Yu, N., Afifi, T. O., & Ladouceur, R. (2005). A national survey of gambling problems in Canada. Canadian Journal of Psychiatry, 50, 213–​217. Crockford, D. N., & el-​ Guebaly, N. (1998). Psychiatric comorbidity in pathological gambling: A critical review. Canadian Journal of Psychiatry, 43, 43–​50. Crockford, D. N., Quickfall, J., Currie, S., Furtado, S., Suchowersky, O., & el-​Guebaly, N. (2008). Prevalence of problem and pathological gambling in Parkinson’s disease. Journal of Gambling Studies, 24, 411–​422. Cunningham-​Williams, R. M., Cottler, L. B., Compton, W. M., & Spitznagel, E. L. (1998). Taking chances: Problem gamblers and mental health disorders—​Results from the St. Louis Epidemiologic Catchment Area Study. American Journal of Public Health, 88, 1093–​1096. Cunningham-​ Williams, R. M., Cottler, L. B., Compton, W. M., Spitznagel, E. L., & Ben-​Abdallah, A. (2000). Problem gambling and comorbid psychiatric disorders among drug users recruited from drug treatment and community settings. Journal of Gambling Studies, 16, 347–​376. Currie, S. R., Hodgins, D. C., & Casey, D. M. (2013). Validity of the Problem Gambling Severity Index interpretive categories. Journal of Gambling Studies, 29, 311–​327.

428

Substance-Related and Gambling Disorders

Cooper, M. L., Russell, M., Skinner, J. B., & Windle, M. (1992). Development and validation of a three-​ dimensional measure of drinking motives. Psychological Assessment, 4, 123. Cowlishaw, S., Merkouris, S., Dowling, N., Anderson, C., Jackson, A., & Thomas, S. (2012). Psychological therapies for pathological and problem gambling. Cochrane Database of Systematic Reviews, 11, CD008937. Dechant, K. (2014). Show me the money: Incorporating financial motives into the Gambling Motives Questionnaire. Journal of Gambling Studies, 30, 949–​965. Dechant, K., & Ellery, M. (2011). The effect of including a monetary motive item on the Gambling Motives Questionnaire in a sample of moderate gamblers. Journal of Gambling Studies, 27, 331–​344. Delfabbro, P. H., & Winefield, A. H. (1999). Poker-​machine gambling: An analysis of within session characteristics. British Journal of Psychology, 90, 425–​439. Dickerson, M., & O’Connor, J. (2006). Gambling as an addictive behaviour. Impaired control, harm minimisation, treatment and prevention. Cambridge, UK: Cambridge University Press. Dixon, M. R., & Johnson, T. E. (2007). The Gambling Functional Assessment (GFA): An assessment device for identification of the maintaining variables of pathological gambling. Analysis of Gambling Behavior, 1, 44–​49. Eisen, S. V., Dill, D. L., & Grob, M. C. (1994). Reliability and validity of a brief patient-​report instrument for psychiatric outcome evaluation. Hospital and Community Psychiatry, 45, 242–​247. Ericsson, K. A., & Simon, H. A. (1980). Verbal reports as data. Psychological Review, 87, 215–​251. Ferris, J., & Wynne, H. (2001). The Canadian Problem Gambling Index:  Final report. Ottawa, Ontario, Canada: Canadian Centre on Substance Abuse. Ferris, J., Wynne, H., & Single, E. (1998). Measuring problem gambling in Canada:  Interim report to the Inter-​ Provincial Task Force on Problem Gambling. Toronto, Ontario, Canada: Canadian Inter-​Provincial Task Force on Problem Gambling. Finney, J. W. (1995). Assessing treatment and treatment processes. In J. P. Allen & M. Columbus (Eds.), Assessing alcohol problems. A guide for clinicians and researchers (pp. 123–​142). Washington, DC:  U.S. Department of Health and Human Services. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015a). Structured Clinical Interview for DSM-​5 Disorders–​ Research Version (SCID-​ RV). Arlington, VA:  American Psychiatric Publishing. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015b). Structured Clinical Interview for DSM-​5 Disorders–​Clinician Version (SCID-​CV). Arlington, VA: American Psychiatric Publishing.

Gebauer, L., LaBrie, R., & Shaffer, H. J. (2010). Optimizing DSM-​ IV-​ TR classification accuracy:  A brief biosocial screen for detecting current gambling disorders among gamblers in the general household population. Canadian Journal of Psychiatry, 55, 82–​90. Gerstein, D., Murphy, S., Toce, M., Hoffmann, J., Palmer, A., Johnson, R., . . . Hill, M. (1999). Gambling impact and behaviour study: Report of the National Gambling Impact Study Commission. Retrieved from http://​www2. norc.org/​new/​gamb-​fin.htm Goodie, A. S., MacKillop, J., Miller, J. D., Fortune, E. E., Maples, J., Lance, C. E., & Campbell, W. K. (2013). Evaluating the South Oaks Gambling Screen with DSM-​IV and DSM-​5 criteria results from a diverse community sample of gamblers. Assessment, 20, 523–​531. Grant, J. E., Kim, S. W., Potenza, M. N., Blanco, C., Ibanez, A., Stevens, L., . . . & Zaninelli, R. (2003). Paroxetine treatment of pathological gambling:  A multi-​ centre randomized controlled trial. International Clinical Psychopharmacology, 18, 243–​249. Grant, J. E., Steinberg, M. A., Kim, S. W., Rounsaville, B. J., & Potenza, M. A. (2004). Preliminary validity and reliability testing of a structured clinical interview for pathological gambling. Psychiatry Research, 128, 79–​88. Himelhoch, S. S., Miles-​ McLean, H., Medoff, D. R., Kreyenbuhl, J., Rugle, L., Bailey-​ Kloch, M., . . . Brownley, J. (2015). Evaluation of brief screens for gambling disorder in the substance use treatment setting. American Journal on Addictions, 24, 460–​466. Hodgins, D. C. (2004). Using the NORC DSM Screen for Gambling Problems (NODS) as an outcome measure for pathological gambling:  Psychometric evaluation. Addictive Behaviors, 29, 1685–​1690. Hodgins, D. C. (2008). What we see depends mainly on what we look for (John Lubbock, British anthropologist, 1834–​ 1913) [Commentary]. Addiction, 103, 1118–​1119. Hodgins, D. C., Currie, S. R., & el-​Guebaly, N. (2001). Motivational enhancement and self-​help treatments for problem gambling. Journal of Consulting and Clinical Psychology, 69, 50–​57. Hodgins, D. C., & el-​ Guebaly, N. (2000). Natural and treatment-​assisted recovery from gambling problems: A comparison of resolved and active gamblers. Addiction, 95, 777–​789. Hodgins, D. C., & el-​Guebaly, N. (2004). Retrospective and prospective reports of precipitants to relapse in pathological gambling. Journal of Consulting and Clinical Psychology, 72, 72–​80. Hodgins, D. C., & Makarchuk, K. (2002). Becoming a winner:  Defeating problem gambling. Edmonton, Alberta, Canada: Alberta Alcohol and Drug Abuse Commission.

Gambling Disorders

Hodgins, D. C., & Makarchuk, K. (2003). Trusting problem gamblers:  Reliability and validity of self-​reported gambling behavior. Psychology of Addictive Behaviors, 17, 244–​248. Hodgins, D. C., Mansley, C., & Thygesen, K. (2006). Risk factors for suicide ideation and attempts among pathological gamblers. American Journal of Addiction, 15, 303–​310. Hodgins, D. C., Peden, N., & Cassidy, E. (2005). The association between comorbidity and outcome in pathological gambling: A prospective follow-​up of recent quitters. Journal of Gambling Studies, 21, 255–​271. Hodgins, D. C., Peden, N., & Makarchuk, K. (2004). Self-​efficacy in pathological gambling treatment outcome:  Development of a Gambling Abstinence Self-​ efficacy Scale (GASS). International Gambling Studies, 4, 99–​108. Hodgins, D. C., Stea, J. N., & Grant, J. E. (2011). Gambling disorders. Lancet, 378, 1874–​1884. Hodgins, D. C., Wynne, H., & Makarchuk, K. (1999). Pathways to recovery from gambling problems: Follow-​ up from a general population survey. Journal of Gambling Studies, 15, 93–​104. Hollander, E., Begaz, T., & DeCaria, C. M. (1998). Pharmacologic approaches in the treatment of pathological gambling. CNS Spectrums, 3, 72–​82. Hollander, E., DeCaria, C. M., Mari, E., Wong, C. M., Mosovich, S., Grossman, R., & Begaz, T. (1998). Short-​ term single-​blind fluvoxamine treatment of pathological gambling. American Journal of Psychiatry, 155, 1781–​1783. Holub, A. (2003). Construction of the Gambling Cognitions Inventory. Unpublished master’s thesis, University of Calgary, Calgary, Alberta, Canada. Holub, A., Hodgins, D. C., & Peden, N. E. (2005). Development of the Temptations for Gambling Questionnaire: A measure of temptation in recently quit gamblers. Addiction Research and Theory, 13, 179–​191. Holub, A., Hodgins, D. C., & Rose, K. (2007). Validation of the Gambling Cognitions Inventory on a pathological gambling sample. Final report for the Alberta Gaming Research Institute. Calgary, Alberta, Canada: University of Calgary. Kessler, R. C., Hwang, I., LaBrie, R., Petukhova, M., Sampson, N. A., Winters, K. C., & Shaffer, H. J. (2008). DSM-​IV pathological gambling in the National Comorbidity Survey Replication. Psychological Medicine, 38, 1351–​1360. Kim S. W., Grant J. E., Adson D., & Shin Y. C. (2001). Double-​blind naltrexone and placebo comparison study in the treatment of pathological gambling. Biological Psychiatry, 49, 914–​921. Kyngdon, A. (2004). Comparing factor analysis and the Rasch model for ordered response categories: An investigation

429

of the scale of gambling choices. Journal of Applied Measurement, 5, 398–​418. Ladouceur, R. (2005). Controlled gambling for pathological gamblers. Journal of Gambling Studies, 21, 49–​57. Ladouceur, R., Bouchard, C., Rheaume, N., Jacques, C., Ferland, F., Leblond, J., & Walker, M. (2000). Is the SOGS an accurate measure of pathological gambling among children, adolescents and adults? Journal of Gambling Studies, 16, 1–​24. Ladouceur, R., Gaboury, A., Bujold, A., Lachance, N., & Tremblay, S. (1991). Ecological validity of laboratory studies of videopoker gaming. Journal of Gambling Studies, 7, 109–​116. Lambe, L., Mackinnon, S. P., & Stewart, S. H. (2015). Validation of the Gambling Motives Questionnaire in emerging adults. Journal of Gambling Studies, 31, 867–​885. Lesieur, H. R., & Blume, S. B. (1987). The South Oaks Gambling Screen (SOGS):  A new instrument for the identification of pathological gamblers. American Journal of Psychiatry, 144, 1184–​1188. Lesieur, H. R., & Blume, S. B. (1991). Evaluation of patients treated for pathological gambling in a combined alcohol, substance abuse and pathological gambling treatment unit using the addiction severity index. British Journal of Addiction, 86, 1017–​1028. Lesieur, H. R., & Blume, S. B. (1993). Revising the South Oaks Gambling Screen in different settings. Journal of Gambling Studies, 9, 213–​223. May, R. K., Whelan, J. P., Steenbergh, T. A., & Meyers, A. W. (2003). The Gambling Self-​Efficacy Questionnaire: An initial psychometric evaluation. Journal of Gambling Studies, 19, 339–​357. McInnes, A., Hodgins, D. C., & Holub, A. (2014). The Gambling Cognitions Inventory:  Scale development and psychometric validation with problem and pathological gamblers. International Gambling Studies, 14, 410–​431. McLellan, A. T., Kushner, H., Metzger, D., & Peters, R. (1992). The fifth edition of the Addiction Severity Index. Journal of Substance Abuse Treatment, 9, 199–​213. Miller, J. C., Meier, E., Muehlenkamp, J., & Weatherly, J. N. (2009). Testing the construct validity of Dixon and Johnson’s (2007) Gambling Functional Assessment. Behavior Modification, 33, 156–​174. Peden, N. (2004). Construct validity of self-​efficacy in problem gambling. Calgary, Alberta, Canada:  University of Calgary. Petry, N. M. (2003). Validity of a gambling scale for the Addiction Severity Index. Journal of Nervous and Mental Disease, 191, 1–​9. Petry, N. M. (2005). Pathological gambling: Etiology, comorbidity, and treatment. Washington, DC:  American Psychological Association.

430

Substance-Related and Gambling Disorders

Petry, N. M., Ammerman, Y., Bohl, J., Doersch, A., Gay, H., Kadden, R.,  .  .  .  Steinberg, K. (2006). Cognitive–​ behavioral therapy for pathological gamblers. Journal of Consulting and Clinical Psychology, 74, 555–​567. Petry, N. M., Stinson, F. S., & Grant, B. F. (2005). Comorbidity of DSM-​IV pathological gambling and other psychiatric disorders:  Results from the National Epidemiological Survey on Alcohol and Related Conditions. Journal of Clinical Psychiatry, 66, 564–​574. Prochaska, J. O., DiClemente, C. C., & Norcross, J. C. (1992). In search of how people change:  Applications to addictive behaviors. American Psychologist, 47, 1102–​1114. Reinert, D. F., & Allen, J. P. (2002). The Alcohol Use Disorders Identification Test (AUDIT):  A review of recent research. Alcoholism: Clinical and Experimental Research, 26, 272–​279. Robins, L., Cottler, L. B., Bucholz, K., & Compton, W. M. (1996). Diagnostic Interview Schedule, Fourth Version (DISIV). Saint Louis, MO:  Washington University Press. Robson, E., Edwards, J., Smith, G., & Colman, I. (2002). Gambling decisions: An early intervention program for problem gamblers. Journal of Gambling Studies, 18, 235–​255. Rosenberg, H. (1993). Prediction of controlled drinking by alcoholics and problem drinkers. Psychological Bulletin, 113, 129–​139. Schellenberg, B. J., McGrath, D. S., & Dechant, K. (2016). The Gambling Motives Questionnaire Financial: Factor structure, measurement invariance, and relationships with gambling behaviour. International Gambling Studies, 16, 1–​16. Shaffer, H. J., Hall, M. N., & Vander Bilt, J. (1999). Estimating the prevalence of disordered gambling behavior in the United States and Canada:  A research synthesis. American Journal of Public Health, 89, 1369–​1376. Skinner, H. (1982). The Drug Abuse Screening Test. Addictive Behaviors, 7, 363–​371. Slutske, W. S. (2006). Natural recovery and treatment-​seeking in pathological gambling: Results of two U.S. national surveys. American Journal of Psychiatry, 163, 297–​302. Smith, C., Stewart, S. H., O’Connor, R. M., Collins, P., & Katz, J. (2011). Development and psychometric evaluation of a 10-​item short form inventory of gambling situations. Journal of Gambling Studies, 27, 115–​128. Spitzer, R. L., Williams, J. B., Gibbon, M., & First, M. B. (1990). Structured clinical interview for DSM-​ III-​ R. Washington, DC: American Psychiatric Publishing. Stea, J. N., Hodgins, D. C., & Fung, T. (2015). Abstinence versus moderation goals in brief motivational treatment for pathological gambling. Journal of Gambling Studies, 31, 1029–​1045. Steenbergh, T. A., Meyers, A. W., May, R. K., & Whelan, J. P. (2001). Development and validation of the Gamblers’

Beliefs Questionnaire. Psychology of Addictive Behaviors, 16, 143–​149. Stewart, S. H., & Jefferson, S. (2007). Experimental methodologies in gambling studies. In G. Smith, D. C. Hodgins, & R. J. Williams (Eds.), Research and measurement issues in gambling research (pp. 88–​ 111). New York, NY: Elsevier. Stewart, S. H., & Zack, M. (2008). Development and psychometric evaluation of a three-​dimensional Gambling Motives Questionnaire. Addiction, 103, 1110–​1117. Stinchfield, R. (2002). Reliability, validity, and classification accuracy of the South Oaks Gambling Screen (SOGS). Addictive Behaviors, 27, 1–​19. Stinchfield, R. (2003). Reliability, validity, and classification accuracy of a measure of DSM-​IV diagnostic criteria for pathological gambling. American Journal of Psychiatry, 160, 180–​182. Stinchfield, R. (2014). A review of problem gambling assessment instruments and brief screens. In D. Richards, A. Blaszczynski, & L. Nower (Eds.), Wiley-​Blackwell handbook of disordered gambling (pp. 165–​ 203). Oxford, UK: Wiley-​Blackwell. Stinchfield, R., Govoni, R., & Frisch, G. R. (2005). DSM-​IV diagnostic criteria for pathological gambling: Reliability, validity, and classification accuracy. American Journal of Addictions, 14, 73–​82. Stinchfield, R., Govoni, R., & Frisch, G. R. (2007). A review of screening and assessment instruments for problem and pathological gambling. In G. Smith, D. C. Hodgins, & R. J. Williams (Eds.), Research and measurement issues in gambling research (pp. 180–​ 217). New York, NY: Elsevier. Stinchfield, R., McCready, J., Turner, N. E., Jimenez-​ Murcia, S., Petry, N. M., Grant, J., . . . Winters, K. C. (2016). Reliability, validity, and classification accuracy of the DSM-​5 diagnostic criteria for gambling disorder and comparison to DSM-​IV. Journal of Gambling Studies, 32(3), 905–​922. Stinchfield, R., Winters, K. C., Botzet, A., Jerstad, S., & Breyer, J. (2007). Development and psychometric evaluation of the Gambling Treatment Outcome Monitoring System (GAMTOMS). Psychology of Addictive Behaviors, 21, 174–​184. Strong, D. R., Breen, R. B., Lesieur, H. R., & Lejuez, C. W. (2003). Using the Rasch model to evaluate the South Oaks Gambling Screen for use with nonpathological gamblers. Addictive Behaviors, 28, 1465–​1472. Taber, J. I., McCormick, R. A., Russo, A. M., Adkins, B. J., & Ramirez, L. F. (1987). Follow-​up of pathological gamblers after treatment. American Journal of Psychiatry, 144, 757–​761. Toce-​Gerstein, M., Gerstein, D. R., & Volberg, R. A. (2009). The NODS–​ CLiP:  A rapid screen for adult pathological and problem gambling. Journal of Gambling Studies, 25, 541–​555.

Gambling Disorders

Toneatto, T. (1999). Cognitive psychopathology of problem gambling. Substance Use and Misuse, 34, 1593–​1604. Turner, N. E., Littman-​Sharp, N., Toneatto, T., Liu, E., & Ferentzy, P. (2013). Centre for Addiction and Mental Health Inventory of Gambling Situations:  Evaluation of the factor structure, reliability, and external correlations. International Journal of Mental Health and Addiction, 11, 526–​545. Volberg, R. A., Munck, I. M., & Petry, N. M. (2011). A quick and simple screening method for pathological and problem gamblers in addiction programs and practices. American Journal on Addictions, 20, 220–​227. Walker, M., Toneatto, T., Potenza, M. N., Petry, N., Ladouceur, R., Hodgins, D. C.,  .  .  .  Blaszczynski, A. (2006). A framework for reporting outcomes in problem gambling treatment research:  The Banff, Alberta Consensus. Addiction, 101, 504–​511. Weatherly, J. N., Dymond, S., Samuels, L., Austin, J. L., & Terrell, H. K. (2014). Validating the Gambling Functional Assessment-​Revised in a United Kingdom sample. Journal of Gambling Studies, 30, 335–​347. Weatherly, J. N., Miller, J. C., Montes, K. S., & Rost, C. (2012). Assessing the reliability of the Gambling Functional Assessment:  Revised. Journal of Gambling Studies, 28, 217–​223. Weatherly, J. N., Miller, J. C., & Terrell, H. K. (2011). Testing the construct validity of the Gambling Functional Assessment-​Revised. Behavior Modification, 35, 553–​569. Weatherly, J. N., & Terrell, H. K. (2014). Validating the Gambling Functional Assessment-​Revised in a sample of probable problem/​disordered gamblers. Analysis of Gambling Behavior, 8, 39–​52. Weinstock, J., Whelan, J. P., & Meyers, A. W. (2004). Behavioral assessment of gambling:  An application of the Timeline Followback method. Psychological Assessment, 16, 72–​80.

431

Welte, J., Barnes, G., Wieczorek, W., Tidwell, M., & Parker, J. (2001). Alcohol and gambling pathology among U.S. adults: Prevalence, demographic patterns and comorbidity. Journal of Studies on Alcohol, 62, 706–​712. Wickwire, E. M., Burke, R. S., Brown, S. A., Parker, J. D., & May, R. K. (2008). Psychometric evaluation of the National Opinion Research Center DSM-​IV Screen for Gambling Problems (NODS). American Journal on Addictions, 17, 392–​395. Williams, R. J., & Volberg, R. A. (2010). Best practices in the population assessment of problem gambling. Report prepared for the Ontario Problem Gambling Research Centre, Guelph, Ontario, Canada. Williams, R. J., & Volberg, R. A. (2014). The classification accuracy of four problem gambling assessment instruments in population research. International Gambling Studies, 14, 15–​28. Winters, K. C., Specker, S., & Stinchfield, R. (2002). Measuring pathological gambling with the Diagnostic Interview for Gambling Severity (DIGS). In J. J. Marotta, J. A. Cornelius, & W. R. Eadington (Eds.), The downside: Problem and pathological gambling (pp. 143–​148). Reno, NV: University of Nevada, Reno. Wulfert, E., Blanchard, E. B., Freidenberg, B., & Martell, R. (2006). Retaining pathological gamblers in cognitive–​ behavioral therapy through motivational enhancement. Behavior Modification, 30, 315–​340. Wulfert, E., Hartley, J., Lee, M., Wang, N., Franco, C., & Sodano, R. (2005). Gambling screens:  Does shortening the time frame affect their psychometric properties? Journal of Gambling Studies, 21, 521–​536. Yakovenko, I., & Hodgins, D. C. (2016). Latest developments in treatment for disordered gambling: Review and critical evaluation of outcome studies. Current Addiction Reports, 3, 299–​306.

Part VI

Schizophrenia and Personality Disorders

433

20

Schizophrenia Shirley M. Glynn Kim T. Mueser Schizophrenia is a major mental illness characterized by psychosis, apathy and social withdrawal, and cognitive impairment, which often results in impaired functioning in the areas of work, school, parenting, self-​care, independent living, interpersonal relationships, and leisure time. Among psychiatric disorders, schizophrenia is the most disabling, and its treatment requires a disproportionate share of mental health services. For example, people with schizophrenia and other nonaffective psychotic disorders accounted for approximately 33.7% of all U.S. Medicare-​paid psychiatric hospitalizations in 2012 and 2013 (Winterstein et al., 2016). The combined economic and social costs of schizophrenia place it among the world’s top 10 causes of disability-​ adjusted life-​years (Murray & Lopez, 1996), accounting for an estimated 2.3% of all burdens in developed countries and 0.8% in developing economies (Institute of Medicine, 2001). Because of the sometimes pervasive impact of schizophrenia across the full range of life domains, assessment is necessarily broad, ranging from basic psychopathology to cognitive functioning to social and community functioning. In this chapter, we describe standardized assessment instruments for diagnosis, treatment planning, and monitoring outcomes of persons with schizophrenia spectrum disorders, including schizoaffective disorder and schizophreniform disorder. We begin with a brief description of schizophrenia, including diagnosis, clinical presentation and associated features, epidemiology, and etiology. This is followed by discussion of the purposes of assessment and then consideration of specific instruments for assessing diagnosis and specific domains of functioning commonly impaired in schizophrenia.

THE NATURE OF SCHIZOPHRENIA

Modern conceptualizations of schizophrenia are based on the work of Kraepelin (1919/​1971), who focused on the long-​ term deteriorating course of the illness, and Bleuler (1911/​ 1950), who defined the core symptoms of the disorder as difficulties thinking clearly (loosening of associations), incongruous or flattened affect, loss of goal-​directed behavior or ambivalence due to conflicting impulses, and retreat into an inner world (autism). Although the prognosis in these early conceptualizations of schizophrenia was often posited to be bleak, more recent research highlights the potential for remission in schizophrenia (Ciompi & Muller, 1976; Harding, Brooks, Ashikaga, Strauss, & Breier, 1987), as well as the benefits of early treatment (Kane et al., 2016) and the possibility of having occupational success even when living with the disorder (Cohen et al., 2017). The availability of newer, more effective treatments makes attention to assessments that yield accurate diagnoses and lead to well-​developed treatment plans particularly timely. The two major diagnostic systems for schizophrenia in common use are the 10th revision of the International Classification of Diseases (ICD-​ 10; World Health Organization, 1992) and the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​5; American Psychiatric Association [APA], 2013). Both systems objectively define symptoms and characteristic impairments of schizophrenia in a similar fashion, and both have improved the reliability of diagnostic assessments compared to more subjectively based approaches. The major differences between the systems are the DSM-​ 5 requirements of social or occupational dysfunction (not

435

436

Schizophrenia and Personality Disorders

included in ICD-​10); the 6-​month duration of illness (vs. 1 month for ICD-​10), resulting in a somewhat narrower definition of the disorder in DSM-​5; and the retention of subtypes of the disorder in ICD-​10 but not DSM-​5. The stability of diagnosis over time is moderate, with most variability immediately following onset of the disorder; 21% to 30% of people treated for a first episode have no symptom relapses during the next 5  years (Häfner & an der Heiden, 2003). Symptoms and Associated Impairments Schizophrenia is characterized by three broad types of symptoms: psychotic symptoms, negative symptoms, and cognitive impairment (Liddle, 1987; Tandon, Nasrallah, & Keshavan, 2009). Psychotic (or positive) symptoms involve the loss of contact with reality, including false beliefs (delusions), perceptual experiences not shared by others (hallucinations), or bizarre behaviors. A variety of different types of hallucinations occur in schizophrenia, including auditory, visual, olfactory, gustatory, or tactile hallucinations, with auditory hallucinations most common. Common delusions in schizophrenia include persecutory delusions, delusions of control (e.g., the belief that others can interfere with one’s thoughts or behaviors), grandiose delusions (e.g., the belief that one is Jesus Christ), and somatic delusions (e.g., the belief that one’s brain is rotting away). The presence and severity of psychotic symptoms tend to be episodic over time but are persistent in a subgroup of persons following onset of the disorder (Friis et al., 2016). Negative symptoms are characterized by the relative absence or paucity of cognitive, emotional, and behavioral processes. Common negative symptoms include blunted affect (e.g., immobile facial expression and monotonous voice tone), anhedonia (lack of pleasure), avolition or apathy (diminished ability to initiate and follow through on plans), and alogia (reduced quantity or content of speech). Although negative symptoms may have a variable trajectory in the early course of schizophrenia (Gee et al., 2016), over the long term they tend to be more persistent than psychotic symptoms (Fenton & McGlashan, 1991)  and are strongly associated with poor psychosocial functioning (Rabinowitz et al., 2012). Because it is less readily apparent to others that negative symptoms are manifestations of a psychiatric illness, people with high levels of negative symptoms are often perceived by relatives and others to be lazy and willfully unengaged in bettering their lives (Weisman, Nuechterlein, Goldstein, & Snyder, 1998).

The defining symptoms of schizophrenia are frequently accompanied by negative emotions, including depression (Bosanac & Castle, 2012), anxiety (Achim et  al., 2011), and anger or hostility (Witt, Hawton, & Fazel, 2014). The lifetime risk of completed suicide in schizophrenia is estimated to be approximately 5% (Inskip, Harris, & Barraclough, 1998). There is also a modest increase in violence in schizophrenia relative to the general population, with different phenotypes corresponding to whether the aggression appears before the onset of the disorder, coinciding with the onset of symptoms, or following many years of symptoms (Hodgins, Piatosa, & Schiffer, 2014). Cognitive impairment is a common feature associated with schizophrenia that encompasses problems in attention and concentration, psychomotor speed, learning and memory, and executive functions such as abstract thinking, planning, and problem solving (Harvey, 2013). Lower levels of premorbid intelligence compared to those of other family members increase the risk of developing schizophrenia (Kendler, Ohlsson, Mezuk, Sundquist, & Sundquist, 2016), and a decline in cognitive abilities frequently precedes the onset of schizophrenia by several years (MacCabe et al., 2013; Meier et al., 2014). Despite this decline, some clients’ cognitive functioning is in the normal range. Similar to negative symptoms, cognitive impairment tends to be relatively stable over time (Dickerson et  al., 2014)  and is strongly associated with functional impairment, including community living and work (Green, Llerena, & Kern, 2015; McGurk & Mueser, 2004). Impaired role functioning or significant change in self-​care are also included as diagnostic criteria for schizophrenia. Problems in these areas include reduced ability to work, attend school, parent, have close relationships, attend to one’s grooming and hygiene, and enjoy one’s leisure time, with difficulties often emerging several years before frank psychotic symptoms (Häfner & an der Heiden, 2008). Impairment in functioning is sometimes pronounced, resulting in the need for disability entitlements (when available) and extensive assistance with meeting daily living needs such as housing, medical care, food, and clothing. Improving functioning remains the most important challenge for the management of schizophrenia. Impairment in functioning tends to be relatively stable over time in schizophrenia, with some improvements over the long term, including some partial or complete symptom remissions (Harding & Keller, 1998).

Schizophrenia

In addition to symptoms and impaired role functioning, schizophrenia affects many other areas of living. People are at increased risk for alcohol and drug problems (Thoma & Daum, 2013), infectious diseases such as hepatitis C (Rosenberg et  al., 2001), violent victimization (Khalifeh et  al., 2015; Roy, Crocker, Nicholls, Latimer, & Reyes Ayllon, 2014)  and post-​ traumatic stress disorder (PTSD; Grubaugh, Zinzow, Paul, Egede, & Frueh, 2011; Mueser, Rosenberg, Goodman, & Trumbetta, 2002), housing instability and homelessness (Aubry et al., 2016), and tobacco use and related illnesses (Correll et al., 2014). The net result of exposure to these risks is a sharply increased rate of premature mortality (Gale et al., 2012). Epidemiology The annual incidence of schizophrenia is 0.2 to 0.4 per 1,000, with lifetime prevalence (risk) of approximately 1% (Jablensky, 1997). The incidence of schizophrenia is the same across genders, although women tend to have a later age of onset compared to men (Murray & Van Os, 1998)  and also a more benign course of illness, including fewer hospitalizations and better social functioning (Angermeyer, Kuhn, & Goldstein, 1990). The later age of onset in women is associated with higher attainment of pre-​illness social role functioning, which confers a better outcome (Häfner, 2000). The incidence of schizophrenia is approximately 15.2 per 100,000 people per year (McGrath et  al., 2004), with variations reported across different locations that are reduced when standardized diagnostic criteria are used (Jablensky, 1997). The lifetime prevalence of schizophrenia has been estimated to be 0.7%, with again some variation across studies (Saha, Chant, Welham, & McGrath, 2005). Overall, research indicates that the often cited 1% prevalence of schizophrenia is approximately accurate, but it may be somewhat of an overestimate of the true prevalence of the disorder (Perkins & Lieberman, 2012). Etiology A variety of biological and socio-​environmental risk factors interact dynamically to contribute to the development of schizophrenia (Uher, 2014; van Os & Kapur, 2009). Family history of schizophrenia is a significant risk factor, indicating a role for genetic vulnerability. Although in the general population the risk of someone

437

developing schizophrenia is approximately 1%, this risk is increased to 10% for people with a first-​degree relative who has the disorder and to 50% for people with an identical twin (McGuffin, Owen, & Farmer, 1996). However, exposure to psychological trauma in childhood and other adversities increases the risk of developing schizophrenia (Okkels, Trabjerg, Arendt, & Pedersen, 2017; Varese et al., 2012), and shared trauma within families, as well as between identical versus fraternal twins, may explain some of the family associations in the development of the disorder that have traditionally been ascribed to genetic factors (Fosse, Joseph, & Riochardson, 2015). The risk of schizophrenia is also increased by a variety of perinatal complications. Maternal infection (Brown et al., 2004), starvation (Hoek, Brown, & Susser, 1998), and exposure to stressful events (Khashan et al., 2008) are all associated with increased risk of schizophrenia. Obstetric complications such as anoxia or forceps delivery also significantly increase the chances of developing schizophrenia (Cannon, Jones, & Murray, 2002). Socio-​environmental factors associated with increased risk of schizophrenia include being born and raised in an urban setting (Saha et al., 2005), immigration from another country (Cantor-​ Graae, Zolkowska, & McNeil, 2005), poverty and lower social class (Eaton, 1994), and ethnic or cultural minority status (Boydell et  al., 2001). Cannabis use, especially before age 15  years, has been linked to the subsequent development of schizophrenia, and onset at an earlier age, in several national birth cohort studies (Miller et al., 2009; Radhakrishnan, Wilkinson, & D’Souza, 2014). The combined effects of biological and socio-​ environmental risk factors for schizophrenia have led to the general theory that schizophrenia is a neurodevelopmental disorder (Allin & Murray, 2002; Weinberger & Marenco, 2003). Altered brain development early in life is hypothesized to interact with subsequent environmental stress to result in the disorder emerging in late adolescence or early adulthood. Neurodevelopmental theories of the etiology of schizophrenia are compatible with the stress–​vulnerability model, which proposes that the course of the disorder is influenced by a similar dynamic interaction between biological and environmental factors (Nuechterlein & Dawson, 1984; Zubin & Spring, 1977).

PURPOSES OF ASSESSMENT

Assessment in schizophrenia serves a number of distinct purposes. First, because the diagnosis of a schizophrenia

438

Schizophrenia and Personality Disorders

spectrum disorder has important treatment implications, especially with regard to pharmacological management, a careful assessment is necessary to ensure accurate identification of the disorder. Aside from undetected substance abuse or medical conditions that can lead to common symptoms of schizophrenia, there is a great overlap with the symptoms of bipolar disorder and major depression. The primary distinction between schizophrenia and mood disorders is made based on the course and co-​ occurrence of different symptoms (e.g., the absence of psychotic symptoms in people with a mood disorder when depression or mania are absent), which requires accurate historical information and sound clinical judgment. Second, assessment serves a critical purpose in identifying treatment needs and informing treatment planning. Although it was once thought that schizophrenia led to irreversible deterioration (Kraepelin, 1919/​1971), it is now clear that comprehensive interventions, grounded in a wide-​ ranging and thorough assessment, can dramatically improve outcomes. In addition to the complex of symptoms present in schizophrenia, and its impact on role functioning, social relationships, and self-​care, other comorbidities are often present, including psychiatric, substance abuse, and medical disorders. In this chapter, we focus mainly on the assessment of symptoms and functioning for treatment planning, with only brief attention to comorbid substance abuse. Third, assessment is necessary in order to monitor the effects of treatments. Ongoing evaluation of targeted areas for treatment is critical in order to know whether alternative approaches are necessary and when treatment goals have been achieved. Numerous different treatments may impact on specific symptoms and areas of functioning, and thus many alternatives exist if treatment targets have not improved sufficiently. In light of the sometimes pervasive nature of the deficits in schizophrenia, it is not surprising to observe that assessment of many other domains may be critical to the development of an accurate diagnosis and treatment plan in schizophrenia. Although important, they are beyond the scope of this chapter. For example, we do not cover the assessment of cognitive impairment (Sharma & Harvey, 2000), but at a minimum recommend employing a brief cognitive screen to evaluate cognitive functioning (Gold, Queern, Iannone, & Buchanan, 1999; Keefe et al., 2004), followed up by a more comprehensive neuropsychological assessment if prominent deficits are identified. We also do not describe the assessment of medical disorders, but considering the high rates of medical comorbidity in schizophrenia (Janssen, McGinty, Azrin, Juliano-​Bult, & Daumit,

2015), we recommend arranging for all clients to have a physical examination. Similarly, we do not address the assessment of health risk behaviors, such as smoking (Evins et  al., 2014)  and unprotected sex (Carey, Carey, Maisto, Gordon, & Vanable, 2001), but due to the high rate of nicotine addiction and infectious diseases in this population, we recommend routine assessment of these and other health-​ related behaviors (e.g., diet) in all clients using standard approaches developed for the general population.

ASSESSMENT FOR DIAGNOSIS

Diagnostic assessment for schizophrenia spectrum disorders involves obtaining a broad range of information that includes subjective states (e.g., hallucinations and delusions); behavioral observation (e.g., blunted affect and bizarre behavior); and reports about functioning in areas such as social relationships, work or school, and self-​ care. Because the diagnosis of schizophrenia in DSM-​5 requires ascertaining whether the duration of impaired functioning has been 6  months or longer, historical information must also be obtained. Although much of the information required to establish a diagnosis can be obtained by directly interviewing the client, the lack of insight characteristic of the illness (Amador & Gorman, 1998)  often necessitates obtaining supplementary information. Such information can usually be obtained from relatives, other treatment providers, and medical records, and it is most useful for determining the presence of psychotic symptoms or problems in functioning. Historically, the diagnosis of schizophrenia was unreliable (Matarazzo, 1983)  before objective criteria were established by DSM-​III (APA, 1980), and the disorder was frequently overdiagnosed (Kuriansky, Deming, & Gurland, 1974). With the clearer specification of diagnostic criteria for schizophrenia in the DSM series, more reliable diagnostic assessment became possible. However, even with these objective criteria, the reliability of diagnoses is greatest when it is established using a structured clinical interview to probe for symptoms in a systematic fashion. The ratings of the psychometric properties of these interviews are presented in Table 20.1. The most widely used standardized instrument for diagnostic interviewing is the Structured Clinical Interview for DSM-​ 5 (SCID-​ 5; First, Williams, Karg, & Spitzer, 2015a, 2015b). The SCID has demonstrated excellent reliability and validity for the diagnosis of schizophrenia, although considerable training and clinical

Schizophrenia TABLE 20.1  Ratings

439

of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

DIS

E

NA

E

E

E

G

G

A

M.I.N.I.

E

E

E

E

E

G

G

E

SCID

E

NA

E

E

E

E

E

A



E G

E E

E E

E E

E E

E E

E E

E E

✓ ✓

Comorbidities CSDS CAP-​S

Note:  DIS  =  Diagnostic Interview Schedule; M.I.N.I.  =  Mini-​International Neuropsychiatric Interview; SCID  =  Structured Clinical Interview for DSM-​IV; CSDS = Calgary Depression Scale for Schizophrenia; CAPS-​S = Clinician Administered Rating Scale for Post-​Traumatic Stress Disorder–​ Schizophrenia; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

interviewing experience are required to administer it, and it is time-​consuming to conduct, often requiring 1 or 2 or more hours to complete. The SCID is very comprehensive, permits a variety of psychiatric diagnoses to be made from the same interview, and is most often used as a research instrument. Shorter versions of earlier versions of the SCID have been developed, such as the PRIME MD (Spitzer et al., 1994), but their reliability and validity for assessing relatively low-​frequency disorders such as schizophrenia remain uncertain. An alternative to the SCID, the Mini-​International Neuropsychiatric Interview (MINI; Sheehan et al., 1998), is much briefer, acceptable to patients, and can easily be used in most clinical office settings, although its sensitivity and specificity with psychosis are not as strong as those of the SCID (Amorim, Lecrubier, Weiller, Hergueta, & Sheehan, 1998). Two broad diagnostic measures have been developed that can be administered by lay persons with training. The Diagnostic Interview Schedule (DIS) (Robins, 1995) was designed primarily for use in large-​scale epidemiological research studies, and it lacks the sensitivity and specificity necessary for use in clinical settings. The DIS requires less training to learn than the SCID, and it can usually be administered in less than 1 hour, but it also has demonstrated lower reliability and validity (Malgady, Lloyd, & Tryon, 1992). A more recent alternative to the DIS is the World Health Organization (WHO) World Mental Health (WMH) Composite International Diagnostic Interview (CIDI) (Kessler & Ustün, 2004), which is a refinement and expansion of the original CIDI (Robins et  al., 1988)  that was based on the DIS. The original CIDI was designed to align with ICD diagnoses to permit international research but did not include a section on psychosis. The WHO WMH-​CIDI does include a section on psychosis, but validity studies often omit that diagnosis

because of its low frequency (Haro et  al., 2006), so the sensitivity and specificity of the measure for schizophrenia are difficult to determine. An intensive training program is offered by the developers. It is well established that cultural factors can influence the interpretation of, reaction to, and expression of common symptoms in schizophrenia (Jacob, Johnson, Prince, Bhugra, & David, 2007; Jenkins & Barrett, 2004). Efforts have focused on establishing a framework for organizing cultural information pertaining to establishing a psychiatric diagnosis, such as the Outline for Cultural Formulation in DSM-​IV (APA, 1994). More recently, work has focused on establishing standardized guidelines for obtaining and understanding culturally relevant information obtained in the context of diagnostic assessment, including the Cultural Formulation Interview (CFI) presented in DSM-​5 (Lewis-​Fernández et al., 2014). Although progress in this area is promising, the available data on the CFI indicate that it is too preliminary to incorporate in the current review (Aggarwal et  al., 2014; Aggarwal, Nicasio, DeSilva, Boiler, & Lewis-​Fernández, 2013). Assessment of Co-​occurring Disorders Comorbidities are common in schizophrenia, with rates of co-​occurring substance abuse, depression, and PTSD being most prevalent (Buckley, Miller, Lehrer, & Castle, 2009). Typically, the identification of a co-​occurring disorder in schizophrenia relies on the clinical acumen of the assessor, often guided by the specificity of instruments such as the SCID, which permits the diagnosis of several independent disorders based on multiple modules conducted in a single interview. However, discerning whether a symptom reflects schizophrenia or another disorder can be

440

Schizophrenia and Personality Disorders

complicated and requires clinical judgment. When does social anxiety become paranoia? Is hearing the voice of a deceased loved one a reflection of grief or a hallucination? The growing recognition of the prevalence of comorbidities has prompted the development of scales designed to diagnose co-​occurring disorders in the presence of schizophrenia. These scales are particularly useful in clarifying the diagnostic picture when symptoms of two or more disorders may present similarly. Two of the most commonly used scales are the Calgary Scale for Depression in Schizophrenia (CSDS) (Addington, Addington, & Maticka-​Tyndale, 1993) and the Clinicians Administered PTSD Scale–​Schizophrenia (CAPS-​S) (Gearon, Bellack, & Tenhula, 2004). The CSDS is designed to permit assessment of depressive symptoms independent from positive, negative, and extrapyramidal symptoms in people with schizophrenia, using a nine-​item semi-​structured interview. Items are scored on a 0 to 3 scale, with a total score greater than 6 having 82% specificity and 85% sensitivity for predicting the presence of a major depressive episode (Addington, Addington, & Maticka-​Tyndale, 1994). Trauma rates are high in individuals with serious psychiatric illness (Goodman, Rosenberg, Mueser, & Drake, 1997; Monahan, Vesselinov, Robbins, & Appelbaum, 2017), raising the likelihood that many of these individuals are experiencing symptoms of PTSD (Grubaugh et al., 2011). To permit more accurate diagnosis of PTSD in persons with schizophrenia, Gearon et al. revised the wording of items to the original CAPS (Blake et al., 1995) by reducing the reading level, adding behavioral definitions and anchors, and providing probe examples more relevant to the lives of individuals diagnosed with a more serious psychiatric illness. The CAPS-​S has strong psychometric properties, and it can be a useful tool in determining whether individuals presenting with psychotic symptoms also have concurrent PTSD. Overall Evaluation Schizophrenia is a complex illness to diagnose that overlaps with many other major mental illnesses, especially major mood disorders. Although in the clinic and hospital setting most schizophrenia diagnoses are made based on a clinical interview, research instruments such as the SCID and MINI are reliable and well-​validated tools to improve diagnostic accuracy. Comorbidities in schizophrenia are common; more recent work has involved developing instruments that can be used to diagnose common comorbidities in schizophrenia.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

In this section, we describe the assessment of symptoms, medication adherence, community functioning, subjective appraisal, family attitudes, and comorbid substance abuse for the purposes of case conceptualization and treatment planning. The ratings of the psychometric properties of these tools are presented in Table 20.2. As we describe subsequently, many of these same assessment tools can also be used to monitor symptoms and evaluate treatment outcomes. Symptoms Two semi-​ structured interview instruments with well-​ established evidence of reliability and validity are widely used for the assessment of symptoms of schizophrenia:  the Brief Psychiatric Rating Scale (BPRS; Lukoff, Nuechterlein, & Ventura, 1986; Overall & Gorham, 1962)  and the Positive and Negative Syndrome Scale (PANSS; Kay, Opler, & Fiszbein, 1987). The BPRS and PANSS cover a broad range of symptoms commonly present in schizophrenia. The instruments include specific interview probes, clearly elucidated descriptions of target symptoms, and behaviorally anchored 5-​to 7-​point ratings scales for scoring the presence and severity of symptoms. The PANSS includes 30 items, of which the first 18 were drawn from the original version of the BPRS (Overall & Gorham, 1962). Following the development of the PANSS, an additional 6 items were developed for the BPRS, referred to as the Expanded BPRS (Lukoff et al., 1986). Each of these measures requires 25 to 40 minutes to complete. The BPRS was designed as a general psychiatric rating scale for the broad range of symptoms present in severe mental illnesses, whereas the PANSS was developed to specifically tap the symptoms of schizophrenia. Factor analyses of the BPRS have most frequently identified either four or five symptom dimensions (Long & Brekke, 1999; Mueser, Curran, & McHugo, 1997; Shafer, 2005), corresponding to thought disorder, anergia (negative symptoms), anxiety–​depression, disorganization, and activation. As expected, because of the overlap in symptoms between the BPRS and the PANSS, very similar factor structures have been identified for the PANSS (Mueser et  al., 1997; van der Gaag et  al., 2006; Velligan et  al., 2005; Wallwork, Fortgang, Hashimoto, Weinberger, & Dickinson, 2012).

TABLE 20.2  Ratings

of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

BPRS PANSS SANS CAINS BNSS

A A A E E

G G G E E

E E E E E

A A A E E

A A A E E

G E E E E

E E E E E

A A A G E

✓ ✓

BASIS-​R CSI

G G

G G

NA NA

A A

G G

G G

G G

G G

A A

A A

G NA

A A

A G

A G

A G

A G



CASIG CAN ILSS MCAS QLS SAFE SAS-​II

A A A A A A A

A G G A E G G

E E G G E G E

A G A A E A A

E E G A A A A

A G A G G A G

G E A G E E E

A G A A A A A

✓ ✓ ✓

SBS

A

E

E

E

A

A

A

A

SFS SF-​36 SLOF MIRECC-​GAF

A E A A

G G G E

E NA G E

A A G NR

G A E E

G G E E

E E E E

A A G E

MHRM QOLI RAS TL-​30S

A E A A

G G G G

NA NA NA NA

A A A A

A G G A

A E A A

A E E E

A A A A

IMR (Client) SSMI ISMI SS PSYRATS

E G G G E

G G G E E

NA NA NA NA E

E E E E E

E E E G E

E E E E E

G E E G E

E G G E E

A A

G G

NA NA

A NR

A G

A G

A A

A G

E E E E A E

G G G G NA NA

E NA NA NA E E

A A A A A A

E G A A A A

E G A A G G

E E E E G E

A A A A A A

Symptoms

Medication Adherence ROMI DAI Community Functioning







Subjective Appraisal ✓ ✓ ✓



Family Attitudes PRS BAS



Substance Abuse ASI AUDIT DAST MAST SATS TLFB



✓ ✓

Note: BPRS = Brief Psychiatric Rating Scale; PANSS = Positive and Negative Syndrome Scale; SANS = Scale for Assessment of Negative Symptoms; CAINS = Clinical Assessment Interview for Negative Symptoms; BNSS = Brief Negative Symptoms Scale; BASIS-​R = Revised Behavior and Symptom Identification Scale; CSI = Colorado Symptom Index; ROMI = Rating of Medication Influences; DAI = Drug Attitudes Inventory; CASIG = Client’s Assessment of Strengths, Interests and Goals (both client and informant versions); CAN = Camberwell Assessment of Need (both clinician and client versions); ILSS = Independent Living Skills Survey (both client and informant versions); MCAS = Multnomah Community Ability Scale; QLS = Quality of Life Scale; SAFE = Social Adaptive Functions Scale; SAS-​ll = Social Adjustment Scale; SBS = Social Behavior Scale; SFS = Social Functioning Scale; SF-​36 = Short Form-​36 Health Survey; SLOF = Specific Level of Functioning; MIRECC-​GAF = Mental Illness Research Education and Clinical Center Global Assessment of Functioning; MHRM = Mental Health Recovery Measure; QOLI = Quality of Life Interview; RAS = Recovery Assessment Scale; TL-​30S = Quality of Life Interview Self-​Administered Short Form; IMR = Illness Management and Recovery; SSMI = Self Stigma of Mental Illness; ISMI = Internalized Stigma of Mental Illness; SS = Stigma Scale; PSYRATS = Psychotic Symptom Rating Scale; PRS = Patient Rejection Scale; BAS = Burden Assessment Scale; ASI = Alcohol Severity Inventory; ADUIT = Alcohol Use Identification Test; DAST = Drug Abuse Screening Test; MAST = Michigan Alcoholism Screening Test; SATS = Substance Abuse Treatment Scale; TLFB = Timeline Followback Calendar; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

442

Schizophrenia and Personality Disorders

As the presence of negative symptoms in schizophrenia has become increasingly prominent in conceptualizations of the disorder (Tandon et al., 2009), there has been a concomitant emphasis on enhancing accurate measure of this symptom constellation. The primary scales in use for negative symptoms are assessor administered in a semi-​structured interview, although self-​report measures are emerging (Dollfus, Mach, & Morello, 2015). The Scale for the Assessment of Negative Symptoms (SANS; Andreasen, 1984; Mueser, Sayers, Schooler, Mance, & Haas, 1994; Vadhan, Serper, Harvey, Chou, & Cancro, 2001)  was designed to capture core negative symptoms using both individual and global items. The SANS covers five domains—​affective flattening or blunting, alogia, avolition/​apathy, anhedonia/​asociality, and inattention—​and within each domain separate symptoms are rated from 0 (absent) to 5 (severe). One factor analysis indicated a three-​factor solution composed of blunted affect, apathy–​ anhedonia, and alogia–​ inattention (Sayers, Curran, & Mueser, 1996). Two more recent negative symptoms measures are the Clinical Assessment Interview for Negative Symptoms (CAINS; Kring, Gur, Blanchard, Horan, & Reise, 2013)  and the Brief Negative Symptoms Scale (BNSS; Kirkpatrick et  al., 2011). The CAINS was designed to address limitations of existing negative symptom instruments and to evaluate the need for the five consensus negative symptom subdomains (represented, for example, in the SANS, described previously). Intensive scale refinement with iterative trials across multiple samples yielded a shorter, more focused negative symptom assessment instrument. The final 13-​item CAINS contains a motivation/​pleasure factor defined by 9 items and a 4-​item expression factor. The final CAINS has good psychometric properties, including evidence of strong convergent validity reflected in association with clinician-​rated real-​world functioning and patient-​rated quality of life (Kring et al., 2013). The CAINS requires approximately 25 minutes for administration and can be administered reliably in nonacademic clinical settings by bachelor-​and master-​ educated raters. There is an extensive training manual as well as training videos available on the Internet (Carpenter, Blanchard, & Kirkpatrick, 2016). The BNSS is a semi-​structured interview that includes the five domains assessed in the SANS as well as an additional item, lack of normal distress. The BNSS consists of 13 items and has demonstrated excellent psychometric properties (Kirkpatrick et al., 2011; Strauss et al., 2012), with administration of the scale typically requiring less than 15 minutes. Items in the anhedonia, avolition, and

asociality subscales distinguish objective behavior from subjective experience and consummatory from appetitive anhedonia. The BNSS has a similar factor structure to the CAINS, with one factor consisting of pleasure and motivation items, and the other including blunted affect and alogia items; the lack of normal distress item loading on the two BNSS factors is inconsistent. With training, the BNSS can be administered by bachelor degree raters if they have experience with people with schizophrenia (Carpenter et al., 2016). Training videos are available. Conducting the symptom assessments mentioned in the preceding paragraphs is labor-​intensive because they require the interviewer be trained to adequate levels of reliability, and the scales themselves can often require more than 30 minutes to administer. Health services researchers have addressed these obstacles by designing broad symptom self-​ report measures intended for use with general psychiatric populations, such as the Revised Behavior and Symptom Identification Scale (BASIS-​R; Eisen, Normand, Belanger, Spiro, & Esch, 2004) and the Modified Colorado Symptom Index (Conrad et al., 2001), even evaluating whether computer administration of such self-​report measures can yield useful results (Chinman, Young, Schell, Hassell, & Mintz, 2004). Initial data are promising, insofar as subscales on the BASIS-​R have been found to discriminate psychotic from nonpsychotic groups in outpatient and inpatient samples, and some of the subscales indicate sensitivity to change (Jerrell, 2005; Niv, Cohen, Mintz, Ventura, & Young, 2007). Chinman et al. (2014) reported that results on BASIS-​R computer-​ administered assessments and interviews were highly correlated. Three caveats are warranted in considering the use of self-​report and/​or computer administration of symptom measures such as the BASIS-​R in persons with schizophrenia, however. First, these measures have not been widely used in typical clinical settings, so their day-​to-​day use may require some revision of procedures or interpretation of data, especially with regard to comparison samples of individuals diagnosed with schizophrenia. Second, even with self or computer administration, professional efforts are required to orient clients to the assessment, answer questions, and ensure that individuals comprehend the task sufficiently to respond accurately. Third, many people with schizophrenia lead economically disadvantaged lives and may have limited prior experience with computers. Thus, orienting them to the computer assessment of symptoms in the office may entail teaching them how to work the computer (how to use a mouse, what to do if the monitor screen freezes, etc.) and being available while the

Schizophrenia

assessment is being conducted. The professional time and effort required to ensure the respondent can complete the task successfully should not be underestimated; of course, personnel assisting with these tasks do not require graduate training. Medication Adherence For many individuals with schizophrenia, regular medication taking reduces symptoms; thus, monitoring adherence with prescribed medication regimens is a critical aspect of treatment. However, research indicates that both self-​ reports and collateral reports of medication adherence are not especially accurate (Pratt, Mueser, Driscoll, Wolfe, & Bartels, 2006). Clinically, reports of nonadherence are typically assumed to be more accurate than reports of adherence. Although nonadherence to antipsychotic medication is often identified as a problem (Dolder et al., 2004), it is important to note that reviews of prescription studies with nonpsychiatric populations report noncompliance rates of at least 30%. Not taking medications as prescribed is a common problem, regardless of the medical condition (Blackwell, 1976). Clinicians working with persons with schizophrenia should assess at least two dimensions of adherence: (a) actual level of medication taking and (b) reasons for any nonadherence, as they may lend themselves to different interventions. Unfortunately, assessing medication-​ taking behavior accurately is notoriously difficult across medical populations (Osterberg & Blaschke, 2005). Although pill counts and electronic pill bottles with cap sensors have been used in scientific investigations as the “gold standard” for assessing adherence, the norm in non-​ research settings is still to rely on client self-​report based on questions asked frequently and nonjudgmentally, beginning with statements such as “I know it must be difficult to take all your medications regularly. How often do you miss taking them?” (Osterberg & Blaschke, 2005). To supplement reports of medication adherence, establishing reasons for nonadherence and general attitudes toward psychiatric mediation can be very useful. The Rating of Medication Influences (ROMI) scale (Weiden et al., 1994) includes 20 self-​report items assessing reasons for nonadherence that have been prospectively linked to nonadherence (Yamada et al., 2006). The ROMI is divided into two subscales that separate reasons for adherence (Reasons for Compliance) from reasons for nonadherence (Reasons for Noncompliance), and it assesses a broad range of factors influencing a client’s personal decisions about adherence. Factor analyses revealed a three-​factor

443

structure within the Reasons for Compliance scale and a five-​factor structure for the Reasons for Noncompliance scale. These two subscales have been found to correlate moderately with the Drug Attitudes Inventory (DAI) (.56 for Reasons for Compliance and  –​.47 for Reasons for Noncompliance). The DAI is available in 30-​item (DAI-​ 30; Hogan, Awad, & Eastwood, 1983) and 10-​item (DAI-​ 10; Stjernswärd, Persson, Nielsen, Tuninger, & Levander, 2013)  self-​report formats, which are highly correlated; both versions include items reflecting positive and negative attitudes toward psychiatric medications. Identifying specific reasons for medication nonadherence is critical because unique responses will call for different interventions: Someone who reports that he or she has difficulty physically obtaining the medication needs a different kind of assistance than someone who describes feeling embarrassed about taking the medication. Community Functioning Impaired adjustments in the areas of social, role, and self-​ care functioning are the hallmarks of schizophrenia, and an accurate assessment is critical to treatment planning. Unfortunately, assessment of each of these functioning domains presents challenges to the clinician in terms of both precise definitions of adequate “adjustment” and the use of clients as informants of their own functioning. For example, the domain of social functioning is complex, and it may include a broad range of only weakly intercorrelated dimensions, such as number of regular social contacts, number of “friends,” satisfaction with friendships, reciprocity of friendships, initiation of social contacts, romantic involvement, degree of contact and satisfaction with family relationships, social skill competence, and engagement in leisure and recreational activities. Aside from the sheer number of potentially important dimensions of social functioning, clients are not always accurate in their perceptions of how well they function compared to others, highlighting the value of obtaining collateral reports from people who know the client, such as relatives or (for clients with frequent contact with professional or paraprofessional staff) mental health workers (Bowie, Reichenberg, Patterson, Heaton, & Harvey, 2006). Furthermore, the definition of “adequate” social adjustment is elusive, even in nonpsychiatric populations. In a society that values independent functioning, are adult offspring who continue to live with their parents less socially adjusted? What about persons who are divorced or never married—​are their community functioning levels necessarily less?

444

Schizophrenia and Personality Disorders

There is no consensus instrument used to assess community functioning in schizophrenia. Most published measures of community functioning used with persons with schizophrenia are dimensional, with one to several items assessing different domains of functioning (e.g., social support and independent living). The scales vary in their length (from as few as 12 to as many as 70 items with multiple prompts for each one), level of training required for the assessment administrator, relative emphasis on global life domains (e.g., social support) versus specific instrumental skills (e.g., ability to do laundry or ride the bus), whether original development of the scale was directed more for researchers (e.g., the Social Adjustment Scale-​II) or practitioners (e.g., the Client Assessment of Strengths, Interests, and Goals), and whether the scales emphasize objective or subjective aspects of functioning. There is no one scale that will meet every need. Clinicians will do best to review the scales discussed here and determine which assess the domains of most interest for a particular client. Scales of wide use in the assessment of community functioning in schizophrenia include the Camberwell Assessment of Need (CAN); the Social Adjustment Scale-​ II (SAS-​ II); the Quality of Life Scale (QLS); the Social Functioning Scale (SFS); the Independent Living Scale Survey (ILSS); the Client Assessment of Strengths, Interests, and Goals (CASIG); the Short Form-​ 36 (SF-​36); the Multnomah Community Ability Scale (MCAS); the Social Behavior Schedule (SBS); the Social Adaptive Function Scale (SAFE); the Specific Level of Functioning (SLOF) Assessment Scale; and the Mental Illness Research, Education, and Clinical Center Global Assessment of Functioning (MIRECC-​GAF). The CAN (Slade, Loftus, Phelan, Thornicroft, & Wykes, 1999)  evaluates functioning across 22 areas of need, including substance use; symptoms such as psychotic symptoms or psychological distress; and areas of psychosocial functioning such as living situation, food, money management, social relationships, intimate relationships, child care, safety to self and others, and ability to use transportation. A number of versions of the CAN have been developed, including a research and clinical version (Phelan et  al., 1995), staff-​administered and client self-​rated versions (Reininghaus et  al., 2013), and a short version (Andresen, Caputi, & Oades, 2000), and the instrument has been translated into many languages. In addition, adapted versions of the CAN have been developed for special mental health populations, including older clients, persons using forensic mental health services, clients with intellectual disability, and mothers or

pregnant persons. Generally, the CAN has been shown to have high test–​ test and inter-​ rater (between clinicians) reliability and good validity (McCrone et al., 2005; Phelan et  al., 1995; Reininghaus et  al., 2013), and it is increasingly being used across a range of clinical contexts (Medeiros-​Ferreira et al., 2016; Slade, 2012). The SAS-​II is a semi-​structured client interview, which is a schizophrenia-​specific modification of an instrument widely used in depressive samples (Weissman & Bothwell, 1976). The validity of the SAS-​II for use with outpatients with schizophrenia has been previously demonstrated (Jaeger, Berns, & Czobor, 2003; Schooler, Hogarty, & Weissman, 1979). Measures of global adjustment in work/​ student role, household functioning, extended kin role, social and leisure activities, intimate relationships, well-​ being, and overall adjustment are rated on a 1 to 7 scale, based on specific responses to a series of questions in each domain. The SAS-​II was primarily designed as a research interview, and thus substantial training is required to administer it with high reliability. The QLS (Heinrichs, Hanlon, & Carpenter, 1984) contains 21 items and is designed to assess the deficit syndrome concept in individuals with schizophrenia. It measures four domains—​ interpersonal functioning, instrumental role functioning, intrapsychic factors (e.g., motivation and curiosity), and possession of common objects/​participation in common activities—​and also yields a total score. It has been found to be sensitive to change from participating in psychosocial interventions (Glynn et al., 2002). A confirmatory factor analysis of the QLS was recently published that mainly replicated the first three factors (interpersonal functioning, instrumental role functioning, and intrapsychic foundations) with 16 of the 21 items (Mueser et al., 2017). The authors proposed renaming the intrapsychic factor as “motivation” because the motivation item loaded highest on this factor. Two abbreviated versions of the QLS have been developed that include 5 (Ritsner, Kurs, Ratner, & Gibel, 2005)  and 7 (Bilker et  al., 2003)  items, which have been found to be strongly correlated with the total QLS score. Similar to the SAS-​II, the QLS was designed as a research interview and is not practical for use in most routine clinical settings. The SFS (Birchwood, Smith, Cochrane, Wetton, & Copestake, 1990)  is a 20-​minute interview assessing the following domains of functioning: social engagement and withdrawal, interpersonal communication, independence performance, socially appropriate behaviors, independence competence, and occupation. Scales are normed in each of the categories, and the breadth of topics contains most of the items relevant to psychiatric populations.

Schizophrenia

The ILSS (Cyr, Toupin, Lesage, & Valiquette, 1994; Wallace, Liberman, Tauber, & Wallace, 2000) is an interview including 70 items asked of the client assessing a range of instrumental skills required for independent living:  appearance and clothing, personal hygiene, care of personal possessions, food preparation and storage, health maintenance, money management, transportation, leisure and community, and job seeking and job maintenance. Both client (self)-​rated and staff-​rated versions of the scale exist. It is particularly targeted at identifying specific skills required for community functioning (e.g., doing laundry and managing money). The CASIG (Lecomte, Wallace, Caron, Perreault, & Lecomte, 2004; Wallace, Lecomte, Wilde, & Liberman, 2001) is also available in both client and informant interview formats. Nine areas of social and independent living skills (health management, money management, food preparation, vocational, transportation, friends, leisure, personal hygiene, and care of personal possessions) are assessed from four to nine dichotomous items. Items are designed to assess performance rather than ability or motivation. The informant and client versions are moderately highly correlated. The RAND Short Form-​ 36 Health Survey (SF-​ 36; https:// ​ w ww.rand.org/ ​ h ealth/ ​ s urveys_ ​ t ools/ ​ m os/ ​ m os_​ core_​36item.html) is a modification of the SF-​36 (Ware, Kosinski, & Keller, 1994) and is designed to assess functioning in a broad range of medical and psychiatric populations. It can be administered by interviews in person or over the phone. The RAND 36-​Item Health Survey taps eight health concepts: physical functioning, bodily pain, role limitations due to physical health problem, roles limitations due to personal or emotional problems, emotional well-​being, social functioning, energy/​fatigue, and general health perceptions. It also includes a single item that provides an indication of perceived change in health. Scores can be summed for both a total and within specific domains. Note that the scale does not specifically address instrumental skills that might be related to capacity to function independently in a psychiatric population (e.g., skill in riding the bus). The MCAS (Barker, Barron, & McFarlane, 1994; Corbière et  al., 2002; Hendryx, Dyck, McBride, & Whitbeck, 2001)  is an informant-​based scale designed to be completed by a staff member who is familiar with the client’s functioning in the community. The scale includes 17 items rated on 5-​point Likert scales, covering the domains of interference with functioning, adjustment to living, social competence, and community integration. A modification of this scale has been developed with

445

interview probes (Dickerson, Origoni, Pater, Friedman, & Kordonski, 2003). The SBS (Lima, Goncalves, Pereira, & Lovisi, 2006; Wykes & Sturt, 1986) is an informant-​rated instrument designed for the inpatient setting to be completed by staff members. The SBS contains 30 items, most rated on 5-​point or 6-​point Likert scales, pertaining to dimensions of adjustment in an intensive treatment setting, such as communication skills, symptomatic behavior, and self-​harming behavior. The SBS can also be used with outpatients, as long as an informant can be identified who is knowledgeable about the person’s day-​to-​day functioning. The SAFE (Harvey et al., 1997) is an informant-​rated instrument that is completed by staff members who are familiar with the client’s daily functioning. The scale includes 17 items rated on 5-​point Likert scales, with subscales corresponding to instrumental and self-​ care, impulse control, and social functions. The SAFE was originally developed for older persons with severe mental illness, although most of the items are applicable to younger clients. The SLOF (Schneider & Struening, 1983)  is a 43-​ item measure on which assessors rate the following domains: physical functioning, personal care skills, interpersonal relationships, social acceptability (i.e., socially appropriate or inappropriate behavior), engagement in activities, and work skills. Items are rated on 1-​to 5-​point Likert scales, with higher scores reflecting better community functioning. SLOF scores have been found to be significantly related to performance of “real-​world” self-​ maintenance activities (Harvey et al., 2011). In contrast to the previously mentioned community functioning scales, MIRECC-​ GAF (Niv, Cohen, Sullivan, & Young, 2007) ratings are not typically based on information obtained in a single interview but, rather, on all observations and information available to the treatment team during the specified assessment period. An extension of the original Global Assessment Scale introduced with DSM III (APA, 1980), the MIRECC-​ GAF measures occupational functioning, social functioning, and symptom severity on three 1-​to 100-​point subscales. Similar to the standard clinician-​administered GAF, lower scores on the modified version indicate more impairment in that domain, and higher scores indicate better occupational and social functioning and fewer symptoms. All MIRECC-​GAF subscales are divided into 10 equal intervals and include criteria for scoring within each interval. The scale has demonstrated good convergent validity.

446

Schizophrenia and Personality Disorders

Subjective Appraisal In addition to obtaining expert assessments, there is an increasing interest in the field in measuring the client’s own attitude toward his or her illness, as well as the individual’s subjective appraisal of his or her circumstances (living situation, safety, budget, etc.) and his or her symptoms. In many ways, this focus reflects the growing influence of the recovery movement in mental health. Recovery from a serious and persisting psychiatric illness has been defined by the President’s New Freedom Commission on Mental Health (2003) as the process by which people are able to live, work, learn, and participate fully in their communities. For some individuals, recovery is the ability to live a fulfilling and productive life despite a disability. For others, recovery implies the reduction or complete remission of symptoms. . . . Science has shown that having hope plays an integral role in an individual’s recovery. (p. 7) With regard to assessment, this recovery focus highlights two necessary domains of measurement—​recovery attitudes and satisfaction with life circumstances. In a factor analysis of clients’ responses to a series of items reflecting recovery orientations, Resnick, Rosenheck, and Lehman (2004) identified four domains that can be viewed as aspects of this process: life satisfaction, hope and optimism, knowledge of mental illness, and empowerment. Assessing recovery attitudes is a new area of investigation, but there are now several tools to identify factors related to recovery in schizophrenia (Cavelti, Kvrgic, Beck, Kossowsky, & Vauth, 2012). Generally, these tools can divided into two categories—​ones assessing positive aspects of the recovery process and ones assessing negative self-​assessments related to a diagnosis of schizophrenia or another significant mental illness. With regard to capturing a positive recovery orientation, one of the widely used measures is the Mental Health Recovery Measure (MHRM; Young & Bullock, 2005). The MHRM is a behaviorally anchored self-​report measure designed for use with persons who have serious and persistent mental illnesses, such as recurrent major depression, bipolar disorder, or schizophrenia. The item content of the MHRM and its subscales is based on a specific conceptual model of mental health recovery that is grounded in the experiences of persons with psychiatric disabilities (Young & Ensing, 1999). The 30-​item version of the MHRM contains the following subscales: overcoming; self-​ empowerment; learning and self-​ redefinition;

basic functioning; overall well-​ being; new potentials; advocacy/​enrichment; spirituality in the recovery process; and higher order activities, including advocacy, coping with stigma, and financial quality of life. Items are rated on 5-​point Likert scales. Relatively recently, a 10-​item version of the measure (MHRM-​10) was developed that had a single factor and was found to have high internal reliability (Armstrong, Cohen, Hellemann, Reist, & Young, 2014). Another widely used measure of recovery is the Recovery Assessment Scale (RAS; Giffort, Schmook, Woody, Vollendorf, & Gervain, 1995; Ralph, Kidder, & Phillips, 2000). The RAS includes 41 items, rated on 5-​ point Likert scales, pertaining to different dimensions of recovery. A factor analysis indicated that the RAS taps the following factors:  hope, meaningful life, quality of life, symptoms, and empowerment (Corrigan, Salzer, Ralph, Sangster, & Keck, 2004). There is limited evidence suggesting some sensitivity of the RAS to treatment-​related change (Corrigan, 2006). The Illness Management and Recovery (IMR) scale (Mueser & Gingerich, 2005; Mueser et al., 2005) is a 15-​item scale with self-​report and clinician versions that was originally designed to capture outcomes from the Illness Management and Recovery Module (Mueser & Gingerich, 2005). However, the scale is now used widely to assess recovery-​oriented actions and successes (e.g., achieving personal goals, sustaining community tenure, and managing substance use problems well) (Sklar, Sarkin, Gilmer, & Groessl, 2012). The psychometric qualities have been found to be moderate to strong (Färdig, Lewander, Fredriksson, & Melin, 2011; Hasson-​Ohayon, Roe, & Kravetz, 2008; McGuire, Kean, Bonfils, Presnell, & Salyers, 2014; Salyers, Godfrey, Mueser, & Labriola, 2007). A  variety of other recovery-​ oriented measures have been developed (for a review, see Sklar, Groessl, O’Connell, Davidson, & Aarons, 2013), but they are not covered here because their psychometric properties are still being evaluated. As mentioned previously, the individual can also develop negative self-​assessments related to a diagnosis of schizophrenia or another significant mental health disorder. There has been much recent interest in developing measures to assess these negative self-​appraisals, which have been labeled self-​stigma or internalized stigma. Self-​stigma is a particular concern because it has been linked to poorer psychosocial treatment adherence (Fung, Tsang, & Corrigan, 2008)  and higher rates of depression (Ritsher, Otilingam, & Grajales, 2003) in individuals diagnosed with schizophrenia. The Self-​Stigma of Mental Illness Scale (SSMI; Corrigan, Watson, & Barr,

Schizophrenia

2006) contains 40 items, with 10 items representing each of the four constructs in the self-​stigma model of Watson, Corrigan, Larson, and Sells (2007): stereotype awareness, stereotype agreement, stereotype self-​ concurrence, and self-​esteem decrement. Order of items within each subscale is randomized to diminish order effects. Clients are asked to respond to each item using a 9-​point agreement scale from “strongly disagree” to “strongly agree.” A short form is also available. The Internalized Stigma of Mental Illness (ISMI) scale (Ritsher et  al., 2003)  contains 29 items that assess individuals’ subjective experiences of internalized stigma. In addition to producing an overall score, the ISMI contains five subscales: alienation, stereotype endorsement, discrimination experience, social withdrawal, and stigma resistance. Both total and subscale scores are calculated as a mean, with possible total and subscale scores ranging from 1 to 4 and higher scores indicating greater self-​stigma. The Stigma Scale (SS; King et al., 2007) contains 28 items, each self-​rated on a 5-​point Likert scale from “strongly agree” to “strongly disagree” with good test–​retest (over 2 weeks) reliability. The SS has three factors corresponding to discrimination, disclosure, and potential positive aspects of mental illness, and it has been shown to be negatively correlated with global self-​esteem. A complementary aspect of subjective appraisal is the client’s own evaluation of his or her satisfaction with the circumstances of his or her life in areas such as living situation and family relations. The original widely used instrument for this type of assessment was the Lehman Quality of Life Interview (Lehman, Kernan, & Postrado, 1995), a 183-​item instrument requiring 45 minutes to administer that asks participants to rate their satisfaction with various facets of their life on a scale from 1 to 7. The Quality of Life Interview Self-​Administered Short Form (TL-​30S) is a validated briefer (30-​item) 15-​minute version that is based on correlation coefficients between the brief and full version scales (Lehman, 2006). The brief version provides measures of satisfaction with living situation, social relations/​ network, finances, and employment, and it includes both objective and subjective items; the subjective appraisal items are of most interest here. In addition to judgments about the impact of a psychiatric illness on one’s sense of self (either positive or negative) and one’s life satisfaction, a more person-​ centered approach to assessment in schizophrenia has also highlighted the importance of understanding the client’s level of distress ensuing from symptoms. Those with

447

lived experience of the symptoms of schizophrenia do not necessarily find them distressing (Baumeister, Sedgwick, Howes, & Peters, 2017). For example, some individuals experience internal voices that they judge to be benign or even helpful, and they may reject the idea that these experiences reflect the presence of a disorder. Thus, it can be helpful for clinicians to distinguish between the presence of a symptom and the distress the experience causes the client. The Psychotic Symptom Rating Scale (PSYRATS; Haddock, McCarron, Tarrier, & Faragher, 1999)  is a semi-​structured 17-​item interview that assesses multiple subjective dimensions of hallucinations and delusions. In contrast to other psychotic symptom measures, details are elicited by the respondent on several unique subjective aspects of delusions and hallucinations (e.g., perceived intensity, controllability, preoccupation, and distress) on a 0 to 4 scale, with higher scores indicating more difficulty. It has been found to have excellent psychometric properties. Family Attitudes Work conducted in England in the 1950s through the 1970s (Brown, Birley, & Wing, 1972; Brown, Monck, Carstairs, & Wing, 1962)  demonstrated that family attitudes reflective of high levels of distress measured at the time of a loved one’s psychotic relapse tended to predict greater rates of subsequent relapse, especially if the relative and client had more than 35 hours of contact per week. This high level of family distress has been labeled “high expressed emotion” (EE), and the relationship between high EE and subsequent relapse is among the most potent predictors of outcome in schizophrenia (Butzlaff & Hooley, 1998). EE is reflected in critical comments or tone or reported extreme self-​sacrificing behavior during a semi-​structured interview (the Camberwell Family Interview) at the time of the initial relapse (Leff & Vaughn, 1985), and it is likely evidenced in actual interactions with the client (Mueser et al., 1993; Strachan, Leff, Goldstein, Doane, & Burtt, 1986). The measurement of EE requires an extensive research assessment and scoring procedure, which is outside the time capacities of most clinicians. However, clinicians can be alert to signs of extreme distress, criticism, and self-​sacrificing behavior on the part of the relative at the time of a relapse and can consider a referral for an evidenced-​based family intervention if these are observed. These might be reflected, for example, in frequent calls to the clinic for assistance, repeated complaints about the client, or tearfulness in a relative. Hooley and Parker

448

Schizophrenia and Personality Disorders

(2006) suggested that one feasible method for assessing EE is to ask clients how critical their relative is of them. In a sample of clients with depression, Hooley and Teasdale (1989) simply asked clients to rate how critical they thought their spouse was of them using a 10-​point Likert-​type scale. Clients’ perceptions of their partner’s criticism level (assessed during the index hospitalization) was highly predictive (r = –​.64) of client relapse over the course of a 9-​month follow-​up. Although this result has not been replicated in schizophrenia, the method warrants more consideration and may have special utility for busy clinicians. An alternative measure to ratings of perceived criticism is the Patient Rejection Scale (PRS; Kreisman et al., 1988; Kreisman, Simmens, & Joy, 1979). This 24-​item scale consists of both positively and negatively worded items reflecting feelings of love and acceptance, criticism, disappointment, and rejection; it can be considered an analogue of the critical comments and hostility factors comprising the concept of EE. Presumably, families high in rejecting attitudes would benefit from participation in targeted interventions such as education or stress management. However, clinicians using the PRS should be aware that some of the items may be distressing for relatives to rate (e.g., “I wish (the patient) had never been born”) and that a short debriefing with relatives after they complete the scale may be in order. The impact of caregiving on the families of individuals with serious psychiatric illnesses is of concern. There is no consensus measure of family burden, and many of the measures used with families of individuals diagnosed with schizophrenia were developed for use in other disorders (e.g., the Zarit Burden Scale; Zarit, Reever, & Bach-​Peterson, 1980)  or are interview based and quite intensive to administer, such as the Family Experiences Interview Schedule (FEIS; Tessler & Gamache, 1996). One measure that appears to have good potential to capture burden in the families of the seriously mentally ill is the Burden Assessment Scale (BAS; Reinhard, Gubman, Horwitz, & Minsky, 1994). The BAS is a 19-​item self-​ report measure that has subjective and objective burden dimensions and has demonstrated good psychometric properties. The BAS may have particular value because it does not require interviewer training and is designed to focus on the experience of burden, and it is not confounded with issues of coping or skill in illness management. There is some evidence supporting the validity of the BAS across different cultures (Chakrabortya, Bhatia, Anderson, Nimgaonkar, & Deshpande, 2013; Talwar & Matheiken, 2010).

Comorbid Substance Abuse The assessment of co-​occurring substance use disorders may have important treatment planning implications for clients with schizophrenia. Approximately 50% of persons with schizophrenia develop a substance use disorder at some point in their illness, and most estimates of the point prevalence of substance use disorders range between 25% and 35% (Mueser, Bennett, & Kushner, 1995; Regier et  al., 1990; Thoma & Daum, 2013). The treatment of co-​occurring substance abuse is important because of its deleterious effects on the course and outcome of schizophrenia (Drake & Brunette, 1998) and the emergence of effective treatment models that integrated services for the two disorders (Drake, O’Neal, & Wallach, 2008; Mueser, Noordsy, Drake, & Fox, 2003). A number of brief screening instruments may be used to detect substance abuse problems in schizophrenia (Carey, 2002). Although some research suggests that instruments developed for measuring substance abuse in the general population may be insensitive to it in people with schizophrenia (Corse, Hirschinger, & Zanis, 1995; Wolford et al., 1999), several measures have been demonstrated to have acceptable reliability and validity, including the Alcohol Use Identification Test (Dawe, Seinen, & Kavanagh, 2000; Maisto, Carey, Carey, Gordon, & Gleason, 2000; Saunders, Aasland, Babor, De La Fuente, & Grant, 1993; Seinen, Dawe, Kavanagh, & Bahr, 2000), the Michigan Alcoholism Screening Test (McHugo, Paskus, & Drake, 1993; Searles, Alterman, & Purtill, 1990; Selzer, 1971; Wolford et al., 1999), and the Drug Abuse Screening Test (Maisto et al., 2000; Skinner, 1982; Wolford et al., 1999). In addition, the Dartmouth Assessment of Lifestyle Instrument was developed specifically to detect alcohol, cannabis, and cocaine use disorders in persons with severe mental illness, and scores have shown good reliability and validity in this population (Batalla et al., 2013; Ford, 2003; Rosenberg et al., 1998). Diagnoses of substance use disorders in clients with schizophrenia can also be reliably measured with the SCID-​5 (First et al., 2015a, 2015b). Although there is a tendency for clients with schizophrenia to have low subscale scores on the Addiction Severity Index (ASI), which was developed for the general population (McLellan et al., 1992), there is evidence that the ASI can nevertheless provide valid and reliable measures of the consequences of substance use (Corse et  al., 1995). Limited work has been conducted indicating that measures of expectancies and reasons for substance use developed for the general population may be valid in persons with schizophrenia

Schizophrenia

(Carey & Carey, 1995; Laudet, Magura, Vogel, & Knight, 2004; Mueser, Nishith, Tracy, DeGirolamo, & Molinaro, 1995), although currently the research is too preliminary to make firm recommendations. Mueser and colleagues (2003) provide detailed standardized assessment tools of substance use in persons with severe mental illness for treatment planning purposes, although rigorous psychometric evaluation remains to be conducted. The Substance Abuse Treatment Scale (SATS; McHugo, Drake, Burton, & Ackerson, 1995; Mueser, Drake, et  al., 1995; Mueser et  al., 2003)  is an 8-​point behaviorally anchored scale designed to measure motivation for substance abuse treatment in persons with severe mental illness. The SATS is based on the stages of treatment model (Mueser et al., 2003; Osher & Kofoed, 1989), which was adapted from the transtheoretical stages of change model (Prochaska & DiClemente, 1984). The stages of treatment include engagement (establishing a therapeutic relationship with the client), persuasion (motivating the person to work on substance abuse problems), active treatment (helping the person reduce substance use and/​or attain abstinence), and relapse prevention (helping the client prevent substance abuse relapses). Because the client’s stage of treatment has implications for treatment planning (e.g., in the engagement stage, the clinician focuses on establishing rapport and meeting with the client regularly, whereas in the active treatment stage the focus is on changing substance use behavior), the SATS is clinically useful. The Timeline Followback (TLFB) Calendar (Sobell & Sobell, 1992) is an instrument for quantifying substance use during the past 6 months and obtaining information about patterns of use that can be useful in treatment planning. The primary dependent variable studied has been the number of days of drinking to intoxication and the number of days of drug use. The TLFB has been adapted for use with persons with severe mental illness (Mueser, Drake, et  al., 1995; Mueser et  al., 2003), with research supporting its reliability and validity (Carey, Carey, Maisto, & Henson, 2004), although there is evidence of underreporting compared to laboratory tests (Bahorik, Newhill, Queen, & Eack, 2014). Overall Evaluation A wide range of validated instruments have been developed for case conceptualization and treatment planning regarding the domains of symptoms, community functioning, and comorbid substance abuse in schizophrenia. The most strongly validated measures for each of these areas

449

involve either semi-​structured interviews or informant-​ based ratings, which is consistent with the poor insight many clients with schizophrenia have into their illness (Amador & Gorman, 1998). There are fewer choices for treatment planning-​related assessment of medication adherence, subjective appraisal, and family attitudes, but there is at least one psychometrically sound instrument for each domain that is practical for use in clinical settings.

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

Symptoms The BPRS, PANSS, and SANS administered by interview have demonstrated sensitivity to change following treatment and are suitable for monitoring the effects of interventions on symptoms. Many research studies have utilized the BPRS or PANSS as frequently as every 2 weeks during times of psychotic exacerbation to determine when symptoms return to baseline. The CAINS and BNSS are newer measures, so their sensitivity to change is less certain. Self-​reported symptoms on scales such as the BASIS-​R and the Colorado Symptom Index tend to measure global distress and not specific dimensions of symptoms as with interview measures; therefore, their clinical utility for monitoring the effects of treatment on symptoms is not established. The ratings of the psychometric properties of these tools are presented in Table 20.3. The Clinical Global Impression Scale (Guy, 1976; Haro et  al., 2003)  has been widely used to assess symptom change, especially in pharmaceutical studies. The scale has three items, the first two being rated on 7-​point Likert scales and of most relevance here. These items assess severity of illness (from “normal” to “extremely ill”) and global improvement from baseline (“very much improved” to “very much worse”). There is a third efficacy item, typically referring to the hypothesized effect of a pharmaceutical agent, that is rated on a 4-​point scale. Although the three ratings are brief, they are typically made after extended clinical assessments and/​or contact with clients and require that assessors have known the client since the baseline period. Medication Adherence Although a range of instruments have been developed to measure medication adherence, none have a consistent

450

Schizophrenia and Personality Disorders

TABLE 20.3  Rating Instrument

of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Treatment Generalization Sensitivity

Clinical Utility

Highly Recommended

A A A A

G NA G G

E E E E

A A A A

A A A A

G A E E

E E E E

E E E A

A A A A

✓ ✓ ✓ ✓ ✓ ✓ ✓

Symptoms BPRS CGI PANSS SANS

Community Functioning CASIG ILSS MCAS QLS

A A A A

A G A G

E G G E

A A A A

E G A A

A A G G

G A G E

G G G E

A A A A

SAFE SAS-​II SBS SFS SF-​36 MIRECC-​GAF

A A A A E A

G G A G G E

G E G E NA E

A A A A A NR

A A A G A E

A G A G G E

E E A E E E

G E G G A E

A A A A A E

✓ ✓

Subjective Appraisal MHRM QOLI RAS TL-​30S

A E A A

G G G G

NA NA NA NA

A A A A

A G G A

A E A A

A E E G

A A A A

A A A A

PSYRATS IMR (Client)

E E

E G

E NA

E E

E E

E E

E G

NR NR

E E

✓ ✓

A A

G G

NA NA

A NR

A G

A G

A A

A G

A G



E A A A E

G NA NA NA NA

E E E E E

A A A A A

E A A A A

E A A A G

E E E E E

G E E E E

A A A A A



Family Attitudes PRS BAS Substance Abuse ASI AUS DUS SATS TLFB

✓ ✓ ✓ ✓

Note: BPRS = Brief Psychiatric Rating Scale; CGI = Clinical Global Impression Scale; PANSS = Positive and Negative Syndrome Scale; SANS = Scale for Assessment of Negative Symptoms; CASIG  =  Client’s Assessment of Strengths, Interests and Goals (both client and informant versions); ILSS = Independent Living Skills Survey (both client and informant versions); MCAS = Multnomah Community Ability Scale; QLS = Quality of Life Scale; SAFE = Social Adaptive Functions Scale; SAS-​II = Social Adjustment Scale-​II; SBS = Social Behavior Scale; SFS = Social Functioning Scale; SF-​36 = Short Form-​36 Health Survey; MIRECC-​GAF = Mental Illness Research Education and Clinical Center Global Assessment of Functioning; MHRM = Mental Health Recovery Measure; QOLI = Quality of Life Interview; RAS = Recovery Assessment Scale; TL-​30S = Quality of Life Interview Self-​Administered Short Form; PSYRATS  =  Psychotic Symptom Rating Scale; IMR  =  Illness Management and Recovery; PRS  =  Patient Rejection Scale; BAS = Burden Assessment Scale; ASI = Alcohol Severity Inventory; AUS = Alcohol Use Scale; DUS = Drug Use Scale; SATS = Substance Abuse Treatment Scale; TLFB = Timeline Followback Calendar; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

track record for demonstrating sensitivity to change. Most interventions that have been evaluated in research trials for improving medication adherence employ either pill counts or electronic pill bottles with cap sensors (Zygmunt, Olfson, Boyer, & Mechanic, 2002). More recently, a form of aripiprazole tablets with a sensor to permit external monitoring of pill ingestion has been approved by the FDA, but it is not clear how this will be used in typical clinical practice, and whether consumers will agree to its use. However, there are significant

problems with validity of data obtained by self-​report of medication adherence so these kinds of automatic sensors may have some appeal. Community Functioning The GAF, based on the widely used Global Assessment Scale (Endicott, Spitzer, Fleiss, & Cohen, 1976), is a single rating scale for evaluating a person’s psychological, social, and occupational functioning on a

Schizophrenia

hypothetical continuum of mental illness—​ mental health, which ranges from 1 (most ill) to 100 (most healthy). The scale provides defining characteristics, including both symptoms and functioning, for each 10-​point interval between 1 and 100. Scores have been reported to be reliable and correlated with symptom measures, especially with repeated assessments, and they have been found to have high inter-​rater reliability (Pedersen, Hagtvet, & Karterud, 2007; Söderberg, Tungström, & Armelius, 2005; Startup, Jackson, & Pearce, 2002). As discussed previously, the MIRECC-​ GAF extends the GAF rating framework to include occupational and social functioning as well as symptom severity, so it can also be a useful tool for assessing changes in adaptive functioning. The client-​based interview instruments of community functioning reviewed in the previous section on assessment for treatment planning (including the SAS-​II, QLS, ILSS, SFS, CASIG, and SLOF) are sensitive to change and suitable for the purposes of monitoring treatment effects. Similarly, the informant-​based instruments have demonstrated sensitivity to change and are appropriate for treatment monitoring (including CASIG, MCAS, SBS, and SAFE). However, these measures would rarely be used more frequently than quarterly because changes in social and community functioning typically lag behind symptom changes and require relatively long periods of time to occur (e.g., to find a job, rent an apartment, or develop a friendship).

451

Family Attitudes The Camberwell Family Interview (CFI) was developed primarily as a measure of negative family affect for clients who have recently experienced a symptom relapse, and it has been evaluated as a predictor of subsequent relapse and rehospitalization (Butzlaff & Hooley, 1998). Research evaluating changes in negative family affect measured on the CFI indicates modest sensitivity to treatment-​related change (Hogarty et al., 1991). However, the extensive time required to administer the CFI makes it impractical for monitoring the effects of family intervention in clinical settings. Client measures of perceived relative criticism have not been reported in schizophrenia to date. The Patient Rejection Scale has been found to be predictive of relapse in schizophrenia (Kreisman et al., 1988) and is sensitive to the effects of participation in family interventions (Mueser et  al., 2001). As a measure of family burden, the BAS has several advantages, including ease of administration and interpretation. Substance Abuse

The Alcohol Use Scale (AUD) and Drug Use Scale (DUS) are 5-​point rating scales completed by clinicians to rate substance use problems over the past 6 months, based on all available information (Drake et al., 1990; Mueser, Drake, et al., 1995; Mueser et al., 2003). Both scales were developed to reflect DSM-​IV criteria pertaining to substance abuse and dependence. Both have specific ratings corresponding to 1 = no substance use, 2 = use but not abuse, 3 = abuse, 4 = dependence, and 5 = dependence Subjective Appraisal and substance use-​related institutionalization (e.g., hospiMost of the self-​appraisal measures discussed in the treat- talizations and incarcerations). The AUS and DUS have ment planning section (MHRM, RAS, QOLI, and TL-​ high sensitivity to change and are appropriate for moni30s) have been proposed for ongoing monitoring and toring treatment outcomes, although the scales have not assessment of treatment outcomes, although data on been revised to reflect changes in DSM-​5 criteria for subtheir use in this way are limited. Investigations on inter- stance use disorders. Similarly, the SATS and TLFB have ventions to reduce self-​ stigma in persons with schizo- demonstrated sensitivity to treatment-​ related change. phrenia are just being developed (Lucksted et al., 2011; The Alcohol Severity Inventory (ASI) is also sensitive to Russinova et al., 2014), so the capacity for this variable to change following treatment, although clients with modchange over time is unknown. Reducing distress result- erate substance abuse severity tend to have floor effects ing from psychotic symptoms is a treatment goal in many (Corse et al., 1995). cognitive–​behavioral therapy for psychosis studies, and the PSYRATS has been shown to be sensitive to change Overall Evaluation in some of these trials (Mehl, Werner, & Lincoln, 2015). Because subjective appraisal and quality of life tend to be Similar to assessment for the purposes of treatment planstable over relatively long periods of time, and their sensi- ning, a wide range of psychometrically sound instruments tivity to change is often uncertain, most of these measures are available for monitoring and evaluating the effects of would best be administered no more frequently than once treatment on symptoms, community functioning, and every 6 months. comorbid substance abuse in schizophrenia. In contrast,

452

Schizophrenia and Personality Disorders

there are more limited, but nevertheless clinically suitable, choices for measuring family attitudes. Aside from the PSYRATS, measures of subjective appraisal and quality of life appear to be less sensitive to treatment-​related change, although it is not clear that this reflects limitations in the measures or the high stability of these appraisals over time. Currently, there are no scientifically validated (and hence recommended) measures of medication adherence that can be used for the purposes of routine treatment monitoring, and clinicians are advised to combine client self-​ report with observational measures such as pill counts or use of electronic pill bottles with cap sensors.

CONCLUSIONS AND FUTURE DIRECTIONS

Because schizophrenia can affect so many different areas of life functioning, assessment is necessarily complex and spans a broad range of different domains. Furthermore, because impaired insight into the illness is a common feature of schizophrenia, the most sensitive measures of functioning usually require either standardized interviews or informant ratings. With these considerations, well-​validated measures have been developed for diagnosing schizophrenia and for both treatment planning and monitoring treatment effects in the domains of symptoms, community functioning, family attitudes, and substance abuse. Although some useful tools are available, more work is needed to develop and evaluate instruments of subjective appraisal and medication adherence that are sensitive to the effects of treatment. In addition to the importance of developing measures for some domains that are more sensitive to change, there is a strong need for measures that can be implemented in routine clinical settings by competent clinicians without requiring extensive training. With the exception of self-​report measures of subjective appraisal, the strongest measures for assessment in schizophrenia have been developed in the context of research studies and validated with trained clinicians. Only limited evidence supports the utility of these instruments in the routine practice of treating clients with schizophrenia, and time constraints often prevent a thorough assessment that would lead to comprehensive treatment. A related problem is the relative paucity of measures that provide a comprehensive, integrated assessment across the broad range of domains of functioning that are often impaired, or for which there are often needs, in schizophrenia. Most of the measures described previously only assess one or two of the broad range of domains

affected in schizophrenia, necessitating the use of multiple assessment tools to develop a comprehensive picture of the client and his or her needs. The development of more fully integrated measures that cover a broader range of functioning in schizophrenia could improve the comprehensiveness of assessment and the effectiveness of treatment planning; CAN scales and the IMR scales are exemplars of such measures. One field that holds promise for increasing the efficiency, and perhaps even effectiveness, of treatment planning and monitoring is the use of client-​facing technology (Treisman et al., 2016). Although there were initial concerns about lack of access to smartphones and computer access in this population, recent surveys indicate that many individuals diagnosed with schizophrenia do have access to smartphones (Miller, Stewart, Schrimsher, Peeples, & Buckley, 2015; Record et al., 2016), whereas other studies have circumvented this problem by providing low-​cost computers, when needed, to facilitate participation in online educational and support interventions (Rotondi et  al., 2005). With regard to assessment, there is particular interest in the field in determining whether careful real-​ time monitoring of medication adherence and early warning signs might reduce rehospitalizations (Granholm, Ben-​Zeev, Link, Bradshaw, & Holden, 2012; Španiel, Vohlídka, Hrdlička, et al., 2008; Španiel, Vohlídka, Kožený, et al., 2008). The early data from these studies are mixed; there are many issues to resolve in moving from concept to effective clinical intervention, but it is likely that technological tools will improve the capacity to monitor and intervene more effectively with individuals diagnosed with schizophrenia in the coming years. Finally, there is a need to develop assessment and treatment planning methods that strive to reconcile and integrate the perspectives of treatment providers and clients with schizophrenia. Shared decision-​ making between clients and providers has been growing in mental health services (Fenton, 2003; Hamann, Leucht, & Kissling, 2003) and is an important value espoused by the President’s New Freedom Commission on Mental Health (2003). Models of shared decision-​ making have been proposed for prescribing medication (Deegan & Drake, 2006), and there is a need for further work to develop such approaches that span the full range of functioning in schizophrenia. Shared decision-​making approaches have the potential to both integrate different perspectives on functioning and set informed treatment priorities based on client preferences. Such approaches are critical considering the ever-​growing array of effective medications and rehabilitation approaches for schizophrenia.

Schizophrenia

References Achim, A. M., Maziade, M., Raymond, E., Oliver, D., Merette, C., & Roy, M. A. (2011). How prevalent are anxiety disorders in schizophrenia? A meta-​analysis and critical review. Schizophrenia Bulletin, 37, 811–​821. Addington, D., Addington, J., & Maticka-​ Tyndale, E. (1993). Assessing depression in schizophrenia:  The Calgary Depression Scale. British Journal of Psychiatry, 163(Suppl. 22), 39–​44. Addington, D., Addington, J., & Maticka-​Tyndale, E. (1994). Specificity of the Calgary Depression Scale for schizophrenics. Schizophrenia Research, 11, 239–​244. Aggarwal, N. K., Glass, A., Tirado, A., Boiler, M., Nicasio, A., Alegría, M.,  .  .  .  Lewis-​Fernández, R. (2014). The development of the DSM-​ 5 Cultural Formulation Interview–​Fidelity Instrument (CFI-​FI):  A pilot study. Journal of Health Care for the Poor and Underserved, 25, 1397–​1417. Aggarwal, N. K., Nicasio, A. V., DeSilva, R., Boiler, M., & Lewis-​ Fernández, R. (2013). Barriers to implementing the DSM-​ 5 Cultural Formulation Interview:  A qualitative study. Culture, Medicine, and Psychiatry, 37, 505–​533. Allin, M., & Murray, R. (2002). Schizophrenia: A neurodevelopmental or neurodegenerative disorder? Current Opinion in Psychiatry, 15, 9–​15. Amador, X. F., & Gorman, J. M. (1998). Psychopathologic domains and insight in schizophrenia. Psychiatric Clinics of North America, 21, 27–​42. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Amorim, P., Lecrubier, Y., Weiller, E., Hergueta, T., & Sheehan, D. V. (1998). The validity of the Mini International Interview (MINI) according to the SCID-​P and its reliability. European Psychiatry, 13, 26–​34. Andreasen, N. C. (1984). Modified Scale for the Assessment of Negative Symptoms. Bethesda, MD:  U.S. Department of Health and Human Services. Andresen, R., Caputi, P., & Oades, L. G. (2000). Interrater reliability of the Camberwell Assessment of Need Short Appraisal Schedule. Australian and New Zealand Journal of Psychiatry, 34, 856–​861. Angermeyer, M. C., Kuhn, L., & Goldstein, J. M. (1990). Gender and the course of schizophrenia: Differences in treated outcome. Schizophrenia Bulletin, 16, 293–​307.

453

Armstrong, N. P., Cohen, A. N., Hellemann, G., Reist, C., & Young, A. S. (2014). Validating a brief version of the Mental Health Recovery Measure for individuals with schizophrenia. Psychiatric Services, 65, 1154–​1159. Aubry, T., Goering, P., Veldhuizen, S., Adair, C. E., Bourque, J., Distasio, J., . . . Tsemberis, S. (2016). A multiple-​city RCT of Housing First with assertive community treatment for homeless Canadians with serious mental illness. Psychiatric Services, 67, 275–​281. Bahorik, A. L., Newhill, C. E., Queen, C. C., & Eack, S. M. (2014). Underreporting of drug use among individuals with schizophrenia: Prevalence and predictors. Psychological Medicine, 44, 61–​69. Barker, S., Barron, N., & McFarlane, B. (1994). Multnomah Community Ability Scale:  User’s manual. Portland, OR: Western Mental Health Research Center, Oregon Health Sciences University. Batalla, A., Garcia-​Rizo, C., Castellví, P., Fernandez-​Egea, E., Yücel, M., Parellada, E., . . . Bernardo, M. (2013). Screening for substance use disorders in first-​episode psychosis:  Implications for readmission. Schizophrenia Research, 146, 125–​131. Baumeister, D., Sedgwick, O., Howes, O., & Peters, E. (2017). Auditory verbal hallucinations and continuum models of psychosis: A systematic review of the healthy voice-​hearer literature. Clinical Psychology Review, 51, 125–​141. Bilker, W. B., Brensinger, C. M., Kurtz, M. M., Kohler, C. G., Gur, R. C., Siegel, S. J., & Gur, R. E. (2003). Development of an abbreviated schizophrenia Quality of Life Scale using a new method. Neuropsychopharmacology, 28, 773–​777. Birchwood, M., Smith, J., Cochrane, R., Wetton, S., & Copestake, S. (1990). The Social Functioning Scale: The development and validation of a new scale of social adjustment for use in family intervention programmes with schizophrenic patients. British Journal of Psychiatry, 157, 853–​859. Blackwell, B. (1976). Treatment adherence. British Journal of Psychiatry, 129, 513–​531. Blake, D. D., Weathers, F. W., Nagy, L. M., Kaloupek, D. G., Charney, D. S., & Keane, T. M. (1995). Clinician administered PTSD scale for DSM-​ IV. Boston, MA: National Center for Posttraumatic Stress Disorder. Bleuler, E. (1950). Dementia praecox or the group of schizophrenias. New  York, NY:  International Universities Press. (Original work published 1911) Bosanac, P., & Castle, D. (2012). Schizophrenia and depression. Medical Journal of Australia, 4, S36–​S39. Bowie, C. R., Reichenberg, A., Patterson, T. L., Heaton, R. K., & Harvey, P. D. (2006). Determinants of real-​ world functional performance in schizophrenia

454

Schizophrenia and Personality Disorders

subjects: Correlations with cognition, functional capacity, and symptoms. American Journal of Psychiatry, 163, 418–​425. Boydell, J., van Os, J., McKenzie, K., Allardyce, J., Goel, R., McCreadie, R., & Murray, R. M. (2001). Incidence of schizophrenia in ethnic minorities in London:  Ecological study into interactions with environment. British Medical Journal, 323, 1336–​1338. Brown, A. S., Begg, M. D., Gravenstein, S., Schaefer, C. A., Wyatt, R. J., Bresnahan, M.,  .  .  .  Susser, E. S. (2004). Serologic evidence of prenatal influenza in the etiology of schizophrenia. Archives of General Psychiatry, 61, 774–​780. Brown, G. W., Birley, J. L. T., & Wing, J. K. (1972). Influence of family life on the course of schizophrenic disorders: A replication. British Journal of Psychiatry, 121, 241–​258. Brown, G. W., Monck, E. M., Carstairs, G. M., & Wing, J. K. (1962). Influence of family life on the course of schizophrenic illness. British Journal of Preventive and Social Medicine, 16, 55–​68. Buckley, P. F., Miller, B. J., Lehrer, D. S., & Castle, D. J. (2009). Psychiatric comorbidities and schizophrenia. Schizophrenia Bulletin, 35, 383–​402. Butzlaff, R. L., & Hooley, J. M. (1998). Expressed emotion and psychiatric relapse. Archives of General Psychiatry, 55, 547–​552. Cannon, T. D., Jones, P. B., & Murray, R. M. (2002). Obstetric complications and schizophrenia: Historical and meta-​ analytic review. American Journal of Psychiatry, 159, 1080–​1092. Cantor-​Graae, E., Zolkowska, K., & McNeil, T. F. (2005). Increased risk of psychotic disorder among immigrants in Malmö:  A 3-​year first-​contact study. Psychological Medicine, 35, 1155–​1163. Carey, K. B. (2002). Clinically useful assessments: Substance use and comorbid psychiatric disorders. Behaviour Research and Therapy, 40, 1345–​1361. Carey, K. B., & Carey, M. P. (1995). Reasons for drinking among psychiatric outpatients:  Relationship to drinking patterns. Psychology of Addictive Behaviors, 9, 251–​257. Carey, K. B., Carey, M. P., Maisto, S. A., & Henson, J. M. (2004). Temporal stability of the Timeline Followback Interview for alcohol and drug use with psychiatric outpatients. Journal of Studies on Alcohol, 65, 774–​781. Carey, M. P., Carey, K. B., Maisto, S. A., Gordon, C. M., & Vanable, P. A. (2001). Prevalence and correlates of sexual activity and HIV-​related risk behavior among psychiatric outpatients. Journal of Consulting and Clinical Psychology, 69, 846–​850. Carpenter, W. T., Blanchard, J. J., & Kirkpatrick, B. (2016). New standards for negative symptom assessment. Schizophrenia Bulletin, 42, 1–​3.

Cavelti, M., Kvrgic, S., E.-​M. Beck, E.-​M., Kossowsky, J., & Vauth, R. (2012). Assessing recovery from schizophrenia as an individual process:  A review of self-​ report instruments. European Psychiatry, 27, 19–​32. Chakrabortya, S., Bhatia, T., Anderson, C., Nimgaonkar, V. L., & Deshpande, S. N. (2013). Caregiver’s burden, coping and psycho-​education in Indian households with single-​and multiple-​affected members with schizophrenia. International Journal of Mental Health Promotion, 15, 288–​298. Chinman, M., Young, A. S., Schell, T., Hassell, J., & Mintz, J. (2004). Computer-​ assisted self-​ assessment in persons with severe mental illness. Journal of Clinical Psychiatry, 65, 1343–​1351. Ciompi, L., & Muller, C. (1976). Lebensweg und Alter der Schizophrenen, eine katamnestische Langzeitstudie bis ins Senium. Berlin, Germany: Springer. Cohen, A. N., Hamilton, A. B., Saks, E. R., Glover, D. L., Glynn, S. M., Brekke, J. S., & Marder, S. R. (2017). How occupationally high-​achieving individuals with a diagnosis of schizophrenia manage their symptoms. Psychiatric Services, 68, 324–​329. Conrad, K. J., Yagelka, J. R., Matters, M. D., Rich, A. R., Williams, V., & Buchana, M. (2001). Reliability and validity of a modified Colorado Symptom Index in a national homeless sample. Mental Health Services Research, 3, 141–​153. Corbière, M., Crocker, A. G., Lesage, A. D., Latimer, E., Ricard, N., & Mercier, C. (2002). Factor structure of the Multnomah Community Ability Scale. Journal of Nervous and Mental Disease, 190, 399–​406. Correll, C. U., Robinson, D. G., Schooler, N. R., Brunette, M. F., Mueser, K. T., Rosenheck, R. A., . . . Kane, J. M. (2014). Cardiometabolic risk in patients with first-​ episode schizophrenia spectrum disorders:  Baseline results from the RAISE-​ETP study. JAMA Psychiatry, 71, 1350–​1363. Corrigan, P. W. (2006). Impact of consumer-​ operated services on empowerment and recovery of people with psychiatric disabilities. Psychiatric Services, 57, 1493–​1496. Corrigan, P. W., Salzer, M., Ralph, R., Sangster, Y., & Keck, L. (2004). Examining the factor structure of the Recovery Assessment Scale. Schizophrenia Bulletin, 30, 1035–​1041. Corrigan, P. W., Watson, A. C., & Barr, L. (2006). The self-​ stigma of mental illness:  Implications for self-​ esteem and self-​efficacy. Journal of Social and Clinical Psychology, 25, 875–​884. Corse, S. J., Hirschinger, N. B., & Zanis, D. (1995). The use of the Addiction Severity Index with people with severe mental illness. Psychiatric Rehabilitation Journal, 19, 9–​18. Cyr, M., Toupin, J., Lesage, A. D., & Valiquette, C. A. (1994). Assessment of independent living skills for psychotic

Schizophrenia

patients: Further validity and reliability. Journal Nervous and Mental Disease, 182, 91–​97. Dawe, S., Seinen, A., & Kavanagh, D. J. (2000). An examination of the utility of the AUDIT in people with schizophrenia. Journal of Studies on Alcohol, 61, 744–​750. Deegan, P. E., & Drake, R. E. (2006). Shared decision making and medication management in the recovery process. Psychiatric Services, 57, 1636–​1639. Dickerson, F. B., Origoni, A. E., Pater, A., Friedman, B. K., & Kordonski, W. M. (2003). An expanded version of the Multnomah Community Ability Scale: Anchors and interview probes for the assessment of adults with serious mental illness. Community Mental Health Journal, 39, 131–​137. Dickerson, F. B., Schroeder, J., Stallings, C., Origoni, A., Ktasafanas, E., Schwienfurth, L. A. B., . . . Yolken, R. (2014). A longitudinal study of cognitive functioning in schizophrenia:  Clinical and biological predictors. Schizophrenia Research, 156, 248–​253. Dolder, C. R., Lacro, J. P., Warren, K. A., Golshan, S., Perkins, D. O., & Jeste, D. V. (2004). Brief evaluation of medication influences and beliefs:  Development and testing of a brief scale for medication adherence. Journal of Clinical Psychopharmacology, 24, 404–​409. Dollfus, S., Mach, C., & Morello, R. (2015). Self-​evaluation of Negative Symptoms: A novel tool to assess negative symptoms. Schizophrenia Bulletin, 42, 571–​578. Drake, R. E., & Brunette, M. F. (1998). Complications of severe mental illness related to alcohol and other drug use disorders. Recent Developments in Alcoholism, 14, 285–​299. Drake, R. E., O’Neal, E., & Wallach, M. A. (2008). A systematic review of psychosocial interventions for people with co-​occurring severe mental and substance use disorders. Journal of Substance Abuse Treatment, 34, 123–​138. Drake, R. E., Osher, F. C., Noordsy, D. L., Hurlbut, S. C., Teague, G. B., & Beaudett, M. S. (1990). Diagnosis of alcohol use disorders in schizophrenia. Schizophrenia Bulletin, 16, 57–​67. Eaton, W. W., Jr. (1994). Residence, social class, and schizophrenia. Journal of Health and Social Behavior, 15, 289–​299. Eisen, S., Normand, S. L., Belanger, A. J., Spiro, A. R., & Esch, D. (2004). The Revised Behavior and Symptom Identification Scale (BASIS-​R): Reliability and validity. Medical Care, 42, 1230–​1241. Endicott, J., Spitzer, R. L., Fleiss, J. L., & Cohen, J. (1976). The Global Assessment Scale: A procedure for measuring overall severity of psychiatric disturbance. Archives of General Psychiatry, 33, 766–​771. Evins, A. E., Cather, C., Pratt, S. A., Pachas, G. N., Hoeppner, S. S., Goff, D. C.,  .  .  .  Schoenfeld, D. A. (2014). Maintenance treatment with varenicline for smoking cessation in patients with schizophrenia and bipolar disorder:  A randomized clinical trial. JAMA, 311, 145–​154.

455

Färdig, R., Lewander, T., Fredriksson, A., & Melin, L. (2011). Evaluation of the Illness Management and Recovery Scale in schizophrenia and schizoaffective disorder. Schizophrenia Research, 132, 157–​164. Fenton, W. S. (2003). Shared decision making: A model for the physician–​patient relationship in the 21st century? Acta Psychiatrica Scandinavica, 107, 401–​402. Fenton, W. S., & McGlashan, T. H. (1991). Natural history of schizophrenia subtypes:  II. Positive and negative symptoms and long term course. Archives of General Psychiatry, 48, 978–​986. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015a). Structured Clinical Interview for DSM-​ 5 Disorders–​ Clinician Version (SCID-​ 5). Arlington, VA: American Psychiatric Publishing. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015b). Structured Clinical Interview for DSM-​5 Disorders–​Research Version (SCID-​5-​RV). Arlington, VA: American Psychiatric Publishing. Ford, P. (2003). An evaluation of the Dartmouth Assessment of Lifestyle Inventory and the Leeds Dependence Questionnaire for use among detained psychiatric inpatients. Addiction, 98, 111–​118. Fosse, R., Joseph, J., & Riochardson, K. (2015). A critical assessment of the equal-​environment assumption of the twin method for schizophrenia. Frontiers in Psychiatry, 6, 62. Friis, S., Melle, I., Johannessen, J. O., Røssberg, J. I., Barder, H. E., Evensen, J. H.,  .  .  .  McGlashan, T. H. (2016). Early predictors of ten-​year course in first-​episode psychosis. Psychiatric Services, 67, 438–​443. Fung, K. M. T., Tsang, H. W. H., & Corrigan, P. W. (2008). Self-​stigma of people with schizophrenia as predictor of their adherence to psychosocial treatment. Psychiatric Rehabilitation Journal, 32, 95–​104. Gale, C. R., Batty, G. D., Osborn, D. P.  J., Tynelius, P., Whitley, E., & Rasmussen, F. (2012). Association of mental disorders in early adulthood and later psychiatric hospital admissions and mortality in a cohort study of more than 1 million men. Archives of General Psychiatry, 69, 823–​831. Gearon, J. S., Bellack, A. S., & Tenhula, W. N. (2004). Preliminary reliability and validity of the Clinician-​ Administered PTSD Scale for schizophrenia. Journal of Consulting and Clinical Psychology, 72, 121–​125. Gee, B., Hodgekins, J., Fowler, D., Marshall, M., Everard, L., Lester, H., . . . Birchwood, M. (2016). The course of negative symptom in first episode psychosis and the relationship with social recovery. Schizophrenia Research, 174, 165–​171. Giffort, D., Schmook, A., Woody, C., Vollendorf, C., & Gervain, M. (1995). Construction of a scale to measure consumer recovery. Springfield, IL:  Illinois Office of Mental Health.

456

Schizophrenia and Personality Disorders

Glynn, S. M., Marder, S. R., Liberman, R. P., Blair, K., Wirshing, W. C., Wirshing, D. A., . . . Mintz, J. (2002). Supplementing clinic-​based skills training with manual-​ based community support sessions:  Effects on social adjustment of patients with schizophrenia. American Journal of Psychiatry, 159, 829–​837. Gold, J. M., Queern, C., Iannone, V. N., & Buchanan, R. W. (1999). Repeatable Battery for the Assessment of Neuropsychological Status as a screening test in schizophrenia:  II. Convergent/​ discriminant validity and diagnostic group comparisons. American Journal of Psychiatry, 156, 1944–​1950. Goodman, L. A., Rosenberg, S. D., Mueser, K. T., & Drake, R. E. (1997). Physical and sexual assault history in women with serious mental illness:  Prevalence, correlates, treatment, and future research directions. Schizophrenia Bulletin, 23, 685–​696. Granholm, E., Ben-​Zeev, D., Link, P. C., Bradshaw, K. R., & Holden, J. L. (2012). Mobile Assessment and Treatment for Schizophrenia (MATS): A pilot trial of an interactive text-​messaging intervention for medication adherence, socialization, and auditory hallucinations. Schizophrenia Bulletin, 38, 414–​425. Green, M. F., Llerena, K., & Kern, R. S. (2015). The “right stuff” revisited:  What have we learned about the determinants of daily functioning in schizophrenia? Schizophrenia Bulletin, 41, 781–​785. Grubaugh, A. L., Zinzow, H. M., Paul, L., Egede, L. E., & Frueh, B. C. (2011). Trauma exposure and posttraumatic stress disorder in adults with severe mental illness: A critical review. Clinical Psychology Review, 31, 883–​899. Guy, W. (1976). ECDEU assessment manual for psychopharmacology, revised (DHEW Publication No. ADM 76-​338). Rockville, MD:  U.S. Department of Health, Education, and Welfare. Haddock, G., McCarron, J., Tarrier, N., & Faragher, E. B. (1999). Scales to measure dimensions of hallucinations and delusions: The Psychotic Rating Scales (PSYRATS). Psychological Medicine, 29, 879–​889. Häfner, H. (2000). Onset and early course as determinants of the further course of schizophrenia. Acta Psychiatrica Scandinavica, 102(Suppl. 407), 44–​48. Häfner, H., & an der Heiden, W. (2003). Course and outcome of schizophrenia. In S. R. Hirsch & D. R. Weinberger (Eds.), Schizophrenia (2nd ed., pp. 101–​141). Oxford, UK: Blackwell. Häfner, H., & an der Heiden, W. (2008). Course and outcome. In K. T. Mueser & D. V. Jeste (Eds.), Clinical handbook of schizophrenia (pp. 100–​113). New  York, NY: Guilford. Hamann, J., Leucht, S., & Kissling, W. (2003). Shared decision making in psychiatry. Acta Psychiatrica Scandinavica, 107, 403–​409.

Harding, C. M., Brooks, G. W., Ashikaga, T., Strauss, J. S., & Breier, A. (1987). The Vermont longitudinal study of persons with severe mental illness:  II. Long-​term outcome of subjects who retrospectively met DSM-​III criteria for schizophrenia. American Journal of Psychiatry, 144, 727–​735. Harding, C. M., & Keller, A. B. (1998). Long-​term outcome of social functioning. In K. T. Mueser & N. Tarrier (Eds.), Handbook of social functioning in schizophrenia (pp. 134–​148). Boston, MA: Allyn & Bacon. Haro, J. M., Arbabzadeh-​ Bouchez, S., Brugha, T. S., de Girolamo, G., Guyer, M. E., Jin, R., . . . Kessler, R. C. (2006). Concordance of the Composite International Diagnostic Interview Version 3.0 (CIDI 3.0) with standardized clinical assessments in the WHO World Mental Health surveys. International Journal of Methods in Psychiatric Research, 15, 167–​180. Haro, J. M., Kamath, S. A., Ochoa, S., Novick, D., Rele, K., Fargas, A.,  .  .  .  SOHO Study Group. (2003). The Clinical Global Impression–​ Schizophrenia scale:  A simple instrument to measure the diversity of symptoms present in schizophrenia. Acta Psychiatrica Scandinavia Supplement, 416, 16–​23. Harvey, P. D. (Ed.). (2013). Cognitive impairment in schizophrenia. Cambridge, UK: Cambridge University Press. Harvey, P. D., Davidson, M., Mueser, K. T., Parrella, M., White, L., & Powchik, P. (1997). Social-​ Adaptive Functioning Evaluation (SAFE): A rating scale for geriatric psychiatric patients. Schizophrenia Bulletin, 23, 131–​145. Harvey, P. D., Raykov, T., Twamley, E. W., Vella, L., Heaton, R. K., & Patterson, T. L. (2011). Validating the measurement of real-​world functional outcome: Phase I results of the VALERO study. American Journal of Psychiatry, 268, 1195–​1201. Hasson-​Ohayon, I., Roe, D., & Kravetz, S. (2008). The psychometric properties of the Illness Management and Recovery scale: Client and clinician versions. Psychiatry Research, 160, 228–​235. Heinrichs, D. W., Hanlon, T. E., & Carpenter, W. T.  J. (1984). The Quality of Life Scale: An instrument for rating the schizophrenia deficit syndrome. Schizophrenia Bulletin, 10, 388–​396. Hendryx, M., Dyck, D. G., McBride, D., & Whitbeck, J. (2001). A test of the reliability and validity of the Multnomah Community Ability Scale. Community Mental Health Journal, 37, 157–​168. Hodgins, S., Piatosa, M. J., & Schiffer, B. (2014). Violence among people with schizophrenia: Phenotypes and neurobiology. Current Topics in Behavioral Neuroscience, 17, 329–​368. Hoek, H. W., Brown, A. S., & Susser, E. (1998). The Dutch famine and schizophrenia spectrum disorders. Social Psychiatry and Psychiatric Epidemiology, 33, 373–​379.

Schizophrenia

Hogan, T. P., Awad, A. G., & Eastwood, R. (1983). A self-​ report scale predictive of drug compliance in schizophrenics:  Reliability and discriminative validity. Psychological Medicine, 13, 177–​183. Hogarty, G. E., Anderson, C. M., Reiss, D. J., Kornblith, S. J., Greenwald, D. P., Ulrich, R. F., & Carter, M. (1991). Family psychoeducation, social skills training, and maintenance chemotherapy in the aftercare treatment of schizophrenia:  II. Two-​year effects of a controlled study on relapse and adjustment. Archives of General Psychiatry, 48, 340–​347. Hooley, J. M., & Parker, H. A. (2006). Measuring expressed emotion:  An evaluation of the shortcuts. Journal of Family Psychology, 20, 386–​396. Hooley, J. M., & Teasdale, J. D. (1989). Predictors of relapse in unipolar depression:  Expressed emotion, marital quality, and perceived criticism. Journal of Abnormal Psychology, 98, 229–​235. Inskip, H. M., Harris, E. C., & Barraclough, C. (1998). Lifetime risk of suicide for alcoholism, affective disorder and schizophrenia. British Journal of Psychiatry, 172, 35–​37. Institute of Medicine. (2001). Neurological, psychiatric, and developmental disorders: Meeting the challenges in the developing world. Washington, DC: National Academies Press. Jablensky, A. (1997). The 100-​year epidemiology of schizophrenia. Schizophrenia Research, 28, 111–​125. Jacob, K. S., Johnson, S., Prince, M. J., Bhugra, D., & David, A. S. (2007). Assessing insight in schizophrenia:  East meets West. British Journal of Psychiatry, 190, 243–​247. Jaeger, J., Berns, S. M., & Czobor, P. (2003). The Multidimensional Scale of Independent Functioning: A new instrument for measuring functional disability in psychiatric populations. Schizophrenia Bulletin, 29, 153–​168. Janssen, E. M., McGinty, E. E., Azrin, S. T., Juliano-​ Bult, D., & Daumit, G. L. (2015). Review of the evidence: Prevalence of medical conditions in the United States population with serious mental illness. General Hospital Psychiatry, 199, 199–​222. Jenkins, J. H., & Barrett, R. J. (Eds.). (2004). Schizophrenia, culture, and subjectivity:  The edge of experience. Cambridge, UK: Cambridge University Press. Jerrell, J. M. (2005). Behavior and symptom identification scale 32:  Sensitivity to change over time. Journal of Behavioral Health Services & Research, 20, 341–​346. Kane, J. M., Robinson, D. G., Schooler, N. R., Mueser, K. T., Penn, D. L., Rosenheck, R. A.,  .  .  .  Heinssen, R. K. (2016). Comprehensive versus usual community care for first-​episode psychosis:  2-​year outcomes from the NIMH RAISE early treatment program. American Journal of Psychiatry, 173, 362–​372. Kay, S. R., Opler, L. A., & Fiszbein, A. (1987). The Positive and Negative Syndrome Scale (PANSS) for schizophrenia. Schizophrenia Bulletin, 13, 261–​276.

457

Keefe, R. S.  E., Goldberg, T. E., Harvey, P. D., Gold, J. M., Poe, M. P., & Coughenour, L. (2004). The Brief Assessment of Cognition in Schizophrenia: Reliability, sensitivity, and comparison with a standard neurocognitive battery. Schizophrenia Research, 68, 283–​297. Kendler, K. S., Ohlsson, H., Mezuk, B., Sundquist, J. O., & Sundquist, K. (2016). Observed cognitive performance and deviation from familial cognitive aptitude at age 16 years and ages 18 to 20 years and risk for schizophrenia and bipolar illness in a Swedish national sample. JAMA Psychiatry, 73, 465–​471. Kessler, R. C., & Ustün, T. B. (2004). The World Mental Health (WMH) Survey Initiative version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI). International Journal of Methods in Psychiatric Research, 13, 93–​121. Khalifeh, H., Johnson, S., Howard, L. M., Borschmann, R., Osborn, D., Dean, K.,  .  .  .  Moran, P. (2015). Violent and non-​violent crime against adults with severe mental illness. British Journal of Psychiatry, 206, 275–​282. Khashan, A. S., Abel, K. M., McNamee, R., Pedersen, M. G., Webb, R. T., Baker, P. N., . . . Mortensen, P. B. (2008). Higher risk of offspring schizophrenia following antenatal maternal exposure to severe adverse life events. Archives of General Psychiatry, 65, 146–​152. King, M., Dinos, S., Shaw, J., Watson, R., Stevens, S., Passetti, F.,  .  .  .  Serfaty, M. (2007). The Stigma Scale:  Development of a standardised measure of the stigma of mental illness. British Journal of Psychiatry, 190, 248–​254. Kirkpatrick, B., Strauss, G. P., Nguyen, L., Fischer, B. A., Daniel, D. G., Cienfuegos, A., & Marder, S. M. (2011). The Brief Negative Symptom Scale:  Psychometric properties. Schizophrenia Bulletin, 37, 300–​305. Kraepelin, E. (1971). Dementia praecox and paraphrenia (R. M. Barclay, Trans.). New York, NY: Krieger. (Original work published 1919) Kreisman, D., Blumenthal, R., Borenstein, M., Woerner, M., Kane, J., Rifkin, A., & Reardon, G. (1988). Family attitudes and patient social adjustment in a longitudinal study of outpatient schizophrenics receiving low-​dose neuroleptics:  The family’s view. Psychiatry, 51, 3–​13. Kreisman, D. E., Simmens, S. J., & Joy, V. D. (1979). Rejecting the patient: #preliminary validation of a self-​ report scale. Schizophrenia Bulletin, 5, 220–​222. Kring, A. M., Gur, R. E., Blanchard, J. J., Horan, W. P., & Reise, S. P. (2013). The Clinical Assessment Interview for Negative Symptoms (CAINS):  Final development and validation. American Journal of Psychiatry, 170, 165–​172. Kuriansky, J. B., Deming, W. E., & Gurland, B. J. (1974). On trends in the diagnosis of schizophrenia. American Journal of Psychiatry, 131, 402–​408.

458

Schizophrenia and Personality Disorders

Laudet, A. B., Magura, S., Vogel, H. S., & Knight, E. L. (2004). Perceived reasons for substance misuse among persons with a psychiatric disorder. American Journal of Orthopsychiatry, 74, 365–​375. Lecomte, T., Wallace, C. J., Caron, J., Perreault, M., & Lecomte, J. (2004). Further validation of the Client Assessment of Strengths Interests and Goals. Schizophrenia Research, 66, 59–​70. Leff, J., & Vaughn, C. (Eds.). (1985). Expressed emotion in families. New York, NY: Guilford. Lehman, A., Kernan, E., & Postrado, L. (1995). Toolkit for evaluating quality of life for persons with severe mental illness. Baltimore, MD:  Evaluation Center at Human Services Research Institute. Lehman, A. F. (2006). Quality of Life Interview:  Self-​ Administered Short Form (TL-​30S Version). Baltimore, MD:  Center for Mental Health Services Research, Department of Psychiatry, University of Maryland. Lewis-​Fernández, R., Aggarwal, N. K., Bäärnhielm, S., Rohlof, H., Kirmayer, L. J., Weiss, M. G.,  .  .  .  Lu, F. (2014). Culture and psychiatric evaluation:  Operationalizing cultural formulation for DSM-​ 5. Psychiatry, 77, 130–​154. Liddle, P. F. (1987). The symptoms of chronic schizophrenia:  A re-​examination of the positive–​negative dichotomy. British Journal of Psychiatry, 151, 145–​151. Lima, L. A., Goncalves, S., Pereira, B. B., & Lovisi, G. M. (2006). The measurement of social disablement and assessment of psychometric properties of the Social Behaviour Schedule (SBS-​BR) in 881 Brazilian long-​ stay psychiatric patients. International Journal of Social Psychiatry, 52, 101–​109. Long, J. D., & Brekke, J. S. (1999). Longitudinal factor structure of the Brief Psychiatric Rating Scale in schizophrenia. Psychological Assessment, 11, 498–​506. Lucksted, A., Drapalski, A., Calmes, C., Forbes, C., DeForge, B., & Boyd, J. (2011). Ending self-​stigma:  Pilot evaluation of a new intervention to reduce internalized stigma among people with mental illnesses. Psychiatric Rehabilitation Journal, 35, 51–​54. Lukoff, D., Nuechterlein, K. H., & Ventura, J. (1986). Manual for the Expanded Brief Psychiatric Rating Scale (BPRS). Schizophrenia Bulletin, 12, 594–​602. MacCabe, J. H., Wicks, S., Löfving, S., David, A. S., Berndtsson, Å., Gustafsson, J.-​E.,  .  .  .  Dalman, C. (2013). Decline in cognitive performance between ages 13 and 18 years and the risk for psychosis in adulthood:  A Swedish longitudinal cohort study in males. JAMA Psychiatry, 70, 261–​270. Maisto, S. A., Carey, M. P., Carey, K. B., Gordon, C. M., & Gleason, J. R. (2000). Use of the AUDIT and the DAST-​10 to identify alcohol and drug use disorders among adults with a severe and persistent mental illness. Psychological Assessment, 12, 186–​192.

Malgady, R., Lloyd, H., & Tryon, W. (1992). Issues of validity in the Diagnostic Interview Schedule. Journal of Psychiatric Research, 26, 59–​67. Matarazzo, J. D. (1983). The reliability of psychiatric and psychological diagnosis. Clinical Psychology Review, 3, 103–​145. McCrone, P., Leese, M., Thornicroft, G., Schene, A. H., Knudsen, H. C., Vázquez-​Barquero, J. L., . . . Griffiths, G. (2005). Reliability of the Camberwell Assessment of Need–​European Version. Epsilon Study 6.  European psychiatric services: Inputs linked to outcome domains and needs. British Journal of Psychiatry, 39(Suppl.), S34–​S40. McGrath, J., Saha, S., Welham, J., Saadi, E., MacCauley, C., & Chant, D. (2004). A systematic review of the incidence of schizophrenia:  The distribution of rates and influence of sex, urbanicity, migrant status and methodology. BMC Medicine, 2, 13. McGuffin, P., Owen, M. J., & Farmer, A. E. (1996). Genetic basis of schizophrenia. Lancet, 346, 678–​682. McGuire, A. B., Kean, J., Bonfils, K., Presnell, J., & Salyers, M. P. (2014). Rasch Analysis of the Illness Management and Recovery Scale–​ Clinician Version. Journal of Evaluation in Clinical Practice, 20, 383–​389. McGurk, S. R., & Mueser, K. T. (2004). Cognitive functioning, symptoms, and work in supported employment: A review and heuristic model. Schizophrenia Research, 70, 147–​174. McHugo, G. J., Drake, R. E., Burton, H. L., & Ackerson, T. H. (1995). A scale for assessing the stage of substance abuse treatment in persons with severe mental illness. Journal of Nervous and Mental Disease, 183, 762–​767. McHugo, G. J., Paskus, T. S., & Drake, R. E. (1993). Detection of alcoholism in schizophrenia using the MAST. Alcoholism:  Clinical and Experimental Research, 17, 187–​191. McLellan, A. T., Kushner, H., Metzger, D., Peters, R., Smith, I., Grissom, G.,  .  .  .  Angerious, M. (1992). The fifth edition of the Addiction Severity Index:  Historical critique and normative data. Journal of Substance Abuse Treatment, 9, 199–​213. Medeiros-​Ferreira, L., Navarro-​Pastor, J. B., Zúñiga-​Lagares, A., Romaní, R., Muray, E., & Obiols, J. E. (2016). Perceived needs and health-​related quality of life in people with schizophrenia and metabolic syndrome: A “real-​world” study. BMC Psychiatry, 16, 414. Mehl, S., Werner, D., & Lincoln, T. M. (2015). Does cognitive behavior therapy for psychosis (CBTp) show a sustainable effect on delusions? A meta-​analysis. Frontiers in Psychology, 6, 1450. Meier, M. H., Caspi, A., Reichenberg, A., Keefe, R. S., Fisher, H. L., Harrington, H., . . . Moffitt, T. E. (2014). Neuropsychological decline in schizophrenia from the premorbid to the postonset period:  Evidence from a

Schizophrenia

population-​representative longitudinal study. American Journal of Psychiatry, 171, 91–​101. Miller, B. J., Stewart, A., Schrimsher, J., Peeples, D., & Buckley, P. F. (2015). How connected are people with schizophrenia? Cell phone, computer, email, and social media use. Psychiatry Research, 225, 458–​463. Miller, R., Ream, G., McCormack, J., Gunduz-​Bruce, H., Sevy, S., & Robinson, D. (2009). A prospective study of cannabis use as a risk factor for non-​ adherence and treatment dropout in first-​episode schizophrenia. Schizophrenia Research, 113, 138–​144. Monahan, J., Vesselinov, R., Robbins, P. C., & Appelbaum, P. S. (2017). Violence to others, violent self-​victimization, and violent victimization by others among persons with a mental illness. Psychiatric Services, 68(5), 516–​519. Mueser, K. T., Bellack, A. S., Wade, J. H., Sayers, S. L., Tierney, A., & Haas, G. (1993). Expressed emotion, social skill, and response to negative affect in schizophrenia. Journal of Abnormal Psychology, 102, 339–​351. Mueser, K. T., Bennett, M., & Kushner, M. G. (1995). Epidemiology of substance abuse among persons with chronic mental disorders. In A. F. Lehman & L. Dixon (Eds.), Double jeopardy: Chronic mental illness and substance abuse (pp. 9–​25). New York, NY: Harwood. Mueser, K. T., Curran, P. J., & McHugo, G. J. (1997). Factor structure of the Brief Psychiatric Rating Scale in schizophrenia. Psychological Assessment, 9, 196–​204. Mueser, K. T., Drake, R. E., Clark, R. E., McHugo, G. J., Mercer-​McFadden, C., & Ackerson, T. (1995). Toolkit for evaluating substance abuse in persons with severe mental illness. Cambridge, MA:  Evaluation Center at Human Services Research Institute. Mueser, K. T., & Gingerich, S. (2005). Illness Management and Recovery (IMR) scales. In T. Campbell-​ Orde, J. Chamberlin, J. Carpenter, & H. S. Leff (Eds.), Measuring the promise: A compendium of recovery measures (Vol. 2, pp. 124–​132). Cambridge, MA: Evaluation Center at Human Services Research Institute. Mueser, K. T., Gingerich, S., Salyers, M. P., Mcguire, A. B., Reyes, R. U., & Cunningham, H. (2005). Illness Management and Recovery (IMR) scales. In T. Campbell-​ Orde, J. Chamberlin, J. Carpenter, & H. S. Leff (Eds.), Measuring the promise:  A compendium of recovery measures (Vol. 2, pp. 32–​35). Cambridge, MA:  Evaluation Center at Human Services Research Institute. Mueser, K. T., Kim, M., Addington, J., McGurk, S. R., Pratt, S. I., & Addington, D. (2017). Confirmatory factor analysis of the Quality of Life Scale and new proposed factor structure for the Quality of Life Scale-​Revised. Schizophrenia Research, 181, 117–​123. Mueser, K. T., Nishith, P., Tracy, J. I., DeGirolamo, J., & Molinaro, M. (1995). Expectations and motives for substance use in schizophrenia. Schizophrenia Bulletin, 21, 367–​378.

459

Mueser, K. T., Noordsy, D. L., Drake, R. E., & Fox, M. L. (2003). Integrated treatment for dual disorders: A guide to effective practice. New York, NY: Guilford. Mueser, K. T., Rosenberg, S. D., Goodman, L. A., & Trumbetta, S. L. (2002). Trauma, PTSD, and the course of schizophrenia: An interactive model. Schizophrenia Research, 53, 123–​143. Mueser, K. T., Sayers, S. L., Schooler, N. R., Mance, R. M., & Haas, G. L. (1994). A multisite investigation of the reliability of the Scale for the Assessment of Negative Symptoms. American Journal of Psychiatry, 151, 1453–​1462. Mueser, K. T., Sengupta, A., Schooler, N. R., Bellack, A. S., Xie, H., Glick, I. D., & Keith, S. J. (2001). Family treatment and medication dosage reduction in schizophrenia:  Effects on patient social functioning, family attitudes, and burden. Journal of Consulting and Clinical Psychology, 69, 3–​12. Murray, C. J. L., & Lopez, A. D. (Eds.). (1996). The global burden of disease: A comprehensive assessment of mortality and disability from diseases, injuries, and risk factors in 1990 and projected to 2020. Cambridge, MA: Harvard University Press. Murray, R. M., & Van Os, J. (1998). Predictors of outcome in schizophrenia. Journal of Clinical Psychopharmacology, 18, 2S-​4S. Niv, N., Cohen, A. N., Mintz, J., Ventura, J., & Young, A. S. (2007). The validity of using patient self-​report to assess psychotic symptoms in schizophrenia. Schizophrenia Research, 90, 245–​250. Niv, N., Cohen, A. N., Sullivan, G., & Young, A. S. (2007). The MIRECC version of the Global Assessment of Functioning Scale:  Reliability and validity. Psychiatric Services, 58, 529–​535. Nuechterlein, K. H., & Dawson, M. E. (1984). A heuristic vulnerability/​ stress model of schizophrenic episodes. Schizophrenia Bulletin, 10, 300–​312. Okkels, N., Trabjerg, B., Arendt, M., & Pedersen, C. B. (2017). Traumatic stress disorders and risk of subsequent schizophrenia spectrum disorder or bipolar disorder: A nationwide cohort study. Schizophrenia Bulletin, 43, 180–​186. Osher, F. C., & Kofoed, L. L. (1989). Treatment of patients with psychiatric and psychoactive substance use disorders. Hospital and Community Psychiatry, 40, 1025–​1030. Osterberg, L., & Blaschke, T. (2005). Adherence to medication. New England Journal of Medicine, 353, 487–​497. Overall, J. E., & Gorham, D. R. (1962). The Brief Psychiatric Rating Scale. Psychological Reports, 10, 799–​812. Pedersen, G., Hagtvet, K. A., & Karterud, S. (2007). Generalizability studies of the Global Assessment of Functioning–​Split version. Comprehensive Psychiatry, 48, 88–​94.

460

Schizophrenia and Personality Disorders

Perkins, D. O., & Lieberman, J. A. (2012). Epidemiology and natural history. In J. A. Lieberman, T. S. Stroup, & D. O. Perkins (Eds.), Essentials of schizophrenia (pp. 1–​10). Washington, DC: American Psychiatric Press. Phelan, M., Slade, M., Thornicroft, G., Dunn, G., Holloway, F., Wykes, T., . . . Hayward, P. (1995). The Camberwell Assessment of Need:  The validity and reliability of an instrument to assess the needs of people with severe mental illness. British Journal of Psychiatry, 167, 589–​595. Pratt, S. I., Mueser, K. T., Driscoll, M., Wolfe, R., & Bartels, S. J. (2006). Medication nonadherence in older people with serious mental illness: Prevalence and correlates. Psychiatric Rehabilitation Journal, 29, 299–​310. President’s New Freedom Commission on Mental Health. (2003). Achieving the promise:  Transforming mental health care in America: Final report (DHHS Publication No. SMA-​03-​3832). Rockville, MD:  U.S. Department of Health and Human Services. Prochaska, J. O., & DiClemente, C. C. (1984). The transtheoretical approach: Crossing the traditional boundaries of therapy. Homewood, IL: Dow-​Jones/​Irwin. Rabinowitz, J., Levine, S. Z., Garibaldi, G., Bugarski-​Kirola, D., Berardo, C. G., & Kapur, S. (2012). Negative symptoms have greater impact on functioning than positive symptoms in schizophrenia:  Analysis of CATIE data. Schizophrenia Research, 137, 147–​150. Radhakrishnan, R., Wilkinson, S. T., & D’Souza, D. C. (2014). Gone to pot—​ A review of the association between cannabis and psychosis. Frontiers in Psychiatry, 5(Article 54), 1–​24. Ralph, R. O., Kidder, K., & Phillips, D. (2000). Can we measure recovery? A Compendium of Recovery and Recovery-​ Related Instruments. Cambridge, MA:  Evaluation Center at Human Services Research Institute. Record, E. J., Medoff, D. R., Dixon, L. B., Klingaman, E. A., Park, S. G., Hack, S.,  .  .  .  Kreyenbuhl, J. (2016). Access to and use of the Internet by veterans with serious mental illness. Community Mental Health Journal, 52, 136–​141. Regier, D. A., Farmer, M. E., Rae, D. S., Locke, B. Z., Keith, S. J., Judd, L. L., & Goodwin, F. K. (1990). Comorbidity of mental disorders with alcohol and other drug abuse: Results from the Epidemiologic Catchment Area (ECA) study. JAMA, 264, 2511–​2518. Reinhard, S. C., Gubman, G. D., Horwitz, A. V., & Minsky, S. (1994). Burden Assessment Scale for families of the seriously mentally ill. Evaluation and Program Planning, 17, 261–​269. Reininghaus, U., McCabe, R., Slade, M., Burns, T., Croudace, T., & Priebe, S. (2013). The validity of patient-​and clinician-​rated measures of needs and the therapeutic relationship in psychosis: A pooled analysis. Psychiatry Research, 30, 711–​720.

Resnick, S. G., Rosenheck, R. A., & Lehman, A. F. (2004). An exploratory analysis of correlates of recovery. Psychiatric Services, 55, 540–​547. Ritsher, J. B., Otilingam, P. G., & Grajales, M. (2003). Internalized stigma of mental illness:  Psychometric properties of a new measure. Psychiatry Research, 121, 31–​49. Ritsner, M., Kurs, R., Ratner, Y., & Gibel, A. (2005). Condensed version of the Quality of Life Scale for schizophrenia for use in outcome studies. Psychiatry Research, 135, 65–​75. Robins, L. N. (1995). Diagnostic Interview Schedule, Version IV. St. Louis, MO: Washington School of Medicine. Robins, L. N., Wing, J., Wittchen, H. U., Helzer, J. E., Babor, T. F., Burke, J., . . . Towle, L. H. (1988). The Composite International Diagnostic Interview:  An epidemiologic instrument suitable for use in conjunction with different diagnostic systems and in different cultures. Archives of General Psychiatry, 45, 1069–​1077. Rosenberg, S. D., Drake, R. E., Wolford, G. L., Mueser, K. T., Oxman, T. E., Vidaver, R. M.,  .  .  .  Luckoor, R. (1998). The Dartmouth Assessment of Lifestyle Instrument (DALI): A substance use disorder screen for people with severe mental illness. American Journal of Psychiatry, 155, 232–​238. Rosenberg, S. D., Goodman, L. A., Osher, F. C., Swartz, M. S., Essock, S. M., Butterfield, M. I., . . . Salyers, M. P. (2001). Prevalence of HIV, hepatitis B and hepatitis C in people with severe mental illness. American Journal of Public Health, 91, 31–​37. Rotondi, A. J., Haas, G. L., Anderson, C. M., Newhill, C. E., Spring, M. B., Ganguli, R., . . . Rosenstock, J. B. (2005). A clinical trial to test the feasibility of a telehealth psychoeducational intervention for persons with schizophrenia and their families:  Intervention and 3-​ month findings. Rehabilitation Psychology, 50, 325. Roy, L., Crocker, A. G., Nicholls, T. L., Latimer, E. A., & Reyes Ayllon, A. R. (2014). Criminal behavior and victimization among homeless individuals with severe mental illness: A systematic review. Psychiatric Services, 65, 739–​750. Russinova, Z., Rogers, E. S., Gagne, C., Bloch, P., Drake, K., & Mueser, K. T. (2014). A randomized controlled trial of a peer-​run anti-​stigma photovoice intervention. Psychiatric Services, 65, 242–​246. Saha, S., Chant, D., Welham, J., & McGrath, J. (2005). A systematic review of the prevalence of schizophrenia. PLoS Medicine, 2(5), e141. Salyers, M. P., Godfrey, J. L., Mueser, K. T., & Labriola, S. (2007). Measuring illness management outcomes:  A psychometric study of clinician and consumer rating scales for illness self management and recovery. Community Mental Health Journal, 43, 459–​480.

Schizophrenia

Saunders, J. B., Aasland, O. G., Babor, T. F., De La Fuente, J. R., & Grant, M. (1993). Development of the Alcohol Use Disorders Identification Test (AUDIT):  WHO Collaborative Project on Early Detection of Persons with Harmful Alcohol Consumption II. Addiction, 88, 791–​804. Sayers, S. L., Curran, P. J., & Mueser, K. T. (1996). Factor structure and construct validity of the Scale for the Assessment of Negative Symptoms. Psychological Assessment, 8, 269–​280. Schneider, L. C., & Struening, E. L. (1983). SLOF: A behavioral rating scale for assessing the mentally ill. Social Work Research & Abstracts, 19(3), 9–​21. Schooler, N., Hogarty, G., & Weissman, M. (1979). Social Adjustment Scale II (SAS-​II). In W. A. Hargreaves, C. C. Atkisson, & J. E. Sorenson (Eds.), Resource materials for community mental health program evaluations (pp. 290–​303). Rockville, MD: National Institute of Mental Health. Searles, J. S., Alterman, A. I., & Purtill, J. J. (1990). The detection of alcoholism in hospitalized schizophrenics: A comparison of the MAST and the MAC. Alcoholism: Clinical and Experimental Research, 14, 557–​560. Seinen, A., Dawe, S., Kavanagh, D. J., & Bahr, M. (2000). An examination of the utility of the AUDIT in people diagnosed with schizophrenia. Journal of Studies on Alcohol, 61, 744–​750. Selzer, M. L. (1971). The Michigan Alcoholism Screening Test:  The quest for a new diagnostic instrument. American Journal of Psychiatry, 127, 1653–​1658. Shafer, A. (2005). Meta-​ analysis of the Brief Psychiatric Rating Scale factor structure. Psychological Assessment, 17, 324–​335. Sharma, T., & Harvey, P. (Eds.). (2000). Cognition in schizophrenia: Impairments, importance and treatment strategies. New York, NY: Oxford University Press. Sheehan, D. V., Lecrubier, Y., Sheehan, K. H., Amorim, P., Janavs, J., Weiller, E., . . . Dunbar, G. C. (1998). The Mini-​ International Neuropsychiatric Interview (M.I.N.I): The development and validation of a structured diagnostic psychiatric interview for DSM-​IV and ICD 10. Journal of Clinical Psychiatry, 59(Suppl. 20), 22–​33. Skinner, H. A. (1982). The Drug Abuse Screening Test. Addictive Behaviors, 7, 363–​371. Sklar, M., Groessl, E. J., O’Connell, M., Davidson, L., & Aarons, G. A. (2013). Instruments for measuring mental health recovery:  A systematic review. Clinical Psychology Review, 33, 1082–​1095. Sklar, M., Sarkin, A., Gilmer, T., & Groessl, E. (2012). The psychometric properties of the Illness Management and Recovery scale in a large American public mental health system. Psychiatry Research, 199, 220–​227. Slade, M. (2012). An evidence-​based approach to routine outcome assessment: Commentary on . . . use of Health

461

of the Nation Outcome Scales in psychiatry. Advances in Psychiatric Treatment, 18, 180–​182. Slade, M., Loftus, L., Phelan, M., Thornicroft, G., & Wykes, T. (1999). The Camberwell Assessment of Need. London, UK: Gaskell. Sobell, L. C., & Sobell, M. B. (1992). Timeline Follow-​ Back: A technique for assessing self-​reported alcohol consumption. In R. Z. Litten & J. Allen (Eds.), Measuring alcohol consumption: Psychosocial and biological methods (pp. 41–​72). Totowa, NJ: Humana Press. Söderberg, P., Tungström, S., & Armelius, B. A. (2005). Special section on the GAF:  Reliability of Global Assessment of Functioning ratings made by clinical psychiatric staff. Psychiatric Services, 56, 434–​438. Španiel, F., Vohlídka, P., Hrdlička, J., Kožený, J., Novák, T., Motlová, L., . . . Höschl, C. (2008). ITAREPS: Information Technology Aided Relapse Prevention Programme in Schizophrenia. Schizophrenia Research, 98, 312–​317. Španiel, F., Vohlídka, P., Kožený, J., Novák, T., Hrdlička, J., Motlová, L.,  .  .  .  Höschl, C. (2008). The Information Technology Aided Relapse Prevention Programme in Schizophrenia:  An extension of a mirror-​ design follow-​up. International Journal of Clinical Practice, 62, 1943–​1946. Spitzer, R. L., Williams, J. B., Kroenke, K., Linzer, M., deGruy, F. V., 3rd, Hahn, S. R.,  .  .  .  Johnson, J. G. (1994). Utility of a new procedure for diagnosing mental disorders in primary care:  The PRIME-​MD 1000 Study. JAMA, 272, 1749–​1756. Startup, M., Jackson, M., & Pearce, E. (2002). Assessing therapist adherence to cognitive–​behaviour therapy for psychosis. Behavioural and Cognitive Psychotherapy, 30, 329–​339. Stjernswärd, S., Persson, K., Nielsen, R., Tuninger, E., & Levander, S. (2013). A modified Drug Attitude Inventory used in long-​ term patients in sheltered housing. European Neuropsychopharmacology, 10, 1296–​1299. Strachan, A. M., Leff, J. P., Goldstein, M. J., Doane, J. A., & Burtt, C. (1986). Emotional attitudes and direct communication in the families of schizophrenics:  A cross-​ national replication. British Journal of Psychiatry, 149, 279–​287. Strauss, G. P., Keller, W. R., Buchanan, R. W., Gold, J. M., Fischer, B. A., McMahon, R. P.,  .  .  .  Kirkpatrick, B. (2012). Next-​ generation negative symptom assessment for clinical trials: Validation of the Brief Negative Symptom Scale. Schizophrenia Research, 142, 88–​92. Talwar, P., & Matheiken, S. T. (2010). Caregivers in schizophrenia: A cross cultural perspective. Indian Journal of Psychological Medicine, 32, 29–​33. Tandon, R., Nasrallah, H. A., & Keshavan, M. (2009). Schizophrenia, “just the facts” 4: Clinical features and conceptualization. Schizophrenia Research, 110, 1–​23.

462

Schizophrenia and Personality Disorders

Tessler, R., & Gamache, G. (1996). Toolkit for evaluating family experiences with severe mental illness. Amherst, MA:  The Evaluation Center at Human Services Research Institute. Thoma, P., & Daum, I. (2013). Comorbid substance use disorder in schizophrenia:  A selective overview of neurobiological and cognitive underpinnings. Psychiatry and Clinical Neurosciences, 67, 367–​383. Treisman, G. J., Jayaram, G., Margolis, R. L., Pearlson, G. D., Schmidt, C. W., Mihelish, G. L., . . . Misiuta, I. E. (2016). Perspectives on the use of health in the management of patients with schizophrenia. Journal of Nervous and Mental Disease, 204, 620–​629. Uher, R. (2014). Gene–​environment interactions in severe mental illness. Frontiers in Psychiatry, 5(Article 48), 1–​9. Vadhan, N. P., Serper, M. R., Harvey, P. D., Chou, J. C., & Cancro, R. (2001). Convergent validity and neuropsychological correlates of the Schedule for the Assessment of Negative Symptoms (SANS) attention subscale. Journal of Nervous and Mental Disease, 189, 637–​641. van der Gaag, M., Hoffman, T., Remijsen, M., Hijman, R., de Haan, L., van Meijel, B., . . . Wiersma, D. (2006). The five-​factor model of the Positive and Negative Syndrome Scale II: A ten-​fold cross-​validation of a revised model. Schizophrenia Research, 85, 280–​287. van Os, J., & Kapur, S. (2009). Schizophrenia. Lancet, 374, 635–​645. Varese, F., Smeets, F., Drukker, M., Lieverse, R., Lataster, T., Viechtbauer, W.,  .  .  .  Bentall, R. P. (2012). Childhood adversities increase the risk of psychosis: A meta-​analysis of patient-​control, prospective-​and cross-​ sectional cohort studies. Schizophrenia Bulletin, 38, 661–​671. Velligan, D., Prihoda, T., Dennehy, E., Biggs, M., Shores-​ Wilson, K., Crismon, M. L., . . . Shon, S. (2005). Brief Psychiatric Rating Scale Expanded Version:  How do new items affect factor structure? Psychiatry Research, 135, 217–​228. Wallace, C. J., Lecomte, T., Wilde, J., & Liberman, R. P. (2001). CASIG:  A consumer-​centered assessment for planning individualized treatment and evaluating program outcomes. Schizophrenia Research, 50, 105–​109. Wallace, C. J., Liberman, R. P., Tauber, R., & Wallace, J. (2000). The Independent Living Skills Survey: A comprehensive measure of the community functioning of severely and persistently mentally ill individuals. Schizophrenia Bulletin, 26, 631–​658. Wallwork, R. S., Fortgang, R., Hashimoto, R., Weinberger, D. R., & Dickinson, D. (2012). Searching for a consensus five-​factor model of the Positive and Negative Syndrome Scale for schizophrenia. Schizophrenia Research, 137, 246–​250.

Ware, J. E., Kosinski, M., & Keller, S. D. (1994). SF-​36 Physical and Mental Health Summary Scales:  A user’s manual. Boston, MA: Health Assessment Lab. Watson, A. C., Corrigan, P., Larson, J. E., & Sells, M. (2007). Self-​stigma in people with mental illness. Schizophrenia Bulletin, 33, 1312–​1318. Weiden, P., Rapkin, B., Mott, T., Zygmunt, A., Goldman, D., Horvitz-​Lennon, M., & Frances, A. (1994). Rating of Medication Influences (ROMI) scale in schizophrenia. Schizophrenia Bulletin, 20, 297–​310. Weinberger, D. R., & Marenco, S. (2003). Schizophrenia as a neurodevelopmental disorder. In S. R. Hirsch & D. R. Weinberger (Eds.), Schizophrenia (2nd ed., pp. 326–​ 348). Oxford, UK: Blackwell. Weisman, A. Y., Nuechterlein, K. H., Goldstein, M. J., & Snyder, K. S. (1998). Expressed emotion, attitudes, and schizophrenic symptom dimensions. Journal of Abnormal Psychology, 107, 355–​359. Weissman, M. M., & Bothwell, S. (1976). Assessment of social adjustment by patient self-​ report. Archives of General Psychiatry, 33, 1111–​1115. Winterstein, A. G., Bussing, R., Keenan, M., Pace, K., Turner, K., Xu, D., . . . Campbell, K. (2016). Inpatient psychiatric facility outcome and process measure development and maintenance project: Final technical report. Gainesville, Florida: Health Services Advisory Group. Witt, K., Hawton, K., & Fazel, S. (2014). The relationship between suicide and violence in schizophrenia: Analysis of the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) dataset. Schizophrenia Research, 154, 61–​67. Wolford, G. L., Rosenberg, S. D., Drake, R. E., Mueser, K. T., Oxman, T. E., Hoffman, D., . . . Carrieri, K. L. (1999). Evaluation of methods for detecting substance use disorder in persons with severe mental illness. Psychology of Addictive Behaviors, 13, 313–​326. World Health Organization. (1992). The ICD-​10 classification of mental and behavioural disorders: Clinical descriptions and diagnostic guidelines. Geneva, Switzerland: Author. Wykes, T., & Sturt, E. (1986). The measurement of social behaviour in psychiatric patients: An assessment of the reliability and validity of the SBS Schedule. British Journal of Psychiatry, 148, 1–​11. Yamada, K., Watanabe, K., Nemoto, N., Fujita, H., Chikaraishi, C., Yamauchi, K.,  .  .  .  Kanba, S. (2006). Prediction of medication noncompliance in outpatients with schizophrenia:  2-​year follow-​up study. Psychiatry Research, 141, 61–​69. Young, S. L., & Bullock, W. A. (2005). Mental Health Recovery Measure (MHRM). In T. Campbell-​Orde, J. Chamberlin, J. Carpenter, & H. S. Leff (Eds.), Measuring the promise: A compendium of recovery measures (Vol. 2, pp. 36–​41). Cambridge, MA: Evaluation Center at Human Services Research Institute.

Schizophrenia

Young, S. L., & Ensing, D. S. (1999). Exploring recovery from the perspective of people with psychiatric disabilities. Psychiatric Rehabilitation Journal, 22, 219–​231. Zarit, S., Reever, K., & Bach-​Peterson, J. (1980). Relatives of the impaired elderly: Correlates of feelings of burden. The Gerontologist, 20, 649–​655.

463

Zubin, J., & Spring, B. (1977). Vulnerability:  A new view of schizophrenia. Journal of Abnormal Psychology, 86, 103–​126. Zygmunt, A., Olfson, M., Boyer, C. A., & Mechanic, D. (2002). Interventions to improve medication adherence in schizophrenia. American Journal of Psychiatry, 159, 1653–​1664.

21

Personality Disorders Stephanie L. Rojas Thomas A. Widiger This chapter is concerned with an evidence-​based assessment for personality disorder. It is organized into three sections: instruments for the diagnosis of personality disorders, for case conceptualization and treatment planning, and for treatment monitoring and outcome. For the purposes of this chapter, the focus is on personality disorders; other chapters in this volume provide useful resources for assessing other aspects of the client with a personality disorder. The chapter begins with a brief discussion of the nature of personality disorder.

NATURE OF THE DISORDER

Personality is one’s characteristic manner of thinking, feeling, behaving, and relating to others. Personality traits are typically perceived to be integral to each person’s sense of self because they involve what persons value, what they do, and what they are like most every day throughout much of their lives. According to the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric Association [APA], 2013a), it is “when personality traits are inflexible and maladaptive and cause significant functional impairment or subjective distress [that] they constitute Personality Disorders” (p. 647). The current edition of the DSM, DSM-​ 5 (APA, 2013a), retained in Section II (the section for the official diagnoses) everything from DSM-​IV (APA, 1994) regarding personality disorders. Included in Section III of DSM-​ 5, however, for “emerging measures and models” (APA, 2013a, p. 729), is a proposed hybrid model within which personality disorders are said to involve a combination of deficits in the sense of self and interpersonal relatedness (Criterion A) along with maladaptive personality traits

(Criterion B). Although this conundrum is not addressed in the context of evidence-​based assessment of personality disorders, this chapter does address assessment measures that can be used for both Section II and Section III for personality disorder diagnoses. The 10 personality disorders retained in DSM-​ 5 Section II are the paranoid, schizoid, and schizotypal (placed within an odd–​eccentric cluster); the histrionic, antisocial, borderline, and narcissistic (placed within a dramatic–​emotional cluster); and the avoidant, dependent, and obsessive–​ compulsive (placed within an anxious–​ avoidant cluster). By definition, personality disorders must be evident since adolescence or young adulthood and have been relatively chronic and stable throughout adult life. As such, they often predate the occurrence of other mental disorders, such as a mood, anxiety, or substance use disorder. It is estimated that approximately 15% of adults in the United States meet diagnostic criteria for at least one personality disorder (APA, 2013a). Although the comorbid presence of a personality disorder is likely to have an important impact on the course and treatment of other forms of psychopathology (Links & Eynan, 2013), the prevalence of personality disorder is generally underestimated in clinical practice, due in part to the failure to provide systematic or comprehensive assessments of personality disorder symptomatology (Miller, Few, & Widiger, 2012). One change for the personality disorders in DSM-​5 was the loss of the multiaxial system of DSM-​IV-​TR (APA, 2000), wherein personality disorders had been placed on a separate diagnostic axis. The reason for the multiaxial system was the fundamental differences between the personality disorders and other forms of psychopathology. Personality disorders will typically predate the occurrence of other mental disorders of adulthood and may in fact

464

Personality Disorders

have contributed to their etiology as well as their future course and treatment. In summary, it is possible that the loss of the multiaxial system will further diminish the assessment of personality disorders in clinical practice. Personality disorders are highly comorbid with one another (Clark, 2007; Trull, Scheiderer, & Tomko, 2012). Patients who meet the DSM-​5 diagnostic criteria for one personality disorder are likely to meet the diagnostic criteria for another. DSM-​5 noted that “prevalence estimates for the different clusters suggest 5.7% for Cluster A [odd–​ eccentric], 1.5% for Cluster B [dramatic–​emotional], 6% for Custer C [anxious–​avoidant], and 9.1% for any personality disorder, indicating frequent co-​occurrence of disorders from different clusters” (APA, 2013a, p. 646). DSM-​5 instructs clinicians that all diagnoses should be recorded because it can be important to consider, for example, the presence of antisocial traits in someone with a borderline personality disorder or the presence of paranoid traits in someone with a dependent personality disorder. However, the extent of diagnostic co-​occurrence is at times so extensive that many researchers prefer a more dimensional or profile description of personality (Clark, 2007; Skodol, 2012; Trull et al., 2012; Widiger & Trull, 2007). A primary purpose of a diagnosis is to suggest a specific etiology and pathology for which a particular treatment would ameliorate the condition (First & Tasman, 2006). However, many of the disorders in DSM-​5, including the personality disorders, may not in fact have single etiologies or even specific pathologies (Kupfer, First, & Regier, 2002). Research has suggested the DSM-​ 5 Section II personality disorders are typically constellations of maladaptive personality traits resulting from multiple genetic dispositions that are interacting with a variety of negative environmental experiences (Paris, 2012; Widiger & Trull, 2007).

465

Most DSM-​5 mental disorders can be diagnosed simply through an assessment of current functioning. One need not inquire as to the person’s functioning 15 years ago to assess whether or not the person currently has a major depressive disorder. However, an assessment of current functioning can be highly misleading when diagnosing a personality disorder, particularly if the person is currently suffering from, or is in treatment for, a mood, anxiety, or other comorbid mental disorder. One needs to distinguish the effect of these other mental disorder on the patient’s current functioning from the characteristic manner of thinking, feeling, and relating to others that predated their onset. The most commonly used and preferred method for the diagnosis of a personality disorder in general clinical practice is an unstructured clinical interview (Westen, 1997). However, studies have consistently indicated that assessments based on unstructured clinical interviews do not consider all of the necessary or important diagnostic criteria (Garb, 2005). Personality disorder assessments based on unstructured clinical interviews are often unreliable (Miller et  al., 2012). Clinicians may base their diagnosis on a subjective impression or focus on just one or two diagnostic criteria that they consider to be sufficient (Samuel & Bucher, 2017). The diagnosis of a particular personality disorder may even be governed by the particular theoretical interests of the clinician as well as gender and cultural biases (Garb, 2005; Oltmanns & Powers, 2012). The preferred method for diagnosing personality disorders in research is the semi-​ structured interview (Segal & Coolidge, 2007; Skodol, 2014; Widiger & Boyd, 2009; Zimmerman, 2003). Semi-​ structured interviews have several advantages over unstructured interviews (McDermut & Zimmerman, 2008; Miller et  al., 2012). Semi-​structured interviews ensure and document that a systematic and comprehensive assessment of each perASSESSMENT FOR DIAGNOSIS sonality disorder diagnostic criterion has been made. This documentation can be particularly helpful in situations Personality disorders can be among the most difficult to in which the credibility or validity of the assessment might assess. Personality includes one’s characteristic sense of be questioned, such as forensic or disability evaluations. self, typically involving distortions in self-​image (Millon, Semi-​ structured interviews provide specific, carefully 2011). Dependent persons can be excessively self-​ selected questions for the assessment of each diagnostic effacing and even self-​denigrating, narcissistic persons criterion, the application of which increases the likelican be grandiose and arrogant, and paranoid persons hood that assessments will be consistent across interviewcan be highly suspicious and mistrustful. As a result, sim- ers. Therefore, semi-​structured interviews provide more ply seeking self-​reported information from persons who reliable and valid results across interviewers and time are characterized, in part, by distortions in self-​image (Miller et  al., 2012; Segal & Coolidge, 2007; Widiger can complicate a valid assessment (Miller et  al., 2012; & Boyd, 2009; Wood, Garb, Lilienfeld, & Nezworski, Widiger & Boyd, 2009). 2002). In addition, the manuals that often accompany a

466

Schizophrenia and Personality Disorders

semi-​structured interview frequently provide a considerable amount of helpful information for understanding the rationale of each diagnostic criterion, for interpreting vague or inconsistent symptoms, and for resolving diagnostic ambiguities (e.g., Loranger, 1999; Widiger, Mangine, Corbitt, Ellis, & Thomas, 1995). Concerns regarding problems with semi-​ structured interviews (e.g., time-​ consuming and less flexibility) should not dissuade clinicians from their use (Segal & Coolidge, 2007; Widiger & Samuel, 2005; Zimmerman, 2003). For example, the diagnosis of intellectual disability typically requires a time-​consuming and structured assessment battery that includes the assessment of both intellectual and adaptive functioning. Yet few clinicians object to these requirements or would risk making such a diagnosis on the basis of an unstructured interview (Widiger & Clark, 2000). It is not unreasonable to expect clinicians to utilize similarly rigorous assessment methodologies for the assessment of personality disorders, especially when these disorders and traits are related to significant functional impairment (e.g., Skodol et  al., 2002)  and have important implications for treatment utilization (e.g., Miller, Pilkonis, & Mulvey, 2006)  and outcomes (e.g., Skodol, 2008). However, it is also unrealistic to expect clinicians to have the time to assess all the diagnostic criteria for the personality disorders (Mullins-​ Sweatt, Lengel, & DeShong, 2016), typically requiring 2 hours (Widiger & Boyd, 2009). Therefore, it is recommended that one first administer a self-​report inventory to identify the most likely personality disorders to be present, followed by a semi-​structured interview to document the presence of their respective diagnostic criteria (Widiger & Samuel, 2005). A variety of self-​report inventories and interviews that would be useful to clinicians for assessing abnormal personality functioning have been developed. A  complete summary of all these potential instruments is beyond the scope of this chapter, but several extensive reviews exist (e.g., Clark & Harrison, 2001; Furnham, Milner, Akhtar, & De Fruyt, 2014; McDermut & Zimmerman, 2005; Miller et  al., 2012; Rogers, 2001; Segal & Coolidge, 2007; Widiger & Boyd, 2009). There are five semi-​ structured interviews designed to assess the 10 DSM-​5 Section II personality disorders: (a) Diagnostic Interview for DSM-​IV Personality Disorders (DIPD-​IV; Zanarini, Frankenburg, Chauncey, & Gunderson, 1987; Zanarini, Frankenburg, Sickel, & Young, 1996), (b)  International Personality Disorder Examination (IPDE; Loranger, 1999), (c)  Personality Disorder Interview-​ IV (PDI-​ IV; Widiger et  al., 1995), (d)  Structured Clinical Interview

for DSM-​ 5 Personality Disorders (SCID-​ 5-​ PD; First, Williams, Benjamin, & Spitzer, 2016), and (e) Structured Interview for DSM-​IV Personality Disorders (SIDP-​IV; Pfohl, Blum, & Zimmerman, 1997). In addition, the Shedler–​Westen Assessment Procedure-​200 (SWAP-​200) is a clinician rating form of 200 items, drawn from the psychoanalytic and personality disorder literature (Shedler, 2015). SWAP-​200 items are not ranked on the basis of an administration of a series of questions; instead, the SWAP-​200 “relies on clinicians to do what clinicians do well:  observe and describe individual patients or clients they know” (Shedler, 2015, p. 228). There are eight traditional self-​report inventories for the assessment of the DSM-​5 Section II (i.e., DSM-​IV) personality disorders:  (a) Coolidge Axis II Inventory (CATI; Coolidge 1992); (b)  Minnesota Multiphasic Personality Inventory-​ 2 (MMPI-​ 2) personality disorder scales developed originally by Morey, Waugh, and Blashfield (1985) but revised for the MMPI-​ 2 by Colligan, Morey, and Offord (1994); (c) Millon Clinical Multiaxial Inventory-​IV (Millon, Grossman, & Millon, 2015); (d) OMNI Personality Inventory (OMNI; Loranger, 2001); (e)  Personality Diagnostic Questionnaire-​4 (PDQ-​4; Bagby & Farvolden, 2004); (f) Personality Assessment Inventory (PAI; Morey & Boggs, 2004); (g) Schedule for Nonadaptive and Adaptive Personality–​2nd Edition (SNAP-​2; Clark, Simms, Wu, & Casillas, 2014); and (h)  Wisconsin Personality Disorders Inventory-​IV (WISPI-​IV; Klein et  al., 1993). These eight inventories contain items that assess for the respective diagnostic criteria of each personality disorder. There are also two self-​report inventories that assess for Section III maladaptive personality traits that have also been keyed for the DSM-​5 Section II personality disorders: (a) Five Factor Model Personality Disorder scales (FFMPD; Widiger, Lynam, Miller, & Oltmanns, 2012)  and (b)  Personality Inventory for DSM-​5 (PID-​5; Krueger, Derringer, Markon, Watson, & Skodol, 2012). Table 21.1 provides a comparative listing of these instruments, using the rating system of this text (see Chapter  1). The first five instruments in the table (i.e., DIPD, IPDE, PDI-​IV, SCID-​5-​PD, and SIDP-​IV) are the five semi-​structured interviews, presented in alphabetical order and followed by the SWAP-​200 clinician rating form. The next eight instruments are the traditional self-​report inventories (i.e., CATI, MCMI-​IV, MMPI-​2, OMNI, PAI, PDQ-​4, SNAP-​2, and WISPI-​IV), followed by the two trait-​based self-​report inventories (i.e., FFMPD and PID-​5), again presented in alphabetical order. Rather than provide summary details on each of these measures in turn, the following sections focus on the psychometric

Personality Disorders TABLE 21.1  Ratings Instrument

467

of Instruments Used for Diagnosis Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

NR A NR NR NR

G G G G G

E E E E E

A A A A A

G G G G G

G E A E E

G G G G G

A A A A A

✓ ✓ ✓ ✓ ✓

NR

G

A

NR

A

A

NR

A

Norms

Semi-​Structured Interviews DIPD IPDE PDI-​IV SCID-​5-​PD SIDP-​IV Clinician Ratings SWAP-​200

Traditional Self-​Report Inventories CATI

A

G

NA

A

A

A

NR

G

MCMI-​IV

E

G

NA

A

A

A

A

A

MMPI-​2

E

G

NA

A

A

A

A

G

OMNI PAI

G E

G G

NA NA

A A

A A

A A

NR NR

G A

PDQ-​4

NR

A

NA

A

G

A

G

G

SNAP-​2

A

G

NA

A

A

A

NR

G

WISPI-​IV

A

G

NA

A

A

A

NR

G

NA NA

NR A

E G

E E

NR NR

E E

Trait-​Based Self-​Report Inventories FFMPD PID-​5

NR NR

E E

✓ ✓

Note: DIPD = Diagnostic Interview for Personality Disorders; IPDE = International Personality Disorders Examination; PDI-​IV = Personality Disorder Interview-​IV; SCID-​5-​PD = Structured Clinical Interview for DSM-​5 Personality Disorders; SIDP-​IV = Structured Interview for DSM-​ IV Personality Disorders; SWAP-​200 = Shedler–​Westen Assessment Procedure; CATI = Coolidge Axis II Inventory; MCMI-​IV = Millon Clinical Multiaxial Inventory-​IV; MMPI-​2 = Minnesota Multiphasic Personality Inventory-​2; OMNI = Omni Personality Inventory; PAI = Personality Assessment Inventory; PDQ-​4  =  Personality Diagnostic Questionnaire-​4; SNAP-​2  =  Schedule for Nonadaptive and Adaptive Personality-​2; WISPI-​IV = Wisconsin Personality Disorders Inventory; FFMPD = Five Factor Model Personality Disorder scales; PID-​5 = Personality Inventory for DSM-​5; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

properties rated in the table and provide comparisons among instruments for each property. Norms Four of the five semi-​ structured interviews and the SWAP-​200 were rated as not reported for normative data because normative data have not been provided within their test manuals (Kaye & Shea, 2000; Rogers, 2001; Widiger & Samuel, 2005). Normative data have not been obtained, in part, because of the substantial cost of conducting an epidemiological study with a semi-​structured interview administered by professional clinicians. There are published studies in which mean values and prevalence rates have been provided, and one can compare one’s findings with these published values (e.g., for mean SWAP-​200 scores obtained in a clinical data set, see Westen & Shedler, 2003). However, these values can vary considerably across clinical settings, and one cannot consider these findings to actually represent normative data (i.e., a representative sample obtained from a designated

population; Clark, 2007). The one potential exception might be the IPDE, a version of which was administered in 14 mental health centers located in 11 different countries of North America, Europe, Africa, and Asia (Loranger, 1999). However, the individual results across countries for each disorder were never published. With regard to the eight traditional self-​report inventories, the test manuals for the MCMI-​IV (Millon et al., 2015), OMNI (Loranger, 1999), PAI (Morey, 1991), CATI (Coolidge & Merwin, 1992), WISPI-​ IV (Klein et  al., 1993), and SNAP-​2 (Clark et  al., 2014)  provide information concerning normative data. Colligan et  al. (1994) provide substantial information concerning the normative data for the MMPI-​ 2 personality disorder scales, although for unclear reasons, test manuals for the MMPI-​2 refer only in passing to the Morey et al. (1985) personality disorder scales (e.g., Derksen, 2006). A rating of not reported for normative data was provided for the PDQ-​4 because the PDQ-​4 has been treated in a manner comparable to the semi-​structured interviews (i.e., little attention given to providing normative information;

468

Schizophrenia and Personality Disorders

Bagby & Farvolden, 2004). Normative data have not yet been provided for the FFMPD and PID-​5.

study in which they administered both the IPDE and SCID-​ II to the same 100 inpatients of a personality disorders treatment unit. Both interviews were administered blind to one another on the same day (one in the Reliability morning and the other in the afternoon). This study has Reliability can concern internal consistency, inter-​rater never been replicated or extended to comparisons of agreement, and test–​retest agreement (Kurtz, McCrae, other semi-​structured interviews, nor has anyone ever Terracciano, & Yamagata, 2011). It should also be empha- compared the results of the same semi-​structured intersized that the internal consistency of scores on an instru- view administered at different times to the same patients. ment is a reflection of the construct being assessed. The Nevertheless, there remains the importance of obtaining DSM-​5 Section II personality disorders are constellations inter-​rater reliability studies with independently adminisof maladaptive personality traits, and these syndromal tered interviews. Indeed, one of the results of the DSM-​5 assortments of traits can complicate obtaining accu- field traits was the poor inter-​rater reliability obtained for rate estimates of internal consistency when the traits do antisocial personality disorder and, at one site, for bordernot themselves correlate highly with one another. For line (Regier et  al., 2013). This poor reliability was due example, antisocial personality disorder includes traits of in part to the absence of a semi-​structured interview but antagonism and disinhibition, whereas schizoid is con- likely also reflects the fact that different persons interfined largely to traits of introversion. As a result, scores viewed the patients at different times. on scales to assess DSM-​5 Section II antisocial will often The test–​retest reliability of scores for a variety of meahave poorer internal consistency compared to scales to sures has been rated as “adequate” because the necesassess schizoid. The FFMPD and PID-​5 were rated as sary research to understand the findings has not yet been excellent for their assessment of the maladaptive person- conducted. It is possible that some instruments should ality trait scales; however, internal consistency estimates be rated as inadequate, but additional research would will at times decrease when these scales are combined for be necessary before this judgment is provided. Consider, the respective DSM-​5 Section II syndromes. for example, a study by Piersma (1987, 1989)  that illusAll five of the semi-​structured interviews were rated as trates well a point that may apply to other measures. He excellent with respect to inter-​rater reliability (inter-​rater reported substantial changes in MCMI assessments across reliability is not relevant to the self-​report inventories). brief inpatient hospitalizations. Test–​retest kappa was only The major strength of these instruments is their provision .11 for the borderline diagnosis, .09 for compulsive, .01 of explicit and systematic assessments of each personality for passive–​aggressive, and .27 for schizotypal. One could disorder diagnostic criterion contributing to their obtain- conclude that clinical treatment resulted in significant ment of good to excellent inter-​rater reliability in most changes to personality functioning, as personality disorstudies (Furnham et al., 2014; Miller et al., 2012; Segal ders are responsive to treatment (Leichsenring & Leibing, & Coolidge, 2007; Widiger & Boyd, 2009). Nevertheless, 2003; Perry & Bond, 2000). However, inconsistent with it is also worth noting that the reliability data that are this explanation is the fact that the treatment was quite reported in most studies have been confined to the agree- brief and was focused on mood, anxiety, and other forms of ment in the coding of respondents’ answers to interview psychopathology. Perhaps most problematic to the hypothquestions. This might not be the more important or fun- esis of a valid change in personality was the additional finddamental concern with respect to the reliability of a per- ing of significant increases in the histrionic and narcissistic sonality disorder assessment (Clark & Harrison, 2001). Of personality disorder scales (Piersma, 1989). If the inpatient greater importance would be studies addressing whether hospitalization did, in fact, contribute to a remission of semi-​ structured interviews are being administered reli- borderline and compulsive symptoms, it should perhaps ably across different research or clinical sites (Segal & take responsibility as well for contributing to the creation Coolidge, 2007). For example, are some interviewers of histrionic and narcissistic personality disorders. Piersma providing substantially more follow-​up queries than other (1989) concluded, instead, that the self-​report inventory interviewers? Do patients respond to the same open-​ assessment “is not able to measure long-​term personalended questions in a consistent manner at different times? ity characteristics (‘trait’ characteristics) independent of The reason that such research has not been con- symptomatology (‘state’ characteristics)” (p. 91). ducted is largely the cost. Skodol, Oldham, Rosnick, It is perhaps unfair to single out the MCMI-​IV with Kellman, and Hyler (1991) conducted an impressive respect to this problem, as it is possible, if not likely, that

Personality Disorders

comparable results would occur for the other self-​report inventories and even the semi-​structured interviews. A significant problem for all of the self-​report inventories is the absence of directions, within the instructions to the respondents, to describe one’s characteristic manner of functioning before the occurrence of any current diagnostic disorder (previously identified as Axis I disorders). The instructions for the MCMI-​IV even refer explicitly to describing one’s current problems. As a result, many respondents are probably answering personality disorder items with respect to their current mood, anxiety, or other psychopathology. Semi-​structured interviews have the potential of being relatively less susceptible to confusing a personality disorder with other psychopathology compared to self-​report inventories (Segal & Coolidge, 2007; Widiger & Boyd, 2009), but they are not immune. An interviewer can easily fail to appreciate the extent to which patients’ self-​ descriptions are being distorted by mood, anxiety, distress, or other situational factors. In fact, results equivalent to those reported by Piersma (1987, 1989)  were obtained in a study that was purportedly documenting the resilience of semi-​structured interviews to mood state distortions. Loranger et al. (1991) compared IPDE assessments obtained at the beginning of an inpatient admission to those obtained 1 week to 6  months later and reported “a significant reduction in the mean number of criteria met on all of the personality disorders except schizoid and antisocial” (p. 726). It is unlikely that 1 week to 6 months of treatment that was focused largely on mood, anxiety, and other forms of psychopathology resulted in the extent of changes to personality that were obtained. In fact, comparable to the findings of Piersma (1989), twice as many patients (eight) were diagnosed with a histrionic personality disorder at discharge than were diagnosed with this personality disorder at admission. Further complicating an understanding of instability in personality disorder assessments is the suggestion that the changes on these measures may reflect actual fluctuations in personality. Comparable changes have occurred with respect to the assessment of such traits as neuroticism within treatment-​seeking individuals. To the extent that neuroticism is a disposition to experience and express negative affect, increases (and decreases) in the expression of these moods could be understood as fluctuating expressions of (and changes to) the personality trait of neuroticism. Costa, Bagby, Herbst, and McCrae (2005) argued that “rather than regard these depression-​caused changes in assessed personality trait levels as a distortion, we interpret them as accurate reflections of the current

469

condition of the individual” (p.  45). One’s level of neuroticism will not simply remain flat and stable no matter what is happening within one’s life. Fluctuations in levels of agreeableness and extraversion, and other domains of personality, will also occur in response to situational changes. In summary, further research is needed to understand instability in personality trait and personality disorder scores. Content Validity A rating of good was provided for all five of the semi-​ structured interviews with respect to content validity. A  strength of all five semi-​structured interviews is their explicit effort to obtain a systematic and comprehensive assessment of the DSM-​5 Section II personality disorder criterion sets. A rating of excellent was not provided because none of the authors of the measures obtained quantitative ratings for the extent to which the interview questions adequately covered the content. However, the face validity of the questions does appear to be excellent. Age of Onset An important limitation of the semi-​structured interviews is the extent to which each adheres to the requirement that the personality disorder symptomatology has an age of onset in late adolescence or young adulthood. All of the interviews focus their initial, if not their entire, assessment on the previous 2 to 5 years. The SCID-​5-​PD (First et al., 2016) requires that each diagnostic criterion be evident over a 5-​year period, whereas the DIPD (Zanarini et al., 1987) focuses its assessment on the previous 2 years (Widiger, 2005). The PDI-​IV (Widiger et  al., 1995), in contrast, encourages the interviewer to document that each diagnostic criterion considered to be present has been evident since young adulthood but does not provide an explicit set of questions to do so. The IPDE (Loranger, 1999) is the most explicit in its requirements, but it is also more liberal, as it requires that only one diagnostic criterion for a respective personality disorder be present since the age of 25 years; all of the others can be evident only within the past few years. The assumption with the DIPD, for example, is that if the behavior has been evident during the previous 2 years, then it is likely to have been present before the onset of other psychopathology and evident since young adulthood. However, this can often be a false and highly problematic assumption. For example, the DIPD was used in the widely published Collaborative Longitudinal

470

Schizophrenia and Personality Disorders

Personality Disorders Study (CLPS; Gunderson et  al., 2000). CLPS reported many cases of sudden, dramatic remissions soon after the study began. For example, 23 of 160 persons (14%) diagnosed with borderline personality disorder at the study’s baseline assessment met criteria for two or fewer of the nine diagnostic criteria just 6 months later (Gunderson et  al., 2003). Gunderson et  al. (2003) concluded that only 1 of these 18 persons had been inaccurately diagnosed at baseline; the rest were considered to be valid instances of sudden and dramatic remission. However, it is difficult to imagine so many persons who met the diagnostic criteria for borderline personality disorder since late childhood and who continued to manifest these symptoms throughout their adult life experienced, apparently for the first time, dramatic changes in personality functioning soon after the onset of the study. For example, the diagnoses included one person whose original symptoms were determined to be secondary to the use of a stimulant for weight reduction. For other cases, “the changes involved gaining relief from severely stressful situations they were in at or before the baseline assessment” (p. 115), including the resolution of a traumatic divorce or custody battle. To the extent that these cases of remission represent invalid baseline assessments, the test–​retest reliability of the interview assessments should perhaps be rated as less than adequate. DSM-​5 Section II Criterion Sets Whereas all five of the semi-​ structured interviews are coordinated explicitly with the respective DSM-​5 Section II diagnostic criterion sets, this is not the case for the SWAP-​200 or for most of the self-​report inventories. These instruments vary considerably in the extent to which they are coordinated with the current DSM-​5 Section II. The CATI (Coolidge & Merwin, 1992) and the PAI (Morey & Boggs, 2004) were constructed in reference to the DSM-​ III-​R criterion sets (APA, 1987) and have not since been revised. The Morey et  al. (1985) MMPI-​2 items were selected on the basis of the DSM-​III criterion sets (APA, 1980), and Somwaru and Ben-​Porath (1995) developed MMPI-​2 personality disorder scales that are coordinated with DSM-​IV-​TR (Hicklin & Widiger, 2000). It is interesting to note that both Somwaru and Ben-​Porath and Morey et al. used quantitative ratings by multiple judges for the selection of items from the same pool (i.e., can be described as “excellent”), yet the two efforts yielded a different item pool selection (Hicklin & Widiger, 2000). Deviating from the DSM, however, might not be a disadvantage in all cases. For example, many of the SWAP-​200

items concern symptoms, features, or traits that are outside of the respective DSM-​5 criterion sets, which might in fact be considered a strength (Shedler, 2015) if it then provides a more valid personality disorder assessment. Content validity for the PID-​ 5 was rated as good because multiple judges were involved in the assignment of the trait scales to respective personality disorder constructs (Krueger et al., 2012), although the precise nature of this process has not been explicitly described. Content validity for the FFMPD scales was rated as excellent because multiple judges and quantitative ratings were obtained for these trait scales. Scales were selected in part on the basis of surveys of researchers (Lynam & Widiger, 2001) and surveys of clinicians (Samuel & Widiger, 2004). Construct Validity The “investigation of a test’s construct validity is not essentially different from the general scientific procedures for developing and confirming theories” (Cronbach & Meehl, 1955, p. 300). Construct validity subsumes other forms of validity and concerns the extent to which the scientific findings for the measure produce the expected theoretical findings for the respective construct (Strauss & Smith, 2009). Ratings of construct validity, for this text, are based on the extent to which there is replicated evidence of predictive validity, concurrent validity, and convergent and discriminant validity, as well as a measure’s ability to provide incremental validity with respect to other clinical data. The IPDE, SCID-​5-​PD, and SIDP-​IV were provided with excellent ratings for construct validity, in part because these three instruments have been used most extensively in personality disorder research. Much of what is published concerning the etiology, pathology, course, and treatment of personality disorders has been based on studies using one of these three instruments. The DIPD similarly was the instrument used in the heavily published CLPS project (Gunderson et al., 2000). The PDI-​IV has been used in a number of studies but not nearly as frequently as the other four semi-​structured interviews. Most of the instruments have demonstrated evidence of problematic discriminant validity, but this likely reflects the absence of adequate discriminant validity with regard to the personality disorder constructs themselves (Clark, 2007; Lynam & Widiger, 2001; Miller et al., 2012; Trull & Durrett, 2005). A valid assessment of an individual personality disorder should obtain weak discriminant validity with respect to other near-​neighbor personality disorder constructs. For example, to the extent that borderline personality disorder does in fact overlap substantially with

Personality Disorders

dependent personality disorder (e.g., both involve fears of separation and abandonment), then valid scales that assess borderline personality disorder should correlate with scales that assess dependent personality disorder. In fact, the scales of some personality disorder self-​report inventories (e.g., the MCMI-​IV and MMPI-​2) include substantial item overlap in order to compel the obtainment of a particular degree and direction of co-​occurrence that would be consistent with theoretical expectations. The PID-​ 5, similarly, uses the same maladaptive trait scales for different personality disorders (APA, 2013a). The FFMPD has scales specific to each personality disorder. For example, there are different anxiousness scales for the schizotypal (i.e., Social Anxiousness), borderline (i.e., Anxious Uncertainty), dependent (i.e., Relationship Anxiety), avoidant (i.e., Evaluation Apprehension), and obsessive–​compulsive (i.e., Excessive Worry) personality disorders. Each scale was constructed to assess how anxiousness is expressed differently for each personality disorder (Widiger et  al., 2012). However, no study has yet attempted to demonstrate that these scales do in fact have adequate discriminant validity. SWAP-​ 200 assessments have consistently obtained better discriminant validity compared to personality disorder semi-​structured interviews (Shedler & Westen, 2004), but this could reflect the fact that clinicians administering the SWAP-​200 are artifactually required to provide a distribution of ratings that diminishes substantially the likelihood of obtaining diagnostic co-​ occurrence (Block, 2008; Wood, Garb, Nezworski, & Koren, 2007). For example, Westen and Shedler (1999) required clinicians to identify half of the personality disorder symptoms as being absent and only eight SWAP-​ 200 items could be given the highest rankings, no matter the actual opinions of the clinicians or the symptoms that were in fact present (similar constraints were placed on the other ratings). Only a few studies have examined the convergent validity among the personality disorder semi-​structured interviews (O’Boyle & Self, 1990; Pilkonis et  al., 1995; Skodol et al., 1991). Of these studies, only two involved the administration of interview schedules to the same patients (O’Boyle & Self, 1990; Skodol et  al., 1991), and all three were confined to just two of the five semi-​ structured interviews. The most comprehensive study was conducted by Skodol et  al., summarized previously with respect to inter-​rater reliability. Skodol et al. reported weak convergent validity for the categorical diagnoses (e.g., κ = .14 for schizoid) but good convergent validity for dimensional ratings (e.g., κ = .58 for schizoid).

471

Saylor and Widiger (2008) converted some of the five DSM-​5 Section II semi-​structured interviews personality disorder assessments into self-​report inventories in order to examine the convergence of the instruments in regard to the content of the interviews (i.e., variation in what questions are asked) rather than with respect to their administration or scoring (e.g., variation in follow-​up questions). The inventories demonstrated substantial convergent validity in regard to total scores but some divergence regarding individual diagnostic criteria. For example, for antisocial impulsivity or failure to plan ahead, the DIPD demonstrated significantly lower convergence with the other four interviews and did not obtain significant convergence with the SNAP and PDQ-​4. This lack of convergence may have been due to content specific to the DIPD (Widiger & Lowe, 2010). The DIPD items contain queries such as “Since the age of 15, have you changed jobs,” “Since the age of 15, have you moved,” and “Since the age of 15, have you gone from close relationship to close relationship” (Zanarini et al., 1996). These queries are related to the presence of changes in job, residence, or relationship, respectively. These questions do not examine if the changes are excessive in frequency or even dysfunctional, or whether there is a failure to plan ahead, as included within the respective DSM-​5 Section II diagnostic criterion. Several studies have been published on the convergence of personality disorder semi-​ structured interviews with self-​report inventories, as well as the convergent validity among self-​ report inventories. Miller et  al. (2012) tabulated the findings from 25 of these studies. In comparison to correlations between self-​report inventories, the correlations between self-​ report and semi-​ structured interviews were substantially lower. Miller et al. reported that the decrease in convergent validity of semi-​structured interviews with the self-​report inventories indicates that the method of assessing personality disorders can have a significant effect on the resultant diagnoses. Additional research is required regarding the relative validity of semi-​structured and self-​ report assessment (Widiger & Boyd, 2009). Rojas and Widiger (2017) examined the PID-​5 coverage of DSM criteria as assessed by the CATI and PDQ-​4+. The authors reported good coverage of the DSM-​IV-​TR diagnostic criteria by the PID-​5 for the antisocial, borderline, avoidant, dependent, and narcissistic personality disorders. However, coverage could be improved for some criteria of obsessive–​compulsive personality disorder.

472

Schizophrenia and Personality Disorders

Validity Generalization Validity generalization concerns whether the instrument has been shown to be equally valid across different populations. Considered in this chapter is generalization across age, gender, and culture/​ethnicity.

Age The DSM-​ 5 notes that personality disorder traits are often recognizable by adolescence or early adulthood. However, with the exception of conduct disorder as an antecedent of adult antisocial personality disorder, very little is known about the childhood antecedents of the DSM-​5 Section II personality disorders (De Fruyt & De Clercq, 2014). The DSM-​5 cautions clinicians that personality disorder features in childhood will often resolve as the individual enters adulthood. In addition, the DSM-​ 5 Section II criterion sets were written for adults, and it is not at all clear whether they translate well to children. For example, many children will act in a dependent fashion that will have little to do with a personality disorder, and some adolescents will display borderline personality disorder symptomatology that should perhaps be understood as part of a normative identity crisis rather than a personality disorder. In line with this research, many of the instruments reviewed are indicated for use with individuals aged 18 years or older. However, Westen, Shedler, Durrett, Glass, and Martens (2003) diagnosed adolescents with personality disorders using the SWAP-​200, and a version of the PID-​5 is available for children aged 11 to 17 years (APA, 2013b). The same point can be made for the assessment of personality disorders among older adults. A  significant amount of research has been conducted on personality disorders among older adults (Oltmanns & Balsis, 2011), but this research has also been shown to be rather problematic. For example, estimates of the prevalence of personality disorders among older adults are generally higher than is obtained among middle-​ aged adults, which is fundamentally inconsistent with the DSM-​5 diagnostic system. It is quite possible that maladaptive personality can develop as one ages (Widiger & Seidlitz, 2002), but DSM-​5 does not currently recognize the occurrence of an adult onset for a personality disorder; therefore, the prevalence rate should decrease as the population ages (unless those with personality disorders have a much lower rate of mortality). The difficulty perhaps lies, again, with the failure (discussed previously) of the existing instruments to adequately address age of onset and temporal stability

throughout adult life (Oltmanns & Balsis, 2011). In addition, some of the diagnostic criteria may again have a different meaning within an older adult population. For example, the dependent personality disorder diagnostic criteria of being unrealistically preoccupied with fears of being left to care for oneself, or feeling uncomfortable or helpless when alone because of exaggerated fears of being unable to care for oneself, were written to assess dependency in middle-​aged persons who are otherwise fully capable of caring for themselves. They would clearly have a much different meaning for a person who is experiencing a decline in physical ability due to aging. Therefore, caution should be noted when attempting to diagnose a personality disorder in an older adult. Gender The topic of gender in relation to personality disorders has often been examined. Many of the personality disorders have a differential prevalence rate across the sexes, and some appear to involve maladaptive variants of gender-​ related personality traits (APA, 2013a). The suggestion that these differential sex prevalence rates reflect gender biases has been among the more difficult and heated diagnostic issues (Widiger, 2007). Concerns regarding gender bias have been examined regarding the conceptualization of personality disorders, diagnostic criteria wording and application, thresholds for diagnosis, clinical presentation, research sampling, self-​awareness and openness of patients, and the items included in self-​report measures (Morey, Alexander, & Boggs, 2005; Oltmanns & Powers, 2012). However, research has not demonstrated a significant bias within the DSM-​5 diagnostic criteria (Boggs et  al., 2005; Jane, Oltmanns, South, & Turkheimer, 2007). Research has indicated that the gender differences of the personality disorders appear to be consistent with normative difference in general personality structure between genders (Lynam & Widiger, 2007). Note, however, that research has indicated gender biases in clinical judgments and self-​report inventories (Miller et  al., 2012; Widiger & Boyd, 2009). When systematic assessments of diagnostic criteria sets are provided, as occurs with the administration of a semi-​structured interview, there appears to be a considerable decrease in gender-​biased assessments (Miller et al., 2012). In regard to gender biases in self-​report inventories, the MMPI-​2 and the MCMI-​IV personality disorder inventories include gender-​related items that are keyed in the direction of adaptive rather than maladaptive functioning.

Personality Disorders

An item need not assess for dysfunction to contribute to a valid assessment of personality disorders. For example, items assessing for gregariousness can identify histrionic persons, items assessing for confidence can identify narcissistic persons, and items assessing conscientiousness can identify obsessive–​compulsive persons (Millon et  al., 2015). Items keyed in the direction of adaptive, rather than maladaptive, functioning can be helpful in countering the tendency of some respondents to deny or minimize personality disorder symptomatology. However, these items will not be useful in differentiating abnormal from normal personality functioning, and they are likely to contribute to an overdiagnosis of personality disorders in normal or minimally dysfunctional populations, such as encountered in student counseling centers, child custody disputes, or personnel selection (Boyle & Le Dean, 2000). When these items are related to the sex or gender of respondents, as many are in the case of the histrionic, dependent, narcissistic, and obsessive–​ compulsive personality disorder scales of the MCMI-​III (Millon, Davis, Millon, & Grossman, 2009) and the MMPI-​2 (Colligan et al., 1994), they may contribute to gender biased assessments (Oltmanns & Powers, 2012). The PDQ-​ 4 was provided a good rating because all of its items are keyed in a maladaptive direction and therefore do not demonstrate the gender bias evident within the MMPI-​2 and the MCMI-​III (Lindsay, Sankis, & Widiger, 2000). Culture and Ethnicity There is considerable literature on the impact of gender on the assessment of personality disorders, but there is limited research on the impact of ethnicity or culture, despite the social and theoretical significance of this area of research (Ryder, Sunohara, & Kirmayer, 2015). Ryder et al. (2015) reported that research examining culture and personality disorder contains a variety of long-​standing methodological and conceptual issues. The authors noted that the databases examining culture and personality disorders are sparse and the research often does not focus on true “culture.” Studies often indicate “broad differences using ‘Western’ constructs and rarely test their explanations” (p. 40). Items within self-​report inventories are generally written from the perspective of a member of the dominant ethnic/​cultural group, and such items may not have the same meaning or implications when provided to members of a minority ethnic group (Okazaki & Sue, 2016). Hindering the effort of psychologists to identify (a)  the cultural contexts in which assessment techniques should

473

be interpreted differently or (b)  the adjustments in test interpretation that should be made across different ethnic groups is the absence of sufficient research on the mechanisms for cultural or ethnic group differences. Much of the existing research has been confined to the reporting of group differences, without an assessment of the purported mechanism by which the differences could be explained or understood (Okazaki & Sue, 2016). As an example of this line of research, studies have reported the obtainment of significantly higher scores by African Americans (compared to European Americans) on Cluster A  personality disorders. Gibbs et  al. (2013) demonstrated that in a nationally representative community sample, African Americans were less likely than European Americans to be diagnosed with avoidant or dependent personality disorder but were more likely to be diagnosed with paranoid or schizoid personality disorder. The authors indicated the higher rates of paranoid or schizoid personality disorders could be due to an accurate increase in symptom prevalence due to adverse social environment (e.g., discrimination) as well as a possible inaccurate increase in symptom prevalence due to interviewers pathologizing healthy coping strategies (e.g., understandable mistrust, skepticism, and suspicion outsiders) or by a failure to appreciate the meaning of (for instance) suspiciousness in persons who have a history of being discriminated against or victimized. In addition, Manseau and Case (2014) demonstrated that both Hispanics and non-​ Hispanic Blacks were treated less frequently in an outpatient mental health setting for personality disorders. However, this disparity was not necessarily reflective of lower incidence rates for personality disorder diagnoses for Hispanics and non-​Hispanic Blacks. Manseau and Case indicated that the treatment rates in this sample could be due to a variety of factors, such as language barriers, immigration status, patient treatment preferences, poverty, insurance status, and hospital location. Wu et al. (2013) examined Asian Americans, Native Hawaiians/​ Pacific Islanders, and “mixed-​ race” patients and found that mixed-​race patients were more likely to have a personality disorder diagnosis. Ryder et al. (2015) argued that a dimensional model, similar to DSM-​5 Section III, could aid in understanding the links between culture and personality disorder. If the traits demonstrate cross-​ cultural replicability, a dimensional model would aid in identifying problematic patterns of personality traits. The traits assessed by the PID-​5 and FFMPD scales are related conceptually and empirically to the five-​factor model of general personality structure (Krueger & Markon, 2014; Widiger et  al.,

474

Schizophrenia and Personality Disorders

2012), which has substantial empirical support for its cross-​cultural application (Allik, 2005). In addition, these maladaptive traits should be examined in context, related to local norms, and examined for consequences, which requires acknowledgment of the individual’s experience beyond dimensional or categorical classifications.

self-​report inventories, they also require little time on the part of the clinician to administer, although they do vary in the amount of time it can take to score them. The CATI, MMPI-​2, PDQ-​4, and SNAP-​2 must be scored by hand (computer scoring systems for the MMPI-​2 do not include the Morey et al. [1985] scales). One can purchase a computer scoring system for the OMNI and WISPI-​IV at an added expense. The PDQ-​4 is the briefest of these Clinical Utility self-​report inventories, consisting of only 99 items, and it Clinical utility concerns ease of usage, communication, is perhaps the most frequently used self-​report inventory and treatment formulation (Mullins-​Sweatt et al., 2016). in clinical research because it is much shorter than the Here, emphasis is given to ease of usage and communi- alternative measures. In contrast, the MCMI-​IV is exceedcation; treatment planning is discussed in the next sec- ingly difficult to score by hand; a computer scoring system tion. Administration of a semi-​structured interview will be is available, but it is relatively expensive. important in clinical situations in which the credibility or The SWAP-​ 200 requires considerably less time to validity of the assessment might be questioned, such as a complete compared to a semi-​structured interview, as its forensic or a disability evaluation. The administration of items are rated on the basis of whatever information is a semi-​structured interview will document that the assess- available to the clinician. No questions are required to be ment was reasonably comprehensive, replicable, and administered to rank the items. It was for this reason that objective (Miller et al., 2012). However, semi-​structured the SWAP-​200 was provided a rating of adequate for cliniinterviews require, on average, 2 hours to be administered, cal utility. However, the SWAP-​200 includes more than which is not realistic (or useful) in clinical practice. Note twice as many items (i.e., 200) as the entire set of DSM-​ that this is a reflection of the constructs being assessed, not 5 Section II personality disorder diagnostic criteria (i.e., the instrument providing the assessment. Therefore, the 96). If clinicians routinely fail to consider systematically routine administration of a semi-​structured interview may the diagnostic criteria currently included within DSM-​ be impractical for general clinical practice. 5 (Garb, 2005), it might not be realistic to expect them As noted previously, the amount of time required for to assess systematically or carefully a patient with a set of the administration of a semi-​structured interview can also items that is twice as long. be reduced substantially by first administering and scorAs suggested previously, the utility of an assessment ing a self-​report inventory (Miller et al., 2012; Widiger & measure is limited by the utility of the construct being Boyd, 2009). The administration of the interview could assessed. Verheul (2005) systematically reviewed various then be confined to the personality disorder scales that components of clinical utility for the personality disorwere significantly elevated on the self-​report inventory. In der diagnostic categories and suggested that the heterofact, the SCID-​II (First & Gibbon, 2004) and the IPDE geneity of diagnostic membership, the lack of precision (Loranger, 1999)  include screening measures precisely in description, the excessive diagnostic co-​occurrence, for this purpose. However, if a self-​report inventory is to the reliance on the “not otherwise specified” wastebasbe administered, it is preferable to administer one that was ket diagnosis, and the unstable and arbitrary diagnostic constructed to provide a comprehensive and valid assess- boundaries are sources of considerable frustration for ment (e.g., PDQ-​4), and for which there is empirical clinicians. Verheul stated, “Overall, the categorical syssupport for its validity to assess the personality disorders, tem has the least evidence for clinical utility, especially rather than simply a measure developed for the purpose with respect to coverage, reliability, subtlety, and cliniof brief screening. cal decision-​making” (p.  295). The DSM-​5 Section III The CATI, MMPI-​2, OMNI, PDQ-​4, SNAP-​2, and dimensional trait model assessed by the PID-​5 and the WISPI-​IV traditional self-​report inventories all received FFMPD scales provide more individualized and prea rating of good with respect to clinical utility. Self-​ cise personality profiles and an increased homogeneity report inventories can be very useful in alerting a clini- of trait constructs that improve considerably the clinician to maladaptive personality functioning that might cal utility of personality disorder assessments (Mullins-​ otherwise have been missed due to false expectations Sweatt & Lengel, 2012). Indeed, studies have directly or assumptions, such as failing to notice antisocial per- compared the clinical utility as assessed by clinicians sonality traits in female patients (Miller et al., 2012). As for the DSM-​5 Section II diagnostic categories and the

Personality Disorders

FFM dimensional trait model. Across a series of studies, clinicians have considered the FFM trait model to be preferable to the diagnostic categories for communication with patients and for ease of usage (Mullins-​Sweatt & Lengel, 2012). Similar results have been obtained for the DSM-​5 Section III dimensional trait model (Morey, Skodol, & Oldham, 2014). Therefore, the FFMPD and PID-​5 trait-​based self-​report inventories received scores of excellent.

475

brevity); and validity generalization is unavailable for the others. The FFMPD and PID-​5 (trait-​based self-​report inventories) can be said to be highly recommended due to their excellent ratings of internal consistency, construct validity, clinical utility, and positive content validity scores. These measures are outlined further in the following section.

ASSESSMENT FOR CASE CONCEPTUALIZATION

Overall Evaluation

AND TREATMENT PLANNING

The strongest statement that can be made in a review of instruments for the diagnosis of personality disorder is that there are clearly quite a number of alternative measures readily available. Regrettably, no single measure stands out as being clearly preferable to all others. Semi-​ structured interviews are strongly preferred over self-​report inventories in research due to their relatively greater resilience to distortions secondary to comorbid disorders (Widiger & Boyd, 2009). However, there appears to be no clear advantage of one semi-​structured interview relative to another. The IPDE has more international application, but its clinical value is limited by the fact that it requires considerably more time to administer. The SCID-​II and DIPD are relatively more straightforward to administer in comparison to the SIDP-​ IV and the PDI-​IV, but the latter could be said to be more sophisticated in their assessment. Researchers are recommended to obtain copies of at least three of the existing semi-​structured interviews and base their selection, in part, on which instrument appears to be best suited for their particular research needs and interests. Clinicians are generally recommended to administer a self-​report inventory first as a screening measure, identifying which one to four personality disorders should be emphasized during a subsequent follow-​up interview and which can be safely ignored. Brief screening measures can be used for this purpose, but there might be little advantage to using a screening instrument in preference to an inventory that was constructed to provide a comprehensive and valid assessment. Most of the self-​report inventories listed in Table 21.1 can be used for this purpose. The PAI is limited by the absence of scales for all of the personality disorders; quite a number of problems occur for the MCMI-​IV with respect to test–​retest reliability, gender bias, problematic cut-​off points, and cost; the OMNI is not as widely used as the other traditional self-​report measures; the PDQ-​4 is perhaps the weakest measure with respect to validity (in large part due to its

There are numerous texts with suggestions for the treatment of personality disorders (e.g., Clarkin, Fonagy, & Gabbard, 2010; First & Tasman, 2006; Oldham, Skodol, & Bender, 2005; Paris, 2015; Perry, 2014; Widiger, 2012). Case conceptualization and treatment planning with these texts are guided by the presence of personality disorders diagnosed with one or more of the instruments discussed previously. These texts are based largely on clinical experiences and theoretical speculations. There are few empirically validated manuals for the treatment of personality disorders. The American Psychiatric Association has published empirically based guidelines for the treatment of individual mental disorders. Guidelines, however, have been published for only borderline personality  disorder (APA, 2001), due in large part to the fact that there is currently insufficient research to develop empirically based guidelines for the treatment of dependent, avoidant, obsessive–​compulsive, and other personality disorders. This section is for assessment measures that could be used to augment diagnostic information to yield a psychological case conceptualization that can be used to guide decisions on treatment planning beyond that which is provided simply by a personality disorder diagnosis. Clinicians do not treat all at once an entire DSM-​5 personality disorder syndrome, such as borderline. Clinicians treat individual components of each syndrome (Paris, 2006), such as the dysregulated anger, fragility, anxious uncertainty, affective dysregulation, oppositionality, and/​ or manipulativeness of persons diagnosed with borderline personality disorder. Existing measures of the DSM-​ 5 Section II personality disorders do not provide scales for the assessments of these components (with the exception of the PAI for the borderline and antisocial personality disorders). However, scales for their assessment are available in the Dimensional Assessment of Personality Pathology-​ Basic Questionnaire (DAPP-​ BQ; Livesley & Jackson, 2009)  and the SNAP-​2 (Clark et  al., 2014), as well as more recently developed measures of maladaptive

476

Schizophrenia and Personality Disorders

TABLE 21.2  Ratings

of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

CAT-​PD

E

A

NA

NR

E

E

A

G

DAPP-​BQ

G

E

NA

G

E

E

E

E

FFMPD PID-​5 SNAP-​2

NR NR E

E G E

NA NA NA

NR A G

E G G

E E G

NR NR G

E E E

Highly Recommended

✓ ✓

Note: CAT-​PD = Computerized Adaptive Test of Personality Disorder; DAPP-​BQ = Dimensional Assessment of Personality Psychopathology–​ Basic Questionnaire; FFMPD = Five Factor Model Personality Disorder scales; PID-​5 = Personality Inventory for DSM-​5; SNAP-​2 = Schedule for Nonadaptive and Adaptive Personality-​2; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

personality traits, including the FFMPD scales (Widiger et al., 2012), the Computerized Adaptive Test–​Personality Disorder (CAT-​PD; Simms et  al., 2011), and the PID-​5 (Krueger et al., 2012). The DAPP-​BQ (Livesley & Jackson, 2009)  includes 18 trait scales (e.g., anxiousness, self-​ harm, intimacy problems, social avoidance, passive opposition, and interpersonal disesteem) subsumed within four higher order domains of emotional dysregulation, dissocial, inhibitedness, and compulsivity that align well with the neuroticism, antagonism, introversion, and conscientiousness domains of the FFM, respectively (Clark & Livesley, 2002). The SNAP-​2 (Clark et al., 2014) includes 12 trait scales (e.g., self-​harm, entitlement, eccentric perceptions, workaholism, detachment, and manipulation) that are grouped into the three higher order domains of negative affectivity, positive affectivity, and constraint that align well with the neuroticism, extraversion, and conscientiousness domains of the FFM, respectively (Watson, Clark, & Harkness, 1994). However, factor analyses of the 12 SNAP scales do not appear to yield a three-​factor structure. Joint factor analyses of the DAPP-​BQ and the SNAP yield the four-​factor structure (Clark, Livesley, Schroeder, & Irish, 1996). The PID-​ 5 provides the official assessment of the dimensional trait model included within Section III of the DSM-​5 (APA, 2013a). This dimensional trait model was first developed through nominations of 37 maladaptive traits from DSM-​ 5 work group members regarding respective personality disorders included within DSM-​ IV-​ TR (APA, 2000; Krueger et  al., 2012). The number of scales was eventually reduced from 37 to 25 on the basis of factor analyses of the respective scales within each domain (Krueger et  al., 2012), including such scales as Anxiousness, Attention-​Seeking, Hostility, and Suspiciousness. The 25 PID-​5 scales are organized into five domains of negative affectivity, detachment,

antagonism, disinhibition, and psychoticism that are explicitly aligned with the FFM (APA, 2013a, p. 773). Table 21.2 provides a summary of the psychometric properties of the PID-​5. A variety of reviews are available that have examined the validity of the PID-​5 as well as the DSM-​5 Section III trait model (Al-​Dajani, Gralnick, & Bagby, 2016; Furnham et al., 2014; Hopwood & Sellbom, 2013; Krueger & Markon, 2014; Morey, Benson, Busch, & Skodol, 2015). These reviews have indicated good internal consistency, adequate test–​retest reliability, good content validity, and excellent construct validity. Research has indicated good to excellent coverage of the variance within each respective personality disorder (Rojas & Widiger, 2017). Bach, Markon, Simonsen, and Krueger (2015) provided a theoretical rationale for the use of the PID-​5 in clinical practice. Bach et al. demonstrated how DSM-​5 Section III can aid in case conceptualization as well as treatment planning through six case study examples. Morey, Skodol, and Oldham (2014) compared directly clinicians’ impressions of the DSM-​5 Section III trait model for use in treatment planning in comparison to the DSM-​5 Section II personality disorder syndromes. The clinicians consistently preferred the trait model. Al-​Dajani et al., however, suggested that future research should provide clinicians with a standardized scoring method, methods of examining profile accuracy, and a recognized normative sample to effectively interpret scores. The CAT-​PD (Simms et  al., 2011)  contains 33 trait scales organized within five domains of negative emotionality, detachment, antagonism, disconstraint, and psychoticism that were aligned with the five domains proposed for DSM-​5 by Widiger and Simonsen (2005) and, as indicated by Wright and Simms (2014), with the FFM. The scales of the CAT-​PD are very similar to those of the PID-​5. In fact, all but three of the PID-​5 scales are included in the CAT-​PD. The CAT-​PD has more

Personality Disorders

coverage, in that it includes 33 scales relative to the 25 of the PID-​5, although the CAT-​PD does not appear to have scales comparable to the PID-​5 Attention-​Seeking, Perseveration, or Distractibility scales. The PID-​5, in turn, does not appear to have scales comparable to the CAT-​ PD Cognitive Problems, Domineering, Exhibitionism, Fantasy Proneness, Health Anxiety, Rudeness, Self-​Harm, Norm-​Violation, or Workaholism scales. Psychometric ratings for the CAT-​PD are provided in Table 21.2. Existing research suggests adequate internal consistency, excellent coverage of domains included, adequate convergent and discriminant validity compared to other trait-​based measures, and adequate generalization across different age and gender groups (Crego & Widiger, 2016; Simms et  al., 2011; Williams & Simms, 2016; Wright & Simms, 2014). Temporal stability of the domains has not yet been reported. A rating of good was provided for clinical utility due to the coverage of the CAT-​PD of largely the same traits as covered by the PID-​ 5, albeit no explicit study regarding its clinical utility has yet been performed. The FFMPD consists of eight self-​report measures constructed to assess the DSM-​ 5 Section II personality disorders from the perspective of the FFM, yielding a total of 99 scales organized conceptually and empirically within the five domains of the FFM (Lynam, 2012; Widiger et al., 2012). Researchers and clinicians can use a subset of the scales to assess for a particular personality disorder from the perspective of the FFM (e.g., borderline; Mullins-​Sweatt et al., 2012) or select scales from a domain of the FFM (e.g., agreeableness vs. antagonism) to assess for its maladaptive variants (e.g., Gullibility and Subservience from agreeableness and Callousness and Manipulativeness from antagonism). Each of the FFMPD instruments was constructed by first identifying which facets of the FFM (as provided within the NEO Personality Inventory-​Revised [NEO PI-​ R]; Costa & McCrae, 1992)  are most relevant for each respective personality disorder on the basis of researchers’ descriptions of each respective personality disorder in terms of the FFM (i.e., Lynam & Widiger, 2001), clinicians’ descriptions of each personality disorder (i.e., Samuel & Widiger, 2004), and FFM personality disorder research (e.g., Samuel & Widiger, 2008). Scales were then constructed to assess the maladaptive variants of each facet that were specific to each personality disorder (e.g., Perfectionism, Workaholism, Punctiliousness, and Doggedness as maladaptive variants of conscientiousness for the Five-​ Factor Obsessive–​ Compulsive Inventory; Samuel, Riddell, Lynam, Miller, & Widiger, 2012).

477

Ratings of the psychometric properties of the FFMPD are presented in Table 21.2. The FFMPD scales were initially validated by demonstrating convergence with both their respective parent FFM facet scale and alternative measures of the respective personality disorder (e.g., Mullins-​ Sweatt et  al., 2012). Finally, each of the measures was shown to have incremental validity over alternative measures of these personality disorders (e.g., Samuel et  al., 2012). Additional validation studies have since been published (Bagby & Widiger, in press). These studies have indicated excellent internal consistency, content validity, and construct validity as measures of both the FFM and the respective personality disorder. Crego and Widiger (2016) demonstrated convergent and discriminant validity for 36 of the FFMPD scales with the PID-​5 and CAT-​ PD. No studies have yet assessed for test–​retest reliability. A  number of studies have also indicated that clinicians consider the constructs assessed by these measures to have excellent clinical utility relative to the DSM-​5 Section II personality disorder syndromes with respect to treatment planning (Mullins-​Sweatt & Lengel, 2012). A strength of the FFMPD and PID-​5 relative to many other DSM-​ 5 Section II self-​ report inventories is that through use of the subscales, clinicians or researchers are able to dismantle the heterogeneous syndromes into more distinctive component parts. For example, as noted previously, research has indicated that when treating a personality disorder, clinicians do not address the entire personality structure with each intervention (Paris, 2006). Clinicians focus instead on underlying components, such as the dysregulated anger, fragility, or the oppositional behavior of an individual diagnosed with borderline personality disorder. These components are assessed independently and specifically by the scales of the FFBI (Mullins-​ Sweatt et  al., 2012), providing considerably greater utility in clinical practice than that provided by the more global measures of borderline personality disorder (Mullins-​Sweatt & Lengel, 2012). Overall Evaluation Recommendations for instruments that would augment treatment planning and case conceptualization are hindered by the absence of much controlled clinical trials of manually guided treatment programs for personality disorders. The general recommendation is for the use of measures of maladaptive personality structure, coordinated with the DSM-​5 Section II personality disorder syndromes, which could thereby provide scales for the assessment of the more precise personality disorder traits

478

Schizophrenia and Personality Disorders

that are a focus of treatment. An advantage of the CAT-​ PD, DAPP-​ BQ, FFMPD, PID-​ 5, and SNAP is their assessment of the more precise components. The CAT-​ PD, FFMPD, and PID-​5 also contain conceptual and empirical coordination with all five domains of the FFM, thereby allowing what is known about the course, etiology, and outcomes of FFM traits to be applied to the clinical assessment. The FFMPD and PID-​5 are highly recommended because they also include algorithms to assess a respective personality disorder, with a considerable body of research supporting these algorithms (Krueger & Markon, 2014; Widiger et  al., 2012). These scoring algorithms allow researchers and clinicians to relate their findings for a respective personality disorder to the FFM.

characteristic manner of thinking, feeling, and relating to others only with respect to the previous week or month. However, a limitation of this proposal is that it is not really clear what period of time should be specified to accurately document that a maladaptive personality trait is now within remission or no longer present. Personality traits vary in the frequency with which they are evident within any particular period of time. Borderline self-​destructiveness must be evident for at least 5 years to indicate its presence on the SCID-​5-​PD, but it is unclear how long it should not be present to indicate its absence. If persons must display self-​destructiveness over a 5-​year period to indicate the presence of borderline suicidality, perhaps they should also evidence the absence of self-​ destructiveness over a 5-​year period to indicate the successful treatment of this borderline suicidality. ASSESSMENT FOR TREATMENT MONITORING An additional limitation of most of the existing personAND TREATMENT OUTCOME ality disorder semi-​structured interviews and traditional self-​report inventories for the purpose of treatment moniThis section presents assessment measures and strategies toring and outcome assessment is that they are not well that can be used to (a) track the progress of treatment and differentiated with respect to the facets or components (b) evaluate the overall effect of treatment on symptoms, of each personality disorder. What is evident from the diagnosis, and general functioning. The semi-​structured limited amount of research on the treatment of personalinterviews and traditional self-​report inventories consid- ity disorders is that this treatment rarely involves a comered within the first section could, again, provide a natu- prehensive or complete cure of the personality disorder ral choice as treatment outcome measures. However, a (Leichsenring & Leibing, 2003; Perry & Bond, 2000). significant disadvantage of most of the personality disor- What appears to occur is the resolution of some traits but der semi-​structured interviews and traditional self-​report the maintenance or continuation of other traits. This suginventories is that they were constructed to assess long-​ gests better utility for measures that assess for the underlyterm functioning, including functioning before the onset ing components of each personality syndrome. Research of treatment, and may not accurately reflect current on the effectiveness of treatments often focuses on meachanges in functioning. For this reason, these instruments surable, behaviorally based outcomes (self-​harm, suicidal are not included in Table 21.3. thoughts, etc.) rather than other aspects of personality The semi-​ structured interviews and traditional self-​ disorder symptomology (O’Connell & Dowling, 2014). report measures could, hypothetically, be modified Effective change occurs with respect to the components to assess only current or recent functioning, specify- rather than the entire global construct. For example, one ing, for instance, that the persons should describe their of the empirically supported treatments for borderline

TABLE 21.3  Ratings

of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Treatment Sensitivity

Clinical Utility

Highly Recommended

CAT-​PD DAPP-​BQ FFMPD PID-​5 SNAP-​2

E G NR NR E

A E E G E

NA NA NA NA NA

NR G NR A G

E E E G G

E E E E G

A E NR NR G

G E E E E

G E E E E

✓ ✓ ✓ ✓ ✓

Note:  CAT-​PD  =  Computerized Adaptive Test of Personality Disorder; DAPP-​BQ  =  Dimensional Assessment of Personality Psychopathology–​ Basic Questionnaire; FFMPD = Five Factor Model Personality Disorder scales; PID-​5 = Personality Inventory for DSM-​5; SNAP-​2 = Schedule for Nonadaptive and Adaptive Personality-​2; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

Personality Disorders

personality  disorder  (APA, 2001)  is dialectical behavior therapy (DBT; Linehan, 2015). Research has demonstrated that DBT is an effective treatment for many of the components of this personality disorder. DBT has been particularly effective with respect to decreasing parasuicidal behavior, anger hostility, hopelessness, anxiety, and distress symptoms (Linehan, 2015). In Section III of the DSM-​5, there are Criterion A and Criterion B for the personality disorders. For the self and interpersonal relatedness impairments of Criterion A, the Level of Personality Functioning Scale (LPFS; APA, 2013a, p. 775; Bender, Morey, & Skodol, 2011) has been used in a few studies (e.g., Few et al., 2013; Keeley, Flanagan, & McCluskey 2014; Morey, Bender, & Skodol, 2013; Zimmerman et  al., 2015). The LPFS provides a broad assessment of disturbances in self and interpersonal functioning through a clinician rating form. However, note that the LPFS does not provide an assessment of the Criterion A disturbances that are specific to each personality disorder. For example, moderate impairment in identity is suggested within the LPFS by “depends excessively on others for identity definition, with compromised boundary delineation,” “vulnerable self-​esteem controlled by exaggerated concern about external evaluation, with a wish for approval,” and “threats to self-​esteem may engender strong emotions such as rage or shame” (APA, 2013a, p. 776). The LPFS severe impairments are suggestive of impairments in Section III described for the borderline, narcissistic, and schizotypal personality disorders, but they are not particularly indicative of impairments for the obsessive–​compulsive. There are also two self-​ report inventories that have been used as proxy measures for Criterion A: the General Assessment of Personality Disorders (GAPD; Berghuis, Kamphuis, Verheul, Larstone, & Livesley, 2013)  and the Severity Indices for Personality Problems (SIPP-​118; Verheul et al., 2008). The GAPD includes 19 scales, 15 of which concern self-​pathology and 4 concern interpersonal deficits. The SIPP-​118 has 16 scales, organized in the initial validation study (Verheul et al., 2008) into five domains of self-​ control, identity integration, relational capacities, responsibility, and social concordance. The GAPD and SIPP-​118 have both been used to assess for Criterion A (e.g., Bastiaansen, De Fruyt, Rossi, Schotte, & Hofmans, 2013; Berghuis, Kamphuis, & Verheul, 2014; Berghuis et  al., 2013). However, neither of these measures provides an explicit assessment of the self and interpersonal dysfunctions specified for the six personality disorders in Section III. Regarding the maladaptive traits of Criterion B of Section III, as previously noted, the CAT-​PD, FFMPD,

479

and PID-​ 5 provide precise assessment of a variety of components that construct the personality disorders (discussed previously). A strength of the CAT-​PD, FFMPD, and PID-​5 measures is their coordination with the five domains of the FFM, thereby allowing what is known about the course, etiology, and outcomes of FFM traits to be applied to the clinical assessment. The DAPP-​BQ (Livesley & Jackson, 2009) and the SNAP-​2 (Clark et al., 2014) are also clinically useful measures of specific components of personality disorder. As previously indicated, the DAPP-​BQ and the SNAP-​2 were constructed in a similar manner to provide assessments of the fundamental dimensions of maladaptive personality functioning that cut across and define the existing diagnostic categories. Overall Evaluation The primary goal for the treatment of a personality disorder would naturally be the remission of the personality disorder. As such, the appropriate treatment outcome measure might then be a diagnostic measure. However, many existing instruments are limited in this regard because they have not yet been modified to assess change in long-​standing personality traits. An additional limitation is that treatment of personality disorders does not appear to address the global personality structure, focusing instead on more specific personality traits and components of the personality disorders. In this regard, the CAT-​PD, DAPP-​BQ, FFMPD, PID-​5, and the SNAP-​2 are likely to be better suited as treatment outcome measures and are highly recommended. Clinicians should consider these instruments and select which appears to be best suited to their particular clinical population.

CONCLUSIONS AND FUTURE DIRECTIONS

A considerable amount of attention and research has been devoted to the assessment and diagnosis of the DSM-​5 Section II personality disorders. Although this chapter identifies 18 distinct instruments developed that provide assessments of these personality disorders, a variety of additional assessments exist. The variety of measures available is a testament to both the complexity and interest in personality disorder assessment and, regrettably, potential limitations of each of the existing instruments. If one instrument was clearly preferable to another, there would be no need or interest in so many alternative measures. It is perhaps time to devote research attention to more direct comparisons of the reliability and validity of

480

Schizophrenia and Personality Disorders

the alternative measures in order to begin to separate the wheat from the chaff. However, in the absence of a gold standard for what constitutes an unambiguously valid criterion, comparative research can be difficult to conduct. It will also be important for future studies to devote more attention to the construction of measures that could be used to augment diagnostic information that could guide decisions on treatment planning and treatment outcome assessment beyond that which is provided simply by a personality disorder diagnosis. Progress in such research is hindered by the virtual absence of studies devoted to the development and validation of empirically supported treatments for specific personality disorders. Research on treatment of personality disorders focuses on borderline, whereas no randomized controlled or open trial studies have been conducted on the paranoid, schizoid, schizotypal, dependent, narcissistic, or histrionic personality disorders (Leahy & McGinn, 2012). Hand in hand with the development of such treatment research, it will be necessary to (a) tackle the thorny issue of what constitutes successful treatment of personality disorders and (b) develop, based on this formulation, and implement measures that are designed to be sensitive to treatment effects in a clinical setting. The successful treatment of a personality disorder will not be the construction of an ideal personality structure. One is unlikely to change a “Theodore Bundy” into a “Mother Teresa.” On the other hand, given the substantial public health care costs that can be associated with some of the more dysfunctional personality disorders (e.g., costs to victims and to law enforcement agencies of persons with an antisocial personality disorder, and the costs of the many brief hospitalizations of persons with borderline personality disorder), even moderate improvements in personality functioning can have substantial personal, social, and public health care benefits (Linehan, 2015). Measures more specifically suited to these important benefits of personality disorder treatment need further implementation in a clinical setting.

References Al-​Dajani, N., Gralnick, T. M., & Bagby, R. M. (2016). A psychometric review of the personality inventory for DSM–​5 (PID–​5): Current status and future directions. Journal of Personality Assessment, 98(1), 62–​81. Allik, J. (2005). Personality dimensions across cultures. Journal of Personality Disorders, 19, 212–​232. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author.

American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev. ed.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2001). Practice guidelines for the treatment of patients with borderline personality disorder. Washington, DC: Author. American Psychiatric Association. (2013a). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. American Psychiatric Association. (2013b, June 15). Online assessment measures: The Personality Inventory for DSM-​ 5 (PID-​5)–​Child Age 11–​17. Retrieved from https://​ www.psychiatry.org/​psychiatrists/​practice/​dsm/​dsm-​5/​ online-​assessment-​measures Bach, B., Markon, K., Simonsen, E., & Krueger, R. F. (2015). Clinical utility of the DSM-​5 alternative model of personality disorders:  Six cases from practice. Journal of Psychiatric Practice, 21, 3–​25. Bagby, R. M., & Farvolden, P. (2004). The Personality Diagnostic Questionnaire-​ 4 (PDQ-​ 4). In M. J. Hilsenroth, D. L. Segal, & M. Hersen (Eds.), Comprehensive handbook of psychological assessment:  Vol. 2.  Personality assessment (pp. 122–​ 133). New York, NY: Wiley. Bagby, R. M., & Widiger, T. A. (in press). Five factor model personality disorder scales. Psychological Assessment. Bastiaansen, L., De Fruyt, F., Rossi, G., Schotte, C., & Hofmans, J. (2013). Personality disorder dysfunction versus traits:  Structural and conceptual issues. Personality Disorders: Theory, Research, and Treatment, 4, 293–​303. Bender, D. S., Morey, L. C., & Skodol, A. E. (2011). Toward a model for assessing level of personality functioning in DSM-​5, Part I: A review of theory and methods. Journal of Personality Assessment, 93, 332–​346. Berghuis, H., Kamphuis, J. H., & Verheul, R. (2014). Specific personality traits and general personality dysfunction as predictors of the presence and severity of personality disorders in a clinical sample. Journal of Personality Assessment, 96, 410–​416. Berghuis, H., Kamphuis, J. H., Verheul, R., Larstone, R., & Livesley, J. (2013). The General Assessment of Personality Disorder (GAPD) as an instrument for assessing the core features of personality disorders. Clinical Psychology & Psychotherapy, 20, 544–​557. Block, J. (2008). Subjective impressions in clinical psychology. In The Q-​ sort in character appraisal:  Encoding subjective impressions of persons quantitatively (pp.

Personality Disorders

93–​ 104). Washington, DC:  American Psychological Association. Boggs, C. D., Morey, L. C., Skodol, A. E., Shea, M. T., Sanislow, C. A., Grilo, C. M.,  .  .  .  Gunderson, J. G. (2005). Differential impairment as an indicator of sex bias in DSM-​IV criteria for four personality disorders. Psychological Assessment, 17, 492–​496. Boyle, G. J., & Le Dean, L. (2000). Discriminant validity of the Illness Behavior Questionnaire and Millon Clinical Multiaxial Inventory-​III in a heterogeneous sample of psychiatric outpatients. Journal of Clinical Psychology, 56, 779–​791. Clark, L. A. (2007). Assessment and diagnosis of personality disorder: Perennial issues and an emerging reconceptualization. Annual Review of Psychology, 58, 227–​257. Clark, L. A., & Harrison, J. A. (2001). Assessment instruments. In W. J. Livesley (Ed.), Handbook of personality disorders: Theory, research, and treatment (pp. 277–​306). New York, NY: Guilford. Clark, L. A., & Livesley, W. J. (2002). Two approaches to identifying the dimensions of personality disorder:  Convergence on the five-​factor model. In P. T. Costa & T. A. Widiger (Eds.), Personality disorders and the five-​factor model of personality (2nd ed., pp. 161–​176). Washington, DC:  American Psychological Association. Clark, L. A., Livesley, W. J., Schroeder, M. L., & Irish, S. L. (1996). Convergence of two systems for assessing personality disorder. Psychological Assessment, 8, 294–​303. Clark, L. A., Simms, L. J., Wu, K. D., & Casillas, A. (2014). Schedule for Nonadaptive and Adaptive Personality–​2nd Edition (SNAP-​ 2):  Manual for administration, scoring, and interpretation. Notre Dame, IN: University of Notre Dame. Clarkin, J. F., Fonagy, P., & Gabbard, G. O. (2010). Psychodynamic psychotherapy for personality disorders:  A clinical handbook. Arlington, VA:  American Psychiatric Publishing. Colligan, R. C., Morey, L. C., & Offord, K. P. (1994). MMPI/​ MMPI-​ 2 personality disorder scales:  Contemporary norms for adults and adolescents. Journal of Clinical Psychology, 50, 168–​200. Coolidge, F. L. (1992). The Coolidge Axis-​ II Inventory:  Manual. Colorado Springs, CO:  University of Colorado–​Colorado Springs. Coolidge, F. L., & Merwin, M. M. (1992). Reliability and validity of the Coolidge Axis II Inventory: A new inventory for the assessment of personality disorders. Journal of Personality Assessment, 59, 223–​238. Costa, P. J., Bagby, R. M., Herbst, J. H., & McCrae, R. R. (2005). Personality self-​reports are concurrently reliable and valid during acute depressive episodes. Journal of Affective Disorders, 89, 45–​55.

481

Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO PI-​R) and NEO Five-​Factor Inventory (NEO-​ FFI):  Professional manual. Odessa, FL: Psychological Assessment Resources. Crego, C., & Widiger, T. A. (2016). Convergent and discriminant validity of alternative measures of maladaptive personality traits. Psychological Assessment, 28, 1561–​1575. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–​302. De Fruyt, F., & De Clercq, B. (2014). Antecedents of personality disorder in childhood and adolescence: Toward an integrative developmental model. Annual Review of Clinical Psychology, 10, 449–​476. Derksen, J. J. (2006). The contribution of the MMPI-​ 2 to the diagnosis of personality disorder. In J. N. Butcher (Ed.), MMPI-​2:  A practitioner’s guide (pp. 99–​120). Washington, DC:  American Psychological Association. Few, L. R., Miller, J. D., Rothbaum, A. O., Meller, S., Maples, S., Terry, J.,  .  .  .  Mackillop, J. (2013). Examination of the Section III DSM-​5 diagnostic system to personality disorders in an outpatient clinical sample. Journal of Abnormal Psychology, 122, 1057–​1069. First, M. B., & Gibbon, M. (2004). The Structured Clinical Interview for DSM-​IV Axis I  Disorders (SCID-​I) and the Structured Clinical Interview for DSM-​IV Axis II Disorders (SCID-​II). In M. J. Hilsenroth, D. L. Segal, & M. Hersen (Eds.), Comprehensive handbook of psychological assessment: Vol. 2. Personality assessment (pp. 134–​143). New York, NY: Wiley. First, M. B., & Tasman, A. (2006). Clinical guide to the diagnosis and treatment of mental disorders. Hoboken, NJ: Wiley. First, M. B., Williams, J. B., Benjamin, L. S., & Spitzer, R. L. (2016). Structured Clinical Interview for DSM-​ 5 Personality Disorders (SCID-​ 5-​ PD). Washington, DC: American Psychiatric Press. Furnham, A., Milner, R., Akhtar, R., & De Fruyt, F. (2014). A review of the measures designed to assess DSM-​5 personality disorders. Psychology, 5(14), 1646–​1686. Garb, H. (2005). Clinical judgment and decision making. Annual Review of Clinical Psychology, 1, 67–​89. Gibbs, T. A., Okuda, M., Oquendo, M. A., Lawson, W. B., Wang, S., Thomas, Y. F., & Blanco, C. (2013). Mental health of African Americans and Caribbean Blacks in the United States: Results from the national epidemiological survey on alcohol and related conditions. American Journal of Public Health, 103, 330–​338. Gunderson, J. G., Bender, D., Sanislow, C., Yen, S., Rettew, J. B., Dolan-​Sewell, R.,  .  .  .  Skodol, A. E. (2003). Plausibility and possible determinants of sudden “remissions” in borderline patients. Psychiatry, 66, 111–​119.

482

Schizophrenia and Personality Disorders

Gunderson, J. G., Shea, T., Skodol, A. E., McGlashan, T. H., Morey, L. C., Stout, R. L., . . . Keller, M. B. (2000). The Collaborative Longitudinal Personality Disorders Study: Development, aims, design, and sample characteristics. Journal of Personality Disorders, 14, 300–​315. Hicklin, J., & Widiger, T. A. (2000). Convergent validity of alternative MMPI-​2 personality disorder measures. Journal of Personality Assessment, 75, 502–​518. Hopwood, C. J., & Sellbom, M. (2013). Implications of DSM-​ 5 personality traits for forensic psychology. Psychological Injury and Law, 6, 314–​323. Jane, J. S., Oltmanns, T. F., South, S. C., & Turkheimer, E. (2007). Gender bias in diagnostic criteria for personality disorders:  An item response theory analysis. Journal of Abnormal Psychology, 116, 166–​175. Kaye, A. L., & Shea, M. T. (2000). Personality disorders, personality traits, and defense mechanisms measures. In H. A. Pincus, A. J. Rush, M. B. First, & L. E. McQueen (Eds.), Handbook of psychiatric measures (pp. 713–​750). Washington, DC: American Psychiatric Association. Keeley, J., Flanagan, E. H., & McCluskey, D. L. (2014). Functional impairment and the DSM-​5 dimensional system for personality disorder. Journal of Personality Disorder, 28, 657–​674. Klein, M. H., Benjamin, L. S., Rosenfeld, R., Treece, C., Husted, J., & Greist, J. H. (1993). The Wisconsin Personality Disorders Inventory:  I. Development, reliability, and validity. Journal of Personality Disorders, 7, 285–​303. Krueger, R. F., Derringer, J., Markon, K. F., Watson, D., & Skodol, A. E. (2012). Initial construction of a maladaptive personality trait model and inventory for DSM-​5. Psychological Medicine, 42, 1879–​1890. Krueger, R. F., & Markon, K. E. (2014). The role of the DSM-​ 5 personality trait model in moving toward a quantitative and empirically based approach to classifying personality and psychopathology. Annual Review of Clinical Psychology, 10, 477–​501. Kupfer, D. J., First, M. B., & Regier, D. A. (2002). A research agenda for DSM-​V. Arlington, VA:  American Psychiatric Association. Kurtz, J., McCrae, R. R., Terracciano, A., & Yamagata, S. (2011). Internal consistency, retest reliability, and their implications for personality scale validity. Personality and Social Psychology Review, 15, 28–​50. Leahy, R. L., & McGinn, L. K. (2012). Cognitive therapy for personality disorders. In T. A. Widiger & T. A. Widiger (Eds.), The Oxford handbook of personality disorders (pp. 727–​750). New  York, NY:  Oxford University Press. Leichsenring, F., & Leibing, E. (2003). The effectiveness of psychodynamic therapy and cognitive behavior therapy in the treatment of personality disorders: A meta-​analysis. American Journal of Psychiatry, 160, 1223–​1232.

Lindsay, K. A., Sankis, L. M., & Widiger, T. A. (2000). Gender bias in self-​report personality disorder inventories. Journal of Personality Disorders, 14, 218–​232. Linehan, M. M. (2015). DBT® Skills Training Manual, 2nd ed. New York, NY: Guilford Press. Links, P. S., & Eynan, R. (2013). The relationship between personality disorders and Axis I  psychopathology:  Deconstructing comorbidity. Annual Review of Clinical Psychology, 9, 529–​554. Livesley, W. J., & Jackson, D. (2009). Manual for the Dimensional Assessment of Personality Pathology–​Basic Questionnaire. Port Huron, MI: Sigma Press. Loranger, A. W. (1999). International Personality Disorder Examination (IPDE). Odessa, FL:  Psychological Assessment Resources. Loranger, A. W. (2001). OMNI Personality Inventories: Professional manual. Odessa, FL:  Psychological Assessment Resources. Loranger, A. W., Lenzenweger, M. F., Gartner, A. F., Susman, V. L., Herzig, J., Zammit, G. K., . . . Young, R. C. (1991). Trait–​state artifacts and the diagnosis of personality disorders. Archives of General Psychiatry, 48, 720–​728. Lynam, D. R. (2012). Assessment of maladaptive variants of five-​factor model traits. Journal of Personality, 80, 1593–​1613. Lynam, D. R., & Widiger, T. A. (2001). Using the five factor model to represent the DSM-​ IV personality disorders:  An expert consensus approach. Journal of Abnormal Psychology, 110, 401–​412. Lynam, D. R., & Widiger, T. A. (2007). Using a general model of personality to understand sex differences in the personality disorders. Journal of Personality Disorders, 21, 583–​602. Manseau, M., & Case, B. G. (2014). Racial–​ethnic disparities in outpatient mental health visits to U.S. physicians, 1993–​2008. Psychiatric Services, 65, 59–​67. McDermut, W., & Zimmerman, M. (2005). Assessment instruments and standardized evaluation. In J. Oldham, A. Skodol, & D. Bender (Eds.), Textbook of personality disorders (pp. 89–​101). Washington, DC: American Psychiatric Press. McDermut, W., & Zimmerman, M. (2008). Personality disorders, personality traits, and defense mechanisms measures. In A. J. Rush, M. B. First, & D. Blacker (Eds.), Handbook of psychiatric measures (2nd ed., pp. 687–​ 729). Arlington, VA: American Psychiatric Publishing. Miller, J. D., Few, L. R., & Widiger, T. A. (2012). Assessment of personality disorders and related traits:  Bridging DSM-​IV-​TR and DSM-​5. In T. A. Widiger (Ed.), The Oxford handbook of personality disorders (pp. 108–​140). New York: Oxford University Press. Miller, J. D., Pilkonis, P. A., & Mulvey, E. P. (2006). Treatment utilization and satisfaction:  Examining

Personality Disorders

the contributions of Axis II psychopathology and the five-​factor model of personality. Journal of Personality Disorders, 20, 369–​387. Millon, T. (2011). Disorders of personality. Introducing a DSM/​ICD spectrum from normal to abnormal (3rd ed.). Hoboken, NJ: Wiley. Millon, T., Davis, R., Millon, C., & Grossman, S. (2009). MCMI-​III manual (3rd ed.). Minneapolis, MN: National Computer Systems. Millon, T., Grossman, S., & Millon, C. (2015). MCMI-​ IV manual. Minneapolis, MN:  National Computer Systems. Morey, L. C. (1991). The Personality Assessment Inventory professional manual. Odessa, FL:  Psychological Assessment Resources. Morey, L. C., Alexander, G. M., & Boggs, C. (2005). Gender and personality disorder. In J. Oldham, A. Skodol, & D. Bender (Eds.), Textbook of personality disorders (pp. 541–​554). Washington, DC:  American Psychiatric Press. Morey, L. C., Bender, D. S., & Skodol, A. E. (2013). Validating the proposed Diagnostic and Statistical Manual of Mental Disorders, 5th edition, severity indicator for personality disorder. Journal of Nervous and Mental Disease, 201, 729–​735. Morey, L. C., Benson, K. T., Busch, A. J., & Skodol, A. E. (2015). Personality disorders in DSM-​ 5:  Emerging research on the alternative model. Current Psychiatry Reports, 17, 558. Morey, L. C., & Boggs, C. (2004). The Personality Assessment Inventory (PAI). In M. J. Hilsenroth, D. L. Segal, & M. Hersen (Eds.), Comprehensive handbook of psychological assessment:  Vol. 2.  Personality assessment (pp. 15–​ 29). Hoboken, NJ: Wiley. Morey, L. C., Skodol, A. E., & Oldham, J. M. (2014). Clinician judgments of clinical utility:  A comparison of DSM-​IV-​TR personality disorders and the alternative model for DSM-​5 personality disorders. Journal of Abnormal Psychology, 123, 398–​405. Morey, L. C., Waugh, M. H., & Blashfield, R. K. (1985). MMPI scales for DSM-​III personality disorders:  Their derivation and correlates. Journal of Personality Assessment, 49, 245–​251. Mullins-​Sweatt, S. N., Edmundson, M., Sauer-​Zavala, S., Lynam, D. R., Miller, J. D., & Widiger, T. A. (2012). Five-​ factor measure of borderline personality traits. Journal of Personality Assessment, 94, 475–​487. Mullins-​Sweatt, S. N., & Lengel, G. J. (2012). Clinical utility of the five-​factor model of personality disorder. Journal of Personality, 80, 1615–​1639. Mullins-​Sweatt, S. N., Lengel, G. J., & DeShong, H. L. (2016). The importance of considering clinical utility in the construction of a diagnostic manual. Annual Review of Clinical Psychology, 12, 133–​155.

483

O’Boyle, M., & Self, D. (1990). A comparison of two interviews for DSM-​III-​R personality disorders. Psychiatry Research, 32, 85–​92. O’Connell, B., & Dowling, M. (2014). Dialectical behaviour therapy (DBT) in the treatment of borderline personality disorder. Journal of Psychiatric & Mental Health Nursing, 21, 518–​525. Okazaki, S., & Sue, S. (2016). Methodological issues in assessment research with ethnic minorities. In A. E. Kazdin (Ed.), Methodological issues and strategies in clinical research (4th ed., pp. 235–​247). Washington, DC: American Psychological Association. Oldham, J. M., Skodol, A. E., & Bender, D. S. (Eds.). (2005). Textbook of personality disorders. Washington, DC: American Psychiatric Publishing. Oltmanns, T. F., & Balsis, S. (2011). Personality disorders in later life:  Questions about the measurement, course, and impact of disorders. Annual Review of Clinical Psychology, 7, 321–​349. Oltmanns, T. F., & Powers, A. D. (2012). Gender and personality disorders. In T. A. Widiger (Ed.), The Oxford handbook of personality disorders (pp. 206–​218). New York, NY: Oxford University Press. Paris, J. (2006, May). Personality disorders: Psychiatry’s stepchildren come of age. Invited lecture presented at the 159th Annual Meeting of the American Psychiatric Association, Toronto, Ontario, Canada. Paris, J. (2012). Pathology of personality disorder:  An integrative conceptualization. In T. A. Widiger (Ed.), The Oxford handbook of personality disorders (pp. 399–​ 406). New  York, NY:  Oxford University Press. Paris, J. (2015). Psychotherapies. In J. Paris (Ed.), A concise guide to personality disorders (pp. 119–​135). Washington, DC: American Psychological Association. Perry, J. C. (2014). Cluster C personality disorders: Avoidant, dependent, and obsessive–​ compulsive. In G. O. Gabbard (Ed.), Gabbard’s treatments of psychiatric disorders (5th ed., pp. 1087–​1116). Arlington, VA: American Psychiatric Publishing. Perry, J. C., & Bond, M. (2000). Empirical studies of psychotherapy for personality disorders. In J. G. Gunderson & G. O. Gabbard (Eds.), Psychotherapy for personality disorders (pp. 1–​31). Washington, DC:  American Psychiatric Press. Pfohl, B., Blum, N., & Zimmerman, M. (1997). Structured interview for DSM-​ IV personality. Washington, DC: American Psychiatric Press. Piersma, H. L. (1987). The MCMI as a measure of DSM-​III Axis II diagnoses: An empirical comparison. Journal of Clinical Psychology, 43, 478–​483. Piersma, H. L. (1989). The MCMI-​II as a treatment outcome measure for psychiatric inpatients. Journal of Clinical Psychology, 45, 87–​93.

484

Schizophrenia and Personality Disorders

Pilkonis, P. A., Heape, C. L., Proietti, J. M., Clark, S. W., McDavid, J. D., & Pitts, T. E. (1995). The reliability and validity of two structured diagnostic interviews for personality disorders. Archives of General Psychiatry, 52, 1025–​1033. Regier, D. A., Narrow, W. E., Clarke, D. E., Kraemer, H. C., Kuramoto, S. J., Kuhl, E. A., & Kupfer, D. J. (2013). DSM-​5 field trials in the United States and Canada: Part II. Test–​retest reliability of selected categorical diagnoses. American Journal of Psychiatry, 170, 59–​70. Rogers, R. (2001). Diagnostic and structured interviewing: A handbook for psychologists. Odessa, FL:  Psychological Assessment Resources. Rojas, S. L. & Widiger, T. A. (2017). Coverage of the DSM-​ IV-​TR/​DSM-​5 Section II personality disorders with the DSM-​5 dimensional trait model. Journal of Personality Disorders, 31, 462–​482. Ryder, A. G., Sunohara, M., & Kirmayer, L. J. (2015). Culture and personality disorder: From a fragmented literature to a contextually grounded alternative. Current Opinion in Psychiatry, 28, 40–​45. Samuel, D. B., & Bucher, M. A. (2017). Assessing the assessors:  The feasibility and validity of clinicians as an assessment source for personality disorder research. Personality Disorders: Theory, Research, and Treatment, 8, 104–​112. Samuel, D. B., Riddell, A. B., Lynam, D. R., Miller, J. D., & Widiger, T. A. (2012). A five factor measure of obsessive–​ compulsive personality traits. Journal of Personality Assessment, 94, 456–​465. Samuel, D. B., & Widiger, T. A. (2004). Clinicians’ personality descriptions of prototypic personality disorders. Journal of Personality Disorders, 18, 286–​308. Samuel, D. B., & Widiger, T. A. (2008). A meta-​analytic review of the relationships between the five-​ factor model and DSM-​IV-​TR personality disorders:  A facet level analysis. Clinical Psychology Review, 28, 1326–​1342. Saylor, K. I., & Widiger, T. A. (2008). A self-​report comparison of five semi-​ structured interviews. In I. V. Halvorsen & S. N. Olsen (Eds.), New research on personality disorders (pp. 103–​ 119). Hauppauge, NY: Nova Science. Segal, D. L., & Coolidge, F. L. (2007). Structured and semistructured interviews for differential diagnosis: Issues and applications. In M. Hersen, S. M. Turner, & D. C. Beidel (Eds.), Adult psychopathology and diagnosis (5th ed., pp. 78–​100). Hoboken, NJ: Wiley. Shedler, J. (2015). Integrating clinical and empirical perspectives on personality:  The Shedler–​Westen Assessment Procedure (SWAP). In S. K. Huprich (Ed.), Personality disorders:  Toward theoretical and empirical integration in diagnosis and assessment (pp. 225–​252). Washington, DC: American Psychological Association.

Shedler, J., & Westen, D. (2004). Refining personality disorder diagnosis:  Integrating science and practice. American Journal of Psychiatry, 161, 1350–​1365. Simms, L. J., Goldberg, L. R., Roberts, J. E., Watson, D., Welte, J., & Rotterman, J. H. (2011). Computerized adaptive assessment of personality disorder: Introducing the CAT-​PD project. Journal of Personality Assessment, 93, 380–​389. Skodol, A. E. (2008). Longitudinal course and outcome of personality disorders. Psychiatric Clinics of North America, 31, 495–​503. Skodol, A. E. (2012). Diagnosis and DSM-​5: Work in progress. In T. A. Widiger (Ed.), The Oxford handbook of personality disorders (pp. 35–​57). New York, NY: Oxford University Press. Skodol, A. E. (2014). Manifestations, assessment, and differential diagnosis. In J. M. Oldham, A. E. Skodol, & D. S. Bender (Eds.), The American Psychiatric Publishing textbook of personality disorders (2nd ed., pp. 131–​164). Arlington, VA:  American Psychiatric Publishing. Skodol, A. E., Gunderson, J. G., McGlashan, T. H., Dyck, I. R., Stout, R. L., Bender, D. S.,  .  .  .  Oldham, J. M. (2002). Functional impairment in patients with schizotypal, borderline, avoidant, or obsessive–​ compulsive personality disorder. American Journal of Psychiatry, 159, 276–​283. Skodol, A. E., Oldham, J. M., Rosnick, L., Kellman, H. D., & Hyler, S. E. (1991). Diagnosis of DSM-​III-​R personality disorders:  A comparison of two structured interviews. International Journal of Methods in Psychiatric Research, 1, 13–​26. Somwaru, D. P., & Ben-​ Porath, Y. S. (1995, March). Development and reliability of MMPI-​2 based personality disorder scales. Paper presented at the 30th Annual Workshop and Symposium on Recent Developments in Use of the MMPI-​2 and MMPI-​A, St. Petersburg Beach, FL. Strauss, M. E., & Smith, G. T. (2009). Construct validity:  Advances in theory and methodology. Annual Review of Clinical Psychology, 5, 1–​25. Trull, T. J., & Durrett, C. A. (2005). Categorical and dimensional models of personality disorder. Annual Review of Clinical Psychology, 1, 355–​380. Trull, T. J., Scheiderer, E. M., & Tomko, R. L. (2012). Axis II comorbidity. In T. A. Widiger (Ed.), The Oxford handbook of personality disorders (pp. 219–​236). New York, NY: Oxford University Press. Verheul, R. (2005). Clinical utility for dimensional models of personality pathology. Journal of Personality Disorders, 19, 283–​302. Verheul, R., Andrea, H., Berghout, C., Dolan, C. C., Busschbach, J. J. V., Van der Kroft, P. J. A., . . . Fonagy, P. (2008). Severity Indices of Personality Problems

Personality Disorders

(SIPP-​118):  Development, factor structure, reliability and validity. Psychological Assessment, 20, 23–​34. Watson, D., Clark, L. A., & Harkness, A. R. (1994). Structures of personality and their relevance to psychopathology. Journal of Abnormal Psychology, 103, 18–​31. Westen, D. (1997). Divergences between clinical and research methods for assessing personality disorders: Implications for research and the evolution of Axis II. American Journal of Psychiatry, 154, 895–​903. Westen, D., & Shedler, J. (2003). SWAP-​200 instructions for use. Boston, MA: Department of Psychology and Center for Anxiety and Related Disorders. Westen, D., Shedler, J., Durrett, C., Glass, S., & Martens, A. (2003). Personality diagnoses in adolescence: DSM-​IV Axis II diagnosis and an empirically derived alternative. American Journal of Psychiatry, 160, 952–​966. Widiger, T. A. (2005). CIC, CLPS, and MSAD. Journal of Personality Disorders, 19, 586–​593. Widiger, T. A. (2007). DSM’s approach to gender:  History and controversies. In W. E. Narrow, M. B. First, P. J. Sirovatka, & D. A. Regier (Eds.), Age and gender considerations in psychiatric diagnosis:  A research agenda for DSM-​V (pp. 19–​29). Washington, DC:  American Psychiatric Association. Widiger, T. A. (Ed.). (2012). The Oxford handbook of personality disorders. New York, NY: Oxford University Press. Widiger, T. A., & Boyd, S. E. (2009). Personality disorders assessment instruments. In J. N. Butcher (Ed.), Oxford handbook of personality assessment (pp. 336–​ 363). New York, NY: Oxford University Press. Widiger, T. A., & Clark, L. A. (2000). Toward DSM-​V and the classification of psychopathology. Psychological Bulletin, 126, 946–​963. Widiger, T. A., & Lowe, J. R. (2010). Personality disorders. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (2nd ed., pp. 571–​605). New York, NY: Guilford. Widiger, T. A., Lynam, D. R., Miller, J. D., & Oltmanns, T. F. (2012). Measures to assess maladaptive variants of the five-​factor model. Journal of Personality Assessment, 94, 450–​455. Widiger, T. A., Mangine, S., Corbitt, E. M., Ellis, C. G., & Thomas, G. V. (1995). Personality Disorder Interview-​ IV:  A semistructured interview for the assessment of personality disorders—​ Professional manual. Odessa, FL: Psychological Assessment Resources. Widiger, T. A., & Samuel, D. B. (2005). Evidence based assessment of personality disorders. Psychological Assessment, 17, 278–​287.

485

Widiger, T. A., & Seidlitz, L. (2002). Personality, psychopathology, and aging. Journal of Research in Personality, 36, 335–​362. Widiger, T. A., & Simonsen, E. (2005). Alternative dimensional models of personality disorder:  Finding a common ground. Journal of Personality Disorders, 19, 110–​130. Widiger, T. A., & Trull, T. J. (2007). Plate tectonics in the classification of personality disorder: Shifting to a dimensional model. The American Psychologist, 62, 71–​83. Williams, T. F., & Simms, L. J. (2016). Personality disorder models and their coverage of interpersonal problems. Personality Disorders: Theory, Research, and Treatment, 7, 15–​27. Wood, J. M., Garb, H. N., Lilienfeld, S. O., & Nezworski, M. T. (2002). Clinical assessment. Annual Review of Psychology, 53, 519–​543. Wood, J. M., Garb, H. N., Nezworski, M. T., & Koren, D. (2007). The Shedler–​ Westen Assessment Procedure-​ 200 as a basis for modifying DSM personality disorder categories. Journal of Abnormal Psychology, 116, 823–​836. Wright, A. C., & Simms, L. J. (2014). On the structure of personality disorder traits: Conjoint analyses of the CAT-​ PD, PID-​5, and NEO-​PI-​3 trait models. Personality Disorders: Theory, Research, and Treatment, 5, 43–​54. Wu, L., Blazer, D. G., Gersing, K. R., Burchett, B., Swartz, M. S., & Mannelli, P. (2013). Comorbid substance use disorders with other Axis I and II mental disorders among treatment-​ seeking Asian Americans, Native Hawaiians/​ Pacific Islanders, and mixed-​ race people. Journal of Psychiatric Research, 47, 1940–​1948. Zanarini, M. C., Frankenburg, F. R., Chauncey, D. L., & Gunderson, J. G. (1987). The diagnostic interview for personality disorders: Interrater and test–​retest reliability. Comprehensive Psychiatry, 28, 467–​480. Zanarini, M. C., Frankenburg, F. R., Sickel, A. E., & Young, L. (1996). Diagnostic interview for DSM-​IV personality disorders. Unpublished measure, McLean Hospital, Boston, MA. Zimmerman, J., Böhnke, J. R., Eschstruth, A., Mathews, A., Wenzel, K., & Leising, D. (2015). The latent structure of personality functioning: Investigating Criterion A from the alternative model for personality disorders in DSM-​5. Journal of Abnormal Psychology, 124, 532–​548. Zimmerman, M. (2003). What should the standard of care for psychiatric diagnostic evaluations be? Journal of Nervous and Mental Disease, 191, 281–​286.

Part VII

Couple Distress and Sexual Disorders

22

Couple Distress Douglas K. Snyder Richard E. Heyman Stephen N. Haynes Christina Balderrama-​Durbin Assessment of couple distress shares basic principles of assessing individuals—​ namely that (a)  the content of assessment methods be empirically linked to target problems, treatment goals, and constructs hypothesized to be functionally related; (b)  measures and methods be reliable, valid, and cost-​effective; and (c)  findings be linked within a theoretical or conceptual framework of the presumed causes of difficulties, as well as to clinical intervention or prevention. However, couple assessment differs from individual assessment in that couple assessment strategies (a)  focus specifically on relationship processes and the interactions between individuals; (b) provide an opportunity for direct observation of target complaints involving communication and other interpersonal exchanges; and (c)  must be sensitive to potential challenges unique to establishing a collaborative alliance when assessing highly distressed or antagonistic partners, particularly in a conjoint context. Similar to the assessment process itself, our discussion of strategies for assessing couple distress is necessarily selective—​emphasizing dimensions empirically related to couple distress, identifying alternative methods and strategies for obtaining relevant assessment data, and highlighting specific techniques within each method. We begin this chapter by defining couple distress and noting its prevalence and comorbidity with emotional, behavioral, and physical health problems of individuals in both clinical and community populations. Both brief screening measures and clinical methods are presented for diagnosing couple distress in clinical as well as research applications. The bulk of the chapter is devoted

to conceptualizing and assessing couple distress for the purpose of planning and evaluating treatment. Toward this end, we review empirical findings regarding behavioral, cognitive, and affective components of couple distress and specific techniques derived from clinical interview, behavioral observation, and self-​report methods. In most cases, these same assessment methods and instruments are relevant to evaluating treatment progress and outcomes. We conclude with general recommendations for assessing couple distress and directions for future research.

CONCEPTUALIZING COUPLE RELATIONSHIP DISTRESS

Defining Couple Distress The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric Association, 2013)  contains criteria for relationship distress with spouse or intimate partner to be used when (a)  the major clinical focus is the subjective experience of problematic quality in the relationship or (b) the problematic quality is affecting the course, prognosis, or treatment of a mental or other medical disorder. Potentially impaired couple functioning criteria include behavioral (e.g., conflict resolution difficulty, withdrawal, and aggression), cognitive (e.g., chronic negative attributions or dismissal), or affective (e.g., chronic sadness, apathy, or anger) domains. The proposed criteria for partner

489

490

Couple Distress and Sexual Disorders

relational problem in the forthcoming 11th edition of the International Classification of Disease (ICD-​11; World Health Organization, 2016)  are similar but more explicated (see Heyman, Slep, & Foran, 2015); for example, whereas the DSM-​5’s criteria address adverse impacts in behavioral, cognitive, and affective domains (with examples), ICD-​11’s criteria include adverse impacts on behavior, cognition, emotion, physical health, interpersonal interaction, and major life role activities. The DSM-​5 (as with prior editions) consigns relational problems to the appendix on Other Conditions and Problems That May Be a Focus of Clinical Attention or That May Otherwise Affect the Diagnosis, Course, Prognosis, or Treatment of a Patient’s Mental Disorder. As discussed in depth by Heyman and Slep (in press), (a)  the DSM excludes problems beyond the individual for guild, rather than for definitional, reasons; (b) decisions regarding the boundary between normality and pathology are made regularly—​ for both individuals and their behaviors in key contexts such as relationships; (c) the DSM’s “harm criterion” is an important innovation that can be used to delineate the boundary for clinically significant individual and relational problems; and (d) diagnostic criteria for relational problems, although not perfect, would still be useful in operationalizing problems that, as with individual problems, cause pain, injury, an important loss of freedom, or death. Nonetheless, diagnostic systems still do not overtly recognize subthreshold deficiencies that couples often present as a focus of concern, including those that detract from optimal individual or relationship well-​being. These include deficits in feelings of security and closeness, shared values, trust, joy, love, physical intimacy, and similar positive emotions that individuals typically value in their intimate relationships. Not all such deficits necessarily culminate in “clinically significant” impaired functioning or emotional and behavioral symptoms as traditionally conceived; yet, frequently, these deficits are experienced as insidious and may culminate in partners’ disillusion or their dissolution of the relationship. The most positive features of the DSM’s conceptualization of partner relational problems are its emphasis on the interactions between partners and its recognition that relational problems are frequently associated with individual symptoms in one or both partners. Prevalence and Comorbid Conditions Clinical interventions targeting couple distress continue to gain in stature as vital components of mental health

services. Three factors contribute to this growing recognition: (a) the prevalence of couple distress in both community and clinic samples; (b) the impact of couple distress on both the emotional and the physical well-​being of adult partners and their offspring; and (c) increased evidence of the effectiveness of couple therapy, not only in treating couple distress and related relationship problems but also as a primary or adjunct treatment for a variety of individual emotional, behavioral, or physical health disorders (Fischer, Baucom, & Cohen, 2016; Lebow, Chambers, Christensen, & Johnson, 2012; Roddy, Nowlan, Doss, & Christensen, 2016; Snyder, Castellani, & Whisman, 2006). Couple distress is a prevalent finding in both community epidemiological studies and research involving clinical samples. In the United States, the most salient indicator of couple distress remains a divorce rate of 40% to 50% among married couples (Kreider & Ellis, 2011), with about half of these occurring within the first 7 years of marriage. Independent of divorce, many, if not most, marriages experience periods of significant turmoil that place partners at risk for dissatisfaction, dissolution, or symptom development (e.g., depression or anxiety); roughly one-​third of married persons report being in a distressed relationship (Whisman, Beach, & Snyder, 2008). Data on the effects of stigma, prejudice, and multiple social stressors experienced by lesbian, gay, and bisexual populations suggest that same-​sex couples may experience additional challenges (Meyer, 2003). In a previous national survey, the most frequently cited causes of acute emotional distress were couple relationship problems, including divorce, separation, and other relationship strains (Swindle, Heller, Pescosolido, & Kikuzawa, 2000). Couple distress covaries with overall life dissatisfaction even more strongly than does distress in other domains, such as health, work, or children (Fleeson, 2004). Other studies have indicated that persons in distressed couple relationships are overrepresented among individuals seeking mental health services, regardless of whether or not they report couple distress as their primary complaint (Lin, Goering, Offord, Campbell, & Boyle, 1996). In a study of 800 employee assistance program (EAP) clients, 65% rated family problems as “considerable” or “extreme” (Shumway, Wampler, Dersch, & Arredondo, 2004). Findings from various national surveys have indicated that compared to happily married persons, maritally distressed partners are significantly more likely to have a mood disorder, anxiety disorder, or substance use disorder (McShall & Johnson, 2015; Whisman, 1999, 2007).

Couple Distress

Additional findings from an epidemiological survey in Ontario, Canada, showed that even when controlling for distress in relationships with relatives and close friends, couple distress was significantly correlated with major depression, generalized anxiety disorder, social and simple phobia, panic disorder, and alcohol dependence or abuse (Whisman, Sheldon, & Goering, 2000). Moreover, couple distress—​particularly negative communication—​ has direct adverse effects on cardiovascular, endocrine, immune, neurosensory, and other physiological systems that, in turn, contribute to physical health problems (Robles, Slatcher, Trombello, & McGinn, 2014). Nor are the effects of couple distress confined to the adult partners. Couple distress has been related to a wide range of deleterious effects on children, including depression, anxiety, withdrawal, poor social competence, health problems, poor academic performance, and a variety of other concerns (Bernet, Wamboldt, & Narrow, 2016; Cummings & Davies, 2010; Hetherington, Bridges, & Insabella, 1998; Vaez, Indran, Abdollahi, Jurahi, & Mansor, 2015). In brief, couple distress has a markedly high prevalence; has a strong linkage to emotional, behavioral, and health problems in the adult partners and their offspring; and is among the most frequent primary or secondary TABLE 22.1  

491

concerns reported by individuals seeking assistance from mental health professionals. Etiological Considerations and Implications for Assessment As noted previously, both the aforementioned comorbidity findings and clinical observations suggest that couple distress likely results from, as well as contributes to, emotional and behavioral problems in one or both partners as well as their children. However, as a relational (vs. individual) disorder, understanding a given couple’s distress requires extending beyond individual considerations to pursue a broader assessment of the relational and socioecological context in which couple distress emerges. Snyder, Cavell, Heffer, and Mangrum (1995) proposed a multitrait, multilevel assessment model for assessing couple and family distress comprising five overlapping construct domains (cognitive, affective, behavioral, interpersonal, and structural/​developmental) operating at five system levels (individuals, dyads, the nuclear family, the extended family, and community/​cultural systems). Table 22.1 (from Abbott & Snyder, 2010) provides a modest sampling of specific constructs relevant to each domain at each system level.

Sample Assessment Constructs Across Domains and Levels of Individual, Couple, and Family Functioning

Individual

Dyad (Couple, Parent–​Child)

Cognitive

Intelligence; memory Cognitions regarding functions; thought self and other content; thought in relationship; quality; analytic expectancies, skills; cognitive attributions, distortions; schemas; attentional biases, capacity for self-​ and goals in the reflection and relationship. insight.

Affective/​emotional

Mood; affective range, intensity, and valence; emotional lability and reactivity.

Nuclear Family System

Extended System (Family of Origin, Friends)

Culture/​Community

Shared or co-​ Intergenerational Prevailing societal and constructed patterns of thinking cultural beliefs and meanings within and believing; co-​ attitudes; ways of the system; family constructed meaning thinking associated ideology or paradigm; shared by therapist with particular thought sequences and family or other religious or ethnic between members significant friends or groups that are contributing to family. germane to the family functioning. family or individual.

Predominant emotional Family emotional Emotional themes Prevailing emotional themes or patterns themes of fear, and patterns in sentiment in in the relationship; shame, guilt, extended system; the community, cohesion; range or rejection; intergenerational culture, and society; of emotional system properties emotional legacies; cultural norms and expression; of cohesion patterns of fusion or mores regarding commitment and or emotional differentiation across the expression of satisfaction in disaffection; generations. emotion. the relationship; emotional emotional content atmosphere in the during conflict; home—​including acceptance and humor, joy, love, and forgiveness. affection as well as conflict and hostility.

(Continued)

492

Couple Distress and Sexual Disorders

TABLE 22.1  Continued

Individual

Dyad (Couple, Parent–​Child)

Nuclear Family System

Extended System (Family of Origin, Friends)

Culture/​Community

Behavioral

Capacity for self-​ control; impulsivity; aggressiveness; capacity to defer gratification; substance abuse; overall health, energy, and drive.

Recursive behavioral Repetitive behavioral Behavioral patterns Cultural norms and sequences displayed patterns or sequences displayed by the mores of behavior; in the relationship; used to influence extended system behaviors which behavioral repertoire; family structure (significant friends, are prescribed or reinforcement and power; shared family of origin, proscribed by the contingencies; recreation and other therapist) used larger society. strategies used to pleasant activities. to influence the control other’s structure and behavior. behaviors of the extended system. Quality and frequency Information flow in Degree to which Information that is of the dyad’s the family system; information is shared communicated to the communication; paradoxical with and received family or individual speaking and messages; family from significant by the community listening skills; system boundaries, others outside the or culture in which how couples share hierarchy, and nuclear family they live; how the information, express organization; how system or dyad; family or individual feelings, and resolve the family system the permeability of communicates their conflict. uses information boundaries and the needs and mobilizes regarding its own degree to which the resources. functioning; family family or couple is decision-​making receptive to outside strategies. influences.

Interpersonal/​ communication

Characteristic ways of communicating and interacting across relationships or personality (e.g., shy, gregarious, narcissistic, dependent, controlling, avoidant).

Structural/​ developmental

All aspects of History of the Changes in the family Developmental changes The cultural and political physiological relationship and how system over time; across generations; history of the society and psychosocial it has evolved over current stage in the significant historical in which the family development; personal time; congruence of family life cycle; events influencing or individual lives; history that influences partners’ cognitions, stressors related current system current political and current functioning—​ affect, and behavior. to child-​rearing; functioning (e.g. economic changes; including psychosocial congruence in death, illness, divorce, congruence of the stressors; intrapersonal needs, beliefs, and abuse); congruence individual’s or couple’s consistency of behaviors across family of beliefs and values values with those of cognitions, affect, and members. across extended social the larger community. behavior. support systems.

Source: From B. V. Abbott and D. K. Snyder (2010). Couple distress. In M. M. Antony & D. H. Barlow (Eds.), Handbook of Assessment and Treatment Planning for Psychological Disorders (2nd ed., pp. 439–​476). New York, NY: Guilford Press. Copyright 2010 by Guilford Press. Reprinted with permission of The Guilford Press.

The relevance of any specific facet of this model to relationship distress for either partner varies dramatically across couples; hence, although providing guidance regarding initial areas of inquiry from a nomothetic perspective, the relation of any specific facet to relationship distress for a given individual or couple needs to be determined from a functional analytic approach and applied idiographically (Haynes, Mumma, & Pinson, 2009; Haynes, O’Brien, & Kaholokula, 2011). Moreover,

interactive effects occur within domains across levels, within levels across domains, and across levels and domains. For example, individual differences in emotion regulation could significantly impact how partners interact when disclosing personal information or attempting to resolve conflict. Later in this chapter, we highlight more salient components of this assessment model operating primarily at the dyadic level as they relate to case conceptualization and treatment planning.

Couple Distress ASSESSMENT FOR DIAGNOSIS

A diagnosis of couple distress is based, in part, on the subjective evaluation of dissatisfaction by one or both partners with the overall quality of their relationship. By comparison, relationship dysfunction may be determined by external evaluations of partners’ objective interactions. Although subjective and external evaluations frequently converge, partners may report being satisfied with a relationship that—​by outsiders’ evaluations—​would be rated as dysfunctional due to observed deficits in conflict resolution, emotional expressiveness, management of relationship tasks involving finances or children, interactions with extended family, and so forth; similarly, partners may report dissatisfaction with a relationship that to outsiders appears characterized by effective patterns of interacting in these and other domains. Discrepancies between partners’ subjective reports and outside observers’ evaluations may result, in part, from differences in raters’ personal values, developmental stage, gender, ethnicity, or cultural perspective (Haynes, Kaholokula, & Tanaka-​Matsumi, in press) or from a lack of opportunity to observe relatively infrequent behaviors (e.g., incidents of physical or emotional abuse). To complicate matters, partners themselves may diverge in their appraisals—​either because of actual differences in subjective experiences or because of differences in ability or willingness to convey these experiences. Some clients may be more forthcoming when responding to a questionnaire (Whisman & Snyder, 2007)  or when interviewed individually rather than conjointly with their partner. Assessment measures and methods intended to identify couple distress should be both sensitive (i.e., able to detect its presence at some operationalized threshold) and

TABLE 22.2  

493

specific (i.e., able to distinguish couple distress from other related or comorbid conditions). For screening purposes, a brief structured interview may be used to assess overall relationship distress and partner violence. Heyman, Feldbau-​Kohn, Ehrensaft, Langhinrichsen-​Rohling, and O’Leary (2001) developed a structured diagnostic interview for couple distress (Structured Diagnostic Interview for Marital Distress and Partner Aggression [SDI-​MD-​ PA]), and Heyman, Slep, Snarr, and Foran (2013) developed a set of structured diagnostic interviews for partner physical, emotional, and sexual abuse, all patterned after the Structured Clinical Interview for the DSM (First, Gibbon, Spitzer, & Williams, 1997). An initial evaluation of the relationship distress structured interview demonstrated high inter-​rater reliability; moreover, partners’ responses to items presented in this interview showed a high correspondence with the same items given in the form of a questionnaire (Table 22.2). The emphasis on partners’ subjective evaluations of couple distress has led to development of numerous self-​ report measures of relationship satisfaction and global affect. There is considerable convergence across measures purporting to assess such constructs as marital “quality,” “satisfaction,” “adjustment,” “happiness,” “cohesion,” “consensus,” “intimacy,” and the like, with correlations between measures often approaching the upper bounds of their reliability. Differentiation among such constructs at a theoretical level often fails to achieve the same operational distinction at the item-​content level (for an excellent discussion of this issue, see Fincham & Bradbury, 1987). Hence, selection among such measures should be guided by careful examination of item content (i.e., content validity) and empirical findings regarding both convergent and discriminant validity.

Ratings of Instruments Used for Screening and Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

SDI-​MD-​PA DAS-​7 MSI-​B CSI-​4 RMICS RCISS

NR NR E A E E

NA A E E E NR

E NA NA NA A A

NR NR E NR NR NR

G G E E A A

G G G G G G

G NR G NR G G

G G G G A A

✓ ✓ ✓ ✓

Note: SDI-​MD-​PA = Structured Diagnostic Interview for Marital Distress and Partner Aggression; DAS-​7 = Dyadic Adjustment Scale–​7-​item version; MSI-​B = Marital Satisfaction Inventory-​Brief; CSI-​4 = Couple Satisfaction Index Scale–​4-​item version; RMICS = Rapid Marital Interaction Coding System; RCISS = Rapid Couples Interaction Scoring System; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

494

Couple Distress and Sexual Disorders

Relatively short measures of overall relationship satisfaction may be useful as diagnostic and screening strategies for couple distress. The most frequently used global measure of relationship satisfaction in couple research is the Dyadic Adjustment Scale (DAS; Spanier, 1976), a 32-​item instrument purporting to differentiate among four related subscales reflecting cohesion, satisfaction, consensus, and affectional expression. For abbreviated screening measures of couple distress, several alternatives are available, including a brief (7-​ item) version of the DAS (Hunsley, Best, Lefebvre, & Vito, 2001; Sharpley & Rogers, 1984). More recent global measures of relationship sentiment include a 10-​ item screening scale (Marital Satisfaction Inventory-​Brief form [MSI-​B]; Whisman, Snyder, & Beach, 2009)  derived from the Marital Satisfaction Inventory-​Revised (MSI-​R; Snyder, 1997) and a set of three Couple Satisfaction Index (CSI) scales constructed using item response theory comprising 32, 16, and 4 items each (Funk & Rogge, 2007). Despite its widespread use, a review of psychometric properties reveals important limitations to the DAS. Factor analyses have failed to replicate its four subscales (Crane, Busby, & Larson, 1991), and the reliability of the affectional expression subscale is weak. There is no evidence that the full-​length DAS and similar longer global scales offer incremental validity above the briefer and more recent MSI-​B and CSI scales that offer higher precision of measurement and greater sensitivity for detecting differences in relationship satisfaction (Balderrama-​Durbin, Snyder, & Balsis, 2015). Because partners frequently present for treatment together, clinicians have the rare opportunity to observe the reciprocal social determinants of problem behaviors without venturing outside the therapy office. Structured observations constitute a useful assessment method because they minimize inferences needed to assess behavior, can facilitate formal or informal functional analysis, can provide an additional method of assessment in a multimethod strategy (e.g., integrated with interview and questionnaires), and can facilitate the observation of otherwise difficult to observe behaviors (Haynes et al., 2011; Heyman & Slep, 2004). We discuss analog behavioral observation of couple interactions and describe specific observational coding systems at greater length in the following section on case conceptualization and treatment planning. However, for purposes of initial screening and diagnosis, we advocate two approaches to assessing partners’ descriptions of relationship problems, expression of positive and negative feelings, and efforts to resolve conflicts and reach decisions—​the Rapid Marital Interaction

Coding System (RMICS; Heyman, 2004) and the Rapid Couples Interaction Scoring System (RCISS; Krokoff, Gottman, & Hass, 1989). Even when not formally coding couples’ interactions, clinicians’ familiarity with the behavioral indicators for specific communication patterns previously demonstrated to covary with relationship accord or distress should facilitate empirically informed screening of partners’ verbal and nonverbal exchanges. When a couple presents for therapy with primary complaints of dissatisfaction in the relationship, screening for the mere presence of couple distress is unnecessary. However, there are numerous other situations in which the practitioner may need to screen for relationship distress as a contributing or exacerbating factor in patients’ presenting complaints, including mental health professionals treating individual emotional or behavioral difficulties; physicians evaluating the interpersonal context of such somatic complaints as fatigue, chronic headaches, sleep disturbance, alcohol misuse, or difficulties at work; or emergency room personnel confronting persons with severe relationship distress culminating in physical violence and injuries. We advocate a sequential strategy of progressively more detailed assessment when indicators of relationship distress emerge (cf. Abbott & Snyder, 2010, pp. 468–​469): 1. Clinical inquiry as to whether relationship problems contribute to individual difficulties such as feeling depressed or anxious, having difficulty sleeping, abusing alcohol or other substances, or feeling less able to deal with such stresses as work, children and family, or health concerns. 2. Alternatively, use of an initial brief screening measure (e.g., the Couple Satisfaction Index Scale–​4-​item version [CSI-​4], Dyadic Adjustment Scale–​7-​item version [DAS-​7], or MSI-​B) having evidence of both internal consistency and construct validity. 3. For individuals reporting moderate to high levels of global relationship distress, following up with more detailed assessment strategies such as semi-​structured interviews, analogue behavioral observation, and multidimensional relationship satisfaction questionnaires to differentiate among levels and sources of distress. Overall Evaluation When screening for either clinical or research purposes, we advocate assessment strategies favoring sensitivity over

Couple Distress

specificity to minimize the likelihood of overlooking potential factors contributing to individual or relationship distress. This implies the initial use of broad screening items in clinical inquiry or brief self-​report measures with strong psychometric support—​ along with direct observation of partner interactions whenever possible—​ and subsequent use of more extensive narrowband or multidimensional measures described in the following section on treatment planning to pinpoint specific sources of concern. Initial assessment findings indicating overall relationship distress need to be followed by idiographic functional analytic assessment strategies to delineate the manner in which individual and relationship concerns affect each other and relate to situational factors (Haynes et al., 2009).

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Conceptualizing couple distress for the purpose of planning treatment requires extending beyond global sentiment to assess specific sources and levels of relationship difficulties, their individual and broader socioecological determinants, and their potential responsiveness to various clinical interventions. We begin our consideration of assessing couple relationships for case conceptualization and treatment planning with a discussion of construct domains particularly relevant to couple distress—​ including relationship behaviors, cognitions, and affect—​as well as individual and broader cultural factors. We follow this with a discussion of various assessment strategies and methods for evaluating specific constructs in these domains. Domains to Target When Evaluating Couple Distress Relationship Behaviors Research examining behavioral components of couple distress has emphasized two domains:  (a) the rates and reciprocity of positive and negative behaviors exchanged between partners (see a review by Salazar, 2015)  and (b)  communication behaviors related to both emotional expression and decision-​making. Regarding the former, distressed partners, compared with nondistressed partners, (a)  are more hostile; (b)  start their conversations more hostilely and maintain this hostility during the course of the conversation; (c)  are more likely to reciprocate

495

and escalate their partners’ hostility; (d) are less likely to edit their behavior during conflict, resulting in longer negative reciprocity loops; (e) emit less positive behavior; (f) suffer more ill health effects from their conflicts; and (g) are more likely to show demand ↔ withdraw patterns (Heyman, 2001). Findings suggest a stronger linkage for negativity, compared to positivity, to overall couple distress. Given the inevitability of disagreements arising in long-​term relationships, numerous studies have focused on specific communication behaviors that exacerbate or impede the resolution of couple conflicts. Most notable among these are difficulties in articulating thoughts and feelings related to specific relationship concerns and deficits in decision-​making strategies for containing, reducing, or eliminating conflict. Gottman (1994) observed that expression of criticism and contempt, along with defensiveness and withdrawal, predicted long-​term distress and risk for relationship dissolution. Christensen and Heavey (1990) found that distressed couples were more likely than nondistressed couples to demonstrate a demand ↔ withdraw pattern in which one person attempts to engage the partner in relationship exchange and that partner withdraws, with respective approach and retreat behaviors progressively intensifying. Given findings regarding the prominence of negativity, conflict, and ineffective decision-​ making strategies as correlates of relationship distress, couple assessment must address specific questions regarding relationship behaviors, especially communication behaviors. We list these here, along with sample assessment methods; in subsequent sections specifying interview, observational, and self-​report strategies for assessing couple distress, we describe these and related methods in greater detail: 1. How frequent and intense are the couple’s conflicts? How rapidly do initial disagreements escalate into major arguments? For how long do conflicts persist without resolution? Both interview and self-​report measures may yield useful information regarding rates and intensity of negative exchanges as well as patterns of conflict engagement. Commonly used self-​ report measures specific to communication include the Communication Patterns Questionnaire (CPQ; Christensen, 1987; Crenshaw, Christensen, Baucom, Epstein, & Baucom, 2016; see Table 22.3). Couples’ conflict-​resolution patterns may be observed directly by instructing partners to discuss problems of their own choosing representative of both moderate and high disagreement and then

496

Couple Distress and Sexual Disorders

TABLE 22.3  

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Test–​Retest Reliability Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

A

A

G

NR

A

A

G

A

Highly Recommended

Interviews RQI

Self-​Report Measures Specific Relationship Behaviors FAPBI

A

G

NA

NR

E

G

A

A

CPQ

NR

G

NA

A

NR

G

G

G

CTS2 DCI

NR E

G G

NA NA

A G

A E

A G

A G

G G

G

NA

G

A

G

NR

G

G G

NA NA

G A

E G

E A

G A

E A



Relationship Cognitions RAM

NR

Multidimensional Inventories MSI-​R E ENRICH G



Observational Measures Affect BARS

A

NR

A

NR

A

A

NR

A

SPAFF

G

NR

A

G

A

G

G

A

Communication (Demand/​Withdraw) CRS G A

G

NR

A

G

G

A

Communication (Affect) CRAC G IDCS NR

G NR

G A

A NR

A A

G G

G A

A A

NR

G

NR

A

G

E

A

Communication (Problem Solving) COMFI NR NR

KPI

G





A

NR

A

A

A

A

CST DISC

G NR

NR NR

G A

NR NR

A A

G NR

E A

A A

LIFE

G

NR

G

NR

A

G

A

A

VTCS

NR

NR

A

NR

A

G

A

A

NR

NR

A

NR

A

A

G

A



NR U

NR E

G G

NR NR

A G

G G

A A

A A





Communication (Power/​Affect) SCID Support/​Intimacy SSICS CIBRS

Note: RQI = Relationship Quality Interview; FAPBI = Frequency and Acceptability of Partner Behavior Inventory; CPQ = Communication Patterns Questionnaire; CTS2 = Conflict Tactics Scale-​Revised; DCI = Dyadic Coping Inventory; RAM = Relationship Attribution Measure; MSI-​R = Marital Satisfaction Inventory-​Revised; ENRICH = Evaluating and Nurturing Relationship Issues, Communication, Happiness; BARS = Behavioral Affective Rating System; SPAFF = Specific Affect Coding System; CRS = Conflict Rating System; CRAC = Clinical Rating of Adult Communication Scale; IDCS = Interactional Dimensions Coding System; KPI = Kategoriensystem für Partnerschaftliche Interaktion; COMFI = Codebook of Marital and Family Interaction; CST = Communication Skills Test; DISC = Dyadic Interaction Scoring Code; LIFE = Living in Family Environments Coding System; VTCS = Verbal Tactics Coding Scheme; SCID = System for Coding Interactions in Dyads; SSICS = Social Support Interaction Coding System; CIBRS = Couples’ Intimate Behavior Rating System; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

either formally or informally coding these interactions using one of the behavioral coding systems described later in this chapter. 2. What are common sources of relationship conflict? For example, interactions regarding finances, children, sexual intimacy, use of leisure time, or household tasks; involvement with others, including extended family, friends, or coworkers; and

differences in preferences or core values? In addition to the clinical interview, numerous self-​report measures sample sources of distress across a variety of relationship domains. Among those having evidence of both reliability and construct validity are the Frequency and Acceptability of Partner Behavior Inventory (FAPBI; Doss & Christensen, 2006)  and the MSI-​ R (Snyder, 1997), both of

Couple Distress

which are described in greater detail later, along with other self-​report measures. 3. What resources and deficits do partners demonstrate in problem-​ identification and conflict-​ resolution strategies? Do they engage couple issues at adaptive levels (i.e., neither avoiding nor dwelling on relationship concerns)? Do partners balance their expression of feelings with decision-​making strategies? Are problem-​resolution efforts hindered by inflexibility or imbalances in power? Do partners offer each other support when confronting stressors from within or outside their relationship (e.g., chronic medical or psychological illness of one partner)? As noted by others (e.g., Bradbury, Rogge, & Lawrence, 2001; Cutrona, 1996), most of the interactional tasks developed for use in couple research have emphasized problem solving and conflict resolution to the exclusion of tasks designed to elicit more positive relationship behaviors, such as emotional or strategic support. Hence, when designing interaction tasks for couples, both clinicians and researchers should include tasks specifically designed to sample potential positive as well as negative exchanges. For example, couples might be asked to discuss a time when one partner’s feelings were hurt by someone outside the relationship (e.g., a friend or coworker) in order to assess behaviors expressing understanding and caring, although few templates with these foci have been developed and psychometrically evaluated (for an exception, see Mitchell et al., 2008). Relationship Cognitions Social learning models of couple distress have expanded to emphasize the role of cognitive processes in mediating the impact of specific behaviors on relationship functioning (Baucom, Epstein, Kirby, & LaTaillade, 2015). Research in this domain has focused on such factors as selective attention; attributions for positive and negative relationship events; and specific relationship assumptions, standards, and expectancies. For example, findings indicate that distressed couples often exhibit a bias toward selectively attending to negative partner behaviors and relationship events and ignoring or minimizing positive events (Sillars, Roberts, Leonard, & Dun, 2000). Compared to nondistressed couples, distressed partners also tend to blame each other for problems and to attribute each other’s negative behaviors to broad and stable traits (Bradbury & Fincham, 1990). Initial negative attributions predict relationship deterioration during the first

497

4 years of marriage (Lavner, Bradbury, & Karney, 2012). Distressed couples are also more likely to have unrealistic standards and assumptions about how relationships should work and lower expectancies regarding a partner’s willingness or ability to change his or her behavior in some desired manner (Epstein & Baucom, 2002). Based on these findings, assessment of relationship cognitions should emphasize the following questions: 1. Do partners demonstrate an ability to accurately observe and report both positive and negative relationship events? For example, partners’ descriptions and interpretations of couple interactions observed directly in therapy can be compared to the clinician’s own assessment of these same exchanges. Partners’ response-​sets when completing self-​report relationship measures can also be assessed; for example, the Conventionalization (CNV) scale on the MSI-​R (Snyder, 1997) assesses the tendency to distort relationship appraisals in an overly positive direction. 2. What interpretation or meaning do partners impart to relationship events? Clinical interviews are particularly useful for eliciting partners’ subjective interpretations of their own and each other’s behaviors; such interpretations and attributions also frequently are expressed during conflict-​ resolution or other interactional tasks. To what extent are partners’ negative relationship behaviors attributed to stable, negative aspects of the partner versus external or transient events? Self-​report measures assessing relationship attributions include the Relationship Attribution Measure (RAM; Fincham & Bradbury, 1992). 3. What beliefs and expectancies do partners hold regarding both their own and the other person’s ability and willingness to change in a manner anticipated to be helpful to their relationship? What standards do they hold for relationships generally? Relationship Affect Similar to findings regarding behavior exchange, research indicates that distressed couples are distinguished from nondistressed couples by higher overall rates, duration, and reciprocity of negative relationship affect and, to a lesser extent, by lower rates of positive relationship affect. Nondistressed couples show less reciprocity of positive affect, reflecting partners’ willingness or ability to express positive sentiment spontaneously independent

498

Couple Distress and Sexual Disorders

of their partner’s affect (Gottman, 1999). By contrast, partners’ influence on each other’s negative affect has been reported for both proximal and distal outcomes. For example, Pasch, Bradbury, and Davila (1997) found that partners’ negative mood prior to discussion of a personal issue predicted lower levels of emotional support they provided to the other during their exchange. From a longitudinal perspective, couples who divorce are distinguished from those who remain married by partners’ initial levels of negative affect and by a stronger linkage of initial negativity to the other person’s negative affect over time (Cook et al., 1995). Gottman (1999) determined that the single best predictor of couples’ eventual divorce was the amount of contempt partners expressed in videotaped interactions. Hence, assessment of couple distress should evaluate the following:

in reports of distress across most or all domains of relationship functioning assessed using self-​report. In research applications, ratings of affect by partners observing their videotaped interactions may provide an additional means of assessing sentiment override. For example, in a study of the effects of relationship sentiment override on couples’ perceptions, partners used an affect-​rating dial to indicate how positively or negatively they felt during a previously videotaped interaction and how they thought their partner felt during the interaction (Hawkins, Carrère, & Gottman, 2002). Comorbid Individual Distress

As noted previously when discussing comorbid conditions, there is growing evidence that relationship difficul1. To what extent do partners express and reciprocate ties covary with, contribute to, and result from individual negative and positive feelings about their relation- emotional and behavioral disorders (Lebow et al., 2012; ship and toward each other? Partners’ reciprocity Snyder & Whisman, 2003). Both clinician reports and of affect is best evaluated using either structured treatment outcome studies suggest that individual diffior unstructured interactions and coded (either for- culties render couple therapy more difficult or less effecmally or informally) using one of the behavioral tive (Allgood & Crane, 1991; Christensen et  al., 2004; observation systems described later in this section. Dalgleish et  al., 2015; Knobloch-​ Fedders, Pinsof, & Although much of the couple literature emphasizes Haase, 2015; Northey, 2002; Rowe, Doss, Hsueh, Libet, negative emotions, positive emotions such as smil- & Mitchell, 2011; Sher, Baucom, & Larus, 1990; Snyder, ing, laughter, expressions of appreciation or respect, Mangrum, & Wills, 1993; Whisman, Dixon, & Johnson, comfort or soothing, mutual support or coping, and 1997). Hence, when evaluating couple distress, additional similar expressions are equally important to assess attention should be given to disorders of individual emothrough observation or clinical inquiry. tional or behavioral functioning to address the extent to 2. What ability does each partner have to express his which either partner exhibits individual emotional or or her feelings in a modulated manner? Problems behavioral difficulties potentially contributing to, exacerwith emotion self-​ regulation may be observed bating, or resulting in part from couple distress. Given the either in overcontrol of emotions (e.g., an inability association of couple distress with affective disorders and to access, label, or express either positive or nega- alcohol use, initial interviews of couples should include tive feelings) or in undercontrol of emotions (e.g., questions regarding suicidality and alcohol or other subthe rapid escalation of anger into intense negativ- stance use, as well as brief screening for previous treatity approaching rage, progression of tearfulness ment of emotional or behavioral disorders. into sobbing, or deterioration in quality of thought When clinical interview suggests potential interacsecondary to emotional overload). Unregulated tion of relationship and individual dysfunction, focused negativity culminating in either verbal or physical and brief measures such as those for depression, anxiaggression can be assessed through self-​or partner ety, alcohol misuse, or other clinical disorders should report using the revised version of the Conflict be considered—​ for example, the Beck Depression Tactics Scale (CTS2; Straus, Hamby, Boney-​ Inventory-​II (BDI-​II; Beck, Steer, & Brown, 1996), the McCoy, & Sugarman, 1996). Alcohol Use Disorders Identification Test (AUDIT; 3. To what extent does partners’ negative affect gener- Babor, Higgins-​ Biddle, Saunders, & Monteiro, 2001), alize across occasions? Generalization of negative the Beck Anxiety Inventory (BAI; Beck, Epstein, Brown, affect can be observed in partners’ inability to shift & Steer, 1988), the Generalized Anxiety Disorder scale from negative to either neutral or positive affect (GAS-​7; Spitzer, Kroenke, Williams, & Lowe, 2006), or during the interview or in interactional tasks, or the Symptom Checklist-​90-​Revised (SCL-​90-​R; Derogatis

Couple Distress

& Savitz, 1999). It is equally important to assess couples’ strengths and resources across intrapersonal, relationship, and broader social system levels. These include partners’ ability to limit the impact of individual or couple dysfunction despite overwhelming stressors, or containing the generalization of distress to other family members. Finally, establishing the direction and strength of causal relations among individual and relationship disorders, as well as their linkage to situational stressors or buffers, is crucial for determining both the content and the sequencing of clinical interventions. This includes the linkage of adult relationship conflict to child behavior problems. In many cases, such functional relations are reciprocal—​ supporting interventions at either end of the causal chain. Cultural Differences in Couple Distress Consistent with our conceptual framework, cultural differences in the development, subjective experience, overt expression, and treatment of couple distress are critical to evaluate. By this we refer not only to cross-​national differences in couples’ relationships but also to cross-​cultural differences within nationality and consideration of nontraditional relationships including gay and lesbian couples. There can be important differences among couples as a function of their race/​ethnicity, culture, religious orientation, economic level, and age. These dimensions can affect the importance of the couple relationship to a partner’s quality of life, their expectancies regarding marital and parenting roles, typical patterns of verbal and nonverbal communication and decision-​making within the family, the behaviors that are considered distressing, sources of relationship conflict, the type of external stressors faced by a family, and the ways that partners respond to couple distress and divorce (e.g., Diener, Gohm, Suh, & Oishi, 2000; Gohm, Oishi, Darlington, & Diener, 1998; Jones & Chao, 1997; Kline et al., 2012; Lam et al., 2015). For example, Haynes and colleagues (1992) found that parenting, extended family, and sex were less strongly related to marital satisfaction whereas health of the spouse and other forms of affection were more important factors in marital satisfaction in older (i.e., older than age 55  years) compared to younger couples. Similarly, Bhugra and De Silva (2000) suggested that relationships with extended family members might be more important in some cultures. Also, when partners are from different cultures, cultural differences and conflicts can be a source of relationship dissatisfaction (e.g., Baltas & Steptoe, 2000). An important implication of such findings is that measures shown to be valid for one population may be less so for another.

499

Culturally sensitive assessment in couple/​ family, forensic, intellectual, school, and psychiatric contexts has been the topic of hundreds of articles and book chapters, which have highlighted cross-​cultural similarities and differences in behavior, sources of distress, values, and beliefs (e.g., Lim, 2015). Implications for couple assessment from cross-​cultural research are consistent with the idiographic approach to assessment emphasized previously in this chapter, particularly sensitivity to and respect for individual differences in the factors that affect a couple’s relationship satisfaction and treatment goals. Because norms for measures can differ across cultures, case formulation, treatment planning, and treatment outcome monitoring may benefit from greater attention to elements within scales than to scale scores when a self-​report instrument is used to assess couples who differ in potentially important ways from the original development sample. Assessment Strategies and Methods for Evaluating Couple Distress Assessment strategies for evaluating relationships vary across the clinical interview, observational methods, and self-​and other-​report measures. In the sections that follow, we discuss empirically supported techniques within each of these assessment strategies. Although specific techniques within any method could target diverse facets of individual, dyadic, or broader system functioning, we emphasize those more commonly used when assessing couple distress. The Clinical Interview The pretreatment clinical interview is the first step in assessing couples. It can aid in identifying a couple’s behavior problems and strengths, help specify a couple’s treatment goals, and be used to acquire data that are useful for treatment outcome evaluation. The assessment interview can also serve to strengthen the couple–​clinician relationship, identify barriers to treatment, and increase the chance that the couple will participate in subsequent assessment and treatment tasks. Furthermore, it is the primary means of gaining a couple’s informed consent about the assessment–​treatment process. Data from initial assessment interviews also guide the clinician’s decisions about which additional assessment strategies may be most useful; for example, Gordis, Margolin, and John (2001) used an interview to select topics for discussion during an analogue behavioral observation of couple communication patterns. Perhaps most important, the assessment interview can provide a rich source of hypotheses about

500

Couple Distress and Sexual Disorders

factors that may contribute to the couple’s distress. These hypotheses contribute to the case formulation, which in turn affects decisions about the best treatment strategy for a particular couple. The interview can also be used to gather information on multiple levels, in multiple domains, and across multiple response modes in couple assessment. It can provide information on the specific behavioral interactions of the couple, including behavioral exchanges and violence; problem-​solving skills, sources of disagreement, areas of satisfaction and dissatisfaction, and each partner’s thoughts, beliefs, and attitudes; and their feelings and emotions regarding the partner and the relationship. The couple assessment interview can also provide information on cultural and family system factors and other events that might affect the couple’s functioning, treatment goals, and response to treatment. These factors might include interactions with extended family members, other relationship problems within the nuclear family (e.g., between parents and children), economic stressors, and health challenges. The initial assessment interview can also provide information on potentially important causal variables for couple distress at an individual level, such as a partner’s substance use, mood disorder, or problematic personality traits. Moreover, the clinical interview can be especially useful in identifying functional relations that may account for relationship difficulties. The functional relations of greatest interest in couple assessment are those that are relevant to problem behaviors, feelings, and relationship enhancement. Identifying functional relations allows the assessor to hypothesize about “why” a partner is unhappy or what behavioral sequences lead to angry exchanges. Clinicians are interested, for example, in finding out what triggers a couple’s arguments and what communication patterns lead to their escalation. What does one partner do, or not do, that leads the other partner to feel unappreciated or angry? In the previous section on screening and diagnosis, we discussed a brief structured interview for identifying overall relationship distress and partner aggression. Various formats for organizing and conducting more extensive assessment interviews with couples have been proposed (cf., Abbott & Snyder, 2010; Epstein & Baucom, 2002; Gottman, 1999; Karpel, 1994; L’Abate, 1994). For example, Karpel suggested a four-​part evaluation that includes an initial meeting with the couple together, followed by separate sessions with each partner individually and then an additional conjoint meeting with the couple. Abbott and Snyder recommended an extended initial assessment interview lasting approximately 2 hours in which the following goals are stated at the outset: (a) first getting

to know each partner as an individual separate from the marriage; (b)  understanding the structure and organization of the marriage; (c) learning about current relationship difficulties, their development, and previous efforts to address these; and (d) reaching an informed decision together about whether to proceed with couple therapy and, if so, discussing respective expectations. Despite many strengths of the assessment interview, a major drawback is that few of the comprehensive formats have undergone rigorous psychometric evaluation. All have face validity, but most have little empirical evidence regarding their temporal reliability, internal consistency, inter-​rater agreement, content validity, convergent validity, sources of error, and generalizability across sources of individual differences such as ethnicity and age. One recent exception is the Relationship Quality Interview (RQI; Lawrence et al., 2011), a semi-​structured interview designed to obtain objective ratings in various domains of couple functioning, including (a) quality of emotional intimacy, (b)  quality of the couple’s sexual relationship, (c)  quality of support transactions, (d)  the couple’s ability to share power, and (e) conflict/​problem-​solving interactions. The RQI has demonstrated good reliability and validity in samples of both married and dating couples. The clinical literature reflects considerable divergence on the issue of whether initial assessment of couple distress should be conducted with partners conjointly or should also include individual interviews with partners separately. Arguments for the latter include considerations of both veridicality and safety, particularly when assessing such sensitive issues as intimate partner violence (IPV), substance abuse, or sexual interactions (Haynes, Jensen, Wise, & Sherman, 1981; Whisman & Snyder, 2007). Research indicates that couples experiencing IPV often do not spontaneously disclose IPV in early interviews due to embarrassment, minimization, or fear of retribution (Ehrensaft & Vivian, 1996), but they do disclose when asked directly (sometimes disclosing in the interview when they did not on an IPV questionnaire (e.g., O’Leary, Vivian, & Malone, 1992). Moreover, risks of retaliatory aggression against one partner by disclosing the other’s violence in conjoint interview argue for the importance of conducting inquiries concerning IPV in individual interviews. Arguments against individual interviews when assessing couple distress emphasize potential difficulties in conjoint therapy if one partner has disclosed information to the therapist about which the other partner remains uninformed. Of particular concern are disclosures regarding IPV (Aldarondo & Straus, 1994; Rathus & Feindler, 2004) and

Couple Distress

sexual infidelity (Snyder & Doss, 2005; Whisman & Wagers, 2005). Hence, if separate interviews are conducted with partners as a prelude to conjoint couple therapy, the interviewing clinician needs to be explicit with both partners ahead of time regarding conditions under which information disclosed by one partner will be shared with the other and also any criteria for selecting among individual, conjoint, or alternative treatment modalities.

2. Behavioral engagement (e.g., demands, pressures for change, withdrawal, and avoidance): An example is the Conflict Rating System (CRS; Heavey, Christensen, & Malamuth, 1995). 3. General communication skills (e.g., involvement, verbal and nonverbal negativity and positivity, and information and problem description): Examples include the Clinician Rating of Adult Communication (CRAC; Basco, Birchler, Kalal, Talbott, & Slater, 1991), the Interactional Dimensions Coding System (IDCS; Kline et  al., 2004), and the Kategoriensystem für Partnerschaftliche Interaktion (KPI; Hahlweg, 2004). 4. Problem solving (e.g., self-​ disclosure, validation, facilitation, and interruption):  Examples include the Codebook of Marital and Family Interaction (COMFI; Notarius, Pellegrini, & Martin, 1991), the Communication Skills Test (CST; Floyd, 2004), the Dyadic Interaction Scoring Code (DISC; Filsinger, 1983), the Living in Family Environments (LIFE) coding system (Hops, Davis, & Longoria, 1995), and the Verbal Tactics Coding Scheme (VTCS; Sillars, 1982). 5. Power (e.g., verbal aggression, coercion, and attempts to control):  An example is the System for Coding Interactions in Dyads (SCID; Malik & Lindahl, 2004). 6. Support/​ intimacy (e.g., emotional and tangible support, and attentiveness):  Examples are the Social Support Interaction Coding System (SSICS; Pasch, Harris, Sullivan, & Bradbury, 2004)  and the Couples’ Intimate Behavior Rating System (CIBRS; Mitchell et al., 2008).

Observational Methods As noted previously, couple assessment offers the unique opportunity to observe partners’ communication and other interpersonal exchanges directly. Like interviews and self-​report methods, analogue behavioral observation (ABO) describes a method of data collection; specifically, it involves a situation designed, manipulated, or constrained by a clinician that elicits both verbal and nonverbal behaviors of interest, such as motor actions, verbalized attributions, affect, and observable facial and other behavioral reactions (Heyman & Slep, 2004). We previously identified both the RMICS and the RCISS as rapid observational methods particularly useful for initial screening and diagnosis of couple distress. Detailed descriptions and psychometric reviews of additional couple coding systems have been published previously (Heyman, 2001; Kerig & Baucom, 2004). Although these systems vary widely, in general they reflect six major a priori classes of targeted behaviors: 1. Affect (e.g., humor, affection, anger, criticism, contempt, sadness, and anxiety):  Examples include the Behavioral Affective Rating System (BARS; Johnson, 2002)  and the Specific Affect Coding System (SPAFF; Gottman, McCoy, Coan, & Collier, 1996; Shapiro & Gottman, 2004). TABLE 22.4  

501

Psychometric characteristics for the 16 couple coding systems summarized in Tables 22.2 through 22.4 indicate

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity Treatment Validity Generalization Sensitivity

Clinical Utility

DAS

A

G

NA

G

A

A

G

G

A

CSI-​16 MSI-​B MSI-​R RMICS GAS

A E E E NA

E E G E A

NA NA NA A NA

NR E G NR NR

E E E A G

G G E G G

NR G G G NR

E E G A A

G G E A A

Highly Recommended

✓ ✓ ✓ ✓

Note:  DAS  =  Dyadic Adjustment Scale; CSI-​16  =  Couple Satisfaction Index Scale–​16-​item version; MSI-​B  =  Marital Satisfaction Inventory-​Brief; MSI-​R = Marital Satisfaction Inventory-​Revised; RMICS = Rapid Marital Interaction Coding System; GAS = Goal Attainment Scaling; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

502

Couple Distress and Sexual Disorders

considerable variability in the extent to which information regarding reliability, validity, and treatment sensitivity for each system has been accrued. For example, only 4 of 16 coding systems report data concerning internal consistency, although this likely reflects systems’ emphasis on specific behaviors rather than broader constructs. When superordinate classes of behavior (e.g., positive or negative) are of interest, internal consistency should be evaluated by using either Cronbach’s alpha or indices derived from factor analysis (Heyman, Eddy, Weiss, & Vivian, 1995). (See also Haynes, 2001, for an overview of critical dimensions of psychometric evaluation of ABO methods.) Stable estimates of behavioral frequencies may require extended observation depending on the base rate of their occurrence—​for example, as few as 2 minutes for frequent behaviors but 30 minutes or longer for infrequent behaviors (Heyman et  al., 2001). Inter-​rater reliability for nearly all coding systems reviewed here was adequate or better following coder training, although the more comprehensive or complicated the system, the more difficult it is to obtain high inter-​rater reliability. Few studies have been conducted on the temporal stability of observed couple behaviors across tasks or settings. However, the limited evidence suggests that couples’ interactions do not vary significantly based on topic difficulty (Sanford, 2003) but are influenced by setting (e.g., home vs. clinic or research laboratory) and length of marriage (with longer married couples exhibiting more enduring patterns; Gottman & Levenson, 1999; Lord, 1999; Wieder & Weiss, 1980). Although varying in their emphasis, each of the couple coding systems reviewed here clearly assesses constructs related to communication and other domains of partner interaction relevant to relationship functioning and couple distress. Many of the coding systems can trace their origins—​directly or indirectly—​to a single source:  the Family Interaction Coding System (Patterson, Ray, Shaw, & Cobb, 1969; Reid, 1978), which was developed from naturalistic observations of family members’ behaviors in the home. Nearly all coding systems have accrued evidence of discriminative validity and relatedness to independent measures of similar constructs, and only the most recently developed systems have yet to accrue evidence of validity generalization. Pre-​and post-​treatment data for couple behavioral coding systems are limited, in part because of fewer funded clinical trials of couple therapy during the past two decades. However, the Marital Interaction Coding System (MICS) and Couples Interaction Scoring System (CISS) have evidence of treatment sensitivity; it is reasonable to infer that their quicker versions (RMICS

and RCISS) and coding systems that measure similar constructs (i.e., most of the communication-​oriented systems) would demonstrate similar levels of treatment sensitivity. Concerns have been raised about the clinical utility of ABO (e.g., Mash & Foster, 2001) because nearly all coding systems require extensive observer training to reach adequate levels of interobserver agreement. Even after observers are certified as being able to code data reliably, a great deal of energy is required to maintain reliability (e.g., weekly meetings with regular feedback on agreement). Thus, even if clinicians expended a great deal of time learning a system to the point of mastery (i.e., meeting the reliability criterion), their reliability would naturally decay without ongoing efforts to maintain agreement. Such a requirement is likely not reasonable for most clinicians. However, even if not striving to code behavioral observations in the manner required for scientific study of couple interactions, the empirically informed use of behavioral observations should be standard in clinicians’ assessment of couple distress. That is, collecting communication samples is an important part of couple clinical assessment because “communication is the common pathway to relationship dysfunction because it is the common pathway for getting what you want in relationships. Nearly all relationship-​relevant conflicts, emotions, and neuroses are played out via observable communication—​ either verbally or nonverbally” (Heyman, 2001, p. 6). If questionnaire or interview assessments suggest that an interactive task may place one or both partners in danger (e.g., if there is a history of serious physical or emotional IPV, indications of severe power or control dynamics, or threats conveyed to the assessor), ABO would be contraindicated. However, if it seems reasonable that it is safe to proceed, then the clinician should hypothesize which classes of behaviors seem most highly connected to the target problems. Wherever possible, ABOs should be video-​recorded so that the sample can be reviewed later with an eye toward a class of behaviors other than what was the assessor’s primary focus during the in vivo ABO. Furthermore, unless the clinician can rule out a plausible connection between conflict communication and the couple’s problems, we recommend that a conflict communication ABO be collected. Based on findings from observational research with couples, Heyman (2001) suggested that clinicians use behavioral observations in assessing couple distress to address the following: 1. How does the conversation start? Does the level of anger escalate? What happens when it does? Does the couple enter repetitive negative loops?

Couple Distress

2. Do partners indicate afterward that what occurred during the conversations is typical? Is their behavior stable across two or more discussions? 3. Do partners’ behaviors differ when it is one partner’s topic versus the other’s? Do they label the other person or the communication process as the problem? 4. What other communication behaviors—​ either positive (e.g., support and empathic reflection) or negative (e.g., criticism, sneers, and turning away)—​ appear functionally related to partners’ ability to discuss relationship issues effectively? Self-​and Other-​Report Methods The rationale underlying self-​report methods in couple assessment is that such methods (a)  are convenient and relatively easy to administer; (b) are capable of generating a wealth of information across a broad range of domains and levels of functioning germane to clinical assessment or research objectives, including those listed in Table 22.1; (c) lend themselves to collection of data from large normative samples that can serve as a reference for interpreting data from individual respondents; (d) allow disclosure about events and subjective experiences that respondents may be reluctant to discuss with an interviewer or in the presence of their partner; and (e) can provide important data concerning internal phenomena opaque to observational approaches, including thoughts and feelings, values and attitudes, expectations and attributions, and satisfaction and commitment. However, the limitations of traditional self-​report measures also bear noting. Specifically, data from self-​report instruments can (a) reflect bias in self-​and other-​reporting in either a favorable or an unfavorable direction, (b)  be affected by differences in stimulus interpretation and errors in recollection of specific events, (c) inadvertently influence respondents’ non-​test behavior in unintended ways (e.g., by sensitizing respondents and increasing their reactivity to specific issues), and (d)  typically provide few fine-​grained details concerning moment-​to-​moment interactions compared with ABOs. We describe here, and summarize in Table 22.3, a small subset of self-​report instruments selected on the basis of their potential clinical utility and at least moderate evidence of reliability and validity. In some domains (e.g., relationship cognitions and affect), well-​ validated measures are few. Additional measures identified in previous reviews (Epstein & Baucom, 2002; Sayers & Sarwer, 1998; Snyder, Heyman, Haynes,

503

Carlson, & Balderrama-​Durbin, in press) or in comprehensive bibliographies of self-​report couple and family measures (e.g., Corcoran & Fischer, 2000; Davis, Yarber, Bauserman, Schreer, & Davis, 1998; Hamilton & Carr, 2016; L’Abate & Bagarozzi, 1993; Touliatos, Perlmutter, Straus, & Holden, 2001)  may be considered as additional clinical resources; however, the data they generate should generally be regarded as similar to data generated from other self-​reports derived from interview—​namely as subject to various potential biases of observation, recollection, interpretation, and motivations to present oneself or one’s partner in a favorable or unfavorable light. A variety of self-​report measures have been developed to assess couples’ behavioral exchanges including communication, verbal and physical aggression, and physical intimacy. The FAPBI (Doss & Christensen, 2006) assesses 20 positive and negative behaviors in four domains (affection, closeness, demands, and relationship violations) and possesses excellent psychometric characteristics. As a clinical tool, the FAPBI has the potential to delineate relative strengths and weaknesses in the relationship—​ transforming diffuse negative complaints into specific requests for positive change. Among self-​ report measures specifically targeting partners’ communication, two that have demonstrated good reliability and validity are the CPQ (Christensen, 1987)  and the Marital Communication Inventory (Bienvenu, 1970). The CPQ was designed to measure the temporal sequence of couples’ interactions by soliciting partners’ perceptions of their communication patterns before, during, and following conflict. Scores on the CPQ can be used to assess characteristics of the demand ↔ withdraw pattern frequently observed among distressed couples. A more recent measure assessing both communication and coping, the Dyadic Coping Inventory (DCI; Bodenmann, 2008; Randall, Hilpert, Jimenez-​ Arista, Walsh, & Bodenmann, 2016), contains 37 items assessing (a) one’s own coping, (b) one’s perception of one’s partner’s stress communication, (c) supportive dyadic coping, and (d)  negative dyadic coping, in close relationships when one or both partners are stressed. Assessing relationship aggression by self-​report measures assumes particular importance because of some individuals’ reluctance to disclose the nature or extent of such aggression during an initial conjoint interview. By far the most widely used measure of couples’ aggression is the CTS2 (Straus et al., 1996), assessing various modes of conflict resolution (reasoning, verbal aggression, and physical aggression), as well as levels of sexual

504

Couple Distress and Sexual Disorders

coercion and physical injury. A measure of psychological aggression demonstrating strong psychometric properties and gaining increasing support is the Multidimensional Measure of Emotional Abuse (MMEA; Murphy & Hoover, 1999). An additional measure of relationship aggression, the Aggression (AGG) scale of the MSI-​R (Snyder, 1997), comprises 10 items reflecting psychological and physical aggression experienced from one’s partner. Advantages of the AGG scale as a screening measure include its relative brevity and its inclusion in a multidimensional measure of couples’ relationships (the MSI-​R) described later. Previously, we noted the importance of evaluating partners’ attributions for relationship events. The RAM (Fincham & Bradbury, 1992)  presents hypothetical situations and asks respondents to generate responsibility attributions indicating the extent to which the partner intentionally behaved negatively, was selfishly motivated, and was blameworthy for the event. Both causal and responsibility attributions assessed by the RAM have evidence of good internal consistency and test–​retest reliability, as well as convergence with partners’ self-​reported overall relationship satisfaction and observed affect. For purposes of case conceptualization and treatment planning, well-​constructed multidimensional measures of couple functioning are useful for discriminating among various sources of relationship strength, conflict, satisfaction, and goals. Widely used in both clinical and research settings is the MSI-​R (Snyder, 1997), a 150-​item inventory designed to identify both the nature and the intensity of relationship distress in distinct areas of interaction. The MSI-​R includes two validity scales, one global scale, and 10 specific scales assessing relationship satisfaction in such areas as affective and problem-​ solving communication, aggression, leisure time together, finances, the sexual relationship, role orientation, family of origin, and interactions regarding children. More than 30 years of research has supported the reliability and construct validity of the MSI-​R scale scores (Snyder et  al., 2004). The instrument boasts a large representative national sample, evidence of good internal consistency and test–​retest reliability, and evidence of excellent sensitivity to treatment change. The Global Distress Subscale (GDS) of the MSI-​R has been shown to predict couples’ likelihood of divorce 4  years following therapy (Snyder, 1997). A  validation study using a national sample of 60 marital therapists supported the overall accuracy and clinical utility of the

computerized interpretive report for this instrument (Hoover & Snyder, 1991). Recent studies suggest the potential utility of Spanish, German, Italian, French, Chinese, Korean, and Arabic adaptations of the MSI-​R for cross-​cultural application with both clinic and community couples (Antonelli, Dettore, Lasagni, Snyder, & Balderrama-​Durbin, 2014; Balderrama-​Durbin, Snyder, & Semmar, 2011; Brodard et al., 2015; Gasbarrini et al., 2015; Kwon & Choi, 1999; Lou, Lin, Chen, Balderrama-​ Durbin, & Snyder, 2016; Reig-​Ferrer, Cepeda-​Benito, & Snyder, 2004), as well as use of the original English version with nontraditional (e.g., gay and lesbian) couples (Means-​Christensen, Snyder, & Negy, 2003). Additional multidimensional measures obtaining fairly widespread use are the PREPARE and ENRICH inventories (Fowers & Olson, 1989, 1992; Olson & Olson, 1999), developed for use with premarital and married couples, respectively. Both of these measures include 165 items in 20 domains reflecting personality (e.g., assertiveness and self-​ confidence), intrapersonal issues (e.g., marriage expectations and spiritual beliefs), interpersonal issues (e.g., communication and closeness), and external issues (e.g., family and friends). A  computerized interpretive report identifies areas of “strength” and “potential growth” and directs respondents to specific items reflecting potential concerns. The ENRICH inventory has a good normative sample and has ample evidence supporting both the reliability and the validity of scores on its subscales. Overall Evaluation Couples presenting for therapy vary widely in both the content and the underlying causes of their individual and relationship problems, as well as their treatment goals. Conceptualizing partners’ distress and planning effective treatment require careful assessment of behavioral, cognitive, and affective components of relationship functioning conducted across multiple modalities and using multiple methods, including interview, ABO, and self-​report measures. Effective intervention depends on assimilating assessment findings within an overarching theoretical framework linking individual and relationship difficulties to presumed etiologies as well as to clinical intervention. Toward this end, assessment of couple distress requires going beyond nomothetic conclusions derived from standardized measures of relationship functioning to integrate idiographic findings from multiple sources and methods in a functional analytic approach (Haynes et al., 2009).

Couple Distress ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

In principle, assessment strategies relevant to case conceptualization and treatment planning are also germane to monitoring treatment progress and evaluating outcome. It would be difficult to imagine adequate assessment of partners’ changes in individual and relationship functioning not including clinical inquiry about alterations in behavioral, cognitive, and affective domains outside of treatment sessions; repeated ABOs to track the acquisition and use of targeted communication skills; and integration of self-​report measures profiling changes across diverse domains and providing information in sensitive areas. Several caveats moderate this general conclusion. First, the use of repeated assessments to evaluate changes attributable to treatment requires that measures demonstrate temporal reliability in the absence of clinical intervention. Although obvious as a precondition for interpreting change, information regarding the temporal reliability of couple-​based assessment techniques is remarkably sparse. Second, treatment effects are best assessed by using measures both relevant and specific to aspects of individual and relationship functioning targeted by clinical interventions. Finally, treatment monitoring across sessions imposes pragmatic constraints on measures’ length, thus suggesting enhanced utility for measures that have evidence of score reliability and validity and are distinguished by their brevity (e.g., the MSI-​B or CSI-​16 as a measure of global affect or the FAPBI to assess more specific dyadic behaviors). Table 22.4 provides ratings on several relevant instruments. Changes in individualized treatment goals can be quantified using goal attainment scaling (GAS; Kiresuk, Smith, & Cardillo, 1994)  as described previously for use in couple therapy by Whisman and Snyder (1997). When adopting the GAS method, the issues that will be the focus of treatment are first identified, and then each problem is translated into one or more goals. The expected level of outcome is then specified for each goal, along with the “somewhat more” and “much more” than expected levels of outcome, as well as the “somewhat less” and “much less” than expected levels. Each level of outcome is assigned a value on a 5-​point measurement scale ranging from –​2 for much less than expected level of outcome to +2 for much more than expected level of outcome. Levels of outcome can then be rated during or following treatment, and the ratings across goals can be averaged to provide a summary score for evaluating the

505

degree to which treatment helped the couple attain their own individualized goals. Overall Evaluation Gains or deterioration in individual and relationship functioning should be evaluated using techniques sensitive and specific to treatment effects across assessment modalities incorporating interview, behavioral observation, and self-​report methods. Conclusions drawn from nomothetic approaches (e.g., the DAS or MSI-​R) should be complemented by idiographic methods, ideally incorporating observational assessment as well as GAS or similar procedures. CONCLUSIONS AND FUTURE DIRECTIONS

Recommendations for Assessing Couple Distress Assessment strategies and specific methods for assessing couple distress will necessarily be tailored to partners’ unique constellation of presenting difficulties, as well as specific resources of both the couple and the clinician. However, regardless of the specific context, the following recommendations for assessing couple distress will apply: 1. Given empirical findings linking couple distress to individual disorders and their respective impact in moderating treatment outcome, assessment of couple functioning should be standard practice when treating individuals. Screening for couple distress when assessing individuals may involve a brief interview format shown to relate to relevant indicators of couple interactions (e.g., the SDI-​MD-​PA or the RQI) or a brief self-​report measure that has exhibited prior evidence of discriminative validity (e.g., the MSI-​B or the CSI-​4). Similarly, when treating couples, partners should be screened for individual emotional or behavioral difficulties that may contribute to, exacerbate, or partially result from couple distress. 2. Assessment foci should progress from broad to narrow—​first identifying relationship concerns at the broader construct level and then examining more specific facets of couple distress and its correlates using a finer-​grained analysis. The specific assessment methods described in this review vary considerably in their overall breadth or focus within any specific construct domain and, hence, will vary both in their applicability across couples and in

506

Couple Distress and Sexual Disorders

their placement in a sequential exploratory assessment process. 3. Within clinical settings, certain domains should always be assessed with every couple either because of their robust linkage to relationship difficulties (e.g., communication processes that involve emotional expressiveness, problem discussions, positive exchanges, and decision-​making) or because the specific behaviors, if present, have particularly adverse impact on couple functioning (e.g., physical aggression or substance abuse). 4. Couple assessment should integrate findings across multiple assessment methods and domains. Self-​ and other-​report measures can complement findings from interview or behavioral observation and generate data across diverse domains both centrally or conceptually related to the couple’s difficulties and treatment goals, or across those domains potentially more challenging to assess because of their sensitive nature or their not being amenable to direct observation. However, caution should be exercised when adopting self-​or other-​report measures in the assessment of couple distress. Despite their proliferation, most measures of couple functioning described in the literature have not undergone careful psychometric evaluation. Among those instruments for which some evidence concerning reliability and validity has been garnered, evidence often exists only for overall scores and not at the level of subscales or smaller units of analysis at which interpretations may be made. 5. At the same time, assessment of couple distress should be parsimonious. This objective can be facilitated by choosing evaluation strategies and modalities that complement each other and by following a sequential approach that uses increasingly narrowband measures to target problem areas that have been identified by other assessment techniques. 6. Psychometric characteristics of any assessment measure—​whether from interview, ABO, or self-​ report method—​are conditional upon the specific population and purpose for which that assessment method was developed. Given that nearly all measures of couple distress were developed and tested on White, middle-​ class, heterosexual married couples, their relevance to and utility for assessing ethnic minority couples, gay and lesbian couples, older couples, and low-​income couples is unknown. This caveat extends to content-​as well as criterion-​related validity. Hence, any assessment

measure demonstrating evidence of validity with some couples may be less valid, in part or in whole, for any given couple, thus further underscoring the importance of drawing upon multiple indicators across multiple methods for assessing any specific construct. 7. Given the dynamic and conditional nature of couple distress and its causes, assessment should be ongoing throughout the therapy process. Recommendations for Further Research Future directions for assessment research germane to the field generally also apply to research in assessing couple distress specifically, including the need for greater attention to (a)  psychometric characteristics of measures; (b) factors moderating reliability and validity across populations differing in sociocultural characteristics as well as in clinical functioning; (c)  the assessment process, including initial articulation of assessment goals, selection of assessment method and instruments, and methods of interpreting data and providing feedback; and (d) the functional utility of assessment findings in enhancing treatment effectiveness (Hayes, Nelson, & Jarrett, 1987). In considering the implications of these directives for the assessment of couple distress, considerably more research is needed before a comprehensive, empirically based couple assessment protocol can be advocated. For example, despite the ubiquitous use of couple assessment interviews, scant research has been conducted to assess their psychometric features. Observational methods, although a rich resource for generating and testing clinical hypotheses, are less frequently used in clinical settings and present significant challenges to their reliable and valid application in everyday practice. Questionnaires—​ despite their ease of administration and potential utility in generating a wealth of data—​frequently suffer from inadequate empirical development and, at best, comprise only part of a multimethod assessment strategy. We recommend, as a research roadmap, that clinical researchers consider adapting the Institute of Medicine stages of intervention research cycle (Mrazek & Haggerty, 1994). Stage 1 involves identifying the disorder and measuring its prevalence. Despite being so basic a need, there currently exists no gold standard for discriminating distressed from nondistressed couples; the questionnaires most frequently used for such classifications are of limited sensitivity and specificity (Heyman et al., 2001). Stage 2 involves delineating specific risk and protective factors. As noted previously, some replicated factors have been identified, although

Couple Distress

this research could be sharpened by defining groups more carefully (via Stage 1). Stage 3 (efficacy trials) would involve tightly controlled trials of the efficacy of a multimethod assessment in clinical practice. Stage 4 (effectiveness trials) would involve controlled trials of the outcome of this assessment in more real-​world clinical environments. Only then would testing broad-​scale dissemination (Stage 5) of empirically based couple assessment be appropriate. This research roadmap reflects an ambitious agenda unlikely to be met by any single investigator or group of investigators. However, progress toward evidence-​based assessment of couple distress will be enhanced by research on specific components targeting more notable gaps in the empirical literature along the lines recommended here: 1. Greater attention should be given to expanding the empirical support for promising assessment instruments already detailed in the literature than to the initial (and frequently truncated) development of new measures. Proposals for new measures should be accompanied by compelling evidence for their incremental utility and validity and a commitment to programmatic research to examine their generalizability across diverse populations and assessment contexts. 2. Research needs to delineate optimal structured and semi-​ structured interview formats for assessing couples. Such research should address (a)  issues of content validity across populations and settings, (b)  organizational strategies for screening across diverse system levels and construct domains relevant to couple functioning (similar to branching strategies for the Structured Clinical Interview for the DSM [First et al., 1997] and related structured interviews for individual disorders), (c) relative strengths and limitations to assessing partners separately versus conjointly, (d)  factors promoting the disclosure and accuracy of verbal reports, (e) relation of interview findings to complementary assessment methods (as in generating relevant tasks for ABO), and (f) the interview’s special role in deriving functional analytic case conceptualization. 3. Although laboratory-​based behavioral observation of couple interaction has considerably advanced our understanding of couple distress, generalization of these techniques to more common clinical settings has lagged behind. Hence, researchers should develop more clinically useful methods of observation and macro-​ level coding systems for quantifying observational data that promote their routine adoption in clinical contexts while preserving their psychometric fidelity.

507

4. Research needs to attend to the influences of culture at several levels. First, there has been little attention to developing measures directly assessing domains specific to relationship functioning at the community or cultural level (e.g., cultural standards or norms regarding emotional expressiveness, balance of decision-​making influence, or boundaries governing the interaction of partners with extended family or others in the community). Hence, assessment of such constructs currently depends almost exclusively on the clinical interview, with no clear guidelines regarding either the content or the format of questions. Second, considerably more research needs to examine the moderating effects of sociocultural factors on measures of couple functioning, including the impact of such factors as ethnicity, age, socioeconomic status, or sexual orientation. Third, work needs to proceed on adapting established measures to alternative languages. In the United States, the failure to adapt existing instruments to Spanish or to examine the psychometric characteristics of extant adaptations is particularly striking given that (a) Hispanics are among the largest and fastest-​growing ethnic minority group and (b) among U.S. Hispanic adults aged 18 to 64 years, 28% have either limited or no ability to speak English (Snyder et al., 2004). Adapting existing measures to alternative contexts (i.e., differing from the original development sample in language, culture, or specific aspects of the relationship such as sexual orientation) should proceed only when theoretical or clinical formulations suggest that the construct being measured does not differ substantially across the new application. Detailed discussions of both conceptual and methodological issues relevant to adapting tests to alternative languages or culture exist elsewhere (e.g., Butcher, 1996; Geisinger, 1994; Haynes et  al., 2016; Van Widenfelt, Treffers, de Beurs, Siebelink, & Koudijs, 2005). Because clinicians and researchers may fail to recognize the inherent cultural biases of their conceptualization of couple processes, the appropriateness of using or adapting tests cross-​culturally should be evaluated following careful empirical scrutiny examining each of the following: • Linguistic equivalence including grammatical, lexical, and idiomatic considerations • Psychological equivalence of items across the source and target cultures

508

Couple Distress and Sexual Disorders

• Functional equivalence indicating the congruence of external correlates in concurrent and predictive criterion-​related validation studies of the measure across applications • Scalar equivalence ensuring not only that the slope of regression lines delineating test–​ criterion relations be parallel (indicating functional equivalence) but also that they have comparable metrics and origins (zero points) in both cultures Finally, research needs to examine the process, as well as the content, of couple assessment. For example, little is known regarding the impact of decisions about the timing or sequence of specific assessment methods, the role of the couple in determining assessment objectives, or the provision of clinical feedback on either the content of assessment findings or their subsequent effect on clinical interventions. Recent studies suggest that systematic monitoring and feedback in couple therapy may enhance treatment outcomes (Halford et  al., 2012; Pepping, Halford, & Doss, 2015). Similarly, additional studies are needed to examine the psychometric equivalence of Internet administration of paper-​and-​pencil questionnaires used in couple research (Brock, Barry, Lawrence, Dey, & Rolffs, 2012). Although assessment of couples has shown dramatic gains in both its conceptual and empirical underpinnings during the past 35 years, much more remains to be discovered. Both clinicians and researchers need to avail themselves of recent advances in assessing couple distress and collaborate in promoting further development of empirically based assessment methods. ACKNOWLEDGMENTS

Portions of this chapter were adapted from Snyder, Heyman, and Haynes (2005). Richard Heyman’s work on this chapter was supported by the National Institute of Dental and Craniofacial Research grant UH2DE025980. The authors express their appreciation to Brian Abbott, Danielle Mitnick, and Dawn Yoshioka for their contributions to Tables 22.1 through 22.4. References Abbott, B. V., & Snyder, D. K. (2010). Couple distress. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (2nd ed., pp. 439–​476). New York, NY: Guilford.

Aldarondo, E., & Straus, M. (1994). Screening for physical violence in couple therapy:  Methodological, practical, and ethical considerations. Family Process, 33, 425–​439. Allgood, S. M., & Crane, D. R. (1991). Predicting marital therapy dropouts. Journal of Marital and Family Therapy, 17, 73–​79. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorder (5th ed.). Arlington, VA: American Psychiatric Publishing. Antonelli, P., Dettore, D., Lasagni, I., Snyder, D. K., & Balderrama-​Durbin, C. (2014). Gay and lesbian couples in Italy:  Comparisons with heterosexual couples. Family Process, 53, 702–​716. Babor, T. F., Higgins-​ Biddle, J. C., Saunders, J. B., & Monteiro, M. G. (2001). AUDIT:  The Alcohol Use Disorders Identification Test:  Guidelines for use in primary care (2nd ed.). Geneva, Switzerland:  World Health Organization. Balderrama-​Durbin, C., Snyder, D. K., & Balsis, S. (2015). Tailoring assessment of relationship distress using the Marital Satisfaction Inventory-​Brief form. Couple and Family Psychology: Research and Practice, 4, 127–​135. Balderrama-​Durbin, C., Snyder, D. K., & Semmar, Y. (2011). Assessing Arabic couples: An evidence-​based approach. Family Science, 2, 24–​33. Baltas, Z., & Steptoe, A. (2000). Migration, culture conflict and psychological well-​ being among Turkish–​ British married couples. Ethnicity & Health, 5, 173–​180. Basco, M. R., Birchler, G. R., Kalal, B., Talbott, R., & Slater, A. (1991). The Clinician Rating of Adult Communication (CRAC): A clinician’s guide to the assessment of interpersonal communication skill. Journal of Clinical Psychology, 47, 368–​380. Baucom, D. H., Epstein, N., Kirby, J. S., & LaTaillade, J. J. (2015). Cognitive–​behavioral couple therapy. In A. S. Gurman, J. L. Lebow, & D. K. Snyder (Eds.), Clinical handbook of couple therapy (5th ed., pp. 23–​ 60). New York, NY: Guilford. Beck, A. T., Epstein, N., Brown, G., & Steer, R. A. (1988). An inventory for measuring clinical anxiety:  Psychometric properties. Journal of Consulting and Clinical Psychology, 56, 893–​897. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the Beck Depression Inventory-​ II. San Antonio, TX: Psychological Corporation. Bernet, W., Wamboldt, M. Z., & Narrow, W. E. (2016). Child affected by parental relationship distress. Journal of the American Academy of Child and Adolescent Psychiatry, 55, 571–​579. Bhugra, D., & De Silva, P. (2000). Couple therapy across cultures. Sexual & Relationship Therapy, 15, 183–​192. Bienvenu, M. J. (1970). Measurement of marital communication. The Family Coordinator, 19, 26–​31.

Couple Distress

Bodenmann, G. (2008). Dyadisches Coping Inventar: Testmanual [Dyadic Coping Inventory:  Test manual]. Bern, Switzerland: Huber. Bradbury, T. N., & Fincham, F. D. (1990). Attributions in marriage:  Review and critique. Psychological Bulletin, 107, 3–​33. Bradbury, T. N., Rogge, R., & Lawrence, E. (2001). Reconsidering the role of conflict in marriage. In A. Booth, A. C. Crouter, & M. Clements (Eds.), Couples in conflict (pp. 59–​81). Mahwah, NJ: Erlbaum. Brock, R. L., Barry, R. A., Lawrence, E., Dey, J., & Rolffs, J. (2012). Internet administration of paper-​and-​pencil questionnaires used in couple research:  Assessing psychometric equivalence. Assessment, 19, 226–​242. Brodard, F., Charvoz, L., Antonietti, J. P., Rossier, J., Bodenmann, G., & Snyder, D. K. (2015). Validation of the French version of the Marital Satisfaction Inventory [in French]. Canadian Journal of Behavioural Science, 47, 113–​122. Butcher, J. N. (1996). Translation and adaptation of the MMPI-​2 for international use. In J. N. Butcher (Ed.), International adaptations of the MMPI-​ 2:  A handbook of research and clinical applications (pp. 26–​43). Minneapolis, MN: University of Minnesota Press. Christensen, A. (1987). Detection of conflict patterns in couples. In K. Hahlweg & M. J. Goldstein (Eds.), Understanding major mental disorder: The contribution of family interaction research (pp. 250–​265). New York, NY: Family Process Press. Christensen, A., Atkins, D. C., Berns, S., Wheeler, J., Baucom, D. H., & Simpson, L. E. (2004). Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples. Journal of Consulting and Clinical Psychology, 72, 176–​191. Christensen, A., & Heavey, C. L. (1990). Gender and social structure in the demand/​withdraw pattern of marital conflict. Journal of Personality and Social Psychology, 59, 73–​81. Cook, J., Tyson, R., White, J., Rushe, R., Gottman, J. M., & Murray, J. (1995). The mathematics of marital conflict:  Qualitative dynamic mathematical modeling of marital interaction. Journal of Family Psychology, 9, 110–​130. Corcoran, K., & Fischer, J. (2000). Measures for clinical practice:  A sourcebook:  Vol. 1.  Couples, families, and children. New York, NY: Free Press. Crane, D. R., Busby, D. M., & Larson, J. H. (1991). A factor analysis of the Dyadic Adjustment Scale with distressed and nondistressed couples. American Journal of Family Therapy, 19, 60–​66. Crenshaw, A. O, Christensen, A., Baucom, D. H., Epstein, N. B., & Baucom, B. R. (2017). Revised scoring and improved reliability for the Communication Patterns Questionnaire. Psychological Assessment, 29, 913–​925.

509

Cummings, E. M., & Davies, P. T. (2010). Marital conflict and children:  An emotional security perspective. New York, NY: Guilford. Cutrona, C. (1996). Social support in couples:  Marriage as a resource in times of stress. Thousand Oaks, CA: Sage. Dalgleish, T. L., Johnson, S. M., Burgess Moser, M., Lafontaine, M. F., Wiebe, S. A., & Tasca, G. A. (2015). Predicting change in marital satisfaction throughout emotionally focused couple therapy. Journal of Marital and Family Therapy, 41, 276–​291. Davis, C. M., Yarber, W. L., Bauserman, R., Schreer, G., & Davis, S. L. (1998). Handbook of sexuality-​related measures. Thousand Oaks, CA: Sage. Derogatis, L. R., & Savitz, K. L. (1999). The SCL-​90-​R, Brief Symptom Inventory, and matching clinical rating scales. In M. E. Maruish (Ed.), The use of psychological testing for treatment planning and outcomes assessment (2nd ed., pp. 679–​724). Mahwah, NJ: Erlbaum. Diener, E., Gohm, C. L., Suh, E., & Oishi, S. (2000). Similarity of the relations between marital status and subjective well-​being across cultures. Journal of Cross-​ Cultural Psychology, 31, 419–​436. Doss, B. D., & Christensen, A. (2006). Acceptance in romantic relationships:  The frequency and acceptability of partner behavior inventory. Psychological Assessment, 18, 289–​302. Ehrensaft, M., & Vivian, D. (1996). Spouses’ reasons for not reporting existing physical aggression as a marital problem. Journal of Family Psychology, 10, 443–​453. Epstein, N. B., & Baucom, D. H. (2002). Enhanced cognitive–​ behavioral therapy for couples:  A contextual approach. Washington, DC: American Psychological Association. Filsinger, E. E. (1983). A machine-​aided marital observation technique: The Dyadic Interaction Scoring Code. Journal of Marriage and the Family, 45, 623–​632. Fincham, F. D., & Bradbury, T. N. (1987). The assessment of marital quality: A reevaluation. Journal of Marriage and the Family, 49, 797–​809. Fincham, F. D., & Bradbury, T. N. (1992). Assessing attributions in marriage: The relationship attribution measure. Journal of Personality and Social Psychology, 62, 457–​468. First, M. B., Gibbon, M., Spitzer, R. L., & Williams, J. B.  W. (1997). Structured clinical interview for DSM-​ IV Axis I  disorders–​ Clinician Version. Washington, DC: American Psychiatric Association. Fischer, M. S., Baucom, D. H., & Cohen, M. J. (2016). Cognitive–​behavioral couple therapies:  Review of the evidence for the treatment of relationship distress, psychopathology, and chronic health conditions. Family Process, 55, 423–​442. Fleeson, W. (2004). The quality of American life at the end of the century. In O. G. Brim, C. D. Ryff, & R. C. Kessler (Eds.), How healthy are we: A national study of well-​being

510

Couple Distress and Sexual Disorders

at midlife (pp. 252–​ 272). Chicago, IL:  University of Chicago Press. Floyd, F. J. (2004). Communication Skills Test (CST):  Observational system for couples’ problem-​ solving skills. In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 143–​ 158). Mahwah, NJ: Erlbaum. Fowers, B., & Olson, D. (1989). ENRICH marital inventory:  A discriminant validity study. Journal of Marital and Family Therapy, 15, 65–​79. Fowers, B., & Olson, D. (1992). Four types of pre-​marital couples:  An empirical typology based on PREPARE. Journal of Family Psychology, 6, 10–​12. Funk, J., & Rogge, R. (2007). Testing the ruler with item response theory:  Increasing precision of measurement for relationship satisfaction with the Couples Satisfaction Index. Journal of Family Psychology, 21, 572–​583. Gasbarrini, M. F., Snyder, D. K., Iafrate, R., Bertoni, A., Donato, S., & Margola, D. (2015). Investigating the relation between stress and marital satisfaction:  The moderating effects of dyadic coping and communication. Family Science, 6, 143–​149. Geisinger, K. F. (1994). Cross-​ cultural normative assessment:  Translation and adaptation issues influencing the normative interpretation of assessment instruments. Psychological Assessment, 6, 304–​312. Gohm, C. L., Oishi, S., Darlington, J., & Diener, E. (1998). Culture, parental conflict, parental marital status, and the subjective well-​being of young adults. Journal of Marriage and the Family, 60, 319–​334. Gordis, E. B., Margolin, G., & John, R. S. (2001). Parents’ hostility in dyadic marital and triadic family settings and children’s behavior problems. Journal of Consulting and Clinical Psychology, 69, 727–​734. Gottman, J. M. (1994). What predicts divorce? The relationship between marital processes and marital outcomes. Hillsdale, NJ: Erlbaum. Gottman, J. M. (1999). The marriage clinic: A scientifically-​ based marital therapy. New York, NY: Norton. Gottman, J. M., & Levenson, R. W. (1999). How stable is marital interaction over time? Family Process, 38, 159–​165. Gottman, J. M., McCoy, K., Coan, J., & Collier, H. (1996). The Specific Affect Coding System (SPAFF). In J. M. Gottman (Ed.), What predicts divorce? The measures (pp. 1–​169). Hillsdale, NJ: Erlbaum. Hahlweg, K. (2004). Kategoriensystem für Partnerschaftliche Interaktion (KPI) Interactional Coding System (ICS). In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 127–​ 142). Mahwah, NJ: Erlbaum. Halford, W. K., Hayes, S., Christensen, A., Lambert, M., Baucom, D. H., & Atkins, D. C. (2012). Toward making

progress feedback an effective common factor in couple therapy. Behavior Therapy, 43, 49–​60. Hamilton, E., & Carr, A. (2016). Systematic review of self-​ report family assessment measures. Family Process, 55, 16–​30. Hawkins, M. W., Carrère, S., & Gottman, J. M. (2002). Marital sentiment override:  Does it influence couples’ perceptions? Journal of Marriage and Family, 64, 193–​201. Hayes, S. C., Nelson, R. O., & Jarrett, R. B. (1987). The treatment utility of assessment:  A functional approach to evaluating assessment quality. American Psychologist, 42, 963–​974. Haynes, S. N. (2001). Clinical applications of analogue behavioral observation:  Dimensions of psychometric evaluation. Psychological Assessment, 13, 73–​85. Haynes, S. N., Floyd, F. J., Lemsky, C., Rogers, E., Winemiller, D., Heilman, N., . . . Cardone, L. (1992). The Marital Satisfaction Questionnaire for older persons. Psychological Assessment, 4, 473–​482. Haynes, S. N., Jensen, B., Wise, E., & Sherman, D. (1981). The marital intake interview: A multi-​method criterion validity evaluation. Journal of Consulting and Clinical Psychology, 49, 379–​387. Haynes, S. N., Kaholokula, J. K., & Tanaka-​ Matsumi, J. (in press). Psychometric foundations of psychological assessment with diverse cultures:  What are the concepts, methods, and evidence? In C. Frisby & W. O’Donohue (Eds.), Cultural competence in applied psychology:  Theory, science, practice, and evaluation. New York, NY: Springer. Haynes, S. N., Mumma, G. H., & Pinson, C. (2009). Idiographic assessment: Conceptual and psychometric foundations of individualized behavioral assessment. Clinical Psychology Review, 29, 179–​191. Haynes, S. N., O’Brien, W. H., & Kaholokula, K. (2011). Behavioral assessment and case formulation. Hoboken, NJ: Wiley. Heavey, C. L., Christensen, A., & Malamuth, N. M. (1995). The longitudinal impact of demand and withdrawal during marital conflict. Journal of Consulting and Clinical Psychology, 63, 797–​801. Hetherington, E. M., Bridges, M., & Insabella, G. M. (1998). What matters? What does not? Five perspectives on the association between marital transitions and children’s adjustment. American Psychologist, 53, 167–​184. Heyman, R. E. (2001). Observation of couple conflicts: Clinical assessment applications, stubborn truths, and shaky foundations. Psychological Assessment, 13, 5–​35. Heyman, R. E. (2004). Rapid Marital Interaction Coding System (RMICS). In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 67–​94). Mahwah, NJ: Erlbaum.

Couple Distress

Heyman, R. E., Eddy, J. M., Weiss, R. L., & Vivian, D. (1995). Factor analysis of the Marital Interaction Coding System (MICS). Journal of Family Psychology, 9, 209–​215. Heyman, R. E., Feldbau-​ Kohn, S. R., Ehrensaft, M. K., Langhinrichsen-​Rohling, J., & O’Leary, K. D. (2001). Can questionnaire reports correctly classify relationship distress and partner physical abuse? Journal of Family Psychology, 15, 334–​346. Heyman, R. E., & Slep, A. M.  S. (2004). Analogue behavioral observation. In E. M. Heiby & S. N. Haynes (Eds.), Comprehensive handbook of psychological assessment:  Vol. 3.  Behavioral assessment (pp. 162–​ 180). New York: Wiley. Heyman, R. E., & Slep, A. M.  S. (in press). Relational diagnoses and beyond. In B. Friese (Ed.), APA handbook of contemporary family psychology. Washington, DC: American Psychological Association Press. Heyman, R. E., Slep, A. M.  S., & Foran, H. M. (2015). Enhanced definitions of intimate partner violence for DSM-​5 and ICD-​11 may promote improved screening and treatment. Family Process, 54, 64–​81. Heyman, R. E., Slep, A. M.  S., Snarr, J. D., & Foran, H. M. (2013). Practical tools for assessing partner maltreatment in clinical practice and public health settings. In H. F. Foran, S. R.  H. Beach, A. M.  S. Slep, R. E. Heyman, & M. Z. Wamboldt (Eds.), Family problems and family violence: Reliable assessment and the ICD-​11 (pp. 43–​70). New York, NY: Springer. Hoover, D. W., & Snyder, D. K. (1991). Validity of the computerized interpretive report for the Marital Satisfaction Inventory:  A customer satisfaction study. Psychological Assessment, 3, 213–​217. Hops, H., Davis, B., & Longoria, N. (1995). Methodo-​logical issues in direct observation: Illustrations with the Living in Family Environments (LIFE) coding system. Journal of Clinical Child Psychology, 24, 193–​203. Hunsley, J., Best, M., Lefebvre, M., & Vito, D. (2001). The seven-​item short form of the Dyadic Adjustment Scale: Further evidence for construct validity. American Journal of Family Therapy, 29, 325–​335. Johnson, M. D. (2002). The observation of specific affect in marital interactions: Psychometric properties of a coding system and a rating system. Psychological Assessment, 14, 423–​438. Jones, A. C., & Chao, C. M. (1997). Racial, ethnic and cultural issues in couples therapy. In W. K. Halford & H. J. Markman (Eds.), Clinical handbook of marriage and couples interventions (pp. 157–​ 176). New  York, NY: Wiley. Karpel, M. A. (1994). Evaluating couples:  A handbook for practitioners. New York, NY: Norton. Kerig, P. K., & Baucom, D. H. (Eds.). (2004). Couple observational coding systems. Mahwah, NJ: Erlbaum.

511

Kiresuk, T. J., Smith, A., & Cardillo, J. E. (Eds.). (1994). Goal attainment scaling: Applications, theory, and measurement. Hillsdale, NJ: Erlbaum. Kline, G. H., Julien, D., Baucom, B., Hartman, S., Gilbert, K, Gonzalez, T., & Markman, H. J. (2004). The Interactional Dimensions Coding System (IDCS):  A global system for couple interactions. In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 113–​126). Mahwah, NJ: Erlbaum. Kline, S. L., Zhang, S., Manohar, U., Ryu, S., Suzuki, T., & Mustafa, H. (2012). The role of communication and cultural concepts in expectations about marriage:  Comparisons between young adults from six countries. International Journal of Intercultural Relations, 36, 319–​330. Knobloch-​ Fedders, L. M., Pinsof, W. M., & Haase, C. M. (2015). Treatment response in couple therapy:  Relationship adjustment and individual functioning change processes. Journal of Family Psychology, 29, 657–​666. Kreider, R. M., & Ellis, R. (2011). Number, timing, and duration of marriages and divorces: 2009 (Current Population Reports No. P70-​125). Washington, DC:  U.S. Census Bureau. Krokoff, L. J., Gottman, J. M., & Hass, S. D. (1989). Validation of a global rapid couples interaction scoring system. Behavioral Assessment, 11, 65–​79. Kwon, J. H., & Choi, K. M. (1999). A validation study of the Korean Marital Satisfaction Inventory. Korean Journal of Clinical Psychology, 18, 123–​139. L’Abate, L. (1994). Family evaluation:  A psychological approach. Thousand Oaks, CA: Sage. L’Abate, L., & Bagarozzi, D. A. (1993). Sourcebook of marriage and family evaluation. New York, NY: Brunner/​Mazel. Lam, B. C. P., Cross, S. E., Wu, T. F., Yeh, K. H., Wang, Y. C., & Su, J. C. (2015). What do you want in a marriage? Examining marriage ideals in Taiwan and the United States. Personality and Social Psychology Bulletin, 42, 703–​722. Lavner, J. A., Bradbury, T. N., & Karney, B. R. (2012). Incremental change or initial differences? Testing two models of marital deterioration. Journal of Family Psychology, 26, 606–​616. Lawrence, E., Barry, R. A., Brock, R. L., Bunde, M., Langer, A., Ro, E., . . . Dzankovic, S. (2011). The Relationship Quality Interview:  Evidence of reliability, convergent and divergent validity, and incremental utility. Psychological Assessment, 23, 44–​63. Lebow, J. L., Chambers, A. L., Christensen, A., & Johnson, S. M. (2012). Research on the treatment of couple distress. Journal of Marital and Family Therapy, 38, 145–​168. Lim, R. F. (Ed.). (2015). Clinical manual of cultural psychiatry (2nd ed.). Arlington, VA: American Psychiatric Publishing.

512

Couple Distress and Sexual Disorders

Lin, E., Goering, P., Offord, D. R., Campbell, D., & Boyle, M. H. (1996). The use of mental health services in Ontario: Epidemiologic findings. Canadian Journal of Psychiatry, 41, 572–​577. Lord, C. C. (1999). Stability and change in interactional behavior in early marriage. Unpublished doctoral dissertation, State University of New York, Stony Brook, NY. Lou, Y., Lin, C., Chen, C., Balderrama-​ Durbin, C., & Snyder, D. K. (2016). Assessing intimate relationships of Chinese couples using the Marital Satisfaction Inventory-​Revised. Assessment, 23, 267–​278. Malik, N. M., & Lindahl, K. M. (2004). System for Coding Interactions in Dyads. In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 173–​ 190). Mahwah, NJ: Erlbaum. Mash, E. J., & Foster, S. L. (2001). Exporting analogue behavioral observation from research to clinical practice: Useful or cost-​defective? Psychological Assessment, 13, 86–​98. McShall, J. R., & Johnson, M. D. (2015). The association between relationship distress and psychopathology is consistent across racial and ethnic groups. Journal of Abnormal Psychology, 124, 226–​231. Means-​Christensen, A. J., Snyder, D. K., & Negy, C. (2003). Assessing nontraditional couples: Validity of the Marital Satisfaction Inventory-​Revised (MSI-​R) with gay, lesbian, and cohabiting heterosexual couples. Journal of Marital and Family Therapy, 29, 69–​83. Meyer, I. (2003). Prejudice, social stress, and mental health in lesbian, gay, and bisexual populations:  Conceptual issues and research evidence. Psychological Bulletin, 129, 674–​697. Mitchell, A. E., Castellani, A. M., Sheffield, R. L., Joseph, J. I., Doss, B. D., & Snyder, D. K. (2008). Predictors of intimacy in couples’ discussions of relationship injuries:  An observational study. Journal of Family Psychology, 22, 21–​29. Mrazek, P. J., & Haggerty, R. J. (Eds.). (1994). Reducing risks for mental disorders: Frontiers for preventive intervention research. Washington, DC: National Academy Press. Murphy, C. M., & Hoover, S. A. (1999). Measuring emotional abuse in dating relationships as a multifactorial construct. Violence and Victims, 14, 39–​53. Northey, W. F., Jr. (2002). Characteristics and clinical practices of marriage and family therapists: A national survey. Journal of Marital and Family Therapy, 28, 487–​494. Notarius, C. I., Pellegrini, D., & Martin, L. (1991). Codebook of Marital and Family Interaction (COMFI). Unpublished manuscript, Catholic University of America, Washington, DC. O’Leary, K. D., Vivian, D., & Malone, J. (1992). Assessment of physical aggression in marriage: The need for a multimodal method. Behavioral Assessment, 14, 5–​14.

Olson, D. H., & Olson, A. K. (1999). PREPARE/​ENRICH program: Version 2000. In R. Berger & M. T. Hannah (Eds.), Preventive approaches in couples therapy (pp. 196–​216). Philadelphia, PA: Brunner/​Mazel. Pasch, L. A., Bradbury, T. N., & Davila, J. (1997). Gender, negative affectivity, and observed social support behavior in marital interaction. Personal Relationships, 4, 361–​378. Pasch, L. A., Harris, K. W., Sullivan, K. T., & Bradbury, T. N. (2004). The Social Support Interaction Coding System. In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 319–​334). Mahwah, NJ: Erlbaum. Patterson, G. R., Ray, R. S., Shaw, D. A., & Cobb, J. A. (1969). Manual for coding of family interactions. New  York, NY: Microfiche Publications Pepping, C. A., Halford, W. K., & Doss, B.D. (2015). Can we predict failure in couple therapy early enough to enhance outcome? Behaviour Research and Therapy, 65, 60–​65. Randall, A. K., Hilpert, P., Jimenez-​Arista, L. E., Walsh, K. J., & Bodenmann, G. (2016). Dyadic coping in the U.S.:  Psychometric properties and validity for use of the English version of the Dyadic Coping Inventory. Current Psychology, 35, 570–​582. Rathus, J. H., & Feindler, E. L. (2004). Assessment of partner violence:  A handbook for researchers and practitioners. Washington, DC: American Psychological Association. Reid, J. B. (Ed.). (1978). A social learning approach, Vol. 2: Observation in home settings. Eugene, OR: Castalia. Reig-​Ferrer, A., Cepeda-​Benito, A., & Snyder, D. K. (2004). Utility of the Spanish translation of the Marital Satisfaction Inventory-​ Revised in Spain. Assessment, 11, 17–​26. Robles, T. F., Slatcher, R. B., Trombello, J. M., & McGinn, M. M. (2014). Marital quality and health:  A meta-​ analytic review. Psychological Bulletin, 140, 140–​187. Roddy, M. K., Nowlan, K. M., Doss, B. D., & Christensen, A. (2016). Integrative behavioral couple therapy:  Theoretical background, empirical research, and dissemination. Family Process, 55, 408–​422. Rowe, L. S., Doss, B. D., Hsueh, A. C., Libet, J., & Mitchell, A. E. (2011). Coexisting difficulties and couple therapy outcomes:  Psychopathology and intimate partner violence. Journal of Family Psychology, 25, 455–​458. Salazar, L. R. (2015). The negative reciprocity process in marital relationships: A literature review. Aggression and Violent Behavior, 24, 113–​119. Sanford, K. (2003). Problem-​solving conversations in marriage:  Does it matter what topics couples discuss? Personal Relationships, 10, 97–​112. Sayers, S. L., & Sarwer, D. B. (1998). Assessment of marital dysfunction. In A. S. Bellack & M. Hersen (Eds.),

Couple Distress

Behavioral assessment:  A practical handbook (4th ed., pp. 293–​314). Boston, MA: Allyn & Bacon. Shapiro, A. F., & Gottman, J. M. (2004). The Specific Affect Coding System (SPAFF). In P. K. Kerig & D. H. Baucom (Eds.), Couple observational coding systems (pp. 191–​208). Mahwah, NJ: Erlbaum. Sharpley, C. F., & Rogers, H. J. (1984). Preliminary validation of the Abbreviated Spanier Dyadic Adjustment Scale:  Some psychometric data regarding a screening test of marital adjustment. Educational and Psychological Measurement, 44, 1045–​1049. Sher, T. G., Baucom, D. H., & Larus, J. M. (1990). Communication patterns and response to treatment among depressed and nondepressed maritally distressed couples. Journal of Family Psychology, 4, 63–​79. Shumway, S. T., Wampler, R. S., Dersch, C., & Arredondo, R. (2004). A place for marriage and family services in employee assistance programs (EAPs): A survey of EAP client problems and needs. Journal of Marital and Family Therapy, 30, 71–​79. Sillars, A., Roberts, L. J., Leonard, K. E., & Dun, T. (2000). Cognition during marital conflict:  The relationship of thought and talk. Journal of Social and Personal Relationships, 17, 479–​502. Sillars, A. L. (1982). Verbal Tactics Coding Scheme: Coding manual. Unpublished manuscript, Ohio State University, Columbus, OH. Snyder, D. K. (1997). Manual for the Marital Satisfaction Inventory-​Revised. Los Angeles, CA:  Western Psychological Services. Snyder, D. K., Castellani, A. M., & Whisman, M. A. (2006). Current status and future directions in couple therapy. Annual Review of Clinical Psychology, 57, 317–​344. Snyder, D. K., Cavell, T. A., Heffer, R. W., & Mangrum, L. F. (1995). Marital and family assessment:  A multifaceted, multilevel approach. In R. H. Mikesell, D. D. Lusterman, & S. H. McDaniel (Eds.), Integrating family therapy: Handbook of family psychology and systems theory (pp. 163–​182). Washington, DC: American Psychological Association. Snyder, D. K., Cepeda-​Benito, A., Abbott, B. V., Gleaves, D. H., Negy, C., Hahlweg, K.,  .  .  .  Maruish, M. (2004). Cross-​cultural applications of the Marital Satisfaction Inventory-​Revised (MSI-​R). In M. E. Maruish (Ed.), Use of psychological testing for treatment planning and outcomes assessment (3rd ed., pp. 603–​623). Mahwah, NJ: Erlbaum. Snyder, D. K., & Doss, B. D. (2005). Treating infidelity: Clinical and ethical directions. Journal of Clinical Psychology, 61, 1453–​1465. Snyder, D. K., Heyman, R. E., & Haynes, S. N. (2005). Evidence-​based approaches to assessing couple distress. Psychological Assessment, 17, 288–​307.

513

Snyder, D. K., Heyman, R. E., Haynes, S. N., Carlson, C., & Balderrama-​Durbin, C. (in press). Couple and family assessment. In B. Fiese (Ed.), APA handbook of contemporary family psychology. Washington, DC:  American Psychological Association. Snyder, D. K., Mangrum, L. F., & Wills, R. M. (1993). Predicting couples’ response to marital therapy: A comparison of short-​and long-​term predictors. Journal of Consulting and Clinical Psychology, 61, 61–​69. Snyder, D. K., & Whisman, M. A. (Eds.). (2003). Treating difficult couples: Helping clients with coexisting mental and relationship disorders. New York, NY: Guilford. Spanier, G. B. (1976). Measuring dyadic adjustment:  New scales for assessing the quality of marriage and similar dyads. Journal of Marriage and the Family, 38, 15–​28. Spitzer, R. L., Kroenke, K., Williams, J. B.  W, & Lowe, B. (2006). A brief measure for assessing generalized anxiety disorder. Achieves of Internal Medicine, 166, 1092–​1097. Straus, M. A., Hamby, S. L., Boney-​McCoy, S., & Sugarman, D. B. (1996). The revised Conflict Tactics Scales (CTS2):  Development and preliminary psychometric data. Journal of Family Issues, 17, 283–​316. Swindle, R., Heller, K., Pescosolido, B., & Kikuzawa, S. (2000). Responses to nervous breakdowns in America over a 40-​year period:  Mental health policy implications. American Psychologist, 55, 740–​749. Touliatos, J., Perlmutter, B. F., Straus, M. A., & Holden, G. W. (Eds.). (2001). Handbook of family measurement techniques (Vols. 1–​3). Thousand Oaks, CA: Sage. Vaez, E., Indran, R., Abdollahi, A., Juhari, R., & Mansor, M. (2015). How marital relations affect child behavior: Review of recent research. Vulnerable Children and Youth Studies, 10, 321–​336. van Widenfelt, B. M., Treffers, A., de Beurs, E., Siebelink, B. M., & Koudijs, E. (2005). Translation and cross-​cultural adaptation of assessment instruments used in psychological research with children and families. Clinical Child and Family Psychology Review, 8, 135–​147. Whisman, M. A. (1999). Marital dissatisfaction and psychiatric disorders: Results from the National Comorbidity Survey. Journal of Abnormal Psychology, 108, 701–​706. Whisman, M. A. (2007). Marital distress and DSM-​IV psychiatric disorders in a population-​based national survey. Journal of Abnormal Psychology, 116, 638–​643. Whisman, M. A., Beach, S. R. H., & Snyder, D. K. (2008). Is marital discord taxonic and can taxonic status be assessed reliably? Results from a national, representative sample of married couples. Journal of Consulting and Clinical Psychology, 76, 745–​755. Whisman, M. A., Dixon, A. E., & Johnson, B. (1997). Therapists’ perspectives of couple problems and treatment issues in couple therapy. Journal of Family Psychology, 11, 361–​366.

514

Couple Distress and Sexual Disorders

Whisman, M. A., Sheldon, C. T., & Goering, P. (2000). Psychiatric disorders and dissatisfaction with social relationships:  Does type of relationship matter? Journal of Abnormal Psychology, 109, 803–​808. Whisman, M. A., & Snyder, D. K. (1997). Evaluating and improving the efficacy of conjoint couple therapy. In W. K. Halford & H. J. Markman (Eds.), Clinical handbook of marriage and couples interventions (pp. 679–​693). New York, NY: Wiley. Whisman, M. A., & Snyder, D. K. (2007). Sexual infidelity in a national survey of American women: Differences in prevalence and correlates as a function of method

of assessment. Journal of Family Psychology, 21, 147–​154. Whisman, M. A., Snyder, D. K., & Beach, S. R. H. (2009). Screening for marital and relationship discord. Journal of Family Psychology, 23, 247–​254. Whisman, M. A., & Wagers, T. P. (2005). Assessing relationship betrayals. Journal of Clinical Psychology, 61, 1383–​1391. Wieder, G. B., & Weiss, R. L. (1980). Generalizability theory and the coding of marital interactions. Journal of Consulting and Clinical Psychology, 48, 469–​477. World Health Organization. (2016). ICD-​ 11 Beta Draft. Retrieved from https://​icd.who.int/​dev11/​l-​m/​en

23

Sexual Dysfunction Natalie O. Rosen Maria Glowacka Marta Meana Yitzchak M. Binik The question of assessment in sexuality has always been a complex one. Arguably more than with other phenomena covered in the Diagnostic and Statistical Manual of Mental Disorders (DSM), the classification of sexuality has been complicated by changing notions of normality, the subjective nature of the sexual experience, gender differences, and significant social, economic, and political investment from parties with opposing ideologies. The past two decades have evidenced a series of challenges to extant definitions of sexual dysfunction in general and, specifically, to the legitimacy of certain dysfunctions. The DSM-​ IV-​ TR (American Psychiatric Association [APA], 2000) classification was critiqued for (a) medicalizing sexuality by discounting the diversity of sexual expression in favor of categorical distinctions between health and disorder (Tiefer, 2002), (b) using an androcentric conceptualization of the sexual response that inadequately accounts for female sexuality (Basson et al., 2004), (c) ignoring questions of sexual and relationship satisfaction (Byers, 1999), and (d) decontextualizing the sexual experience (Laumann & Mahay, 2002). The validity of specific dysfunctions has also been questioned (Basson, 2002; Binik, 2005; Reissing, Binik, Khalife, Cohen, & Amsel, 2004), with theoretical and empirical challenges resulting in the removal of sexual aversion disorder, and the combining of female hypoactive sexual desire disorder and female sexual arousal disorder into a new diagnosis of female sexual interest/​arousal disorder (FSIAD), as well as dyspareunia and vaginismus into genito-​pelvic pain/​penetration disorder (GPPPD) in the DSM-​5 (APA, 2013). Male dyspareunia was also excluded from the DSM-​5 due to insufficient data.

Our aim in this chapter is not to determine what rises to the level of a disorder and what does not. Rather, we aim to describe and discuss different ways of measuring subjective and physiological sexual phenomena related to global sexual function as well as to the seven sexual dysfunctions defined in the DSM-​ 5:  delayed ejaculation, erectile disorder, female orgasmic disorder, female sexual interest/​arousal disorder, genito-​pelvic pain/​penetration disorder, male hypoactive sexual desire disorder, and premature (early) ejaculation. After a brief description of the nature of these sexual problems, we describe global sexual function measures suitable for the purposes of diagnosis, case conceptualization and treatment planning, and treatment monitoring and outcome. A description of assessments specific to each of the aforementioned sexual dysfunctions follows, concluding with a discussion of future directions.

THE NATURE OF SEXUAL DYSFUNCTION

One of the reasons clinicians and researchers debate the very notion of sexual dysfunction is the ubiquity of sexual complaints in our society. Despite wide variation in prevalence rates for all sexual dysfunctions depending on the population and methodology in question (Simons & Carey, 2001), the numbers remain staggering. With general prevalence figures for sexual problems reported to be as high as 40% in women and 28% in men (Hendrickx, Gijs, & Enzlin, 2014; Laumann, Paik, & Rosen, 1999; Shifren, Monz, Russo, Segreti, & Johannes, 2008), sexual

515

516

Couple Distress and Sexual Disorders

difficulties seem close to normative. Once relegated strictly to sex therapists and sexologists, the assessment of sexual function is increasingly considered an integral part of an overall health assessment (Parish, 2006). However, it is important to distinguish a fleeting sexual complaint from a more pervasive problem. Most people will experience difficulty with sex at some point in their lives. The DSM-​5 restricts diagnosis to cases characterized by a persistence of the problem (at least 6 months) and significant associated distress for the individual or couple. Indeed, prevalence estimates drop to 12% to 20% for women and 11% for men when considering both persistence and associated distress (Christensen et al., 2011; Hendrickx et al., 2014; Shifren et  al., 2008). Furthermore, another population-​ based study (Prevalence of Female Sexual Problems Associated with Distress and Determinants of Treatment Seeking [PRESIDE]) reported that the vast majority of sexual problems with a 1-​month duration (>72%) did not persist to 6 months (Shifren et al., 2008). It is notable that the prevalence of sexual problems can vary considerably across cultures, further highlighting the importance of contextual factors (Laumann et al., 2005). The DSM-​5 further classifies sexual dysfunctions as generalized or situational (with the exception of GPPPD) and lifelong or acquired, and it specifies the current severity as mild, moderate, or severe. Exclusion criteria include problems that are better explained by a nonsexual mental disorder; medical conditions and/​or use of substances; or severe relationship distress, partner violence, or other stressors. To better address the degree of medical and nonmedical correlates, several associated features are now listed for consideration in the diagnosis, including partner factors, relationship factors, individual vulnerabilities, cultural or religious factors, and medical factors. The exact determination of the DSM-​ 5 inclusion/​ exclusion criterion relating specifically to etiology is particularly complicated in any individual case. It is often difficult to determine whether the sexual problem emanates from psychological disturbances alone or whether there is organic involvement. Considering that the sexual response necessarily involves both peripheral and central nervous system activity, and that it is usually experienced in an intrapersonal, interpersonal, and cultural context, one could argue that every sexual problem either originates from or is perpetuated by both psychological and physiological factors. The overall organization of the sexual dysfunctions in the DSM-​5 is alphabetical and includes seven dysfunctions relating to sexual desire, arousal, orgasm, and pain. There are no dysfunctions listed that relate to the

period immediately following sexual activity, although this may change in future editions given growing support for the existence of persistent sexual arousal syndrome in women (Facelle, Sadeghi-​Nejad, & Goldmeier, 2013). The comorbidity of sexual dysfunctions other than the presenting one is very common (Hendrickx et al., 2014). A  problem at any stage of sexual response is likely to engender difficulties at other stages. A  brief description of the known features of each of the sexual dysfunctions listed in the DSM-​5 follows. Delayed Ejaculation Previously referred to as male orgasmic disorder, delayed ejaculation (DE) presents as delayed, infrequent, or absent ejaculation in 75% to 100% of partnered sexual activity occasions. Population prevalence estimates range from less than 1% to 2% (Christensen et al., 2011; Hendrickx, Gijs, & Enzlin, 2013). The most common physiological etiologies are select disease processes associated with aging, such as heart disease and benign prostatic hyperplasia/​lower urinary tract symptoms, although pelvic surgeries, diabetes, neurological disturbances, antidepressants, and alpha blockers have also been linked to DE. Theorized psychosocial etiologic pathways include fear, performance anxiety, hostility, guilt, low desire for the partner, lack of confidence, and inadequate stimulation (Rowland et al., 2010). Idiosyncratic and vigorous masturbatory styles may also negatively impact ejaculation. Erectile Disorder Erectile disorder (ED) is diagnosed when at least one of the following criteria is met during 75% to 100% of sexual activity encounters:  (1) difficulty obtaining an erection, (2)  difficulty maintaining an erection, or (3)  a decrease in erectile rigidity. Almost 15% of men in the British National Survey of Sexual Attitudes and Lifestyles (NATSAL) study reported difficulty getting or maintaining an erection (Mitchell et al., 2013). In two other large studies, approximately 5% of men reported distressing ED (Christensen et  al., 2011; Hendrickx et  al., 2013). The prevalence and severity of ED increase with age; however, older men are typically less distressed in comparison to younger or middle-​aged men (Rosen, Miner, & Wincze, 2014). Vascular and neurological diseases or damage are associated with ED, as are lifestyle behaviors (e.g., smoking, alcohol abuse, and inactivity) that affect the vascularization and innervation necessary for erection and/​or

Sexual Dysfunction

the stamina to sustain the physical exertion of penetration (Rosen, Miner, et al., 2014). Some antidepressants, antihypertensives, and drugs that block the conversion of testosterone into dihydrotestosterone (DHT), commonly used to treat male pattern hair loss and benign prostatic hyperplasia (Shamloul & Ghanem, 2013), have also been implicated. Psychosocially, performance demands, arousal underestimation, negative affect during sex, self-​ critical attributions, depressive symptoms, and relationship problems have all been linked to ED (Rosen, Miner, et al., 2014). Female Orgasmic Disorder A diagnosis of female orgasmic disorder (FOD) requires a delay in, infrequency of, or absence of orgasm or a reduced intensity of orgasmic sensations during 75% to 100% of sexual activity encounters. Because of the wide variation in the type or intensity of stimulation that triggers orgasm, clinicians are left to judge whether the woman’s orgasmic capacity is less than expected for her age, sexual experience, and stimulation received. A review of several studies reported that the prevalence of FOD is approximately 3% to 34%, varying widely across studies and cultures (Graham, 2010; Laumann et  al., 2005). Approximately half of women with orgasm difficulties do not report associated distress (Shifren et al., 2008). A diagnosis of FOD should not be made if lack of orgasm is solely dependent on inadequate sexual stimulation; however, epidemiology studies rarely take into account the source of stimulation (Graham, 2014). Neurophysiological and vascular disruptions, thyroid problems, pelvic nerve damage, and spinal cord injury, as well as side effects from serotonin reuptake inhibitors have been implicated in the development of FOD. Psychosocial etiologic factors are more common than physiological factors and include fear of losing control, relationship quality, and socioeconomic status and educational level (Graham, 2014).

517

sexual desire disorder (MHSDD), a desire discrepancy between partners is not alone sufficient for a diagnosis. There are no population-​based studies that have examined the prevalence of this new disorder. The NATSAL study reported that 40.6% of women lacked interest in sex. However, when the criterion of 6-​month duration was included, the prevalence declined to 10.2% (Mercer et al., 2003). The prevalence rate of arousal difficulties is estimated to be between 10.9% and 31.2% (Brotto, Bitzer, Laan, Leiblum, & Luria, 2010). Cultural variations in the prevalence of arousal and desire difficulties have also been noted (Laumann et al., 2005). Sexual desire and arousal are influenced by a combination of biological, psychological, and contextual factors (Laan & Both, 2008; Toates, 2009). Biologically, difficulties with desire and arousal have been linked to endocrine factors, medical illnesses, and medical treatments that impact hormones or the menstrual cycle (Brotto et al., 2010). There is mixed evidence related to the role of androgen levels in problems with sexual desire in women (Davis, Davison, Donath, & Bell, 2005; Davis, Worsley, Miller, Parish, & Santoro, 2016; Santoro et al., 2005). Similar to MHSDD, negative mood states, traumatic experiences, body image concerns, relationship factors, and pressure from various cultural norms have been linked to arousal and desire problems (Brotto et al., 2010). Genito-​Pelvic Pain/​Penetration Disorder

GPPPD is defined as recurrent or persistent difficulty with at least one of the following: (a) vaginal penetration during intercourse, (b) vulvovaginal or pelvic pain during penetration or attempts, (c) marked fear or anxiety about vulvovaginal or pelvic pain, or (d) tensing of pelvic floor muscles during penetration attempts. Studies indicate that 14% to 34% of younger women and 6.5% to 45% of older women suffer from pain during sexual intercourse (van Lankveld et al., 2010). The Global Study of Sexual Attitudes and Behaviours, which spanned 29 countries, reported that 2% to 8.6% of women experienced frequent Female Sexual Interest/​Arousal Disorder pain during sex; however, this study did not take into FSIAD is defined as absent or reduced sexual interest/​ account distress (Laumann et al., 2005). Although there arousal based on meeting three or more of the following are no population-​based studies of GPPPD specifically, criteria: lack of or reduced (a) interest in sexual activity, previous studies  that accounted for clinical distress indi(b) sexual thoughts/​fantasies, (c) initiations of sexual activ- cated the prevalence of dyspareunia to be 3% and vagiity or being unreceptive to partner initiations, (d) excite- nismus to be 0.4% (Christensen et  al., 2011; Hendrickx ment or pleasure during 75% to 100% of sexual activity et al., 2014). Male genital pain has been removed from events, (e)  interest/​arousal in the context of any sexual the DSM-​5 due to insufficient research, despite growcues, or (f)  genital or nongenital sensations during 75% ing evidence of men experiencing pain during erection, to 100% of sexual activity events. As with male hypoactive ejaculation, and receptive anal intercourse (Bergeron,

518

Couple Distress and Sexual Disorders

Rosen, & Pukall, 2014). The prevalence of male dyspareunia remains unclear, but it is estimated to range from 1% to 15% (Christensen et al., 2011; Clemens, Meenan, O’Keeffe, Rosetti, Gao, & Calhoun, 2005). Biologically, GPPPD can arise from congenital malformations of the genital tract, acute and chronic diseases, nonspecific inflammatory or nerve dysfunction processes, such as vestibulodynia, postmenopausal decreases in estrogen, and iatrogenic damage from genital surgeries/​ procedures (Bergeron, Corsini-​ Munt, Aerts, Rancourt, & Rosen, 2015). Psychological factors associated with GPPPD include pain catastrophizing, fear of and hypervigilance to pain, lower self-​efficacy, anxiety, and depression. In addition, women who suffer from sexual, physical, or psychological abuse have an increased likelihood of developing genito-​pelvic pain (Harlow & Stewart, 2005). Recent research has highlighted a number of interpersonal factors associated with greater pain and poorer adjustment, such as partner response to the pain and couple communication (Rosen, Bergeron, et al., 2014). Male Hypoactive Sexual Desire Disorder MHSDD is defined as persistent or recurrent absence or deficiency of sexual thoughts/​fantasies and desire for sexual activity. The prevalence rates vary substantially across studies but are estimated to be between 15% and 25% (Brotto, 2010; Lewis et al., 2010). Unfortunately, distress is rarely assessed in studies of male desire. In a large study that did account for clinically significant distress, only 1.6% of men met the DSM-​IV-​TR diagnostic criteria for MHSDD (Hendrickx et al., 2013). It is worth noting that men may be reluctant to report low desire for a variety of reasons, including adherence to cultural norms (Meana & Steiner, 2014). It is imperative to tease apart true MHSDD from desire that fails to rise to a partner’s wishes or to a societal, oppressive ideal. Barring age, medical conditions, pain syndromes, or medication side effects, the most often cited biological factor implicated in MHSDD has been hormones. Administration of exogenous testosterone has shown effects in the desire of hypogonadal men with erectile dysfunction; however, increased testosterone positively impacts energy and mood, which may improve desire (Khera et  al., 2011). Furthermore, it is unlikely that testosterone replacement would improve desire in eugonadal men (Meana & Steiner, 2014). Thus, testosterone may not be as important to the etiology of MHSDD as was previously thought. Psychosocially, many negative emotional states and life experiences have been linked to low desire in men,

including stress, depression, anxiety, cognitive set, self-​ esteem, trauma, cultural norms, and relational and financial difficulties (for a review, see Meana & Steiner, 2014). Premature (Early) Ejaculation Premature (early) ejaculation (PE) is defined as persistent or recurrent ejaculation within 1 minute following vaginal penetration and before the person wishes it on 75% to 100% of partnered sexual activity occasions. Men who engage in nonvaginal sexual activity may meet the diagnosis for PE, but the specific duration criteria are unknown. In this case, the onus is on the clinician to judge whether conditions described are adequate for most men to delay ejaculation until desired. Two large studies of Danish and American men reported a prevalence of 7% or 8% for distressing PE based on DSM-​IV-​TR criteria (Christensen et al., 2011; Patrick et al., 2005). The addition of the 1-​ minute criterion to the DSM-​5 diagnosis is likely to significantly impact prevalence estimates (Althof, 2014). In fact, a multinational consultation team composed of more than 200 experts estimated that the prevalence of PE, using the more stringent duration criteria, is only 1% to 3% (Rowland et al., 2010). In addition to innate physiological predispositions to ejaculate quickly; genitourinary, cardiovascular, and neurologic diseases; prostatitis/​chronic pelvic pain syndrome; and erectile dysfunction have also been implicated. Psychosocial factors hypothesized to contribute to PE include negative mood states, performance anxiety, unrealistic expectancies, sexual misinformation, poor sexual skills and sensory awareness, maladaptive arousal patterns, and relational problems (Althof et  al., 2014; Perelman, 2006).

PURPOSES OF ASSESSMENT

The latest edition of the Handbook of Sexuality-​Related Measures (Fisher, Davis, Yarber, & Davis, 2011)  contains 218 self-​administered questionnaires that relate to sexuality. The comprehensiveness of this reference text is deceiving, however, because it creates the impression that the field of human sexuality is rich in assessment tools. In terms of sexual function and its clinical assessment, this is not always the case. Only a small subset of the measures in the Handbook focuses on sexual function and possesses adequate psychometric properties. The assessment of sexual function using extensively validated instruments has grown in the

Sexual Dysfunction TABLE 23.1  

519

Ratings of Instruments Used for Diagnosis Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity Validity Generalization

Clinical Utility

For Use with Men, Women, and Couples GRISS G G

NA

A

A

G

G

A

For Use with Women Only BISF-​W FSFI MFSQ SFQ SDM

A G G G A

A E A G NA

NA NA NA NA L

A A A A NR

A G G E A

A G G G A

A G G G NR

A A A A A

For Use with Men Only BMSFI

A

G

NA

A

G

A

A

A

G G

G G

NA NA

A A

G G

G A

G NR

A A

A

E

NA

A

G

A

A

A

G A G G

G G A A

NA NA NA NA

A A A A

G A G G

G A A A

A G G G

A A A A

Instrument

Norms

Internal Consistency

Highly Recommended

Global Sexual Function

IIEF MSHQ Dysfunction-​Specific SIDI-​F IIEF-​5 MSHQ-​EjD IPE PEDT

✓ ✓ ✓





Note: GRISS = Golombok–​Rust Inventory of Sexual Satisfaction; BISF-​W = Brief Index of Sexual Functioning for Women; FSFI = Female Sexual Function Index; MFSQ = McCoy Female Sexuality Questionnaire; SFQ = Sexual Function Questionnaire; SDM = Structured Diagnostic Method; BMSFI = Brief Male Sexual Function Inventory; IIEF/​IIEF-​5 = International Index of Erectile Function; MSHQ/​MSHQ-​EjD = Male Sexual Health Questionnaire/​–​Ejaculation Short Form; SIDI-​F = Sexual Interest and Desire Inventory; IPE = Index of Premature Ejaculation; PEDT = Premature Ejaculation Diagnostic Tool; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

past decade as a consequence of the emerging need to assess outcomes in pharmaceutical clinical trials (Daker-​ White, 2002). This chapter is limited to the description and evaluation of measures that (a)  aim to assess sexual function in clinically useful ways and (b) have adequate or better psychometric properties. The list of measures covered could arguably have been longer because the multifactorial conceptualization of sexual problems could conceivably include assessments of myriad aspects of an individual’s life. Our choice was guided by objective indices of reliability and validity, and by our subjective assessment of a measure’s promise of clinical utility. We first present multidimensional measures of global sexual function or related constructs (satisfaction, distress, and relationship adjustment) adequate for diagnosis, case conceptualization, and treatment monitoring. This is followed by a discussion of assessment tools specific to each of the sexual dysfunctions. Some of the measures selected are applicable to men, women, and/​or couples, whereas others are gender-​specific. Critical evaluations of the psychometric properties of all measures (global and dysfunction-​ specific) by assessment purpose (diagnosis, case conceptualization, and treatment monitoring) are

provided in Tables 23.1 and listed in the order in which they appear in the text.

GLOBAL ASSESSMENT OF SEXUAL FUNCTION

Concerns about the growing medicalization of the field have engendered appeals for integrative conceptualizations of sexual dysfunctions that encompass individual, family of origin, relational, social, and cultural factors (Binik & Hall, 2014). This multifactorial approach, however, represents a daunting challenge to assessment and treatment because it requires the simultaneous consideration of multiple factors. It also calls for assessment of other mental disorders or medical conditions that may impact on sexual function and for assessment of the comorbidity of other sexual dysfunctions in the client and his or her partner. The assessment of global sexual function generally involves a clinical interview and/​or self-​administered questionnaires, depending on the context of the evaluation. General practitioners who want to screen for sexual dysfunction in the context of a busy medical

520

Couple Distress and Sexual Disorders

practice may depend primarily on brief screening questionnaires. Sex therapists and other mental health professionals more directly involved in the treatment of sexual dysfunction will almost invariably start with an extended clinical interview, possibly followed by questionnaires. Assessment for Diagnosis There is no diagnostic category of global sexual dysfunction because there is no such diagnosis. The “diagnostic” assessment of global sexual function is thus conducted for one of two reasons: to get a general sense of the person’s sexual adjustment multidimensionally defined as function, satisfaction, distress, and relationship quality, or as a screen for the existence of a specific dysfunction that will then be investigated further. Although the DSM-​5 has enhanced specificity of the diagnostic criteria for sexual disorders, several still depend heavily on clinician judgment, rendering the clinical interview an essential diagnostic tool. Self-​report measures of global sexual function and specific dysfunctions are generally considered diagnostic adjuncts. Clinical Interview The clinical interview remains the mainstay of sexual dysfunction diagnostic assessment. Clinician judgment is central to the determination of whether a client meets DSM-​5 criteria for sexual dysfunction. However, there is no widely used, standardized interview that has been psychometrically validated, as is the case for other mental disorders. The Structured Clinical Interview for DSM-​5 Disorders (SCID-​5) does not cover the sexual dysfunctions (First, Williams, Karg, & Spitzer, 2015). Several authors have proposed clinical interview outlines and recommendations about coverage of topics and process (e.g., Maurice, 1999; McConaghy, 2003; Wincze & Weisberg, 2015)  and also for the specific dysfunctions (Binik & Hall, 2014; Levine, Risen, & Althof, 2016). Briefly, the clinical interview typically starts with the individual describing the nature of the problem and the reasons for seeking treatment at the time. Following an open-​ended characterization of the difficulty, the clinician might start asking more operationally specific questions about the extent of the problem and the conditions under which it occurs. This is ideally followed by questions covering the myriad biological, psychological, and social problems that might be implicated, paying attention

to the four “P’s”—​that is, predisposing, precipitating, perpetuating, and protective factors. From a broadly biological perspective, it is important to assess and take into account age, general health status (e.g., body mass index, energy levels, and sense of physical well-​being), lifestyle factors (e.g., diet, cigarette smoking, alcohol use, and exercise), life transitions (e.g., menopause and childbirth), hormone levels, chronic pain syndromes (e.g., vulvodynia and interstitial cystitis), vascular diseases (e.g., hypertension, atherosclerosis, and impaired cardiac function), conditions that affect nervous system function (e.g., diabetes and neuropathy), and pelvic or perineum trauma. It is also important to assess for the potentially iatrogenic influence of surgeries that may interfere with the musculature and innervation of the genital area, as well as its cosmetic appearance. Antidepressants, antipsychotics, and antihypertensives can also have a deleterious effect on desire, arousal, and orgasm, and should be inquired about. Often, assessment of many of these factors will require referral to the appropriate medical or other health professional (e.g., a physiotherapist). In terms of individual psychological factors, depression and anxiety are often comorbid with sexual dysfunction. Treatment for sexual difficulties that does not simultaneously target mood disturbances and anxiety (if present) is unlikely to meet with much success. Substance abuse disorders can also have a major impact on sexual functioning, as can certain maladaptive cognitive sets and negative emotional reactions that interfere with sexual function, although they may not rise to the level of a disorder. These may arise from past trauma, negative experiences, or learned sexual scripts. Often, individuals simply lack knowledge of physiology or of sexual techniques. From a relational/​social perspective, family of origin attitudes regarding sexuality can be instated early on and create the conditions for the development of sexual dysfunction. The importance of assessing the quality of the individual’s current relationship cannot be stressed enough. Although sexual difficulties can occur in the happiest of relationships, couple disharmony can be a cause and/​or consequence of sexual problems and needs to be addressed. Relational issues important to assess include anger, distrust, discrepancies in drive and preferences, communication, and physical attraction. The way in which a relationship partner responds to the sexual difficulty both inside and outside of a sexual context can also have implications for the individuals’ and couples’ sexual functioning. It is usually recommended that both partners be interviewed together and/​ or separately to gather as much information as possible. The comorbidity

Sexual Dysfunction

of partner sexual dysfunction is common and crucial to assess. Finally, ethnocultural and religious attitudes and beliefs are important as they can be implicated in the development and maintenance of sexual difficulties. Also, these beliefs need to be respected in order to successfully treat the individual or the couple. In summary, the presence of any one or combination of the aforementioned factors does not necessarily result in dysfunction. Failing to assess for them, however, may interfere with otherwise reasonable treatment efforts. Although the unstructured clinical interview undeniably provides maximum flexibility to explore the specifics of an individual’s sexual problem and profile, the addition of a shorter, structured interview and/​or self-​administered questionnaires may enhance the accuracy and utility of the overall assessment. Self-​Report Measures of Global Sexual Function Table 23.1 provides a listing of self-​report measures of global sexual function helpful in diagnostic assessment. The first two of these measures are designed to be applicable to men, women, and couples, whereas the rest are gender-​specific. It is worth noting that many of these measures are quite heteronormative, and adaptations and validations with diverse sexual identities are required. A description of these measures follows. The Golombok–​Rust Inventory of Sexual Satisfaction (GRISS; Rust & Golombok, 1985, 1986, 1998)  is a 56-​ item self-​ report measure of sexual function and of relationship quality in heterosexual relationships. Female-​specific dimensions (28 items) pertain to orgasmic difficulties, vaginismus, nonsensuality, avoidance, and dissatisfaction. Male-​specific dimensions (28 items) pertain to erectile dysfunction, PE, nonsensuality, avoidance, and dissatisfaction. The two common dimensions pertain to infrequency and noncommunication. Items are responded to on 5-​ point adjectival scales. Scores on the 12 dimensions are transformed into standardized scores and can be plotted to provide a profile. The GRISS also provides a global score indicative of overall relationship quality and the couple’s sexual function that can be useful in case conceptualization and treatment planning. Although there is some support for its use as a diagnostic tool, the GRISS was designed primarily as an evaluation tool for sex and couple therapy and for cross-​ treatment efficacy comparisons. Its clinical utility lies in its ease of administration (approximately 10 minutes to complete) and its simultaneous assessment of both sexual function and relationship quality.

521

The Brief Index of Sexual Functioning for Women (BISF-​W; Rosen, Taylor, & Leiblum, 1998; Taylor, Rosen, & Leiblum, 1994) is a 22-​item scale developed to measure global sexual function for the purposes of large-​scale clinical trials. A scoring algorithm provides an overall score for sexual function and on seven dimensions: thoughts/​desire, arousal, frequency of sexual activity, receptivity/​initiation, pleasure/​orgasm, relationship satisfaction, and problems affecting sexual function. Items are responded to in a variety of formats, it takes 15 to 20 minutes to administer, and some dimensions and the overall score have been shown to be sensitive to treatment (Rosen et  al., 2006; Shifren et al., 2000). The Female Sexual Function Index (FSFI; Rosen et  al., 2000)  is a brief, 19-​item self-​report measure of female sexual function yielding a total score as well as scores on five domains:  desire, arousal, lubrication, orgasm, satisfaction, and pain. Items are responded to on 5-​or 6-​point adjectival scales and in reference to the past 4 weeks. The FSFI takes approximately 15 minutes to complete. Cross-​validation of this instrument has supported its use as a screening tool or diagnostic aid, but not as the sole basis of diagnosis (Meston, 2003; Wiegel, Meston, & Rosen, 2005). Because it does not address questions of onset, duration, etiological or maintaining factors, or situational specifics, it is not as useful in the conceptualization of cases and treatment planning as in screening and measurement of treatment outcome. Data indicate that it can detect treatment-​related changes (Derogatis, 2008). Recent recommendations have suggested modifications to the FSFI when it is administered to women who are sexually inactive (Yule, Davison, & Brotto, 2011), and a 6-​ item version has been validated for use as a rapid screener for female sexual dysfunction (Isidori et al., 2010). The McCoy Female Sexuality Questionnaire (MFSQ; McCoy & Matyas, 1998) is a 19-​item measure that assesses a woman’s general level of sexual interest and response in the preceding 4 weeks. It was designed to serve as a diagnostic aid and to measure changes in sexual functioning over time. The first 11 questions relate to general sexual enjoyment, arousal, interest, satisfaction with partner, and feelings of attractiveness; the remaining 8 questions cover intercourse frequency and enjoyment, orgasm frequency and pleasure, lubrication, pain with intercourse, and the impact of the partner’s erectile difficulties. Most items are answered on a 7-​point adjectival scale. Time to administer is approximately 10 minutes. The MFSQ has primarily been used with menopausal women, but there is support for its use as a valid measure of dysfunction in women aged 18 to 65 years (Rellini et al., 2005).

522

Couple Distress and Sexual Disorders

The Sexual Function Questionnaire (SFQ; Quirk et al., 2002; Quirk, Haughie, & Symonds, 2005) is a 34-​ item self-​report instrument developed to assess female sexual function and sexual satisfaction in sexual pharmacology clinical trials. The eight specific dimensions targeted are desire, arousal–​sensation, arousal–​lubrication, subjective arousal, enjoyment, orgasm, pain, and partner relationship. The SFQ specifically distinguishes between subjective and genital aspects of arousal. It takes 15 to 20 minutes to complete, with items answered in reference to the preceding 4 weeks on 5-​point adjectival scales. The 4-​week reference period makes the measure suitable for the tracking of treatment progress, although no data supporting its use for treatment outcome have yet been made available. The Structured Diagnostic Method (SDM; Utian et  al., 2005)  was designed to help health care providers who are not sexuality experts determine a diagnosis of female sexual dysfunction in postmenopausal women. The SDM consists of four self-​report measures, followed by a clinical interview. The four questionnaires are administered in the following order: Life Satisfaction Checklist (Fugl-​Meyer, Lodnert, Bränholm, & Fugl-​Meyer, 1997), the first seven of nine questions in the sexual component of the Medical History Questionnaire (Pfeiffer & Davis, 1972), the Female Sexual Distress Scale (FSDS; Derogatis, Rosen, Leiblum, Burnett, & Heiman, 2002), and the SFQ (Quirk et al., 2002). The combination covers overall life satisfaction (including sexual), decline in sexual function as well as its onset, sexually related distress, and sexual function. The measures are followed by a structured interview based on a guide to diagnostic assignment outlined by Utian and colleagues. The administration of the SDM is lengthy and not suitable for primary care clinic use, but it can be clinically useful in both clinical trials and sex therapy practice. The authors have not provided an algorithm or guidelines to combine results from the measures and interview to arrive at a diagnosis. The Brief Male Sexual Function Inventory (BMSFI; O’Leary et al., 1995) is an 11-​item measure of male sexual function covering sexual drive, erection, and ejaculation; subjective problem assessment of drive, erection, and ejaculation; and overall satisfaction. Responses are given on 5-​point adjectival scales in reference to the last 30 days, with higher scores indicating better function. The more recent validation of this measure suggests that it is most efficacious as a unidimensional tool for general screening purposes (Mykletun, Dahl, O’Leary, & Fossa, 2005). The measure was intended to be suitable for men in same-​sex or other-​sex relationships.

The International Index of Erectile Function (IIEF; Rosen et  al., 1997)  is a brief self-​administered measure of erectile function designed to detect treatment-​related changes in patients with erectile dysfunction, although it is also a useful diagnostic adjunct. The 15 items address five domains of sexual function: erectile function, orgasmic function, sexual desire, intercourse, and overall satisfaction. Response options consist of 5-​or 6-​point adjectival scales, and the time reference is the prior 4 weeks. It takes less than 15 minutes to complete and is easy to administer in most settings. Recent recommendations have suggested modifications to the IIEF when it is administered to men who are sexually inactive (Yule et al., 2011) and men who have sex with men (Coyne et  al, 2010). The IIEF has been validated in many languages. The Male Sexual Health Questionnaire (MSHQ; Rosen et  al., 2004)  is a 25-​item self-​administered measure designed specifically to assess sexual function and satisfaction in aging men with urogenital concerns often associated with heart disease, prostate cancer, and benign prostatic hyperplasia/​ lower urinary tract symptoms. Disorders of ejaculation are common in men with these age-​related physical problems, yet erectile function measures such as the IIEF do not focus specifically on problems such as delayed or retrograde ejaculation and diminished sensation, force, or pleasure. The MSHQ thus addresses three domains of sexual function: erection, ejaculation, and satisfaction with the sexual relationship. Assessment for Case Conceptualization and Treatment Planning Again, the richest tool for case conceptualization and treatment planning is the clinical interview, with its capacity to investigate multiple areas of functioning both in the client and in the partner. One important area to assess in the formulation of a treatment plan is the existence of other mental disorders. Other chapters in this text elaborate on the assessment of these and thus will not be covered here. The other area crucial to case conceptualization and treatment planning is the assessment of the nonsexual aspects of the client’s primary relationship (see also Chapter 22). Table 23.2 provides a listing of self-​ report measures suitable as adjuncts in case conceptualization and treatment planning. Ideally, the assessment of sexual function should include the client’s partner if he or she has one and if the partner is willing to participate. There are multiple functions to partner assessment, including a general assessment of relationship adjustment, the partner’s perception

Sexual Dysfunction TABLE 23.2  

523

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Inter-​Rater Consistency Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

✓ ✓

Global Sexual Function For Use with Men, Women, and Couples GRISS G G DAS G G CSI A E DSFI G A ISS G E GMSEX G E NSSS A E

NA NA NA NA NA NA NA

A A NR A A G A

G G A A A A G

G G A G A G A

G G G G A G A

A A A A A A A

For Use with Women Only SSS-​W G FSDS Dysfunction-​Specific SDI PFSF SIDI-​F

G

NA

A

G

A

A

A

G

G

NA

A

G

G

A

A

A

G

NA

A

G

A

A

A

G G

G E

NA NA

A A

G G

A A

G A

A A





Note: GRISS = Golombok–​Rust Inventory of Sexual Satisfaction; DAS = Dyadic Adjustment Scale; CSI = Couple Satisfaction Index; DSFI = Derogatis Sexual Functioning Inventory; ISS = Index of Sexual Satisfaction; GMSEX = Global Measure of Sexual Satisfaction; NSSS = New Sexual Satisfaction Scale; SSS-​W = Sexual Satisfaction Scale for Women; FSDS = Female Sexual Distress Scale; SDI = Sexual Desire Inventory; PFSF = Profile of Female Sexual Function; SIDI-​F = Sexual Interest and Desire Inventory; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

of and responses to the sexual difficulty, and the presence of partner sexual dysfunction. This couple assessment can be enhanced with self-​administered measures of relationship adjustment. The Dyadic Adjustment Scale (DAS; Spanier, 1976) is the most widely used instrument for the measurement of relationship quality. It consists of 32 items in a variety of response formats that are summed to create a total score ranging from 0 to 151, with a score of 26.5 or lower indicating clinical distress. There are also four subscales, which can be used independently because scores on these have also shown good reliability and validity:  Dyadic Consensus (13 items), Dyadic Satisfaction (10 items), Dyadic Cohesion (5 items), and Affective Expression (4 items). Total DAS scores have been shown to discriminate between distressed and nondistressed couples and to identify at-​risk marriages. The measure has also been used with gay and lesbian couples (Kurdek, 1992). It is easy to administer (10–​15 minutes) and provides information about the relationship context within which the sexual dysfunction exists. A  well-​ validated short-​ form version (Revised-​ DAS; Busby, Christensen, Crane, & Larson, 1995) consisting of 14 items is also available. A limitation to the DAS and the Revised-​DAS is that they are only valid for couples who are married or living together. The Couple Satisfaction Index (CSI; Funk &

Rogge, 2007) was developed using item response theory and was found to have greater precision of measurement and enhanced power for detecting differences in relationship satisfaction compared to the DAS. The CSI can be used in dating, common-​law, or married couples. It consists of 32 items in a variety of response formats that are summed for a total score ranging from 0 to 161, with a score of 104.5 or lower indicating clinical distress. Two shorter versions, the CSI-​ 16 and the CSI-​ 4, are also available. Because of the multidimensionality of most measures of global sexual function, many are appropriate for use in case conceptualization and treatment planning. Of the measures already covered in the preceding diagnosis section, the GRISS can be useful because it covers corollary cognitions and behaviors, as well as satisfaction and relationship quality. Measures of sexual satisfaction and sexual distress can be invaluable in case conceptualization and the holistic measurement of treatment outcome, regardless of the presenting sexual dysfunction. Other measures include those discussed next. The Derogatis Sexual Functioning Inventory (DSFI; Derogatis, 1998; Derogatis & Melisaratos, 1979)  is a multidimensional measure that assesses constructs associated with sexual functioning and general well-​being. It consists of 254 items arranged into 10 subscales. The

524

Couple Distress and Sexual Disorders

response format is a mixture of yes/​no answers and multipoint adjectival scales. The 10 dimensions addressed by the scales are information, experiences, drive, attitudes, psychological symptoms, affect, gender role definition, fantasy, body image, and sexual satisfaction. Each scale provides a separate score, and the linear combination of the 10 scales yields the Sexual Functioning Index. A second global score, the Global Sexual Satisfaction Score, assesses the individual’s subjective perception of his or her sexual function. The psychometric soundness of the measure varies by subscale; thus, it is important to review relevant research prior to interpreting the results of any given subscale. The Index of Sexual Satisfaction (ISS; Hudson, 1998; Hudson, Harrison, & Crossup, 1981)  is a 25-​item self-​ report measure of dissatisfaction in the sexual aspects of a couple’s relationship from the perspective of the respondent. In the original measure, items were responded to on 5-​point adjectival scales describing relative frequency. The newer version has 7-​point scales and minor item revisions. The measure has been validated in various populations (Santos-​Iglesias et  al., 2009; Vieira, Pechorro, & Diniz, 2008). The Sexual Satisfaction Scale for Women (SSS-​W; Meston & Trapnell, 2005) has 30 items that are responded to on 5-​point scales anchored at “strongly agree” and “strongly disagree” in reference to the respondent’s current situation. The detailed breakdown of satisfaction into separate components (communication, compatibility, contentment, relational concern, and personal concern) may be particularly helpful in clarifying the sometimes confusing relationship between satisfaction/​distress and sexual difficulties in women. The Global Measure of Sexual Satisfaction (GMSEX; Lawrance & Byers, 1995) is a brief five-​item measure of an individual’s overall positive and negative evaluation of the sexual relationship. It consists of five word pairs that are descriptive of the respondent’s sex life and rated on 7-​ point bipolar scales. It is a component of the Interpersonal Exchange Model of Sexual Satisfaction questionnaire but can be used independently. It can be completed in less than 5 minutes and can be used with all genders and sexual orientations. The New Sexual Satisfaction Scale (NSSS; Štulhofer, Buško, & Brouillard, 2010) has 20 items that are responded to on 5-​point scales and across five conceptual dimensions (sexual sensations, sexual awareness and focus, sexual exchange, emotional closeness, and sexual activity). Two subscales are an ego-​centered subscale, which measures satisfaction with personal experiences and sensations, and

the partner-​and sexual activity-​centered subscale, which measures satisfaction with one’s partner and sexual activity. A  short form (NSSS-​S) consists of 12 items and has similar reliability and validity as the long form (Štulhofer, Buško, & Brouillard, 2011). The measure is not limited to a particular sexual orientation, relationship status, gender, or culture but may be particularly useful for women who include their partner’s sexual satisfaction in the assessment of their own sexual satisfaction (Mark, Herbenick, Fortenberry, Sanders, & Reece, 2014; McClelland, 2011). The Quality of Sex Inventory (QSI; Shaw & Rogge, 2016) is a promising new measure that assesses sexual satisfaction and sexual dissatisfaction as distinct components of sexual quality. It was developed using item response theory and has demonstrated increased precision and power compared to other measures of sexual satisfaction while retaining strong convergent and construct validity characteristics. It consists of two 12-​item subscales that are responded to on 5-​point scales. The short form consists of two 6-​item subscales. The FSDS (Derogatis et al., 2002) is designed to measure sexually related distress in women. It consists of 12 items that describe distressing feelings or problems related to one’s sexuality or sexual relationships. The items are responded to on 5-​ point adjectival scales anchored at “never” and “always” in reference to the past 30  days. A  revised version (FSDS-​R) added a 13th item to assess distress related to low sexual desire (Derogatis, Clayton, Lewis-​ D’Agostino, Wunderlich, & Fu, 2008). Cut-​ off scores for clinical distress have been published for both versions (Derogatis et al., 2002, 2008). The measure takes 3 to 5 minutes to complete. Although developed and validated with women only, the items of the FSDS are gender neutral. It has been administered to men and is currently under validation (Santos-​Iglesias, Danko, Robinson, & Walker, 2016). Currently, men’s results must be interpreted with caution because more research is required to confirm the validity of the FSDS in this population. The ascertainment of distress over sexual difficulties can be integral to case conceptualization and treatment planning, and the FSDS has been shown to be sensitive to treatment changes. Assessment for Treatment Monitoring and Treatment Outcome Treatment monitoring and outcome is the one assessment purpose for which the clinical interview is not optimal. This assessment purpose requires the quantification that only standardized measurement can provide. Fortunately,

Sexual Dysfunction

the recent explosion in clinical trials for pharmacotherapeutic agents targeting sexual dysfunction has resulted in the development of a number of measures designed specifically for the assessment of treatment monitoring and outcome. Table 23.3 provides a listing of measures suitable to treatment monitoring and the assessment of treatment outcome. In terms of measures applicable to men, women, and couples, there are data to support that the GRISS and the ISS can detect changes attributable to treatment effects. The Changes in Sexual Functioning Questionnaire (CSFQ and CSFQ-​14:  Clayton, McGarvey, & Clavet, 1997; Clayton, McGarvey, Clavet, & Piazza, 1997; Keller, McGarvey, & Clayton, 2006)  can be clinician administered as a structured interview (CSFQ-​I) or self-​ administered as a gender-​specific questionnaire (CSFQ-​F or CSFQ-​M). It measures five dimensions of sexual functioning (frequency of sexual activity, sexual desire, pleasure, arousal, and orgasmic capacity), as well as comorbid conditions, current medications, alcohol and substance use, and relationship status. The first 21 items apply to both men and women and are followed by 36 male-​specific

TABLE 23.3  

525

and 35 female-​specific items, answered primarily on 5-​ point Likert-​type scales. Scores on the CSFQ have been found to be more valid and reliable in female than in male samples, and most of the available psychometric data derive from the self-​administered version. An abbreviated short-​form version also exists; the CSFQ-​14 also has gender-​specific versions and is self-​administered. It yields scores for three scales corresponding to desire, arousal, and orgasm, as well as for the five scales in the original long form. The CSFQ-​14 scores appear to improve on the reliability and validity of the long form, especially with regard to men. The addition of a short form enhances its clinical utility because it can be administered quickly in busy practices and is amenable to immediate clinician feedback. Although designed with psychiatric patients in mind, the CSFQ has also been tested in nonclinical populations and has been found suitable for general use. In terms of measures specific to female sexual dysfunction, the BISF-​W, FSDS, and FSFI have all been found to be sensitive to treatment effects (Derogatis, 2008; Safarinejad, Hosseini, Asgari, Dadkhah, & Taghva, 2010). Thus, these measures can be used for treatment monitoring and outcome. In terms of measures specific to male

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Internal Norms Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Treatment Clinical Generalization Sensitivity Utility

Highly Recommended

NA NA NA

A A A

G A A

G A G

G A G

G A A

A A A



A E A G

NA NA NA NA

A A A A

A G G G

A G G G

A G G A

A G A G

A A A A

✓ ✓ ✓

G

G

NA

A

G

G

G

G

A



A G A G G G

G G NA G A NR

NA NA NA NA NA NA

A A A A A A

G G NR G G G

A G G A A A

A A G A A A

A G G G A A

A A A A A A

Global Sexual Function For Use with Men, Women, and Couples GRISS G G ISS G E CSFQ/​CSFQ-​14 G A



For Use with Women Only BISF-​W FSFI MFSQ FSDS

A G G G

For Use with Men Only IIEF Dysfunction-​Specific SDI IIEF-​5 EHS QEQ IPE PEP

Note: GRISS = Golombok–​Rust Inventory of Sexual Satisfaction; ISS = Index of Sexual Satisfaction; CSFQ/​CSFQ-​14 = Changes in Sexual Function Questionnaire; BISF-​W = Brief Index of Sexual Functioning for Women; FSFI = Female Sexual Function Index; MFSQ = McCoy Female Sexuality Questionnaire; FSDS  =  Female Sexual Distress Scale; IIEF/​IIEF-​5  =  International Index of Erectile Function; SDI  =  Sexual Desire Inventory; EHS = Erection Hardness Score; QEQ = Quality of Erection Questionnaire; IPE = Index of Premature Ejaculation; PEP = Premature Ejaculation Profile; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

526

Couple Distress and Sexual Disorders

sexual dysfunction, the IIEF has demonstrated treatment sensitivity (Derogatis, 2008).

DYSFUNCTION-​S PECIFIC ASSESSMENT

The assessment of any one sexual dysfunction is largely dependent on the clinical interview. However, the administration of one or more of the aforementioned self-​administered measures of global sexual function that contain a domain pertinent to the dysfunction in question can be a useful adjunct. Selecting a subset of items from an instrument can be useful for standardizing the manner in which particular symptoms are assessed, but it warrants caution because the psychometric adequacy (of the select items) would be unknown. Dysfunction-​specific measures are described in this section, and they are included in Tables 23.1 to 23.3 as appropriate. A growing number of dysfunction-​specific measures have been developed during the past decade with the advancement of clinical trials for new medications. When a client presents with symptoms of a specific dysfunction, assessment can combine a clinical interview with self-​report measures and may involve physiological assessment strategies as appropriate. Although only a few psychophysiological measures have been validated for the assessment of sexual dysfunction, they are discussed briefly to introduce the reader to possible additions to the multidisciplinary assessment tool kit. Delayed Ejaculation Most global sexual function measures inquire about the occurrence of orgasm and satisfaction with ejaculatory latency and sensation, but instruments designed specifically for male sexual dysfunction tend to more adequately investigate the range of problems that fall under DE. The IIEF and the BSFI contain one question addressing the occurrence of and difficulty with ejaculation. The IIEF adds one more item on the pleasurable sensation of orgasm, and the BSFI-​M asks directly about satisfaction with the amount of ejaculate emitted. The best coverage of orgasmic problems in men, however, is provided by the MSHQ and the MSHQ-​Ejaculation (MSHQ-​EjD; Rosen et al., 2007). The MSHQ has seven questions devoted to ejaculation, its occurrence, delay, volume, force, pain or discomfort, and pleasure, as well as the occurrence of retrograde ejaculation. Although the MSHQ was designed for aging men, it can be useful for patients of any age who report orgasm problems. The MSHQ-​EjD is a four-​item measure consisting of three ejaculatory function items

and one ejaculatory bother item. This briefer version of the MSHQ can be used for assessing the diagnosis and treatment outcomes of DE within an everyday clinical setting, and it can be used with heterosexual, bisexual, and gay men. Retrograde ejaculation and emission phase disorders will likely have a physiological cause; thus, a careful medical history and referral to a physician are important to assess for potential disease or other biological processes (for a list of these, see Segraves & Segraves, 1993). Whether or not biological factors are implicated, a psychosexual history is necessary to assess psychological and relational factors contributing to the problem or consequential to it because this history can be helpful for the purpose of case conceptualization and treatment planning. Erectile Disorder The comprehensive assessment of ED requires a thorough clinical interview that includes both medical and psychosexual history, physical examination, and laboratory testing. More specialized diagnostic tests may be indicated in some cases, and these may include Doppler ultrasound and nocturnal penile tumescence tests (NPT). Self-​report measures can be helpful in the diagnosis of the problem, although they are rarely sufficient. General sexual function measures that inquire about ED are the CSFQ and the GRISS. Male-​specific measures that explore the existence and diagnosis of ED in more detail are the BSFI-​ M and the IIEF-​5. Measures specific to ED that track treatment outcomes are the IIEF, Erectile Hardness Score (EHS), and the Quality of Erection Questionnaire (QEQ). In addition, a recently validated measure exists for assessing ED in men who do not engage in intercourse (Yuan et  al., 2014); however, this measure requires further study. Finally, using the Female Assessment of Male Erectile Dysfunction Detection Scale, female partners appear to be able to accurately identify ED, supporting the integration of a couples-​based approach to the diagnosis and treatment of ED (Rubio-​Aurioles et al., 2009). The IIEF-​5 (Rosen, Cappelleri, Smith, Lipsky, & Pena, 1999) consists of five items from the IIEF that specifically measure erectile function and intercourse satisfaction. This measure is also sometimes referred to as the Sexual Health Inventory for Men. It is easy to administer in the context of busy general practices, although it does not provide information about other aspects of the person’s sexual function. It was designed to tag erectile difficulties and track treatment-​related changes. The response options are on 5-​point adjectival scales, and the reference period is

Sexual Dysfunction

6 months. The IIEF-​5 can be modified for administration to men who are sexually inactive (Yule et al., 2011). The EHS (Goldstein et al., 1998) is a one-​item measure that asks the patient to rate the hardness of his erection. This measure is useful in monitoring treatment outcome over regular time points (i.e., across sexual encounters), both in office and at home. The response options range from 0 (Penis does not enlarge) to 4 (Penis is completely hard and fully rigid). The QEQ (Porst et  al., 2007)  is a six-​item measure that assesses satisfaction with the quality of erections, specifically in men who are concerned by their erectile function. It can be used to monitor treatment-​related changes in satisfaction with erection quality. The response options are on 5-​point adjectival scales, and the reference period is the previous 4 weeks. Specialized techniques to assess for ED include NPT, penile strain gauges, the RigiScan Monitor, and the Doppler ultrasound. The most commonly used psychophysiologic procedure in the diagnosis of ED is NPT, based on the assumption that the erections during the rapid eye movement phase of the sleep cycle rule out substantial organic etiology. Usually measured in sleep labs with penile strain gauges that measure circumferential changes, NPT has demonstrated both validity and clinical utility (Ghanem & Shamloul, 2008). The RigiScan Monitor, a small computerized device, improves on NPT by addressing the issue of rigidity, in addition to tumescence and duration of erectile episodes (Meuleman, Hatzichristou, Rosen, & Sadovsky, 2010). Thermal imaging technology rapidly produces thermal images indicating the average temperature of less than 1 millimeter of skin with a precision of 0.07°C (Kukkonen, Binik, Amsel, & Carrier, 2007). Thermal imaging is significantly associated with self-​reported arousal, has shown evidence of good test–​retest reliability, and has been used to distinguish between men with and men without ED (Kukkonen, Binik, Amsel, & Carrier, 2010; Sarin, Amsel, & Binik, 2014). Finally, intracavernosal injection testing and penile duplex ultrasonography have been found clinically useful in the detection of arterial inflow abnormalities and venoocclusions (Shamloul & Ghanem, 2013). Once ED has been adequately diagnosed, case conceptualization can be greatly enhanced by a sexual, medical, and psychosocial history to assess for general sexual functioning; medical, pharmacologic, surgical, and lifestyle risk factors; as well as relationship and general psychological well-​being. The physical examination should focus on genitourinary, neurologic, and cardiovascular

527

systems, with laboratory tests focused on endocrine dysfunction (Hatzimouratidis et al., 2010). Female Orgasmic Disorder Within a clinical interview, women with lifelong orgasmic difficulty will typically report either never having had an orgasm or difficulty attaining one. Alternately, they may complain of having lost orgasmic capacity over time or a lack of pleasure or intensity during orgasm, or even not knowing whether or not they have had an orgasm. Almost all of the self-​administered measures of general and female-​specific sexual function covered in this chapter inquire directly about orgasm and can be helpful in indicating a potential problem. Although most of the questions embedded in these global sexual function questionnaires are not sufficient to establish a nuanced clinical picture of the many variations possible in female orgasmic difficulty, question 15 on the Female Sexual Distress Scale/​Desire Arousal Orgasm correlates well with clinician diagnosis and has been suggested as an appropriate tool for evaluating treatment benefit in FOD (Dickstein, Goldstein, Tkachenko, & Kreppner, 2013). The clinical interview remains the best diagnostic tool for the assessment of orgasmic difficulties in women. Mah and Binik’s (2002) Orgasm Rating Scale (ORS) is an interesting addition to the assessment of orgasm for both men and women. It is not designed to assess anorgasmia per se but, rather, the cognitive–​affective and sensory components of orgasm. This measure may be useful in identifying determinants of orgasmic pleasure as part of a treatment program for women or men who are not completely anorgasmic. In terms of psychophysiological instruments, the GenitoSensory Analyzer (GSA) is a quantitative sensory testing tool that measures the vibratory and thermal sensations of the vagina and clitoris. The GSA has shown promise for assessing and diagnosing FOD (Helpman, Greenstein, Hartoov, & Abramov, 2009), but it has similar constraints as those mentioned for the psychophysiological instruments used to assess FSIAD. Female Sexual Interest/​Arousal Disorder A clinical interview for FSIAD should include questions about the frequency and intensity of sexual interest, sexual thoughts, past and current responses to sexual stimuli, relationship factors, and physical sensations related to sexual activity. The following measures for assessing FSIAD were developed for the diagnosis of hypoactive sexual desire disorder (HSDD) and female sexual arousal

528

Couple Distress and Sexual Disorders

disorder (FSAD), which were replaced with FSIAD in the DSM-​5. It will take time for clinically relevant measures for FSIAD to be developed and validated; in the meantime, we must rely on previous tools. In terms of self-​administered measures, the CSFQ, BISF-​ W, and MFSQ all inquire about sexual interest and arousal in general terms, but the inquiry is limited to one or a few questions. The CSFQ and BISF-​W also contain questions about comorbid conditions that might impact desire and arousal, such as relationship status and use of medications and other substances. The FSFI has two questions about sexual desire, four questions about general sexual arousal, and four about lubrication. The FSFI appears to be valid for use in women with FSIAD (Opperman, Benson, & Milhausen, 2013). The SFQ has eight questions devoted to arousal and six questions to assess desire. An adapted version of the SFQ (SFQ-​28; Symonds et al., 2012) has been validated in women with HSDD and FSAD, suggesting that it may be appropriate for those with FSIAD. The Sexual Desire Inventory (SDI) and the Decreased Sexual Desire Screener (DSDS; Clayton et al., 2009) have been validated in women with HSDD and could be used to assess problems with desire; however, arousal is not assessed in these measures. The Profile of Female Sexual Function (PFSF; Derogatis et  al., 2004; McHorney et  al. 2004)  is a 37-​ item self-​report instrument that was designed to assess symptoms of HSDD. It covers seven domains:  desire, arousal, orgasm, pleasure, sexual concerns, responsiveness, and self-​image. A brief version—​the Brief-​Profile of Female Sexual Function (B-​PFSF)—​has been validated in postmenopausal women and includes 5 items from the PFSF and 2 items from the Personal Distress Scale (Rust et al., 2007). The Sexual Interest and Desire Inventory (SIDI-​F; Clayton et  al., 2006)  is a clinician-​administered instrument designed to quantify the severity of symptoms in premenopausal women diagnosed with HSDD and to track symptom changes in response to treatment. The 13 items cover relationship–​sexual, receptivity, initiation, desire–​ frequency, affection, desire–​ satisfaction, desire–​ distress, thoughts–​ positive, erotica, arousal–​ frequency, arousal–​ ease, arousal–​continuation, and orgasm. The SIDI-​F was initially validated in samples of women with HSDD or FSAD (Clayton et al., 2010). Compared to the measures described previously, this scale is briefer and shows higher specificity in assessing the severity and frequency of desire and arousal symptoms. Past studies have excluded women with comorbid HSDD and FSAD; thus, further research

is required to establish the validity and reliability of scores on this measure in women diagnosed with FSIAD. A physical examination that includes examining the pelvic floor muscle and vagina for possible atrophy, infections, or pain could be included in the assessment (Brotto & Luria, 2014). Unlike the clinical assessment of male erectile dysfunction, the assessment of female sexual interest and arousal has historically relied almost exclusively on self-​report. The experience of sexual desire may emerge subsequent to sexual arousal initiated by a sexually meaningful stimulus rather than always preceding arousal (Laan & Both, 2008). This discovery and others have stimulated research focused on objective genital arousal assessment instruments. Attempts to measure lubrication, clitoral engorgement, and uterine contractions have met with little success for a variety of reasons (see Meston, 2000; Prause & Janssen, 2006). Vaginal blood flow has been most amenable to measurement, and the most frequently used instrument is the vaginal photoplethysmograph (VPP), a tampon-​like, light-​emitting device that measures vasocongestion via the amount of light reflected back from the vaginal walls. However, VPP results do not necessarily represent vaginal wall engorgement, and associations with self-​reported arousal have been weak (Chivers, Seto, Lalumière, Laan, & Grimbos, 2010; Prause & Janssen, 2006). The labial thermistor clip is a surface temperature probe fastened to the labia minora (Janssen, 2001; Payne & Binik, 2006). The thermistor clip is associated with self-​ reported arousal, and there is evidence for discriminant validity (Kukkonen, 2015). Thermal imaging (described previously in the section on assessments specific to ED) has been successfully used to measure arousal in community samples and in women reporting pain during intercourse (Cherner & Reissing, 2013; Kukkonen et al., 2010). Magnetic resonance imaging is now also being applied to the measurement of genital vasocongestion, as well as brain activation during sexual arousal (Maravilla, 2006). A recent review reported that laser Doppler imaging, which measures superficial blood flow in the genital area using an infrared laser beam, provides the most valid and reliable psychophysiological data on female sexual arousal (Kukkonen, 2015). It is the only tool that measures direct blood flow, and it appears to distinguish between women with and those without sexual dysfunction (Boyer, Pukall, & Chamberlain, 2013). Furthermore, there is support for its discriminant validity and test–​retest reliability (Waxman & Pukall, 2009). The clinical utility of all these instruments is constrained by the necessity of

Sexual Dysfunction

sexual arousal induction, equipment, trained technicians, and, sometimes, interpretive problems.

529

approximately 20–​25 minutes). There are three supplemental scales (additional pain descriptors, coping styles, and romantic partner factors) that take 10 minutes each to complete. There is also a screener version composed of Genito-​Pelvic Pain/​Penetration Disorder 38 items containing all the descriptive questions from the An understanding of GPPPD requires the assessment full version as well as items selected from each of the subof sexual function and of pain. Self-​administered sexual scales. The authors of the VPAQ found support for confunction measures, such as the CSFQ, GRISS, MFSQ, struct, convergent, and discriminant validity, as well as the and BISF-​W, contain one question to assess the existence internal consistency of scores on the measure. Although and frequency of pain with intercourse. The SFQ and the further research is required, the VPAQ is the first measure FSFI have questions related to frequency and intensity of specific to the assessment of genital pain and shows promthe pain, and the SFQ includes a question regarding wor- ise as a useful tool for the evaluation of GPPPD. rying about pain. Scores on both the FSFI and the SFQ The clinical interview for GPPPD should contain have been found to have good discriminant validity in the questions on the history, onset, location, quality, duration, assessment of chronic vulvar pain (Legocki, Aikens, Sen, and intensity of the pain because these pain characterHaefner, & Reed, 2013). A  measure of sexual distress, istics have been found to have discriminant validity in such as the FSDS (Derogatis et al., 2002), should also be the differentiation of pain subtypes (Meana et al., 1997). included in the assessment of GPPPD. Other details that should be queried are the experience General pain measures found to be useful in the con- of pain in nonsexual contexts, cognitive distortions, facceptualization and treatment planning of GPPPD are the tors that might reduce or exacerbate pain symptoms, and McGill Pain Questionnaire (MPQ; Melzack, 1975), the any previous treatment attempts and associated outcomes Pain Catastrophizing Scale (PCS; Sullivan, Bishop, & (Bergeron et  al., 2014). The impact of the pain on sexPivik, 1995), as well as visual analogue scales and pain dia- ual activity, relationships, and psychological functionries (Payne, Bergeron, Khalife, & Binik, 2006). In addition ing is also important to cover. As previously mentioned, to a large number of studies attesting to the reliability and the diagnosis of GPPPD does not apply to men. Sexual validity of the MPQ scores for a wide range of pain experi- functioning measures for men do not include questions ences, it has been shown to distinguish between different about pain, which has probably contributed to the dearth subtypes of GPPPD (Meana, Binik, Khalife, & Cohen, of research on this problem. The pain measures described 1997). The PCS, another widely validated general pain previously and the clinical interview information are measure, is useful for determining the amount of pain-​ likely relevant for the assessment of pain in men as well, related distress and in formulating cognitive treatment and they would involve adaptation to the male context of strategies. Pain-​related distress is particularly germane to pain during penetrative activities or ejaculation. women who catastrophize about their intercourse pain A physical examination that aims to replicate the pain and who experience pelvic floor muscle dysfunction dur- experienced with attempted penetration is a necessary ing intercourse (Pukall, Binik, Khalife, Amsel, & Abbott, component of assessment. The physical examination 2002; Reissing et al., 2004). Consistent with a biopsycho- should include a cotton-​swab palpation of the vulva and social model of genito-​pelvic pain, partner responses to a pelvic examination, during which the woman is asked the pain significantly impact women’s pain experience to rate the intensity of the pain. An instrument called the (Rosen, Bergeron, Sadikaj, & Delisle, 2015) and should vulvalgesiometer was developed to standardize palpation be considered in the assessment of GPPPD. pressure and discriminates between women with and The Vulvar Pain Assessment Questionnaire Inventory those without GPPPD (Pukall, Young, Roberts, Sutton, (VPAQ; Dargie, Holden, & Pukall, 2016)  is a novel & Smith, 2007; Tu, Fitzgerald, Todd, Todd, & Harden, 63-​item assessment tool developed to measure biopsy- 2007). The palpation serves to both locate the pain prechosocial aspects of vulvar pain. The VPAQ contains cisely and establish the sensitivity of the hyperalgesic descriptive questions about pain characteristics and asso- area, if one is identified. Assessment of vulvar or pelvic ciated symptoms, as well as the following subscales: pain diseases is another important goal of medical referral. For severity, emotional response, cognitive response, life example, the assessment of pelvic floor tonicity has gained interference, sexual functioning interference, and self-​ wider acceptance because it has been shown to discrimistimulation/​penetration interference (completion time of nate between women with and those without GPPPD

530

Couple Distress and Sexual Disorders

(Reissing, Brown, Lord, Binik, & Khalife, 2005). Recently, transperineal four-​dimensional ultrasound (consisting of a probe applied to the surface of the perineum) has been used as a pain-​free measure of pelvic floor tonicity in women with a specific type of GPPPD (Morin, Bergeron, Khalifé, Mayrand, & Binik, 2014). Male Hypoactive Sexual Desire Disorder MHSDD is perhaps the most difficult sexual dysfunction to diagnose in men because it is not anchored in the absence of an expected discrete event (e.g., erection and orgasm). Diagnostic assessment is usually based on the presenting complaint of distress about desire level, taking into account natural discrepancies between members of a couple. In addition to the clinical interview, an operationalization of the severity of the problem can be facilitated by self-​ administered measures. Global sexual function measures that have domains specific to desire are the CFSQ, DSFI, GRISS, BSFI-​M, and IIEF. The advantage of these multidimensional measures of desire is that they may also be helpful for the purpose of case conceptualization because they provide information on the existence of comorbid sexual dysfunctions, can also be administered to the partner, and, in some cases, provide information about relationship quality and satisfaction. However, there is only one desire-​specific self-​administered measure for men with acceptable psychometric properties and clinical utility—​the SDI. The SDI can also be used in combination with the FSDS to evaluate distress related to low desire, as the wording in the FSDS is gender-​neutral. However, the FSDS still requires validation in men. The SDI (Spector, Carey, & Steinberg, 1996) is a 14-​ item self-​report measure of dyadic and solitary desire for use with men and women. Its focus is primarily on cognitive rather than behavioral dimensions of desire. Each item is responded to according to the intensity of feeling or frequency of occurrence on 7-​or 8-​point adjectival scales and yields scores for dyadic desire and solitary desire, as well as a total score. Because of its cognitive emphasis, it can be particularly useful in cognitive–​behavioral case conceptualizations. In the absence of psychological, relational, situational, or disease-​related factors that could account for a decline in desire, clinicians are increasingly turning to the assessment of sex hormone levels as aids in the case conceptualization and treatment planning of MHSDD. Links have been found between sexual desire and various hormones in men, including testosterone; however, it is important to note that no single hormone level has

been found to be predictive of low desire (Rubio-​Aurioles & Bivalacqua, 2013). Premature (Early) Ejaculation The assessment of PE has been complicated by variations in what is considered a normal ejaculatory latency by expert opinions and by the patient himself. In clinical trials, intravaginal ejaculation latency time (IELT) is usually assessed by means of a stopwatch; however, this is not a viable assessment technique in clinical practice. Because PE depends not only on objective measurement but also on patient distress, most clinicians do not use IELT cut-​ off points to assess PE. Assessment usually relies more on clinical impression and patient distress gathered from the clinical interview (Perelman, 2006). The GRISS contains a subscale for PE, and it can be used for diagnosis of PE. There are also two recently developed self-​administered measures designed for diagnosing PE that include patient distress and can also be used for assessing treatment outcomes: the Index of Premature Ejaculation (IPE; Althof et  al., 2006)  and the Premature Ejaculation Diagnostic Tool (PEDT; Symonds et  al., 2007). The Premature Ejaculation Profile (PEP; Patrick et  al., 2005)  was designed specifically for monitoring treatment outcomes in men with PE. Note that the IPE, PEDT, and PEP were created based on DSM-​IV-​TR criteria for PE, which did not include specific criteria on the frequency and duration of PE symptoms, nor on ejaculatory latency of less than 1 minute after penetration. The IPE (Althof et al., 2006) consists of 10 items that assess subjective aspects of the overall experience of PE from the patient perspective. The tool was designed both for diagnosis and as a more encompassing alternative to single-​ item patient-​ reported treatment outcomes. The response items are on 5-​point adjectival scales, and the reference period is the past 4 weeks. The PEDT (Symonds et  al., 2007)  is a five-​item measure that captures the main elements of the DSM-​ IV-​TR criteria for premature ejaculation. This measure was designed to be a validated, brief tool to standardize the diagnosis of the absence or presence of PE in clinical trials. The response items are on 4-​point scales and ask about the patient’s general experience with intercourse. The PEP (Patrick et  al., 2005)  consists of four items and can be used in three different ways: for examining the PE domains separately, as an overall index score, and as a profile score (Patrick et al., 2009). It was designed specifically for monitoring treatment outcomes in men with PE.

Sexual Dysfunction

The response items are on 5-​point scales, and the reference period varies depending on each item. The clinical interview should assess whether the PE is likely to be attributable to psychological traits, distress, psychosexual skills deficits, relationship problems, and/​or physical illness or injury (Althof, 2014). Metz and Pryor (2000) provided a useful decision tree for the aforementioned classifications and potential etiologic pathways. Perelman (2006) stressed the importance of assessing whether the patient is able to detect premonitory sensations (bodily changes reflecting arousal/​impending ejaculation) because this is necessary in order to choose to ejaculate or to delay ejaculation.

CONCLUSIONS AND FUTURE DIRECTIONS

The multidimensionality of sexual function and its problems poses a formidable challenge to both research and clinical practice. With lengthy laundry lists of potential etiologies for all the sexual dysfunctions, the isolation of any one predominating factor or even of a reasonably articulated system of interdependent factors is exceedingly difficult. It is against this backdrop of complexity that clinicians are left to diagnose, conceptualize, and treat. No single measure of sexual function can provide sufficient information regarding the affective, cognitive, behavioral, relational, and social contexts within which the sexual difficulties have arisen or are perpetuated. Only the clinical interview has the flexibility to encompass an individual client’s specific circumstances, yet it is compromised by potential reliability and validity deficiencies and by the fact that instrumental details affecting the sexual difficulties tend to emerge long after the initial intake. For this reason, assessment needs to be an integrated component of treatment at all stages, to track efficacy and to revise strategies as information and conditions change. Despite their limitations, self-​administered measures and psychophysiological tests can be useful in diagnosis, case conceptualization, and the monitoring of treatment progress. The clinical interview does not lend itself well to repeat administrations or to the operationalization of changes in sexual function. Indeed, the U.S. Food and Drug Administration requires the use of psychometrically valid tools in clinical trials of new drug treatments. Regardless of complexities in the etiology and maintenance of sexual difficulties, simple measures of drive, frequency, pleasure, or pain can indicate improvement, stasis, or deterioration. Elaboration on the meaning of the changes can follow, but their quantification is essential

531

to the client’s and the clinician’s evaluation of progress. Self-​administered measures are also integral to screening of sexual function in health care settings. After decades of urging the medical profession to attend to sexual health as a primary component of an overall health assessment, sex researchers have made great strides toward providing them with the tools to do so accurately. Certainly there is more work to be done, and many sexual function measures require additional psychometric validation. There is a paucity of independent validation and data supporting long-​term test–​retest reliability, validity generalization, treatment sensitivity, and clinical benefit. Achieving high psychometric standards is an important research goal that will increase our confidence in the continued use of these measures and encourage other disciplines to engage in the assessment of sexual function. The concerning move toward medicalization has had the unexpected benefit of promoting the development of clinically useful measures for use in clinical trials. We must remain vigilant that the originating drive for the development of these measures does not result in reductionist assessment tools that miss the forest for the trees or that neglect to address the specific concerns of minority populations. Most sexual function measures are penile–​ vaginal intercourse centered and validated with predominantly Caucasian, heterosexual, abled populations. There have been recent developments on how to administer some tools to individuals who are currently sexually inactive, as well as validation for use of tools across sexual orientations (Coyne et  al., 2010; Štulhofer et  al., 2010; Yule et  al., 2011). There is little research on culturally informed assessment and treatment for sexual difficulties over and above concerns about high-​risk behaviors (Lewis, 2004). Cultural norms are important to prevent sexual function measures from pathologizing groups that fall outside of mainstream expectations. The cross-​national validation of some sexual function measures designed for clinical trials and Laumann et al.’s (2006) work on the sexual well-​being of older adults in 29 countries are good examples of this culturally informed direction. The sexual health of individuals with disabilities or chronic illness has also been neglected. The norming of existing measures, as well as the development and validation of measures specific to ethnocultural groups, sexual minorities, and individuals with disabilities, is long overdue. Finally, note that the much needed corrective trend toward the investigation of female sexual dysfunction may now need to be matched by one that revisits the complexity of male sexual function. There are now many

532

Couple Distress and Sexual Disorders

more measures for the assessment of female than of male sexual function. The “age of Viagra” may have reduced male sexual function to a medically produced erection. Although the male sexual response may be more predictable than the female one, we risk simplifying and doing a disservice to male sexual function. In conclusion, sexual health as defined by the World Health Organization is a state of physical, emotional, mental, and social well-​being related to sexuality, which is respectful and free of coercion and discrimination (Edwards & Coleman, 2004). Clearly, this encompasses much more than the absence of dysfunction, but it does includes dysfunction. Our endeavors to develop effective assessment strategies are instrumental in the promotion of sexual health. We cannot address problems without the proper tools to identity them. Ensuring that these strategies are both accurate and inclusive is essential.

ACKNOWLEDGMENTS

The authors thank Nicole Snowball and Kayla Mooney for their assistance in preparing this chapter.

References Althof, S. E. (2014). Treatment of premature ejaculation:  Psychotherapy, pharmacotherapy, and combined therapy. In Y. M. Binik & K. S. K. Hall (Eds.), Principles and practice of sex therapy (5th ed., pp. 112–​ 137). New York, NY: Guilford. Althof, S. E., McMahon, C. G., Waldinger, M. D., Can Serefoglu, E., Shindel, A. W., Adaikan, P. G., . . . Otavio Torres, L. (2014). An update of the International Society of Sexual Medicine’s guidelines for the diagnosis and treatment of premature ejaculation (PE). Journal of Sexual Medicine, 11, 1391–​1422. Althof, S., Rosen, R., Symonds, T., Mundayat, R., May, K., & Abraham, L. (2006). Development and validation of a new questionnaire to assess sexual satisfaction, control, and distress associated with premature ejaculation. Journal of Sexual Medicine, 3, 465–​475. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Press. Basson, R. (2002). Women’s sexual desire: D ​ isordered or misunderstood? Journal of Sex and Marital Therapy, 28(Suppl. 1), 17–​28.

Basson, R., Leiblum, S., Brotto, L., Derogatis, L., Fourcroy, J., Fugl-​Meyer, K.,  .  .  .  Weijmar Schultz, W. (2004). Revised definitions of women’s sexual dysfunction. Journal of Sexual Medicine, 1, 40–​48. Bergeron, S., Corsini-​Munt, S., Aerts, L., Rancourt, K., & Rosen, N. O. (2015). Female sexual pain disorders:  A review of the literature on etiology and treatment. Current Sexual Health Reports, 7, 159–​169. Bergeron, S., Rosen, N. O., & Pukall, C. P. (2014). Genital pain in women and men: It can hurt more than your sex life. In Y. M. Binik & K. S. K. Hall (Eds.), Principles and practice of sex therapy (5th ed., pp. 159–​176). New York, NY: Guilford. Binik, Y. M. (2005). Should dyspareunia be retained as a sexual dysfunction in DSM-​V? A painful classification decision. Archives of Sexual Behavior, 34, 11–​21. Binik, Y. M., & Hall, K. S. (2014). Principles and practice of sex therapy (5th ed.). New York, NY: Guilford. Boyer, S. C., Pukall, C. F., & Chamberlain, S. M. (2013). Sexual arousal in women with provoked vestibulodynia:  The application of laser Doppler imaging to sexual pain. Journal of Sexual Medicine, 10, 1052–​1064. Brotto, L. A. (2010). The DSM diagnostic criteria for hypoactive sexual desire disorder in men. Journal of Sexual Medicine, 7, 2015–​2030. Brotto, L. A., Bitzer, J., Laan, E., Leiblum, S., & Luria, M. (2010). Women’s sexual desire and arousal disorders. Journal of Sexual Medicine, 7, 586–​614. Brotto, L., & Luria, M. (2014). Sexual interest/​arousal disorder in women. In Y. M. Binik & K. S.  K. Hall (Eds.), Principles and practice of sex therapy (5th ed., pp. 17–​41). New York, NY: Guilford. Busby, D. M., Christensen, C., Crane, D. R., & Larson, J. H. (1995). A revision of the Dyadic Adjustment Scale for use with distressed and nondistressed couples:  Construct hierarchy and multidimensional scales. Journal of Marital and Family Therapy, 21, 289–​308. Byers, E. S. (1999). The Interpersonal Exchange Model of Sexual Satisfaction:  Implications for sex therapy with couples. Canadian Journal of Counselling, 33, 95–​111. Cherner, R. A., & Reissing, E. D. (2013). A psychophysiological investigation of sexual arousal in women with lifelong vaginismus. Journal of Sexual Medicine, 10, 1291–​1303. Chivers, M. L., Seto, M. C., Lalumière, M. L., Laan, E., & Grimbos, T. (2010). Agreement of self-​reported and genital measures of sexual arousal in men and women: A meta-​analysis. Archives of Sexual Behavior, 39, 5–​56. Christensen, B. S., Grønbæk, M., Osler, M., Pedersen, B. V., Graugaard, C., & Frisch, M. (2011). Sexual dysfunctions and difficulties in Denmark: Prevalence and associated sociodemographic factors. Archives of Sexual Behavior, 40, 121–​132.

Sexual Dysfunction

Clayton, A. H., Goldfischer, E. R., Goldstein, I., Derogatis, L., Lewis-​D’Agostino, D. J., & Pyke, R. (2009). Validation of the Decreased Sexual Desire Screener (DSDS):  A brief diagnostic instrument for generalized acquired female hypoactive sexual desire disorder (HSDD). Journal of Sexual Medicine, 6, 730–​738. Clayton, A. H., Goldmeier, D., Nappi, R. E., Wunderlich, G., Lewis-​D’Agostino, D. J., & Pyke, R. (2010). Validation of the Sexual Interest and Desire Inventory–​ Female in hypoactive sexual desire disorder. Journal of Sexual Medicine, 7, 3918–​3928. Clayton, A. H., McGarvey, E. L., & Clavet, G. J. (1997). The Changes in Sexual Functioning Questionnaire (CSFQ):  Development, reliability, and validity. Psychopharmacology Bulletin, 33, 731–​745. Clayton, A. H., McGarvey, E. L., Clavet, G. J., & Piazza, L. (1997). Comparison of sexual functioning in clinical and nonclinical populations using the Changes in Sexual Functioning Questionnaire (CSFQ). Psychopharmacology Bulletin, 33, 747–​753. Clayton, A. H., Seagraves, R. T., Leiblum, S., Basson, R., Pyke, R., Cotton, D.,  .  .  .  Wunderlich, G. R. (2006). Reliability and validity of the Sexual Interest and Desire Inventory–​Female (SIDI-​F), a scale designed to measure severity of female hypoactive sexual desire disorder. Journal of Sex and Marital Therapy, 12, 115–​135. Clemens, J. Q., Meenan, R. T., O’Keefe Rosetti, M. C., Gao, S. Y., & Calhoun, E. A. (2005). Incidence and clinical characteristics of National Institutes of Health Type III prostatitis in the community. Journal of Urology, 174, 2319–​2322. Coyne, K., Mandalia, S., McCullough, S., Catalan, J., Noestlinger, C., Colebunders, R., & Asboe, D. (2010). The International Index of Erectile Function:  Development of an adapted tool for use in HIV-​positive men who have sex with men. Journal of Sexual Medicine, 7, 769–​774. Daker-​White, G. (2002). Reliable and valid self-​report outcome measures in sexual (dys) function:  A systematic review. Archives of Sexual Behavior, 31, 197–​209. Dargie, E., Holden, R. R., & Pukall, C. F. (2016). The Vulvar Pain Assessment Questionnaire Inventory. Pain, 157(12), 2672–​2686. Davis, S. R., Davison, S. L., Donath, S., & Bell, R. J. (2005). Circulating androgen levels and self-​ reported sexual function in women. JAMA, 294, 91–​96. Davis, S. R., Worsley, R., Miller, K. K., Parish, S. J., & Santoro, N. (2016). Androgens and female sexual function and dysfunction: Findings from the Fourth International Consultation of Sexual Medicine. Journal of Sexual Medicine, 13, 168–​178. Derogatis, L. R. (1998). The Derogatis Interview for Sexual Functioning. In C. M. Davis, W. L. Yarber, R. Bauserman, G. Schreer, & S. L. Davis (Eds.), Handbook

533

of sexuality-​related measures (pp. 268–​271). Thousand Oaks, CA: Sage. Derogatis, L. R. (2008). Clinical and research evaluations of sexual dysfunctions. Advances in Psychosomatic Medicine, 29, 7–​22. Derogatis, L. R., Clayton, A., Lewis-​ D’Agostino, D., Wunderlich, G., & Fu, Y. (2008). Validation of the Female Sexual Distress Scale-​Revised for assessing distress in women with hypoactive sexual desire disorder. Journal of Sexual Medicine, 5, 357–​364. Derogatis, L. R., & Melisaratos, N. (1979). The DSFI: A multidimensional measure of sexual functioning. Journal of Sex and Marital Therapy, 5, 244–​248. Derogatis, L. R., Rosen, R., Leiblum, S., Burnett, A., & Heiman, J. (2002). The Female Sexual Distress Scale (FSDS):  Initial validation of a standardized scale for assessment of sexually related personal distress in women. Journal of Sex and Marital Therapy, 28, 317–​330. Derogatis, L. R., Rust, J., Golombok, S., Bouchard, C., Nachtigall, L., Rodenberg, C.,  .  .  .  McHorney, C. A. (2004). Validation of the Profile of Female Sexual Function (PFSF) in surgically and naturally menopausal women. Journal of Sex and Marital Therapy, 30, 25–​36. Dickstein, J. B., Goldstein, S. W., Tkachenko, N., & Kreppner, W. (2013). Correlation of question 15 of the FSDS-​ DAO with clinician evaluation of female orgasmic disorder. Journal of Sexual Medicine, 10, 2251–​2254. Edwards, W. M., & Coleman, E. (2004). Defining sexual health:  A descriptive overview. Archives of Sexual Behavior, 33, 189–​195. Facelle, T. M., Sadeghi-​Nejad, H., & Goldmeier, D. (2013). Persistent genital arousal disorder: Characterization, etiology, and management. Journal of Sexual Medicine, 10, 439–​450. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015). Structured clinical interview for DSM-​ 5 disorders, clinician version (SCID-​ 5-​ CV). Arlington, VA: American Psychiatric Association. Fisher, T. D., Davis, C. M., Yarber, W. L., & Davis, S. L. (2011). Handbook of sexuality-​ related measures. New York, NY: Routledge. Fugl-​Meyer, A. R., Lodnert, G., Bränholm, I. B., & Fugl-​ Meyer, K. S. (1997). On life satisfaction in male erectile dysfunction. International Journal of Impotence Research, 9, 141–​148. Funk, J. L., & Rogge, R. D. (2007). Testing the ruler with item response theory:  Increasing precision of measurement for relationship satisfaction with the Couples Satisfaction Index. Journal of Family Psychology, 21, 572–​583. Ghanem, H., & Shamloul, R. (2008). An evidence-​based perspective to commonly performed erectile dysfunction investigations. Journal of Sexual Medicine, 5, 1582–​1589.

534

Couple Distress and Sexual Disorders

Goldstein, I., Lue, T. F., Padma-​Nathan, H., Rosen, R. C., Steers, W. D., & Wicker, P. A. (1998). Oral sildenafil in the treatment of erectile dysfunction. New England Journal of Medicine, 338, 1397–​1404. Graham, C. A. (2010). The DSM diagnostic criteria for female orgasmic disorder. Archive of Sexual Behavior, 39, 256–​270. Graham, C. A. (2014). Orgasm disorders in women. In Y. M. Binik & K. S.  K. Hall (Eds.), Principles and practice of sex therapy (5th ed., pp. 89–​111). New  York, NY: Guilford. Harlow, B. L., & Stewart, E. G. (2005). Adult-​onset vulvodynia in relation to childhood violence victimization. American Journal of Epidemiology, 161, 871–​880. Hatzimouratidis, K., Amar, E., Eardley, I., Giuliano, F., Hatzichristou, D., Montorsi, F., . . . Wespes, E. (2010). Guidelines on male sexual dysfunction:  Erectile dysfunction and premature ejaculation. European Urology, 57, 804–​814. Helpman, L., Greenstein, A., Hartoov, J., & Abramov, L. (2009). Genito-​sensory analysis in women with arousal and orgasmic dysfunction. Journal of Sexual Medicine, 6, 1039–​1044. Hendrickx, L., Gijs, L., & Enzlin, P. (2013). Distress, sexual dysfunctions, and DSM:  Dialogue at cross purposes? Journal of Sexual Medicine, 10, 630–​641. Hendrickx, L., Gijs, L., & Enzlin, P. (2014). Prevalence rates of sexual difficulties and associated distress in heterosexual men and women: Results from an Internet survey in Flanders. Journal of Sex Research, 51, 1–​12. Hudson, W. W. (1998). Index of Sexual Satisfaction. In C. M. Davis, W. L. Yarber, R. Bauserman, G. Schreer, & S. L. Davis (Eds.), Handbook of sexuality-​related measures (pp. 512–​513). Thousand Oaks, CA: Sage. Hudson, W. W., Harrison, D. F., & Crossup, P. C. (1981). A short form scale to measure sexual discord in dyadic relationships. Journal of Sex Research, 17, 157–​174. Isidori, A., Pozza, C., Esposito, K., Giugliano, D., Morano, S., Vignozzi, L., . . . Jannini, E. (2010). Development and validation of a 6-​item version of the Female Sexual Function Index (FSFI) as a diagnostic tool for female sexual dysfunction. Journal of Sexual Medicine, 7, 1139–​1146. Janssen, E. (2001). Psychophysiological assessment of sexual arousal. In M. W. Wiederman & B. E. Whitley (Eds.), Handbook for conducting research on human sexuality (pp. 139–​171). Mahwah, NJ: Erlbaum. Keller, A., McGarvey, E. L., & Clayton, A. H. (2006). Reliability and construct validity of the Changes in Sexual Functioning Questionnaire Short-​Form (CSFQ-​ 14). Journal of Sex and Marital Therapy, 32, 43–​52. Khera, M., Bhattacharya, R. K., Blick, G., Kushner, H., Nguyen, D., & Miner, M. M. (2011). Improved sexual function with testosterone replacement therapy in

hypogonadal men:  Real-​ world data from the Testim Registry in the United States (TRiUS). Journal of Sexual Medicine, 8, 3204–​3213. Kukkonen, T. M. (2015). Devices and methods to measure female sexual arousal. Sexual Medicine Reviews, 3, 225–​244. Kukkonen, T. M., Binik, Y. M., Amsel, R., & Carrier, S. (2007). Thermography as a physiological measure of sexual arousal in both men and women. Journal of Sexual Medicine, 4, 93–​105. Kukkonen, T. M., Binik, Y. M., Amsel, R., & Carrier, S. (2010). An evaluation of the validity of thermography as a physiological measure of sexual arousal in a non-​ university adult sample. Archives of Sexual Behavior, 39, 861–​873. Kurdek, L. A. (1992). Dimensionality of the Dyadic Adjustment Scale: Evidence from heterosexual and homosexual couples. Journal of Family Psychology, 6, 22–​35. Laan, E., & Both, S. (2008). What makes women experience desire? Feminism & Psychology, 18, 505–​514. Laumann, E. O., & Mahay, J. (2002). The social organization of women’s sexuality. In M. Wingood & R. J. DiClemente (Eds.), Handbook of women’s sexual and reproductive health (pp. 43–​70). New York, NY: Kluwer/ ​Plenum. Laumann, E. O., Nicolosi, A., Glasser, D. B., Paik, A., Gingell, C., Moreira, E., & Wang, A. (2005). Sexual problems among women and men aged 40–​ 80 y:  Prevalence and correlates identified in the Global Study of Sexual Attitudes and Behaviors. International Journal of Impotence Research, 17, 39–​57. Laumann, E. O., Paik, A., Glasser, D. B., Kang, J.-​H., Wang, T., Levinson, B.,  .  .  .  Gingell, C. (2006). A cross-​ national study of subjective sexual well-​being among older women and men: Findings from the Global Study of Sexual Attitudes and Behaviors. Archives of Sexual Behavior, 35, 145–​161. Laumann, E. O., Paik, A., & Rosen, R. C. (1999). Sexual dysfunction in the US: Prevalence and predictors. Journal of the American Medical Association, 281, 537–​544. Lawrance, K. A., & Byers, E. S. (1995). Sexual satisfaction in long-​term heterosexual relationships:  The interpersonal exchange model of sexual satisfaction. Personal Relationships, 2, 267–​285. Legocki, L. J., Aikens, J. E., Sen, A., Haefner, H. K., & Reed, B. D. (2013). Interpretation of the Sexual Functioning Questionnaire in the presence of vulvar pain. Journal of Lower Genital Tract Disease, 17, 273–​279. Levine, S. B., Risen, C. B., & Althof, S. E. (2016). Handbook of clinical sexuality for mental health professionals (3rd ed.). New York, NY: Routledge. Lewis, L. J. (2004). Examining sexual health discourses in a racial/​ethnic context. Archives of Sexual Behavior, 33, 223–​234.

Sexual Dysfunction

Lewis, R. W., Fugl-​Meyer, K. S., Corona, G., Hayes, R. D., Laumann, E. O., Moreira, E.D.,  .  .  .  Segraves, T. (2010). Definitions/​epidemiology/​risk factors for sexual dysfunction. Journal of Sexual Medicine, 7, 1598–​1607. Mah, K., & Binik, Y. M. (2002). Do all orgasm feel alike? Evaluating a two-​ dimensional model of the orgasm. Journal of Sex Research, 39, 104–​113. Maravilla, K. R. (2006). Blood flow:  Magnetic resonance imaging and brain imaging for evaluating sexual arousal in women. In I. Goldstein, C. M. Meston, S. R. Davis, & A. M. Traish (Eds.), Women’s sexual function and dysfunction: Study, diagnosis and treatment (pp. 368–​382). Abingdon, UK: Taylor & Francis. Mark, K. P., Herbenick, D., Fortenberry, J. D., Sanders, S., & Reece, M. (2014). A psychometric comparison of three scales and a single-​item measure to assess sexual satisfaction. Journal of Sex Research, 51, 159–​169. Maurice, W. L. (1999). Sexual medicine in primary care. St. Louis, MO: Mosby. McConaghy, N. (2003). Sexual dysfunctions and deviations. In M. Hersen & S. M. Turner (Eds.), Diagnostic interviewing (3rd ed., pp. 315–​341). New York, NY: Kluwer. McCoy, N. L., & Matyas, J. R. (1998). McCoy Female Sexuality Questionnaire. In C. M. Davis, W. L. Yarber, R. Bauserman, G. Schreer, & S. L. Davis (Eds.), Handbook of sexuality related measures (pp. 249–​251). Thousand Oaks, CA: Sage. McHorney, C. A., Rust, J., Golombok, S., Davis, S., Bouchard, C., Brown, C.,  .  .  .  Derogatis, L. (2004). Profile of Female Sexual Function:  A patient-​based, international, psychometric instrument for the assessment of hypoactive sexual desire disorder in oopherectomized women. Menopause, 11, 474–​483. McClelland, S. I. (2011). Who is the “self” in self reports of sexual satisfaction? Research and policy implications. Sexuality Research and Social Policy, 8, 304–​320. Meana, M., Binik, Y. M., Khalife, S., & Cohen, D. (1997). Dyspareunia:  Sexual dysfunction or pain syndrome? Journal of Nervous and Mental Disease, 185, 561–​569. Meana, M., & Steiner, E. T. (2014). Hidden disorder/​hidden desire:  Presentations of low sexual desire in men. In Y. M. Binik & K. S.  K. Hall (Eds.), Principles and practice of sex therapy (5th ed., pp. 42–​60). New York, NY: Guilford. Melzack, R. (1975). The McGill Pain Questionnaire: Major properties and scoring methods. Pain, 1, 277–​299. Mercer, C. H., Fenton, K. A., Johnson, A. M., Wellings, K., MacDowall, W., McManus, S.,  .  .  .  Erens, B. (2003). Sexual function problems and help seeking behaviour in Britain:  National Probability Sample Survey. BMJ, 327, 426–​427. Meston, C. M. (2000). The psychophysiological assessment of female sexual function. Journal of Sex Education and Therapy, 25, 6–​16.

535

Meston, C. M. (2003). Validation of the Female Sexual Function Index (FSFI) in women with female orgasmic disorder and in women with hypoactive sexual desire disorder. Journal of Sex and Marital Therapy, 29, 39–​46. Meston, C. M., & Trapnell, P. (2005). Development and validation of a five-​factor sexual satisfaction and distress scale for women:  The Sexual Satisfaction Scale for Women. Journal of Sexual Medicine, 2, 66–​81. Metz, M. E., & Pryor, J. L. (2000). Premature ejaculation: A psychophysiological approach for assessment and management. Journal of Sex and Marital Therapy, 26, 293–​320. Meuleman, E. J.  H., Hatzichristou, D., Rosen, R. C., & Sadovsky, R. (2010). Diagnostic tests for male erectile dysfunction revisited. Journal of Sexual Medicine, 7, 2375–​2381. Mitchell, K. R., Mercer, C. H., Ploubidis, G. B., Jones, K. G., Datta, J., Field, N., . . . Wellings, K. (2013). Sexual function in Britain: Findings from the Third National Survey of Sexual Attitudes and Lifestyles (Natsal-​ 3). Lancet, 382, 1817–​1829. Morin, M., Bergeron, S., Khalifé, S., Mayrand, M. H., & Binik, Y. M. (2014). Morphometry of the pelvic floor muscles in women with and without provoked vestibulodynia using 4D ultrasound. Journal of Sexual Medicine, 11, 776–​785. Mykletun, A., Dahl, A. A., O’Leary, M. P., & Fossa, S. D. (2005). Assessment of male sexual function by the Brief Sexual Function Inventory. British Journal of Urology International, 97, 316–​323. O’Leary, M. P., Fowler, F. J., Lenderking, W. R., Barber, B., Sagnier, P. P., Guess, H. A., & Barry, M. J. (1995). A brief male sexual function inventory for urology. Urology, 46, 697–​706. Opperman, E. A., Benson, L. E., & Milhausen, R. R. (2013). Confirmatory factor analysis of the female sexual function index. Journal of Sex Research, 50, 29–​36. Parish, S. J. (2006). Role of the primary care and internal medicine clinician. In I. Goldstein, C. M. Meston, S. R. Davis, & A. M. Traish (Eds.), Women’s sexual function and dysfunction:  Study, diagnosis and treatment (pp. 689–​695). Abingdon, UK: Taylor & Francis. Patrick, D. L., Althof, S. E., Pryor, J. L., Rosen, R., Rowland, D. L., Ho, K. F.,  .  .  .  Jamieson, C. (2005). Premature ejaculation:  An observational study of men and their partners. Journal of Sexual Medicine, 2, 358–​367. Patrick, D. L., Giuliano, F., Fai Ho, K., Gagnon, D. D., McNulty, P., & Rothman, M. (2009). The Premature Ejaculation Profile: Validation of self-​reported outcome measures for research and practice. BJU International, 103, 358–​364. Payne, K. A., Bergeron, S., Khalife, S., & Binik, Y. M. (2006). Assessment, treatment strategies and outcome results:  Perspective of pain specialists. In I. Goldstein,

536

Couple Distress and Sexual Disorders

C. M. Meston, S. R. Davis, & A. M. Traish (Eds.), Women’s sexual function and dysfunction: Study, diagnosis and treatment (pp. 471–​479). Abingdon, UK: Taylor & Francis. Payne, K. A., & Binik, Y. M. (2006, May 5). Reviving the labial thermistor clip [Letter to the Editor]. Archives of Sexual Behavior, 35(2), 111–​113. Perelman, M. A. (2006). A new combination treatment for premature ejaculation:  A sex therapist’s perspective. Journal of Sexual Medicine, 3, 1004–​1012. Pfeiffer, E., & Davis, G. C. (1972). Determinants of sexual behavior in middle and old age. Journal of the American Geriatric Society, 20, 151–​158. Porst, H., Gilbert, C., Collins, S., Huang, X., Symonds, T., Stecher, V., & Hvidsten, K. (2007). Development and validation of the quality of erection questionnaire. Journal of Sexual Medicine, 4, 372–​381. Prause, N., & Janssen, E. (2006). Blood flow: Vaginal photoplethysmography. In I. Goldstein, C. M. Meston, S. R. Davis, & A. M. Traish (Eds.), Women’s sexual function and dysfunction:  Study, diagnosis and treatment (pp. 359–​365). Abingdon, UK: Taylor & Francis. Pukall, C. F., Binik, Y. M., Khalife, S., Amsel, R., & Abbott, F. V. (2002). Vestibular tactile and pain thresholds in women with vulvar vestibulitis syndrome. Pain, 96, 163–​175. Pukall, C. F., Young, R. A., Roberts, M. J., Sutton, K. S., & Smith, K. B. (2007). The vulvalgesiometer as a device to measure genital pressure-​pain threshold. Physiological Measurement, 28, 1543–​1550. Quirk, F. H., Haughie, S., & Symonds, T. (2005). The use of the Sexual Function Questionnaire as a screening tool for women with sexual dysfunction. Journal of Sexual Medicine, 2, 469–​477. Quirk, F. H., Heiman, J. R., Rosen, R. C., Laan, E., Smith, M. D., & Boolell, M. (2002). Development of a sexual function questionnaire for clinical trials of female sexual dysfunction. Journal of Women’s Health and Gender-​ Based Medicine, 11, 277–​289. Reissing, E. D., Binik, Y. M., Khalife, S., Cohen, D., & Amsel, R. (2004). Vaginal spasm, pain, and behavior: An empirical investigation of the diagnosis of vaginismus. Archives of Sexual Behavior, 33, 5–​17. Reissing, E. D., Brown, C., Lord, M. J., Binik, Y. M., & Khalife, S. (2005). Pelvic floor muscle functioning in women with vulvar vestibulitis syndrome. Journal of Psychosomatic Obstetrics and Gynecology, 26, 107–​113. Rellini, A. H., Nappi, R. E., Vaccaro, P., Ferdeghini, F., Abbiati, I., & Meston, C. M. (2005). Validation of the McCoy Female Sexuality Questionnaire in an Italian sample. Archives of Sexual Behavior, 34, 641–​647. Rosen, N. O., Bergeron, S., Sadikaj, G., & Delisle, I. (2015). Daily associations among male partner responses, pain during intercourse, and anxiety in women with

vulvodynia and their partners. Journal of Pain, 16, 1312–​1320. Rosen, N. O., Bergeron, S., Sadikaj, G., Glowacka, M., Baxter, M. L., & Delisle, I, (2014). Relationship satisfaction moderates the associations between male partner responses and depression in women with vulvodynia: A dyadic daily experience study. Pain, 155, 1374–​1383. Rosen, R., Brown, C., Heiman, J., Leiblum, S., Meston, C., Shabsigh, R.,  .  .  .  D’Agostino, R. (2000). The Female Sexual Function Index (FSFI): A multidimensional self-​ report instrument for the assessment of female sexual function. Journal of Sex and Marital Therapy, 26, 191–​208. Rosen, R. C., Cappelleri, J. C., Smith, M. D., Lipsky, J., & Pena, B. M. (1999). Development and evaluation of an abridged, 5-​item version of the International Index of Erectile Function (IIEF-​5) as a diagnostic tool for erectile dysfunction. International Journal of Impotence Research, 11, 319–​326. Rosen, R. C., Catania, J. A., Althof, S. E., Pollack, L. M., O’Leary, M., Seftel, A. D., & Coon, D. W. (2007). Development and validation of four-​ item version of Male Sexual Health Questionnaire to assess ejaculatory dysfunction. Urology, 69, 805–​809. Rosen, R. C., Catania, J., Pollack, L., Althof, S., O’Leary, M., & Seftel, A. D. (2004). Male Sexual Health Questionnaire (MSHQ):  Scale development and psychometric validation. Urology, 64, 777–​782. Rosen, R. C., Janssen, E., Wiegel, M., Bancroft, J., Althof, S., Wincze, J.,  .  .  .  Barlow, D. (2006). Psychological and interpersonal correlates in men with erectile dysfunction and their partners: A pilot study of treatment outcome with sildenafil. Journal of Sex and Marital Therapy, 32, 215–​234. Rosen, R. C., Miner, M., & Wincze, J. (2014). Erectile dysfunction:  Integration of medical and psychological approaches. In Y. M. Binik & K. S. K. Hall (Eds.), Principles and practices of sex therapy (5th ed., pp. 61–​ 87). New York, NY: Guilford. Rosen, R. C., Riley, A., Wagner, G., Osterloh, I. H., Kirkpatrick, J., & Mishra, A. (1997). The International Index of Erectile Function (IIEF):  A multidimensional scale for assessment of erectile dysfunction. Urology, 49, 822–​830. Rosen, R. C., Taylor, J. E., & Leiblum, S. (1998). Brief Index of Sexual Functioning for Women. In C. M. Davis, W. L. Yarber, R. Bauserman, G. Schreer, & S. L. Davis (Eds.), Handbook of sexuality-​ related measures (pp. 251–​255). Thousand Oaks, CA: Sage. Rowland, D., McMahon, C. G., Abdo, C., Chen, J., Jannini, E., Waldinger, M. D. & Young Ahn, T. (2010). Disorders of orgasm and ejaculation in men. Journal of Sexual Medicine, 7, 1668–​1686. Rubio-​ Aurioles, E., & Bivalacqua, T. J. (2013). Standard operational procedures for low sexual desire in men. Journal of Sexual Medicine, 10, 94–​107.

Sexual Dysfunction

Rubio-​Aurioles, E., Kim, E. D., Rosen, R. C., Porst, H., Burns, P., Zeigler, H., & Wong, D. G. (2009). Impact on erectile function and sexual quality of life of couples: A double-​blind, randomized, placebo-​controlled trial of tadalafil taken once daily. Journal of Sexual Medicine, 6, 1314–​1323. Rust, J., Derogatis, L., Rodenberg, C., Koochaki, P., Schmitt, S., & Golombok, S. (2007). Development and validation of a new screening tool for hypoactive sexual desire disorder: The Brief Profile of Female Sexual Function (B-​PFSF). Gynecological Endocrinology, 23, 638–​644. Rust, J., & Golombok, S. (1985). The Golombok–​ Rust Inventory of Sexual Satisfaction (GRISS). British Journal of Clinical Psychology, 24, 63–​64. Rust, J., & Golombok, S. (1986). The GRISS: A psychometric instrument for the assessment of sexual dysfunction. Archives of Sexual Behavior, 15, 157–​165. Rust, J., & Golombok, S. (1998). The GRISS: A psychometric scale and profile of sexual dysfunction. In C. M. Davis, W. L. Yarber, R. Bauserman, G. Schreer, & S. L. Davis (Eds.), Handbook of sexuality-​related measures (pp. 192–​194). Thousand Oaks, CA: Sage. Safarinejad, M. R., Hosseini, S. Y., Asgari, M. A., Dadkhah, F., & Taghva, A. (2010). A randomized, double-​blind, placebo-​controlled study of the efficacy and safety of bupropion for treating hypoactive sexual desire disorder in ovulating women. BJU International, 106, 832–​839. Santoro, N., Torrens, J., Crawford, S., Allsworth, J. E., Finkelstein, J. S., Gold, E. B.,  .  .  .  Weiss, G. (2005). Correlates of circulating androgens in mid-​ life women: The study of women’s health across the nation. Journal of Clinical Endocrinology & Metabolism, 90, 4836–​4845. Santos-​Iglesias, P., Danko, A., Robinson, J., & Walker, L. (2016, October). Assessing men’s sexual distress: The psychometric validation of the Female Sexual Distress Scale in men. Poster presented at the Canadian Sex Research Forum, Quebec, Canada. Santos-​Iglesias, P., Sierra, J. C., García, M., Martínez, A., Sánchez, A., & Tapia, M. (2009). Índice de satisfacción sexual (ISS):  Un studio sobre su fiabilidad y validez. International Journal of Psychology and Psychological Therapy, 9, 259–​273. Sarin, S., Amsel, R., & Binik, Y. M. (2014). How hot is he? A psychophysiological and psychosocial examination of the arousal patterns of sexually functional and dysfunctional men. Journal of Sexual Medicine, 11, 1725–​1740. Segraves, K., & Segraves, R. T. (1993). Medical aspects of orgasm disorders. In W. O’Donohue & J. H. Geer (Eds.), Handbook of sexual dysfunctions: Assessment and treatment (pp. 225–​252). Boston, MA: Allyn & Bacon. Shamloul, R., & Ghanem, H. (2013). Erectile dysfunction. Lancet, 381, 153–​165.

537

Shaw, A. M., & Rogge, R. D. (2016). Evaluating and refining the construct of sexual quality with item response theory:  Development of the Quality of Sex Inventory. Archives of Sexual Behavior, 45, 249–​270. Shifren, J. L., Braunstein, G. D., Simon, J. A., Casson, P. R., Buster, J. E., Redmond, G. P., . . . Mazer, N. A. (2000). Transdermal testeosterone treatment in women with impaired sexual function after oopherectomy. New England Journal of Medicine, 343, 682–​688. Shifren, J. L., Monz, B. U., Russo, P. A., Segreti, A., & Johannes, C. B. (2008). Sexual problems and distress in United States women. Obstetrics & Gynecology, 112, 970–​978. Simons, J. S., & Carey, M. P. (2001). Prevalence of the sexual dysfunctions: Results from a decade of research. Archives of Sexual Behavior, 22, 51–​58. Spanier, G. B. (1976). Measuring dyadic adjustment:  New scales for assessing the quality of marriage and similar dyads. Journal of Marriage and Family, 38, 15–​28. Spector, I. P., Carey, M. P., & Steinberg, L. (1996). The Sexual Desire Inventory:  Development, factor structure, and evidence of reliability. Journal of Sex and Marital Therapy, 22, 175–​190. Štulhofer, A., Buško, V., & Brouillard, P. (2010). Development and bicultural validation of the New Sexual Satisfaction Scale. Journal of Sex Research, 47, 257–​268. Štulhofer, A., Buško, V., & Brouillard, P. (2011). The new sexual satisfaction scale and its short form. In C. M. Davis, W. L. Yarber, R. Bauserman, G. Schreer, & S. L. Davis (Eds.), Handbook of sexuality-​related measures (pp. 530–​532). Thousand Oaks, CA: Sage. Sullivan, M. J. L., Bishop, S. R., & Pivik, J. (1995). The Pain Catastrophizing Scale:  Development and validation. Psychological Assessment, 7, 524–​532. Symonds, T., Abraham, L., Bushmakin, A. G., Williams, K., Martin, M., & Cappelleri, J. C. (2012). Sexual Function Questionnaire: Further refinement and validation. Journal of Sexual Medicine, 9, 2609–​2616. Symonds, T., Perelman, M. A., Althof, S., Giuliano, F., Martin, M., May, K.,  .  .  .  Morris, M. (2007). Development and validation of a premature ejaculation diagnostic tool. European Urology, 52, 565–​573. Taylor, J. F., Rosen, R. C., & Leiblum, S. R. (1994). Self-​report assessment of female sexual function:  Psychometric evaluation of the Brief Index of Sexual Functioning for Women. Archives of Sexual Behavior, 23, 627–​643. Tiefer, L. (2002). Beyond the medical model of women’s sexual problems: A campaign to resist the promotion of “female sexual dysfunction.” Sexual and Relationship Therapy, 17, 127–​135. Toates, F. (2009). An integrative theoretical framework for understanding sexual motivation, arousal, and behavior. Journal of Sex Research, 46, 168–​193. Tu, F., Fitzgerald, C. M., Todd, K., Todd, F., & Harden, N. (2007). Comparative measurement of pelvic floor

538

Couple Distress and Sexual Disorders

pain sensitivity in chronic pelvic pain. Obstetrics & Gynecology, 110, 1244–​1248. Utian, W. H., McLean, D. B., Symonds, T., Symons, J., Somayaji, V., & Sisson, M. (2005). A methodology study to validate a structured diagnostic method used to diagnose female sexual dysfunction and its subtypes in postmenopausal women. Journal of Sex and Marital Therapy, 31, 271–​283. van Lankveld, J. D.  M., Granot, M., Weijmar Schultz, W. C.  M., Binik, Y. M., Wesselmann, U., Pukall, C. F., . . . Achtrari, C. (2010). Women’s sexual pain disorders. Journal of Sexual Medicine, 7, 615–​631. Vieira, R. X., Pechorro, P., & Diniz, A. (2008). Validation of Index of Sexual Satisfaction (ISS) for use with Portuguese women. Sexologies, 17, s115. Waxman, S. E., & Pukall, C. F. (2009). Laser Doppler imaging of genital blood flow: A direct measure of female sexual arousal. Journal of Sexual Medicine, 6, 2278–​2285.

Wiegel, M., Meston, C., & Rosen, R. (2005). The Female Sexual Function Index (FSFI):  Cross-​ validation and development of clinical cutoff scores. Journal of Sex and Marital Therapy, 31, 1–​20. Wincze, J. P., & Weisberg, R. B. (2015). Sexual dysfunction:  A guide for assessment and treatment. New  York, NY: Guilford. Yuan, Y., Zhang, Z., Gao, B., Peng, J., Cui, W., Song, W., . . . Guo, Y. (2014). The Self-​Estimation Index of Erectile Function–​No Sexual Intercourse (SIEF-​NS): A multidimensional scale to assess erectile dysfunction in the absence of sexual intercourse. Journal of Sexual Medicine, 11, 1201–​1207. Yule, M., Davison, J., & Brotto, L. (2011). The International Index of Erectile Function: A methodological critique and suggestions for improvement. Journal of Sex & Marital Therapy, 37, 255–​269.

Part VIII

Health-​Related Problems

24

Eating Disorders Robyn Sysko Sara Alavi This chapter presents the most commonly used and well-​ validated eating disorder assessments for the purposes of diagnosis, case conceptualization and treatment planning, and treatment monitoring and treatment outcome. General information is also presented as an introduction to the topic of assessment, including the criteria for an eating disorder diagnosis, prevalence and incidence of eating disorders, common comorbidities, treatment outcomes, and etiology of the disorders. Although structured or semi-​structured interviews or self-​report questionnaires are often used in research studies, and have demonstrated their value for studying the nature of eating disorders in research, the assessments also have promise as clinical tools. As such, this chapter provides information for practitioners interested in using these measures in clinical practice.

NATURE OF EATING DISORDERS

The fifth edition of the Diagnostic and Statistical Manual for Mental Disorders (DSM-​ 5; American Psychiatric Association [APA], 2013)  introduced several changes to the chapter on feeding and eating disorders. An important decision in DSM-​5 was to combine conditions previously listed in both the Eating Disorders and the Feeding and Eating Disorders of Infancy or Early Childhood sections into a single section, thereby including pica, rumination disorder, and a new category of avoidant/​restrictive food intake disorder (ARFID). Given the focus of this chapter on eating disorders, notable alterations to criteria for anorexia nervosa (AN), bulimia nervosa (BN), binge eating disorder (BED), and revisions to residual category for eating disorders are briefly described here.

Eating Disorder Categories Anorexia Nervosa The DSM-​5 criteria for AN differ from earlier categorizations in several ways, although the hallmark of AN continues to be the presence of a significantly low body weight. The description of this symptom was altered to eliminate the term “refusal” from DSM-​ IV (APA, 1994), which avoids a perception that individuals with AN are making a conscious choice, and also focuses on the importance of altered energy intake through reduced food intake and/​or increased physical activity (APA, 2013). An example of percent ideal body weight is not provided, which requires clinicians to determine whether an individual’s weight is low given age, sex, and developmental trajectory. Individuals with AN are also expected to demonstrate a fear of gaining weight; DSM-​5 accommodates individuals who deny this criterion, but their overt behavior (e.g., avoidance of high-​calorie foods and reluctance to consume a range of foods) must be consistent with fear. A disturbance in shape or weight was retained in the diagnosis, but a “persistent lack of recognition” of the seriousness of low weight was offered as a clarification of the phenomenon. The greatest change from the DSM-​IV criteria for AN was eliminating the requirement for amenorrhea. Individuals who do not regularly engage in binge eating or purging behaviors (i.e., self-​induced vomiting and laxative or diuretic abuse) continue to be classified as having AN-​restricting type (AN-​R), and those reporting binge eating or purging are diagnosed with AN-​binge-​eating/​purging (AN-​B/​P) type. Some studies have noted an increase in rates of individuals diagnosed with AN, which likely results from cases previously placed in the residual not otherwise specified category in DSM-​ IV moving to a

541

542

Health-Related Problems

formal category under DSM-​5, including one retrospective study that noted an increase of 14% in diagnoses of AN in a clinical sample after applying DSM-​5 criteria (Gualandi, Simoni, Manzato, & Scanelli, 2016). Bulimia Nervosa Changes to the criteria for BN in DSM-​5 were modest. Diagnostic criteria require that individuals report recurrent episodes of binge eating and inappropriate compensatory behavior (e.g., self-​induced vomiting, fasting, and excessive exercise). On the basis of a literature review (Wilson & Sysko, 2009), the required frequency of these episodes was lowered from at least twice weekly in DSM-​IV to once weekly over a 3-​month period in DSM-​5, and people meeting diagnostic criteria are required to experience an undue influence of shape and weight on their self-​evaluation. An episode of binge eating is characterized by consuming a large amount of food and the experience of a loss of control over eating. A reapplication of DSM-​5 criteria to a clinical sample identified a 2.4% increase in the rate of BN diagnoses (Gualandi et al., 2016). The classification of individuals with BN with either the purging or the nonpurging type was considered to be of limited utility and frequently not employed (Peat, Mitchell, Hoek, & Wonderlich, 2009); therefore, this subtyping scheme was eliminated. Binge Eating Disorder A major shift in DSM-​5 was the formal recognition of BED as a diagnosis. Individuals with BED experience recurrent episodes of binge eating, at least once weekly over a 3-​month period, parallel with the criterion for BN, in the absence of compensatory behaviors. To be diagnosed with BED, individuals must experience distress over their binge eating episodes, have binge eating episodes characterized by eating large amounts of food during a short time and a sense of loss of control during the episode, and report three of the following:  eating until feeling uncomfortably full; eating large amounts of food when not physically hungry; eating much more rapidly than normal; eating alone because of embarrassment; and feeling disgusted, depressed, or guilty after overeating. Obesity, or the presence of excess body weight, is listed as a general medical condition and not an eating disorder within the DSM-​5 system. Residual Categories Two new residual categories in DSM-​5 divide the group with clinically significant eating pathology formerly

identified with DSM-​ IV eating disorder not otherwise specified (EDNOS), namely other specified feeding and eating disorder (OSFED) and unspecified feeding and eating disorder (UFED). The OSFED category includes the five example clinical presentations of atypical anorexia nervosa, subthreshold binge eating disorder, purging disorder, and night eating syndrome. All other individuals are classified within the UFED category. Several studies examined changes in prevalence of residual diagnoses post-​ DSM-​ 5 criteria, with the general observation that rates of residual diagnoses are substantially reduced (Caudle, Pang, Mancuso, Castle, & Newton, 2015; Keel, Brown, Holm-​ Denoma, & Bodell, 2011; Machado, Gonçalves, & Hoek, 2013; Mustelin et al., 2016; Ornstein et al., 2013). Prevalence In comparison to other psychiatric diagnoses, such as major depression or substance abuse, eating disorders are relatively rare among the population. The prevalence of eating disorders in the general population appears to be increasing, although this may be a result of the recently broadened diagnostic criteria (Lindvall Dahlgren & Wisting, 2016; Qian et al., 2013). A meta-​analysis including data from 15 global epidemiological studies of the general population reported the lifetime prevalence of BED as the highest (2.2%), followed by BN (0.81%) and AN (0.21%), using the DSM-​IV scheme (Qian et  al., 2013). Eating disorders are more common when considering only adolescents, with the highest incidence rate in females aged 15 to 19  years at 109.2 per 100,000 in a year, comprising approximately 40% of all cases of AN (Smink, van Hoeken, & Hoek, 2012). In adolescents, the point prevalence of BED is the highest (3.7% females, 0.5% males), followed by AN (1.2% females, 0.1% males) and BN (0.6% females, 0.1% males) (Smink, van Hoeken, Oldehinkel, & Hoek, 2014). Estimates of the prevalence of BED among the general population are typically higher than those of AN and BN (Cossrow et  al., 2016; Kessler et al., 2013; Smink et al., 2012). Based on DSM-​5 criteria, the 12-​month BED prevalence estimate is 1.6% (2.0% and 1.2% in women and men, respectively). As expected with the broader diagnostic criteria, this estimate is higher than the estimate based on DSM-​IV-​TR (1.2%; APA, 2000). Similarly, the lifetime prevalence of BED as determined by DSM-​5 criteria is 2.0% (2.6% and 1.5% in women and men, respectively) compared to the DSM-​IV-​ based estimate of 2.07% (Cossrow et al., 2016). The gap between women and men in prevalence rates for BED is

Eating Disorders

significantly smaller than that of AN and BN, with several studies showing that BED is roughly as common in men as it is in women (Lewinsohn, Seeley, Moerk, & Striegel-​ Moore, 2002; Mond & Hay, 2007; Streigel-​Moore et al., 2009). However, among samples of obese individuals, prevalence estimates can be up to 8% (Striegel-​Moore & Franko, 2003). Culture and Sex Differences Although eating disorders are often considered to be culturally bound syndromes, AN has been documented in every region of the world (Keel & Klump, 2003). In addition, the prevalence of AN is similar in Western and non-​ Western countries; therefore, AN does not appear to occur solely, or even more frequently, in Western countries (Keel & Klump, 2003). Although BN has been observed outside of Western countries, Western cultural influences appear to play a more significant role in the development of this disorder, and an increase in the incidence of BN was also observed during the latter half of the twentieth century (Keel & Klump, 2003). Although epidemiological studies consistently estimate eating disorders in males to occur at a lower frequency than for females, more recent data have found smaller discrepancies in the rates between genders, including a 3:1 ratio of women to men with either AN or BN (Hudson, Hiripi, Pope, & Kessler, 2007). Several important gender differences should be taken into consideration when interpreting these findings, such as the attribution of this increase in estimates among males with AN or BN to (a) an actual rise in numbers of cases, (b) less gender bias in the diagnostic criteria, or (c)  a greater awareness of eating disorders in males (Hildebrandt & Craigen, 2015). Body image disturbances in males typically present in one of two dimensions: muscularity and body fat. Men and boys who are preoccupied with achieving thinness with limited muscularity concerns more easily map onto the existing criteria for classic eating disorders related to the drive for thinness. On the other hand, men and boys who have an extreme desire for muscularity are more likely to have symptoms of muscle dysmorphia, a subtype of body dysmorphic disorder in males that has been termed “reverse anorexia.” The defining feature of muscle dysmorphia is a drive for leanness and muscularity and not a desire for thinness (Hildebrandt & Craigen, 2015; Hildebrandt, Schlundt, Langenbucher, & Chung, 2006; Pope, Gruber, Choi, Olivardia, & Phillips, 1997). Differences in core motivation for the achievement of a physical ideal not based on thinness have led

543

to suggestions that the symptoms of body image disturbance in males could manifest very differently from those observed in females (Hildebrandt et al., 2011). Along with a shared body image disturbance, behavioral symptoms in males with AN and muscle dysmorphia include both diet and exercise disturbances (Murray et al., 2012), but these conditions differ in that in muscle dysmorphia, the primary disturbance relates to body image, whereas for AN and other eating disorders, the primary disturbance is eating pathology (Hildebrandt & Craigen, 2015). Males with an excessive drive for muscularity often engage in the use, and sometimes abuse, of illicit substances called appearance and performance enhancing drugs (APEDs) such as anabolic steroids to further control their physical appearance (Pope, Kanayama, & Hudson, 2012). Understanding the distinct motivations for weight and shape control in men and boys that drive their impairment is necessary before we can adequately assess eating disorder symptomology in this half of the population. Etiology, Comorbidities, Prognosis, and Treatment The etiology of eating disorders is complex; however, some biological, environmental, and psychosocial factors may increase an individual’s risk for developing these disorders. The interaction of biological (e.g., hormones) and psychological changes in adolescence likely influences the development of these disorders because the majority of individuals experience the onset of eating disorders near puberty, and a greater proportion of young women are affected by eating disorders. In addition, social influences, such as peers, can affect beliefs about shape and weight or dieting (Jones & Crawford, 2006), and cultural influences, including the influence of mass media, can produce increases in body dissatisfaction and eating disturbances (Becker, Burwell, Gilman, Herzog, & Hamburg, 2002). Genetic factors may also predispose individuals to the development of eating disorders; however, there are currently no specific genes that are consistently identified as specific to patients with eating disorders. Thus, biological, social, cultural, genetic, and other variables likely influence the etiology of eating disorders, but it is not known whether different factors are responsible for the development of these disorders and maintenance of symptoms or if an interaction of these factors is a better explanation. Comorbid psychiatric diagnoses are common among treatment-​seeking patients with eating disorders. Prevalence rates of a lifetime anxiety disorder range from approximately 33% to 72% of patients with AN-​R, 55%

544

Health-Related Problems

of patients with AN-​B/​P, 41% to 75% of patients with BN (Godart, Flament, Perdereau, & Jeammet, 2002), and 29% of patients with BED (Wilfley, Friedman, et  al., 2000). Rates of lifetime major depressive disorder range from 9.5% to 64.7% of patients with AN-​R, 50% to 71.3% of patients with AN-​B/​P, 20% to 80% of patients with BN (Godart et al., 2007), and 58% of patients with BED (Wilfley, Friedman, et al., 2000). Some data suggest that comorbid depressive symptoms improve with successful treatment, as statistically significant improvements in mood symptoms have been observed among inpatients with AN receiving nutritional rehabilitation and psychotherapy after weight restoration (Meehan, Loeb, Roberto, & Attia, 2006) and patients with BN after treatment with cognitive–​behavioral therapy (CBT; Wilson & Fairburn, 2002). The prognosis for individuals with eating disorders varies across diagnostic categories. Anorexia nervosa has the highest mortality rate of all psychiatric disorders, and few treatments, psychological or pharmacological, have been found to be particularly effective for patients with AN (Zipfel, Giel, Bulik, Hay, & Schmidt, 2015). One study reported the expected long-​term course and outcome of patients with AN as follows: 27.5% experienced a good outcome, 25.3% had an intermediate outcome, 39.6% had a poor outcome, and 7.7% had died (Fichter, Quadflieg, & Hedlund, 2006). Individuals with AN who are younger or receive treatment after a short duration of illness may experience better treatment outcomes compared to adults with a longer course of illness (Forsberg & Lock, 2015; Herpertz-​Dahlmann et al., 2001). Two forms of treatment, CBT and antidepressant medication, have been found to be helpful for the treatment of BN. Patients treated with CBT typically experience a reduction in binge eating and purging of 80% or more, and approximately 30% of patients are abstinent from binge eating and purging at the end of treatment (National Institute for Clinical Excellence, 2004). Antidepressant medications are consistently superior to placebo in pharmacological treatment studies for BN, and median reductions of up to 70% have been observed for symptoms of binge eating and vomiting (Agras, 1997; Bacaltchuk & Hay, 2003; Shapiro et al., 2007). The selective serotonin reuptake inhibitor fluoxetine is the only drug approved by the U.S. Food and Drug Administration for the treatment of bulimia nervosa, with the most effective dose identified as 60 mg/​day (Fluoxetine Bulimia Nervosa Collaborative Study Group, 1992). Despite the availability of effective treatments, the symptoms of BN can be chronic for some individuals, with recovery rates (not fulfilling diagnostic

criteria of any eating disorder) of approximately 50% for both DSM-​5 and DSM-​IV BN documented after 6 years of follow-​up (Castellini et  al., 2011)  and approximately 30% of patients with BN experiencing recurrent episodes of binge eating and purging more than a decade after presentation for their disorder (Keel, Mitchell, Miller, Davis, & Crow, 1999). Research suggests more encouraging outcomes among patients with BED because reductions in binge eating are observed in response to a variety of treatments (CBT, interpersonal psychotherapy, and behavioral weight loss; Wilson, Wilfley, Agras, & Bryson, 2010), which are maintained over at least 1  year (Ricca et  al., 2001; Wilson et al., 2010). CBT is currently considered to be the treatment of choice for BED. Although psychological treatments for BED have been shown to successfully reduce binge eating and associated psychological symptoms, no significant degree of weight loss is observed among these patients either in the short term or the long term (Wilson et al., 2010; Wonderlich, de Zwaan, Mitchell, Peterson, & Crow, 2003). For individuals with BED, a majority of whom are overweight or obese, this failure to achieve weight loss can be associated with significant morbidity and mortality (National Task Force on the Prevention and Treatment of Obesity, 2000).

PURPOSES OF ASSESSMENT

Regardless of the particular purpose for a clinical evaluation, assessments for eating disorders must consider the wide range of symptoms experienced by patients with AN, BN, BED, and residual forms of eating disorders. These symptoms can include restraint over eating; binge eating and purging; concerns about shape and weight; and obsessions and compulsions about food, eating, shape, and weight. In the following paragraphs, a number of the challenges involved in accurately and fully assessing an individual with an eating disorder are described. The assessment of binge eating, which is a core eating disturbance experienced by individuals with AN-​B/​P, BN, and BED, is perhaps the most difficult construct to measure accurately. To receive a diagnosis of BN or BED that is consistent with DSM-​5, an individual must describe binge episodes in which he or she consumes an objectively large amount of food and experiences a sense of loss of control over eating (objective bulimic episode [OBE]; see the description of the Eating Disorder Examination presented later). This definition of binge eating was partially derived from eating behavior experiments conducted in

Eating Disorders

laboratory settings, in which patients with BN were asked to binge eat and were provided with a large multi-​item meal. Patients demonstrated a significant disturbance in the total amount of calories consumed during the binge episode (mean of between 3,352 and 4,477 kcal), as opposed to a specific type of food or a specific macronutrient group (Kissileff, Walsh, Kral, & Cassidy, 1986; Walsh, Kissileff, Cassidy, & Dantzic, 1989). Similarly, other studies have observed disturbances in total consumption during a binge episode for individuals with BED, although BED patients generally consume fewer calories than do patients with BN (Walsh & Boudreau, 2003). Thus, although laboratory data help provide some objective measure of binge eating, there are no explicit criteria for the amount of food needed to constitute a binge episode in DSM-​ 5 (e.g., consumption of 1,500 calories per sitting). As a result, many of the assessments described in the chapter employ standards stemming from judgments of experts in the field (e.g., Eating Disorder Examination), the judgment of the interviewer (e.g., Structured Clinical Interview for DSM-​5), or the self-​report of the patient (e.g., Eating Disorder Diagnostic Scale). The way in which an instrument assesses binge eating is described throughout the chapter so that the reader can weigh the advantages and disadvantages of each method for determining the presence of absence of binge episodes. In addition, caution should be exercised when selecting measures to use for eating disorders. Many measures of eating disorder symptoms or body image do not assess fundamental behavioral disturbances, such as binge eating, or address issues relevant to men (Thompson, Roehrig, Cafri, & Heinberg, 2005). Men may report that commonly used measures ask questions that are not applicable to their experience of shape or weight disturbance (e.g., “Have you felt excessively large and rounded?”), and readers are directed to measures developed specifically for the purpose of assessing men with eating disorders, including the Male Body Attitudes Scale (Tylka, Bergeron, & Schwartz, 2005), the Muscle Dysmorphic Disorder Inventory (Hildebrandt, Langenbucher, & Schlundt, 2004), and the Muscle Dysmorphia Inventory (Rhea, Lantz, & Cornelius, 2004). Because this chapter focuses on the routine clinical assessment of eating disorders, and the prevalence of eating disorders among men is low, these measures are not covered here in detail. Second, professionals utilizing the measures described in this chapter should be aware that not all assessments are validated for children or adolescents (Thompson et  al., 2005), and care should be taken to utilize assessments

545

that have been developed specifically for children or adolescents for younger individuals. A  comprehensive review of the diagnosis of feeding and eating disorders in children and adolescents is available (Schvey, Eddy, & Tanofsky-​Kraff, 2015). The assessments described in the following three sections focus specifically on eating disorder symptoms. Other features have been shown to be either risk factors for the development of eating disorders or symptoms otherwise associated with eating disorders, including perfectionism, body dissatisfaction, exercise, impulse regulation, and thin-​ideal internalization. Instruments are available that measure these constructs and can be employed along with other measures (e.g., Body Esteem Scale [Franzoi & Shields,  1984], Eating Disorder Inventory-​2 [Garner,  1991], Eating Pathology Symptoms Inventory [Forbush et  al.,  2013], Ideal-​ Body Stereotype Scale-​ Revised [Stice & Bearman,  2001], and Satisfaction and Dissatisfaction with Body Parts Scale [Berscheid, Walster, & Bohmstedt, 1973]). Clinicians may also be interested in measuring the general psychological or psychosocial functioning of eating disorder patients, interpersonal functioning, or the patient’s family context. A  number of treatment studies (Agras, Crow, et al., 2000; Agras, Walsh, Fairburn, Wilson, & Kraemer, 2000; Halmi et  al., 2005; Wilfley et  al., 2002) assessed general psychological functioning pre-​and post-​treatment using the Social Adjustment Scale (SAS; Weissman & Bothwell, 1976), others (Agras, Walsh, et al., 2000; Walsh, Fairburn, Mickley, Sysko, & Parides, 2004; Wilfley et  al., 2002)  employed the Symptom Checklist (53 or 90 items; Derogatis, Lipman, & Covi, 1973), and yet others (Walsh et  al., 2006)  used the Quality of Life Enjoyment and Satisfaction Questionnaire (Endicott, Nee, Harrison, & Blumenthal, 1993). Interpersonal functioning has been frequently measured using the Inventory of Interpersonal Problems (IIP; Horowitz, Rosenberg, Baer, Ureno, & Villasenor, 1988) in treatment studies for BN (Agras, Crow, et al., 2000; Agras, Walsh, et al., 2000; Carter et al., 2003) or BED (Devlin et al., 2005; Wilfley et  al., 2002). Studies of the Maudsley form of family therapy (Lock, 2015; Lock, Agras, Dare, & Le Grange, 2002), a useful treatment for adolescents with AN, have measured family functioning in a number of ways, including the Standardized Clinical Family Interview (Kinston & Loader, 1984, as cited in Le Grange, Eisler, Dare, & Russell, 1992), video recordings of interviews to rate expressed emotion (Vaughn & Leff, 1976, as cited in Le Grange et  al., 1992), and the Family Adaptability and Cohesion Evaluation Scales (Olson, Sprenkle, & Russell,

546

Health-Related Problems

1979; Olson, Portner, & Lavee, 1985, as cited in Le Grange et al., 1992), the Family Environment Scale (Moos, 1974; Moos & Moos, 1994, as cited in Lock, Agras, Bryson, & Kraemer, 2005), the Parent Adolescent Relationship Questionnaire (Robin, Koepke, & Moye, 1990, as cited in Robin et al., 1999), and the Family Assessment Device (Epstein, Baldwin, & Bishop, 1983, as cited in Gowers et al., 2007, and Le Grange et al., 2016). Although data on the aforementioned related constructs can provide useful information about patients with eating disorders, the most salient aspect of diagnosis, case conceptualization and treatment planning, and treatment outcome is knowledge of a patient’s eating disorder symptoms. Only the assessment of specific eating disorder symptoms can generate DSM-​5 diagnoses, and the diagnosis assigned to a given patient subsequently helps determine the most efficacious treatments for that patient. The research evaluating eating disorder treatments to date has stratified patients on the basis of their diagnosis; therefore, these studies allow clinicians to use empirically supported treatments in routine practice when eating disorder symptoms are measured.

semi-​ structured interview that is considered to be the “gold standard” of measurement for eating disorders (Wilson, 1993). The EDE provides a comprehensive description of the psychopathology associated with AN, BN, and BED; allows for DSM-​5 eating disorder diagnoses to be assigned; and is publicly available (http://​www. credo-​oxford.com/​pdfs/​EDE_​17.0D.pdf). The format of the EDE is assessor based, such that consistent scoring of the EDE items is achieved by a synthesis of information provided by the interviewee and the assessors’ understanding of the terms and constructs as defined by the assessment (Fairburn & Cooper, 1993; Fairburn et al., 2014; Wilson, 1993). To become proficient in administering the interview, it is necessary to complete comprehensive training. Training includes mastery of the interview format, co-​ rating interviews, and receiving supervision from an individual previously trained in administering the EDE. Although methods for completing EDE training vary and there is no standardized protocol, training potential assessors on randomized controlled trials (e.g., Wilson et al., 2010) and at large academic centers (e.g., Columbia University Medical Center) required approximately 20 to 30 hours of (a) watching training videos (~8 hours), (b) carefully reading the EDE interview ASSESSMENT FOR DIAGNOSIS and chapter describing administration (~2 hours), (c) co-​ rating existing taped EDE interviews and reviewing with In this section, we focus on assessment tools used to a supervisor (~3–​6 hours), (d)  observing approximately formulate eating disorder diagnoses, including AN, BN, three EDEs (~3 hours), (e)  completing one to three BED, or OSFED/​UFED. One important measurement EDEs with a trained observer (~1–​3 hours), (f) conductissue relevant to the diagnosis of eating disorders, regard- ing an independent EDE and receiving feedback from a less of assessment method, is the measurement of body trained interviewer (~1–​3 hours), and (g)  any necessary weight. Weight is crucial in differentiating between diag- remediation (~1–​ 5 hours). Trained EDE interviewers noses such as AN-​B/​P versus BN because similar bulimic make determinations about the severity of eating disorder symptoms are present in both disorders. To assign the symptoms and rate the amounts of food that qualify for difdiagnosis of AN-​B/​P, the individual must be at a signifi- ferent types of overeating (see the later discussion of four cantly low weight, which requires obtaining the patient’s types of overeating), which is particularly important for weight and, subsequently, using either tables of ideal diagnosing BN and BED. body weight (e.g., Metropolitan Life Insurance, 1959) or Because the EDE assesses eating disorder symptoms a calculation of body mass index (BMI; weight in kg/​ over a significant period of time (3 or 6  months), the height in m2) or BMI percentile in the case of younger Timeline Followback (TLFB) method is used. For the patients (https://​nccd.cdc.gov/​dnpabmi/​calculator.aspx). EDE TLFB, the interviewer presents the patient with a Thus, in conjunction with any data from interviews or calendar showing the 3-​or 6-​month period covered by self-​ report questionnaires, a measurement of weight the EDE and, with the patient, identifies events in each must be obtained to assign an accurate eating disorder month that might have disrupted the patient’s normal diagnosis. eating routine and other notable events (e.g., vacations, Both semi-​structured interviews and self-​report ques- birthdays, and parties). These events are written on tionnaires are available for use in formulating a diagnosis. the calendar so that the patient can refer back to them Perhaps the most commonly used assessment instrument throughout the interview. The TLFB procedure was is the Eating Disorder Examination (EDE, current ver- originally developed to retrospectively measure alcohol sion 17.0D; Fairburn, Cooper, & O’Connor, 2014), a consumption (Maisto, Sobell, Cooper, & Sobell, 1982),

Eating Disorders

and it helps orient patients to the time period being assessed and provides contextual information during the interview. The EDE has four subscales (Restraint, Eating Concern, Shape Concern, and Weight Concern) and a global score, and it includes items with either frequency or severity ratings. Severity items on the EDE are rated on a scale from 0 to 6, where a 1 is assigned if the feature is “barely present,” a 5 is assigned when the symptom does not qualify for the most severe rating (6), and a 3 is used as the midpoint between 0 and 6 (Fairburn & Cooper, 1993). Four different types of overeating are assessed by the EDE, including (a)  objective bulimic episodes, or the consumption of an objectively large amount of food while experiencing a sense of loss of control; (b) subjective bulimic episodes, or experiencing loss of control while consuming smaller amounts of food that are viewed by the individual as excessive; (c) objective overeating, or eating an objectively large amount of food without loss of control; and (d)  subjective overeating, or eating a small amount of food without a sense of loss of control, which the individual believes is excessive (Fairburn & Cooper, 1993). The designation of an amount of food that constitutes an “objectively large” amount of food during a binge episode is determined by the EDE interviewer; however, an appendix to the EDE was developed by experts in the field to standardize amounts constituting OBEs. For example, the consumption of two full meals (each with two or more courses), or three main courses (e.g., three Big Macs), or more than 1 pint of ice cream, or five donuts would all be considered large when rating OBEs on the EDE. A version of the EDE suitable for the assessment of children and adolescents (child EDE; ChEDE) has been developed by Bryant-​Waugh, Cooper, Taylor, and Lask (1996), and a few studies have evaluated this form of the

TABLE 24.1  

547

measure (e.g., Decaluwe & Braet, 2004; Glasofer et  al., 2007; Hilbert et  al., 2013; Tanofsky-​Kraff et  al., 2003; Watkins, Frampton, Lask, & Bryant-​Waugh, 2005). To make the assessment more appropriate for younger children, the ChEDE uses modified language and a sort task to evaluate the importance of shape and weight (Bryant-​ Waugh et al., 1996). Training in the use of the ChEDE to diagnose eating disorders among younger patients (aged 7–​ 14  years) has been described (Tanofsky-​ Kraff et  al., 2007). After BED was added to DSM-​5, recent versions of the ChEDE now include BED as a diagnostic category (see Schvey et al., 2015). Whereas the EDE is used in numerous treatment studies to diagnose patients with eating disorders, only a few studies have examined the validity of diagnoses generated by EDE. The EDE does successfully distinguish between patients with BN and individuals without BN who are preoccupied with shape and weight (Wilson & Smith, 1989). Comparisons of the diagnoses generated by the EDE and the self-​report version of the EDE (EDE-​Q) are described in the following paragraphs. Summary psychometric data for the use of the EDE for diagnostic purposes are provided in Table 24.1; comparable data on the EDE for other assessment purposes are reported in subsequent tables. Berg, Peterson, Frazier, and Crow (2012) offer an excellent review of data pertinent to the EDE, and in addition to articles referenced previously, data for ratings made in Table 24.1 were obtained from the following studies: Grilo, Masheb, Lozano-​Blanco, and Barry (2004); Jennings and Phillips (2017); Rizvi, Peterson, Crow, and Agras (2000); Rosen, Vara, Wendt, and Leitenberg (1990); and Wilfley, Schwartz, Spurrell, and Fairburn (2000). Despite the common use of the instrument in research settings, there are many obstacles for administering the EDE in routine clinical practice. Clinicians may not have completed the extensive training required to administer

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

EDE, versions 12–​17 SCID-​IV EDA-​5

G A NA

A NA NR

E G NR

A L A

E E G

A A A

E E A

A A NR

✓ ✓

EDE-​Q (version 4.0 or 6.0) EDDS for DSM-​IV

E

G

NA

L

E

NR

E

A

G

G

NA

A

G

A

A

A



Note: EDE = Eating Disorder Examination; SCID-​IV = Structured Clinical Interview for DSM-​IV; EDA-​5 = Eating Disorder Assessment for DSM-​5; EDE-​Q = Eating Disorder Examination Questionnaire; EDDS = Eating Disorder Diagnostic Scale; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

548

Health-Related Problems

the EDE, and the amount of time needed to administer the EDE (~1 or 2 hours) is significant, which makes the EDE less practical for private practice settings. However, the EDE can be clinically useful for developing a detailed understanding of a range of eating disorder symptoms (Wilson, 1993), and it is also helpful for case conceptualization and monitoring of treatment outcome—​issues that are discussed later. The Structured Clinical Interview for DSM-​ 5 (SCID-​5; First, Williams, Karg, & Spitzer, 2015)  provides an updated means for assessing DSM-​5 diagnostic criteria for AN, BN, BED, and OSFED. For this version of the interview, the SCID-​5 is available exclusively for purchase through American Psychiatric Publishing, and psychometric data are not currently available for eating disorders. The SCID-​IV (SCID; First, Spitzer, Gibbon, & Williams, 2002) was used in numerous studies of eating disorders, in some cases to diagnose comorbid Axis I psychopathology (e.g., Agras, Crow, et al., 2000) and in others to provide eating disorder diagnoses (e.g., Engel et  al., 2005; Grilo & Masheb, 2005). When using the SCID, clinicians must determine what constitutes a large amount of food to classify binge eating episodes. Although the SCID-​ IV was employed frequently in research on eating disorders, there are limited data specifically examining the psychometric properties of the version of the instrument for DSM-​IV diagnoses. Previous versions of the SCID (SCID for DSM-​III-​R; APA, 1987)  found acceptable kappa coefficients and test–​retest reliabilities for the eating disorder modules (Segal, Hersen, & van Hasselt, 1994). The data for the inter-​rater reliability of DSM-​IV eating disorder diagnoses are consistent with those of previous versions, with a good κ value (.77); however, the test–​retest reliability estimates for DSM-​IV eating disorder diagnoses (correlation = .64) are not consistent with a rating of acceptable (minimum correlation over several days or weeks = .70; Zanarini et  al., 2000). The prior version of the SCID-​ IV does have at least acceptable psychometric data for norms, and construct validity for eating disorder diagnoses, and is thus included in Table 24.1. The SCID, like the EDE, requires significant training for interviewers (~20–​30 hours), and it can be time-​ consuming to complete the entire instrument. However, the SCID only assesses the diagnostic criteria for eating disorders and not associated eating disorder pathology. As such, administering only the SCID eating disorder modules to generate diagnoses is not particularly time-​ consuming, but given the limited availability and notable burden of training in the SCID, and cost of the

instrument, it is unlikely that the measure is suitable for routine community-​based practice. The Eating Disorder Assessment (EDA-​5; Sysko et al., 2015)  is an electronic assessment (available for free at http://​www.eda5.org) developed to focus specifically on the comprehensive assessment of all DSM-​5 feeding and eating disorders. The EDA-​5 diverges in several important ways from previously mentioned interview-​ based measures. In contrast to the EDE, which requires extensive training and an extended amount of time to administer, the EDA-​5 can be administered with limited training and in a brief time period to reduce participant burden. The SCID-​5 (First et al., 2015) assesses the presence of an eating disorder, but it does not evaluate pica or rumination disorder, and the module on ARFID is optional; it also fails to precisely determine the individual’s BMI or frequencies of a range of behavioral disturbances, such as objective and subjective binge eating episodes (Glasofer, Sysko, & Walsh, 2015). Two studies examined the utility of the EDA-​5 in treatment-​seeking adults across multiple sites (for details, see Sysko et  al., 2015). The first study compared the diagnostic validity of the EDA-​5 to the EDE and measured test–​retest reliability of diagnoses from the new measure. The EDA-​5 and the EDE showed high rates of agreement (κ = .74 across diagnoses, n = 64), with a range of κ = .65 for OSFED/​UFED to κ = .90 for BED. For a random subgroup of participants, a new interviewer readministered the EDA-​5 at 7 to 14  days following the first assessment. Across diagnoses, the test–​retest κ coefficient was .87, and diagnostic agreement was achieved in 19 of 21 cases (90.5%). Because feedback from interviewers highlighted the complexity of the interview’s skip rules, an electronic application (“app”) version of the EDA-​5 was created. A second study compared the EDA-​5 app to clinician interview and found a high rate of agreement between diagnosis by EDA-​5 and clinician interview (κ = .83 across diagnoses, n = 71). Across individual diagnostic categories, κ ranged from .56 for OSFED/​UFED to .94 for BED. The EDA-​5 required significantly less time to complete than the EDE, and the app version of the EDA-​5 significantly shortened the length of time needed to administer the interview, from an average of 19.3 ± 5.6 minutes (range, 5–​34 minutes) to 14.0 ± 6.2 minutes (range, 5–​30 minutes). Given the encouraging results of these preliminary investigations, the EDA-​5 is included in Table 24.1; however, further validation and replication studies are warranted. Several self-​ report questionnaires are also available to provide clinicians with a means for assigning eating disorder diagnoses; however, limited information is

Eating Disorders

available about updated DSM-​5 versions of these assessments and the psychometrics relevant to diagnosis with updated instruments. The most commonly used measures were developed in an effort to generate eating disorder diagnoses while circumventing the need for costly or time-​ consuming interviews (Stice, Telch, & Rizvi, 2000). Although self-​report measures are brief and do not require specific clinician training, there are also some issues that clinicians should consider before utilizing these assessments, including the need to obtain scoring algorithms and the costs of the questionnaires (Peterson & Mitchell, 2005). The Eating Disorder Examination Questionnaire (EDE-​ Q, current version 6.0; http://​www.credo-​oxford. com/​pdfs/​EDE-​Q_​6.0.pdf; Fairburn & Beglin, 2008)  is a 38-​item self-​report version of the EDE designed to be completed in 15 minutes. Similar to the EDE, the EDE-​ Q includes four subscales (Restraint, Eating Concern, Shape Concern, and Weight Concern) and uses a combination of frequency items (e.g., objective bulimic episodes and vomiting) and severity items rated on a scale of 0 to 6 to assess the 28-​day period before the completion of the questionnaire (Fairburn & Beglin, 2008). A child version of the EDE-​Q has also been developed (ChEDE-​ Q; Decaluwe, 1999). The EDE-​Q is listed in Table 24.1, although the data on test–​retest reliability are generally less than acceptable, with some variability (e.g., Rose, Vaewsorn, Rosselli-​Navarra, & Wilson, 2013). However, research on this questionnaire has found it to demonstrate acceptable treatment sensitivity and clinical utility; good internal consistency; and excellent norms, content validity, and validity generalization. Several studies compared the EDE and EDE-​Q for diagnosis, including for DSM-​IV AN in comparison to clinical interview, which was considered to be the standard for diagnosis (Wolk, Loeb, & Walsh, 2005). By clinical interview, 100% of patients were diagnosed with AN and 66.7% of these patients were diagnosed with AN-​B/​P, with corresponding percentages for the EDE and EDE-​Q of 71.7% and 86.7% of patients diagnosed with AN and 79% and 71% of the subsample diagnosed with AN-​B/​P, respectively (Wolk et al., 2005). Because all of the patients in the study met criteria for low weight and amenorrhea, the authors indicated that the discrepancies between the diagnosis of AN with the EDE and that with the EDE-​ Q were related to the severity items, and specifically Criterion B of the AN diagnostic criteria, or the fear of gaining weight or becoming fat (Wolk et al., 2005). Thus, to better evaluate this criterion, clinicians interested in using either the EDE or the EDE-​Q to diagnose AN

549

should consider gathering additional data from patients about the fear of gaining weight or becoming fat. The EDE-​Q should not be used as the only method of diagnosing BN. In a study of women seeking treatment for substance abuse, the EDE-​Q underassessed the rate of BN when strict DSM-​IV criteria were applied, but it overdiagnosed individuals as having BN when the criteria were slightly relaxed (Black & Wilson, 1996). More recently, several studies (Berg, Peterson, et  al., 2012; Mancuso et  al., 2015)  utilized the EDE/​ EDE-​ Q or EDE-​ Q to compare DSM-​IV to DSM-​5 criteria, and they suggested that more individuals met diagnostic criteria for a full-​ threshold eating disorder under DSM-​5 criteria, resulting in a reduction of the relative prevalence of residual eating disorder diagnoses. However, the EDE-​Q does not evaluate several of the diagnostic criteria for DSM-​5 (e.g., lack of recognition of the seriousness of low body weight; Mancuso et al., 2015), and there is moderate diagnostic concordance between the EDE and EDE-​Q (Berg, Stiles-​ Shields, et al., 2012), which limits the ability to use the measure for the purpose of diagnosis. The Eating Disorder Diagnostic Scale (EDDS; Stice et  al., 2000)  is a 22-​item self-​report scale that can generate possible diagnoses for AN, BN, and BED and an overall composite score for eating disorder symptoms. The EDDS was developed for the purposes of diagnosing eating disorders in etiological research, for use in research that requires frequent measurements, or for identification of individuals with eating disorders in clinical practice (e.g., primary care; Stice et al., 2000). It includes questions rated on a Likert scale, dichotomous response questions, questions about symptom frequency, and open-​ended questions. In addition, to improve on some of the difficulties inherent in assessing binge eating by self-​report, the EDDS does not include the word “binge”; instead, binge eating is described solely in behavioral terms (Peterson & Mitchell, 2005). Two studies have examined the reliability and validity of the EDDS scores for DSM-​IV diagnoses (Stice et  al., 2000; Stice, Fisher, & Martinez, 2004), other studies have used the EDDS to measure treatment sensitivity (Stice, Orjada, & Tristan, 2006; Stice & Ragan, 2002), and information about psychometric properties of the scale is provided in Table 24.1. The EDDS is psychometrically sound for DSM-​IV and appropriate for clinical practice. The measure is brief and can be completed quickly; therefore, the EDDS can also be helpful in evaluating patients with other psychiatric disorders where eating disorders are likely to co-​occur (e.g., major depression, anxiety disorders, and substance use disorders). A revised version of the original EDDS was

550

Health-Related Problems

developed to fit the diagnostic changes in the DSM-​5 for eating disorders, but it is currently under development and has not yet been validated.

The measurement of body weight is an essential element of case conceptualization and treatment planning because, as described previously, body weight differentiates patients with AN-​B/​P from individuals with BN, and it informs clinicians about the type of treatment Overall Evaluation that will be most effective. For example, for individuals Additional information about the assignment of DSM-​5 with BN, fluoxetine at 60 mg is an effective treatment, diagnoses is needed before an instrument can be recom- and it produces significant reductions in binge eating mended. One assessment, the EDA-​5, has some psycho- and purging behaviors (Fluoxetine Bulimia Nervosa metric data, and two other tools, the EDE and EDDS, Collaborative Study Group, 1992; Goldstein, Wilson, have consistently strong supporting psychometric data for Thompson, Potvin, & Rampey, 1995). Conversely, no use in the diagnosis of DSM-​IV eating disorders. However, significant benefits have been observed for individuals clinicians should consider confirming diagnoses gener- with AN-​B/​P receiving fluoxetine at 60 mg in comparison ated by the EDDS to ensure that the patient has binge to placebo for the acute treatment of AN (Attia, Haiman, eating episodes that satisfy DSM criteria. Other measures, Walsh, & Flater, 1998) or for preventing relapse (Walsh such as the SCID and EDE-​Q, are widely used in eat- et al., 2006). ing disorders research, but test–​retest reliability estimates Patients with AN, BN, or BED should be referred are not adequate. Because of their ease of use, self-​report for a medical evaluation before the start of treatment measures hold some promise for being used in clinical set- and at regular intervals throughout the course of treattings, but current psychometric evidence is limited. The ment. The medical assessments described here are scarcity of instruments with adequate test–​retest reliabil- based on the recommendations of experts in the field ity and possible reasons for difficulties in assessing eating (e.g., Crow & Swigart, 2005; National Task Force on the disorder symptoms over time are discussed in the conclu- Prevention and Treatment of Obesity, 2000). Patients at sions/​future directions section. a low weight (e.g., AN-​R and AN-​B/​P) should receive a complete blood count, an electrolyte battery, an electrocardiogram, liver function tests, and a dual-​energy ASSESSMENT FOR CASE CONCEPTUALIZATION X-​ray absorptiometry (DEXA; Crow & Swigart, 2005) to AND TREATMENT PLANNING evaluate risk for complications associated with low body weight (e.g., low heart rate, hypotension, and hyponaAssessment instruments can also provide clinically tremia; Commission on Adolescent Eating Disorders, meaningful information for clinicians to guide case con- 2017). Inpatient treatment may be necessary for indiceptualization and treatment planning. Some of the afore- viduals with AN at a low weight in order to restore body mentioned instruments (e.g., EDE and EDE-​Q) measure weight and allow for the close monitoring of medical a broad spectrum of symptoms, which can allow the clini- complications that may emerge during the refeeding cian to determine the severity of a patient’s eating disorder. process. For patients with binge eating and purging A significant amount of treatment planning is dependent behaviors (e.g., AN-​B/​P and BN), an electrolyte battery on the eating disorder and type of treatment to be deliv- and a dental evaluation should be completed because ered; therefore, careful consideration must be given to individuals who purge are at risk for electrolyte disturthe choice of assessments for this purpose. In addition, as bances, including potassium depletion (Crow & Swigart, described previously, many patients with eating disorders 2005). The majority of individuals presenting with BED experience comorbid disorders. As such, screening assess- are at a weight classified as overweight (BMI > 25 kg/​m2) ments for depression, anxiety, and substance use should or obese (BMI > 30 kg/​m2). As such, patients with BED also be considered for eating-​disordered patients. The data should be assessed for the serious medical sequelae (e.g., from these measures and the presence or absence of co-​ type 2 diabetes) associated with higher body weights as occurring psychiatric symptoms should then be used in the outlined by the National Task Force on the Prevention development of a case formulation. Readers are encour- and Treatment of Obesity (2000). aged to refer to the chapters on assessments for depression, In this chapter, assessments for treatment planning anxiety, and substance abuse in this volume to determine and case conceptualization for individuals with eating the most clinically relevant and psychometrically sound disorders are discussed in the context of empirically measures to evaluate comorbid symptoms. supported treatments, specifically CBT. Continuing

Eating Disorders

assessment and evaluation of progress throughout treatment are essential components of CBT because adjustments can be made by the clinician based on the data provided by the measures. Research studies of CBT for BN (Fairburn, Marcus, & Wilson, 1993; Fairburn, 2008) have used assessments not only to guide treatment but also to better understand mechanisms of change during treatment. Thus, the remainder of this section presents measures that can be employed during the delivery of CBT for eating disorders. As described previously, the EDE and EDE-​Q measure a wide range of eating disorder symptoms. These instruments are particularly helpful in case conceptualization for CBT because they assess dietary restraint, bulimic behaviors (binge eating and purging), and shape and weight concerns, all of which are targets of CBT (see Table 24.2 for summary psychometric ratings). The utility of the EDE and EDE-​Q is particularly relevant in the delivery of CBT for BN, where patients need to eliminate binge eating and purging behaviors, establish a pattern of regular eating, identify alternative activities, and learn problem-​solving strategies. In addition, dietary restraint is addressed through the development of regular eating and exposure to forbidden foods, and shape and weight concerns are targeted through cognitive restructuring and behavioral experiments. Research has demonstrated that the reduction in dietary restraint as early as the fourth week of CBT for BN mediates post-​ treatment reductions in binge eating and vomiting (Wilson, Fairburn, Agras, Walsh, & Kraemer, 2002), and change in purging behavior after 4 weeks of CBT predicts symptom levels at 8-​month follow-​up (Fairburn, Agras, Walsh, Wilson, & Stice, 2004). Thus, clinicians could use the EDE-​Q during the first month of CBT for BN to monitor levels of dietary restraint and frequency of purging, which would provide important information about whether improvements should be expected with continued CBT. The clinician would then have objective data informing the

TABLE 24.2  

551

decision to continue delivering CBT for BN or to begin using another treatment strategy (e.g., switching to interpersonal psychotherapy or beginning antidepressant medication). Another important means of assessment throughout CBT is self-​monitoring, which is an integral part of CBT for eating disorders. Patients begin self-​monitoring after the first session of CBT; as such, the monitoring records can help with case conceptualization or treatment planning because they provide information about the patient’s baseline eating disorder symptoms. The monitoring typically involves recording circumstances associated with binge eating and purging, such as antecedent and consequent events, and general descriptions of food intake. Self-​monitoring can also focus on other behaviors typical of patients with eating disorders, including body checking or avoidance (Fairburn, Cooper, & Shafran, 2003). Wilson and Vitousek (1999) reviewed research on self-​monitoring in the treatment of eating disorders and identified a number of important advantages to this form of measurement. Because these records are completed closer to the time when the behaviors occur, the likelihood that the records are affected by problems of retrospective recall is reduced (Wilson & Vitousek, 1999). Thus, assessing eating disorder symptoms immediately after they occur may increase the accuracy of self-​ reported binge eating or restricting behaviors on self-​monitoring records in comparison to other forms of assessment (e.g., EDE). The possibility that symptoms can be measured more accurately without a time delay has been explored using ecologic momentary assessment (EMA; Engel et al., 2016, Farchaus & Corte, 2003; Smyth et al., 2001). In general, EMA involves recording events multiple times during a day on monitoring records, a hand-​held computer, or smartphone. Patients can be instructed to record at specific times of day (e.g., just after waking), when signaled by a pager, alarm on a watch, or hand-​held computer, or

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

EDE, versions 12–​16

G

A

E

A

E

A

E

A



EDE-​Q, versions 4 and 6

E

G

NA

L

E

NR

E

A

BSQ

A

E

NA

A

G

A

A

A

BCQ

A

E

NA

A

G

A

A

A

Note: EDE = Eating Disorder Examination; EDE-​Q = Eating Disorder Examination Questionnaire; BSQ = Body Shape Questionnaire; BCQ = Body Checking Questionnaire; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

552

Health-Related Problems

when a specific event occurs (e.g., binge eating; Farchaus & Corte, 2003). In the assessment of eating disorders, EMA has been used to measure mood, stressors, eating behavior, dietary restraint, binge eating, antecedents of binge eating, exercise inappropriate compensatory behaviors, and other related variables among individual with eating disorders (Engel et  al., 2016; Farchaus & Corte, 2003; Le Grange, Gorin, Catley, & Stone, 2001; Smyth et al., 2001). Ecological momentary assessment has also been examined as an alternative to self-​monitoring in CBT for binge eating disorder, where patients record mood, events, thoughts, and eating behaviors; however, the use of EMA did not provide any additional benefit to standard CBT (Le Grange, Gorin, Dymek, & Stone, 2002). Thus, like self-​monitoring, EMA may be useful in treatment planning and conceptualization, but additional data are needed. Self-​monitoring also provides information about the temporal pattern of eating behaviors, such as the observation that binge eating is more likely to occur during the afternoon or evening and that episodes of binge eating are often preceded by negative mood states, which can be used to inform the therapeutic process (Wilson & Vitousek, 1999). For example, if the monitoring records indicate that a patient is experiencing difficulties with binge eating in the afternoons, after avoiding eating during the morning, the CBT therapist will help the patient consume additional meals or snacks earlier in the day and schedule activities that are inconsistent with binge eating during the afternoon. Self-​monitoring may also serve a crucial role in the process of early response observed among patients with BN treated with CBT (Wilson et al., 1999), as some of the improvements observed in the first few weeks of CBT for BN (e.g., Fairburn et al., 2004; Wilson et al., 2002) may be attributable to the awareness of eating behavior and patterns of eating through self-​monitoring. Although self-​monitoring records are very useful clinically, this form of measurement is not included in Table 24.2 because studies have not established norms for self-​ monitoring, it is not possible to measure internal consistency, content validity is not applicable, and inter-​rater reliability is not helpful in clinical practice. In addition, test–​retest reliability of self-​monitoring records is difficult to establish because patterns of eating behavior are constantly changing among individuals with eating disorders (Hildebrandt & Latner, 2006). Thus, self-​monitoring is an integral and useful part of CBT for eating disorders. However, even when patients are instructed in the appropriate methods for completing self-​monitoring records, there are indications that self-​ monitoring may not provide entirely accurate information

about the amount of food consumed. Patients have been shown to report similar meal patterns on 24-​hour self-​ report interviews and the EDE (Bartholome, Raymond, Lee, Peterson, & Warren, 2006); however, patients may exaggerate the size of binge episodes when self-​ monitoring (Hadigan, Walsh, Devlin, LaChaussee, & Kissileff, 1992). Therefore, when evaluating monitoring records, clinicians should attend to the overall pattern of meals, snacks, and binge episodes rather than the total amount of food eaten. The diagnostic criteria for both AN and BN include specific disturbances in body image, which can also be observed among individuals with BED or EDNOS. The Body Shape Questionnaire (BSQ; Cooper, Taylor, Cooper, & Fairburn, 1987) is a 34-​item self-​report questionnaire that provides an overall measure of concerns about shape, weight, and body image. The BSQ allows clinicians to assess the need for interventions addressing distortions in the perception of shape or weight among individuals with AN, BN, EDNOS, or BED. Although there are many existing assessments measuring body image concerns (for more information, see Thompson et  al., 2005), the BSQ is highly recommended because it is both psychometrically sound (see Table 24.2) and straightforward for patients to complete. In addition to articles referenced in the text, data for ratings in Table 24.2 for the BSQ were obtained from Evans and Dolan (1993) and Rosen, Jones, Ramirez, and Waxman (1996). The BSQ also includes two questions that specifically assess body checking and avoidance behaviors. The behaviors measured by the BSQ are avoiding wearing clothes that make the person particularly aware of body shape and pinching areas of the body to determine how much fat there is. Williamson, Muller, Reas, and Thaw (1999) suggested that preoccupation with shape and weight is increased by the selective attention focused on a disliked part of the body that occurs in body checking and avoidance. A study by Shafran, Fairburn, Robinson, and Lask (2004) supported this hypothesis by demonstrating increases in preoccupation with shape and weight after body checking. Thus, checking and avoidance can be important targets for treatment because these behaviors reinforce negative beliefs about shape and weight and may also maintain eating disorder symptoms. The most recent version of CBT for eating disorders (Fairburn et al., 2003) asks patients to monitor these behaviors during treatment, and the clinician intervenes to reduce body checking and avoidance (Fairburn, 2006). The Body Checking Questionnaire (BCQ; Reas, Whisenhunt, Netemeyer, & Williamson, 2002) is a 23-​item measure designed to assess body checking behaviors. The importance of body checking in the

Eating Disorders

maintenance of symptoms of AN (Fairburn, Shafran, & Cooper, 1999)  and the psychometric soundness of the measure justify its inclusion in this chapter. A version of the BCQ specific to males has also been developed and initially validated since the original version of this chapter (see Hildebrandt, Walker, Alfano, Delinsky, & Bannon, 2010). Information about the BCQ is provided in Table 24.2, including data from Calugi, Dalle Grave, Ghisi, and Sanavio (2006) and Reas, White, and Grilo (2006). A recently developed self-​ report questionnaire, the 45-​ item Eating Pathology Symptoms Inventory (EPSI; Forbush et al., 2013), also assesses eating disorder dimensions relevant to treatment outcome. A total of eight subscales can be measured with the EPSI, including Body Dissatisfaction (dissatisfaction with body weight and/​or shape), Binge Eating (eating large amounts of food and associated cognitive symptoms), Cognitive Restraint (cognitive attempts to limit or avoid eating, whether or not successful), Purging (self-​induced vomiting, laxative use, diuretic use, and diet pill use), Muscle Building (desire for increased muscularity and muscle-​building supplement use), Restricting (efforts to avoid or reduce food consumption), Excessive Exercise (intense and/​or compulsive physical exercise), and Negative Attitudes Toward Obesity (negative attitudes toward overweight or obese individuals). Given that the EPSI is not yet widely used, it is not included in the table; however, the psychometric properties of scores on the measure based on initial testing are promising (see also Forbush & Berg, 2015). Specifically, scores on the EPSI have demonstrated good to excellent internal consistency across samples of men, women, obese participants, and psychiatric patients with and without eating disorders (Forbush, Wildes, & Hunt, 2014; Forbush et al., 2013). For test–​retest reliability estimates, scores on most of the EPSI subscales exceeded .70, except for the Cognitive Restraint scale (.61). Discriminant validity for the EPSI scores was found between individuals with eating disorders and (a)  general psychiatric outpatients (Forbush et al., 2013) and (b) college students (Forbush et al., 2013, 2014). In college samples, convergent validity was also observed, with the EPSI subscale scores more strongly related to scores on measures of similar than dissimilar areas. Thus, extant data suggest initial support for the validity of the EPSI and the potential for use in case conceptualization and treatment planning. CBT for eating disorders involves the use of assessments to determine whether the areas of interpersonal functioning, perfectionism, core low self-​ esteem, or mood intolerance should be addressed during treatment. Fairburn et  al. (2003) proposed that for some patients, these areas are barriers to change because they serve as

553

additional maintaining processes that interact with the eating disorder maintaining mechanisms usually targeted by CBT for BN (Fairburn et al., 1993; e.g., overevaluation of shape and weight, dietary restraint, and binge eating and compensatory behavior). A number of measures can be used by clinicians to gather data about these areas to inform treatment in the expanded form of CBT (Fairburn, 2008). Examples of relevant instruments include the Beck Depression Inventory-​II (BDI-​II; Beck, Steer, & Brown, 1996), the Rosenberg Self-​Esteem Scale (RSE; Rosenberg, 1979), the Inventory of Interpersonal Problems (IIP; Horowitz et  al., 1988), and the Dysfunctional Attitude Scale (DAS; Weissman & Beck, 1978). Overall Evaluation Similar to the assessments used to diagnose eating disorders, there are a handful of measures specific to eating disorders that have sufficient empirical support to allow them to serve as clinical tools for treatment conceptualization and planning. These measures are especially appropriate for treatment planning in CBT because the CBT model includes strategies designed to affect the areas assessed by these measures (e.g., dietary restraint and overvaluation of shape and weight). However, these instruments are also likely to be of use for other treatment approaches that have the goal of achieving reductions in eating disorder symptoms (e.g., Maudsley family therapy and psychopharmacological treatment).

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

Very few measures have been designed and used for tracking and evaluating the impact of treatments for eating disorders. The EDE is the most commonly used assessment for measuring treatment outcome for patients with eating disorders, including studies of AN (e.g., Pike, Walsh, Vitousek, Wilson, & Bauer, 2003; Walsh et al., 2006), BN (e.g., Agras, Crow, et al., 2000; Agras, Walsh, et al., 2000; Walsh et al., 2004), and BED (e.g., Devlin et al., 2005; Wilfley et  al., 2002; Wilson et  al., 2010). However, as described previously, the EDE is time-​consuming and requires extensive training of interviewers, which make the instrument less practical for multiple assessments of outcome. As such, a number of studies have investigated whether the EDE-​Q can be substituted for the EDE in measuring treatment outcome for patients with eating disorders, many of which are summarized in the review by Berg, Peterson, et al. (2012).

554

Health-Related Problems

The first comparison of the EDE and the EDE-​Q (Fairburn & Beglin, 1994)  examined the agreement between the measures among a sample of women from the community (n  =  243) and a sample of women with eating disorders (n  =  23 patients with BN, n  =  13 patients with AN). Across the two instruments, OBEs, self-​induced vomiting, and laxative misuse were highly correlated, with the data on self-​induced vomiting being the most highly correlated between EDE and EDE-​Q for both samples. When comparing the scores obtained for the EDE and EDE-​Q subscales, the agreement was greatest for the restraint and weight concern subscales. A number of studies have since compared the EDE and the EDE-​Q in women seeking treatment for substance abuse (Black & Wilson, 1996), obese patients with BED (Wilfley, Schwartz, Spurrell, & Fairburn, 1997), obese bariatric surgery candidates (Kalarchian, Wilson, Brolin, & Bradley, 2000), patients with BED (Grilo, Masheb, & Wilson, 2001a, 2001b), women with AN (Wolk et al., 2005), and women with BN (Carter, Aime, & Mills, 2001; Sysko, Walsh, & Fairburn, 2005). A  general pattern of results has emerged among these studies, in which the behavioral features (e.g., self-​induced vomiting) and clearly defined concepts (e.g., dietary restraint) are most highly correlated between EDE and EDE-​Q (Black & Wilson, 1996; Wilfley et  al., 1997; Wolk et  al., 2005). Greater discrepancies have been observed between the EDE and the EDE-​Q for complex concepts such as binge eating (Black & Wilson, 1996; Carter et al., 2001; Grilo et al., 2001a; Wilfley et al., 1997), and significantly higher levels of pathology have been observed on the EDE-​Q subscales in comparison to the EDE (Kalarchian et  al., 2000; Wilfley et  al., 1997). High levels of convergence between the EDE and the EDE-​Q for the assessment of binge eating can be produced with the addition of a brief (one page) instruction sheet to the EDE-​Q providing detail definitions and examples of binge eating (Goldfein,

TABLE 24.3  

Devlin, & Kamenetz, 2005). The results of two studies examining the ChEDE and the ChEDE-​Q among obese children and adolescents with AN found a similar pattern of results, with higher levels of eating disorder pathology observed on the questionnaire measure (Decaluwe & Braet, 2004; Passi, Bryson, & Lock, 2003). When the EDE and the EDE-​Q were compared for the measurement of change in a study of patients with BN, the change in compensatory behaviors over the course of the study was highly correlated, but the change in binge eating (OBE and subjective bulimic episode [SBE]) and attitudinal features (e.g., importance of shape and weight) were more discrepant (Sysko, Walsh, & Fairburn, 2005). The authors concluded that although both instruments assess change, it is not possible to evaluate which measure provides greater validity in assessing eating disorder pathology. Thus, Sysko et al. recommended that clinicians and researchers should consistently use one measure (EDE or EDE-​Q) rather than switching back and forth between measures or viewing the measures as interchangeable. Because patients with AN often experience obsessive thoughts and compulsions related to eating disorder symptoms, a number of studies evaluating treatments for AN have evaluated change using the Yale–​Brown–​Cornell Eating Disorder Scale (YBC-​ EDS; Attia et  al., 1998; Kaye et  al., 2001; Mazure, Halmi, Sunday, Romano, & Einhorn, 1994; Walsh et  al., 2006). The YBC-​EDS includes a 65-​item symptom checklist assessing 18 categories (e.g., food/​ eating/​ weight and shape/​ clothing/​ hoarding/​exercise preoccupations, and eating/​food/​binge eating/​purging/​somatic rituals). In addition, 19 questions measuring specific symptoms are asked, and a total score is calculated by summing 8 items assessing preoccupations and rituals. Summary psychometric information about the YBC-​EDS is provided in Table 24.3. Any of the three measures described previously in the context of measuring overall treatment outcome can also

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Content Reliability Validity

Construct Validity Validity Generalization

Treatment Sensitivity

Clinical Utility

Highly Recommended

EDE, versions 12–​16

G

A

E

A

E

A

E

A

A



EDE-​Q, versions 4 and 6

E

G

NA

L

E

NR

E

A

A

YBC-​EDS

A

G

E

NR

G

A

G

G

A

Note: EDE = Eating Disorder Examination; EDE-​Q = Eating Disorder Examination Questionnaire; YBC-​EDS = Yale–​Brown–​Cornell Eating Disorder Scale; L = Less Than Adequate; A = Adequate; G = Good; E = Excellent; NR = Not Reported; NA = Not Applicable.

Eating Disorders

be used to evaluate progress during treatment. Because the EDE and the EDE-​Q both assess eating-​disordered behaviors over a 28-​day period, these measures are ideal for evaluating change on a monthly basis. Clinicians interested in changes in symptoms on a weekly basis can use self-​monitoring records to determine progress in treatment. Overall Evaluation The overall evaluation of measures assessing treatment monitoring and treatment outcome is consistent with the conclusions of the two previous sections. Only the EDE and YBC-​EDS can be included in Table 24.3 as assessments that work, and although these instruments are widely used in research, they may be less practical for use by clinicians. Both are semi-​structured interviews that require extensive training, and the amount of time needed to administer the EDE or the YBC-​EDS can be considerable. Thus, the development of more efficient assessment tools is an important goal for furthering the assessment of treatment monitoring and outcome evaluation.

CONCLUSIONS AND FUTURE DIRECTIONS

The assessments described in this chapter are among the most widely used in the field of eating disorders. However, only a small number of measures can be classified as having extensive supporting psychometric evidence. Commonly used research strategies, such as the use of laboratory meal situations, can provide an objective measure of eating behavior but are simply not feasible in clinical practice (Wilson, 1993). Although some variability in symptoms over time is to be expected, ratings for test–​retest reliability are suboptimal for the instruments described in this chapter. Mond, Hay, Rodgers, Owen, and Beaumont (2004) conducted the longest evaluation of the stability of eating disorder assessment over time, giving the EDE-​Q a mean of 303.2 days apart. The authors found that “although the cognitive and personality dimensions of eating disorder psychopathology are relatively stable, eating-​disordered behaviors such as binge eating and use of exercise as a means of weight control are liable to fluctuate considerably in intensity and severity over time” (p.  200). Symptom reactivity in eating-​disordered behaviors has also been observed with other forms of assessment, with a study by Hildebrandt and Latner (2006) demonstrating a significant decrease in OBEs and a concurrent increase in SBEs among women

555

with BN and BED after 7 days of self-​monitoring and no additional treatment intervention. These studies suggest that the core eating-​disordered behavior of binge eating may oscillate and be significantly reactive to nonspecific interventions, rendering the evaluation of binge eating over a long period of time quite difficult. One measurement issue that affects the accuracy and reliability of scores on most eating disorder instruments is the frequent reliance on a single questionnaire or interview item to assess a particular behavior or symptom. For example, most measures have only one question for quantifying the number of binge eating episodes during a specified period. This can provide important and meaningful information for the purpose of diagnosis (e.g., whether a patient meets criteria for BN), treatment conceptualization (e.g., are binge eating episodes decreasing over time), and treatment outcome (e.g., has a clinically meaningful change been observed). However, the reliance on a single item increases measurement error and decreases the overall statistical power to detect changes across time or between groups (Viswanathan, 2005). Given the poor test–​retest reliability observed for scores on some measures, it may be useful to determine if reliability could be improved by using multiple indicators for each behavior or symptom. Beyond this, future research should focus on attempting to examine DSM-​5 diagnoses, design new measures, and refine existing measures in order to provide scientifically sound assessment tools available to clinicians.

ACKNOWLEDGMENT

We thank G. Terence Wilson, PhD, who served as a consultant on the version of this chapter from the first edition and provided guidance about the initial content of the chapter.

References Agras, W. S. (1997). Pharmacotherapy of bulimia nervosa and binge eating disorder:  Longer-​ term outcomes. Psychopharmacology Bulletin, 33, 433–​436. Agras, W. S., Crow, S. J., Halmi, K. A., Mitchell, J. E., Wilson, G. T., & Kraemer, H. C. (2000). Outcome predictors for the cognitive behavior treatment of bulimia nervosa: Data from a multisite study. American Journal of Psychiatry, 157, 1302–​1308. Agras, W. S., Walsh, B. T., Fairburn, C. G., Wilson, G. T., & Kraemer, H. C. (2000). A multicenter comparison

556

Health-Related Problems

of cognitive–​behavioral therapy and interpersonal psychotherapy for bulimia nervosa. Archives of General Psychiatry, 57, 459–​466. American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev). Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Attia, E., Haiman, C., Walsh, B. T., & Flater, S. (1998). Does fluoxetine augment the inpatient treatment of anorexia nervosa? American Journal of Psychiatry, 155, 548–​551. Bacaltchuk, J., & Hay, P. P. (2003). Antidepressants versus placebo for people with bulimia nervosa. Cochrane Database Systematic Reviews, 2003(4), CD003391. Bartholome, L. T., Raymond, N. C., Lee, S. S., Peterson, C. B., & Warren, C. S. (2006). Detailed analysis of binges in obese women with binge eating disorder: Comparisons using multiple methods of data collection. International Journal of Eating Disorders, 39, 685–​693. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the Beck Depression Inventory (2nd ed.). San Antonio, TX: Psychological Corporation. Becker, A., Burwell, R., Gilman, S., Herzog, D., & Hamburg, P. (2002). Eating behaviors and attitudes following prolonged exposure to television among ethnic Fijian adolescent girls. British Journal of Psychiatry, 180, 509–​514. Berg, K. C., Peterson, C. B., Frazier, P., & Crow, S. J. (2012). Psychometric evaluation of the Eating Disorder Examination and Eating Disorder Examination Questionnaire:  A systematic review of the literature. International Journal of Eating Disorders, 45, 428–​438. Berg, K. C., Stiles-​Shields, E. C., Swanson, S. A., Peterson, C. B., Lebow, J., & Le Grange, D. (2012). Diagnostic concordance of the interview and questionnaire versions of the Eating Disorder Examination. International Journal of Eating Disorders, 45, 850–​855. Berscheid, E., Walster, E., & Bohmstedt, G. (1973). The Happy American Body:  A survey report. Psychology Today, 7, 119–​131. Black, C. M. D., & Wilson, G. T. (1996). Assessment of eating disorders: Interview versus questionnaire. International Journal of Eating Disorders, 20, 43–​50. Bryant-​Waugh, R. J., Cooper, P. J., Taylor, C. L., & Lask, B. D. (1996). The use of the eating disorder examination with children:  A pilot study. International Journal of Eating Disorders, 19, 391–​397.

Calugi, S., Dalle Grave, R., Ghisi, M., & Sanavio, E. (2006).Validation of the Body Checking Questionnaire (BCQ) in an eating disorders population. Behavioural and Cognitive Psychotherapy, 34, 233–​242. Carter, J. C., Aime, A. A., & Mills, J. S. (2001). Assessment of bulimia nervosa:  A comparison of interview of self-​ report questionnaire methods. International Journal of Eating Disorders, 30, 187–​192. Carter, J. C., Olmsted, M. P., Kaplan, A. S., McCabe, R. E., Mills, J. S., & Aime, A. (2003). Self-​help for bulimia nervosa:  A randomized controlled trial. American Journal of Psychiatry, 160, 973–​978. Castellini, G., Lo Sauro, C., Mannucci, E., Ravaldi, C., Rotella, C. M., Faravelli, C. & Ricca, V. (2011). Diagnostic crossover and outcome predictors in eating disorders according to DSM-​IV and DSM-​V proposed criteria: A 6-​year follow-​up study. Psychosomatic Medicine, 73, 270–​279. Caudle, H., Pang, C., Mancuso, S., Castle, D., & Newton, R. (2015). A retrospective study of the impact of DSM-​5 on the diagnosis of eating disorders in Victoria, Australia. Journal of Eating Disorders, 3, 1–​5. Commission on Adolescent Eating Disorders. (2017). Eating disorders. In D. L. Evans, E. B. Foa, R. E. Gur, H. Hendin, C. P. O’Brien, D. Romer, . . . T. Walsh (Eds.), Treating and preventing adolescent mental health disorders: What we know and what we don’t know (2nd ed.). New York, NY: Oxford University Press. Cooper, P. J., Taylor, M. J., Cooper, Z., & Fairburn, C. G. (1987). The development and validation of the Body Shape Questionnaire. International Journal of Eating Disorders, 6, 485–​494. Cossrow, N., Pawaskar, M., Witt, E. A., Ming, E. E., Victor, T. W., Herman, B. K., . . . Erder, M. H. (2016). Estimating the prevalence of binge eating disorder in a community sample from the United States: Comparing DSM-​IV-​TR and DSM-​ 5 criteria. Journal of Clinical Psychiatry, 77, e968–​e974. Caudle, H., Pang, C., Mancuso, S., Castle, D., & Newton, R. (2015). A retrospective study of the impact of DSM-​ 5 on the diagnosis of eating disorders in Victoria, Australia. Journal of Eating Disorders, 3, 35. http://​doi. org/​10.1186/​s40337-​015-​0072-​0 Crow, S., & Swigart, S. (2005). Medical assessment. In J. E. Mitchell & C. B. Peterson (Eds.), Assessment of eating disorders (pp. 120–​128). New York, NY: Guilford. Decaluwe, V. (1999). Child Eating Disorder Examination–​ Questionnaire. Dutch translation and adaptation of the Eating Disorder Examination–​ Questionnaire, authored by C. G. Fairburn & S. J. Beglin. Unpublished manuscript. Decaluwe, V., & Braet, C. (2004). Assessment of eating disorder psychopathology in obese children and adolescents:  Interview versus self-​ report questionnaire. Behaviour Research and Therapy, 42, 799–​811.

Eating Disorders

Derogatis, L. R., Lipman, R. S., & Covi, L. (1973). SCL-​ 90:  An outpatient psychiatric rating scale-​preliminary report. Psychopharmacology Bulletin, 9, 13–​28. Devlin, M. J., Goldfein, J. A., Petkova, E., Jiang, H., Raizman, P. S., Wolk, S.,  .  .  .  Walsh, B. T. (2005). Cognitive behavioral therapy and fluoxetine as adjuncts to group behavioral therapy for binge eating disorder. Obesity Research, 13, 1077–​1088. Endicott, J., Nee, J., Harrison, W., & Blumenthal, R. (1993). Quality of Life Enjoyment and Satisfaction Questionnaire:  A new measure. Psychopharmacology Bulletin, 29, 321–​326. Engel, S. G., Corneliussen, S. J., Wonderlich, S. A., Crosby, R. D., Le Grange, D., Crow, S., . . . Steiger, H. (2005). Impulsivity and compulsivity in bulimia nervosa. International Journal of Eating Disorders, 38, 244–​251. Engel, S. G., Crosby, R. D., Thomas, G., Bond, D., Lavender, J. M., Mason, T.,  .  .  .  Wonderlich, S. A. (2016). Ecological momentary assessment in eating disorder and obesity research: A review of the recent literature. Current Psychiatry Reports, 18, 1–​9. Epstein, N. B., Baldwin, L. M., & Bishop, D. S. (1983). The McMaster Family Assessment Device. Journal of Marital and Family Therapy, 9(2), 171–​180. Evans, C., & Dolan, B. (1993). Body Shape Questionnaire:  Derivation of shortened “alternate forms.” International Journal of Eating Disorders, 13, 315–​321. Fairburn, C. G. (2006). Body checking, body avoidance and “feeling fat.” Workshop presented at the 17th International Conference on Eating Disorders, Barcelona, Spain. Fairburn, C. G. (2008). Cognitive behavior therapy and eating disorders. New York, NY: Guilford. Fairburn, C. G., Agras, W. S., Walsh, B. T., Wilson, G. T., & Stice, E. (2004). Early change in treatment predicts outcome in bulimia nervosa. American Journal of Psychiatry, 161, 2322–​2324. Fairburn, C. G., & Beglin, S. J. (1994). Assessment of eating disorders:  Interview or self-​ report questionnaire? International Journal of Eating Disorders, 16, 363–​370. Fairburn, C. G., & Beglin, S. J. (2008). Eating Questionnaire. http://​www.credo-​oxford.com/​pdfs/​EDE-​Q_​6.0.pdf Fairburn, C. G., & Cooper, Z. (1993). The Eating Disorder Examination. In C. G. Fairburn & G. T. Wilson (Eds.), Binge eating:  Nature, assessment, and treatment (12th ed., pp. 317–​360). New York, NY: Guilford. Fairburn C. G., Cooper, Z., & O’Connor, M. (2014). Eating Disorder Examination. Retrieved from http://​www. credo-​oxford.com/​pdfs/​EDE_​17.0D.pdf Fairburn, C. G., Cooper, Z., & Shafran, R. (2003). Cognitive behavior therapy for eating disorders:  A “transdiagnostic” theory and treatment. Behaviour Research and Therapy, 41, 509–​528.

557

Fairburn, C. G., Marcus, M. D., & Wilson, G. T. (1993). Cognitive–​ behavioral therapy for binge eating and bulimia nervosa:  A comprehensive treatment manual. In C. G. Fairburn & G. T. Wilson (Eds.), Binge eating:  Nature, assessment, and treatment (pp. 361–​404). New York, NY: Guilford. Fairburn, C. G., Shafran, R., & Cooper, Z. (1999). A cognitive behavioural theory of anorexia nervosa. Behaviour Research and Therapy, 37, 1–​13. Farchaus, S. K., & Corte, C. M. (2003). Ecologic momentary assessment of eating-​disordered behaviors. International Journal of Eating Disorders, 34, 349–​360. Fichter, M. M., Quadflieg, N., & Hedlund, S. (2006). Twelve-​year course and outcome predictors of anorexia nervosa. International Journal of Eating Disorders, 39, 87–​100. First, M. B., Spitzer, R. L, Gibbon, M., & Williams, J. B. W. (2002). Structured Clinical Interview for DSM-​IV Axis I Disorders, Research Version, Patient Edition. (SCID-​I/​ P). New York, NY: Biometrics Research, New York State Psychiatric Institute. First, M. B., Williams, J. B.  W., Karg, R. S., & Spitzer, R. L. (2015). Structured clinical interview for DSM-​ 5 disorders, clinician version (SCID-​ 5-​ CV). Arlington, VA: American Psychiatric Association. Fluoxetine Bulimia Nervosa Collaborative Study Group. (1992). Fluoxetine in the treatment of bulimia nervosa:  A multicenter, placebo-​controlled, double-​blind trial. Archives of General Psychiatry, 49, 139–​147. Forbush, K. T., & Berg, K. C. (2015). Self-​report assessments of eating pathology. In B. T. Walsh, E. Attia, D. R. Glasofer, & R. Sysko (Eds.), Handbook of assessment and treatment of eating disorders (pp. 157–​174). Arlington, VA: American Psychiatric Association Publishing. Forbush, K. T., Wildes, J. E., & Hunt, T. K. (2014). Gender norms, psychometric properties, and validity for the Eating Pathology Symptoms Inventory. International Journal of Eating Disorders, 47, 85–​91. Forbush, K. T., Wildes, J. E., Pollack, L. O., Dunbar, D., Luo, J., Patterson, K., . . . Bright, A. (2013). Development and validation of the Eating Pathology Symptoms Inventory (EPSI). Psychological Assessment, 25, 859–​878. Forsberg, S., & Lock, J. (2015). Family-​ based treatment of child and adolescent eating disorders. Child and Adolescent Psychiatric Clinics of North America, 24, 617–​629. Franzoi, S. L., & Shields, S. A. (1984). The Body Esteem Scale:  Multidimensional structure and sex differences in a college population. Journal of Personality Assessment, 48, 173–​178. Garner, D. M. (1991). Eating Disorder Inventory 2: Professional manual. Odessa, FL: Psychological Assessment Resources. Glasofer, D. R., Sysko, R., & Walsh, B. T. (2015). Use of the Eating Disorder Assessment for DSM-​5. In B. T. Walsh,

558

Health-Related Problems

E. Attia, D. R. Glasofer, & R. Sysko (Eds.), Handbook of assessment and treatment of eating disorders (pp. 175–​ 206). Arlington, VA:  American Psychiatric Association Publishing. Glasofer, D. R., Tanofsky-​Kraff, M., Eddy, K. T., Yanovski, S. Z., Theim, K. R., Mirch, M. C., . . . Yanovski, J. A. (2007). Binge eating in overweight treatment-​seeking adolescents. Journal of Pediatric Psychology, 32, 95–​105. Godart, N. T., Flament, M. F., Perdereau, F., & Jeammet, P. (2002). Comorbidity between eating disorders and anxiety disorders: A review. International Journal of Eating Disorders, 32, 253–​270. Godart, N. T., Perdereau, F., Rein, Z., Berthoz, S., Wallier, J., Jeammet, P., & Flament, M. F. (2007). Comorbidity studies of eating disorders and mood disorders: Critical review of the literature. Journal of Affective Disorders, 97, 37–​49. Goldfein, J. A., Devlin, M. J., & Kamenetz, C. (2005). Eating Disorder Examination–​Questionnaire with and without instruction to assess binge eating in patients with binge eating disorder. International Journal of Eating Disorders, 37, 107–​111. Goldstein, D. J., Wilson, M. G., Thompson, V. L., Potvin, J. H., Rampey, A. H. & Fluoxetine Bulimia Nervosa Research Group. (1995). Long-​term fluoxetine treatment of bulimia nervosa. British Journal of Psychiatry, 166, 660–​666. Gowers, S. G., Clark, A., Roberts, C., Griffiths, A., Edwards, V., Bryan, C., . . . Barrett, B. (2007). Clinical effectiveness of treatments for anorexia nervosa in adolescents. British Journal of Psychiatry, 191, 427–​435. Grilo, C. M., & Masheb, R. M. (2005). A randomized controlled comparison of guided self-​help cognitive behavioral therapy and behavioral weight loss for binge eating disorder. Behavior Research and Therapy, 43, 1509–​1525. Grilo, C. M., Masheb, R. M., Lozano-​ Blanco, C., & Barry, D. T. (2004). Reliability of the Eating Disorder Examination in patients with binge eating disorder. International Journal of Eating Disorders, 35, 80–​85. Grilo, C. M., Masheb, R. M., & Wilson, G. T. (2001a). Different methods for assessing the features of eating disorders in patients with binge eating disorder: A replication. Obesity Research, 9, 418–​422. Grilo, C. M., Masheb, R. M., & Wilson, G. T. (2001b). A comparison of different methods for assessing the features of eating disorders in patients with binge eating disorder. Journal of Consulting and Clinical Psychology, 69, 317–​322. Gualandi, M., Simoni, M., Manzato, E., & Scanelli, G. (2016). Reassessment of patients with eating disorders after moving from DSM-​IV towards DSM-​5:  A retrospective study in a clinical sample. Eating and Weight Disorders—​Studies on Anorexia, Bulimia and Obesity, 21, 617–​624.

Hadigan, C. M., Walsh, B. T., Devlin, M. J., LaChaussee, J. L., & Kissileff, H. R. (1992). Behavioral assessment of satiety in bulimia nervosa. Appetite, 18, 233–​241. Halmi, K. A., Agras, W. S., Crow, S., Mitchell, J., Wilson, G. T., Bryson, S. W., & Kraemer, H. C. (2005). Predictors of treatment acceptance and completion in anorexia nervosa: Implications for future study designs. Archives of General Psychiatry, 62, 776–​781. Herpertz-​Dahlmann, B., Muller, B., Herpertz, S., Heussen, N., Hebebrand, J., & Remschmidt, H. (2001). Prospective 10-​year follow-​up in adolescent anorexia nervosa: Course, outcome, psychiatric comorbidity, and psychosocial adaptation. Journal of Child Psychology and Psychiatry and Allied Disciplines, 42, 603–​612. Hilbert, A., Buerger, A., Hartmann, A. S., Spenner, K., Czaja, J., & Warschburger, P. (2013). Psychometric evaluation of the Eating Disorder Examination adapted for children. European Eating Disorders Review, 21, 330–​339. Hildebrandt, T., & Craigen, K. (2015). Eating-​related pathology in men and boys. In B. T. Walsh, E. Attia, D. R. Glasofer, & R. Sysko (Eds.), Handbook of assessment and treatment of eating disorders (pp. 105–​118). Arlington, VA: American Psychiatric Association Publishing. Hildebrandt, T., Lai, J. K., Langenbucher, J. W., Schneider, M., Yehuda, R., & Pfaff, D. W. (2011). The diagnostic dilemma of pathological appearance and performance enhancing drug use. Drug and Alcohol Dependence, 114, 1–​11. Hildebrandt, T., Langenbucher, J., & Schlundt, D. G. (2004). Muscularity concerns among men:  Development of attitudinal and perceptual measures. Body Image, 1, 169–​181. Hildebrandt, T., & Latner, J. (2006). Effect of self-​monitoring on binge eating:  Treatment response or binge drift? European Eating Disorders Review, 14, 17–​22. Hildebrandt, T., Schlundt, D., Langenbucher, J., & Chung, T. (2006). Presence of muscle dysmorphia symptomology among male weightlifters. Comprehensive Psychiatry, 47, 127–​135. Hildebrandt, T., Walker, D. C., Alfano, L., Delinsky, S., & Bannon, K. (2010). Development and validation of a male specific body checking questionnaire. International Journal of Eating Disorders, 43, 77–​87. Horowitz, L., Rosenberg, S. E., Baer, B. A., Ureno, G., & Villasenor, V. S. (1988). Inventory of Interpersonal Problems: Psychometric properties and clinical applications. Journal of Consulting and Clinical Psychology, 56, 885–​892. Hudson, J. I., Hiripi, E., Pope, H. G., & Kessler, R. C. (2007). The prevalence and correlates of eating disorders in the National Comorbidity Survey Replication. Biological Psychiatry, 61, 348–​358. Jennings, K. M., & Phillips, K. E. (2017). Eating Disorder Examination–​ Questionnaire (EDE-​ Q):  Norms for a

Eating Disorders

clinical sample of males. Archives of Psychiatric Nursing, 31, 73–​76. Jones, D. C., & Crawford, J. K. (2006). The peer appearance culture during adolescence: Gender and body mass variations. Journal of Youth and Adolescence, 35, 243–​255. Kalarchian, M. A., Wilson, G. T., Brolin, R. E., & Bradley, L. (2000). Assessment of eating disorders in bariatric surgery candidates:  Self-​ report questionnaire versus interview. International Journal of Eating Disorders, 28, 465–​469. Kaye, W. H., Nagata, T., Weltzin, T. E., Hsu, L. K. G., Sokol, M. S., McConaha, C., . . . Deep, D. (2001). Double-​ blind placebo-​controlled administration of fluoxetine in restricting-​and restricting-​purging type anorexia nervosa. Biological Psychiatry, 49, 644–​652. Keel, P. K., Brown, T. A., Holm-​Denoma, J., & Bodell, L. P. (2011). Comparison of DSM-​IV versus proposed DSM-​ 5 diagnostic criteria for eating disorders:  Reduction of eating disorder not otherwise specified and validity. International Journal of Eating Disorders, 44, 553–​560. Keel, P. K., & Klump, K. L. (2003). Are eating disorders culture-​bound syndromes? Implications for conceptualizing their etiology. Psychology Bulletin, 129, 747–​769. Keel, P. K., Mitchell, J. E., Miller, K. B., Davis, T. L., & Crow, S. J. (1999). Long-​term outcome of bulimia nervosa. Archives of General Psychiatry, 56, 63–​69. Kessler, R. C., Berglund, P. A., Chiu, W. T., Deitz, A. C., Hudson, J. I., Shahly, V., . . . Bruffaerts, R. (2013). The prevalence and correlates of binge eating disorder in the World Health Organization World Mental Health Surveys. Biological Psychiatry, 73, 904–​914. Kinston, W., & Loader, P. (1984). Eliciting whole-​ family interaction with a standardized clinical interview. Journal of Family Therapy, 6, 347–​363. Kissileff, H. R., Walsh, B. T., Kral, J. G., & Cassidy, S. M. (1986). Laboratory studies of eating behavior in women with bulimia. Physiology and Behavior, 38, 563–​570. Le Grange, D., Eisler, I., Dare, C., & Russell, G. F.  M. (1992). Evaluation of family treatments in adolescent anorexia nervosa: A pilot study. International Journal of Eating Disorders, 12, 347–​357. Le Grange, D., Gorin, A., Catley, D., & Stone, A. (2001). Does momentary assessment detect binge eating in overweight women that is denied at interview? European Eating Disorders Review, 9, 1–​16. Le Grange, D., Gorin, A., Dymek, M., & Stone, A. (2002). Does ecological momentary assessment improve cognitive behavioral therapy for binge eating disorder: A pilot study. European Eating Disorders Review, 10, 316–​328. Le Grange, D., Hughes, E. K., Court, A., Yeo, M., Crosby, R. D., & Sawyer, S. M. (2016). Randomized clinical trial of parent-​focused treatment and family-​based treatment for adolescent anorexia nervosa. Journal of the American Academy of Child and Adolescent Psychiatry, 55, 683–​692.

559

Lewinsohn, P. M., Seeley, J. R., Moerk, K. C., & Striegel-​ Moore, R. H. (2002). Gender differences in eating disorder symptoms in young adults. International Journal of Eating Disorders, 32, 426–​440. Lindvall Dahlgren, C., & Wisting, L. (2016). Transitioning from DSM-​IV to DSM-​5: A systematic review of eating disorder prevalence assessment. International Journal of Eating Disorders, 49, 975–​997. Lock, J. (2015). An update on evidence-​ based psychosocial treatments for eating disorders in children and adolescents. Journal of Clinical Child & Adolescent Psychology, 44, 707–​721. Lock, J., Agras, W. S., Bryson, S., & Kraemer, H. C. (2005). A comparison of short-​and long-​term family therapy for adolescent anorexia nervosa. Journal of the American Academy of Child & Adolescent Psychiatry, 44, 632–​639. Lock, J. E., Agras, W. S., Dare, C., & Le Grange, D. (2002). Treatment manual for anorexia nervosa: A family-​based approach. New York, NY: Guilford. Machado, P. P., Gonçalves, S., & Hoek, H. W. (2013). DSM-​5 reduces the proportion of EDNOS cases:  Evidence from community samples. International Journal of Eating Disorders, 46, 60–​65. Maisto, S. A., Sobell, L. C., Cooper, A. M., & Sobell, M. B. (1982). Comparison of two techniques to obtain retrospective reports of drinking behavior from alcohol abusers. Addictive Behaviors, 7, 33–​38. Mancuso, S. G., Newton, J. R., Bosanac, P., Rossell, S. L., Nesci, J. B., & Castle, D. J. (2015). Classification of eating disorders:  Comparison of relative prevalence rates using DSM-​IV and DSM-​5 criteria. British Journal of Psychiatry, 206, 519–​520. Mazure, C. M., Halmi, K. A., Sunday, S. R., Romano, S. J., & Einhorn, A. M. J. (1994). The Yale–​Brown–​Cornell Eating Disorder Scale: Development, use, reliability and validity. Journal of Psychiatric Research, 28, 425–​445. Meehan, K. G., Loeb, K. L., Roberto, C. A., & Attia, E. (2006). Mood change during weight restoration in patients with anorexia nervosa. International Journal of Eating Disorders, 39, 587–​589. Metropolitan Life Insurance. (1959). New weight standards for men and women. Statistical Bulletin, 40, 1–​4. Moos, R. (1974). The Family Environment Scale:  Form R. Palo Alto, CA: Consulting Psychologists Press. Moos, R., & Moos, B. (1994). Family Environment Scale manual. Palo Alto, CA: Consulting Psychologists Press. Mond, J. M., & Hay, P. J. (2007). Functional impairment associated with bulimic behaviors in a community sample of men and women. International Journal of Eating Disorders, 40, 391–​398. Mond, J. M., Hay, P. J., Rodgers, B., Owen, C., & Beaumont, P. J. V. (2004). Temporal stability of the Eating Disorder Examination Questionnaire. International Journal of Eating Disorders, 36, 195–​203.

560

Health-Related Problems

Murray, S. B., Rieger, E., Hildebrandt, T., Karlov, L., Russell, J., Boon, E.,  .  .  .  Touyz, S. W. (2012). A comparison of eating, exercise, shape, and weight related symptomatology in males with muscle dysmorphia and anorexia nervosa. Body Image, 9, 193–​200. Mustelin, L., Silén, Y., Raevuori, A., Hoek, H. W., Kaprio, J., & Keski-​Rahkonen, A. (2016). The DSM-​5 diagnostic criteria for anorexia nervosa may change its population prevalence and prognostic value. Journal of Psychiatric Research, 77, 85–​91. National Institute for Clinical Excellence (2004). Eating Disorders. Core Interventions in the Treatment and Management of Eating Disorders in Primary and Secondary Care. London: National Institute for Clinical Excellence. National Task Force on the Prevention and Treatment of Obesity. (2000). Dieting and the development of eating disorders in overweight and obese adults. Archives of Internal Medicine, 160, 2581–​2589. Olson, D. H., Portner, L., & Lavee, Y. (1985). FACES III. Minneapolis, MN: Family Social Science, University of Minnesota. Olson, D. H., Sprenkle, D. H., & Russell, C. S. (1979). Circumplex model of marital and family systems:  I. Cohesion and adaptability dimensions, family types and clinical applications. Family Process, 18, 3–​28. Ornstein, R. M., Rosen, D. S., Mammel, K. A., Callahan, S. T., Forman, S., Jay, M. S., . . . Walsh, B. T. (2013). Distribution of eating disorders in children and adolescents using the proposed DSM-​5 criteria for feeding and eating disorders. Journal of Adolescent Health, 53, 303–​305. Passi, V. A., Bryson, S. W., & Lock, J. (2003). Assessment of eating disorders in adolescents with anorexia nervosa: Self-​report questionnaire versus interview. International Journal of Eating Disorders, 33, 45–​54. Peat, C., Mitchell, J. E., Hoek, H. W., & Wonderlich, S. A. (2009). Validity and utility of subtyping anorexia nervosa. International Journal of Eating Disorders, 42, 590–​594. Peterson, C. B., & Mitchell, J. E. (2005). Self-​report measures. In J. E. Mitchell & C. B. Peterson (Eds.), Assessment of eating disorders (pp. 98–​119). New York, NY: Guilford. Pike, K. M., Walsh, B. T., Vitousek, K. B., Wilson, G. T., & Bauer, J. (2003). Cognitive behavioral therapy in the post-​hospital treatment of anorexia nervosa. American Journal of Psychiatry, 160, 2046–​2049. Pope, H. G., Gruber, A. J., Choi, P., Olivardia, R., & Phillips, K. A. (1997). Muscle dysmorphia. An underrecognized form of body dysmorphic disorder. Psychosomatics, 38, 548–​557. Pope, H. G., Kanayama, G., & Hudson, J. I. (2012). Risk factors for illicit anabolic–​androgenic steroid use in male weightlifters:  A cross-​sectional cohort study. Biological Psychiatry, 71, 254–​261.

Qian, J., Hu, Q., Wan, Y., Li, T., Wu, M., Ren, Z., & Yu, D. (2013). Prevalence of eating disorders in the general population:  A systematic review. Shanghai Archives of Psychiatry, 25, 212. Reas, D. L., Whisenhunt, B. L., Netemeyer, R., & Williamson, D. A. (2002). Development of the Body Checking Questionnaire: A self-​report measure of body checking behaviors. International Journal of Eating Disorders, 31, 324–​333. Reas, D. L., White, M. A., & Grilo, C. M. (2006). Body Checking Questionnaire:  Psychometric properties and clinical correlates in obese men and women with binge eating disorder. International Journal of Eating Disorders, 39, 326–​331. Rhea, D. J., Lantz, C. D., & Cornelius, A. E. (2004). Development of the Muscle Dysmorphia Inventory (MDI). Journal of Sports Medicine and Physical Fitness, 44, 428–​435. Ricca, V., Mannucci, E., Mezzani, B., Di Bernardo, M., Zucchi, T., Paionni, A.,  .  .  .  Faravelli, C. (2001). Psychopathological and clinical features of outpatients with an eating disorder not otherwise specified. Eating and Weight Disorders, 6, 157–​165. Rizvi, S. L., Peterson, C. B., Crow, S. J., & Agras, W. S. (2000). Test–​ retest reliability of the Eating Disorder Examination. International Journal of Eating Disorders, 28, 311–​316. Robin, A. L., Koepke, T., & Moye, A. (1990). Multidimensional assessment of parent–​adolescent relations. Psychological Assessment, 2, 451–​459. Robin, A. L., Siegel, P. T., Moye, A. W., Gilroy, M., Baker-​ Dennis, A., & Sikand, A. (1999). A controlled comparison of family versus individual therapy for adolescents with anorexia nervosa. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 1482–​1489. Rose, J. S., Vaewsorn, A., Rosselli-​Navarra, F., & Wilson, G. T. (2013). Test–​retest reliability of the Eating Disorder Examination–​ Questionnaire (EDE-​ Q) in a college sample. Journal of Eating Disorders, 1:42. Rosen, J. C., Jones, A., Ramirez, E., & Waxman, S. (1996). Body Shape Questionnaire: Studies of validity and reliability. International Journal of Eating Disorders, 20, 315–​319. Rosen, J. C., Vara, L., Wendt, S., & Leitenberg, H. (1990). Validity studies of the Eating Disorder Examination. International Journal of Eating Disorders, 9, 519–​528. Rosenberg, M. (1979). Conceiving the self. New  York, NY: Basic Books. Schvey, N. A., Eddy, K. T., & Tanofsky-​Kraff, M. (2015). Diagnosis of feeding and eating disorders in children and adolescents. In B. T. Walsh, E. Attia, D. R. Glasofer, & R. Sysko (Eds.), Handbook of assessment and treatment of eating disorders (pp. 207–​ 230). Arlington, VA: American Psychiatric Association Publishing.

Eating Disorders

Segal, D. L., Hersen, M., & van Hasselt, V. B. (1994). Reliability of the structured clinical interview for DSM-​ III-​R:  An evaluative review. Comprehensive Psychiatry, 35, 316–​327. Shafran, R., Fairburn, C. G., Robinson, P., & Lask, B. (2004). Body checking and its avoidance in eating disorders. International Journal of Eating Disorders, 35, 93–​101. Shapiro, J. R., Berkman, N. D., Brownley, K. A., Sedway, J. A., Lohr, K. N. & Bulik, C. M. (2007). Bulimia nervosa treatment: A systematic review of randomized controlled trials. International Journal of Eating Disorders, 40, 321–​336. Smink, F. R., van Hoeken, D., & Hoek, H. W. (2012). Epidemiology of eating disorders:  Incidence, prevalence and mortality rates. Current Psychiatry Reports, 14, 406–​414. Smink, F. R., van Hoeken, D., Oldehinkel, A. J., & Hoek, H. W. (2014). Prevalence and severity of DSM-​5 eating disorders in a community cohort of adolescents. International Journal of Eating Disorders, 47, 610–​619. Smyth, J., Wonderlich, S., Crosby, R., Miltenberger, R., Mitchell, J., & Rorty, M. (2001). The use of ecological momentary assessment approaches in eating disorder research. International Journal of Eating Disorders, 30, 83–​95. Stice, E., & Bearman, S. K. (2001). Body image and eating disturbances prospectively predict increases in depressive symptoms in adolescent girls: A growth curve analysis. Developmental Psychology, 37, 597–​607. Stice, E., Fisher, M., & Martinez, E. (2004). Eating Disorder Diagnostic Scale: Additional evidence of reliability and validity. Psychological Assessment, 16, 60–​71. Stice, E., Orjada, K., & Tristan, J. (2006). Trial of a psychoeducational eating disturbance intervention for college women:  A replication and extension. International Journal of Eating Disorders, 39, 233–​239. Stice, E., & Ragan, J. (2002). A preliminary controlled evaluation of an eating disturbance psychoeducational intervention for college students. International Journal of Eating Disorders, 31, 159–​171. Stice, E., Telch, C. F., & Rizvi, S. L. (2000). Development and validation of the Eating Disorder Diagnostic Scale: A brief self-​report measure of anorexia, bulimia, and binge-​eating disorder. Psychological Assessment, 12, 123–​131. Striegel-​Moore, R. H., & Franko, D. L. (2003). Epidemiology of binge eating disorder. International Journal of Eating Disorders, 34, S19–​S29. Sysko, R., Glasofer, D. R., Hildebrandt, T., Klimek, P., Mitchell, J. E., Berg, K. C.,  .  .  .  Walsh, B. T. (2015). The Eating Disorder Assessment for DSM-​ 5 (EDA-​ 5):  Development and validation of a structured interview for feeding and eating disorders. International Journal of Eating Disorders, 48, 452–​463.

561

Sysko, R., Walsh, B. T., & Fairburn, C. G. (2005). Eating Disorder Examination–​Questionnaire as a measure of change in patients with bulimia nervosa. International Journal of Eating Disorders, 37, 100–​106. Tanofsky-​Kraff, M., Goossens, L., Eddy, K. T., Ringham, R., Goldschmidt, A., Yanovski, S. Z.,  .  .  .  Yanovski, J. A. (2007). A multisite investigation of binge eating behaviors in children and adolescents. Journal of Consulting and Clinical Psychology, 75, 901. Tanofsky-​ Kraff, M., Morgan, C. M., Yanovski, S. Z., Marmarosh, C., Wilfley, D. E., & Yanovski, J. A. (2003). Comparison of assessments of children’s eating-​ disordered behaviors by interview and questionnaire. International Journal of Eating Disorders, 33, 213–​224. Thompson, J. K., Roehrig, M., Cafri, G., & Heinberg, L. J. (2005). Assessment of body image disturbance. In J. E. Mitchell & C. B. Peterson (Eds.), Assessment of eating disorders (pp. 175–​202). New York, NY: Guilford. Tylka, T. L., Bergeron, D., & Schwartz, J. P. (2005). Development and psychometric evaluation of the Male Body Attitudes Scale (MBAS). Body Image, 2, 161–​175. Vaughn, G. E., & Leff, J. (1976). The influence of family and social factors on the course of psychiatric illness: A comparison of schizophrenic and depressed neurotic patients. British Journal of Psychiatry, 129, 125–​137. Viswanathan, M. (2005). Measurement error and research design:  A practical approach to the intangibles of research design. Thousand Oaks, CA: Sage. Walsh, B. T., & Boudreau, G. (2003). Laboratory studies of binge eating disorder. International Journal of Eating Disorders, 34, S30–​S38. Walsh, B. T., Fairburn, C. G., Mickley, D., Sysko, R., & Parides, M. K. (2004). Treatment of bulimia nervosa in a primary care setting. American Journal of Psychiatry, 161, 556–​561. Walsh, B. T., Kaplan, A. S., Attia, E., Olmsted, M., Parides, M., Carter, J. C.,  .  .  .  Rockert, W. (2006). Fluoxetine after weight restoration in anorexia nervosa: A randomized controlled trial. JAMA, 295, 2605–​2612. Walsh, B. T., Kissileff, H. R., Cassidy, S. M., & Dantzic, S. (1989). Eating behavior of women with bulimia. Archives of General Psychiatry, 46, 54–​58. Watkins, B., Frampton, I., Lask, B., & Bryant-​Waugh, R. (2005). Reliability and validity of the child version of the Eating Disorder Examination:  A preliminary investigation. International Journal of Eating Disorders, 38, 183–​187. Weissman, A. N., & Beck, A. T. (1978). Development and validation of the Dysfunctional Attitude Scale: A preliminary investigation. Paper presented at the 86th annual convention of the American Psychological Association, Toronto, Ontario, Canada, August–​September. Weissman, M. M., & Bothwell, S. (1976). Assessment of social adjustment by patient self-​ report. Archives of General Psychiatry, 33, 1111–​1115.

562

Health-Related Problems

Wilfley, D. E., Friedman, M. A., Dounchis, J. Z., Stein, R. I., Welch, R. R., & Ball, S. A. (2000). Comorbid psychopathology in binge eating disorder: Relation to eating disorder severity at baseline and following treatment. Journal of Consulting and Clinical Psychology, 68, 641–​649. Wilfley, D. E., Schwartz, M. B., Spurrell, E. B., & Fairburn, C. G. (2000). Using the Eating Disorder Examination to identify the specific psychopathology of binge eating disorder. International Journal of Eating Disorders, 27, 259–​269. Wilfley, D. E., Schwartz, M. B., Spurrell, E. B., & Fairburn, C. G. (1997). Assessing the specific psychopathology of binge eating disorder patients: Interview or self-​report? Behaviour Research and Therapy, 35, 1151–​1159. Wilfley, D. E., Welch, R. R., Stein, R. I., Spurrell, E. B., Cohen, L. R., Saelens, B. E., . . . Matt, G. E. (2002). A randomized comparison of group cognitive–​behavioral therapy and group interpersonal psychotherapy for the treatment of overweight individuals with binge-​eating disorder. Archives of General Psychiatry, 59, 713–​721. Williamson, D. A., Muller, S. L., Reas, D. L., & Thaw, J. M. (1999). Cognitive bias in eating disorders: Implications for theory and treatment. Behavior Modification, 23, 556–​577. Wilson, G. T. (1993). Assessment of binge eating. In C. G. Fairburn & G. T. Wilson (Eds.), Binge eating: Nature, assessment, and treatment (pp. 227–​249). New  York, NY: Guilford. Wilson, G. T., & Fairburn, C. G. (2002). Treatments for eating disorders. In P. E. Nathan & J. M. Gorman (Eds.), A guide to treatments that work (2nd ed., pp. 559–​592). New York, NY: Oxford University Press. Wilson, G. T., Fairburn, C. G., Agras, W. S., Walsh, B. T., & Kraemer, H. (2002). Cognitive–​behavioral therapy for bulimia nervosa:  Time course and mechanisms of change. Journal of Consulting and Clinical Psychology, 70, 267–​274.

Wilson, G. T., Loeb, K. L., Walsh, B. T., Labouvie, E., Petkova, E., Liu, X., & Waternaux, C. (1999). Psychological versus pharmacological treatments for bulimia nervosa:  Predictors and processes of change. Journal of Consulting and Clinical Psychology, 67, 451–​459. Wilson, G. T., & Smith, D. (1989). Assessment of bulimia nervosa:  An evaluation of the Eating Disorder Examination. International Journal of Eating Disorders, 8, 173–​179. Wilson, G. T., & Sysko, R. (2009). Frequency of binge eating episodes in bulimia nervosa and binge eating disorder: Diagnostic considerations. International Journal of Eating Disorders, 42, 603–​610. Wilson, G. T., & Vitousek, K. M. (1999). Self-​monitoring in the assessment of eating disorders. Psychological Assessment, 11, 480–​489. Wilson, G. T., Wilfley, D. E., Agras, W. S., & Bryson, S. W. (2010). Psychological treatments of binge eating disorder. Archives of General Psychiatry, 67, 94–​101. Wolk, S. L., Loeb, K. L., & Walsh, B. T. (2005). Assessment of patients with anorexia nervosa:  Interview versus self-​report. International Journal of Eating Disorders, 37, 92–​99. Wonderlich, S. A., de Zwaan, M., Mitchell, J. E., Peterson, C., & Crow, S. (2003). Psychological and dietary treatments of binge eating disorder:  Conceptual implications. International Journal of Eating Disorders, 34, S58–​S73. Zanarini, M. C., Bender, D., Sanislow, C., Morey, L. C., Shea, M. T., & Gunderson, J. G. (2000). The Collaborative Longitudinal Personality Disorders Study: Reliability of Axis I and II diagnoses. Journal of Personality Disorders, 14, 291–​299. Zipfel, S., Giel, K. E., Bulik, C. M., Hay, P., & Schmidt, U. (2015). Anorexia nervosa: Aetiology, assessment, and treatment. Lancet Psychiatry, 2, 1099–​1111.

25

Insomnia Disorder Charles M. Morin Simon Beaulieu-​Bonneau Kristin Maich Colleen E. Carney Sleep complaints are extremely common in clinical practice. They may present as a clinical feature or symptom of another co-​occurring disorder, or they may represent a sleep–​wake disorder. There are multiple types of sleep–​wake disorders, which may involve, for example, trouble sleeping at night (insomnia), problems with breathing during sleep (sleep apnea), or abnormal events during sleep (nightmares). Insomnia is by far the most prevalent of all sleep disorders and the one most likely to be encountered in clinical practice by psychologists and other mental health practitioners. Although some of the basic assessment procedures and methodologies are similar across sleep–​wake disorders, this chapter focuses on the assessment of insomnia (Arnedt, Conroy, Posner, & Aloia, 2006; Morgenthaler et al., 2007; Sateia, Doghramji, Hauri, & Morin, 2000; Schutte-​Rodin, Broch, Buysse, Dorsey, & Sateia, 2008). After describing the main clinical features and diagnostic criteria of insomnia disorder, along with a summary of its epidemiology and public health significance, we review assessment strategies and measures of insomnia-​related complaints in the context of making a diagnosis and developing a case conceptualization for treatment planning, as well as for monitoring treatment and assessing outcome. NATURE AND SIGNIFICANCE OF INSOMNIA DISORDER

Clinical Features and Diagnostic Criteria Insomnia is characterized by both nocturnal and diurnal symptoms. The predominant feature is dissatisfaction

with sleep quality or duration, with complaints of difficulties initiating or maintaining sleep. The three classic nocturnal insomnia symptoms involve problems initiating sleep at bedtime, trouble staying asleep with middle of the night awakenings and difficulty going back to sleep, or waking up too early in the morning with an inability to return to sleep (American Academy of Sleep Medicine, 2014; American Psychiatric Association, 2013). Insomnia complaints are typically accompanied by significant distress or impairments of daytime functioning that involve daytime fatigue, cognitive impairments (e.g., attention and memory), and mood disturbances (e.g., irritability and dysphoria); these are often the primary concerns prompting clients to seek insomnia treatment. To make the diagnosis of insomnia disorder, these difficulties must be present 3 nights or more per week and last for more than 3 months. Sleep difficulties that are less frequent or of shorter durations may still require clinical attention before they reach disorder status. The conceptualization of insomnia has evolved during the past two decades from being viewed predominantly as a symptom of another psychiatric disorder to being recognized as a disorder on its own. Indeed, important changes were made to the diagnostic criteria of insomnia in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-​ 5; American Psychiatric Association, 2013)  and the third edition of the International Classification of Sleep Disorders (ICSD; American Academy of Sleep Medicine, 2014). For instance, the most important change introduced in both classifications is that there is no longer a distinction

563

564

Health-Related Problems

made between primary insomnia and insomnia secondary to another psychiatric or medical disorder. Such comorbidities, when present, only need to be listed, but there is no requirement to make a causal attribution to determine whether insomnia is primary or secondary. This departure from previous nosologies was predicated on the recognition that when insomnia is comorbid with another disorder (e.g., major depression), it is often difficult to determine which condition is the cause and which is the consequence, and also on evidence that the direction of this relationship may change over time (Reynolds & Redline, 2010). It was also based on increasing evidence that treatment outcome is more favorable when treating both insomnia and the comorbid condition (e.g., depression, anxiety, and pain) concurrently than when treating either condition alone. Epidemiology and Public Health Significance Population-​based estimates indicate that between 6% and 12% of adults meet criteria for an insomnia disorder during the course of a year, and an additional 15% to 20% of adults report subsyndromal insomnia (Morin, LeBlanc, et al., 2011; Ohayon, 2002; Roth et al., 2006). Insomnia is more prevalent among women, middle-​aged and older adults, shift workers, and individuals with medical or psychiatric disorders. Difficulties initiating sleep are more common among young adults, and problems maintaining sleep are more frequent among middle-​aged and older adults. The incidence of insomnia is higher among first-​ degree family members (daughter and mother) than in the general population (Dauvilliers et  al., 2005), but it remains unclear whether this link is inherited through a genetic predisposition, learned by observations of parental models, or a by-​product of another psychopathology. The onset of insomnia can occur at any time in life, but the first episode is most common in young adulthood. It is often precipitated by stressful life events, such as a separation, occupational or family stress, and interpersonal conflicts (Bastien, Vallières, & Morin, 2004; Ellis, Gehrman, Espie, Riemann, & Perlis, 2012). In some cases, insomnia begins in childhood, in the absence of psychological or medical problems, and persists throughout adulthood. Insomnia is a common problem among women during menopause and often persists even after other symptoms (e.g., hot flashes) have subsided either naturally or with hormonal replacement therapy. Insomnia may also have a late-​life onset, which needs to be distinguished from normal (age-​related) changes in sleep; such late-​life onset is often associated with other health-​related problems.

Potential risk factors for insomnia include demographic factors (e.g., female gender and advancing age), psychological factors (e.g., a worry-​prone cognitive style), hyperarousal, and a personal or familial history of insomnia (Jarrin, Chen, Ivers, & Morin, 2014; LeBlanc et al., 2009). For most individuals, insomnia is transient in nature, lasting a few days and resolving itself once the initial precipitating event has subsided. For others, perhaps those more vulnerable to sleep disturbances due to risk factors just mentioned, insomnia may persist long after the initial triggering event has disappeared; other factors, such as spending excessive amounts of time in bed or repeated napping during the day, would then perpetuate sleep disturbances (Spielman & Glovinsky, 1991). It is particularly important to identify these perpetuating factors when planning treatment. The course of insomnia may also be intermittent, with repeated brief episodes of sleep difficulties following a close association with the occurrence of stressful events. Longitudinal studies have shown that chronicity rates may range from 45% to 75% for follow-​ups of 1 to 7 years (Buysse et al., 2008; Morin, Bélanger, et al., 2009; Morphy, Dunn, Lewis, Boardman, & Croft, 2007). Even in chronic insomnia, there is often significant night-​to-​night variability in sleep patterns, with an occasional restful night’s sleep intertwined with several nights of poor sleep (Vallières, Ivers, Bastien, Beaulieu-​ Bonneau, & Morin, 2005). The prognosis for insomnia varies across individuals and is probably mediated by a combination of biologically related predisposing factors and psychological and behavioral perpetuating factors. It may also be complicated by the presence of comorbid psychiatric or medical disorders. Persistent insomnia is not a benign problem and often produces adverse effects on an individual’s life, on his or her family, and on society at large. For example, persistent insomnia is associated with reduced quality of life, decreased work productivity, increased absenteeism, and higher rates of health care utilization (Daley et  al., 2009; Simon & VonKorff, 1997; Sivertsen et  al., 2006). Increasing evidence suggests associations between chronic insomnia and long-​term negative health outcomes such as increased risk of depression, disability, hypertension, and even mortality (Baglioni et al., 2011; Fernandez-​Mendoza et  al., 2012; Laugsand, Vatten, Platou, & Janszky, 2011; Suka, Yoshida, & Sugimori, 2003; Vgontzas et al., 2010). Insomnia is often comorbid with other psychiatric and medical conditions, most frequently depression, anxiety, and pain (Baglioni et al., 2011; Taylor, Lichstein, Durrence, Reidel, & Bush, 2005; Taylor et  al., 2007). This high comorbidity may add to the complexity and

Insomnia Disorder

challenges of making an accurate diagnosis. Nonetheless, it is essential to consider these comorbid conditions when assessing insomnia, particularly for planning treatment.

565

Edinger, Lichstein, & Morin, 2006; Morin & Espie, 2003; Schutte-​ Rodin et  al., 2008). Several assessment options are available to assist the clinician in conducting the initial evaluation of insomnia; they are presented in Table 25.1.

PURPOSES OF ASSESSMENT

The 24-​ hour nature of insomnia (i.e., nocturnal and daytime symptoms), combined with some discrepancies between the subjective and objective measurements of sleep/​wakefulness, makes this condition particularly challenging for clinical assessment. In addition, sometimes insomnia is the presenting complaint, but there may be another sleep disorder unknown to the client. For these reasons, the assessment of insomnia should be multidimensional and involve a multitrait, multimethod assessment paradigm that takes into consideration nighttime (sleep) and daytime dimensions (fatigue, mood, and cognition), along with subjective, behavioral, and physiological measures. In the following sections, we highlight the main strategies to consider for assessing insomnia in the context of diagnosis, treatment planning, and monitoring treatment progress/​outcome (Schutte-​Rodin et al., 2008).

ASSESSMENT FOR DIAGNOSIS

The diagnosis of insomnia is derived primarily from a detailed clinical evaluation of the client’s subjective complaint (Arnedt et al., 2006; Sateia et al., 2000; Wyatt, Cvengros, & Ong, 2012). The sleep history should cover the type of complaint (initial, middle, or late insomnia), its duration (acute vs. chronic), and course (recurrent or persistent); typical sleep schedule (bedtime and arising time); functional analysis of precipitating, perpetuating, and alleviating factors; perceived consequences and functional impairments; and the presence of medical, psychiatric, or environmental contributing factors. A complete history of alcohol and drug use and prescribed and over-​ the-​counter medications is also essential, as is a history of previous treatments and outcome (Buysse, Ancoli-​Israel, TABLE 25.1  

Clinical Interviews The Insomnia Diagnostic Interview (IDI), also called the Insomnia Interview Schedule (Morin, 1993), was developed to assist clinicians in conducting a semi-​structured interview. Topics covered by this interview include typical sleep–​wake schedules; the nature, frequency, and severity of insomnia symptoms; daytime consequences of insomnia; the history of the sleep problem; overview of predisposing, precipitating, and perpetuating factors; sleep-​related behaviors (e.g., napping and strategies to manage insomnia symptoms or consequences); environmental factors and life habits (e.g., work schedule, bedroom organization, use of caffeine, and exercise); current and past use of medication and other sleep aid; medical history; and screening questions for other sleep disorders. The IDI has been used extensively in clinical research studies to assist in the initial diagnosis of insomnia and to facilitate treatment planning. It has also been adapted for use with several clinical populations presenting comorbid conditions such as cancer (Savard, Simard, Ivers, & Morin, 2005)  and brain injury (Ouellet, Beaulieu-​Bonneau, & Morin, 2012). Despite its clinical usefulness to gather systematic information, there are no psychometric data on either reliability or validity, and for this reason the IDI is not included in Table 25.1. In addition, more recent structured interviews have been developed to specifically address the diagnosis of insomnia. The Duke Structured Interview for Sleep Disorders (DSISD; Edinger, Wyatt, et al., 2009) was first developed to assess for the presence of sleep disorders according to criteria from the DSM-​IV-​TR (American Psychiatric Association, 2000) and the ICSD-​2 (American Academy of Sleep Medicine, 2005). It is a comprehensive interview that includes a screening questionnaire to streamline the

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

DSISD

NR

NA

G

NR

E

E

A

E



Polysomnography

NR

NA

E

G

A

A

E

A

Note: DSISD = Duke Structured Interview for Sleep Disorders. A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

566

Health-Related Problems

interview and shorten the administration time, verbatim questions for each diagnosis, a scoring table for yes/​no responses to remind interviewers of the criteria and aid in decision-​ making, and a ranking table of diagnoses with ICSD-​2 and DSM-​IV-​TR codes. Importantly, this interview carefully leads the clinician through insomnia diagnostic possibilities but also through other sleep disorders that may be comorbid or may actually explain the insomnia complaint. A  typical DSISD interview takes approximately 1 hour to administer. It may be especially helpful for those with less experience in diagnosing sleep disorders. Because the DSISD uses diagnostic criteria, it has excellent face validity and content/​ construct validity. There is evidence for reliability as well, with moderate to good inter-​rater agreement for the DSM categories of primary insomnia (r  =  .46), breathing-​related sleep disorder (r = .75), circadian rhythm disorder (r = .44), dyssomnia not otherwise specified (r  =  .42), and insomnia related to mental (r = .57) and medical disorder (r = .44) (Carney et  al., 2008). However, some diagnoses in the interview were associated with poor inter-​rater agreement, mainly from the ICSD-​2 (e.g., paradoxical insomnia). Subsequent field studies suggested that some insomnia subtypes were not particularly valid or reliable (Edinger et  al., 2011), and they were subsequently dropped from DSM-​ 5. Issues of reliability and validity in structured interviews are difficult because the measure can only be as reliable and valid as the diagnostic categories on which it is based. The DSISD has been used in a number of clinical trials to establish the presence of an insomnia diagnosis and to rule out other sleep disorders (Edinger, Olsen, et al., 2009; Harvey et al., 2014; Talbot et al., 2014). An updated version for DSM-​5 and ICSD-​3 is available by contacting the first author of the interview. Polysomnography Polysomnography (PSG) involves the monitoring of simultaneous physiologic channels for the purposes of characterizing the onset of sleep and its stages. PSG channels include electroencephalography (EEG), electrooculography, and electromyography indices that measure brain activity, eye movements, and muscle tone. Several additional channels for monitoring breathing and leg movements are typically used to diagnose other sleep disorders, such as sleep apnea and periodic limb movements during sleep. Most often, PSG monitoring takes place in a sleep laboratory, but it can also be conducted on an ambulatory basis in the client’s home. There are standardized

rules for scoring the array of PSG indices, including sleep stages, the onset of sleep, and breathing-​and movement-​ related events (Berry et  al., 2012). Although clinicians often assume that overnight sleep studies involving PSG are necessary for insomnia assessment, PSG is not recommended as part of routine clinical practice for insomnia (Schutte-​Rodin et al., 2008). The reason why PSG is not commonly used in insomnia assessment is that the validity for insomnia may be dubious (Littner et al., 2003). For example, in chronic insomnia, arousal may become conditioned/​associated with the bed, so having someone sleep in a different environment (e.g., the sleep lab) may undo the arousal–​bed association and obfuscate the insomnia complaint temporarily. The characteristics of the lab may have the reverse effect—​that is, the noise and discomfort could worsen insomnia symptoms; in either case, PSG can interfere with the assessment of the problem it is intended to measure. There may be reliability issues associated with PSG because it typically assesses sleep in one night or perhaps two, but sleep is thought to be variable across nights, so this limited sampling may not reflect the sleep of the client overall. In addition, EEG activity scored according to consensus rules may not correlate with the subjective experience/​complaint of the insomnia sufferer (Kaplan et  al., 2016; Krystal, Edinger, Wohlgemuth, & Marsh, 2002). Given that insomnia is a subjective disorder, the lack of correspondence between PSG and subjective experience is a significant shortcoming of PSG. In addition, given that there are not hard quantitative rules for defining insomnia, PSG is not an essential component of clinical insomnia assessment except when other sleep disorders may be suspected. Overall Evaluation In summary, a detailed and systematic clinical evaluation of insomnia and its related symptoms is the most important assessment strategy for making an initial diagnosis of insomnia. This clinical evaluation can and should be complemented with other assessment strategies to be described in the next sections.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Conceptual Models of Insomnia There are several conceptual models of insomnia that all view hyperarousal as a core feature of this sleep disorder.

Insomnia Disorder

Some models emphasize behavioral factors, whereas others stress cognitive factors, and still others focus on cortical arousal (Espie, 2002; Harvey, 2002; Morin, 1993; Perlis, Ellis, Kloss, & Riemann, 2016). The 3P model (Spielman & Glovinsky, 1991)  is an integrative model that outlines three key contributing factors to insomnia, namely predisposing, precipitating, and perpetuating factors. Predisposing factors enhance the vulnerability to develop insomnia; these include increasing age, female gender, an anxiety-​prone personality style, and physiological hyperarousal. The mere presence of these risk factors is usually not enough to trigger insomnia; there needs to be some precipitating event, typically a stressful life event (e.g., an accident, a separation, death of a loved one, and occupational stress), to bring about acute sleep disturbances. Whereas otherwise good sleepers will resume normal sleep after these precipitating events have disappeared, individuals who are more prone or vulnerable to suffer from insomnia will often continue to experience sleep disturbances even after the initial triggering event is no longer present. When insomnia becomes a chronic problem, there are several psychological and behavioral factors that contribute to the perpetuation of sleep difficulties over time. Among these are classic psychological/​cognitive factors such as sleep-​specific performance anxiety, the fear of not sleeping, apprehension about the potential consequences of insomnia, and maladaptive sleep habits (e.g., spending excessive amounts of time in bed, maintaining irregular sleep schedules, and using stimulants to stay awake). It is here that the case formulation becomes extremely helpful in identifying the factors contributing

TABLE 25.2  

567

to the perpetuation of insomnia and the factors that must be targeted in treatment. Case Formulation Case formulation is an iterative, client-​centered approach to assessment and treatment (Persons, 2012). Case formulation is essential to a good treatment plan because it considers who to treat and when to treat them (e.g., diagnostic considerations), the treatment targets (e.g., perpetuating factors for the insomnia), and the behavioral strategies that will be used to address these targets (Manber & Carney, 2015). Manber and Carney recommended collecting the following information on clients to organize the case formulation: (a) factors associated with poor homeostatic sleep drive, (b) factors associated with poor circadian functioning, (c) factors associated with hyperarousal, (d) unhealthy sleep behaviors, (e) medications that may impact sleep and alertness, (f) comorbidities that may worsen sleep, and (g) environmental or other factors that may affect the sleep complaint or impact the delivery of cognitive–​ behavioral therapy for insomnia (CBT-​I). Tables 25.2 and 25.3 provide examples of instruments that assess for each of these domains. As illustrated in Table 25.3, central to this approach is a thorough clinical interview, the use of sleep diary monitoring, the testing of hypotheses derived from the formulation, and the evaluation of treatment outcomes. The clinician hypothesizes about the most important targets for treatment (i.e., those with a high negative impact on sleep regulation) and shares this hypothesis with the client, as well as a suggested plan for addressing the problem. Clinician and client then collaborate on the

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

Consensus Sleep Diary

NR

NA

NA

NA

E

E

NR

E



DBAS-​16 PSAS

A NR

A G

NA NA

A G

A A

A G

NR A

G A



SBSRS

NR

G

NA

G

A

A

NR

A

SAMI

A

G

NA

A

A

NR

NR

A

SPS

A

G

NA

NR

A

NR

NR

A

APS

A

G

NA

NR

A

A

NR

A

FIRST

A

G

NA

G

A

NR

A

A

GSES

A

A

NA

NR

A

NR

NR

A

Notes: Psychometric ratings are based on the original English version of the instrument only. DBAS  =  Dysfunctional Beliefs and Attitudes About Sleep Scale; PSAS  =  Pre-​Sleep Arousal Scale; SBSRS  =  Sleep-​Behavior Self-​Rating Scale; SAMI = Sleep Associated Monitoring Index; SPS = Sleep Preoccupation Scale; APS = Arousal Predisposition Scale; FIRST = Ford Insomnia Response to Stress Test; GSES = Glasgow Sleep Effort Scale; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

568

Health-Related Problems

TABLE 25.3  

Case Formulation Assessment Domains, Sources, and Treatment Implications

Perpetuating Factors

How to Assess

Cognitive Behavioral Strategies to Consider

1. What factors may be negatively impacting the homeostatic drive for deep sleep?

Examine diary for napping, add all the time spent in bed in the 24-​hour period to determine if it far exceeds the average total sleep time. Clinical interview: Query activity levels, time into bed and out of bed, nap attempts, dozing.

Sleep restriction therapy (i.e., decrease time spent in bed to match current average total sleep time). Stimulus control (i.e., no napping, set arise time 7 days per week, do not go to bed until sleepy). Sleep hygiene (i.e., exercise).

2. What factors may be negatively impacting the biological clock?

Examine diary for a variability of 1 hour or more between the earliest and latest bedtime, wake time, and arising time. Clinical interview: Query whether there is an early (advanced sleep phase or early bird tendency) or late (delayed sleep phase or night owl tendency) chronotype, and whether the client is able to schedule sleep opportunities that match this tendency. Actigraphs may be helpful for assessing rare, irregular rest/​activity patterns in the 24-​hour period. Clinical interview: Query whether there is increased alertness upon getting into the bed (e.g., conditioned arousal). Sleep diary evidence of increased sleep effort (e.g., increased time in bed to increase the likelihood of sleep, use of sleep aides). Is the mean DBAS-​16 score 4 or above (i.e., do they have beliefs about sleep that are target-​worthy)? Is there an elevation on the Glasgow Sleep Effort Scale, suggestive of a maladaptive belief that one has to exert effort to sleep? Is there a tendency toward monitoring for sleep-​related threats on the Sleep Associated Monitoring Index or the Sleep Preoccupation Scale? Are there increased scores suggestive of hyperarousal on the Arousal Predisposition Scale and/​or the Pre-​Sleep Arousal Scale? Sleep diary to track excessive or consumption late in the day of caffeine, alcohol, drugs, and cigarettes. Insomnia Diagnostic Interview and/​or clinical interview to assess for behaviors such as nocturnal eating, late, vigorous exercise, etc. Sleep diary to track sleep medication use. Interview to assess for contingent (i.e., prn) use of sleep or anxiety medications, safety behaviors, and daytime effects of the medication. Assess for medications other than sleep or anxiety pills that can interfere with sleep (e.g., antidepressant medications).

Sleep restriction therapy (i.e., consider chronotype when setting sleep schedule). Stimulus control (i.e., set arise time 7 days per week). Sleep hygiene (i.e., limit exposure to blue light in the evening and increase blue light exposure during the day for input to the clock).

3. What factors may be associated with increased arousal?

4. What unhealthy sleep behaviors may be affecting sleep and/​or alertness?

5. What medications could be affecting the client’s sleep and/​or alertness?

Stimulus control (i.e., go to bed only when sleepy, get out of bed when unable to sleep, set arise time 7 days per week, no naps, refrain from wakeful activities in bed). Cognitive therapy (i.e., challenge belief that sleep effort is helpful for sleep and the fear of not sleeping, explore whether particular beliefs on the DBAS-​16 are sleep-​interfering, test whether sleep monitoring or sleep-​preoccupation behaviors are helpful or sleep-​interfering). Relaxation therapy to decrease hyperarousal. Counter-​arousal techniques (e.g., Pennebaker writing intervention [Harvey & Farrell, 2003] or structured problem solving [Carney & Waters, 2006]) to address hyperarousal. MBTI (Ong et al., 2014) to address hyperarousal.

Sleep hygiene (i.e., decrease use of caffeine, alcohol, drugs, cigarettes; limit exposure to blue light in the evening; exercise but not close to bedtime; refrain from nocturnal eating).

Signed release to review medical history and to collaborate with prescriber to eliminate contingent use of sleep medication and to discuss options for an optimal, efficacious sleep medication. Psychoeducation about the effects of sleep medication. Cognitive therapy to test whether sleep mechanisms are still functioning in the client. Proceed with CBT-​I and then, if consistent with client goals, implement taper. Consider whether adaptations to CBT-​I are necessary for safety (e.g., restricting sleep to a lesser degree in those on sedating medications or asking them to refrain from getting out of bed in the middle of the night if there is a concern for falls).

Insomnia Disorder

569

TABLE 25.3  Continued Perpetuating Factors

How to Assess

Cognitive Behavioral Strategies to Consider

6. What comorbidities impact the client’s sleep and/​or alertness and how?

Structured interview such as the Duke Structured Interview for Sleep Disorders or Insomnia Diagnostic Interview to assess for signs of other sleep disorders. Follow-​up with referral for polysomnography if other sleep disorders are suspected (e.g., sleep apnea). Clinical interview to assess for medical and psychiatric conditions, and treatment strategies (e.g., sleep apnea diagnosis but difficult adjustment to CPAP treatment, or chronic pain with ambivalence toward using pain management strategies). Self-​report measures to track comorbid symptoms such as depression (BDI-​II, PHQ-​9), anxiety, (STAI and BAI), and fatigue (FSS or MFI)

Consider whether adaptations to CBT-​I are necessary (e.g., adding a fatigue module with chronic illnesses such as cancer, or restricting sleep to a lesser degree with panic disorder or disorders associated with excessive daytime sleepiness). Refer for treatment of other disorder and work with client on adherence to the treatment of the other disorder.

7. What other factors are there to consider?

Insomnia Diagnostic Interview and/​or clinical interview to assess other factors (e.g., current sleep environment, mental/​ cognitive status, and readiness for change).

In cases of a low readiness for behavior change or medical/​psychiatric crisis, consider motivational interviewing to explore whether it is the right time to initiate CBT-​I. Discuss benefits and drawbacks to other approaches, including relaxation as a monotherapy, medication, MBTI. Troubleshooting adaptations for current environment or referral for social supports to improve housing situation. Adaptation/​simplification of materials (e.g., sleep diaries and handouts) for the cognitive/​reading level of the client.

Note: BAI = Beck Anxiety Inventory; BDI-​II = Beck Depression Inventory-​II; CBT-​I = Cognitive–​Behavioral Therapy for Insomnia; CPAP = Continuous Positive Airway Pressure; DBAS-​16 = Dysfunctional Beliefs and Attitudes About Sleep Scale; FSS = Fatigue Severity Scale; MBTI = Mindfulness-​Based Therapy for Insomnia; MFI = Multidimensional Fatigue Inventory; PHQ-​9 = Patient Health Questionnaire; STAI = State–​Trait Anxiety Inventory.

course and agree to track progress so that they can evaluate if the plan needs to be altered. Self-​Report Measures Having the client complete a sleep diary each morning is an essential component of insomnia assessment both for diagnosis and for treatment planning/​tracking. Although there are different models of sleep diaries available in the literature, the Consensus Sleep Diary (CSD; Carney et al., 2012) was developed in consultation with a group of 25 leading experts to provide a standardized diary. The CSD is a prospective tool completed upon awakening that queries the subjective experience of the previous night. Core items include the time the client got into bed, the estimated amount of time it took to fall asleep, the number of awakenings and total estimated length of awakenings during the night, the last time that the client woke up for the day, the time at which the client got out of bed, and a 5-​point Likert rated subjective sleep quality.

Given the subjective nature of insomnia disorder, the sleep diary represents an essential tool to obtain the client’s perception of the problem. Establishing reliability and validity data for this measure is difficult. For example, we would not anticipate that there would be high test–​ retest reliability because sleep is quite variable across nights, particularly among individuals with insomnia. Sleep is a construct defined by what measure is selected, so in PSG, sleep is defined as electrical activity, whereas in actigraphy it is defined by movement patterns; thus, we would not expect high correlation with other measures of slightly differing constructs. Despite the subjectivity of the diary, the stability and clinical value of this measure are optimal when the diary is completed soon after wakening and for at least a 2-​week period (Wohlgemuth, Edinger, Fins, & Sullivan, 1999). Good evidence for validity has been obtained when CSD indices were compared against objective (i.e., actigraph) indices (Maich, Lachowski, & Carney, 2016). There is evidence for diagnostic validity for establishing an insomnia diagnosis (Natale et  al.,

570

Health-Related Problems

2015); for example, there is strong specificity evidence for CSD indices of sleep-​onset latency, wakefulness after sleep onset, number of awakenings, and sleep efficiency (Maich et al., 2016). There is only moderate sensitivity for the same indices (Maich et al., 2016), but this may relate to the difficulty in deriving suitable quantitative criteria for insomnia (Lineberger, Carney, Edinger, & Means, 2006) rather than to the properties of the CSD. Although some clinicians worry about whether clients can complete diary measures, this diary was created with client-​directed input via focus groups (Carney et  al., 2012), and there is a high reported rate of completion of the diary as well (Maich et al., 2016). Readability analyses suggest the core diary is written at a third-​grade reading level (Carney et al., 2012). The Dysfunctional Beliefs and Attitudes About Sleep Scale (DBAS) is a self-​report questionnaire that assesses unhelpful cognitions related to sleep, insomnia, and daytime consequences of insomnia. The initial version contains 30 items that are categorized into five themes:  (a) misconceptions about the causes of insomnia (e.g., “I believe insomnia is essentially the result of a chemical imbalance”), (b)  misattribution or amplification of the consequences of insomnia (e.g., “Without an adequate night’s sleep, I can hardly function the next day”), (c) unrealistic expectations about sleep (e.g., “I need 8 hours of sleep to feel refreshed and function well during the day”), (d) misconceptions about sleep-​promoting practices (e.g., “When I  don’t get proper amount of sleep on a given night, I need to catch up on the next day by napping or on the next night by sleeping longer”), and (e) lack of control or unpredictability of sleep (e.g., “I am worried that I may lose control over my abilities to sleep”). Respondents rate the extent to which they endorse each item on a Likert-​ type scale (0 = strongly disagree to 10 = strongly agree). A total score is derived by averaging item scores (the score of item 23 is reversed), with higher scores suggesting a higher level of dysfunctional sleep-​related cognitions. In place of using formal scoring of this instrument, clinicians may use it in practice simply to identify unhelpful sleep beliefs that should be targeted during the course of therapy. Although the original DBAS contains 30 items (Morin, 1993), an abbreviated 16-​item version has also been validated (Morin, Vallieres, & Ivers, 2007), and a 24-​ item adaptation is available for use with children (Gregory et al., 2009). Only the DBAS-​16 is included in Table 25.2 because of its stronger psychometric properties and shorter format. The factor structure is consistent with the longer version: (a) perceived consequences of insomnia, (b)  worry/​ helplessness about insomnia,

(c) sleep expectations, and (d) medication (Morin et al., 2007). The DBAS has been shown to be sensitive to treatment change with CBT-​I (Edinger, Wohlgemuth, Radtke, Marsh, & Quillian, 2001; Eidelman et al., 2016). The Pre-​ Sleep Arousal Scale (PSAS) (Nicassio, Mendlowitz, Fussell, & Petras, 1985)  is a 16-​item self-​ report measure of the state of arousal just before sleep. There are two summed PSAS subscales measuring somatic and cognitive states of pre-​sleep arousal. The original validation article by Nicassio and colleagues (1985) reported good internal consistency and test–​retest reliability for PSAS scores. There is also good evidence for convergent validity because scores on the somatic and cognitive subscales correlate moderately with anxiety, self-​ identification as a poor sleeper, sleep-​onset latency, total sleep time, and awakenings from sleep. The PSAS scores distinguish between good sleepers and those with insomnia, particularly on the basis of the cognitive subscale score (Nicassio et al., 1985). This scale was also validated with a community sample, and there was good evidence for a two-​factor model with a shortened version of the scale (Jansson-​Frojmark & Norell-​Clarke, 2012). Further research is needed to determine if the scales need to be modified based on these results. The Sleep Behavior Self-​ Rating Scale (SBSRS; Kazarian, Howe, & Csapo, 1979)  measures the frequency of sleep-​ incompatible behaviors in the bedroom. It includes 20 items that are rated on a 5-​point scale (i.e., ranging from 1  =  never to 5  =  very often). Eighteen items are duplicates, with one set of 9 items referring to activities engaged in during the day and the other 9 items pertaining to the same behaviors engaged in around sleeping time. The total score is the sum of the 20 items, with higher scores suggesting greater degree of sleep-​incompatible behaviors. Its internal consistency was found to be adequate in the original study (α = .72 to .76), and the test–​retest reliability was high (r = .88 for two administrations separated by 2 or 3 weeks). In addition, the SBSRS scores differentiated between poor and good sleepers (as defined on the basis of sleep-​onset latency) and showed adequate discriminant validity with anxiety and depression measures. Although the SBSRS can be a useful tool to identify target behaviors to change with stimulus control procedures during insomnia treatment, psychometric data are limited to just one study. Moreover, because the scale was developed in 1979, additional sleep-​ incompatible behaviors, more relevant to the advent of modern technology, should be added to the scale (e.g., using a smartphone or tablet in bed and working on a computer in the bedroom).

Insomnia Disorder

The Sleep Associated Monitoring Index (SAMI; Semler & Harvey, 2004) is a 30-​item questionnaire assessing nighttime and daytime monitoring for sleep-​related threat, a key component of Harvey’s cognitive model of insomnia (Harvey, 2002). A  factor analysis yielded eight factors:  (a) pre-​ sleep monitoring for body sensations consistent with falling asleep, (b)  pre-​ sleep monitoring for body sensations inconsistent with falling asleep, (c)  pre-​sleep monitoring the environment, (d)  pre-​sleep monitoring the clock, (e) calculation of time, (f) waking monitoring for body sensations, (g)  daytime monitoring for body sensations, and (h) daytime monitoring of functioning. Each item is rated on a 5-​point scale (1 = not at all to 5 = all the time). Eight subscale scores and a total score can be derived by adding up scores of individual items. Psychometric properties reported in Table 25.2 are derived from the initial validation study (Semler & Harvey, 2004). The Sleep Preoccupation Scale (SPS; Ellis, Mitchell, & Hogh, 2007)  was developed as a measure of the daytime cognitive processes related to sleep (e.g., “My memory appears to be worse after a bad night’s sleep” and “I am more irritable after a bad night’s sleep”). It is composed of 22 items, each answered on a 6-​point rating scale (0 = never to 6 = all the time) representing the frequency of cognitions related to daytime consequences of sleep patterns. The SPS has two subscales, one for cognitive/​behavioral consequences and the other for affective consequences. Based on the initial validation study, poor sleepers reported significantly greater levels of cognitive/​behavioral and affective preoccupations compared to good sleepers. Psychometric properties of the SPS have not been studied independently from the initial development study. The Arousal Predisposition Scale (APS; Coren, 1988)  is a 12-​ item questionnaire assessing cognitive arousability (e.g., “I get excited easily” and “I can be emotionally moved by what other people consider simple things”). It is distinct from the PSAS because it focuses on cognitive arousability as a long-​ term, stable trait rather than as a pre-​sleep state. The respondent rates on a 5-​point scale (1 = never to 5 = always) how each item describes his or her typical behaviors. The summation of the 12 items yields the APS total score. The psychometric qualities of the APS have been documented in several studies (Coren, 1988, 1990; Coren & Mah, 1993; Hicks, Conti, & Nellis, 1992; Saliba, Henderson, Deane, & Mahar, 1998). The Ford Insomnia Response to Stress Test (FIRST; Drake, Richardson, Roehrs, Scofield, & Roth, 2004)  is

571

a self-​reported measure of an individual’s vulnerability to stress-​related sleep disturbance and hyperousal. The questionnaire includes nine items representing common stressful situations, and the respondent has to rate on a 4-​point scale (from 1 = not likely to 4 = very likely) the likelihood of experiencing sleep difficulties in response to these situations (e.g., before an important meeting the next day or after an argument). A total score ranging from 9 to 36 is obtained by adding up the nine item scores, with higher scores suggesting greater sleep reactivity. The psychometric properties of the FIRST have been documented in the original validation article (Drake et  al., 2004)  and in at least one further study (Jarrin, Chen, Ivers, Drake, & Morin, 2016). It has also been shown that the vulnerability to stress-​related sleep disturbances, as assessed by the FIRST, has a strong familial aggregation (Drake, Scofield, & Roth, 2008). The Glasgow Sleep Effort Scale (GSES; Broomfield & Espie, 2005) assesses cognitive and behavioral components of sleep effort and control, both at sleep onset and after nighttime awakenings. Because sleep is an involuntary process, excessive effort to control sleep initiation has been proposed as a potential cognitive factor contributing to the maintenance and exacerbation of insomnia. The GSES contains seven items (e.g., “I feel I should be able to control my sleep” and “I put too much effort into sleep when it should come naturally”) rated on a 3-​point scale (0  =  not at all, 1  =  to some extent, and 2  =  very much), with the period of reference being the past week (the GSES is therefore a measure of state rather than trait). The total score is the summation of the seven items, with a score of 3 or more suggesting high sleep effort (Broomfield & Espie, 2005). A  pilot version of the GSES was first used in two studies (Broomfield & Espie, 2003, 2005), and the scale was then formally validated (Broomfield & Espie, 2005) and further used with both insomnia and good sleeper samples (Hertenstein et al., 2015). Overall Evaluation Of all assessment strategies described in this section, the case conceptualization and the sleep diary represent essential assessment strategies for initial evaluation and diagnosis of insomnia. The remaining measures are helpful for gaining a better understanding of factors predisposing to or potentially contributing to perpetuate insomnia. They can be used for specific client needs or when other assessments suggest that more information in one or another area might be helpful in case formulation.

572

Health-Related Problems

ASSESSMENT FOR TREATMENT MONITORING AND TREATMENT OUTCOME

A comprehensive assessment of treatment progress and outcome following treatment should include assessment of nocturnal sleep–​wake parameters and insomnia symptoms, as well as assessment of several dimensions of daytime functioning including fatigue, and also mood and psychological symptoms. Ideally, some measures of daytime performance and cognitive functioning should also be conducted, but there is currently a lack of adequate measures for this dimension. Ratings of instruments reviewed in this section are presented in Table 25.4. Assessment of Sleep/​Insomnia The Consensus Sleep Diary, described previously, is essential to treatment monitoring. The CSD is used to derive daily averages of several sleep–​wake parameters, including time to fall asleep, time awake after sleep onset, total sleep time, and sleep efficiency. These indices are then used to inform treatment decisions and to track whether hypotheses about how to correct the sleep problem are correct. For example, the CSD is central to developing the sleep scheduling component of sleep restriction therapy in CBT-​I. The calculated mean total sleep time is used to derive the time-​in-​bed prescription, and then in subsequent weeks, the CSD is used to assess for adherence (i.e., Does the mean time in bed match the time-​in-​bed prescription?) as well as outcome (i.e., Does

TABLE 25.4  

mean sleep efficiency normalize?). The CSD is also used to determine when the second phase of sleep restriction therapy (i.e., sleep extension) occurs (i.e., Is the mean sleep efficiency greater than 90%). In addition, the CSD is used in CBT-​I to test beliefs the client has about his or her sleep system. For example, sleep tracked on the CSD may reveal that all-​or-​none statements about “not sleeping” were inaccurate, or assumptions that naps are not actually sleep disruptive can be evaluated by examining sleep after naps or no naps. The CSD is central to hypothesis testing in treatment. The CSD has good treatment sensitivity in detecting improvement after CBT-​I (Lichstein et al., 2013). The Insomnia Severity Index (ISI; Bastien, Vallières, & Morin, 2001; Morin, 1993) is a seven-​item measure of perceived insomnia severity assessing initial, middle, and late insomnia; satisfaction with sleep; sleep-​related preoccupation; and the impact and noticeability of sleep difficulties. Each item is rated on a 5-​point scale, and the summation of the items yields a total score ranging from 0 to 28. The following interpretation guidelines are recommended for the total score: 0 to 7 = absence of insomnia, 8 to 14 = subthreshold insomnia symptoms, 15 to 21 = moderate insomnia, and 22 to 28 = severe insomnia. Cut-​off scores of 10 in community samples and 14 in primary care clinics have been recommended to detect insomnia, and a change score of 8 points has been suggested to define a positive treatment response (Gagnon, Belanger, Ivers, & Morin, 2013; Morin, Belleville, Belanger, & Ivers, 2011). Scores on the ISI have been shown in several studies to

Ratings of Instruments Used for Treatment Monitoring and Treatment Outcome Evaluation

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Treatment Sensitivity

Clinical Utility

Highly Recommended

Consensus Sleep Diary ISI PSQI Actigraphy

NR

NA

NA

NA

E

E

NR

E

E



A A NR

E G NA

NA NA NA

NR A G

A G A

G A NR

E G NR

E G G

A A G



FSS MFI BAI STAI BDI-​II PHQ-​9

A A NR NR NR NR

G G E G G G

NA NA NA NA NA NA

G G NR NR NR A

G G A G A G

G G A A A A

G G A G A G

G G A G A G

G G A G A E

✓ ✓ ✓ ✓

Notes: Psychometric ratings are based on the original English version of the instruments only; for BAI, BDI-​II, FSS, MFI, PHQ-​9, and STAI, ratings are based on studies on insomnia samples only. ISI  =  Insomnia Severity Index; PSQI  =  Pittsburgh Sleep Quality Index; FSS  =  Fatigue Severity Scale; MFI  =  Multidimensional Fatigue Inventory; BAI = Beck Anxiety Inventory; STAI = State–​Trait Anxiety Inventory; BDI-​II = Beck Depression Inventory-​II; PHQ-​9 = Patient Health Questionnaire; A = Adequate; G = Good; E = Excellent; NA = Not Applicable; NR = Not Reported.

Insomnia Disorder

be sensitive to therapeutic changes (Harvey et al., 2014; Morin et al., 2004; Morin, Beaulieu-​Bonneau, LeBlanc, & Savard, 2005; Morin, Vallières, et  al., 2009). In addition to a patient-​completed version, there are two parallel versions of the ISI: one completed by a significant other (usually a spouse) and one completed by a clinician. These versions can be useful to assess changes with treatment. An extended version of the ISI includes questions pertaining to the perceived impact of sleep difficulties on six specific domains of daytime functioning:  mood, fatigue, concentration/​memory, quality of life, interpersonal relationships, and social or leisure activities. Studies have demonstrated the validity and clinical utility of the ISI in various populations, including cancer clients (Savard, Savard, Simard, & Ivers, 2005)  and individuals with sickle cell disease (Moscou-​Jackson, Allen, Smith, & Haywood, 2016), and as a web-​based measure (Thorndike et al., 2011). The Pittsburgh Sleep Quality Index (PSQI; Buysse, Reynolds, Monk, Berman, & Kupfer, 1989) is a 19-​item measure that queries several aspects of subjective sleep quality, efficiency, duration, as well as symptoms of several other sleep disorders (e.g., nightmares) and sources of sleep disruptions (e.g., bedpartner’s snoring). There are several subscales of the PSQI, but the most widely used index is an overall global sleep quality score that has a cut-​ off score of 5. It is a widely used tool in sleep research, but the construct of sleep quality remains poorly defined. The PSQI is a generic measure of sleep rather than an insomnia measure per se; indeed, there are a number of content items related to different sleep problems subsumed under the generic “poor sleep quality” in the PSQI, including restless legs syndrome, nightmares, and insomnia. Despite the inclusion of content items related to a variety of sleep disorders, the PSQI is not effective at detecting other sleep disorders (Nishiyama et al., 2014). There are studies suggesting that the overall PSQI sleep quality score is measuring something different (e.g., anxiety/​distress) from sleep diaries and/​or actigraphy in those with comorbid psychiatric disorders (Dorheim, Bondevik, Eberhard-​ Gran, & Bjorvatn, 2009; Hartmann, Carney, Lachowski, & Edinger, 2015). Some authors have argued that the overall PSQI score may be considered an index of distress about sleep quality more than anything else, particularly in those with comorbid diagnoses, and should be used with caution (Crawford & Ong, 2015). The PSQI scores have good test–​retest reliability (Backhaus, Junghanns, Broocks, Riemann, & Hohagen, 2002), although this decreases when using a time frame of a few weeks for retrospective reporting. The standard time frame for

573

completing the PSQI is the past month; using a 1-​week reference period increases the correlation between the diaries and the PSQI (Wohlgemuth et al., 1999). Wrist actigraphy uses a small portable device that records movement, and light with some devices, for an extended period of time (days to weeks) to estimate sleep, circadian rhythm, and motor activity. The actigraph is usually worn as a watch on the nondominant wrist. Movement data captured by the accelerometer of the actigraph are transformed with mathematical algorithms into estimates of sleep parameters (e.g., total sleep time, sleep latency, and wake after sleep onset). Actigraphy does not measure sleep directly as does polysomnography, nor the subjective experience of sleep as do sleep diaries, but rather assesses sleep patterns (sleep and wake periods) through movement data. Actigraphy generally has high specificity in detecting sleep but low specificity in detecting wake, often underestimating sleep onset latency and wake after sleep onset. As such, it is fairly reliable in estimating global sleep–​wake parameters, such as total sleep time and sleep efficiency, but much less accurate in estimating more discrete sleep–​wake parameters, such as sleep onset latency and time awake after sleep onset. Nonetheless, actigraphy is especially useful for evaluating circadian rhythms (Morgenthaler et  al., 2007)—​that is, typical sleep–​wake schedule (bedtime, arising time, and nap time)—​and for examining night-​to-​night variability. It has also been used in clinical studies for documenting treatment adherence and, occasionally, for documenting treatment outcome in clinical trials of CBT-​I. Although it is a useful tool in research studies on insomnia and circadian rhythm sleep disorders, its use in clinical settings is fairly limited due to its cost. In recent years, sleep-​tracking devices using accelerometry have become increasingly available, affordable, and popular. However, the vast majority of these devices have been developed commercially without supporting evidence for reliability and validity (Lee & Finkelstein, 2015). Although a potentially useful complement to self-​ report and PSG measures, actigraphy devices and algorithms are not all equivalent, and there may be significant variability in the reliability and validity of sleep–​wake data derived from different devices. Fatigue Assessment In addition to assessing sleep, it is important to assess for daytime problems associated with insomnia. Daytime problems, the most common of which is fatigue, are often the primary complaint of people seeking treatment for insomnia. That is, they are concerned about waking

574

Health-Related Problems

up at night because of the feared negative consequences on daytime functioning (e.g., fatigue). Although several scales are available for fatigue measurement, the Fatigue Severity Scale and the Multidimensional Fatigue Inventory are most often used in the insomnia literature. The Fatigue Severity Scale (FSS; Krupp, LaRocca, Muir-​Nash, & Steinberg, 1989)  is a nine-​item questionnaire assessing the subjective severity of fatigue symptoms on a 7-​point Likert scale over the past week. It is brief (estimated 2 or 3 minutes to complete) and requires minimal training to administer, score, and interpret (Hewlett, Dures, & Almeida, 2011). The scale has strong face validity because items explicitly ask the respondent to rate the severity of his or her fatigue symptoms. The scale is scored by calculating the mean score of all nine items. A cut-​off score greater than 3 is indicative of significant fatigue (Herlofson & Larsen, 2002; Hossain, Reinish, Kayumov, Bhuiya, & Shapiro, 2003; Krupp et al., 1989; Lichstein, Means, Noe, & Aguillard, 1997; Schwartz, Jandorf, & Krupp, 1993). The main criticism of the FSS is that it is a unidimensional scale, whereas fatigue is regarded as having multiple components. Indeed, the FSS focuses primarily on physical fatigue, as indicated by its definition of fatigue:  a “sense of tiredness, lack of energy, or total body give-​out.” The FSS psychometrics are strong and reflect the participant’s perception of the impact of his or her fatigue symptom experience over the past week (Herlofson & Larsen, 2002; Hossain et al., 2003; Krupp et  al., 1989; Schwartz et  al., 1993). Perhaps one of the reasons why this scale performs well without containing the multiple domains is that there is not one domain of fatigue that consistently characterizes all insomnia clients. There is evidence of clinical utility for treatment tracking, although there is not always an improvement in fatigue even when there is sleep improvement after CBT-​I. Although this may be a shortcoming of the scale, it is also possible that CBT-​I alleviates sleep problems but does not adequately address fatigue. The Multidimensional Fatigue Inventory (MFI; Smets, Garssen, Bonke, & De Haes, 1995) assesses several dimensions of fatigue: general, physical, mental, reduced motivation, and reduced activity. The MFI is a 20-​item instrument with a score on each dimension ranging from 5 to 20 points, indicating no fatigue to extreme fatigue. The MFI is a well-​validated questionnaire that has been used for several diseases and disorders, including cancer (Meek et al., 2000) and chronic fatigue syndrome (Weatherley-​ Jones et  al., 2004). Evaluating the treatment sensitivity of the inventory is a challenge. There is a lack of correspondence between fatigue reporting and objective sleep

disturbance, and there appears to be a cognitive factor that accounts for the reporting of fatigue in insomnia (Riedel & Lichstein, 2000). This observation is consistent with neuropsychological models of central fatigue, which posit a prominent role of appraisal of the personal resources needed for a task (Chaudhuri & Behan, 2004). Thus, it is possible that treatments that do not target this appraisal process will not result in a meaningful decrease in fatigue symptoms. If this is true, it is unreasonable to expect an assessment tool to detect a difference that is unlikely to occur in treatment. One alternative to the MFI and FSS is the Flinders Fatigue Scale (Gradisar et al., 2007), a scale proposed to detect treatment differences in fatigue. However, scores on the scale were shown to relate to changes on the PSQI, a scale confounded by anxiety and distress (Hartmann et  al., 2015). Thus, it is unclear if the Flinders Fatigue Scale is detecting changes in distress rather than changes in fatigue. Assessment of Psychological/​Mood Symptoms The Beck Anxiety Inventory (BAI; Beck, Epstein, Brown, & Steer, 1988) is a 21-​item self-​report measure (i.e., over the past week) of anxiety symptom severity and is a recommended instrument for use in insomnia (Buysse et al., 2006). The BAI is widely used in insomnia research (Harvey & Greenall, 2003; Morin, Belleville, et al., 2011), and it has sound psychometrics in anxiety-​disordered clients; however, the BAI has been criticized for its content validity because of its high number of autonomic symptoms. The somatic items of the BAI have poor specificity because they overestimate anxiety severity and may miscategorize those without clinical levels of anxiety if they have medical conditions (Wetherell & Gatz, 2005)  or sleep-​ disordered breathing (Sanford, Bush, Stone, Lichstein, & Aguillard, 2008). In a large-​scale psychometric evaluation of those with an insomnia disorder diagnosis, Carney, Moss, Harris, Edinger, and Krystal (2011) cautioned against the use of the BAI cut-​offs with insomnia clients because the cut-​off scores were associated with suboptimal sensitivity and specificity, and several items failed to discriminate those with an anxiety disorder from those without. The overall score may be useful as an index of anxiety symptom severity because the BAI total score differentiates those with insomnia with and without an anxiety disorder diagnosis. The reliability of the score in an insomnia sample has been reported to be similarly high to that in anxiety disorder investigations (Cronbach’s α = .89; Carney et al., 2011). Despite some concern over

Insomnia Disorder

the BAI cut-​offs for those with insomnia, it continues to be a widely used tracking tool for anxiety symptoms. The Spielberger State–​Trait Anxiety Inventory (STAI; Spielberger, 1983)  is a widely used 20-​item self-​report, retrospective measure designed to assess general levels of anxiety. The initial iteration of the test (Form X, published in 1970) was popular in clinical research. Scores on the revised Form Y (Spielberger, 1983) are significantly correlated with the original measure scores, but this form has improved psychometric properties (Oei, Evans, & Crook, 1990). The STAI has been criticized for not being able to adequately differentiate between anxiety and depression (Bieling, Antony, & Swinson, 1998; Gros, Antony, Simms, & McCabe, 2007). The STAI has been used across many insomnia studies, but we are not aware of any psychometric evaluations of the properties of the STAI in those with insomnia or other sleep disorders. The Beck Depression Inventory, Second Edition (BDI-​II; Beck, Steer, & Brown, 1996) is one of the most commonly used measures of dysphoric symptoms. It has strong psychometric properties in samples of depressed individuals but variable support in medical populations, perhaps due to the lack of depression-​specific (i.e., discriminating) items. An investigation of the BDI-​II in insomnia sufferers (Carney, Ulmer, Edinger, Krystal, & Knauss, 2009) found good internal consistency (α = .82) in those with clinical levels of insomnia without depression and excellent internal consistency in those with diagnoses of insomnia and major depressive disorder (α = .90). There is some concern with the use of the mild BDI-​II cut-​off (BDI-​II ≥ 14)  because the BDI-​II can overclassify those with insomnia as having mild depression (Carney et  al., 2009), probably because several of its items overlap with the research diagnostic criteria (Edinger et  al., 2004)  for insomnia (e.g., insomnia, fatigue, and concentration problems). Although there may be concern for use of this scale with the mild depression cut-​off, the cut-​off of BDI-​II ≥ 17 has good accuracy support for correctly identifying depressed clients. The BDI-​II is useful for capturing mood improvements after CBT-​I (Bastien, Morin, Ouellet, Blais, & Bouchard, 2004), although the overlap with insomnia symptoms makes it difficult to determine if this scale is capturing sleep improvement or mood improvement. The Patient Health Questionnaire-​9 (PHQ-​9; Kroenke, Spitzer, & Williams, 2001) is among the most widely used screening measures for depression in primary care facilities (Zhong, Gelaye, Fann, Sanchez, & Williams, 2014). The nine items of this diagnostic measure reflect the DSM-​IV-​defined criteria for depressive disorders. Each item is rated on a Likert-​type scale (0–​3); thus, the authors

575

suggest that the PHQ-​9 may establish depression diagnosis as well as provide information with regard to symptom severity (Kroenke et al., 2001). Administration and scoring are brief (~3 minutes), and little training is required for interpretation for those familiar with the criteria for diagnosing depressive disorders. Although the PHQ-​9 has been used in a number of insomnia studies, there appear to be no studies evaluating the psychometric properties of the PHQ-​9 within insomnia or other sleep-​disordered populations. Overall Evaluation Essential assessment strategies for monitoring treatment progress and outcome should include the daily sleep diary (CSD) and the ISI, as well as some measures of daytime symptoms, such as fatigue as well as anxiety and depressive symptomatology. Although there is some evidence supporting the use of anxiety and depression scales in those with insomnia, more research on the psychometric properties of these scales is needed. Other assessment methods, such as actigraphy, are useful for research purposes and in cases wherein a circadian rhythm disorder may be present, but they may not be necessary for routine clinical use. In addition, many with insomnia complain about cognitive impairments affecting their attention and concentration, but to date, there is no adequate measure to reliably capture such deficits or detect changes in these domains with insomnia treatment.

CRITICAL ISSUES IN ASSESSING INSOMNIA

Barriers and Challenges Despite the wide range of measures available, clinicians face a number of challenges when assessing sleep/​insomnia complaints. The most important one derives from the fact that the diagnosis of insomnia is based solely on the client’s subjective complaint of difficulties initiating and/​ or maintaining sleep and the resulting daytime impairments. What is considered a long time to fall asleep or spent awake at night, too short amount of sleep, or poor sleep quality may vary widely across individuals. DSM-​5 criteria indicate a cut-​point of 20 to 30 minutes to define sleep onset and sleep maintenance insomnia, but this is not part of the formal diagnostic criteria. Furthermore, such criteria are quite arbitrary. Because there are age-​ related, “normal” changes in sleep patterns, being awake for 30 minutes at night is not necessarily perceived to

576

Health-Related Problems

be problematic by an older adult, whereas it is typically perceived as bothersome by a 30-​ year-​ old individual. Likewise, because of individual differences in sleep needs, there is no cut-​point to how much sleep is too short amount of sleep. Hence, the lack of quantitative criteria to define insomnia contributes to a significant heterogeneity of clinical profiles when working with individuals with an insomnia disorder. A related problem is that there are often important discrepancies between a person’s perception of being awake or asleep and objective recordings of sleep derived from PSG or actigraphy. Most people tend to overestimate the time they take to fall asleep and to underestimate the time they sleep at night relative to objective measurements, but this discrepancy is more pronounced in some individuals with insomnia. This phenomenon, also called sleep state misperception, can only be identified when objective sleep recording (i.e., PSG) is available, which is rarely the case in clinical practice. Therefore, clinicians should usually take at face value the information reported during the interview and in the client’s sleep diary. Reports of frequent sleepless nights may be an indication of significant sleep state misperception because such phenomenon is rare even among the most severe cases of insomnia. A similar paradox is that individuals with insomnia often report significant impairments of daytime functioning, but objective evaluation of performance, when available, usually reveals fairly mild and selective deficits (e.g., attention) (Fortier-​ Brochu, Beaulieu-​ Bonneau, Ivers, & Morin, 2012). In general, individuals with insomnia tend to perceive their sleep and daytime functioning as more impaired relative to how it can be objectively measured, which may reflect a generalized faulty appraisal of sleep and daytime functioning among individuals with insomnia. Notwithstanding these discrepancies, insomnia complaints must be taken seriously because they carry significant long-​term negative mental and physical health outcomes. The high rate of comorbidity between insomnia and other psychiatric disorders can also pose some assessment challenges, although this is less of a problem now that DSM-​5 has eliminated the need to ascertain whether insomnia is primary in nature or secondary to another disorder. Because of the overlap of several symptoms (e.g., sleep difficulties, decreased energy, and poor concentration) in insomnia, anxiety, and depression, it is sometimes difficult for clinicians to determine whether insomnia is simply a clinical symptom or feature of another disorder (e.g., generalized anxiety disorder or major depression), a disorder of its own, or a co-​occurring condition. The

literature is now fairly clear that insomnia can present in any of these three forms and, temporally, it can precede, accompany, or follow the occurrence of another psychiatric disorder. Most sleep experts are also in agreement that when an insomnia disorder is comorbid with another psychiatric or even medical (pain) condition, treatment should target both conditions without reference to which disorder may have occurred first (Manber et al., 2008). Of course, treatment may proceed sequentially and take into account what may be considered the most critical and urgent condition in need of treatment. Streamlining Assessment for Clinical Decision-​Making Despite the previously mentioned challenges, there are a number of assessment strategies that can guide clinicians in their evaluation of insomnia in clinical practice. First, the diagnosis of insomnia is derived primarily from a detailed clinical evaluation of the client’s subjective complaints. Thus, a detailed and comprehensive sleep-​ focused interview remains the most important assessment component. Completed in parallel with a case formulation, this assessment should cover the type of complaints, their duration, and their course; perceived consequences and impairments; typical sleep schedules; precipitating and perpetuating factors; and the presence of medical and psychiatric contributing factors with a history of prescribed and over-​the-​counter medications. Second, the use of a sleep diary is essential to document the nature, frequency, and severity of insomnia; identify behavioral and scheduling factors that may perpetuate insomnia; and monitor treatment compliance and progress. Third, although there are multiple measures that may complement the assessment of insomnia, if a single instrument is to be used to minimize burden, the ISI provides a quick assessment of perceived insomnia severity and its impact on daytime functioning and is a useful measure to monitor treatment progress and outcome. Additional measures of fatigue, anxiety, and depressive symptomatology can provide useful complementary information, particularly in view of the high co-​occurrence of insomnia and psychological symptoms. A more comprehensive psychological evaluation may be necessary for clients with suspected psychiatric disorders. Although PSG is not indicated for the routine evaluation of insomnia, it is essential to diagnose other sleep disorders (e.g., sleep apnea) and clinicians should refer their clients for such evaluation whenever another sleep disorder is suspected. PSG should also be considered when a client is unresponsive to treatment.

Insomnia Disorder

577

CONCLUSIONS AND FUTURE DIRECTIONS

ACKNOWLEDGMENTS

Insomnia is a prevalent condition brought to the attention of clinicians, either as an independent disorder or, more frequently, as a condition coexisting with another psychiatric disorder. A wide array of assessment strategies and methods are available to assist in the assessment of insomnia disorder. Most of these have been developed and validated in the context of research studies, but they can also facilitate the clinical decision-​making process. Some instruments are better suited for diagnosis or treatment planning, whereas others are better indicated for treatment monitoring or for assessing outcomes. Their clinical utility, availability, and cost, as well as their psychometric properties, are critical factors likely to guide the selection and use of some of these tools in clinical practice. Despite progress made in standardizing insomnia assessment in research, there are still important barriers and challenges to effective assessment of insomnia in clinical practice. There are several evidenced-​based therapies for insomnia, but insomnia often remains underrecognized and undertreated in clinical practice. Brief and valid screening tools are needed to identify individuals requiring treatment. There is a definite need for more practical, user-​ friendly assessment methods that can be used at the point of care. Additional research is also warranted to develop new assessment strategies to better understand the contribution of different factors (e.g., hyperarousal) to insomnia and characterize different phenotypes of insomnia. In view of the significant discrepancies between subjective and objective measures of sleep and daytime impairments, additional research is also needed to develop and validate insomnia-​specific measures that are sensitive and better suited to document the daytime impairments (e.g., fatigue and cognitive impairments) that often prompt individuals with insomnia to seek treatment. Specific assessment instruments are necessary to document more precisely the impact of insomnia therapies on several aspects of daytime functioning, such as fatigue, quality of life, psychological well-​being, and cognitive functioning (attention/​ concentration). A  significant challenge for the future will be to develop an assessment tool to optimize treatment algorithms and ensure evidence-​based therapies are matched to specific insomnia phenotypes. Finally, given that insomnia is a frequent complaint associated with several other psychological conditions, the development of an instrument that could be used transdiagnostically would be very helpful.

Preparation of this chapter was supported by research grants from the National Institute of Mental Health (MH091053) and the Canadian Institutes of Health Research (MT-​42504; No. 353509).

References American Academy of Sleep Medicine. (2005). International classification of sleep disorders:  Diagnostic and coding manual (2nd ed.). Westchester, IL: Author. American Academy of Sleep Medicine. (2014). International classification of sleep disorders:  Diagnostic and coding manual (3rd ed.). Westchester, IL: Author. American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders. Washington, DC: Author. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing. Arnedt, J. T., Conroy, D. A., Posner, D. A., & Aloia, M. S. (2006). Evaluation of the insomnia patient. Sleep Medicine Clinics, 1, 319–​332. Backhaus, J., Junghanns, K., Broocks, A., Riemann, D., & Hohagen, F. (2002). Test–​retest reliability and validity of the Pittsburgh Sleep Quality Index in primary insomnia. Journal of Psychosomatic Research, 53, 737–​740. Baglioni, C., Battagliese, G., Feige, B., Spiegelhalder, K., Nissen, C., Voderholzer, U., . . . Riemann, D. (2011). Insomnia as a predictor of depression: A meta-​analytic evaluation of longitudinal epidemiological studies. Journal of Affective Disorders, 135, 10–​19. Bastien, C. H., Morin, C. M., Ouellet, M. C., Blais, F. C., & Bouchard, S. (2004). Cognitive–​ behavioral therapy for insomnia:  Comparison of individual therapy, group therapy, and telephone consultations. Journal of Consulting and Clinical Psychology, 72, 653–​659. Bastien, C. H., Vallières, A., & Morin, C. M. (2001). Validation of the Insomnia Severity Index as an outcome measure for insomnia research. Sleep Medicine, 2, 297–​307. Bastien, C. H., Vallières, A., & Morin, C. M. (2004). Precipitating factors of insomnia. Behavioral Sleep Medicine, 2, 50–​62. Beck, A. T., Epstein, N., Brown, G., & Steer, R. A. (1988). An inventory for measuring clinical anxiety: Psychometric properties. Journal of Consulting and Clinical Psychology, 56, 893–​897. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the Beck Depression Inventory-​ II. San Antonio, TX: Psychological Corporation. Berry, R. B., Brooks, R., Gamaldo, C. E., Harding, S. M., Marcus, C. L., & Vaughn, B. V. (2012). The

578

Health-Related Problems

AASM manual for the scoring of sleep and associated events:  Rules, terminology and technical specifications. Darien, IL: American Academy of Sleep Medicine. Bieling, P. J., Antony, M. M., & Swinson, R. P. (1998). The State–​Trait Anxiety Inventory, Trait version:  Structure and content re-​ examined. Behaviour Research and Therapy, 36, 777–​788. Broomfield, N. M., & Espie, C. A. (2003). Initial insomnia and paradoxical intention:  An experimental investigation of putative mechanisms using subjective and actigraphic measurement of sleep. Behavioural and Cognitive Psychotherapy, 31, 313–​324. Broomfield, N. M., & Espie, C. A. (2005). Towards a valid, reliable measure of sleep effort. Journal of Sleep Research, 14, 401–​407. Buysse, D. J., Ancoli-​Israel, S., Edinger, J. D., Lichstein, K. L., & Morin, C. M. (2006). Recommendations for a standard research assessment of insomnia. Sleep, 29, 1155–​1173. Buysse, D. J., Angst, J., Gamma, A., Ajdacic, V., Eich, D., & Rossler, W. (2008). Prevalence, course, and comorbidity of insomnia and depression in young adults. Sleep, 31, 473–​480. Buysse, D. J., Reynolds, C. F., 3rd, Monk, T. H., Berman, S. R., & Kupfer, D. J. (1989). The Pittsburgh Sleep Quality Index:  A new instrument for psychiatric practice and research. Psychiatry Research, 28, 193–​213. Carney, C. E., Buysse, D. J., Ancoli-​Israel, S., Edinger, J. D., Krystal, A. D., Lichstein, K. L., & Morin, C. M. (2012). The Consensus Sleep Diary: Standardizing prospective sleep self-​monitoring. Sleep, 35, 287–​302. Carney, C. E., Edinger, J. D., Olsen, M. K., Stechuchak, K. M., Krystal, A. D., Lichstein, K. L., & Morin, C. M. (2008). Inter-​ rater reliability for insomnia diagnoses derived from the Duke Structured Interview for sleep disorders. Sleep, 31, A250. Carney, C. E., Moss, T. G., Harris, A. L., Edinger, J. D., & Krystal, A. D. (2011). Should we be anxious when assessing anxiety using the Beck Anxiety Inventory in clinical insomnia patients? Journal of Psychiatric Research, 45, 1243–​1249. Carney, C. E., Ulmer, C., Edinger, J. D., Krystal, A. D., & Knauss, F. (2009). Assessing depression symptoms in those with insomnia:  An examination of the Beck Depression Inventory Second Edition (BDI-​II). Journal of Psychiatric Research, 43, 576–​582. Carney, C. E., & Waters, W. F. (2006). Effects of a structured problem-​solving procedure on pre-​sleep cognitive arousal in college students with insomnia. Behavioral Sleep Medicine, 4, 13–​28. Chaudhuri, A., & Behan, P. O. (2004). Fatigue in neurological disorders. Lancet, 363, 978–​988. Coren, S. (1988). Prediction of insomnia from arousability predisposition scores:  Scale development and

cross-​validation. Behaviour Research and Therapy, 26, 415–​420. Coren, S. (1990). The Arousal Predispositon Scale: Normative data. Bulletin of the Psychonomic Society, 28, 551–​552. Coren, S., & Mah, K. B. (1993). Prediction of physiological arousability: A validation of the Arousal Predisposition Scale. Behaviour Research and Therapy, 31, 215–​219. Crawford, M. R., & Ong, J. C. (2015). There are two sides to every question: Exploring the construct of sleep quality. Journal of Clinical Psychiatry, 76, e822–​e823. Daley, M., Morin, C. M., LeBlanc, M., Gregoire, J. P., Savard, J., & Baillargeon, L. (2009). Insomnia and its relationship to health-​care utilization, work absenteeism, productivity and accidents. Sleep Medicine, 10, 427–​438. Dauvilliers, Y., Morin, C. M., Cervena, K., Carlander, B., Touchon, J., Besset, A., & Billiard, M. (2005). Family studies in insomnia. Journal of Psychosomatic Research, 58, 271–​278. Dorheim, S. K., Bondevik, G. T., Eberhard-​Gran, M., & Bjorvatn, B. (2009). Subjective and objective sleep among depressed and non-​depressed postnatal women. Acta Psychiatrica Scandinavica, 119, 128–​136. Drake, C. L., Richardson, G., Roehrs, T., Scofield, H., & Roth, T. (2004). Vulnerability to stress-​related sleep disturbance and hyperarousal. Sleep, 27, 285–​291. Drake, C. L., Scofield, H., & Roth, T. (2008). Vulnerability to insomnia:  The role of familial aggregation. Sleep Medicine, 9, 297–​302. Edinger, J. D., Bonnet, M. H., Bootzin, R. R., Doghramji, K., Dorsey, C. M., Espie, C. A., . . . Stepanski, E. J. (2004). Derivation of research diagnostic criteria for insomnia: Report of an American Academy of Sleep Medicine Work Group. Sleep, 27, 1567–​1596. Edinger, J. D., Olsen, M. K., Stechuchak, K. M., Means, M. K., Lineberger, M. D., Kirby, A., & Carney, C. E. (2009). Cognitive behavioral therapy for patients with primary insomnia or insomnia associated predominantly with mixed psychiatric disorders: A randomized clinical trial. Sleep, 32, 499–​510. Edinger, J. D., Wohlgemuth, W. K., Radtke, R. A., Marsh, G. R., & Quillian, R. E. (2001). Cognitive behavioral therapy for treatment of chronic primary insomnia:  A randomized controlled trial. Journal of the American Medical Association, 285, 1856–​1864. Edinger, J. D., Wyatt, J. K., Olsen, M. K., Stechuchak, K. M., Carney, C. E., & Chiang, A. (2009). Reliability and validity of the Duke Structured Interview for Sleep Disorders for insomnia screening. Paper presented at the 23rd Annual Meeting of the Associated Professional Sleep Societies, LLC, Seattle, WA. Edinger, J. D., Wyatt, J. K., Stepanski, E. J., Olsen, M. K., Stechuchak, K. M., Carney, C. E.,  .  .  .  Krystal, A. D. (2011). Testing the reliability and validity of DSM-​IV-​TR

Insomnia Disorder

and ICSD-​2 insomnia diagnoses: Results of a multitrait–​ multimethod analysis. Archives of General Psychiatry, 68, 992–​1002. Eidelman, P., Talbot, L., Ivers, H., Belanger, L., Morin, C. M., & Harvey, A. G. (2016). Change in dysfunctional beliefs about sleep in behavior therapy, cognitive therapy, and cognitive–​ behavioral therapy for insomnia. Behavior Therapy, 47, 102–​115. Ellis, J., Mitchell, K., & Hogh, H. (2007). Sleep preoccupation in poor sleepers: Psychometric properties of the Sleep Preoccupation Scale. Journal of Psychosomatic Research, 63, 579–​585. Ellis, J. G., Gehrman, P., Espie, C. A., Riemann, D., & Perlis, M. L. (2012). Acute insomnia: Current conceptualizations and future directions. Sleep Medicine Reviews, 16(1), 5–​14. Espie, C. A. (2002). Insomnia:  Conceptual issues in the development, persistence, and treatment of sleep disorder in adults. Annual Review of Psychology, 53, 215–​243. Fernandez-​Mendoza, J., Vgontzas, A. N., Liao, D., Shaffer, M. L., Vela-​ Bueno, A., Basta, M., & Bixler, E. O. (2012). Insomnia with objective short sleep duration and incident hypertension:  The Penn State Cohort. Hypertension, 60, 929–​935. Fortier-​ Brochu, E., Beaulieu-​ Bonneau, S., Ivers, H., & Morin, C. M. (2012). Insomnia and daytime cognitive performance: A meta-​analysis. Sleep Medicine Reviews, 16, 83–​94. Gagnon, C., Belanger, L., Ivers, H., & Morin, C. M. (2013). Validation of the Insomnia Severity Index in primary care. Journal of the American Board of Family Medicine, 26, 701–​710. Gradisar, M., Lack, L., Richards, H., Harris, J., Gallasch, J., Boundy, M., & Johnston, A. (2007). The Flinders Fatigue Scale: Preliminary psychometric properties and clinical sensitivity of a new scale for measuring daytime fatigue associated with insomnia. Journal of Clinical Sleep Medicine, 3, 722–​728. Gregory, A. M., Cox, J., Crawford, M. R., Holland, J., Harvey, A. G., & Steps, T. (2009). Dysfunctional beliefs and attitudes about sleep in children. Journal of Sleep Research, 18, 422–​426. Gros, D. F., Antony, M. M., Simms, L. J., & McCabe, R. E. (2007). Psychometric properties of the State–​ Trait Inventory for Cognitive and Somatic Anxiety (STICSA):  Comparison to the State–​ Trait Anxiety Inventory (STAI). Psychological Assessment, 19, 369–​381. Hartmann, J. A., Carney, C. E., Lachowski, A., & Edinger, J. D. (2015). Exploring the construct of subjective sleep quality in patients with insomnia. Journal of Clinical Psychiatry, 76, e768–​e773. Harvey, A. G. (2002). A cognitive model of insomnia. Behaviour Research and Therapy, 40, 869–​893.

579

Harvey, A. G., Belanger, L., Talbot, L., Eidelman, P., Beaulieu-​Bonneau, S., Fortier-​Brochu, E.,  .  .  .  Morin, C. M. (2014). Comparative efficacy of behavior therapy, cognitive therapy, and cognitive behavior therapy for chronic insomnia:  A randomized controlled trial. Journal of Consult and Clinical Psychology, 82, 670–​683. Harvey, A. G., & Farrell, C. (2003). The efficacy of a Pennebaker-​like writing intervention for poor sleepers. Behavioral Sleep Medicine, 1, 115–​124. Harvey, A. G., & Greenall, E. (2003). Catastrophic worry in primary insomnia. Journal of Behavior Therapy and Experimental Psychiatry, 34, 11–​23. Herlofson, K., & Larsen, J. P. (2002). Measuring fatigue in patients with Parkinson’s disease—​The Fatigue Severity Scale. European Journal of Neurology, 9, 595–​600. Hertenstein, E., Nissen, C., Riemann, D., Feige, B., Baglioni, C., & Spiegelhalder, K. (2015). The exploratory power of sleep effort, dysfunctional beliefs and arousal for insomnia severity and polysomnography-​ determined sleep. Journal of Sleep Research, 24, 399–​406. Hewlett, S., Dures, E., & Almeida, C. (2011). Measures of fatigue:  Bristol Rheumatoid Arthritis Fatigue Multi-​ Dimensional Questionnaire (BRAF MDQ), Bristol Rheumatoid Arthritis Fatigue Numerical Rating Scales (BRAF NRS) for severity, effect, and coping, Chalder Fatigue Questionnaire (CFQ), Checklist Individual Strength (CIS20R and CIS8R), Fatigue Severity Scale (FSS), Functional Assessment Chronic Illness Therapy (Fatigue) (FACIT-​F), Multi-​Dimensional Assessment of Fatigue (MAF), Multi-​Dimensional Fatigue Inventory (MFI), Pediatric Quality of Life (PedsQL), Multi-​ Dimensional Fatigue Scale, Profile of Fatigue (ProF), Short Form 36 Vitality Subscale (SF-​ 36 VT), and Visual Analog Scales (VAS). Arthritis Care & Research, 63(Suppl. 11), S263–​S286. Hicks, R. A., Conti, P. A., & Nellis, T. (1992). Arousability and stress-​related physical symptoms: A validation study of Coren’s Arousal Predisposition Scale. Perceptual and Motor Skills, 74, 659–​662. Hossain, J. L., Reinish, L. W., Kayumov, L., Bhuiya, P., & Shapiro, C. M. (2003). Underlying sleep pathology may cause chronic high fatigue in shift-​workers. Journal of Sleep Research, 12, 223–​230. Jansson-​ Frojmark, M., & Norell-​ Clarke, A. (2012). Psychometric properties of the Pre-​Sleep Arousal Scale in a large community sample. Journal of Psychosomatic Research, 72, 103–​110. Jarrin, D. C., Chen, I. Y., Ivers, H., Drake, C. L., & Morin, C. M. (2016). Temporal stability of the Ford Insomnia Response to Stress Test (FIRST). Journal of Clinical Sleep Medicine, 12, 1373–​1378. Jarrin, D. C., Chen, I. Y., Ivers, H., & Morin, C. M. (2014). The role of vulnerability in stress-​ related insomnia, social support and coping styles on incidence and

580

Health-Related Problems

persistence of insomnia. Journal of Sleep Research, 23, 681–​688. Kaplan, K. A., Hirshman, J., Hernandez, B., Stefanick, M. L., Hoffman, A. R., Redline, S.,  .  .  .  Osteoporotic Fractures in Men, Study of Osteoporotic Fractures SOF Research Groups. (2016). When a gold standard isn’t so golden: Lack of prediction of subjective sleep quality from sleep polysomnography. Biological Psychology, 123, 37–​46. Kazarian, S. S., Howe, M. G., & Csapo, K. G. (1979). Development of the Sleep Behavior Self-​Rating Scale. Behavior Therapy, 10, 412–​417. Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ-​9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–​613. Krupp, L. B., LaRocca, N. G., Muir-​Nash, J., & Steinberg, A. D. (1989). The Fatigue Severity Scale:  Application to patients with multiple sclerosis and systemic lupus erythematosus. Archives of Neurology, 46, 1121–​1123. Krystal, A. D., Edinger, J. D., Wohlgemuth, W. K., & Marsh, G. R. (2002). NREM sleep EEG frequency spectral correlates of sleep complaints in primary insomnia subtypes. Sleep, 25, 630–​640. Laugsand, L. E., Vatten, L. J., Platou, C., & Janszky, I. (2011). Insomnia and the risk of acute myocardial infarction: A population study. Circulation, 124, 2073–​2081. LeBlanc, M., Mérette, C., Savard, J., Ivers, H., Baillargeon, L., & Morin, C. M. (2009). Incidence and risk factors of insomnia in a population-​based sample. Sleep, 32, 1027–​1037. Lee, J., & Finkelstein, J. (2015). Consumer sleep tracking devices: A critical review. Studies in Health Technology and Informatics, 210, 458–​460. Lichstein, K. L., Means, M. K., Noe, S. L., & Aguillard, R. N. (1997). Fatigue and sleep disorders. Behaviour Research and Therapy, 35, 733–​740. Lichstein, K. L., Scogin, F., Thomas, S. J., DiNapoli, E. A., Dillon, H. R., & McFadden, A. (2013). Telehealth cognitive behavior therapy for co-​occurring insomnia and depression symptoms in older adults. Journal of Clinical Psychology, 69, 1056–​1065. Lineberger, M. D., Carney, C. E., Edinger, J. D., & Means, M. K. (2006). Defining insomnia: Quantitative criteria for insomnia severity and frequency. Sleep, 29, 479–​485. Littner, M., Hirshkowitz, M., Kramer, M., Kapen, S., Anderson, W. M., Bailey, D., . . . Woodson, B. T. (2003). Practice parameters for using polysomnography to evaluate insomnia: An update. Sleep, 26, 754–​760. Maich, K. H., Lachowski, A. M., & Carney, C. E. (2016). Psychometric properties of the Consensus Sleep Diary in those with insomnia disorder. Behavioral Sleep Medicine, 27, 1–​18. Manber, R., & Carney, C. E. (2015). Treatment plans and interventions for insomnia: A case formulation approach. New York, NY: Guilford.

Manber, R., Edinger, J. D., Gress, J. L., San Pedro-​Salcedo, M. G., Kuo, T. F., & Kalista, T. (2008). Cognitive behavioral therapy for insomnia enhances depression outcome in patients with comorbid major depressive disorder and insomnia. Sleep, 31, 489–​495. Meek, P. M., Nail, L. M., Barsevick, A., Schwartz, A. L., Stephen, S., Whitmer, K.,  .  .  .  Walker, B. L. (2000). Psychometric testing of fatigue instruments for use with cancer patients. Nursing Research, 49, 181–​190. Morgenthaler, T., Alessi, C., Friedman, L., Owens, J., Kapur, V., Boehlecke, B.,  .  .  .  American Academy of Sleep Medicine. (2007). Practice parameters for the use of actigraphy in the assessment of sleep and sleep disorders: An update for 2007. Sleep, 30, 519–​529. Morin, C. M. (1993). Insomnia: Psychological assessment and management. New York, NY: Guilford. Morin, C. M., Bastien, C., Guay, B., Radouco-​Thomas, M., LeBlanc, J., & Vallières, A. (2004). Randomized clinical trial of supervised tapering and cognitive behavior therapy to facilitate benzodiazepine discontinuation in older adults with chronic insomnia. American Journal of Psychiatry, 161, 332–​342. Morin, C. M., Beaulieu-​ Bonneau, S., LeBlanc, M., & Savard, J. (2005). Self-​help treatment for insomnia:  A randomized controlled trial. Sleep, 28, 1319–​1327. Morin, C. M., Bélanger, L., LeBlanc, M., Ivers, H., Savard, J., Espie, C. A., . . . Grégoire, J. P. (2009). The natural history of insomnia: A population-​based 3-​year longitudinal study. Archives of Internal Medicine, 169, 447–​453. Morin, C. M., Belleville, G., Belanger, L., & Ivers, H. (2011). The Insomnia Severity Index:  Psychometric indicators to detect insomnia cases and evaluate treatment response. Sleep, 34, 601–​608. Morin, C. M., & Espie, C. A. (2003). Insomnia:  A clinical guide to assessment and treatment. New  York, NY: Kluwer/​Plenum. Morin, C. M., LeBlanc, M., Belanger, L., Ivers, H., Merette, C., & Savard, J. (2011). Prevalence of insomnia and its treatment in Canada. Canadian Journal of Psychiatry, 56, 540–​548. Morin, C. M., Vallières, A., Guay, B., Ivers, H., Savard, J., Mérette, C.,  .  .  .  Baillargeon, L. (2009). Cognitive behavioral therapy, singly and combined with medication, for persistent insomnia: A randomized controlled trial. Journal of the American Medical Association, 301, 2005–​2015. Morin, C. M., Vallières, A., & Ivers, H. (2007). Dysfunctional Beliefs and Attitudes About Sleep (DBAS): Validation of a brief version (DBAS-​16). Sleep, 30, 1547–​1554. Morphy, H., Dunn, K. M., Lewis, M., Boardman, H. F., & Croft, P. R. (2007). Epidemiology of insomnia: A longitudinal study in a UK population. Sleep, 30, 274–​280. Moscou-​Jackson, G., Allen, J., Smith, M. T., & Haywood, C., Jr. (2016). Psychometric validation of the Insomnia

Insomnia Disorder

Severity Index in adults with sickle cell disease. Journal of Health Care for the Poor and Underserved, 27, 209–​218. Natale, V., Leger, D., Bayon, V., Erbacci, A., Tonetti, L., Fabbri, M., & Martoni, M. (2015). The Consensus Sleep Diary: Quantitative criteria for primary insomnia diagnosis. Psychosomatic Medicine, 77, 413–​418. Nicassio, P. M., Mendlowitz, D. R., Fussell, J. J., & Petras, L. (1985). The phenomenology of the pre-​sleep state: The development of the Pre-​Sleep Arousal Scale. Behaviour Research and Therapy, 23, 263–​271. Nishiyama, T., Mizuno, T., Kojima, M., Suzuki, S., Kitajima, T., Ando, K. B.,  .  .  .  Nakayama, M. (2014). Criterion validity of the Pittsburgh Sleep Quality Index and Epworth Sleepiness Scale for the diagnosis of sleep disorders. Sleep Medicine, 15, 422–​429. Oei, T. P., Evans, L., & Crook, G. M. (1990). Utility and validity of the STAI with anxiety disorder patients. British Journal of Clinical Psychology, 29(Pt. 4), 429–​432. Ohayon, M. M. (2002). Epidemiology of insomnia: What we know and what we still need to learn. Sleep Medicine Reviews, 6, 97–​111. Ong, J. C., Manber, R., Segal, Z., Xia, Y., Shapiro, S., & Wyatt, J. K. (2014). A randomized controlled trial of mindfulness meditation for chronic insomnia. Sleep, 37, 1553–​1563. Ouellet, M. C., Beaulieu-​ Bonneau, S., & Morin, C. M. (2012). Sleep–​wake disturbances. In N. Zasler, D. Katz, & R. Zafonte (Eds.), Brain injury medicine:  Principles and practice (2nd ed., pp. 707–​725). Boston, MA: Demos Medical. Perlis, M. L., Ellis, J. G., Kloss, J. D., & Riemann, D. W. (2016). Etiology and pathophysiology of insomnia. In M. H. Kryger, T. Roth, & W. C. Dement (Eds.), Principles and practice of sleep medicine (6th ed., pp. 769–​784). Philadelphia: PA: Saunders. Persons, J. B. (2012). The case formulation approach to cognitive–​behavior therapy. New York, NY: Guilford. Reynolds, C. F., 3rd, & Redline, S. (2010). The DSM-​V sleep–​wake disorders nosology: An update and an invitation to the sleep community. Sleep, 33, 10–​11. Riedel, B. W., & Lichstein, K. L. (2000). Insomnia and daytime functioning. Sleep Medicine Reviews, 4, 277–​298. Roth, T., Jaeger, S., Jin, R., Kalsekar, A., Stang, P. E., & Kessler, R. C. (2006). Sleep problems, comorbid mental disorders, and role functioning in the National Comorbidity Survey Replication. Biological Psychiatry, 60, 1364–​1371. Saliba, A. J., Henderson, R. D., Deane, F. P., & Mahar, D. (1998). The Arousability Predisposition Scale:  Validity and determinants of between-​subject variability. Journal of General Psychology, 125, 263–​269. Sanford, S. D., Bush, A. J., Stone, K. C., Lichstein, K. L., & Aguillard, N. (2008). Psychometric evaluation of the Beck Anxiety Inventory: A sample with sleep-​disordered breathing. Behavioral Sleep Medicine, 6, 193–​205.

581

Sateia, M. J., Doghramji, K., Hauri, P. J., & Morin, C. M. (2000). Evaluation of chronic insomnia: An American Academy of Sleep Medicine review. Sleep, 23, 243–​308. Savard, J., Simard, S., Ivers, H., & Morin, C. M. (2005). Randomized study on the efficacy of cognitive–​ behavioral therapy for insomnia secondary to breast cancer, Part I: Sleep and psychological effects. Journal of Clinical Oncology, 23, 6083–​6096. Savard, M. H., Savard, J., Simard, S., & Ivers, H. (2005). Empirical validation of the Insomnia Severity Index in cancer patients. Psychooncology, 14, 429–​441. Schutte-​ Rodin, S., Broch, L., Buysse, D., Dorsey, C., & Sateia, M. (2008). Clinical guideline for the evaluation and management of chronic insomnia in adults. Journal of Clinical Sleep Medicine, 4, 487–​504. Schwartz, J. E., Jandorf, L., & Krupp, L. B. (1993). The measurement of fatigue: A new instrument. Journal of Psychosomatic Research, 37, 753–​762. Semler, C. N., & Harvey, A. G. (2004). Monitoring for sleep-​ related threat:  A pilot study of the Sleep Associated Monitoring Index (SAMI). Psychosomatic Medicine, 66, 242–​250. Simon, G. E., & VonKorff, M. (1997). Prevalence, burden, and treatment of insomnia in primary care. American Journal of Psychiatry, 154, 1417–​1423. Sivertsen, B., Overland, S., Neckelmann, D., Glozier, N., Krokstad, S., Pallesen, S., . . . Mykletun, A. (2006). The long-​term effect of insomnia on work disability:  The HUNT-​2 historical cohort study. American Journal of Epidemiology, 163, 1018–​1024. Smets, E. M., Garssen, B., Bonke, B., & De Haes, J. C. (1995). The Multidimensional Fatigue Inventory (MFI) psychometric qualities of an instrument to assess fatigue. Journal of Psychosomatic Research, 39, 315–​325. Spielberger, C. D. (1983). Manual for the State–​ Trait Anxiety Inventory (STAI). Palo Alto, CA:  Consulting Psychologists Press. Spielman, A. J., & Glovinsky, P. B. (1991). The varied nature of insomnia. In P. Hauri (Ed.), Case studies in insomnia (pp. 1–​15). New York, NY: Plenum. Suka, M., Yoshida, K., & Sugimori, H. (2003). Persistent insomnia is a predictor of hypertension in Japanese male workers. Journal of Occupational Health, 45, 344–​350. Talbot, L. S., Maguen, S., Metzler, T. J., Schmitz, M., McCaslin, S. E., Richards, A., . . . Neylan, T. C. (2014). Cognitive behavioral therapy for insomnia in posttraumatic stress disorder: A randomized controlled trial. Sleep, 37, 327–​341. Taylor, D. J., Lichstein, K. L., Durrence, H. H., Reidel, B. W., & Bush, A. J. (2005). Epidemiology of insomnia, depression, and anxiety. Sleep, 28, 1457–​1464. Taylor, D. J., Mallory, L. J., Lichstein, K. L., Durrence, H. H., Riedel, B. W., & Bush, A. J. (2007). Comorbidity of chronic insomnia with medical problems. Sleep, 30, 213–​218.

582

Health-Related Problems

Thorndike, F. P., Ritterband, L. M., Saylor, D. K., Magee, J. C., Gonder-​Frederick, L. A., & Morin, C. M. (2011). Validation of the Insomnia Severity Index as a web-​ based measure. Behavioral Sleep Medicine, 9, 216–​223. Vallières, A., Ivers, H., Bastien, C. H., Beaulieu-​Bonneau, S., & Morin, C. M. (2005). Variability and predictability in sleep patterns of chronic insomniacs. Journal of Sleep Research, 14, 447–​453. Vgontzas, A. N., Liao, D., Pejovic, S., Calhoun, S., Karataraki, M., Basta, M., . . . Bixler, E. O. (2010). Insomnia with short sleep duration and mortality:  The Penn State cohort. Sleep, 33, 1153–​1164. Weatherley-​Jones, E., Nicholl, J. P., Thomas, K. J., Parry, G. J., McKendrick, M. W., Green, S. T., .  .  . Lynch, S. P. (2004). A randomised, controlled, triple-​blind trial of the efficacy of homeopathic treatment for chronic fatigue syndrome. Journal of Psychosomatic Research, 56, 189–​197.

Wetherell, J. L., & Gatz, M. (2005). The Beck Anxiety Inventory in older adults with generalized anxiety disorder. Journal of Psychopathology and Behavioral Assessment, 27, 17–​24. Wohlgemuth, W. K., Edinger, J. D., Fins, A. I., & Sullivan, R. J., Jr. (1999). How many nights are enough? The short-​ term stability of sleep parameters in elderly insomniacs and normal sleepers. Psychophysiology, 36, 233–​244. Wyatt, J. K., Cvengros, J. A., & Ong, C. J. (2012). Clinical assessment of sleep–​wake complaints. In C. M. Morin & C. A. Espie (Eds.), The Oxford handbook of sleep and sleep disorders (pp. 383–​404). New  York, NY:  Oxford University Press. Zhong, Q., Gelaye, B., Fann, J. R., Sanchez, S. E., & Williams, M. A. (2014). Cross-​cultural validity of the Spanish version of PHQ-​9 among pregnant Peruvian women: A Rasch item response theory analysis. Journal of Affective Disorders, 158, 148–​153.

26

Child and Adolescent Pain C. Meghan McMurtry Patrick J. McGrath The International Association for the Study of Pain defines pain as “an unpleasant sensory and emotional experience associated with actual or potential tissue damage, or described in terms of such damage” (Merskey & Bogduk, 1994, p.  210, emphasis added). The definition goes on to assert that pain is always subjective. This definition is very widely accepted and serves as the starting point for all pain assessment. Pain is common throughout childhood. There has been an explosion of research in the scientific study of pain and its measurement in children and youth. One of the most important findings has been that pain is much more complex than was thought 60  years ago. The beginning of the modern era of pain research can be marked by Melzack and Wall’s (1965) seminal paper proposing the gate control theory of pain. They posited that pain was modulated by “gates” in the spinal cord, by descending signals from the brain, and by peripheral stimulation. More recently, both peripheral and central sensitization have been described (Taddio & Katz, 2005; Woolf, 2011). In effect, our bodies have a memory for pain so that one experience of pain can trigger more pain during later experiences (Taddio, Katz, Illersich, & Koren, 1997). Rather than “getting used to” pain, we (and children and adolescents) can actually become more sensitive to it (Fradet, McGrath, Kay, Adams, & Luke, 1990; Woolf, 2011). As explored later, there are significant negative sequelae from unmanaged pain. Unfortunately, many children and adolescents continue to suffer from inadequately treated acute and chronic/​recurrent pain (Perquin et al., 2000; Stevens et al., 2011). Pain measurement is the application of a metric to a specific aspect of pain. Assessment is much broader than measurement and includes the selection of what aspects of pain to measure and what measures to use (McGrath & Unruh, 1987). Often, the focus is pain intensity; however,

intensity alone fails to capture the overall experience of pain. The Pediatric Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (PedIMMPACT; McGrath et al., 2008) recommended that in addition to pain intensity, several other domains should be considered, including satisfaction with treatment, symptoms and adverse events, physical recovery, emotional response, role functioning, sleep, and economic factors. One strategy to capture the relevant domains is to use a standard battery of questions (e.g., Eccleston et  al., 2005). Alternatively, one can select specific measures for each aspect of the pain experience that is to be measured. Regardless, the first step in the effective management of pain in children and adolescents is an evidence-​based assessment. In this chapter, we provide recommendations for assessment tools that have demonstrated utility and feasibility in clinical settings or hold promise as clinical assessment tools. Assessment tools in pediatric pain can be thought of as self-​report, behavioral (observational), physiological, or some combination. Because pain is always a subjective experience, self-​report has been referred to as the “gold standard” in pain assessment (Twycross, Voepel-​ Lewis, Vincent, Franck, & von Baeyer, 2015). However, self-​report tools have a number of limitations (e.g., Craig, Lilley, & Gilbert, 1996; Twycross et al., 2015; von Baeyer, 2013). For example, these tools require sophisticated cognitive and communication abilities, and ratings on them are likely influenced by self-​interest (Craig et  al., 1996; von Baeyer, 2006, 2013). A child who knows that medication will be given by needle may underreport pain to avoid the needle (Eland & Anderson, 1977). Ideally, a thorough pain assessment would utilize a combination of behavioral and self-​report tools. Unfortunately, in clinical practice this is often not feasible. Self-​report might best be considered the “primary” but not exclusive source of

583

584

Health-Related Problems

information in verbal individuals (von Baeyer, 2013). Many self-​ report tools are quick and cost-​ effective to administer in clinical settings. Thus, they have received the most research attention, and many have undergone rigorous psychometric testing. Therefore, we focus our review on self-​report tools. When relevant, we also discuss parental proxy reports. For a review of observational measures of pediatric pain, the reader is directed to Chorney and McMurtry (2013). Researchers have used physiological methods (e.g., heart rate and vagal tone) for measuring pain in very young children (infants in particular), but these methods are rarely used in clinical care outside of neonatal intensive care units and no single measure capturing pain has been identified (Brummelte, Oberlander, & Craig, 2013). Furthermore, physiologically based assessments (e.g., heart rate) may reflect other biological states (e.g., arousal) rather than pain. This chapter focuses on the assessment of pain in children between the ages of 3 and 18  years without severe cognitive impairment. The assessment of pain in cognitively impaired children is beyond the scope of this chapter, and the reader is referred to Oberlander and Symons (2006) and Belew and colleagues (2013). Because assessment of pain in infants is quite specialized, it is not covered in this chapter (see Lee & Stevens, 2013). Our discussion of the pediatric pain assessment literature is tailored to mental health professionals such as clinical and health psychologists, social workers, and psychiatrists working with children who suffer from pain. The role of mental health practitioners in pediatric pain assessment is multifaceted and varies from setting to setting. However, mental health practitioners are united by a focus that extends beyond assessing simple pain perception. They also examine the effects of pain on the functioning of youth and their families across domains, such as physical, emotional, and social role functioning, because the nature of pain requires assessment beyond simplistic pain intensity.

parental responses and peer support) factors that influence pain experience and expression. Measurement and assessment of pain in children and adolescents must be considered within a developmental context. First, there are developmental factors in the occurrence of different pains (King et  al., 2011). For example, recurrent abdominal pain is more common in younger than older children, headache is more common in older children, and migraine increases sharply after puberty, especially in females (King et  al., 2011; Unruh & Campbell, 1999). Second, development limits children’s understanding of pain. An 18-​month-​old child is unlikely to understand why he or she should receive a needle for a vaccination, whereas an 11-​ year-​ old child can understand. Third, development is a limiting factor in the use of self-​report measures. Many children younger than 5 or 6  years of age cannot consistently use self-​report measures for pain or show response biases such as endorsing the extremes of a scale (Chambers & Johnston, 2002; von Baeyer, 2013). Conversely, older children may inhibit behavioral responses to pain and thus make observational measures less useful. In this chapter, although we focus on measures that have demonstrated validity across a broad age range (from 3 through 18 years), we have kept developmental factors in mind while formulating our recommendations. Pain in children and adolescents can arise from medical procedures such as needles or surgery and can also be caused by disease or trauma. Some diseases, such as sickle cell disease or juvenile rheumatoid arthritis, frequently cause pain, but the amount of pain often does not correspond to the severity of the underlying disease. The origin of pain may also be unknown—​a large proportion of children who attend pediatric chronic pain clinics suffer from pain of unknown origin. When the cause of pain cannot be ascertained, it is often assumed to be the result of psychological factors. This is an unfortunate and pernicious strategy because it alienates patients who believe they are being blamed for their pain and that they are being told their pain “is all in their head.” This “leap to the head,” as Wall (1989) called it, is not scientifically THE NATURE OF PAIN justified because there is seldom any positive evidence of psychological causation. However, psychological facThe World Health Organization (WHO, 1948)  defines tors are important in the experience of pain regardless health as “a state of complete physical, mental and social of etiology. well-​being and not merely the absence of disease or infirPain may also be categorized in terms of its time mity.” This definition is consistent with a biopsychosocial course, such as acute, recurrent, or chronic. Acute pain model of pain that captures the complex, bidirectional can be divided into short sharp pain that may last a few relations among biological (e.g., genetics and biochemi- seconds to a few minutes or longer lasting acute pain that cals such as endorphins), psychological (e.g., self-​efficacy, may last from hours to days. Postoperative pain and pain anxiety, and depression), and social (e.g., caregiver/​ from injuries are the most common longer lasting acute

Child and Adolescent Pain

pain. Examples of short sharp pain include pain from everyday accidents such as stubbing a toe or skinning a knee. Clinical short sharp pain is typically from medical procedures, such as needles. Vaccinations by needle are common for all children, and children with chronic illnesses such as cancer or diabetes undergo other types of needle procedures (e.g., insulin injections, venipunctures, bone marrow aspirations, and lumbar punctures). Unmanaged pain and fear during medical procedures are associated with negative short-​and long-​term consequences, including longer procedure times, use of physical restraint during the procedure, injuries, increased pain and distress during future procedures, and negative memories of the event, and can contribute to the development of significant needle fear (McMurtry et  al., 2015; Taddio et al., 1997). Chronic pain is typically defined as pain that lasts more than 3 months (Merskey & Bogduk, 1994). It may be persistent or recurrent (episodic), in which there are bouts of pain interspersed with either pain-​free or low-​ pain periods. Chronic pain is common in youth. Median prevalence rates differ depending on the type of pain and range from 11% to 38%, with the most common including headaches, abdominal pain, back pain, and musculoskeletal pain (King et  al., 2011). Chronic pain is associated with impairments in functioning (school, social, and physical) and increased risk of internalizing symptoms (Dick & Pillai Riddell, 2010; Forgeron et  al., 2010; Huguet & Miró, 2008; Varni et  al., 1996). Approximately 5% of youth with chronic pain are moderately to severely impaired (Huguet & Miró, 2008). Youth with chronic pain are at increased risk of continuing to have pain as adults (Brna, Dooley, Gordon, & Dewan, 2005; Walker, Dengler-​Crish, Rippel, & Bruehl, 2010). The economic costs of chronic pain are enormous: Direct health care costs due to adult chronic pain are estimated at $6 billion per year in Canada (M. E. Lynch, 2011) and

TABLE 26.1  

585

more than $260 billion in the United States (Gaskin & Richard, 2011). There is less literature on costs of pediatric chronic pain; however, Groenewald, Essner, Wright, Fesinmeyer, and Palermo (2014) studied a sample of American adolescents presenting for initial evaluation at interdisciplinary pain treatment programs and extrapolated the annual costs of moderate to severe chronic pain to $19.5 billion.

ASSESSMENT FOR DIAGNOSIS

As mentioned previously, most clinical short sharp pain is iatrogenic and of known cause (e.g., from a needle or a surgical procedure). Pain and related fear regarding needle procedures should be managed using pharmacological (e.g., topical anesthetics), physical (e.g., sitting in an upright position), and psychological (e.g., distraction) strategies; based on a series of systematic reviews, a comprehensive clinical practice guideline for the management of vaccination-​related pain and fear has been published (Taddio et  al., 2015). With respect to the diagnosis of recurrent and chronic pain, there are several parameters of pain that are useful to consider, including pain intensity, localization, quality, frequency, and duration. Table 26.1 provides summary information on measures designed to assess some of these constructs. Pain Intensity Sensory intensity is often one of the first dimensions of pain assessed by clinicians. It is vital for clinicians to obtain quantitative ratings of intensity in order to understand the extent of children’s pain. However, reliance on pain intensity is an oversimplification of a complex experience and should not on its own determine clinical

Ratings of Instruments Used for Diagnosis

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

Pieces of Hurt Tool The Oucher

NA NA

NA NA

NA NA

NA NA

A G

G G

G G

G G



FPS-​R Visual analogue scales

NA NA

NA NA

NA NA

NA NA

A NA

G A

E G

G A



Numerical rating scale APPT

NA NA

NA NA

NA NA

A A

A G

G G

G E

G A

✓ ✓

Note: FPS-​R = Faces Pain Scale-​Revised; APPT = Adolescent Pediatric Pain Tool; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

586

Health-Related Problems

care (Schiavenato & Craig, 2010; Voepel-​Lewis, 2011; von Baeyer, 2013). Although pain intensity measures are discussed in this section, these measures are also valuable for treatment planning and monitoring treatment effectiveness. These tools are all single-​item measures of a subjective and ever-​fluctuating state; therefore, conventional indices of reliability are not generally applicable. In general, a pain-​free state is the norm. Norms for painful conditions have little meaning because pain problems vary considerably. The typical intensity, duration, and frequency of pain that produces a given level of functional disability would be of interest but have not been researched. Clinicians and parents make these judgments without good normative data. For example, the judgment that it is not “normal” for a 14-​year-​old girl with headache pain to miss 4 days of school each month is likely statistically true. But, how severe might the pain have to be to make school absence typical and thus normative (even if it is not helpful)? Pieces of Hurt Tool The Pieces of Hurt Tool (sometimes called the Poker Chip Tool; Hester, 1979) is a concrete ordinal rating tool. This tool consists of four plastic poker chips that represent “pieces of hurt.” When the tool is administered, the child is asked, “Did/​does it hurt?” If the child says “no,” a score of zero is recorded. If the child says “yes,” he or she is asked to indicate pain intensity by selecting between one and four poker chips, with one chip representing “a little hurt” and four chips representing “the most hurt.” The number of chips selected is the child’s score. Psychometric data indicate that the Pieces of Hurt Tool score provides a valid self-​report measure of child pain intensity (Stinson, Kavanaugh, Yamada, Gill, & Stevens, 2006). It has adequate content validity (Hester, 1979)  and good construct validity. It has been shown to correlate strongly with other self-​report and observational pain intensity measures (Beyer & Aradine, 1987, 1988; Gharaibeh & Abu-​Saad, 2002; Goodenough et al., 1997; Hester, 1979; Suraseranivongse et al., 2005). Evidence of discriminant validity includes low correlations with two measures of fear (Beyer & Aradine, 1988). The Pieces of Hurt Tool has been studied with children between the ages of 3 and 18 years (Stinson, Kavanagh, et al., 2006) and has been used with hospitalized children (Beyer & Aradine, 1987, 1988)  and children with postoperative pain (Aradine, Beyer, & Tompkins, 1988; Suraseranivongse et  al., 2005), as well as children undergoing venipunctures (Gharaibeh & Abu-​Saad, 2002) and immunizations

(Goodenough et al., 1997; Hester, 1979). The Pieces of Hurt Tool has been translated and validated for use with children in Thailand (Suraseranivongse et al., 2005) and Jordan (Gharaibeh & Abu-​Saad, 2002). Advantages of the Pieces of Hurt Tool include that it is scored on a concrete ordinal rating scale. This type of scale is appealing for use with children because concrete representations (poker chips) enhance children’s ability to understand the concept of levels of hurt/​pain (Stinson, Kavanagh, et al., 2006). Disadvantages include the need to sterilize the chips after each use and its 0 to 4 rating scale differs from the 0 to 10 scale widely used in health settings. We recommend it for use with children between the ages of 3 and 7  years suffering from acute pain, as it requires further testing in preschool-​aged children and children with chronic pain (Stinson, Kavanagh, et al., 2006). The Oucher The Oucher (Beyer, 1984) consists of two separate scales in a poster format:  a 0-​to-​100 numerical scale for older children and a photographic faces scale for younger children that is scored from 0 to 5. The original Oucher photographic scale shows the face of a 4-​year-​old Caucasian boy in increasing levels of discomfort, from “no hurt” to “the biggest hurt you could ever have.” African American and Hispanic versions of the Oucher are available (Villarruel & Denyes, 1991), and a modified Asian version of the Oucher (with a numerical scale that ranges from 0 to 10) has also been developed (Yeh, 2005). Finally, a First Nations version has been developed but little information on the psychometrics is available (Shapiro, 1997). Extensive psychometric research has been carried out on the Oucher (Stinson, Kavanagh, et  al., 2006; Tomlinson, von Baeyer, Stinson, & Sung, 2010). Belter, McIntosh, Finch, and Saylor (1988) assessed the Oucher score’s reliability by asking young children to rate the intensity of pain depicted in various cartoon scenes and found low to moderate levels of test–​ retest reliability. Luffy and Grove (2003) also obtained ratings of test–​retest reliability on the African American version by asking children to rate the pain they experienced from two past medical procedures/​treatments (r  =  .70). The conceptual framework behind the Oucher was clearly defined and informed each step in its creation (Beyer, Denyes, & Villarruel, 1992). Three to 7-​year-​old children show strong agreement with the order of the six original photographs (Beyer & Aradine, 1986). Content validity has also been established for the African American, Hispanic, and

Child and Adolescent Pain

Asian versions (Villarrruel & Denyes, 1991; Yeh, 2005). The construct validity of the original Oucher is supported by strong and positive correlations with a visual analogue scale of pain in a group of hospitalized children (Beyer & Aradine, 1988). Evidence for discriminant validity is provided by low correlations with two measures of children’s fears (Beyer & Aradine, 1988). Similar evidence of convergent and discriminant validity has been found for the African American, Hispanic, and Asian versions (Beyer & Knott, 1998; Yeh, 2005). There is evidence to support the use of the Oucher with hospitalized children (Beyer & Aradine, 1987, 1988)  and children suffering from postoperative pain (Beyer & Knott, 1998; Ramritu, 2000). It has been validated with Caucasian, African American, and Hispanic children between the ages of 3 and 12  years (Beyer & Knott, 1998). Patients seem to prefer the Oucher over a word-​graphic scale (Ramritu, 2000) but the Wong–​Baker FACES scale over the Oucher (Luffy & Grove, 2003). One of the main advantages of the Oucher is that it is culturally sensitive. On the other hand, versions other than the Asian version depict male children, and there is some informal evidence that female children may have difficulty relating to the photographs of male children (Beyer et al., 1992). As of 2009, the Oucher has been downloadable from http://​oucher.org. We recommend the numerical scale of the Oucher for use with children between the ages of 5 and 12  years because it requires further psychometric testing with very young children (i.e., ages 3 and 4 years; Stinson, Kavanagh, et al., 2006; Tomlinson et al., 2010). We also suggest that the photographic scale of the Oucher may be used with children between 3 and 6 years old who are not able to use the numerical scale. Although there are recommended tasks to determine which version a child should use (e.g., counting to 100 and sequencing shapes; Beyer et al., 1992), they are often not practical in clinical settings. Faces Pain Scale-​Revised The early development of the ability to recognize facial expressions of emotion may make it easier for children to use scales with faces (Bieri, Reeve, Champion, Addicoat, & Ziegler, 1990). However, to use a faces scale, children are still required to match their internal feelings of pain to a given face on the scale (Hicks, von Baeyer, Spafford, van Korlaar, & Goodenough, 2001). Based on the Faces Pain Scale developed by Bieri and colleagues (1990), the Faces Pain Scale-​Revised (FPS-​R; Hicks et  al, 2001)  is scored from 0 to 10 and shows a series of six faces ranging from a neutral face showing “no pain” to a face showing “very

587

much pain.” The child is asked to rate his or her pain by indicating which face shows how much pain (hurt) he or she has. Substantial evidence supports the psychometrics of the FPS-​R (Stinson, Kavanagh, et al., 2006; Tomlinson et al., 2010; von Baeyer, 2013). The faces for the original version of the scale were based on children’s drawings of increasing pain expressions (Bieri et al., 1990). The six faces for the FPS-​R were produced through a magnitude production task with adults (Hicks et al., 2001). The FPS-​R has generally shown strong convergent validity with other self-​report measures of pain intensity (Hicks et al., 2001; Miró & Huguet, 2004; Newman et al., 2005). Discriminant validity of the FPS-​ R has been supported by comparisons with pain affect (Miró & Huguet, 2004) and between vignettes (Stanford, Chambers, & Craig, 2006). Parents’ ratings using the FPS-​ R have been found to correlate significantly to their children’s self-​report scores (e.g., Wood et al., 2004). Acceptable test–​retest reliability in response to hypothetical events was demonstrated by Miró and Huguet (2004). The FPS-​R has been used with numerous different samples, including children between 4 and 19 years old (e.g., Hicks et al., 2001; Newman et al., 2005; Saudan et al., 2008; Taddio, Kaur Soin, Schuh, Koren, & Scolnik, 2005). The FPS-​R has been used with nonclinical samples (Hicks et al., 2001; Miró & Huguet, 2004) and numerous samples undergoing various medical procedures (Hicks et al., 2001; Lister et  al., 2006; Migdal, Chudzynska-​Pomianowska, Vause, Henry, & Lazar, 2005; Miró & Huguet, 2004; Newman et al., 2005; Saudan et al., 2008; Wood et al., 2004). Advantages of the FPS-​R include its strong psychometrics, quickness and ease of administration, and its availability (free for clinical and research use at https://​www. iasp-​pain.org/​FPSR). The FPS-​R has been translated into more than 30 languages. Versions that have been validated include French (Wood et al., 2004), Thai (Newman et al., 2005), and Catalan (Miró & Huguet, 2004). One must be cautious in administering the FPS-​R and similar scales to young children (i.e., ages 4–​6 years) because children of this age have been found to use the extreme ends of the scale (Arts et  al., 1994). Finally, the FPS-​R may not have high acceptability because children, parents, and nurses have indicated preference for more cartoon-​ like scales that have a smiling no pain face (Chambers, Hardial, Craig, Court, & Montgomery, 2005). However, in a comparison with a nonfacial scale of pain intensity, the majority of schoolchildren and children in hospital preferred the FPS-​R (Miró & Huguet, 2004). Overall, we recommend the FPS-​R for clinical use in assessing pain intensity in children between 4 and 12 years of age.

588

Health-Related Problems

The Wong–​Baker FACES Pain Scale (Wong & Baker, 1988)  is another widely used, psychometrically sound, single-​item, self-​report measure of pain intensity (Stinson, Kavanagh, et al., 2006; Tomlinson et al., 2010; von Baeyer, 2013). It contains six cartoon-​like faces that, in contrast to the FPS-​R, range from a smiling “no hurt” face to a face with tears for the “hurts worst” face. Parents, children, and nurses have indicated a preference for the Wong–​Baker FACES Pain Scale over other faces scales (Chambers et al., 2005). However, scales with a smiling no pain face have been shown to confound affect with pain intensity (Chambers & Craig, 1998; Chambers, Giesbrecht, Craig, Bennett, & Huntsman, 1999). Children reporting their pain on faces scales with a smiling no pain face endorse higher pain ratings than on scales with a neutral face anchor (Chambers et al., 1999). Although these differences in ratings are statistically significant, it is not clear whether these differences affect the clinical care of children, and debate continues on this and whether the scale measures fear or not (Chambers et al., 2005; Garra, Singer, Domingo, & Thode, 2013). Visual Analogue Scales Usually, visual analogue scales (VAS) consist of a 10-​cm horizontal line drawn on a piece of paper, with stops (anchors) placed at each end of the line (Wewers & Lowe, 1990). The anchors are labeled from, for example, “no pain” to “the most extreme pain,” and the child is asked to point or make a mark on the line to represent his or her current level of pain intensity. The recordings are typically measured in millimeters, yielding scores that range from 0 to 100. The minimum clinically significant difference on 10-​cm VAS for child pain intensity is 10 mm (Powell, Kelly, & Williams, 2001). There are many versions of the VAS available that differ in the terminology they use for the anchors, presence or absence of divisions along the line, units of measurement, length, and orientation of the scale (Stinson, Kavanagh, et al., 2006). Table 26.1 summarizes the psychometric testing of VAS for child pain intensity (Stinson, Kavanagh, et  al., 2006; von Baeyer, 2013). In a study that examined children’s ratings of past medical procedures/​treatments, Luffy and Grove (2003) found that only 45% of children rated their pain intensity within 10 mm above or below their original rating. However, more research on children’s recalled pain intensity is needed before any conclusions about the test–​ retest reliability of VAS can be drawn. Convergent validity is supported by moderate to strong correlations with other child-​reported pain intensity measures (Beyer & Aradine,

1987, 1988; Migdal et al., 2005). In terms of discriminant validity, VAS have been found to have low correlations with two measures of fear (Beyer & Aradine, 1988). VAS have been used successfully with children in acute and chronic pain (Beales, Keen, & Lennox-​Holt, 1983; Beyer & Aradine, 1987, 1988; Migdal et al., 2005; Powell et al., 2001). Child preference data on VAS compared to other self-​report measures are equivocal (Berntson & Svensson, 2001; Luffy & Grove, 2003). VAS are quick and easy to use. They can be easily and affordably photocopied for use (as long as line length remains constant). Another advantage is that they allow for measurement of pain on an interval scale, which allows for greater sensitivity (Champion, Goodenough, von Baeyer, & Thomas, 1998). One of the main disadvantages of VAS is that clinicians must be careful to ensure that children (especially younger children) understand the instructions for their use. These scales require children to seriate their perceptions from small to large, and this ability does not appear until children are approximately 7 years of age (Shields, Palermo, Powers, Grewe, & Smith, 2003). For this reason, we recommend the use of VAS with children older than the age of 8 years (Stinson, Kavanagh, et al., 2006). Numerical Rating Scales Numerical rating scales (NRS; or verbal numerical scales [VNS] if delivered verbally) are probably the most frequently used scales and are well established for children 8 years old or older (von Baeyer, 2013; von Baeyer et al., 2009). For example, von Baeyer and colleagues (2009) presented analyses from three different data sets supporting the use of NRS in postoperative pain and vaccination pain in terms of concurrent validity with the VAS and the FPS-​R. Although the age range was 7 to 17 years, the authors recommended use in children 8 year old or older (von Baeyer et  al., 2009). Miró and Huguet (2009) also demonstrated concurrent validity and discriminant validity in healthy schoolchildren and children post-​surgery. Bailey, Daoust, Doyon-​ Trottier, Dauphin-​ Pierre, and Gravel (2010) reported data gathered from 8-​to 17-​year-​ old children in the emergency department that supported test–​test reliability and content validity of the VNS; participants significantly preferred the VNS to a VAS and a verbal descriptor rating scale. Ruskin and colleagues (2014) presented evidence of discriminant validity of the VNS for assessing pain intensity in youth with chronic pain. We recommend the NRS (VNS) for use with children 8 years old or older.

Child and Adolescent Pain

Pain Localization The location of a child’s pain is an important assessment parameter. In clinical practice, children are often asked to tell or point to where they feel pain. These informal methods have limitations. Children may not have enough anatomical knowledge to accurately express where they hurt and/​or may be hesitant to point to their pain sites (Savedra, Tesler, Holzemer, Wilkie, & Ward, 1989). These methods do not preserve empirical documentation of children’s responses. Body outline tools have been developed to aid in the assessment of pain localization. However, they have not received nearly as much research attention as measures of intensity in the pediatric pain literature. Eland Color Tool The Eland Color Tool (Eland & Anderson, 1977)  is a measure of child pain intensity and localization. To use this tool, the child is asked to choose four crayon colors to represent “no hurt,” “a little hurt,” “more hurt,” and “worst hurt.” The child then selects the color that represents his or her level of pain and is asked to color in a body outline wherever he or she hurts. In an unpublished study, 98% of hospitalized children between the ages of 4 and 10 years could place a mark on this tool that coincided with their pathology, surgical procedure, or another painful event that occurred during hospitalization (Eland & Anderson, 1977). A  modified version of the Eland Color Tool has been used successfully in a sample of children with developmental delays (Benini et al., 2004). The advantage of the Eland Color Tool is that it allows clinicians to assess children’s self-​reported pain, localization, and intensity. In our clinical experience, this tool has proven to be very appealing to young children, who are often eager to use a favorite activity (coloring) to communicate about their pain. Because the Eland Color Tool has not undergone rigorous psychometric testing to date, it must be interpreted with caution. Adolescent Pediatric Pain Tool The Adolescent Pediatric Pain Tool (APPT; Savedra, Holzemer, Tesler, & Wilkie, 1993) is a self-​report multidimensional pain measure that can be used to assess pain intensity, localization, and quality. It is divided into three separate components: a body outline, a word-​graphic rating scale, and a qualitative descriptive word list. The body outline is made up of two line drawings showing the front

589

and back of the body. The word-​graphic rating scale is a 10-​cm VAS with five pain intensity anchors (“no pain,” “little pain,” “medium pain,” “large pain,” and “worst pain possible”). The word list is composed of 67 words that describe the sensory, affective, evaluative, and temporal dimensions of pain. Both the body outline and the word list are based on a widely used adult pain assessment measure, the McGill Pain Questionnaire (Melzack, 1983). Adolescents indicate the location of their current pain on the body outline, rate their current pain intensity on the word-​graphic rating scale, and highlight words describing their current pain experience. The APPT has undergone rigorous psychometric testing with both healthy and hospitalized children from a wide range of ethnic backgrounds between the ages of 8 and 17  years. The development of the APPT is documented in a series of published studies (Savedra et  al., 1989; Tesler et  al., 1991; Wilkie et  al., 1990)  that provide good evidence for the content validity of each of its three components. Evidence for the convergent validity of the body outline component of the APPT is supplied by a study in which hospitalized children’s markings on the body outline were found to match nurses’ observations and/​or medical records (Savedra et al., 1989). The word-​graphic rating scale has also shown evidence of convergent validity through moderate to strong correlations with other scales (Tesler et al., 1991). Scores on the word list component of the APPT have shown weak to moderate but significant correlations with pain intensity and number of pain sites; these results provide some limited support for its convergent validity (Wilkie et  al., 1990). Children may underselect descriptors consistent with neuropathic pain when asked to choose from the APPT list in comparison to when they are asked to report sensations in an affected body part generally (Ho, Curtis, & Clarke, 2015). Test–​retest reliability has been supported for all three components (Savedra et  al., 1993; Tesler et al., 1991; Wilkie et al., 1990). The originally intended age range was 8 to 17 years; however, the APPT has been used with individuals ranging from to 2 to 68 years of age (Fernandes, De Campos, Batalha, Perdigão, & Jacob, 2014). Younger children may not be able to understand some of the words in the word list and may have difficulty with left/​right reversal of body outline drawings (Savedra et  al., 1989, 1993; Wilkie et  al., 1990). The APPT has been used with youth with sickle cell disease, cancer, HIV, undergoing venipuncture or immunization, and in the postoperative context (Fernandes et al., 2014). Average administration time for the APPT is approximately 3 to 6 minutes (Savedra et al., 1993). The tool is

590

Health-Related Problems

easily reproducible, provided that the correct scaling of the word-​graphic rating scale is preserved. Scoring the tool requires placing a clear plastic template over the completed body outline to measure the number of separate locations marked by the child; measuring the child’s mark on the word-​graphic rating scale with a ruler; and calculating total and percentage sensory, affective, and evaluative subscale scores for the word list. The multistep scoring procedure may limit the APPT’s feasibility in some busy clinical settings. Although no child preference data appear to be available for the complete APPT, child preference was taken into account during the development of the word-​graphic rating scale (Tesler et al., 1991). The main advantage of the APPT is that it measures three important dimensions of pain (location, intensity, and quality). The body outline component of the APPT is the only pediatric pain tool of its kind with evidence to support its reliability and validity. It provides a measure of pain location in children who may not have enough anatomical knowledge to indicate the location of their pain verbally or who may be hesitant to point to their pain sites (Savedra et  al., 1989). The body outline also provides empirical documentation that can be used to track changes in children’s pain location over time. A  recent systematic review of the APPT concluded that the APPT has support for use with hospitalized children between 8 and 17  years of age, with further work needed for other populations (Fernandes et al., 2014). Pain Quality The quality of pain is important for clinical description and/​or diagnosis because it may give an indication of the type of pain or particular disease causing the pain. For example, burning pain is a part of the diagnosis of neuropathic pain, pounding headache is one part of the criteria for migraine, steady headache is a component of a tension-​type headache diagnosis, and pain in the flank radiating to the groin is indicative of kidney stones. There are no particular measures for children and adolescents that have been mapped to specific disorders. Currently, the APPT word list is the only tool assessing pain quality that has undergone extensive psychometric testing with children, and we recommend it for that purpose. Pain Frequency and Duration Pain frequency and duration are important dimensions of the pain experience to take into account when

formulating diagnoses. Diagnoses for chronic and recurrent pain conditions depend on the time course of the symptoms; as noted previously, chronic pain typically has to last for 3 months or more. In painful conditions seen by other specialists (e.g., gastroenterologists), there are similar requirements; for example, functional abdominal pain requires that the symptoms must occur at least 4 times per month for at least 2  months (Hyams et  al., 2016). Pain diaries are used to augment diagnostic information provided via interview and questionnaires. Pain diaries are discussed in more detail in the Assessment for Case Conceptualization and Treatment Planning section. Overall Evaluation To make accurate diagnoses in children and adolescents suffering from pain, clinicians must, at the minimum, assess the perceptual dimensions of pain, including intensity, localization, quality, duration, and frequency. The Pieces of Hurt Tool, FPS-​R, VAS, numerical rating scales, and the Oucher have all undergone rigorous psychometric testing, and we recommend their use in clinical settings for the assessment of pain intensity in preschool and school-​aged children. We also recommend VAS and the word-​graphic rating scale of the APPT for the measurement of pain intensity in older children and adolescents. For pain localization, we recommend the APPT body outline tool with older children and adolescents. For younger children, the Eland Color Tool may be useful for measuring pain intensity and localization. However, this measure has not undergone rigorous psychometric testing to date. For pain quality, we recommend the APPT word list for older children and adolescents. Unfortunately, we know of no similar measure that has been validated for use with younger children. We recommend pain diaries for the assessment of pain frequency and duration for the purposes of diagnosis, with parental proxy-​report for younger children.

ASSESSMENT FOR CASE CONCEPTUALIZATION AND TREATMENT PLANNING

Mental health practitioners will typically not be involved in case conceptualization or treatment planning for children undergoing isolated acutely painful procedures or injuries. The exception is a child who develops severe anxiety of and fear during certain painful procedures, such as vaccination by needle injection. In this case,

Child and Adolescent Pain TABLE 26.2  

591

Ratings of Instruments Used for Case Conceptualization and Treatment Planning

Instrument

Norms

Internal Consistency

Inter-​Rater Reliability

Test–​Retest Reliability

Content Validity

Construct Validity

Validity Generalization

Clinical Utility

Highly Recommended

FDI PedsQL 4.0

G E

E A

NA NA

A A

A G

E E

G E

G A



PCS-​C

A

A

NA

A

A

A

G

G



Note:  FDI  =  Functional Disability Inventory; PedsQL 4.0  =  Pediatric Quality of Life Inventory Generic Core Scales; PCS-​C  =  Pain Catastrophizing Scale for Children; A = Adequate; G = Good; E = Excellent; NA = Not Applicable.

the assessment typically focuses on the anxiety and fear associated with the painful procedure rather than on the pain itself. (For a clinical practice guideline on high levels of needle fear across the lifespan, see McMurtry et al. [2016].) Most pediatric pain assessments by mental health practitioners are with children suffering from recurrent or chronic pain. To obtain a full case conceptualization for a child or adolescent suffering from recurrent or chronic pain, clinicians must assess basic pain perception parameters, such as pain intensity and frequency, over time. It is also important for clinicians to assess the impact of the child’s pain on his or her day-​to-​day functioning to define targets for intervention. Finally, there are a number of psychological factors related to pain that clinicians should consider when conceptualizing cases and formulating treatment plans. These psychological factors include pain catastrophizing, fear of pain, as well as general anxiety and depression. Table 26.2 provides summary information on measures designed to assess some of these constructs.

Pain Diaries Paper-​and-​pencil pain diaries (for an example, see Table 26.3) combine numerical ratings with a calendar to allow for the assessment of pain over time. Diaries are commonly used for continuing or episodic pain such as headache, abdominal pain, or neuropathic pain. Although diaries are commonly used in clinical contexts, they have not been the focus of

TABLE 26.3  

Pain Diary of a Fictional 10-​Year-​Old Child with Chronic Headaches Name: Jamie Trisco

Date

Time

Rating (0 = No Pain to 5 = Severe Pain)

Monday

Breakfast

0

Lunch

0

Dinner

4

Bedtime

0

Breakfast

3

Lunch

4

Dinner

3

Bedtime

1

Breakfast

0

Lunch

5

Dinner

2

Bedtime

1

Pain Frequency and Duration Information about pain frequency and duration augments basic diagnostic information and allows mental health professionals to form full psychological case conceptualizations and plan treatments for children suffering from recurrent or chronic pain. By working with children and their families to document pain frequency and duration, clinicians gain insight into possible patterns in children’s pain experiences. For example, a child with recurrent abdominal pain may report severe pain twice per week, on Mondays and Wednesdays during a challenging class at school. This might indicate a link between the child’s abdominal pain and academic stress and suggest that psychological strategies targeting the management of academic stress may help decrease this child’s pain. Pain diaries are the best method for investigating potential patterns in children’s pain experiences.

Tuesday

Wednesday

What Happened Before the Headache

Late dinner

What Did You Do

Took 2 ibuprofen

Woke up with Took 2 headache ibuprofen Took 2 ibuprofen Slept for 1 hour

Late lunch

Took 2 ibuprofen, slept for 1 hour Slept for 1 hour

Note:  This diary suggests the possibility that headaches are triggered by delays in eating and are made somewhat better by ibuprofen and by sleep. A  lengthier observation period might support these ideas, and specific manipulations could confirm them.

592

Health-Related Problems

thorough psychometric testing. Research conducted to date suggests that children who are queried retrospectively about their pain tend to overreport pain in comparison with data from prospective pain diaries (Andrasik, Burke, Attanasio, & Rosenblum, 1985; van den Brink, Bandell-​Hoekstra, & Abu-​Saad, 2001). For this reason, pain diaries help clinicians obtain a more accurate picture of children’s pain experiences over time. Pain diaries are most typically time-​based and require the child and/​or parent to make a rating three or four times a day. More frequent reporting by the respondents will increase the precision of the diary but will also increase the burden on the respondents and likely decrease compliance. Event-​based diaries are an alternative to time-​based diaries. In this style of diary, respondents are asked to record the beginning and end of pain episodes. Event-​based diaries have the advantage of being able to determine the duration of pain episodes more accurately but also carry the risk of being unable to distinguish between the absence of pain and the absence of reporting. Whether the parent, child, or both fill out the diary will depend on the developmental level of the child. There is evidence to suggest that there is a positive relation between child report and parental proxy report of the intensity of the child’s pain (Andrasik et al., 1985; Richardson, McGrath, Cunningham, & Humphreys, 1983; Vetter, Bridgewater, Ascherman, Madan-​Swain, & McGwin, 2014); however, parents’ and children’s pain frequency ratings may differ under some circumstances, including younger child age (Vetter et  al., 2014). Furthermore, pain experienced at school is unlikely to be recorded by a parent unless it is of sufficient severity that the child has to leave school. Similarly, mild pain may not result in behavior that is evident to parents even when it is present. We recommend that clinicians routinely use paper-​ and-​pencil pain diaries to inform their case conceptualizations and treatment plans for children suffering from chronic or recurrent pain. Electronic diaries (via programs run on smartphones, tablets, or computers) are becoming more popular as well; a recent review on e-​diaries for headache recommended improvement in their development and testing, including assessing psychometric properties (Stinson et al., 2013). These e-​diaries have several advantages over paper-​and-​pencil diaries. Compliance appears to be improved with e-​diaries in that respondents are more likely to complete electronic diary recordings and to make these recordings at the time when pain occurs (rather than recalling the pain experience later and making the recording retrospectively; Lewandowski, Palermo, Kirchner, & Drotar, 2009; Palermo, Valenzuela, & Stork, 2004). The usability of an electronic chronic pain diary was demonstrated in a small sample of adolescents with

juvenile arthritis (Stinson, Petroz, et  al., 2006). Further research with larger samples of children is needed before we can determine whether the advantages of these tools over paper-​and-​pencil diaries outweigh their cost. Physical, Social, and Role Functioning In accordance with the biopsychosocial model of health and the International Classification of Functioning, Disability and Health (ICF; WHO, 2001), clinicians need to assess how pain impacts children’s physical, social, and role functioning in completing a full case conceptualization and treatment plan. When assessing physical functioning, clinicians should always include a measure of sleep because it is often disrupted in children with chronic or recurrent pain (e.g., Walters & Williamson, 1999). It is beyond the scope of this chapter to discuss pediatric sleep assessment in detail; for recommendations, see Mindell and Owens (2015) and de la Vega and Miró (2013). Appetite and weight are two other aspects of physical functioning that should be assessed. This does not usually require formal assessment measures. In children, role functioning is often synonymous with academic functioning because school is the “job” of children. In children with chronic or recurrent pain, it is also important to assess the extent to which children are taking on a “sick role” in their family and demonstrating lower functioning in their other roles (e.g., social). Functional Disability Inventory The Functional Disability Inventory (FDI) is a 15-​item global measure of children’s physical and psychosocial functioning that has both self-​and parent-​report components (Walker & Greene, 1991). Respondents are asked to indicate the perceived difficulty the child has had performing various activities (e.g., walking to the bathroom) in the previous few days. The response scale has the following options, scored 0 to 4 and summed across items: “no trouble,” “a little trouble,” “some trouble,” “a lot of trouble,” and “impossible.” Total scores range from 0 to 60 for both self-​and parent-​report forms, with higher scores indicating greater disability. Healthy children have been found to score on average between 2 and 3.5 on the FDI (Walker & Greene, 1991). Kashikar-​Zuck et al. (2011) distinguished four levels of disability corresponding to the following scores:  no/​minimal (0–​12), moderate (13–​29), and severe (≥30); they found support for two factors (physically strenuous activities and non-​physically strenuous daily activities).

Child and Adolescent Pain

Three major studies have been conducted to examine the psychometric properties and clinical utility of the FDI (Claar & Walker, 2006; Kashikar-​Zuck et al., 2011; Walker & Greene, 1991), and many other studies have employed the FDI as an assessment or treatment outcome measure. The internal consistency of the FDI has ranged from good to excellent for both child and parent versions (Walker & Greene, 1991). Adequate test–​retest reliability of the FDI has been supported (Claar & Walker, 2006; Walker & Greene, 1991). The FDI was developed based on adult measures of functional disability and pilot tested for usability with children and adolescents (Walker & Greene, 1991). Cross-​informant (parent–​child) correlations on the FDI have ranged from moderate to strong (Reid, McGrath, & Lang, 2005; Walker & Greene, 1991). Concurrent and convergent validity of the FDI have been established by its moderate to strong relationships with other measures of child health and well-​being (Claar & Walker, 2006; Kashikar-​ Zuck, Vaught, Goldschneider, Graham, & Miller, 2002; A.  M. Lynch, Kashikar-​Zuck, Goldschneider, & Jones, 2006; Palermo & Kiska, 2005; Walker & Greene, 1991; Walker, Smith, Garber, & Claar, 2005). Discriminant validity of the FDI has been supported through negative correlations with measures that would not be expected to be closely related to functional disability (Claar, Walker, & Smith, 1999) and through its ability to distinguish between groups of youth expected to have different levels of perceived disability (Walker & Greene, 1991; Walker, Guite, Duke, Barnard, & Greene, 1998). There is also evidence that scores on the FDI have incremental validity over other clinical measures in predicting the severity of sleep–​wake problems (Palermo & Kiska, 2005), number of days a child spent in bed due to illness, and school absences (Walker & Greene, 1991). In addition, significant correlations between baseline scores on the FDI and subsequent measures of illness behavior (e.g., school absence, bed days, pain, and depressive symptoms) support the predictive validity of this instrument (Claar & Walker, 2006; Walker & Greene, 1991). The FDI has very good validity generalization. It has been translated into Arabic (Madi & Clinton, 2014) and German (Offenbächer et  al., 2016). The FDI has been administered to children as young as age 6 years and adults up to age 23 years, with the majority of the studies conducted with children between ages 8 and 17 years. There is evidence that girls may report greater disability compared to boys (Claar & Walker, 2006; Walker & Greene, 1991; but see Kashikar-​Zuck et al., 2011). The FDI has been used to assess disability in many different clinical populations, including children with recurrent abdominal

593

pain (Campo et al., 2004; Claar & Walker, 2006; Walker & Greene, 1991), chronic back pain (Lynch et al., 2006), burns (Barnum, Synder, Rapoff, Mani, & Thompson, 1998), complex regional pain syndrome (Eccleston, Crombez, Scotford, Clinch, & Connell, 2004), juvenile idiopathic musculoskeletal pain (Eccleston et al., 2004), fibromyalgia (Kashikar-​ Zuck et  al., 2002; Reid et  al., 2005), recurrent headache (Palermo & Kiska, 2005), and sickle cell disease (Peterson & Palermo, 2004). The FDI has also been used with adolescents following oral surgery (Gidron, McGrath, & Goodday, 1995), outpatients with minor health complaints (Walker & Greene, 1991), and healthy controls (Walker & Greene, 1991). An advantage of the FDI is that it is easy and relatively quick to administer. Furthermore, the tool has been used with a number of different populations, shows good psychometric properties, and scores associated with levels of disability have been calculated in chronic pain populations (Kashikar-​Zuck et  al., 2011). Use of the measure may allow both children and parents to express the level of disruption that the child’s health problems are creating for the child’s daily functioning. A  disadvantage is that it has primarily been used with Caucasian children, although some research suggests no differences due to ethnicity (Kashikar-​Zuck et al., 2011). Further work needs to establish the use of the measure with children and adolescents of different ethnic backgrounds. In addition, there have been inconsistent relationships between scores on the FDI and socioeconomic status (Claar & Walker, 2006; Peterson & Palermo, 2004; Walker & Greene, 1991). Despite these limitations, there is sufficient psychometric support to recommend the use of the FDI to measure children’s physical and psychosocial functioning in clinical settings. The Pediatric Quality of Life Inventory Generic Core Scales The Pediatric Quality of Life Inventory (PedsQL) is a 23-​ item modular instrument measuring health-​ related quality of life (HRQOL), which can be defined as “an individual’s subjective perception of his or her functioning and emotional state vis-​à-​vis the effects of disease and treatment” (Connelly & Rapoff, 2006, p. 698). The PedsQL 4.0 Generic Core Scales (PedsQL 4.0; Varni, Seid, & Kurtin, 2001)  were designed to measure child physical, mental, and social health dimensions, as well as role (school) functioning. The measure is made up of parallel child self-​report and parent proxy-​report formats. The parent proxy-​reports measure parents’ perceptions of

594

Health-Related Problems

their children’s HRQOL. Child self-​reports include ages 5 to 7 years (young child), 8 to 12 years (child), and 13 to 18 years (adolescent). Parent proxy-​reports include ages 2 to 4 years (toddler), 5 to 7 years (young child), 8 to 12 years (child), and 13 to 18 years (adolescent). Respondents indicate the extent to which the child is having problems in each of the four areas of functioning using a Likert scale. A 5-​point scale is used for the child self-​report forms (ages 8–​18 years) and the parent proxy-​report forms. To increase ease of use for young children, a simplified 3-​point scale is used. The young child self-​report form is also anchored to a faces scale ranging from happy to sad. Items are reverse-​ scored and converted into a 0-​to-​100 scale, with higher scores indicating better HRQOL. The measure yields scores on Physical Functioning, Emotional Functioning, Social Functioning, and School Functioning, as well as a Physical Health Summary Score, a Psychosocial Health Summary Score, and a Total Score. A large volume of empirical evidence supports the psychometric properties of the PedsQL 4.0. The PedsQL 4.0 was designed to measure the core health dimensions outlined by WHO (1948). Items for the original instrument were created based on a literature search, interviews with children with cancer and their families, and discussions with clinicians (Varni, Seid, & Rode, 1999). This most recent version is the result of a number of iterations that have occurred since the publication of the original instrument (Varni et  al., 1999). Measures of central tendency and distribution are available for total and scale scores in large samples of youth with chronic and acute health conditions, as well as samples of healthy children (Connelly & Rapoff, 2006; Powers, Patton, Hommel, & Hershey, 2004; Tran et al., 2015; Varni, Burwinkle, Limbers, & Szer, 2007; Varni et al., 2001, 2015; Varni, Seid, Knight, Uzark, & Szer, 2002). Internal consistency has been reported to be at least adequate (Connelly & Rapoff, 2006; Varni et  al., 2001, 2007). Two-​week test–​retest reliability of the Total Scale score has also been reported as adequate (Connelly & Rapoff, 2006). There is good evidence for the construct validity of the PedsQL 4.0 (e.g., scores are lower in samples of children with chronic pain conditions and acute health conditions than in samples of healthy children; Connelly & Rapoff, 2006; Powers et al., 2004; Varni et al., 2001, 2007, 2015). Empirical evidence supports the use of the PedsQL 4.0 with a wide age range (2–​18 years). It is the only generic pediatric quality of life measure to span such a wide age range that has undergone rigorous psychometric testing (Eiser & Morse, 2001). The PedsQL 4.0 is appropriate for use

with both healthy children and children with a wide variety of acute and chronic illnesses (e.g., Varni et al., 2001). This measure has been tested with samples of children from different ethnic backgrounds and their parents in both English and Spanish, and ratings provided in both languages have been found to be equivalent (Varni et al., 2001). Translations in a wide range of other languages are also available (Varni, 2016), including Norwegian, Dutch, German, and Chinese versions that have all been validated by independent groups of authors (Bastiaansen, Koot, Bongers, Varni, & Verhulst, 2004; Chan, Chow, & Lo, 2005; Felder-​Puig et  al., 2004; Reinfjell, Diseth, Veenstra, & Vikan, 2006). Because the PedsQL 4.0 is a generic HRQOL measure, it gives researchers and clinicians the ability to conduct comparisons across acute and chronic health conditions, as well as benchmark against healthy population norms (Varni et  al., 2002). However, the generic nature of this instrument may make it necessary to administer supplementary disease-​specific assessments to address the full range of functioning in some children with pain (Eiser & Morse, 2001). PedsQL disease-​specific modules are currently available for arthritis, asthma, brain tumor, rheumatology, diabetes, cancer, cerebral palsy, and cardiac conditions, among others. One of the main advantages of the PedsQL 4.0 is that it includes complementary child and parent proxy-​ report forms. Although patient self-​report is considered the standard for measuring perceived HRQOL, it is parents’ perception of their children’s HRQOL that influences health care utilization (Varni & Setoguchi, 1992). Correlations between child and parent ratings are in the moderate range, suggesting that it is important to obtain both the child’s and the parent’s perspective (e.g., Powers et  al., 2004; Varni et  al., 2007). Another advantage of the PedsQL is that it provides a multidimensional quality of life assessment with a quick administration time (