Soft Skills in Education: Putting the evidence in perspective [1st ed.] 9783030547868, 9783030547875

This book examines the global movement of putting more emphasis on students’ social and emotional development in educati

1,025 44 3MB

English Pages XIV, 239 [249] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Soft Skills in Education: Putting the evidence in perspective [1st ed.]
 9783030547868, 9783030547875

Table of contents :
Front Matter ....Pages i-xiv
Roots of the Movement; Development and Criticism (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 1-20
Conceptual Challenges (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 21-43
Evidence from Psychological Studies (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 45-65
Evidence From Educational Studies (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 67-98
Opening Black Boxes of the Meta-Analyses: What Do the Underlying Studies Look like? (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 99-140
Measurement of Soft Skills in Education (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 141-189
Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 191-211
Recapitalization and Discussion of the Main Findings and Implications for Educational Practice, Theory and Research (Jaap Scheerens, Greetje van der Werf, Hester de Boer)....Pages 213-236
Back Matter ....Pages 237-239

Citation preview

Jaap Scheerens Greetje van der Werf Hester de Boer

Soft Skills in Education Putting the evidence in perspective

Soft Skills in Education

Jaap Scheerens Greetje van der Werf Hester de Boer •



Soft Skills in Education Putting the evidence in perspective

123

Jaap Scheerens University of Twente Enschede, The Netherlands

Greetje van der Werf GION University of Groningen Groningen, The Netherlands

Hester de Boer GION University of Groningen Groningen, The Netherlands

ISBN 978-3-030-54786-8 ISBN 978-3-030-54787-5 https://doi.org/10.1007/978-3-030-54787-5

(eBook)

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

The theme of this book is a critical review of the trend to give social and emotional skills—sometimes called ‘soft skills’—a more prominent place in education. Sometimes, we will refer to this trend as a movement, which term might suggest that we consider the trend as ideological. The Concise Oxford Dictionary gives a more neutral definition of the term ‘movement’: “a series of actions and endeavors of a body of persons for a specific object”. From an academic perspective, the trend can be treated as a research and development program. And this is the overriding perspective that will be taken although normative issues are also at stake. Our key interest is clarification of ideas and concepts of social-emotional attributes and interrogation of the evidence on their malleability by educational interventions. To the extent that concepts remain vague and visions are not supported (or even supportable) by empirical evidence, we might still encounter ideology. Our orientation towards the social and emotional skills trend is critical in this sense. However, we aim to contribute to the debate in a constructive manner, which will be made clear in the final chapter, where we discuss the implications of our assessment of the evidence for educational planning, research and theory. Chapter 1 of the book refers to the movement’s major impulses and ‘roots’, provides a first clarification of the subject matter and discusses criticism and major challenges. Next, Chap. 2 addresses the concept of social-emotional attributes, from various disciplinary perspectives, followed by a discussion in Chap. 3 on the Big5 personality taxonomy and the malleability of personality characteristics. Chapter 4 documents the way social and emotional outcomes are measured in program evaluations of social-emotional learning programs and discusses the quality and ‘fit for purpose’ of these instruments. The empirical evidence on the effects of education and educational interventions is presented in Chaps. 4 and 5. Chapter 4 discusses the findings of meta-analyses and research reviews, and Chap. 5 describes the results of individual intervention studies, among others to provide the reader with information on the educational content of the programs and interventions. Chapter 4 documents the way social and emotional outcomes are measured in program evaluations of social and emotional v

vi

Preface

learning programs and discusses the quality and ‘fit for purpose’ of the instruments in question. Chapter 7 presents new evidence, based on a meta-analysis of educational intervention effects on facets of the Big5 personality trait conscientiousness. The final chapter, Chap. 8, summarizes the main results and discusses conclusions and implications for both educational policy and practice, as well as further research. Enschede, The Netherlands Groningen, The Netherlands Groningen, The Netherlands

Jaap Scheerens Greetje van der Werf Hester de Boer

Contents

1 Roots of the Movement; Development and Criticism . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Social-Emotional Skills and Affective Educational Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 21st Century Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Emotional Intelligence and the Programming of Social-Emotional Learning at School . . . . . . . . . . . 1.4 More Recent Developments . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Social-Emotional Skills Within the Framework of EU Competencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2 Developments by the OECD: From Literacy to Cross-Curricular Competencies to Big Five States and Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Critical Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Social and Emotional Skills and Personal Development as Deliberate Goals in Schooling . . . . . . . . . . . . . . . . 1.5.2 Pop Psychology and Teachers as Lay Psychologists . . 1.5.3 Social and Emotional Outcomes as Formal Standards to Which Schools Could Be Held Accountable . . . . . 1.5.4 Technology, State and Corporate Interests . . . . . . . . . 1.5.5 International Standardization . . . . . . . . . . . . . . . . . . . 1.5.6 The Perspective of Critical Theory . . . . . . . . . . . . . . 1.5.7 Conceptual Vagueness and Uncertainty About the Evidence Base as Fundamental Critical Issues . . . 1.6 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.. .. ..

1 1 2

.. ..

2 3

.. ..

4 6

..

6

.. ..

8 9

.. ..

9 10

. . . .

. . . .

11 11 12 13

.. .. ..

16 16 18

vii

viii

Contents

2 Conceptual Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Non-cognitive Blob . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Focus on Social and Emotional Attributes . . . . . . . . . . . . . . . 2.4 Fundamental Conceptual Questions . . . . . . . . . . . . . . . . . . . . 2.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 A Note on the Competency Concept . . . . . . . . . . . . . 2.5 Further Categorization and Contextualization of Social and Emotional Skills, Based on Psychological Concepts . . . . . 2.5.1 Constructs and Meta-Constructs of Personality and Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 The Big Five Personality Factors . . . . . . . . . . . . . . . 2.5.3 Recapitulation and Preliminary Conclusions . . . . . . . . 2.5.4 Kyllonen et al.’s (2014) Contribution to the Analysis of Personality, Motivation and College Readiness . . . 2.5.5 Development of the Big Five Traits Across a Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.6 Kyllonen’s Interpretation of the Trait-State-Behavior Continuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Towards a Comprehensive Framework of Skills in the Cognitive, Affective and Conative Domains . . . . . . . . . 2.7 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

21 21 22 24 24 25 27

..

29

.. .. ..

29 31 33

..

35

..

36

..

36

.. .. ..

38 39 41

3 Evidence from Psychological Studies . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Big Five Personality Concept and the Five Main Traits . . 3.3 Facets of Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Stability of Personality Traits . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Genetic Basis of Personality Traits . . . . . . . . . . . . . . . . . 3.6 Evidence on How to Change Personality . . . . . . . . . . . . . . . . 3.7 Evidence for Personality Trait Change by Means of Therapeutic and Counselling Interventions . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Stability and Changeability of Facets of Traits . . . . . . . . . . . . 3.9 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

45 45 46 48 50 52 53

. . . .

. . . .

53 55 57 60

4 Evidence From Educational Studies . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Contributions from Economists . . . . . . . . . . . . . . 4.2.1 Causal Modeling of Outcomes . . . . . . . . 4.2.2 Long Term Effects of the Perry Preschool 4.2.3 Comments . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

67 67 68 69 70 70

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . Program . ........

. . . . . .

Contents

Non-cognitive Outcomes in Meta-Analyses of (Quasi-) Experimental Educational Intervention Studies . . . . . . . . . . . 4.3.1 The Meta-Analysis by Durlak et al. . . . . . . . . . . . . . 4.3.2 The Meta-Analysis by Sklad et al. . . . . . . . . . . . . . 4.3.3 The Meta-Analysis by Wigelsworth et al. . . . . . . . . 4.3.4 The Meta-Analysis by Korpershoek et al. . . . . . . . . 4.3.5 The Meta-Analysis by Taylor et al. . . . . . . . . . . . . . 4.4 In-Between Balance: How Convincing Is the Evidence? . . . . 4.5 Non-cognitive Outcomes in Educational Effectiveness Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 Nonexperimental Educational Effectiveness Studies and Non-cognitive Outcomes . . . . . . . . . . . . . . . . . 4.5.3 Opdenakker and Van Damme (2000) . . . . . . . . . . . 4.5.4 Van Swynsberghen, VanLaar, De Fraine, and Van Damme (2017) . . . . . . . . . . . . . . . . . . . . . 4.5.5 Brunner, Keller, Wenger, Fischbach, and Lüdtke (2018) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.6 Experimental and Quasi-Experimental Studies . . . . . 4.5.7 Hwang and Cappella (2018) . . . . . . . . . . . . . . . . . . 4.5.8 Polikoff, Le, Tien, Danielson, and Marsh (2018) . . . 4.5.9 Gandhi, Slama, Park, and Williamson (2018) . . . . . . 4.5.10 O’Connor, Cappella, McCormick, and McClowry (2014) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.11 Kraft and Dougherty (2013) . . . . . . . . . . . . . . . . . . 4.5.12 Gottfredson, Brown Cross, and Connell (2010) . . . . 4.5.13 Snyder, Flay, Vuchinich, Acock, Washburn, Beets, and Li (2010) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . Annex: Characteristics of Comprehensive School Reform Programs (Cited from Borman et al., 2003) . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

4.3

5 Opening Black Boxes of the Meta-Analyses: What Do the Underlying Studies Look like? . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Review of Results of Program Evaluations of Skill Enhancement Programs by Kautz et al. (2014) . . . . . . . . . . . 5.3 Case Descriptions of Program Evaluations of SEL Programs . 5.3.1 Evaluation of the Tools of the Mind Curriculum (Barnett et al., 2008) . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Evaluations of the Promoting Alternative Thinking Skills (PATHS) Program (Greenberg, Kusché, & Riggs, 2004) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

71 71 73 75 76 77 78

... ...

80 80

... ...

81 83

...

84

. . . . .

. . . . .

85 86 87 88 88

... ... ...

89 90 91

... ...

91 92

... ...

93 94

... ...

99 99

. . . . . . .

. . . . . . .

. . . . .

. . . 101 . . . 102 . . . 102

. . . 103

x

Contents

5.3.3

Good Behavior Game (GBG, Witvliet, van Lier, Cuijpers, & Koot, 2009) . . . . . . . . . . . . . . . . . . . . 5.3.4 Zippy’s Friends (Holen, Waaktaar, Lervåg, & Ystgaard, 2012) . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 The UK Resilience Program (Challen, Noden, West, & Machin, 2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 The Lessons in Character (LIC) Program (Hanson, Dietsch, & Zheng, 2012) . . . . . . . . . . . . 5.4.2 Comer’s School Development Program Cook et al. (1999) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Positive Action (PA) Key-Reference: Snyder, Flay, Vuchinich, Washburn and Beets 2010 . . . . . . . . . . 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Discussion: How Does Social-Emotional Learning Take Effect; Interrogating Program Theories . . . . . . . . . . . . . . . . Annex: Published Instruments in Program Case Descriptions . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . 107 . . . . 108 . . . . 110 . . . . 112 . . . . 115 . . . . 117 . . . . 125 . . . . 125 . . . . 131 . . . . 136

6 Measurement of Soft Skills in Education . . . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Criteria to Judge the Quality of Measures . . . . . . . . . . . . . . . 6.3 Descriptions of Instruments Rated by the Educational Endowment Foundation (2018) . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 The Short Grit Scale (GRIT-S) . . . . . . . . . . . . . . . . . 6.3.2 Multidimensional Measure of Children’s Perception of Control (MMCPC) . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 The Self-efficacy Teacher Report Scale (SETRS) . . . . 6.3.4 How I Feel Questionnaire . . . . . . . . . . . . . . . . . . . . . 6.3.5 Emotion-Regulation Rating Questionnaire for Children and Adults (ERQ-CA) . . . . . . . . . . . . . . 6.3.6 Rosenberg Self-Esteem Scale (RSES) . . . . . . . . . . . . 6.3.7 The Child and Youth Resilience Measure (CYRM-12) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.8 The Children’s Self-Report Social Skills Scale (CS4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.9 The Basic Empathy Scale (BES) . . . . . . . . . . . . . . . . 6.3.10 The Expression and Emotion Scale for Children (EESC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.11 Brief Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Description of Instruments that were Used in the Intervention Studies (see Chap. 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 The Social Skills Rating System (SSRS), Gresham and Elliott (1990). Tools of the Mind . . . . . . . . . . . . 6.4.2 The Emotional Awareness Scale for Children (LEAS-C), Bajgar et al. (2005). PATH . . . . . . . . . . .

. . 141 . . 141 . . 142 . . 144 . . 145 . . 147 . . 148 . . 149 . . 150 . . 152 . . 153 . . 154 . . 155 . . 156 . . 157 . . 158 . . 158 . . 163

Contents

xi

6.4.3

The Empathy Index for Children and Adolescents (IECA) (Bryant’s Empathy Index), de Wied et al. (2007). PATH . . . . . . . . . . . . . . . . . . . . . . . . 6.4.4 Kidcope-Child Version (Holen et al., 2012). Zippy’s Friends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.5 The Strengths and Difficulties Questionnaire (SDQ) (Goodman, 1997). Zippy’s Friends . . . . . . . . . . . . . 6.4.6 The Teacher Child Rating Scale (T-CRS) Hightower et al. (1986) Positive Action . . . . . . . . . . . . . . . . . . Summary of all the Instruments . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.5 6.6 Annex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . 165 . . . 166 . . . 168 . . . . .

. . . . .

. . . . .

170 172 174 177 187

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Literature Search . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Eligibility Criteria . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4 Meta-Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.5 Combining the Effects . . . . . . . . . . . . . . . . . . . . 7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Descriptives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Meta-Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Summary Effect . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Moderator Analysis . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Long Term Effects . . . . . . . . . . . . . . . . . . . . . . . 7.5 Some Examples of Effective Interventions . . . . . . . . . . . . 7.5.1 Two Examples of Interventions with a Cognitive Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.2 Two Interventions Without a Cognitive Focus . . . 7.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

191 191 192 192 192 193 194 194 197 197 198 198 199 200 200

. . . .

. . . .

. . . .

. . . .

. . . .

201 205 208 209

8 Recapitalization and Discussion of the Main Findings and Implications for Educational Practice, Theory and Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Development of the Soft Skills Movement in Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Critical Questions and Technical Challenges . . . . 8.2.1 Appropriateness . . . . . . . . . . . . . . . . . . . 8.2.2 Conceptual Fuzziness . . . . . . . . . . . . . . .

. . . .

. . . . . . . . . . . 213 . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

213 215 215 215

xii

Contents

8.2.3

Tenability of Technical Assumptions Concerning Measurement and Malleability . . . . . . . . . . . . . . 8.3 The Categorization of Concepts . . . . . . . . . . . . . . . . . . . . 8.3.1 Social and Emotional Attributes at the Core of the Soft Skills Movement . . . . . . . . . . . . . . . . 8.3.2 The Meaning of Social-Emotional Learning at School . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 Hierarchical Ordering of Personality Concepts and Implications for Measurement and Malleability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 The Malleability of the Big Five Personality Traits and Trait Facets; an Excursion into Psychological Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Overview of the Educational Research Evidence on Malleability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Opening “Black Boxes”; The Content of Social-Emotional Learning Programs and Further Reflection on Their Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Measurement of Social and Emotional Learning Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 A Meta-Analysis of Educational Intervention Studies Addressing Conscientiousness Aspects . . . . . . . . . . . . . . . 8.9 Implications for Educational Policy and Practice, and Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.9.1 Levels of Ambition and Evidence . . . . . . . . . . . . 8.9.2 Practice: Special Programs or an Embedded Approach? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.9.3 Implications for Research . . . . . . . . . . . . . . . . . . 8.10 Making Up the Balance: The Political Economy of the Soft Skills Movement . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . 216 . . . . . 216 . . . . . 216 . . . . . 217

. . . . . 218

. . . . . 219 . . . . . 220

. . . . . 221 . . . . . 223 . . . . . 225 . . . . . 225 . . . . . 225 . . . . . 228 . . . . . 229 . . . . . 234 . . . . . 235

Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

About the Authors

Jaap Scheerens is Professor Emeritus. He worked at the University of Twente as Professor of Educational Organization and Management. After his retirement, he was guest lecturer at the Universities of Bristol and Rome (Roma Tre). He is currently associated with the research institute Oberon in Utrecht (The Netherlands) and serves as a member of the Scientific Committee of INVALSI, Italy’s institute for educational testing. At the University of Twente, he coordinated a research program on educational effectiveness. During his career, he was scientific director of the national research school for postgraduate training in educational science (ICO) and director of the research institute OCTO of the faculty of education at the University of Twente. He was involved in many international research projects, funded by international organizations. He has published 20 books and about 100 articles in scientific journals, mainly addressing educational effectiveness and evaluation. His previous book with Springer is “Educational Effectiveness and Ineffectiveness. A Critical Review of the Knowledge Base”, published in 2016. Greetje van der Werf is Professor Emeritus of the University of Groningen, the Netherlands. Before her retirement in June 2019, she was Full Professor of Learning and Instruction at the department GION education/research. During her working life, she coordinated several national large-scale, multilevel longitudinal studies in primary and secondary education, as well as the Dutch part of one of the IEA studies on Civics and Citizenship Education. She published a large number of articles on the effects of schooling on students’ achievement and long-term attainment. Her main research expertise is on the influence of students’ psychological attributes, among which students’ personality traits, achievement motivation, social comparison and peer relations on students’ achievement and success in their school career. Before her retirement, she served 5 years in the Board of the Faculty of Behavioral and Social Sciences, in which she was responsible for the teaching programs of the departments of psychology, sociology and educational sciences. Currently, she is still Editor-in-Chief of the international journal Educational Research and Evaluation.

xiii

xiv

About the Authors

Hester de Boer is a post-doc researcher at GION education/research of the University of Groningen, The Netherlands. Her methodological expertise is in meta-analysis. She has a wide interest in educational topics and has published and co-authored several meta-analytical studies in international peer-reviewed journals. The subjects of these meta-analyses include teacher expectation interventions, metacognitive learning strategy instructions, differentiation practices, classroom management strategies and programs and school belonging. Furthermore, she has published a meta-analysis of the influence of the attributes of the implementation and measurement of educational interventions on the estimated effects. Besides these meta-analyses, she had published and co-authored several articles on teacher expectations, parents’ aspirations and secondary school track recommendations.

Chapter 1

Roots of the Movement; Development and Criticism

1.1 Introduction In this chapter we trace the roots of the current trend to give social and emotional skills a more prominent role in education. In educational science, broad skills, as well as affective components have a longer history than the call for developing 21st century skills. We start out with explaining some of the principles of didactic analysis and the structure of taxonomies of educational objectives as they add to the conceptual clarification that is an important theme in this book. Next, we discuss developments in the United States with respect to 21st century skills in relationships to ‘emotional intelligence’ and ‘social-emotional learning’ and go on to discuss more recent developments by the OECD and the European Union. In the final section of the chapter we discuss criticism, from various perspectives and disciplinary background, and end the chapter with a broad sketch of the current state of affairs. Terminology and definition of social and emotional skills is a returning theme throughout this book. It is the central theme in the next chapter, but will be readdressed when assessing the malleability, effectiveness and measurability of these skills in later chapters. In this chapter we will use this term ‘social-emotional skills’ as the most common label, recognizing that it sometimes refers to a relatively narrow set of attributes, but in other cases is applied as representing all of the ‘non-cognitive’ in a broader set of non-cognitive attributes. Varying interpretations associated with roots and recent developments will be dealt with when they occur.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_1

1

2

1 Roots of the Movement; Development and Criticism

1.2 Roots 1.2.1 Social-Emotional Skills and Affective Educational Objectives Goals are crucial for every kind of attempt to influence or steer complex social activities, and education is a case in point. According to steering or control theory: “Control is defined as any kind of focused influence by a controlling agency on a controlled system” (de Leeuw, 1974, p. 108). Application of the goal concept to education stands at the cradle of the development of educational science. Thinking in terms of goals and objectives is essential for major fields of research and development, such as curriculum design and achievement test development. According to de Groot (1986, p. 64) “Educational objectives are targeted learning effects attained by pupils at the end of a program”. The logic of didactic analyses, i.e. distinguishing subject matter content elements and psychological operations for pupils to deal with those content elements, also serves as the basis to distinguish and develop educational goals. An interesting conceptual contribution is the distinction between ‘material’ or ‘intrinsic’ and ‘formal’ or ‘transcendent’ educational goals (de Corte, Geerligs, Lagerweij, Peters, & Vandenberghe, 1976, p. 22). Material objectives are determined by very detailed specification of educational content, while psychological operations remain limited to mastery of subject matter knowledge and skills. Formal objectives, on the other hand, like the development of thinking skills and attitudes, are less tied to very specific subject matter content, but are expected to apply more generally across subject matter content domains. According to de Corte, et al., formal objectives refer to the development of different facets of personality, like perception, thinking, creative expression, working attitudes, and appreciation of values. Formal objectives tend to be more general and less specified than material objectives. However, content and psychological operations are the key elements that are present in both types of objectives. de Corte et al., emphasizes that it is more fruitful to apply the distinction as a continuum, and not a dichotomy. “An educational goal becomes more formal or transcendent whenever the content element is broadened” and when its content dimension is further specified, a formal objective evolves towards a material one (Ibid., p. 23). The logic of didactic analysis has direct relevance for taxonomies of educational objectives as developed by Bloom, Hastings, and Madaus (1971). These taxonomies have a dichotomous structure of content and psychological operations as well. The cognitive taxonomy has both material and formal objectives, whereas in the affective domain objectives tend to be formal. In the formal dimension of both taxonomies increasingly ambitious skills are targeted. In the cognitive domain they evolve to higher order skills like problem solving and in more recent formulations (Krathwohl, 2002) to meta-cognition. In the affective domain the progression is towards the development of more encompassing value systems. The literacy concept as used

1.2 Roots

3

within the framework of the OECD PISA study refers to outcomes that are broadly applicable across situations and thus in the older terminology of didactic analysis moved in the direction of more formal objectives. This brief excursion underlines that attention for the affective domain has a longer tradition than the current social-emotional skills movement. When Bloom et al. introduced their treatment of the affective domain, they started by saying that education is permeated with affective components like interests, desirable attitudes, appreciation, values and commitment. For educationalists it might be some comfort that we do not own current developments in the domain of social emotional learning solely to pedagogues and economists. On the other hand, educationalists have tended to treat the issue with a lot of restraint and reservation, because of fear of indoctrination, breach of privacy of parents and students and difficulties to objectively assess non-cognitive development (Bloom et al., 1971).

1.3 21st Century Skills The call for developing 21st century skills can be traced back to the influential report ‘A nation at risk’ in the United States, which came out in 2003. The report recommended new basics, among which computer literacy, other curriculum matters like foreign languages, and skills and abilities, like enthusiasm for learning and deep learning. New emphases were modernized content and skills. A second important impetus came from private industry and labor market economists. Top skills demanded by U.S. Fortune 500 companies by the year 2000 had shifted from traditional reading, writing and arithmetic to teamwork, problem solving, and interpersonal skills (Wikipedia). Employers emphasized that applied skills like collaboration, teamwork and critical thinking were very important to success at work.1 Labor economists at MIT and Harvard Business School claimed that employers’ demands for people with competencies like complex thinking and communications skills had increased greatly (Levy & Murnane, 2013). Next to 21st century content the skills component was expanded. In 2002 the Partnership for 21st Century Skills referred to the 7C’s: Critical thinking and problem solving; Creativity and innovation; Cross-cultural understanding; Communications, information, and media literacy; Computing and ICT literacy and Career and learning self-reliance. By 2015 the interest in 21st century skills had long become international, which was, among others, evident from a contribution by the World Economic Forum, in which 16 crucial proficiencies for education in the 21st century were stated (Table 1.1).

1 Are

They Ready to Work? Employers’ Perspectives on the Basic Knowledge and Applied Skills of New Entrants to the 21st Century U.S. Workforce (PDF). Washington, D.C: Partnership for 21st Century Skills. 2006.

4 Table 1.1 World economic forum proficiencies for education in the 21st century

1 Roots of the Movement; Development and Criticism Foundation literacies

Competencies

Character qualities

• Literacy and • Critical • Creativity numeracy thinking/problem • Initiative solving • Persistence/grit • Scientific literacy • ICT literacy • Communication • Adaptability • Curiosity • Financial literacy • Collaboration • Leadership • Cultural literacy • Social and • Civic literacy cultural awareness Source Wikipedia

1.3.1 Emotional Intelligence and the Programming of Social-Emotional Learning at School Gardner’s (1983) multiple types of intelligence can be seen as one of the fore runners of emotional intelligence, particularly the distinction between intrapersonal and interpersonal abilities. Inter-personal intelligence is about understanding the feelings and intentions of others and intrapersonal intelligence refers to the awareness and discrimination of one’s own feelings (Zeidner, Matthews, & Roberts, 2009). Goleman’s work (1998), was the most significant break-through of the Emotional Intelligence (EI) movement. His central thesis is that emotional illiteracy is responsible for many social evils, including mental illness, crime, and educational failure (Ibid., 2009). Goleman includes qualities such as optimism, self-control and moral character as part of intelligence, whereas these are normally seen as personality characteristics and was criticized for listing practically every positive quality that was not cognitive under the heading of Emotional Intelligence (Matthews, Zeidner, & Roberts, 2002). While intelligence is seen as innate and stable over a lifetime, Goleman sees Emotional Intelligence as something that can be learned at any time in a person’s life. Conventional intelligence is neutral, while emotional intelligence has a strongly moral outlook, and, according to Zeidner et al. (2009) Goleman associates emotional literacy with education for character, moral development and citizenship (Ibid., p. 10). In these qualifications of Emotional Intelligence fundamental distinctions in psychology are at stake: intelligence versus personality, innate stable characteristics versus learnable (and teachable) characteristics, impartial cognitive aptitude versus morality. In later work Goleman (2001) sought to put the traits that focally define EI on a more systematic basis. This resulted in a basic scheme, represented by Zeidner et al. (2009, p. 11), which is cited in Table 1.2. Goleman’s model distinguishes two dimensions, the first separates personal competencies, such as self-awareness from those that relate to social competencies, like empathy. The second dimension distinguishes aspects of EI that relate to awareness from those that concern management and regulation of emotion.

1.3 21st Century Skills Table 1.2 Goleman’s model of emotional competencies, with examples of four types of competencies

5 Self (personal competences)

Other (social competences)

Recognition

Self-awareness – Emotional self-awareness – Accurate self-assessment – Self-confidence

Social awareness – Empathy – Service orientation – Organizational awareness

Regulation

Self-management – Self-control – Trustworthiness – Conscientiousness

Relationship management – Communication – Conflict management – Teamwork and collaboration

Zeidner et al. (2009) describe the rise of the Emotional Intelligence movement as boosted by an “emotion-friendly Zeitgeist” and dislike of the detached, deterministic and socially stratifying implications of intellectualism, genetic determination of intelligence, and ‘technocracy’. In this context Coleman’s position that emotional intelligence differs from IQ in being malleable and trainable, and moreover ‘democratic’, fell in fertile grounds (Ibid., p. 15). Emotional intelligence formed the theoretical background for a movement to stimulate social-emotional learning at school in the USA. In some orientations this was done for intrinsic reasons, like in case of the ‘positive psychology’ movement, which explores the sources of happiness, satisfaction, optimism and well-being, propagating character education and positive thinking in special programs. Other influential work (Elias et al., 1997) saw social-emotional learning also as instrumental to academic outcomes, (Zins, Weisberg, Wang, & Walberg, 2004). These authors present a framework of “person-centered key SEL competencies” (see Table 1.3) that is an elaboration of Goleman’s model presented in Table 1.2. Zins et al. (2004) position the movement to pay more attention to social and emotional learning in line with the accountability requirements that came with the No Child Left Behind Act of 2002. SEL is described as an enabling component to help students in attaining academic success. The authors make a plea for programs as “holistic, coordinated approaches that effectively address academic performance mediators, such as motivation, self-management, goal setting … etc.” (Ibid., p. 6). They make an important distinction with respect to two major types of SEL interventions. On the one hand they mention specific SEL curricula, and on the other hand “the infusion of social and emotional skills into the regular academic curricula”. The ‘Collaborative for Academic, Social and Emotional Learning’ (CASEL) became an important body to stimulate social-emotional learning. SEL oriented activities that were carried out from this perspective were frequently enacted in school-based prevention programs, in which there was a strong emphasis on reducing maladaptive behavior.

6

1 Roots of the Movement; Development and Criticism

Table 1.3 Person-centered key SEL competencies, according to Zins et al. (2004) Self-awareness

Social awareness

Responsible decision-making

Self-management Relationship management

Identifying emotions Accurate self-perception Recognizing strengths, needs and values Self-efficacy Spirituality

Perspective taking Empathy Appreciating diversity Respect of others

Problem identification and situation analysis Problem solving Evaluation and reflection Personal, moral, and ethical responsibility

Impulse control and self-management Self-motivation and discipline Goal setting and organizational skills

Communication, social engagement, and building relationships Working cooperatively Negotiation, refusal and conflict management Help seeking and providing

The multitude of skills that are mentioned in the literature on American programs for socio-emotional learning together with the diversity of intervention-types is likely to create highly diversified practice. We will encounter this diversity when discussing the evidence on the effectiveness of SEL programs, further on.

1.4 More Recent Developments 1.4.1 Social-Emotional Skills Within the Framework of EU Competencies In 2006, the European Parliament and the Council adopted a Recommendation on Key Competences for Lifelong Learning. The motivational background is based on the notion that “European societies and economies have experienced significant changes, digital and technological innovations as well as labor market and demographic changes” and that “Society and economy rely heavily on highly competent people while competence requirements are changing: in addition to good basic skills (literacy, numeracy and basic digital skills) and civic competences; skills such as creativity, critical thinking, initiative taking and problem solving play an increasing role in coping with complexity and change in today’s society” (EU, 2018, p. 3). The European Reference Framework of Key Competences for Lifelong Learning defined the following eight key competences: Literacy competence, Multilingual competence, Mathematical competence and competence in science, technology and engineering, Digital competence, Personal, social and learning to learn competence,

1.4 More Recent Developments

7

Citizenship competence, Entrepreneurship competence, and Cultural awareness and expression competence. For our purposes we focus on the key competence personal, social and learning to learn competence. This is described as follows. “Personal, social and learning to learn competence is the ability to reflect upon oneself, effectively manage time and information, work with others in a constructive way, remain resilient and manage one’s own learning and career. It includes the ability to cope with uncertainty and complexity, learn to learn, support one’s physical and emotional well-being, to maintain physical and mental health, and to be able to lead a health-conscious, futureoriented life, empathize and manage conflict in an inclusive and supportive context” (EU, 2018, p. 2). The EU key competencies are conceptualized as a combination of knowledge, skills and attitudes and the definition of each key competency states the knowledge, skills and attitudes relevant for it. Next to key-competencies the EU documents refer to transversal skills. Transversal skills are relatively independent of content areas, and strongly overlap with the personal, social and learning to learn competence. Examples of transversal skills are creativity, critical thinking, problem solving, decision-making, learning to learn, cooperation and communication. Among themselves the eight key-competencies are far from mutually exclusive as, for example, citizenship competencies and entrepreneurial competences have several personal and social skill facets attached to them. Next, what might be observed with respect to the clarification of the personal, social and learning to learn competence in the EU documentation is that they are manifold. In an advisory report to the Commission (Cefai, Bartolo, Cavioni, & Downes, 2018) refer to an ordering framework from the Collaborative for Social and Emotional learning (CASEL). This framework corresponds to the one presented when we discussed the US social and emotional learning and emotional intelligence literature, which we referred to earlier. It identifies four interrelated sets of social and emotional competencies that can be taught in schools and other contexts, namely (1) self-awareness and (2) self-management (both intrapersonal), and (3) social awareness and (4) relationship skills (both interpersonal), next to responsible decision making (cognitive)” (Ibid., p. 131). Cefai et al. use this framework as a source of inspiration to classify 40 attributes under the headings of self-awareness (6 attributes); self-management (8 attributes), social awareness (6 attributes) and social management (10 attributes). The authors express the hope that “This competences framework may serve as a basis for the inclusion of SEE (social emotional education) as one of the EU’s key competences and a core content area of MS’ curricula” (MS stands for Member State).

8

1 Roots of the Movement; Development and Criticism

1.4.2 Developments by the OECD: From Literacy to Cross-Curricular Competencies to Big Five States and Traits The literacy concept that is applied in the OECD PISA study expresses the ambition of fostering cognitive skills that are not completely tied to national curriculum frameworks and are applicable in a broader domain of real-life situations. In the decade before the first wave of PISA took place, initiatives were developed to move further into the direction of generalizing skills over content areas. This resulted in two thematic reports based on the first edition of PISA: Problem solving for tomorrow’s world: first measures of cross-curricular competencies from PISA 2000 and Learners for life. Student approaches to learning (OECD, 2004; Artlet, Baumert, McElvany, & Peschar, 2003). These reports emphasized higher order cognitive skills (problem solving) and meta-cognition (learning to use learning strategies). A second initiative by the OECD that took place, more or less at the same time, was the project on Defining and selecting key competencies (DESECO). The project yielded two edited books that mostly explored conceptual issues (Rychen & Salganik, 2001, 2003). The subject was a broadened set of indicators to cover three groups of learning outcomes: academic achievement and cognitive skills, personal development (e.g. democratic values and work-related skills) and attitudes (e.g. higher order thinking skills, communication and interpersonal skills). The two volumes illustrate input from various backgrounds: labor market economics, emotional intelligence and continental European pedagogy. The first volume settled on the recognition of three broad categories of content for key competencies: (1) to act autonomously and reflectively, (2) to use tools interactively and (3) to join and function in socially heterogeneous groups (Rychen & Salganik, 2001, p. 12). The second volume further elaborated these broad categories and mentioned more specific skills. The two volumes intensively struggled with the competency construct and came up with some clear definitions and remaining questions, to which we shall return in a subsequent section. In 2015 the OECD published the study “Skills for social progress: The power of social and emotional skills”. In this study empirical evidence on the effects of social-emotional learning programs, an international survey on the influence of social and emotional skills on life outcomes, and contributions by the economist James Heckman and colleagues were all incorporated to make a strong plea for the importance of social and emotional skills. A very interesting turn that was taken in this study, is the connection that was made to personality psychology, more specifically the Big Five personality factors (Openness, Conscientiousness, Emotional stability, Agreeableness and Extraversion). In subsequent studies these ideas were further analyzed and compared to available evidence (Chernyshenko, Kankaraš, & Drasgow, 2018; Kankaraš & Suarez-Alvarez, 2019; Kautz, Heckman, Diris, Ter Weel, & Borghans, 2014). Facets of Big Five personality traits were chosen as a basis for measuring the direct outcomes of social-emotional learning. The important advantage of connecting to this body of knowledge is the available theoretical basis and the existence of validated measuring instruments. Currently, a new international study (OECD’s Study

1.4 More Recent Developments

9

on Social and Emotional Skills), in which this new orientation will be applied is to begin its data collection phase. At the same time this new development touches on new problematic areas and evokes criticism as well. As we will see when discussing the evidence base on social-emotional learning, there may be few studies that have used Big Five instruments as dependent variables in effect studies assessing socialemotional learning. Secondly, it is necessary to come to terms with the question of the stability versus the malleability of personality constructs. Moreover, the marriage between psychology, psychometry and a managerial perspective based on economic concepts of human and social capital has evoked criticism that will be referred to in a subsequent section of this chapter.

1.5 Critical Perspectives The ‘soft skills’ movement is contested. Although broad consensus exists that social and emotional development is part and parcel of schooling, the high flight and high degree of deliberateness of current developments has raised concerns. The following quote from Greene (2018), cited by Effrem and Robbins (2019, p. 31) illustrates this point: At its best… SEL (social emotional learning) is essential. It is important. It has always been with us under flowery descriptors like learning how to be fully human in the world or becoming your best self or more mundane labels like learning to get along with others or even just growing up. Teachers, because they are the non-parental adults who spend the most time with children, have always been instrumental in this process. And it has always been bad for the society and the culture as a whole when some folks fail to grow up into healthy, functioning human beings… And education reform, under the guidance of technocrats and data worshippers, has pushed us steadily away from the social and emotional dimensions that are a critical part of the growth and development of every young human……At its worst, we are talking about crafting human beings to order and harvesting both them and their data in the service of those with power. We are talking about pushing them to be the people that someone else thinks they should be. This is not just bad policy, inappropriate pedagogy, or culturally toxic—this is evil.

Behind this categorical comment are lines of critical argumentation that will be sketched more systematically.

1.5.1 Social and Emotional Skills and Personal Development as Deliberate Goals in Schooling The traditional core function of schooling is the teaching of content and the furthering of cognitive development. Still, religious education and the teaching of moral values have been present as well, to varying degrees, depending on national traditions and the existence of so called ‘denominational’ schools, with, for example a Roman

10

1 Roots of the Movement; Development and Criticism

Catholic or Protestant signature. Also, philosophical schools of thought, like humanistic psychology and reform pedagogics in European countries have emphasized personal development. Visser (2016) distinguishes the following conceptions: First, religious and moral education as the oldest form in which personal development, over and above cognitive qualification, has had a place in schooling. Second, the idea of Bildung (German term for formation and development) in its classicistic humanist form has implications for socialization and personal development as well. Non-religious, but more associated with a civic-liberal and therefore more individualistic inclination. Bildung implies guided personal formation, but also selfactualization and active involvement with societal development. Third, the romantic idea of personal formation as individual development and growth, in the tradition of Rouseau and reform pedagogy in which authentic development of the individual child is the central issue. A fourth variation would see an education that is aimed at values of virtue, freedom and authenticity in combination with traditional qualification as the best approach for personal elevation. Critical arguments against the expansion of socio-emotional and personal development at school, by means of special programs and institutionalization are related to: the view that personal formation is a task for the family, and questions of privacy, when the psychological functioning of children becomes, so to say, part of the curriculum; the fear that social and emotional learning goes to the detriment of cognitive development in the sense of diminished time and opportunity for traditional school subjects; and ethical objections to having the government, through the public schools, delve into this realm at all (Effrem & Robbins, 2019). However, it should be noted that these lines of criticism do not go against the existence of social and emotional development as part of schooling but is targeted at the degree of “deliberateness” “intrusiveness” and institutionalization as a major priority in formal education.

1.5.2 Pop Psychology and Teachers as Lay Psychologists A basic critical issue of the propagation of personal development and socialemotional skills in schooling is the overriding fuzziness of what it means and what functions it is supposed to have. Although the prime orientation seems to be to approach the non-cognitive in terms of intended outcomes, other functions are considered as well. First, assessment of social and emotional attributes could have the function of diagnosis, in order to detect problematic behavior, or as a basis to adapt treatments and interventions to the specific needs of students. Next, in many interpretations social-emotional development is seen as instrumental to other intended outcomes, such as enhanced academic performance and longer-term outcomes of schooling. Particularly as the scope of social-emotional skills is being expanded to comprehensive sets of psychological traits like the Big Five taxonomy, the question arises whether teachers are equipped to deal with this new ‘subject matter’. Are they competent in making diagnostic use of psychological tests, interpret social

1.5 Critical Perspectives

11

and emotional outcomes assessments, and most importantly develop and implement interventions to assist students in attaining the desired social and emotional outcomes? Effrem and Robbins (2019) are particularly critical of ‘amateur assessment’ of social-emotional skills: “SEL doesn’t assume the presence of licensed counselors or other trained clinicians for its implementation. Rather, as illustrated by a CASEL report on recommended SEL programs, standard procedure is to offer some sort of training to teachers and perhaps designated administrators and have them teach the material and evaluate the results (….this means to assess wheth-er students’ personality or character traits are developing as desired). Because the data from these assessments may be included in the statewide longitudinal data system, to endure forever and perhaps to shape the child’s future path, there is much justifiable concern about the source and subjectivity of SEL standards and the qualifications of the implementing personnel” (p. 26).

1.5.3 Social and Emotional Outcomes as Formal Standards to Which Schools Could Be Held Accountable In their review of the development of social-emotional learning in the United States Effrem and Robbins (2019) present examples of high stakes contexts, in which such outcomes are included as formal standards. Among others they mention the evaluation of the Head Start Program, the Common Core Standards and the National Assessment of Educational Progress (NAEP). Their comment is that “assessing social-emotional characteristics in NAEP violates NAEP’s governing statute itself, which forbids tests that “evaluate or assess personal or family beliefs” (Ibid., p. 14). Assessing schools on the attainment of social and emotional outcomes could be seen as a logical consequence of giving these outcomes the status of official goals. This would be a strong stimulus for schools to take interventions in this domain seriously; and would take ‘teaching to the test’ beyond alignment with academic outcomes. But the actual implementation is questionable, given earlier mentioned ethical considerations, the fuzziness of the domain and questions about measurability and malleability on the basis of educational intervention.

1.5.4 Technology, State and Corporate Interests Measures of social and emotional outcomes, be it as part of formal assessments, large scale program evaluations or school-based forms of assessment are likely to end up in data bases, which may not be protected, and available to public sector organizations and even to private parties. Effrem and Robbins (2019) describe these phenomena in the United States and note that the US now has legislation, the Foundations for

12

1 Roots of the Movement; Development and Criticism

Evidence-Based Policymaking Act (FEPA), which allows using data from multiple federal agencies to analyze the effectiveness of federal programs. Despite privacy concerns the FEPA passed, to allow widespread disclosure of citizen data among various federal agencies. The authors comment that “Under this statute, any data submitted by citizens to any agency for a particular purpose can be re-disclosed to other agencies for other purposes not consented to by the citizen. Sensitive SEL data held in federal education or research databases can now be traded among agencies and researchers, unbeknownst and unconsented to by students or their parents” (Ibid., p. 27). They go on to mention the interest of SEL promoting organizations like CASEL to use a broad spectrum of data on social and emotional development, not limited to stand alone programs or lessons but including all aspects of the school setting (recess, lunch-room, hallways, extracurricular activities) in the context of justifying program funding. Such a broad spectrum of data may also become available as a result of ‘transcripts’, as a new kind of digital portfolio’s, which, in the US, are being considered as a basis for college admission. Springer (2019) says about these transcripts that “Some may stick to traditional subjects like STEM or history, while others are a bit more inventive: ‘social cultural and historical fluency’. This is also a place where schools can show how their students have learned other skills like critical analysis, social-emotional learning, problem-solving or decision-making”. Other technological applications concerning SEL are applications in the context of so called personalized learning. These may include “embedded assessment which depend on the registration of every key-stroke”. Effrem and Robbins (2019) describe these as “New technologies using educational data mining and ‘affective computing’ (the study and development of systems and devices that can recognize, interpret, process, and simulate aspects of human affect) that are beginning to focus on ‘microlevel’ moment-by-moment data within digital and blended-learning environments to provide feedback to adapt learning tasks to personalized needs” (p. 26). Digital transcripts and game-based assessments of social-emotional learning create records that are to be considered as highly sensitive, because of privacy concerns and commercial exploitation and even risks of hacking. The rather killing comment that Effrem and Robbins add is as follows: “These problems are necessarily present even with respect to accurate SEL data. But the ethical implications are especially troubling when the evaluations are incorrect or misleading” (p. 31).

1.5.5 International Standardization In a previous section we described the new OECD international study on socialemotional skills as a very significant current development. The new study uses the Big Five personality taxonomy as an important part of its assessment framework (Kankaraš & Suarez-Alvarez, 2019). In addition, contextual information about the home background of students, parental involvement and teaching strategies and school context are included in the study. The direct assessment will be delivered online using a centralized software platform for assessment of children’s SE skills.

1.5 Critical Perspectives

13

Notably, the OECD claims it will use log file data obtained during the test as additional indicators of SE skills (Williamson, 2018). Williamson describes this practice as “a form of stealth assessment whereby students are being assessed on criteria they know nothing about, and which rely on micro-analytics of their gestures across interfaces and keyboards”. The terminology that the OECD has chosen, social and emotional skills, reflects the assumption that the personality traits and trait facets adapted from Big Five taxonomies are malleable performances. Treating social and emotional skills as desirable outcomes implies setting normative standards on specific personality profiles. Critics are suspicious of the explicit connection with life and labor market outcomes, as the driving rationale for applying these normative standards (Cefai et al., 2018; Effrem & Robbins, 2019; Williamson, 2019). Williamson (2018) notes that the choice of the Big Five taxonomy marks a therapeutic shift in in OECD focus, “with its target being the development of emotionally stable individuals who can cope with intellectual challenge and real-world problems”. Another critical observation by Williamson is that the new OECD study on social- emotional skills “exemplifies how policy-relevant knowledge is produced by networks of influential international organizations, connected discursively and organizationally to think tanks, government departments and outsourced contractors”. He finishes by expressing his concern that “over time the OECD may generate international comparisons, accountability metrics and league tables of education systems based on intimate assessments of students’ personalities”.

1.5.6 The Perspective of Critical Theory With the term critical theory we refer to an ‘amalgam’ of contributions from pedagogics, sociology and anthropology, united by a critical stance towards educational technology, and psycho tech, which are seen as decontextualizing child development, a focus on class and identity based aspects of educational inequality and a preference for non- positivist qualitative methods, and for neo-Marxist interpretation frameworks. The perspective of critical theory provides additional angles to interpret and value the soft skill movement, complimentary to the earlier described comments in this section. We provide illustrative references to contributions from pedagogics, sociology and etymology. Kirchgasler (2018, p. 693) discusses Grit (eagerness to achieve) as an exemplary phenomenon in the soft skills movement and characterizes ‘pedagogies of grit’ as generating “classificatory regimes that divide people by the display of particular attitudes and behaviors. As grit travels globally, it decontextualizes social and economic inequalities and explains them as owing to the intrinsic qualities of people”. The soft skills movement itself is linked to the burgeoning field of positive psychology, which proposes the study of ‘positive subjective experiences’, ‘individual traits’, and “the institutions that enable them in order to unlock the hidden human potentiality in all” (Peterson & Seligman, 2004, p. 5; Ibid., 694).

14

1 Roots of the Movement; Development and Criticism

The alternative perspective, offered by the author is a ‘local’, i.e. US based, historic analysis in which grit is linked to “a venerated narrative of the United States’ development as owing to its pioneers’ unique character—their self-assurance, resilience, and most of all, hard work”. Grit is demasked as a conservative Protestant narrative, an expression of the ideology of positive psychology and evidence of neo liberalism and the privatization of schooling. Kirchgasler’s critical analysis, labeled as “putting history back into grit”, highlights dividing practices in school reforms that make up “gritty and non-gritty” people. The author sees a paradox in teaching grit as a tool to reduce social inequalities, because this new classificatory regime is just another form of “dividing people by their decorum”, which could inadvertently function as explanations for prevailing social and economic inequalities (ibid., 715). Francis, Mills, and Lupton (2017) take the perspective of ‘social justice’ to address a series of challenges and dilemma’s for progressive education. The authors say they are “committed to an essentially humanist—albeit social constructionist—view of public education systems as (a) potentially emancipatory, and (b) under threat in many parts of the world” (Ibid., 415). The main dilemma these authors address appears to be the sometimes difficult alignment between furthering the position of disadvantaged learners on the one hand and facets of progressive education on the other. The paper is organized on the basis of ‘binaries’ that reflect traditional and progressive education on the one hand and approaches to further social justice on the other. Obviously skills are on the progressive side, and knowledge is on the side of traditional education. Similarly, soft is on the progressive side and hard is on the traditional side. As far as formal arrangements of educational systems are concerned, they place ‘professional autonomy’ on the progressive side and ‘accountability’ on the traditional side, and local democracy on the progressive side and universal principles on the traditional side. The major dilemma in these contemplations is the risky option to choose progressive arrangements, like soft skills, for low SES students, because they might give rise to new forms of stereotyping and, by softening universal standards, actually diminish opportunities for disadvantaged learners. In a contribution by Holborow (2018) about language skills as human capital, an analysis of classical human capital theory in connection to recent economic crises is used to criticize the assumption that skill development guarantees individual benefits on the labor market. “The recession and restructuring of the labor market, according to labor market experts Brown, Cheung, and Lauder (2015), involves ‘a global auction for jobs’, in which skills bear little relation to earnings. Most graduates are overskilled for the jobs they do and university degrees becoming increasingly devalued. Their conclusion is that human capital theory, which has for decades dominated thinking about the relationship between education and the labor market or orthodox economist policy makers and parents, has reached a crisis of legitimacy” (Ibid., 527, Brown et al., 2015). The author argues that human capital theory functions ideologically as a strategy of displacement to shift responsibility for employment outcomes from the social to the individual and states that “human capital theory, with its focus on the individual, leaves out of its calculations wider structural social inequalities— of gender, race and class—a presence which disproves any simple correlation of

1.5 Critical Perspectives

15

possession of skills with earnings” (Ibid., p. 525). Although soft skills are not specifically addressed, undercutting the assumptions of human capital theory in this way criticizes one of the core motives of the “21st skills movement that the development of soft skills in education should improve labor market and life outcomes. The author’s recommendation to revisit and update socially embedded models of language, could be generalized to soft skills as a way to avoid the counterproductive effects for educational equality, noted in Kirchgasler’s analysis of ‘grit’. Urciuoli (2008) defines neo-liberalism as an ideology in which all possible forms of “sociality and being are treated as market exchanges”. In her article she discusses a new meaning of skills, in which socio-cultural practices, originally belonging to ‘the self’ are now seen as obeying to the laws of the market. The new meaning of skills also diverts from the original meaning which expresses specific manual or machine operation and now denotes any practice, form of knowledge, or way of being constituting productive labor. According to the author most of these modern skills are “soft skills, aspects of self and social interaction (chief among these, communication, teamwork, and leadership) conceptualized as aspects of tasks, transferable techniques, and productive contributions” (Ibid., p. 212). Other qualities of these new skills are that they tend to be ‘vacuous’ disconnected from specific tasks and denotationally vague, “which in turn is central to their strategic use, linked as they are to their users’ alignment with corporate values” (p. 213). Other facets of the reshaping of aspects of self in the new meaning of skills are their being measurable, malleable and marketable. “The credentialed experts who inculcate skills into workers, managers, and executives get often hefty fees for skills work-shops lasting a few days or, sometimes, hours” (p. 213). The overriding perspective is that the new soft skills re-shape the self into submissiveness to corporate norms, “to inculcate patterns into students with which they will think and act appropriately as workers” (p. 218). Thereby, “the value of soft skills over hard skills lies in the value of a selfmonitoring workforce, especially when the need for specific forms of knowledge or practice may be facilitated or displaced by other forms of production” (p. 216). The author describes ‘communication’, ‘team’, and ‘leadership’ as having become the defining soft skills, “heavily commodified as surefire productive techniques”. Her description of leadership skills is particularly telling for the strategic fuzziness of the “über soft skills”: “Unlike communication skills, most of the leadership skills in these lists cannot be described as techniques or discrete practices; they are a disparate mix of practices, techniques, cognitive operations, and attitudes; they are also exhortative, as are team skills (see below). Given this referential indefiniteness, the SDS (strategic, JS) function of leadership skills is especially critical in establishing the use-value of these training products” (p. 222). Uriciuoli’s analysis of the soft skills movement strongly underlines manipulative practices, which is in strong contrast to the view of developmental pedagogues, adherents of positive psychology and progressive educators, who see soft skills promotion rather as emancipatory and self-celebrating. Like Williamson’s critique, she points at the economic benefits for providers of soft skills training and ‘tool development’.

16

1 Roots of the Movement; Development and Criticism

1.5.7 Conceptual Vagueness and Uncertainty About the Evidence Base as Fundamental Critical Issues Amidst these far-reaching societal concerns about the soft skills movement, there is one central issue that stands out as fundamental, namely the uncertainty about the core concepts, theoretical rationale and the empirical evidence on which this whole enterprise is built. Effrem and Robbin (2019) note that the scientific research support for socialemotional learning (SEL) is much less persuasive than advertised and mention “the numerous problems in assessing SEL—problems that are acknowledged even by the experts and most dedicated proponents of the movement. It turns out there’s no reliable, objective way to measure a student’s personality, values, and mindsets. These experts cannot even agree on a uniform definition of SEL” (Ibid., p. 6). Williamson (2019) cites Bull and Allen (2018) who speak of ‘considerable conceptual messiness’ across various sites and practices of policy, work, popular culture, schooling, and so on’… whilst adding that the various interest groups all face similar difficulties in producing a ‘scientific’ evidence base. He further refers to Osher et al. (2016, p. 663), who conclude that ‘significant gaps in statistical measurement of SEL “limit investigators’ and policymakers’ ability to fully utilize the research findings” (Ibid., p. 3). Next to conceptual vagueness and measurement problems evidence on the malleability of social and emotional skills and Big Five attributes is a major issue. In some texts (e.g. OECD, 2015) malleability is interpreted as change of personality characteristics across time, more specifically people’s lifetime. However, the issue that is at stake is proof of the malleability of social and emotional skills and psychological attributes by means of educational interventions. On this issue relevant empirical studies and meta-analyses are available, but the evidence is not uncontested, and a major theme for this book is to re-assess the evidence.

1.6 Summary and Discussion In this chapter we have traced the multi-disciplinary intellectual background of the movement to further social-emotional skills in education. Still, taking the perspective of educational practice, students’ social and emotional functioning has always played an important role. Therefore, we should perhaps see the traditional attention for diligence, good behavior and moral standards in schooling as the real roots of the current movement. Greater emphasis on these attributes has sometimes been forced by problems of schooling in disadvantaged areas, as many programs for social and emotional learning were prevention rather than general development programs (as will be documented further in subsequent chapters). A second broad contextual background dimension are logical frameworks and rational strategies for educational reform at regional and nation state level, driven by outcome standards and evidence

1.6 Summary and Discussion

17

on ‘what works’. In educational effectiveness research a structured and disciplinary climate has repeatedly been confirmed as a relevant factor and cooperative learning is a moderately successful instruction strategy. However, within these schools of thought, including the development and application of taxonomies of educational objectives, students’ social-emotional attributes have been treated as means rather than ends in themselves. And this latter characteristic (social and emotional skills as ends) seems to be the most distinguishing in ‘what’s new’ in the current movement; although both interpretations: social-emotional skills as means, and socialemotional skills as ends are part of it. We also noted that in the educational tradition of didactic analyses and taxonomies of educational objectives, affective dimensions were approached with considerable reservation, because of concerns of privacy, not entering the domain of the pedagogical function of the family and indoctrination. These considerations, as well as technical concerns with respect to measurement issues, are probably the reason why the educationalist tradition has not been a strong impetus for the current social-emotional skills movement. Although there is some common ground, ‘logical frameworks and rational strategies’ are rather a critical perspective to look upon the more ambitious aims of the social-emotional skills movement. The developments described in this chapter indicate that the marriage between ‘emotional intelligence’ and perspectives on labor market needs are at the basis of what became known as 21st century skills and programs for social-emotional learning in the United States. Positive psychology and character education added an element of moral education, whereas the competency concept embraced by the European Union further underlined holistic development and alignment with real life outcomes. Finally, let us take a global look at where we stand at this time (fall 2019) with the social-emotional skills movement and consider the intellectual state of affairs as well as adoption in educational policy. Substantively the work by the OECD in preparation of its Longitudinal Study of Skill Development in Cities represents the current state of the art. As noted, the turn to apply the Big Five taxonomy of personality traits as the core conceptual framework is to be seen as a major development (Abrahams et al., 2019; Chernyshenko et al., 2018; John and DeFruyt, 2015). In these preparatory studies the authors show how the major dimensions of social-emotional learning, which originate from Goleman’s theory of emotional intelligence (see Table 1.2) are encompassed by the trait facets of the Big Five personality dimensions. As far as practical implementation and global dissemination is concerned the state of affairs shows important implementation and institutionalization in the United States, widespread attention across EU countries and a prospective global impetus of the current OECD longitudinal Study on Social and Emotional Skills. Kirchgasler (2018, p. 710) refers to broad dissemination Through the World Bank’s Skills Towards Employability and Productivity (STEP) skills measurement program. Effrem and Roberts (2019, p. 8) refer to a US nationally representative survey of teachers, principals and district leaders in the United States, by Transformation Education (Krachman & La Rocca, 2017), which concluded that “U.S. K-12

18

1 Roots of the Movement; Development and Criticism

public schools devote a total of approximately $21–47 billion per year to SEL in terms of: (1) expenditure on SEL-related products and programs and (2) teacher time focused on SEL”. They also conclude that “the investment of teacher time on SEL is particularly striking. We find that teachers spend about 4.3 h per week on SEL, or approximately 8% of their total working time inside and outside of the classroom”. The United States has also included specific social and emotional skills in curriculum frameworks such as Head Start and the “Common Core” and developed important coordinating institutes such as the Collaborative for Academic, Social and Emotional Learning (CASEL) and the National Commission on Social, Emotional, and Academic Development (Effrem & Roberts, 2019). Cefai et al. (2018) provide an overview on how social-emotional education (SEE) is integrated into the curricula of various EU member states: Austria, Finland, Germany, Ireland, Italy, Malta, the Netherlands, Portugal, Spain and Sweden; briefer illustrations of policies and practices from the Czech Republic, Denmark, France, Greece, Lithuania, Norway, and the UK. It should be noted that the importance of the SEE components in these countries varies considerably, from points of attention in national curricula to mandatory school subjects, and varying sets of special programs. Although these efforts are not further systemized and quantified, they seem less involving than what was described for the USA, if only because some of the developments are still at the stage of plans only. But social-emotional learning is definitely on the agenda, in some way or another, in EU countries. Despite growth and expansion, the soft skills movement is contested. In the final section of this chapter major lines of criticism were discussed. These range from technical issues concerning the fuzziness of concepts, measurement issues and the solidity of the evidence base on malleability, to ethical concerns about privacy, shifting patterns of control in education, corporate interests, new forms of class-based stereotyping, Orwellian interpretations of use and misuse of technology, an ‘uncomfortable marriage’ between psychometrics and management, and questions about international standardization in the domain of social and emotional skills (countries compared in league tables of students’ character). In this book our focus will be to further document and assess the current state of affairs, particularly with respect to the more technical issues: conceptual clarification, theoretical interpretation and review of the evidence on effects of educational interventions.

References Abrahams, L., Pancorbo, G., Primi, R., Santos, D., Kyllonen, P., John, O. P., & De Fruyt, F. (2019). Social-emotional skill assessment in children and adolescents: Advances and challenges in personality, clinical, and educational contexts. Psychological Assessment, 31(4), 460–473. https://doi.org/10.1037/pas0000591. Artlet, C., Baumert, J., Julius-McElvany, N., Peschar, J., & Organisation for Economic Cooperation and Development, P. (France). (2003). Learners for Life. Student Approaches to Learning. Results from PISA 2000.

References

19

Bloom, B. S., Hastings, J Th, & Madaus, G. F. (1971). Handbook on formative and summative evaluation of student learning. New York: MacGraw-Hill. Brown, P., Cheung, S. Y., & Lauder, H. (2015). Beyond a human capital approach to education and the labour market: the case for industrial policy. In D. Bailey, K. Cowling, & P. Tomlinson (Eds.), New perspectives on industrial policy for a modern Britain (pp. 206–224). Oxford: Oxford University Press. Bull, A., & Allen, K. (2018). Introduction: Sociological interrogations of the turn to character. Sociological Research Online, 23(2), 392–398. https://doi.org/10.1177/1360780418769672. Cefai, C., Bartolo, P. A., Cavioni, V., Downes, P. (2018). Strengthening social and emotional education as a core curricular area across the EU. A review of the international evidence, NESET II report. Luxembourg: Publications Office of the European Union. https://doi.org/10.2766/664439. Chernyshenko, O. S., Kankaraš, M., Drasgow, F., & Organisation for Economic Cooperation and Development (OECD) (France). (2018). Social and emotional skills for student success and well-being: conceptual framework for the OECD study on social and emotional skills (OECD Education Working Papers, No. 173). In OECD Publishing. OECD Publishing. de Corte, E., Geerligs, C. T., Lagerweij, N. A. J., Peters, J. J., & Vandenberghe, R. (1976). Beknopte didaxologie [Concise educational theory]. Groningen, Nederland: Wolters-Noordhoff. Effrem, K., Robbins, J., & Pioneer Institute for Public Policy Research. (2019). Social-emotional learning: K-12 education as New Age Nanny State. White Paper No. 192. In Pioneer Institute for Public Policy Research. Pioneer Institute for Public Policy Research. Elias, M. J., Zins, J. E., Weissberg, R. P., Frey, K. S., Greenberg, M. T., Haynes, N. M., Kessler, R., Schwab-Stone, M. E., Shriver, T. P., & Association for Supervision and Curriculum Development, A. V. (1997). Promoting social and emotional learning: Guidelines for educators. European Commission. (2018). Proposal for a council recommendation on key competences for lifelong learning. Commission Staff Working Document. Brussels: European Commission. Francis, B., Mills, M., & Lupton, R. (2017). Towards social justice in education: Contradictions and dilemmas. Journal of Education Policy, 32(4), 414–431. https://doi.org/10.1080/02680939. 2016.1276218. Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic Book. Goleman, D. (1998). Working with emotional intelligence. New York, NY: Bantam Books Inc. Goleman, D. (2001). Emotional intelligence: Perspectives on a theory of performance. In C. Cherniss & D. Goleman (Eds.), The emotionally intelligent workplace. San Francisco: Jossey-Bass. Greene, P. (2018, January 24). Does social and emotional learning belong in school? Retrieved from https://curmudgucation.blogspot.com/2018/01/does-social-and-emotional-learning.html. de Groot, A. D. (1986). Begrip van evalueren [Understanding evaluation]. Den Haag: Vuga. Holborow, M. (2018). Language skills as human capital? Challenging the neoliberal frame. Language and Intercultural Communication, 18(5), 520–532. https://doi.org/10.1080/14708477. 2018.1501846. John, O., & De Fruyt, F. (2015). Framework for the longitudinal study of social and emotional skills in cities. Paris: OECD Publishing. Kankaraš, M., & Suarez-Alvarez, J. (2019). Assessment framework of the OECD Study on Social and Emotional Skills (OECD Education Working Papers, No. 207). Paris: OECD Publishing. https://doi.org/10.1787/5007adef-en. Kautz, T., Heckman, J. J., Diris, R., ter Weel, B., & Borghans, L. (2014). Fostering and measuring skills: Improving cognitive and noncognitive skills to promote lifetime success (OECD Education Working Papers, No. 110). Paris: OECD Publishing. https://doi.org/10.1787/5jxsr7vr78f7-en. Kirchgasler, C. (2018). True grit? Making a scientific object and pedagogical tool. American Educational Research Journal, 55(4), 693–720. https://doi.org/10.3102/0002831217752244. Krachman, S. B., & LaRocca, B. (2017, September). The scale of our investment in social-emotional learning. Retrieved from https://www.transformingeducation.org/wp-content/uploads/2017/10/ InspirePaper-Transforming-Ed-FINAL-2.pdf. Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: An overview. Theory into Practice, 41(4), 212. https://doi-org.proxy-ub.rug.nl/10.1207/s15430421tip4104_2.

20

1 Roots of the Movement; Development and Criticism

de Leeuw, A. C. J. (1974). Systeemleer en organisatiekunde. Leiden, The Netherlands: Stenfert Kroese. Levy, F., & Murnane, R. (2013). Dancing with robots: Human skills for computerized work [PDF file]. Retrieved from https://www.thirdway.org/report/dancing-with-robots-human-skillsfor-computerized-work. Matthews, G., Zeidner, M., & Roberts, R. D. (2002). Emotional intelligence: Science and myth. Cambridge, MA: MIT Press. OECD. (2004). Problem solving for tomorrow’s world: First measures of cross-curricular competencies from PISA 2003. Paris: PISA: OECD Publishing. https://doi.org/10.1787/978926400643 0-en. OECD. (2015). Skills for social progress: The power of social and emotional skills. Paris: OECD Skills Studies: OECD Publishing. https://doi.org/10.1787/9789264226159-en. Osher, D., Kidron, Y., Brackett, M., Dymnicki, A., Jones, S., & Weissberg, R. P. (2016). Advancing the science and practice of social and emotional learning: Looking back and moving forward. Review of Research in Education, 40(1), 644–681. https://doi.org/10.3102/0091732X16673595. Peterson, C., & Seligman, M. E. P. (2004). Character strengths and virtues: A handbook and classification. Oxford University Press. Rychen, D. S., & Salganik, L. H. (2001). In D. S. Rychen & L. H. Salganik (Eds.), Defining and selecting key competencies. Hogrefe & Huber Publishers. Rychen, D. S., & Salganik, L. H. (2003). In D. S. Rychen & L. H. Salganik (Eds.)., Key competencies for a successful life and a well-functioning society. Hogrefe & Huber Publishers. Springer, K. (2019, July). The mastery transcript consortium has been developing a gradeless transcript for college admissions. This fall it gets its first test. T74 Newsletter. Retrieved from https://www.the74million.org/article/the-mastery-transcript-consortium-has-been-develo ping-a-gradeless-transcript-for-college-admissions-this-fall-it-gets-its-first-test/. Urciuoli, B. (2008). Skills and selves in the new workplace. American Ethnologist, 35(2), 211–228. https://doi.org/10.1111/j.1548-1425.2008.00031.x. Visser, A. (2016). Persoonsvorming als curriculaire uitdaging?! Een conceptueel vooronderzoek [Personality development as a curricular challenge?! A conceptual preliminary investigation]. Enschede: SLO. Williamson, B. (2018). PISA for personality testing–The OECD and the psychometric science of social-emotional skills. Retrieved from https://codeactsineducation.wordpress.com/2018/01/16/ pisa-for-personality-testing/. Williamson, B. (2019). Psychodata: Disassembling the psychological, economic, and statistical infrastructure of ‘social-emotional learning’. Journal of Education Policy. https://doi.org/10. 1080/02680939.2019.1672895. Zeidner, M., Matthews, G., & Roberts, R. D. (2009). What we know about emotional intelligence. How it affects learning, work, relations and our mental health. Cambridge, MA Bradford Books. Zins, J. E., Weissberg, R. P., Wang, M. C., & Walberg, H. J. (Eds.). (2004). Building academic success on social and emotional learning: What does the research say?. New York: Teachers College Press.

Chapter 2

Conceptual Challenges

2.1 Introduction The chapter starts with referring to the multitude of desirable attributes as new objectives of education presented in programs for educational reform as ‘the 21st centuryskills blob’. Social and emotional skills are just a subset of this unstructured multitude of attributes. In order to bring the conceptual core more to focus the following reduction strategy is followed: – Disregard higher cognitive skills and learning strategies labelled as meta-cognition (because they are part of regular schooling and teaching); – Disregard “modernization skills” like digital literacy, citizenship etc. (because they have an important knowledge component and can either be taught as new school subjects of be integrated in existing subjects). What then remains are attributes of emotion, volition, behavior and skills that are considered of high value for both educational careers and later societal functioning (as some would say for better people in a better world; Reimers (2017). In three introductory paragraphs we provide a gradual structuring of the soft skills ‘subject matter’. After a first global inventory we define social-emotional skills as our prime focus and reiterate fundamental conceptual challenges. These are further structured by subsequently addressing definitions, the competency-concept and ordering frameworks that are based on psychological concepts, among others the Big Five taxonomy of personality traits and facets. Key issues that are being dealt within this chapter are the proposal to interpret skills as achievement-oriented dispositions, and a perspective on the demarcation of these dispositions in terms of a continuum from general to more specific dispositions, with traits, trait facets and skills as relevant labels.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_2

21

22

2 Conceptual Challenges

2.2 The Non-cognitive Blob1 The growing attention for educational orientations that transcend traditional educational content and emphasize transversal skills, both cognitive and non-cognitive, has led to a high production of policy documents in which a many facetted whole of skills and desirable attributes is listed. The European Commission in particular has had a high turnover of such documents and underlying studies (Cefai, Bartolo, Cavioni, & Downes, 2018, provide an overview). At first notice this may come across as rather chaotic. By way of an illustration reference is made to the preparation for a new curriculum for primary and secondary education in the Netherlands, from 2015 and onwards, under the heading of Our education 2032. A committee proposed a design for education in the Netherlands in the future, where 2032 was chosen as the arbitrary time point of reference. Basically, the message of the policy advice was to diminish time for academic, subject-based instruction somewhat and provide more space for general cognitive and non-cognitive attributes. In the background documents of the report, provided by consultants associated with the OECD, traditional subjects like algebra were described as obsolete, and a strong emphasis was placed on the development of social and emotional skills. The collection of attributes mentioned in these documents shows a broad and heterogeneous set of characteristics, of which the only probable common denominator was indeed that they are not cognitive, although in some instances even that is disputable. In the background studies the following general types of non-cognitive attributes are mentioned: intra-personal and interpersonal competencies, cooperative skills, self-organizing skills, task direction, social skills, ethical values, social and emotional skills, character education (among others involving mindfulness, resilience, curiosity, courage, ethics and leadership), entrepreneurship, ICT literacy, financial literacy, civic literacy, global awareness, managing emotions, self-awareness, emotional intelligence, conscientiousness, adaptability, flexibility, positive disposition to cultural differences, ethical and moral issues associated with digital life, ‘cyber wellness, digital ethics, revival of the concepts of character and virtue, self-awareness, self-management skills, perseverance and ‘grit, internal locus of control, intrinsic motivation, active selfregulation and engagement, creativity (seen as hindered by assessment systems— p. 14), self-esteem, ‘emotional literacy in Kindergarten, basic goodness (OECD, 2015a, 2015b). This is an overwhelming mass of constructs. The safest way to characterize them all is to refer to them as non-cognitive attributes, although for quite a few of them this is disputable. Literacy is mostly associated with cognitive outcomes, although the term ‘emotional literacy’ has also appeared. Other terms are hybrids, particularly those that come under the umbrella of “competencies and are supposed to have cognitive and non-cognitive parts. Many of the domains that characterize modernization and 21 century skills refer to knowledge, attitudes and beliefs. This applies to citizenship education, ICT and financial literacy, entrepreneurship and global awareness. Social skills have an element of knowing how, in the sense of knowledge of 1A

blob is (among others) defined as an unclear, formless substance.

2.2 The Non-cognitive Blob

23

behavioral rules, but may also aspire to emotional attributes like showing empathy. The term emotional skills sounds strange, but may refer to capacity to control as well as display emotions, and what about emotional intelligence? Both terms illustrate an important demarcation in the affective domain, namely between learnable (and perhaps also teachable) skills on the one hand and on the other hand personality traits, of which the malleability is disputed. A category that stands apart in the collection of terms is character education and personality development, this area raises fundamental questions with respect to the core mission of schooling, comparable to the reservations expressed by Bloom, Hastings, and Madaus (1971), which were referred to in the previous chapter. At this stage a first attempt to create some order in chaos is the following classification of subdomains of non-cognitive attributes in education. A first distinction is between content areas and psychological dimensions with social, emotional, motivational and moral aspects. On the content side a number of areas of modernization can be distinguished: ICT, global competence, citizenship and entrepreneurship. Sub-categories fitting the areas of modernization are indicated in Table 2.1 and sub-categories of the psychological dimensions are shown in Table 2.2. A similar kind of categorization is used by the OECD (2019) in their report on Curriculum content mapping. It should be noted that this first ordering is incomplete and arbitrary in several ways. One area that is hardly included is ethics and moral attributes; it is partly Table 2.1 Classification of non-cognitive attributes according to areas of modernization and implied skills ICT

Global competence

Citizenship

Entrepreneurship

ICT literacy. computational thinking/coding and programming ethics

Tolerance cosmopolitism commitment to peace Literacy for sustainable development

Democratic values intercultural skills

Financial literacy Media literacy

Table 2.2 Classification of non-cognitive attributes according to psychological dimensions Social attributes

Emotional attributes

Motivational attributes

Character/personal development

Being cooperative

Self-awareness Emotional literacy Emotional control

Self-esteem Internal locus of control Self-management skills

Adaptability Flexibility

Creativity

Engagement Intrinsic motivation

Conscientiousness Locus of control

Leadership

Basic goodness

Conscientiousness Perseverance GRIT

Mindfulness

Emotional intelligence

Resilience

Curiosity Extraversion

24

2 Conceptual Challenges

addressed under emotional attributes with the phrase ‘basic goodness’. Some examples are mentioned in more than one column, and for some it was dubious to place them in one column or another, curiosity, for example, might be included under ‘motivational’, but can also be seen as an element of character. Further clarification of the domain is required, and this can only be done by looking in more detail at further classification of the affective and motivational content of most specimens in Table 2.2.

2.3 Focus on Social and Emotional Attributes In this book we focus on social and emotional attributes, which is only part of the attributes that were discussed when we referred to the roots of this movement in Chap. 1. The non-cognitive core on which we will focus has remained fairly consistent over time and has stayed close to Goleman’s (1998) four components of emotional intelligence: self-awareness, social awareness, self-management and relationship management (compare Table 1.2). This is evident from the consistency of this model with Zins et al.’s (2004) model of person-centered key SEL competencies (Table 1.3) and the ordering framework for the EU Key competences personal, social and learning to learn competencies as proposed by Cefai et al. (2018), which reproduces Goleman’s original model (as far as the distinction of the four overarching categories is concerned). Although the OECD has chosen a different theoretical framework, namely the Big Five personality traits, there is a fair match at the level of more specific trait facets with the 30 attributes, which Cefai et al. (2018) recommend to the European Commission. Despite of our intention to focus on this core of social and emotional skills, it will not always be possible to make clear demarcations between cognitive and noncognitive, and, when discussing programs in regular public education to enhance social emotional learning, it will not always be possible to avoid issues of moral education. Also, when we turn to the more fundamental conceptual questions with respect to the skill concept, distinctions may apply similarly for cognitive and noncognitive attributes, for example when one considers meta constructs (like in metacognition, meta-conation and meta-affection).

2.4 Fundamental Conceptual Questions Fundamental conceptual questions can be raised about skills. First of all, skills can be interpreted in different ways, and, as we have done so far ourselves in this chapter, the term seems to be frequently used as a short label, which we would actually rather put between hyphens ‘skills’, expressing that we might not be using the term in a strict sense, but really mean something more general, close to other terms like abilities,

2.4 Fundamental Conceptual Questions

25

capacities and competencies. So, the first fundamental conceptual issue is to agree on a rationale for defining skills. Part of this issue is whether clear demarcation is possible between these attributes or fuzziness should be accepted. Next, meaningful dimensions to categorize the domain of non-cognitive attributes might be desirable. A distinction that comes to mind rapidly is to subdivide the noncognitive into affective/emotional on the one hand and conative/motivational on the other. A second candidate for further categorization is the distinction between skills as direct object level manifestations or as reflective meta level capacities. Third and finally, we should address the hierarchical nature of personality characteristics, the distinction between traits and facets of traits. Traits of personality and character are seen as rather stable individual characteristics, whereas facets of traits are considered to be more variable across contexts. This distinction is related to expectations about the malleability, learnability and teachability of psychological attributes.

2.4.1 Definitions Following Ryle (1976), de Groot and Medendorp (1986) use the term ‘disposition’ to refer to an action- or reaction possibility that a system is expected to manifest in particular types of situations. The disposition concept applies to a broad range of systems, including dispositions owned by persons. De Groot and Medendorp subdivide dispositions of persons into three categories: (a) capacities and skills, which refer to achievement potential; (b) habits and (c) liabilities & bents which refer to inclinations (Ibid., 119). Of these three, the interpretation of habits is relatively straightforward and non-problematic. Habits can be observed as behavioral regularities, without an ambition to infer more general personal characteristics. Dispositions referring to achievement potential (capacities and skills) are analyzed as follows by de Groot and Medendorp (Ibid., 120): person P is seen as possessing a set of underlying characteristics Ch, which together make up a potentiality Po, which enables P to manifest reactions R, which are counted as performance at a certain level, in a certain domain D, in a particular period of life Pe, and in situations of a particular type S. When it comes to operationally establishing potentiality (Po), the sequence should be followed in the reverse order: is it possible to construct S in such a way that P’s R (reaction) can be evaluated in terms of performance in domain D; in such a way that conclusion about P’s disposition, that is about the level of his potentiality Po, can be drawn? A further distinction the authors make is between the attribution of achievement dispositions with and without a development perspective. This distinction refers to short- or long-term assumptions when establishing P’s potentiality in a domain. They illustrate their analyses of performance dispositions on the basis of a more in-depth treatment of intelligence testing. They describe the interpretation of performance measurement as a sequence of generalization steps, from taking test scores at direct face value, to performance on a particular type of

26

2 Conceptual Challenges

test, to the current level of a person’s P (in this case intelligence), while generalizing over test types, to interpretation in terms of a person’s general intelligence G, while generalizing over all types of situations in life. The third category of dispositions, inclinations is the domain of personality traits. De Groot and Medendorp characterize this domain as more problematic when it comes to rational application. They note that inclinations are difficult to distinguish from habits, but that it is clear on what the distinction depends; in the case of inclinations there is reference to non-observable and psychic reactions of a wide diversity. They also note that the distinction with capacities and skills (achievement potential) is sometimes vague, showing that character strength can be seen as an achievement, although operationalization and measurement are more problematic. In order to come to grips with the enormous diversity in terms and words indicating personality characteristics, the authors distinguish four major types of inclinations (Ibid., 131): – The case when the trait in question coincides strongly with concrete behavioral habits, like agility in the psycho-motor domain and simple working habits like perfectionism or nonchalance. – The case when traits, which refer to an inner tendency are nevertheless relatively directly manifested in observable behavior, e.g. shyness and aggressivity in social situations. Despite relative facility in observing these inclinations, deciding on representative situations in which they would appear is more complex. – Habitual psychic reaction tendencies, like pride or moodiness; for these finding representative situations in which they can be observed and the observation method itself are more complicated. Self-reports are a way-out, but these are prone to uncertainties and socially desirable answers. – Traits as defined in personality theories and taxonomies. The latter category, traits as defined in personality theories and taxonomies, touches upon an important issue about some traits being considered as more fundamental than others. Weinert (2001) refers to this issue in terms of the hierarchical nature of personality concepts. De Groot and Medendorp (Ibid., 132) mention several criteria that are in use when it comes to deciding on whether a personality characteristic is fundamental or not: (1) inclinations that are elementary (not further dividable); (2) trans situational; (3) embedded in a personality theory; (4) seen as causal to more specific facets in the development of the personality; (5) are seen as innate predispositions or (6) learned at a very early age. De Groot and Medendorp conclude that all these assumptions concerning fundamental traits remain dubious, despite all research efforts. Weinert (2001, p. 52) concludes that hierarchical ordering of competencies from universal to specific has often failed in psychology. For our discussion on social and emotional skills, the excursion into conceptual analysis of basic concepts has some implications. First, social-emotional skills, according to the distinctions made by de Groot and Medendorp, would seem to fall in the category of inclinations with achievement potential. Inclinations are more innate and stable, just as cognitive ability. Habits and skills are learnable by practice, although within certain limits depending on cognitive

2.4 Fundamental Conceptual Questions

27

ability and personality. In the current use of the term social-emotional skills, in the literature on soft skills, the distinction between capacities that reflect achievement potential and other psychological tendencies, here described as inclinations is largely absent. Exceptions are contributions by Kyllonen, Lipnevitch, Burrus, and Roberts (2014) and John and De Fruyt (2015), which will be referred to further on. The latter category (inclinations) is not used, and everything is indicated as a skill. Given the conceptual and methodological problems de Groot and Medendorp note with respect to applications of ‘inclination measures’, they forewarn against the use of such measures for selection and accreditation purposes. On the other hand, they endorse the use of such measures when establishing individual differences between persons as predictors of performance. Second, in the conceptual analyses by de Groot and Medendorp and Weinert, the generality versus situation specificity of dispositions is discussed as a central issue. Traits are more general when they generalize over a wider range of situations. This question is related to fundamental questions about the innate basis of the more general traits, and the degree of malleability, e.g. the changeability induced by interventions in areas like therapy and education. This too is an issue that seems to be insufficiently seen as a problem in the current overriding conviction that social and emotional skills are malleable (as expressed, for example, by Schleicher, 2019). Finally, the analysis by de Groot and Medendorp was made at a time that personality research had not yet made the advancement inherent in the Big Five personality factors. This advancement has improved the situation with respect to the categorization of a limited number of higher order general factors and more specific underlying constructs (traits and trait facets) as well as the availability of validated and standardized measuring instruments. Also, the issue of general and more specific dispositions tendencies has obtained a more updated discussion with the introduction of the competency concept; if anything, an interesting hybrid of achievement potential and other inclinations. This will be briefly addressed in the next paragraph.

2.4.2 A Note on the Competency Concept It is beyond the scope of this chapter to address theoretical and practical issues surrounding the competence and competencies concepts in any detail. We will only briefly refer to some conceptual issues that are related to the analysis of social emotional skills. Several debatable aspects about social emotional skills are also addressed in the work on competencies (e.g. the question of skills being transversal or domain- and situation specific) and the literature on competencies is seen as helpful in clarifying practical application of skills, specifically the way skills taught at school match with demands in real life contexts, like work. One basic definitional question is to distinguish competence from competency (singular) and competencies plural. Competence means being able to manifest certain performance; e.g. “she has the competence of being a good nurse” (Mulder &

28

2 Conceptual Challenges

Winterton, 2017, p. 14). Competency is an element and characteristic of competence, and as such part of a generic competence. The plural forms competences and competencies appear to be used rather arbitrarily, as the appropriateness would depend on the generality or specificity of the domain that is being referred to. A frequently encountered further specification of competency refers to “a coherent cluster of knowledge, skills and attitudes, which can be utilized in real performance contexts” (Mulder, 2014). Weinert refers to action competence as “all those cognitive, motivational and social prerequisites necessary and/or available for successful learning and action” (Weinert, 2001, p. 51). It is worthwhile to consider the implications of this holistic interpretation of competencies for curriculum development and empirical assessment. Corporate training and vocational and professional (higher) education are settings that gave an impetus to the development and flourish of the competence concept (next to the heritage of emotional intelligence), Mulder & Winterton (2017). Responsiveness of education to demands of the labor market is an age-old concern that is also at the basis of the current modernization efforts, of which competency-based education and propagation of social emotional skills are major trends. In curriculum research, job and workplace task analyses have been used as a basis for selecting educational content (knowledge and skills) functional to the needs in diverse work fields (Nijhof, 1983). Theoretically, empirical studies might be able to capture those “intellectual abilities, content specific knowledge, cognitive skills, domain specific strategies, routines and subroutines, motivational tendencies, volitional control systems, personal value orientations, and social behaviors” (Weinert, p. 51), all seen as part of a complex system and representing competence in real life situations. Next, according to the functionalist approach, the challenge for education would then be to design ‘competency based’ curricula and lesson programs mirroring such a complex system of work competences. We have no systematic overview of the successes and failures of competency based educational programs, but anecdotal evidence suggests that the concept is controversial and difficult to implement. As far as empirical assessment of the holistic nature of action competencies is concerned, the few examples that we have come across indicate that composite concepts like ‘grit’ (Credé, 2018) and ‘career competence’ (Kuijpers, 2003), break down into their constituent and more specific sub-skills or attributes, thus refuting holistic aspirations. Competencies are defined at very different level of universality, generality and abstraction. In a previous section we came across the EU’s Key competencies. These are encompassing formulations of main areas that should be covered in order to meet societal functions of education. As we noted, most of the EU’s key competencies refer to basic subject matter areas, like language and mathematics, and to applied domains like ICT, citizenship, and management (entrepreneurship). Only one skill is not tied to a traditional discipline or an applied domain, namely the one that refers to the personal, social and learning to learn competence. But apart from having one key competency totally dedicated to social, emotional and learning to learn attributes, the EU key competencies framework confirms to the notion that all of the key competencies are to be seen as complex systems, knowledge, beliefs

2.4 Fundamental Conceptual Questions

29

and action tendencies, constructed from “well-organized domain-specific expertise, basic skills, generalized attitudes, and converging cognitive styles” (Weinert, 2001, p. 53). Two analytic questions have arisen with respect to the assumption that competencies are complex systems of knowledge, belief and action tendencies: first the question of the malleability of some of the constituent elements of competencies, and second on how the interaction between these elements is to be seen. On the first issue Weinert (2001, p. 53) notes that there are large individual differences in cognitive abilities, cognitive styles and emotional qualities and that it is doubtful whether they can be modified through learning; whether deficits can be compensated for and whether people can indeed change in the required direction. He adds: “If this question remains unasked or unanswered, there is a danger that an academic discussion of key competencies will trivialize the already enormous inter-individual differences and could lead to a surge of individual discrimination” (Ibid., p. 53). For our purposes, the second question regards the supposed interrelationships between cognitive and non-cognitive elements in the competency concept. Referring to the various roots of the 21st century skills and socio-emotional learning movement it can be inferred that non-cognitive elements are seen as instrumental and supportive with respect to cognitive development and educational and general educational attainment. Motivational attributes like emotional control have been considered as prerequisites for learning, as well as social attributes, frequently defined as prevention of non-social behavior. It seems that in the EU’s framework of key competencies social and emotional skills are both seen as instrumental to cognitive development and preparation for societal functioning and as goals in themselves. We shall return to both issues in subsequent sections.

2.5 Further Categorization and Contextualization of Social and Emotional Skills, Based on Psychological Concepts 2.5.1 Constructs and Meta-Constructs of Personality and Intelligence The basic constructs of the mind, that influence human behavior and learning are categorized as cognitive, affective and conative (Hilgard, 1980; Snow & Jackson, 1997). Although the emphasis in this chapter is on the affective and conative constructs, cognitive constructs will be brought into the picture as well, because there are connections with the other two. The cognitive construct has to do with intelligence, the affective deals with emotions and the conative is the construct that covers volition and striving. Concepts in the conative domain are readiness for action, achievement orientation, tenacity and motivational intensity. Table 2.3 based on Kyro, Seikula-Leino, and Myllari (2008) provides an overview of all three constructs. Conation is interpreted as having links with both personality and intelligence.

30

2 Conceptual Challenges

Table 2.3 Constructs and meta-constructs of personality and intelligence (Kyro et al., 2008) Personality

Intelligence

Affection

Conation

Temperament

Emotion

Traits of temperament General personality factors

Motivation

Cognition Volition

Procedural knowledge

Declarative knowledge

Characteristics Achievement moods orientation

Action control

General mental Special ability factors mental ability factors

Special personality factors

Orientations towards self

Orientation Skills towards others

Values and Attitudes

Interests

Beliefs

Meta-affection

Meta-conation

Domain knowledge

Meta-cognition

The distinctions in the cognitive domain between procedural knowledge and declarative knowledge, general mental ability factors and special mental ability factors confirm to the revised taxonomy of Bloom. Within the construct of affection, the distinction between traits of temperament and personality factors depends on the degree to which characteristics are seen as more lasting and independent of situational aspects. Traits of temperament are considered as more stable and independent of individual situations than personality factors. Also, temperamental factors are expected to be closer to biological factors. But these are difference in degree, and there is a lot of commonality as well. According to Shiner and De Young (2011) “Historically temperament and personality have been studied as distinct sets of individual differences, with temperament consisting of more narrowly defined consistencies that appear earlier in life and with personality consisting of a broader range of consistencies that emerge later in life. However, if we restrict our consideration of personality to traits rather than characteristic adaptations or narratives, then temperament and personality traits have much in common”. The two sets of traits (temperament and personality) are both seen as shaped by heredity and by the environment. and De Young point at correspondence between Rothbart’s five temperament factors, as one of the more established temperament models, and the Big Five personality factors (Evans & Rohtbart, 2007). Emotional Intelligence is more like a hybrid of affective and cognitive factors. Likewise, creativity has hybrid characteristics. On the one hand it may reflect esthetic sensitivity as an emotional thing, and on the other hand it may be seen as a part of intelligence, in the form of divergent thinking. About creativity Sternberg and Lubart (1996) say that “the ability to produce work that is novel (original, unique), useful and generative” is very useful in work settings, but the major problem is how it can be measured objectively, reliably and validly. A very interesting set of constructs in Table 2.3, from Kyro et al. (2008) are self-regulatory capacities in the cognitive, affective and conative domains, indicated as meta-affection, meta-conation and meta-cognition. Developments in the

2.5 Further Categorization and Contextualization of Social and Emotional …

31

cognitive domain and meta-cognition are leading, and the two other meta-concepts are more newly prosed analogues and hypothetical. Kyro et al. (Ibid.) provide some examples of meta-conative and meta-affective expressions in the context of learning entrepreneurial competences. As an example of ‘meta-affection’ they refer to “sensing the group atmosphere in a peer group activity”. “Experiencing the freedom that entrepreneurship brings” is seen as an example of meta-conation. Meta-affection seems to come down to reflect on one’s own feelings and the emotional climate during group tasks, and meta-conation is about self-reflection on one’s own and other’s achievement orientation and action control. Yet, described in these terms, the reflectivity appears to be cognitive. Literary meta-affection means feeling about one’s feelings, and meta-conation having motivational reactions towards expressed motivation (by oneself or others). Just like meta-cognition is reflection on one’s cognitive behavior; like in’learning to learn’ (cf. Weinert, 2001). An example of meta-affection from the European Commission’s categorization of social emotional skills, is “appreciation of one’s positive emotions, such as happiness and excitement” (Cefai et al., 2018). Examples of meta-conation seem to be harder to come by. Still a distinction between skills and other dispositions at the level of direct expression, and at meta-level is quite relevant.2 Particularly when the ambition is to see skills as “transversal”.

2.5.2 The Big Five Personality Factors The Big Five personality factors have received strong empirical corroboration and are considered as the basic structure of personality traits in adults. The Big Five traits include Openness/Intellect, Conscientiousness, Extraversion, Agreeableness and Emotional stability. In Table 2.4 the Big Five domains are rendered from an educational perspective and included in a framework presented by the OECD (2015b), with compound skills as an extra domain. In the Chap. 3 the Big Five will be discussed from the original psychological perspective. A striking feature of OECD’s interpretation of the Big Five, depicted in Table 2.4, is that descriptions are sometimes referring to skills (able to), in other cases related to behavior, and in still other cases to affect. In the usual Big Five instruments one sees only statements about thoughts, feelings and behavior. In the conative domain, related to ‘orientations towards self’ and to ‘orientations towards others’ the construct locus of control, not part of the Big Five should be mentioned as well Rotter (1966) differentiated internal and external locus of control. Internals are those who believe that they are themselves in control over their fate, while externals are those who believe as not in direct control and who perceive themselves in a passive role about the external environment. Internals perceive a strong 2 In

control theory a distinction is made between steering at object and steering at meta-level. Meta-level control is at a higher level of abstraction: controlling control (de Leeuw, 1990).

32

2 Conceptual Challenges

Table 2.4 Big Five personality dimensions and compound skills. Cited from OECD (2015a, 2015b) Big Five domains

Skillsa

Descriptions

Consiensciousness

Achievement orientation

Setting high standards for oneself and working hard to meet them

Responsibility

Able to honor commitments, and be punctual and reliable

Self control

Able to avoid distractions, and focus attention on the current task in order to achieve personal goals

Persistence

Perseverance in tasks and activities until they get done

Stress resistance

Effectiveness in modulating anxiety and able to calmly solve problems (relaxed, handles stress well)

Optimism

Positive and optimistic expectations for self and life in general

Emotional control

Effective strategies for regulating temper, anger and irritation in the face of frustrations

Empathy

Kindness and caring for others and their well-being that leads to valuing & investing in close relationships

Trust

Assuming that others have good intentions and forgiving those who have done wrong

Cooperation

Living in harmony with others and valuing interconnectedness among all people

Curiosity

Interest in ideas and love for learning, understanding and intellectual exploration; an inquisitive mindset

Tolerance

Is open to different points of view, values diversity, is appreciative of foreign people and cultures

Creativity

Generating novel ways to do or think about thinks through exploring, learning from failure, insight and vision

Sociability

Able to approach others, both friends and strangers, initiating and maintaining social connections

Emotional stability

Agreeableness

Openness to experience

Extraversion

(continued)

2.5 Further Categorization and Contextualization of Social and Emotional …

33

Table 2.4 (continued) Big Five domains

Compound skills

a The

Skillsa

Descriptions

Assertiveness

Able to confidently voice opinions, needs and feelings, and exert social influence

Energy

Approaching daily life with energy, excitement and spontaneity

Self-efficacy

The strength of individuals’ beliefs in their ability to execute tasks and achieve goals

Critical thinking; independence

The ability to evaluate information and interpret it through independent and unconstrained analysis

Self-reflection, meta-cognition

Awareness of inner processes and subjective experiences, such as thoughts and feelings, and the ability to reflect on and articulate experiences

more appropriate label to be used here is “trait facets”, but the OECD uses the term skills

link between their actions and consequences; externals tend to attribute personal outcomes to the external environment. Of the Big Five traits, neuroticism/emotional stability is seen as most closely related to locus of control (Sorensen & Eby, 2006).

2.5.3 Recapitulation and Preliminary Conclusions So, what is to be learned from this expedition in the field of psychological study of individual differences? First a large part of non-cognitive attributes in recent proposals for the modernization of education (see the paragraph on the 21st century skills blob) refer to facets of intelligence, personality and motivation. Factors like adaptability, flexibility, emotional intelligence, conscientiousness, internal locus of control and resilience mentioned as targets by educationalists are directly included or direct off-springs of the psychological constructs ordered in Tables 2.3 and 2.4 Connecting the traditional interests in diligence and appropriate behavior in schools to this whole framework of psychological constructs has “blown up” the issue enormously and the mass of concepts seems to have grown rather than diminished, from where we started with our ‘21st century blob’. What can be seen as an important step in the right direction is that the psychological literature offers a number of options for clearer categorization and sharper definition.

34

2 Conceptual Challenges

Secondly, the psychological literature can be interrogated on the tenability and feasibility of the educational aspirations to deal with psychological dimensions and behavioral dispositions in a prescriptive way. Here the following points can be mentioned: (a) The charting of concepts in the study of individual differences in psychology is helpful in providing an ordered comprehensive overview. (b) Despite more clarity in separating constructs, strong interrelatedness is a remaining issue. (c) Reflection on underlying dimensions on which personality traits and intelligence are expected to develop and are more or less hierarchically ordered is relevant for educational applications. (d) with respect to the partly innate characteristic of traits, their degree of malleability, is a second consideration that is most relevant for educational applications. Re (a) The set of psychological constructs presented in Tables 2.3 and 2.4 provides a useful framework for ordering affective and conative constructs, next to the more familiar cognitive constructs. Re (b) Despite the distinction in cognitive, conative and affective, these are still considered as interactive elements in human intelligence and personality. Even though the cognitive is seen as fundamental in learning tasks, the learning task may also have a strongly emotional facet, sometimes because of stressful conditions, and otherwise because of motivational dimensions. As Kyro et al. (2008, p. 2) put it: “a need for achievement (conative domain) construct can also be seen from an affective perspective. At a deeper level the affective construct relates to values and attitudes. To put this simply, what we regard as valuable guides our willingness and interest to learn. Thus, the affective construct is as fundamental to our learning as the conative construct”. Re (c) In the cognitive domain a hierarchical ordering of reactions from simple to more complex operations and from simple reactions to mental organization is a likely option. This is particularly emphasized in taxonomies of educational objectives as developed by Bloom and others (Bloom et al., 1971). In the psychological framework in Table 2.3 such hierarchical ordering is less in evidence. The distinction of meta-cognitive, meta-affective and meta-conative operations represents a step toward higher level operations. In the taxonomy of affective educational objectives an ordering from simple affective awareness to internalizations in increasingly complex structures is presented. However, in the ordering of psychological constructs in Table 2.3 such ordering does not seem to play a role. Instead, the degree to which constructs are seen as relatively stable person related dispositions as compared to more specific situation dependent reactions would be the most relevant ordering dimension. This has direct implications for expectations and ambitions to influence these personality and intelligence characteristics (see the next bullet point). Re (d) As it was noted, traits of temperament would be relatively more determined by biological factors and genetic endowment than personality traits, although this was seen as a relative difference. A distinction that is related to the degree to

2.5 Further Categorization and Contextualization of Social and Emotional … Innate >>>>>>>>>>>>> Trait of Personality trait temperament Nega ve emo onality Effor ul control

Neuro cism

Behavioral disposi on (general) Vulnerability

Conscien ousness Achievement Internal locus of oriented control

35

Shaped by experience and learning Behavioral Manifest behavior disposi on (situa on specific) Fear of failure in Ac ng nervously examina ons High mo va on Always does to succeed in homework math.

Fig. 2.1 From general innate traits to manifest situation specific behavior; examples from the affective and conative constructs

which traits are considered as innate, or shaped by experience, is the generalizability of application and manifestation of the trait across situations. If one is said to be intelligent this would not be dependent on whether the person was acting in school or outside school. This continuum from general and innate traits towards manifest, situation specific behavior is illustrated in Fig. 2.1. The text in italics represents illustrative examples. For the topic at hand, non-cognitive attributes of education, the continuum illustrated in Fig. 2.1 is very important, particularly when affective (and conative) educational goals are intended. The expected success of this endeavor is less problematic at the very right-hand side of the Figure, when, for example, behavior modification could be considered. But what could be said about the ambition to make children, for example, more conscientiousness and less external in their locus of control?

2.5.4 Kyllonen et al.’s (2014) Contribution to the Analysis of Personality, Motivation and College Readiness In a quite comprehensive paper Kyllonen et al. (2014) address several main issues that have been discussed in this paper so far. The additional themes in the Kyllonen paper are: – A more elaborated introduction of the Big Five or Five Factor Model (FFM), which, as we have seen, is also used in the OECD work on socio-emotional skills. – Association of the Big Five traits with concepts of character strength and applied skills desired in the world of work and school life (the latter subsumed under the label of self-regulated learning) – Research findings about the development of the Big Five traits across a lifetime. – Further reflection about state/trait distinctions and situational determinacy. Elaboration on the Big Five factors Table 1 of the paper lists (30) facets of the Big Five factors, specified in the NEOPI-R assessment (Ibid., p. 6); as well as exemplary items. Table 2 in Kyllonen et al.

36

2 Conceptual Challenges

Table 2.5 Big Five factors related to areas of application. Adapted from Kyllonen et al. (2014) Big Five factor

Character strength

Self-regulated learning

Applied skills (work)

Conscientiousness

Persistence, integrity, prudence

Planning, self-organization, motivation

Ethics, work ethics, lifelong learning

Neuroticism

Self-regulation

Self-efficacy, goal orientation

Adapting & coping

Extraversion

Humor, leadership, bravery

Agreeableness

Hope, kindness, modesty, love, forgiveness

Social independence

Teamwork, collaboration

Openness

Curiosity, social intelligence, creativity

Intrinsic/extrinsic motivation

Creativity, innovation, lifelong learning

Leadership

2014, (adapted and cited here below) matches the factors with ideals of character strength, self-regulated learning and desirable skills at the workplace (see Table 2.5). For our purposes the column reflecting self-regulated learning is the most interesting. It is related to the discussion about meta skills. We shall return to this issue in a final summary table further on.

2.5.5 Development of the Big Five Traits Across a Lifetime Kyllonen et al. present tables indicating that most of the factors grow stronger over time, with the steepest increase during the first 20 years. Conscientiousness shows a steadier growth pattern across the lifetime, as does emotional stability. The authors conclude that the traits are changeable and malleable (Ibid., p. 20). The meta-analysis of Durlak, Weissberg, Dymnicki, Taylor, and Schellinger (2011, to be discussed in Chap. 4) is mentioned as the main source for support of malleability of the traits by means of interventions at school.

2.5.6 Kyllonen’s Interpretation of the Trait-State-Behavior Continuum First, it is important to distinguish the terms facets and states. In psychology, a facet is a specific and unique aspect of a broader personality trait. Kyllonen et al. say that “A facet is a lower order factor or item cluster in the Five Factor Model hierarchy”. (p. 2). Facets are considered as less stable than traits. Personality traits are

2.5 Further Categorization and Contextualization of Social and Emotional …

37

individuals’ characteristic ways of thinking, feeling and behaving that are consistent across situations and long lasting. Unlike traits, which are stable characteristics, states are temporary behaviors or feelings that depend on a person’s situation and motives at a certain moment. Formally facets are conceptual specifications. States can also be expressed in terms of more specific interpretations of personality characteristics, but here the main distinction is between the innate or situation-dependent nature of the concept. There is also overlap in the interpretation of facets on the one hand and states on the other, as the instability of states is seen as being caused by situational dependence. In the above, (Fig. 2.1) we indicated a continuum from mostly innate traits of temperament and personality via intermediary concepts like traits and behavioral dispositions to manifest behavior. Along the continuum the characteristics in question gradually become more specific, situationally determined and behavioral. This continuum is considered relevant for the question of malleability of noncognitive attributes in educational settings. This applies to the ambitions of educational goals to either improve character and personality or aim for more narrowly defined behavior restricted to situational domains. And it also matters for fostering realistic expectations of the effects of educational treatments. Analytically one could think of a continuum of increased specification:

Practically, from a perspective of education and training, one could think of a continuum of internalization and learning, in the reverse order:

Kyllonen et al. (2014) give a causal interpretation of the first continuum, when they address the influence of personality traits on student outcomes. They refer to the intermediary levels (states, facets) as mediators. As an example, they mention energy regulation and perseverance as facets of conscientiousness, which would facilitate class attendance and educational achievement. Although Kyllonen et al. consider the Big Five factors as malleable, this is a matter of broad and long-term exposure in different spheres of life. When it comes to the malleability of social-emotional functioning in education it seems more realistic to target ‘ntermediary’ facets and behavioral dispositions as well as direct behavioral modification as points of leverage. In this context Kyllonen et al. mention “particular facets that have proven to be important in education, such as the achievement striving and dependability facets of conscientiousness and the anxiety facets of emotional stability.” (p. 3) They also mention specific interventions that would

38

2 Conceptual Challenges

impinge on the Big Five factors: “For each of the five factors, specific interventions have proven successful. These include exercises and training in critical thinking (openness), study skills (conscientiousness), test and math anxiety reduction (neuroticism), teamwork and leadership (extroversion and agreeableness), and attitudes. The authors say that “Interventions along the lines of those described here could be evaluated in conjunction with a comprehensive psychosocial assessment system” (Ibid., p. 3).

2.6 Towards a Comprehensive Framework of Skills in the Cognitive, Affective and Conative Domains Grosso modo we follow the choices made by others (John & De Fruyt, 2015; Kyllonen et al., 2014; OECD, 2015a, 2015b) to use the Big Five personality factors as point of departure. For an elaborate documentation of the choice for the Big Five as an overarching framework the reader is referred to Abrahams et al. (2019, pp. 460– 464). In order to sharpen the delineation of the most relevant social and emotional attributes, we exclude some areas, and include some extra dimensions. We exclude those 21st century skills that have a basis in content-based subjects. These are active citizenship and computer literacy, which can best be seen as part of the regular school curriculum, either as new subjects, or as being addressed in traditional subjects, like social science, history, and mathematics. Similarly, entrepreneurial skills might be seen as being part of economics. Still we recognize that citizenship education is likely to address social skills in specific contexts, and entrepreneurial skills will tend to include leadership skills. Next, although our focus is on the affective and conative constructs, we include a cognitive dimension. Meta-cognition is a frequently occurring area and might be seen as an extension of training higher order cognitive skills (as defined in taxonomies of educational objectives). Moreover, the predominance of reflective and ‘meta’-orientations in the interpretation of skills, raises questions about the metacognitive nature of reflections on motivation and strivings (conation) and affection. We use a categorization of cognitive, conative, affective, affective/social, and conative/affective constructs to emphasize the general orientation of each of the Big Five factors. As Kyllonen et al. (2014) and the OECD, we distinguish a trait and a trait-facet level, and, borrowing a term from John and De Fruyt (2015, p. 44, Table 5.1) include a category that is indicated as skill equivalents of trait facets.3 We subdivide object level and meta-level skills. An example of a skill at object level is student cooperation in learning tasks. Self-regulation of social interaction is seen as a skill at meta-level. We 3 It should be noted that our interpretation of “skill equivalents” as achievement dispositions differs

from the way John and DeFruyt use this term.

2.6 Towards a Comprehensive Framework of Skills in the Cognitive …

39

note that the status of constructs like meta-affection and meta-conation is uncertain. It might be argued that reflection on emotions is primarily to be seen as a cognitive operation. It should be noted that no attempt has been made to include all tens of skills that are mentioned in recent documents by the European Commission and the OECD; nor to be exhaustive with respect to the large number of Big Five facets distinguished in the psychological literature. Instead the skills and meta-skills included in the figure are to be seen as merely illustrative illustrative, see Table 2.6 below in the Summary and Conclusion section. When looking at the last column in the table, it is very clear that the skill equivalents of the psychological constructs are all formulated in terms of capacities or performance oriented dispositions. In order to execute such skills. People need to possess declarative knowledge about HOW to do something in terms of IF-THEN relationships (e.g. IF you want to stay relaxed during an academic test, THEN say to yourself that you are able to do it). In order to get the ability to stay calm one needs to apply and exercise this knowledge in different stressful situations, in order to become automatized and get stored in long term memory as procedural, more tacit (difficult to verbalize) knowledge. As such, social and emotional skills are just as cognitive as academic, motoric, musical, professional or other types of skills. To put it more clearly, all skills including skills in the social-emotional domain are cognitive and should be addressed and assessed as such in educational settings.

2.7 Summary and Conclusion We started out by picturing the field as having yielded a rather unstructured mess of ill-defined concepts. In recent years important work has been done to provide more structure. Particularly evolving from the broad categories of emotional intelligence to applying the Big Five personality model by authors like Kyllonen et al. (2014), John and De Fruyt (2015), the OECD (2015b), Chernyshenko, Kankaraš, and Drasgow, (2018) and Abrahams et al. (2019) is to be considered a big step ahead in providing an ordering framework. Our endorsement of the Big Five as an ordering framework, does not mean that we are uncritical of its application for measuring social and emotional outcomes in education (see subsequent Chapters). In the preceding sections we have addressed several definitional and classification issues, which have yielded a tentative summary ordering framework, but leaves us with quite a few unresolved and critical questions as well. The core of these open questions is unresolved complexity and fuzzy demarcations between key concepts, such as traits and skills. Some answers to this, like referring to holistic approaches, emphasize the seriousness of the problem, rather than offering a cure. In summary we encountered the following problematic issues:

40

2 Conceptual Challenges

Table 2.6 A hypothetical final ordering framework of social and emotional skills Psychological construct

Personality factor (trait)

Trait facet (object level)

Trait facet (meta–level)

Skill equivalents (object and meta)

Cognitive

Intelligence

Higher order cognitive skills – Transfer – Insight – Innovation

Meta-cognition – learning to learn – Applying control and learning strategies

Ability to apply knowledge acquired at school to outside school situations

Conative

Conscientiousness Achievement-orientation Self-regulation Perseverance Resilience Orderliness Self-evaluation Grit

Capacity to set realistic targets at school and in daily life

Affective

Neuroticism

Anxiety Fear of failure Test anxiety

Emotional control Appreciation of one’s positive emotions

Capacity to remaining calm during stressful situations at school or daily life

Affective/social

Extraversion

Sociability Assertiveness

Affective assessment of one’s own assertiveness in social situations

Capacity to effectively react in case of treatment that is experienced as unjust

Affective/social

Agreeableness

Cooperative Trustful

Self-evaluation and monitoring of social situations

Capacity to adequately assess whether being taken advantage of by others

Imaginativeness Creativity Curiousness

Self-monitoring Ability to of creativity manifest and inhibitions divergent thinking in discussions with others

Conative/affective Openness

2.7 Summary and Conclusion

41

– the fuzziness, uncertainty of application, and measurement problems of lessachievement oriented dispositions like personality traits, as discussed by de Groot and Medendorp; – the complexity of the competency concept, particularly in coming to grips with the way cognitive, motivational and attitudinal facets are supposed to interact; – unsettled demarcation issues between traits, more specific behavioral dispositions, and skills; – doubts about usefulness of concepts like meta affection and meta conation. We have tried to chart the field but feel we have only succeeded to a limited sense to provide order in chaos. But our search does not end here. In the next chapter we will make an excursion in psychological research into the stability and changeability of traits and trait facets, and the effects of clinical interventions. Another compelling issue that has not been addressed is the place social and emotional skills are expected to have in schooling. There is an issue with respect to the desirability of attempting to modify personality in the context of schooling. When education programs would become a prescriptive branch of positive psychology ethical questions should be raised about privacy and perhaps even about the threats of indoctrination. But there are also more pragmatic considerations. These issues too, will be taken up in subsequent chapters.

References Abrahams, L., Pancorbo, G., Primi, R., Santos, D., Kyllonen, P., John, O. P., et al. (2019). Socialemotional skill assessment in children and adolescents: Advances and challenges in personality, clinical, and educational contexts. Psychological Assessment, 31(4), 460–473. https://doi.org/10. 1037/pas0000591. Bloom, B. S., Hastings, J Th, & Madaus, G. F. (1971). Handbook on formative and summative evaluation of student learning. New York: MacGraw-Hill. Cefai, C., Bartolo, P. A., Cavioni, V., Downes, P. (2018). Strengthening social and emotional education as a core curricular area across the eu. a review of the international evidence, NESET II report. Luxembourg: Publications Office of the European Union. https://doi.org/10.2766/664439. Credé, M. (2018). What shall we do about grit? A critical review of what we know and what we don’t know. Educational Researcher, 47(9), 606–611. https://doi.org/10.3102/0013189X18801322. Chernyshenko, O. S., Kankaraš, M., Drasgow, F., & Organisation for Economic Cooperation and Development (OECD) (France). (2018). Social and emotional skills for student success and well-being: Conceptual framework for the OECD study on social and emotional skills (OECD Education Working Papers, No. 173). In OECD Publishing. OECD Publishing. Durlak, J. A., Weissberg, R. P., Dymnicki, A. B., Taylor, R. D., & Schellinger, K. B. (2011). The impact of enhancing students’ social and emotional learning: A meta-analysis of schoolbased universal interventions. Child Development, 82(1), 405–432. https://doi.org/10.1111/j. 1467-8624.2010.01564.x. Evans, D. E., & Rothbart, M. K. (2007). Developing a model for adult temperament. Journal of Research in Personality, 41(4), 868–888. https://doi.org/10.1016/j.jrp.2006.11.002.

42

2 Conceptual Challenges

Goleman, D. (1998). Working with emotional intelligence. New York, NY, England: Bantam Books Inc. de Groot, A. D., & Medendorp, F. L. (1986). Term, begrip, theorie. Inleiding tot signifische begripsanalyse [Term, concept, theory. Introduction to Signatory Concept Analysis]. Meppel, The Netherlands: Boom. Hilgard, E. R. (1980). The trilogy of mind: cognition, affection and conation. Journal of the History of Behavioural Sciences, 16, 107–117. John, O., & De Fruyt, F. (2015). Framework for the longitudinal study of social and emotional skills in cities. Paris: OECD Publishing. Kuijpers, M. (2003). Loopbaanontwikkeling. Onderzoek naar ‘competenties’ [Career development. A study into ‘competencies’]. Enschede: Twente University Press (TUP). Kyllonen, P., Lipnevich, A. A., Burrus, J. & Roberts, R. D. (2014). Personality, motivation, and college readiness: A prospectus for assessment and development. Research report. Educational Testing Service RR-14-06. ETS Research Report Series. Kyro, P., Seikkula-Leino, J., & Myllari, J. (2008). How the dialogue between cognitive, conative and affective constructs in entrepreneurial and enterprising learning processes is explicated through concept-mapping. In Proceedings of the Third International Conference on Concept Mapping. Tallin, Estonia and Helsinki, Finland. Retrieved from http://cmc.ihmc.us/cmc2008papers/cmc 2008-p337.pdf. de Leeuw, A. C. J. (1990). Systeemleer en organisatiekunde. Leiden, The Netherlands: Stenfert Kroese. Mulder, M., & Winterton, J. (2017). Introduction. In: Mulder M. (Ed.), Competence-based vocational and professional education. Bridging the worlds of work and education. Cham: Springer. Mulder, M. (2014). Conceptions of professional competence. In S. Billett, C. Hartels, & H. Gruber (Eds.), International handbook on research into professional and practice-based learning (pp. 107–137). Dordrecht, The Netherlands: Springer. Nijhof, W. J. (1983). On the design of curricula. Enschede, The Netherlands: University of Twente. OECD. (2015a). Review study OECD Dutch curriculum: Onderwijs2032. Evidence about knowledge and skills for fork and learning. Retrieved from www.onsonderwijs2032.nl/advies. OECD. (2015b). Skills for social progress: The power of social and emotional skills. Paris: OECD Skills Studies: OECD Publishing. https://doi.org/10.1787/9789264226159-en. OECD. (2019). Education 2030 curriculum content mapping: An analysis of the Netherlands curriculum proposal. Paris: OECD Publishing. Reimers, F. M. (2017). Empowering students to improve the World in Sixty Lessons Version 1.0. North Charleston, South Carolina: Create Space Independent Publishing Platform. Rotter, J. B. (1966). Generalized expectancies for internal versus external control of reinforcement. Psychological Monographs, 80(1), 1–28. https://doi.org/10.1037/h0092976. Ryle, G. (1976). The concept of mind. Harmondsworth, Middlesex: Penguin Books. Schleicher, A. (2019, July 10). The measurement of learning: future scenarios. In Presentation at the convention of INVALSI (Italy) on the presentation of the national report on the results of the 2018 test program. Rome. Shiner, R. L., & De Young, C. G. (2011). The structure of personality traits: A developmental perspective. University of Chicago: Economic Research Center. Snow, R. E., & Jackson, D. N. (1997). Individual differences in Conation: Selected constructs and measures. Los Angeles: University of California, Center for the Study of Evaluation. Sorensen, T. W. H., & Eby, L. T. (2006). Locus of control at work: A meta-analysis. Journal of Organizational Behavior, 27(8), 1057–1087. https://doi.org/10.1002/job.416.

References

43

Sternberg, R. J., & Lubart, T. I. (1996). Investing in creativity. American Psychologist, 51(7), 677–688. https://doi.org/10.1037/0003-066X.51.7.677. Weinert, F. E. (2001). Concept of competence: A conceptual clarification. In D. S. Rychen & L. H. Salganik (Eds.), Defining and selecting key competencies (pp. 45–65). Seattle, Toronto, Bern Gottingen: ¨ Hogrefe & Huber Publishers. Zins, J. E., Weissberg, R. P., Wang, M. C., & Walberg, H. J. (Eds.). (2004). Building academic success on social and emotional learning: What does the research say?. New York: Teachers College Press.

Chapter 3

Evidence from Psychological Studies

3.1 Introduction As we have seen in the previous chapter, research on the propagation of social and emotional skills in education has gradually come to include theories and models from developmental and personality psychology. As evident in contributions by the OECD (Chernyshenko, Kankaraš, & Drasgow, 2018), Kyllonen, Lipnevich, Burrus, and Roberts (2014) and Abrahams et al. (2019) the so called Big Five taxonomy of traits and facets is now frequently being proposed as a basis for the measurement and assessment of social and emotional skills in education. In this chapter we make an excursion into the domain of psychological studies on various aspects of the Big Five taxonomy. In this, our focus is on fundamental issues that were already encountered in the previous chapter. Conceptually a clear definition and demarcation of more general factors (traits) and more specific facets of these traits appears to be a remaining challenge, on which we expect to learn from psychological studies. As far as empirical evidence is concerned we are particularly interested in results about the malleability of traits and facets of traits. In exploring this evidence, we will go beyond educational treatments and interventions, and look at studies of therapeutic (clinical and non-clinical) and counseling interventions. In this chapter we will start with a short description of the Big Five concept and the traits and facets of traits that are included in the different assessment instruments that currently exist, as well as the predictive value of personality traits for future life outcomes. Subsequently, we address the stability of personality traits and facets, their consistency across different contexts, the genetic basis and potential changeability of personality, and evidence on the effectiveness of (sub)clinical and counseling interventions aimed at changing personality. In the final section we will make up the balance and address implications for the educational applications that are the central theme of this book: the furthering of socio-emotional skills in education, by means of specific interventions.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_3

45

46

3 Evidence from Psychological Studies

3.2 The Big Five Personality Concept and the Five Main Traits Personality traits reflect people’s characteristic, rather enduring, consistent and automatic patterns of thoughts, feelings and behaviors that distinguish people and that are afforded in specific environments (Roberts, 2009). Activities, such as using speech, walking, etc. that virtually all people do and thus do not differ between people, are not considered as traits (Diener & Lucas, 2000). Also, the concept of ‘enduring’ in the definition is important. Roberts and Hill (2017) distinguish traits from states in the sense that states are simply a matter of aggregation, time, and pattern: Traits represent many aggregations of states that show continuity over long periods of time and across relevant situations, whereas states represent thoughts, feelings and behaviours captured in the moment and by default, in the situation (p. 1). Another key element in the definition is automaticity; “personality traits are patterns of thoughts, feelings and behaviours that have become so ingrained that they are automatically deployed in new situations, and thus the day-to-day manifestation of traits occurs seamlessly and non-consciously” (Roberts & Hill, 2017, p. 2). Both the concepts of enduring and automaticy are relevant for the discussion on stability and changeability of personality traits in this chapter. Generally, personality psychologists agree that personality best can be described with The Big Five model of traits that stems from the psycholexical approach to personality. The assumption of this approach is that all individual differences in behavior that are of social relevance will eventually become encoded into daily language, as people will want to talk about them (Goldberg, 1981). Until the beginning of the 1930s a huge amount of traits were generated by psychologists to study individual differences, which lead to the need to reduce this number and to discover the basic traits that describe most of the differences between people. This was the start of the search of the dictionary for descriptors of personality by Allport and Odbert (1936), followed by statistical methods (Factor analysis) to reduce the diversity of words to a small number of dimensions. This approach was followed by other researchers as well, who each developed their own sets of personality descriptors and test items, also written in other languages, which were applied among thousands of people (mainly adults) in many different countries. Based on factor analysis rather consistently five factors emerged, the union of which was called the Five Factor Model or The Big Five (Goldberg, 1990; McCrae & John, 1992). This model is currently the most widely used working hypothesis of personality trait structure (McCrae & Costa, 1997), which seems to incorporate most phenotypic personality attributes (Goldberg, 1999). About four of The Big Five bi-polar dimensions, consists a rather high level of consistency between researchers. These dimensions are Extraversion, Agreeableness, Conscientiousness and Emotional Stability (sometimes also called Neuroticism). About the label of the fifth dimension is less agreement. It has been referred to as Culture, Intellect, Openness to Experience, or Autonomy, dependent on the personality test that is used. Hendriks, Kuyper, Lubbers, and Van der Werf (2011) describe the first four mentioned traits, based on

3.2 The Big Five Personality Concept and the Five Main Traits

47

the Five-Factor Personality Inventory (FFPI, Hendriks, Hofstee, & De Raad, 2011) as follows: “Extraversion refers to social expressiveness and activity level. Extraverted individuals, scoring high on Extraversion, seek other people’s company and are talkative and active, whereas their opposites—introverted people—scoring low on Extraversion, prefer to be left alone. Agreeable people are mild, peace-loving, and cooperative, whereas their opposites—disagreeable people—are bossy, competitive, and quarrelsome. Conscientiousness refers to how people perform tasks. People high on Conscientiousness are organized, dependable, and precise, whereas their opposites score low on Conscientiousness and are chaotic, careless, and procrastinating. Emotional stability describes a person’s level of emotional reactivity. Emotionally stable individuals are calm, even-tempered, and readily overcome setbacks, whereas their opposites—emotionally unstable individuals—get overwhelmed by emotions easily” (p. 220). Lastly, the fifth factor in the FFPI, Autonomy refers to an individual’s intellectual approach to life, a trait of which the core meaning is independent thought and decision making (analyzing problems, forming own opinions). The meaning of the fifth factor in other personality assessment instruments, Openness, refers to the tendency to appreciate new art, ideas, values, feelings and behavior. Today, although some traits are variously defined, and some researchers defined less (3,4) or sometimes more (6) than five traits, most psychologists have adopted a view of personality in which traits are a core feature, and that they are what standard personality scales measure. Individuals score for each of such a trait on a continuum with a positive (+2) and a negative pole (−2), while the midpoint of the continuum is zero. This midpoint is based on the average score of a reference population, a sample of people of comparable age and culture as the individuals whose traits are at stake. So, talking about individual differences in personality is actually evaluating to which degree a particular trait of an individual is below or above the average of the population he/she is part of. Generally, it is found that most individuals score around the midpoint of the scale, and only low percentages of people score on the extremes. This issue is important to take into account in the discussion about attention for personality traits (SEL) in the curriculum, because it raises the question which students should be the target group of SEL interventions, when most of them score in the range between ‘normal’ plus or minus 1 (SD), and what should be the standard, the average score or above? Moreover, the issue is important because it implies that it is not very informative to look at particular traits to describe individual differences in people’s personality. Instead, personality psychologists evaluate one’s personality by including a person’s score on all five traits in a personality profile that describes better his/her ‘typical thoughts, feelings and behavior’ compared to just looking at deviations in particular traits (Asendorpf, 2010). Also, one should take into account that the Big five personality traits and individuals’ personality profile based on these traits do not capture fully people’s characteristic patterns of thought, feelings and behaviors. Although usually being ‘normal’ regarding the broad traits, individuals might differ regarding the more specific lower-level units of personality, which usually are called facets. Another issue is that the personality system includes more dispositions than only the big five traits, like needs and motives, interests, attitudes and beliefs, and self-concept and feelings of well-being. The main difference between the five traits

48

3 Evidence from Psychological Studies

and these dispositions is that the first are more related to dispositions that are characteristic for how people ‘are’, while the latter are mostly described as dispositions that guide people’s particular instrumental actions (Asendorpf, 2010).

3.3 Facets of Traits Although personality psychologists generally agree about the value of the Big Five traits as a way of describing individual differences in personality, there is no widely accepted list of facets that are assumed to underly these traits. As a consequence, different personality assessment instruments include different numbers of facets, and also the content of the underlying items of the facets differ. For example, there are 45 bipolar dimensions in the AB5C model of Hofstee, De Raad and Goldberg (FFPI, 1992), 30 bipolar dimensions in the Five-Factor Model of Costa and McCrae (NEO-PI-R; 1992), 10 facets in the Big Five Aspects (BFA) of De Young, Quilty, and Peterson (2013), 15 facets in the BFI-2 (Soto & John, 2017), and 24 facets in the HEXACO-PI-R (Ashton, Lee, & De Vries, 2014). Table 3.1 shows the particular facets in each of the personality models, which are hierarchical models with the exception of the AB5C model, which is a circumplex model. In the table, the facets that overlap in at least two cells are printed in bold italics. The overlap is very small, making clear that the agreement between personality psychologists about the broader traits is hardly reflected in the agreement at the lower facet level. This, in turn, because the facets, just as the broader traits, are based on first and second order factors resulting from psychometric analysis methods, implies that also the specific items in the personality assessment instruments might be of very different content, item format, level of abstraction, contextualisation, and focus (on thinking, feeling or behavior). The disagreement about underlying facets might also be the reason why studies on the predictive value of personality for future life outcomes, like success in education, the labour market, health, etc. mostly pertain to the broad traits, instead of to more specific facets. However, in studies in which facets were included as predictors instead of broad traits some interesting results were found. For example, Chamorro-Premuzic and Furnham (2003) found that academic performance was related to dutifulness (facet of Conscientiousness), anxiety (Neuroticism), and Activity (Extraversion). Moutafi, Furnham, and Crump (2006) showed that actions and ideas (Openness) were positively related to fluid intelligence, while order, self-discipline and deliberations (Conscientiousness) were negatively related to this outcome. Graham and Lachman (2014) found that for every trait that was related to cognitive performance, at least one facet was also related to the same domain. On the other hand, in some cases a facet was related to cognitive performance, but not the corresponding trait. These results show that, although there is currently quite some consensus on the predictive value of particular traits for particular life outcomes, the issue is much more complicated, because a more detailed understanding of the contributing underlying lower order personality attributes is lacking. Moreover, also at the level of traits, there is still a

3.3 Facets of Traits

49

Table 3.1 Facets in the personality models NEO-PI-R, BFA, BFI-2, HEXACO-PI-R and FFPI Broad trait

NEO-PI-R

BFA

BFI-2

HEXACO-PI-R

FFPI

Neuroticism/(negative) emotionality/emotional stability

Anxiety Depression Hostility Self-consciousness Impulsiveness Vulnerability to stress

Withdrawal Volatility

Anxiety Depression Emotional volatility

Anxiety Fearfulness Dependence Sentimentality

Stability Happiness Calmness Moderation Toughness Impulse control Imperturbability Cool-headedness Tranquility

Extraversion

Gregariousness Assertiveness Activity Warmth Excitement seeking Positive emotions

Assertiveness Enthusiasm

Assertiveness Sociability Energy level

Sociability Social self-esteem Social Boldness Liveliness

Gregariousness Assertiveness Sociability Friendliness Poise Provocativeness Leadership Self-disclosure Talkativeness

Openness

Fantasy Aesthetics Feelings Actions Ideas Values

Openness/Creativity Intellect

Intellectual curiosity Aesthetic sensitivity Creative imagination

Aesthetic appreciation Inquisitiveness Creativity Unconventionality

Agreeableness

Trust Straightforwardness Compliance Modesty Altruism Tendermindedness

Compassion Politeness

Compassion Respectfulness Trust

Forgiveness Gentleness Flexibility Patience

Understanding Warmth Morality Pleasantness Empathy Cooperation Sympathy Tenderness Nurturance

Conscientiousness

Order Dutifulness Self -discipline Competence Achievement striving Deliberation

Orderliness Industriousness

Order Dutifulness Self -discipline

Organization Diligence Perfectionism Prudence

Orderliness Organization Conscientiousness Efficiency Dutifulness Purposefulness Cautiousness Rationality Perfectionism

Honesty-Humility (only HEXACO)

Autonomy (FFPI)

Sincerity Fairness Geed avoidance Modesty Intellect Ingenuity Reflection Competence Quickness Introspection Creativity Imagination Depth

Adapted from Anglim and O’Connor (2019), and added with the facets from the FFPI, based on Goldberg (1999)

50

3 Evidence from Psychological Studies

lack of clear evidence about the predictive value of personality. In different studies, different relations were found, which are partly dependent of the specific life outcome measure that was used. For example, the trait conscientiousness was found positively related to indicators of academic attainment, while it was negatively related to scores on achievement tests; similar results were found for agreeableness which positively predicts team work, but negatively predicts wages. More recent research even shows that for some traits relations are not linear, in the sense that persons who score around the trait average are the most successful regarding the outcome indicator. Also, differences in the relations between certain traits and particular life outcomes, seem to be dependent of the time gap between measurements of personality and life outcome variables (the bigger the time gap, the lower the relation) and on whether the measurement of the life outcomes is, similar to the personality measurement, based on self-reports or on more independent indicators (e.g. perceived health versus use of medication, perceived success at work versus amount of salary). See for an overview Pałczy´nska and Swist (2018). Given the shortcomings described above, we must conclude that there is as yet no firm knowledge base, from predictions of specific life outcomes, to determine which traits and/or specific facets of traits should be the objectives of educational interventions in order to prepare students better for their future. Having established this, a second issue is the feasibility to change specific traits and trait-facets. This issue will be addressed by looking at the stability and the heritability of traits and facets of traits in the subsequent sections.

3.4 Stability of Personality Traits In the literature on the stability of personality traits the two most common indicators of stability are rank-order stability and mean-level stability. Rank-order stability refers to the relative position of a person in a sample over time, while mean level stability refers to the degree of change of the mean scores of a sample of persons (Barelds, 2016). Evidence from different meta-analyses (e.g. Roberts & DelVecchio, 2000; Ferguson, 2010) shows a rather high rank-order stability, with rank-order correlations varying between 0.40 and 0.60 over a period of 10 years. Already at age 3 personality is a good predictor of personality at age 26 (Caspi et al., 2003), and in adulthood the stability is even higher with values of around 0.60. Teenagers have an average rank-order stability of around 0.50, and this stability increases with a peak for persons between 50 and 60 years old (Roberts & DelVecchio, 2000). In the study of Specht, Egloff, and Schmukle (2011) was found that the stability of the traits emotional stability, extraversion, agreeableness and openness to experience had a peak between age 40 and 60, and decreased after this age, but that the stability of the trait conscientiousness still increased. A critical issue, however, is that longitudinal research over a longer period of time, starting in early childhood, instead of during limited developmental periods, using a similar measurement, and similar informants (parents, self-reports) is lacking, which implies that it not yet clear whether observed

3.4 Stability of Personality Traits

51

changes are due to true changes in personality over time, changes in the measurement instrument or changes in the informants. (Herzhoff, Kushner, & Tackett, 2017). As regards mean-level stability results from cross-sectional studies show that personality traits continue to change in adulthood and that these changes may be quite substantial, in particular for the traits agreeableness, conscientiousness and emotional stability (Srivastava, John, Gosling, & Potter, 2003). Results from meta-analyses of longitudinal studies report significant mean-level changes in all traits, with the greatest increase for extraversion during young adulthood, and a small increase of conscientiousness during adolescence, after which it increased throughout young adulthood (Roberts, Walton, & Viechtbauer, 2006). Some recent longitudinal studies from different countries (Johnson, Hicks, McGue, & Iacono, 2007); Josefsson et al., 2013; Lüdtke, Roberts, Trautwein, & Nagy 2011; Vecchione, Alessandri, Barbaranelli, & Gerbino, 2010) support the assumptions of the maturity principle (Roberts & Nickel, 2017) that people in general increase in agreeableness, conscientiousness and emotional stability as they grow older. Apart from the studies described above on rank-order and mean-level stability, some additional studies have been conducted on individual differences in personality trait change, which is a third index of stability. Many of these studies also have related individual differences in the change of traits in adolescence and young adulthood to specific live events, such as relationship factors (Lehnart, Neye, & Eccles, 2010), stressful life events (Jeronimus, Riese, Sanderman, & Ormel, 2014), and work experiences (Le, Donnellan, & Conger, 2013). Also, some studies were conducted among older people, which found a relation between changes in perceived social support and changes in conscientiousness (Hill, Payne, Roberts, & Stine-Murrow, 2014), and that being more socially engaged in old age was related to changes in conscientiousness and agreeableness (Lodi-Smith & Roberts, 2012). Based on the findings of these studies Roberts and Hill (2017) conclude that there are individual differences in the direction and degree of change of personality traits of some people and that these differences are linked to experiential factors. At the end of this section, we want to discuss another index of stability which might be relevant for the discussion on the changeability of personality traits. This index regards the stability of traits across different contexts, also called contextual trait consistency. One of the first studies in which variability effects in specific Big Five traits across life contexts were reported was conducted by Sheldon, Ryan, Rawsthorne, and llardi (1997). These life contexts were student, employee, child, family, and romantic partner. Sheldon found the following significant within-trait cross-context differences: extraversion was the highest in the friend role, emotional stability the lowest in the student role, agreeableness and conscientiousness the highest in the worker role, and openness the highest in the partner role. Each trait, however, had substantial inter-context correlations, suggesting that personality traits are partly context variable as well as partly consistent across contexts. Later studies of for example Fleeson (2001), Wood and Roberts (2006), Heller et al. (2007) supported these findings. Building upon the results of these studies, Robinson (2009) explored in a British sample of university students (mean age 27), also the relative cross-context stability of the big five traits, in order to determine whether some traits would be

52

3 Evidence from Psychological Studies

more malleable than others. In general, the results of this study confirmed the earlier findings that traits are partly consistent and partly variable across contexts, which is in accordance with Fleesons’ view of traits as distributions of behaviour around a central tendency (Fleeson, 2004). However, the study also showed that the traits differ to the extent to which they are contextually invariant, with conscientiousness as the most consistent across contexts (r = 0.55) and extraversion as the least consistent (r = 0.31). Based on these findings, Robinson suggests that extraversion is a contextually more malleable trait than the other traits of the Big Five.

3.5 The Genetic Basis of Personality Traits In the former sections it was made clear that personality traits are rather stable across time and context, although there are some small differences in the degree of stability and the age on which they stabilize. This fact implies that personality traits regard fundamental attributes with a clear genetic basis. Empirical evidence from twin, family and adoption studies shows that, on average for all traits, personality is for 50% genetic (Bouchard & McGue, 1990; Iacono & McGue, 2002). More recent studies (Briley & Tucker-Drob, 2014; Vukasovic & Bratko, 2015) estimate that the heritability component is between 30 and 50%. These estimates imply that personality is also for around 50% attributed to the environment, which can be distinguished into a shared environment and a non-shared environment. Striking is that the shared environment (family, parenting style) almost has no influence on personality, and that it is mainly the non-shared environment (own experiences, own friends, etc.) that influence the development of personality (Bouchard & Loehlin, 2001; Krueger & Johnson, 2008). Although there is hardly any discussion among psychologists about the heredity component of personality traits, there is quite some controversy about the implications. Some state that it is useless to put effort in changing personality traits that are so strongly tied to biology (e.g. Bailey, Duncan, Odgers, & Wu, 2017; Whitehurst, 2016). Others, e.g. Roberts & Jackson, 2008) say that the fact that personality traits have partly a genetic basis does not mean that traits can’t be changed. Moreover, Roberts and Hill (2017) even call it an advantage that shared environments hardly influence the development of personality and that it is mainly the unshared environment that contributes to it, because this implies that traits are not immune to the influence of environmental input. On the contrary, they state that traits are both consistent and changeable, which may in the end be seen as a desirable combination for changing personality in educational, clinical and occupational settings. This statement might be true for clinical and occupational settings, because these settings generally are rather personalized for individuals (non-shared environment), but for educational settings in which students are supposed to receive the same curriculum and teaching approaches (shared environment) it is highly questionable.

3.6 Evidence on How to Change Personality

53

3.6 Evidence on How to Change Personality Until now, the empirical evidence that personality traits can indeed be changed by interventions is very scarce. In personality psychology research into techniques for changing personality traits has hardly been conducted. Most research in that field addresses the issue of explaining differences in mean level stability or individual differences in trait changes focussed on natural life experiences, like for example military service, the transition from university to adult life, marriage, divorce, etc. The effects of such life events on traits are in general relatively modest and dependent on the type of events and the trait (Bleidorn, Hopwood, & Lucas, 2016). Another problem is that it is unsure whether trait changes may occur as a reaction or interpretation of such life events or even that they predict experiences of such events (Allemand & Fluckinger, 2017). There is also a lack of evidence of the effects of interventions that were intentionally aimed to change personality among people who are willing and motivated to change, without having any psychological or social problems, except some examples that, according to Allemand and Fluckinger (2017) point into the direction “that people are able to successfully attain desired personality changes” (p. 12). This is a rather optimistic view, given the evidence that appears from the studies that are mentioned as “notable examples”. We will discuss these studies in the section below. The third line of research that might shed some light of the possibility to change personality traits are clinical studies among people with mental health problems. Although clinical interventions mostly are not primarily aimed at changing personality, because they usually target specific problems or disorders, there are numerous studies in which personality measurements have been conducted in order to establish the effects of different therapeutic interventions on clinical outcomes. In the next section we will discuss the results of these studies.

3.7 Evidence for Personality Trait Change by Means of Therapeutic and Counselling Interventions Until now there is hardly any evidence that clinical interventions can directly change personality traits, because most interventions are not specifically aimed at changing traits but only at specific or broad mental health problems. An exception is the recent review of Munro and Coulson (2016), who discuss the results of personality trait change interventions of four different categories: psychotherapy (counselling and behaviour therapy), pharmacotherapy, medication-assisted therapy (combination of counselling and pharmacological methods; and digital interventions). In their review only 9 studies met the quality criteria (randomisation; sufficient number of participants; sufficient length of study). Six of the 9 studies were aimed at outpatients, persons with mild to moderate psychiatric disorders, and only 3 studies included healthy groups. The only consistent finding across the studies was a decrease of the

54

3 Evidence from Psychological Studies

personality trait neuroticism (reversed emotional stability), with effect sizes varying between 0.38 and 1.27 for the short-term effects. However, these effects were only found in the studies with outpatients. Moreover, no long-term effects were measured. Despite this shortcoming, as well as the small number of studies among rather small samples that were included in the review, the authors conclude that personality traits can be changed by interventions and that the evidence in their review would be clinical useful to induce personality trait change. Until here we described the effects of clinical interventions that were directly targeted at personality trait changes. Given the low number of this type of interventions and the weaknesses of the meta-analysis described above we must conclude that the evidence that such interventions can indeed change personality is very shaky. Next, we will describe the evidence coming from studies in which trait changes have been studied as side effects of interventions with a particular focus. In this type of intervention studies researchers established the outcomes of interventions as comprehensive as possible by (also) including personality measures in the design. The number of such studies is rather large. Already in 1980, Smith, Glass and Miller conducted a meta-analyses on such studies and found that therapy can change personality traits in addition to the primary outcomes of the therapy. Some, more recent studies by e.g. DeFruyt, Van Leeuwen, Bagby, Rolland, and Rouillon (2006), Tang et al. (2009) agree in showing that psychotherapy and counselling interventions, sometimes in combination with medication, can change personality traits. Next to these clinical interventions, there are some subclinical or other types of interventions (like mindfulness training, skills training, meditation, or even cognitive training) for people who don’t have psychological disorders, which showed positive side effects in terms of trait changes next to the intended outcome variables. Recently, Roberts et al. (Roberts et al., 2017) conducted a meta-analysis of all these types of studies. Building further upon the meta-analysis of Smith et al. (1980), they addressed the questions to which degree traits could be changed in a relative short period of time (12–15 weeks), and whether the traits changes are enduring over time. Next to clinical studies, also non-clinical studies (non-clinical intervention, non-clinical groups, or a combination of both) were included. In total 207 studies were included in the meta-analysis, with in total 20.000 participants with a mean age of 36 years (range 19–73). The average duration of the interventions was 24 weeks. Of the 207 studies, 35 were truly experimental (with random assignment to intervention versus waiting list), 19 studies were non-clinical, and in 77 of the studies a follow-up measurement was included. The results showed an average pre-post effect size of 0.37, indicating that personality traits tended to change with around one third of a standard deviation. The intervention effect across the full data set was an increase of 0.43 SD compared to the control groups. As regards the true experiments (clinical as well as non-clinical), the treatment effect was 0.43, with a small difference between the clinical groups (0.45) and the non-clinical groups (0.36). With respect to the different traits, overall the changes were the largest for emotional stability (0.59), followed by extraversion (0.23) and conscientiousness (0.19), and the lowest for agreeableness (0.19) and openness (0.13). Compared to the controls, the effects of the interventions were 0.69 for emotional stability, 0.38 for extraversion, 0.26 for openness, 0.23

3.7 Evidence for Personality Trait Change by Means of Therapeutic …

55

for agreeableness and 0.06 for conscientiousness. Further, the type of intervention hardly made a difference and the pre-post changes hardly differed from the changes between pre-test and follow-up measure. Based on these results the authors conclude that interventions can cause change in traits over the short run and that the effects don’t fade away. However, the authors do not show us any data about the follow-up effects. Another critical issue that could not be solved in the study is that it might be that therapy just brings people back to their baseline level of the period before their psychological problems. Data about the participants baseline level of this period, were, however, not available, so it was not possible to check this suggestion. A counter argument that the authors mention is that non-clinical studies as well show effects, which are as large as the effects in the clinical studies. However, as this might be true across all traits together, the argument does not hold for the trait for which the largest effects were found, which was emotional stability. Looking at the pre-post change values for that trait, the change difference between the clinical and non-clinical group was 0.23, while it was only 0.05 across all traits together. This is a remarkable difference, that should be taken into account, because it indicates that effects of interventions for clinical groups might be very different from non-clinical groups and that one should be careful using such results to promote to focus on trait changes by interventions, not only in clinical settings but also in other fields like economics, politics, health and education (Roberts et al., 2017, p. 16). Finally, one has to keep in mind that the interventions which were included in the meta-analysis did not have the purpose of changing personality traits, but were aimed at a particular problem. The effects on trait changes that were found were only a side effect of the interventions, so at the end we still don’t have any clue whether purposeful changing personality changes by means of interventions is possible.

3.8 Stability and Changeability of Facets of Traits Based on the findings described above, we might conclude that the evidence about the assumption that personality traits are not stable but changeable is not very convincing. Moreover, the scarce evidence that traits might be changed by interventions mainly pertains to clinical subjects who received individual therapy. Hardly anything is known about how to change traits by non-clinical interventions and on which specific traits such interventions should be aimed at. This raises serious doubts whether it makes sense to include the development of such traits as targets of education. Maybe, a better alternative might be to focus more on particular facets of traits than on broad traits, because such facets are less stable over time and situations, and thus more changeable. Unfortunately, about facet stability over time hardly any empirical data are available. This is remarkable, because, in general, personality psychologists agree that each same domain facet captures unique personality information and that this unique information predicts a variety of important behaviour and life outcomes, beyond the level of the big five domains themselves, such as academic achievement,

56

3 Evidence from Psychological Studies

alcohol consumption and abuse, delinquent behaviour, life satisfaction, and many other behaviours (Soto & Gosling, 2011). Anglim and O’Connor (2019) even state that, compared to the broad Big Five, narrow traits (trait facets) generally offer enhanced predictive validity. Until recently, as reported by Soto, John, Gosling, and Potter (2011) only a few cross-sectional studies have been conducted regarding meanlevel stability at the level of facets. The available evidence suggests that within at least some traits domains, different facets show different trends. For example, the meta-analysis of Roberts et al. (2006) distinguished two facets of Extraversion, which showed different age trends, with an increase from the college years through early adulthood for social dominance (assertiveness and self-confidence), while the level of social vitality (gregariousness, positive affect and energy level) remained flat. In the study of Terraciano et al. (2006) among middle-aged and older adults, it was found that within most Big Five domains, different facets of the 30 facets of the NEO-PI-R showed different trends. For example, within the Agreeableness domain, altruism showed a positive age trend, whereas modesty did not (Soto et al., 2011, p. 332). In their own study among about 2,3 million children, adolescents and adults (age 10–65), Soto et al. found that for some domains the facets showed very small age trends, but for others, particularly Extraversion, Conscientiousness and Neuroticism, the age trends were quite substantial. Unfortunately, until now, longitudinal studies on facet level are lacking, so that it is not possible to check whether such trends also occur for rank-order stability at facet level. Also, only a very few studies on the changeability of trait facets by means of interventions have been conducted. The studies that are available, focus, in contrast to the described clinical intervention studies, on coaching interventions, in which participants choose themselves how many and which facets they wanted to change. In a study of Martin, Oades, and Caputi (2014), the average targeted facet score was used in the comparison between the intervention and the control group. It was found that after a 10 weeks coaching period the average score in the intervention group was significantly higher than in the control group, and this effect remained significant at the follow-up three months later. Allan, Leeson, and Martin (2014) found that the target of most participants was to change facets within the traits of neuroticism and conscientiousness. However, the study didn’t provide an answer to the questions whether and which personality facets changed as a result of the intervention and whether changes in the targeted facets were due to targeting that facets or to the general intervention effects. Building further on the study of Martin et al. (2014), Allan et al. (2014) studied the effects of the coaching intervention more in detail. In their study, they addressed the question whether the trait domains which had the highest number of targeted facets by participants would change as a result of the intervention (10 weeks) as well as would the particular target facets themselves change. The study was conducted among 54 adults (aged 18–64), mainly females, which were matched for gender and age and then randomly assigned to the coach group or the control group. The NEO-PI-R (Costa & McCrae, 1992) was used to measure the personality traits and underlying facets. The results showed a medium effect of the intervention for the trait neuroticism and small effects for conscientiousness and extraversion. The effects for neuroticism and extraversion

3.8 Stability and Changeability of Facets of Traits

57

were maintained at week 22 (the follow-up test), the effect for conscientiousness did not. As regards the changes at facet level, the results show that the effects were small to medium (effect sizes varying between 0.18 and 0.28) for the neuroticism facets (anxiety, hostility, vulnerability, depression, impulsiveness and self-consciousness), and the effects maintained at week 22. The effects for all of the extraversion and conscientiousness facets were small, with effect sizes varying between 0.00 and 0.12. Most results were not maintained at week 22. Furthermore, as regards the effects of targeting, the results show that the facets that were targeted by the participants showed larger changes than the facets that were not targeted. Although this effect was significant, the effect size was small (0.09). Based on the significant findings, Allan et al. (2014) conclude that people who are motivated to change are able to change their personality and that they can do this in a short period of time. However, looking at the small effect sizes, this conclusion is not warranted., because the effect sizes are simply too small to be convincing. Next to the study discussed above, one other study (Kloster, 2016) was found in the literature in which the effects of an intervention on facet level changes were studied. This study focussed on the facets self-discipline and orderliness of the personality trait conscientiousness, which were measured with the NEO-IPIP (Johnson, 2014). The participants of the study were young, college-aged students who received a group intervention focussing on the changeability of the brain (incremental mindset) and on research showing which conscientiousness facets change over time. The group sessions were supported with hand-outs and mentoring emails. The results of the study showed that the intervention did have no effects at all. Altogether, given the scarcity of studies on the effects of interventions at facet level as well as the lack of longitudinal results at this level, hardly any firm conclusion on the changeability of personality facets is possible. On the contrary, a recent study of Mõttus et al. (2019) even suggests that changeability is most unlikely. This study, which is a replication and meta-analysis of their earlier study in 2017, employing data of more than 6000 individuals from 6 countries, found that at the level below facets, which is item level, almost all items of the NEO-PI-R showed significant rank-order stability over an average period of 12 years. Moreover, they found that the majority of items also demonstrated cross-rater agreement and heritability. They conclude that such single test items, which they call Nuances extend the personality trait hierarchy below facets and may serve as the basic units of personality traits. This is an interesting conclusion, but at the same time it is bad news for the SEL movement, because its implication is that even at the lowest level of personality there is hardly any room for change by means of interventions.

3.9 Summary and Conclusions In the literature review above we have made clear that the assumptions regarding the Big Five traits and facets as a framework for inclusion of social-emotional skills in school curricula and in international assessments are not as properly supported

58

3 Evidence from Psychological Studies

by empirical evidence as suggested by the proponents of these ideas (e.g. OECD, CASEL). Firstly, the broad Big Five traits are more stable and less changeable by interventions than is reported in the publications of these proponents. The evidence for changeability of traits stems from only one meta-analysis of the effects of therapeutic interventions, which were not purposefully aimed at changing personality traits and in which only 35 of the 207 studies were real experimental studies, of which most studies were clinical. Secondly, at the lower levels of personality, i.e. trait facets, hardly any empirical evidence is available about stability and changeability. The scarce evidence based on intervention studies showed that the effects varied between very small to zero. Moreover, even at the level below facets, i.e. item level, which can be considered as the basic units of personality, the rank-order stability between item scores before and after the intervention appears to be so high that changeability of personality at the higher levels of personality (facet and trait level) seems most unlikely. Thirdly, although research suggests that personality is for 50% genetic and thus for the other 50% influenced by the environment, the claim that personality thus can be changed by education, doesn’t hold. Genetic research has shown convincingly that it is the non-shared environment that influences personality development and not the shared environment. So, when even non-shared environments like therapeutic individual interventions among individuals who voluntary want to change, are not able to produce substantial personality changes, how then could it be expected that educational interventions in which students are organized in a shared environment and are submitted to an obligatory school program can? Developing students’ socio-emotional functioning based on the Big Five framework, if at all possible, requires at least an individual approach, as well as the willingness of students to change ‘the way they are’. Or, in other words, education as therapy. Another, more fundamental issue, is what the standard should be for developing and evaluating the students’ level of SEL, given the fact that in the Big Five framework the score range for each trait and trait facet varies between a negative and a positive pole with a midpoint of zero. The traits of individuals are assessed by comparing their scores in terms of deviations from the midpoint, which is based on a reference group of persons with comparable age and culture. This raises the question what the standard, the ambition level, should be when one wants to promote SEL in education. The answer is that it is not possible to set a standard, because each student requires a personal approach and goal, dependent on which traits are underdeveloped. But then the question is, who will decide that? The teacher, the students, the parents, the three parties together? These kind of decisions are common business in therapeutic settings, but difficult to implement in education. Moreover, evaluating the scores of individuals on separate traits and trait facets is not very informative, because it is just the combination of traits (the personality profile) that best describes how people ‘typically’ feel, think and behave. This implies that SEL development in education should take the student’s score profile as the starting point, which will make it even more complicated for teachers to implement SEL, to set the standard for each student, and to assess whether the standard is acquired. Altogether, based on our review above, we seriously doubt whether the Big Five framework is an appropriate framework to organize SEL development in education.

3.9 Summary and Conclusions

59

We didn’t find sufficient evidence that it is possible to change students’ traits and facets of traits in education and to use the framework for international assessments. Currently, some counter arguments against our objections are already being provided by some proponents of the SEL movement. For example, by Murano et al. (2018, CASEL), who states that the skills within the Big Five factors are supported by decades of empirical evidence, showing that these skills are changeable over the life span by well-designed interventions and that SELS are different from traits in that they are context-dependent, include knowledge and attitudes and that they can be behaviourally based. Moreover, they state that it is possible to teach students contextualized skills that are part of broader skills without wanting to change their personality. These statements of CASEL are curious in two respects. First, that SELS are different from traits in that they are context-dependent contradicts the original idea of SEL that it is important to teach more transferable (read context-independent) skills to students in order to prepare them better for their future life. Second, the use of the term ‘skills’. In all the personality literature that we reviewed for this chapter, the skill concept was not mentioned. We didn’t encounter the word in any of the descriptions of personality traits and trait facets, nor did we find items referring to a skill in the personality assessment instruments, with the exception of one or two items measuring the trait Openness to experience/Intellect. Also, the OECD uses the concept of social-emotional skills in their publications about the international assessment pilot that is currently going on. But in their publications, they equalize personality traits with socio-emotional skills without any further discussion, although every personality psychologists knows that there is, conceptually, a big difference between personality traits and skills. Personality traits describe individuals’ ‘typical behaviour’, while skills are related to what individuals can perform (see also Hofstee, 2001; Ackermann, 2018). These different conceptualisations also imply different ways of assessment. Personality assessment instruments include only items asking for what people generally do, and how they usually feel and think, and compares the scores of individuals with the scores of a reference norm (reflecting how most people, act, think and feel). Skills assessment instruments measures individuals’ maximal performance, being purposeful behaviour, in standardized testing conditions and standardized scoring of the performance, usually ranging from insufficient to very good. What we see here, is what Hofstee (2001) calls the ‘skills approach to personality’ which has been tried before by personality psychologists without much success. This skills approach pretends to assess students’ skills to purposefully feel, think and act in a particular manner. For example, the skill to be conscientious (trait level), or the skill to be organized, planful (facets of conscientiousness). The problem with the skills approach to personality is that it would only make sense to limit the skills definition to the positive poles of personality, because it would be odd to measure skills that are related to unadaptive feelings, thoughts and behaviors (e.g. the skill to be unorganized, the skills to be infriendly, etc.). Another problem of the skills approach to personality is that people can have good skills to behave in

60

3 Evidence from Psychological Studies

a maximally appropriate manner (purposefully being very friendly), but at the same time being a very unkind person. The third problem with the skills approach is that skills cannot be measured with self-report measurements, but should be observed by independent observers on basis of actual performance, in a standardized test situation, with clear standards for whether the performance is appropriate or not. Now, looking at the OECD conceptual framework (Chernyshenko, Kankaraš, & Drasgow, 2018) and the report on the pilot study (Kankaraš, Feron, & Renbarger, 2019), it is clear that the OECD only partly has chosen for the ‘skills approach to personality’. They use the concept of ‘social-emotional skills’ and in their framework, they only have included the positive poles of the BIG Five traits and trait facets, although they mention the traits differently than personality psychologists usually do. However, the trait facets that are included in the broader traits, sometimes seem to refer to skills, sometimes apparently have nothing to do with skills, and sometimes it is unclear. But the most surprising is that in the item list that was published in the report on the pilot study, there is no single item that is asking for a skill. The items are completely similar to the items in regular personality tests, asking for ‘typical behaviour’. The only difference with regular personality testing is that just the positive poles of the Big Five traits were chosen, and the scoring format has been changed to a Likert scale, ranging from 1 to 5, suggesting that the higher the score, the better the skill. But, because the items actually measure students’ ‘typical behavior’ instead of skills, this way of scoring is completely wrong, because it suggests that the higher the score the more adaptive the ‘typical behavior’ is, while in fact individuals with the highest scores on the positive poles of personality traits usually are not the most adaptive in terms of their typical feelings, thinking and behaviour and also not the ‘best persons’ in terms of their personality structure. Moreover, once again, high scores on personality traits and facets don’t say anything about the social-emotional skills that students can demonstrate in situations that require to use such skills.

References Abrahams, L., Pancorbo, G., Primi, R., Santos, D., Kyllonen, P., John, O. P., et al. (2019). Socialemotional skill assessment in children and adolescents: Advances and challenges in personality, clinical, and educational contexts. Psychological Assessment, 31(4), 460–473. https://doi.org/10. 1037/pas0000591. Ackerman, P. L. (2018). The search for personality-intelligence relations: Methodological and conceptual issues. Journal of Intelligence, 6(2), 2–12. https://doi.org/10.3390/jintelligence60 10002. Allan, J. A., Leeson, P., & Martin, L. S. (2014). Who wants to change their personality and what do they want to change? International Coaching Psychology Review, 9(1), 8–21. Allemand, M., & Fluckinger, C. (2017). Changing personality traits: Some considerations from psychotherapy process-outcome research for intervention efforts on intentional personality change. Journal of Psychotherapy Integration, 27(4), 476–494. https://doi.org/10.1037/int000 0094.

References

61

Allport, G. W., & Odbert, H. S. (1936). Trait-names: A psycho-lexical study. Psychological Monographs, 47(1), 1–171. https://doi.org/10.1037/h0093360. Anglim, J., & O’Connor, P. (2019). Measurement and research using the Big Five, HEXACO, and narrow traits: A primer for researchers and practitioners. Australian Journal of Psychology, 71(1), 16–25. https://doi.org/10.1111/ajpy.12202. Asendorpf, J.B. (2010). Psychologie van de persoonlijkheid [Psychology of personality]. Bohn, Stafleu, van Loghum. Ashton, M. C., Lee, K., & De Vries, R. E. (2014). The HEXACO honesty-humility, agreeableness, and emotionality factors: A review of research and theory. Personality and Social Psychology Review, 18(2), 139–152. https://doi.org/10.1177/1088868314523838. Bailey, D., Duncan, G. J., Odgers, C., & Wu, W. (2017). Persistence and fadeout in the impacts of child and adolescence interventions. Journal of Research in Educational Effectiveness, 10(1), 7–39. https://doi.org/10.1080/19345747.2016.1232459. Barelds, D. P. H. (2016). Persoonlijkheid: een introductie [Personality: an introduction]. In D. Barelds & Dijkstra, P. (Eds.). Inleiding in de persoonlijkheidspsychologie [Introduction into personality psychology], pp. 11–33. Amsterdam: Boom Publishers. Bleidorn, W., Hopwood, C. J., & Lucas, R. E. (2016). Life events and personality trait change. Journal of Personality, 86(1), 83–96. https://doi.org/10.1111/jopy.12286. Bouchard, T. J., & Loehlin, J. C. (2001). Genes, evolution, and personality. Behavior Genetics, 31(3), 243–273. https://doi.org/10.1023/A:1012294324713. Bouchard, T. J., & McGue, M. (1990). Genetic and rearing environmental influences on adult personality: An analysis of adopted twins reared apart. Journal of Personality, 58(1), 263–292. https://doi.org/10.1111/j.1467-6494.1990.tb00916.x. Briley, D. A., & Tucker-Drob, E. M. (2014). Genetic and environmental continuity in personality development: A meta-analysis. Psychological Bulletin, 140(5), 1303–1331. https://doi.org/10. 1037/a0037091. Caspi, A., Harrington, H., Milne, B., Amell, J. W., Theodore, R. F., & Moffit, T. E. (2003). Children’s behavioral styles at age 3 are linked to their adult personality traits at age 26. Journal of Personality, 68(4), 712–722. https://doi.org/10.1111/1467-6494.7104001. Chamorro-Premuzic, T., & Furnham, A. (2003). Personality traits and academic performance. European Journal of Personality, 17(3), 237–250. https://doi.org/10.1002/per.473 Chernyshenko, O., Kankaraš, M., & Drasgow, F. (2018). Social and emotional skills for student success and well-being: conceptual framework for the OECD study on social and emotional skills (OECD Education Working Papers No 173). https://dx.doi.org/10.1787/db1d8e59-en Costa, P. T., & McCrae, R. R. (1992). Revised NEO personality inventory (NEO-PI-R) and NEO Five-Factor inventory (NEO-FFP). Professional manual. Odessa, Fl: Psychological Assessment Resources. DeFruyt, F., Van Leeuwen, K. G., Bagby, R. M., Rolland, J. P., & Rouillon, F. (2006). Assessing and interpreting personality change and continuity in patients for major depression. Psychological Assessment, 18(1), 71–80. https://doi.org/10.1037/1040-3590.18.1.71. De Young, C. G., Quilty, L. C., & Peterson, J. B. (2013). Between facets and domains: 10 aspects of the Big Five. Journal of Personality and Social Psychology, 91(5), 33–58. https://doi.org/10. 1037/0022-3514.93.5.880. Diener, E., & Lucas, R. (2000). Subjective emotional well-being. In M. Lewis & J. M. HavilandJones (Eds.). Handbook of emotions (2nd ed., pp. 325–337). The Guilford Press. Ferguson, C. J. (2010). A meta-analysis of normal and disordered personality across the life span. Journal of Personality and Social Psychology, 98(4), 659–667. https://doi.org/10.1037/a0018770. Fleeson, W. (2001). Toward a structure- and process-integrated view of personality: Traits as density distribution of states. Journal of Personality and Social Psychology, 80(6), 1011–1027. https:// doi.org/10.1037/0022-3514.80.6.1011.

62

3 Evidence from Psychological Studies

Fleeson, W. (2004). Moving personality beyond the person-situation debate: The challenge and the opportunity of within-person variability. Current Directions in Psychological Science, 13(2), 83–87. https://doi.org/10.1111/j.0963-7214.2004.00280.x. Goldberg, L. R. (1981). Language and individual differences: The search for universals in personality lexicons. In L. Wheeler (Ed.). Review of personality and social psychology (Vol. 2, pp. 141–165). Sage. Goldberg, L. R. (1990). An alternative ‘description’ of personality: The BIG-Five factor structure. Journal Personality and Social Psychology, 59(6), 1216–1229. https://doi.org/10.1037/00223514.59.6.1216. Goldberg, L. R. (1999). A broad-bandwidth, public-domain, personality inventory measuring the lower-level facets of several Five-factor models. In I. Mervielde, I. J. Deary, F. De Fruyt, & F. Ostendorf (Eds.). Personality Psychology in Europe (Vol. 7, pp. 7–28). Tilburg University Press. Graham, E. K., & Lachman, M. E. (2014). Personality traits, facets and cognitive performance: Age differences in their relations. Personality and Individual Differences, 59, 89–95. https://doi.org/ 10.1016/j.paid.2013.11.011. Heller, D., Watson, D., Komar, J. A., Min, J., & Perunovic, W. Q. F. (2007). Contextualized personality: Traditional and new assessment procedures. Journal of Personality, 75(6), 1229–1253. https://doi.org/10.1111/j.1467-6494.2007.00474.x. Hendriks, A. A. J., Hofstee, W. K. B., & De Raad, B. (2011). Handleiding bij de Five-Factor Personality Inventory II (FFPI-II) [Manual for the Five-Factor Personality Inventory II (FFPI-II). Bohn, Stafleu, van Loghum. Hendriks, A. A. J., Kuyper, H., Lubbers, M. J., & Van der Werf, M. P. C. (2011). Personality as a moderator of context effects on academic achievement. Journal of School Psychology, 49(2), 217–248. https://doi.org/10.1016/j.jsp.2010.12.001. Herzhoff, K., Kushner, S. C., & Tackett, J. L. (2017). Personality development in childhood. In J. Specht (Ed.), Personality development across the lifespan. (pp. 9–23). Elsevier Academic Press. https://doi.org/10.1016/B978-0-12-804674-6.00002-8 Hill, P. L., Payne, B. R., Roberts, B. W., & Stine-Murrow, E. A. L. (2014). Perceived social support predicts increased conscientiousness during older adulthood. Journal of Gerontology: Psychological Sciences, 69(4), 543–547. https://doi.org/10.1093/geronb/gbt024 Hofstee, W. K. B. (2001). Intelligence and personality: Do they mix? In J. M. Collis & S. Messick (Eds). Intelligence and personality (pp. 43–60). LEA Publishers. Hofstee, W. K., de Raad, B., & Goldberg, L. R. (1992). Integration of the Big Five and circumplex approaches to trait structure. Journal of Personality and Social Psychology, 63(1), 146–163. https://doi.org/10.1037/0022-3514.63.1.146. Iacono, W. G., & McGue, M. (2002). Minnesota twin family study. Twin Research and Human Genetics, 5(5), 482–487. https://doi.org/10.1375/twin.5.5.482. Jeronimus, B. F., Riese, H., Sanderman, R., & Ormel, J. (2014). Mutual reinforcement between neuroticism and life-experiences: A five-wave, 16-year study to test reciprocal causation. Journal of Personality and Social Psychology, 107(4), 751–764. https://doi.org/10.1037/a0037009. Johnson, J. A. (2014). Measuring thirty facets of the Five Factor Model with a 120-item public domain inventory: Development of the IPIP-NEO-120. Journal of Research in Personality, 51, 78–89. https://doi.org/10.1016/j.jrp2014.05.003. Johnson, W., Hicks, B. M., McGue, M., & Iacono, W. G. (2007). Most of the girls are alright, but some aren’t: Personality trajectory groups from ages 14 to 24 and some associations with outcomes. Journal of Personality and Social Psychology, 93(2), 266–284. https://doi.org/10.1037/ 0022-3514.93.2.266. Josefsson, K., Jokela, M., Cloninger, C. R., Hintsanen, M., Salo, J., Hintsa, T., et al. (2013). Maturity and change in personality: Developmental trends of temperament and character in adulthood. Development and Psychopathology, 25(3), 713–727. https://doi.org/10.1017/S09545794 13000126.

References

63

Kankaraš, M., Feron, E., & Renbarge, R. (2019). Assessing students’ social and emotional skills through triangulation of assessment methods (OECD Education Working Papers No. 208). https:// doi.org/10.1787/19939019. Kloster, K. R. (2016). Facet-level personality development: An intervention for developing studentself-discipline and orderliness. Unpublished Master thesis, St. Cloud State University. Krueger, R. F., & Johnson, W. (2008). Behavioral genetics and personality. In O. P. John, R. W. Robins, & L. A. Pervin (Eds.), Handbook of personality: Theory and research (pp. 287–310). New York/London: The Guilford Press. Kyllonen, P., Lipnevich, A. A., Burrus, J. & Roberts, R. D. (2014). Personality, motivation, and college readiness: A Prospectus for Assessment and Development. Research report. Educational Testing Service RR-14-06. ETS Research Report Series. Le, K., Donnellan, M. B., & Conger, R. (2013). Personality at work: Workplace conditions, personality changes, and the corresponsive principle. Journal of Personality, 82(1), 44–56. https://doi. org/10.1111/jopy.12032. Lehnart, J., Neyer, F. J., & Eccles, J. (2010). Long-term effects of social investment: The case of partnering in young adulthood. Journal of Personality, 78(2), 639–670. https://doi.org/10.1111/ j.1467-6494.2010.00629.x. Lodi-Smith, J., & Roberts, B. W. (2012). Concurrent and prospective relationships between social engagement and personality traits in older adulthood. Psychology and Aging, 27(3), 720–727. https://doi.org/10.1037/a0027044. Lüdtke, O., Roberts, B. W., Trautwein, U., & Nagy, G. (2011). A random walk down university avenue: Life paths, life events, and personality trait change at the transition to university life. Journal of Personality and Social Psychology, 101(3), 620–637. https://doi.org/10.1037/a00 23743. McCrae, R. R., & John, O. P. (1992). An introduction to the five-factor model and it’s applications. Journal of Personality, 60(2), 175–215. https://doi.org/10.1111/lj.1467-6494.tb000970.x. McCrae, R. R., & Costa, P. (1997). Personality trait structure as a human universal. American Psychologist, 52(5), 509–516. https://doi.org/10.1037//0003-066x.52.5.509. Martin, L. S., Oades, L. G., & Caputi, P. (2014). Intentional personality change coaching: A randomised controlled trial of participant selected personality facet change using the Five-factor model of personality. International Coaching Psychology Review, 9(2), 196–209. Mõttus, R., Sinick, J., Terracciano, A., Hˇrebíˇcková, M., Kandler, C., Ando, J., Mortensen, E. L., Colodro-Conde, L., & Jang, K. L. (2019). Personality characteristics below facets: A replication and meta-analysis of cross-rater agreement, rank-order stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 117(4), e35–e50. https://doi. org/10.1037/pspp0000202.supp (Supplemental) Moutafi, J., Furnham, A., & Crump, J. (2006). What facets of openness and conscientiousness predict fluid intelligence score? Learning & Individual Differences, 16(1), 31–42. https://doi. org/10.1016/j.lindif.2005.06.003. Munro, G. M., & Coulson, N. S. (2016). Personality change interventions: A systematic review of the evidence. University of Nottingham, School of Medicin. Retrieved from https://www.resear chgate.net/publication/303838991. Murano, D., Way, J., Anguiano-Carrasco, C., Walton, K. E., & Burrus, J. (2018). On the use of the Big Five Model as SEL assessment framework. Center for Social, Emotional, and Academic Learning, ACT. Inc. https:/measuringsel.casel.org/use-big-five-model-sel-assessment-framework/. Pałczy´nska, M., & Swist, K. (2018). Personality, cognitive skills and life outcomes: Evidence from the Polish follow-up study to PIAAC. Large-scale Assessments in Education, 6(2), 2–23. https:// doi.org/10.1186/s40536-018-0056-z. Roberts, B. W. (2009). Back to the future: Personality and assessment and personality development. Journal of Research in Personality, 43(2), 137–145. https://doi.org/10.1016/j.jrp.2008.12.015.

64

3 Evidence from Psychological Studies

Roberts, B. W., & DelVecchio, W. F. (2000). The rank-order consistency of personality traits from childhood to old age: A quantitative review of longitudinal studies. Psychological Bulletin, 126(1), 3–25. https://doi.org/10.1037/0033-2909.126.1.3. Roberts, B. W. & Hill, P. L. (2017). Questions and answers about the policy relevance of personality traits. Applied Psychology. https://doi.org/10.31234/osf.10/8cf7h. Roberts, B. W., & Jackson, J. J. (2008). Sociogenomic personality psychology. Journal of Personality, 76(6), 1523–1544. https://doi.org/10.1111/j.1467-6494.2008.00530.x. Roberts, B. W., Luo, J., Briley, D. A., Chow, P. I., Su, R., & Hill, P. L. (2017). A systematic review of personality trait change through intervention. Psychological Bulletin, 143(2), 117–141. https:// doi.org/10.1037/bul0000088. Roberts, B. W., & Nickel, L. B. (2017). A critical evaluation of the Neo-Socioanalytic Model of personality. In J. Specht (Ed.), Personality development across the lifespan. (pp. 157–177). Elsevier Academic Press. https://doi.org/10.1016/B978-0-12-804674-6.00011-9. Roberts, B. W., Walton, K. E., & Viechtbauer, W. (2006). Patterns of mean-level change in personality traits across the life course: a meta-analysis of longitudinal studies. Psychological Bulletin, 132(1), 1–25. https://doi.org/10.1037/0033-2909.132.1.1. Robinson, O. C. (2009). On the social malleability of traits. Variability and consistency in Big 5 trait expression across three interpersonal contexts. Journal of Individual Differences, 30 (4), 201–208. https://doi.org/10.1027/1614-0001.30.4.201. Sheldon, K. M., Ryan, R. M., Rawsthorne, L. J., & Ilardi, B. (1997). Trait self and true self: Crossrole variation in the Big-Five personality traits and its relations with psychological authenticity and subjective well-being. Journal of Personality and Social Psychology, 73(6), 1380–1393. https://doi.org/10.1037/0022-3514.73.6.1380. Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy. Baltimore, MD: John Hopkins University Press. Soto, C. J., & Gosling, S. D. (2011). Age differences in personality traits form 10 to 65: Big Five domains and facets in a large cross-sectional sample. Journal of Personality and Social Psychology, 100(2), 330–348. https://doi.org/10.1037/a0021717. Soto, C. J., & John, O. P. (2017). The next Big Five Inventory (BFI-2): Developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity and predictive power. Journal of Personality and Social Psychology, 113(1), 117. https://doi.org/10.1037/pspp0000096. Soto, C. J., John, O. P., Gosling, S. D., & Potter, J. (2011). Age differences in personality traits from 10 to 65: Big Five domains and facets in a large cross-sectional sample. Journal of Personality and Social Psychology, 100(2), 330–348. https://doi.org/10.1037/a0021717. Specht, J., Egloff, B., & Schmukle, S. C. (2011). Stability and change of personality across the life course: The impact of age and major life events on mean-level and rank-order stability of the Big Five. Journal of Personality and Social Psychology, 101(4), 862–882. https://doi.org/10.1037/ a0024950. Srivastava, S., John, O. P., Gosling, S. D., & Potter, J. (2003). Development of personality in early and middle adulthood: Set like plaster or persistent change? Journal of Personality and Social Psychology, 84(5), 1041–1053. https://doi.org/10.1037/0022-3514.84.5.1041. Tang, T. Z., DeRubeis, R. J., Hollon, S. D., Amsterdam, J., Shelton, R., & Schalet, B. (2009). Personality change during depression treatment: A placebo-controlled trial. Archives of General Psychiatry, 66(12), 1322–1330. https://doi.org/10.1001/archgenpsychiatry.2009.166. Terraciano, A., Costa, P. T., & McCrae, R. R. (2006). Personality plasticity after age 30. Personality and Social Psychology Bulletin, 32(8), 999–1009. https://doi.org/10.1177/0146167206288599. Vecchione, M., Alessandri, G., Barbaranelli, C., & Gerbino, M. (2010). Stability and change of ego resiliency from late adolescence to young adulthood: a multiperspective study using the ER89-R Scale. Journal of Personality Assessment, 92(3), 212–221. https://doi.org/10.1080/002238910 03670166. Vukasovic, T., & Bratko, D. (2015). Heritability of personality: A meta-analysis of behavior genetic studies. Psychological Bulletin, 141(4), 769–785. https://doi.org/10.1037/bul000001.

References

65

Whitehurst, G. J. (2016). Hard thinking on soft skills. Evidence Speaks Reports, 1(14), 1–10. Brookings Economic Studies. Wood, D., & Roberts, B. W. (2006). Cross-sectional and longitudinal tests of the Personality and Role Identity Structural Model (PRISM). Journal of Personality, 74(3), 780–810. https://doi.org/ 10.1111/j.1467-6494.2006.00392.x.

Chapter 4

Evidence From Educational Studies

4.1 Introduction The central theme of this chapter is evidence on the malleability of social-emotional attributes by means of educational interventions. In the relevant research literature social and emotional outcomes are considered as the desired result of programs and interventions in schools dedicated to stimulating such learning. Learning is more than direct behavioral response to treatments but envisages more enduring changes in social and emotional dispositions. In the preceding chapters we have seen that constructs, used in psychology to measure individual differences on a broad range of personality characteristics, are now being proposed as outcomes of social-emotional learning interventions. The nature of some of these constructs appears to clash with the very concept of malleability, as personality characteristics, in particular the broad Big Five traits are seen as partly innate stable characteristics. In the previous chapter we saw that the conceptualization and measures of skills or skill equivalents of traits is central to this dilemma. Theoretically the issue of malleability of personality attributes by means of educational interventions leads to two questions: what is the required level of specificity that makes modification of personality attributes feasible and which types of educational interventions are sufficiently powerful to bring about the enduring change that we expect when we speak of learning? In this chapter we look at empirical evidence that directly speaks to the issue of malleability. We consider three types of studies: first, we look at studies from economists, secondly, we review meta-analyses based on evaluations of social-emotional learning programs and thirdly we refer to research in the tradition of educational effectiveness research.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_4

67

68

4 Evidence From Educational Studies

4.2 Contributions from Economists Heckman and Kautz (2012) state that achievement tests do not adequately capture a category of skills that are valued in the labor market, in school, and in many other domains. This category of skills is subsumed under the heading of soft skills: personality traits, goals, motivations, and preferences. The larger message of their paper is “that soft skills predict success in life, that they causally produce that success, and that programs that enhance soft skills have an important place in an effective portfolio of public policies” (p. 1). Although their description of soft skills appears considerably broader, the authors focus on personality traits as the conceptual core. Choosing the well-known Big Five personality factors as theoretical frame of reference they say that conscientiousness, perseverance, sociability and curiosity matter. The authors touch upon the important issue of the innate and stable, versus malleable and learnable nature of personality concepts. The term trait suggests permanence and stability, whereas skills are associated with something that can be learned. The authors say that the extent to which these personal attributes can change lies on a spectrum, but do not elaborate on the possible implications for the malleability of traits and skills. It is worthwhile to figure out the exact place of social and emotional skills (SES) in a causal framework of schooling. The authors say that SES causally influence test scores, grades and life outcomes; but maintain that the reverse is not the case: cognitive outcomes do not influence personality traits (p. 35).1 Next, they state that educational interventions can stimulate social and emotional skills (SES), and provide various examples of what they see as empirical proof of this. Some of the references which Kautz and Heckman mention in support of their claim are based on evaluation studies of social-emotional learning programs and research syntheses based on such studies (e.g. the meta-analysis by Durlak, Weissberg, Dymnicki, Taylor, & Schellinger, 2011). We will discuss such studies in the next section. Further evidence is based on pre-school compensatory programs, such as the evaluation of the Perry School Project and on several modelling studies, in which social-economic outcomes are related to cognitive and non-cognitive educational outcomes. Other studies that the authors refer to are based on results regarding side-effects of more general educational improvement programs, like the STAR project, and the impact of schooling in general on social emotional outcomes.

1 Cunha et al. (2010, p. 886) hypothesize that “Stocks of cognitive skills can promote the formation

of noncognitive skills and vice versa”. However, their analysis based on the National Longitudinal Survey of Youth indicates that this hypothesis does not hold when measurement error is controlled for; when this occurs, there is no across-productivity effect of cognitive skills on non-cognitive skills. OECD (2015, p. 39), on the contrary, present a model on “skills begetting skills” where cross-over effects from cognitive attainment at phase t to non-cognitive outcomes at phase t + 1 are included.

4.2 Contributions from Economists

69

4.2.1 Causal Modeling of Outcomes In their study on “Estimating the technology of cognitive and non-cognitive skill formation” (Cunha, Heckman, & Shennach, 2010) use measures of parental investment and children’s outcomes from the National Longitudinal Survey of Youth (NLSY)in the United States to “estimate the parameters that govern the substitutability between early and late investments in cognitive and non-cognitive skills”. The NLSY contains panel data on wages, schooling, and employment for a cohort of young persons, age 14–22 at their first interview in 1979. This cohort has been followed ever since. The NLSY79 contains information on cognitive test scores as well as non-cognitive measures. They found “much less evidence of malleability and substitutability for cognitive skills in later stages of a child’s life cycle, while malleability for non-cognitive skills was about the same in two stages” (ibid., p. 928). Their conclusion is that their “estimates imply that successful adolescent remediation strategies for disadvantaged children should focus on fostering non-cognitive skills. Investments in the early years are important for the formation of adult cognitive skills”. The non-cognitive skills that were measured at various age ranges (3–4 up to 13–14) in this study and which could be measured with a fair degree of reliability, were all indices of behavioral problems, like being headstrong, anxious, anti-social etc. In another study which also makes use of data from the NLSY, Heckman, Stixrud, and Urzua (2006) apply various techniques to model outcomes on schooling, wages and employment as a function of cognitive and non-cognitive indicators. The overall conclusion is that “For many dimensions of behavior…. non-cognitive ability is as important, if not more important, than cognitive ability” (p. 477).2 The non-cognitive measures, which were used in this study were the Rotter Locus of Control Scale (Rotter, 1966), which was administered in 1979, and the Rosenberg Self-Esteem Scale (Rosenberg, 1965), which was administered in 1980. The Rotter Scale measures Internal and External Locus of Control. The original version has 29 items; in the study in question the four-item abbreviated version was used. The Rosenberg Scale consists of 10 items, e.g. “I take a positive attitude toward myself” and “I wish I could have more respect for myself”. The Rosenberg scale was shown to measure one factor by Gray-Little, Wiliams, and Hancock, (1979) Contrary to Heckman et al.’s conclusion Baumeister et al. (2003) concluded that “The modest correlations between self-esteem and school performance do not indicate that high self-esteem leads to good performance. Instead, high self-esteem is partly the result of good school performance. Efforts to boost the self-esteem of pupils have not been shown to improve academic performance and may sometimes be counterproductive. Job performance in adults is sometimes related to self-esteem, although the correlations vary widely, and the direction of causality has not been established. Occupational success may boost self-esteem rather than the reverse”. 2 More precisely “Although cognitive skills explain much more of the variance of (log) wages, their

effects on (log) wages (as measured by skill gradients) are similar to the effects of the non-cognitive traits” (478).

70

4 Evidence From Educational Studies

4.2.2 Long Term Effects of the Perry Preschool Program The Perry Preschool Program, conducted in the 1960s, was an early childhood intervention, which provided preschool education to low-IQ, disadvantaged AfricanAmerican children living in Michigan. Data were collected at age 3, the entry age, and through annual surveys until age 15, with additional follow-ups conducted at ages 19, 27, and 40 (Heckman, Moon, Pinto, Savelyev, & Yavitz, 2010). Heckman, Pinto, and Savelyev (2013) concentrate on the psychological skills changed by the Perry program and decompose the treatment effects on adult outcomes into components attributable to improvements in these skills. The psychological skills that were measured were based on the Pupil Behavior Inventory (PBI), which was seen as measuring “externalizing behavior” (7 items) and on the Ypsilante Rating Scale, which addressed “academic motivation” (3 items). Both instruments used ratings of pupils at age levels 7–9, by teachers. The possible answers in the PBI indicate whether a specific kind of behavior was manifested: (1) very frequently, (2) frequently, (3) sometimes, (4) infrequently, and (5) very infrequently. The 7 items of the PBI are based on the following behaviors: – disrupts classroom procedures, – swears or uses obscene words, – steals, – lying or cheating, – influences others toward trouble making, – aggression towards peers and – teases or provokes students. The program improved externalizing behaviors (aggressive, antisocial, and rulebreaking behaviors), which, in turn, improved several labor market outcomes and health behaviors and reduced criminal activities. The program also enhanced academic motivation, but the effect was primarily for girls. Results on the personality skills are summarized as follows: “Persistent changes in personality skills play a substantial role in producing the success of the Perry program. The reduction in externalizing behavior, which explains the bulk of the effects of the Perry program on criminal, labor market, and health behavior outcomes, is especially strong” (ibid., p. 2080). ’Externalizing behavior’ is seen as a skill that is related to Big Five factors agreeableness and Conscientiousness, while academic motivation is interpreted as representing Openness to Experience.

4.2.3 Comments In the study by Cunha et al. (2010) the authors base their recommendation to invest in non-cognitive development at an early age on a rather specific measure of these skills, namely externalizing behavior. The strong conclusions on the effect of personality traits and non-cognitive measures that were drawn by Heckman et al. (2006) do not seem proportional to the modest and not entirely uncontested nature of the noncognitive measures that were used in these analyses. Such stark generalizations based on relatively narrow measures of behavioral tendencies to ‘non-cognitive development’ and personality traits, can also be raised as a critical comment to the conclusions

4.2 Contributions from Economists

71

Heckman et al. (2010) and Heckmann et al. (2013) draw about the long-term effects of the Perry Preschool Program. Murray (2012) provides a critical review on the strong conclusions about personality changes that Heckman c.s. draw based on the Perry Preschool Program and a similarly small-scale program, the Abecedarian Project. He notes that “in both cases the people who ran the program were also deeply involved in collecting and coding the evaluation data, and they were passionate advocates of early childhood intervention” (p. 1). He then draws attention to a less compromised and larger scale replication study, the Infant Health and Development Program (IHDP). This study had a randomly selected treatment group of 377 and a control group of 608. For each infant the intervention began upon discharge from the neonatal nursery and continued until the child reached 36 months of age. Follow-up results at 24 and 36 months after program completion were highly positive. However, “by the time the participants were age five, most of those results had disappeared. In the follow-up at age eighteen, the results for the treatment and control children showed no effect for any of the indicators, which covered intellectual ability, academic achievement, behavioral problems, and physical health” (ibid., p. 1). While referring to a perspective from Peter Rossi on the long-term effects of intervention programs in education, Murray concludes that “small-scale experimental efforts staffed by highly motivated people show effects. When they are subject to well-designed large-scale replications, those promising signs attenuate and often evaporate altogether” (p. 1).

4.3 Non-cognitive Outcomes in Meta-Analyses of (Quasi-) Experimental Educational Intervention Studies 4.3.1 The Meta-Analysis by Durlak et al. In the meta-analysis by Durlak, Weissberg, Dymnicki, Taylor, and Schellinger (2011) the effect sizes of 213 interventions in the domain of social-emotional learning (SEL) were computed. Following Elias et al. (1997), these authors define SEL as “the process of acquiring core competencies to recognize and manage emotions, set and achieve positive goals, appreciate the perspectives of others, establish and maintain positive relationships, make responsible decisions, and handle interpersonal situations constructively” (ibid., p. 405). The interventions work through teaching and creating dedicated learning environments in regular schools. A considerable share of the programs in which these interventions occur appear to be prevention programs, with the direct aim to avoid behavioral problems at school and stimulate pro-social behavior. A second category of programs is associated with promotion of social emotional skills at large, without the emphasis on avoiding problematic behavior. The proximal goals of the programs usually form a pluralistic set, like “fostering the development of five interrelated sets of cognitive, affective, and behavioral competencies: self-awareness, self-management, social awareness, relationship skills, and

72

4 Evidence From Educational Studies

responsible decision making” (Durlak et al., p. 406, cited from the Collaborative for Academic, Social, & Emotional Learning, 2005). The dependent variables that were addressed in the meta-analysis were six different student outcomes: (a) social and emotional skills, (b) attitudes toward self and others, (c) positive social behaviors, (d) conduct problems, (e) emotional distress, and (f) academic performance. Not all outcomes were assessed in all studies that were sampled. As our prime interest pertains to the first category, social and emotional skills, it is noteworthy that a subset of 68 of the 213 outcomes addressed this category. Social-emotional skills were further defined as follows. “This category includes evaluations of different types of cognitive, affective, and social skills related to such areas as identifying emotions from social cues, goal setting, perspective taking, interpersonal problem solving, conflict resolution, and decision making” (ibid., p. 410). Inclusion criteria for the meta-analysis were as follows; “Studies eligible for review were (a) written in English; (b) appeared in published or unpublished form by December 31, 2007; (c) emphasized the development of one or more SEL skills; (d) targeted students between the ages of 5 and 18 without any identified adjustment or learning problems; (e) included a control group; and (f) reported sufficient information so that effect sizes (ESs) could be calculated at post and, if follow-up data were collected, at least 6 months following the end of intervention.” (ibid., 409). Assessment of social-emotional skills “could be based on the reports from the student, a teacher, a parent, or an independent rater. However, all the outcomes in this category reflected skill acquisition or performance assessed in test situations or structured tasks e.g., interviews, role plays, or questionnaires” (ibid., p. 410). The quality of outcome measures of the original studies was addressed as a moderator variable in the meta-analysis. “To assess how methodological features might influence outcomes, three variables were coded dichotomously (randomization to conditions, use of a reliable outcome measure, and use of a valid outcome measure; each as yes or no). An outcome measure’s reliability was considered acceptable if kappa or alpha statistics were ‡0.60, reliability calculated by product moment correlations was ‡0.70, and level of percentage agreement by raters was ‡0.80. A measure was considered valid if the authors cited data confirming the measure’s construct, concurrent, or predictive validity. Reliability and validity were coded dichotomously because exact psychometric data were not always available” (ibid., p. 410). The results of the meta-analysis showed an average effect size in socio-emotional skills of 0.57, based on 68 of the 213 outcomes. In comparison to earlier metaanalyses on socio-emotional skills this was a relatively large effect size (comparable studies had effect sizes of about 0.40. The average effect size for academic performance was 0.27. The authors note that this latter effect size is comparable to results of meta-analyses that had only assessed the effects of cognitive instruction. Despite the thoroughness of this meta-analysis, both substantively and methodologically, the reporting generalizes on some issues for which one might want more specific information. These issues are related to the complexity of the interventions and the plurality of outcome measures.

4.3 Non-cognitive Outcomes in Meta-Analyses of (Quasi-) Experimental …

73

First of all, results are not specified for individual skills; the set of social-emotional skills is heterogeneous; we do not know how often a specific skill was measured, how general scores might have been computed over several skills, and not even to what extent they are purely socio emotional rather than cognitive. Secondly, the intervention programs are also very heterogeneous, often complex and holistic, results are not separated for prevention- versus promotion- orientation of the program Thirdly, the quality of the SEL measures as dependent variables is a concern. Although the reliability and the validity of the instruments is included as a moderator variable, it would have been more convincing if whether or not the instruments were standardized on a norm population, and whether or not the data depended on self-reports, had also been included as moderator variables. The impression of relative weakness of outcome measures in studies that have assessed SEL outcomes of SEL intervention studies will be checked by means of case studies of program evaluations, for instance the of the PATH study (Greenberg, Kusché, & Riggs, 2004), in a subsequent chapter. Fundamental problems with these measures will be discussed in Chap. 6. Fourth, a related issue is the question whether the outcome assessment is controlled by the program manager or an external party (cf. Cheung & Slavin, 2015). Fifth, a breakdown of effect sizes obtained from randomized trials versus quasiexperiments might also have been relevant to reveal more about the methodological quality of the underlying studies. Sixth and finally, whether or not the outcomes were controlled for pre-test information was not considered. On all these points, more in-depth study of the material and perhaps even replication, would seem to be required. A striking observation of the study was that the effect of SEL programs on cognitive achievement (post mean ES 0.27) was comparable to the results of 76 meta-analyses of strictly educational interventions (Hill, Bloom, Black, and Lipsey 2008; cited by Durlak et al., 2011, p. 416). Actually, Hill et al. report effect sizes of 0.23 for elementary school, 0.27 for middle school and 0.28 for high School. A final limitation of this particular meta-analysis was that only post treatment effects, and no longer term follow up effects, were considered. The same group of authors concentrated on follow up effects in the meta- analysis by Taylor, Oberle, Durlak, and Weissberg (2017), which will be discussed further on.

4.3.2 The Meta-Analysis by Sklad et al. Sklad, Diekstra, De Ritter, Ben, and Gravesteijn (2012) conducted a meta-analytic review of 75 studies that reported the effects of universal school-based social, emotional and/or behavioral programs (abbreviated to SEB-programs). The orientation of these programs was promoting development rather than prevention. “The core competencies taught in SEB programs…, consist of what is often described as

74

4 Evidence From Educational Studies

emotional intelligence (Goleman, 1998). Emotional intelligence includes competencies that allow students to recognize and manage emotions, solve problems effectively, and establish positive relationships with others (Zins & Elias, 2006, cited in Sklad et al., 2012, p. 893)”. The study applied the following list of inclusion criteria: “the study reported a program that taught at least one social-emotional skill; the intervention was school based (primary or secondary schools); the intervention had to be universal, i.e. aimed at the general school population and not only at ‘high risk’ or deprivileged children; study reported effect sizes, or sufficient information to calculate these; the study had to be published between 1995 and 2008, and the study used an experimental or quasi-experimental design with control/comparison group(s)” (ibid., p. 895). Outcome criteria that were analyzed were: socio-emotional skills, and positive self-image. These were seen as direct outcomes. Next, a number of secondary effect criteria were used: anti-social behavior, pro-social behavior, substance abuse, mental health disorders and academic achievement. The social and emotional skills were referred to in terms of typical examples: “rating of social competence provided by teachers, evaluation of strategy constructiveness in conflict resolution task, assertiveness skills measure, refusal skills efficacy score, problem orientation assessment, or bully-victim scale score” (ibid., p. 897). Outcomes immediately after program completion were analyzed next to follow-up measures (from 7 months after program completion and onwards). Results on socio-emotional skills outcomes were computed from 55 of the 75 programs. Several methodological characteristics of the studies were registered, for example: any form of random assignment occurred in 56% of the studies; 60% of the outcome measured relied on self-reports only and an intervention manual was unavailable for 73% of the programs. One quarter of the interventions was directed at change of the school climate or culture; which seems to imply that the direction of the program was undetermined or not to be categorized, for 75% of the programs. The largest share of the programs lasted up to 1 year (65%). The average effect size assessed immediately after program completion for socialemotional skills was 0.70 (Cohen’s d) and for follow-up measures it was 0.07. Effect sizes for academic achievement were 0.46 and 0.26 for immediate and follow-up outcomes respectively. Analysis of the interventions’ features revealed that programs of short duration (less than 1 year) showed a higher immediate effect on social skills and antisocial behavior than longer programs (p. 904). The authors observe that “There is a wide variety of different interventions addressing social and emotional skills, each intervention with a unique composition…. each study also uses unique measurement of its effects” (p. 906). They also note that “The effects of general universal SEB (social emotional behavioral, JS) programs on any specific outcome may be smaller than those indicated by this analysis due to the fact that most reported programs have a particular focus and report effects of the program on the intended target outcomes (and sometimes on instrumental outcomes) but rarely on incidental outcomes not directly related to the main goal of the program.” (p. 907).

4.3 Non-cognitive Outcomes in Meta-Analyses of (Quasi-) Experimental …

75

These two observations underline the same kind of limitations as were noted for the previously discussed meta-analysis by Durlak et al. (2011), namely that the heterogeneity of both interventions and outcome measures makes it hard to make causal inferences for the enhancement of specific social-emotional skills. The studies are program evaluations with overall outcome measures that are a different mix for each study. This only allows comparisons on an abstract entity labeled socialemotional skills. Similarly, on the intervention side there is a lot of vagueness as well. In this meta-analysis this is, among other features, underlined by the lack of an intervention manual in 75% of the cases. Also, in this meta-analytic study there is little attention for the quality of the outcome measures. Between the lines one can read that the outcome measures were different in each study, constructed specifically for the program evaluation and no standardized instruments are used. In this study no quality criteria for the outcome measures were mentioned (neither as inclusion criteria, nor as potential moderators).

4.3.3 The Meta-Analysis by Wigelsworth et al. The meta-analysis by Wigelsworth et al. (2016), was based on 89 studies that had assessed the effects of Social and Emotional Learning Programs. Following Denham (2005), their report defines social-emotional learning as the promotion of five core competencies: self-awareness; self-management; social awareness; relationship skills; and responsible decision-making. This orientation led to the choice of six SEL outcome variables: social-emotional competence, attitudes towards self, pro-social behavior, conduct problems, emotional distress and emotional competency. Academic attainment was included as a 7th outcome indicator. Interestingly, for each of the SEL outcome indicators the authors mention illustrative established scales and instruments, like for example the ’Student Self Efficacy Scale’. Still the quality of these instruments in terms of validity, reliability and standardization is not discussed in the report; neither as an inclusion criterion nor a descriptive study characteristic. “Studies eligible for the meta-analysis were: (a) written in English; (b) appeared in published or unpublished form between 1 January 1995 and 1 January 2013; (c) detailed an intervention that included the development of one or more core SEL components as defined by Denham (2005); (d) delivered on school premises, during school hours; (e) delivered to students aged 4–18 years; (f) detailed an intervention that was universal (i.e. for all pupils, regardless of need); (g) included a control group; (h) reported sufficient information for effect sizes to be calculated for program effect” (p. 353). Results indicated the following effect sizes (Hedges’ g) for the SEL outcomes and academic attainment: Social-emotional competence 0.53; Attitudes towards self 0.17; Pro-social behavior 0.33; Conduct problems 0.28; Emotional distress 0.19; Emotional competence 0.27; Academic achievement 0.27.

76

4 Evidence From Educational Studies

The authors mention “diversity, with regard to both ‘methodological diversity’ (variability in study design), and ‘clinical heterogeneity’ (differences between participants, interventions and outcomes)” as the most challenging limitation of their study. They also comment that this experienced heterogeneity “is in no small part due to the expansive definition by which SEL programs are identified (Payton et al., 2008). This raises questions about the utility of such broad definitions within the academic arena as it currently precludes more precise investigations of specific issues” (pp. 356/366). The authors mention the desirability of further theoretical work in this area. An outcome that is not further questioned by the authors is the high effect size for the relatively broad variable “social-emotional competence” relative to other outcomes that are more specifically defined, together with the result that for this variable follow-up assessment was higher than first trial assessment. This would call for more in-depth analysis of the concept and the operationalizations that were used in the underlying studies. One wonders why the issue of social desirability is not being mentioned in this literature. Finally, a surprising result, which was also found in Durlak et al.’s meta-analysis, is the relatively high effect size for academic attainment (equal to what is a reasonable summary for programs directly dedicated to enhancing academic performance).

4.3.4 The Meta-Analysis by Korpershoek et al. The meta-analysis by Korpershoek et al. (2016) examined which classroom management strategies and programs enhanced students’ academic, behavioral, socialemotional, and motivational outcomes in primary education. The analysis included 54 random and nonrandom controlled intervention studies published in the past decade (2003–2013).The main objective of the study was to conduct a meta-analysis of the effects of various classroom management strategies (CMS) and classroom management programs (CMP) aimed at improving students’ behavior and enhancing their academic performance in primary education. Following Evertson and Weinstein (2006), the authors define classroom management in terms of the actions teachers take to create a supportive environment for the academic and social-emotional learning of students. They propose the following classification of classroom management interventions, based on their primary focus: Teachers’ behavior-focused interventions. The focus of the intervention is on improving teachers’ classroom management (e.g., keeping order, introducing rules and procedures, disciplinary interventions) and thus on changing the teachers’ behavior. Teacher–student relationship–focused interventions: The focus of the intervention is on improving the interaction between teachers and students, that is, on developing caring, supportive relationship. Students’ behavior-focused interventions: The focus of the intervention is on improving student behavior, for example, via group contingencies or by improving self-control among all students.

4.3 Non-cognitive Outcomes in Meta-Analyses of (Quasi-) Experimental …

77

Students’ social-emotional development-focused interventions: The focus of the intervention is on improving students’ social-emotional development, such as enhancing their feelings of empathy for other children. Studies were identified on the basis of a systematic search of the peer-reviewed classroom management literature published between 2003 and 2013 and application of inclusion criteria. The screening procedure ultimately resulted in a total of 47 eligible studies. The duration of the intervention was categorized into three groups: less than 13 weeks, between 13 weeks and 1 year, and longer than 1 year. Outcomes were categorized, as academic, behavioral, social emotional and motivational. “Social development, social skills, social competencies, emotional development, emotional skills, emotional competencies, emotion recognition, moral sensitivity, coping, emotion regulation, and empathy were coded as social-emotional outcomes. Academic motivation, school motivation, goal orientations, commitment to school, learning engagement, and enthusiasm were coded as motivational outcomes. All other student outcomes, which were considered relevant, such as selfconfidence, self-efficacy, peer acceptation, and time-on-task were coded as ‘other’ outcomes” (p. 11). The focus of most of the intervention studies was on changing the students’ (students’ behavior and/or students’ social-emotional development) and/or the teachers’ (i.e., their CMS) behavior through long-term interventions; the shortest intervention lasted 6 weeks and the longest 3 years. The effect sizes for classroom interventions, specified for type of outcome were 0.17 (academic outcomes); 0.24 (behavior); 0.21 (social emotional); 0.08 (motivation) and 0.26 for other outcomes. The authors conclude that an interesting outcome was that focusing on the socialemotional development of students had an effect on social emotional outcomes (e.g., empathy for other children’s feelings) Programs that addressed this component had a slightly higher effect size than programs that did not (p. 19). “Another interesting outcome was that interventions that focused on improving classroom management by teachers had a small effect on students’ academic outcomes. This study draws attention to the interrelatedness of different intervention strategies: most strategies had positive effects on all outcomes” (p. 28). The fact that multiple strategies were coupled with multiple outcomes in this study might be the basis of further delineation of which strategies in regular classroom education positively affect (which measures of) social-emotional learning, and how the improved skills should be interpreted (in terms of situation specific behavior or more general states).

4.3.5 The Meta-Analysis by Taylor et al. The meta- analysis by Taylor, Oberle, Durlak, and Weissberg (2017) might be considered as an update of the meta-analysis by Durlak et al. (2011), but with a specific emphasis on follow up outcomes (collected at least 6 month after de intervention) and more reference to longer term effect of programs in terms of schooling outcomes

78

4 Evidence From Educational Studies

and life indicators. A sample of 82 studies was selected. The reports had to describe a school-based universal SEL program for kindergarten to 12th-grade students that collected follow-up data from intervention and control groups 6 months or more post-intervention. Definitions of SEL and socio-emotional outcomes are equal to the ones that were used in the earlier meta-analysis. The methods that were used for the meta-analysis are also identical. A very interesting and strong methodological asset, also present in the first meta-analysis by Durlak et al. (2011) was the assessment of what one might refer to as program integrity, by checking the so called four SAFE program features for SEL interventions: Sequenced: The program had a coordinated progression of activities or practices to build competencies; Active: Participatory elements such as role plays involved students in active learning of SEL competencies; Focused: There was a dedicated time or specific program element that was focused on developing SEL competencies; and Explicit: The program identified specific SEL competencies that it was trying to develop within the intervention. Interestingly, only 9 of the 82 interventions did not meet these criteria The result at post-intervention showed the following effect sizes: For the 82 studies, measures of social and emotional assets at post showed significant positive impacts of the intervention, with participants having stronger SEL skills (ES = 0.17) and improved attitudes (ES = 0.17) compared with controls. Program participants also faired significantly better than controls at post on academic performance (ES = 0.22) emotional distress (ES = 0.12), and drug use (ES = 0.12). However, post-intervention mean ESs were not significant for either positive social behaviors (ES = 0.06) or conduct problems (ES = 0.07)The average follow-up effect size for SEL competencies was 0.23; the effect size for academic performance at follow-up was 0.33. In the narrative review part of the study the authors list literature in which the long term positive impact of SEL on schooling, work and life indicators is underlined. Although the checks on program integrity and implementation rule out non-event evaluation, they do not help in resolving questions about the content and direct targeting of specific SEL competencies. The authors note that “the alternative predictors examined in this meta-analysis do not allow us to draw conclusions about what specific features make SEL interventions more or less effective”. As was noted with respect to the 2011 meta-analysis, considerable uncertainty remains about the quality of the SEL effect measures. Almost three quarters of the studies (i.e., 72.2%) relied on self-report measures to evaluate student outcomes. No specific attention was paid to the use of standardized instruments.

4.4 In-Between Balance: How Convincing Is the Evidence? In the preceding sections we have taken a closer look at two sources of evidence that have played an important role in documenting supportive evidence about the malleability of social and emotional skills and their importance for educational and

4.4 In-Between Balance: How Convincing Is the Evidence?

79

Table 4.1 Average effect sizes from 5 meta-analyses SEL post program

Academic outcomes, post program

Durlak

0.57

0.27

Sklad

0.70

0.46

Wigelworth

0.53

0.27

Korpershoek

0.21

0.17

Taylor

0.17

0.22

SEL-follow up

Academic outcomes, follow-up

0.07

0.26

0.23

0.33

life outcomes: the econometric studies by Heckmann, et al. and the contribution from meta-analyses about the effects of educational interventions on social-emotional skills. Although the OECD studies that reviewed the same sources (Kautz et al., 2014) are not devoid of mentioning limitations and incidental zero or negative program effects, the overriding note in the reviews is positive and encourages educational policy to strongly support the furthering of the development of social and emotional skills in education. Most explicitly activist are the conclusions from the EU review study by Cefai et al. (2018), who express the hope that all European countries will reform their curricula to promote social emotional learning. By far the most relevant evidence are the results of the five meta-analyses, which were described in more detail, those by Durlak et al. (2011), Sklad et al. (2012), Wigelsworth et al. (2016), Korpershoek et al. (2016) and Taylor et al. (2017). Mean effect sizes are summarized in Table 4.1. Three meta-analysis show medium to large effect sizes for social-emotional skills, at post intervention level. Effect sizes in the order of 0.20 are considered as slightly below the level of being considered as an “educationally meaningful” effect, where the benchmark is sometimes put at ES = 0.25 (What Works Clearinghouse). The evidence on follow up effects is scarce, and effect sizes would be expected to diminish (which is confirmed in the meta-analysis by Sklad et al., but not confirmed by the meta-analysis by Taylor et al.). When we compare the set of constructs that was used to define social and emotional outcomes in the meta-analyses to conceptual developments treated in the preceding chapters, we note that there is a certain emphasis on measures of pro-social behavior. Next, other categories of social and emotional skills are in line with the five components of social- emotional learning defined by CASEL, which in turn go back to the work on emotional intelligence. The more recent development to define SEL outcomes in terms of the Big Five personality taxonomy is not yet reflected in the studies that formed the basis for the meta-analyses that were reviewed. How convincing are the outcomes? All the meta-analyses that were reviewed appear to be properly and technically sound. Still, there is heterogeneity in outcomes, and the evidence on follow-up effects is not convincing. When discussing the metaanalytic studies in more detail we noted considerable ambiguity and vagueness in

80

4 Evidence From Educational Studies

the definition of the interventions, often many-facetted programs, as well as heterogeneity, lack of standardization and quality documentation of the SEL outcome measures. We also noted frequent application of self-reports and sparse attention for the methodological vulnerability of these measures, lack of attention to controlling for pretest differences and for addressing implementer and researcher (in)dependence. These issues cause problems of comparability and interpretation on the cause and effect associations that are central in the studies that make up the raw material of the meta-analyses. How certain can we be that SEL effects are really the results of social-emotional learning interventions and not the side effects of cognitive training aspects that are also present in multi-facetted programs? We will address fundamental questions like this, first of all by broadening the scope on empirical evidence and turn to studies in the field of educational effectiveness, which are partially of a non-experimental nature. Next, in a final discussion we will bring in some other sources of comparative evidence in a critical review of the evidence. And, finally, in the next chapter, we will look in more detail at a sample of individual studies drawn from the primary studies on which the meta-analyses were based.

4.5 Non-cognitive Outcomes in Educational Effectiveness Research 4.5.1 Introduction Educational effectiveness research is a multi-disciplinary research field with the aim to attribute differences in achieved educational outcomes to malleable conditions of educational policy, schooling and teaching. Studies can be categorized as addressing system, school and teaching effectiveness, where the distinction is determined by the aggregation level of the independent variables. System level effectiveness is relatively young, and strongly enhanced by the development of international comparative assessment studies, such as TIMSS and PISA. Currently educational effectiveness research is organized in three international networks, the oldest one being the International Congress for School Effectiveness and Improvement (ICSEI), the Society for Research on Educational Effectiveness (SREE), which was founded in 2005, and the Educational Effectiveness Special Interest group in the European Association for Learning and Instruction (EARLI), which had its first conference in 2009. ICSEI and EARLI are international with a strong European representation, while SREE is predominantly American. The Journal School Effectiveness and School Improvement (since 1990) is associated with ICSEI, and the Journal for Research in Educational Effectiveness is associated with SREE (since 2008). As far as disciplinary background is concerned ICSEI has a mixed background including sociology, educational psychology and general educational science, EARLI is traditionally mainly associated with educational psychology, while SREE has a strong representation from

4.5 Non-cognitive Outcomes in Educational Effectiveness Research

81

economics of education. It should be noted that the borders of these networks among themselves are fuzzy, and that, moreover, there is fluid participation with respect to larger disciplinary fields, such as psychology, sociology, pedagogics, economics, psychometrics and social science research methodology. For our purposes it is relevant to mark two major research paradigms, the first non-experimental and quasi-experimental, and the second trying to meet the “golden standard” of intervention studies designed as randomized trials. The older educational effectiveness research has predominantly been based on non-experimental studies, often cross sectional, with the application of multi-level modeling to separate influences from malleable and contextual variables. Program evaluations of school improvement programs have mostly been conducted as quasiexperiments. Recently, non-experimental longitudinal studies, sometimes using growth curves analyses are being used more frequently. Educational effectiveness in the “SREE-tradition” are mostly randomized trials and quasi experiments. In non-experimental educational effectiveness studies, two kinds of effect sizes are computed. Intraclass correlations, expressing the between school or between class variations in a particular educational outcome. Next, the degree to which between unit (school, class) variance can be tied by specific school organizational or teaching variables, such as, for example, the cooperation of teachers within a school, and the use of formative assessment as part of classroom management. Adjustment for previous achievement or other co-variables are made to arrive at “value-added” as compared to gross, unadjusted school effects. Although experimental designs are preferred because of higher internal validity, and a stronger basis for causal inference, the variance partitioning methods in non-experimental educational effectiveness studies may provide a more fine-grained attribution to specific malleable variables. The information of effect sizes from experimental studies depends on the specificity of treatments, and well-defined program theories. We came across this issue when discussing the results of meta-analyses of experimental evaluations of intervention studies focused on social emotional outcomes in a previous section, when we noted that these programs sometimes lacked specificity and were rather broad, multi-facetted interventions.

4.5.2 Nonexperimental Educational Effectiveness Studies and Non-cognitive Outcomes Review studies by Reynolds et al. (2014) and Muijs et al. (2014) discuss the state of the art in general educational effectiveness research and teaching effectiveness research, respectively. The common understanding in both reviews is that noncognitive outcomes have received relatively low emphasis. Next, results of the limited set of studies that did address non-cognitive outcomes indicate that the effect sizes found in these studies are generally lower in comparison to cognitive outcomes.

82

4 Evidence From Educational Studies

Reynolds et al. offer the following explanation. “Three main hypotheses have been generated to explain the relatively small effects on non-cognitive outcomes. First, these nontraditional outcomes may be given less emphasis in the curriculum. That is, societies create schools largely to teach specific cognitive skills, such as reading and mathematics. Second, the measurement of these non-cognitive outcomes is less precise than the measurement of achievement. Third, students’ out-of-school time is focused less on academics and more on the other, non-cognitive activities” (p. 206). With this last point they seem to indicate that non-cognitive outcomes are shaped by home and other contextual conditions rather than by the school. Muijs et al. (2014) note that there is a current rise in research studies addressing well-being, self-concept, motivation and engagement, “with a view towards uncovering teacher effects on these broader outcomes” (p. 242). They concur with the Reynold’s review in noting that the effect sizes found in studies of non-cognitive outcomes are mostly modest, and smaller than studies of cognitive outcomes. Next, the authors refer to evidence, which shows that factors that tend to have a positive influence on cognitive outcomes also tend to be positively associated with non-cognitive outcomes. They say that “there is no evidence for the sometimes posited contradiction between effectiveness in cognitive and non-cognitive areas”. A final point they make is that “many case studies of non-cognitive outcomes suffer from a lack of consistency in defining the key constructs and in reliably and validly measuring these” (ibid.). They also point at a more specific problem in measuring non-cognitive outcomes reliably and validly, namely that in attitudinal outcomes like academic self-concept, self-assessments are strongly influenced by the fact that pupils compare themselves to their immediate peers and that this “has the paradoxical effect that a stronger pupil in a high-performing classroom may have lower self-concept than a weaker pupil in a low-performing classroom” (p. 243). In methodological debates about assessment of social-emotional skills, by authors operating in the core of the soft skills movement, this issue is recognized and referred to as “reference bias” (Duckworth and Yeager, 2015). The review by Muijs et al. also addresses the question which malleable teaching and school organizational variables are expected to impact non-cognitive outcomes, such as academic self-concept, well-being, and engagement. They mention a caring environment with clear boundaries, high expectations, effective behavior management, giving pupils responsibility, and contingent praise as teacher behaviors related to increased academic self-concept. More specific information of the mostly low effect sizes for non-cognitive outcomes in educational effectiveness research is provided in a study by Van Swynsberghen, VanLaar, De Fraine, and Van Damme (2017). They present the following references of effect sizes for school effects: “Konu, Lintonen, and Autio (2002) found a between-school variance of 1% on this outcome (school well-being) in Finland. Van Landeghem, Van Damme, Opdenakker, De Fraine, and Onghena (2002) studied effects of secondary schools on non-cognitive outcomes of students in Flanders and found a variance component of 4.1% for school well-being in the empty model. De Bilde (2013) investigated the effects of primary schools on autonomous and controlled motivation of students. She found a variance at

4.5 Non-cognitive Outcomes in Educational Effectiveness Research

83

primary school level on controlled motivation of 2.9% and a primary school variance on autonomous motivation of 9.8%. Van Landeghem et al. (2002) investigated the raw effects of secondary schools in an empty cross-classified model with students in seventh and eighth grade classes in secondary schools. They found a raw effect of the secondary school of 3.6% on the social integration of students at the end of eighth grade. Opdenakker and Van Damme (2000) found a gross secondary school effect of 2.4% on students’ interest in the learning tasks at the end of seventh grade, when also taking the class level into account. Van Landeghem et al. (2002) found a gross effect of 2.1% of the secondary school on the academic self-concept of the students at the end of Grade 8” (ibid., p. 84). A closer look will now be taken at some individual educational effectiveness studies that addressed non-cognitive outcomes.

4.5.3 Opdenakker and Van Damme (2000) Opdenakker and Van Damme studied effects on school well-being of students in Flemish lower secondary schools. They addressed the following components of school well-being: well-being at the school, social integration in the class, relationship with teachers, interest in learning tasks, motivation towards learning tasks, attitude to homework, attentiveness in the classroom and academic self-concept. The data collection was conducted by means of questionnaires that were based on established and validated questionnaires and had acceptable levels of internal consistency. The results showed that “a lot of variation in pupil achievement in Flanders is due to the school and the class: 43% of the total variance in mathematics achievement and 56% of the total variance in mother tongue at the end of the first grade (1A) in secondary education is due to the school and the class. A quite different result is obtained for the well-being indicators at the end of the first grade in secondary education: only 5–11% of the total variance is between schools and classes” (p. 147). The results showed that some school practices that were positively associated with cognitive achievement also tended to be positively associated with some of the wellbeing indicators. An orderly classroom climate was one of these effective practices. A focus on social and personality development appeared to have a negative effect on mathematics achievement, but a positive association with motivation to carry out learning tasks. The study showed a positive effect of teaching staff co-operation in relation to teaching methods and pupil counselling on both achievement and several well-being indicators.

84

4 Evidence From Educational Studies

4.5.4 Van Swynsberghen, VanLaar, De Fraine, and Van Damme (2017) This study investigated non-cognitive outcomes with a focus on long-term school effects. The research project started in 2002 in Flanders, the Dutch-speaking region of Belgium, and followed approximately 600 children from kindergarten (age 5–6) until Grade 7, the first grade of secondary school (age 12–13). When the students were 17 years old, in May 2014, they were asked to participate in a follow-up data collection, consisting of a mathematics test and a student questionnaire (p. 86). The following research questions were addressed: “What are the long-term effects of primary schools on non-cognitive outcomes of students at age 17?” “If long-term effects are found, which primary school factors play a role?” Amongst others, a group composition characteristic and the academic effectiveness of the school were considered. The authors studied the following non-cognitive outcomes: school well-being, specific types of motivation, as distinguished by Ryan and Deci (2000), Midgley et al. (2000) and Dweck and Leggett (1988) social integration, interest in the learning task, general self-concept and mathematical self-concept (Marsh, 1988). Van Swynsbergen et al. contrast this set of non-cognitive outcomes with other non-cognitive outcomes deducted from the Big Five personality theory. They say that their selection presents more malleable concepts than the Big Five traits, which they see as rather stable through the life course and are less easily influenced by the environment. And they base this observation on results of studies in primary schools that have shown small positive effects on some of these factors. The results showed that the primary school had a significant, but small long-term effect on three non-cognitive outcomes of students at age 17: mastery goal orientation, social integration in the class, and mathematical self -concept. The variance in non-cognitive outcomes of students situated between primary schools was 0.5–0.8%. On the seven other outcomes, no significant long-term effects of the primary school were found. The authors conclude that significant but only small long-term effects of the primary school were found (for 3 out of 10 variables).3 In their interpretation of these results the authors note that comparable studies of cognitive outcomes had shown larger effects. They point at some possible methodological weaknesses. The first they mention is reference bias, which means that many non-cognitive outcomes, such as school well-being and academic self-concept, are heavily influenced by the direct peer group at the time of data collection. A second source of bias might have been differences in enthusiasm of students for completing questionnaires about this kind of features. And thirdly, they mention the possible influence of social desirability. They conclude that the long-term effects of primary schools on non-cognitive outcomes of students are very small and that they would find it hard to make well-grounded policy conclusions based on these limited findings.

3 Given the large N of the study (6000 pupils), significance of these tiny effects is not so meaningful.

4.5 Non-cognitive Outcomes in Educational Effectiveness Research

85

4.5.5 Brunner, Keller, Wenger, Fischbach, and Lüdtke (2018) This study analyzed between schools and between student variance in 82 countries on a number of outcomes: student achievement, motivation, affect and learning strategies. Data from five waves of the Program of International Student Achievement (PISA) were used for these analyses (p. 453). Examples of PISA scales that measured affective and motivational outcomes are: constructs that were related to students’ motivation and drive (e.g., perseverance, openness, interest, and instrumental motivation to learn), the beliefs they held about themselves as learners (e.g., self-efficacy, self-concept), as well as their affective experiences (e.g., enjoyment, anxiety) while learning in each domain. (p. 460). Self -reports on students’ use of learning strategies covered three types of strategies: memorization, elaboration and control strategies. The results showed that, overall, schools differed considerably in cognitive achievement and also that the between-school differences between countries varied considerably; the median intraclass correlation (ICC) was 0.40, while the between schools difference across countries ranged from 0.06 (Finland) to 0.61 (The Netherlands). Sociodemographic characteristics explained a substantial part of the variance in between-school differences and a smaller proportion of the variance in withinschool differences in most countries Compared with the achievement measures, the overall and across countries differences between schools in the scores on affect and motivation scales were much smaller. The median intraclass correlation was 0.03, while the values of the ICC varied from 0.02 (e.g., general openness) to 0.08 (mathematics self-efficacy across countries. Regarding the results for the learning strategies, the authors reported that the pattern of results was quite similar to that obtained from the motivation and affect variables. This international study is a strong corroboration of the findings from the educational effectiveness studies referred to in the above, in showing that schools matter more for cognitive than for non-cognitive outcomes. In summing up the state of affairs with educational effectiveness research as far as non-cognitive outcomes are concerned, the following points can be made: Firstly, educational effectiveness research has been highly focused on cognitive outcomes in basic subjects, like language, mathematics and sometimes science; a small minority of studies has used non-cognitive outcomes as dependent variables. Secondly, school and classroom effects on non-cognitive outcomes tend to be much smaller than effects on cognitive outcomes. Thirdly, the evidence appears to indicate that both instructional strategies (like clarity, structure and opportunity to learn) and characteristics of the personal and social learning environment (like relational classroom climate) are sometimes positively associated with non-cognitive outcomes. And fourthly the choice of non-cognitive outcomes in educational effectiveness research, so far has been relatively limited, favoring factors that are tied to the immediate school setting, such as school well-being, or to variables that are instrumental to academic learning, like academic self-concept. These results heavily depend on the use of students’ self-reports.

86

4 Evidence From Educational Studies

4.5.6 Experimental and Quasi-Experimental Studies As it was indicated in the introduction on the field of educational effectiveness research, the SREE organization and its associated Journal of Research on Educational Effectiveness show a preference for experimental and quasi-experimental designs, as compared to the earlier strands of educational effectiveness research which were referred to in the previous section. In order to identify experimental or quasi-experimental studies that had addressed social-emotional learning programs and/or assessment of social-emotional outcomes all volumes of the Journal of Research on Educational Effectiveness (2008–2019) were scrutinized for such studies. The fact that only seven such studies were identified as addressing socialemotional outcomes reinforces earlier conclusions that only a very small minority of educational effectiveness studies have addressed non-cognitive outcomes. An overview of these studies is provided in Table 4.2. Below, of each study a brief summary description will be given. Table 4.2 Summary description of 7 quasi-experimental SEL evaluation studies in the Journal of Research on Educational Effectiveness (2008–2019) Study

Disadvantaged area yes/no

Grade level

E.S. Cognitive outcomes

E.S. Social emotional outcomes

Intervention

Hwang and Cappella (2018)

No

Kindergartengrade eight

d = −0.19 reading Zero effects on other cognitive outcomes

d = 0.20 for self concept; zero effects for other s.e. outcomes

Grade retention

Polikoff et al. (2018)

Urban district

Fourth grade

0.48 sd content knowledge

0.42 sd Teaching math student and science interest, other content s.e. outcomes 0.25/0.30 sd

Gandhi et al. High poverty (2018) districts

elementary and/or middle grades

ELA and Zero effects Mathematics on retention, 0.30 and 0.24 attendance rates and suspensions

Social Emotional Learning

O’Connor et al. (2014)

Kindergarten First grade

Math 0.31, Sustained reading, 0.55 attention 0.39, Reduction in behavioral problems 0.54

INSIGHTS SEL

Low income urban

(continued)

4.5 Non-cognitive Outcomes in Educational Effectiveness Research

87

Table 4.2 (continued) Study

Disadvantaged area yes/no

Grade level

Kraft and Dougherty (2013)

No

Gottfredson et al. (2010)

Disadvantaged

Snyder et al. Disadvantaged (2010)

E.S. Cognitive outcomes

E.S. Social emotional outcomes

Intervention

Six and ninth grade

Student participation & behavior results in Expected direction (no effect sizes reported)

Increased teacher communication

Middle School No effects academic performance

No effects, social emotional. outcomes, like school bonding, social competence

After school program

Grades 4 and 5

Improved attainment 0.63–0.96 (e.g. absenteeism)

POSITIVE ACTION, SEL

Math 0.50 Language 0.58

4.5.7 Hwang and Cappella (2018) Hwang and Cappella conducted a study about the effects of grade retention on longterm academic and psychosocial outcomes. The study was based on a secondary analysis of the Early Childhood Longitudinal Study, Kindergarten Class 1998–1999, a (US) nationally representative sample following a longitudinal cohort of 21,260 kindergarteners until the eighth grade. The treatment was defined as one instance of grade retention in either the first or the second grade (i.e., during the 1999–2000 or 2000–2001 academic year). The dependent variables of the study included direct assessment of reading and math achievement, student self-report of reading and math competence, and teacherreport of reading competence. Psychosocial outcomes included student self-report on social self-concept, internalizing behaviors, self-esteem, and locus of control. The treatment effect estimations showed a negative effect on reading (d = −0.19) and no detectable effects in mathematics or other academic outcomes. For psychosocial outcomes, there was a statistically significant positive coefficient for social self-concept (d = 0.20) indicating that retention led students to have slightly higher self-perceptions of their social acceptance and peer connectedness than if they were not retained. No detectable effects were found for the remaining psychosocial outcomes (internalizing behaviors, locus of control, and self-esteem).

88

4 Evidence From Educational Studies

4.5.8 Polikoff, Le, Tien, Danielson, and Marsh (2018) The study of Polikoff et al. addressed the impact of ‘Speedometry’, a STEM curriculum to teach fourth-grade students science and mathematics content (aligned with US national standards). The Speedometry curriculum consisted of 12 lessons. A cluster randomized control trial was conducted, which involved a total of 1,615 fourth-grade students across 48 classrooms and 17 schools in an urban district. The academic outcomes of the study were measured by means of a specifically designed assessment instrument. The non-cognitive outcomes were measured, post implementation, on the basis of a specifically designed student survey, the student interest and emotion survey. The following outcomes were included: general interest in the STEM activity, interest in specific math and science content, three positive emotions (excited, curious, surprised) and three negative emotions (confused, frustrated, bored). Results on student knowledge showed that 2% of the variance in content knowledge posttest scores was between schools, 12% of the variance was between classrooms within schools, and 86% of the variance was among students within classrooms. The intervention led to a 0.48 standard deviation increase in student content knowledge (p < 0.001) versus the control group. Regarding the non-cognitive outcomes, the Speedometry curriculum led to a 0.42 standard deviation increase in student interest—however, not statistically significant at the 0.05 level (p = 0.061). Further, the results showed that Speedometry led to a 0.30 standard deviation increase in excitement (p < 0.05), a 0.30 standard deviation decrease in boredom (p < 0.05), a 0.26 standard deviation decrease in frustration (p < 0.05), and a 0.25 standard deviation decrease in confusion (p < 0.01). The curriculum did not have a significant impact on surprise or curiosity. It should be noted that, despite of the fact that some non-cognitive outcomes were targeted, in this study the intervention is purely cognitive.

4.5.9 Gandhi, Slama, Park, and Williamson (2018) Gandhi et al. report on a study in which a comparative interrupted time series design was used to examine the impact of the Massachusetts’ Wraparound Zone (WAZ) Initiative on student achievement, attendance, retention, and suspension. One of the motives for this study was that recent research has suggested that reform efforts prioritizing nonacademic strategies, such as school climate, social and emotional learning (SEL), and building community partnerships to address student needs, may be viable strategies for improving student outcomes. The focus of the evaluation study were five (high poverty) districts that implemented the WAZ strategies; one specific component for each district being one or more Social Emotional Learning (SEL) curricula or programs.

4.5 Non-cognitive Outcomes in Educational Effectiveness Research

89

The dependent variables used to examine the impact of WAZ were: student achievement; (b) student attendance (c) student retention and (d) suspension (a binary variable indicating whether a student received an in-school or out-of-school suspension during the school year). Results on student achievement showed that overall, students in WAZ schools performed better than students in comparison schools, when considering prior achievement trends. Effects were statistically significant after three years of WAZ implementation. Specifically, in the third year of implementation, students in WAZ schools demonstrated ELA and mathematics scores that were 0.30 and 0.24 standard deviations higher, respectively, than what would be expected given prior performance trends and test score changes in non-WAZ comparison schools during the same time. There was no significant impact of WAZ on attendance rates, nor on retention and suspension.

4.5.10 O’Connor, Cappella, McCormick, and McClowry (2014) McCormick, Capella, O’Connor and McClowry (2016) used data from the randomized trial of the SEL program INSIGHTS into Children’s Temperament (N = 435 parent/child dyads). The study was conducted in 22 low-income urban elementary schools in the USA, during children’s kindergarten and first-grade year. The authors say that SEL programs aim to improve children’s social-emotional competencies (behavioral regulation, attentional skills, problem-solving, social skills), in order to support their academic development. In this study parental involvement was a central focus of attention. The authors place the mission of social emotional learning programs as a “growing movement, (which) seeks to improve the social-emotional skills of children attending low-income urban elementary schools” (p. 364.). The current study built on an earlier impact evaluation of INSIGHTS by O’Connor, Cappella, McCormick and McClowry (2014) in the same 22 schools. The main results of this study will be summarized first. O’Connor et al. (2014) reported on “a group randomized intervention-study, that examined the effects of INSIGHTS versus an attention-control reading program in supporting the academic and behavioral skills of low-income urban children in kindergarten and first grade” (p. 1165). According to the authors, results provided convincing evidence of the benefits of a universal SEL prevention program for children’s academic development as well as self-regulatory capacities. Children in INSIGHTS demonstrated increases in math (ES 0.31) and reading (ES 0.55) achievement, as well as in sustained attention (ES 0.39), and decreases in behavior problems (ES 0.54) compared with their peers in the reading program after statistically accounting for the same skills in the fall of their kindergarten year. The follow-up study indicated larger effects of INSIGHTS on academic, attentional, and behavioral outcomes for children whose parents participated at lower

90

4 Evidence From Educational Studies

rates. This somewhat surprising outcome was explained by lower initial achievement and stronger growth of children from low-participating parents as compared to children from high-participating parents, which had better initial results and more moderate growth.

4.5.11 Kraft and Dougherty (2013) Kraft and Dougherty studied the effect of teacher-family communication on student engagement, starting from the assumption that the nature of relationships between teachers, students, and their parents plays an important role in determining a child’s level of engagement with school. They conducted a randomized trial involving sixth and ninth-grade students, and their parents. The setting of the study was sixthand ninth-grade students attending the MATCH Charter Public Middle School and High School summer academy. Fourteen study groups of roughly 10 students were randomly assigned to the treatment or the control group. The experimental treatment consisted of two components of increased teacher communication. Students in the treatment group (n = 69) were assigned to receive one phone call home per day from either their fiction or nonfiction English teacher. Students received treatment for a total of five consecutive days during the 2nd week of the summer academy. Teachers were asked to complete daily communication logs to track the implementation of the treatment regime. Student engagement outcomes were measured by means of a classroom observational protocol, which documented the total number of instances a teacher redirected a student’s attention or behavior in a given class, (REDIRECT), as well as the number of instances each student participated in a given class, (PARTICIPATE). The checks on treatment implementation showed a compliance rate of 86.4%, with respect to the intended phone calls. Ultimately, only 54.9% of all prescribed calls resulted in a conversation with a parent or guardian. Text messages were delivered with an almost identical rate of success, with 298 of the 345 prescribed messages being sent. Results are summarized as follows: “On average, students in the control group became measurably less engaged over time; their homework completion rate dropped by more than 6.5 percentage points, teachers had to redirect their attention more frequently, and they participated less in class. In comparison, students in the treatment group maintained their initial levels of engagement and improved their behavior; their homework completion rate dropped by only 0.6 percentage points, teachers had to redirect their attention less frequently, and their class participation increased” (ibid., p. 199). Another finding was that teacher–family communication reduced the frequency with which students’ attention or behavior in class had to be redirected by 25%.

4.5 Non-cognitive Outcomes in Educational Effectiveness Research

91

4.5.12 Gottfredson, Brown Cross, and Connell (2010) Gottfredson, et al. carried out a study on effects of participation in after-school programs for Middle School students. The authors start out with a literature review of effect studies of After School Programs and note that many evaluation studies do not meet standards of methodological rigor. They discern a trend that less rigorous studies are the only ones that report relatively high effect sizes, while rigorous studies tend to find no effect, small effects, and even negative effects. Their study randomly assigned students within each of five participating schools to an experimental ASP or to a “treatment as usual” control group. The program operated for 9 h per week for 30 weeks and offered attendance monitoring and reinforcement, academic assistance, a prevention curriculum, and recreational activities. The After School Programs were located in public middle schools in Baltimore County, Maryland, that served high percentages of minority, socioeconomically disadvantaged youths. The experimental programs incorporated the All Stars curriculum, which had been demonstrated in prior research to reduce substance use and aggressive behavior and to increase social competency skills. Outcome data were gathered by means of surveys of youth, parents and teachers. The main outcome categories were attendance, academic performance and conduct problems. The non-cognitive outcome variables included were reduction of unsupervised socializing, increase of positive peer influences, bonding to school, social competence and prosocial attitudes. All these variables were seen as mediators of a reduction in conduct problems. The program effects were significant and in the positive direction for one outcome variable, which was the reduction of unsupervised socializing. Effects on the other outcomes were insignificant and partly negative. No significant posttest differences between treatment and control youths were found on measures of conduct problems, academic performance, school attendance, prosocial/antidrug attitudes, social competence, school bonding, or positive peer influence. The authors concluded that “programs like this are not strong enough to increase academic performance, reduce problem behavior or school nonattendance, or influence any of the targeted intermediate behaviors and attitudes other than time expenditure”. The voluntary nature of the program was seen as a source of substantial non-attendance, which, in its turn might have caused the disappointing results.

4.5.13 Snyder, Flay, Vuchinich, Acock, Washburn, Beets, and Li (2010) Snyder et al. report on an evaluation of the Positive Action program, which took place in 20 public elementary, schools at grade levels 4 and 5 (10 matched-pairs) on three Hawai‘ian islands. This study will be discussed in more detail in Chap. 6; effect sizes are reported in Table 4.2).

92

4 Evidence From Educational Studies

The described selection of studies, published in the Journal of Research on Educational Effectiveness confirms the conclusion that was reached in the overview of nonexperimental educational effectiveness studies, that a very small minority of studies from these fields have addressed non-cognitive outcomes and/or evaluations of socialemotional learning (SEL). The results of the studies selected from the Journal of Research in Educational Effectiveness are in line with results from the meta-analyses that were addressed before, namely small to medium sized effect sizes (0.20–0.50) for non-cognitive outcomes. It should be noted that, like the current selection of seven studies, the meta -analyses were predominantly based on studies designed as experiments and quasi experiments. The selection of studies also confirms the picture of an over-representation of studies at Kindergarten and elementary school level in disadvantaged settings. A recent analysis of Randomized Controlled Trials commissioned by the British Education Endowment Foundation (EEF) and the US-based National Center for Educational Evaluation and Regional Assistance (NCEE) confirms the relative underrepresentation of social-emotional learning interventions among educational effectiveness studies (Lortie-Forgues, and Inglis, 2019). Among the 141 interventions, published between 2012 and 2018 that were selected, only 4 involved socialemotional learning programs.4 It should be noted that this is a selection from a universe of studies that met rigorous methodological inclusion criteria. It could be the case that considerably more SEL program evaluations exist that do not meet these criteria.

4.6 Summary and Conclusions When making up the balance on the evidence on the success of programs and practices in education to enhance socio-emotional outcomes in education the question arises how the effect sizes from research reviews and meta-analyses should be benchmarked. When generally accepted norms are applied, like an effect size of 0.20–0.25 as a measure of minimal educational relevance, the meta-analyses that were reviewed are close to this norm or exceed it, when it comes to assessing social-emotional outcomes directly after treatment implementation. A remarkable outcome are the effects on academic outcomes. In the five meta-analyses that were reviewed these were in the order of 0.20. A recent meta-analysis by Corcoran et al. (2018) found similar effect sizes of SEL programs on academic achievement (reading 0.25, mathematics 0.26 and science 0.19). Considering that in the case of SEL interventions these results might have to be considered as ‘side effects’, and not as the directly intended program effects, they seem quite high. The more so, when results of metaanalyses targeting interventions and practices directly dedicated to optimizing cognitive outcomes show comparable, and frequently even lower effect sizes. Comparable 4 These

four programs were “Good behavior Game”, PATHS, Zippy’s Friends, and Lessons in Character.

4.6 Summary and Conclusions

93

effect sizes for educational interventions of about 0.23, are reported in meta-analyses by Hill et al. (2008) and Dietrichson et al. (2017) for Intervention components such as tutoring (ES = 0.36), feedback and progress monitoring (ES = 0.32), and cooperative learning (ES = 0.22). Lower effect sizes are reported in a meta-analysis on Comprehensive School Reform (CSR) programs by Borman et al. (2003). This analysis showed effect sizes between 0.09 and 0.15. Borman et al.’s results are particularly significant, because the program theory behind CSR projects is considered as evidence-based and grounded in scientific research. A specification of the CSR program theory is given in the Annex to this chapter. In the study by Lortie-Forgues and Inglis (2019), referred to in the above, the average effect size of 141 randomized trials in education, conducted between 2012 and 2018, was only 0.06 SD. (The four SEL interventions, included in this study, had effect sizes, of 0.03, −0.11, 0.02 and 0.08). This and other evidence (Detterman, 2016; Scheerens, 2017; Seidel & Shavelson, 2007) poses the question, whether the malleability of student achievement, in the sense of increments in effectiveness over and above regular practice, is not being systematically overestimated. Seen in this light some of the evidence on the effectiveness of SEL interventions almost seems ‘too good to be true’. The frequently expressed concerns about the quality of social-emotional learning outcome measures cannot be ruled out as having caused score inflation on the social-emotional learning outcomes (see Chap. 6 for further consideration of these concerns). Apart from these questions addressing the internal validity of SEL intervention and effectiveness studies, there are also questions about the external validity, the degree to which outcomes are generalizable. Our findings confirm to the observation by Kautz et al. (2014) that programs implemented in disadvantaged contexts, and servicing students at Kindergarten and elementary school predominate, which limits the degree to which results can be expected to apply to more average student populations at lower and upper secondary level. In the next chapter we continue our search by taking a closer look at individual SEL intervention studies in order to capture the nature of the intervention programs, reflect on the underlying “program theory” and consider the overall quality of the program evaluations.

Annex: Characteristics of Comprehensive School Reform Programs (Cited from Borman et al., 2003) “The U.S. Department of Education defines CSR using 11 components that, when coherently implemented, represent a “comprehensive” and “scientifically based” approach to school reform. Specifically, a CSR program: 1.

Employs proven methods for student learning, teaching, and school management that are based on scientifically based research and effective practices, and have been replicated successfully in schools;

94

4 Evidence From Educational Studies

2.

Integrates instruction, assessment, classroom management, professional development, parental involvement, and school management 3. Provides high-quality and continuous teacher and staff professional development and training; 4. Includes measurable goals for student academic achievement and establishes benchmarks for meeting those goals; 5. Is supported by teachers, principals, administrators, and other staff throughout the school; 6. Provides support for teachers, principals, administrators, and other school staff by creating shared leadership and a broad base of responsibility for reform efforts; 7. Provides for the meaningful involvement of parents and the local community in planning, implementing, and evaluating school improvement activities; 8. Uses high-quality external technical support and assistance from an entity that has experience and expertise in schoolwide reform and improvement, which may include an institution of higher education; 9. Includes a plan for the annual evaluation of the implementation of the school reforms and the student results achieved; 10. Identifies federal, state, local, and private financial and other resources available that schools can use to coordinate services that support and sustain the school reform effort; and 11. Meets one of the following requirements: the program has been found, through scientifically based research, to significantly improve the academic achievement of participating students; or the program has been found to have strong evidence that it will significantly improve the academic achievement of participating children”.

References Baumeister, R. F., Campbell, J. D., Krueger, J. I., & Vohs, K. D. (2003). Does high self-esteem cause better performance, interpersonal success, happiness, or healthier lifestyles? Psychological Science in the Public Interest, 4(1), 1–44. https://doi.org/10.1111/1529-1006.01431. De Bilde, J. (2013). Alternative education. Examining the effects of alternative educational approaches on student achievement, academic motivation and engagement in Flemish primary schools. Leuven: University of Leuven. Borman, G. D., Hewes, G. M., Overman, L. T., & Brown, S. (2003). Comprehensive school reform and achievement: A meta-analysis. Review of Educational Research, 73(2), 125–23. https://doi. org/10.3102/00346543073002125.

References

95

Brunner, M., Keller, U., Wenger, M., Fischbach, A., & Lüdtke, O. (2018). Between-school variation in students’ achievement, motivation, affect, and learning strategies: Results from 81 countries for planning group-randomized trials in education. Journal of Research on Educational Effectiveness, 11(3), 452–478. https://doi.org/10.1080/19345747.2017.1375584. Cefai, C., Bartolo, P. A., Cavioni, V., Downes, P. (2018). Strengthening social and emotional education as a core curricular area across the EU. A review of the international evidence, NESET II report. Luxembourg: Publications Office of the European Union. http://doi.org/10.2766/664439 Cheung, A. C. K., & Slavin, R. E. (2015). How methodological features affect effect sizes in education. Educational Researcher, 45(5), 283–292. https://doi.org/10.3102/0013189X1665 6615. Collaborative for Academic, Social, and Emotional Learning. (2005). Safe and sound: An educational leader’s guide to evidence-based social and emotional learning programs—Illinois edition. Chicago: Author. Corcoran, R. P., Cheung, A., Kim, E., & Xie, C. (2018). Effective universal school-based social and emotional learning programs for improving academic achievement: A systematic review and meta-analysis of 50 years of research. Educational Research Review, 25, 56–72. https://doi.org/ 10.1016/j.edurev.2017.12.001. Cunha, F., Heckman, J. J., & Schennach, S. M. (2010). Estimating the technology of cognitive and noncognitive skill formation. Econometrica, 78(3), 883–931. https://doi.org/10.3982/ECT A6551. Denham, S. A. (2005). Assessing social-emotional development in children from a longitudinal perspective for the national children’s study. Columbus, OH: Battelle Memorial Institute. Detterman, D. K. (2016). Education and intelligence: Pity the poor teacher because student characteristics are more significant than teachers or schools. The Spanish Journal of Psychology, 19. https://doi.org/10.1017/sjp.2016.88. Dietrichson, J., Bøg, M., Filges, T., & Klint Jørgensen, A.-M. (2017). Academic Interventions for elementary and middle school students with low socioeconomic status: A systematic review and meta-analysis. Review of Educational Research, 87(2), 243–282. https://doi.org/10.3102/003465 4316687036. Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters: Assessing personal qualities other than cognitive ability for educational purposes. Educational Researcher, 44(4), 237–251. http:// doi.org/10.3102/0013189X15584327. Durlak, J. A., Weissberg, R. P., Dymnicki, A. B., Taylor, R. D., & Schellinger, K. B. (2011). The impact of enhancing students’ social and emotional learning: A meta-analysis of schoolbased universal interventions. Child Development, 82(1), 405–432. https://doi.org/10.1111/j. 1467-8624.2010.01564.x. Dweck, C., & Leggett, E. (1988). A social-cognitive approach to motivation and personality. Psychology Review, 95(2), 256–273. Elias, M. J., Zins, J. E., Weissberg, R. P., Frey, K. S., Greenberg, M. T., Haynes, N. M., et al. (1997). Promoting social and emotional learning: Guidelines for educators. Alexandria, VA: Association for Supervision and Curriculum Development. Evertson, C. M., & Weinstein, C. S. (Eds.). (2006). Handbook of classroom management: Research, practice, and contemporary issues. Mahwah, NJ: Lawrence Erlbaum. Gandhi, A. G., Slama, R., Park, S. J., & Williamson, S., (2018). Focusing on the whole student: An evaluation of Massachusetts’s wraparound zone initiatiative. Journal of Research on Educational Effectiveness, 11(2), 240–266. http://doi.org/10.1080/19345747.2017.1413691. Goleman, D. (1998). Working with emotional intelligence. New York, NY: Bantam Books. Gottfredson, D., Brown Cross, A., Wilson, D., Rorie, M., & Connell, N. (2010). Effects of participation in after-school programs for middle school students: A randomized trial. Journal of Research on Educational Effectiveness, 3(3), 282–313. https://doi.org/10.1080/19345741003686659. Gray-Little, B., Williams, V. S. L., & Hancock, T. D. (1979). An item response theory analysis of the Rosenberg Self-Esteem Scale. Personality and Social Psychology Bulletin, 23(5), 443–451. https://doi.org/10.1177/0146167297235001.

96

4 Evidence From Educational Studies

Greenberg, M. T., Kusché, C. A., & Riggs, N. (2004). The PATHS curriculum: Theory and research on neurocognitive development and school success. In J. E. Zins, R. P. Weissberg, M. C. Wang, & H. J. Walberg (Eds.), Building academic success on social and emotional learning: What does the research say? (pp. 170–188). Teachers College Press. Heckman, J. J., & Kautz, T. D. (2012). Hard evidence on soft skills. Retrieved from http://www. nber.org/papers/w18121.pdf Heckman, J., Moon, S. H., Pinto, R., Savelyev, P., & Yavitz, A. (2010). Analyzing social experiments as implemented: A reexamination of the evidence from the HighScope Perry Preschool Program. Quantitative Economics, 1(1), 1–46. https://doi.org/10.3982/QE8. Heckman, J. J., Pinto, R., & Savelyev, P. A. (2013). Understanding the mechanisms through which an influential early childhood program boosted adult outcomes. American Economic Review, 103(6), 2052–2068. https://doi.org/10.1257/aer.103.6.2052. Heckman, J. J., Stixrud, J., & Urzua, S. (2006). The effects of cognitive and noncognitive abilities on labor market outcomes and social behavior. Journal of Labor Economics, 24(3), 411–482. https://doi.org/10.1086/504455. Hill, C. J., Bloom, H. S., Black, A. R., & Lipsey, M. W. (2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172–177. https://doi.org/10.1111/ j.1750-8606.2008.00061.x. Hwang, S. H. J., & Cappella, E. (2018). Rethinking early elementary grade retention: Examining long-term academic and psychosocial outcomes. Journal of Research on Educational Effectiveness, 11(4), 559–587. https://doi.org/10.1080/19345747.2018.1496500. Kautz, T., et al. (2014). Fostering and measuring skills: Improving cognitive and non-cognitive skills to promote lifetime success (OECD Education Working Papers, No. 110). Parsi: OECD. https://doi.org/10.1787/5jxsr7vr78f7-en. Konu, A. I., Lintonen, T. P., & Autio, V. J. (2002). Evaluation of well-being in schools–A multilevel analysis of general subjective well-being. School Effectiveness and School Improvement, 13(2), 187–200. https://doi.org/10.1076/sesi.13.2.187.3432. Korpershoek, H., Harms, T., de Boer, H., van Kuijk, M., & Doolaard, S. (2016). A meta-Analysis of the effects of classroom management strategies and classroom management programs on students’ academic, behavioral, emotional, and motivational outcomes. Review of Educational Research, 86(3), 643–680. https://doi.org/10.3102/0034654315626799. Kraft, M. A., & Dougherty, S. M. (2013). The effect of teacher-family communication on student engagement: evidence form a randomized filed experiment. Journal of Research on Educational Effectiveness, 6(3), 199–222. https://doi.org/10.1080/19345747.2012.743636. Lortie-Forgues, H., & Inglis, M. (2019). Rigorous large-scale educational RCTs are often uninformative: Should we be concerned? Educational Researcher, 48(3), 158–166 http://doi.org/10. 3102/0013189X19832850 Marsh, H. W. (1988). The self-description questionnaire: A theoretical and empirical basis for the measurement of multiple dimensions of preadolescent self-concept: A test manual and a research monograph. San Antonio, TX: The Psychological Corporation. McCormick, M., P., Cappella, E., O’Connor, E. & McClowry, S. (2016). Do effects of socialemotional learning programs vary by level of parent participation? Evidence from the randomized trial of INSIGHTS. Journal of Research on Educational Effectiveness, 9(3), 364–394 http://doi. org/10.1080/19345747.2015.1105892 Midgley, C., Maehr, M. L., Hruda, L. Z., Anderman, E., Anderman, I., & Freeman, K. I. (2000). Manual for the patterns of adaptive learning scales. Ann Arbor, MI: University of Michigan. Muijs, D., Kyriakides, L., van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014). State of the art—Teacher effectiveness and professional learning. School Effectiveness and School Improvement, 25(2), 231–256. https://doi.org/10.1080/09243453.2014.885451. Murray, C. (2012, September/October). Response: Weighing the evidence. Boston Review. Retrieved from https://bostonreview.net/archives/BR37.5/ndf_charles_murray_social_mobility.php O’Connor, E. E., Cappella, E., McCormick, M. P., & McClowry, S. G. (2014). An examination of the efficacy of INSIGHTS in enhancing the academic and behavioral development of children

References

97

in early grades. Journal of Educational Psychology, 106(4), 1156–1169. https://doi.org/10.1037/ a0036615. OECD (2015), Skills for social progress: The power of social and emotional skills. OECD Skills Studies, OECD Publishing. http://dx.doi.org/10.1787/9789264226159-en. Opdenakker, M.-C., & Van Damme, J. (2000). Effects of schools, teaching staff and classes on achievement and well-being in secondary education: Similarities and differences between school outcomes. School Effectiveness and School Improvement, 11(2), 165–196. Payton, J., Weissberg, R. P., Durlak, J. A., Dymnicki, A. B., Taylor, R. D., Schellinger, K. B., et al. (2008). The positive impact of social and emotional learning for kindergarten to eighth-grade students. Chicago, IL: CASEL. Polikoff, M., Le, Q. T., Danielson, R. W., Sinatra, G. M., & Marsh, J. A. (2018). The impact of speedometry on student knowledge, interest, and emotions. Journal of Research on Educational Effectiveness, 11(2), 217–239. https://doi.org/10.1080/19345747.2017.1390025. Reynolds, D., Sammons, P., De Fraine, B., Van Damme, J., Townsend, T., Teddlie, C., et al. (2014). Educational effectiveness research (EER): A state-of-the-art review. School Effectiveness & School Improvement, 25(2), 197–230. https://doi.org/10.1080/09243453.2014.885450. Roid, G. H., & Miller, L. J. (1997). Leiter international performance scale-revised: Examiners manual. Wood Dale, IL: Stoelting. Rosenberg, M. (1965). Society and the adolescent self-image. Princeton, NJ: Princeton University Press Retrieved from https://www.docdroid.net/Vt9xpBg/society-and-the-adolescent-selfimage-morris-rosenberg-1965.pdf#page=7. Rotter, J. B. (1966). Generalized expectancies for internal versus external control of reinforcement. Psychological Monographs: General and Applied, 80(1), 1–28. https://doi.org/10.1037/ h0092976. Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55(1), 68–78. https://doi. org/10.1037/0003-066X.55.1.68. Scheerens, J. (2017). The perspective of “limited malleability” in educational effectiveness: Treatment effects in schooling. Educational Research and Evaluation, 23(5/6), 247–266. https://doi. org/10.1080/13803611.2017.1455286. Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454–499. https://doi.org/10.3102/0034654307310317. Sklad, M., Diekstra, R., De Ritter, M., Ben, J., & Gravesteijn, C. (2012). Effectiveness of schoolbased universal social, emotional, and behavioral programs: Do they enhance students’ development in the area of skill, behavior, and adjustment? Psychology in the Schools, 49(9), 892–909. https://doi.org/10.1002/pits.21641. Snyder, F., Flay, B., Vuchinich, S., Acock, A., Washburn, I., Beets, M., et al. (2010). Impact of a social-emotional and character development program on school-level indicators of academic achievement, absenteeism, and disciplinary outcomes: A matched-pair, cluster-randomized, controlled trial. Journal of Research on Educational Effectiveness, 3(1), 26–55. https://doi.org/ 10.1080/19345740903353436. Taylor, R., Oberle, E., Durlak, J. A., & Weissberg, R. P. (2017). Promoting positive youth development through school-based social and emotional learning interventions: A meta-analysis of follow-up effects. Child Development, 88(4), 1156–1171. https://doi.org/10.1111/cdev.12864. Van Landeghem, G., Van Damme, J., Opdenakker, M.-C., De Fraine, B., & Onghena, P. (2002). The effect of schools and classes on noncognitive outcomes. School Effectiveness and School Improvement, 13(4), 429–451. https://doi.org/10.1076/sesi.13.4.429.10284. Van Swynsberghen, G., Vanlaar, G., Van Damme, J., & De Fraine, B. (2017). Long-term effects of primary schools on educational positions of students 2 and 4 years after the start of secondary education. School Effectiveness and School Improvement, 28(2), 167–190. https://doi.org/10. 1080/09243453.2016.1245667.

98

4 Evidence From Educational Studies

Wigelsworth, M., Lendrum, A., Oldfield, J., Scott, A., ten Bokkel, I., Tate, K., et al. (2016). The impact of trial stage, developer involvement and international transferability on universal social and emotional learning programme outcomes: A meta-analysis. Cambridge Journal of Education, 46(3), 347–376. https://doi.org/10.1080/0305764X.2016.1195791. Zins, J. E., & Elias, M. J. (2006). Social and Emotional Learning. In G. G. Bear & K. M. Minke (Eds.), Children’s Needs III (pp. 1–13). Bethesda, MD: NASP.

Chapter 5

Opening Black Boxes of the Meta-Analyses: What Do the Underlying Studies Look like?

5.1 Introduction In this chapter we take a closer look at intervention programs that were evaluated in the individual studies that formed the building blocks for the meta-analyses that were discussed in the previous chapter. The review of meta-analyses prompted us to turn to a casuistic review of some of the underlying individual intervention studies to come to grips with the actual content of social-emotional development programs and the match between programs and outcome measures. Basically, this chapter consists of case studies of intervention studies, in which we paid specific attention to the description of the actual interventions, and the match with the SEL outcome measures that were used. But before setting forth with these case descriptions we would like to revisit the definition of the domain of social and emotional skills, as the overarching contents of these programs and evaluation studies and address the program theories behind the intervention studies. We consider the hypothetical causal structure of programs that are meant to stimulate social-emotional outcomes as program theory. According to Corcoran, Cheung, Kim, and Xie (2018), there is national and international evidence to suggest that improving social and emotional learning (SEL) allows students to connect with others and learn in a more effective way, thereby increasing their chances of success both in school and in later life. Next, they refer to the Collaborative for Academic, Social, and Emotional Learning (CASEL, 2015) for a comprehensive definition of social and emotional development, namely: “the process through which children and adults acquire and effectively apply the knowledge, attitudes, and skills necessary to understand and manage emotions, set and achieve positive goals, feel and show empathy for others, establish and maintain positive relationships, and make responsible decisions”. (Corcoran et al., 2018, p. 57). They continue by saying that the conceptual model that guides SEL outcome research assumes that each of the core competencies contributes to increased skills and knowledge, supportive learning environments, and improved attitudes about school, self, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_5

99

100

5 Opening Black Boxes of the Meta-Analyses: What Do …

and others, which in turn leads to reduced problem behaviors, reduced emotional stress, improved social behavior, improved self-esteem and that each of these factors then help contribute to improved academic performance in the classroom. But they also refer to alternative models, which “suggest that core competencies change teaching practices and allow for a richer classroom culture and this in turn elevates engagement as well as a feeling of security and support, which in turn has an effect on improved academic skills”. And they also note that “still other models distinguish between pro-social behavior and performance-related skills derived from SEL such as attention, regulation, or grit. Such SEL performance related skills are theorized to have a larger impact on academic outcomes than prosocial behavior” (ibid., p. 57). Schematically their original model and the subsequent alternative models can be rendered as follows: Model (1) programs addressing core social-emotional competences >>> (improved attitudes about school, self, and others; reduced problem behaviors; reduced emotional stress; improved social behavior; improved self-esteem) >>>>> improved academic performance in the classroom. Model (2) programs addressing core social-emotional competences >>>> (teaching practices and a richer classroom culture) >>>> (elevated engagement as well as a feeling of security and support) >>>>> improved academic performance. Model (3) programs addressing pro-social behavior and performance related skills (such as ‘grit’) >>>> academic performance. In these models, educational interventions are aimed at improving core social and emotional competences in order to stimulate academic performance. The first model hypothesizes intermediary processes that are learner based, while the second model mentions intermediary processes that also refer to qualities of the teaching and learning environment. Model 3 just highlights a particular category of skills. In principle the components could be included in more complex indirect effect models, with different intermediary constellations. Typically, this is not seen in the program evaluation studies that are the building blocks of the meta-analyses discussed in the previous chapter. Another important feature of these models, from Corcoran et al., is the fact that they see academic performance as the dependent variable. In their view the development of social-emotional skills is instrumental to the furthering of academic performance. We have been working from the perspective of the soft skills movement (see Chap. 1) that social and emotional skills are educational objectives in their own right, and therefore see measures of such skills as the primary outcome variables. Theoretically these two perspectives are reconcilable, either by measuring academic outcomes next to social and emotional outcomes simultaneously, as has in fact happened in all of the meta-analyses that were reviewed in the previous chapter, or by carrying out a causal modeling study, in which varying combinations of intermediary conditions can be tested. When academic outcomes are used as the effect criteria in the evaluation of socialemotional learning programs, the inclusion of social-emotional outcomes could serve different purposes: (a) as a second category of outcomes (b) as a specific kind of moderator, namely as part of implementation control, and (c) as mediator. We return

5.1 Introduction

101

to the issue of program theories and causal models in social-emotional outcome studies in the concluding section of this chapter. Before embarking on the case studies of SEL intervention studies that form the main body of this chapter, we briefly refer to an earlier review of program evaluation studies by Kautz et al. (2014).

5.2 Review of Results of Program Evaluations of Skill Enhancement Programs by Kautz et al. (2014) In their OECD working paper, Kautz et al. (2014) discuss the rate of return of preschool and school programs that included non-cognitive components in establishing the rate of return of these programs. They conclude that the most successful interventions target preschoolers (after age three) and primary school children. “They improve later-life outcomes by developing non-cognitive skills” (p. 33). They mention four general challenges when trying to assess the effectiveness of intervention programs. Firstly, many interventions are only assessed with no, or shortterm follow up. Secondly, not all studies measure the same outcomes. Thirdly, many programs target specific student populations, and most of them target disadvantaged groups. Fourthly, “different programs use different, often incompatible, measurement schemes” (p. 32). They note that the evidence on intervention programs for adolescents is particularly problematic, “due to the absence of measures of skills for many adolescent interventions, understanding these programs requires examining the curricula of the programs themselves, for example, whether the program seeks to foster cognitive or non-cognitive skills”. Out of the three primary school intervention programs Kautz et al. refer to in their summary table (p. 35) only one improved non-cognitive outcomes. This was the Seattle Social Development Project (SSDP). The program targeted public elementary schools in high-crime areas of Seattle. The program emphasized attachment and interaction between children and their parents and teachers. The program improved self-efficacy at age 21 and 24, and improved scores on the Mental health disorder index, but did not have a statistically significant effect on achievement test scores (p. 45). Among intervention studies about programs for adolescents, only 2 out of 8 programs listed in Kautz et al.’s summary, showed positive effects on non-cognitive outcomes. Other descriptions of interventions targeted at adolescents, present the following summary picture: – Most programs were targeted at disadvantaged groups, primarily oriented toward prevention of anti-social behavior. – Next, most programs were at the same time targeted at improving cognitive outcomes, while improvement of non-cognitive outcomes other than behavioral

102

5 Opening Black Boxes of the Meta-Analyses: What Do …

improvement are mentioned as program facets but rarely assessed by means of psychological measures. – To the extent that, let us say, psychological interpretations of non-cognitive outcomes are included in the program evaluations, this is usually done in a rather minimal way, just focusing one or two traits, with limited (abbreviated) versions of instruments, with little coherence in the choice of skills being selected for measurement.

5.3 Case Descriptions of Program Evaluations of SEL Programs The rationale for selecting a limited number of SEL program interventions for more in depth description was guided, first of all, by choosing two programs that were referred to in several strands of research literature on social and emotional skills. As such we selected the evaluation of the Tools of the Mind Curriculum and the PATH program. Next, we selected studies from two recent meta-analyses, namely the one by Korpershoek et al. (2016), and the one by Corcoran et al. (2018). The selection of studies from these meta-analyses was fairly random, but we purposefully choose the evaluation of the Positive Action program by Snyder et al. (2010) for its rather high effect sizes.

5.3.1 Evaluation of the Tools of the Mind Curriculum (Barnett et al., 2008) The effectiveness of the Tools of the Mind (Tools) curriculum in improving the education of 3- and 4-year-old children was evaluated by means of a randomized trial. The Tools curriculum, based on the work of Vygotsky, focuses on the development of self-regulation at the same time as teaching literacy and mathematics skills in a way that is socially mediated by peers and teachers and with a focus on play. The Tools curriculum was found to improve classroom quality and children’s executive functioning as indicated by lower scores on a problem behavior scale (Barnett et al., 2008, p. 299). There were indications that Tools also improved children’s language development, but these effects were smaller and did not reach conventional levels of statistical significance in multi-level models or after adjustments for multiple comparisons. The experiment was conducted in a low-income urban school district with a high proportion of children from low-income and non-English-speaking families. The Tools intervention had three main components: the teachers’ use of scaffolding, reducing behavioral problems and elements directly related to literacy skills. The reduction of behavioral problems component was associated with enhancing the students’ self-regulation capacity, in the sense of control of their physical, emotional

5.3 Case Descriptions of Program Evaluations of SEL Programs

103

and cognitive functioning. The Tools’ didactic approach is characterized as providing ample opportunity for play, controlled and supported by teacher interventions to stimulate “mature play” (ibid., p. 302). The evaluation was designed as a randomized trial, taking place in one school; there were 7 experimental (88 students) and 10 control classrooms (122 students). In control classrooms, there was greater emphasis on teacher-imposed control and less on children regulating each other and themselves. Comparability of the two treatment conditions was verified by assessing several student characteristics at the start of the 7 months duration of the program. The effect of the self-regulation component was assessed by means of the teacher form of the Problem Behaviors Scale of the Social Skills Rating System (SSRS) completed by the child’s teacher near the end of the school year (Gresham & Elliott, 1990). Results were summarized as follows: “The effect on behavior problems as measured by the SSRS is about half a standard deviation (es = 0.47, Glass’ delta). This indicates that behavior problems (as rated by the teacher) were substantially less common for children in the Tools classrooms than for those in the control classrooms” (Barnett et al., 2008, p. 308). In addition, the Tools classrooms outperformed the control classrooms on teacher sensitivity and productivity as measured by the CLASS (a classroom observation rating scale, see Pianta et al., 2005). While this study is being marked as an example of asserting the malleability of social-emotional skills, the effect variable is measured as problem behavior. Moreover, the assumed underlying construct of self-regulation is indicated as “one of the really central and significant cognitive developmental hallmarks of the early childhood period” (Ibid, p. 300). Interestingly the SSRS has a social skills scale next to an academic competence scale and the problem behavioral scale, which was used in the evaluation. One wonders why the social skills scale was not used and, given the fact that multiple entrance tests were used, why the SSRS Problem Behavioral Scale was not administered as a pre-test, to check the comparability of the treatment groups as entrance or to assess progress.

5.3.2 Evaluations of the Promoting Alternative Thinking Skills (PATHS) Program (Greenberg, Kusché, & Riggs, 2004) The philosophy of this program is based on various strands of theorizing, psychoanalysis and positive psychology among them. Greenberg et al. (2004) refer to one’s ability to regulate strong emotions (anger, anxiety, sadness), and to have selfawareness, as having direct impact on one’s performance, whether social or academic. There is a strong emphasize on countering anti-social behavior. Positive prosocial skills are considered to be valued outcomes and instrumental to the development of cognitive abilities. The program has a strong emphasis on prevention of bad behavior

104

5 Opening Black Boxes of the Meta-Analyses: What Do …

but at the same time a developmental perspective. The following basic principles of the PATHS curriculum are mentioned: (1) The school environment is a fundamental ecology and can be a locus of change (2) A holistic approach that includes a focus on affect, behavior and cognition (3) Ability to discuss and understand emotions is related to the students’ ability to inhibit behavior and use effective problem solving in social interactions (4) Building protective factors e.g. reflective thinking, problem solving and the ability to accurately anticipate and evaluate situations, that decrease maladjustments. The program covers five conceptual social-emotional domains: Self-control, Emotional understanding, Positive self-esteem, Relationships and Interpersonal problem-solving skills. At the time the review by Greenberg et al. (2004) was published, the program consisted of 131 lessons over a 5-year period, starting at the level of Kindergarten. The program intervention was supported by an Instructional manual, six volumes of lessons, and additional materials: the readiness and self-control unit; the feelings and relationships unit, teaching emotional and interpersonal understanding and the interpersonal problem-solving unit. A general objective of the teaching approach was to build positive self-esteem and to improve peer communications and relationships. The evaluation design is described as longitudinal and randomized. Effect measures were based on measures of social cognition, based on teacher ratings of conduct problems. No standardized tests are mentioned in the review. Goossens et al. (2012, p. 236) present the following overview of evaluations of the PATH curriculum: “The results of the first PATHS study in regular education indicated that the intervention was effective in improving grade 2 and 3 children’s range of vocabulary and fluency in discussing emotional experiences, their efficacy beliefs regarding the management of emotions, and their developmental understanding of some aspects of emotions (Greenberg, Kusché, Cook, & Quamma, 1995)”. A second, larger study of 329 second and third graders showed that the intervention promoted inhibitory control, verbal fluency, and diminished internalizing and externalizing problem behaviors (Riggs, Greenberg, Kusché, & Pentz, 2006). A third study, with 246 pre-school children, showed that children exposed to PATHS intervention had higher emotion knowledge skills and were more socially competent than peers (Domitrovich, Cortes, & Greenberg, 2007). In all these studies, the program developers were involved in the research, and the level of support was high (i.e. teachers received monthly or even weekly consultation from the project staff to enhance the quality of implementation). In a final evaluation study, discussed by Goossens et al. (2012) the PATH program was studied in ten US public elementary schools. Of the twenty child level outcomes, none was significant. The authors suggest that the lack of positive findings was probably caused by the control condition not being a non-treatment condition, but a standard practice condition including schools that use other social and character

5.3 Case Descriptions of Program Evaluations of SEL Programs

105

development activities. Goossens et al. conclude that, “In sum, PATHS has been shown to be efficacious and potentially effective, but effectiveness depends to a large extent on the implementation conditions” (ibid., p. 237). A more recent evaluation study (Bierman et al. 2010) involved the Fast Track PATHS (Promoting Alternative Thinking Strategies) curriculum and teacher consultation, embedded within the Fast Track selective prevention model. The longitudinal analysis involved 2937 children of multiple ethnicities who remained in the same intervention or control schools for Grades 1, 2, and 3. The study involved a clustered randomized controlled trial involving sets of schools randomized within 3 U.S. locations. The Fast Track PATHS curriculum was implemented in Grades 1–3. Grade 1 contained 57 lessons and Grade 2 contained 46 lessons, approximately 80% of which were drawn from the published version of the curriculum (Kusché & Greenberg, 1995). Grade 3 contained 48 lessons, with approximately 65% drawn from the published version. In order to provide a more detailed impression of the contents of the Fast Track version of the PATH curriculum, we include a long citation of the intervention, as described in the article by Bierman et al. (pp. 159–60). “Previously designed for special need populations, this multiyear (first through fifth grade) classroom prevention program was adapted to fit the needs of regular education students in high-risk schools for the Fast Track program. At each grade level, some new lessons were created in order to provide synchronization with the parent training and social skill training components of the Fast Track program. Across all three grades, approximately 40% of the lessons focused on skills related to understanding and communicating emotions. PATHS teaches young children to recognize the internal and external cues of affect and to label them with appropriate terms, as a basic step toward self-control. In a series of lessons, feeling words are identified and descriptions of the sorts of situations that may elicit the feeling, the external cues to recognize that feeling in others, and the internal cues to identify that feeling in oneself are provided. Additional lessons help children understand the difference between feelings and behaviors. Appropriate and inappropriate behavioral responses are discussed. The teaching of feelings involves a generalization technique (“Feeling Faces”) that is used to promote the student’s use of new knowledge and skills throughout the classroom day. After each emotion concept is introduced, the children personalize their own Feeling Face for that affect; these faces are small cards with idealized line drawings of the affect that are kept on the student’s desk. The faces allow the children to communicate their feelings with minimal difficulty throughout the day, and they facilitate the children’s understanding about how feelings change. Teachers have their own set of Feeling Faces and use the cards as models for their students. Teachers are encouraged to promote generalization at the beginning and the end of the day, after recesses, and after lunchtime by suggesting that the children evaluate how they feel and display the appropriate faces. Regular homework activities are designed to help children engage their parents in cooperative activities, such as completing drawings or sharing stories related to curriculum components”. (Ibid, pp. 159–160)

106

5 Opening Black Boxes of the Meta-Analyses: What Do …

In the evaluation study by Bierman et al. (2010) outcome measures were based on two independent sources: teacher ratings and peer sociometric nominations. First, in the fall and spring of first grade and the spring of second and third grade, teachers were individually interviewed regarding the behavior of each child in their class using the Teacher Observation of Classroom Adaptation—Revised (TOCA–R; WerthamerLarsson, Kellam, & Wheeler, 1991) and the Social Health Profile (SHP); (CPPRG, 1998).1 Second, in the spring of first, second, and third grade, sociometric nominations were collected to assess peer aggression, hyperactive–disruptive behavior, and prosocial behavior. The authors summarize the results as follows: “Modest positive effects of sustained program exposure included reduced aggression and increased prosocial behavior (according to both teacher and peer report) and improved academic engagement (according to teacher report)”. “Significant intervention main effects and significant moderation of intervention were found for all three teacher-rated TOCA-SHP outcomes. On the outcomes of authority acceptance (p < 0.001, effect size 0.24), cognitive concentration (p < 0.001, effect size 0.12), and social competence (p < 0.0001, effect size 0.34), children in the intervention schools had significantly lower problem levels at Grade 3 and less of an increase in problems than did children in the control schools (p < 0.001)” (Ibid., p. 163). The analysis based on the sociometric technique based on peer nomination did not show significant main effects, but only some significant interaction effects (aggressive and hyperactive–disruptive nominations for boys) (ibid., 164). A final evaluation study of PATHS we would like to refer to is based on the implementation of PATHS in all of the eight grades of Dutch primary school (Goossens et al. 2012). The evaluation study did not find intervention effects. The authors blame this on limitations of the program implementation. In order to capture the scope of a program like the PATH intervention descriptions in evaluation studies are quite useful. Perhaps even more so are the outcome measures. Based on the review that was presented, a first impression is that the broad term “social-emotional skills” covers an emphasis that is more narrowly targeted to the prevention of anti-social behavior with a touch of emotion regulation and social skills. The range of outcome measures used in the study by Goossens, reveals a somewhat broader coverage of emotional states; their sets of measures go beyond assessing problem behaviors and emotion regulation in the school context, to include emotional awareness and empathy. At the end of the chapter, we will summarize the outcome measures that were used in the PATH intervention studies (and the other studies that were reviewed) in an ANNEX, and we will discuss some of them more in detail in Chap. 6.

1 Conduct

Problems Prevention Research Group (1998). Technical report for the Social Health Profile. Retrieved from http://sanford.duke.edu/centers/child/fasttrack/techrept/s/shs/shs3tech.pdf.

5.3 Case Descriptions of Program Evaluations of SEL Programs

107

5.3.3 Good Behavior Game (GBG, Witvliet, van Lier, Cuijpers, & Koot, 2009) The Good Behavior Game (GBG) is described as a universal classroom based preventive intervention. The program is aimed at reduction in children’s externalizing (antisocial) behavior and improvement in positive peer relations. Positive peer relations are theorized to prevent externalizing problems “because they provide children a social context in which they can practice social skills, learn social norms and rules, experience social support, and validate a sense of self-worth” (Witvliet et al., 2009). The authors say that several longitudinal studies have shown that problems in peer relations, such as peer rejection and victimization, mediate the association between early problem behavior and later antisocial behavior. One of the lines of approach through which the GBG aims to reduce externalizing behavior and to promote prosocial behavior is to facilitate positive interactions between children through a team-based approach. In the GBG, children are assigned to teams. Team members are encouraged to actively support each other in behaving appropriately, and teams as a whole are systematically rewarded when complying with the explicitly formulated class rules. Because of the active facilitation and rewarding of positive interactions between team members, it is reasonable to assume that these changes in positive peer interactions support the effect of the program on externalizing problems. Previous research on the GBG found that the program was indeed effective in reducing externalizing behavior. Witvliet et al. (2009) describe the following further details of the intervention. “On the basis of behavioral observations, teachers assign children to teams with an equal number of disruptive and non-disruptive children. Teams contain of average 4–5 members, and team compositions may change throughout the year. Each team receives a number of cards, and teachers take a card from a team if a team member violates one of the predefined rules. Children in teams are encouraged to actively support each other in behaving appropriately. Teams as a whole are rewarded by receiving tangible rewards when at least one card is left at the end of a 15–60min period. In addition, students and teams are always rewarded by compliments. The GBG is implemented in three phases. In the introduction phase, children and teachers are familiarized with the GBG by playing it three times a week for 10 min. In the expansion phase, the duration of the GBG, the settings in which the GBG is played, and the behaviors targeted by the GBG are expanded. Rewards are delayed for a week and then a month. In the generalization phase, prosocial behavior outside GBG moments is promoted by explaining to children that the rules used during the GBG are also applicable, even when the game is not played. These three phases were implemented in both first and second grades. However, because children were already familiar with the GBG in second grade, classes swiftly moved onto the expansion and generalization phase. Teachers received three afternoons of training and 10 annual classroom supervisions by licensed GBG supervisors. After the classroom observations, the GBG supervisors gave feedback to the teachers” (Ibid., p. 908).

108

5 Opening Black Boxes of the Meta-Analyses: What Do …

The evaluation study by Witvliet et al. (2009) was a randomized trial, involving 825 Kindergarten children from 47 classes in 30 elementary schools from two urban areas in the western part of the Netherlands and one rural area in the eastern part of the Netherlands. The study took place over a 2-year period, when children attended grades 1 and 2. The following measures were used in the study: Teacher ratings of externalizing behavior and prosocial behavior were assessed with the Problem Behavior at School Interview (Erasmus, 2000), and their ratings of social problems were assessed by the 11-item Social Problems scale of the Teacher’s Report Form (Achenbach, 1991; Verhulst et al., 1997). Data about students’ peer nominations of acceptance and number of mutual friends were obtained by asking children to nominate an unlimited number of children in their class that they liked most. Based on these nominations, proximity to others was computed by using the network analysis software program UCINET (Version 6; Borgatti, Everett, & Freeman, 2002). The results of the study were as follows: The effect size of mean difference in externalizing behavior (Cohen’s d) after 2 years of intervention between controlgroup and GBG children was 0.45. The effect sizes of mean difference in peer acceptance, having mutual friends, and proximity to other children in the winter of second grade were 0.34, 0.20, and 0.26 respectively. To the extent that the aims and outcomes of GBG are associated with the development of social emotional skills, it is one more example of a program strongly oriented to the prevention of anti-social behavior with a touch of social skill development.

5.3.4 Zippy’s Friends (Holen, Waaktaar, Lervåg, & Ystgaard, 2012) Zippy’s Friends is a universal school-based program targeting children between 6 and 8 years of age. The main objective of the program is to prevent psychological problems by increasing children’s coping repertoire and giving them various ways of coping with problems (Holen et al., 2012; Mishara & Ystgaard, 2006). The program builds on theory and empirical findings on the relation between negative life events, coping and mental health. “Emotional coping includes everything we do to regulate the negative emotions triggered by an event, such as playing music, taking a walk or crying. Action-focused coping refers to everything we do to change the situation that frustrates us or makes us unhappy” (Holen et al., 2012, p. 658) Zippy’s Friends is based on six stories about three cartoon characters, their families and friends, and the imaginary stick insect Zippy. Over the course of 24 weekly lessons, children explore themes related to emotions, communication, relations and conflict resolution through the many day-to-day problems, sorrows and joys Zippy and his friends experience (Mishara & Ystgaard, 2006; cited by Holen et al., 2012). “Through tasks and discussions within a manualized structured program, the children are stimulated to interact and take part in dialogues in class, and to share experiences and perceptions. The program is designed with stepwise progression from

5.3 Case Descriptions of Program Evaluations of SEL Programs

109

simple to more complex elements. The coping aspect recurs in various ways in all the lessons. Activation and the exploration of both emotional and action-oriented coping alternatives are being emphasized. The teacher’s challenge is thus to maintain the program structure and aims as well as guiding the class through structured learning tasks. Learning is transferred across different settings through the repetition of learning experiences in continuously new situations, during project classes and in the day-to-day affairs of school. The teaching material consists of: (1) six stories about Zippy and his friends; (2) large color posters illustrating the stories; (3) a detailed instruction manual for the teachers. Each of the six stories focuses on a main topic: emotions, communication, friendship, conflicts and conflict resolution, loss and change and finally summary lessons about coping, which repeats and integrates everything learnt up to that point. The children work on these topics by drawing, role playing, performing exercises, play and dialogue” (Ibid., p. 659). The study by Holen et al. (2012) used a randomized design, controlling for the hierarchical structure of the data, to examine the hypothesis that participating in the Zippy’s Friends program would improve children’s coping repertoire and prevent mental health problems. Thirty-five Norwegian elementary schools, representing both rural and urban areas, agreed to take part in the study with all their l secondgrade classes. Over the course of 24 weekly lessons, the children were stimulated to initiate their own activities, interactions and dialogue, and to share perceptions and experiences. The program was implemented by the teachers and was to have a central place in the children’s lives over an extended period (minimum eight months). Three counselling sessions lasting one day each were scheduled for the teachers in the course of the program. The following instruments were used to measure the outcomes of the study: – the Kidcope questionnaire based on stress-coping theory for adults, developed by Spirito, Stark, and Williams (1988). The coping measures were reduced to three factors labelled as Support-seeking (Factor 1), Active (Factor 2) and Withdrawal/Oppositional (Factor 3). – Mental health was assessed using the extended Norwegian version of the Strengths and Difficulties Questionnaire (SDQ), parent and teacher form (Goodman, 1997). It consists of 25 items representing five subscales: Emotional Symptoms, Conduct Problems, Hyperactivity/Inattention, Peer Problems and Prosocial Behavior. The results of the study were summarized as follows: “While the children reported a significant reduction (Cohen’s d = −0.380) in oppositional coping strategies, their parents reported a significant increase in active strategies (Cohen’s d = 0.186). No significant effects were discovered in the mental health subscales as assessed by the SDQ”. The researchers conclude that the self-reported reduction in oppositional strategies may indicate that the children had learned alternative strategies instead of screaming and blaming others when handling peer rejection at school (p. 671). The intervention as well as the outcome variables of this study had a component, coping with peer rejection at school, which might be seen as a social skill. The other main component was problematic and anti-social behavior.

110

5 Opening Black Boxes of the Meta-Analyses: What Do …

5.4 The UK Resilience Program (Challen, Noden, West, & Machin, 2011) In September 2007, three local authorities (South Tyneside, Manchester and Hertfordshire) piloted a program with Year 7 pupils in 22 of their secondary schools, with the aim of building pupils’ resilience, promoting their well-being as well as accurate thinking: the UK Resilience Program. The UK Resilience Program is the UK implementation of the Penn Resiliency Program (PRP), a well-being program that has been assessed earlier more than 13 times in different settings. “The Penn Resiliency Program (PRP) is a curriculum developed by a team of psychologists at the University of Pennsylvania. Its original aim was to prevent adolescent depression, but it now has a broader remit of building resilience and promoting realistic thinking, adaptive coping skills and social problem-solving in children. The primary aim of the program is to improve psychological well-being, but it is possible that any such improvement could also have an impact on behavior, attendance and academic outcomes. Thirteen controlled trials have found PRP to be effective in helping protect children against symptoms of anxiety and depression, and some studies have found an impact on behavior. The skills taught in PRP could be applied in many contexts, including relationships with peers and family members, and achievement in academic or other activities. PRP is a manualized intervention comprising 18 h of workshops. (“Manualized” means that no additional materials or resources are required to lead the workshops.) The curriculum teaches cognitive-behavioral and social problem-solving skills. Central to PRP is Ellis’s Activating-Belief-Consequences model that beliefs about events mediate their impact on emotions and behavior. PRP participants are encouraged to identify and challenge (unrealistic) negative beliefs, to employ evidence to make more accurate appraisals of situations and others’ behavior, and to use effective coping mechanisms when faced with adversity. Participants also learn techniques for positive social behavior, assertiveness, negotiation, decision-making, and relaxation. The manualized nature of the curriculum and the intensive training required before using it allows facilitators to be drawn from a wide range of professions and agencies including teachers, learning mentors, teaching assistants, psychologists and health professionals. The training for the original cohort of teachers lasted around 8–10 days, with the first half of the course focusing on teaching trainees the adult-level Cognitive Behavioral Therapy (CBT) skills, and the second week on familiarizing them with the students’ curriculum and practicing how to communicate it to pupils” (Challen et al., 2011, p. 8). The report on the impact evaluation of the UK application of the PRP summarizes earlier evidence on the effectiveness of the program as follows: “Overall, the controlled trials that have been conducted of PRP suggest that it could prevent symptoms of depression and anxiety in universal, targeted and clinic samples, and some studies have found some evidence of a reduction in disruptive behavior

5.4 The UK Resilience Program (Challen, Noden, West …

111

(see e.g. Jaycox, Reivich, Gillham, & Seligman, 1994; Roberts, Kane, Thomson, Bishop, & Hart, 2003; most studies did not measure behavior). However, there are some inconsistent findings, and a meta-analysis of the PRP research, which includes both published and unpublished research, finds very mixed results across studies (Brunwasser, Gillham, & Kim, 2009). Some studies found no effect on depressive symptoms, while others found an effect on some groups but not others. In an earlier review of the PRP studies (Gillham, Brunwasser, & Freres, 2008), the PRP team further find a link between measured impact and the level of training and supervision of the workshop facilitators, suggesting that despite the manualized curriculum, facilitator quality is important and treatment heterogeneity (differences in the quality of the program delivered) is likely. In addition, the sample sizes used in prior PRP studies are relatively small, and scaling-up is a common evaluation problem, with the efficacy of an intervention frequently decreasing as the number of subjects involved increases”. (ibid, p. 9) The evaluation study by Challen et al. (2011) took place in 22 secondary schools, and involved 6000 students. The following outcome measures were used: – Symptoms of depression were measured using the Children’s Depression Inventory (CDI). – Symptoms of anxiety were measured using the Revised Children’s Manifest Anxiety Scale (RCMAS). – Behavior was measured using the self-report and teacher-report versions of the Goodman SDQ. The SDQ total difficulties score is comprised of 20 items, each scored 0, 1 or 2 according to the perceived severity of the symptom. – Life satisfaction was measured using the Huebner Brief Multidimensional Students’ Life Satisfaction Scale, which has five items asking about satisfaction with particular domains of a child’s life and one asking about overall life satisfaction. – “Academic attainment in English, math and science was measured in sublevels, such as 3b, 5.5 etc. Key Stage 2 attainment in these three subjects is obtained from the National Pupil Database, and data on attainment throughout the first three years of secondary school was provided by the schools.” (ibid, p. 19) Data om academic performance were only collected in 12 of the 22 schools. Results on academic and social-emotional outcomes were as follows: “In the short run, the estimated impact of the workshops on attainment in English language is 0.37 of a standard deviation—a moderate to large policy effect. At oneyear follow-up it is 0.22 of a standard deviation, and at two-year follow-up it is positive but not significant (i.e. it is essentially zero). This suggests a positive impact in the short run, which fades over time. We see that the impact on math scores is essentially zero in the short run (about 0.10 but not significant), rising to 0.24 at one-year follow-up and possibly around 0.19 at two year follow-up, although the coefficient on this in our preferred specification in column 12 is not significantly different from zero”. Results for attainment in science show a coefficient on ‘Treated’ of about

112

5 Opening Black Boxes of the Meta-Analyses: What Do …

0.2 of a standard deviation higher in Key Stage 2 science than pupils in the control group. “However, once pupil characteristics and the school they attend are taken into consideration this coefficient drops to zero. There is no measured impact of the workshops on attainment in science in any period”. (ibid, 34). Results on psychological dimensions and behavior showed an average short-run improvement in pupils’ depression symptom scores and school attendance, as a result of the workshops. However, this improvement had faded by one-year follow-up for the depression score and for absence from school. There was some impact on anxiety scores, but this was inconsistent and concentrated in a few groups of pupils. This program is another example of a treatment that is mixed: part prevention of psychological problems and anti-social behavior, and partly cognitively oriented (accurate thinking, social problem solving”). The major positive outcome is post program performance in English language, ES = 0.37. Results on behavior and psychological well-being were negligent. Follow up effects were altogether missing or small. It is quite interesting that this study was not, as most other program evaluations in this field, concentrated in disadvantaged (low SES schools) and that it took place in secondary rather than pre-primary and primary schools.

5.4.1 The Lessons in Character (LIC) Program (Hanson, Dietsch, & Zheng, 2012) The LIC program is an English language art-based character education program— on student academic achievement, social competence, and problem behaviors and, secondarily, on the school environment. The program consists of literature-based supplementary material aligned with California English language arts standards and designed to integrate easily into the current English language arts curricula. The LIC curriculum is designed to be easy to implement in the classroom and to involve minimal teacher training, which distinguishes the program from other character education programs. The (US) context of the program is described as being part of “one of the fastest growing reform movements in K–12 education today, partially in response to unacceptable levels of student misbehavior and inadequate endorsement of good character values” Hanson et al., 2012, p. xi). The aims of the program are enhanced academic achievement and social competence and fewer behavioral problems. Additional expected outcomes are increased sense of belongingness to the schools and “greater levels of school expectations consistent with character development” (Hanson et al., 2012, p. xii). In order to obtain an impression of the contents and scope of the intervention program we refer to the teacher training and the (implemented) curriculum. Teacher training “The framework for the training was based on the Partnership for Character Education’s “Eleven Principles of Effective Character Education,” which outlines 11 basic

5.4 The UK Resilience Program (Challen, Noden, West …

113

principles that suggest how character will look when implemented in a classroom or school. Throughout the training, character education was defined as the systematic, purposeful teaching of core consensus values that lead to habits of good character. The training was designed to familiarize teachers with the rationale for incorporating character education in a systematic way into their daily schedule through the use of literature and lessons that allow students to learn about what being a person of character is and by practicing the skills learned in the lessons. The training emphasized that teaching about character was not another job but rather reinforcement that they already teach students the habits of good character, and that they should be mindful of the influence teachers have on their students as a model of good character” (Hanson et al., 2012, p. 49). The 11 principles of effective character education are summarized below. ELEVEN PRINCIPLES of Effective Character Education, by Liçkona, Schaps, and Lewis (undated) http://www.educationalimpact.com/resources/TeachChar/pdf/eleven_princi ples.pdf 1.

Character education promotes core ethical values as the basis of good character. (core values are—caring, honesty, fairness. Responsibility and respect for self and other) 2. “Character” must be comprehensively defined to include thinking, feeling, and behavior. 3. Effective character education requires an intentional proactive, and comprehensive approach that promotes the core values in all phases of school life. 4. The school must be a caring community. 5. To develop character, students need opportunities for moral action. 6. Effective character education includes a meaningful and challenging academic curriculum that respects all learners and helps them succeed. 7. Character education should strive to develop students’ intrinsic motivation. E.g. Students should minimize reliance on extrinsic rewards and punishments that distract students’ attention from the real reasons to behave responsibly: the rights and needs of self and other. 8. The school staff must become a learning and moral community in which all share responsibility for character education and attempt to adhere to the same core values that guide the education of students. e.g. All staff must model the core values in their own behavior and take advantage of other opportunities they have to influence the character of the students with whom they come into contact. 9. Character education requires moral leadership from both staff and students. 10. The school must recruit parents and community members as full partners in the school’s character-building efforts. 11. Evaluation of character education’ should assess the character of the school, the school staff’s functioning as character educators, and the extent to which the students manifest good character. Effective character education must include an effort to assess progress.

Curriculum implementation The LIC core curriculum consists of 25 lessons, of which participating intervention group teachers were asked to implement 19 during the academic year. Teachers implemented an average of 12.40 core lessons in year 1 and 9.56 lessons in year 2. “The supplementary curricular materials were used less frequently than the LIC core materials: two-thirds of teachers used at least some of the Daily Oral Language

114

5 Opening Black Boxes of the Meta-Analyses: What Do …

with Character or Writing with Character materials in their classrooms in year 1 and about half of teachers reported using these materials in year 2. In sum, participating teachers implemented fewer core LIC lessons and used the Daily Oral Language with Character and Writing with Character materials less frequently in their classes than recommended by the program developer” (Hanson et al., 2012, p. 56). Hanson et al., report on a program evaluation designed as an experimental trial; the study took place from spring 2007 to spring 2010 in 50 California elementary schools (with teachers of grades 2–5). The impact analyses included 4683 students who were in grade 4 or 5 in year 2. The following outcome measures were used in the evaluation study: – State English language arts assessments. Student achievement data from statemandated standardized assessments of English language arts (the California Standards Tests) were collected for the years before and during program implementation. Criterion-referenced to state standards, the California Standards Tests in English language arts are administered to students in grades 2–11. – Social Skills Rating System teacher reports. Gresham and Elliott’s Social Skills Rating System was used to assess student social skills, problem behaviors, and academic competence. – Student surveys. A 35-min survey assessing behaviors, attitudes, and values consistent with the goals of character education was administered to all grade 4 and 5 students in the fall and spring of year 1 and the spring of year 2. Using items and subscales from validated instruments, the survey assessed student altruism (Characterplus, 2002), aggression (Orpinas & Frankowski, 2001), delinquent behavior (Kisker et al., 2004), and empathy (Funk, Elliot, Bechtoldt, Pasold, & Tsavoussis, 2003), as well as school belonging and expectations (Characterplus, 2002). The result of the impact analyses “indicated that grade 4 and 5 students who attended schools in the LIC intervention group did not exhibit higher scores on measures of academic achievement or on measures of social competence after two academic years of potential LIC exposure than grade 4 and 5 students who attended schools in the control group. Nor did intervention group students score lower on measures of problem behaviors. Moreover, the intermediate impact analyses indicated that there were no statistically significant LIC impacts on the school environment measures of school expectations and student feelings of belonging. In addition, although participating teachers in intervention group schools reported implementing fewer core and supplementary lessons during the second year than during the first year, exploratory analyses suggested that there were no statistically significant LIC impacts on grade 4–5 student outcomes or on measured school environment outcomes after the first year of program implementation” (p. 71). In their final comments the authors blame the already relatively low dose nature of the program and the implementation problems that were noted. The viability of the approach as such was not questioned.

5.4 The UK Resilience Program (Challen, Noden, West …

115

5.4.2 Comer’s School Development Program Cook et al. (1999) The philosophy of James Comer’s approach is that a wide range of skills can be enhanced through an intervention that initially seeks to improve the interpersonal relationships and social climate in preparation of enhancing the academic focus. Comer “pays special attention to improving a school’s social climate under the assumption that such improvements will reduce the teacher-student culture gap, will help students acquire some of the middle-class interaction habits that teachers (and employers) value so highly, and will make learning easier because students feel safer in school and trust the staff more” (p. 545). The program consists of three cooperative structures, for programming and planning, social support, and parent engagement. Next, the program has three “process principles”. The first process principle is that the various adult groups within the building should cooperate with each other, always putting student needs above their own. The second process principle is that the school should operate with a problem-solving rather than a fault-finding orientation. The third is that decisions should be reached by consensus rather than vote (ibid, p. 545). Cook et al. (1999, p. 547) state that “The theoretical explication of the School Development Program in Anson et al. (1991) forms the backbone of this article. It postulates a linear sequence of causal relationships. Program implementation comes first, and it has to do with how the three teams operate, how widely governance is suffused throughout the school, how Comer’s process goals are disseminated, and how much parent involvement is enhanced. Given quality implementation, the school’s climate should improve next, particularly the more social dimensions of climate that involve the quality of interpersonal relationships among staff, among students, and between teachers and students. This improved climate should then lead students to feel better about themselves and to behave in ways that are closer to middle-class mainstream norms. Higher grades and test scores should result next, presumably because of the improved school climate and because adopting more mainstream values includes valuing achievement more highly”. The authors comment that, in their judgment, it is not entirely clear in the theory of the School Development Program, just what the mechanisms are for transferring changes in a school’s social climate to changes in its students’ academic performance. Reconstructing the program theory behind Comer’s approach, they arrive at three hypotheses. The first hypothesis is that a better social climate will improve students’ social and personal well-being. A second important hypothesis is that a school’s social climate will have a positive impact on students’ academic performance, most likely after it has already influenced their mental health and social behavior. The third hypothesis intends to test whether combining the academic and social climate dimensions leads to additive (or even more complex) positive effects across a wide range of student outcomes (ibid, 549). Interestingly, the evaluators pay attention to a possible “counter theory” that would predict negative impact of investing a lot of energy in improvement of the social

116

5 Opening Black Boxes of the Meta-Analyses: What Do …

climate and cooperative structures, because it might lead to smaller investment in established conditions of effective instruction. “Achievement is related to students’ opportunities to learn and be recognized for learning and to the opportunities teachers have to use curriculum materials and instructional strategies that are meaningful to children (and to participate in professional development activities that emphasize how to stimulate higher academic achievement)” (ibid, p. 548). Cook et al. (1999) conducted an evaluation study of the Comer’s school development program by means of a 4-year randomized experiment in 23 middle schools in Prince George’s County, Maryland, repeated measurements with more than 12,000 students and 2000 staff, a survey of more than 1000 parents, and extensive access to students records. The school population was predominantly African American, with considerable internal variation in household socioeconomic standing. Measures Many facetted measures for school climate, staff climate and academic climate were developed. Student moderator and outcome measures, included administrative data from school records and standardized achievement tests, among others children’s thirdand fifth-grade CAT scores; and children’s Maryland State Readiness Test scores in math at the beginning of seventh grade and again either early in ninth grade or at the end of eighth grade (p. 557). Next several student outcome constructs were collected by questionnaires: “In the psychological well-being domain, assessments were made of five constructs: selfefficacy as a student, satisfaction with self, anger control, depression, and ethnic pride. Also assessed were aspects of conventional social behavior, including participation in extracurricular activities that adults traditionally consider to be wholesome; the degree to which friends disapprove of drugs and are engaged in positive activities; the extent to which the student disapproves of various forms of misbehavior and values temper control, community participation, and achieving mainstream adult outcomes; and the importance the student attaches to his or her friends. Negative behavior was also assessed, including the number of petty offenses committed; the use of tobacco, alcohol, marijuana, and other more serious illicit drugs; and the antisocial nature of friends’ activities (including their sexual behavior, since we were not allowed to ask about individual sexual practices” (p. 558). Reliabilities of the respective scales are indicated in an Appendix to the paper. Evaluation results The Comer manipulation had no clear effects on either student or staff perceptions of the school climate, although many small relationships were in the expected direction (p. 565). No responsible answer is possible to how being a Comer-like school affects the more social dimensions of school climate (p. 573) (because of a lack of discriminant validity between the implementation index and the social climate measures). The best schools for improving test scores were those with the highest academic focus and the least good social climate. Thus, there is no evidence that combining a

5.4 The UK Resilience Program (Challen, Noden, West …

117

better social and academic climate qualitatively transforms a school. In the discussion the authors offer the following further interpretation on this outcome. “If this finding were to be replicated at other Comer sites, it would suggest that the program’s social emphasis can have negative side effects for achievement. It is not clear why this should be the case, but obvious possibilities are that concentrating on interpersonal relationships and school-level structures somehow reduces students’ opportunities to learn or somehow impedes teachers’ opportunities to tailor their instruction to specific local needs or to engage in in-service training that stimulates their knowledge of effective instructional practices” (p. 577). The authors summarize the conclusion as follows: “Results showed that (a) being in the official School Development Program did not affect school climate or student outcomes and that (b) this may have been because of the very mixed quality with which the program was implemented. Nonexperimental analyses suggested that (c) schools with procedures like those specified in Comer’s theory may cause positive changes in social behavior and psychological adjustment but (d) they do not improve math scores and may even lower them.” (p. 579). Cook et al.’s evaluation study of the Comer program is interesting for a number of reasons: – It is one of the few SEL related evaluation studies at (lower) secondary level – It offers a reconstruction of the program theory that tries to understand the mechanisms that are assumed when the expectation is that social emotional development is to be seen as a preliminary step in enhancing academic outcomes. – It deals with implementation issues of SEL oriented programs, in depth. – It is one of the rare programs that takes alternative model interpretations, i.e. that attention for SEL might be detrimental to academic achievement into consideration. – It connects evidence from instructional effectiveness research to programs on social emotional learning, also in the sense that there might be trade-offs between the two.

5.4.3 Positive Action (PA) Key-Reference: Snyder, Flay, Vuchinich, Washburn and Beets 2010 Introduction PA is described as a comprehensive schoolwide social-emotional and character development (SACD) program. In 2007 it was recognized as the only “character education” program in the nation to meet the evidentiary requirements for improving both academics and behavior by the What Works Clearinghouse. The meta-analysis by Corcoran et al. (2018) marked evaluation studies of the Positive Action Program as having effect sizes on academic outcomes that were considerable higher in comparison to the effects of other Social Emotional Learning programs. This was the first reason to analyze PA and its evaluations in somewhat

118

5 Opening Black Boxes of the Meta-Analyses: What Do …

more detail. The second reason to do so is the explicit theoretical and conceptual background of the program. Theoretical background The theory of triadic influence, abbreviated as TTI, has the ambition of being an integrative theory about the causation of health related behavior (Flay, Snyder, & Petraitis, 2009). The Positive Action Program is inspired by the TTI, as education is seen as one of the society’s health promoting fields. The TTI is a comprehensive framework that tries to capture the complexity of influences on health-related behavior. Its main components are: – Three streams: the personal stream, the social stream and the environmental sphere. The personal stream covers the domain of biology and personality, the social stream the social situation, and the environmental stream the cultural environment. – Seven levels that express a continuum running from ultimate, innate general characteristics (level 1) to specific behavior (level 7). The intermediary levels of the personal stream are two “distal predisposing influences (social personal nexus, level 2, and evaluations and expectations, level 3), and three “proximal influences” (affect and cognitions, level 4, decisions and intentions, level 4, and “trial behaviors and experiences” level 6). (cf. Flay et al., 2009, p. 455, Fig. 16.1) – A set of interrelationships, some of them seen as direct causal links, others as feedback loops. These are visualized in the paper’s Fig. 12.2, (ibid.). Major influences run from high to low, i.e. from the more ultimate underlying causes to the distal and proximal influences, and horizontally, from environmental and social stream to the personal stream, visualizing social and cultural (value driven) influences on personal dispositions. The latter interpretation (the personal nested in the social context and the larger cultural environment) confirms to Bronfenbrenner’s well-known multi-level ecological model. Most interesting are influences that run from low to high, to some degree interpretable as feedback, symbolizing learning from experience. Consistent with the definition of ultimate underlying causes, like personality characteristics, the upward moving influences only reach up to level 2, and do not touch level 1. An interesting, perhaps paradoxical claim of the theory is the underlining of the importance of the “bottom up” influences. “Future Health Programs need to focus less on the micro levels of causation and more on the distal and ultimate levels” (ibid., 493). The motive is that “a focus on distal and ultimate levels will lead to programs that are more efficient because they change multiple behaviors at once”, but the inbuilt constraint of the theory is that these higher levels are more difficult to change. Closer reading reveals that bottom up influences of this kind are mentioned with respect to the social and environmental/cultural sphere, but no examples in the personal sphere are given, i.e. influencing distal predisposing factors like “sense of self control” and “social competence” by means of more proximal experiences and behavior. The hierarchical structure of the Positive Action conceptual framework, touches on a re-occurring theme of this book, namely a supposed continuum from general

5.4 The UK Resilience Program (Challen, Noden, West …

119

and innate personality traits to behavior and speculation of intermediary constructs like dispositions, behavioral habits and skills, (see Chaps. 2, and 3). Below, we will turn to the way this overarching theory is applied in the conceptualization of the Positive Action program. Program theory of the Positive Action program The program rationale for the Positive Action program (Flay & Allred, 2010) retains the comprehensive (or “holistic”) scope of the theory of triadic influence, while focusing on schooling. In the process, some features of the original theory obtain more emphasis. This is particularly the case for the normative and value oriented facets of the TTI (theory of triadic influence). On this facet the Positive Action program takes its inspiration from the Positive Psychology movement (e.g. Seligman & Csikszentmihalyi, 2000). Flay and Allred (2010) distinguish three core elements of the program: a specific theory about ‘self-concept”, the “thoughts, –feelings–action” cycle and the actual program content, which is described as teaching “specific positive actions for the whole self: the physical, intellectual, social, and emotional areas” (p. 476). The theory about self-concept states “that people determine their self-concepts by what they do; that actions, more than thoughts or feelings, determine self-concept; and that making positive and healthy behavioral choices results in feelings of selfworth/esteem” (ibid, 476). The” thoughts–actions–feelings about self” cycle is based on “the intuitive idea that ‘You feel good about yourself when you do positive actions and there is always a positive way to do everything’.” (ibid.). The cycle can be positive or negative. In order to stimulate positive development values are of central importance. The reasoning is that once the intention is to do good, intrinsic motivation will follow, which in its turn is expected to lead to more sustained learning. The actual program theory of PA is described as a set of intended components, implementation decisions and actual practice. At this operational level the program is structured as a set of teaching inputs (PA units), student level mediators and outcomes. At the intended level program components, school and classroom SSLL (skills for successful learning and living) practices and school and classroom climate are distinguished. Program components are teacher/staff training, climate development, “Pre K–12 instruction”, curriculum, a counseling kit, and a family/classes/community kit. School and classroom practices are indicated as social and character development strategies by teachers, administrators, other school staff and parents/caretakers. School climate conditions are indicated in terms of improved leadership and relationships among school personal, parents and students, increased academic orientation and increased parental involvement. Classroom climate is indicated as improved emotional and instructional climate; increased teacher attitudes, skills and behaviors supportive of social, academic and cultural development of students (ibid, Fig. 28.2, p. 479). At the operational level an input- mediator-outcome model is sketched with the curricular elements of the program as input, a set of student-level cognitive and affective attributes as mediators, and student outcomes, further specified as improved

120

5 Opening Black Boxes of the Meta-Analyses: What Do …

behavior, health, and learning skills, as well as school attendance, grades, and standardized achievement test scores. Given the focus of this book, the position of the student mediators in the PA program theory is particularly interesting. They comprise of academic and social skills, self-esteem, attitudes and moral values, and are seen as both influenced by “given” child and family characteristics, and by the curricular inputs. As behavioral tendencies and “skills” they are not purely “intermediary outcomes”, because they also depend on given characteristics, and are thus not thought of as fully malleable. The operational outcomes of the PA program are all stated in behavioral terms, improved behavior, attendance, reduced substance use, academic outcomes, with “improved study skills” as perhaps the only exception. In earlier chapters we made a distinction between various positions of social emotional skills in educational programs. Social and emotional skills could either be seen as instrumental to academic and other outcomes, or as ends in themselves. The above reflection on the PA program theory indicates that in this program the instrumental position of the social emotional “attributes” predominates. This might explain why program evaluations of the PA program, so far, have paid relatively little attention to the measurement and effect measures of social emotional attributes and skills (see subsequent sections). Finally, the intended comprehensiveness of the program in addressing, academic, social and moral development should be mentioned once more. This is how Flay and Allred (2010, p. 478) describe how PA “works for academics”: “Positive Action creates an intellectually stimulating learning environment and helps students retain academic lessons by applying them to real-life situations. The lessons also inspire students to value learning and education, and to engage in setting personal goals for a happy and successful life”. Program structure A concise description of PA’s program structure is given by Flay and Allred (2003): “The PA program includes a detailed curriculum with almost daily lessons, a schoolwide climate program, and family- and community involvement components, each of which uses research-proven educational strategies and methods such as active learning and positive classroom management. The program has goals and components for each of the individual, family, school, and community levels. Central to all components of the program are 6 program units (Table 1): (1) self-concept; (2) positive actions for one’s mind and body; (3) managing oneself responsibly; (4) getting along with others; (5) being honest with oneself and others; and (6) improving oneself continuously. Schools integrate the program units in a scopedand-sequenced classroom curriculum and a school-climate program. The K–6 classroom curriculum consists of over 140 lessons per grade. Using teacher’s kits (that include teacher’s manuals and all materials needed for all activities for a whole class), classroom teachers present 15- to 20-min lessons almost every day. Scripted lessons are completely prepared and teacher friendly, employing a variety of methodologies and addressing different learning styles. Activities include stories, role playing, modeling, games, music, questions/answers, activity booklets and sheets, posters, and

5.4 The UK Resilience Program (Challen, Noden, West …

121

manipulatives. The program content teaches students how to use positive actions, to recognize feeling good about themselves, to manage themselves (including thoughts, actions, and feelings), and to treat others the way they want to be treated. The schoolclimate program encourages and reinforces the practice of positive actions schoolwide and extends the program to families and the community. For each school, a principal’s kit provides directions for a school-climate program to promote the practice and reinforcement of positive actions in the entire school. It also includes parent- and community-involvement activities. The parent program includes coordinated weekly lessons and links the family to the school activities. The family kit contains a manual with 42 multi-age, weekly lessons based on the 6 units and 6 review lessons with enough materials for 6 individuals. This kit coordinates family activities with the PA school curriculum and school-climate activities. It contains all the materials required in the lessons: colorful posters and visuals, hands-on materials, activity worksheets, and music. It contains Words of the Week and the “ICU Doing Something Positive Box” like those used in the school. The community program includes a community kit and combines with the school and parent programs to align all the environments (schools, families, and community) involved in the program. The community kit includes a guide, the Positive Actions for Living text, music CDs and books, family kits, and other materials. It provides community leaders, public servants, social service workers, and business executives with the tools to plan and cultivate positive actions in every aspect of the community while encouraging development in every aspect of the individual citizen”. (ibid, S8, S9) Illustrations of sample lessons from pre-kindergarten to high school 4 are obtainable via the following link: https://www.positiveaction.net/sample-lessons#kinder garten. Early evaluation studies Quasi experimental (matched pair) evaluations of Positive Action took place in 2001 (Flay, Allred and Ordway), 2003 (Flay and Allred), and 2006 (Flay, Acock, Vuchinich and Beets). Results, in terms of effect sizes on measures of academic achievement and other outcomes are summarized in Table 5.1. It is notable that program effects measured as social emotional outcomes do not feature prominently in the early evaluation studies of PA (nor in the more recent Table 5.1 Results of early evaluations of the positive action program Study

Reading

Flay, Allred and Ordway (2001)

0.48

Flay and Allred (2003).

1.32

Flay, Acock, Vuchinich and Beets (2006)

0.73

Math.

0.34

Science

SEL outcomes

0.26

0.57 treatment effect on violence

Retention 0.63 Suspensions 0.71 Daily absence over 4 year 0.55 Student attitudes towards positive behavior 0.42

122

5 Opening Black Boxes of the Meta-Analyses: What Do …

2010 evaluation study, documented below). To the extent that teacher and student self-reports on social emotional skills are included, they are focused on behavioral attitudes. Still, the 2006 evaluation mentions behavioral check-lists administrated by teachers that cover a broader realm of social emotional and intellectual attributes. The scales were apparently self-constructed and adapted from Edelbrock and Achenbach (1984) and Hightower et al., (1986). The following self-developed scales are mentioned: (1) Self -concept neg. (2) Physical Pos. (3) Physical Neg. (4) Intellectual Pos. (5) Intellectual Neg. (6) Responsible (7) Self Control (8) Disruptive (9) Considerate (10) Social (11) Honesty Positive (12) Honesty Negative (13) Self -Improvement (14) Avoid substance use (15) Substance use. When discussing the program theory of PA in a previous section we mentioned the intermediary, mediating position of student characteristics, depicted as partly “given” and partly influenced by program inputs. This might explain why the scales that were used in the 2006 evaluation studies are not presented as outcome measures. The research techniques that were applied do not include structural equation modelling appropriate for dealing with mediation and indirect effect. The (2010) evaluation by Snyder, Flay, Vuchinich, Washburn and Beets This study, next to the other evaluation studies of PA discussed in the above, stood out for its above average effect sizes on mathematics and reading performance in Corcoran et al.’s meta-analyses of social emotional learning programs (Corcoran et al., 2018). Reason to describe the evaluation study in somewhat more detail. Research methods The PA Hawai’i trial was a matched-pair, cluster-randomized, controlled trial, conducted during the 2002–03 through 2005–06 school years, with a 1-year followup in 2007, in Hawai’i public elementary schools (K–5 or K–6). The trial took place in 20 elementary schools (10 matched-pairs) on three Hawai’i islands. Absenteeism, suspensions, retention in grade, and four academic achievement indicators served as the dependent variables for the study; these were chosen because they were the publicly available indicators of school performance. These school level indicators were collected on a yearly basis and available for the study in the period between 2002 (baseline) and 2007. Achievement data were collected by means of standardized achievement tests, e.g. the Stanford Achievement Test (SAT). The other school-level indicators used in this study included (a) absenteeism (average number of days absent per year, (b) suspensions (percentage suspended), and (c) retentions (percentage retained in grade, i.e., kept back a grade). Program implementation was controlled on the basis of several indicators. The aggregation level at which data were analyzed was the school level (n = 20). “For data analysis the matched-paired t-tests, Hedges’ adjusted g as a measure of effect size and percentage relative improvement (RI) were used. To assess the robustness of results, permutation tests and random-intercept growth curve models were used for sensitivity analyses” (35) Second, effect sizes for absenteeism, suspensions, retentions, and each of the four achievement outcomes were calculated by subtracting the mean difference of control

5.4 The UK Resilience Program (Challen, Noden, West … Table 5.2 Summary of effect sizes (differences control and experimental schools), from Snyder et al. (2010)

123

Subject/behavior

ES post program

ES 1 year follow up

Standardized test math

0.50

0.52

Standardized test reading

−0.58

−0.54

Absenteeism

−0.63

−0.65

Suspensions

−0.96

−0.78

Retentions (grade repetitions)

−0.84

−1.08

schools from the mean difference of PA schools and dividing by the pooled posttest standard deviation. Results Overall, for the academic achievement outcomes, raw means for PA and control schools were statistically similar at baseline and demonstrated a clearly discernable divergence over time. Although the PA schools were well below state averages at baseline (as planned), they nearly met or exceeded the state averages for academic achievement at posttest and 1-year post trial. Overall, results indicated higher achievement and lower absenteeism and suspension outcomes for the PA schools (Table 5.2). These effect sizes should be interpreted as medium to high. The authors, (Snyder et al., 2010) explain the success of the intervention as follows: “First, PA addresses distal influences on behavior in a multifaceted way; PA is a comprehensive approach that involves providing the curriculum to all grades in the school at once, involving all teachers and staff in the school, and involving parents and the community. The PA program assists students and adults to gain not only the knowledge, attitudes, norms, and skills that they might gain from other programs but also improved values, selfconcept, family bonding, peer selection, communication, and appreciation of school, with the expected result of improvement in academic performance and a broad range of behaviors. These improved outcomes may occur because positive behaviors tend to correlate negatively with negative behaviors. More specifically, with regards to academic achievement, for example, PA increases positive behaviors and decreases disruptive behaviors, which in turn lead to more time on task for teaching and, in turn, more opportunity for student learning. Also, improvements in students’ positive behaviors, such as attention and inhibitory control, can lead to increased academic achievement throughout formal schooling. Second, PA is “interactive” in delivery, using methods that integrate teacher/student contact and communication opportunities for the exchange of ideas, and utilize feedback and constructive criticism in a nonthreatening atmosphere (Tobler et al., 2000). Third, the results observed may also have been a consequence of the intensive nature of the program, with students receiving approximately 1 h of exposure during a typical week over multiple school years” (Snyder et al., 2010, 47).

124

5 Opening Black Boxes of the Meta-Analyses: What Do …

The study is remarkable for several reasons. – It shows above average effect sizes on academic achievement and behavioral indicators – It shows medium to large effect sizes on academic achievement although the program was not specifically dedicated to improvement in these subjects. The authors express this as follows: “The study extends research on the ways that changing a child’s developmental status in nonacademic areas can significantly enhance academic achievement” (49). The main explanation the authors provide for the positive outcomes is the scope and intensity of the program. One aspect is the curriculum time. The objective data they present on curriculum exposure is more modest (1 hour of exposure during a typical week) than the description provided by Flay and Allred (2003) who speak of almost daily lessons. – A particular aspect of the design is that data are analyzed at school level, which, combined with the small N (20 matched pairs) makes for weak conditions to check the comparability of the treatment and control group and possibly inflated effect size estimates. – More generally, and despite the explicit program theory, the evaluation study remains largely a “black box” program evaluation, not revealing evidence about the expected intermediary influences, like the influence of social emotional learning effects. – The way the authors explain the indirect effect of improved behavior and “character” on academic achievement is rather thin (decreased interruptions, more time on task). It does not quite take away the perplexity of how come intensive programs dedicated to effective schooling and instruction like Comprehensive School Reform programs (e.g. Borman, Hewes, Overman, & Brown, 2003) show effect sizes that are about one third of what Positive Action realizes. – As matters stand, one can only speculate about why this program has stronger positive effects than many other SEL-programs on academic outcomes. One hypothesis could be the combination of social emotional learning and character education with a strong normative “mission”. In school effectiveness research this has been given as an explanation for the finding that Catholic schools outperformed public schools (Dronkers, 1966). A related mechanism could be stronger consensus among teachers and stronger consistency in teaching (Creemers & Kyriakides, 2008). Yet another explanation might be the scripted nature of the program that would likewise enhance internal alignment. An explanation closer to the core of social emotional learning would be the positive dynamics at classroom level that could result from bolstering self-efficacy and stimulating feedback (Yeager & Walton, 2011). The tentative explanation that the success of Positive Action is the combination of all of these would require further corroboration. Similarly comprehensive programs like PATH did not show these overwhelmingly positive effects.

5.5 Summary

125

5.5 Summary In summary Table 5.3, an overview is presented of the contents of the programs that were discussed in the case descriptions of program evaluation. Program contents was expressed in terms of a range of behavioral and social emotional components: avoidance and prevention of behavioral problems, enhanced self-control, communicating and understanding emotions, facilitating positive social interaction, improved social climate at school, coping and social problem solving. In two of the programs values of ‘good character’ are promoted. In all programs social emotional functioning and behavioral skills were presented in connection to school life. Although in varying degrees, all programs see development of social emotional skills in relationship to cognitive teaching and learning. In some cases, this relationship was more explicitly seen as instrumental: improved behavior and socio emotional functioning was considered as facilitating cognitive development and academic performance. Most program evaluations used both socio-emotional outcomes and academic outcomes as effect criteria. Intervention modes were most frequently curriculum documents, like lesson programs and scripted teaching approaches, specific teacher training for the program, and sometimes adapted modes of school organization, staff cooperation and parent involvement. As far as effect sizes are concerned the selection of programs showed considerable variability, but given the small and purposeful selection, the overview lacks any basis of being representative.

5.6 Discussion: How Does Social-Emotional Learning Take Effect; Interrogating Program Theories In the sequence of chapters of this book we have started out by considering a broad movement to give more prominence to non-cognitive and social emotional outcomes in education. In the initial chapters we concentrated on the nature and conceptual clarification of these types of outcomes, and next turned to fundamental questions about their malleability, among others by means of an excursion into developmental and personality psychology. When, in Chap. 4 and in this chapter we analyzed empirical evidence on the malleability of social emotional skills by means of educational interventions, we encountered various interpretations, “models” if one would like, of means-goal associations, in research terms “process-outcome” associations. The simplest and most straightforward model would be to consider interventions on social emotions learning as means, and improved social-emotional attributes (“skills” perhaps) as desired outcomes or goals. At first sight the bulk of the empirical studies that formed the basis for a number of meta-analyses of social-emotional learning programs seemed to confirm to this simple and straightforward model. But then it appeared that many empirical studies considered other outcomes, in place of, or in addition to, social-emotional outcomes.

126

5 Opening Black Boxes of the Meta-Analyses: What Do …

Table 5.3 Summary of program contents in case descriptions Program

Program contents

Outcomesa

Tools of the mind

The teachers’ use of scaffolding, diminishing behavioral problems and elements directly related to literacy skills

SE & Ac

PATHS

Self-control, communicating and understanding emotions, positive self-esteem, relationships and interpersonal problem-solving skills

SE & Ac

Good behavior game

Positive peer relations are theorized to prevent externalizing problems. Facilitating positive interactions between children through a team-based approach

SE & Ac

ZIppy’s friends

Over the course of 24 weekly lessons, SE & Ac children explore themes related to emotions, communication, relations and conflict resolution through the many day-to-day problems, sorrows and joys Zippy and his friends experience as playing music, taking a walk or crying. Action-focused coping refers to everything we do to change the situation that frustrates us or makes us unhappy”

The Uk resilience program; (based on the Pennsylvania Resilience Program PRP)

participants are encouraged to identify and SE & Ac challenge (unrealistic) negative beliefs, to employ evidence to make more accurate appraisals of situations and others’ behavior, and to use effective coping mechanisms when faced with adversity. Participants also learn techniques for positive social behavior, assertiveness, negotiation, decision-making, and relaxation

Lessons in character (LIC)

The aims of the program are enhanced SE & Ac academic achievement and social competence and fewer behavioral problems. Additional expected outcomes are increased sense of belongingness to the schools and greater levels of school expectations consistent with character development. Throughout the training, character education was defined as the systematic, purposeful teaching of core consensus values that lead to habits of good character

Comer’s school development program

The philosophy of James Comer ‘s SE & Ac approach is that a wide range of skills can be enhanced through an intervention that initially seeks to improve the interpersonal relationships and social climate in preparation of enhancing the academic focus. Process principles: within schools cooperate, problem solving rather than fault finding orientation, and decisions reached by consensus, not by vote (continued)

5.6 Discussion: How Does Social-Emotional Learning Take Effect …

127

Table 5.3 (continued) Program

Program contents

Outcomesa

Positive action

Lessons cover six major units on topics related to self-concept (i.e., the relationship of thoughts, feelings, and actions) physical and intellectual actions (e.g., hygiene, nutrition, physical activity, avoidance of harmful substances, decision-making skills, creative thinking),social/emotional actions for managing oneself responsibly (e.g., self-control, time management), getting along with others (e.g., empathy, altruism, respect, conflict resolution), being honest with yourself and others

Ac & AT

a Three kinds of outcomes: social-emotional (SE), academic (Ac.) and attainment (e.g. absenteeism, dropout, At)

In the economic studies that were cited, social and emotional outcomes of schooling were associated with “life outcomes”, like societal well-being, earnings and position on the labor market. A considerable share of the programs that formed the empirical basis for the meta-analyses discussed in Chap. 4, assessed not only social and emotional outcomes but academic outcomes as well. The meta-analysis of socialemotional learning programs by Corcoran et al. (2018) only considered academic outcomes of these programs. The simplest interpretation of applying both social and emotional outcomes and other outcomes as effect criteria in evaluation studies of social-emotional learning programs would be to see different outcomes as disparate categories. However, the case studies of program evaluations in this chapter indicated that in quite a few studies the relationship between social and emotional skill development and academic outcomes was explicitly seen as instrumental. Similarly, it was noted that the social-emotional learning programs or “packages” frequently leaned towards cognitive processes and content as well. Some socio-emotional learning programs were attached to specific content areas, like language, and others addressed problem solving. The striking empirical finding that social-emotional learning programs frequently showed enhanced performance in academic performance, next to progress on social-emotional attributes, would seem to favor models that incorporate association between the two domains. In Fig. 5.1, this increased complexity in social emotional learning effect studies is schematically presented. The distinction of progressively more complex models serves two purposes. The first is to do with imperatives and challenges for designing effect studies, and the second is about providing scaffolds for theoretical understanding on how social emotional learning takes effect. Although the two are obviously related, it is at least an option to go beyond mainstream research practice in theoretical conjectures. Although several references have addressed hypotheses on instrumental relationships between social emotional learning and academic performance, these have hardly been

128

5 Opening Black Boxes of the Meta-Analyses: What Do …

Model 1: social-emo onal learning >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> life outcomes Model 2: social-emo onal learning >>>>>>>>>>>>>>>>>>>>>social-emo onal outcomes Model 3: social-emo onal learning >>> (social-emo onal outcomes and academic outcomes) Model 4: social-emo onal learning >>>>>>>>>>>>>>>>>>>>>> >>>>>>academic outcomes Model 5: Social-emo onal learning>>>>>>social-emo onal outcomes>>>>>>>>>Academic outcomes Model 6: Social-emo onal learning > social-emo onal outcomes >academic outcomes > life outcomes

Fig. 5.1 Social-emotional learning effect models

borne out by empirical studies. No examples were found of studies based on models 5 or 6, but we did encounter theoretical hypotheses about such associations in the meta-analysis by Corcoran et al. (2018), and in the case study descriptions on the Comer program (Cook et al., 1999) and Positive Action (Snyder et al., 2010). In order to illustrate what we can learn from such reconstructions of the program’s theory we briefly restate how this was done for these two programs. The reconstruction of the program theory of the Comer program sketches a linear sequence of causal relationships. Program implementation comes first. Given quality implementation, the school’s climate should improve next, and this improved climate should then lead students to feel better about themselves and “to behave in ways that are closer to middle-class mainstream norms”. Next higher grades and test scores should result as a consequence of the improved school climate and “because adopting more mainstream values includes valuing achievement more highly”. The reviewers express their uncertainty on the mechanism that would explain the last link in this chain: from student feeling better about themselves to enhanced academic performance and propose a competing program theory that would predict that more attention for social emotional development would lower academic performance because it would diminish teaching time and learning opportunities. (As a matter of fact, their evaluation found that positive social emotional development was associated with lower academic performance). In the study by Snyder et al. (2010) on Positive Action the causal reasoning was as follows: “PA increases positive behaviors and decreases disruptive behaviors, which in turn lead to more time on task for teaching and, in turn, more opportunity for student learning.” What these reconstructions of program theory do is providing hypothetical explanations on what happens in the black box, between program initiation and outcome measurement. Our excursion in the field of educational effectiveness in Chap. 4, particularly the non-experimental studies, opened another avenue for “treatment specification”. In these studies, a set of specific “effectiveness enhancing conditions” has acquired

5.6 Discussion: How Does Social-Emotional Learning Take Effect …

129

a more or less established conceptual framework, in which categories of independent variables have been ordered. Broad categories are teacher training and professional development, school organizational and school management interventions, creating a favorable learning environment or “climate”, as well as strategies in the realm of classroom management and instruction. Some of these independent variables are close to components of social-emotional learning programs: creating a safe and orderly school- and classroom climate, providing emotional support to learners; stimulating collaboration at school and classroom level and nurturing positive expectations. The categorization used in the meta-analyses by Korpershoek et al. (2016) on classroom management draws the perspectives from educational effectiveness research and the assessment of social-emotional learning programs together, distinguishing teachers’ behavior-focused interventions, teacher–student relationshipfocused interventions, students’ behavior-focused interventions and students’ socialemotional development-focused interventions. Nonexperimental causal modeling in educational effectiveness studies is better adapted than experimental program evaluations to empirically test the more complex models 5 and 6, and so are combinations of experimental designs and causal modeling within the treatment group (Desimone & Hill, 2017). So far, we do not know of applications in the assessment of social emotional learning programs and interventions. In conclusion we could say that specification of mediating processes in programs on social and emotional learning adds to our understanding on how these programs take effect, as they are an initial step for seeing what happens in the black box of program implementation. An additional step is to check whether more established social scientific and economic theories could explain the research outcomes and generate further research. Examples from mainstream educational effectiveness research are the application of micro-economic theory to explain issues of standard setting and extrinsic motivation (De Vos, 1989), a systems perspective of effective schools (Clauset & Gaynor, 1982) and alternative feedback models (Black & Wiliam, 1998). We finish this chapter by providing examples of causal analysis and theory formation more directly related to the “workings” of social-emotional interventions. In doing so we focus on the interplay of cognitive teaching and stimulation of socialemotional learning, as we are intrigued by the finding that social-emotional learning programs appear to effect both cognitive and non-cognitive outcomes. A first example, which was discussed in Chap. 4, is the finding from econometric analyses by Cunha, Heckman, and Schennach (2010), that cognitive skills may affect the accumulation of noncognitive skills and vice versa (noncognitive development affecting cognitive outcomes). In longitudinal analyses the cognitive outcomes at a particular point in time do not only affect cognitive outcomes at a next time point, but also non-cognitive outcomes at this next point in time, and the same applies for progress on noncognitive outcomes, including a “cross-over effect” on cognitive outcomes (OECD, 2015, p. 39). A second example is a study by Usher and Pajares (2008), based on Bandura’s theory on self-efficacy (Bandura, 1986). “Bandura theorized that the beliefs that people hold about their capabilities and about the outcomes of their efforts powerfully

130

5 Opening Black Boxes of the Meta-Analyses: What Do …

influence the ways in which they behave. According to Bandura’s social cognitive theory, these self-efficacy beliefs help determine the choices people make, the effort they put forth, the persistence and perseverance they display in the face of difficulties, and the degree of anxiety or serenity they experience as they engage the myriad tasks that comprise their life” (Usher & Pajares, 2008, p. 751). Usher and Pajares reviewed research on four sources of self-efficacy in academic contexts: – – – –

Previous attainment or mastery Vicarious experience when observing others Social persuasion (encouragement from teachers and peers) Emotional and physiological states.

The major outcomes are that previous attainment is most convincingly supported as the basic source of self-efficacy. Social comparison and persuasion have also been supported as being positively associated with self-efficacy. Still, these sources are highly correlated, for example, mastery has clear implications for the positive or negative nature of social comparison. From a didactic perspective it should be realized that the malleability of each of these four sources is different. Past performance is a given, social comparison depends on group compositions that may not be fully controllable in educational practice. Likewise, emotional and physiological states are given pre-conditions that are dependent on students’ personality. Persuasion should be seen in the context of dealing with the messages from previous achievement in terms of feedback, encouragement and guidance for next steps in learning. A third and final example addresses a broader set of social psychological interventions and delves deeper into dynamic interactions between social psychological treatments and “regular” teaching. Yeager and Walton (2011) review empirical studies that had shown that seemingly “small” social-psychological interventions in education—“that is, brief exercises that target students’ thoughts, feelings, and beliefs in and about school—can lead to large gains in student achievement and sharply reduce achievement gaps even months and years later” (ibid, p. 267). They reject the impression that such effects “depend on magic” and go about presenting an explanation. For example, one of the studies they refer to, by Blackwell, Trzesniewski, and Dweck (2007), found that middle school students who attended an eight-session workshop, which, metaphorically, compared the plasticity of the brain to “a muscle growing with effort”, displayed a sharp increase in math achievement for the rest of the school year, an effect not shown by students who attended a control workshop that taught them study skills (Yeager & Walton, 2011, p. 268) One of Yeager and Walton’s key assumptions is that psychological interventions do not replace traditional educational reforms but operate within the context of existing structures to make them more effective. Psychological interventions change students’ mind-sets to help them take greater advantage of available learning opportunities (p. 274). The authors assume that social psychological interventions can remove restraining forces, allowing students to take greater advantage of learning. Next to the example about convincing students of the malleability of the brain, they provide examples of interventions changing students’ attributions for academic

5.6 Discussion: How Does Social-Emotional Learning Take Effect …

131

setback, interventions that mitigate stereotype threat and students being invited to describe personally important values (as a way to stimulate a positive self-image). Quite a few of these examples resemble the work on “positive expectations” in school effectiveness research. Analytically the way these social psychological interventions work is described as the identification of subjective processes of students that stimulate or restrain a positive attitude to learning, targeting and implementing social psychological interventions that speak to these restraining and stimulating processes and “allowing students to personalize their responses to intervention materials”. Such carefully and persuasively planned “stealthy” interventions are expected to set in motion positive self-reinforcing non-recursive processes,2 which could explain the long term effects of these initially small interventions. The authors (Yeager & Walton, 2011, p. 268) express this as follows: “As we suggest below, a key to understanding the long-lasting effects of social-psychological interventions is to understand how they interact with recursive processes already present in schools, such as the quality of students’ developing relationships with peers and teachers, their beliefs about their ability, and their acquisition of academic knowledge. It is by affecting self-reinforcing non-recursive processes that psychological interventions can cause lasting improvements in motivation and achievement even when the original treatment message has faded in salience” and refer to, Cohen, Garcia, Purdie-Vaughns, Apfel, & Brzustoski, (2009) for further explanation. Yeager and Walton give suggestions for the appropriate timing of the social psychological interventions, and for “scaling up” (generalized application) of the improved behavior and attitudes. They suggest delivering psychological interventions at key educational junctures, such as the beginning of an academic year. Scaling up is a matter of guided application in diverse settings. The study by Jaeger and Walton reflects two features that are most interesting for embedding social-emotional learning in the every-day school context. Firstly, social psychological learning is targeted to the social and emotional facets of school life and school learning. Secondly the way they are seen as interacting with regular content related teaching offers a possible explanation for the remarkable finding that many evaluations of SEL programs and intervention showed significant improvement of academic outcomes.

Annex: Published Instruments in Program Case Descriptions

2 In

causal modelling, like path analysis and structural equation modeling, recursive paths are linear associations with uncorrelated error terms. Non-recursive models contain one or more feedback loops or reciprocal effects and may have correlated disturbances. Actually “self-reinforcing recursive processes” would be non-recursive rather than recursive.

132

5 Opening Black Boxes of the Meta-Analyses: What Do …

Name instrument

Reference

“Skill”

Program

Problem behaviors scale of the SSRS

Gresham, E. M., & Elliott, S. N. (1990). The Social Skills Rating System. Circle Pines, MN: American Guidance Service. Harter

Problem behavior

Tools of the mind

CLASS

Pianta, R., Burchinal M., Teacher sensitivity Howes, C, and Bryant, D. M. (2005) Features of Pre-Kindergarten Programs, Classrooms, and Teachers: Do They Predict Observed Classroom Quality and Child-Teacher Interactions? Applied Development Science 9(3), 144–159

Tools of the mind

Revised TOCA–R and SHP

Werthamer-Larsson, L., Kellam, S. G., & Wheeler, L. (1991). Effect of first grade classroom environment on shy behavior, aggressive behavior, and concentration problems. American Journal of Community Psychology, 19, 585–60

PATHS

The teacher-based Preschool and Kindergarten Behavior Scale (PKBS)

Merrell, Kenneth W. (1996). Social skills and problem PATHS Socio-Emotional Assessment behavior in Early Childhood: The Preschool and Kindergarten Behaviour Scales. Journal of Early Intervention 20, 132–45. https://doi.org/10.1177/105 381519602000205

The Head Start Competence Scale (HSCS)

Domitrovich, Celene E., Rebecca C. Cortes, and Mark T. Greenberg. (2001). Head Start Competence Scale Technical Report. Unpublished manuscript, Pennsylvania State University

Social and emotional skills reflecting interpersonal relationships and emotion regulation

PATHS

Emotional Awareness Scale for Children (LEAS-C)

Bajgar, Jane, Joseph Ciarrochi, The complexity of Richard Lane, and Frank P. children’s emotional Deane. (2005). Development awareness of the Levels of Emotional Awareness Scale for Children (LEAS-C). British Journal of Developmental Psychology 23, 569–86. https://doi.org/10. 1348/026151005x35417

PATHS

Classroom behavior students Authority acceptance subscale (10 items) The cognitive concentration subscale (12 items) assessed concentration, attention, and work completion Emotion regulation

(continued)

Annex: Published Instruments in Program Case Descriptions

133

(continued) Name instrument

Reference

The child-based Difficulties in Emotion Regulation Scale (DERS)

Gratz, Kim L., Lizabeth Deficits of emotion Roemer. (Gratz and Roemer regulation 2004). Multidimensional Assessment of Emotion Regulation and Dysregulation: Development, Factor Structure and Initial Validation of the Difficulties in Emotion Regulation Scale. Journal of Psychopathology and Behavioural Assessment 26, 41

PATHS

Bryant’s Empathy Index

Bryant, Brenda K. (1982). An Children’s affective Index of Empathy for Children sharing of others’ and Adolescents. Child emotions Development 53(2), 413–25. https://doi.org/10.2307/ 112898 De Wied, Minet, Cora Maas, Stephanie Van Goozen, Marjolijn Vermande, Rutger Engels, Wim Meeus, Walter Matthys, and Paul Goudena. (2007). Bryant’s Empathy Index: A Closer Examination of its Internal Structure. European Journal of Psychological Assessment 23, 99–104. https://doi.org/10. 1027/1015-5759.23.2.99

PATHS

Problem Behavior at School Interview

Erasmus, M. C. (2000). Problem Behavior at School Interview. Rotterdam, the Netherlands: Author

Teacher ratings of externalizing behavior

Good behavior game (GBG)

Proximity to others

GBG

The network analysis Borgatti, S. P., Everett, M. G., software program & Freeman, L. C. (2002). UCINET (Version 6) UCINET for Windows: Software for social network analysis. Harvard, MA: Analytic Technologies

“Skill”

Program

(continued)

134

5 Opening Black Boxes of the Meta-Analyses: What Do …

(continued) Name instrument

Reference

The 11-item Social Problems scale of the Teacher’s Report Form

Achenbach, T. M. (1991). Teacher ratings of social GBG Manual for Teacher’s Report problems in kindergarten Form and 1991 Profile. Burlington, VT: University of Vermont, Department of Psychiatry. Verhulst, F. C., van der Ende, J., & Koot, H. M. (1997). Handleiding voor de Teacher’s Report Form [Manual for the Teacher’s Report Form]. Rotterdam, the Netherlands: Sophia Kinderziekenhuis/Academisch Ziekenhuis Rotterdam/Erasmus Universiteit Rotterdam

“Skill”

Program

The Kidcope questionnaire based on stress-coping theory for adults

Spirito, A., Stark, L.J., & Coping with stress Williams, C. (1988). Development of a brief coping checklist for use with pediatric populations. Journal of Pediatric Psychology, 13(4), 555–574. https://doi.org/10. 1093/jpepsy/13.4.555

The extended Norwegian version of the Strengths and Difficulties Questionnaire (SDQ), parent and teacher form

Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A research note. Journal of Child Psychology and Psychiatry and Allied Disciplines, 38(5), 581–586. https://doi.org/10. 1111/j.1469-7610.1997.tb0 1545.x

Zippy’s friends

Mental health emotional Zippy’s symptoms, conduct friends problems, hyperactivity/inattention, peer problems and prosocial behavior

Children’s Depression Inventory (CDI)

Symptoms of depression The UK resilience program

The Revised Children’s Manifest Anxiety Scale (RCMAS)

Symptoms of anxiety

The UK resilience program

Huebner Brief Multidimensional Students’ Life Satisfaction Scale

Life satisfaction

The UK resilience program (continued)

Annex: Published Instruments in Program Case Descriptions

135

(continued) Name instrument

Reference

Social Skills Rating System teacher reports

Gresham, E M., &Elliott, S. N. Social skills (1990). The Social Skills Rating System. Circle Pines, MN: American Guidance Service. Harter

Student altruism Characterplus. (2002). (Characterplus 2002) Evaluation resource guide: tools and strategies for evaluating a character education program. St. Louis, MO: Characterplus

“Skill”

Altruism

Program Lessons in Character (LIC)

LIC

Aggression

Orpinas, P. and Frankowski, R. Aggression (2001). The aggression scale: a self-report measure of aggressive behavior for young adolescents. Journal of Early Adolescence, 21(1), 50–67

LIC

Delinquent behavior

Kisker, E, Kalb, L, Miller, M., Delinquent behavior Sprachman, S., Carey, N, Schochet, P, and James-Burdumy, S. (2004). Social and character development research program evaluation: supporting statement for request for OMB approval of SACD evaluation. Princeton, NJ: Mathematica Policy Research

LIC

School belonging and expectations

Characterplus. (2002). Evaluation resource guide: tools and strategies for evaluating a character education program. St. Louis, MO: Characterplus

School belonging and expectations

LIC

Self-efficacy as a student, satisfaction with self, anger control, depression, and ethnic pride

Comer’s school development program

In the psychological well-being domain, assessments were made of five constructs. No published instruments mentioned

(continued)

136

5 Opening Black Boxes of the Meta-Analyses: What Do …

(continued) Name instrument

Reference

“Skill”

The Teacher Child Rating Scale, served as a basis for program developed scales

Hightower, A. D., Work, W. C., Cowen, E. C., Lotyczewski, B. S., Spinwell, A. P., Guare, J. C., and Rohrbeck, C. A. (1986). The Teacher Child Rating Scale: A brief objective measure of elementary children’s school problem behaviors and competencies. School Psychology Review, 15(3), 393–409

1 Self-concept neg. 2 Positive physical pos. 3 physical action neg. 4 intellectual pos. 5 intellectual neg. 6 responsible 7 self control 8 disruptive 9 considerate 10 social 11 honesty positive 12 honesty negative 13 self-improvement 14 avoid substance use 15 substance use

Program

References Achenbach, T. M. (1991). Manual for teacher’s report form and 1991 profile. Burlington, VT: University of Vermont, Department of Psychiatry. Anson, A., Cook, T. D., Habib, F., Grady, M. K., Haynes, N., & Comer, J. (1991). The comer school development program: A theoretical analysis. Journal of Urban Education, 26(1), 56–82. Bajgar, J., Ciarrochi, J., Lane, R., & Deane, F. P. (2005). Development of the levels of emotional awareness scale for children (LEAS-C). British Journal of Developmental Psychology, 23, 569– 586. Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory. Englewood Cliffs, NJ: Prentice Hall. Barnett, W. S., Jung, K., Yarosz, D. J., Thomas, J., Hornbeck, A., Stechuk, R., et al. (2008). Educational effects of the tools of the mind curriculum: A randomized trial. Early Childhood Research Quarterly, 23(3), 299–313. https://doi.org/10.1016/j.ecresq.2008.03.001. Bierman, K. L., Coie, J. D., Dodge, K. A., Greenberg, M. T., Lochman, J. E., McMahon, R. J., et al. (2010). The effects of a multiyear universal social-emotional learning program: The role of student and school characteristics. Journal of Consulting and Clinical Psychology, 78(2), 156–168. https://doi.org/10.1037/a0018607. Black, P., & Wiliam, D. (1998). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), 139–148. Blackwell, L. S., Trzesniewski, K. H., & Dweck, C. S. (2007). Implicit theories of intelligence predict achievement across an adolescent transition: A longitudinal study and an intervention. Child Development, 78(1), 246–263. https://doi.org/10.1111/j.1467-8624.2007.00995.x. Borgatti, S. P., Everett, M. G., & Freeman, L. C. (2002). UCINET for windows: Software for social network analysis. Harvard, MA: Analytic Technologies. Borman, G. D., Hewes, G. M., Overman, L. T., & Brown, S. (2003). Comprehensive school reform and achievement: A meta-analysis. Review of Educational Research, 73(2), 125–230. https://doi. org/10.3102/00346543073002125. Brunwasser, S. M., Gillham, J. E., & Kim, E. S. (2009). A meta-analytic review of the Penn resiliency program’s effect on depressive symptoms. Journal of Consulting and Clinical Psychology, 77(6), 1042–1054. https://doi.org/10.1037/a0017671. Bryant, B. K. (1982). An index of empathy for children and adolescents. Child Development, 53(2), 413–425.

References

137

Challen, A., Noden, P., West, A., & Machin, S. (2011). UK resilience program evaluation. Final report (DFE-RR097). Retrieved from https://www.gov.uk/government/uploads/system/uploads/ attachment_data/file/182419/DFE-RR097.pdf. Characterplus. (2002). Evaluation resource guide: Tools and strategies for evaluating a character education program. St. Louis, MO: Characterplus. Clauset, K. H., & Gaynor, A. K. (1982). A systems perspective on effective schools. Educational Leadership, 40(3), 54–59. Cohen, G. L., Garcia, J., Purdie-Vaughns, V., Apfel, N., & Brzustoski, P. (2009). Recursive processes in self-affirmation: Intervening to close the minority achievement gap. Science, 324(5925), 400– 403. https://doi.org/10.1126/science.1170769. Collaborative for Social and Emotional Learning. (2015). CASEL guide: Effective Social and emotional learning programs. Retrieved from http://secondaryguide.casel.org/casel-secondaryguide.pdf. Conduct Problems Prevention Research Group. (1998). Technical report for the social health profile. Retrieved from http://sanford.duke.edu/centers/child/fasttrack/techrept/s/shs/shs3tech.pdf. Cook, T. D., Habib, F., Phillips, M., Settersten, R. A., Shagle, S. C., & Degirmencioglu, S. M. (1999). Comer’s school development program in Prince George County, Maryland: A theorybased evaluation. American Education Research Journal, 36(3), 543–549. Corcoran, R. P., Cheung, A., Kim, E., & Xie, C. (2018). Effective universal school-based social and emotional learning programs for improving academic achievement: A systematic review and meta-analysis of 50 years of research. Educational Research Review, 25, 56–72. https://doi.org/ 10.1016/j.edurev.2017.12.001. Creemers, B. P. M., & Kyriakides, L. (2008). The dynamics of educational effectiveness: A contribution to police, practice and theory in contemporary schools. New York: Routledge. Cunha, F., Heckman, J., & Schennach, S. (2010). Estimating the technology of cognitive and noncognitive skill formation. Econometrica, 78(3), 883–931. http://www.econometricsociety. org/tocs.asp. De Wied, M., Maas, C., Van Goozen, S., Vermande, M., Engels, R., Meeus, W., et al. (2007). Bryant’s empathy index: A closer examination of its internal structure. European Journal of Psychological Assessment, 23, 99–104. Desimone, L. M., & Hill, K. L. (2017). Inside the black box: Examining mediators and moderators of a middle school science intervention. Educational Evaluation and Policy Analysis, 39(3), 511–536. https://doi.org/10.3102/0162373717697842. Domitrovich, C. E., Cortes, R. C., & Greenberg, M. T. (2007). Improving young children’s social and emotional competence: A randomized trial of the preschool “PATHS” curriculum. The Journal of Primary Prevention, 28(2), 67–91. https://doi.org/10.1007/s10935-007-0081-0. Dronkers, J. (1966). Dutch public and religious schools between state and market. A balance between parental choice and national policy? In D. Benner, A. Kell, & D. Lenzen (Eds.), Bildung zwischen Staat und Markt. Beiträge zum 15. Kongreß der Deutschen Gesellschaft für Erziehungswissenschaft vom 11.-13. März 1996 in Halle an der Saale (S. 51–66). Weinheim u.a.: Beltz 1996 (Zeitschrift für Pädagogik, Beiheft; 35). Edelbrock, C. S., & Achenbach, T. M. (1984). The teacher version of the child behavior profile: I. Boys aged 6–11. Journal of Consulting and Clinical Psychology, 52(2), 207–217. http://doi.org/ 10.1037/0022-006X.52.2.207. Erasmus, M. C. (2000). Problem behavior at school interview. Rotterdam, the Netherlands: Afdeling. Flay, B. R., Acock, A., Vuchinich, S., & Beets, M. (2006). Progress report of the randomized trial of positive action in Hawaii: End of third year of intervention. Retrieved from https://www.res earchgate.net/publication/224942204. Flay, B. R., & Allred, C. G. (2003). Long-term effects of the positive action® program. American Journal of Health Behavior, 27(Suppl1), S6–S21. https://doi.org/10.5993/AJHB.27.1.s1.2. Flay, B. R., & Allred, C. G. (2010). The positive action program: Improving academics, behavior, and character by teaching comprehensive skills for successful learning and living. In T. Lovat, R.

138

5 Opening Black Boxes of the Meta-Analyses: What Do …

Toomey, & N. Clement (Eds.), International research handbook on values education and student wellbeing (pp. 471–501). Springer Science + Business Media. https://doi.org/10.1007/978-90481-8675-4_28. Flay, B. R., Allred, C. G., & Ordway, N. (2001). Effects of the positive action program on achievement and discipline: Two matched-control comparisons. Prevention Science, 2(2), 71–89. https:// doi.org/10.1023/A:1011591613728. Flay, B. R., Snyder, F. J., & Petraitis, J. (2009). The theory of triadic influence. In R. J. DiClemente, R. A. Crosby, & M. C. Kegler (Eds.), Emerging theories in health promotion practice and research (2nd ed., pp. 451–510). San Francisco, CA: Jossey-Bass. Funk, J., Elliot, R., Bechtoldt, H., Pasold, T., & Tsavoussis, A. (2003). The attitudes toward violence scale: Child version. Journal of Interpersonal Violence, 18(2), 186–196. https://doi.org/10.1177/ 0886260502238734. Gillham, J. E., Brunwasser, S. M., & Freres, D. R. (2008). Preventing depression in early adolescence: The Penn resiliency program. In J. R. Z. Abela & B. L. Hankin (Eds.), Handbook of depression in children and adolescents (pp. 309–322). New York, NY: The Guilford Press. Goodman, R. (1997). The strengths and difficulties questionnaire: A research note. Journal of Child Psychology and Psychiatry and Allied Disciplines, 38(5), 581–586. https://doi.org/10.1111/j. 1469-7610.1997.tb01545.x. Goossens, F. X., Gooren, E. M. J. C., de Castro, B. O., van Overveld, K. W., Buijs, G. J., Monshouwer, K., et al. (2012). Implementation of PATHS through Dutch municipal health services: A quasiexperiment. International Journal of Conflict & Violence, 6(2), 235–248. Gratz, K. L., & Roemer, L. (2004). Multidimensional assessment of emotion regulation and dysregulation: Development, factor structure and initial validation of the difficulties in emotion regulation scale. Journal of Psycho-Pathology and Behavioural Assessment, 26, 41–54. Greenberg, M. T., Kusché, C. A., Cook, E. T., & Quamma, J. P. (1995). Promoting emotional competence in school-aged children: The effects of the PATHS curriculum. Development and Psychopathology, 7(1), 117–136. https://doi.org/10.1017/S0954579400006374. Greenberg, M. T., Kusché, C. A., & Riggs, N. (2004). The PATHS curriculum: Theory and research on neurocognitive development and school success. In J. E. Zins, R. P. Weissberg, M. C. Wang, & H. J. Walberg (Eds.), Building academic success on social and emotional learning: What does the research say? (pp. 170–188). Teachers College Press. Gresham, E. M., & Elliott, S. N. (1990). The social skills rating system. Circle Pines, MN: American Guidance Service. Hanson, T., Dietsch, B., & Zheng, H. (2012). Lessons in character impact evaluation. Final report (NCEE2012-4004). Retrieved from https://eric.ed.gov/?id=ED530370. Hightower, A. D., Work, W. C., Cowen, E. C., Lotyczewski, B. S., Spinwell, A. P., Guare, J. C., et al. (1986). The teacher child rating scale: A brief objective measure of elementary children’s school problem behaviors and competencies. School Psychology Review, 15(3), 393–409. Holen, S., Waaktaar, T., Lervåg, A., & Ystgaard, M. (2012). The effectiveness of a universal school-based programme on coping and mental health: A randomised, controlled study of Zippy’s Friends. Educational Psychology, 32(5), 657–677. https://doi.org/10.1080/01443410. 2012.686152. Jaycox, L. H., Reivich, K. J., Gillham, J., & Seligman, M. E. P. (1994). Prevention of depressive symptoms in school children. Behaviour Research and Therapy, 32(8), 801–816. https://doi.org/ 10.1016/0005-7967(94)90160-0. Kautz, T., et al. (2014). Fostering and measuring skills: Improving cognitive and non-cognitive skills to promote lifetime success. OECD Education Working Papers, No. 110, OECD Publishing. https://doi.org/10.1787/5jxsr7vr78f7-en. Kisker, E., Kalb, L., Miller, M., Sprachman, S., Carey, N., Schochet, P., et al. (2004). Social and character development research program evaluation: Supporting statement for request for OMB approval of SACD evaluation. Princeton, NJ: Mathematica Policy Research. Korpershoek, H., Harms, T., de Boer, H., van Kuijk, M., & Doolaard, S. (2016). A meta-analysis of the effects of classroom management strategies and classroom management programs on students’

References

139

academic, behavioral, emotional, and motivational outcomes. Review of Educational Research, 86(3), 643–680. https://doi.org/10.3102/0034654315626799. Kusché, C. A., & Greenberg, M. T. (1995). The PATHS curriculum. Seattle, WA: Developmental Research and Programs. Liçkona, T., Schaps, E., & Lewis, C. (undated). Eleven principles of effective character education. Retrieved from http://www.educationalimpact.com/resources/TeachChar/pdf/eleven_principles. pdf. Merrell, K. W. (1996). Socio-emotional assessment in early childhood: The preschool and kindergarten behaviour scales. Journal of Early Intervention, 20, 132–145. Mishara, B. L., & Ystgaard, M. (2006). Effectiveness of a mental health promotion program to improve coping skills in young children: Zippy’s Friends. Early Childhood Research Quarterly, 21(1), 110–123. https://doi.org/10.1016/j.ecresq.2006.01.002. OECD. (2015). Skills for social progress: The power of social and emotional skills. OECD Skills Studies: OECD Publishing, Paris. https://doi.org/10.1787/9789264226159-en. Orpinas, P., & Frankowski, R. (2001). The aggression scale: A self-report measure of aggressive behavior for young adolescents. Journal of Early Adolescence, 21(1), 50–67. https://doi.org/10. 1177/0272431601021001003. Pianta, R., Howes, C., Burchinal, M., Bryant, D., Clifford, R., Early, D., et al. (2005). Features of pre-kindergarten programs, classrooms, and teachers: Do they predict observed classroom quality and child-teacher interactions? Applied Developmental Science, 9(3), 144–159. https://doi.org/ 10.1207/s1532480xads0903_2. Riggs, N. R., Greenberg, M. T., Kusché, C. A., & Pentz, M. A. (2006). The mediational role of neurocognition in the behavioral outcomes of a social-emotional prevention program in elementary school students: Effects of the PATHS curriculum. Prevention Science, 7(1), 91–102. https:// doi.org/10.1007/s11121-005-0022-1. Roberts, C., Kane, R., Thomson, H., Bishop, B., & Hart, B. (2003). The prevention of depressive symptoms in rural school children: A randomized controlled trial. Journal of Consulting and Clinical Psychology, 71(3), 622–628. https://doi.org/10.1037/0022-006X.71.3.622. Seligman, M. E. P., & Csikszentmihalyi, M. (2000). Positive psychology. American Psychologist, 55(1), 5–14. Snyder, F., Flay, B., Vuchinich, S., Acock, A., Washburn, I., Beets, M., et al. (2010). Impact of a social-emotional and character development program on school-level indicators of academic achievement, absenteeism, and disciplinary outcomes: A matched-pair, cluster-randomized, controlled trial. Journal of Research on Educational Effectiveness, 3(1), 26–55. https://doi.org/ 10.1080/19345740903353436. Spirito, A., Stark, L. J., & Williams, C. (1988). Development of a brief coping checklist for use with pediatric populations. Journal of Pediatric Psychology, 13(4), 555–574. https://doi.org/10. 1093/jpepsy/13.4.555. Tobler, N. S., Roona, M. R., Ochshorn, P., Marshall, D. G., Streke, A. V., & Stackpole, K. M. (2000). School-based adolescent drug prevention programs: 1998 meta-analysis. Journal of Primary Prevention, 20(4), 275–336. https://doi.org/10.1023/A:1021314704811. Usher, E. L., & Pajares, F. (2008). Sources of self-efficacy in school: Critical review of the literature and future directions. Review of Educational Research December, 78(4), 751–796. http://doi.org/ 10.3102/0034654308321456. Verhulst, F. C., van der Ende, J., & Koot, H. M. (1997). Handleiding voor de Teacher’s Report Form [Manual for the teacher’s report form]. Rotterdam, The Netherlands: Sophia Kinderziekenhuis/Academisch Ziekenhuis Rotterdam/Erasmus Universiteit Rotterdam. de Vos, H. (1989). A rational choice explanation of composition effects in educational research. Rationality and Society, 1(2), 220–239. https://doi.org/10.1177/1043463189001002004. Werthamer-Larsson, L., Kellam, S. G., & Wheeler, L. (1991). Effect of first-grade classroom environment on shy behavior, aggressive behavior, and concentration problems. American Journal of Community Psychology, 19(4), 585–602. https://doi.org/10.1007/BF00937993.

140

5 Opening Black Boxes of the Meta-Analyses: What Do …

Witvliet, M., van Lier, P. A. C., Cuijpers, P., & Koot, H. M. (2009). Testing links between childhood positive peer relations and externalizing outcomes through a randomized controlled intervention study. Journal of Consulting and Clinical Psychology, 77(5), 905–915. https://doi.org/10.1037/ a0014597. Yeager, D. S., & Walton, G. M. (2011). Social psychological interventions in education: They’re not magic. Review of Educational Research, 81(2), 267–301. https://doi.org/10.3102/003465431 1405999.

Chapter 6

Measurement of Soft Skills in Education

6.1 Introduction The main purpose of this chapter is to provide a more specific and concrete illustration on ‘what gets measured’ as part of the evolving trend of fostering social emotional skills in education. We have therefore opted for somewhat detailed descriptions of a limited set of instruments that are in the relevant literature, particularly as they are sometimes used as ‘effect measures’ in evaluations of social-emotional learning programs. Our selection of instruments follows the categorization of social and emotional outcomes presented in Chap. 2, Table 2.6. It was our intention to find examples of measures that represented trait/facet measures (column 3) as well as measures that reflected skill equivalents (column 5). We followed two selection strategies. First, we made use of the SPECTRUM data base published by the British Educational Endowment Foundation,1 in which measurement instruments in the social-emotional domain are rated on psychometric quality and implementation facility. ‘SPECTRUM’ is an acronym that stands for “Social, Psychological, Emotional, Concepts of Self, and Resilience: Understanding and Measurement”. In the user’s guide to the SPECTRUM data base, the authors (Wigelsworth et al., 2017), provide a domain description that is in line with our own conceptual framework, presented in Chap. 2. They recognize that instruments in this domain may be used for different purposes, like screening and identification, evaluation of educational interventions and general accountability and practice improvement. The broad conceptualization of social-emotional skills in the SPECTRUM guidelines leaves room for instruments that may tend to be more like descriptive trait/facet measures as well as measures that resemble skill equivalents. Secondly, we selected instruments that we encountered as measurement 1 EEF

(Education Endowment Foundation) (2018). SPECTRUM database. Education Endowment Foundation: https://educationendowmentfoundation.org.uk/projects-and-evaluation/evaluating-pro jects/measuring-essential-skills/spectrum-database/. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_6

141

142

6 Measurement of Soft Skills in Education

of effect variables in SEL evaluation studies, of which we provided case descriptions in Chap. 5 of this book. Because these instruments had been used in the context of evaluation studies addressing social-emotional outcomes, we expected them to explicitly cover sill equivalents. Apart from illustration and description of measurement instruments, this chapter contributes to major themes of this book, as far as conceptual analysis and assessing the evidence base on the malleability of soft skills by means of educational interventions are concerned. In Chaps. 2 and 3 we touched upon several debatable issues in the definition and conceptualization of social-emotional skills. One major theme is the stability versus the changeability, inherent in the way the concepts of personality are thought of; and related to this distinction: to what extent are traits and facets generalizable across all type of situations and settings, or rather situation specific. Finally, how loosely or strictly should one take the labelling of a broad range of social emotional attributes as skills? Scales to measure social and emotional outcomes are to be seen as operational definitions of educational goals, and the expectation is that taking a closer look at these operationalizations will help clarifying the conceptual issues. The quality of the soft skills measures used in evaluations of socio-emotional learning programs, is an important facet of the quality of the program evaluation as such. And in this way the quality of outcome measures is a core issue in assessing the evidence base on the malleability of social-emotional outcomes by means of educational interventions, as we described in the previous chapter.

6.2 Criteria to Judge the Quality of Measures Standard psychometric quality criteria are the reliability and validity of the measures. Reliability in general refers to the degree to which the measure avoids random error. A fundamental approach to establish the degree of random error is to analyze the correspondence between parallel versions of the same tests, but this is often unfeasible. More practical approximations are to study the internal consistency of the item set, the test-retest reliability, or the inter-rater reliability. The internal consistency of the set of items is frequently established by Cronbach’s coefficient alpha. In order to check the dimensionality of the construct, factor analysis techniques may be applied to see whether one or more factor solutions represent the data in the best way. Appropriate test length is another issue for the reliability of measures. Validity in general involves the extent to which a measure corresponds to the real-world concept it is supposed to reflect. Evidence on criterion-related validity of a measure entails information on the correlation between the measure and another measure of the same concept that is generally accepted as valid (for example a real-life manifestation of the construct). Two specific types of criterion validity are predictive validity and concurrent validity. In the case of predictive validity, the criterion is measured later. In the case of concurrent validity, the criterion is measured at the same time. An example would be assessing the validity of student self-reports on their math performance through a comparison with their scores on a standardized

6.2 Criteria to Judge the Quality of Measures

143

mathematics test. Content validity denotes to what extent a measure covers all facets of a given concept. The construct validity of a measure is conveyed by the extent to which its statistical associations with other measures are in line with theoretical expectations (Carmines & Zeller, 1979). Establishing construct validity frequently entails establishing what is indicated as convergent and discriminant validity of the test; the expectation is that the test will highly correlate with other tests that are expected to measure the same construct (convergent validity) and not correlate with measures that are not expected to measure the same construct. In recent contributions the validity concept has been broadened to include “theoretical rationales to support the adequacy and appropriateness of inferences and actions based on test scores” (Messick, 1995, p. 6; cited by Wolming & Wikstrom, ¨ 2017). A further broadening in the scope of validity theory came from contributions by Kane (1992, 2006), who proposed an ‘argumentative approach’ to validity. Validity was seen as an overall assessment of a ‘theory of action’ in which a test application is situated. A particular additional element is consequential validity, which encompasses assessment of the utility of tests and social consequences. In educational applications the high stakes nature of some test applications fits the idea of an action theory and consequential implications for actors; in the sense of desired effects and undesired side effects for actors (teachers and students). The usefulness of this expansion of the scope of the validity concept has met with criticism by authors who proposed to turn back to the narrower basic interpretations of validity as sketched in the above (Mehrens, 1997; Popham, 1997). However, for our purposes the issue of test application seems to be relevant because, as we shall see, with the soft skills movement in education scales and tests, originally designed to measure individual differences, are now being proposed as outcome measures in educational evaluations.2 Elaborating on this last theme it is worthwhile to consider the utility of soft skills measures in relation to different applications in education. Here we should focus on the question whether the frequently used instruments are appropriate as outcome measures in education. Application for diagnostic purposes, to assess delinquency or pathology, or as a basis for prevention and adaptive teaching are less central to the theme of this book. As mentioned in the introduction, we analyze two selections of instruments. The first selection was made to illustrate the main affective and conative trait/facets that are enlisted in the final ordering table of Chap. 2, Table 2.6 (third column). The purposeful selection was made from a list of positively rated instruments by the British Education Endowment Foundation. The second set of instruments was selected from the outcome measures of SEL evaluation studies, which were described in Chap. 5.

2 Use

of tests for a range of applications is also specified in Sanders and Brouwer’s (2019) rating systems for educational achievement tests, while Wools (2015) distinguishes the criterion “fit for purpose”.

144

6 Measurement of Soft Skills in Education

6.3 Descriptions of Instruments Rated by the Educational Endowment Foundation (2018) The information from the Educational Endowment Foundation is contained in the SPECTRUM data base. “SPECTRUM is a review of how non-academic skills and essential skills are conceptualized and measured in relation to child and adolescent outcomes” (EEF, 2018). It contains information on a total of 86 measurement instruments. The documentation about each instrument consists of a general description of the domain that is covered, general characterization of the instrument, and rating on several psychometric criteria. In this section the information from the EF about 10 instruments (see summary table) is summarized and discussed. In the guidelines of the Spectrum data base, the criteria to rate the instruments on psychometric quality and implementation facility are explained. The criteria to rate psychometric quality are concentrated on types of validity and reliability, as discussed in the previous section. Criteria to rate the implementation facility of instruments, mentioned in the guidelines are cost friendliness, ease of administering the instrument, ease of scoring, utility (i.e. provides helpful results) and use by multiple respondents at the same time (Wigelsworth et al., 2017, p. 8). The operational psychometric and implementation criteria and the scoring procedure are explained in an Annex to the guidelines. Basically, each of 5 psychometric criteria and 4 implementation criteria gets a star qualification if a certain standard is reached. For example, for the criterion reliability the standard is: Each of the subscales has a Cronbach’s alpha above 0.70. So, for implementation the maximum score is 4 points and for psychometric quality the maximum score is 5 points. The summary Table 6.1 provides an overview of the instruments that will be reviewed. The guidelines of the SPECTRUM data base discuss important Table 6.1 Summary table of the selection of instruments from the EEF’s SPECTRUM data base Conscientiousness

Achievement-orientation Perseverance Orderliness Grit

The short grit scale (GRIT-S) Multidimensional measure of children’s perception of control (MMCPC) The self-efficacy teacher-report scale (SETRS)

Neuroticism

Anxiety Fear of failure Test anxiety

The how i feel questionnaire (HIF) Emotion-regulation rating questionnaire for children and adults (ERQ-CA) Rosenberg self-esteem scale (RSES)

Extraversion

Sociability Assertiveness

The child and youth resilience measure (CYRM-12)

Agreeableness

Cooperative Trustful

The children’s self-report social skills scale (CS4) The basic empathy scale (BES)

Openness

Imaginativeness Creativity Curiousness

The expression and emotion scale for children (EESC)

6.3 Descriptions of Instruments Rated by the Educational …

145

additional quality considerations of the measures, namely risks of response bias (like reference bias and socially desirable responding), and optimal choice of respondents. However, it is not clear how these have been applied in the rating procedure of the instruments, and how consideration of these additional quality-aspects has impacted the overall rating.

6.3.1 The Short Grit Scale (GRIT-S) Characterization The instrument is meant to cover the domain of motivation, goal orientation and perseverance. The Grit-S measures trait-level perseverance and passion for long-term goals. The instruments has 12 Likert scale items. The age-range is from 11 to 18. Psychometric details as assessed by the Education Endowment Foundation The instrument has no UK norms. Cronbach’s alpha is assessed at 0.72–0.80. Testretest reliability (4 weeks interval) is r = 0.78. Inter-rater reliability and criterion validity are not reported. Construct validity is addressed and show convergence with the Self-Control scale r = 0.68. Predictive validity with respect to academic performance one month later is r = 0.20. Responsiveness3 nor floor/ceiling effects have not been reported. Overall rating by the EF Psychometry 3; Implementation 4. References Duckworth and Quinn (2009), Duckworth, Peterson, Matthews and Kelly (2007), Credé (2018). Exemplary items (full item sets in ANNEX) I often set a goal but later choose to pursue a different one. I am a hard worker. Comments The study by Duckworth and Quinn (2009) introduced self-report and informantreport versions of the Grit Scale, which measures trait-level perseverance and passion

3 Responsiveness

refers to the measure being responsive to change. Dependent on the intended purpose and domain, measures can be used to identify ‘meaningful change’ (e.g. above a threshold for intervention).

146

6 Measurement of Soft Skills in Education

for long-term goals. The authors provide evidence for the Grit–S’s internal consistency, test–retest stability, consensual validity with informant-report versions, and predictive validity. They conclude that among adults, the Grit–S was associated with educational attainment and fewer career changes and that among adolescents, the Grit–S longitudinally predicted GPA and, inversely, hours watching television. Although the authors note that, (in psychological studies) grit has frequently been used as an outcome indicator, in this article it is specifically addressed as a predictor of success and attainment, among others as a predictor of academic performance. Because of the strong association of the Grit measure with conscientiousness the authors controlled for this factor when predicting educational performance and concluded that, despite controlling for conscientiousness “grittier individuals had attained more education than other individuals of the same age” (ibid., 169). In a critical analysis and replication of Grit research, Credé (2018, p. 606) concluded that … “there appears to be no reason to accept the combination of perseverance and passion for long-term goals into a single grit construct, nor is there any support for the claim that grit is a particularly good predictor of success and performance in an educational setting or that grit is likely to be responsive to interventions”. Among others, he found that the association between Grit and academic achievement disappeared when conscientiousness was controlled for. Credé offers several explanations for his results with the Grit scale. First, Grit might be a necessary but not sufficient condition for success; the decisive factor might be innate ability, which might explain the finding that Grit is effective for students being at the lower end of the ability distribution. A second alternative explanation might be that the effectiveness of grit depends on situational characteristics, for example hypothesizing that grit manifests itself particularly in situations characterized by high levels of adversity and complexity. A third explanation, mentioned by Credé, are psychometric weaknesses of the current scale. As an explanation of the modest influence of Grit on educational performance, Credé refers to findings in psychological research, which suggest that interventions designed to increase perseverance—a construct that appears to be isomorphic with conscientiousness—are likely to be of limited value. He argues that interventions designed to change perseverance are likely to require long-term investment by institutions and the involvement of well-trained and skilled teachers and trainers. He also says that, “it is as yet unclear whether an individual’s general level of passion can be increased by interventions or whether high levels of passion for long-terms goals are even inherently desirable, particularly for younger children who may benefit more from a general exploration of many different activities rather than the single-minded pursuit of any one activity”. (p. 609).

6.3 Descriptions of Instruments Rated by the Educational …

147

6.3.2 Multidimensional Measure of Children’s Perception of Control (MMCPC) Characterization The instrument is meant to measure the domains Perception of Self, and Social and Emotional Competence. The MMPC provides an idiographic portrayal of children’s perceptions of control. It has four subscales: Cognitive, Social, Physical and General. The instrument consists of 48 Likert scale items. The age-range is from 8 to 14. Psychometric details as assessed by the Education Endowment Foundation The instrument has UK norms. Cronbach’s alpha is 0.60–0.70. Test–retest reliability: 9 weeks, r = 0.34; 17 weeks, r = 0.32. Inter-rater reliability is not reported, and neither is criterion validity. As for construct validity, correlations are in the order of 0.30–0.50 with academic achievement, mastery motivation, autonomous judgment, peer acceptance and physical competence. Predictive validity: 5 out of 8 relationships were significant at the 0.05 level for success outcomes and one out of eight for failure outcomes. Concurrent validity, responsiveness and floor/ceiling effects were not addressed. Overall rating by the EF Psychometry 3; Implementation 3. Reference Muldoon, Lowry, Prentice, and Trew (2005). Exemplary items (from the General Scale) (full item sets in ANNEX) Often, I can’t understand why good things happen to me (unknown control). To get what I want I have to please the people in charge (powerful others control). I can pretty much control what will happen in my life (internal control). Comments Muldoon et al. (2005) state that the validity of the instrument is bolstered by findings suggesting that perceptions of control as measured by this instrument are related to psychopathology, academic achievement and parental locus of control. The authors re-analyzed the factor-structure in a new sample and conclude that it was largely reproduced; problems remain because some of the subdomains are represented by a too small number of items and because some items are paraphrases of other items. Their overall conclusion is that “This replication provides support for the contention that attributions for success and failure in childhood can be attributed to three sources of control and underlines the domain specific nature of these perceptions”. This instrument is a trait-facet rather than a skill measure related to self-confidence.

148

6 Measurement of Soft Skills in Education

6.3.3 The Self-efficacy Teacher Report Scale (SETRS) Characterization The scale is a unidimensional scale, which measures perceptions of students’ selfefficacy by teachers. The age range is 8–17. The scale consists of 18 Likert scale items, with the following scoring key: U means the student Usually displays the behavior. S means the student Sometimes displays the behavior. R means the student Rarely or Hardly Ever displays the behavior. Psychometric details as assessed by the Education Endowment Foundation The instrument does not have UK norms. Cronbach’s alpha ranges from 0.95 to 0.97. Test-retest reliability (14 days) is 0.89, Inter-rater reliability is not reported. Criterion validity is not reported. There are some indications for convergent and discriminant validity. E.g. the scale converges with the Self-Evaluation Scale for Teachers. Concurrent and predictive validity are not reported, and neither are responsiveness and floor/ceiling effects. Overall rating by the EF Psychometry 3; Implementation 3. References Bandura (1977), Bandura et al. (1999, 2001, 2003), Erford, Duncan, and SavinMurphy (2010). Exemplary items (full item sets in ANNEX) When the student begins something, he/she tries hard to finish it. The student views the chance of failing as a challenge. Comments Erford et al. (2010, p. 79) describe the conceptual and theoretical basis of self-efficacy as follows: “Self-efficacy is the belief in one’s ability to use coping resources to successfully gain control over a situation to complete a task (Bandura, 1977) and is related to numerous situational outcomes”. Self-efficacy is an influential factor in overcoming difficult situations; if people think they possess the attributes necessary to succeed in certain situations, and solve problems, their self-efficacy increases. Self-efficacy operates through outcome expectancies (i.e., outcomes resulting from personal behaviors) and self-efficacy expectancies (i.e., belief in one’s ability to perform the personal behaviors). Individual and contextual variations in self-efficacy stem from four primary sources: (a) verbal messages, (b) personal psychological states, (c) past performances, and (d) vicarious learning (Bandura, Caprara, Barbaranelli, Gerbino, & Pastorelli, 2003). Facets that contribute to self-efficacy are

6.3 Descriptions of Instruments Rated by the Educational …

149

perseverance, procrastination (tendency to defer action and be dilatory), achievement efficacy and self-confidence. Some researchers deem confidence as an important facet of self-efficacy. Erford et al. (2010) describe the development of the SETRS. They report that four subscales were designed into the SETRS, but because these subcomponents are so highly intercorrelated, a one-factor structure was hypothesized. The authors conclude that the findings of these preliminary studies investigating reliability and validity of scores on the SETRS seem to suggest the instrument is a psychometrically adequate screening tool for assessing teacher perceptions of the self-efficacy of students. The scale is therefore primarily seen as a diagnostic tool and as a basis to optimize instructional delivery, thus allowing students to have mastery experiences, which in turn will boost self-efficacy. Teachers and professional counselors are seen as strong influences on student performance. These professionals can use the results of the SETRS as they work with students to enhance students’ internal dialogues related to beliefs about ability, in order to assist students in achieving goals and envisioning personal success. The authors refer to Bandura (1977) and Bandura and others (1999, 2001, 2003) for interventions that students with low self-efficacy can receive to bolster self-efficacy. One could imagine that the SETRS is subsequently used as a post-test to assess the success of such interventions among this restricted group of students, although it is doubtful whether this should be done by means of teacher ratings, instead of student self-reports, because teachers are not able to reliably evaluate students’ beliefs. Judging from the item set the instrument resembles a measure of conscientiousness more than it does self-efficacy in the sense of the feeling that one is capable of doing what it takes to accomplish objectives.

6.3.4 How I Feel Questionnaire Characterization The domain the instrument covers is described by the EEF as ‘mental health and wellbeing’. It consists of three subscales: Positive emotion, Negative emotion and Emotional control. The instrument is described as a “multi-dimensional, self-report measure of emotional arousal and regulation for children”. The age-range is 8–12. Psychometric details as assessed by the Education Endowment Foundation The instrument has no UK norms. Cronbach’s alpha is assessed at 0.77–0.90. The test-retest reliability (15 months) is r = 0.34–0.42. Inter-rater reliability and criterion validity are not reported, nor are concurrent validity, predictive validity, responsiveness and floor/ceiling effects. Assessment of construct validity showed convergence with measures like the Affect Expression Rating Scale (AER), Positive and Negative Affect Scale (PANAS) and peer ratings of popularity.

150

6 Measurement of Soft Skills in Education

Overall rating by the EF Psychometry 3; Implementation 3. Reference Ciucci, Baroncelli, Grazzani, Ornaghi, and Caprin (2016). Exemplary items (full item sets in ANNEX) Excited all the time (positive emotion). Scared often (negative emotion). Sad frequently (emotion control). Comments The construct in question is described as a measure of children’s emotional arousal and regulation. The HIF is described as a 30-item self-report questionnaire created to simultaneously capture frequency, intensity and control of happiness, excitement, sadness, fear, and anger in children and pre-adolescents (Ciucci et al., 2016). It appeared to have a three factor structure labelled as Positive emotion, Negative emotion and Emotion control. The purpose of the instrument is described as follows: “The HIF gives access to information about personal emotion arousal and regulation that may be used to test for associations with other personality traits, emotional skills, and overt behaviors. This information is also of key importance given that impaired emotional regulation predicts low levels of social preference and rejection by school peer groups, both of which are serious risk factors for children’s emotional health.” According to Ciucci et al. (2016, p. 201) “Our findings suggest that the HIF may be particularly suitable for evaluating the impact of interventions designed to promote emotional health, and for this reason we recommend its use at school” (ibid.). No examples of application as intervention effect measures of the HIF are presented; nor are criterion or predictive validity assessed. The positive and negative emotion subscales are traits. Emotional control might be a skill, but this would depend on the formulation of the items. In this case the items are not formulated as skills.

6.3.5 Emotion-Regulation Rating Questionnaire for Children and Adults (ERQ-CA) Characterization The ERQ-CA comprises 10 items assessing the emotion regulation strategies of cognitive reappraisal and expressive suppression (as the two subscales). The domain is indicated as “emotional intelligence”. The age-range is 9–18.

6.3 Descriptions of Instruments Rated by the Educational …

151

Psychometric details as assessed by the Education Endowment Foundation The instrument does not have UK norms. Cronbach’s alpha ranges 0.69–0.86. Testretest reliability (12 months), r = 0.37, 0.63. Inter-rater reliability is not reported. Construct validity is addressed: r = [−0.26, 0.37] with Child Depression Inventory; r = [−0.37, −0.28] with Big-Five Questionnaire for Children (BFQ-C); r = [−0.18, 0.15] with Rosenberg Self-Esteem Scale. No other validity type is addressed, neither are responsiveness or floor/ceiling effects. Overall rating by the EF Psychometry 3; Implementation 3. References Gullone and Taffe (2012), Gross (1998). Exemplary items (full item sets in ANNEX) When I want to feel happier, I think about something different. I control my feelings by not showing them. Comments According to Gullone and Taffe (2012) there is general consensus that Emotion Regulation (ER) involves intrinsic and extrinsic processes responsible for managing one’s emotions toward goal accomplishment. ER processes can be conscious or unconscious, automatic or effortful and include skills and strategies for monitoring, evaluating, and modifying emotional reactions. ER involves not only reducing the intensity or frequency of emotional states but also developing the capacity to generate and sustain emotions. Moreover, ER processes are not solely focused on negative emotions but also include positive ER. Gullone and Taffe refer to Gross’s (1998) process-oriented approach to further explain the theoretical basis. Gross’s model includes five sets of emotion regulatory strategies: (a) situation selection, (b) situation modification, (c) attention deployment, (d) cognitive change, and (e) response modulation. “Specific ER strategies have been differentiated as antecedent focused or response focused, along timelines consistent with an unfolding emotional response. The former refers to strategies adopted before the emotion-response tendencies have become fully activated and the latter to those adopted once an emotion is already being experienced. Within this model, to date, two ER strategies have been operationalized. These are (a) cognitive reappraisal (CR), a cognitive change strategy that involves redefining a potentially emotion-eliciting situation in such a way that its emotional impact is changed; and (b) expressive suppression (ES), a form of response modulation involving the inhibition of ongoing emotion-expressive behavior. The rationale for the focus on these particular strategies is that each is a good exemplar of antecedent-focused and response-focused strategies, respectively, and both are strategies that are commonly used in everyday life” (Ibid., pp. 409–410). Construct validation indicated that the Cognitive reappraisal sub-scale tended to correlate positive with the Big Five traits, while Emotion Suppression tended to

152

6 Measurement of Soft Skills in Education

correlate negatively with the Big Five traits. When considering the use of this instrument as an outcome measure of social emotional learning interventions, one may wonder what kind of educational intervention would be fitting. Perhaps this would be easier for the cognitive part which might consist of behavior modification guided by the overall recommendation “when you are feeling bad, think of something different”. Emotion suppression is associated with introversion and neuroticism. These are personality traits, and the instrument is general and not situation specific. What would an educational program aimed at improving emotion regulation, defined in this way look like? Perhaps, training behavior in situations where the recommended reaction would be to suppress direct primary reactions. But it would seem that behavior modification programs should be assessed by means of more specific behavioral items, and not with a general trait measure. Although reappraisal and suppression of emotion are conceptually like competences, the instrument in question does not seem sufficiently performance oriented to be seen as a skill measure.

6.3.6 Rosenberg Self-Esteem Scale (RSES) Characterization The domain that is measured is Perception of Self. It is a unidimensional measure. It consists of 10 Likert type items. The age-range is from 13 to 18. Psychometric details as assessed by the Education Endowment Foundation The measure has UK norms. Cronbach’s α ranges from 0.85 to 0.90. The test-retest reliability over a 7 months period is r = 0.61. Criterion validity is addressed and show a −0.09 correlation with the Ontario Child Healthy Study scales. Not reported are construct validity, concurrent validity, responsiveness and floor ceiling effects. Overall rating by the EF Psychometry 4; Implementation 4. Reference Bagley, Bolitho, and Bertrand (1997). Exemplary items (full item sets in ANNEX) I’m not good at all. I have a positive self-attitude. Comments Bagley et al. (1997) describe an English study, in which a controlled intervention was conducted, randomly allocating high school students who indicated both suicidal ideation and devastated self-esteem (scores in the lowest 5% compared with scores on the Coopersmith scale scores for the entire cohort of 14 and 15-year old’s). Intensive

6.3 Descriptions of Instruments Rated by the Educational …

153

social interventions (Rogerian counselling; tuition for those with poor scholastic achievement; task-linkage of sociometrical isolated students with highly popular students; and social work assistance for families of students experiencing poverty, family disruption and divorce) was reflected a year later in significant improvements in self-esteem, and eclipse of suicidal feelings in focus students, in comparison with controls. The authors interpret the results in terms of identification of groups of students who are in need of counseling. This is more like a trait(facet) measure, generally applicable, not situation specific. Educational use described by Bagley et al. (1997) is diagnosis and treatment of extreme problematic groups of children. Still the article illustrates that the instrument is fit for use in a pre-post design, and secondly that complex and intensive treatment pays off. A strong point for the instrument is that control for ‘reference bias’ seems to be built into the design of the scale (self-characteristics relative to the perceived characteristics of peers): when the peer group changes the self-evaluation might change as well. The documentation points to use in restrictive sub-populations (children with severe behavioral problems), restriction on generalizability of application. As far as the intervention “program theory” is concerned, the expectation seems to be that low self-esteem is a cause rather than an effect of problematic behavior. The obvious rival hypotheses is that low self-esteem is a consequence of problematic behavior. Alternative treatment: direct modification of the problem behavioral symptoms.

6.3.7 The Child and Youth Resilience Measure (CYRM-12) Characterization The instrument is described as “a brief measure of resilience for young people.” The domain that is measured is resilience and coping. Subscales refer to individual, relational, communal and cultural aspects. The scale consists of 12 Likert items. The age-range is from 10 to 18. Psychometric details as assessed by the Education Endowment Foundation. The instrument does not have UK norms. Cronbach’s alpha is 0.84. Not reported are test-retest reliability, inter-rater reliability, any type of validity, responsiveness and floor/ceiling effects. Overall rating by the EF Psychometry 3; Implementation 3. Reference Liebenberg, Ulgar, and LeBlanc (2013).

154

6 Measurement of Soft Skills in Education

Exemplary items (full item sets in ANNEX) I have opportunities to develop job skills. I eat enough most days. Comments The Child and Youth Health and Resilience Measure (CYRM) is described by Liebenberg et al. (2013) as designed to measure youth resilience while accounting for diverse social contexts across numerous cultures. The instrument was validated in regular school contexts as well as among multiple-service-using youth participating (child welfare, mental health, juvenile justice, community programs, and special educational supports). The study concludes that “the instrument shows enough content validity to merit its use as a screener for resilience processes in the lives of adolescents”. The usefulness of the instrument as an outcome measure for evaluating social emotional learning programs is doubtful, given its design as broad indication of resilience across contexts. It is a trait/facet like measure, without clear reference to a specific personality trait.

6.3.8 The Children’s Self-Report Social Skills Scale (CS4) Characterization The domains that are measured are emotional Intelligence (mixed and trait),4 mental health and wellbeing, and social and emotional competence. The instrument is made up of three subscales: social rules; likeability and social ingenuousness. The test consists of 21 Likert type items (1 never—5 always). The age-range is from 9 to 12 Psychometric details as assessed by the Education Endowment Foundation The measure does not have UK norms. Cronbach’s α ranges from 0.84 to 0.90. Testretest reliability over a period of 10–14 days is r = 0.74. Inter-rater reliability is not reported, and neither is criterion validity addressed. Construct validity is indicated in terms of significant correlations between CS4 subscales and peer nominations, r = [−0.26, 0.30]. Concurrent validity, predictive validity and responsiveness have not been addressed, neither have floor/ceiling effects. Overall rating by the EF Psychometry 3; Implementation 3.

4 The qualification “mixed and trait” means that the instrument is mixed with respect to “ability” and

“trait” interpretations of emotional intelligence, but tends to a “trait” interpretation. The items rather resemble items for personality trait questionnaires in the areas of agreeableness and extraversion.

6.3 Descriptions of Instruments Rated by the Educational …

155

Reference Danielson and Phelps (2003). Exemplary Items (full item sets in ANNEX) I look others in the face when they talk. Others do not like me. Comments The items of the scale are intended to tap ‘execution’ of social behavior, not ‘potential’. PCA indicated three components: Social rules, likeability and social ingeniousness. The impression is that the scale measures perception of children of their own social behavior. The items look like items from personality questionnaires on traits like agreeableness and extraversion and ask for ‘typical behavior’. This is a trait rather than a skill measuring instrument.

6.3.9 The Basic Empathy Scale (BES) Characterization The domain that is measured is described as Emotional Intelligence. The instrument is made up of two subscales detecting two different components of empathetic responsiveness: the affective empathy subscale, measuring the emotional congruence with another person’s emotions, and the cognitive empathy subscale, measuring ability to understand another person’s emotions. The test consists of 20 Likert scale items (1 strongly disagree—5 strongly agree). The scoring is standardized and the age range is from 9 to 18) Psychometric details as assessed by the Education Endowment Foundation UK norms for the measure exist. Cronbach’s α ranges from 0.77 to 0.87. Test– retest reliability over a period of 3 weeks ranges from r = 0.54 to 0.70. Criterion validity is not reported. Construct validity is addressed (discriminant and convergent validity) r = [−0.20, −0.17] with the Toronto–Alexithymia scale; r = [0.35, 0.53] with the Interpersonal Reactivity Index; r = [−0.10, 0.34] with Big Five Personality traits. Concurrent validity, predictive validity, responsiveness and the occurrence of floor/ceiling effects are not addressed. Overall rating by the EF Psychometry 4; Implementation 4 (on a 5 point scale). References Jolliffe and Farrington (2006), D’Ambrosio, Olivier, Didon, and Besche (2009), Carré, Stefaniak, and Ambrosio (2013)

156

6 Measurement of Soft Skills in Education

Exemplary Items (full item sets in ANNEX) My friends’ emotions don’t affect me much. I have trouble figuring out when my friends are happy. Comments As to the relevance for SEL, the instrument has a cognitive and an affective subscale; each has 10 items. Affective empathy is defined as possession of an appropriate emotional response when confronted with the mental state attributed to another person. Cognitive empathy is defined as the intellectual apprehension of another’s mental state, currently associated with the theory of mind, (ToM). Performance related application of the measure is not documented. Criterion and predictive validity have not been referenced. The impression is that this is a trait, rather than a skill measure, as most items express general habitual responses, although the cognitive subscale has some “can do” skill-like items. The content of the item set does not give the impression that it is aligned with specific educational interventions.

6.3.10 The Expression and Emotion Scale for Children (EESC) Characterization The domain the instrument intends to measure is Emotional intelligence and the instrument is described by the EF as being designed to assess lack of emotion awareness and reluctance to express emotions. There are two subscales: Expressive reluctance and Poor awareness. The age range is from 9 to 12. Psychometric details as assessed by the Education Endowment Foundation The instrument has no UK norms. Cronbach’s alpha is assessed at 0.81–0.83. The test-retest reliability over a period of 2 weeks amounts to r = 0.56–0.59. Interrater reliability is not reported, nor are criterion validity, concurrent validity, predictive validity, responsiveness and floor/ceiling effect. Construct validity (convergent validity) is addressed: r = 0.55 with the approach scale of the Behavioral Inhibition System (BIS); r = 0.39 with the promotion scale of the Regulatory Focus Questionnaire (RFQ) Overall rating by the EF Psychometry 3; Implementation 4 (on a 5 point scale). Reference Penza-Clyve and Zeman (2002).

6.3 Descriptions of Instruments Rated by the Educational …

157

Exemplary items (full item sets in ANNEX) When I feel upset, I do not know how to talk about it. When I’m sad, I try not to show it. Comments The central concept is not about creative expression as a facet of Openness but is seems more associated with Introversion/Extraversion and Social Inhibition. The measure is a trait, rather than a skills measure. All items are negatively worded, which suggests application for diagnosis of problematic behavior and attitudes. As Penza-Clyve and Zeman (2002) say the construct is about “2 aspects of deficient emotion expression: lack of emotion awareness and lack of motivation to express negative emotion” (p. 540). The authors see the development of skills to regulate emotional experience and expression as a prerequisite for adaptive psychological and social development. The context of application that is mentioned is prevention (the assessment of children with poor emotion expression may prove beneficial for the prevention of poor psychological outcomes—ibid., 540). It was anticipated that the Emotion Expression Scale for Children (EESC) would yield two factors: poor awareness, describing difficulty labeling internal emotional experience, and reluctance to express emotion, describing lack of motivation or willingness to communicate or express negative emotions to others. The study confirms the two-factor solution. The study is placed in a scientific context of elaborating the nomological network in the psychology of emotions. If any, practical application has to do with prevention and diagnosis, not with intervention effects. No indications of criterion or predictive validity were included in the study.

6.3.11 Brief Summary In the next paragraph some other measuring instruments will be described, and after these additional descriptions a more detailed summary and discussion will be given. At this stage we will just mention some major trends in the description of the selected instruments from the EEF’s SPECTRUM data base. In all cases Likert scales were used as item format. Only in three out of ten cases were the predictive and criterion validity of the instruments reported. This being of interest for studies where social and emotional variables are modelled as instrumental to academic performance (see Chap. 5). Finally, the instruments were overridingly categorized as trait rather than skill measures. To this last issue we will return in the final summary and discussion of this chapter.

158

6 Measurement of Soft Skills in Education

6.4 Description of Instruments that were Used in the Intervention Studies (see Chap. 5) In addition to the instruments that were selected from the SPECTRUM data base of the Educational Endowment Foundation, we also describe six instruments, selected from the instruments which we encountered in the case descriptions of SEL program evaluation in Chap. 5, listed in the ANNEX to that chapter. The order of presentation of these instruments follows the order of appearance in that list. The following instruments were selected: (title, key reference and name of the evaluated SEL program). The Social Skills Rating System (SSRS), Gresham and Elliott (1990). Tools of the Mind The Emotional Awareness Scale for Children (LEAS-C), Bajgar et al. (2005). PATH The Empathy Index for Children and Adolescents (IECA) (Bryant’s Empathy Index), de Wied et al. (2007). PATH Kidcope-Child Version (Holen, Waaktaar, Lervåg, & Ystgaard, 2012). Zippy’s Friends The Strengths and Difficulties Questionnaire (SDQ), (Goodman, 1997). Zippy’s Friends The Teacher Child Rating Scale (T-CRS), Hightower et al. (1986). Positive Action The format for describing and discussing these instruments consists of a schematic summary description, a description of the measure, item examples and comments. In the comments we summarize basic psychometric indicators about reliability and validity and try to qualify the measure as a trait or a skill measure.

6.4.1 The Social Skills Rating System (SSRS), Gresham and Elliott (1990). Tools of the Mind Description of the measure The SSRS measures “Essential social skills for success at school, according to teacher reports, include listening to others, following classroom rules, complying with teacher directives, asking for help, cooperating with peers, and controlling temper in conflict situations. Social competence deficits as part: (a) an inability to build or maintain satisfactory interpersonal relationships with peers and teachers, and (b) the expression of inappropriate behavior under normal circumstances” (Gresham, Elliott, Vance, and Cook, 2011, p. 29). The SSRS is commonly used by school psychologists to assess the social skills and problem behaviors of students who are experiencing difficulty in school settings (Diperna & Volpe, 2005, p. 345). The scale

6.4 Description of Instruments that were Used in the Intervention …

159

Table 6.2 Summary description of the SSRS Instrument Theoretical and background program reference

Characterization Reliability/internal Convergent Trait or skills of instrument consistency and measure discriminant validity

The social skills rating system (SSRS) Gresham and Elliott (1990). Tools of the mind

The SSRS (Gresham & Elliott, 1990) is a broad-based, multi rater assessment of students’ social behavior that examines teacher-student relations, peer interactions, and academic performance

Measurement intention: Essential social skills for success at school Social competence deficits

Gresham et al. (2011, 38) report high levels of internal consistency Cronbach’s alpha for any of the total scores on the SSIS-RS above 0.94. Diperna and Volpe (2005, 350) report a Cronbach’s alpha of 0.86 for the social skills total of the elementary school version of the SSRS Lower alpha’s were reported for the separate subscales

Gresham et al. (2011, 41) and Diperna and Volpe (2005, 535) report moderately high concurrent validity for the total score, and mixed results for some of the subscales

Despite the consistent labeling of desired behavior patterns as skills, the student self-reports do not refer to actual performance but to self-rated behavioral response tendencies in social behavior and motivation. Therefore, we consider the SSRS Scales as trait measures

has been widely used as a screening tool, but also as part of US national evaluations of the Head Start program (Table 6.2). The SSRS is a broad-based, multi rater assessment of students’ social behavior that examines teacher-student relations, peer interactions, and academic performance. The SSRS yields information from three rating sources: teachers, parents, and students. “The SSRS solicits information from these three sources in Grades 3–12 and from parents and teachers for children ages 3–5. The SSRS has three forms reflecting three developmental age ranges: preschool (ages 3–5 years), elementary (grades K–6), and secondary (grades 7–12). The SSRS focuses on a comprehensive assessment of social skills; however, it also includes problem behaviors that often compete with the acquisition and/or performance of socially skilled behaviors. Additionally, the teacher version of the SSRS includes a measure of academic competence.” (Gresham et al., 2011, p. 32). “The SSRS includes five social skills domains: Cooperation, Assertion, Responsibility, Empathy, and Self-Control. Three of these domains are consistent across teacher, parent, and student raters. Responsibility is only included on the parent

160

6 Measurement of Soft Skills in Education

Table 6.3 The structure of the SSRS Scales

Social skills Problem behaviors Academic competence (teacher ratings only)

Social skills subscales

Cooperation (10 items all rater categories) Assertion (10 items all rater categories) Responsibility (10 items all categories) Self-control (10 items all rater categories)

Problem behavioral subscales

Externalizing (6 items all rater categories) Internalizing (6 items, all rater categories)

Rating dimensions

3-point scales (frequency: never, sometimes, or very often; importance: not important, important, or critical)

form, and Empathy is only found on the student form. The SSRS also has three problem behavior domains: Externalizing, Internalizing, and Hyperactivity.” (ibid.) Schematically the structure of the SSRS is as follows (see Table 6.3). Exemplary items Parent Questionnaire Could you indicate how often your child manifests the following behavior (never, sometimes or very often): Cooperation:

Uses free time at home in a responsible manner. Offers to help with domestic tasks. Responsibility: Introduces him/herself spontaneously to other people. Declines unreasonable demands by others in a polite way. Assertion: Takes part in group activities on his/her own initiative. Makes friends easily. Self-control: Talks at home with an acceptable volume of voice. Handles criticism well. Child questionnaire Instruction. In this list you see a lot of phrases about what children of your age can do. We are going to read the sentences together and when this happens think of yourself. Tell me how often this behavior occurs with you. Do you do this never, sometimes or very often? I do not take notice of children in my class who are trying to make fun. I listen to adults when they talk to me. I make friends easily. I keep my desk clean and tidy. I let friends know when I like them by telling them, or letting them know.

6.4 Description of Instruments that were Used in the Intervention …

161

I do what the teachers asks me to do. Teacher questionnaire Instruction: Please indicate how often the students manifests this behavior (never, sometimes or very often) Cooperation with the teacher: This student uses free time in an acceptable way. This student carries out his/her assignments in time. Assertion: This student asks other children to participate in his//her play. This student tells positive things about him/herself, without boasting. Self-control: This student handles criticism well. This student associates with and accepts people who are different. Comments Gresham et al. (2011, p. 41) conclude that the SSER had high internal consistency estimates for total scores for both social skills and problem behavior scales. Cronbach’s alpha for any of the total scores on the SSIS-RS is above 0.94. Diperna and Volpe (2005, p. 350) report a Cronbach’s alpha of 0.86 for the Social Skills Total of the elementary school version of the SSRS. Lower alphas were reported for the separate subscales. Both Gresham et al. (2011, p. 41) and Diperna and Volpe (2005, p. 535) report moderately high concurrent validity for the total score, and mixed results for some of the subscales. More specifically these latter authors conclude as follows: “Findings from the current study indicate that the total score demonstrates acceptable internal consistency, moderate 6-month stability, and concurrent validity with related constructs. Evidence was less supportive for the subscales. Specifically, none of the subscales demonstrated acceptable levels of internal consistency for screening purposes, and two subscales (Self-Control and Empathy) demonstrated lower stability. In addition, although concurrent validity evidence supported hypothesized relationships for the Total scale and two subscales (Cooperation and Assertion), evidence for the Self-Control and Empathy subscales was not as predicted. Together, the reliability and validity findings from this study provide evidence to support the use of the SSRS-SEF Total scale for assessing the social behavior of students in the intermediate elementary grades. Due to the limitations noted for the subscales, interpretation of the subscale scores from the SSRS-SEF should only occur in conjunction with other measures of these constructs, such as the SSRS-T form”. (ibid., p. 353). Although primarily designed as a diagnostic screening instrument to identify students with problematic behavior the SSER has been applied as an effect variable in evaluations of well-known educational improvement programs as Head Start and Tools of the mind. In a more recent version of the SSER, the Social Skills Improvement System (SSIS-RS), the assessment scales are tied closely to an intervention

162

6 Measurement of Soft Skills in Education

system. “In addition to the SSIS-RS rating scales, the SSIS-RS System includes a Performance Screening Guide (PSG), a Class-wide Intervention Program (CIP), and an Intervention Guide (IG). These are tools to assess, instruct, and monitor progress in a tiered model of instruction”. (Gresham et al., 2011, 33). “As a means to proactively teach social skills within the general education setting, the SSIS-RS evidence-based practices start with the Class wide Intervention Program (CIP). The CIP is a scripted general education program that teaches 10 of the most important social skills as rated by teachers and parents. These skills represent seven critical skill domains covered in the SSIS-RS: Communication, Cooperation Assertion, Responsibility, Empathy, Engagement and Self-Control”. (32). “The Classroom Intervention Program and the Intervention Guide manuals provide a number of lessons meant to directly teach and reinforce prosocial behaviors. They include components of instruction, peer modeling, positive and negative examples, and role play. Additionally, the Intervention Guide provides teachers with information regarding appropriate reinforcement contingencies that could be useful for increasing prosocial behaviors for students with autism spectrum disorders”. (33). The SSIS-RS system aligns instruction and social emotional skill performance assessment. The Performance Screening Guide (PSG) is described as “a criterionrelated performance measure meant to be used as universal screener for teachers to use to assess all students within a setting. It focuses on observable behaviors in four skill areas: positive social behaviors, motivation to learn, reading skills, and math skills” (ibid., 33). Referring to these four items as “skill areas” is a persuasive way of suggesting that reading and math skills on the one hand, and motivation and social skills on the other can be commonly labeled as skills. Yet, the social and motivational “skills” items refer to rather general habitual responses, about which it is hard to believe that they are changeable in a sustainable way by means of educational interventions. The SSRS is a multi-scale and multi-rater set of instruments, however, and this latter assessment would apply in a different way to the student rated versions than to the scales that depend on teacher observations. In the evaluation study of the Tools of the Mind program by Barnett et al. (2008, 299) the effect of the selfregulation component of the program, assessed by means of the teacher form of the Problem Behaviors Scale of the SSRS, completed by the child’s teacher near the end of the school year, showed an effect on behavior problems of about half a standard deviation (es = 0.47, Glass’ delta). It is quite conceivable that intensive monitoring of social and on-task behavior in school would pay off in the sense of after-treatment improved behavior, but only sustained effects over a longer period of time would qualify as improvement in social and motivational skills. Moreover, as demonstrated in the study by Credé (2018), control for underlying personality traits, like sociability or conscientiousness would be important to properly assess intervention effects of this nature.

6.4 Description of Instruments that were Used in the Intervention …

163

6.4.2 The Emotional Awareness Scale for Children (LEAS-C), Bajgar et al. (2005). PATH Description of the measure The LEAS-C consists of 12 evocative interpersonal scenarios. “Each scenario is described in two to four sentences and involves two people. Subjects are asked to describe the feelings of self and of the other person for each scenario. Two scenarios are presented per page, each scenario followed by two questions: ‘How would you feel?’, and, ‘How would the other person feel? (Table 6.4). The scoring procedure for the LEAS-C is identical to that followed by the LEAS. Scoring is aimed at determining the degree of differentiation or specificity in the emotions described, and the range of emotions reported. Each scenario is designed to elicit one of four types of emotion (happiness, anger, sadness, or fear; three samples each). In departure from other emotion knowledge assessments (e.g. Denham’s affective labelling and affective perspective-taking tasks, 1986), this format serves an organizational purpose only, and the particular emotions targeted in the scenarios are not relevant to the scoring of the LEAS-C. The primary purpose of the LEAS-C is to examine the emotion complexity inherent in the responses children generate to each of the scenarios, therefore the correctness of their response is not relevant to the scoring” (Bajgar et al., 2005, p. 575). A low awareness Level 1 response may stress somatic features (e.g. ‘I would feel sick’), or may directly state a lack of emotional response (e.g. ‘I would feel nothing’). A Level 2 response reflects action (e.g. ‘I would feel like smashing the wall’), or a more global and generalized state not indicative of a specific emotion (e.g. ‘I would feel good’). Level 3 responses reflect specific unidimensional emotions (e.g. ‘I would feel happy’), Level 4 and 5 responses show greater complexity in awareness with emotion blends evident (e.g. Table 6.4 Summary description of the LEAS-C Instrument and program reference

Theoretical background

Characterization Reliability of instrument internal consistency

Convergent and discriminant validity

Trait or skills measure

Emotional awareness scale for children (LEAS-C), Bajgar et al. (2005); PATH

Developed to measure individual differences in the complexity of emotional awareness (EA): the skill most fundamental to emotional intelligence

The LEAS-C comprises 12 evocative interpersonal scenarios

Yes; e.g. The LEAS-C was significantly related to emotion comprehension

Description of the structure and complexity of EA. EA is described as a cognitive skill

Internal consistency using Cronbach’s alpha was 0.71 for self-scores, 0.64 for other-scores, and 0.66 for total scores (N = 51), p. 579

164

6 Measurement of Soft Skills in Education

‘I would feel angry but maybe a little bit sad as well’). Where there is no response, or the response reflects cognition (e.g. ‘I would feel like she did it deliberately’), a score of 0 is given. For each scenario, 3 scores are allocated: a score for self-awareness, other-awareness, and for total-awareness. The total-awareness score is taken as the higher of the self- and other-awareness scores” (ibid., p. 575). Item examples The 5 levels of emotional awareness with response examples: LEAS-C Scenario #7. The dentist tells you that you have some problems with your teeth that need to be fixed immediately. The dentist makes an appointment for you to come back the next day. How would you feel? How would the dentist feel? (0) I would feel like I should have brushed my teeth more often than I did. The dentist would feel like I didn’t brush my teeth enough. (No response/cognition) (1) I would feel it would hurt. I don’t know how the dentist would feel. (Bodily sensation) (2) I would feel alright because we had it done before. He would feel good. (Global hedonistic state) (3) We would both feel angry of course! (Unidimensional emotion) (4) I would feel scared and worried. The dentist would probably feel worried and happy to fix me and get money. (Differentiated emotions) (5) I would feel a bit worried for my teeth but excited because I don’t know what will happen. The dentist would feel hopeful and sorry. (More complex and differentiated states) Italics: level of ability to describe emotions (after Bajgar et al., 2005, p. 577). Comments Emotional awareness (EA), is described as a skill that is fundamental to emotional intelligence and EA is defined as “the ability to identify and describe one’s own emotions, and those of other people. The construct is derived from the developmental levels of emotional awareness (LEA) model and focuses on the structure and complexity of emotion representations. That is, the capacity to differentiate emotions from one another, and the level of emotion complexity inherent in the description of emotion experiences EA is viewed as a cognitive skill that undergoes a developmental process similar to that described by Piaget for cognition in general” (ibid., 569). The EA measures a disposition on which levels are defined and responses are objectively scored. In this way the instrument is to be seen as a tool to describe children’s developmental levels in the domain of the experience of emotions and thus can be considered as a skill measure. The evaluation study by Goossens et al. (2012) applied the LEAS-C as an outcome measures in an evaluation of the PATH program and found no intervention effects.

6.4 Description of Instruments that were Used in the Intervention …

165

6.4.3 The Empathy Index for Children and Adolescents (IECA) (Bryant’s Empathy Index), de Wied et al. (2007). PATH Description of the measure Affective empathy concerns the vicarious experience of emotions consistent with those of others, that is, feeling with others. The cognitive component involves understanding another’s feelings, whether by means of simple associations or more complex perspective-taking processes (Table 6.5). The IECA scale contains items that tap a range of affective reactions, including empathy (“Seeing a(boy/girl) who is crying makes me feel like crying”), sympathy (“It makes me sad to see a (boy/girl) who can’t find anyone to play with”), and personal distress (“I get upset when I see a (boy/girl) being hurt”). (99). Exemplary Items (full item sets in ANNEX) The IECA scale contains items that tap a range of affective reactions, including empathy (“Seeing a(boy/girl) who is crying makes me feel like crying”), sympathy (“It makes me sad to see a (boy/girl) who can’t find anyone to play with”), and personal distress (“I get upset when I see a (boy/girl) being hurt”). de Wied et al. (2007, 99). Comments The study by de Wied et al. (2007) examined the internal structure of Bryant’s Index of Empathy for Children and Adolescents, a 22-itemquestionnaire measure of Table 6.5 Summary description of the IECA Instrument and program reference

Theoretical background

Characterization of instrument

Reliability/internal consistency

Convergent and discriminant validity

Trait or skills measure

The empathy index for children and adolescents (IECA) (Bryant’s empathy index) De Wied et al. (2007). PATH

Affective empathy concerns the vicarious experience of emotions consistent with those of others, that is, feeling with others. The cognitive component involves understanding another’s feelings, whether by means of simple associations or more complex perspective-taking processes

The IECA scale contains 22 (yes/no) items that tap a range of affective reactions, including empathy (“Seeing a(boy/girl) who is crying makes me feel like crying”), sympathy (“It makes me sad to see a (boy/girl) who can’t find anyone to play with”), and personal distress (“I get upset when I see a (boy/girl) being hurt”)

Two factor solution. The first factor, labeled empathic sadness, showed good reliability The second factor, labeled Attitude, showed weak reliability across all samples, 1 (0.59), 2 (0.55), and 3 (0.54.) De Wied et al. (2007)

De Wied et al. (2007)-conclude that the findings seriously challenge the validity of the 22-item empathy index

The IEAC scale is to be characterized as a trait-facet measure

166

6 Measurement of Soft Skills in Education

dispositional affective empathy. Third graders (n = 817), fourth to sixth graders (n = 82), and eighth graders (n = 1079) were studied. Factor analyses revealed that the empathy index is multidimensional, encompassing two subscales. The same twofactor solution emerged in all samples. The first factor, labeled empathic sadness, showed good reliability in the two larger samples. Sex differences were established in each sample, with girls reporting more empathic sadness than boys. The second factor, reflecting attitudes rather than feelings, showed weak reliability in all samples, The second factor, labeled Attitude, showed weak reliability across all samples, 1 (0.59), 2 (0.55), and 3 (0.54), and poor differentiation between the sexes in the two younger age samples. The authors conclude that the findings seriously challenge the validity of the 22-item empathy index. Improvement of the scale as a measure of affective empathy is indicated. The study by De Wied et al. raised doubts about the validity of the scale. In the intervention study by Goossens et al. (2012) the IEAC was one of the outcome measures on which the intervention (PATH) did not show any effect. Although it would seem possible to conceive of the cognitive component as a skill (capacity to understand another’s feelings) the items reflect a general way of reacting rather than something they think they can do.

6.4.4 Kidcope-Child Version (Holen et al., 2012). Zippy’s Friends Description of the measure and exemplary items “The KIDCOPE (child version) asks about 11 different types of coping strategies, using 1 or 2 questions per strategy for a total of 15 questions. Four of the strategies asked about are approach-oriented and thus generally considered to be positive or adaptive (i.e., problem solving, positive emotion regulation, cognitive restructuring, seeking social support), whereas seven are escape-oriented and thus generally considered to be negative or maladaptive (i.e., distraction, negative emotion regulation, social withdrawal, wishful thinking, self-criticism, blaming others, resignation). Youth are asked to indicate both how often a particular coping strategy was used (i.e., frequency) and how much it helped (i.e., efficacy). Sample items include: “I tried to fix the problem by thinking of answers” (problem-solving) and “I just tried to forget it” (distraction). Frequency is assessed by asking youth whether they made use of each strategy (Yes or No); efficacy is assessed by asking youth to rate how helpful the strategy was (if used) on a 3-point scale: Not at all, A little, or A lot. Youth can be asked to self-identify a recent stressor to consider when responding to the questions on the measure or, alternatively, to consider a pre-identified type of stressor (e.g., a recent difficulty experienced in getting along with peers)”. (cited from the National Mentoring Resource Center, undated blog) https://nationalmentoringresourcecenter. org/index.php/toolkit/item/245-adaptive-coping-with-stress.html (Table 6.6).

6.4 Description of Instruments that were Used in the Intervention …

167

Table 6.6 Summary description of the Kidcope-child version Instrument and program reference

Theoretical background

Characterization Reliability/internal Convergent Trait or skills of instrument consistency and measure discriminant validity

Kidcope-child version. Holen et al. (2012) Zippy’s Friends

The context of the development of this instrument is pediatrics and coping behavior of patients, who suffer from chronical mental setbacks and illness

The KIDCOPE (child version) asks about 11 different types of coping strategies, using 1 or 2 questions per strategy for a total of 15 questions. Youth are asked to indicate both how often a particular coping strategy was used (i.e., frequency) and how much it helped (i.e., efficacy)

Analysis of the test-retest reliability showed acceptable levels over short periods of time. The highest correlations were obtained when subjects rated the same personal stressors 3 days apart (range = 0.56 to 0.75). Spirito et al. (1988) found reasonable concurrent validity with the coping strategies inventory (CSI)

Spirito et al. (1988) found reasonable concurrent validity with the coping strategies inventory (CSI)

Developed as a screening device to distinguish psychological pathology, and a possible effect measure of therapeutic interventions, the application as an outcome measure in educational program evaluations seems risky. The instrument is seen as a skills measure

Separate scores can be computed for positive and negative coping strategies by averaging across responses for the items that ask about each type of coping (see below). This approach provides distinct information about youths’ use of both “adaptive” and “maladaptive” coping strategies. Ratings of frequency and efficacy can be considered separately or in combination when scoring the KIDCOPE. The most straightforward approach is the former, in which frequency can be computed as to whether a coping strategy was used or the total number of strategies used within a given category, and efficacy can be computed as the average of the ratings of helpfulness (0 for Not at all, 1 for A little and 2 for A lot) for those strategies endorsed (ibid.). Higher scores reflect greater reported use and/or perceived helpfulness of the indicated coping strategy or type of coping (e.g., positive or negative). Items of the Kidscope-Child Version are cited in the Annex Comments Spirito et al. (1988) report on the development and testing of a version of the Kidcope instrument for adolescents.

168

6 Measurement of Soft Skills in Education

Analysis of the test-retest reliability showed acceptable levels over short periods of time. The highest correlations were obtained when subjects rated the same personal stressors 3 days apart (range = 0.56 to 0.75). Some-what lower correlations were obtained with the same personal stressor rated1 week apart (range = 0.41 to 0.83, with one exception 0.07 blaming others). Spirito et al. (1988, 569). When the testretest correlations were obtained over a 10-weekperiod. The correlations (0.15 to 0.43) were lower than those obtained at shorter intervals (ibid., p. 561). The validity of the Kidcope was assessed via comparisons with previously standardized measures of coping, the Coping Strategies Inventory (CSI), and AdolescentCoping Orientation for Problem Experiences Inventory (ACOPE). The correlations between the primary coping strategies of the Coping Strategies Inventory and the majority of the 10 items of the Kidcope were moderate to high (range = 0.33 to 0.77). The correlations on the coping scale for adolescents (ACOPE) and the Kidcope were somewhat lower than those between the Kid-cope and the Coping Strategy Inventory (ibid., 569). The clinical utility of the checklist was examined by administering the scale to several samples of children with chronic illness. In the sample of pediatric patients referred for psychological intervention, distraction was the most frequently employed coping strategy and used significantly more often than in control patients (571). The context of application of the Kidcope is described as being “integrated into the daily clinical work of pediatric psychologists, where such a checklist would help in counseling young patients about potential coping strategies they might use for a given situation and/or to gauge the effectiveness of a particular therapeutic intervention. In addition, a brief checklist can also serve as a coping screen for large numbers of pediatric patients” (Spirito et al., 1988, 572–573). Holen et al. (2012), used the Kidcope child version in their evaluation study of the program “Zippy’s Friends”. The results were summarized as follows: “While the children reported a significant reduction (Cohen’s d = −0.380) in oppositional coping strategies, their parents reported a significant increase in active strategies (Cohen’s d = 0.186)” (ibid., p. 671). Developed as a screening device to distinguish psychological pathology, and a possible effect measure of therapeutic interventions, the application as an outcome measure in educational program evaluations seems risky. For one thing the declining levels of test-retest reliability, as the time interval gets longer, might confound prepost intervention effects. The Kidcope can be considered as a skills measure.

6.4.5 The Strengths and Difficulties Questionnaire (SDQ) (Goodman, 1997). Zippy’s Friends Description of the measure The SDQ is a 25 item scale, to be rated by teachers, parents and students. The age range for the student self-report administration is from 11 to 16. Teachers and parents

6.4 Description of Instruments that were Used in the Intervention …

169

rate children from 4 to 16 years old. “The SDQ asks about 25 attributes, 10 of which would generally be thought of as strengths (e.g. “Thinks things out before acting”), 14 of which would generally be thought of as difficulties (e.g. “Often unhappy, down-hearted and tearful”), and one of which—” gets on better with adults than with other children”—is neutral.” (Goodman, 1997, p. 582). Each item is rated by the respondents as Not True, Somewhat True, or Certainly True (Table 6.7). Exemplary Items (full item sets in ANNEX) The 25 SDQ items are divided between5 scales of 5 items each: “Hyperactivity Scale. ‘Restless, overactive, cannot stay still for long’; ‘Constantly fidgeting or squirming’; ‘Easily distracted, concentration wanders’; ‘Thinks things out before acting’; and ‘Sees tasks through to the end, good attention span’. Emotional Symptoms Scale. ‘Often complains of head-aches, stomach-ache or sickness’; ‘Many worries, often seems worried’; ‘Often unhappy, down-hearted or tearful’; ‘Nervous or clingy in new situations, easily loses confidence’ and ‘Many fears, easily scared’. Conduct Problems Scale. ‘Often has temper tantrums or hot tempers’; ‘Generally obedient, usually does what adults request’; ‘Often fights with other children or bullies them’; Table 6.7 Summary description of the SDQ Instrument and program reference

Theoretical background

Characterization Reliability, internal of instrument consistency

Convergent Trait or and skills discriminant measure validity

The strengths and difficulties questionnaire (SDQ) Goodman (1997). Zippy’s Friends

Diagnostic instrument meant to categorize children as likely psychiatric “cases” or “non-cases”. Yields a total difficulties score, conduct problems score, emotional symptoms score, hyperactivity score, peer problems score and a prosocial behavior score

The SDQ is meant to be applicable to children and young people ranging from 4 to 16 years; the same version should be completed by parents and teachers and a similar version should be available for students’ self-report; both strengths and difficulties should be well represented

High interrater correlations with the Rutter scale 0.92, 0.91, 0.87, 0.90 indicate concurrent validity

Interrater correlations (teacher/parent) of: 0.62, 0.65, 0.41, 0.54, 0.59, 0.37 were found for: the total deviance/difficulties score, conduct problems score, emotional symptoms score, hyperactivity score, peer problems score and the prosocial Behavior score, respectively

The SDQ is a diagnostic trait/facet measure associated with sociability and neuroticism

170

6 Measurement of Soft Skills in Education

‘Often lies or cheats’; and ‘Steals from home, school or elsewhere’. Peer Problems Scale. ‘Rather solitary, tends to play alone’; ‘Has at least one good friend’; ‘’Generally liked by other children’; ‘Picked on or bullied by other children’; and ‘Gets on better with adults than with other children’. Prosocial Scale. ‘Considerate of other people’s feelings ‘; ‘Shares readily with other children (treats, toys, pencils, etc.)’; ‘Helpful if someone is hurt, upset or feeling ill’; ‘Kind to younger children’; and ‘Often volunteers to help others(parents, teachers, other children)’ (Goodman, 1997, p. 582). Comments The study by Goodland explored the concurrent validity of the SDQ with the Rutter scale. He found relatively high inter measure correlations with the Rutter scale (0.92, 0.91, 0.87, 0.90 respectively for the Total Deviance/Difficulties score, the Conduct Problems score, the Emotional Symptoms score and the Hyperactivity score). Inter rater reliabilities (teacher parents) for the SDQ were 0.62, 0.65, 0.41, 0.54, 0.59, 0.37, respectively for the The Total Deviance/Difficulties score, the Conduct Problems score, the Emotional Symptoms score, the Hyperactivity score, the Peer Problems score and the Prosocial Behavior score. (Ibid., p. 583). The SDQ is a diagnostic instrument meant to identify “psychiatric cases”. It is a trait/facet measure that appears to be associated with sociability and neuroticism. In the evaluation of Zippy’s Friends, by Holen et al. (2012), the SDQ was used as an outcome measure. No significant effects were discovered in the mental health subscales as assessed by the SDQ.

6.4.6 The Teacher Child Rating Scale (T-CRS) Hightower et al. (1986) Positive Action Description of the measure The Teacher-Child Rating Scale (T-RCS) is a measure of elementary children’s school problem behaviors and competencies, based on teacher ratings. It’s context of application is described as early detection and prevention of young children’s social adjustment problems (Hightower et al., 1986, 394). More generally the T-RCS is seen as a social emotional screening device to be used for purposes of assessment, intervention, consultation and program evaluation. Items are rated on a 5-point scale (1 “not a problem”, 5 a “very serious problem”), for part 1, and (1 “not at all”, 5 “very well”) for part 2 of the scale. Two parts of the T-RCS (total of 43 items) were factor analyses and yielded six factors: A, Acting out, B, Shy Anxious, C, Learning, D, Frustration Tolerance, E, Assertive Social Skills, E Task Orientation (Table 6.8). Exemplary items; more item examples in ANNEX Acting out

6.4 Description of Instruments that were Used in the Intervention …

171

Table 6.8 Summary description of the T-CRS Instrument Theoretical and background program reference

Characterization Reliability/internal Convergent of instrument consistency and discriminant validity

Trait, skill or declarative knowledge measure

The teacher child rating scale (T-CRS) Hightower et al. (1986) Positive Action

The teacher-child rating scale (T-RCS) is a measure of elementary children’s school problem behaviors and competencies, based on teacher ratings. It’s context of application is described as early detection and prevention of young children’s social adjustment problems

The measure establishes personal tendencies and habitual responses, partially associated with school life and partially more general. It is more of a trait/facet measure than a performance oriented skill measure

The developmental context of the instrument is to assess and prevent problem behavior

Constantly seeks attention. Shy-Anxious Withdrawn. Learning Underachieving. Frustration Tolerance Well behaved in school. Assertive Social Skills Comfortable as a leader. Task orientation Completes work.

Internal consistency and test-retest reliabilities are in the range of 0.85–0.95 and 0.66–0.77 respectively

Convergent validity is established by means of correlation with similar tests, the classroom adjustment rating scale (CARS) and the health resources inventory (HRI)

172

6 Measurement of Soft Skills in Education

Well organized. Comments Hightower et al. (1986, 402) report the following reliability coefficients: T-CRS scale

Cronbach’s alpha

Test-retest (20 weeks)

Acting out

0.94

0.66

Shy-anxious

0.85

0.77

Learning

0.94

0.86

Frustration tolerance

0.92

0.66

Assertive social skills

0.91

0.71

Task orientation

0.95

0.85

Inter-rater reliabilities are not reported. With respect to validity Hightower et al. (1986, 406) conclude that the scale’s ability to discriminate groups known to differ in adjustment as well as the convergent and divergent validity with other measures of child adjustment and performance were satisfactory, which they see as support of the measure’s utility both as a screening/assessment and program evaluation tool. Further on in their paper they conclude that they see this latter application (program evaluation) as in need of further documentation about the measure’s sensitivity to intervention changes. Flay, Alcock, Vuchinic, and Beets (2006) applied the Teacher Child Rating Scale to assess positive behaviors, as an outcome of a trial of the Positive Action Program, but do not report effect sizes, based on the teacher ratings. The measure establishes personal tendencies and habitual responses, partially associated with school life and partially more general. It is more of a trait/facet measure than a performance oriented skill measure. The scale was primarily designed for diagnostic screening in a context of prevention of problematic behavior.

6.5 Summary of all the Instruments Table 6.9 provides an overview of the 16 instruments that were reviewed. The purpose of our review of instruments was to provide a “hands on” overview of instruments, up to the level of citing the actual contents of the measures, the complete items sets. So, documentation was the first priority, and not quality review of specific instruments. In the way the instruments were selected we expected to ascertain a certain level of quality. The set of measures from the Educational Endowment’s Foundation’s SPECTRUM data base had all been rated positively on psychometric standards, and “usability” criteria. Still the psychometric quality ratings of the set of instruments varied among the measures and an overall picture was obtained about the scope of the reliability and validity assessments. The measures that were selected on the basis of being used in the program evaluation studies, reviewed in Chap. 5, were all documented on the basis of published articles, in which psychometric standards had been addressed as well. In summary, as also documented in the Summary Table, the results of our overview lead to the following observations.

6.5 Summary of all the Instruments

173

Table 6.9 Summary table of 16 instrument reviews Instrument

Trait-skill characterization

Predictive and or criterion validity addressed?

Outcome measure of educational intervention?

The short grit scale

Trait/facet

Yes

Yes

Multidimensional measure of children’s perception of control MMCPC

Trait/facet

Yes

Yes

The self-efficacy teacher-report scale

Trait/facet

No

Yes

How I feel questionnaire Trait/facet

No

Yes

Emotion-regulation rating questionnaire for children and adults

Trait/facet

No

No

Rosenberg self-esteem scale (RSES)

Trait/facet

Yes

Yes

The child and youth resilience measure

Trait/facet

No

No

The children’s self-report Trait/facet social skills scale

No

No

The basic empathy scale

Trait/facet

No

No

The expression and emotion scale for children (EESC)

Trait/facet

No

No

The social skills rating system (SSRS)

Trait/facet (skill)

No

Yes

The emotional awareness Skill scale for children (LEAS-C)

No

Yes

The empathy index for children and adolescents (IECA)

Trait/facet (skill)

No

Yes

Kidcope-child version

Skill

No

Yes

The strengths and Trait/facet difficulties questionnaire (SDQ)

No

Yes

The teacher child rating scale (T-CRS)

No

Yes

Trait/facet

Note In the second column we express our own qualification; in two cases the authors emphatically addressed what we consider trait/facets as skills, in those cases the word skills is put between brackets. In the fifth column we put a yes, if actual use as an outcome measure was documented, or explicitly recommended by the authors. In these latter cases we would not necessarily agree with those recommendations

174

6 Measurement of Soft Skills in Education

First, the internal consistency of the measures was generally acceptable to high. Secondly, there was frequent support for convergent and sometimes discriminant validity of the instruments, for example when comparisons were made with Big Five measures. Thirdly, criterion validity and predictive validity were rarely addressed (only in 3 of the 17 cases). Fourthly we found no traces of efforts to avoid response bias, such as acquiescence in any of the instrument reviews, neither in the ones selected from the SPECTRUM data base, nor in the ones selected for their use in program evaluations. In the fifth place the measures differed in whether they were actually used or characterized by the authors as “potentially useful” as outcome in program evaluations. To a degree we had induced this variation ourselves by the purposeful selection of instruments from the program evaluations, described in Chap. 5, but among the measures from the SPECTRUM data base, only a relative minority referred to such use. This was the case for the Roosenberg self-esteem scale (RSES), the Self-Efficacy Teacher-Report Scale, the short Grit scale and the How I feel questionnaire. Finally, we conclude that the lack of information on predictive and/or criterion validity is not supportive of an instrumental function of social emotional skills with respect to academic test performance or other academic and “life” outcome, i.e. the situation where programs intend to apply social emotional learning as a means to improve academic outcomes In line with the discussion started in other chapters, the review of instruments shed some further light on distinguishing trait and skill measures. In the summary table we made a global assessment of the trait-skill characterization and noted that trait measures predominate. In some cases, we thought measures had all the characteristics of trait measures even when the authors explicitly called them skill measures. In the discussion we will try and further qualify the distinction.

6.6 Discussion Further reflection on the “fit for purpose” of the scales, reviewed in this chapter, as effect measures of educational interventions showed that the instruments were overridingly based on self-descriptions of inclinations with a general orientation and without a clear performance focus. The scales lacked hierarchical organization as implied in taxonomies of affective educational objectives (which run, for instance from direct reactions to more encompassing and elaborate emotional reactions). Items and subscales that hinted at action orientation, like the competency dimension of Emotional Intelligence frequently seemed to be characterized by a “meta” orientation, and moreover a “meta-cognitive” orientation. Trait measures, like most of the ones reviewed, were mostly developed for diagnostic and screening purposes, frequently in a context of prevention rather than development. It could be argued that application should indeed be limited to use for these purposes and that application as outcome measures of educational interventions is questionable. It could also be argued that, if one is intending to measure social and emotional skills as goals and

6.6 Discussion

175

effect variables in educational program evaluations and student assessment, inclinations should be properly defined as skills (performance reference, skills should be demonstrable) with a certain degree of situation specificity. These features provide problems for the interpretation of the current results of individual program evaluations and meta-analyses thereof, (see Chaps. 4 and 5) as these evaluations largely depend on trait measures and self-report data. In the worst case they should be written off as artifacts. As it comes to the utility of applying the type of instruments that were reviewed in educational settings two kinds of applications are at stake: (1) applying the instruments for diagnostic purposes and as a basis for adapting education to individual differences in social and emotional attributes and (2) using the instruments to assess “social and emotional outcomes”, for example in the context of evaluating social-emotional learning programs. With respect to the first application the use of the instruments is in line with what they were developed for: assessing individual differences between persons, students in this case. To what extent this kind of diagnosis is a task for schools and teachers has been questioned by the critics of the soft skills movement, which were cited in Chap. 1. The use of this kind of psychological self-report based scales for assessment of students and evaluation of SEL intervention programs is problematic. In Chap. 2 we noted that positioning social and emotional “skills” as outcomes leads to considerable fuzziness when it is left open whether the outcomes in question may be general inclinations (habits), attitudes, or abilities to perform (skills). We therefore proposed to restrict use of the term “skill” to inclinations that can be assessed as demonstrable performance. In the review of existing lists and taxonomies of socialemotional skills in Chap. 2 we noted a less restrictive and precise use of the term skills (all social-emotional attributes are labelled as skills). Some authors, Kyllonen, Lipnevich, Burrus, and Roberts (2014) and John and De Fruyt (2015), introduced “competency” or “skill” facets of general social-emotional traits and states (e.g. Table 2.5). The distinction between personality traits/facets and skills was reviewed more fundamentally in Chap. 3, which focused on the Big Five taxonomy. In this chapter the use of personality traits/facets as skills was severely criticized when it was concluded that “high scores on personality traits and facets don’t say anything about the social-emotional skills that students can demonstrate in situations that require to use such skills”. The overview in this chapter confirmed the criticism that was given in Chap. 3, which basically states that self-report scales intended to measure individual differences on traits and trait-facets, are unfit as instruments to assess growth on social-emotional outcomes. The methodology of self-reporting on habitual responses as happens in the reviewed instruments should be questioned. John and De Fruyt (2015) describe developments, which take the competency and skill aspect of traits more seriously as they refer to the emotional competency approach (Saarni, 1999), the “reflective or “meta” perspective taken by Salovey et al. (1995) who developed the Trait Meta Moods scale and the development of the MSCEIT (Mayer, Salovey and Caruso Emotional

176

6 Measurement of Soft Skills in Education

Intelligence Test). These applications are accompanied by alternative methodologies. The MSCEIT is a situational judgement test with answering vignettes in which respondents select appropriate actions in a specific social situation (also compare Lopes et al. 2011; Lopes, Salovey, Côté, Beers, & Petty 2005). An example of an item from the MISCEIT, cited by John and de Fruyt (2015, p. 37), and Lopes et al. (2005), is the following: Debbie just came back from vacation. She was feeling peaceful and content. How well would each action preserve her mood? (1) She started to make a list of things at home that she needed to do. (2) She began thinking about where and when to go on her next vacation. (3) She called a friend to tell her about the vacation . . .

The item is seen as representing ability in emotion management and the respondent is asked to indicate how effective different actions would be for obtaining a specified effect on the person’s experience (here, to preserve Debbie’s good mood) ibid., p. 37. Still, the ability might also be interpreted as theoretical procedural knowledge about how to obtain certain goals in a social situation, because direct application of the ability in a performance situation is not assessed. Abraham’s et al. (2019) discuss a range of potential methodological improvements of social and emotional skills rating scales: correction for acquiescence (to avoid socially desirable answers), making use of multiple methods and informants (as a means to triangulate responses), applying forced choice answering formats, and application of rubrics and answering vignettes (describing typical situations in which a skill is challenged) as Likert scale anchors. In addition, they mention more objective ways to assess skills like situational judgement tests and performance measures, which we talked about in the above. In the development of the argumentation in this book the distinction between personality trait measures and skills emerges as a major issue. In the evaluation of programs of social emotional learning, and the choice of instruments these studies use as effect measures, the intention to measure performance oriented inclinations (“skills”) appears to be met only in a minority of cases, and use of scales that measure general personality trait facets predominates. Additional study and conceptual work should be at the basis of improving this state of affairs. This would imply addressing more fundamental issues of rating scale development. As Abrahams et al., (2019, 466) say: “The construction of social-emotional skill rubrics will have to be based on developmental theories of specific skills describing increasing mastery levels of that skill”. This may be part and parcel of test construction in the cognitive domain but is less developed in the domain of social emotional development. For one thing social emotional “skill levels” might not be aligned in a linear way (Drost & Verra, 2019). In summing up we conclude that improvement of SEL assessment should take the “skills” and performance oriented nature of “social-emotional outcomes” more seriously. This could be done in the following ways: – By careful consideration of the general versus situation specific nature of items and scales.

6.6 Discussion

177

– By constructing items that discriminate between right and wrong choices and scales that are “maximal” rather than “typical”.5 – By considering alternative methodological approaches like situational judgment tests, and direct observations instead of self-assessments. – By singling out items on emotion regulation (as a basic category of social emotional skills), which appeal to reflective “meta-orientations” and considering them as instances of meta-cognition, rather than “meta-affection” or “metaconnation”, which in its turn would prompt a performance based measurement approach. – Experimenting with scales in the affective and conative domains that express substantive growth on relevant dimensions. In addition, some more general methodological issues could also lead to improvement: – More frequent use of age-norms for measures in the social emotional domain. – Addressing response bias (e.g. reference bias and social desirability) and rater effects/triangulation of measures more frequently. Finally, we should mention the limitations of this chapter. As the primary purpose was to document and illustrate use of measures in the social emotional domain, we have opted for relatively elaborate description of limited cases, while the selection was purposeful and not at all representative. Although we feel that the analysis of existing measures and instruments has further enlightened major distinctions that are central in this book (e.g. between traits and skills), we present our conclusions on the conceptual clarification of measures as tentative, and relate them in the final chapter to suggestions for further research.

Annex In this Annex we cite items lists, or parts of the items of the instruments, discussed in this chapter. These citations fully depend on published articles, in the public domain, and not on Commercial sources. In each case the reference to the publication is included. The Basic Empathy Scale Items of the Basic Empathy Scale (20 items) 1. 2. 3.

My friends’ emotions don’t affect me much. After being with a friend who is sad about something, I usually feel sad. I can understand my friend’s happiness when she/he does well at something.

5 Typical

performance measures are characterized by opinion-or belief-based items (typically on a Likert scale). Maximal measures are characterized by having ‘objectively’ correct and incorrect answers (Wigelsworth et al., 2017, p. 10).

178

4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

6 Measurement of Soft Skills in Education

I get frightened when I watch characters in a good scary movie. I get caught up in other people’s feelings easily. I find it hard to know when my friends are frightened. I don’t become sad when I see other people crying. Other people’s feeling don’t bother me at all. When someone is feeling ‘down’ I can usually understand how they feel. I can usually work out when my friends are scared. I often become sad when watching sad things on TV or in films. I can often understand how people are feeling even before they tell me. Seeing a person who has been angered has no effect on my feelings. I can usually work out when people are cheerful. I tend to feel scared when I am with friends who are afraid. I can usually realize quickly when a friend is angry. I often get swept up in my friends’ feelings. My friend’s unhappiness doesn’t make me feel anything. I am not usually aware of my friends’ feelings. I have trouble figuring out when my friends are happy.

Carré, A., Stefaniak, N., D’Ambrosio, F., Bensalah, L., & Besche-Richard, C. (2013). The Basic Empathy Scale in Adults (BES-A): Factor structure of a revised form. Psychological Assessment, 25(3), 679–691. https://doi.org/10.1037/a0032297 The Children’s Self-Report Social Skills Scale 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

I look others in the face when they talk I say thank you when someone does something nice to me I kick or hit someone if they make me angry I am bossy I take turns with others I listen to others when they talk I share toys and games with others I say I’m sorry when I hurt somebody by accident When I see others playing a game I would like to play, I ask if I can join them I say I’m sorry when I hurt someone on purpose I help others when they need help I ask others to play Others like me and have fun with me I make friends easily Others do not like me Others ask me to play When I come over, others ask me to move or give them more space I don’t play fairly I walk up to others and start conversations I speak or interrupt if someone else is talking I am too loud when I talk

Annex

179

Danielson, C. K., & Phelps, C. R. (2003). The assessment of children’s social skills through self-report: A potential screening instrument for classroom use. Measurement and Evaluation in Counseling and Development, 35(4), 218–229 Rosenberg Self-Esteem Scale (RSES) (1) Satisfied with self. (2) I’m not good at all. (3) Have good qualities. (4) I can do things as well as others. (5) I don’t have much to be proud of. (6) I feel useless at times. (7) I am a person of worth. (8) I don’t respect myself. (9) I’m a failure. (10) I have a positive self-attitude. Bagley, CH., Bolitho, F., and Bertrand, L. (1997) Norms and construct validity of the Rosenberg Self-Esteem Scale in Canadian High School Populations: Implications for Counseling. Canadian Journal of Counseling, Vol. 31;1 82–92 The child and youth resilience measure (1) I have people I look up to. (2) Getting an education is important to me. (3) My caregiver(s) watch me closely. (4) My caregiver(s) know a lot about me (5) I eat enough most days. (6) I try to finish what I start. (7) I solve problems without drugs or alcohol. (8) I feel supported by my friends. (9) I know where to go to get help. (10) I feel I belong at my school. (11) My caregiver(s) stand(s) by me during difficult times. (12) My friends stand by me during difficult times. (13) I am treated fairly in my community. (14) I am aware of my own strengths. (15) I think it is important to serve my community. (16) I feel safe when I am with my caregiver(s). (17) I have opportunities to develop job skills. (18) I enjoy my caregiver(s)’ cultural and family tradition. Liebenberg, L., Ulgar, M., & LeBlanc, J.C. (2013) The CYRM-12: A brief measure of resilience. Canadian Journal of Public Health, 104, 131–135 Emotion-regulation rating questionnaire for children and adults (ERQ-CA) Items loading on cognitive reappraisal: When I want to feel happier, I think about something different. When I want to feel less bad (e.g. sad and worried), I think about something different. When I am worried about something, I make myself think in a way that helps me feel better. When I want to feel happier about something, I change the way I am thinking about it. I control my feelings about things by changing the way I think about them. When I want to feel less bad (e.g. sad, angry or worried) about something, I change the way I think about it. Items loading on emotion suppression: I keep my feelings to myself.

180

6 Measurement of Soft Skills in Education

When I am feeling happy, I am careful not to show it. I control my feelings by not showing them. When I am feeling bad (e.g. sad, angry, worried), I am careful not to show it. Gullone, E., & Taffe, J. (2012) The emotion regulation questionnaire for children and adolescents (ERQ-CA): A psychometric evaluation. Psychological Assessment, 24(2), 409–417 The Self-Efficacy Teacher Report Scale (SETRS) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.

When the student begins something, he/she tries hard to finish it The student begins important activities right away The student believes that he/she can solve a problem no matter how hard it is The student is successful Running into problems only makes the student try harder The student gets down to work when he/she needs to The student tries things that look too hard The student views the chance of failing as a challenge The student achieves what he/she sets out to do The student starts on big projects right away The student is certain of his/her ability to be successful Failing at something just makes the student try harder The student can stick to and complete activities he/she does not like to do When problems come up, the student faces them The student has a lot of self-confidence The student achieves important goals The student believes that, “if at first you don’t succeed, try, try again” The student is a person who does today what could be put off until tomorrow. (cited from Erford, Duncan, & Savin-Murphy, 2010)

Reference Erford, B. T., Duncan, K., & Savin-Murphy, J. (2010). Brief psychometric analysis of the Self-Efficacy Teacher Report Scale. Measurement and Evaluation in Counseling and development, 43(2), 79–90. Multidimensional measure of children’s perception of control MMCPC The general domain (only the items from the General Scale will be cited). Unknown control When good things happen to me often there doesn’t seem to be any reason why. Often, I can’t understand why good things happen to me. A lot of times I don’t know why something goes wrong for me. When something goes wrong for me I usually cannot work out why it happened.

Annex

181

Powerful others control To get what I want I have to please the people in charge. If there is something that I want to get, I usually have to please the people in charge to get it If an adult doesn’t want me to do something I want to do, I probably won’t be able to do it. I don’t have much chance of doing what I want if adults don’t want me to do it. Internal control I can pretty much control what will happen in my life. I can pretty much decide what will happen in my life. When I am unsuccessful, it’s usually my own fault. When I don’t do well at something, it’s usually my own. Muldoon, O. T., Lowry, R. G., Prentice, G., & Trew, K. (2005). The factor structure of the multidimensional measure of children’s perceptions of control. Personality and Individual Differences, 38(3), 647–657 The short Grit Scale Consistency of Interest 1. I often set a goal but later choose to pursue a different one. 5. I have been obsessed with a certain idea or project for a short time but later lost interest. 6. I have difficulty maintaining my focus on projects that take more than a few months to complete. 2. New ideas and projects sometimes distract me from previous ones. 4. My interests change from year to year. 3. I become interested in new pursuits every few months Perseverance of Effort 9. 10. 12. 11. 7. 8.

I finish whatever I begin. Setbacks don’t discourage me. I am diligent. I am a hard worker. I have achieved a goal that took years of work. I have overcome setbacks to conquer an important challenge

Duckworth, A. L., & Quinn, P. D. (2009). Development and validation of the short grit scale (Grit–S). Journal of Personality Assessment, 91(2), 166–174 How I feel questionnaire There are 30 items, answering categories: “Children are asked to rate on a 5-point scale (1 = not at all true of me, 2 = a little true of me, 3 = somewhat true of me, 4 =

182

6 Measurement of Soft Skills in Education

pretty true of me, 5 = very true of me) the extent to which the statements described their emotion experience over the previous 3 months” Item stems: F1: Positive Emotion 14. 29. 19. 4. 26. 11. 1. 16.

Excited strong. Excited powerful. Excited often. Excited all the time Happy strong Happy powerful. Happy often. Happy all the time.

F2: Negative Emotion 13. 17. 2. 7. 23. 28. 8. 22. 25. 20. 5. 10.

Mad often. Sad powerful Sad strong. Sad often. Mad powerful. Mad all the time. Mad strong. Sad all the time. Scared often. Scared strong. Scared powerful. Scared all the time.

F3: Emotion Control (EC) 18. 27. 12. 15. 30. 24. 73. 69. 21. 46.

Mad intensity. Sad frequency. Sad intensity. Scared frequency. Scared intensity. Excited intensity. Mad frequency. Excited frequency. Happy frequency. Happy intensity.

Ciucci, E., Baroncelli, A., Grazzani, I., Ornaghi, V., & Caprin, C. (2016). Emotional Arousal and Regulation: Further Evidence of the Validity of the “How I

Annex

183

Feel” Questionnaire for Use with School-Age Children. Journal of School Health, 86(3), 195–203 The expression and emotion scale for children (EESC) Children respond to items using a 5-point Likert scale with scores of 1 (not at all true), 2 (a little true), 3 (somewhat true), 4(very true), and 5 (extremely true) to indicate how well each item describes their experience with these expressive difficulties. Poor awareness factor 8. 15. 11. 9. 10. 3. 14. 5.

When I feel upset, I do not know how to talk about it. I often do not know why I am angry. Sometimes I just don’t have words to describe how I feel. I often do not know how I am feeling. People tell me I should talk about my feelings more often. When something bad happens, I feel like exploding. I know I should show my feelings, but it is too hard. I have feelings that I can’t figure out.

Expressive reluctance factor 1. 4. 2. 12. 7. 6. 16. 13.

I prefer to keep my feelings to myself. I don’t show how I really feel in order not to hurt others’ feelings. I do not like to talk about how I feel. When I’m sad, I try not to show it. When I get upset, I am afraid to show it. I usually do not talk to people until they talk to me first. It is hard for me to show how I feel about somebody. Other people don’t like it when you show how you really feel.

Penza-Clyve, S., & Zeman, J. (2002). Initial validation of the emotion expression scale for children (EESC). Journal of Clinical Child and Adolescent Psychology, 31(4), 540–547 The Social Skills Rating System (SSRS) Given the complexity of the SSRS we refrain from seeking to cite full item sets, and just cite exemplary items in the Chapter’s main text. Key reference: Gresham, F. M., & Elliott, S. N. (1990). The Social Skills Rating System. Circle Pines, MN: American Guidance Service The Emotional Awareness Scale for Children (LEAS-C) Given the complexity of the LEAS-C we refrain from seeking to cite full item sets, and just cite exemplary items in the Chapter’s main text. Key reference: Bajgar, J., Ciarrochi, L., Lane, R., and Deane, F.P. (2005). Development of the Levels of Emotional Awareness Scale for Children (LEAS-C). British Journal of Developmental Psychology 23:569–86. https://doi.org/10.1348/026151 005x35417

184

6 Measurement of Soft Skills in Education

The Empathy Index for Children and Adolescents (IECA) (Bryant’s Empathy Index) Statement Yes/No 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

It makes me sad to see a girl who can’t find anyone to play with (+) People who kiss and hug in public are silly (−) Boys who cry because they are happy are silly (−) I really like to watch people open presents, even when I don’t get a present myself (+) Seeing a boy who is crying makes me feel like crying (+) I get upset when I see a girl being hurt (+) Even when I don’t know why someone is laughing, I laugh too (+) Sometimes I cry when I watch TV (+) Girls who cry because they are happy are silly (−) It’s hard for me to see why someone else gets upset (−) I get upset when I see an animal being hurt (+) It makes me sad to see a boy who can’t find anyone to play with (+) Some songs make me so sad I feel like crying (+) I get upset when I see a boy being hurt (+)1 Grown-ups sometimes cry, even when they have nothing to be sad about (−) It’s silly to treat dogs and cats as though they have feelings like people (−) I get mad when I see a classmate pretending to need help from the teacher all the time (−) Kids who have no friends probably don’t want any (−) Seeing a girl who is crying makes me feel like crying (+) I think it is funny that some people cry during a sad movie or while reading a sad book (−) I am able to eat all my cookies even when I see some-one looking at me wanting one (−) I don’t feel upset when I see a classmate being punished by a teacher for not obeying school rules (−)

References: Key reference: De Wied, M., Maas, C., Van Goozen, S., Vermande, M., Engels, R., Meeus, W., Matthys, W., and Goudena, P. (2007). Bryant’s Empathy Index: A Closer Examination of its Internal Structure. European Journal of Psychological Assessment 23:99–104. Other reference: Goossens et al.: Implementation of PATHS Through Dutch Municipal Health Services. IJCV: Vol. 6 (2) 2012, pp. 234–248 Kidcope-Child Version Coping strategies in the Kidscope-Child Version (source: National Mentoring Resource Center)

Annex

185

1. I just tried to forget it 2. I did something like watch TV or played a game to forget it 3. I stayed by myself 4. I kept quiet about the problem 5. I tried to see the good side of things 6. I blamed myself for causing the problem 7. I blamed someone else for causing the problem 8. I tried to fix the problem by thinking of answers 9. I tried to fix the problem by doing something or talking to someone 10. I yelled, screamed, or got mad 11. I tried to calm myself down 12. I wished the problem had never happened 13. I wished I could make things different 14. I tried to feel better by spending time with others like family, grownups, or friends 15. I didn’t do anything because the problem couldn’t be fixed

Positive or adaptive strategies are measured by items 5, 8, 9, 11, and 14; negative or maladaptive strategies are measured by items 1, 2, 3, 4, 6, 10, 12, 13, and 15. Using this approach, the total number of strategies used within a given category (e.g., positive) can be computed, and efficacy can be computed as the average of the ratings of helpfulness (0 for Not at all, 1 for A little and 2 for A lot) for those strategies endorsed. References: The National Mentoring Resource Center, Adaptive coping with stress. undated blog, https://nationalmentoringresourcecenter.org/index.php/toolkit/ item/245-adaptive-coping-with-stress.html Spirito, A., Stark, L.J., & Williams, C. (1988). Development of a brief coping checklist for use with pediatric populations. Journal of Pediatric Psychology, 13(4), 555–574 The Strengths and Difficulties Questionnaire (SDQ) Complete scale, cited from Goodman (1999), appendix, p. 586. Childs Name Date of Birth. Signature To be completed by Teacher, parent, Other Considerate of other people’s feelings Restless, overactive, cannot stay still for long Often complains of headaches, stomach-aches or sickness Shares readily with other children (treats, toys, pencils etc.) Often has temper tantrums or hot tempers Rather solitary, tends to play alone Generally obedient, usually does what adults request Many worries, often seems worried Helpful if someone is hurt, upset or feeling ill Constantly fidgeting or squirming

186

6 Measurement of Soft Skills in Education

Has at least one good friend Often fights with other children or bullies them Often unhappy, down-hearted or tearful Generally liked by other children Easily distracted, concentration wanders Nervous or clingy in new situations, easily loses confidence Kind to younger children Often lies or cheats Picked on or bullied by other children Often volunteers to help others (parents, teachers, other children) Thinks things out before acting Steals from home, school or elsewhere Gets on better with adults than with other children Many fears, easily scared Sees tasks through to the end, good attention span Key reference: Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A research note. Journal of Child Psychology and Psychiatry and Allied Disciplines, 38(5), 581–586. The Teacher Child Rating Scale (T-CRS) Items Examples of items per factor: Acting Out Disruptive in class Constantly seeks attention Shy-Anxious Withdrawn Unhappy, depressed, sad Learning Underachieving Poorly motivated to achieve Frustration Tolerance Well behaved in school Well-liked by classmates Assertive Social Skills Defends own views under group pressure Comfortable as a leader Task orientation Completes work Well organized Key reference: Hightower, A. D., Work, W. C., Cowen, E. C., Lotyczewski, B. S., Spinwell, A. P., Guare, J. C., and Rohrbeck, C. A. (1986). The Teacher Child Rating Scale: A brief objective measure of elementary children’s school problem behaviors and competencies. School Psychology Review, 15(3), 393 409.

Annex

187

Flay, B.R., Adcock, A., Vuchinich, S., & Beets, M. (2006). Progress report of the randomized trial of Positive Action in Hawaii: End of third year of intervention. Retrieved from Research Gate website: https://www.researchgate.net/publication/ 224942204.

References Abrahams, L., Pancorbo, G., Primi, R., Santos, D., Kyllonen, P., John, O. P., et al. (2019). Socialemotional skill assessment in children and adolescents: Advances and challenges in personality, clinical, and educational contexts. Psychological Assessment, 31(4), 460–473. https://doi.org/10. 1037/pas0000591. Bagley, C., Bolitho, F., & Bertrand, L. (1997). Norms and construct validity of the Rosenberg self-esteem scale in Canadian high school populations: Implications for counselling. Canadian Journal of Counseling, 31(1), 82–92. Bajgar, J., Ciarrochi, J., Lane, R., & Deane, F. P. (2005). Development of the levels of emotional awareness scale for children (LEAS-C). British Journal of Developmental Psychology, 23, 569– 586. https://doi.org/10.1348/026151005X35417. Bandura, A. (1977). Self-efficacy: Toward a unifying theory for behavioral change. Psychological Review, 84(2), 191–215. https://doi.org/10.1037/0033-295X.84.2.191. Bandura, A., Barbaranelli, C., Caprara, G. V., & Pastorelli, C. (2001). Self-efficacy beliefs as shapers of children’s aspirations and career trajectories. Child Development, 72(1), 187–206. https://doi. org/10.1111/1467-8624.00273. Bandura, A., Caprara, G. V., Barbaranelli, C., Gerbino, M., & Pastorelli, C. (2003). Role of affective self-regulatory efficacy in diverse spheres of psychosocial functioning. Child Development, 74(3), 769–782. https://doi.org/10.1111/1467-8624.00567. Bandura, A., Pastorelli, C., Barbaranelli, C., & Caprara, G. V. (1999). Self-efficacy pathways to childhood depression. Journal of Personality and Social Psychology, 76(2), 258–269. https://doi. org/10.1037/0022-3514.76.2.258. Barnett, W. S., Jung, K., Yarosz, D. J., Thomas, J., Hornbeck, A., Stechuk, R., & Burns, S. (2008). Educational effects of the tools of the mind curriculum: A randomized trial. Early Childhood Research Quarterly, 23(3), 299–313. Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment. Sage Publications. https://doi.org/10.4135/9781412985642. Carré, A., Stefaniak, N., D’Ambrosio, F., Bensalah, L., & Besche-Richard, C. (2013). The basic empathy scale in adults (BES-A): Factor structure of a revised form. Psychological Assessment, 25(3), 679–691. https://doi.org/10.1037/a0032297. Ciucci, E., Baroncelli, A., Grazzani, I., Ornaghi, V., & Caprin, C. (2016). Emotional arousal and regulation: Further evidence of the validity of the “How I Feel” questionnaire for use with schoolage children. The Journal of School Health, 86(3), 195–203. https://doi.org/10.1111/josh.12370. Credé, M. (2018). What shall we do about grit? A critical review of what we know and what we don’t know. Educational Researcher, 47(9), 606–611. https://doi.org/10.3102/0013189X18801322. D’Ambrosio, F., Olivier, M., Didon, D., & Besche, C. (2009). The basic empathy scale: A French validation of a measure of empathy in youth. Personality and Individual Differences, 46(2), 160–165. https://doi.org/10.1016/j.paid.2008.09.020. Danielson, C. K., & Phelps, C. R. (2003). The assessment of children’s social skills through self-report: A potential screening instrument for classroom use. Measurement and Evaluation in Counseling and Development, 35(4), 218–229. https://doi.org/10.1080/07481756.2003.120 69068.

188

6 Measurement of Soft Skills in Education

de Wied, M., Maas, C., Goozen, S., van Vermande, M., Engels, R., Meeus, W., et al. (2007). Bryant’s empathy index: A closer examination of its internal structure. European Journal of Psychological Assessment, 23(2), 99–104. https://doi.org/10.1027/1015-5759.23.2.99 Diperna, J. C., & Volpe, R. J. (2005). Self-report on the social skills rating system: Analysis of reliability and validity for an elementary sample. Psychology in the Schools, 42(4), 345–354. https://doi.org/10.1002/pits.20095. Drost, M., & Verra, P. (2019). Handboek RTTI met theoretische beschouwing [Handbook RTTI, with a theoretical reflection]. Bodegraven, The Netherlands: Docentplus. Duckworth, A. L., Peterson, C., Matthews, M. D., & Kelly, D. R. (2007). Grit: Perseverance and passion for long-term goals. Journal of Personality and Social Psychology, 92(6), 1087–1101. https://doi.org/10.1037/0022-3514.92.6.1087. Duckworth, A. L., & Quinn, P. D. (2009). Development and validation of the short grit scale (Grit– S). Journal of Personality Assessment, 91(2), 166–174. https://doi.org/10.1080/002238908026 34290. Education Endowment Foundation. https://educationendowmentfoundation.org.uk/public/files/Eva luation/SPECTRUM/V6_Guidance_document.pdf. EEF (Education Endowment Foundation). (2018). SPECTRUM database. Education endowment foundation. Retrieved from https://educationendowmentfoundation.org.uk/projects-and-evalua tion/evaluating-projects/measuring-essential-skills/spectrum-database/. Erford, B. T., Duncan, K., & Savin-Murphy, J. (2010). Brief psychometric analysis of the selfefficacy teacher report scale. Measurement and Evaluation in Counseling and development, 43(2), 79–90. https://doi.org/10.1177/0748175610373454. Flay, B. R., Adcock, A., Vuchinich, S., & Beets, M. (2006). Progress report of the randomized trial of Positive Action in Hawaii: End of third year of intervention. Retrieved from Research Gate website: https://www.researchgate.net/publication/224942204. Goodman, R. (1997). The strengths and difficulties questionnaire: A research note. Journal of Child Psychology and Psychiatry and Allied Disciplines, 38(5), 581–586. Goossens, F. X., Gooren, E. M. J. C., de Castro, B. O., van Overveld, K. W., Buijs, G. J., Monshouwer, K., et al. (2012). Implementation of PATHS through Dutch Municipal Health Services: A quasiexperiment. International Journal of Conflict & Violence, 6(2), 235–248. Gresham, F. M., & Elliott, S. N. (1990). The social skills rating system. Circle Pines, MN: American Guidance Service. Gresham, F. M., Elliott, N., Vance, J., & Cook, R. (2011). Comparability of the social skills rating system to the social skills improvement system: Content and psychometric comparisons across elementary and secondary age. School Psychology Quarterly, 26(1), 27–44. https://doi.org/10. 1037/a0022662. Gross, J. J. (1998). The emerging field of emotion regulation: An integrative review. Review of General Psychology, 2(3), 271–299. http://dx.doi.org.proxy-ub.rug.nl/10.1037/1089-2680.2. 3.271. Gullone, E., & Taffe, J. (2012). The emotion regulation questionnaire for children and adolescents (ERQ-CA): A psychometric evaluation. Psychological Assessment, 24(2), 409–417. https://doi. org/10.1037/a0025777. Hightower, A. D., Work, W. C., Cowen, E. C., Lotyczewski, B. S., Spinwell, A. P., Guare, J. C., & Rohrbeck, C. A. (1986). The teacher child rating scale: A brief objective measure of elementary children’s school problem behaviors and competencies. School Psychology Review, 15(3), 393–409. Holen, S., Waaktaar, T., Lervåg, A., & Ystgaard, M. (2012). The effectiveness of a universal school-based programme on coping and mental health: A randomised, controlled study of Zippy’s Friends. Educational Psychology, 32, 657–677. John, O., & De Fruyt, F. (2015). Framework for the longitudinal study of social and emotional skills in cities. Paris: OECD Publishing. Jolliffe, D., & Farrington, D. P. (2006). Development and validation of the basic empathy scale. Journal of Adolescence, 29(4), 589–611. https://doi.org/10.1016/j.adolescence.2005.08.010.

References

189

Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527– 535. https://doi.org/10.1037/0033-2909.112.3.527. Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17– 64). Westport, CT: American Council on Education/Praeger. Kyllonen, P., Lipnevich, A. A., Burrus, J., & Roberts, R. D. (2014). Personality, motivation, and college readiness: A prospectus for assessment and development. Research report. Educational Testing Service RR-14-06. ETS Research Report Series. Liebenberg, L., Ungar, M., & LeBlanc, J. C. (2013). The CYRM-12: A brief measure of resilience. Canadian Journal of Public Health, 104(2), 131–135. Lopes, P. N., Nezlek, J. B., Extremera, N., Hertel, J., Fernández, B. P., Schütz, A., & Salovey, P. (2011). Emotion regulation and the quality of social interaction: Does the ability to evaluate emotional situations and identify effective responses matter? Journal of Personality, 79(2), 429– 467. https://doi-org.proxy-ub.rug.nl/10.1111/j.1467-6494.2010.00689.x. Lopes, P. N., Salovey, P., Côté, S., Beers, M., & Petty, R. E. (2005). Emotion regulation abilities and the quality of social interaction. Emotion, 5(1), 113–118. https://doi.org/10.1037/1528-3542. 5.1.113. Mehrens, W. A. (1997). The consequences of consequential validity. Educational Measurement: Issues and Practice, 16(2), 16–18. Messick, S. (1995). Standards of validity and the validity of standards in performance assessment. Educational Measurement: Issues and Practice, 14(4), 5–8. https://doi.org/10.1111/j.1745-3992. 1995.tb00881.x. Muldoon, O. T., Lowry, R. G., Prentice, G., & Trew, K. (2005). The factor structure of the multidimensional measure of children’s perceptions of control. Personality and Individual Differences, 38(3), 647–657. https://doi.org/10.1016/j.paid.2004.05.020. Penza-Clyve, S., & Zeman, J. (2002). Initial validation of the emotion expression scale for children (EESC). Journal of Clinical Child and Adolescent Psychology, 31(4), 540–547. https://doi.org/ 10.1207/153744202320802205. Popham, J. W. (1997). Consequential validity: Right concern-wrong concept. Educational Measurement: Issues and practice, 16(2), 9–13. Saarni, C. (1999). The development of emotional competence. New York: Guilford Press. Salovey, P., Mayer, J. D., Goldman, S. L., Turvey, C., & Palfai, T. P. (1995). Emotional attention, clarity, and repair: Exploring emotional intelligence using the trait meta-mood scale. In J. W. Pennebaker (Ed.), Emotion, disclosure, & health (pp. 125–154). American Psychological Association. https://doi.org/10.1037/10182-006. Sanders, P., & Brouwers, A. (2019). RCEC rating system for the quality of educational achievement tests and examinations. Arnhem, The Netherlands: RCEC. Spirito, A., Stark, L. J., & Williams, C. (1988). Development of a brief coping checklist for use with pediatric populations. Journal of Pediatric Psychology, 13(4), 555–574. The National Mentoring Resource Center, Adaptive Coping with Stress. undated blog. https://nat ionalmentoringresourcecenter.org/index.php/toolkit/item/245-adaptive-coping-with-stress.html. Wigelsworth, M., Humphrey, N., Black, L., et al. (2017). Social, psychological, emotional, concepts of self, and resilience outcomes. Understanding and Measurement (SPECTRUM). Wolming, S., & Wikstrom, ¨ C. (2010). The concept of validity in theory and practice. Assessment in Education: Principles, Policy & Practice, 17(2), 117–132. https://doi.org/10.1080/096959410 03693856. Wools, S. (2015). All about validity: An evaluation system for the quality of educational assessment. Enschede: Universiteit Twente.

Chapter 7

Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

7.1 Introduction In Chap. 3 we discussed the malleability of personality traits and trait facets by therapeutic interventions and showed that the effects of such interventions are not sufficiently impressive to warrant inclusion of personality development in school curricula. In this chapter, we will further test this conclusion by focusing on the effects of educational interventions on the trait conscientiousness and its underlying facets. The reason for the choice of this trait is that several studies on the relationship between personality traits and success in later life show that conscientiousness is a relatively important predictor of school success (see for example Pellegrino & Hilton, 2012; Poropat, 2009). More recent studies show that the relation is non-linear (e.g., Palczynska & Swist, 2018; Rammstedt, Danner, & Lechner, 2017). Individuals with an intermediate level of conscientiousness have the highest level of educational attainment. A relatively large number of intervention studies in education aimed at social-emotional attributes are rather similar to conscientiousness facets, which allows us to conduct a meta-analysis of the overall effects of these types of interventions. The conscientiousness facets addressed in the meta-analysis are perseverance, orderliness, achievement orientation and purposefulness. In the sections below, we first describe the method of the study (literature search, eligibility criteria, coding, and meta-analysis method), followed by a presentation of the descriptive results (kind of interventions and measurement instruments) and the results of the meta-analysis (summary effect, publication bias, moderators). Next, we discuss some highly effective interventions in more detail, focusing on the content of the intervention, the quality of the study and the way in which the outcomes were measured. In the final section we discuss the findings and their implications.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_7

191

192

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

7.2 Method 7.2.1 Literature Search We used the databases of PsycINFO and ERIC to find eligible articles, published in peer-reviewed journals between 1988 and 2018, and written in English. We searched on the term ‘conscientiousness’, the names of four facets of it (perseverance, orderliness, achievement orientation, and purposefulness), and synonyms of these terms and its derivatives. The exact search terms were as follows: conscientious*, persever*, productiv*, industrious*, concentration, concentrated, effort, dedication, orderliness, organized, organised, structured, “time management”, “management of time”, “precise working”, “working precise*”, self-monitor*, “self monitor*”, self-evaluat*, “self evaluat*”, planful*, methodic*, methodologic*, systematic*, “achievement orient*”, “performance orient*”, ambitious, ambition, “task value”, “achievement motivation”, “performance motivation”, “achievement striving”, “performance striving”, “set* high standard*”, purposeful*, self-disciplin*, “self disciplin*”, selfmanagement, “self management”, self-control, “self control”, self-regulation, “self regulation”, self-efficien*, “self efficien*”, “task efficien*”, focus, “time on task”, goal-orient*, “goal orient*”, “goal dedicat*”, “goal set*”, “set* goal*”, and goalset*. We searched in the descriptors of the articles. For the term ‘conscientiousness’ and the names of the four facets of it, we also searched in the titles and abstracts of the articles. To narrow down the overwhelming amount of hits, we restricted our search to articles with one of the following terms in the title: treatment*, program*, intervention*, instruct*, experiment*, training*. Our search yielded 1965 unique hits.

7.2.2 Eligibility Criteria In order to be selected for coding, we used content and methodological quality related eligibility criteria. First of all, articles had to describe an educational intervention study focused on the development of one or more social-emotional attributes related to conscientiousness. Correlational studies were excluded, because for these kinds of studies it is difficult to detect causal relationships. The study had to include a control group so that it was possible to correct for naturally occurring developmental effects. We did include randomized as well as non-randomized intervention studies. This is because randomized studies with relatively small samples, such as students in two classrooms, often randomly assign one class as experimental group and the other as control group. In our opinion, such small-scale randomized studies are not better than quasi-experimental studies, because differences might still exist between the groups prior to the intervention. To control for pretest differences, studies were therefore required to include a pretest measure of the same outcome type that was measured at posttest. In order to be selected, a study had to include a measure of social-emotional attributes and had to provide sufficient data to calculate an effect

7.2 Method

193

size. Finally, we only included studies that focused on the whole classroom or school, as we are interested in intervention effects in normal educational situations. We included studies implemented in primary school, secondary school and higher education. Studies that only focused on special needs education or Kindergarten were excluded. Applying the eligibility criteria to our hits in PsycINFO and ERIC reduced the number of 1965 original hits to 25 articles that were approved for further coding. These 25 articles described in total 30 interventions with each a different sample.

7.2.3 Coding We used an extensive coding scheme in which we focused on general intervention characteristics, content aspects and information about the outcome measures. More specifically, we coded the following aspects: • Duration of the intervention in weeks; • Implementer of the intervention. We initially coded all types of implementers, and later reduced these into three categories: (1) teacher, (2) computer, and (3) other person, including (assistant) researcher; • Education type in which the intervention was implemented. We distinguished (1) primary school, (2) secondary school, and (3) higher education; • The size of the experimental group; • The socioeconomic status (SES) of the students. We distinguished (1) low SES, (2) average SES, and (3) high SES; • The country in which the study was executed; • The conscientiousness facets that were addressed in the intervention. We distinguished four facets, namely perseverance, orderliness, achievement orientation, and purposefulness. Interventions could address multiple facets. We only coded differences between the experimental and control group; • The focus of the program. This could be promotion or prevention; • Whether the interventions focused on other attributes than the facets of conscientiousness. In line with the categorization of attributes described in Chap. 2, Table 2.6 we distinguished in our coding three additional categories of attributes that were addressed in the interventions in addition to conscientiousness facets: (meta)cognitive attributes (e.g., application of meta-cognitive strategies, problemsolving), affective attributes (e.g., interest, test anxiety) and social attributes (e.g., peer interaction, being cooperative). We only coded differences between the experimental and control group. When, for example, the experimental group received conscientiousness training, but both the control and the experimental group received training in cognitive attributes, we did not code the intervention for addressing cognitive attributes; • The type of conscientiousness outcome. We distinguished (1) attitudes, beliefs, opinions (thinking), (2) volition (willing, aiming), (3) behavior (doing), and (4) skills (being able to);

194

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

• • • •

The reliability of the outcome measurement instrument; Reports on the validity of the outcome measurement instrument; The rater of the outcome measure; Test statistics. Mostly, means, standard deviations and sample sizes were reported, but sometimes also F-values, t-values or regression coefficients; • When follow-up measures were reported, we also coded these.

7.2.4 Meta-Analysis 7.2.4.1

Calculation of Effect Sizes

To perform the meta-analysis, we used the statistical package Comprehensive MetaAnalysis (CMA) version two, developed by Biostat (see: www.meta-analysis.com). We started with calculating the effect sizes (Hedges’ g) and variances in CMA of each outcome measure in the individual interventions, based on the statistical information provided in the primary studies. In most studies, the intervention effect was estimated using more than one test. We included all these measures and calculated the mean effect for each individual intervention. There were a few special situations: In one study (de Acedo et al., 2009), two different intervention groups were compared with the same control group. This causes statistical dependency in the data (Lipsey & Wilson, 2001), resulting in too much weight being attached to the control group. We corrected for this by dividing the number of students in the control group by the number of interventions with which the control group was compared. Furthermore, in two studies (Kamps et al., 2015; Wills, Kamps, Caldarella, Wehby, & Romine, 2018) the intervention effect was measured at the level of the classroom instead of at the student. We converted the class-level effect sizes into student-level effect sizes by multiplying them by the square root of the intra-class correlation, as Hedges (2007) prescribes. The variances were converted by multiplying them with the intraclass correlation. As the intra-class correlation was not reported in the studies, we estimated it at 0.1. This is also the standard for non-achievement outcomes that is recommended by the What Works Clearinghouse (Procedures Handbook, version 4.1, 2020). Table 7.1 shows the effect size on the conscientiousness outcomes and its 95% confidence interval for each of the 30 interventions. The last three columns of the table indicate which other categories of attributes (cognitive, affective and social) besides conscientiousness facets were addressed in the interventions.

7.2.5 Combining the Effects CMA calculated an average weighted effect size for the summary of all interventions. The weight attached to each intervention depended on the variance of its effect size.

7.2 Method

195

Table 7.1 Effect sizes of each intervention on conscientiousness outcomes First author (publication year)

Effect size conscientiousness

Other categories addressed

Hedges’ g

SE

Cogn.

Aff.

Social

Acee and Weinstein (2011)

0.53*

0.22

No

No

No

Arco-Tirado et al. (2011)

0.40*

0.20

No

No

No

Behnam et al. (2014)

1.20*

0.35

No

No

No

Blackwell et al. (2007)

0.44*

0.21

No

No

No

Brown et al. (2010) iv 1

0.17

0.24

No

No

No

Brown et al. (2010) iv 2

0.27

0.22

No

No

No

Burrus et al. (2013/2017) iv 1

−0.07

0.19

No

No

No

Burrus et al. (2013/2017) iv 2

−0.09

0.18

No

No

No

Carbonero et al. (2017)

0.28*

0.12

No

Yes

Yes

Clarke et al. (2014)

0.25*

0.08

No

Yes

Yes

Dias and Seabra (2017)

0.38

0.26

Yes

Yes

Yes

Digedlidis et al. (2003)

0.09

0.08

No

Yes

No

Dignath-van Ewijk et al. (2015)

0.04

0.28

Yes

No

No

Eker (2013)

1.98*

0.31

Yes

No

No

Goudas et al. (2006)

0.70*

0.24

No

Yes

No

Goudas et al. (2008)

0.21

0.18

Yes

Yes

No

Jaakkola and Liukkonen (2006)

0.21*

0.10

No

No

No

Kamps et al. (2015)

1.22*

0.09

No

Yes

Yes

Lakes and Hoyt (2014)

0.39*

0.14

No

Yes

Yes

Perels et al. (2005) iv 1

−0.01

0.18

No

No

No

Perels et al. (2005) iv 2

0.34

0.19

No

No

No

Perels et al. (2009)

0.96*

0.29

No

No

No (continued)

196

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

Table 7.1 (continued) First author (publication year)

Effect size conscientiousness

Other categories addressed

Hedges’ g

SE

Cogn.

Aff.

Social

Sanz de Acedo Lizarraga et al. (2009) iv1

1.07*

0.20

Yes

No

No

Sanz de Acedo Lizarraga et al. (2009) iv 2

1.39*

0.25

Yes

No

No

Sanz de Acedo Lizarraga et al. (2009) iv 3

0.66*

0.23

Yes

No

No

Sanz de Acedo Lizarraga et al. (2010)

0.78*

0.20

Yes

No

No

Stoeger et al. (2008)

0.25

0.14

Yes

No

No

Stoeger et al. (2010)

0.44*

0.14

Yes

No

No

Suminski et al. (2006)

0.03

0.13

No

No

Yes

Wills et al. (2018)

0.78*

0.07

No

Yes

Yes

Notes *p < 0.05. Cogn. = cognitive; aff. = affective

Interventions with a small variance were attached more weight than interventions with a larger variance. As the variance depended largely on the sample size (with larger samples having a smaller variance), interventions with large samples were attached more weight than interventions with smaller samples. We also performed moderator analyses (meta-ANOVA) to examine how various aspects influenced the summary effect. We used a random effects model to estimate the weighted average effect size, because the interventions included in the meta-analysis differed in many respects. We used a mixed effects model for the moderator analyses. We also examined whether our selection of primary intervention studies was subject to publication bias. Studies are more likely to be published when the effects found in the study are significant and large rather than small, or when the study is based on a large sample size. Studies based on smaller sample sizes and reporting non-significant effects might, therefore, be underrepresented in the meta-analysis. CMA has the option to examine whether there is a relationship between sample size and effect size. The program assumes that if there is a relationship between the two constructs, this can be attributed to missing studies. With the included Duval and Tweedie’s Trim-and-Fill procedure in CMA, we estimated whether there were studies missing, and if so, to what extent the summary effect was likely to be biased.

7.3 Results

197

7.3 Results 7.3.1 Descriptives Although we searched for eligible articles published between 1988 and 2018, our final selection only includes articles published from 2003 and more recent. The 30 interventions described in these articles were nicely spread over primary school, secondary school and higher education and were executed in various countries. The students’ socioeconomic status (SES) was reported in 15 interventions. Of these 15, six had a sample of students with an average SES, four with a low SES and five with a high SES. The average intervention took 19.2 weeks, with a large dispersion (SD = 19.6); shortest intervention took one day, longest two school years). Most interventions had a relatively small experimental group; only six interventions had an experimental group consisting of more than 100 students. Of the 30 interventions, 17 were implemented by the regular teacher, three were given by means of a computer, and in the other ten interventions, the researcher, research assistant or another trainer was the implementer. The large majority of the interventions focused on promotion of conscientiousness facets; only two focused on prevention of problems. The meta-analysis includes 12 interventions that addressed conscientiousness facets only. All other interventions addressed another category of attributes besides conscientiousness, which is detailed in Table 7.2. All four conscientiousness facets we distinguished were quite frequently addressed in the interventions. Almost all interventions had a purposefulness component (26 in total), and about half to two third of the interventions addressed perseverance, orderliness and/or achievement orientation (17, 19 and 15 interventions respectively). The numbers show that most interventions addressed multiple facets of conscientiousness. The effects of the interventions on conscientiousness outcomes were measured with a total of 97 instruments. Most frequently measured were behavioral aspects related to conscientiousness (39.2%), followed by volition (28.9%) and skills Table 7.2 Combinations of categories of attributes addressed in the interventions

Domains of attributes

Number of interventions

Conscientiousness only

12

Conscientiousness and (meta)cognitive

8

Conscientiousness and affective

2

Conscientiousness and social

1

Conscientiousness, (meta)cognitive and affective

1

Conscientiousness, (meta)cognitive, affective and social

1

Conscientiousness, affective and social

5

198

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

(23.7%). Attitudes were measured in only 8.2% of the instruments. The reliability of the instruments was with an average Cronbach’s alpha of 0.79 quite good. However, for 6.2% of the instruments, the reliability was not reported, and for the 5.2% observational measures a Cronbach’s alpha could not be calculated. For the instruments with reported Cronbach’s alpha, only 12.8% had a reliability below 0.70. Furthermore, 43.0% had a reliability between 0.70 and 0.79, and 44.2% had a reliability of 0.80 or higher. The articles in which the interventions were described were much less generous with reporting information about the validity than about the reliability of the instruments. Only for a small minority of the instruments information about validity was reported. For several instruments, a few sample items/questions were described, but not always. This made it difficult to evaluate whether the instruments were indeed a good measure of the construct conscientiousness and its facets. A striking characteristic of the way the effect variables were measured was the fact that most measurement instruments relied on self-reports of students, even when the measurement instrument pretended to measure a skill. This was the case for 89.7% of the instruments. Teachers were the rater for 5.2% of the instruments and in another 5.2% of the cases the measurement was based on observations.

7.4 Meta-Analysis Results 7.4.1 Summary Effect The summary effect of the 30 conscientiousness interventions on conscientiousness outcomes is Hedges’ g = 0.48 (SE = 0.08; p < 0.01), with a 95% confidence interval of 0.33 to 0.64. This is a significant positive effect. We assessed whether there was any publication bias by using Duval and Tweedie’s trim and fill method for a random effects model. This method usually is applied to see if any studies were missing in the meta-analysis (Borenstein, Hedges, Higgins, & Rothstein, 2009). The method indicated that there were two studies missing at the right side of the mean (thus with higher effect sizes than the average). The estimated summary effect with imputation of these two missing studies is with 0.53 slightly higher than the observed summary effect in our meta-analysis. The relatively small difference suggests that our results were not significantly affected by publication bias. The homogeneity statistic of the summary effect indicates that the variation in effects between the interventions is statistically significant (Q = 249.06; df = 29; p < 0.01), which means that the interventions do not share the same true effect size. The variance of the true effect sizes is estimated at T 2 = 0.158. The variance reflects a substantial proportion of real differences in effect sizes and only a small part due to random error, as indicated by the I 2 of 88.4. This suggests that there are differences between the interventions related to their characteristics (moderators) that might explain the variation in effect sizes.

7.4 Meta-Analysis Results

199

7.4.2 Moderator Analysis One moderator of the summary effect appears to be the additional category/ies of attributes that was/were addressed in the interventions, next to conscientiousness facets. A comparison of the groups of interventions with at least five interventions addressing the same combinations of attributes (see Table 7.2) shows a significant between groups difference (Q-between = 6.84; df = 2; p-value = 0.03). The 12 interventions with only conscientiousness had the lowest average effect size with Hedges’ g = 0.30; SE = 0.09; p < 0.01), the five interventions with conscientiousness and affective as well as social attributes had an average effect of 0.59 (SE = 0.19; p < 0.01), and the eight interventions addressing conscientiousness as well as (meta)cognitive attributes had the highest average effect with Hedges’ g = 0.80 (SE = 0.19; p < 0.01). Post-hoc analysis showed that only the difference between the groups ‘conscientiousness and (meta)cognitive’ and ‘conscientiousness only’ was statistically significant (p = 0.02). In this analysis we excluded the five interventions that addressed a combination of categories of attributes that only occurred one or two times in our meta-analysis (as can be seen in Table 7.2). This was because it is not wise to report conclusions about groups of interventions based on such a low number of studies. To have a closer look at the difference between interventions with and without addressing (meta)cognitive attributes, we performed an additional analysis and divided the interventions in just these two groups. This had the advantage that all interventions were included and both groups (with and without (meta)cognitive attributes) had a sufficient number of studies to base conclusions on. This analysis showed a just not significant between-groups difference (Q-between = 2.75; df = 1; p-value = 0.097). However, the difference between the groups was quite remarkable, with a higher average effect size for interventions with also a cognitive focus (10 interventions, Hedges’ g = 0.70; SE = 0.16; p < 0.01) and a lower effect size for interventions without a cognitive focus (20 interventions, Hedges’ g = 0.39; SE = 0.10; p < 0.01). Similar analyses for an additional affective, respectively social focus showed no differences at all between the groups of intervention studies (differences in effect size between the groups with and without the regarding focus were 0.001 and 0.011 respectively). We also examined whether the reliability of the measurement instruments had any effect on the summary effect. A proper meta-ANOVA in which different groups of interventions are compared was not possible in this case, as interventions could have instruments of various reliabilities and could not be categorized into one group. In order to still get an impression of the influence of the reliability of the measurement instrument on the summary effect, we ran multiple meta-analyses with each time a different selection of measurement instruments. We found no clear effects. Seven interventions had a measurement instrument with a reliability of below 0.70 and their summary effect was Hedges’ g = 0.25 (SE = 0.09). The 15 interventions with a measurement instrument with reliability between 0.70 and 0.79 had a summary effect of 0.44 (SE = 0.10) and the 17 interventions with a measurement instrument with reliability of 0.8 or higher had an average effect of 0.32 (SE = 0.08). Furthermore,

200

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

two interventions had a measurement instrument for which the reliability was not reported. Their average effect was 0.19 (SE = 0.11). The three interventions with observations as measure had the highest average effect with 0.81 (SE = 0.21), but it should be noted that the two interventions with the highest effects in this group (of Kamps, et al., 2015; Wills, et al., 2018) were from partly the same research team (both first authors were each other’s co-author) and examined the same intervention (CWFIT). The results of these various meta-analyses indicate that there are differences in average effect size, but these differences seem rather small and not significant (confidence intervals do overlap), and without a clear trend in relationship between effect size and reliability of the measurement instruments. We performed a similar kind of comparative analysis for the type of measurement instrument. For the seven interventions with a measure of conscientiousness attitudes, beliefs or opinions we found an average effect size of Hedges’ g = 0.27 (SE =0.10). For volitional, behavioral and skill measures we found average effects of 0.25 (SE = 0.06; 9 interventions), 0.56 (SE = 0.13; 16 interventions) and 0.32 (SE = 0.10; 12 interventions), respectively. Here again, we see small differences in average effects, but the confidence intervals still largely overlap, suggesting no significant differences between the groups of outcomes.

7.4.3 Long Term Effects Of the 30 interventions included in the meta-analysis five had a follow-up measure of the intervention effect. One of these (Acee & Weinstein, 2010) already had its follow-up test two weeks after the end of the intervention. Because of this short interval, we did not include this intervention in the examination of the long term effects. The four interventions we did include were from Clarke, Bunting, and Barry (2014), Dias and Seabra (2017), and two interventions from the study of Sanz de Acedo Lizarraga et al. (2009). Clarke et al. (2014) had a follow-up test 12 weeks after the end of the intervention, the others one school year after the intervention’s end. The four interventions had an average posttest effect of Hedges ‘g = 0.65 (SE = 0.25; p = 0.01; 95% confidence interval of 0.16 to 1.14), and an average long-term effect of Hedges ‘g = 0.32 (SE = 0.15; p = 0.03; 95% confidence interval of 0.03 to 0.61). Both effects are significant and positive, but the long term effect is about half the size of the effect directly after the end of the intervention. Table 7.3 shows the intervention effects of the four studies immediately at the end of the intervention and at the follow-up test.

7.5 Some Examples of Effective Interventions In the meta-analyses described above it appeared that the cognitive focus of the interventions was a moderator of their effect sizes. The results showed that the

7.5 Some Examples of Effective Interventions

201

Table 7.3 Intervention effects of the four studies with follow-up measures First author

Effect at posttest Hedges’g (SE)

Effect at follow-up Hedges’g (SE)

Time between posttest and follow-up

Clarke et al. (2014)

0.25 (0.08)*

0.09 (0.09)

12 weeks

Dias and Seabra (2017)

0.39 (0.26)

0.35 (0.26)

1 school year

Sanz de Acedo Lizarraga 1.39 (0.25)* et al. (2009) iv 2

0.72 (0.23)*

1 school year

Sanz de Acedo Lizarraga 0.66 (0.23)* et al. (2009) iv 3

0.32 (0.23)

1 school year

Summary effect

0.32 (0.15)*

0.65 (0.25)*

Notes *p < 0.05

average effect size of interventions with a (meta)cognitive focus next to a focus on conscientiousness facets was the highest (0.70), while interventions without a (meta)cognitive focus on average were less effective (0.39). In the following, we will describe two studies with the highest effect sizes in each of both groups of interventions. For this description, we only selected studies that were conducted in primary and secondary education, being the educational sectors that are most important in focusing on students’ social and emotional functioning. In the description of the study, we pay attention to the content and duration of the interventions, the quality of the studies, and the findings. As for the quality of the studies, we focus on the independence of the measurement (observation by others or self-reports), the context and intervention specificity (whether the measurement was bound to the school and/or domain context), on the content of the intervention (teaching-to-the test), and on the validity of the instrument in terms of item content, social-desirability and acquiescence bias). Related to the findings of the studies, we didn’t only look at the effects that were found for the facets of conscientiousness, but we also tried to put these in perspective by comparing them with the effects on the other outcomes - short as well as long term- that were measured in the studies. At the end of the description of each intervention we will conclude whether it is assumable that the intervention indeed proved to be effective to change the conscientiousness facet(s) on which it was aimed.

7.5.1 Two Examples of Interventions with a Cognitive Focus The study with the highest effect size (1.99) among the cognitive focused interventions is the study of Eker (2013). The study was conducted in Social Studies lessons among two classes of Turkish students in grade 5 of secondary school. It is unclear how the two classes were assigned to the experimental (n = 30) and the control condition (n = 32). The author just mentions that the ideas of the teachers of the

202

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

Social Studies lessons and the pre-test scores were taken into account in taking the decision. The intervention lasted 5 weeks during which the Social Studies teachers gave instruction on the topic ‘realized dreams’. Additionally, in the experimental condition, homework had to be completed by the students and after having finished their homework students had to fill in forms that were particularly developed for learning self-regulating strategies. The content of these forms included questions about abilities like listening, taking notes, understanding and summarizing texts, using and planning study time, and getting ready for the exams. Before the intervention started, students were taught about how to fill in those forms. During the intervention, the teacher gave feedback on the completed forms, as well as on the quality of the homework. In the control condition, the usual instruction approach, doing homework and giving feedback were applied. The effect of the intervention was measured with a Turkish version of the scale ‘self-regulation strategies’, originally developed by Pintrich and De Groot (1990), as part of the Motivated Strategies Learning Questionnaire (MSLQ). The author of the study refers for the Turkish version of this scale to the PhD thesis of Üredi (2005), in which a scale reliability of 0.84 is reported. Unfortunately, the thesis of Üredi is written in Turkish, so that it is not possible for us to check the validity of this scale. However, several publications about the psychometric quality of the original MSLQ show that the construct validity is still under debate and that the predictive validity is (very) low. Furthermore, all scales of the MSLQ are based on students’ self-reports, which is particularly problematic for scales regarding students’ behavior and skills because their responses suffer from acquiescence and social desirability tendency. Although there are methods available to correct the scores for such tendencies, no correction was applied in this study. Given the doubts about the construct validity of the self-reported self-regulation strategies, together with the fact that students in the experimental group were trained before the intervention in how to fill in the forms, and had to complete the forms during the intervention after having finished their homework, it does not come as a surprise that after the intervention the students reported high levels of self-regulation: they master the terms in the questionnaire (teaching to the test), they are supposed to have learnt the strategies during doing their homework in which they have put a lot of effort (tendency of acquiescence), and they are inclined to do their teacher a favor by reporting high levels of agreement with the statements in the questionnaire about applying the strategies (social desirability). All together, we must conclude that this study doesn’t convincingly show that the students have become indeed more selfregulated learners as a consequence of this intervention. Probably, because of the fact that self-regulation strategies have only been practiced during the intervention (but we aren’t even sure of that), transfer to other domains and long-term effects are uncertain. The second publication in the group of interventions with a cognitive focus that we want to discuss is the publication of Sanz de Acedo Lizarraga (2009) about two intervention studies, whereby in the second study two different experimental

7.5 Some Examples of Effective Interventions

203

conditions were compared with a control group. So, these two studies resulted in three effect sizes, which were respectively 1.07, 1.39 and 0.66. The interventions were conducted among Spanish secondary school students, aged 13/14 years. The first intervention study was aimed to develop, implement and evaluate a so-called infusion method to stimulate students’ thinking skills. The infusion method (IM) consisted of teaching the academic course syllabus materials to teach students thinking skills, together with teaching creativity and behavioral self-regulation at the same time. The participants in this study were 118 students, who were randomly assigned to the experimental group (n = 57) and the control group (n = 61). In total, four classes with four teachers were involved in the study. The intervention lasted 40 weeks (one school year). Before the intervention started, the teachers were trained in three seminars, each lasting 15 h. The intervention consisted of teaching thinking skills, creativity, and self-regulation of learning simultaneously, along with the syllabus content of sciences, language, mathematics and social studies. At the beginning of each unit the teachers explained the thinking skill, creativity, self-regulation and transfer. In each unit they first defined the particular skill by analyzing the steps to be taken mentally to practice it. Also, they stimulated students to produce new ideas. Next, they explained the self-regulation activities in three stages and showed how each of these activities should be practiced. In the first stage they addressed the so-called before learning self-regulation activities, which consists of the activities organizing, identifying, generating and deciding. In the second stage they instructed the while learning self-regulation activities, like monitoring and verifying, and in the third stage the after learning activities (assessing, communicating, and learning form experience). And, lastly, they stimulated transfer. The control group was taught as usual, i.e. the lessons focused on teaching the syllabus content of the subjects. During the intervention, the researchers visited the classes of the experimental group once a week. The effects of the intervention were measured in a pre-posttest design, in which several measures were applied: an Intelligence test (verbal, numerical, abstract, inductive and deductive reasoning), an academic achievement test, a creativity test, and a self-regulation scale. The last mentioned scale, the most relevant in the context of this chapter, was subscale IV from the Learning Strategies Scale (ACRA), which was developed by Román and Gallego (1994). This scale assesses the application of metacognitive strategies by students and contains 34 Likert-type items and was completed by the students themselves. The authors of this study report that the reliability of the scale was 0.75, but do not give any information about its validity. More information about the scale might be available in the publication of Román and Gallego, but unfortunately this publication is written in Spanish and not publicly available. Looking at the effect sizes of this intervention, it is remarkable that not only the effect on the conscientiousness facets (self-regulation, including orderliness, and goal-directedness) is rather high (1.07), but that the effects on four of the other outcome variables (all cognitive) that were assessed are even higher, with the highest effect for academic performance (1.97).

204

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

As regards the second intervention study, the same intervention as in the first study was implemented (the infusion method). Furthermore, at the same time another intervention method called the Instrumental Enrichment Program (IEP) was implemented with a different experimental group. The purpose of the IEP was to achieve changes in the mental processes of students that intervene in the act of thinking and selfregulation of behavior. The effects of both the IEP and the IM were compared with the results of the control group. For the rest, the design, instruments and analysis method were similar to those in study 1. The only exception was that a follow-up assessment took place one year after the completion of the interventions. The results show that the effect of the IM on academic performance is again the highest (1.94), followed by the effect on the conscientiousness outcome (selfregulation, 1.39). Of the other effects on the outcome variables, four of the seven (all cognitive) are above 1.0. The effect of the second intervention, IEP, on self-regulation (0.66) is half of the effect of IM, and also the effects on the cognitive outcome measures are much lower. Striking is that the effect on academic performance is only 0.21. With respect to the follow-up, it is remarkable that almost all effects of the infusion method have increased in comparison with the posttest effects, with the highest increase for academic performance (2.8). The two exceptions are the effect on the Cattell intelligence test score, with a decrease of 0.10, and the effect on selfregulation, which is half of the effect on the posttest score immediately after the intervention. Also, most long-term effects of the IEP intervention are substantially higher compared with the posttest effects, with the highest increase for academic performance (from 0.21 to 0.80). But again, the long-term effect on self-regulation is half of the effect of this method on the posttest score on this outcome variable. Two results are striking. First, mainly cognitive interventions integrated in the main school subjects, aimed at improving thinking skills, creativity and selfregulation, are indeed effective to improve students’ academic performance and/or general academic abilities. Moreover, after the interventions, the students who participated in the intervention report higher levels of self-regulation. Second, it is unknown whether the reported increase of self-regulation is a direct effect of the intervention or a side-effect of the improvement of the cognitive skills in the school subjects. This side-effect might be (partly) caused by the fact that students themselves attribute the improvement of their cognitive skills to the use of self-regulation strategies. However, given the fact that in the second intervention, where students’ academic performance didn’t improve very much, while at the same time they reported an increase of selfregulation, such a side-effect is not likely. In our view, it is more likely that students have learnt during the interventions about the importance of self-regulation strategies and became aware of how to fill in questionnaires about their behavior in a socially desirable and acquiescent manner. This awareness could be strongest just after the end of the interventions, and been faded away after one year, which might also explain the decrease in reported self-regulation in the follow-up assessment in both interventions. Here, again, the way in which self-regulation has been measured in the studies, while no correction for acquiescence or social desirability was applied, is too problematic to safely conclude that this concept is changeable by educational interventions.

7.5 Some Examples of Effective Interventions

205

7.5.2 Two Interventions Without a Cognitive Focus The first study in the category of interventions without a cognitive focus that we will describe below, is the study of Kamps, Wills, Dawson-Bannister, HeitzmanPowell, Kottwitz, Hansen, and Fleming (2015), which had the highest effect size in this category of studies (1.22). The study aimed at determining the effect of the socalled Class-Wide Function-Related Intervention Teams (CW-FIT) program on the improvement of students’ on-task behavior, next to increasing teacher recognition of appropriate behavior. The intervention is a group contingency classroom management program consisting of teaching and reinforcing appropriate behaviors, like getting the teacher’s attention, following directions, ignoring inappropriate behaviors of peers, etc. Seventeen elementary schools in the US participated in the randomized trial of CW-FIT, four schools in year 1, five in Year 2, five in year 3, and three in year 4. Each school participated in the study for one school year (40 weeks). A total of 159 teachers participated in the four years of study, 86 in the CW-FIT group and 73 in the comparison group. Class sizes ranged from about 18 to 25 students in both groups. Prior to the assignment of teachers to the CW-FIT or comparison group, each teacher selected a time of the day with challenging student behaviors for study implementation. From the 86 teachers in the CW-FIT group, 51% selected math classes, 33% reading classes, 12% writing classes, 1% science classes, and 3% other classes. In the comparison condition the distribution was respectively 47%, 32%, 7%, 4% and 11%. The CW-FIT intervention consisted of a behavioral intervention designed to teach appropriate skills and reinforce students’ use of the skills by using a game format. The CW-FIT intervention was implemented three to four times a week beginning in mid to late October and continuing to March of the same school year for participating teachers/classes. Before the intervention started teachers received a training, consisting of a two hour workshop, and two or three sessions of modeling the procedures. During the intervention they received weekly feedback from coaches. Before the intervention started, baseline data were collected in both the CW-FIT as well as the comparison group, during a period of two to three weeks. During this period group on-task data were collected by independent, trained observers (interrater agreement was 90%), using a 30-s time sampling procedure. Every 30 s, the observer scanned the group of students and recorded a plus for each row or small group of students (average number = four students per row or group) if ALL students were on-task. Within each 30 s, the observer rotated from group 1 to 2, to group 3, etc. until each group was scored, and then began the sequence again. If any one member of a group was off-task, the observer scores a minus in the box for that group. On-task was defined as all students working appropriately on the assigned activity including (a) attending to the material/task, (b) making appropriate responses, (c) asking for assistance in an appropriate manner, and d) waiting appropriately for the teacher to begin or continue with instruction. For the CW-FIT classes, group on-task data were collected for on average 1–2 sessions per week per class during baseline (number of

206

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

observations (n) = 277) and intervention (n = 975). For the comparison group 233 times a score was given during baseline in the fall and 420 times during baseline 2 (winter to early spring). The results show that the CW-FIT intervention classes showed higher increases in the levels of on-task behavior over time than the comparison group classes. Ontask data were higher in the CW-FIT conditions during each of the four years of the study. On average, class-wide on-task behavior during CW-FIT increased from 52% at the baseline to 83% during the intervention. The comparison group classes increased their on-task behavior from 50% at baseline to 56% at baseline 2, which is a substantial lower increase compared with the results in the experimental condition. Our conclusion from this study is that the CW-FIT intervention indeed seems to be able to promote students’ on-task behavior at group-level, which is related to the conscientiousness facets of orderliness and persistence. Also, the measurement of the outcome variable was appropriate by using independent observations of student behavior by means of a reliable and valid instrument. However, the question is whether this intervention is indeed able to change students’ on-task behavior more permanently when peer and teacher pressure is absent. Because there are no followup data available about the long term effects of this intervention, we don’t know whether the changes at group or class level were maintained after the intervention was finished, let alone that we know whether on-task behavior has become part of student’s individual personality trait of conscientiousness. The second study in the category of interventions without a cognitive focus that we want to discuss here is the study of Goudas, Dermitzaki and Leondari (2006), having an effect size of 0.70. The purpose of that study was to examine the effects of a life skill training program taught as part of physical education lessons, aimed at goal setting and positive thinking. Four 7th grade classes in Greece, with a mean student age of 12 years, participated in the study. Two classes were assigned to the experimental group and two to the control group. The implemented life skill program was an abbreviated form of two other programs, called Going for Goal (GOAL; Danish et al., 1992a, b) and the program Sports United to Promote Education and Recreation (SUPER; Danish, Fazio, Nellen, & Owens, 2002). The program GOAL is a 10-hour program designed to teach adolescents a sense of personal control and confidence about their future so that they can make better decisions and ultimately become better citizens. The program SUPER is a sports-based adaptation of the GOAL program, which supposes that linking sports and life skills makes sense because they both require learning of skills that are taught in the same way. SUPER is taught as sports clinics with participants involved in (a) learning physical skills related to a specific sport, (b) learning life skills related to sports in general, and (c) playing the sports. The procedures for implementing SUPER differ from GOAL: skill modules are adapted to fit the specific sport and time (most skills require 20 to 30 min to teach), there is less writing, and activities are more action-oriented related to sport. The abbreviated form of GOAL and SUPER that was implemented in the study of Goudas et al. differed from the other two programs in the sense that the sessions were shorter (10–15 min), they took place during physical education and the program began with physical fitness tests, the results of which served as stimuli for participants to set goals.

7.5 Some Examples of Effective Interventions

207

At the beginning of the intervention study, students were evaluated on two physical fitness tests. In the first two sessions of the intervention, in addition to practice, students discussed their performance on the tests and the importance of setting goals with the physical education teacher. They also learnt about the characteristics of reachable goals and how to set goals for themselves that were positively-stated, specific and achievable in one month. In session three they learnt how to make a plan to reach their goal. In sessions four, five and six, they learnt about positive thinking and practiced positive self-talk together with physical practice as well as replacing negative thoughts by positive ones. Finally, in sessions seven and eight goal setting and positive thinking were reinforced. During the eight sessions students had as reference point their own personal goal that they had to achieve in the particular physical condition test. The students in the control group practiced the same physical exercises (strength and flexibility activities) as the students in the intervention group, working mostly for themselves using a self-check approach. The physical practice sessions lasted 30–35 min in each group. The remaining minutes of the lesson hour was spent to the life skills training in the experimental group, while students in the control group received short lectures regarding Olympic games. The outcomes of the intervention were assessed three times with a physical fitness test (push-ups and sit and reach), a life skill knowledge test (knowledge about how to set goals, achieve them and think positively) and a questionnaire in self-beliefs for the ability of goal-setting, problem solving and positive thinking, which is most relevant in this context. The questionnaire, developed by Papacharisis et al. (2005) consisted of 10 items, of which five items assessed students’ perceived ability to set goals and five items assessed their perception of their positive thinking ability. The reliability of the first scale varied between 0.70 and 0.76 across the three measurements, and for the second scale between 0.76 and 0.88. The authors refer to Papacharisis for further information about the structural validity of the scales. Looking at the results of the intervention, it is remarkable that the effect size for perceived goal setting ability (0.70) is almost as high as the effect size for physical fitness (strength; 0.72). So, students in the experimental conditions have improved more on physical skills (although it is remarkable that this holds only for physical fitness and not for flexibility) than students in the control group, while the amount of physical practice was similar in both groups. This might be an effect of the goal setting focus of the intervention. At the same time however, it is no wonder that students who have attained the goal that they set, also perceive themselves as being good in goal setting, which as such doesn’t say anything about their actual ability of goal setting. So, the conclusion here must be that there is insufficient evidence that this intervention indeed resulted in a positive change of goal-setting, which can be considered to be related to the conscientiousness facet of being goal-directed.

208

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

7.6 Conclusions The meta-analysis of intervention studies focusing on conscientiousness attributes showed that the studies had, on average, a moderate positive effect on conscientiousness outcomes measured right at the end of the intervention. Interventions that addressed (meta)cognitive attributes additionally to facets of conscientiousness were the most effective and had a moderate to high effect size on average; the average effect of interventions without a (meta)cognitive focus was lower but was still small to moderate and significant. On the longer term, however, the intervention effects seemed to decrease. The four interventions with a follow-up measure showed that effects were halved three months after the end of the intervention in one study and after one school year (three studies). This indicates that the students did not fully interiorize the conscientiousness attributes and shows that it is difficult to teach students to sustainably use a repertoire of conscientiousness attributes, at least when the training is only for a demarcated and sometimes limited period. This finding might imply that it is very hard to alter students’ personality traits and attributes and the skills that follow from them, at least in the sense that students can apply what they have learned in other contexts and over a longer period of time. However, there is also another explanation for the fact that intervention effects on conscientiousness attributes are not sustainable and transferable. This explanation is that the students haven’t actually learned anything during the training, but just became aware of how to complete the questionnaires in a desirable way. The description of the individual intervention studies shows that the conclusions from our meta-analysis are based on several studies in which the hypothesized causal relationship between what is taught and what is measured as effect is not without discussion. Prior to the start of the meta-analysis, we tried to overcome this problem by selecting only studies with a proper research design (pretestposttest control group design). It appeared, however, that intervention studies often used measurement instruments of which the quality is called into question. Most studies used students’ self- reports about their behavior or skills, and the scores of the students were computed without taking social desirability or acquiescence tendency into account, while at the same time the wording in the questionnaires were very much alike to the concepts that were used in the training (teaching to the test). The only exception was the study of Kamps et al., in which the effects of the intervention were based on observations of students’ behavior in class, conducted with a rather high frequency with a reliable and valid measurement applied by independent observers. Given the quality of this study, we may assume that the evidence that it is possible to change students’ on-task behavior in classrooms, across several domains together, is convincing. But again, it is unknown whether the effects that were found were sustainable over time and generalize across different contexts, and thus whether the conscientiousness facets orderliness and perseverance did indeed become part of students’ personality system. All together, we must conclude that, just as appeared from our review of psychological studies (Chap. 3), the evidence from educational studies thus far is too weak

7.6 Conclusions

209

to support the idea that schools should include personality development as a separate domain in their curriculum. Even though the average effect size is rather large, there are many interventions which didn’t have any effect at all. And our description above shows clearly that the interventions with high effect sizes are not without discussion. Even those highly effective intervention studies used instruments that are not suitable to measure changes in personality traits in a proper way. In three of the four described interventions, the instruments were based on students’ self-reports without any correction for acquiescence or social desirability. The exception is the intervention of Kamps et al. (2015), but the problem in that study is that the outcome measure is rather narrow—bound to classroom behavior—and that no long-term data are available. These conclusions are rather in line with the conclusions on the effects of educational interventions on a broader set of social and emotional attributes as described in Chaps. 4 and 5 and with the conclusion in Chap. 6 about how these attributes are usually measured. We will come back on these issues in the final chapter of this book, in which the implications of these findings will be discussed.

References Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to metaanalysis. Chichester, England: Wiley. http://doi.org/10.1002/9780470743386. Danish, S. J., Fazio, R., Nellen, V. C., & Owens, S. (2002). Community-based life skills programs: Using sport to teach life skills to adolescents. In J. V. Raalte & B. Brewer (Eds.), Exploring sport and exercise psychology (2nd ed., pp. 269–288). Washington, DC: APA Books. Danish, S. J., Mash, J. M., Howard, C. W., Curl, S. J., Meyer, A. L., Owens, S., et al. (1992a). Going for the goal leader manual. Richmond VA: Department of Psychology, Virginia Commonwealth University. Danish, S. J., Mash, J. M., Howard, C. W., Curl, S. J., Meyer, A. L., Owens, S., et al. (1992b). Going for the goal student activity manual. Richmond VA: Department of Psychology, Virginia Commonwealth University. Hedges, L. V. (2007). Effect sizes in cluster-randomized designs. Journal of Educational and Behavioral Statistics, 32, 341–370. http://doi.org/10.3102/1076998606298043. Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage Publications. Palczynska, M., & Swist, K. (2018). Personality, cognitive skills and life outcomes: evidence from the Polish follow-up study to PIAAC. Large-scale Assessments in Education, 6(2). http://doi.org/ 10.1186/s40536-018-0056-z. Papacharisis, V., Goudas, M., Danish, S., & Theodorakis, Y. (2005). The effectiveness of teaching a life skills program in a sport context. Journal of Applied Sport Psychology, 17(3), 247–254. https://doi.org/10.1080/10413200591010139. Pellegrino, J. W., & Hilton, M. L. (2012). Education for Life and Work: Developing Transferable Knowledge and Skills in the 21st Century. National Academies Press. Pintrich, P. R., & de Groot, E. V. (1990). Motivational and self-regulated learning components of classroom academic performance. Journal of Educational Psychology, 82(1), 33–40. https://doi. org/10.1037/0022-0663.82.1.33. Poropat, A. E. (2009). A meta-analysis of the five-factor model of personality and academic performance. Psychological Bulletin, 135(2), 322–338. https://doi.org/10.1037/a0014996.

210

7 Meta-Analysis of Educational Interventions Addressing Conscientiousness Facets

Rammstedt, B., Danner, D., & Lechner, C. (2017). Personality, competences and life outcomes. Results from the German PIAAC longitudinal study. Large-scale Assessment in Education, 5 (11), 2. http://doi.org/10.1186/s40536-017-0035-9. Román, J. M., & Gallego, S. (1994). Escalas de estrategias de aprendizaje Manual [Learning strategies scales, manual]. Madrid: TEA Ediciones. Üredi, I. (2005). The contributions of perceived parenting styles to 8th class primary school students’ self-regulated learning strategies and motivational beliefs. Unpublished Doctoral thesis. Yildiz Technical University, Social Sciences Institute Istanbul, Turkey. What Works Clearinghouse. (2020). What Works Clearinghouse Procedures Handbook, Version 4.1. Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance. Retrieved from https://ies.ed.gov/ncee/ wwc/Docs/referenceresources/WWC-Procedures-Handbook-v4-1-508.pdf.

Studies included in the meta-analysis Acee, T. W., & Weinstein, C. E. (2010). Effects of a value-reappraisal intervention on statistics students’ motivation and performance. Journal of Experimental Education, 78(4), 487–512. https://doi.org/10.1080/00220970903352753. Arco-Tirado, J., Fernandez-Martin, F., & Fernandez-Balboa, J. (2011). The impact of a peer-tutoring program on quality standards in higher education. Higher Education: The International Journal of Higher Education and Educational Planning, 62(6), 773–788. https://doi.org/10.1007/s10734011-9419-x. Behnam, B., Jenani, S., & Ahangari, S. (2014). The effect of time-management training on testanxiety and self-efficacy of iranian intermediate EFL learners. Iranian Journal of Language Teaching Research, 2(1), 45–61. Blackwell, L. S., Trzesniewski, K. H., & Dweck, C. S. (2007). Implicit theories of intelligence predict achievement across an adolescent transition: A longitudinal study and an intervention. Child Development, 78(1), 246–263. https://doi.org/10.1111/j.1467-8624.2007.00995.x. Brown, T., Hillier, T., & Warren, A. M. (2010). Youth employability training: Two experiments. Career Development International, 15(2), 166–187. https://doi.org/10.1108/136204310 11040950. Burrus, J., Jackson, T., Holtzman, S., & Roberts, R. D. (2017). Teaching high school students to manage time: The development of an intervention. Improving Schools, 20(2), 101–112. https:// doi.org/10.1177/1365480216650309. Burrus, J., Jackson, T., Holtzman, S., Roberts, R. D., & Mandigo, T. (2013). Examining the efficacy of a time management intervention for high school students. research report. ETS RR-13-25. ETS Research Report Series. Carbonero, M. A., Martín-Antón, L. J., Otero, L., & Monsalvo, E. (2017). Program to promote personal and social responsibility in the secondary classroom. Frontiers in Psychology, 8, article 809. https://doi.org/10.3389/fpsyg.2017.00809. Clarke, A. M., Bunting, B., & Barry, M. M. (2014). Evaluating the implementation of a schoolbased emotional well-being programme: A cluster randomized controlled trial of zippy’s friends for children in disadvantaged primary schools. Health Education Research, 29(5), 786-798. http:// doi.org/10.1093/her/cyu047. Dias, N. M., & Seabra, A. G. (2017). Intervention for executive functions development in early elementary school children: Effects on learning and behaviour, and follow-up maintenance. Educational Psychology, 37(4), 468–486. https://doi.org/10.1080/01443410.2016.1214686.

References

211

Digedlidis, N., Papaioannou, A., Laparidis, K., & Christodoulidis, T. (2003). A one-year intervention in 7th grade physical education classes aiming to change motivational climate and attitudes towards exercise. Psychology of Sport and Exercise, 4(3), 195–210. https://doi.org/10.1016/ S1469-0292(02)00002-X. Dignath-van Ewijk, C., Fabriz, S., & Buttner, G. (2015). Fostering self-regulated learning among students by means of an electronic learning diary: A training experiment. Journal of Cognitive Education and Psychology, 14(1), 77–97. https://doi.org/10.1891/1945-8959.14.1.77. Eker, C. (2013). The effect of given homework upon the instruction of self-regulation strategies that were directed to develop self-regulation strategies. Educational Research and Reviews, 8(19), 1804–1809. https://doi.org/10.5897/ERR2013.1570. Goudas, M., Dermitzaki, I., Leondari, A., & Danish, S. (2006). The effectiveness of teaching a life skills program in a physical education context. European Journal of Psychology of Education, 21(4), 429–438. https://doi.org/10.1007/BF03173512. Goudas, M., & Giannoudis, G. (2008). A team-sports-based life-skills program in a physical education context. Learning and Instruction, 18(6), 528–536. https://doi.org/10.1016/j.learninstruc. 2007.11.002. Jaakkola, T., & Liukkonen, J. (2006). Changes in students’ self-determined motivation and goal orientation as a result of motivational climate intervention within high school physical education classes. International Journal of Sport and Exercise Psychology, 4(3), 302–324. http://doi.org/ 10.1080/1612197X.2006.9671800. Kamps, D., Wills, H., Dawson-Bannister, H., Heitzman-Powell, L., Kottwitz, E., Hansen, B., et al. (2015). Class-wide function-related intervention teams “CW-FIT” efficacy trial outcomes. Journal of Positive Behavior Interventions, 17(3), 134–145. https://doi.org/10.1177/109830071 4565244. Lakes, K. D., & Hoyt, W. T. (2004). Promoting self-regulation through school-based martial arts training. Journal of Applied Developmental Psychology, 25(3), 283–302. http://doi.org/10.1016/ j.appdev.2004.04.002. Perels, F., Dignath, C., & Schmitz, B. (2009). Is it possible to improve mathematical achievement by means of self-regulation strategies? evaluation of an intervention in regular math classes. European Journal of Psychology of Education, 24(1), 17–31. https://doi.org/10.1007/BF0317 3472. Perels, F., Gürtler, T., & Schmitz, B. (2005). Training of self-regulatory and problem-solving competence. Learning and Instruction, 15(2), 123–139. http://doi.org/10.1016/j.learninstruc. 2005.04.010. de Acedo, S., Lizarraga, M. L., de Acedo, S., Baquedano, M. T., Goicoa Mangado, T., & CardelleElawar, M. (2009). Enhancement of thinking skills: Effects of two intervention methods. Thinking Skills and Creativity, 4(1), 30–43. https://doi.org/10.1016/j.tsc.2008.12.001. de Acedo, S., Lizarraga, M. L., de Acedo, S., Baquedano, M. T., & Pollán Rufo, M. (2010). Effects of an instruction method in thinking skills with students from compulsory secondary education. The Spanish Journal of Psychology, 13(1), 127–137. https://doi.org/10.1017/S1138741600003723. Stoeger, H., & Ziegler, A. (2008). Evaluation of a classroom based training to improve self-regulation in time management tasks during homework activities with fourth graders. Metacognition and Learning, 3(3), 207–230. https://doi.org/10.1007/s11409-008-9027-z. Stoeger, H., & Ziegler, A. (2010). Do pupils with differing cognitive abilities benefit similarly from a self-regulated learning training program? Gifted Education International, 26(1), 110–123. https:// doi.org/10.1177/026142941002600113. Suminski, R. R., & Petosa, R. (2006). Web-assisted instruction for changing social cognitive variables related to physical activity. Journal of American College Health, 54(4), 219–225. https:// doi.org/10.3200/JACH.54.4.219-226. Wills, H., Kamps, D., Caldarella, P., Wehby, J., & Romine, R. S. (2018). Class-wide functionrelated intervention teams (CW-FIT): Student and teacher outcomes from a multisite randomized replication trial. Elementary School Journal, 119(1), 29–51. https://doi.org/10.1086/698818.

Chapter 8

Recapitalization and Discussion of the Main Findings and Implications for Educational Practice, Theory and Research

8.1 Development of the Soft Skills Movement in Education A first important notion is the realization that students’ social and emotional functioning has a long tradition in education. Good working attitudes and good behavior have always been important areas on which students were monitored. In didactic analysis, taxonomies of educational objectives were designed for the affective, next to the cognitive domain. What has changed is that the reservation with which the non-cognitive domains used to be treated, because of privacy concerns and fear of indoctrination, seems to have disappeared and is being replaced by very deliberate attempts to stimulate social-emotional learning and promote social and emotional outcomes, even beyond the school context. The rise of the soft skills movement should, first and foremost, be seen in line with modernization, in the sense of keeping education responsive to the needs of a changing society, including changing demands of the labor market as well as cultural trends. As part of this ‘normal’ process a certain convergence occurred in underlining the importance of social and emotional attributes for life outcomes. In the case of ‘twenty first century skills’ labor market demands were the most important impetus, while a more cultural trend was stimulated through the development of the theory of Emotional Intelligence, with the development of social-emotional learning programs (SELs) in its wake. As an additional supportive trend, the quest for a new locus of moral education might be discerned, caused by the decline of religious education in Western societies, in the shape of ‘character education’. And finally, a certain resistance against technology, rational planning and accountability, which is pervasively present, at least latently, in the educational province, could be seen as behind calls for holistic approaches and “teaching the whole child”. Although, perhaps ironically in relationship to this latter perspective, OECD’s current interest

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5_8

213

214

8 Recapitalization and Discussion of the Main Findings and Implications …

in soft skills is criticized by some, as a new kind of super accountability (international league tables on soft skill performance).1 Institutionally and geographically the roots of the movement are situated in the United States, both in the sense of the more labor market oriented call for “twenty first century skills” and the propagation of social-emotional learning by organizations like the “Collaborative for Academic, Social and Emotional Learning” (CASEL). The movement obtained an important boost from international organizations, with strong technical developmental input from the OECD, and to a lesser degree also from the European Union. In both cases the call for social-emotional skills was embedded in a larger program of modernization of educational outcomes, which had a strongly cognitive dimension as well. The developments by OECD can be traced back to the literacy concept in the PISA study, and the conceptualization of ‘cross curricular competencies’ within the framework of the DESECO project. The literacy concept propagates cognitive skills that are broader, are inclined to real life applications and transcend national curricula. The DESECO project incorporated social and emotional ‘transferrable’ skills. The European Union’s key competences range from basic numeracy and literacy skills to digital and citizenship related skills to transferrable social-emotional skills. The EU developments in this area have explicitly incorporated technical and vocational education and embraced the competency concept as a multi-facetted outcome definition, combining cognitive, affective and motivational concepts. A significant development is the conceptual work of the OECD to connect social and emotional outcomes of schooling with the Big Five taxonomy of personality traits (Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism/Emotional stability). This work underbuilds piloting a large-scale international study. Connecting social emotional skills to the Big Five traits and facets encompassing the main dimensions of the CASEL social-emotional model leads to a well-founded categorization, but at the same time raises new questions about the interpretation of measures and malleability assumptions. As far as societal importance is concerned, data on investments on socialemotional learning in the USA show that, anno 2010, this has become big business and a multi-billion affair. The global implementation is strongly stimulated by the input from the OECD and the World Bank, and the diffusion in countries that belong to the European Union.

1 See

Chap. 1, the section on criticism.

8.2 Critical Questions and Technical Challenges

215

8.2 Critical Questions and Technical Challenges 8.2.1 Appropriateness Doubts about the legitimacy of affective educational objectives were already ventilated at the time of the well-known taxonomies of educational objectives by Bloom and Krathwohl in the nineteen seventies. Personal characteristics, attitudes, preferences and appreciations are private, and as it comes to education they belong to the domain of the family. As the new movement extends “formation” to the whole personality, and objectives are phrased as improving the whole person, new air is given to this criticism. Issues of privacy and indoctrination will tend to be considered more serious to the degree that normative standards and ‘appropriate’ attitudes would go beyond what is generally accepted, like the declaration of human rights, or norms that would confirm to a certain denomination of the school. The edge of this debate depends on whether personality characteristics are treated as desired educational objectives and outcomes or as otherwise relevant conditions. In the first case social and emotional outcomes could be considered as becoming part of the formalized civil effect of schooling (as with examination) and given a place among indicators for accountability purposes. In the second case social and emotional assessments could be used for diagnostic purposes or seen as supportive for academic performance. More generally the question is to what extent fostering social and emotional outcomes is a core task of the school, as important as knowledge transmission and the development of cognitive skills. Information technology is sometimes seen as a new challenge, as protection of social and emotional data on students might be insufficiently safeguarded.

8.2.2 Conceptual Fuzziness When we introduced the title of this books as the soft skills movement, the key term was deliberately put in hyphens, indicating that this label is no more than a compromise to express a complex, not clearly demarcated subject in a brief and accessible way. Various terms are used to indicate the phenomenon: social-emotional learning, social-emotional skills, twenty first century skills, transferable skills, key competences, non-cognitive skills, non-cognitive attributes, character skills. All these terms have something in common, but also have a specific divergent meaning. In this book we had some trouble in choosing a uniform term for our subject. Mostly we settled on using the terms ‘social-emotional learning, social-emotional outcomes and social-emotional skills’. But depending on the context in which the key term was required, we sometimes deviated, and referred to social and emotional attributes (when we were not sure whether the attribute could really be seen as a skill), and to non-cognitive attributes, when we were trying to rule out cognitive/affective hybrids.

216

8 Recapitalization and Discussion of the Main Findings and Implications …

The sub-components that are to be subsumed under the chosen label and their categorization, forms another level in the conceptual structure. For example, the domain of non-cognitive attributes is sometimes divided in the categories of affective and conative constructs (the latter concerned with strivings and motivations). But neither on this level does there seem to be a consistent application of sub-categories and terms. As some of the nomenclature within the soft skills movement is borrowed from psychology, the origin of key terms is meaningful in the sense that it touches on the important technical question of malleability. This is particularly evident in the use of the term skills with respect to the Big Five personality traits, as these traits are originally defined as relatively stable characteristics, typical for the way individual persons think, feel and behave.

8.2.3 Tenability of Technical Assumptions Concerning Measurement and Malleability Some psychologists have compared the ambitions of the soft skills movement to change social and emotional attributes to witchcraft (Wilbrink, 2016). Questions about the measurement and malleability of social-emotional attributes are closely related to how these psychological attributes are conceptualized and defined: as personality traits, as more specific behavior or something in between. Qualifying social-emotional attributes as skills is another definitional issue that has implications for measurement and assessment. There is no consistent application of the term skills in the literature that is brought to bear on the soft skills movement. Sometimes personality traits are indicated as skills, while in other cases the term skills is reserved to observable performance. A third question is related to assessing social and emotional attributes as intended outcomes (seen as realized educational objectives). Only in the case of affective taxonomies of education are “affective dispositions” hierarchically scaled on meaningful dimensions like growing complexity and internal mental organization of value systems.

8.3 The Categorization of Concepts 8.3.1 Social and Emotional Attributes at the Core of the Soft Skills Movement A broad range of concepts is subsumed under the heading of soft skills. At the fringes attainment outcomes, such as reduced absenteeism, are sometimes included. Other constructs like self-regulation and meta-cognition have a predominantly cognitive interpretation but may be associated with regulating emotions and understanding

8.3 The Categorization of Concepts

217

social situations as well. Next, good behaviour, pro-social behaviour and reduced “bad behaviour, such as substance-use, bullying and delinquency, are quite central in social-emotional learning programs in the United States, which had a strong orientation towards prevention rather than development. The term ‘character education’ is sometimes used in the sense of ‘personality formation’ but has a strong element of moral education and being positive as well. The psychological background of the concept formation in the area of soft skills, namely the heritage of emotional intelligence and the more recent turn to the Big Five personality taxonomy takes us closer to social and emotional attributes and social and emotional skills as the possible core of the soft skills movement. In behavioural terms the social-emotional domain has to do with emotion control, (control of fear and aggression), adequate social functioning, self-confidence, effective work-related attitudes and self-reflection. The last two categories might be seen as cognitive-affective hybrids, related to meta-cognition. In Chap. 2 we ultimately settled on an ordering framework that follows the direction taken in the preparation of the OECD study on social and emotional skills, by placing the Big Five personality traits on a central axis, while adding distinction between an object and meta-level interpretation of trait facets, and including tentative examples of skill equivalents (see Table 2.6). But de facto the conceptual analysis was continued in Chap. 3, where evidence from personality psychology gave more depth to the distinction between traits and trait facets, on the one hand and skills on the other. And in Chap. 6, the measurement issue forced us to try and become even more precise about operational definitions of traits, skills and more indistinct habitual inclinations.

8.3.2 The Meaning of Social-Emotional Learning at School Creating a regulated and orderly working climate is a basic task in schooling and most relevant to social and emotional experiences. Over and above cognitive stimulation, regulated learning environments learning environments influence students in a way that could be called informal learning of social emotional skills. But the soft skills movement has given rise to more deliberate efforts to stimulate social-emotional learning. Interventions range from explicit reflexion on social and emotional behaviour to direct teaching of scripted curricula, teaching for example a vocabulary for discussing emotions and creating specific learning situations to exercise social and emotional skills. In the latter case one could also speak of intended social-emotional outcomes. Teaching and creating specific learning environments that function as dedicated interventions to stimulate social emotional outcomes intend to go beyond the learning of a set of behavioural tricks. Just as in the case of subject matter related cognitive teaching, transfer in the sense of more generalized behaviour and applications in a broad range of situations are envisaged. But teaching to stimulate social-emotional learning towards more general dispositions has feasibility challenges. These have to do with the determinacy of the entrance situation of students, having a model of the growth process, and a rationale for setting norms with respect to

218

8 Recapitalization and Discussion of the Main Findings and Implications …

social emotional attainment and skill levels. Addressing these feasibility issues was an additional reason to probe deeper into the meaning of the concepts of psychology, to consider spontaneous development or growth of personality over time as a benchmark for intervention effects, and put into question whether one should aim for good personalities, and what this would mean.

8.3.3 Hierarchical Ordering of Personality Concepts and Implications for Measurement and Malleability Concepts of personality can be put on a continuum running from characteristics that are assumed to apply in all kinds of situations, to behavior in specific situations. A second assumed parallel continuum runs from stable, innate, personal characteristics to modifiable behavior, highly adaptive to the demands of the situation. General personality traits are considered as applying to a broad range of situations and relatively stable. Situation specific behavior is modifiable but constrained by personality traits. Trait facets indicate specific aspects of traits but are still considered as context independent. At the same time, it is conceivable that application of trait facets is connected to broad situational categories, like functioning at school, or functioning in workplace situations. In Chap. 2 we discussed various contributions to specification of the hierarchical ordering of personality concepts but noted that fundamental uncertainties remain. At the end of the chapter these uncertainties were summarized as follows: – the fuzziness, uncertainty of application, and measurement problems of nonachievement oriented inclinations as discussed by De Groot and Medendorp (1986); – the complexity of the competency concept, particularly in coming to grips with the way cognitive, motivational and attitudinal facets are supposed to interact; – unsettled demarcation issues between traits, more specific behavioral dispositions, and skills; – questionable usefulness of concepts like ‘meta affection’ and ‘meta conation’. The conceptual apparatus of personality taxonomies is unproblematic as far as its use for establishing individual differences is concerned. But it is problematic in an educational context, when it comes to measurement for assessment purposes and evaluating the effectiveness of interventions aimed at stimulating growth of students in social emotional outcomes. The key issues that are at stake are the appropriateness of personality measures to assess growth in social and emotional “skills” and expectations about the malleability of the psychological attributes in question by means of educational interventions. Both themes were further addressed in the subsequent chapters of the book.

8.4 The Malleability of the Big Five Personality Traits and Trait Facets …

219

8.4 The Malleability of the Big Five Personality Traits and Trait Facets; an Excursion into Psychological Research In Chap. 3 an excursion was made into the domain of psychological studies on various aspects of the Big Five taxonomy. Following Roberts (2009), personality traits were defined as reflecting people’s characteristic, rather enduring, consistent and automatic patterns of thoughts, feelings and behaviors that distinguish people and that are afforded in specific environments. The qualification ‘enduring’ distinguishes traits from states where states represent thoughts, feelings and behaviours captured in the moment and by default, in the situation. As for the description of the Big Five dimensions it was noted that a rather high level of agreement exists among researchers on four of The Big Five bi-polar dimensions. These dimensions are Extraversion, Agreeableness, Conscientiousness and Emotional Stability (sometimes also called Neuroticism). About the label of the fifth dimension is less agreement. It has been referred to as Culture, Intellect, Openness to Experience, or Autonomy, dependent on the personality test that is used. Hendriks et al. (2011) describe the first four mentioned traits, based on the Five-Factor Personality Inventory (FFPI, Hendriks, Hofstee and De Raad, 1999) as follows: “Extraversion refers to social expressiveness and activity level. Extraverted individuals, scoring high on Extraversion, seek other people’s company and are talkative and active, whereas their opposites—introverted people—scoring low on Extraversion, prefer to be left alone. Agreeable people are mild, peace-loving, and cooperative, whereas their opposites—disagreeable people—are bossy, competitive, and quarrelsome. Conscientiousness refers to how people perform tasks. People high on Conscientiousness are organized, dependable, and precise, whereas their opposites score low on Conscientiousness and are chaotic, careless, and procrastinating. Emotional stability describes a person’s level of emotional reactivity. Emotionally stable individuals are calm, even-tempered, and readily overcome setbacks, whereas their opposites—emotionally unstable individuals—get overwhelmed by emotions easily” (p. 220). Lastly, the fifth factor in the FFPI, Autonomy refers to an individual’s intellectual approach to life, a trait of which the core meaning is independent thought and decision making (analyzing problems, forming own opinions). The meaning of the fifth factor in other personality assessment instruments, Openness, refers to the tendency to appreciate new art, ideas, values, feelings and behavior. Review of the relevant literature, meta-analyses, research reviews and individual studies, was used to analyze a set of topics that were considered to address major issues of this book. Themes that were subsequently addressed were the stability of personality traits and facets, their consistency across different contexts, the genetic basis and potential changeability of personality, and evidence on the effectiveness of (sub)clinical and counseling interventions aimed at changing personality. The main results of the study are summarized as follows: Although the review showed changing developmental patterns in personality traits, this should not be confused with evidence of

220

8 Recapitalization and Discussion of the Main Findings and Implications …

malleability by means of specific interventions. In the rare cases that such evidence was found, it was related to personalized clinical interventions. Altogether we did not find sufficient evidence from the psychological literature confirming that it is possible to change students’ traits and facets of traits in education. As we acknowledged the relevance of the Big Five taxonomy as an ordering frame for social-emotional attributes, we noted that it would only make sense to use it as an outcome oriented assessment framework in education if the social and emotional attributes deducted from the Big Five were properly defined as skills. In the case of OECD’s current international study on social and emotional skills, we noted that the scales (although referred to as skills) actually refer to trait facets, and this makes a big difference. Personality traits describe individuals’ ‘typical behaviour’, while skills are related to what individuals can demonstrate in a situation in which a performance is required. From the perspective of personality theory, and the research that was reviewed, one would not expect anything more than marginal change on scales that measure trait facets, when these are used as outcome measures in evaluations of educational interventions.

8.5 Overview of the Educational Research Evidence on Malleability Evaluations of educational interventions aimed at fostering social-emotional attributes provide “the proof of the pudding” as far as the malleability expectations of the soft skills movement are concerned. In Chap. 4 results from three different sources were summarized: studies by economists, meta-analyses based on evaluations of Social Emotional Learning (SEL) programs, and educational effectiveness studies. The studies by economists like Heckman and others have drawn a lot of attention but have been criticized on various grounds as well. The most relevant evidence comes from a series of meta-analyses, all of them showing effects of SEL, of small to medium size. Notwithstanding the thoroughness of these meta-analyses the validity of the evidence is to be interrogated on several points. We noted heterogeneity of effect sizes on outcomes assessed directly after program completion and unconvincing evidence on follow-up effects. When discussing the meta-analytic studies in more detail we found considerable ambiguity and vagueness in the definition of the interventions, often many facetted programs, as well as heterogeneity, lack of standardization and quality documentation of the SEL outcome measures. Together these issues make up for problems of comparability and interpretation on the cause and effect associations that are central in the studies that make up the raw material of the meta-analyses. How certain can we be that SEL effects are really the results of social-emotional learning and not the side effects of cognitive training aspects that are also present in multi-facetted programs? A striking finding was that SEL programs generally also appeared to produce significant effects on academic performance. Effect sizes were comparable and sometimes even exceeded results of

8.5 Overview of the Educational Research Evidence on Malleability

221

meta-analyses based on programs fully dedicated to enhancing cognitive academic performance. Are the evaluations of SEL-programs positively biased? Although addressed in the meta-analyses, lack of independence of the basic evaluation studies (because program staff also conducted the evaluations) and subjective bias in the outcome measurement (self-reports) might still have caused inflated effect sizes. Next to effect sizes on academic performance, another possible bench-mark for the SEL program effect sizes is a recent meta-analysis of high quality randomized field trials in education, which showed an average effect size of 0.06 (Lortie-Forgues and Inglis, 2019). Among the 141 interventions, published between 2012 and 2018 that were selected, only 4 involved social-emotional learning programs; with effect sizes of 0.03, −0.11, 0.02 and 0.08. In the context of the third research strand, considered in Chap. 4, non-experimental educational effectiveness studies, considerably lower effect sizes predominate. Comparisons between these two research strands require some caution, however. Non-experimental effectiveness studies are based on correlations between variables that represent malleable conditions of schooling and outcome measures. Gross indicators of school effects are based on the proportion of total student level variation that is explained by the between school variation on the outcome variables, frequently expressed as intraclass coefficients. Generally, school effects on academic performance are much larger than school effects on social-emotional outcomes. For example, in a study based on data from the PISA international assessment study, Brunner et al. (2018) found the international median interclass correlation was 0.4, while the school effects for students’ affect and motivation were much smaller (Median values of the interclass coefficient varied from 0.02 (e.g., general openness) to 0.08 (mathematics self-efficacy). Altogether the review of the research evidence left many unanswered questions and prompted us to turn to the individual program evaluations, on which the metaanalyses were based, before making more definite assessments. This was the focus of the next chapter, Chap. 5.

8.6 Opening “Black Boxes”; The Content of Social-Emotional Learning Programs and Further Reflection on Their Evaluations An important concept in reviewing improvement programs and their evaluations is the concept of program theory. The program theory is the overall rationale of the program, preferably specified as a set of prospective means-goals associations. The most basic program theory is the expectation that program X will positively influence outcomes Y. An operational version of this basic program theory will describe the “means” in terms of interventions, inputs and malleable variables, and the outcomes in terms of measurable variables. Preferably the choice of interventions will be based

222

8 Recapitalization and Discussion of the Main Findings and Implications …

on substantive expectations, based on earlier research, or on hypothetical causal mechanisms, which may be theory-based. When it comes to the evaluation of SEL programs, two basic, rudimentary, program theories can be distinguished, which depend on the position of the social-emotional outcome variables. In the first case social-emotional attributes that are influenced by the program are the prime desired outcomes and in the second case, social-emotional outcomes are seen as mediators, instrumental to other outcomes, like academic performance or life outcomes. In a review of the rate of return of programs that included “non-cognitive components” Kautz et al. (2014) found that most programs were at the same time targeted at improving cognitive outcomes, while improvement of non-cognitive outcomes other than behavioral improvement were mentioned as program facets, but rarely assessed by means of psychological measures. In the SEL program evaluations that we selected as illustrative cases just one program assessed academic outcomes only, the other programs included both types of measures. In just two cases were explicit substantive assumptions formulated about the supposed instrumental effects of social emotional outcomes on academic outcomes. In all other cases were social-emotional attributes and academic performance treated as separate, “unconnected” outcomes. Though understandable from a more managerial point of view, from the perspective of academic research the lack of specification of the causal structure of the programs is a liability. For advancement of the soft skills movement as a research program, a more theory driven research practice is very much needed. In Chap. 5, eight social-emotional learning programs were described. For each program at least one program evaluation was discussed, while for some programs several evaluation studies were referred to. In order to give the reader an impression of the contents of these programs, considerable space was given to descriptions of the program interventions. Program contents was expressed in terms of a range of behavioral and social-emotional components: avoidance and prevention of behavioral problems, enhanced self-control, communicating and understanding emotions, facilitating positive social interaction, improved social climate at school, coping and social problem solving. In two of the programs values of ‘good character’ are promoted. In all programs social-emotional functioning and behavioral skills were presented in connection to school life. Although in varying degrees, all programs see development of social-emotional skills in relationship to cognitive teaching and learning. In some cases, this relationship was more explicitly seen as instrumental: improved behavior and socio-emotional functioning were considered as facilitating cognitive development and academic performance. Most program evaluations used both socio-emotional outcomes and academic outcomes as effect criteria. Intervention modes were most frequently curriculum documents, like lesson programs and scripted teaching approaches, specific teacher training for the program, and sometimes adapted modes of school organization, staff cooperation and parent involvement. The case descriptions gave rise to the following evaluative reflections, which add to the assessment of the evidence in Chap. 4. Most programs had a strong orientation toward prevention of anti-social behavior.

8.6 Opening “Black Boxes”; The Content of Social-Emotional Learning Programs …

223

We concur with the review of programs by Kautz et al. (2014) when they note the following challenges: “Firstly, many interventions are only assessed with no, or shortterm follow up. Secondly, not all studies measure the same outcomes. Thirdly, many programs target specific student populations, and most of them target disadvantaged groups. Fourthly, “different programs use different, often incompatible, measurement schemes” (p. 32). We noted that program rationales and speciation of causal models were almost absent. In this context we mark a possibly important draw-back of program evaluations designed as randomized field experiments, as rather uninformative ‘black box assessments’, particularly in the case of very broad and many-facetted interventions. Non-experimental effectiveness studies may be more adapt at testing elaborate causal structures, which include social-emotional outcomes and cognitive academic performance. In the discussion section of the chapter several examples of theory based mechanisms were given, which might explain complex interactions between cognitive and social-emotional facets of teaching and learning.

8.7 Measurement of Social and Emotional Learning Outcomes The actual use of instruments and outcome measures for evaluation of SEL related interventions was discussed in more detail in Chap. 6. One of the main purposes was to document typical instruments that are used to assess social and emotional attributes, as effect measures of Social and Emotional Learning (SEL) programs and review their quality and “fit for purpose”. The selection of instruments was facilitated by an overview by the British Educational Endowment Foundation (2018)2 in which instruments were rated on psychometric quality and scientific support. In our selection we tried to cover the most relevant social and emotional attributes, roughly covering underlying personality dimensions like conscientiousness, emotional stability, extraversion, agreeableness and openness (see Summary Table 1 in Chap. 6). The results showed that the instruments met psychometric criteria of reliability (mostly assessed as internal consistency), and frequently had positive indices on convergent and discriminant validity. Predictive and criterion validity were rarely checked. When it came to judge the fit for purpose as social-emotional outcome measures the critical conclusion that was drawn in Chap. 3 was largely corroborated (this conclusion basically stated that self-report scales intended to measure individual differences on traits and trait-facets, are unfit as instruments to assess growth on social and emotional outcomes). The scales developed for measuring the concepts of

2 EEF

(Education Endowment Foundation). 2018. SPECTRUM database. Education Endowment Foundation: https://educationendowmentfoundation.org.uk/projects-and-evaluation/evaluating-pro jects/measuring-essential-skills/spectrum-database/.

224

8 Recapitalization and Discussion of the Main Findings and Implications …

personality are developed to measure individual differences and scale scores depend on the number of items that reflect positive or negative instances of the trait. It was concluded that: – Instruments were mostly based on self-descriptions of inclinations with a general orientation and without a clear performance focus (and therefore, in our view, are not to be seen as skills measures) – Only in a minority of cases were performance aspects of dispositions included, rather limited to cases where the competency element in Emotional Intelligence was emphasized – The scales lacked hierarchical organization as implied in taxonomies of affective educational objectives (which run, for instance from direct reactions to more encompassing and elaborate emotional reactions). Another way of expressing this point is that scales lacked criterion referenced norms. – Items that hinted at action orientation, for example in terms of emotion control, frequently seemed to be characterized by a “meta” orientation, and moreover a “meta-cognitive” orientation. In the discussion reference was made to alternative methodological approaches. It was concluded that improvement of SEL assessment should take the performance oriented nature of “social-emotional outcomes” more seriously. This could be done in the following ways: – By making items criterion referenced – By constructing items that discriminate between right and wrong choices – By considering alternative methodological approaches like situational judgment tests, and direct observations in real life performance situations – By singling out items on emotion regulation (as a basic category of socialemotional skills), which appeal to reflective meta-orientations and considering them as instances of meta-cognition, rather than meta-affection or meta-connation, which in its turn would prompt a performance based measurement approach, targeted at meta-cognitive skills. – Experimenting with assessment instruments that express growth of performance in the social-emotional domain on relevant dimensions substantively, like in the case of taxonomies of educational objectives – Developing standardized tests of social-emotional functioning, implying that the instruments are normed on representative samples, for different age groups (a recent example is the CITO monitoring system “VISEON”, (Kuhlemeier, Knoop, van Boxtel, Papenburg & Hollenberg, 2016)).

8.8 A Meta-Analysis of Educational Intervention Studies …

225

8.8 A Meta-Analysis of Educational Intervention Studies Addressing Conscientiousness Aspects We included a new meta-analysis in the study on which this book is based (Chap. 7), to add to the knowledge base about soft skill effects, and re-address some of the methodological challenges which we had encountered when reviewing earlier meta-analyses. We chose conscientiousness because of the plausibility that facets of this trait have a positive influence on educational outcomes, an impression that, to a degree, has also been empirically confirmed. Several studies on the relationship between personality traits and success in later life show that conscientiousness is a relatively important predictor of school success and life outcomes. The conscientiousness facets that were addressed in het educational intervention studies that were sampled are perseverance, orderliness, achievement orientation and purposefulness. After applying a standard set of inclusion criteria to an initial total of 1965 studies, we obtained 30 eligible interventions. The mean effect size of the 30 conscientiousness interventions on conscientiousness outcomes amounted to 0.48 (Hedges’ g = 0.48 (SE = 0.08; p < 0.01), with a 95% confidence interval of 0.33 to 0.64). Although this is a medium effect size for outcomes directly measured after completion of the intervention, we concluded that our results do not convincingly demonstrate that personality traits (conscientiousness in this case) are altered in the sense that students can apply what they have learned in other contexts, and over a longer period of time. In line with conclusions drawn in other chapters we noted many problems with the outcome measures: frequent self-reports unchecked for social desirability, “trait” instead of skill measures and “teaching to the test” in the sense of exaggerated proximity of intervention and assessment content. Next, the finding that interventions with cognitive components registered higher effect sizes questions the internal validity of the (social emotional) intervention assessment (pseudo treatment).

8.9 Implications for Educational Policy and Practice, and Research 8.9.1 Levels of Ambition and Evidence Does the evidence support widespread and intensive implementation of socialemotional learning in education? We showed that social-emotional learning is defined in various ways and educational implementations have different ambition levels. The scope of socio-emotional learning varies from behavioral modification to personality formation, and the ambition level varies from striving for social and emotional outcomes as a new category of formal educational objectives, to a more instrumental function to support attainment of academic goals and life outcomes. To systemize

226

8 Recapitalization and Discussion of the Main Findings and Implications …

Model 1: informal social-emo onal learning >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> life outcomes Model 2: social-emo onal learning >>>>>>>>>>>>>>>>>>>>>social-emo onal outcomes Model 3: social-emo onal learning >>> (social-emo onal outcomes and academic outcomes) Model 4: social-emo onal learning >>>>>>>>>>>>>>>>>>>>>> >>>>>>academic outcomes Model 5: social-emo onal learning>>>>>>social-emo onal outcomes>>>>>>>>>academic outcomes Model 6: social-emo onal learning > social-emo onal outcomes >academic outcomes > life outcomes

Fig. 8.1 Social-emotional learning effect models

the argumentation, we turn back to the schematic overview of variations of socialemotional learning effect studies in Chap. 5. We extend the framework by making a distinction between informal and intentional social emotional learning in order to contrast model 1, from the subsequent models (see Fig. 8.1). The figure presents a categorization of the types of evidence that can be obtained by means of different model variants, but it could also be used to scale ambition levels of educational interventions and programs. The relevance of social and emotional development is sometimes motivated by the claim that social and emotional attributes are very important for adequate functioning in modern society, the labor market included. But this claim needs not refer to specific educational interventions but be based on overall personal development in all life contexts, or just as an overall effect of children and adolescents following formal education. Model 1 should therefore be seen as a ‘zero’ condition with respect to the implementation of special programs for socio-emotional learning at school. This point is less trivial than it might seem. Recommendations stressing the importance of social and emotional attributes are frequently motivated in a way that does not even touch on a potential role of education. For example, as follows: “Evidence confirms that student skills other than academic achievement and ability predict a broad range of academic and life outcomes” (West, 2016). Although the further context of the publication makes it clear that the author has skills in mind that are acquired at school, there is frequently insufficient attention for the fact that model 1 also holds without an identifiable added value attributable to education. It is quite likely that the ‘given’ personality of people is associated with success in life. Rational strategies to support labor market needs for soft skills also exist outside the sphere of education and might, for example, be fulfilled within the realm of personnel selection, applying personality tests of individual differences. A crucial issue is whether social and emotional attributes are positioned as educational outcomes (in the sense of attainable educational goals) or seen as intermediary conditions, and instrumental to academic performance. Model 2 and model 3 represent social and emotional outcomes as goals in their own right, while models

8.9 Implications for Educational Policy and Practice, and Research

227

5 and 6 place social and emotional outcomes as intermediary goals. The implications for educational practice of a choice between these two options are particularly evident in the way the assessment of the social and emotional goals should be seen. When social and emotional outcomes are formal educational objectives ‘high stakes testing’, involving civil effect consequences for students, and accountability requirements for schools (and perhaps teachers), would be a logical choice. When social and emotional outcomes have an intermediary position, supportive to academic performance, monitoring with a more ‘formative’ intention could suffice. The 6 models depicted in the figure represent causal models. Tenability of the ambitions that are inherent in the different variants is a matter of measurability of the outcomes (as outlined above), but also a question of proven effectiveness, in the sense of attribution of outcome variation to the antecedent conditions. In this sense models 1 to 4 are to be seen as rather global input-output configurations, whereas models 5 and 6 represent indirect effect models. Research into the simpler direct models answers the question whether informal learning or a deliberate social emotional learning intervention has an effect or not, whereas indirect effect models provide additional evidence on how the program takes effect. The results of our study indicate that measurement and assessment of social and emotional outcomes is currently insufficiently developed to allow for high stakes summative assessment. The trait versus skills discussion has not led to convincing solutions, and as matters stand, the erroneous practice of using trait measures instead of measures of performance oriented “skills” persists. Next, on the issue of evidence for the malleability of social and emotional attributes by means of educational interventions, the message from our study is more mixed. On the one hand meta-analyses show educationally significant effect sizes, but on the other hand these effects are not supported by verified program theories, and uncertain because of research technical imperfections of the studies on which the metaanalyses are based. The most important technical imperfection being the measurement problem, mentioned in the above. The meta-analyses that were reviewed heavily depend on personality trait measures, frequently based on students’ self-reports where skill measures would have been required. Learning effects on the cognitive facets of the personality scales (like increased knowledge of social conventions and rules for good behavior) might have to be considered as alternative explanation for the significant program effects. The above conclusions imply that the evidence base for the effectiveness of socialemotional learning programs is rather weak. In a rational world this would imply that ambition levels and investment scales should be reconsidered. The scope and intensity of social-emotional learning interventions may vary in different ways. We just mentioned the difference between applications where social and emotional learning outcomes are considered as assessable educational objectives ‘in their own right’ and the perspective of seeing socio-emotional learning as instrumental to academic performance. The former applications are likely to be more involving than the latter, also depending on the formal or informal nature of the instrumental program, both in the sense of an explicitly scripted program and formal or informal formative assessment methods. Two other dimensions on which the scope and intensity of “dealing

228

8 Recapitalization and Discussion of the Main Findings and Implications …

with social and emotional facets at school” may vary are (1) merely using assessment of social and emotional characteristics of students for diagnostic purposes and (2) embedded versus stand alone applications. Diagnostic application has been traditionally tied to recognition of disorders and pathology. As we noted social emotional learning programs in the United States originally had a strong bias towards prevention in schools in disadvantaged areas. Theoretically, diagnosis of social and emotional skills could be more generally (not tied to specifically problematic settings) used as an assessment of entrance behavior as a basis for dealing with social-emotional aspects in regular teaching. For this latter condition, dealing with social-emotional aspects in regular teaching, we use the term embedded, as compared to stand-alone programs. In the next section the embedded approach will be described in more detail.

8.9.2 Practice: Special Programs or an Embedded Approach? The core of education is knowledge transfer in established subjects, knowledge fields and occupational domains. The logic of didactic analysis and taxonomies of educational objectives shows that mastering content and mastering cognitive skills go hand in hand. Progress is defined in terms of increased content knowledge and carrying out psychological operations of increasingly higher order, with transfer to other contexts, problem solving, innovation and creativity as progressive levels. In the history of didactic analyses initiatives to teach cognitive skills separated from substantive content and traditional subjects have existed for about five decades. The best-known examples are ‘learning to learn’ programs, but other examples are ‘creative thinking programs’. All in all, these programs seem to have fought an uphill battle for acceptance and lead a withering existence. Critics have pointed at the artificial nature of separating cognitive skills from content teaching (Weinert, 2001). A line of thinking parallel to the cognitive domain could be followed for the teaching of social and emotional skills. Just as applies to teaching in the cognitive domain, teaching social and emotional skills require content and applications in particular situations. Special stand alone programs for social-emotional learning have a cognitive element in the teaching of specific vocabulary to talk about motives and feelings and create a number of learning experiences that evoke social and emotional aspects of situations, ranging from discussing literature, viewing films, projects, role playing, excursions and electronic discussion platforms. An alternative is based on profiting from the everyday experiences at school and in the classroom as a natural context for dealing with social and emotional experiences. This could be described as an embedded approach to stimulate social and emotional development (Scheerens, 2009). Typical situations in the classroom that are rich sources of reflection and teaching of social and emotional skills are classroom organization and management, situations related to discipline and following behavioral rules, (inter)cultural norms, conflict resolution, collaborative learning and motivational and affective facets of performance feedback. One supportive condition for such an embedded approach to

8.9 Implications for Educational Policy and Practice, and Research

229

social-emotional learning is a teaching style that is attentive to social and emotional support and is reflexive with respect to everyday experiences concerning affect and motivation. Another relevant condition is a shared and supportive school climate. An approach somewhere in between special SEL programs and an embedded approach is formed by citizenship programs. Such programs have a knowledge and cognitive part that has content on principles of democracy, knowledge of different cultures and human right principles, but often also mention social and emotional skills and moral development. In national curricula citizenship may be a separate subject, or be attached to specific other subjects, like social sciences or history. Given the state of the empirical evidence, seeing social-emotional development as supportive to cognitive development and academic performance and embedded applications are more defensible options for educational policy and practice than standalone comprehensive programs that, as the EU would have it, are part of national curricula.

8.9.3 Implications for Research 8.9.3.1

Developing a More Established Conceptual Categorization

We have tried to add to a conceptual overview and categorization of the field of twentieth first century skills, key competences and social-emotional learning. Continued attention for this issue is needed, to avoid confusion and facilitate collaboration and international exchange. A more established conceptualization would serve the field in important ways. Core areas for analysis and debate are: – Ethical questions about normative interpretations of personality characteristics; like giving high marks for extraversion. Such issues could arise when educational objectives, erroneously labeled as skills, but measured as psychological trait facets, are used for high stakes judgements. – The cognitive nature of social and emotional skills. In cognitive psychology skills are described as procedural knowledge (knowing how) which can be developed gradually and become internalized and automated into actions, classifiable in performance levels. From this perspective it could be argued that skills are always cognitive and that the expression ‘social-emotional skills’ is a contradictio in termini. Likewise, from this perspective it could be argued that what we are dealing with are cognitive skills in the social-emotional domain. – Inclinations in the social-emotional domain that are not performance oriented (none-skills). The labelling in the ‘soft skills movement’ has shifted the attention away from functioning in the social-emotional domain that is characterized as habits, liabilities, bents, moods and attitudes. To the extent that these are relevant to students’ experience of school life (e.g. school well-being) they might be considered as the object of monitoring.

230

8.9.3.2

8 Recapitalization and Discussion of the Main Findings and Implications …

Improving Measurement and Assessment

In connection with the previous point, a definitional question is at the core of the most serious assessment issue, namely the distinction between personality traits and skills, and a consequent follow-through of this distinction. In Chap. 2 we expressed the expectation that review of the measurement of social and emotional skills, and specific instruments to do so, would help in further clarifying conceptual challenges. This applied specifically to the distinction between trait (facets) on the one hand and skills on the other hand. Chapter 3 elaborated on problematic aspects of this distinction when the skills approach to personality was discussed and the chapter ended with the sentence: “high scores on personality traits and facets don’t say anything about the social-emotional skills that students can demonstrate in situations that require to use such skills”. In Chap. 6 we selected instruments that were used in social-emotional effect studies. These effect studies were further documented in Chaps. 4 and 5. In these chapters we hardly encountered instruments, used to assess social and emotional outcomes, which met our distinction of skills as performanceoriented inclinations. When we tried to get an impression of the fit for purpose of the measures that were analysed in Chap. 6, we frequently doubted their usefulness as SEL outcome measures, because they lacked reference to assessable performance, depended largely on self-reports, and rarely had addressed predictive or criterion validity. Other debatable issues in the use of trait/facet measures as if they are skills are: the use of only the positive end of the scale in determining outcome scores, the question of the general or more restricted scope of the scales in terms of broad or narrow situational referents and the question of ‘levels’ in social and emotional skills. Just using positive items. On this issue Chap. 3 referred to the erroneous practice of only using the positive poles of the Big Five traits, and changing the scoring format to a Likert scale, ranging from 1 to 5, suggesting in this way that the higher the score, the better the skill. “Because the items actually measure students’ typical behaviour instead of skills, this way of scoring is completely wrong, because it suggests that the higher the score the more adaptive the typical behaviour is, while in fact individuals with the highest scores on the positive poles of personality traits are usually not the most adaptive in terms of their typical feelings, thinking and behaviour and also not the ‘best persons’ in terms of their personality structure”. General or situation specific skills. As noted in Chap. 5, Social-Emotional Learning programmes frequently have a Prevention, rather than a Development orientation, in disadvantaged settings and a strong emphasis on countering anti-social behaviour. This striving for good behaviour is labelled as part of social-emotional learning at school. Probably educational interventions in this domain are likely to address specific behaviour, partly in school, and partly in the direct school environment. Similarly, pro-social behaviour in specific learning situations like group work, and experience with affective and conative reactions in a context of performance feedback could be seen as having a relatively ‘narrow’ situational characterization. It could

8.9 Implications for Educational Policy and Practice, and Research

231

be argued that focusing such narrower areas of good and social behaviour, would be better assessable as demonstrable behaviour and performance measurement. Perhaps a continuum of social-emotional learning targets running from narrow to broad in terms of situation specificity would be helpful to provide more clarity about the ambitions of social-emotional learning programmes at school. At the narrow end of this continuum could be appropriate social and emotional behaviour as part of school and classroom life and At the broad end of the continuum appropriate behaviour in all possible life situations (home, peer group, local community, intercultural cooperation, and work situations). Such a continuum could be a basis for choosing the kind of measurement instrument to assess the intended social and emotional outcomes: Behavioral and performance oriented assessments more likely for situation specific targets, middle range dispositions assessed as theoretical procedural knowledge in the social-emotional domain, and broad end skills measured as performance in the application of procedural knowledge in different contexts (transfer). The assessment of general traits should be reserved for purposes of screening and diagnosis. A perspective in this debate on situation specific interventions and the appropriateness of general personality trait measures that should not be overlooked is the ambition in education to go beyond specific experiences and aim for generalisation of what is learned in specific situations. In the cognitive domain this ambition to generalization is expressed by the construct of transfer of training and, more generally, by the distinction of higher order cognitive skills and meta-cognition. In their report on the development of 21st century skills learning of the National Research Council in the US Pellegrino and Hillton (2012) note that theory formation and research on interpersonal and intrapersonal skills are less developed in comparison to cognitive skills and recommend addressing transfer more explicitly in future research. Such an endeavour would align with giving more meaning to the ambition of social and emotional skills that are ‘transversal’. Assessment instruments with levels. Further research and development in the domain of social-emotional learning would be served by a more structured design of wellaligned program development and assessment, as outlined by the US National Research Council (2012), starting out from the specification of goals and objectives, didactic specification in the form of curriculum development and scripted programs, and evaluative specification in terms of quantitative scales. In the social-emotional learning domain this might lead to a reconsideration and perhaps further development of taxonomies in the affective and conative domains, and matching scales, which would express “substantive” levels of dispositions, “based on developmental theories of specific skills describing in-creasing mastery levels of that skill”, as we cited Abrahams et al. (2019, 466) in Chap. 4. Social-emotional skills or cognitive skills in the social-emotional domain? Ideas about further development of social-emotional learning programs, in terms of scripted programs guided by leveled taxonomies of learning objectives in the social-emotional domain and instruments based on developmental psychology theories of social and emotional skills are tentative and perhaps an interesting challenge for further

232

8 Recapitalization and Discussion of the Main Findings and Implications …

research. One not unlikely failure of such research and development programs could be that the social-emotional domain does not appear amenable to a kind of structuration and theoretical underpinning as the cognitive domain. Or rather: a possibly fitting structuration might appear to copy cognitive taxonomies. When thinking about ‘higher order’ levels on scales in the social-emotional domain, it might be concluded that these higher order levels are nothing different from problem solving, levels of generalization in transfer and meta-cognition. It might then be considered preferable to change the semantics from social-emotional skills to (cognitive) skills in the social-emotional domain and think in terms of declarative and procedural knowledge applied to the social-emotional domain.

8.9.3.3

Causal Modelling

The evidence base for the thesis that social and emotional skills are malleable by means of educational interventions largely depends on field experiments, in which schools which have implemented a particular social-emotional learning program are compared to a control group. Particularly when the selection to the treatment and the control group is fully randomized this is considered as the golden standard for establishing causality and evidence based policy advice. When we described illustrative cases of studies that had followed this methodology to evaluate social-emotional learning programs (Chap. 6) we noted that the programs were usually many-facetted and complex and that a broad range of outcome measures was used for most programs, with sets of chosen outcome variables differing between program evaluations. Such studies show the overall effectiveness of the program, which is important, and could be sufficient for decision-oriented evaluations (like in the case of deciding on the basis of a pilot study to implement the program on a larger scale). Looking at these studies from a more scientific perspective, from which obtaining generalizable knowledge about the malleability of particular social and emotional attributes is aimed for, our ambitions are different. For scientific purposes we are interested in why and how the program was effective, not just in establishing whether it works or not. SEL program evaluations meet these additional requirements to the degree that program interventions are well specified and preferably supported by an explicit rationale or program theory (see Chap. 5). From the case studies of program evaluations, reported in Chap. 5 we noted that programs were many facetted and that explicit and operational program theories were largely missing. Despite the advantages of randomized trials, such designs have their limitations when amounting to no more than black box evaluations. In Chap. 4 we also referred to results from non-experimental educational effectiveness studies. Such studies depend on making inferences based on existing variation in schooling and educational practice. Results are obtained by means of establishing covariation between school-controlled independent variables and outcome variables, while adjusting for co-variables, like pre-test information on the outcome variables. Although weaker on internal validity than randomized trials, such designs can handle

8.9 Implications for Educational Policy and Practice, and Research

233

elaborate causal models, incorporating multi-level relationships and influences from mediating variables. Higher ecological validity might also be seen as an advantage of non-experimental studies as reactive arrangements attached to experimental interventions, like the Hawthorn effect, are avoided. Perhaps a solution would be to combine experimental design with non-experimental causal modeling, as suggested in Chap. 6. In order to do this well, conceptual model specification and substantive theory should be underlined again. Finally, in addition to attainment effects of social-emotional learning interventions, equity indicators might be included in the causal models, as part of assessing the differential effectiveness of these programs for disadvantaged learners (Cf. Kyriakides, Creemers, & Charalambous, 2018)

8.9.3.4

Theory Development

The evidence base on the effects of social-emotional learning depends on largely applied research strands like program evaluations and non-experimental educational effectiveness research. Both are empiricist and rarely driven by theory or conceptual models. Still, as empirical evidence accumulates, more general insights can be induced, which, in their turn, might guide further inquiry. In the case of socialemotional learning programs there must be plans and developmental designs, but we encountered few cases of such plans being explicitly tied to the design of program evaluations and explicit hypotheses of cause and effect relationships.3 We noted in the review of case studies of program evaluations that program theories were only developed in a minority of cases. Similarly, we encountered only a few instances, in which more established theory was referred to in support of program rationales as a way to interpret results. The examples that we mentioned were econometric modelling, motivational theory with respect to self-efficacy and social-psychological models about dynamic interactions between social-psychological treatments and regular teaching (Yeager and Walton, 2011). Specifically, when the interest is on the instrumental perspective of social and emotional attributes, i.e. in support of academic performance, broader insights from educational effectiveness research could be made to bear upon social-emotional learning. Examples are the relevance of “high expectations” of pupil performance, the Pygmalion effect, and insights about the dynamics of performance feedback. What we would like to emphasis is that use of theory and theory development is not a matter of “theory for the sake of theory” but rather is about ways of better organizing our knowledge, giving focus to future studies and

3 “In addition to informing current practice and policy, research in education should support the development of explanatory and predictive theories of educational processes and mechanisms. Education research must answer questions about why, how, under what circumstances, and for whom, education practices and policies affect individual outcomes. Without an evidence-based theory of educational processes and mechanisms, pragmatic evidence of effectiveness may not be generalizable to new settings or different populations.” Opening statement SREE 2011 Spring meeting. Conference program.

234

8 Recapitalization and Discussion of the Main Findings and Implications …

better interpretation of acquired results. There is nothing more practical than a good theory.

8.10 Making Up the Balance: The Political Economy of the Soft Skills Movement In the introduction of this book we clarified our perspective when speaking of the soft skills movement. We have tried to deal with it as a research program, and most of the content is dedicated to reviewing empirical evidence. However, when we conclude that the evidence is not strong enough to warrant massive implementation and overhaul of national curricula, as the European Union seems to propagate, we realize that we are entering the political domain. When we discussed criticisms of the soft skills movement in the first chapter, we took note of the economic and business interests behind the movement (cf Williamson, 2018, 2019). But we think that an additional mechanism is at play that could explain the incredible attractiveness of important sounding, but otherwise fuzzy and half-understood innovations in education. In the case of the Netherlands, social-constructivism worked out as a case in point. It took an inter-parliamentary review committee (chaired by the internationally known Jeroen Dijsselbloem) to conclude that premature institutionalization of “the student as the owner of his or her learning process” (or similar jargon) was not such a big success, after all. In its final report that came out in 2008, the committee recommended that, henceforth, major educational reforms should be ruled out, unless they were supported by strong evidence on their feasibility. What happened since then was that a research based initiative of performance oriented teaching, after a relatively short flourish, gradually dwindled out to be replaced by a strong sentiment against economic return on investment thinking, growing resistance against performance testing, examinations and control by the inspectorate. All this in a period when the influence of the central government was further devolved to the intermediary structure of influential councils which combine the role of formal employer in education with innovation as a second pillar of their mission. Results of these developments are an increase of market influences on educational development, the ‘innovation market’, and even in the domain of high stakes educational testing. In our view these developments might also ultimately be understood in terms of a political, and partly also financial, economy of interests. The question “who benefits financially?” should be raised more frequently when educational innovations are proposed. At the same time pedagogical philosophies of ‘individual growth’, ‘self-development’ and holism (‘developing the whole child’) experience a revival, and these pedagogical philosophies align quite well with some interpretations of the soft skills movement. In the Netherlands, in 2017 a committee developed a proposal for a major curriculum overhaul, under the title “Education 2032”. Soft skills were recommended to obtain a more prominent place, and examination should be adapted accordingly. Although the proposal for curriculum reform met with a lot of criticism,

8.10 Making Up the Balance: The Political Economy of the Soft Skills Movement

235

the ever more autonomous schools in The Netherlands might be tempted to accept commercial soft skills merchandise, in the form of software packages and formative evaluation tools, uncritically. As researchers we can only do so much to have a positive influence on these developments, but what we can do is to keep critically reviewing evidence, create new evidence, and make our results public.

References Abrahams, L., Pancorbo, G., Primi, R., Santos, D., Kyllonen, P., John, O. P., et al. (2019). Socialemotional skill assessment in children and adolescents: Advances and challenges in personality, clinical, and educational contexts. Psychological Assessment, 31(4), 460–473. https://doi.org/10. 1037/pas0000591. Brunner, M., Keller, U., Wenger, M., Fischbach, A., & Lüdtke, O. (2018). Between-school variation in students’ achievement, motivation, affect, and learning strategies: Results from 81 countries for planning group-randomized trials in education. Journal of Research on Educational Effectiveness, 11(3), 452–478. https://doi.org/10.1080/19345747.2017.1375584. EEF (Education Endowment Foundation). (2018). SPECTRUM database. Retrieved from https:// educationendowmentfoundation.org.uk/projects-and-evaluation/evaluating-projects/measuringessential-skills/spectrum-database/. Groot, A.D., de, & Medendorp, F. L. (1986). Term, begrip, theorie. Inleiding tot signifische begripsanalyse [Term, concept, theory. Introduction to signatory concept analysis]. Meppel, The Netherlands: Boom. Hendriks, A. A. J., Hofstee, W. K. B., & De Raad, B. (1999). The five-factor personality inventory (FFPI). Personality and Individual Differences, 27(2), 307–325. https://doi.org/10.1016/S01918869(98)00245-1. Hendriks, A. A. J., Kuyper, H., Lubbers, M. J., & Van der Werf, M. P. C. (2011). Personality as a moderator of context effects on academic achievement. Journal of School Psychology, 49(2), 217–248. https://doi.org/10.1016/j.jsp.2010.12.001. Kautz, T., et al. (2014). Fostering and measuring skills: Improving cognitive and non-cognitive skills to promote lifetime success. OECD Education Working Papers, No. 110, OECD Publishing. https://doi.org/10.1787/5jxsr7vr78f7-en. Kuhlemeier, H., Knoop, H., Van Boxtel, H., Papenburg, I., & Hollenberg, J. (2016). Wetenschappelijke verantwoording VISEON 2.0. Volginstrument van sociaal emotioneel functioneren [Scientific account of VISEON 2.0, Monitoring instrument of social emotional functioning]. Arnhem: CITO. Kyriakides, L., Creemers, B. P. M., & Charalambous, E. (2018). Equity and quality dimensions in educational effectiveness. Dordrecht, the Netherlands: Springer. Lortie-Forgues, H., & Inglis, M. (2019). Rigorous large-scale educational RCTs are often uninformative: Should we be concerned? Educational Researcher, 48(3), 158–166. https://doi.org/10. 3102/0013189X19832850. National Research Council. (2012). Pellegrino, J. W. and Hilton, M.L. (Eds). Education for life and work: Developing transferable knowledge and skills in the 21st century. Washington D.C.: National Academies Press. https://doi.org/10.17226/13398. Roberts, B. W. (2009). Back to the future: Personality and Assessment and personality development. Journal of Research in Personality, 43(2), 137–145. https://doi.org/10.1016/j.jrp.2008.12.015. Scheerens, J. (Ed.). (2009). Informal learning for active citizenship at school. An international comparative study in seven European countries. Dordrecht, The Netherlands: Springer.

236

8 Recapitalization and Discussion of the Main Findings and Implications …

Weinert, F. E. (2001). Concept of competence: A conceptual clarification. In D. S. Rychen & L. H. Salganik (Eds.), Defining and selecting key competencies (pp. 45–65). Seattle, Toronto, Bern Gottingen: ¨ Hogrefe & Huber Publishers. West, M. R., & Center on Children and Families at Brookings. (2016). Should Non-Cognitive Skills Be Included in School Accountability Systems? Preliminary Evidence from California’s CORE Districts. Evidence Speaks Reports, Vol 1, #13. Center on Children and Families at Brookings. Retrieved from https://www.brookings.edu/wp-content/uploads/2016/07/EvidenceS peaksWest031716.pdf. Wilbrink, B. (2016). Dutch education reform “Education-2032” # onderwijs 2032—reform indeed: progressivist pseudo-science. Retrieved from http://benwilbrink.nl/projecten/onderwijs 2032.htm. Williamson, B. (2018). PISA for personality testing—the OECD and the psychometric science of social-emotional skills. Retrieved from https://codeactsineducation.wordpress.com/2018/01/16/ pisa-for-personality-testing/. Williamson, B. (2019). Psychodata: disassembling the psychological, economic, and statistical infrastructure of ‘social-emotional learning’. Journal of Education Policy. https://doi.org/10. 1080/02680939.2019.1672895. Yeager, D. S., & Walton, G. M. (2011). Social psychological Interventions in Education: They’re not magic. Review of Educational Research, 81(2), 267–301. https://doi.org/10.3102/003465431 1405999.

Epilogue

We titled this book “Soft skills in education. Putting the evidence in perspective”. So, let’s, at the end of the line, briefly resume the perspectives and considerations. Cognitive and non-cognitive parts of curricula as a zero sum game The initial motivation for this study was our experience in the Netherlands, when a national committee proposed a curriculum overhaul in which more space for soft skills went at the cost of time reduction for traditional school subjects. After having completed the study we reject the “zero-sum” perspective and presented arguments for a more modest and “embedded” approach to social-emotional development at school. Critical perspectives regarding appropriateness Although good behavior and diligence are age-old and accepted values in school education, personality formation in the affective domain has been contested, for fear of indoctrination and privacy considerations. In the information age these concerns obtain additional gravity. From the part of the critical school of thought in sociology, anthropology and theory of justice, new forms of inequality are discerned. Also, the political and material economy of soft skills as a new business warrants scrutiny. “Sollen impliziert koennen” (“ought to” implies “can do”) The core of this book is dedicated to reviewing the available research evidence about the technical feasibility of the soft skill movement’s claims, in the sense of having a clear conceptual basis, valid measures and supporting evidence on malleability (effectiveness of educational interventions to influence social-emotional learning outcomes). This is what we found: On conceptual development An initial conceptual mess, clarified in later years by means of reference to the Big Five personality taxonomy, but opening up new cans of worms. Our excursion

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 J. Scheerens et al., Soft Skills in Education, https://doi.org/10.1007/978-3-030-54787-5

237

238

Epilogue

into the realm of personality testing revealed it to be very unlikely that educational interventions influence personality trait facets. On soft skill measures The move to call these trait- facets “skills” falls through, when it appears that measures used to assess social and emotional outcomes are mostly trait measures, either self-ratings or ratings by observers who are not experts. In the book we refer to a number of developments that have the potential to take the measurement of skills in the social-emotional domain serious, but these are at the stage of ideas for a research program (as they were in 2012, when the American Council on 21st century skills made basically the same assessment and recommendation (pp. 13, 14)). On malleability Technically sound meta-analyses show small to medium effect sizes of socialemotional learning interventions on social emotional outcomes as well as academic performance. Benchmarking these effects against other evidence and in-depth review of the underlying studies, calls for a prudent interpretation of these results. Review of the earlier studies and our newly conducted meta-analysis on facets of conscientiousness draw attention to frequently occurring artifacts in the basic studies. Restructuring the soft-skills research paradigm First, further development in this area would benefit from being squarely put in the research tradition of empirical-analytic multi-disciplinary educational research, starting out with clearly delineated learning goals and a model of how learning is expected to develop, along with assessments to measure student progress toward and attainment of the goals. Among others, this approach would place transfer and generalized application of behavior in the social emotional domain into perspective. Secondly, exploration of “scales with levels” of attributes in the social emotional domain, possibly benchmarked against developmental continua from child and educational psychology, would be a critical test for the amenability of “transferable skills” by educational interventions. Thirdly, perhaps a change in semantics would be required, when recognizing that higher order levels of functioning in the social and affective domain are very much like higher order cognitive processes of transfer and meta-cognition (cf. American Council on 21st century skills, 2012, p. 13). Fourth and finally, by placing theory formation on explanatory mechanisms about complex interactions between cognitive learning and affective reactions, more central. After all, it would not be sufficient to know what works in the social emotional domain, but also how and why it works. The ‘much ado about very little perspective’ Without belittling the relevance of development in the social-emotional domain at school and recognizing that all the attention that the soft skills movement is generating has its positive sides, one should realize that the outlined research paradigm might very well falsify the claims. Maybe the attention alone will long lastingly

Epilogue

239

stimulate teachers to enhance diagnostic skills, stimulate reflection and broaden feedback to students, but without need of comprehensive new programs, and high stakes assessment.